[00:01:22] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [00:02:22] (03CR) 10CSteipp: [C: 031] open-up-wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99775 (owner: 10Dan-nl) [00:04:12] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [00:04:34] csteipp: Thank you [00:06:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:02:21 AM UTC [00:07:03] yvon..pff [00:07:57] RECOVERY - Puppet freshness on yvon is OK: puppet ran at Sat Dec 7 00:07:49 UTC 2013 [00:08:42] (03PS3) 10Dan-nl: open-up-wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99775 [00:09:26] (03CR) 10Dan-nl: "- addressed Legoktm’s comment in ps2." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99775 (owner: 10Dan-nl) [00:09:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:09:43] (03CR) 10BryanDavis: [C: 032] open-up-wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99775 (owner: 10Dan-nl) [00:09:51] (03Merged) 10jenkins-bot: open-up-wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99775 (owner: 10Dan-nl) [00:11:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:13:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:13:49] !log demon synchronized php-1.23wmf5/extensions/Wikibase/ 'Updating for debugging' [00:14:10] Logged the message, Master [00:15:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:17:17] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [00:17:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:17:43] (03PS2) 10Dzahn: turn wikistats into module - WIP [operations/puppet] - 10https://gerrit.wikimedia.org/r/94409 [00:18:07] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [00:19:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:20:23] (03PS1) 10Ori.livneh: graphite: notify uWSGI when local_settings.py changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/99784 [00:21:06] (03CR) 10Ori.livneh: [C: 032 V: 032] graphite: notify uWSGI when local_settings.py changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/99784 (owner: 10Ori.livneh) [00:21:27] is XO the only transit/peering out of cr1-sdtpa.wikimedia.org? [00:21:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:21:46] (03CR) 10GWicke: Add Mathoid module (TeX -> MathML / SVG conversion web service) (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [00:23:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:25:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:26:42] (03PS1) 10Ori.livneh: graphite: Correct location of Django admin static files [operations/puppet] - 10https://gerrit.wikimedia.org/r/99787 [00:26:58] (03CR) 10Ori.livneh: [C: 032 V: 032] graphite: Correct location of Django admin static files [operations/puppet] - 10https://gerrit.wikimedia.org/r/99787 (owner: 10Ori.livneh) [00:27:17] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [00:27:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:07:49 AM UTC [00:28:27] RECOVERY - Puppet freshness on yvon is OK: puppet ran at Sat Dec 7 00:28:22 UTC 2013 [00:28:47] (03PS1) 10Dzahn: change bugzilla role classes in preparation for upcoming switch to new server and after having merged the new mdoule [operations/puppet] - 10https://gerrit.wikimedia.org/r/99788 [00:29:33] (03PS2) 10Dzahn: change bugzilla role classes in preparation for upcoming switch to new server and after having merged the new mdoule [operations/puppet] - 10https://gerrit.wikimedia.org/r/99788 [00:30:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:28:22 AM UTC [00:31:17] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [00:32:03] (03CR) 10Dzahn: "how about just having both? tech and non/tech? don't have any strong opinion here" [operations/puppet] - 10https://gerrit.wikimedia.org/r/97190 (owner: 10Faidon Liambotis) [00:32:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:28:22 AM UTC [00:33:07] RECOVERY - Puppet freshness on yvon is OK: puppet ran at Sat Dec 7 00:33:01 UTC 2013 [00:34:35] (03PS10) 10Aude: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95996 [00:34:37] PROBLEM - Puppet freshness on yvon is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 12:33:02 AM UTC [00:35:42] cajoel: hey, what's the name of the router you need info on [00:36:31] ideally I'd like access to observium [00:36:33] but.. [00:36:36] 1s [00:36:54] cr1-sdtpa.wikimedia.org: [00:36:58] the XO uplink [00:37:10] need to know general sense of traffic load on that link [00:37:13] in Mbps [00:37:31] mutante: thanks [00:38:46] cajoel: hmm.. so i have access but i dont know off the top of my head how the access was handled [00:39:01] on the start page of observium i see other cr1-, like esams [00:39:11] but yet would have to find cr1-sdtpa [00:39:40] I suppose it's possible tampa isn't in there? [00:39:50] (03CR) 10Deyan: "Could you elaborate on why you decided to switch the converter? From what I followed thus far, it seemed that the latexml integration was " [operations/puppet] - 10https://gerrit.wikimedia.org/r/61767 (owner: 10Physikerwelt) [00:40:04] thanks for peeking -- I'll bug leslie [00:40:57] cajoel: would cr-2 help [00:41:11] i think i got something on cr-2.pmtpa [00:41:48] I only need cr1 [00:43:22] hmm, yea, sorry, i'm on the "all devices" and i have cr1-eqiad and esams [00:43:27] oh... cr1-sdtpa [00:43:35] whee [00:44:12] sometimes dots sometimes dashes? [00:44:16] let me just mail you a screenshot [00:44:42] i hope leslie can then help with the question about access [00:45:42] already emailed her [00:45:50] screen shot for a 24h period would be stellar [00:46:01] (as long as it looks fairly regular) [00:46:39] cajoel: ugh, 2 factor auth for mail expired. expect it in a few [00:46:51] finds backup phone :p [00:47:46] you did all this on your phone? [00:47:59] golf claps [00:49:43] (03CR) 10GWicke: "Some of the reasons were" [operations/puppet] - 10https://gerrit.wikimedia.org/r/61767 (owner: 10Physikerwelt) [00:51:18] (03CR) 10Chad: [C: 032] Remove pool counter setting for Cirrus updates [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99752 (owner: 10Manybubbles) [00:51:27] (03Merged) 10jenkins-bot: Remove pool counter setting for Cirrus updates [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99752 (owner: 10Manybubbles) [00:53:12] !log demon synchronized wmf-config/PoolCounterSettings-pmtpa.php 'Removing old cirrus config' [00:53:32] Logged the message, Master [00:53:48] !log demon synchronized wmf-config/PoolCounterSettings-eqiad.php 'Removing old cirrus config' [00:54:04] Logged the message, Master [01:01:02] cajoel: heh, no, but i need a backup phone to log into my mail account [01:02:52] RECOVERY - Puppet freshness on yvon is OK: puppet ran at Sat Dec 7 01:02:45 UTC 2013 [01:08:02] ori-l: OK, my brain is still only operating at half capacity… were you about to point out some obvious way for me to get the project name/id/whatever on the puppetmaster for a given report? [01:08:25] andrewbogott: dunno enough about openstack, sorry [01:09:11] Would you expect facts to be available during reporting? I don't see that they are but maybe I'm making a mistake... [01:11:52] cajoel: eh, i took some literal pictures with my phone and mailed them [01:11:58] hah! [01:12:01] beauty [01:12:16] haha, i know, not the best to import into spreadsheet now [01:12:25] but that was the workaround to mail it out :p [01:12:26] this is just for sanity checks [01:12:28] works great [01:12:30] thanks! [01:12:48] k, yw:) [01:14:00] Just got a 503 on beta. [01:14:15] When trying to create a Draft on http://en.wikipedia.beta.wmflabs.org/ [01:15:02] ugh [01:15:18] https://bugzilla.wikimedia.org/show_bug.cgi?id=57249 [01:19:33] (03PS1) 10BryanDavis: Disable wgCopyUploadProxy for beta [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99803 [01:21:09] (03CR) 10BryanDavis: "GWToolset can't download anything without this change or a change to the squid config for url-downloader.wikimedia.org. This seems like th" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99803 (owner: 10BryanDavis) [01:23:47] (03CR) 10CSteipp: [C: 031] "should be fine for labs" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99803 (owner: 10BryanDavis) [01:24:24] (03CR) 10BryanDavis: [C: 032] Disable wgCopyUploadProxy for beta [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99803 (owner: 10BryanDavis) [01:26:22] (03Merged) 10jenkins-bot: Disable wgCopyUploadProxy for beta [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99803 (owner: 10BryanDavis) [01:40:12] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:41:12] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [01:42:11] greg-g, the 503 issues are not just on long articles. [01:42:30] Just got: [01:42:36] Request: POST http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:UserLogin&action=submitlogin&type=signup&returnto=Special:MovePage/Draft:Jane+Cooper, from 71.175.126.100 via deployment-cache-text1 frontend ([10.4.1.133]:80), Varnish XID 2142024138 [01:42:53] When submitting the signup page. [01:44:12] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:47:12] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [01:50:16] (03PS1) 10Ori.livneh: Graphite web: create admin user [operations/puppet] - 10https://gerrit.wikimedia.org/r/100035 [01:53:12] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:53:32] (03CR) 10Ori.livneh: [C: 032 V: 032] Graphite web: create admin user [operations/puppet] - 10https://gerrit.wikimedia.org/r/100035 (owner: 10Ori.livneh) [01:54:12] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [02:20:06] !log LocalisationUpdate completed (1.23wmf5) at Sat Dec 7 02:20:06 UTC 2013 [02:20:27] Logged the message, Master [02:25:18] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:31:08] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [02:37:08] !log LocalisationUpdate completed (1.23wmf6) at Sat Dec 7 02:37:08 UTC 2013 [02:37:24] Logged the message, Master [02:42:12] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:44:12] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [02:56:12] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:57:12] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [03:22:13] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:22:21] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Dec 7 03:22:21 UTC 2013 [03:22:35] Logged the message, Master [03:27:13] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [03:50:20] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:51:10] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [03:54:38] So we're going to run a script to blank signupstart and signupend on all Wikimedia wikis. [03:55:03] Is there a recommended way to do this on the cluster, or just run a script using the API from outside the cluster? [03:55:20] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:56:10] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [04:35:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:37:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:38:32] (03CR) 10Deyan: "I guess if those are the features you want to prioritize, that's fine. I suspect LaTeXML would get you more mileage in:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/61767 (owner: 10Physikerwelt) [04:39:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:41:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:43:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:45:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:47:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:49:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:51:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:53:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:54:21] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [04:55:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:55:21] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [04:57:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:59:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:29:31 AM UTC [04:59:51] RECOVERY - Puppet freshness on mw1034 is OK: puppet ran at Sat Dec 7 04:59:50 UTC 2013 [05:01:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:50 AM UTC [05:01:21] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:02:11] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:03:21] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:50 AM UTC [05:05:22] PROBLEM - Puppet freshness on mw1034 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:50 AM UTC [05:27:07] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:29:07] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [05:29:37] RECOVERY - Puppet freshness on mw1034 is OK: puppet ran at Sat Dec 7 05:29:35 UTC 2013 [05:30:17] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:31:17] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [05:32:07] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:33:07] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [05:58:56] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:59:46] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [06:28:04] PROBLEM - udp2log log age for lucene on oxygen is CRITICAL: CRITICAL: log files /a/log/lucene/lucene.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [06:30:04] RECOVERY - udp2log log age for lucene on oxygen is OK: OK: all log files active [06:52:50] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:58:50] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [07:06:19] (03CR) 10Isarra: [C: 031] "It appended something weird to the end of the filename and I couldn't open it as a result. I removed that and it looks fine, but that may " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99768 (owner: 10M4tx) [07:22:21] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:23:11] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [07:27:51] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:28:51] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [07:47:17] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:49:17] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [08:00:17] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:01:17] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [08:10:23] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:11:23] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [08:48:03] PROBLEM - Puppet freshness on searchidx1001 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:47:46 AM UTC [08:55:23] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:56:13] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [08:57:23] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:58:23] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [09:07:23] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:09:23] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [09:25:23] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:26:23] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [09:32:07] (03Abandoned) 10Nemo bis: Simplify misc::maintenance::update_special_pages a bit [operations/puppet] - 10https://gerrit.wikimedia.org/r/90117 (owner: 10Nemo bis) [09:39:01] (03PS1) 10M4tx: Update favicon wikinews.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100123 [09:45:38] RECOVERY - Puppet freshness on searchidx1001 is OK: puppet ran at Sat Dec 7 09:45:28 UTC 2013 [09:47:43] (03PS7) 10Vldandrew: Update favicon mediawiki.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99756 [09:47:46] (03CR) 10jenkins-bot: [V: 04-1] Update favicon mediawiki.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99756 (owner: 10Vldandrew) [10:06:21] (03PS8) 10Vldandrew: Update favicon mediawiki.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99756 [10:08:24] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:10:24] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [10:29:05] (03PS1) 10Vldandrew: Update favicon incubator.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100133 [10:37:25] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:38:25] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [10:39:44] (03CR) 10Odder: [C: 031] "The favicon looks great to me, thanks!" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100123 (owner: 10M4tx) [10:56:25] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:58:25] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [11:07:42] (03CR) 10Odder: [C: 031] "Looks fine, thanks!" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99756 (owner: 10Vldandrew) [11:11:29] (03CR) 10Odder: [C: 031] "Looks great, thanks much!" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100133 (owner: 10Vldandrew) [11:12:39] (03CR) 10Odder: "Er?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99768 (owner: 10M4tx) [11:33:33] (03PS1) 10Vldandrew: Update favicon wikisource.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100139 [11:43:28] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:43:33] (03CR) 10Odder: [C: 031] "Do I even have to mention the icon looks great? :-)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100139 (owner: 10Vldandrew) [11:45:08] (03CR) 10Vldandrew: "Hey, thank you very much." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100139 (owner: 10Vldandrew) [11:45:28] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [11:59:46] (03PS1) 10Vldandrew: Update favicon wikisource.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100142 [12:00:46] (03PS2) 10Vldandrew: Update favicon wikisource.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100142 [12:18:14] (03CR) 10Odder: [C: 031] "Great as usual." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100142 (owner: 10Vldandrew) [12:34:59] !log restarting Jenkins on gallium [12:35:15] Logged the message, Master [12:48:31] (03PS1) 10Vldandrew: Update favicon wikidata.ico. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 [12:49:06] (03CR) 10jenkins-bot: [V: 04-1] Update favicon wikidata.ico. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [12:55:34] (03PS2) 10Vldandrew: Update favicon wikidata.ico. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 [12:56:14] (03CR) 10jenkins-bot: [V: 04-1] Update favicon wikidata.ico. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [13:11:47] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [13:12:12] (03PS3) 10Hashar: Update favicon wikidata.ico. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [13:12:37] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [13:31:42] disappearing [13:40:54] (03CR) 10Odder: "There seem to be a difference between the 32px and 48px layers; the 32px one takes the full width of the layer (32px) while the 48px leave" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [13:44:42] PROBLEM - check_job_queue on terbium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [13:45:42] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [13:48:51] (03CR) 10Vldandrew: "Yes, I thought that if I make the image fit the whole width it would look different from the others. The distance between the x32 and x16 " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [13:54:24] (03CR) 10Odder: [C: 04-1] "Please do; I just played a bit with the logo in Inkscape, and creating a layer with it taking the whole width (ie. 48px) looks possible." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [14:10:33] (03PS1) 10Dapete: Tool Labs: install fonts for vCat tool [operations/puppet] - 10https://gerrit.wikimedia.org/r/100147 [14:26:34] (03CR) 10Ladsgroup: [C: 031] Enabling Persian Wikipedia Education Program [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/99739 (owner: 10Ebrahim) [15:19:31] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [17:05:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:07:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:09:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:11:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:13:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:15:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:17:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 04:59:45 PM UTC [17:19:01] RECOVERY - Puppet freshness on mw1035 is OK: puppet ran at Sat Dec 7 17:18:56 UTC 2013 [17:20:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:18:56 PM UTC [17:22:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:18:56 PM UTC [17:24:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:18:56 PM UTC [17:26:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:18:56 PM UTC [17:28:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:18:56 PM UTC [17:29:11] RECOVERY - Puppet freshness on mw1035 is OK: puppet ran at Sat Dec 7 17:29:03 UTC 2013 [17:30:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:29:03 PM UTC [17:32:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:29:03 PM UTC [17:34:31] PROBLEM - Puppet freshness on mw1035 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:29:03 PM UTC [17:45:39] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [17:59:58] RECOVERY - Puppet freshness on mw1035 is OK: puppet ran at Sat Dec 7 17:59:51 UTC 2013 [18:10:46] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [18:23:08] (03PS1) 10M4tx: Update favicon piece.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100161 [19:27:42] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [19:34:42] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [19:40:12] (03CR) 10Odder: [C: 04-1] "There is a visible difference between the existing 32px layer of the favicon and the one you committed, please improve!" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100161 (owner: 10M4tx) [19:51:46] (03PS4) 10Vldandrew: Update favicon wikidata.ico. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 [20:39:42] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [20:43:34] (03CR) 10Odder: [C: 031] "Looks fine, thanks again!" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100144 (owner: 10Vldandrew) [20:44:22] PROBLEM - Puppet freshness on searchidx1001 is CRITICAL: Last successful Puppet run was Sat 07 Dec 2013 05:43:40 PM UTC [21:13:36] RECOVERY - Puppet freshness on searchidx1001 is OK: puppet ran at Sat Dec 7 21:13:28 UTC 2013 [21:36:48] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [22:39:23] (03PS1) 10Ori.livneh: Graphite: add system role; tweak storage schemas to be thriftier [operations/puppet] - 10https://gerrit.wikimedia.org/r/100185 [22:39:56] (03CR) 10jenkins-bot: [V: 04-1] Graphite: add system role; tweak storage schemas to be thriftier [operations/puppet] - 10https://gerrit.wikimedia.org/r/100185 (owner: 10Ori.livneh) [22:41:22] (03PS2) 10Ori.livneh: Graphite: add system role; tweak storage schemas to be thriftier [operations/puppet] - 10https://gerrit.wikimedia.org/r/100185 [22:41:54] (03CR) 10jenkins-bot: [V: 04-1] Graphite: add system role; tweak storage schemas to be thriftier [operations/puppet] - 10https://gerrit.wikimedia.org/r/100185 (owner: 10Ori.livneh) [22:43:55] (03PS3) 10Ori.livneh: Graphite: add system role; tweak storage schemas to be thriftier [operations/puppet] - 10https://gerrit.wikimedia.org/r/100185 [22:46:07] (03CR) 10Ori.livneh: [C: 032] Graphite: add system role; tweak storage schemas to be thriftier [operations/puppet] - 10https://gerrit.wikimedia.org/r/100185 (owner: 10Ori.livneh) [22:55:38] ori-l: does this mean that the move of gdash is coming? [22:56:13] Nemo_bis: gdash already moved, but graphite hasn't [22:56:38] ori-l: oh, so I can/should submit those patches for the log scale now? [22:56:41] because gdash just generates URLs to graphite for graphs, the graphs in gdash are still coming from the old graphite instance [22:56:48] sure [22:57:02] yeah, you should submit them. I'm not ready to merge them quite yet but should be very shortly. [22:57:17] great, will work on that on monday probably [22:57:29] cool, thanks! [23:07:27] ori-l: on special version, it says d7c6f6c is deployed while there is newer stuff in the wmf5 branch [23:07:52] i'm wondering if it's the sepcial pages that are cached or there really is stuff not deployed? [23:07:56] special* [23:08:00] either someone merged and didn't sync, or the sync'd files but didn't run full scap [23:08:15] do you have deployment privs? [23:08:15] ok [23:08:19] i can wait until monday [23:08:27] wait for what? [23:08:34] to have stuff synced [23:08:46] not sure it's good to do on the weekend :) [23:09:02] depends on who you ask, but yeah [23:09:15] but -- do you have deployment privs? [23:09:20] no [23:10:00] would you like to? it might make sense, since i notice you're often the point-person for wikidata patch management [23:10:29] maybe or at least be able to have shell access to see logs etc :) [23:11:08] and not bug ree-dy all the time [23:11:57] ok, let me figure out what the process is and let's do it [23:12:05] probably ask rob [23:12:49] why don't i email him and cc you? [23:13:01] ok [23:18:02] aude: Don't worry, when you have the depoly privs, you don't earn the matching badge for your shirt till you break the site mid-flight [23:18:50] heh [23:19:08] * aude sure wikidata / me has already broke something at some point [23:44:13] PROBLEM - Backend Squid HTTP on sq80 is CRITICAL: Connection refused [23:47:45] Nemo_bis: how come Wikitech doesn't have an entry in the interwiki map? [23:47:51] or am I being daft and not noticing it? [23:49:25] never mind, I'm being daft. [23:50:15] (03PS1) 10Ori.livneh: Parametrize Graphite's DOCUMENTATION_URL option & set it to [[wikitech:Graphite]] [operations/puppet] - 10https://gerrit.wikimedia.org/r/100194