[00:01:40] !log ori synchronized php-1.23wmf9/skins/vector/images 'Fix SVG MIME-type detection by reverting a9b855eea52ba3' [00:01:58] Logged the message, Master [00:02:28] !log ori synchronized php-1.23wmf9/skins/common 'Fix SVG MIME-type detection by reverting a9b855eea52ba3 (part 2/2)' [00:02:46] Logged the message, Master [00:55:28] Nemo_bis: log scale in gdash is doable now, if you want to submit a patch [00:57:27] (03CR) 10Ori.livneh: [C: 04-1] "As discussed in IRC, this is problematic: it is doubtful that a canonical, single statsd instance could serve the cluster, because statsd " [operations/dns] - 10https://gerrit.wikimedia.org/r/105383 (owner: 10Faidon Liambotis) [00:57:56] (03CR) 10Faidon Liambotis: "Multiple statsd service hostnames, then? :)" [operations/dns] - 10https://gerrit.wikimedia.org/r/105383 (owner: 10Faidon Liambotis) [01:12:33] (03CR) 10Ori.livneh: [C: 032] Fixed scap variable exporting [operations/puppet] - 10https://gerrit.wikimedia.org/r/105238 (owner: 10Aaron Schulz) [01:12:35] ah forgot to merge a change for aaron [01:15:28] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [01:17:11] (03PS1) 10Faidon Liambotis: graphite: enable carbon over UDP [operations/puppet] - 10https://gerrit.wikimedia.org/r/105419 [01:17:48] (03CR) 10Faidon Liambotis: [C: 032 V: 032] graphite: enable carbon over UDP [operations/puppet] - 10https://gerrit.wikimedia.org/r/105419 (owner: 10Faidon Liambotis) [01:33:14] !log jenkins fixed up mwext-VisualEditor-qunit job, its configuration got reverted to some old/incorrect version when I downgraded the git plugin. Retriggered all changes [01:33:33] Logged the message, Master [02:38:08] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [02:38:29] !log LocalisationUpdate completed (1.23wmf8) at Sat Jan 4 02:38:29 UTC 2014 [02:38:46] Logged the message, Master [02:39:08] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [03:13:46] !log LocalisationUpdate completed (1.23wmf9) at Sat Jan 4 03:13:45 UTC 2014 [03:14:00] Logged the message, Master [03:45:39] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Jan 4 03:45:39 UTC 2014 [03:45:55] Logged the message, Master [04:13:49] !log schema change, ad-hoc, additional indexes on recentchanges & wb_terms for recent slow queries [04:14:04] Logged the message, Master [04:35:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:28:59 AM UTC [04:36:25] RECOVERY - Puppet freshness on es7 is OK: puppet ran at Sat Jan 4 04:36:18 UTC 2014 [04:38:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:40:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:42:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:44:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:46:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:48:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:50:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:51:32] I heard you like your puppet fresh, so I… [04:51:33] >.> [04:51:50] apergos: ^ [04:52:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:54:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:56:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:58:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:36:18 AM UTC [04:59:25] RECOVERY - Puppet freshness on es7 is OK: puppet ran at Sat Jan 4 04:59:22 UTC 2014 [05:01:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:59:22 AM UTC [05:02:30] why is it doing that... [05:02:43] checks are already passive [05:03:25] PROBLEM - Puppet freshness on es7 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 04:59:22 AM UTC [05:07:44] RECOVERY - Puppet freshness on es7 is OK: puppet ran at Sat Jan 4 05:07:42 UTC 2014 [05:47:53] springle: it does that because freshness threshhold is updated on every puppet run on each host, in the exprted reources, and it's updated like this: [05:47:58] delete the row. commit [05:48:03] insert the new row. commit. [05:49:04] so, once in awhile, the timing is jus right for neon to poll for the threshold and have it not be there... in which case it defaults to 1, and there you have it [05:49:25] ah :) [05:49:30] the next puppet run on neon will typically clear it up [05:50:05] I haven't come up with the right workaround... well, the *right* workaround is to do the delete and insert in one transaction. anyways... [05:50:10] stupid puppet [06:08:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:09:24] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [06:34:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:35:24] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [06:44:24] PROBLEM - SSH on searchidx1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:44:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:24] RECOVERY - SSH on searchidx1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [06:45:24] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [07:31:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:32:34] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [07:40:05] (03PS1) 10Springle: reduce db1040 load while adding indexes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105429 [07:40:24] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [07:40:28] (03CR) 10Springle: [C: 032] reduce db1040 load while adding indexes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105429 (owner: 10Springle) [07:41:59] !log springle synchronized wmf-config/db-eqiad.php 'reduce db1040 load while adding indexes' [07:42:15] Logged the message, Master [08:15:14] PROBLEM - Puppet freshness on searchidx1001 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 05:14:51 AM UTC [08:40:24] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [08:49:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:50:24] PROBLEM - SSH on searchidx1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:50:34] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [08:51:14] RECOVERY - SSH on searchidx1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [08:51:53] (03PS1) 10Springle: reduce db1026 load while reindexing wikidatawiki.wb_terms [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105431 [08:52:52] (03CR) 10Springle: [C: 032] reduce db1026 load while reindexing wikidatawiki.wb_terms [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105431 (owner: 10Springle) [08:54:31] !log springle synchronized wmf-config/db-eqiad.php 'reduce db1026 load while reindexing wikidatawiki.wb_terms' [08:54:49] Logged the message, Master [09:46:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:49:24] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [10:15:04] RECOVERY - Puppet freshness on searchidx1001 is OK: puppet ran at Sat Jan 4 10:15:02 UTC 2014 [11:46:24] PROBLEM - Host ms-be1012 is DOWN: PING CRITICAL - Packet loss = 100% [14:17:54] PROBLEM - MySQL Processlist on db1006 is CRITICAL: CRIT 0 unauthenticated, 0 locked, 0 copy to table, 109 statistics [14:18:55] RECOVERY - MySQL Processlist on db1006 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 6 statistics [14:25:24] (03PS1) 10Springle: db1040 back to full steam [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105442 [14:26:02] (03CR) 10Springle: [C: 032] db1040 back to full steam [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105442 (owner: 10Springle) [14:26:53] !log springle synchronized wmf-config/db-eqiad.php 'db1040 back to full steam' [14:27:12] Logged the message, Master [14:58:54] PROBLEM - RAID on db1047 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) [18:20:08] (03PS1) 10QChris: Log correctly encoded url with parameters for nginx [operations/puppet] - 10https://gerrit.wikimedia.org/r/105449 [18:20:13] (03PS1) 10QChris: Stop nginx from escaping the user agent [operations/puppet] - 10https://gerrit.wikimedia.org/r/105450 [18:20:14] (03PS1) 10QChris: Log outgoing X-Analytics header for nginx [operations/puppet] - 10https://gerrit.wikimedia.org/r/105451 [19:37:33] (03CR) 10Vldandrew: "I would like to work on this one. Give me a few minutes." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [19:48:53] (03PS2) 10Vldandrew: Added commons.ico with more resolutions. \n Buzilla Link - review/yatinmaan/103107 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [19:50:53] (03PS1) 10Tim Landscheidt: Tools: Add requested package libjpeg-turbo-progs [operations/puppet] - 10https://gerrit.wikimedia.org/r/105453 [19:52:24] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [19:56:03] (03PS3) 10Odder: Added commons.ico with more resolutions. \n Buzilla Link - review/yatinmaan/103107 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [19:57:25] (03CR) 10Odder: [C: 031] "Thanks for taking this, Vldandrew — the icon looks great. Well done :-)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [20:02:50] (03CR) 10Qgil: "I still miss a description explaining whether you have simply added the higher resolution PNG or you also edited the other two resolutions" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [20:04:06] (03CR) 10Vldandrew: "I took the original version and added the 48x48." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [20:04:45] (03PS4) 10Vldandrew: Added commons.ico with more resolutions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [20:09:14] PROBLEM - Puppet freshness on cp1046 is CRITICAL: Last successful Puppet run was Sat 04 Jan 2014 05:08:46 PM UTC [20:11:07] (03CR) 10Qgil: [C: 031] Added the 48x48 resolution for commons.ico, it was based on the original icon. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [20:25:47] (03CR) 10Odder: [C: 031] Added the 48x48 resolution for commons.ico, it was based on the original icon. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103355 (owner: 10Yatinmaan) [20:54:24] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [21:06:04] !log jenkins: purged php5-mysqld package from lanthanum and gallium. It was no more installed but had some conf file left in /etc/php5/conf.d/ [21:06:20] Logged the message, Master [21:36:21] (03PS1) 10Odder: Add Frank Schulenburg's blog to the English Planet [operations/puppet] - 10https://gerrit.wikimedia.org/r/105460 [22:17:46] (03PS1) 10BBlack: bugfix: pointer to mem possibly freed by realloc() [operations/software/varnish/libvmod-netmapper] - 10https://gerrit.wikimedia.org/r/105461 [22:17:47] (03PS1) 10BBlack: Prevent a possible resource leak under race [operations/software/varnish/libvmod-netmapper] - 10https://gerrit.wikimedia.org/r/105462 [22:20:14] (03CR) 10BBlack: [C: 032 V: 032] bugfix: pointer to mem possibly freed by realloc() [operations/software/varnish/libvmod-netmapper] - 10https://gerrit.wikimedia.org/r/105461 (owner: 10BBlack) [22:20:45] (03CR) 10BBlack: [C: 032 V: 032] Prevent a possible resource leak under race [operations/software/varnish/libvmod-netmapper] - 10https://gerrit.wikimedia.org/r/105462 (owner: 10BBlack) [22:26:19] (03PS1) 10BBlack: Updated netmapper patch to e4eb4160 [operations/debs/varnish] (testing/3.0.3plus-rc1) - 10https://gerrit.wikimedia.org/r/105463 [22:26:48] (03CR) 10BBlack: [C: 032 V: 032] Updated netmapper patch to e4eb4160 [operations/debs/varnish] (testing/3.0.3plus-rc1) - 10https://gerrit.wikimedia.org/r/105463 (owner: 10BBlack) [23:08:24] RECOVERY - Puppet freshness on cp1046 is OK: puppet ran at Sat Jan 4 23:08:15 UTC 2014