[00:03:26] <grrrit-wm1>	 (03CR) 10Dzahn: [C: 04-1] subscribe the chained.pem file to the non-chained.pem file [operations/puppet] - 10https://gerrit.wikimedia.org/r/131087 (owner: 10RobH)
[00:04:42] <grrrit-wm1>	 (03CR) 10Dzahn: "was this trying to fix a specific bug?" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/111925 (owner: 10BryanDavis)
[00:06:27] <grrrit-wm1>	 (03CR) 10Dzahn: [C: 031] contint: switch localvhost to apache::conf [operations/puppet] - 10https://gerrit.wikimedia.org/r/155707 (https://bugzilla.wikimedia.org/68256) (owner: 10Hashar)
[00:06:49] <grrrit-wm1>	 (03CR) 10Dzahn: [C: 031] contint: migrate localvhost to apache::site [operations/puppet] - 10https://gerrit.wikimedia.org/r/155708 (owner: 10Hashar)
[00:07:11] <grrrit-wm1>	 (03CR) 10Dzahn: [C: 032] "labs only" [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/155680 (owner: 10Dzahn)
[00:07:45] <grrrit-wm1>	 (03CR) 10Dzahn: [C: 032] "labs only" [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/155682 (owner: 10Dzahn)
[00:08:06] <grrrit-wm1>	 (03CR) 10Dzahn: [C: 032] "labs only, let's one build the package again" [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/155685 (owner: 10Dzahn)
[00:18:51] <grrrit-wm1>	 (03CR) 10BryanDavis: "It was a followup to I83cef4b4d1c956ede13b9e124a046015962d7458 and Id2cc29f911fa36805320cdb606a5da1226c9d230 to make handling of http -> h" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/111925 (owner: 10BryanDavis)
[00:29:04] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[00:29:43] <ori>	 !log disabled puppet on osmium again to debug a leak; please don't re-enable
[00:29:49] <morebots>	 Logged the message, Master
[00:29:54] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[00:31:51] <bd808>	 ori: If you do `puppet --disable "disabled to debug a leak; please don't re-enable"` that message will show when someone tries to run puppet
[00:32:07] <ori>	 bd808: oh, neat. i didn't know that. thanks!
[00:32:25] <bd808>	 It's apparently a little known puppet trick :)
[00:32:59] <bd808>	 I knew about it but I saw someone write a TIL about it here a couple of days ago
[00:33:04] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[00:34:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[00:35:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[00:37:04] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[00:39:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[00:44:37] <grrrit-wm1>	 (03PS1) 10Ori.livneh: jobrunner: use trebuchet package provider [operations/puppet] - 10https://gerrit.wikimedia.org/r/155859 
[00:59:04] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[01:00:55] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[01:05:53] <grrrit-wm1>	 (03Abandoned) 10Jeremyb: account creation limit for CIS (tewiki) event [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/153383 (https://bugzilla.wikimedia.org/69385) (owner: 10Jeremyb)
[01:29:04] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[01:31:56] <logmsgbot>	 !log springle Synchronized wmf-config/db-eqiad.php: repool db1056 (duration: 00m 06s)
[01:32:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[01:32:05] <morebots>	 Logged the message, Master
[01:33:38] <grrrit-wm1>	 (03CR) 10Hoo man: "Any chance we can move forward here?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/152724 (owner: 10Hoo man)
[01:54:20] <grrrit-wm1>	 (03CR) 10Hoo man: [C: 031] "Would be nice to get this merged so that I can rebase the other change into a mergeable state." [operations/puppet] - 10https://gerrit.wikimedia.org/r/153034 (owner: 10Hoo man)
[02:09:05] <^d>	 This is getting rather old.
[02:09:19] <jeremyb>	 orly?
[02:09:46] <^d>	 ya rly
[02:10:17] <hoo>	 What's getting old? :P
[02:10:46] <^d>	 elasticsearch being a disk hog
[02:10:49] <^d>	 http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&m=disk_free&s=by+name&c=Elasticsearch+cluster+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&hide-hf=false&sh=1&z=small&hc=4
[02:11:05] <jeremyb>	 i thought it was a RAM hog?
[02:11:42] <hoo>	 oO
[02:11:46] <^d>	 It uses all the ram you can give it, but we're pretty stable with the heap / disk cache we've got now.
[02:12:10] <^d>	 1.3.x elasticsearch introduced a regression.
[02:12:12] <^d>	 https://github.com/elasticsearch/elasticsearch/issues/7386
[02:12:24] <icinga-wm>	 PROBLEM - mailman_ctl on sodium is CRITICAL: PROCS CRITICAL: 0 processes with UID = 38 (list), regex args /mailman/bin/mailmanctl  
[02:13:24] <icinga-wm>	 RECOVERY - mailman_ctl on sodium is OK: PROCS OK: 1 process with UID = 38 (list), regex args /mailman/bin/mailmanctl  
[02:16:31] <^d>	 Hopefully the hack Nik put together with upstream will work tomorrow.
[02:16:40] <^d>	 Otherwise we have to keep babysitting this until 1.3.3 comes out.
[02:18:32] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf17) at 2014-08-23 02:17:28+00:00
[02:18:42] <morebots>	 Logged the message, Master
[02:20:24] <icinga-wm>	 PROBLEM - LighttpdHTTP on dataset1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[02:21:15] <icinga-wm>	 RECOVERY - LighttpdHTTP on dataset1001 is OK: HTTP OK: HTTP/1.1 200 OK - 5122 bytes in 6.110 second response time  
[02:23:03] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf18) at 2014-08-23 02:21:59+00:00
[02:23:09] <morebots>	 Logged the message, Master
[02:36:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[03:07:19] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 23 03:06:13 UTC 2014 (duration 6m 12s)
[03:07:26] <morebots>	 Logged the message, Master
[03:29:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[03:30:14] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[03:34:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[03:35:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[03:42:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[03:43:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[03:46:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[03:49:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[04:22:54] <icinga-wm>	 PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 1 failures  
[04:25:34] <icinga-wm>	 PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Epic puppet fail  
[04:27:14] <icinga-wm>	 PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: Puppet has 1 failures  
[04:28:34] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds  
[04:28:55] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on labmon1001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds  
[04:31:34] <icinga-wm>	 PROBLEM - puppet last run on amssq48 is CRITICAL: CRITICAL: Puppet has 1 failures  
[04:33:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[04:34:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[04:37:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[04:40:54] <icinga-wm>	 RECOVERY - puppet last run on amssq58 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures  
[04:44:04] <icinga-wm>	 RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures  
[04:45:34] <icinga-wm>	 RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures  
[04:47:34] <icinga-wm>	 RECOVERY - puppet last run on amssq48 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures  
[04:53:11] <grrrit-wm1>	 (03CR) 10Hoo man: [C: 04-1] "I guess this can't go out before the extension is live on all wikis also, right?" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[04:58:16] <grrrit-wm1>	 (03CR) 10Hoo man: "Ok, this should only be changed for beta right now as (according to what 01tonythomas told me) this template is also being used for produc" [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[05:03:44] <grrrit-wm1>	 (03CR) 10Hoo man: Added the bouncehandler router to catch in all bounce emails (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[05:11:53] <grrrit-wm1>	 (03PS3) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 
[05:13:39] <grrrit-wm1>	 (03CR) 1001tonythomas: "I changed the API receiver to 'Host: http://d" [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[05:18:16] <grrrit-wm1>	 (03CR) 10BryanDavis: "A few notes inline. We will need to get the additional extensions into the wmf release branches as well but that's sort of a separate prob" (0311 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155789 (owner: 10Andrew Bogott)
[05:19:36] <grrrit-wm1>	 (03CR) 10BryanDavis: "Commit message should link to bug 68751 and bug 62496" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155789 (owner: 10Andrew Bogott)
[05:20:19] <grrrit-wm1>	 (03CR) 10Legoktm: Random stab at getting wikitech config in here. (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155789 (owner: 10Andrew Bogott)
[05:25:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[05:26:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[05:52:34] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected  
[05:52:55] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on labmon1001 is OK: OK: No anomaly detected  
[06:05:34] <icinga-wm>	 PROBLEM - puppet last run on amssq53 is CRITICAL: CRITICAL: Epic puppet fail  
[06:05:44] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[06:05:54] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on labmon1001 is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[06:06:14] <icinga-wm>	 PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: Epic puppet fail  
[06:06:35] <icinga-wm>	 PROBLEM - puppet last run on amssq49 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:07:04] <icinga-wm>	 PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:18:45] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[06:18:55] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on labmon1001 is OK: OK: Less than 1.00% above the threshold [250.0]  
[06:23:04] <icinga-wm>	 RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures  
[06:23:34] <icinga-wm>	 RECOVERY - puppet last run on amssq49 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures  
[06:24:14] <icinga-wm>	 RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures  
[06:24:35] <icinga-wm>	 RECOVERY - puppet last run on amssq53 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures  
[06:25:25] <icinga-wm>	 PROBLEM - Disk space on elastic1015 is CRITICAL: DISK CRITICAL - free space: / 748 MB (2% inode=96%):  
[06:27:25] <icinga-wm>	 PROBLEM - Disk space on elastic1013 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=96%):  
[06:28:55] <icinga-wm>	 PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:14] <icinga-wm>	 PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:30:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:32:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[06:38:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[06:46:04] <icinga-wm>	 RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures  
[06:46:14] <icinga-wm>	 RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures  
[06:50:15] <icinga-wm>	 RECOVERY - Disk space on ms1004 is OK: DISK OK  
[06:53:14] <icinga-wm>	 PROBLEM - Disk space on ms1004 is CRITICAL: DISK CRITICAL - free space: / 588 MB (3% inode=94%): /var/lib/ureadahead/debugfs 588 MB (3% inode=94%):  
[06:53:34] <icinga-wm>	 PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: Epic puppet fail  
[07:13:35] <icinga-wm>	 RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures  
[07:26:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[07:27:14] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[07:33:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[07:34:14] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[07:42:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[07:43:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[08:09:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[08:17:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[08:21:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[08:22:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[08:36:44] <icinga-wm>	 PROBLEM - puppet last run on mw1152 is CRITICAL: CRITICAL: Puppet has 1 failures  
[08:38:03] <ChrisJ>	 ... how does one reset a lost 2fa token on labs/wikitech?
[08:38:51] <legoktm>	 ChrisJ: ask andrewbogott_afk or Coren :P
[08:39:31] <ChrisJ>	 thanks legoktm... i managed to get logged in with 2FA emergency tokens, but they don't seem to work for "disable" or "reset".
[08:39:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[08:44:30] <ChrisJ>	 looking at ways i could potentially get more involved as a volunteer and figured having my wikitech account functional might be helpful... :)
[08:53:44] <icinga-wm>	 RECOVERY - puppet last run on mw1152 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures  
[09:12:54] <icinga-wm>	 PROBLEM - mailman_qrunner on sodium is CRITICAL: PROCS CRITICAL: 0 processes with UID = 38 (list), regex args /mailman/bin/qrunner  
[09:13:25] <icinga-wm>	 PROBLEM - mailman_ctl on sodium is CRITICAL: PROCS CRITICAL: 0 processes with UID = 38 (list), regex args /mailman/bin/mailmanctl  
[09:15:24] <icinga-wm>	 RECOVERY - mailman_ctl on sodium is OK: PROCS OK: 1 process with UID = 38 (list), regex args /mailman/bin/mailmanctl  
[09:15:54] <icinga-wm>	 RECOVERY - mailman_qrunner on sodium is OK: PROCS OK: 8 processes with UID = 38 (list), regex args /mailman/bin/qrunner  
[09:30:54] <icinga-wm>	 PROBLEM - mailman_qrunner on sodium is CRITICAL: PROCS CRITICAL: 0 processes with UID = 38 (list), regex args /mailman/bin/qrunner  
[09:31:24] <icinga-wm>	 PROBLEM - mailman_ctl on sodium is CRITICAL: PROCS CRITICAL: 0 processes with UID = 38 (list), regex args /mailman/bin/mailmanctl  
[09:31:54] <icinga-wm>	 RECOVERY - mailman_qrunner on sodium is OK: PROCS OK: 8 processes with UID = 38 (list), regex args /mailman/bin/qrunner  
[09:32:24] <icinga-wm>	 RECOVERY - mailman_ctl on sodium is OK: PROCS OK: 1 process with UID = 38 (list), regex args /mailman/bin/mailmanctl  
[09:33:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[09:37:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[10:09:14] <icinga-wm>	 PROBLEM - puppet last run on elastic1015 is CRITICAL: CRITICAL: Puppet last ran 14404 seconds ago, expected  14400  
[10:10:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[10:17:14] <icinga-wm>	 PROBLEM - puppet last run on elastic1013 is CRITICAL: CRITICAL: Puppet last ran 14424 seconds ago, expected  14400  
[10:18:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[10:35:31] <grrrit-wm1>	 (03PS1) 10Springle: Depool db1004 for maintenance. Pool db1053 in its place. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155885 
[10:36:06] <grrrit-wm1>	 (03CR) 10Springle: [C: 032] Depool db1004 for maintenance. Pool db1053 in its place. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155885 (owner: 10Springle)
[10:36:10] <grrrit-wm1>	 (03Merged) 10jenkins-bot: Depool db1004 for maintenance. Pool db1053 in its place. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155885 (owner: 10Springle)
[10:36:55] <logmsgbot>	 !log springle Synchronized wmf-config/db-eqiad.php: depool db1004. pool db1053. (duration: 00m 07s)
[10:37:00] <morebots>	 Logged the message, Master
[10:37:14] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[10:40:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[10:42:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[11:21:45] <mark>	 !log Manually removed IPv6 address from mchenry
[11:21:51] <morebots>	 Logged the message, Master
[11:23:36] <mark>	 !log Deactivated IPv6 router-advertisement on cr2-pmtpa
[11:23:42] <morebots>	 Logged the message, Master
[11:27:47] <grrrit-wm1>	 (03PS2) 10Mark Bergsma: Remove IPv6 address from fenari [operations/puppet] - 10https://gerrit.wikimedia.org/r/155758 
[11:28:07] <grrrit-wm1>	 (03CR) 10Mark Bergsma: [C: 032] Remove IPv6 address from fenari [operations/puppet] - 10https://gerrit.wikimedia.org/r/155758 (owner: 10Mark Bergsma)
[11:33:36] <mark>	 !log Manually removed IPv6 addresses from fenari
[11:33:42] <morebots>	 Logged the message, Master
[11:34:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[11:35:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[11:39:45] <icinga-wm>	 PROBLEM - Host ps1-d1-pmtpa is DOWN: PING CRITICAL - Packet loss = 100%  
[11:39:45] <icinga-wm>	 PROBLEM - Host ps1-d2-pmtpa is DOWN: PING CRITICAL - Packet loss = 100%  
[11:39:45] <icinga-wm>	 PROBLEM - Host ps1-c1-pmtpa is DOWN: PING CRITICAL - Packet loss = 100%  
[11:39:45] <icinga-wm>	 PROBLEM - Host mchenry is DOWN: PING CRITICAL - Packet loss = 100%  
[11:40:24] <icinga-wm>	 PROBLEM - Host linne is DOWN: CRITICAL - Time to live exceeded (208.80.152.167)  
[11:40:24] <icinga-wm>	 PROBLEM - Host sanger is DOWN: CRITICAL - Time to live exceeded (208.80.152.187)  
[11:40:24] <icinga-wm>	 PROBLEM - Host fenari is DOWN: CRITICAL - Time to live exceeded (208.80.152.165)  
[11:40:34] <icinga-wm>	 RECOVERY - Host mchenry is UP: PING OK - Packet loss = 0%, RTA = 33.95 ms  
[11:40:39] <mark>	 odd
[11:40:44] <icinga-wm>	 RECOVERY - Host linne is UP: PING OK - Packet loss = 0%, RTA = 31.03 ms  
[11:40:44] <icinga-wm>	 RECOVERY - Host fenari is UP: PING OK - Packet loss = 0%, RTA = 31.02 ms  
[11:40:44] <icinga-wm>	 RECOVERY - Host sanger is UP: PING OK - Packet loss = 0%, RTA = 31.50 ms  
[11:40:44] <icinga-wm>	 RECOVERY - Host ps1-d1-pmtpa is UP: PING OK - Packet loss = 0%, RTA = 33.34 ms  
[11:40:44] <icinga-wm>	 RECOVERY - Host ps1-d2-pmtpa is UP: PING OK - Packet loss = 0%, RTA = 33.39 ms  
[11:40:44] <icinga-wm>	 RECOVERY - Host ps1-c1-pmtpa is UP: PING OK - Packet loss = 0%, RTA = 37.32 ms  
[11:57:34] <icinga-wm>	 PROBLEM - puppet last run on es7 is CRITICAL: CRITICAL: Epic puppet fail  
[12:00:35] <icinga-wm>	 RECOVERY - puppet last run on es7 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures  
[12:06:55] <grrrit-wm1>	 (03PS4) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 
[12:11:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[12:19:04] <grrrit-wm1>	 (03PS5) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 
[12:19:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[12:30:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[12:30:21] <grrrit-wm1>	 (03PS6) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [operations/puppet] - 10https://gerrit.wikimedia.org/r/155753 
[12:33:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[12:41:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[12:42:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[12:44:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[12:51:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[12:52:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[13:54:17] <grrrit-wm1>	 (03PS1) 10Danny B.: cswikinews: Remove unused custom namespace [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155893 
[13:55:25] <grrrit-wm1>	 (03CR) 10Danny B.: "This is forgotten old community decision: https://cs.wikinews.org/wiki/Wikizpr%C3%A1vy:V_redakci/07#Tematick.C3.A1_agregace_obsahu" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155893 (owner: 10Danny B.)
[14:12:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[14:20:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[14:29:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[14:31:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[14:39:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[14:41:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[14:42:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[14:49:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[14:50:14] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[15:27:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[15:32:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[15:36:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[15:39:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[15:53:15] <icinga-wm>	 PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: Epic puppet fail  
[16:03:49] <grrrit-wm1>	 (03CR) 10Deskana: [C: 031] "Since this is only the user-specific JS/CSS that's being enabled, there are no product issues here." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/154432 (https://bugzilla.wikimedia.org/57891) (owner: 10Legoktm)
[16:12:15] <icinga-wm>	 RECOVERY - puppet last run on cp3009 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures  
[16:13:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[16:21:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[16:43:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[17:39:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[17:44:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[17:48:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[17:49:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[17:55:59] <grrrit-wm1>	 (03Abandoned) 10Matanya: ci firewall: move ferm rules to role level and firewall to node level [operations/puppet] - 10https://gerrit.wikimedia.org/r/144503 (owner: 10Matanya)
[17:59:27] <grrrit-wm1>	 (03CR) 10Matanya: "I agree here, but since there is no role yet, this is the first step in simplifying the structure." [operations/puppet] - 10https://gerrit.wikimedia.org/r/117698 (owner: 10Matanya)
[18:10:08] <grrrit-wm1>	 (03PS3) 10MZMcBride: Enable GlobalCssJs on all CentralAuth wikis minus loginwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/154432 (https://bugzilla.wikimedia.org/13953) (owner: 10Legoktm)
[18:14:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[18:17:47] <grrrit-wm1>	 (03PS1) 10Ori.livneh: Lint Trebuchet provider [operations/puppet] - 10https://gerrit.wikimedia.org/r/155909 
[18:22:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[18:42:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[18:43:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[18:44:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[19:24:51] <matanya>	 unable to log in: [3a6d1d09] 2014-08-23 19:23:51: Fatal exception of type MWException
[19:24:58] <grrrit-wm1>	 (03PS1) 10Umherirrender: Add new user rights 'editsitejs' and 'editsitecss' to user groups [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155913 
[19:32:45] <grrrit-wm1>	 (03CR) 10John F. Lewis: [C: 04-1] "Duplicated definitions in quite a few places" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155913 (owner: 10Umherirrender)
[19:34:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[19:35:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[19:49:23] <grrrit-wm1>	 (03CR) 10John F. Lewis: Add new user rights 'editsitejs' and 'editsitecss' to user groups [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155913 (owner: 10Umherirrender)
[19:51:13] <grrrit-wm1>	 (03CR) 10Calak: [C: 031] Add new user rights 'editsitejs' and 'editsitecss' to user groups [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155913 (owner: 10Umherirrender)
[20:01:46] <grrrit-wm1>	 (03CR) 10Steinsplitter: [C: 031] "+1 Looks good to me." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/155913 (owner: 10Umherirrender)
[20:12:35] <icinga-wm>	 PROBLEM - puppet last run on ssl3002 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:12:44] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 13.33% of data above the critical threshold [500.0]  
[20:13:04] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on labmon1001 is CRITICAL: CRITICAL: 13.33% of data above the critical threshold [500.0]  
[20:13:54] <icinga-wm>	 PROBLEM - puppet last run on amssq62 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:15:24] <icinga-wm>	 PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: Puppet has 2 failures  
[20:15:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[20:16:04] <icinga-wm>	 PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: Epic puppet fail  
[20:16:14] <icinga-wm>	 PROBLEM - puppet last run on cp3012 is CRITICAL: CRITICAL: Puppet has 4 failures  
[20:16:44] <icinga-wm>	 PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:17:35] <icinga-wm>	 PROBLEM - puppet last run on amssq31 is CRITICAL: CRITICAL: Puppet has 2 failures  
[20:18:35] <icinga-wm>	 PROBLEM - puppet last run on cp3004 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:18:46] <icinga-wm>	 PROBLEM - puppet last run on amssq39 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:19:35] <icinga-wm>	 PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:20:14] <icinga-wm>	 PROBLEM - puppet last run on ssl3001 is CRITICAL: CRITICAL: Puppet has 2 failures  
[20:20:35] <icinga-wm>	 PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:20:35] <icinga-wm>	 PROBLEM - puppet last run on amslvs4 is CRITICAL: CRITICAL: Epic puppet fail  
[20:22:40] <grrrit-wm1>	 (03PS1) 10coren: Labs: point /public/dumps to the new server [operations/puppet] - 10https://gerrit.wikimedia.org/r/156002 
[20:23:04] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on labmon1001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 2 below the confidence bounds  
[20:23:14] <icinga-wm>	 PROBLEM - Packetloss_Average on erbium is CRITICAL: packet_loss_average CRITICAL: 8.11702505882  
[20:23:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[20:23:35] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 2 below the confidence bounds  
[20:25:35] <icinga-wm>	 PROBLEM - puppet last run on amssq49 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:26:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:27:14] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[20:28:44] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[20:29:04] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on labmon1001 is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[20:30:35] <icinga-wm>	 RECOVERY - puppet last run on ssl3002 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures  
[20:30:52] <matanya>	 Coren: are you around ?
[20:31:35] <icinga-wm>	 PROBLEM - puppet last run on ms-be3001 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:31:44] <icinga-wm>	 PROBLEM - puppet last run on amssq56 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:32:14] <icinga-wm>	 PROBLEM - puppet last run on amssq40 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:33:14] <icinga-wm>	 RECOVERY - puppet last run on cp3012 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures  
[20:33:24] <icinga-wm>	 RECOVERY - puppet last run on cp3009 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures  
[20:33:35] <icinga-wm>	 RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures  
[20:33:35] <icinga-wm>	 RECOVERY - puppet last run on cp3004 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures  
[20:33:54] <icinga-wm>	 RECOVERY - puppet last run on amssq62 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures  
[20:34:35] <icinga-wm>	 RECOVERY - puppet last run on amssq31 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures  
[20:35:14] <icinga-wm>	 RECOVERY - puppet last run on ssl3001 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures  
[20:35:44] <icinga-wm>	 RECOVERY - puppet last run on amssq39 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures  
[20:36:04] <icinga-wm>	 RECOVERY - puppet last run on cp3018 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures  
[20:36:35] <icinga-wm>	 RECOVERY - puppet last run on cp3011 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures  
[20:37:35] <icinga-wm>	 RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[20:37:54] <icinga-wm>	 PROBLEM - puppet last run on amssq41 is CRITICAL: CRITICAL: Epic puppet fail  
[20:37:54] <icinga-wm>	 PROBLEM - puppet last run on amssq38 is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:39:36] <icinga-wm>	 RECOVERY - puppet last run on amslvs4 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures  
[20:41:14] <icinga-wm>	 RECOVERY - Packetloss_Average on erbium is OK: packet_loss_average OKAY: 3.08598107143  
[20:42:35] <icinga-wm>	 RECOVERY - puppet last run on amssq49 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[20:45:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[20:48:35] <icinga-wm>	 RECOVERY - puppet last run on amssq56 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures  
[20:48:45] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[20:49:04] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on labmon1001 is OK: OK: Less than 1.00% above the threshold [250.0]  
[20:49:14] <icinga-wm>	 RECOVERY - puppet last run on amssq40 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures  
[20:49:35] <icinga-wm>	 RECOVERY - puppet last run on ms-be3001 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures  
[20:49:54] <icinga-wm>	 RECOVERY - puppet last run on amssq41 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures  
[20:52:55] <icinga-wm>	 RECOVERY - puppet last run on amssq38 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures  
[21:02:45] <grrrit-wm1>	 (03PS2) 10Ori.livneh: Lint Trebuchet provider [operations/puppet] - 10https://gerrit.wikimedia.org/r/155909 
[21:03:05] <grrrit-wm1>	 (03CR) 10Ori.livneh: [C: 032 V: 032] "trivial and tested" [operations/puppet] - 10https://gerrit.wikimedia.org/r/155909 (owner: 10Ori.livneh)
[21:05:04] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on labmon1001 is CRITICAL: CRITICAL: 7.14% of data above the critical threshold [500.0]  
[21:05:44] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 7.14% of data above the critical threshold [500.0]  
[21:22:04] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on labmon1001 is OK: OK: Less than 1.00% above the threshold [250.0]  
[21:22:44] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[21:26:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[21:28:14] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[21:38:15] <icinga-wm>	 PROBLEM - RAID on analytics1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[21:45:04] <icinga-wm>	 RECOVERY - RAID on analytics1003 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0  
[21:57:04] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on labmon1001 is OK: OK: No anomaly detected  
[21:57:35] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected  
[22:16:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1015 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:08:52 UTC  
[22:24:34] <icinga-wm>	 PROBLEM - Puppet freshness on elastic1013 is CRITICAL: Last successful Puppet run was Sat 23 Aug 2014 06:16:33 UTC  
[22:46:34] <icinga-wm>	 PROBLEM - Puppet freshness on analytics1003 is CRITICAL: Last successful Puppet run was Fri 22 Aug 2014 20:33:50 UTC  
[23:19:12] <grrrit-wm1>	 (03CR) 10coren: [C: 032] "Better partial than broken." [operations/puppet] - 10https://gerrit.wikimedia.org/r/156002 (owner: 10coren)