[00:49:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[01:30:10] <icinga-wm>	 PROBLEM - RAID on analytics1004 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[01:37:09] <icinga-wm>	 RECOVERY - RAID on analytics1004 is OK: OK: Active: 8, Working: 8, Failed: 0, Spare: 0  
[02:05:00] <icinga-wm>	 PROBLEM - Disk space on virt0 is CRITICAL: DISK CRITICAL - free space: /a 3873 MB (3% inode=99%):  
[02:07:21] <jeremyb>	 do we care about virt0?
[02:07:26] <jeremyb>	 (tampa)
[02:18:48] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf15) at 2014-09-07 02:17:44+00:00
[02:19:02] <morebots>	 Logged the message, Master
[02:31:19] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf19) at 2014-09-07 02:30:15+00:00
[02:31:25] <morebots>	 Logged the message, Master
[02:43:16] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf20) at 2014-09-07 02:42:12+00:00
[02:43:21] <morebots>	 Logged the message, Master
[02:49:29] <icinga-wm>	 PROBLEM - very high load average likely xfs on ms-be1005 is CRITICAL: CRITICAL - load average: 210.89, 115.08, 55.84  
[02:50:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[03:00:59] <icinga-wm>	 RECOVERY - Disk space on virt0 is OK: DISK OK  
[03:12:49] <icinga-wm>	 PROBLEM - swift eqiad-prod container availability on tungsten is CRITICAL: CRITICAL: 10.34% of data under the critical threshold [96.0]  
[03:29:47] <jeremyb>	 i'll assume that last alert is related to ms-be1005?
[03:30:57] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep  7 03:29:51 UTC 2014 (duration 29m 50s)
[03:31:03] <morebots>	 Logged the message, Master
[03:35:58] <greg-g>	 jeremyb: heya, did you mean to claim/assign to yourself this task http://fab.wmflabs.org/T644 ? (If you did, awesome! just double checking)
[04:06:58] <jeremyb>	 greg-g: yeah
[04:07:07] <jeremyb>	 greg-g: i fixed morebots once this week already
[04:07:15] <greg-g>	 jeremyb: sweet, thank you sir. :)
[04:07:20] <jeremyb>	 what's a little more tweaking? :)
[04:23:32] <jeremyb>	 speaking of, looks like wikitechwiki was still broken today
[04:23:42] <jeremyb>	 i wonder if andrewbogott_afk looked at my patch
[04:25:02] <jeremyb>	 greg-g: so, should it be !log foo or !log deployment-prep foo ?
[04:25:24] <jeremyb>	 or do you want a QA log and people should know to use that instead of deployment-prep?
[04:52:30] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[04:59:09] <jeremyb>	 greg-g: reping. btw, did you catch that swift issue above? seems like the kind of thing you like to keep an eye on
[05:12:01] <greg-g>	 jeremyb: re !log, if it can be simply "!log foo" and that go to the right SAL, https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL then that's ideal.
[05:12:17] <jeremyb>	 '
[05:12:20] <jeremyb>	 whoops
[05:12:24] <greg-g>	 I didn't see the swift thing, /me looks
[05:13:30] <jeremyb>	 swift maybe isn't urgent or maybe is, unsure. but it's something that's happened recently and won't fix itself
[05:13:55] <jeremyb>	 (that's why we now have the special case nagios alert for load avg)
[05:14:37] <jeremyb>	 greg-g: not sure about that. you're talking about having a second beta cluster (in it's own project) and there's QA stuff unrelated to beta, right? e.g. jenkins, cloudbees, etc.
[05:14:48] <jeremyb>	 greg-g: i was thinking maybe it should be all a single unified log
[05:41:19] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds  
[05:41:29] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on labmon1001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds  
[06:00:40] <icinga-wm>	 PROBLEM - DPKG on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:41] <icinga-wm>	 PROBLEM - swift-account-reaper on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:49] <icinga-wm>	 PROBLEM - puppet last run on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:50] <icinga-wm>	 PROBLEM - SSH on ms-be1005 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[06:00:50] <icinga-wm>	 PROBLEM - swift-account-server on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:50] <icinga-wm>	 PROBLEM - swift-account-auditor on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:50] <icinga-wm>	 PROBLEM - swift-container-updater on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:50] <icinga-wm>	 PROBLEM - swift-container-replicator on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:59] <icinga-wm>	 PROBLEM - swift-container-auditor on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:00:59] <icinga-wm>	 PROBLEM - swift-container-server on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:09] <icinga-wm>	 PROBLEM - swift-object-replicator on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:09] <icinga-wm>	 PROBLEM - RAID on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:09] <icinga-wm>	 PROBLEM - swift-object-auditor on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:10] <icinga-wm>	 PROBLEM - check if dhclient is running on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:10] <icinga-wm>	 PROBLEM - swift-object-updater on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:19] <icinga-wm>	 PROBLEM - swift-object-server on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:01:30] <icinga-wm>	 PROBLEM - swift-account-replicator on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[06:05:54] <jeremyb>	 greg-g: this seems to be the same as https://rt.wikimedia.org/Ticket/Display.html?id=8249
[06:06:05] <jeremyb>	 fwiw
[06:13:29] <icinga-wm>	 PROBLEM - NTP on ms-be1005 is CRITICAL: NTP CRITICAL: No response from NTP server  
[06:14:04] <jeremyb>	 icinga-wm: thanks! we got the picture
[06:14:15] <godog>	 ah indeed, what we have observed is the machine being responsive but it doesn't seem the case here, anyways I'm taking a look on the console
[06:15:32] <godog>	 !log powercycle ms-be1005, not even responsive on console
[06:15:37] <morebots>	 Logged the message, Master
[06:15:56] <jeremyb>	 was there a stack at least?
[06:16:11] <godog>	 no :( just last line of getty
[06:16:27] <godog>	 perhaps it has logged on kern.log
[06:16:45] <jeremyb>	 is kern.log xfs? :D
[06:17:29] <icinga-wm>	 PROBLEM - Host ms-be1005 is DOWN: PING CRITICAL - Packet loss = 100%  
[06:17:53] <godog>	 haha fair point, I think icinga wouldn't spam if services would have dependencies fwiw
[06:18:19] <icinga-wm>	 RECOVERY - swift-account-replicator on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator  
[06:18:29] <icinga-wm>	 RECOVERY - very high load average likely xfs on ms-be1005 is OK: OK - load average: 5.51, 1.32, 0.44  
[06:18:29] <icinga-wm>	 RECOVERY - Host ms-be1005 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms  
[06:18:30] <icinga-wm>	 RECOVERY - DPKG on ms-be1005 is OK: All packages OK  
[06:18:39] <icinga-wm>	 RECOVERY - swift-account-reaper on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper  
[06:18:39] <icinga-wm>	 RECOVERY - puppet last run on ms-be1005 is OK: OK: Puppet is currently enabled, last run 2430 seconds ago with 0 failures  
[06:18:40] <icinga-wm>	 RECOVERY - SSH on ms-be1005 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[06:18:40] <icinga-wm>	 RECOVERY - swift-account-auditor on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor  
[06:18:40] <icinga-wm>	 RECOVERY - swift-account-server on ms-be1005 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server  
[06:18:40] <icinga-wm>	 RECOVERY - swift-container-replicator on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator  
[06:18:40] <icinga-wm>	 RECOVERY - swift-container-updater on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater  
[06:18:49] <icinga-wm>	 RECOVERY - swift-container-auditor on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor  
[06:18:49] <icinga-wm>	 RECOVERY - swift-container-server on ms-be1005 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server  
[06:18:50] <jeremyb>	 godog: yeah,.well i was thinking we could have a single unified check for "all expected processes running"
[06:18:59] <icinga-wm>	 RECOVERY - swift-object-replicator on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator  
[06:18:59] <icinga-wm>	 RECOVERY - RAID on ms-be1005 is OK: OK: optimal, 14 logical, 14 physical  
[06:19:00] <icinga-wm>	 RECOVERY - swift-object-auditor on ms-be1005 is OK: PROCS OK: 3 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor  
[06:19:00] <icinga-wm>	 RECOVERY - check if dhclient is running on ms-be1005 is OK: PROCS OK: 0 processes with command name dhclient  
[06:19:00] <icinga-wm>	 RECOVERY - swift-object-updater on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater  
[06:19:09] <icinga-wm>	 RECOVERY - swift-object-server on ms-be1005 is OK: PROCS OK: 101 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server  
[06:19:16] <jeremyb>	 dependencies could help a little. but maybe not right timing
[06:20:30] <godog>	 yep worth a try
[06:21:49] <grrrit-wm>	 (03PS1) 10Ori.livneh: update require_package() to latest [puppet] - 10https://gerrit.wikimedia.org/r/158913 
[06:27:40] <icinga-wm>	 PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Epic puppet fail  
[06:28:09] <icinga-wm>	 PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:28:49] <icinga-wm>	 PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:00] <icinga-wm>	 PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:30] <icinga-wm>	 RECOVERY - Disk space on ms1004 is OK: DISK OK  
[06:33:30] <icinga-wm>	 PROBLEM - Disk space on ms1004 is CRITICAL: DISK CRITICAL - free space: / 2 MB (0% inode=94%): /var/lib/ureadahead/debugfs 2 MB (0% inode=94%):  
[06:45:49] <icinga-wm>	 RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures  
[06:45:59] <icinga-wm>	 RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures  
[06:46:09] <icinga-wm>	 RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures  
[06:47:40] <icinga-wm>	 RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures  
[06:52:27] <icinga-wm>	 PROBLEM - puppet last run on ssl3002 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:52:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[06:56:40] <icinga-wm>	 PROBLEM - puppet last run on db60 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:59:49] <icinga-wm>	 RECOVERY - swift eqiad-prod container availability on tungsten is OK: OK: Less than 1.00% under the threshold [98.0]  
[07:07:19] <jzerebecki>	 bd808|BUFFER: thx
[07:10:19] <icinga-wm>	 RECOVERY - puppet last run on ssl3002 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures  
[07:14:40] <icinga-wm>	 RECOVERY - puppet last run on db60 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures  
[07:18:19] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected  
[07:18:29] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on labmon1001 is OK: OK: No anomaly detected  
[08:53:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[10:54:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[12:47:49] <icinga-wm>	 PROBLEM - puppet last run on amssq48 is CRITICAL: CRITICAL: Epic puppet fail  
[12:55:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[13:06:49] <icinga-wm>	 RECOVERY - puppet last run on amssq48 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures  
[14:46:49] <icinga-wm>	 PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Epic puppet fail  
[14:56:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[14:59:34] <grrrit-wm>	 (03CR) 10Hoo man: [C: 04-1] "This wont work as is as you use arrays as strings right now (see inline comments)." (037 comments) [puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[15:06:49] <icinga-wm>	 RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures  
[15:12:09] <apergos>	 !log manually changed /etc/hosts entry on analytics1004 from having "analyticas1004.eqiad.wmnet" to "analytics1004.eqiad.wmnet"
[15:12:15] <morebots>	 Logged the message, Master
[15:12:28] <apergos>	 just in case that impacts something weirdly (I doubt it)
[15:30:41] <grrrit-wm>	 (03PS30) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [puppet] - 10https://gerrit.wikimedia.org/r/155753 
[16:02:35] <grrrit-wm>	 (03CR) 10Hoo man: "Looks ok now (untested, at first glance)." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[16:05:08] <grrrit-wm>	 (03PS2) 10Faidon Liambotis: Allocate sandbox vlans for codfw and ulsfo [dns] - 10https://gerrit.wikimedia.org/r/158636 (owner: 10Mark Bergsma)
[16:05:10] <grrrit-wm>	 (03PS1) 10Faidon Liambotis: Allocate IPv4/IPv6 for RIPE Atlas codfw/ulsfo [dns] - 10https://gerrit.wikimedia.org/r/158939 
[16:26:30] <grrrit-wm>	 (03CR) 10Faidon Liambotis: [C: 031] "A couple of small comments inline." (032 comments) [dns] - 10https://gerrit.wikimedia.org/r/158382 (owner: 10BBlack)
[16:57:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[17:24:51] <grrrit-wm>	 (03CR) 10Faidon Liambotis: [C: 032] "I added this statement during the gdnsd switchover. The only reason I did was to keep the delta of responses from the old PowerDNS setup a" [puppet] - 10https://gerrit.wikimedia.org/r/158637 (owner: 10BBlack)
[17:49:24] <jackmcbarn>	 what does mw1011 do? is it a jobrunner?
[18:09:42] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] update require_package() to latest [puppet] - 10https://gerrit.wikimedia.org/r/158913 (owner: 10Ori.livneh)
[18:10:29] <ori>	 jackmcbarn: yes
[18:10:38] <jackmcbarn>	 ori: and it's zend, right?
[18:10:51] <ori>	 i'm pretty sure, but let me confirm
[18:11:00] <ori>	 i think _joe._ imaged a couple of additional machines
[18:11:12] <ori>	 yes, it's zend
[18:11:28] <ori>	 why, is there some indication that it's bisbehaving?
[18:11:44] <jackmcbarn>	 it took 10 seconds to run lua that takes 2 seconds when i purge/nulledit/preview it
[18:13:01] <jackmcbarn>	 (and it didn't get done, so it probably would have taken a lot longer were it allowed to finish)
[18:13:07] <ori>	 are you using HHVM?
[18:13:27] <ori>	 also, it has high load relative to the web tier apaches
[18:14:13] <jackmcbarn>	 yes, i'm on hhvm. when i switch to zend, then it takes 5-6 seconds
[18:14:42] <jackmcbarn>	 should we maybe give jobrunners a higher timeout than the webservers have, to mitigate the "Script error"s appearing on articles?
[18:16:15] <ori>	 i'm more inclined to think that we should increase capacity such that there isn't a gap in performance
[18:16:21] <ori>	 how widespread of a problem is it?
[18:18:03] <jackmcbarn>	 this is the first one i've seen since you shut off the hhvm jobrunner
[18:18:38] <jackmcbarn>	 though there's a good chance that since page cache and links tables aren't always in sync, that there's a lot more that i don't know about
[18:19:36] <ori>	 jackmcbarn: (slight tangent.. ) to what extent do the resource limits constrain what people do with lua?
[18:21:02] <ori>	 i ask because i worry about the following scenario: we roll out HHVM, performance improves sharply for a bit, then gradually declines to the current norm as people use the new platform to do more and more complex things with Lua.
[18:21:48] <jackmcbarn>	 ori: i've never seen an article anywhere near the limit for "real"
[18:22:23] <ori>	 ok, that's reassuring
[18:22:40] <jackmcbarn>	 and even really pathological cases (one of which just crashed firefox as i was about to give you the url) only takes 0.5 seconds of lua
[18:23:07] <ori>	 on reflection maybe you're right about just increasing the limit as a stopgap for the jobrunners
[18:23:22] <ori>	 what would be reasonable? 15s?
[18:23:58] <jackmcbarn>	 i'd say 20 for now, and see if they start to go over that
[18:25:20] <jackmcbarn>	 i'm thinking of adding a feature to scribunto, that if pages use more than (configurable) half of their assigned time, put them in a warning category
[18:25:57] <ori>	 that would be extremely useful
[18:33:51] <grrrit-wm>	 (03PS1) 10Ori.livneh: Scribunto: double the Lua CPU limit on the job runners [mediawiki-config] - 10https://gerrit.wikimedia.org/r/158948 
[18:35:50] <grrrit-wm>	 (03CR) 10Ori.livneh: Scribunto: double the Lua CPU limit on the job runners (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/158948 (owner: 10Ori.livneh)
[18:37:37] <ori>	 jackmcbarn: thanks again for your work on lua scripting in general.
[18:58:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[19:11:16] <grrrit-wm>	 (03PS1) 10Jackmcbarn: Increase $wgSVGMaxSize to 4096 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/158951 (https://bugzilla.wikimedia.org/70529) 
[19:38:14] <grrrit-wm>	 (03PS31) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [puppet] - 10https://gerrit.wikimedia.org/r/155753 
[19:50:11] <matanya>	 welcome back(?) paravoid
[19:50:48] <paravoid>	 nope :)
[19:50:55] <matanya>	 I hopped
[20:36:22] <ori>	 !log mw1017: upgraded HHVM from 3.3-dev+20140728+wmf5 to 3.3-dev+20140728+wmf6
[20:36:26] <morebots>	 Logged the message, Master
[20:59:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[21:10:44] <grrrit-wm>	 (03PS2) 10Ori.livneh: Clean up salt::grain [puppet] - 10https://gerrit.wikimedia.org/r/153783 
[21:11:42] <grrrit-wm>	 (03PS10) 10Ori.livneh: Clean up salt::minion [puppet] - 10https://gerrit.wikimedia.org/r/153727 
[21:15:24] <grrrit-wm>	 (03PS1) 10Krinkle: apache: Remove old comments referencing 'yaseo' [puppet] - 10https://gerrit.wikimedia.org/r/158996 
[23:00:49] <icinga-wm>	 PROBLEM - Puppet freshness on mw1053 is CRITICAL: Last successful Puppet run was Thu 04 Sep 2014 00:21:29 UTC  
[23:11:28] <grrrit-wm>	 (03PS2) 10Alex Monk: apache: Remove old comments referencing 'yaseo' [puppet] - 10https://gerrit.wikimedia.org/r/158996 (owner: 10Krinkle)
[23:25:29] <icinga-wm>	 PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: Epic puppet fail  
[23:35:05] <TimStarling>	 !log upgrading liblua everywhere
[23:35:09] <morebots>	 Logged the message, Master
[23:38:10] <icinga-wm>	 PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:40:00] <icinga-wm>	 PROBLEM - puppet last run on mw1157 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:40:19] <icinga-wm>	 PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: Puppet has 2 failures  
[23:41:09] <icinga-wm>	 PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:41:39] <icinga-wm>	 PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: Puppet has 2 failures  
[23:41:39] <icinga-wm>	 PROBLEM - puppet last run on mw1182 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:42:59] <icinga-wm>	 PROBLEM - puppet last run on mw1062 is CRITICAL: CRITICAL: Puppet has 3 failures  
[23:43:10] <icinga-wm>	 PROBLEM - puppet last run on mw1132 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:43:39] <icinga-wm>	 PROBLEM - puppet last run on mw1038 is CRITICAL: CRITICAL: Puppet has 2 failures  
[23:44:09] <icinga-wm>	 PROBLEM - puppet last run on mw1005 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:44:29] <icinga-wm>	 RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures  
[23:57:10] <icinga-wm>	 RECOVERY - puppet last run on tmh1001 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures  
[23:57:19] <icinga-wm>	 RECOVERY - puppet last run on mw1137 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures  
[23:58:00] <icinga-wm>	 RECOVERY - puppet last run on mw1157 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures  
[23:58:40] <icinga-wm>	 RECOVERY - puppet last run on mw1182 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[23:59:09] <icinga-wm>	 RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures