[00:01:28] <mutante>	 paravoid: is this exim config really faulty? i believe it is as Tim L. says
[00:01:31] <mutante>	 https://gerrit.wikimedia.org/r/#/c/127481/1
[00:01:45] <mutante>	 it's toollabs 
[00:02:21] <grrrit-wm>	 (03CR) 10Faidon Liambotis: [C: 032] Tools: Remove faulty exim configuration [operations/puppet] - 10https://gerrit.wikimedia.org/r/127481 (owner: 10Tim Landscheidt)
[00:02:28] <mutante>	 thx!
[00:02:35] <paravoid>	 yvw :)
[00:04:35] <mutante>	 out now, have a nice weekend all
[00:04:40] <pajz>	 I've just noticed Visual Editor is enabled on OTRS Wiki, but it doesn't seem to work. Clicking "edit" returns an error ("Error loading data from server: parsoidserver-http-bad-status: 404"). Is that currently being worked on and/or should that go to bugzilla?
[00:05:08] <paravoid>	 gwicke: ^^
[00:07:16] <grrrit-wm>	 (03PS1) 10Springle: db1047 mariadb 10 now replicates s2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/144096 
[00:12:23] <gwicke>	 pajz: hmm.. James_F ^^
[00:12:26] <gwicke>	 ;)
[00:12:35] <gwicke>	 hot potato routing
[00:12:54] <gwicke>	 subbu has left already
[00:14:14] <James_F>	 gwicke: Bleh.
[00:14:38] <grrrit-wm>	 (03CR) 10Springle: [C: 032] db1047 mariadb 10 now replicates s2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/144096 (owner: 10Springle)
[00:15:39] <gwicke>	 James_F: are you looking into it?
[00:15:45] <James_F>	 Yeah.
[00:15:49] <James_F>	 Didn't realise it went out.
[00:16:21] <gwicke>	 James_F: k, thx!
[00:27:07] <gwicke>	 James_F: deploying..
[00:27:18] <James_F>	 gwicke: Thanks!
[00:28:15] <gwicke>	 !log deployed parsoid config change e21a534 to support VE on the OTRS wiki
[00:28:21] <morebots>	 Logged the message, Master
[00:29:10] <James_F>	 Thanks.
[00:31:20] <James_F>	 pajz: Should be fixed now. Sorry!
[00:35:18] <pajz>	 Thanks. Out of curiosity, was there a request to enable VE on OTRS Wiki?
[00:37:26] <pajz>	 Also, it's still not working for me. The error is gone, but it's now just loading endlessly (both FF and Chrome).
[00:37:54] <kaldari>	 Looks like beta labs is in read-only mode right now.
[00:38:07] <pajz>	 James_F, ^
[00:38:52] <James_F>	 pajz: Yeah, I chatted with the OTRS admins and in the end we said it was OK as a Beta Feature (but not on by default).
[00:38:59] <James_F>	 pajz: Wasn't expecting it to go out today, sorry.
[00:39:45] <James_F>	 pajz: Do you get anything in the console?
[00:40:00] <pajz>	 But, uhm, it is now enabled by default, isn't it?
[00:40:57] <James_F>	 pajz: Indeed.
[00:44:33] <pajz>	 jackmcbarn, I'm not really sure what I should be looking for in the console.
[00:45:03] <pajz>	 Provided I'm even looking at the right thing.
[00:48:23] <legoktm>	 James_F: ^
[00:48:45] <pajz>	 James_F, http://pastebin.com/TAvK7uTA
[00:48:47] <James_F>	 pajz: Well, if there are any messages at all that's generally a bad sign. :-)
[00:49:27] <James_F>	 Oh. Eurgh.
[00:49:35] <James_F>	 Damn private wikis with terrible code.
[00:51:48] <RD>	 <James_F> pajz: Yeah, I chatted with the OTRS admins and in the end we said it was OK as a Beta Feature (but not on by default).
[00:51:52] <icinga-wm>	 PROBLEM - puppet last run on lvs4003 is CRITICAL: CRITICAL: Puppet has 1 failures  
[00:51:55] <RD>	 It was supposed to be opt in
[00:54:01] <James_F>	 RD: Yeah. :-(
[00:54:11] <pajz>	 RD, could you check and see if it works for you now? I'm seeing an error message in chrome which is related to a (now-deleted) .js file of mine, so I wonder if that could be related.
[00:54:54] <RD>	 Appears to be working now, yes.  (FF)
[00:55:06] <RD>	 Now to make it opt-in.. :D
[00:55:06] <James_F>	 pajz: Yeah, it works for me now.
[00:55:17] <RD>	 Sorry.  I'm being bias
[00:55:17] <pajz>	 Ah. Hmm.
[00:57:36] <pajz>	 yeah, I second the request to make it opt-in if that's possible.
[01:03:33] <pajz>	 Hmm, I can't get it to work.
[01:09:56] <icinga-wm>	 RECOVERY - puppet last run on lvs4003 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures  
[01:33:41] <icinga-wm>	 PROBLEM - puppet last run on tungsten is CRITICAL: CRITICAL: Puppet has 1 failures  
[01:51:35] <icinga-wm>	 RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures  
[01:59:38] <icinga-wm>	 PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[02:05:28] <icinga-wm>	 RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 53279 bytes in 0.409 second response time  
[02:15:18] <icinga-wm>	 PROBLEM - graphite.wikimedia.org on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[02:19:28] <icinga-wm>	 PROBLEM - RAID on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[02:20:21] <icinga-wm>	 RECOVERY - RAID on fenari is OK: OK: Active: 2, Working: 2, Failed: 0, Spare: 0  
[02:33:33] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf11) at 2014-07-04 02:32:29+00:00
[02:33:38] <morebots>	 Logged the message, Master
[02:34:01] <icinga-wm>	 RECOVERY - graphite.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 1607 bytes in 0.003 second response time  
[03:03:52] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf12) at 2014-07-04 03:02:49+00:00
[03:03:56] <morebots>	 Logged the message, Master
[03:29:35] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul  4 03:28:29 UTC 2014 (duration 28m 28s)
[03:29:41] <morebots>	 Logged the message, Master
[04:31:25] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 04-1] "On reflection, I think that this is more risky than it needs to be. Let's defer the actual use of the Puppet-managed configuration files t" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/143329 (owner: 10Giuseppe Lavagetto)
[04:50:13] <icinga-wm>	 PROBLEM - puppet last run on amssq55 is CRITICAL: CRITICAL: Puppet has 1 failures  
[05:07:11] <icinga-wm>	 PROBLEM - RAID on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[05:07:41] <icinga-wm>	 PROBLEM - MediaWiki profile collector on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[05:08:01] <icinga-wm>	 RECOVERY - RAID on tungsten is OK: OK: optimal, 1 logical, 2 physical  
[05:08:11] <icinga-wm>	 RECOVERY - puppet last run on amssq55 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures  
[05:08:31] <icinga-wm>	 RECOVERY - MediaWiki profile collector on tungsten is OK: OK: All defined mwprof jobs are runnning.  
[06:27:32] <icinga-wm>	 PROBLEM - puppet last run on elastic1012 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:28:22] <icinga-wm>	 PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:28:32] <icinga-wm>	 PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:28:32] <icinga-wm>	 PROBLEM - puppet last run on mw1217 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:28:42] <icinga-wm>	 PROBLEM - puppet last run on mw1189 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:28:42] <icinga-wm>	 PROBLEM - puppet last run on search1016 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:28:42] <icinga-wm>	 PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:28:52] <icinga-wm>	 PROBLEM - puppet last run on mw1009 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:28:52] <icinga-wm>	 PROBLEM - puppet last run on mw1153 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:28:52] <icinga-wm>	 PROBLEM - puppet last run on db1002 is CRITICAL: CRITICAL: Puppet has 4 failures  
[06:28:52] <icinga-wm>	 PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:02] <icinga-wm>	 PROBLEM - puppet last run on mw1046 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:02] <icinga-wm>	 PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: Puppet has 4 failures  
[06:29:02] <icinga-wm>	 PROBLEM - puppet last run on mw1060 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:02] <icinga-wm>	 PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:02] <icinga-wm>	 PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:03] <icinga-wm>	 PROBLEM - puppet last run on mw1150 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:03] <icinga-wm>	 PROBLEM - puppet last run on elastic1008 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:04] <icinga-wm>	 PROBLEM - puppet last run on mw1088 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:04] <icinga-wm>	 PROBLEM - puppet last run on mw1100 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:05] <icinga-wm>	 PROBLEM - puppet last run on mw1187 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:12] <icinga-wm>	 PROBLEM - puppet last run on search1010 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:12] <icinga-wm>	 PROBLEM - puppet last run on mw1176 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:12] <icinga-wm>	 PROBLEM - puppet last run on db74 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:12] <icinga-wm>	 PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:22] <icinga-wm>	 PROBLEM - puppet last run on iron is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:22] <icinga-wm>	 PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: Puppet has 4 failures  
[06:29:22] <icinga-wm>	 PROBLEM - puppet last run on holmium is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:22] <icinga-wm>	 PROBLEM - puppet last run on mw1205 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:22] <icinga-wm>	 PROBLEM - puppet last run on mw1164 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:23] <icinga-wm>	 PROBLEM - puppet last run on search1018 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:32] <icinga-wm>	 PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:32] <icinga-wm>	 PROBLEM - puppet last run on mw1117 is CRITICAL: CRITICAL: Puppet has 4 failures  
[06:29:32] <icinga-wm>	 PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:42] <icinga-wm>	 PROBLEM - puppet last run on mw1068 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:42] <icinga-wm>	 PROBLEM - puppet last run on mw1003 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:42] <icinga-wm>	 PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:52] <icinga-wm>	 PROBLEM - puppet last run on mw1120 is CRITICAL: CRITICAL: Puppet has 4 failures  
[06:29:52] <icinga-wm>	 PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:02] <icinga-wm>	 PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:30:02] <icinga-wm>	 PROBLEM - puppet last run on db1028 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:02] <icinga-wm>	 PROBLEM - puppet last run on mw1099 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:03] <icinga-wm>	 PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:12] <icinga-wm>	 PROBLEM - puppet last run on searchidx1001 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:30:12] <icinga-wm>	 PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:12] <icinga-wm>	 PROBLEM - puppet last run on cp3016 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:22] <icinga-wm>	 PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:42] <icinga-wm>	 PROBLEM - puppet last run on db1021 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:38:08] <ori>	 i'm not sure why icinga-wm has to speak in near-palindromes
[06:44:09] <icinga-wm>	 RECOVERY - puppet last run on mw1187 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures  
[06:44:09] <icinga-wm>	 RECOVERY - puppet last run on db74 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures  
[06:44:29] <icinga-wm>	 RECOVERY - puppet last run on mw1117 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures  
[06:44:29] <icinga-wm>	 RECOVERY - puppet last run on cp3003 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures  
[06:44:39] <icinga-wm>	 RECOVERY - puppet last run on elastic1012 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures  
[06:44:39] <icinga-wm>	 RECOVERY - puppet last run on search1016 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures  
[06:44:49] <icinga-wm>	 RECOVERY - puppet last run on mw1009 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures  
[06:44:59] <icinga-wm>	 RECOVERY - puppet last run on mw1046 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures  
[06:45:09] <icinga-wm>	 RECOVERY - puppet last run on mw1150 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures  
[06:45:09] <icinga-wm>	 RECOVERY - puppet last run on elastic1008 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures  
[06:45:09] <icinga-wm>	 RECOVERY - puppet last run on mw1100 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures  
[06:45:09] <icinga-wm>	 RECOVERY - puppet last run on searchidx1001 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures  
[06:45:29] <icinga-wm>	 RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures  
[06:45:29] <icinga-wm>	 RECOVERY - puppet last run on mw1205 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures  
[06:45:29] <icinga-wm>	 RECOVERY - puppet last run on lvs1005 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures  
[06:45:29] <icinga-wm>	 RECOVERY - puppet last run on mw1217 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures  
[06:45:39] <icinga-wm>	 RECOVERY - puppet last run on mw1068 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures  
[06:45:39] <icinga-wm>	 RECOVERY - puppet last run on mw1003 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures  
[06:45:49] <icinga-wm>	 RECOVERY - puppet last run on db1022 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures  
[06:45:49] <icinga-wm>	 RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures  
[06:45:49] <icinga-wm>	 RECOVERY - puppet last run on mw1120 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures  
[06:45:49] <icinga-wm>	 RECOVERY - puppet last run on mw1153 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures  
[06:45:59] <icinga-wm>	 RECOVERY - puppet last run on db1002 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures  
[06:45:59] <icinga-wm>	 RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures  
[06:45:59] <icinga-wm>	 RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures  
[06:45:59] <icinga-wm>	 RECOVERY - puppet last run on iron is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures  
[06:46:00] <icinga-wm>	 RECOVERY - puppet last run on mw1060 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures  
[06:46:00] <icinga-wm>	 RECOVERY - puppet last run on mw1099 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures  
[06:46:00] <icinga-wm>	 RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures  
[06:46:09] <icinga-wm>	 RECOVERY - puppet last run on mw1088 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures  
[06:46:09] <icinga-wm>	 RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures  
[06:46:09] <icinga-wm>	 RECOVERY - puppet last run on mw1176 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures  
[06:46:09] <icinga-wm>	 RECOVERY - puppet last run on search1010 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures  
[06:46:09] <icinga-wm>	 RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures  
[06:46:10] <icinga-wm>	 RECOVERY - puppet last run on cp3016 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures  
[06:46:19] <icinga-wm>	 RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures  
[06:46:29] <icinga-wm>	 RECOVERY - puppet last run on holmium is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures  
[06:46:29] <icinga-wm>	 RECOVERY - puppet last run on mw1164 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures  
[06:46:29] <icinga-wm>	 RECOVERY - puppet last run on search1018 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures  
[06:46:29] <icinga-wm>	 RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures  
[06:46:39] <icinga-wm>	 RECOVERY - puppet last run on mw1189 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures  
[06:46:39] <icinga-wm>	 RECOVERY - puppet last run on db1021 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[06:46:49] <icinga-wm>	 RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures  
[06:46:49] <icinga-wm>	 RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures  
[06:46:59] <icinga-wm>	 RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures  
[06:46:59] <icinga-wm>	 RECOVERY - puppet last run on db1028 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures  
[06:46:59] <icinga-wm>	 RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures  
[06:47:09] <icinga-wm>	 RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures  
[06:53:06] <grrrit-wm>	 (03CR) 10Gilles: [C: 031] Remove remaining surveys for Media Viewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/143750 (owner: 10MarkTraceur)
[07:39:34] <grrrit-wm>	 (03PS1) 10Yurik: Handle ZERO's new carrier ip subnets [operations/puppet] - 10https://gerrit.wikimedia.org/r/144131 
[07:46:58] <icinga-wm>	 PROBLEM - graphite.wikimedia.org on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[07:48:16] <_joe_>	 sigh
[07:52:48] <icinga-wm>	 RECOVERY - graphite.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 1607 bytes in 0.012 second response time  
[07:59:39] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: [C: 031] "one ignorable comment (prefixed with ~~) but looks good!" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/143597 (owner: 10Giuseppe Lavagetto)
[08:01:31] <icinga-wm>	 PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures  
[08:05:39] <hashar>	 good morning
[08:06:33] <MaxSem>	 morning?
[08:06:45] <MaxSem>	 it's night for us! :P
[08:07:14] <_joe_>	 it's always night for someone
[08:08:05] <grrrit-wm>	 (03PS8) 10Giuseppe Lavagetto: nutcracker: move config in puppet, work with new packages [operations/puppet] - 10https://gerrit.wikimedia.org/r/143597 
[08:10:38] <hashar>	 MaxSem: might technically stillb e night for me
[08:10:53] <hashar>	 went back home at something like 3am.
[08:12:09] <MaxSem>	 gj
[08:12:33] <MaxSem>	 selebrated ID?
[08:14:27] <_joe_>	 hashar: you debauched frenchman! you're supposed to CODE until 3 am, not have fun like normal people!
[08:15:12] <hashar>	 ho
[08:15:28] <hashar>	 i would code if I was a software developer :D
[08:15:37] <MaxSem>	 HAAAAAA
[08:16:02] <MaxSem>	 CODE PUPPET THEN
[08:16:16] <hashar>	 oh puppet is not code
[08:16:30] <hashar>	 it just a a harness for masochists newbie sysadmins 
[08:17:07] <_joe_>	 hashar: you never tried coding erlang I guess
[08:17:59] <hashar>	 na
[08:18:07] <hashar>	 but I could since I mostly copy paste from stackoverflow
[08:20:34] <icinga-wm>	 RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures  
[08:21:36] <_joe_>	 hashar: :P
[08:23:32] <hashar>	 _joe_: have you starred at the zuul puppet patches?
[08:24:20] <_joe_>	 hashar: nope sorry
[10:02:58] <grrrit-wm>	 (03PS9) 10Giuseppe Lavagetto: nutcracker: move config in puppet, work with new packages [operations/puppet] - 10https://gerrit.wikimedia.org/r/143597 
[10:04:05] <grrrit-wm>	 (03PS10) 10Giuseppe Lavagetto: nutcracker: move config in puppet, work with new packages [operations/puppet] - 10https://gerrit.wikimedia.org/r/143597 
[10:57:49] <grrrit-wm>	 (03CR) 10Alexandros Kosiaris: [C: 032] deprecated syntax in mysql/generic_my.cnf.erb [operations/puppet] - 10https://gerrit.wikimedia.org/r/143529 (owner: 10Dzahn)
[10:59:18] <grrrit-wm>	 (03CR) 10Alexandros Kosiaris: [C: 032] apt: minor lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/143271 (owner: 10Matanya)
[11:04:41] <icinga-wm>	 PROBLEM - puppet last run on tungsten is CRITICAL: CRITICAL: Complete puppet failure  
[11:09:41] <icinga-wm>	 RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures  
[11:18:51] <icinga-wm>	 PROBLEM - puppet last run on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[11:19:01] <icinga-wm>	 PROBLEM - Graphite Carbon on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[11:19:11] <icinga-wm>	 PROBLEM - RAID on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[11:19:11] <icinga-wm>	 PROBLEM - SSH on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[11:19:42] <icinga-wm>	 RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 612 seconds ago with 0 failures  
[11:19:52] <icinga-wm>	 RECOVERY - Graphite Carbon on tungsten is OK: OK: All defined Carbon jobs are runnning.  
[11:20:02] <icinga-wm>	 RECOVERY - RAID on tungsten is OK: OK: optimal, 1 logical, 2 physical  
[11:20:02] <icinga-wm>	 RECOVERY - SSH on tungsten is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[11:40:40] <icinga-wm>	 PROBLEM - puppet last run on tungsten is CRITICAL: CRITICAL: Complete puppet failure  
[11:43:20] <grrrit-wm>	 (03PS1) 10Alexandros Kosiaris: Fix eventlogging duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/144146 
[11:47:35] <jzerebecki>	 it seems wikitech is still vulnerable agains mitm as it has an old libssl. was reported here: https://bugzilla.wikimedia.org/show_bug.cgi?id=53259#c15
[11:50:40] <icinga-wm>	 RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures  
[11:50:51] <jzerebecki>	 _joe_: could you find someone to look at it?
[11:51:28] <Reedy>	 He's /away
[11:51:33] <Reedy>	 I've just poked some Opsen
[11:51:41] <jzerebecki>	 thx
[11:58:20] <icinga-wm>	 PROBLEM - DPKG on virt1000 is CRITICAL: DPKG CRITICAL dpkg reports broken packages  
[11:59:28] <Reedy>	 jzerebecki: ^^ being worked on ;)
[12:00:50] <grrrit-wm>	 (03CR) 10Alexandros Kosiaris: [C: 032] Fix eventlogging duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/144146 (owner: 10Alexandros Kosiaris)
[12:01:53] <icinga-wm>	 PROBLEM - puppetmaster https on virt1000 is CRITICAL: Connection refused  
[12:01:53] <icinga-wm>	 PROBLEM - HTTP on virt1000 is CRITICAL: Connection refused  
[12:02:13] <icinga-wm>	 PROBLEM - Memcached on virt1000 is CRITICAL: Connection refused  
[12:03:13] <icinga-wm>	 RECOVERY - Memcached on virt1000 is OK: TCP OK - 0.001 second response time on port 11000  
[12:03:45] <grrrit-wm>	 (03CR) 10Alexandros Kosiaris: "This broke puppet on tungsten due to a duplicate definition. The same name was re-used a couple of lines above. Fixed in Iff01dce7dc032f21" [operations/puppet] - 10https://gerrit.wikimedia.org/r/143865 (https://bugzilla.wikimedia.org/67073) (owner: 10Nuria)
[12:03:53] <icinga-wm>	 RECOVERY - puppetmaster https on virt1000 is OK: HTTP OK: Status line output matched 400 - 335 bytes in 0.157 second response time  
[12:03:53] <icinga-wm>	 RECOVERY - HTTP on virt1000 is OK: HTTP OK: HTTP/1.1 302 Found - 457 bytes in 0.003 second response time  
[12:15:53] <icinga-wm>	 PROBLEM - puppet last run on virt1000 is CRITICAL: CRITICAL: Puppet has 1 failures  
[12:19:21] <grrrit-wm>	 (03CR) 10Alexandros Kosiaris: [C: 04-2] "NAK, See https://gerrit.wikimedia.org/r/#/c/143306, we should be merging this (and its dependencies) next week." [operations/puppet] - 10https://gerrit.wikimedia.org/r/144092 (owner: 10Dzahn)
[12:26:15] <akosiaris>	 keystone upgrade sux
[12:26:26] <icinga-wm>	 RECOVERY - DPKG on virt1000 is OK: All packages OK  
[12:26:31] <akosiaris>	 and makes dpkg complain
[12:28:44] <akosiaris>	 !log executed dist-upgrade on virt1000. Keystone configure phase failed in keystone-manage db-sync and hence dpkg configure failed. It was trying to create an already existing index in the database. Dropped the index, ran dpkg --configure -a to recreate the index (and whatever else keystone-manage db_sync does). All is back to normal.
[12:28:49] <morebots>	 Logged the message, Master
[12:28:57] <akosiaris>	 openstack sux
[12:29:08] <Reedy>	 that's just stupid
[12:29:46] <icinga-wm>	 RECOVERY - puppet last run on virt1000 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures  
[12:29:51] <Reedy>	 thanks akosiaris
[12:48:58] <_joe_>	 akosiaris: talk about idempotent packages....
[12:53:40] <YuviPanda>	 _joe_: btw, only late at night I realized that a big(?) part of your point was the use of exec rather than file, makes sense as well :)
[12:53:46] <YuviPanda>	 I'm investigating is_puppet_master now
[12:54:03] <YuviPanda>	 hmm, actually I should just use the sudo solution
[12:54:16] <YuviPanda>	 depending on is_puppet_master conflates the puppetmaster and diamond roles, sounds... wrong
[12:54:21] * YuviPanda goes back to writing code
[12:54:48] <_joe_>	 eheh
[12:56:03] <YuviPanda>	 at least cat is not a bash builtin :)
[13:35:23] <grrrit-wm>	 (03PS10) 10Yuvipanda: diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 
[13:37:42] <grrrit-wm>	 (03PS11) 10Yuvipanda: diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 
[13:38:34] <YuviPanda>	 hmm, every puppet run seems to be failing for me with 'Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class passwords::mysql::phabricator for graphite-test.eqiad.wmflabs on node graphite-test.eqiad.wmflabs'
[13:41:29] <YuviPanda>	 hmm, had to comment those out
[13:42:21] <hoo>	 YuviPanda: That's usually defined in puppet private... not sure how you handle that for labs
[13:42:40] <YuviPanda>	 hoo: yeah, but I shouldn't need that at all, I guess, cine I'm not doing anything with phab
[13:42:48] <YuviPanda>	 the phab module should be fixed
[13:43:50] <hoo>	 YuviPanda: Yeah... you could check whether the class is defined in there
[13:43:52] <godog>	 YuviPanda: it is probably missing from labs/private (gerrit repo, public)
[13:44:05] <hoo>	 or that
[13:44:17] <YuviPanda>	 godog: aaah, right. I suppose labs puppetmaster add that as well, but this one just forgot to add it
[13:44:23] <YuviPanda>	 I can't access the private repo, can someone copy it over?
[13:44:52] <godog>	 YuviPanda: labs/private is public despite the name :) just contains dummy passwords
[13:45:38] <YuviPanda>	 hmm, right. but I guess it'll be easier to just copy them from the private repo while dummying out the passwords, but I guess I can also look for the individual variables being used and dummy them out myself
[13:45:49] <YuviPanda>	 but for now I just commented those out and my (unrelated) work continues...
[13:47:56] <_joe_>	 YuviPanda: labs/private has that class
[13:48:02] <YuviPanda>	 oh
[14:14:16] <icinga-wm>	 PROBLEM - graphite.wikimedia.org on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[14:16:40] <grrrit-wm>	 (03PS12) 10Yuvipanda: diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 
[14:17:06] <icinga-wm>	 RECOVERY - graphite.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 1607 bytes in 0.005 second response time  
[14:17:16] <YuviPanda>	 _joe_: chasemp ^ python + sudo solution
[14:28:54] <grrrit-wm>	 (03CR) 10Yuvipanda: "Rewritten to fix issues pointed out by the comments." [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 (owner: 10Yuvipanda)
[14:32:53] <_joe_>	 YuviPanda: :)
[14:33:09] <YuviPanda>	 _joe_: it's somewhat ugly, but I guess beats messing around the puppet package
[14:33:29] <YuviPanda>	 _joe_: do remove the -2 (and maybe +2? :) ) if you're ok with the solution
[14:33:47] <_joe_>	 YuviPanda: yes, whenever I can
[14:34:01] <YuviPanda>	 _joe_: ty! I'll be afk for a while now, though.
[14:34:03] <_joe_>	 I'm kinda busy with something that incredibly decided to break on friday
[14:34:22] <YuviPanda>	 _joe_: ah, that never happens! :)
[14:34:28] <YuviPanda>	 _joe_: good luck, and thanks for all the comments/help!
[14:37:14] <grrrit-wm>	 (03PS13) 10Yuvipanda: diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 
[14:39:36] <_joe_>	 YuviPanda|zzz: cronspam from graphite-test
[14:44:19] <grrrit-wm>	 (03CR) 10coren: [C: 031] "I'm really not in love with the idea of invoking sudo with configurable parameters from deep within a monitoring process; but this is a de" [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 (owner: 10Yuvipanda)
[14:53:19] <icinga-wm>	 PROBLEM - RAID on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[14:54:09] <icinga-wm>	 RECOVERY - RAID on tungsten is OK: OK: optimal, 1 logical, 2 physical  
[14:57:49] <icinga-wm>	 PROBLEM - Puppet freshness on db1009 is CRITICAL: Last successful Puppet run was Fri 04 Jul 2014 12:56:58 UTC  
[15:01:47] <YuviPanda|zzz>	 _joe_: oh? to you?
[15:02:17] <YuviPanda>	 I don't know how that'll happen :|
[15:04:01] <_joe_>	 diamond : parse error in /etc/sudoers.d/50_diamond_sudo_for_puppet near line 6 ; TTY=pts/2 ; PWD=/ ; 
[15:04:06] <_joe_>	 YuviPanda: ^^
[15:04:09] <YuviPanda>	 ah
[15:04:13] <YuviPanda>	 yeah, I fixed that in a later patch
[15:04:17] <YuviPanda>	 let me update the puppetmaster
[15:05:27] <YuviPanda>	 _joe_: should be fixed now
[15:05:41] <_joe_>	 ok
[15:05:42] <_joe_>	 :)
[15:05:53] <YuviPanda>	 well, should be fixed once this puppet run completes
[15:06:23] <YuviPanda>	 yeah, seems to be fixed now
[15:10:58] <grrrit-wm>	 (03CR) 10Giuseppe Lavagetto: [C: 032] diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 (owner: 10Yuvipanda)
[15:12:16] <grrrit-wm>	 (03CR) 10Giuseppe Lavagetto: [C: 031] "de-moting to +1 as per small comment." (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 (owner: 10Yuvipanda)
[15:12:38] <_joe_>	 YuviPanda: sorry to be your nightmare
[15:12:41] <_joe_>	 :)
[15:14:17] <YuviPanda>	 _joe_: updated
[15:14:24] <grrrit-wm>	 (03PS14) 10Yuvipanda: diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 
[15:14:42] <YuviPanda>	 _joe_: and if you think this was a nightmare, you should work with more designers :)
[15:15:13] <_joe_>	 YuviPanda: I'm color and style blind, which guarantees I'm always left very far from any frontend work
[15:15:50] <_joe_>	 thank god :)
[15:16:00] <YuviPanda>	 _joe_: aaah! :) I've adopted a philosophy of never questioning designers' colors, but even then... :)
[15:19:48] <YuviPanda>	 _joe_: want to upgrade to a +2, since the comment was fixed? :)
[15:19:56] <_joe_>	 oh yes sorry
[15:20:08] <_joe_>	 I was deep down in apache configs right now :)
[15:20:30] <YuviPanda>	 _joe_: ah, ok :) Sorry to pester, just poked since you first +2'd and then +1'd :)
[15:20:46] <_joe_>	 don't worry
[15:20:54] <YuviPanda>	 :)
[15:21:26] <grrrit-wm>	 (03CR) 10Giuseppe Lavagetto: [C: 032] diamond: Let diamond read the puppet state file [operations/puppet] - 10https://gerrit.wikimedia.org/r/143861 (owner: 10Yuvipanda)
[15:21:30] <YuviPanda>	 _joe_: w00t
[15:21:35] <YuviPanda>	 _joe_: thanks a lot!
[15:21:51] <_joe_>	 YuviPanda: puppet-merged
[15:22:21] <grrrit-wm>	 (03PS1) 10Milimetric: Add setting for max instances per recurrent run [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/144154 
[15:22:25] <hoo>	 Anyone firm with salt around?
[15:22:41] <_joe_>	 I often spill it
[15:22:49] <_joe_>	 hoo: what do you need?
[15:23:02] <hoo>	 _joe_: look at fenari pid 21625
[15:23:19] <_joe_>	 ok 1 sec
[15:23:27] <hoo>	 heavy in ram and on cpu... running for a couple of days now
[15:24:32] <_joe_>	 hoo: is it salt that copies python files in tmp?
[15:24:39] <_joe_>	 ...
[15:24:47] <hoo>	 _joe_: According to ppid, yes
[15:24:54] <hoo>	 I just walked the process tree up :P
[15:24:55] <_joe_>	 hoo: I've seen that
[15:25:09] <_joe_>	 my question was out of shock actually 
[15:25:10] <_joe_>	 :)
[15:26:19] <_joe_>	 hoo: seems like it's reading every file in /home/.snapshot/hourly.0/wikipedia/htdocs/foundation/w2/
[15:26:33] <_joe_>	 so probably in /home/.snapshot
[15:26:56] <hoo>	 ah, that's why sda is so busy :P
[15:27:04] <_joe_>	 so, it's not stuck, it's just doing something apparently stupid
[15:27:23] <_joe_>	 we can just kill it, but I'm not sure that's a good idea honestly
[15:28:19] <hoo>	 mh, yes... I'm not into that enoguh to know :/
[15:28:20] <hoo>	 brb
[15:28:36] <_joe_>	 me neither
[15:37:06] <icinga-wm>	 RECOVERY - Puppet freshness on db1009 is OK: puppet ran at Fri Jul  4 15:37:03 UTC 2014  
[15:40:33] <_joe_>	 !log restarting salt-minion, killing io hungry job on fenari running since jun 30, 00 AM
[15:40:38] <morebots>	 Logged the message, Master
[15:42:08] <icinga-wm>	 PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures  
[16:00:04] <icinga-wm>	 RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[16:01:10] <hashar>	 see you on monday *wave*
[16:03:17] <nuria>	 hello anyone, how can i acknowledge an icinga alarm?
[16:04:59] <nuria>	 ping _joe_
[16:05:13] <Reedy>	 Unless you're ops, in most cases you can't
[16:05:38] <nuria>	 if the alarm comes to my group .. i should be able right?
[16:06:03] <Reedy>	 Not necesserily
[16:06:08] <nuria>	 Reedy: as in "nagios/icinga group"
[16:06:50] <nuria>	 ahh I see,Reedy, can you tell me how you do it in the icinga UI and i will try?
[16:07:28] <_joe_>	 nuria: hey
[16:07:49] <nuria>	 hello, _joe_
[16:08:21] <_joe_>	 nuria: what alarm do you need to acknowledge?
[16:08:23] <nuria>	 i was trying to "acknowledge" teh troughput alarm (must be missconfigured, it is alrming for "over" but should alarm for "under")
[16:08:47] <nuria>	 "throughput of event logging"
[16:09:02] <nuria>	 i cannot do it on the uI ( i must not have permits)
[16:09:35] <_joe_>	 done
[16:10:38] <nuria>	 ok, thanks _joe_
[16:10:57] <nuria>	 will look at alarm and script to see if teh "under" parameter is not being passed correctly
[16:22:01] <_joe_>	 nuria: it used to...
[16:22:12] <_joe_>	 YuviPanda|zzz: more cronspam
[16:22:15] <_joe_>	 from labs
[16:22:22] <_joe_>	 well I'm off now
[16:22:34] <_joe_>	 someone will take care of it hopefully
[16:25:25] <Coren>	 YuviPanda|zzz: Dude, diamond is spamming root again.
[16:27:14] <grrrit-wm>	 (03PS1) 10coren: Revert "diamond: Let diamond read the puppet state file" [operations/puppet] - 10https://gerrit.wikimedia.org/r/144161 
[16:28:20] <grrrit-wm>	 (03CR) 10coren: [C: 032] "Reverting now as this will likely cause gmail to throttle opsen email again." [operations/puppet] - 10https://gerrit.wikimedia.org/r/144161 (owner: 10coren)
[16:29:16] <Coren>	 Come /on/ Jenkins.
[16:30:25] <grrrit-wm>	 (03CR) 10coren: [V: 032] "+2V, this needs reversion now." [operations/puppet] - 10https://gerrit.wikimedia.org/r/144161 (owner: 10coren)
[16:32:56] <_joe_>	 Coren: and you'll need to run puppet on those machines
[16:33:07] <_joe_>	 Coren: the change seemed correct btw
[16:33:10] <Coren>	 _joe_: I was about to do a salt run now.
[16:33:16] <_joe_>	 Coren: ok
[16:33:25] <Coren>	 _joe_: Agreed, but some hosts seemingly do not apply the sudo rule right.
[16:33:30] <_joe_>	 so there must be something wrong on those hosts
[16:33:32] <_joe_>	 yes
[16:34:02] <_joe_>	 Coren: a badly configure puppetmaster maybe?
[16:35:23] <Coren>	 _joe_: Proably. I'm still trying to find a matching grain or minion glob to match.  Don't want to do puppet agent on '*'
[16:35:39] <_joe_>	 on labs?
[16:35:47] <_joe_>	 that would kill virt1000
[16:37:32] <Coren>	 I know, that's why I'm trying to find a suitable grain.
[16:37:48] <Coren>	 Doing it on -G puppet::self
[16:38:08] <_joe_>	 Coren: or, you could just kill diamond everywhere
[16:38:17] <_joe_>	 that will get restarted on the next puppet run
[16:38:21] <_joe_>	 with the correct code
[16:38:31] <Coren>	 ... not insane.
[16:38:34] <_joe_>	 as long as the deployment puppetmaster is updated
[16:38:55] <_joe_>	 Coren: they're all deployment-*
[16:39:04] <_joe_>	 so you must find that puppetmaster
[16:40:19] <Coren>	 And figure out how /it/ gets upstream.
[16:40:22] <Coren>	 &^@%#
[16:40:47] <Coren>	 It's deployment-salt
[16:41:47] <Coren>	 Ah, and they have unmerged changes.
[16:42:51] <Coren>	 I expect that's why things broke in the first place.
[16:44:08] * Coren manually fixes the merge conflict
[16:45:42] <_joe_>	 yes
[16:45:56] <_joe_>	 people should not automerge 
[16:47:20] <icinga-wm>	 PROBLEM - puppet last run on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:20] <icinga-wm>	 PROBLEM - DPKG on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:20] <icinga-wm>	 PROBLEM - swift-account-reaper on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:21] <icinga-wm>	 PROBLEM - RAID on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:30] <icinga-wm>	 PROBLEM - swift-account-replicator on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:30] <icinga-wm>	 PROBLEM - swift-object-replicator on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:30] <icinga-wm>	 PROBLEM - check if dhclient is running on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:30] <icinga-wm>	 PROBLEM - SSH on ms-be1005 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[16:47:30] <icinga-wm>	 PROBLEM - swift-container-server on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:31] <icinga-wm>	 PROBLEM - swift-container-auditor on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:31] <icinga-wm>	 PROBLEM - swift-object-auditor on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:32] <icinga-wm>	 PROBLEM - swift-account-server on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:32] <icinga-wm>	 PROBLEM - check configured eth on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:47:40] <icinga-wm>	 PROBLEM - swift-container-replicator on ms-be1005 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:48:10] <icinga-wm>	 RECOVERY - puppet last run on ms-be1005 is OK: OK: Puppet is currently enabled, last run 587 seconds ago with 0 failures  
[16:48:10] <icinga-wm>	 RECOVERY - DPKG on ms-be1005 is OK: All packages OK  
[16:48:11] <icinga-wm>	 RECOVERY - swift-account-reaper on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper  
[16:48:20] <icinga-wm>	 RECOVERY - swift-object-replicator on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator  
[16:48:20] <icinga-wm>	 RECOVERY - check if dhclient is running on ms-be1005 is OK: PROCS OK: 0 processes with command name dhclient  
[16:48:20] <icinga-wm>	 RECOVERY - swift-account-replicator on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator  
[16:48:20] <icinga-wm>	 RECOVERY - SSH on ms-be1005 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[16:48:20] <icinga-wm>	 RECOVERY - swift-container-server on ms-be1005 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server  
[16:48:21] <icinga-wm>	 RECOVERY - swift-container-auditor on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor  
[16:48:21] <icinga-wm>	 RECOVERY - swift-account-server on ms-be1005 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server  
[16:48:22] <icinga-wm>	 RECOVERY - swift-object-auditor on ms-be1005 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor  
[16:48:22] <icinga-wm>	 RECOVERY - check configured eth on ms-be1005 is OK: NRPE: Unable to read output  
[16:48:30] <icinga-wm>	 RECOVERY - swift-container-replicator on ms-be1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator  
[16:48:30] <icinga-wm>	 PROBLEM - Disk space on ms-be1005 is CRITICAL: DISK CRITICAL - /srv/swift-storage/sdg1 is not accessible: Input/output error  
[16:57:20] <icinga-wm>	 PROBLEM - RAID on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:57:20] <icinga-wm>	 PROBLEM - SSH on lvs1002 is CRITICAL: Server answer:  
[16:57:21] <icinga-wm>	 PROBLEM - SSH on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[16:57:21] <icinga-wm>	 PROBLEM - uWSGI web apps on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:57:30] <icinga-wm>	 PROBLEM - MediaWiki profile collector on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:57:30] <icinga-wm>	 PROBLEM - check configured eth on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:57:40] <icinga-wm>	 PROBLEM - check if dhclient is running on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:57:40] <icinga-wm>	 PROBLEM - puppet last run on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:57:40] <icinga-wm>	 PROBLEM - Graphite Carbon on tungsten is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[16:58:20] <icinga-wm>	 RECOVERY - SSH on lvs1002 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[16:59:09] <icinga-wm>	 RECOVERY - RAID on tungsten is OK: OK: optimal, 1 logical, 2 physical  
[16:59:10] <icinga-wm>	 RECOVERY - SSH on tungsten is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[16:59:19] <icinga-wm>	 RECOVERY - uWSGI web apps on tungsten is OK: OK: All defined uWSGI apps are runnning.  
[16:59:19] <icinga-wm>	 RECOVERY - check configured eth on tungsten is OK: NRPE: Unable to read output  
[16:59:20] <icinga-wm>	 RECOVERY - MediaWiki profile collector on tungsten is OK: OK: All defined mwprof jobs are runnning.  
[16:59:29] <icinga-wm>	 RECOVERY - check if dhclient is running on tungsten is OK: PROCS OK: 0 processes with command name dhclient  
[16:59:29] <icinga-wm>	 RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 572 seconds ago with 0 failures  
[16:59:29] <icinga-wm>	 RECOVERY - Graphite Carbon on tungsten is OK: OK: All defined Carbon jobs are runnning.  
[17:01:09] <icinga-wm>	 PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: Puppet has 1 failures  
[17:38:06] <icinga-wm>	 PROBLEM - Puppet freshness on db1009 is CRITICAL: Last successful Puppet run was Fri 04 Jul 2014 15:37:03 UTC  
[17:57:18] <grrrit-wm>	 (03PS2) 10Milimetric: Add setting for max instances per recurrent run [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/144154 
[18:17:29] <icinga-wm>	 RECOVERY - Puppet freshness on db1009 is OK: puppet ran at Fri Jul  4 18:17:24 UTC 2014  
[19:20:12] <legoktm>	 sorry, I think I just accidentally created a new ticket in RT.
[19:20:28] <legoktm>	 it (and the original ticket) and be closed...
[19:28:57] <grrrit-wm>	 (03PS1) 10Nuria: Eventlogging monitoring, pass true w/o quotes [operations/puppet] - 10https://gerrit.wikimedia.org/r/144178 
[20:05:45] <hoo>	 !log Ran sync-common on fenari to update the docs on noc.wikimedia.org
[20:05:50] <morebots>	 Logged the message, Master
[20:06:13] <hoo>	 Nemo_bis: ^ that was what you asked for earlier (regarding dblists)
[20:08:46] <Nemo_bis>	 thanks
[20:18:29] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[20:35:47] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[22:08:39] <grrrit-wm>	 (03PS1) 10Ottomata: Production now uses CDH (CDH5) module, also refactor hadoop.pp role [operations/puppet] - 10https://gerrit.wikimedia.org/r/144242