[00:01:09] <icinga-wm>	 PROBLEM - Puppet freshness on db1007 is CRITICAL: Last successful Puppet run was Sat 12 Jul 2014 21:59:40 UTC  
[00:19:56] <icinga-wm>	 RECOVERY - Puppet freshness on db1007 is OK: puppet ran at Sun Jul 13 00:19:52 UTC 2014  
[02:12:44] <legoktm>	 !log migratePass0.php finished a while back
[02:12:54] <morebots>	 Logged the message, Master
[02:14:35] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf12) at 2014-07-13 02:13:32+00:00
[02:14:40] <morebots>	 Logged the message, Master
[02:24:59] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf13) at 2014-07-13 02:23:56+00:00
[02:25:04] <morebots>	 Logged the message, Master
[02:54:39] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 13 02:53:33 UTC 2014 (duration 53m 32s)
[02:54:45] <morebots>	 Logged the message, Master
[05:41:22] <grrrit-wm>	 (03CR) 10Giuseppe Lavagetto: [C: 031] "I erroneously did not remove this cipher. Apart from having !DH at the end of the chiphers list, we do not set a dh_param, so that for ngi" [operations/puppet] - 10https://gerrit.wikimedia.org/r/145688 (owner: 10Dzahn)
[06:08:38] <icinga-wm>	 PROBLEM - graphite.wikimedia.org on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[06:29:04] <icinga-wm>	 PROBLEM - puppet last run on search1010 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:24] <icinga-wm>	 RECOVERY - graphite.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 1607 bytes in 0.005 second response time  
[06:30:14] <icinga-wm>	 PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:24] <icinga-wm>	 PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:24] <icinga-wm>	 PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:44] <icinga-wm>	 PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:31:24] <icinga-wm>	 PROBLEM - puppet last run on mw1117 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:44:05] <icinga-wm>	 RECOVERY - puppet last run on search1010 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[06:45:25] <icinga-wm>	 RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures  
[06:46:15] <icinga-wm>	 RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures  
[06:46:25] <icinga-wm>	 RECOVERY - puppet last run on mw1117 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures  
[06:46:25] <icinga-wm>	 RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures  
[06:46:45] <icinga-wm>	 RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures  
[10:39:39] <icinga-wm>	 PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: Puppet has 1 failures  
[10:40:09] <icinga-wm>	 PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: Puppet has 1 failures  
[10:40:19] <icinga-wm>	 PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 1 failures  
[10:43:19] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[10:46:20] <icinga-wm>	 PROBLEM - puppet last run on amssq49 is CRITICAL: CRITICAL: Puppet has 1 failures  
[10:46:20] <icinga-wm>	 PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: Puppet has 1 failures  
[10:58:09] <icinga-wm>	 RECOVERY - puppet last run on cp3021 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures  
[10:58:09] <icinga-wm>	 RECOVERY - puppet last run on cp3019 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures  
[10:58:20] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[10:58:21] <icinga-wm>	 RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures  
[11:03:22] <icinga-wm>	 RECOVERY - puppet last run on amssq54 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures  
[11:04:21] <icinga-wm>	 RECOVERY - puppet last run on amssq49 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures  
[12:21:27] <icinga-wm>	 PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures  
[12:39:25] <icinga-wm>	 RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures  
[13:16:28] <grrrit-wm>	 (03PS1) 10Matanya: gitblit: fully qualify vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/145894 
[14:46:37] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 9 below the confidence bounds  
[15:33:44] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected  
[17:17:42] <icinga-wm>	 PROBLEM - puppetmaster https on palladium is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[17:18:32] <icinga-wm>	 RECOVERY - puppetmaster https on palladium is OK: HTTP OK: Status line output matched 400 - 335 bytes in 0.095 second response time  
[18:31:58] <grrrit-wm>	 (03PS1) 10Ori.livneh: apache: on service refresh, do a graceful reload instead of start/stop [operations/puppet] - 10https://gerrit.wikimedia.org/r/145908 
[20:01:58] <icinga-wm>	 PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures  
[20:20:45] <icinga-wm>	 RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures  
[20:20:45] <icinga-wm>	 PROBLEM - Packetloss_Average on analytics1003 is CRITICAL: packet_loss_average CRITICAL: 10.4707055  
[20:30:45] <icinga-wm>	 RECOVERY - Packetloss_Average on analytics1003 is OK: packet_loss_average OKAY: 1.88484675  
[20:46:18] <icinga-wm>	 PROBLEM - Packetloss_Average on oxygen is CRITICAL: packet_loss_average CRITICAL: 20.2849271667  
[20:46:48] <icinga-wm>	 PROBLEM - Packetloss_Average on analytics1003 is CRITICAL: packet_loss_average CRITICAL: 21.0789915833  
[20:53:12] <grrrit-wm>	 (03PS1) 10BryanDavis: labs_vagrant: Install to /srv/vagrant [operations/puppet] - 10https://gerrit.wikimedia.org/r/145974 
[20:53:14] <grrrit-wm>	 (03PS1) 10BryanDavis: labs_vagrant: cleanup sudoers config [operations/puppet] - 10https://gerrit.wikimedia.org/r/145975 
[20:56:01] <grrrit-wm>	 (03PS2) 10BryanDavis: labs_vagrant: Install to /srv/vagrant [operations/puppet] - 10https://gerrit.wikimedia.org/r/145974 
[20:59:06] <grrrit-wm>	 (03PS3) 10BryanDavis: labs_vagrant: Install to /srv/vagrant [operations/puppet] - 10https://gerrit.wikimedia.org/r/145974 
[21:00:55] <grrrit-wm>	 (03PS4) 10BryanDavis: labs_vagrant: Install to /srv/vagrant [operations/puppet] - 10https://gerrit.wikimedia.org/r/145974 
[21:03:53] <Krinkle>	 !log git-deploy: Deploying integration/slave-scripts I7f2b476807465
[21:03:59] <morebots>	 Logged the message, Master
[21:16:16] <icinga-wm>	 RECOVERY - Packetloss_Average on oxygen is OK: packet_loss_average OKAY: 1.22043441667  
[21:16:56] <grrrit-wm>	 (03PS1) 10QChris: Reflect move of refinery script to drop partitions [operations/puppet] - 10https://gerrit.wikimedia.org/r/145980 
[21:17:36] <grrrit-wm>	 (03CR) 10QChris: "This change depends on" [operations/puppet] - 10https://gerrit.wikimedia.org/r/145980 (owner: 10QChris)
[21:30:43] <icinga-wm>	 PROBLEM - Packetloss_Average on analytics1003 is CRITICAL: packet_loss_average CRITICAL: 10.8208922689  
[21:32:14] <icinga-wm>	 PROBLEM - Packetloss_Average on oxygen is CRITICAL: packet_loss_average CRITICAL: 13.9927621008  
[21:33:00] <grrrit-wm>	 (03PS5) 10BryanDavis: labs_vagrant: Install to /srv/vagrant [operations/puppet] - 10https://gerrit.wikimedia.org/r/145974 
[21:36:14] <icinga-wm>	 RECOVERY - Packetloss_Average on oxygen is OK: packet_loss_average OKAY: 1.36625383333  
[21:40:50] <icinga-wm>	 RECOVERY - Packetloss_Average on analytics1003 is OK: packet_loss_average OKAY: 1.80645991597  
[21:44:50] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[21:46:21] <grrrit-wm>	 (03CR) 10BryanDavis: "Tested via cherry-pick to self-hosted puppetmaster on bd808-vagrant.wmflabs.net instance in wikimedia-support project." [operations/puppet] - 10https://gerrit.wikimedia.org/r/145974 (owner: 10BryanDavis)
[21:46:51] <grrrit-wm>	 (03PS2) 10BryanDavis: labs_vagrant: cleanup sudoers config [operations/puppet] - 10https://gerrit.wikimedia.org/r/145975 
[21:52:11] <icinga-wm>	 PROBLEM - Packetloss_Average on oxygen is CRITICAL: packet_loss_average CRITICAL: 23.4265182353  
[21:56:49] <Krinkle>	 bd808: Do you know anything about 'cobalt'?
[21:56:50] <icinga-wm>	 PROBLEM - Packetloss_Average on analytics1003 is CRITICAL: packet_loss_average CRITICAL: 42.41308275  
[21:56:58] <Krinkle>	 Doesn't show up in http://ganglia.wikimedia.org/latest/ and not documented on wikitech
[21:57:03] <YuviPanda>	 bd808: ty for the patches!
[21:57:12] <Krinkle>	 Noticed it got an IP recently; https://gerrit.wikimedia.org/r/#/c/139080/
[21:57:31] <bd808>	 Krinkle: Hmm.. no I haven't hear about it
[21:57:51] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[21:58:10] <bd808>	 YuviPanda: yw! I was playing with labs-vagrant an ran my disk up to 98% :)
[21:58:19] <YuviPanda>	 :D
[21:59:43] <bd808>	 YuviPanda: I was wondering if the labs_vagrant role should run `labs-vagrant provision` too. Thoughts?
[22:00:01] <YuviPanda>	 bd808: right, but that would mean running provision every time puppet parent runs (30m?)
[22:00:44] <bd808>	 yeah, it would, but "usually" it should be a slow no-op right?
[22:01:16] <bd808>	 It would setup the wiki the first time. Maybe it could be done with a notify for the initial provision instead.
[22:01:28] <bd808>	 I hate notify though. It makes puppet non-deterministic
[22:04:17] <bd808>	 Krinkle: cobalt doesn't seem to be in site.pp, so I'd guess it's a new misc sever for something that isn't provisioned yet.
[22:04:32] <YuviPanda>	 bd808: right. I'm 50/50 on that, but don't mind either way.
[22:05:04] <Krinkle>	 https://github.com/search?q=cobalt.wikimedia.org+%40wikimedia&type=Code&ref=searchresults
[22:05:28] <Reedy>	 bd808: Krinkle It's currently awaiting a new hard drive according to RT
[22:05:35] <Krinkle>	 https://github.com/wikimedia/operations-puppet/search?q=cobalt&ref=cmdform
[22:05:38] <Reedy>	 so it's likely unassigned and down as being dead
[22:05:39] <Krinkle>	 Yeah
[22:05:45] <Krinkle>	 k
[22:12:50] <ori>	 !log stopping puppet on rcs1001 to debug nginx issue
[22:12:55] <morebots>	 Logged the message, Master
[22:36:18] <icinga-wm>	 RECOVERY - Packetloss_Average on oxygen is OK: packet_loss_average OKAY: 1.67081932773  
[22:41:53] <icinga-wm>	 RECOVERY - Packetloss_Average on analytics1003 is OK: packet_loss_average OKAY: 0.146342857143