[00:00:27] <gerrit-wm>	 New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/24645
[00:00:58] <nagios-wm>	 RECOVERY - Apache HTTP on srv190 is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 0.006 seconds
[00:09:04] <nagios-wm>	 RECOVERY - NTP on srv190 is OK: NTP OK: Offset 0.0157648325 secs
[00:13:16] <nagios-wm>	 PROBLEM - Apache HTTP on srv190 is CRITICAL: Connection refused
[00:13:49] <binasher>	 AaronSchulz: we should test it :)
[00:19:25] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:20:47] <gerrit-wm>	 Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/24645
[00:27:04] <nagios-wm>	 PROBLEM - Puppet freshness on analytics1005 is CRITICAL: Puppet has not run in the last 10 hours
[00:33:31] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.804 seconds
[00:38:01] <nagios-wm>	 PROBLEM - Puppet freshness on aluminium is CRITICAL: Puppet has not run in the last 10 hours
[00:54:25] <nagios-wm>	 RECOVERY - Apache HTTP on srv190 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time
[01:08:40] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[01:22:19] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.045 seconds
[01:41:13] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 216 seconds
[01:42:43] <nagios-wm>	 PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 262 seconds
[01:45:43] <nagios-wm>	 RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 5 seconds
[01:45:52] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 7 seconds
[01:54:52] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[02:08:58] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.145 seconds
[02:59:10] <nagios-wm>	 RECOVERY - Puppet freshness on analytics1005 is OK: puppet ran at Sat Sep 22 02:59:06 UTC 2012
[03:18:49] <nagios-wm>	 PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours
[03:18:49] <nagios-wm>	 PROBLEM - Puppet freshness on singer is CRITICAL: Puppet has not run in the last 10 hours
[03:18:49] <nagios-wm>	 PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours
[03:18:49] <nagios-wm>	 PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours
[03:18:49] <nagios-wm>	 PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours
[03:18:49] <nagios-wm>	 PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours
[05:10:06] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 212 seconds
[05:10:42] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 205 seconds
[05:16:33] <nagios-wm>	 PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours
[05:16:51] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay 0 seconds
[05:18:08] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 0 seconds
[05:30:08] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 216 seconds
[05:30:35] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 216 seconds
[05:34:47] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 4 seconds
[05:35:05] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay 0 seconds
[06:06:08] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours
[06:12:38] <Nemo_bis>	 so how many jobrunners do we have now compared to before https://gerrit.wikimedia.org/r/24645 ?
[06:24:44] <nagios-wm>	 PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out
[06:32:23] <nagios-wm>	 RECOVERY - Lucene on search1015 is OK: TCP OK - 9.032 second response time on port 8123
[06:43:50] <nagios-wm>	 PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out
[06:52:05] <nagios-wm>	 PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out
[06:53:17] <nagios-wm>	 RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.027 second response time on port 8123
[07:26:08] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 187 seconds
[07:26:08] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 186 seconds
[07:27:47] <nagios-wm>	 RECOVERY - Lucene on search1015 is OK: TCP OK - 9.017 second response time on port 8123
[07:27:56] <nagios-wm>	 PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours
[07:29:08] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 0 seconds
[07:29:08] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay 0 seconds
[07:38:57] <nagios-wm>	 PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out
[07:40:09] <nagios-wm>	 RECOVERY - Lucene on search1015 is OK: TCP OK - 0.027 second response time on port 8123
[07:48:34] <nagios-wm>	 PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out
[07:50:04] <nagios-wm>	 RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.027 second response time on port 8123
[08:17:12] <nagios-wm>	 PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out
[08:24:33] <nagios-wm>	 RECOVERY - Lucene on search1015 is OK: TCP OK - 3.023 second response time on port 8123
[08:33:47] <nagios-wm>	 PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out
[08:35:17] <nagios-wm>	 PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out
[08:36:38] <nagios-wm>	 RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.027 second response time on port 8123
[08:41:35] <nagios-wm>	 PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out
[08:47:44] <nagios-wm>	 RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.027 second response time on port 8123
[08:47:44] <nagios-wm>	 RECOVERY - Lucene on search1015 is OK: TCP OK - 0.027 second response time on port 8123
[08:49:59] <nagios-wm>	 PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours
[08:49:59] <nagios-wm>	 PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours
[09:05:26] <nagios-wm>	 PROBLEM - Lucene on search1015 is CRITICAL: Connection timed out
[09:24:29] <nagios-wm>	 RECOVERY - Lucene on search1015 is OK: TCP OK - 3.018 second response time on port 8123
[09:26:13] <mark>	 !log Restarted lucene on search1015
[09:26:24] <morebots>	 Logged the message, Master
[09:43:24] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours
[10:38:49] <nagios-wm>	 PROBLEM - Puppet freshness on aluminium is CRITICAL: Puppet has not run in the last 10 hours
[10:41:29] <gerrit-wm>	 New patchset: Dereckson; "(bug 40436) Namespaces configuration for se.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/24656
[10:45:16] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db56 is CRITICAL: CRIT replication delay 199 seconds
[10:46:01] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db56 is CRITICAL: CRIT replication delay 220 seconds
[11:11:58] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db56 is OK: OK replication delay 7 seconds
[11:12:07] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db56 is OK: OK replication delay 16 seconds
[11:18:07] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db56 is CRITICAL: CRIT replication delay 243 seconds
[11:18:07] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db56 is CRITICAL: CRIT replication delay 243 seconds
[12:07:20] <nagios-wm>	 PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100%
[12:10:02] <nagios-wm>	 RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 0.60 ms
[13:02:36] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db56 is OK: OK replication delay 4 seconds
[13:03:12] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db56 is OK: OK replication delay 2 seconds
[13:12:03] <nagios-wm>	 PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100%
[13:15:12] <nagios-wm>	 RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 2.42 ms
[13:19:51] <nagios-wm>	 PROBLEM - Puppet freshness on singer is CRITICAL: Puppet has not run in the last 10 hours
[13:19:51] <nagios-wm>	 PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours
[13:19:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours
[13:19:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours
[13:19:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours
[13:19:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours
[15:18:03] <nagios-wm>	 PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours
[16:07:22] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours
[16:12:28] <nagios-wm>	 PROBLEM - Host cp1043 is DOWN: PING CRITICAL - Packet loss = 100%
[17:28:59] <nagios-wm>	 PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours
[18:19:18] <Thehelpfulone>	 can anyone tell me when the last updates for http://rt.wikimedia.org/Ticket/Display.html?id=452 and  http://rt.wikimedia.org/Ticket/Display.html?id=456 were?
[18:39:36] <nagios-wm>	 ACKNOWLEDGEMENT - MySQL Slave Delay on es1001 is CRITICAL: CRIT replication delay 94330 seconds asher part of making read-only
[18:39:36] <nagios-wm>	 ACKNOWLEDGEMENT - MySQL Slave Delay on es1002 is CRITICAL: CRIT replication delay 82406 seconds asher part of making read-only
[18:40:06] <nagios-wm>	 ACKNOWLEDGEMENT - MySQL Slave Delay on es1004 is CRITICAL: CRIT replication delay 82379 seconds asher part of making read-only
[18:40:06] <nagios-wm>	 ACKNOWLEDGEMENT - MySQL Slave Delay on es2 is CRITICAL: CRIT replication delay 94392 seconds asher part of making read-only
[18:40:36] <nagios-wm>	 ACKNOWLEDGEMENT - MySQL Slave Delay on es4 is CRITICAL: CRIT replication delay 94438 seconds asher part of making read-only
[18:46:24] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1001 is CRITICAL: CRIT replication delay 269 seconds
[18:46:42] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1001 is CRITICAL: CRIT replication delay 287 seconds
[18:51:21] <nagios-wm>	 PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours
[18:51:21] <nagios-wm>	 PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours
[18:54:12] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[18:57:12] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.170 seconds
[19:32:36] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[19:38:36] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.187 seconds
[19:43:24] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[19:44:18] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours
[19:44:27] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1001 is OK: OK replication delay 0 seconds
[19:44:36] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1001 is OK: OK replication delay 0 seconds
[19:44:45] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.036 seconds
[20:12:45] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[20:30:18] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.032 seconds
[20:39:36] <nagios-wm>	 PROBLEM - Puppet freshness on aluminium is CRITICAL: Puppet has not run in the last 10 hours
[20:45:59] <gerrit-wm>	 New review: Dereckson; "Moving the discussion back to the bug report to discuss the change more broadly (and in a more natur..." [operations/mediawiki-config] (master) C: 0;  - https://gerrit.wikimedia.org/r/23985
[21:00:54] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[21:02:14] <gerrit-wm>	 New patchset: Dereckson; "(bug 39569) Activating flood flag on it.wikibooks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/24671
[21:12:29] <gerrit-wm>	 New patchset: Dereckson; "(bug 38398) Namespace configuration for meta." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/24672
[21:14:51] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.028 seconds
[21:47:57] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:00:42] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.203 seconds
[22:15:03] <gerrit-wm>	 New review: Dereckson; "We need bug reference." [operations/mediawiki-config] (master) C: -1;  - https://gerrit.wikimedia.org/r/24561
[22:34:27] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:48:43] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 1.120 seconds
[23:20:51] <nagios-wm>	 PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours
[23:20:51] <nagios-wm>	 PROBLEM - Puppet freshness on singer is CRITICAL: Puppet has not run in the last 10 hours
[23:20:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours
[23:20:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours
[23:20:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours
[23:20:51] <nagios-wm>	 PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours
[23:23:24] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:35:51] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.319 seconds