[00:10:33] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:12:57] <nagios-wm>	 PROBLEM - Puppet freshness on labstore4 is CRITICAL: Puppet has not run in the last 10 hours
[00:13:51] * AaronSchulz  wonders why his rgw broke
[00:15:58] <AaronSchulz>	 nvm
[00:21:57] <nagios-wm>	 PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours
[00:21:57] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be1002 is CRITICAL: Puppet has not run in the last 10 hours
[00:21:57] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be1001 is CRITICAL: Puppet has not run in the last 10 hours
[00:21:57] <nagios-wm>	 PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours
[00:23:36] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.596 seconds
[00:31:18] <rfaulkner>	 preilly: here
[00:31:23] <preilly>	 rfaulkner: welcome to the real channel
[00:31:31] <rfaulkner>	 haha ty
[00:31:31] <preilly>	 rfaulkner: so this is the one that you can use https://github.com/wikimedia/Sartoris
[00:32:36] <nagios-wm>	 PROBLEM - Host wikipedia-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::1
[00:32:37] <nagios-wm>	 PROBLEM - Host wikiquote-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::3
[00:32:38] <nagios-wm>	 PROBLEM - Host mediawiki-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::8
[00:32:45] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:32:57] <LeslieCarr>	 hrm
[00:33:30] <nagios-wm>	 PROBLEM - Host wikiversity-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::7
[00:33:48] <nagios-wm>	 PROBLEM - Host upload-lb.esams.wikimedia.org_ipv6_https is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::b
[00:33:49] <nagios-wm>	 PROBLEM - Host wikiversity-lb.esams.wikimedia.org_ipv6_https is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::7
[00:33:54] <LeslieCarr>	 fuck, it looks like cr1-eqiad's fpc rebooted
[00:33:55] <LeslieCarr>	 again
[00:33:55] <AaronSchulz>	 Ryan_Lane: every time I hear "swift" my ears twitch
[00:33:57] <nagios-wm>	 PROBLEM - Host bits-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::a
[00:34:15] <nagios-wm>	 PROBLEM - Host wikisource-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::5
[00:34:15] <nagios-wm>	 PROBLEM - Host wiktionary-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::2
[00:34:16] <nagios-wm>	 PROBLEM - Host foundation-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::9
[00:34:16] <nagios-wm>	 PROBLEM - Host wikinews-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::6
[00:34:24] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:34:33] <Ryan_Lane>	 AaronSchulz: :D
[00:34:33] <nagios-wm>	 PROBLEM - Host upload-lb.esams.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:862:ed1a::b
[00:34:51] <nagios-wm>	 RECOVERY - Host bits-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 115.55 ms
[00:34:51] <nagios-wm>	 RECOVERY - Host foundation-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 116.90 ms
[00:35:00] <nagios-wm>	 RECOVERY - Host upload-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 117.52 ms
[00:35:01] <nagios-wm>	 RECOVERY - Host wikisource-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 115.58 ms
[00:35:09] <nagios-wm>	 RECOVERY - Host wikinews-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 116.84 ms
[00:35:11] <nagios-wm>	 RECOVERY - Host wiktionary-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 117.32 ms
[00:35:11] <nagios-wm>	 RECOVERY - Host wikiversity-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 115.97 ms
[00:35:28] <nagios-wm>	 RECOVERY - Host wikiquote-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 115.59 ms
[00:35:29] <nagios-wm>	 RECOVERY - Host wikipedia-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 115.91 ms
[00:35:36] <nagios-wm>	 RECOVERY - Host mediawiki-lb.esams.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 117.37 ms
[00:36:48] <LeslieCarr>	 !log appears that cr1-eqiad's fpc rebooted
[00:36:56] <morebots>	 Logged the message, Mistress of the network gear.
[00:37:08] <paravoid>	 yeah it's not the first time
[00:37:15] <paravoid>	 I was debugging it the other day too
[00:37:30] <paravoid>	 then figured there's nothing I can do and told ma rk, not sure if he told you
[00:39:30] <nagios-wm>	 RECOVERY - Host upload-lb.esams.wikimedia.org_ipv6_https is UP: PING OK - Packet loss = 0%, RTA = 114.36 ms
[00:39:31] <nagios-wm>	 RECOVERY - Host wikiversity-lb.esams.wikimedia.org_ipv6_https is UP: PING OK - Packet loss = 0%, RTA = 112.42 ms
[00:40:01] <LeslieCarr>	 yeah
[00:40:05] <LeslieCarr>	 :(
[00:41:45] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 185 seconds
[00:42:48] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 187 seconds
[00:56:00] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 189 seconds
[00:56:27] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 196 seconds
[00:57:48] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[01:11:09] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 17 seconds
[01:12:03] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds
[01:12:21] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.719 seconds
[01:21:31] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 262 seconds
[01:23:09] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 1 seconds
[01:24:03] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1004 is CRITICAL: HTTP CRITICAL - No data received from host
[02:25:26] <logmsgbot>	 !log LocalisationUpdate completed (1.21wmf6) at Sat Dec 15 02:25:26 UTC 2012
[02:25:37] <morebots>	 Logged the message, Master
[02:27:17] <LeslieCarr>	 so if juniper doesn't get back to me soon, i'll disable bgp and switch ospf costs on cr1 to make all traffic go to cr2-eqiad
[02:27:22] <LeslieCarr>	 it'll fuck our commits though :(
[02:35:54] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[02:39:03] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.930 seconds
[02:41:25] <gerrit-wm>	 New review: Catrope; "DNS for this is done now. Note that when deploying this change, a Pybal restart (per https://wikitec..." [operations/puppet] (production) C: 0;  - https://gerrit.wikimedia.org/r/38457
[02:45:56] <LeslieCarr>	 !log drained all transit traffic from cr1-eqiad -- restore by  load override /var/home/lcarr/cr1-eqiad.undrained if traffic levels seem too high
[02:46:05] <morebots>	 Logged the message, Mistress of the network gear.
[02:47:17] <logmsgbot>	 !log LocalisationUpdate completed (1.21wmf5) at Sat Dec 15 02:47:17 UTC 2012
[02:47:26] <morebots>	 Logged the message, Master
[03:32:45] <nagios-wm>	 RECOVERY - Puppet freshness on erzurumi is OK: puppet ran at Sat Dec 15 03:32:38 UTC 2012
[03:57:48] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be3 is CRITICAL: Puppet has not run in the last 10 hours
[06:00:17] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[06:03:26] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.998 seconds
[06:06:08] <nagios-wm>	 PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours
[06:14:05] <nagios-wm>	 PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours
[06:37:47] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[06:49:11] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.806 seconds
[06:51:26] <nagios-wm>	 PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours
[07:20:05] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1003 is CRITICAL: HTTP CRITICAL - No data received from host
[07:23:41] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[07:38:23] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.051 seconds
[08:10:33] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[08:25:15] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.031 seconds
[08:57:12] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:13:33] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.045 seconds
[09:44:32] <nagios-wm>	 PROBLEM - Host ms-be1 is DOWN: PING CRITICAL - Packet loss = 100%
[09:46:20] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:55:21] <gerrit-wm>	 New review: Hashar; "recheck" [operations/puppet] (production); V: 0 C: 0;  - https://gerrit.wikimedia.org/r/38797
[09:55:28] <gerrit-wm>	 New review: Hashar; "recheck" [operations/puppet] (production); V: 0 C: 0;  - https://gerrit.wikimedia.org/r/24620
[09:59:23] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.595 seconds
[10:13:47] <nagios-wm>	 PROBLEM - Puppet freshness on labstore4 is CRITICAL: Puppet has not run in the last 10 hours
[10:15:44] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[10:22:47] <nagios-wm>	 PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours
[10:22:47] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be1001 is CRITICAL: Puppet has not run in the last 10 hours
[10:22:48] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be1002 is CRITICAL: Puppet has not run in the last 10 hours
[10:22:48] <nagios-wm>	 PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours
[10:33:35] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[10:46:38] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 3.687 seconds
[11:21:40] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[11:36:22] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.044 seconds
[11:39:40] <nagios-wm>	 PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours
[11:39:40] <nagios-wm>	 PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours
[11:55:43] <nagios-wm>	 PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours
[11:55:44] <nagios-wm>	 PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours
[12:08:51] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[12:18:49] <apergos>	 !log ms-be1 can't be reached form mgmt console to check on it (Connection refused)
[12:19:02] <morebots>	 Logged the message, Master
[12:25:12] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.030 seconds
[12:57:54] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[13:10:57] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 1.869 seconds
[13:24:10] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1004 is CRITICAL: HTTP CRITICAL - No data received from host
[13:35:59] <paravoid>	 apergos: mmm
[13:44:33] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[13:58:39] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be3 is CRITICAL: Puppet has not run in the last 10 hours
[14:00:54] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.033 seconds
[14:20:06] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1003 is CRITICAL: HTTP CRITICAL - No data received from host
[14:30:22] <paravoid>	 !log powercycling ms-be1
[14:30:31] <morebots>	 Logged the message, Master
[14:33:09] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[14:33:18] <nagios-wm>	 RECOVERY - Host ms-be1 is UP: PING OK - Packet loss = 0%, RTA = 0.29 ms
[14:37:18] <logmsgbot>	 !log maxsem synchronized /php-1.21wmf5/includes/Message.php  'https://gerrit.wikimedia.org/r/#/c/38822/'
[14:37:26] <morebots>	 Logged the message, Master
[14:49:30] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.042 seconds
[15:22:03] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:38:24] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.038 seconds
[16:07:39] <nagios-wm>	 PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours
[16:10:30] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:10:39] <nagios-wm>	 PROBLEM - check_gcsip on payments1 is CRITICAL: Connection timed out
[16:10:39] <nagios-wm>	 PROBLEM - check_gcsip on payments4 is CRITICAL: Connection timed out
[16:10:40] <nagios-wm>	 PROBLEM - check_gcsip on payments3 is CRITICAL: Connection timed out
[16:10:40] <nagios-wm>	 PROBLEM - check_gcsip on payments2 is CRITICAL: Connection timed out
[16:15:27] <nagios-wm>	 RECOVERY - check_gcsip on payments2 is OK: HTTP OK: HTTP/1.1 200 OK - 378 bytes in 0.139 second response time
[16:15:27] <nagios-wm>	 RECOVERY - check_gcsip on payments3 is OK: HTTP OK: HTTP/1.1 200 OK - 378 bytes in 0.157 second response time
[16:15:28] <nagios-wm>	 RECOVERY - check_gcsip on payments4 is OK: HTTP OK: HTTP/1.1 200 OK - 378 bytes in 0.192 second response time
[16:15:28] <nagios-wm>	 RECOVERY - check_gcsip on payments1 is OK: HTTP OK: HTTP/1.1 200 OK - 378 bytes in 0.165 second response time
[16:15:36] <nagios-wm>	 PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours
[16:22:48] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 209 seconds
[16:23:24] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 228 seconds
[16:26:51] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.183 seconds
[16:36:18] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds
[16:37:12] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds
[16:45:36] <nagios-wm>	 PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours
[16:52:39] <nagios-wm>	 PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours
[16:58:57] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[17:15:19] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.034 seconds
[17:20:06] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1003 is CRITICAL: HTTP CRITICAL - No data received from host
[17:36:44] <gerrit-wm>	 New review: MZMcBride; "This is a trivial changeset. What needs to happen to get this deployed? Is there an associated RT ti..." [operations/debs/squid] (master) C: 0;  - https://gerrit.wikimedia.org/r/18331
[17:47:24] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[18:03:45] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.025 seconds
[18:08:15] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 218 seconds
[18:08:33] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 229 seconds
[18:28:37] <gerrit-wm>	 New review: Pgehres; "I'm not in a terrible hurry as this isn't really the best solution.  Since most error pages occur wh..." [operations/debs/squid] (master) C: 0;  - https://gerrit.wikimedia.org/r/18331
[18:36:18] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[18:51:00] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.041 seconds
[19:19:30] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds
[19:20:15] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds
[19:24:54] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[19:37:57] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.575 seconds
[20:13:30] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[20:14:33] <nagios-wm>	 PROBLEM - Puppet freshness on labstore4 is CRITICAL: Puppet has not run in the last 10 hours
[20:23:33] <nagios-wm>	 PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours
[20:23:33] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be1002 is CRITICAL: Puppet has not run in the last 10 hours
[20:23:34] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be1001 is CRITICAL: Puppet has not run in the last 10 hours
[20:23:34] <nagios-wm>	 PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours
[20:26:33] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.797 seconds
[20:33:15] <apergos>	 paravoid:  I could not get to mgmt either from fenari or bast1001 (and obviously not to the box either)
[20:33:33] <apergos>	 thanks for rebooting
[20:55:37] <paravoid>	 apergos: because you tried SSH on a C2100
[20:55:56] <apergos>	 ah ;-D
[20:56:10] <apergos>	 forgot!  so used to having the 720s already
[20:56:42] <apergos>	 that I've wiped the weirdness of the c2100s (but not their brokenness) out of my mind already...
[21:00:27] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[21:02:47] <MaxSem>	 WTF is going on with stafford? probably half of Nagios spam is from it
[21:16:48] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.030 seconds
[21:40:39] <nagios-wm>	 PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours
[21:40:40] <nagios-wm>	 PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours
[21:50:33] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[21:56:33] <nagios-wm>	 PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours
[21:56:33] <nagios-wm>	 PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours
[22:03:36] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 8.737 seconds
[22:22:12] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:22:30] <nagios-wm>	 PROBLEM - Swift HTTP on ms-fe1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:37:39] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:46:12] <nagios-wm>	 PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 207 seconds
[22:46:57] <nagios-wm>	 PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 238 seconds
[22:49:30] <nagios-wm>	 RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds
[22:50:15] <nagios-wm>	 RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds
[22:54:00] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.029 seconds
[23:09:36] <nagios-wm>	 PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours
[23:26:24] <nagios-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:39:27] <nagios-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 1.347 seconds
[23:59:33] <nagios-wm>	 PROBLEM - Puppet freshness on ms-be3 is CRITICAL: Puppet has not run in the last 10 hours