[00:20:28] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:34:52] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.026 seconds [01:07:16] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:07:43] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [01:07:43] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [01:19:43] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.870 seconds [01:21:31] RECOVERY - Varnish HTTP upload-frontend on cp1030 is OK: HTTP OK HTTP/1.1 200 OK - 643 bytes in 0.055 seconds [01:24:13] !log checking/fixing puppet runs on cp103x servers [01:24:26] Logged the message, Master [01:28:01] RECOVERY - Puppet freshness on cp1034 is OK: puppet ran at Sat Sep 29 01:27:38 UTC 2012 [01:28:01] RECOVERY - Puppet freshness on cp1033 is OK: puppet ran at Sat Sep 29 01:27:46 UTC 2012 [01:29:04] RECOVERY - Puppet freshness on cp1035 is OK: puppet ran at Sat Sep 29 01:28:51 UTC 2012 [01:29:31] RECOVERY - Puppet freshness on cp1036 is OK: puppet ran at Sat Sep 29 01:29:27 UTC 2012 [01:40:25] New patchset: Jgreen; "change aluminium/grosley local archive paths" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25723 [01:41:20] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/25723 [01:41:50] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 268 seconds [01:42:29] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25723 [01:43:01] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 279 seconds [01:44:49] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 27 seconds [01:46:01] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 8 seconds [01:47:40] RECOVERY - NTP on cp1033 is OK: NTP OK: Offset -0.04411947727 secs [01:48:25] RECOVERY - NTP on cp1034 is OK: NTP OK: Offset -0.04366409779 secs [01:48:52] RECOVERY - NTP on cp1035 is OK: NTP OK: Offset -0.04420876503 secs [01:49:46] RECOVERY - NTP on cp1036 is OK: NTP OK: Offset -0.05091881752 secs [01:54:07] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:58:10] PROBLEM - Puppet freshness on mw10 is CRITICAL: Puppet has not run in the last 10 hours [01:58:10] PROBLEM - Puppet freshness on mw1 is CRITICAL: Puppet has not run in the last 10 hours [01:58:10] PROBLEM - Puppet freshness on mw11 is CRITICAL: Puppet has not run in the last 10 hours [01:58:10] PROBLEM - Puppet freshness on mw12 is CRITICAL: Puppet has not run in the last 10 hours [01:58:10] PROBLEM - Puppet freshness on mw14 is CRITICAL: Puppet has not run in the last 10 hours [01:58:11] PROBLEM - Puppet freshness on mw15 is CRITICAL: Puppet has not run in the last 10 hours [01:58:11] PROBLEM - Puppet freshness on mw16 is CRITICAL: Puppet has not run in the last 10 hours [01:58:12] PROBLEM - Puppet freshness on mw13 is CRITICAL: Puppet has not run in the last 10 hours [01:58:12] PROBLEM - Puppet freshness on mw6 is CRITICAL: Puppet has not run in the last 10 hours [01:58:13] PROBLEM - Puppet freshness on mw3 is CRITICAL: Puppet has not run in the last 10 hours [01:58:13] PROBLEM - Puppet freshness on mw2 is CRITICAL: Puppet has not run in the last 10 hours [01:58:14] PROBLEM - Puppet freshness on mw4 is CRITICAL: Puppet has not run in the last 10 hours [01:58:14] PROBLEM - Puppet freshness on mw8 is CRITICAL: Puppet has not run in the last 10 hours [01:58:15] PROBLEM - Puppet freshness on mw5 is CRITICAL: Puppet has not run in the last 10 hours [01:58:15] PROBLEM - Puppet freshness on mw7 is CRITICAL: Puppet has not run in the last 10 hours [01:58:16] PROBLEM - Puppet freshness on mw9 is CRITICAL: Puppet has not run in the last 10 hours [01:59:38] !log restarting cp1031 after hardware replacement [01:59:49] Logged the message, Master [02:07:55] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.022 seconds [02:40:19] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:41:49] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.020 seconds [02:51:16] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [03:04:10] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [03:29:32] New patchset: Dzahn; "update MAC address of cp1031 after mainboard was replaced in RT-3614" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25725 [03:34:31] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25725 [03:34:45] mutante: Got a minute for a bit of advice in an sql query? Note sure which way it is efficient. [03:34:47] It used to be a very bad unindexed query, so I refactored the testswarm database and improved the query. Its a lot faster, but maybe you can tell me which I should choose? [03:34:51] http://dpaste.org/gkznU/raw/ [03:34:57] Got 2 candidates left. [03:36:16] RECOVERY - Host cp1031 is UP: PING OK - Packet loss = 0%, RTA = 26.49 ms [03:40:28] PROBLEM - SSH on cp1031 is CRITICAL: Connection refused [03:43:55] PROBLEM - Host cp1031 is DOWN: PING CRITICAL - Packet loss = 100% [03:44:11] Krinkle: sorry, i dont really know, but where do you run this? maybe you can just try both and measure it? like at https://ishmael.wikimedia.org/?host=dbXX [03:46:24] mutante: Its part of the main query that powers TestSwarm, which would run a lot on the integration server we have. [03:46:38] These tables get quite large over time, so its important that the queries are good. [03:46:58] Our install got locked up about a dozen times now, everytime a mysql efficiency issue when we scaled further. [03:47:07] Its getting better though, hasn't locked for 6 months now. [03:57:50] Krinkle: could you send that the ops mailing list please. i know the integration server is gallium and i see a local mysql but i dont recall a lock up. [03:58:49] and i guess maybe it should not even run on gallium locally [03:59:25] let's hear binasher on that [03:59:45] but Friday night , and need to run [03:59:48] Its been at least 6 months since testswarm's db crashed gallium's mysql [04:00:07] PROBLEM - SSH on ssl1002 is CRITICAL: Server answer: [04:00:17] and when it does, usually only 2 or 3 people notice within hours. This was before it was hooked up to Gerrit/Jenkins, so it didn't stop anything. [04:01:37] RECOVERY - SSH on ssl1002 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [04:04:42] now that you say it.. several months ago restarting mysql there once or twice ..i think i remember vaguely [04:05:01] have you talked to hashar yet? [04:07:01] RECOVERY - SSH on cp1031 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [04:07:04] !log reinstalled cp1031 (new mainboard, thx RobH) [04:07:10] RECOVERY - Host cp1031 is UP: PING OK - Packet loss = 0%, RTA = 26.56 ms [04:07:15] Logged the message, Master [04:07:33] Krinkle: bbl [04:08:41] * mutante aways [04:30:07] PROBLEM - NTP on cp1031 is CRITICAL: NTP CRITICAL: Offset unknown [04:33:07] RECOVERY - NTP on cp1031 is OK: NTP OK: Offset -0.01372647285 secs [05:06:40] Krinkle: you could get gallium hooked up to ishmael [05:06:57] Sure. [05:07:12] Note though that these queries in question are for yet-to-be-implemented features. [05:07:18] huh [05:07:37] so they look like they were run/explained on empty or nearly empty tables [05:07:40] right? [05:07:47] I'm just looking for some advice from a db expert to see which way is recommended. No biggie, I know both are efficient overall, just micro optimization [05:07:57] yeah, my dev install on localhost [05:07:57] i guess not empty because it was a result of 9 [05:08:02] the feature is being written [05:08:15] so, the explain will be more useful on a table with lots of data [05:08:59] anyway, i agree with the pointer to 2 people that aren't here now (domas/asher) [05:28:55] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:30:54] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.193 seconds [05:33:36] PROBLEM - Varnish HTTP upload-frontend on cp1031 is CRITICAL: Connection refused [05:35:06] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [05:35:06] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [05:35:06] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [05:35:06] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [05:35:06] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [06:04:57] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:08:06] PROBLEM - Puppet freshness on cp1040 is CRITICAL: Puppet has not run in the last 10 hours [06:11:06] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.479 seconds [06:46:38] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:51:53] PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours [06:54:53] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 3.262 seconds [07:31:02] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:34:56] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [07:41:59] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.391 seconds [07:57:22] PROBLEM - Puppet freshness on mw22 is CRITICAL: Puppet has not run in the last 10 hours [08:16:53] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:18:14] New patchset: Dereckson; "(bug 40611) AbuseFilter configuration for mr.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25733 [08:24:22] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [08:27:58] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.772 seconds [08:53:01] PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 210 seconds [08:54:31] RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 0 seconds [09:03:13] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:40] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.636 seconds [09:27:22] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [09:45:24] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [09:50:03] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:04:09] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.025 seconds [10:33:17] !log Created 40 TB volume 'originals' inside 36-drive aggregate 'media1' on nas1-a [10:33:28] Logged the message, Master [10:35:04] !log Manually mounted nas1-a:/vol/originals on ms7, fenari [10:35:15] Logged the message, Master [10:36:15] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:39:53] !log Started rsync of /export/upload (respecting /etc/rsync.includes) to /mnt (nas1-a:vol/originals/) on ms7 in a screen session [10:40:04] Logged the message, Master [10:41:35] New patchset: Nemo bis; "(bug 29692) Per-wiki namespace aliases shouldn't override (remove) global ones" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25737 [10:50:21] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.065 seconds [11:08:42] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [11:08:42] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [11:22:57] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:35:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.334 seconds [11:58:48] PROBLEM - Puppet freshness on mw1 is CRITICAL: Puppet has not run in the last 10 hours [11:58:48] PROBLEM - Puppet freshness on mw11 is CRITICAL: Puppet has not run in the last 10 hours [11:58:48] PROBLEM - Puppet freshness on mw13 is CRITICAL: Puppet has not run in the last 10 hours [11:58:48] PROBLEM - Puppet freshness on mw10 is CRITICAL: Puppet has not run in the last 10 hours [11:58:48] PROBLEM - Puppet freshness on mw12 is CRITICAL: Puppet has not run in the last 10 hours [11:58:49] PROBLEM - Puppet freshness on mw16 is CRITICAL: Puppet has not run in the last 10 hours [11:58:49] PROBLEM - Puppet freshness on mw3 is CRITICAL: Puppet has not run in the last 10 hours [11:58:50] PROBLEM - Puppet freshness on mw15 is CRITICAL: Puppet has not run in the last 10 hours [11:58:50] PROBLEM - Puppet freshness on mw2 is CRITICAL: Puppet has not run in the last 10 hours [11:58:51] PROBLEM - Puppet freshness on mw4 is CRITICAL: Puppet has not run in the last 10 hours [11:58:51] PROBLEM - Puppet freshness on mw5 is CRITICAL: Puppet has not run in the last 10 hours [11:58:52] PROBLEM - Puppet freshness on mw6 is CRITICAL: Puppet has not run in the last 10 hours [11:58:52] PROBLEM - Puppet freshness on mw8 is CRITICAL: Puppet has not run in the last 10 hours [11:58:53] PROBLEM - Puppet freshness on mw7 is CRITICAL: Puppet has not run in the last 10 hours [11:58:53] PROBLEM - Puppet freshness on mw9 is CRITICAL: Puppet has not run in the last 10 hours [11:58:54] PROBLEM - Puppet freshness on mw14 is CRITICAL: Puppet has not run in the last 10 hours [12:09:46] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:23:43] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.021 seconds [12:29:43] PROBLEM - BGP status on cr2-eqiad is CRITICAL: (Service Check Timed Out) [12:52:22] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [12:57:29] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:05:25] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [13:09:46] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.275 seconds [13:12:05] New patchset: Alex Monk; "(bug 40614) Close strategywiki, zawiktionary, zh_min_nanwikibooks." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/25738 [13:43:53] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:56:20] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.113 seconds [14:14:02] PROBLEM - Puppet freshness on cp1031 is CRITICAL: Puppet has not run in the last 10 hours [14:29:50] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:43:47] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 8.029 seconds [15:17:32] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:29:59] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.505 seconds [15:35:41] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [15:35:41] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [15:35:41] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [15:35:41] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [15:35:41] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [16:04:16] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:08:55] PROBLEM - Puppet freshness on cp1040 is CRITICAL: Puppet has not run in the last 10 hours [16:17:01] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.374 seconds [16:32:55] mutante: have you met kelsey hightower? [16:33:01] (puppetconf) [16:51:31] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:52] PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours [17:03:58] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.203 seconds [17:36:19] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [17:37:22] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:49:40] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.395 seconds [17:58:13] PROBLEM - Puppet freshness on mw22 is CRITICAL: Puppet has not run in the last 10 hours [18:25:13] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:25:13] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [18:37:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.154 seconds [19:12:29] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:26:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.025 seconds [19:28:14] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [19:46:14] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [19:58:41] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:09:47] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 8.522 seconds [20:45:11] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:53:04] hi all, I'm getting a 403 Forbidden on the API (commons.wikipedia.org/w/api.php) could it be that this IP address blacklisted? [20:54:22] Does commons itself work? And other wiki sites/their apis? [20:55:59] PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 221 seconds [20:55:59] PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 221 seconds [20:57:38] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.429 seconds [20:57:51] Reedy: en works for me, es - api is not responding, commons is responding [20:58:31] en - api also won't load [20:58:46] what do responding and "won't load" mean? [20:59:04] is won't load the same as 403 ? [20:59:14] are these automated requests or how are you making them? [21:00:46] won't load and not responding are the same: the page starts loading but times out [21:00:52] which is different from 403 [21:01:02] and I'm trying the API from the web browser [21:01:10] not automated requests [21:02:36] I actually can load commons.wikimedia.org/w/api.php but for any action (api.php?action=query...) I get 403 [21:03:10] PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 183 seconds [21:03:28] PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 192 seconds [21:08:19] huh, commons api is working now, thanks [21:09:55] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [21:09:55] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [21:15:37] RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay 0 seconds [21:15:46] RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 0 seconds [21:29:25] PROBLEM - Host es10 is DOWN: PING CRITICAL - Packet loss = 100% [21:30:28] RECOVERY - Host es10 is UP: PING OK - Packet loss = 0%, RTA = 0.28 ms [21:32:07] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:33:46] PROBLEM - mysqld processes on es10 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [21:44:34] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.203 seconds [21:59:52] PROBLEM - Puppet freshness on mw1 is CRITICAL: Puppet has not run in the last 10 hours [21:59:52] PROBLEM - Puppet freshness on mw12 is CRITICAL: Puppet has not run in the last 10 hours [21:59:52] PROBLEM - Puppet freshness on mw11 is CRITICAL: Puppet has not run in the last 10 hours [21:59:52] PROBLEM - Puppet freshness on mw15 is CRITICAL: Puppet has not run in the last 10 hours [21:59:52] PROBLEM - Puppet freshness on mw13 is CRITICAL: Puppet has not run in the last 10 hours [21:59:53] PROBLEM - Puppet freshness on mw14 is CRITICAL: Puppet has not run in the last 10 hours [21:59:53] PROBLEM - Puppet freshness on mw10 is CRITICAL: Puppet has not run in the last 10 hours [21:59:54] PROBLEM - Puppet freshness on mw3 is CRITICAL: Puppet has not run in the last 10 hours [21:59:54] PROBLEM - Puppet freshness on mw16 is CRITICAL: Puppet has not run in the last 10 hours [21:59:55] PROBLEM - Puppet freshness on mw2 is CRITICAL: Puppet has not run in the last 10 hours [21:59:55] PROBLEM - Puppet freshness on mw4 is CRITICAL: Puppet has not run in the last 10 hours [21:59:56] PROBLEM - Puppet freshness on mw5 is CRITICAL: Puppet has not run in the last 10 hours [21:59:56] PROBLEM - Puppet freshness on mw7 is CRITICAL: Puppet has not run in the last 10 hours [21:59:57] PROBLEM - Puppet freshness on mw6 is CRITICAL: Puppet has not run in the last 10 hours [21:59:57] PROBLEM - Puppet freshness on mw8 is CRITICAL: Puppet has not run in the last 10 hours [21:59:58] PROBLEM - Puppet freshness on mw9 is CRITICAL: Puppet has not run in the last 10 hours [22:18:28] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:32:25] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.046 seconds [22:52:55] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [23:05:49] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:07:01] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [23:19:46] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.039 seconds [23:52:37] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds