[00:17:24] RECOVERY - Memcached on srv250 is OK: TCP OK - 8.996 second response time on port 11000 [00:22:03] PROBLEM - Memcached on srv250 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:39:00] PROBLEM - Puppet freshness on ekrem is CRITICAL: Puppet has not run in the last 10 hours [01:10:03] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [01:42:00] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 279 seconds [01:42:00] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 280 seconds [01:46:30] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 11 seconds [01:46:39] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 17 seconds [02:36:00] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [02:36:00] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [02:41:20] New review: Nikerabbit; "I'm quite sure that labsconsole allows passing parameters, I used that for my solr instances." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/23770 [02:41:24] RECOVERY - Puppet freshness on ekrem is OK: puppet ran at Sun Sep 16 02:41:04 UTC 2012 [02:57:58] New review: Nikerabbit; "Basically anything can be sensitive, but if this satisfies some criteria so be it." [operations/debs/lucene-search-2] (master) C: 0; - https://gerrit.wikimedia.org/r/23583 [03:39:00] PROBLEM - Puppet freshness on manganese is CRITICAL: Puppet has not run in the last 10 hours [03:52:03] PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours [04:25:31] RECOVERY - Puppet freshness on spence is OK: puppet ran at Sun Sep 16 04:24:56 UTC 2012 [04:29:06] PROBLEM - Squid on brewster is CRITICAL: Connection refused [04:32:44] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [05:07:14] PROBLEM - SSH on srv256 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:07:41] PROBLEM - Apache HTTP on srv256 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:07:50] PROBLEM - Memcached on srv256 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:25:38] PROBLEM - NTP on srv256 is CRITICAL: NTP CRITICAL: No response from NTP server [06:22:21] PROBLEM - Puppet freshness on srv250 is CRITICAL: Puppet has not run in the last 10 hours [06:28:03] RECOVERY - Memcached on srv256 is OK: TCP OK - 0.001 second response time on port 11000 [06:32:33] PROBLEM - Memcached on srv256 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:43:40] RECOVERY - Squid on brewster is OK: TCP OK - 0.003 second response time on port 8080 [07:06:37] PROBLEM - Puppet freshness on singer is CRITICAL: Puppet has not run in the last 10 hours [07:06:37] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [07:06:37] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [07:06:37] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [07:06:37] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [07:06:37] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [07:45:02] PROBLEM - Puppet freshness on ms-be10 is CRITICAL: Puppet has not run in the last 10 hours [08:22:52] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:23:19] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 236 seconds [08:24:22] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [08:24:49] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [08:25:16] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [08:31:07] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 73%, RTA = 0.27 ms [08:44:28] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [08:55:52] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 73%, RTA = 0.24 ms [09:01:34] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [09:14:46] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [09:16:07] RECOVERY - udp2log log age for locke on locke is OK: OK: all log files active [09:23:00] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 86%, RTA = 0.21 ms [09:30:30] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [09:47:48] New patchset: Aude; "set wgAutoConfirmCount to 10 for jawiki, per bug 40270" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23927 [09:53:00] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [09:58:15] PROBLEM - LVS HTTP IPv4 on wiktionary-lb.pmtpa.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:59:45] RECOVERY - LVS HTTP IPv4 on wiktionary-lb.pmtpa.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 61213 bytes in 0.162 seconds [09:59:45] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 73%, RTA = 0.21 ms [10:02:36] PROBLEM - LVS HTTPS IPv4 on foundation-lb.pmtpa.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:03:57] RECOVERY - LVS HTTPS IPv4 on foundation-lb.pmtpa.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 39773 bytes in 0.184 seconds [10:07:15] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [10:09:30] PROBLEM - Apache HTTP on mw54 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:34:38] PROBLEM - udp2log log age for locke on locke is CRITICAL: CRITICAL: log files /a/squid/fundraising/logs/bannerImpressions-sampled100.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [10:40:02] PROBLEM - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:41:23] RECOVERY - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is OK: HTTP OK HTTP/1.1 200 OK - 60478 bytes in 1.806 seconds [10:50:32] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 86%, RTA = 0.20 ms [11:09:35] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [11:10:38] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [11:11:05] RECOVERY - Apache HTTP on mw54 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.333 second response time [11:15:17] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 86%, RTA = 0.19 ms [11:15:35] PROBLEM - Apache HTTP on mw54 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:16:02] PROBLEM - LVS HTTPS IPv4 on wikiversity-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:17:23] RECOVERY - LVS HTTPS IPv4 on wikiversity-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 46783 bytes in 0.971 seconds [11:23:05] PROBLEM - Apache HTTP on mw51 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:23:50] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [11:24:44] RECOVERY - Apache HTTP on mw54 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.151 second response time [11:29:14] PROBLEM - Apache HTTP on mw54 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:03:35] PROBLEM - LVS HTTP IPv4 on foundation-lb.esams.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:05] RECOVERY - LVS HTTP IPv4 on foundation-lb.esams.wikimedia.org is OK: HTTP OK HTTP/1.0 200 OK - 39956 bytes in 4.534 seconds [12:05:59] PROBLEM - Apache HTTP on mw42 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:20:14] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 86%, RTA = 0.25 ms [12:23:59] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:25:20] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.045 seconds [12:27:35] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [12:30:18] PROBLEM - Apache HTTP on mw59 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:36:35] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [12:36:35] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [12:37:38] PROBLEM - Puppet freshness on mw22 is CRITICAL: Puppet has not run in the last 10 hours [12:45:36] PROBLEM - LVS HTTPS IPv4 on wikiversity-lb.pmtpa.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:50:06] RECOVERY - LVS HTTPS IPv4 on wikiversity-lb.pmtpa.wikimedia.org is OK: HTTP OK HTTP/1.1 200 OK - 46595 bytes in 0.160 seconds [12:58:40] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:00:00] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.126 seconds [13:03:00] RECOVERY - Host srv256 is UP: PING WARNING - Packet loss = 86%, RTA = 0.20 ms [13:10:21] PROBLEM - Host srv256 is DOWN: PING CRITICAL - Packet loss = 100% [13:24:27] PROBLEM - Apache HTTP on mw32 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:26:11] Yeah, every wiki seems to be taking forever to load anything [13:34:39] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:39:09] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.037 seconds [13:39:45] PROBLEM - Puppet freshness on manganese is CRITICAL: Puppet has not run in the last 10 hours [13:48:45] PROBLEM - Apache HTTP on mw37 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:52:39] PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours [14:04:43] PROBLEM - Apache HTTP on mw39 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:05:28] PROBLEM - Apache HTTP on mw24 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:12:13] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:14:19] PROBLEM - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:15:32] so folks... I'm here but *really* incoherent, about 5 hours sleep in the last 48 hours [14:16:43] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.706 seconds [14:17:19] RECOVERY - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is OK: HTTP OK HTTP/1.1 200 OK - 60479 bytes in 0.223 seconds [14:24:22] PROBLEM - Apache HTTP on mw33 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:26:50] ok well whatever the spike was it has dropped off [14:26:53] thank goodness [14:27:09] http://ganglia.wikimedia.org/latest/?c=Application%20servers%20pmtpa&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [14:27:16] you can see it there in the network graph [14:27:26] http://ganglia.wikimedia.org/latest/?c=Bits%20application%20servers%20pmtpa&m=load_one&r=hour&s=by%20name&hc=4&mc=2 here too [14:27:40] PROBLEM - Apache HTTP on mw45 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:49:16] !log cleared profiling [14:49:26] Logged the message, Master [14:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:53:02] !log powercycled srv250 [14:53:11] Logged the message, Master [14:53:46] !log powercycled srv256 [14:53:54] Logged the message, Master [14:54:13] PROBLEM - Host srv249 is DOWN: PING CRITICAL - Packet loss = 100% [14:55:01] I am afk again [14:55:03] er [14:55:08] i power cycle 250, and 249 goes down? [14:55:34] RECOVERY - Host srv249 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [14:56:46] RECOVERY - SSH on srv256 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [14:56:46] RECOVERY - Memcached on srv256 is OK: TCP OK - 0.001 second response time on port 11000 [14:56:55] RECOVERY - Host srv256 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [14:57:58] RECOVERY - Apache HTTP on mw54 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.345 second response time [14:58:34] RECOVERY - Apache HTTP on mw51 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.735 second response time [14:58:34] RECOVERY - Apache HTTP on mw32 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.155 second response time [14:59:28] RECOVERY - Apache HTTP on mw42 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 7.713 second response time [14:59:46] PROBLEM - Apache HTTP on srv249 is CRITICAL: Connection refused [14:59:46] PROBLEM - Puppet freshness on srv256 is CRITICAL: Puppet has not run in the last 10 hours [15:01:25] RECOVERY - Apache HTTP on mw39 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.810 second response time [15:01:34] RECOVERY - Apache HTTP on mw24 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.587 second response time [15:01:43] RECOVERY - Apache HTTP on mw59 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.615 second response time [15:02:28] RECOVERY - Apache HTTP on mw45 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.976 second response time [15:02:37] RECOVERY - check_all_memcacheds on spence is OK: MEMCACHED OK - All memcacheds are online [15:02:55] RECOVERY - SSH on srv250 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:02:55] RECOVERY - Apache HTTP on mw33 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.006 second response time [15:02:55] RECOVERY - Apache HTTP on mw37 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.561 second response time [15:03:31] RECOVERY - Memcached on srv250 is OK: TCP OK - 0.004 second response time on port 11000 [15:05:10] RECOVERY - Puppet freshness on srv250 is OK: puppet ran at Sun Sep 16 15:05:05 UTC 2012 [15:05:55] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.022 seconds [15:06:49] RECOVERY - NTP on srv250 is OK: NTP OK: Offset 0.002096533775 secs [15:07:16] RECOVERY - Apache HTTP on srv249 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.094 second response time [15:07:34] RECOVERY - Apache HTTP on srv250 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.032 second response time [15:12:22] New patchset: Jeremyb; "bug 40270 - jawiki: set wgAutoConfirmCount = 10" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23927 [15:13:32] New review: Jeremyb; "PS2: Tweaked the commit msg, changed one nearby bugzilla URL from HTTP -> HTTPS" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/23927 [15:23:00] RECOVERY - Puppet freshness on srv256 is OK: puppet ran at Sun Sep 16 15:22:52 UTC 2012 [15:24:39] RECOVERY - Apache HTTP on srv256 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.031 second response time [15:37:24] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:49:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 1.787 seconds [16:20:36] PROBLEM - Puppet freshness on ssl2 is CRITICAL: Puppet has not run in the last 10 hours [16:23:36] PROBLEM - Puppet freshness on cp1036 is CRITICAL: Puppet has not run in the last 10 hours [16:23:36] PROBLEM - Puppet freshness on db1019 is CRITICAL: Puppet has not run in the last 10 hours [16:23:36] PROBLEM - Puppet freshness on mw24 is CRITICAL: Puppet has not run in the last 10 hours [16:23:36] PROBLEM - Puppet freshness on sq67 is CRITICAL: Puppet has not run in the last 10 hours [16:23:36] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [16:23:37] PROBLEM - Puppet freshness on mw47 is CRITICAL: Puppet has not run in the last 10 hours [16:23:37] PROBLEM - Puppet freshness on sq68 is CRITICAL: Puppet has not run in the last 10 hours [16:23:38] PROBLEM - Puppet freshness on ssl1004 is CRITICAL: Puppet has not run in the last 10 hours [16:23:38] PROBLEM - Puppet freshness on srv301 is CRITICAL: Puppet has not run in the last 10 hours [16:24:03] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:24:39] PROBLEM - Puppet freshness on search34 is CRITICAL: Puppet has not run in the last 10 hours [16:36:10] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.030 seconds [16:54:10] PROBLEM - Puppet freshness on cp1003 is CRITICAL: Puppet has not run in the last 10 hours [16:54:10] PROBLEM - Puppet freshness on erzurumi is CRITICAL: Puppet has not run in the last 10 hours [16:54:10] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Puppet has not run in the last 10 hours [16:54:10] PROBLEM - Puppet freshness on searchidx2 is CRITICAL: Puppet has not run in the last 10 hours [16:54:10] PROBLEM - Puppet freshness on srv253 is CRITICAL: Puppet has not run in the last 10 hours [16:54:11] PROBLEM - Puppet freshness on sq58 is CRITICAL: Puppet has not run in the last 10 hours [16:54:11] PROBLEM - Puppet freshness on search1024 is CRITICAL: Puppet has not run in the last 10 hours [16:54:12] PROBLEM - Puppet freshness on ssl3 is CRITICAL: Puppet has not run in the last 10 hours [16:54:12] PROBLEM - Puppet freshness on williams is CRITICAL: Puppet has not run in the last 10 hours [16:54:13] PROBLEM - Puppet freshness on srv278 is CRITICAL: Puppet has not run in the last 10 hours [17:07:13] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [17:07:13] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [17:07:13] PROBLEM - Puppet freshness on singer is CRITICAL: Puppet has not run in the last 10 hours [17:07:14] PROBLEM - Puppet freshness on virt1003 is CRITICAL: Puppet has not run in the last 10 hours [17:07:14] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [17:07:14] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [17:09:28] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:21:37] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 8.885 seconds [17:41:53] New patchset: Nemo bis; "(bug 29692) Per-wiki namespace aliases shouldn't override (remove) global ones" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23935 [17:46:13] PROBLEM - Puppet freshness on ms-be10 is CRITICAL: Puppet has not run in the last 10 hours [17:55:58] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:09:28] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.037 seconds [18:41:53] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:53:44] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.034 seconds [19:02:35] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [19:03:56] New review: Dereckson; "Seems correct to me (only default namespaces are removed and whitespaces are now consistent)." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/23935 [19:25:32] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:37:32] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 1.385 seconds [19:53:28] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [20:11:37] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:24:58] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.029 seconds [20:27:01] New patchset: Nemo bis; "(bug 40285) Point Wikipedias logo to more up to date 2.0 version on Commons where available" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/23985 [20:49:30] New review: Ori.livneh; "This has been obsoleted by subsequent work on AFT logging. Roan, could you please abandon the patch?" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/10669 [20:57:58] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:08:19] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.923 seconds [21:11:28] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [21:43:42] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:56:49] New review: Dereckson; "-shellpolicy +shell" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/23094 [21:57:12] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.048 seconds [22:30:12] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:37:15] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [22:37:15] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [22:38:09] PROBLEM - Puppet freshness on mw22 is CRITICAL: Puppet has not run in the last 10 hours [22:40:42] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.280 seconds [23:15:49] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:21:41] * Jasper_Deng pokes Reedy and TimStarling [23:26:28] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.850 seconds [23:40:43] PROBLEM - Puppet freshness on manganese is CRITICAL: Puppet has not run in the last 10 hours [23:53:46] PROBLEM - Puppet freshness on ms-be6 is CRITICAL: Puppet has not run in the last 10 hours