[00:03:25] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 186 seconds [00:03:33] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 193 seconds [00:10:45] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:19:00] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [00:19:18] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [00:20:39] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 9.327 seconds [00:48:24] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 201 seconds [00:48:42] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 212 seconds [00:48:42] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 238 seconds [00:48:42] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 238 seconds [00:56:57] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:02:48] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 8 seconds [01:04:00] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [01:06:15] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [01:06:24] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [01:06:42] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.427 seconds [01:15:24] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 228 seconds [01:16:55] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 13 seconds [01:20:39] PROBLEM - SSH on ms1002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:41:48] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:55:00] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 9.133 seconds [02:26:37] !log LocalisationUpdate completed (1.21wmf6) at Sat Dec 22 02:26:37 UTC 2012 [02:26:49] Logged the message, Master [02:30:06] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [02:30:06] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:38:03] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.034 seconds [02:38:45] New patchset: Dereckson; "(bug 43327) Add an autopatrolled user rights group to it.wikivoyage" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39842 [02:39:06] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [02:39:06] PROBLEM - Puppet freshness on ms-be1006 is CRITICAL: Puppet has not run in the last 10 hours [02:39:07] PROBLEM - Puppet freshness on ms-be1005 is CRITICAL: Puppet has not run in the last 10 hours [02:39:07] PROBLEM - Puppet freshness on ms-be1007 is CRITICAL: Puppet has not run in the last 10 hours [02:50:57] RECOVERY - Puppet freshness on ms-fe1001 is OK: puppet ran at Sat Dec 22 02:50:44 UTC 2012 [02:56:57] RECOVERY - Puppet freshness on ms-fe1002 is OK: puppet ran at Sat Dec 22 02:56:46 UTC 2012 [03:14:12] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:18:51] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.423 second response time [03:31:47] PROBLEM - Apache HTTP on mw1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:32:14] PROBLEM - Apache HTTP on mw1019 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:44:32] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:51:44] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.916 second response time [03:58:56] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:59:05] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [03:59:05] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [04:09:35] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 7.404 second response time [04:16:57] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:23:59] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.302 second response time [04:43:31] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:46:58] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.623 second response time [05:03:55] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 192 seconds [05:05:07] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 222 seconds [05:12:19] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:14:07] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.912 second response time [05:21:19] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:33:55] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.744 second response time [05:39:46] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [05:40:31] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [05:46:22] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:55:13] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.778 second response time [06:04:22] PROBLEM - Apache HTTP on srv223 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:09:46] RECOVERY - Apache HTTP on srv223 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 6.479 second response time [06:36:13] dang it [06:36:15] fine [06:38:59] RECOVERY - Apache HTTP on srv222 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.055 second response time [06:39:22] !log shot a bunch of hung converts again on the imagescalers [06:39:31] Logged the message, Master [08:21:43] PROBLEM - Puppet freshness on search1001 is CRITICAL: Puppet has not run in the last 10 hours [08:24:44] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [08:25:19] PROBLEM - check_minfraud_secondary on payments1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:19] PROBLEM - check_minfraud_secondary on payments1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:20] PROBLEM - check_minfraud_secondary on payments1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:20] PROBLEM - check_minfraud_secondary on payments4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:20] PROBLEM - check_minfraud_secondary on payments1002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:20] PROBLEM - check_minfraud_secondary on payments1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:20] PROBLEM - check_minfraud_secondary on payments3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:25:21] PROBLEM - check_minfraud_secondary on payments2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:25] PROBLEM - check_minfraud_secondary on payments1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:25] PROBLEM - check_minfraud_secondary on payments1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:26] PROBLEM - check_minfraud_secondary on payments1002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:26] PROBLEM - check_minfraud_secondary on payments4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:26] PROBLEM - check_minfraud_secondary on payments1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:26] PROBLEM - check_minfraud_secondary on payments2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:26] PROBLEM - check_minfraud_secondary on payments1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:27] PROBLEM - check_minfraud_secondary on payments3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:32:40] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [08:35:22] RECOVERY - check_minfraud_secondary on payments2 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.426 second response time [08:35:22] RECOVERY - check_minfraud_secondary on payments1003 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.399 second response time [08:35:23] RECOVERY - check_minfraud_secondary on payments3 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.418 second response time [08:35:23] RECOVERY - check_minfraud_secondary on payments1002 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.395 second response time [08:35:23] RECOVERY - check_minfraud_secondary on payments1 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.421 second response time [08:35:23] RECOVERY - check_minfraud_secondary on payments1001 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.422 second response time [08:35:23] RECOVERY - check_minfraud_secondary on payments4 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.423 second response time [08:35:24] RECOVERY - check_minfraud_secondary on payments1004 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.479 second response time [09:10:06] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [09:10:15] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 183 seconds [09:10:43] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 198 seconds [09:13:42] PROBLEM - MySQL Slave Delay on db1041 is CRITICAL: CRIT replication delay 190 seconds [09:13:42] PROBLEM - MySQL Replication Heartbeat on db1024 is CRITICAL: CRIT replication delay 190 seconds [09:14:18] PROBLEM - MySQL Slave Delay on db1024 is CRITICAL: CRIT replication delay 201 seconds [09:14:36] PROBLEM - MySQL Replication Heartbeat on db1041 is CRITICAL: CRIT replication delay 208 seconds [09:14:54] PROBLEM - MySQL Replication Heartbeat on db1028 is CRITICAL: CRIT replication delay 220 seconds [09:15:21] PROBLEM - check_minfraud_secondary on payments1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:22] PROBLEM - check_minfraud_secondary on payments1002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:22] PROBLEM - check_minfraud_secondary on payments1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:22] PROBLEM - check_minfraud_secondary on payments2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:22] PROBLEM - check_minfraud_secondary on payments1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:22] PROBLEM - check_minfraud_secondary on payments4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:22] PROBLEM - check_minfraud_secondary on payments3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:23] PROBLEM - check_minfraud_secondary on payments1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:18:12] RECOVERY - MySQL Replication Heartbeat on db1041 is OK: OK replication delay 0 seconds [09:18:30] RECOVERY - MySQL Replication Heartbeat on db1028 is OK: OK replication delay 0 seconds [09:18:49] RECOVERY - MySQL Slave Delay on db1041 is OK: OK replication delay 0 seconds [09:18:57] RECOVERY - MySQL Replication Heartbeat on db1024 is OK: OK replication delay 1 seconds [09:19:42] RECOVERY - MySQL Slave Delay on db1024 is OK: OK replication delay 8 seconds [09:20:18] RECOVERY - check_minfraud_secondary on payments3 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.422 second response time [09:20:18] RECOVERY - check_minfraud_secondary on payments4 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.423 second response time [09:20:18] RECOVERY - check_minfraud_secondary on payments1002 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.299 second response time [09:20:18] RECOVERY - check_minfraud_secondary on payments1 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.415 second response time [09:20:19] RECOVERY - check_minfraud_secondary on payments1001 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.296 second response time [09:20:19] RECOVERY - check_minfraud_secondary on payments2 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.430 second response time [09:20:19] RECOVERY - check_minfraud_secondary on payments1003 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.299 second response time [09:20:19] RECOVERY - check_minfraud_secondary on payments1004 is OK: HTTP OK: HTTP/1.1 302 Found - 120 bytes in 0.299 second response time [09:28:51] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 198 seconds [09:29:27] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 210 seconds [09:44:36] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:51:39] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.785 seconds [10:05:36] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [10:05:45] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [10:06:12] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [10:06:21] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [10:23:36] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [10:24:48] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.63 ms [10:27:03] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:28:24] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [10:41:09] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.035 seconds [10:45:39] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.050 second response time [11:14:36] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:27:03] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.029 seconds [11:32:54] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 190 seconds [11:33:30] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 191 seconds [11:43:33] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [11:44:01] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [12:00:30] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:12:57] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.055 seconds [12:13:42] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 183 seconds [12:15:03] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 221 seconds [12:16:33] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 237 seconds [12:16:33] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 236 seconds [12:25:15] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [12:25:16] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [12:31:06] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [12:36:12] PROBLEM - Puppet freshness on db10 is CRITICAL: Puppet has not run in the last 10 hours [12:40:06] PROBLEM - Puppet freshness on ms-be1005 is CRITICAL: Puppet has not run in the last 10 hours [12:40:07] PROBLEM - Puppet freshness on ms-be1006 is CRITICAL: Puppet has not run in the last 10 hours [12:40:07] PROBLEM - Puppet freshness on ms-be1007 is CRITICAL: Puppet has not run in the last 10 hours [12:40:07] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [12:45:21] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:54:21] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [12:55:06] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [13:01:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.026 seconds [13:33:03] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:47:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.023 seconds [14:00:24] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [14:00:24] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [14:20:57] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:29:21] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 184 seconds [14:29:39] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 199 seconds [14:33:16] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.024 seconds [15:06:15] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [15:06:15] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [15:08:49] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:19:36] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.651 seconds [15:29:16] PROBLEM - SSH on cp1041 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:30:55] RECOVERY - SSH on cp1041 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [15:49:19] im getting varnish 503s on the mobile site [15:54:28] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:08:34] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.021 seconds [16:11:35] Varnish is giving 503 errors for some pages on the mobile site [16:13:06] "Guru Meditation: XID: 1832150768", if that helps [16:14:06] yeah it looks like MaxSem|android noticed that earlier too anomie [16:14:55] and it serves desktop site for some others [16:27:22] peeng [16:31:49] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 193 seconds [16:31:58] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 200 seconds [16:40:22] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:40:22] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [16:40:58] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [16:48:04] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39793 [16:48:42] New patchset: Reedy; "bug 38543 - set kowikisource wgLogo to local upload" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/38911 [16:48:47] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/38911 [16:50:43] !log Created WikiLove tables on itwikivoyage [16:50:53] Logged the message, Master [16:50:56] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39498 [16:54:37] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.035 seconds [16:55:41] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39572 [16:55:56] New patchset: Reedy; "(bug 42288) Babel configuration for sv.wiktionary" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39747 [16:56:01] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39747 [16:56:29] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/39842 [17:00:34] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32681 [17:02:10] !log reedy synchronized wmf-config/ [17:02:19] Logged the message, Master [17:16:22] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 182 seconds [17:16:31] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 187 seconds [17:17:25] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 191 seconds [17:17:34] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 197 seconds [17:21:37] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 2 seconds [17:21:46] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [17:27:55] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:04] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [17:28:13] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [17:38:25] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.653 seconds [17:48:10] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 183 seconds [17:49:13] PROBLEM - MySQL Slave Delay on db26 is CRITICAL: CRIT replication delay 234 seconds [17:49:22] PROBLEM - MySQL Replication Heartbeat on db26 is CRITICAL: CRIT replication delay 235 seconds [17:49:31] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 204 seconds [17:55:40] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 189 seconds [17:56:25] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 205 seconds [18:06:37] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [18:06:55] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [18:11:43] RECOVERY - MySQL Slave Delay on db26 is OK: OK replication delay 0 seconds [18:12:10] RECOVERY - MySQL Replication Heartbeat on db26 is OK: OK replication delay 0 seconds [18:15:19] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:23:07] PROBLEM - Puppet freshness on search1001 is CRITICAL: Puppet has not run in the last 10 hours [18:26:07] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [18:26:07] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 8.950 seconds [18:26:31] New patchset: Reedy; "RT #2295: Run cleanupUploadStash across all wikis daily" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/37968 [18:26:48] New patchset: Reedy; "Rename *.wikimedia.org.crt to star.wikimedia.org.crt like it is used in files/owa/owa-apache" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/32924 [18:34:13] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [18:53:07] RECOVERY - Varnish HTCP daemon on cp1042 is OK: PROCS OK: 1 process with UID = 997 (varnishhtcpd), args varnishhtcpd worker [18:53:08] RECOVERY - Varnish HTTP mobile-backend on cp1042 is OK: HTTP OK HTTP/1.1 200 OK - 696 bytes in 3.052 seconds [18:53:34] RECOVERY - Varnish traffic logger on cp1042 is OK: PROCS OK: 3 processes with command name varnishncsa [19:01:13] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:10:40] PROBLEM - Varnish HTTP mobile-backend on cp1042 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:11:08] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [19:11:08] PROBLEM - Varnish traffic logger on cp1042 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [19:11:16] PROBLEM - Varnish HTCP daemon on cp1042 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [19:13:40] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.019 seconds [19:46:58] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:01:22] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.075 seconds [20:33:10] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:47:34] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.048 seconds [21:19:40] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:30:19] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.890 seconds [22:05:43] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:16:22] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.170 seconds [22:32:07] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [22:37:13] PROBLEM - Puppet freshness on db10 is CRITICAL: Puppet has not run in the last 10 hours [22:41:07] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [22:41:08] PROBLEM - Puppet freshness on ms-be1005 is CRITICAL: Puppet has not run in the last 10 hours [22:41:08] PROBLEM - Puppet freshness on ms-be1006 is CRITICAL: Puppet has not run in the last 10 hours [22:41:08] PROBLEM - Puppet freshness on ms-be1007 is CRITICAL: Puppet has not run in the last 10 hours [22:51:19] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:05:52] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.030 seconds [23:12:41] New patchset: Ryan Lane; "Add salt-key accept for cron" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/40073 [23:13:14] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/40073 [23:30:56] New patchset: Dereckson; "(bug 43348) Logo for fi.wiktionary" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/40074 [23:37:22] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:49:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.600 seconds