[00:00:15] (03CR) 10Reedy: [C: 04-1] "(1 comment)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/76342 (owner: 10TTO) [00:01:31] (03PS2) 10Reedy: Change Special:Upload to Wikipedia:Upload for fawiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/76281 (owner: 10Ebrahim) [00:02:00] (03CR) 10Reedy: [C: 032] Change Special:Upload to Wikipedia:Upload for fawiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/76281 (owner: 10Ebrahim) [00:02:10] (03Merged) 10jenkins-bot: Change Special:Upload to Wikipedia:Upload for fawiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/76281 (owner: 10Ebrahim) [00:04:35] !log reedy synchronized wmf-config/InitialiseSettings.php [00:04:46] Logged the message, Master [00:08:14] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [00:14:08] !log Deployed new version of parsoid-regressions via Jenkins Job Builder [00:14:19] Logged the message, Master [00:31:34] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:32:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.127 second response time [00:32:44] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 00:32:38 UTC 2013 [00:33:14] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [00:53:29] (03PS1) 10Jgreen: fight with puppet re. exim variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/77472 [00:55:02] (03CR) 10Jgreen: [C: 032 V: 032] fight with puppet re. exim variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/77472 (owner: 10Jgreen) [00:56:38] (03CR) 10TTO: "(1 comment)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/76342 (owner: 10TTO) [01:00:07] (03PS1) 10Jgreen: more fighting with puppet re. exim variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/77473 [01:00:31] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:01:35] (03CR) 10Jgreen: [C: 032 V: 032] more fighting with puppet re. exim variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/77473 (owner: 10Jgreen) [01:02:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 7.429 second response time [01:04:44] (03PS1) 10Jgreen: Revert "more fighting with puppet re. exim variables" [operations/puppet] - 10https://gerrit.wikimedia.org/r/77475 [01:05:33] (03PS1) 10Jgreen: Revert "fight with puppet re. exim variables" [operations/puppet] - 10https://gerrit.wikimedia.org/r/77476 [01:05:40] (03CR) 10jenkins-bot: [V: 04-1] Revert "fight with puppet re. exim variables" [operations/puppet] - 10https://gerrit.wikimedia.org/r/77476 (owner: 10Jgreen) [01:06:19] (03Abandoned) 10Jgreen: Revert "fight with puppet re. exim variables" [operations/puppet] - 10https://gerrit.wikimedia.org/r/77476 (owner: 10Jgreen) [01:09:41] PROBLEM - Puppet freshness on holmium is CRITICAL: No successful Puppet run in the last 10 hours [01:10:17] (03PS1) 10Jgreen: back out otrs exim4 changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/77477 [01:13:45] (03CR) 10Jgreen: [C: 032 V: 032] back out otrs exim4 changes [operations/puppet] - 10https://gerrit.wikimedia.org/r/77477 (owner: 10Jgreen) [01:15:31] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:16:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 5.912 second response time [01:27:31] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:30:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 5.189 second response time [01:48:12] (03PS1) 10Dzahn: add Roan to admins contact group to receive Icinga pages [operations/puppet] - 10https://gerrit.wikimedia.org/r/77478 [01:48:49] (03PS2) 10Dzahn: add Roan to admins contact group to receive Icinga pages [operations/puppet] - 10https://gerrit.wikimedia.org/r/77478 [01:48:58] (03CR) 10Dzahn: [C: 032] add Roan to admins contact group to receive Icinga pages [operations/puppet] - 10https://gerrit.wikimedia.org/r/77478 (owner: 10Dzahn) [01:57:37] mutante: You're sure it's rkattouw and not catrope? [01:57:52] yes, i am [01:57:57] Cool. :-) [01:59:49] (03PS1) 10Dzahn: add a new Icinga timeperiod for Hong Kong awake hours (HKT 9am-9pm) [operations/puppet] - 10https://gerrit.wikimedia.org/r/77481 [02:02:19] (03PS2) 10Dzahn: add a new Icinga timeperiod for Hong Kong awake hours (HKT 9am-9pm) [operations/puppet] - 10https://gerrit.wikimedia.org/r/77481 [02:07:28] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [02:07:55] (03CR) 10Dzahn: [C: 032] add a new Icinga timeperiod for Hong Kong awake hours (HKT 9am-9pm) [operations/puppet] - 10https://gerrit.wikimedia.org/r/77481 (owner: 10Dzahn) [02:16:47] !log LocalisationUpdate completed (1.22wmf12) at Sat Aug 3 02:16:47 UTC 2013 [02:17:00] Logged the message, Master [02:17:08] PROBLEM - Puppet freshness on erzurumi is CRITICAL: No successful Puppet run in the last 10 hours [02:22:28] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:23:19] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.155 second response time [02:32:48] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 02:32:39 UTC 2013 [02:33:28] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [02:36:54] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 3 02:36:54 UTC 2013 [02:37:07] Logged the message, Master [02:52:30] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:53:20] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.131 second response time [02:56:10] PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours [02:56:30] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:57:20] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [02:57:44] A Gerrit cron job was supposed to run at 1 UTC, as far as I can tell. [02:57:51] It doesn't seem to have worked. [02:57:58] If anyone has any interest in debugging it. [02:58:10] https://gerrit.wikimedia.org/r/76945 [03:05:47] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [03:19:49] Elsie: fatal: gerrit2 does not have "accessDatabase" capability. [03:19:55] commenting on that patch set [03:20:07] mutante: Thank you! [03:20:16] I guess we'll need to switch it to root, then? [03:20:32] (03CR) 10Dzahn: "this doesn't appear to work and the reason:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/76945 (owner: 10Demon) [03:20:58] Elsie: that would be Permission denied (publickey). [03:21:51] Hrmmm. [03:22:37] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:23:27] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.410 second response time [03:32:37] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 03:32:35 UTC 2013 [03:32:47] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [03:52:33] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:53:23] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [04:06:27] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [04:30:57] PROBLEM - Puppet freshness on sq41 is CRITICAL: No successful Puppet run in the last 10 hours [04:31:37] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:32:27] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.130 second response time [04:32:47] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 04:32:43 UTC 2013 [04:33:27] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [05:05:46] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [05:22:36] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:23:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.145 second response time [05:32:46] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 05:32:37 UTC 2013 [05:32:46] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [05:49:20] Elsie: nice the reviewer count [05:49:36] Nemo_bis: If only it worked. :P [05:49:47] Chad says he'll be able to fix the permissions issue. [05:49:49] So perhaps tomorrow. [06:05:47] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [06:30:37] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:32:27] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.126 second response time [06:32:47] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 06:32:39 UTC 2013 [06:33:47] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [06:36:41] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:37:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.133 second response time [06:49:56] (03CR) 10Hashar: [C: 04-1] "Is there any rational in choosing Nginx / LUA instead of Varnish and its vcl ?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/77454 (owner: 10Yuvipanda) [07:06:21] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [07:32:57] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 07:32:50 UTC 2013 [07:33:17] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [07:57:44] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [07:57:44] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [07:57:44] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [07:57:44] PROBLEM - Puppet freshness on virt1 is CRITICAL: No successful Puppet run in the last 10 hours [07:57:44] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [07:57:44] PROBLEM - Puppet freshness on virt4 is CRITICAL: No successful Puppet run in the last 10 hours [07:59:31] !log depooling ssl1007-9 for now. They need to be added to the MW config. [07:59:42] Logged the message, Master [08:00:44] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:01:34] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 4.701 second response time [08:11:52] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [08:33:12] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 08:33:04 UTC 2013 [08:33:52] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [09:02:40] (03CR) 10QChris: "I now granted the user accessDatabase." [operations/puppet] - 10https://gerrit.wikimedia.org/r/76945 (owner: 10Demon) [09:02:50] PROBLEM - Puppet freshness on ssl1004 is CRITICAL: No successful Puppet run in the last 10 hours [09:06:08] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [09:32:40] (03CR) 10Yuvipanda: "Yes - next patchset will add support for dynamic HTTP routing, with the routing tables being read from Redis. nginx/lua has good support f" [operations/puppet] - 10https://gerrit.wikimedia.org/r/77454 (owner: 10Yuvipanda) [09:32:48] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 09:32:38 UTC 2013 [09:33:08] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [09:36:15] (03CR) 10Yuvipanda: "See https://wikitech.wikimedia.org/wiki/User:Yuvipanda/Dynamic_http_routing for more details." [operations/puppet] - 10https://gerrit.wikimedia.org/r/77454 (owner: 10Yuvipanda) [09:50:54] PROBLEM - Puppet freshness on pdf3 is CRITICAL: No successful Puppet run in the last 10 hours [09:57:54] PROBLEM - Puppet freshness on mchenry is CRITICAL: No successful Puppet run in the last 10 hours [10:06:29] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [10:19:17] (03CR) 10Reedy: "(1 comment)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/76342 (owner: 10TTO) [10:32:49] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 10:32:40 UTC 2013 [10:33:29] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [11:02:19] PROBLEM - Host mw27 is DOWN: PING CRITICAL - Packet loss = 100% [11:02:48] RECOVERY - Host mw27 is UP: PING OK - Packet loss = 0%, RTA = 26.54 ms [11:10:05] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [11:10:15] PROBLEM - Puppet freshness on holmium is CRITICAL: No successful Puppet run in the last 10 hours [11:32:55] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 11:32:47 UTC 2013 [11:33:05] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [12:08:12] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [12:17:27] PROBLEM - Puppet freshness on erzurumi is CRITICAL: No successful Puppet run in the last 10 hours [12:32:42] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 12:32:39 UTC 2013 [12:33:12] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [12:56:04] PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours [13:06:13] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [13:32:43] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 13:32:40 UTC 2013 [13:33:13] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [13:35:32] Jasper_Deng_away, ori-l: ssl1007-9 were new SSL terminating boxes that were pooled without $wgSquidServersNoPurge (the trusted X-Forwarded-For list) being adjusted first [13:36:03] Jasper_Deng_away, ori-l: I see they're not pooled now but I left a comment in the file for whoever was playing with them (Ryan_Lane I think) to not forget mediawiki-config next time [13:43:19] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:44:08] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [13:48:58] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [13:49:18] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 26.67 ms [14:06:06] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [14:21:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:22:16] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.123 second response time [14:31:56] PROBLEM - Puppet freshness on sq41 is CRITICAL: No successful Puppet run in the last 10 hours [14:32:46] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 14:32:42 UTC 2013 [14:33:06] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [14:38:46] (03PS1) 10Reedy: Add ssl1007-ssl1008 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77505 [14:38:59] Ryan_Lane ^^ [14:39:30] (03CR) 10Reedy: [C: 032] Add ssl1007-ssl1008 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77505 (owner: 10Reedy) [14:39:42] (03Merged) 10jenkins-bot: Add ssl1007-ssl1008 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77505 (owner: 10Reedy) [14:40:37] !log reedy synchronized wmf-config/squid.php 'ssl1007-ssl1009' [14:40:49] Logged the message, Master [14:41:09] OWWW EMMMMM GEEEEE [14:41:35] REEDY IS DEPLOYING DURING A FREEEEEZZZZE!!ONEONE [14:41:48] I R FIXING UR BUGS [14:42:13] (03PS1) 10Reedy: Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 [14:42:17] (03CR) 10jenkins-bot: [V: 04-1] Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 (owner: 10Reedy) [14:45:51] (03PS2) 10Reedy: Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 [14:45:53] (03CR) 10jenkins-bot: [V: 04-1] Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 (owner: 10Reedy) [14:45:54] (03PS3) 10Reedy: Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 [14:45:55] (03CR) 10Reedy: [C: 032] Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 (owner: 10Reedy) [14:45:56] (03Merged) 10jenkins-bot: Fix indenting [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/77506 (owner: 10Reedy) [14:46:30] Ryan_Lane: Ryan_Lane1 ^ Feel free to repool whenever [14:47:25] Or anyone else if you're paying attention/field so inclined [15:01:24] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:02:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.126 second response time [15:06:40] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [15:23:22] Reedy: heh, thanks [15:23:43] Reedy: I had it edited but thought I shouldn't deploy on the Saturday before wikimania :) [15:23:59] not that I blame you, you're a bit more accustomed to deploying mw-config than me :) [15:33:10] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 15:33:07 UTC 2013 [15:33:40] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [15:36:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:37:16] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.137 second response time [15:43:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:16] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [15:46:21] paravoid: If we couldn't do simple stuff like this without breaking stuff.. [15:46:36] There was a suggestion of getting puppet to generate an IP list and mw can then just import that file [15:46:38] reedy: do you know which labs box has the broken controller? [15:46:59] "One of the storage disk controllers has been broken for some time" [15:47:11] Nope... Based on that bug, Ryan, Marc or Andrew should hopefully have some idea.. [15:47:51] okay...we'll deal with it on monday. [15:48:12] Though, you would've thought that they would have escalated it to you before now [15:48:26] you would think [15:49:28] I am curious if it's in eqiad...i think labs is mostly in tampa atm [15:49:43] paravoid: ^ Don't suppose you have any idea about a dodgy controller on one of the labs stores? [15:50:02] https://bugzilla.wikimedia.org/show_bug.cgi?id=52500 [15:53:18] not really, no [16:01:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:02:16] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [16:06:01] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [16:08:26] cmjohnson1: Coren is supposed to know.. [16:09:13] okay..cool [16:11:02] Reedy: thanks for taking that on "last night" [16:12:03] Why one? [16:12:03] :P [16:20:10] one? [16:20:25] Reedy: meant, 'taking it on' like "take on me" the song from the 80s ;) [16:20:33] why the ' vs "? dunno [16:20:56] Which one [16:20:57] even [16:20:58] bah [16:20:58] ( [16:21:14] Reedy speak good English. [16:21:21] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:11] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.129 second response time [16:26:04] "Labs will be just like production... minus the NFS and broken controller." [16:29:01] PROBLEM - Host mw1085 is DOWN: PING CRITICAL - Packet loss = 100% [16:30:01] RECOVERY - Host mw1085 is UP: PING OK - Packet loss = 0%, RTA = 0.48 ms [16:32:51] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 16:32:48 UTC 2013 [16:33:01] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [16:36:21] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:37:48] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.123 second response time [17:08:20] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [17:21:45] cmjohnson1: labstore3, IIRC. I think they're all in tampa [17:22:27] hrm okay..irc some chatter about that [17:22:40] PROBLEM - SSH on pdf2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:23:04] it's under warranty so that is good news [17:25:31] RECOVERY - SSH on pdf2 is OK: SSH OK - OpenSSH_4.7p1 Debian-8ubuntu3 (protocol 2.0) [17:25:41] cmjohnson1: Coren is active! :) [17:26:33] To be fair, we don't know if the issue is a flaky driver, flaky controller, flaky connector, or flary shelf. We're going to be shuffling a few of those variables around in the coming days in order to figure out which. [17:32:40] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 17:32:35 UTC 2013 [17:33:20] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [17:44:08] coren: okay..let me know if I can help in any way [17:45:38] Ryan_Lane, has WMF considered EV SSL certs? [17:48:38] cmjohnson1: I should expect we will call on you eventually. Ken suggested that just taking a second look at whether the wiring is secure would be a good sanity check. :-) [17:49:10] agreed [17:52:50] (03CR) 10coren: [C: 032] "Yes, that makes sense." [operations/puppet] - 10https://gerrit.wikimedia.org/r/77452 (owner: 10Tim Landscheidt) [17:56:40] PROBLEM - SSH on pdf2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:57:50] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [17:57:50] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [17:57:50] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [17:57:50] PROBLEM - Puppet freshness on virt1 is CRITICAL: No successful Puppet run in the last 10 hours [17:57:50] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [17:57:50] PROBLEM - Puppet freshness on virt4 is CRITICAL: No successful Puppet run in the last 10 hours [17:58:40] RECOVERY - SSH on pdf2 is OK: SSH OK - OpenSSH_4.7p1 Debian-8ubuntu3 (protocol 2.0) [18:05:29] Reedy: :) sorry, the ssl servers making edits one :) [18:06:40] PROBLEM - SSH on pdf2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:07:56] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [18:09:36] RECOVERY - SSH on pdf2 is OK: SSH OK - OpenSSH_4.7p1 Debian-8ubuntu3 (protocol 2.0) [18:15:46] PROBLEM - SSH on pdf2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:18:36] RECOVERY - SSH on pdf2 is OK: SSH OK - OpenSSH_4.7p1 Debian-8ubuntu3 (protocol 2.0) [18:32:56] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 18:32:49 UTC 2013 [18:33:56] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [19:03:21] PROBLEM - Puppet freshness on ssl1004 is CRITICAL: No successful Puppet run in the last 10 hours [19:06:51] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [19:10:33] Krenair: we have EV certs for payments [19:10:52] Krenair: EV certs aren't really any more trustworthy IMO [19:11:09] they give endusers a warm fuzzy when they are typing in their credit card details [19:11:33] I never got a warm fuzzy! I demand a refund! [19:32:41] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 19:32:38 UTC 2013 [19:32:51] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [19:45:55] !log repooling ssl1007-9 [19:46:06] Logged the message, Master [19:51:12] PROBLEM - Puppet freshness on pdf3 is CRITICAL: No successful Puppet run in the last 10 hours [19:58:12] PROBLEM - Puppet freshness on mchenry is CRITICAL: No successful Puppet run in the last 10 hours [20:06:30] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [20:08:20] PROBLEM - Host mw1085 is DOWN: PING CRITICAL - Packet loss = 100% [20:09:30] RECOVERY - Host mw1085 is UP: PING OK - Packet loss = 0%, RTA = 0.62 ms [20:33:00] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 20:32:58 UTC 2013 [20:33:30] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [21:06:03] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [21:10:43] PROBLEM - Puppet freshness on holmium is CRITICAL: No successful Puppet run in the last 10 hours [21:32:53] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 21:32:46 UTC 2013 [21:33:03] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [22:06:36] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [22:17:56] PROBLEM - Puppet freshness on erzurumi is CRITICAL: No successful Puppet run in the last 10 hours [22:33:36] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 22:33:32 UTC 2013 [22:34:36] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [22:56:13] PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours [23:11:38] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [23:32:48] RECOVERY - Puppet freshness on mexia is OK: puppet ran at Sat Aug 3 23:32:44 UTC 2013 [23:33:38] PROBLEM - Puppet freshness on mexia is CRITICAL: No successful Puppet run in the last 10 hours [23:33:58] PROBLEM - LVS HTTPS IPv4 on wikinews-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:34:08] PROBLEM - LVS HTTPS IPv4 on wikibooks-lb.eqiad.wikimedia.org is CRITICAL: Connection timed out [23:34:08] PROBLEM - LVS HTTP IPv4 on wikipedia-lb.eqiad.wikimedia.org is CRITICAL: Connection timed out [23:34:08] PROBLEM - LVS HTTPS IPv6 on wikinews-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:34:18] PROBLEM - LVS HTTPS IPv4 on wikipedia-lb.eqiad.wikimedia.org is CRITICAL: Connection timed out [23:34:18] PROBLEM - LVS HTTP IPv6 on bits-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [23:34:28] PROBLEM - LVS HTTPS IPv6 on bits-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [23:34:58] RECOVERY - LVS HTTPS IPv4 on wikinews-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 62046 bytes in 8.482 second response time [23:35:18] RECOVERY - LVS HTTP IPv6 on bits-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 3833 bytes in 1.760 second response time [23:35:21] RECOVERY - LVS HTTPS IPv6 on bits-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 3849 bytes in 4.910 second response time [23:35:28] PROBLEM - LVS HTTPS IPv4 on foundation-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:36:07] PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out [23:36:10] PROBLEM - LVS HTTP IPv4 on m.wikimedia.org is CRITICAL: Connection timed out [23:36:13] PROBLEM - LVS HTTPS IPv6 on wikivoyage-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:36:16] PROBLEM - LVS HTTPS IPv4 on wikiquote-lb.eqiad.wikimedia.org is CRITICAL: Connection timed out [23:36:17] PROBLEM - LVS HTTP IPv4 on wikimedia-lb.eqiad.wikimedia.org is CRITICAL: HTTP CRITICAL - No data received from host [23:36:29] RECOVERY - LVS HTTPS IPv4 on wikipedia-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 99802 bytes in 0.029 second response time [23:36:29] RECOVERY - LVS HTTPS IPv4 on foundation-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 62046 bytes in 0.037 second response time [23:36:59] RECOVERY - LVS HTTPS IPv6 on wikivoyage-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 42907 bytes in 0.011 second response time [23:37:02] RECOVERY - LVS HTTPS IPv4 on wikibooks-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 62046 bytes in 0.016 second response time [23:37:02] RECOVERY - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 22678 bytes in 0.002 second response time [23:37:05] RECOVERY - LVS HTTP IPv4 on m.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 22678 bytes in 0.002 second response time [23:37:07] RECOVERY - LVS HTTPS IPv6 on wikinews-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 62046 bytes in 0.012 second response time [23:37:07] RECOVERY - LVS HTTP IPv4 on wikipedia-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.0 200 OK - 62040 bytes in 0.003 second response time [23:37:08] RECOVERY - LVS HTTPS IPv4 on wikiquote-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 62046 bytes in 0.015 second response time [23:37:08] RECOVERY - LVS HTTP IPv4 on wikimedia-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.0 200 OK - 99796 bytes in 0.006 second response time [23:57:41] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:58:41] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 7.508 second response time