[00:01:08] !log repooling ssl1002 (upgrade complete) [00:01:12] !log depooling ssl1001 [00:01:15] Logged the message, Master [00:01:21] Logged the message, Master [00:04:29] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:06:36] okay, scapping [00:07:00] MaxSem: Reedy yes, those are in a half-up state [00:07:05] not in rotation at the moment [00:07:56] PROBLEM - NTP on ssl4 is CRITICAL: NTP CRITICAL: No response from NTP server [00:19:11] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.662 seconds [00:21:34] !log maxsem Started syncing Wikimedia installation... : Black-deployment of ext:Solarium, not enabled anywhere yet [00:21:41] Logged the message, Master [00:22:20] PROBLEM - HTTPS on ssl4 is CRITICAL: Connection refused [00:23:15] !log repooling ssl4 (upgrade complete) [00:23:22] Logged the message, Master [00:23:23] !log depooling ssl2 for upgrade to precise [00:23:29] Logged the message, Master [00:23:59] RECOVERY - HTTPS on ssl4 is OK: OK - Certificate will expire on 08/22/2015 22:23. [00:31:29] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 0.004 seconds [00:33:35] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [00:33:35] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [00:33:35] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [00:33:35] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [00:34:47] RECOVERY - Apache HTTP on srv267 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.020 seconds [00:37:38] PROBLEM - HTTPS on ssl1001 is CRITICAL: Connection refused [00:42:53] RECOVERY - Apache HTTP on srv279 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.006 seconds [00:44:14] RECOVERY - HTTPS on ssl1001 is OK: OK - Certificate will expire on 10/27/2015 12:00. [00:44:20] Dear opsen, i'm looking for the ganglia config files. [00:44:47] awight: it's in gerrit [00:44:54] in the operations/puppet repo [00:45:02] likely in ganglia.pp [00:45:16] yep, in ganglia.pp [00:45:17] i found that, but it doesn't contain all the instrumentation [00:45:26] that's in files and templates [00:45:39] you'll need to read the manifest to find those files [00:45:44] RECOVERY - Apache HTTP on srv268 is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 0.005 seconds [00:45:51] likely files/ganglia [00:45:53] and templates/ganglia [00:46:10] !log repooling ssl1001 (upgrade complete) [00:46:17] Logged the message, Master [00:46:17] !log all eqiad https hosts upgraded to precise [00:46:24] Logged the message, Master [00:48:09] grrr, scap appears to hang on wikiversions sync [00:49:09] where't that new deployment system already? [00:49:10] oh [00:49:10] right [00:49:11] PROBLEM - Host ssl2 is DOWN: PING CRITICAL - Packet loss = 100% [00:49:56] RECOVERY - Host ssl2 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [00:53:23] PROBLEM - HTTPS on ssl2 is CRITICAL: Connection refused [00:53:50] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:56:16] Ryan_Lane, TBH, it's hanging ssh'ing one of the half-baked servers. will git-deploy not use ssh at all? [00:56:25] it will not [00:56:35] 0mq [00:56:36] via salt [00:57:08] RECOVERY - NTP on srv278 is OK: NTP OK: Offset -0.05188822746 secs [00:57:11] not saying it won't hang, but it'll hang waiting for all minions to return (and that has a timeout setting) [00:57:21] and every host will get the command in parallel [00:57:26] s/host/minion/ [00:57:32] I really should use consistent terms [00:58:03] like the commanding entity=overlord?:P [01:00:44] RECOVERY - NTP on srv267 is OK: NTP OK: Offset -0.04328072071 secs [01:02:28] !log Scap hung and had to be aborted. Since what was being deployed wasn't enabled, no clusters were harmed. [01:02:36] Logged the message, Master [01:08:32] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.031 seconds [01:09:59] !log depooling ssl3003 to upgrade to precise [01:12:04] Logged the message, Master [01:13:20] RECOVERY - Apache HTTP on srv269 is OK: HTTP OK HTTP/1.1 200 OK - 453 bytes in 0.008 seconds [01:14:05] RECOVERY - Apache HTTP on srv280 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.002 seconds [01:21:22] RECOVERY - HTTPS on ssl2 is OK: OK - Certificate will expire on 08/22/2015 22:23. [01:21:22] !log repooling ssl2 (upgrade complete) [01:21:25] !log depooling ssl1 [01:21:30] Logged the message, Master [01:21:36] Logged the message, Master [01:27:22] RECOVERY - NTP on srv279 is OK: NTP OK: Offset -0.03760004044 secs [01:27:40] RECOVERY - NTP on srv268 is OK: NTP OK: Offset -0.04951989651 secs [01:40:07] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 327 seconds [01:40:25] RECOVERY - NTP on srv269 is OK: NTP OK: Offset -0.0414069891 secs [01:40:25] RECOVERY - NTP on srv280 is OK: NTP OK: Offset -0.03783047199 secs [01:40:25] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 345 seconds [01:42:13] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:43:25] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [01:47:01] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 0 seconds [01:51:13] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [01:57:04] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.661 seconds [01:59:37] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 231 seconds [01:59:55] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 246 seconds [02:07:57] mutante: !log repooling srv290-srv295 [02:08:27] er [02:08:32] that wasn't just for mutante [02:08:37] !log repooling srv290-srv295 [02:08:44] Logged the message, notpeter [02:09:59] !log depooling srv296-srv301 for upgrades to precise [02:10:07] Logged the message, notpeter [02:23:37] PROBLEM - HTTPS on ssl1 is CRITICAL: Connection refused [02:23:46] PROBLEM - Host srv296 is DOWN: PING CRITICAL - Packet loss = 100% [02:24:14] New review: Aaron Schulz; "Looks fine." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/34251 [02:24:41] PROBLEM - Host srv297 is DOWN: PING CRITICAL - Packet loss = 100% [02:25:16] PROBLEM - Host srv298 is DOWN: PING CRITICAL - Packet loss = 100% [02:25:52] PROBLEM - Host srv299 is DOWN: PING CRITICAL - Packet loss = 100% [02:26:46] PROBLEM - Host srv300 is DOWN: PING CRITICAL - Packet loss = 100% [02:27:13] PROBLEM - Host srv301 is DOWN: PING CRITICAL - Packet loss = 100% [02:28:20] !log LocalisationUpdate completed (1.21wmf4) at Tue Nov 20 02:28:20 UTC 2012 [02:28:27] Logged the message, Master [02:29:28] RECOVERY - Host srv296 is UP: PING OK - Packet loss = 0%, RTA = 0.44 ms [02:30:22] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:22] RECOVERY - Host srv297 is UP: PING OK - Packet loss = 0%, RTA = 0.98 ms [02:30:58] RECOVERY - Host srv298 is UP: PING OK - Packet loss = 0%, RTA = 0.22 ms [02:31:34] RECOVERY - Host srv299 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [02:32:28] RECOVERY - Host srv300 is UP: PING OK - Packet loss = 0%, RTA = 1.24 ms [02:32:55] RECOVERY - Host srv301 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [02:33:04] PROBLEM - Apache HTTP on srv296 is CRITICAL: Connection refused [02:33:13] PROBLEM - SSH on srv296 is CRITICAL: Connection refused [02:34:16] PROBLEM - SSH on srv297 is CRITICAL: Connection refused [02:34:25] PROBLEM - Apache HTTP on srv297 is CRITICAL: Connection refused [02:34:34] PROBLEM - SSH on srv298 is CRITICAL: Connection refused [02:35:01] PROBLEM - Apache HTTP on srv299 is CRITICAL: Connection refused [02:35:19] PROBLEM - Apache HTTP on srv298 is CRITICAL: Connection refused [02:35:55] PROBLEM - SSH on srv299 is CRITICAL: Connection refused [02:36:40] PROBLEM - Apache HTTP on srv300 is CRITICAL: Connection refused [02:36:40] PROBLEM - SSH on srv300 is CRITICAL: Connection refused [02:36:58] PROBLEM - Apache HTTP on srv301 is CRITICAL: Connection refused [02:37:52] PROBLEM - SSH on srv301 is CRITICAL: Connection refused [02:38:37] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.037 seconds [02:41:01] RECOVERY - SSH on srv298 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:41:19] RECOVERY - SSH on srv296 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:42:32] RECOVERY - SSH on srv297 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:43:34] RECOVERY - SSH on srv300 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:44:10] RECOVERY - SSH on srv299 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:44:19] RECOVERY - SSH on srv301 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [02:52:09] !log LocalisationUpdate completed (1.21wmf3) at Tue Nov 20 02:52:09 UTC 2012 [02:52:17] Logged the message, Master [02:53:10] PROBLEM - NTP on srv296 is CRITICAL: NTP CRITICAL: No response from NTP server [02:54:13] PROBLEM - NTP on srv297 is CRITICAL: NTP CRITICAL: No response from NTP server [02:54:40] PROBLEM - NTP on srv298 is CRITICAL: NTP CRITICAL: No response from NTP server [02:55:25] PROBLEM - NTP on srv299 is CRITICAL: NTP CRITICAL: No response from NTP server [02:56:01] PROBLEM - NTP on srv301 is CRITICAL: NTP CRITICAL: No response from NTP server [02:56:29] PROBLEM - NTP on srv300 is CRITICAL: NTP CRITICAL: No response from NTP server [03:05:55] PROBLEM - Squid on brewster is CRITICAL: Connection refused [03:31:11] New patchset: Tim Starling; "Updates for Score deployment" [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/34255 [03:36:58] RECOVERY - Puppet freshness on dobson is OK: puppet ran at Tue Nov 20 03:36:46 UTC 2012 [03:44:28] !log on brewster: root partition is full, removing some useless squid logs [03:44:35] Logged the message, Master [03:46:19] why do people make servers with microscopic root partitions and then configure them to store gigabytes of logs? [03:46:52] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 9 seconds [03:47:01] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [03:47:55] RECOVERY - Squid on brewster is OK: TCP OK - 0.001 second response time on port 8080 [03:48:00] i think brewster had a full / recently? less than 2 weeks i guess [03:48:08] but idk if it was squid logs [03:49:43] RECOVERY - Apache HTTP on srv301 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.003 seconds [03:51:57] !log on brewster: reduced rotate count for squid logs to zero and ran logrotate -f [03:52:04] Logged the message, Master [04:03:40] RECOVERY - NTP on srv301 is OK: NTP OK: Offset -0.001162052155 secs [04:04:17] is there any way to install a package without installing the packages it recommends? [04:04:41] !log repooling ssl1 (upgrade complete) [04:04:44] RECOVERY - HTTPS on ssl1 is OK: OK - Certificate will expire on 08/22/2015 22:23. [04:04:48] Logged the message, Master [04:04:56] TimStarling: yes, not that I can remember how off the top of my head [04:05:06] TimStarling: --no-install-recommends [04:05:17] or do you mean via puppet? [04:06:21] via puppet or wikimedia-task-appserver [04:06:33] puppet is going to be more difficult [04:07:19] http://projects.puppetlabs.com/issues/1766 [04:07:59] I added timidity and lilypond to wikimedia-task-appserver [04:08:17] it will already be going out to lucid apaches since I just pushed it into the lucid-wikimedia repo [04:08:21] why are we still using wikimedia-task-appserver? [04:08:29] I thought we were pulling the package dependencies out [04:08:30] well, I did ask [04:08:50] does anyone have opinions on wikimedia-task-appserver and whether it should continue to exist? [04:08:57] it has dependencies for most of the packages that MW needs, except for 4 which have been added directly to the puppet class [04:08:58] I think it should not exist [04:09:04] it's probably a bit easier to update puppet than to update the task package [04:09:07] I think puppet should handle this [04:09:34] wikimedia-task-appserver had not been updated since august [04:09:39] I've been trying to kill all the configuration packages for a while now [04:10:01] I could have sworn I did so when we upgraded to lucid, except for a few scripts [04:10:34] we had branches of the package, I wonder if mine simply disappeared [04:10:45] wikimedia-task-appserver had 53 packages and puppet had 5 [04:10:54] so I figured wikimedia-task-appserver was the preferred solution [04:11:20] nope. I'm nearly positive I stripped almost all of them out at some point in a branched version for lucid [04:11:33] did you commit it anywhere? [04:11:34] * Ryan_Lane checks [04:12:27] I just imported it into git half an hour ago, in case you're looking there [04:12:39] ah. no. I was looking for a branch in svn [04:12:49] would the import have gotten the branches too? [04:13:15] no [04:13:26] * Ryan_Lane grumbles [04:13:39] hrmmm, still no rdns on WMF IPv6 ? [04:13:49] let me see what the ticket # was [04:14:00] http://svn.wikimedia.org/viewvc/mediawiki/branches/hardy/debs/wikimedia-task-appserver/debian/control?r1=83002&r2=85389 [04:14:23] that looks like I did the opposite? [04:14:25] that's 1.5 years ago [04:14:34] and in hardy [04:14:36] hrmmmm, well it does have rdns now... [04:14:39] 2620:0:861:1::2 [04:14:54] the revision says "Updating package for lucid." [04:15:05] http://svn.wikimedia.org/viewvc/mediawiki/trunk/debs/wikimedia-task-appserver/debian/control?view=log&pathrev=85389 [04:15:11] in a branch called hardy? [04:15:18] you had branched it [04:15:23] and reverted my changes [04:15:31] > Received: from [2620:0:861:1::2] (port=48253 helo=lists.wikimedia.org) by mchenry.wikimedia.org with esmtp (Exim 4.69) (envelope-from ) id baz for info@wikipedia.org; Mon, 19 Nov 2012 19:19:24 +0000 [04:15:47] though it looks like I added dependencies [04:15:50] not removed. [04:16:11] seems I'm insane. ignore me. [04:16:42] done [04:16:54] either way, I despise the configuration packages and would much prefer that things were done in puppet [04:17:33] I'm pretty sure the only one that doesn't feel that way is Jeff_Green [04:17:39] and either way, there is no way to avoid installing recommended packages? [04:17:56] yes and no [04:18:10] can change the apt configuration so that the default is to not install recommended [04:18:53] timidity recommends timidity-daemon, which is some kind of hardware emulation thing for ALSA [04:19:02] ugh [04:19:16] and lilypond recommends a couple of hundred MB of docs [04:19:56] let me see if it's possible in the package definition [04:21:37] seems not [04:21:39] fucking puppet [04:21:53] so, can change the apt configuration [04:21:59] may be able to do it per package [04:23:16] Change merged: Tim Starling; [operations/debs/wikimedia-task-appserver] (master) - https://gerrit.wikimedia.org/r/34255 [04:23:31] already deployed, so about time I merged it [04:23:56] /etc/apt/apt.conf [04:24:00] APT::Install-Recommends "0"; [04:24:18] of course, would be nicer to do for specific packages [04:24:28] maybe it's possible to do similar to pinning [04:24:31] I could add a Conflicts line [04:24:39] to wikimedia-task-appserver [04:24:42] that would likely cause problems [04:25:36] !log repooling srv258-srv280 [04:25:44] Logged the message, notpeter [04:27:03] or I could just disable the timidity-daemon service via puppet, then it would be pretty harmless [04:27:15] could. yeah [04:27:25] it's absurd that puppet can't handle this [04:27:49] I wonder what difference it would make if we disabled recommends by default [04:28:03] hard to say without reinstalling the server [04:28:08] yep [04:29:14] you could clean install 2 servers (one each with and without reccomends) and diff their installed package lists [04:29:24] yeah. could do it in labs [04:29:26] * Ryan_Lane shrugs [04:29:37] the recommendation is to not turn recommends off (of course) [04:29:48] where? [04:30:24] Ryan_Lane: so is it at all possible to get tim tams locally? [04:30:40] I've heard yes, but I haven't seen them anywhere [04:31:04] is it OK to put this in mediawiki::packages or do I need to add a dozen layers of abstraction? [04:31:12] we have stroopwafel in NYC [04:31:17] why is it always a dozen? [04:31:24] never thirteen or eleven [04:31:32] TimStarling: the packages? [04:31:41] should be fine to put into mediawiki::packages [04:31:52] disabling the timidity-daemon service [04:31:56] ah [04:32:04] hm. [04:32:26] one sec [04:33:08] hey tim, whatever you do, can you put it in both the mediawii class and the mediawiki_new module? alsmost done migrating, I promise! [04:33:12] I'd say it's fine there [04:33:45] * jeremyb hands notpeter a k [04:33:50] it's a side-effect of the package [04:33:59] it's 830, and I've been drinking ;) [04:34:03] I think it's a good idea to put it there and document why its there [04:34:07] I can lose all the letters I want! [04:34:28] * jeremyb gets a beer [04:37:21] there's no mediawiki class [04:40:30] interesting how wikimedia-task-appserver and the other packages MW needs are in different puppet classes in the new hierarchy [04:40:41] despite the fact that they do the same thing [04:40:58] mediawiki::packages [04:40:59] sorry [04:41:21] we should really split this off into a module eventually [04:41:25] notpeter: is that in the plans? :) [04:41:55] well, let's see if ssl3003 comes back up [04:42:09] sure. it's all in the plans [04:42:10] !log rebooting ssl3003 (upgraded via ssh, not console) [04:42:12] :D [04:42:16] Logged the message, Master [04:42:36] well, if ssl3003 doesn't come back up I guess I'll need to stop till mark gets back in [04:44:27] \o/ it came back up [04:45:25] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [04:46:25] New patchset: Tim Starling; "Changes for Score extension deployment" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34262 [04:48:14] !log repooling ssl3003 (upgrade finished) [04:48:20] Logged the message, Master [04:48:22] !log depooling ssl3002 for upgrade to precise [04:48:29] Logged the message, Master [04:50:15] someone want to review that last change? [04:50:57] name isn't necessary in the service definition [04:51:13] though it also won't hurt anything [04:51:21] yeah, I thought it probably wasn't, but there was nothing to say it wasn't in the docs [04:51:37] at least, not in the section I was reading [04:51:50] if it isn't specified, it takes the title [04:52:17] hm. well I can't really review the rewrite code [04:52:25] TimStarling: "lilypond"? [04:52:54] New review: Ryan Lane; "+1 on puppet and varnish. Someone else will need to approve for rewrite." [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/34262 [04:53:15] thanks Aaron|home [04:53:31] amending for both [04:54:00] New patchset: Tim Starling; "Changes for Score extension deployment" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34262 [04:56:09] I grepped for "math", that's how I found that varnish change [04:58:04] ahh, right, we are using varnish now [04:58:47] hopefully there will be no switching back [05:00:19] ok, I'll push that out now, if there are no more comments [05:00:36] sec :) [05:00:38] looking [05:01:09] paravoid: you are still awake? [05:01:10] heh [05:01:11] crazy [05:01:28] no, I just woke up [05:01:56] ah [05:02:09] TimStarling: two things [05:02:22] wow, paravoid waking up in the morning and sleeping at night [05:02:31] yeah, unexpected [05:02:40] a) I'm running a patched rewrite.py ms-fe1 right now for testing, so I'd have to patch this by hand [05:03:00] but I think I'm just going to roll-out all of my rewrite.py changes to all servers anyway [05:03:05] b) this needs a squid change too [05:03:16] we're not running exclusively on varnish yet [05:03:22] squid configuration is not in puppet, is it? [05:03:27] nope [05:03:28] nope [05:03:44] paravoid: where are we using squid for uploads? [05:03:45] I want to push this out first, before I install timidity on any more servers [05:04:13] then I want to update wikimedia-task-appserver on precise [05:04:34] are we still using that? [05:04:35] then it will be time for squid [05:04:40] yes, see above [05:05:16] paravoid: I had basically the same reaction [05:06:00] someone was working on ditching this [05:06:02] I think hashar [05:06:06] oh well [05:06:19] Ryan thought he was working on ditching it too [05:06:29] but apparently nobody has succeeded [05:06:53] heh [05:06:59] so, want some help Tim? [05:07:16] if you like [05:07:19] sure [05:08:03] thanks [05:08:53] the MW configuration change is here, in case you need it for reference: https://gerrit.wikimedia.org/r/#/c/34251/ [05:09:04] status draft until the ops changes are done [05:09:07] doing squid now [05:09:19] and I already +1'ed tha change [05:09:25] for the rewrite.py part :) [05:09:59] Ryan reviewed the rest of it [05:10:03] yeah [05:10:04] I'll deploy it now [05:10:15] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34262 [05:10:22] Aaron|home: pmtpa still runs on squid and is the backend for esams [05:11:48] !log rebooting ssl3002 [05:11:55] Logged the message, Master [05:11:56] I'm wondering if we should put the squid configs to gerrit [05:12:03] paravoid: meg [05:12:04] err [05:12:05] meh [05:12:11] we'll be rid of it soon enough [05:12:16] well, that was my initial reaction too [05:12:19] and we'd need to clean it [05:12:29] but people have needed it a number of times since then [05:12:34] which means we'd need to split part of it out into private configs [05:12:34] wikivoyage, this etc. [05:12:39] you know there's a reason I put the password in a separate file [05:12:52] it was meant to be public from the start, but I was shouted down [05:12:56] the IPs we are blocking would need to be split out too [05:13:03] bleh. I wish it was public [05:13:11] it's made life annoying in labs [05:13:35] IP blocks are public when they are made via the web interface [05:13:54] there's no real reason why they can't be public when they are made via configuration [05:13:59] except we're putting in blocks against DoS [05:14:23] maybe [05:14:30] !log deploying squid config for score [05:14:37] Logged the message, Master [05:14:55] I think it's likely more trouble than it's worth at this point, though [05:15:19] I'd rather put more effort in DNS being public than squid [05:16:51] New patchset: Faidon; "swift: also handle URLErrors from imagescalers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33490 [05:16:51] New patchset: Faidon; "swift: passthrough all imgscalers errors as-is" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33491 [05:16:51] New patchset: Faidon; "swift: fix https for short thumb URL redirects" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33492 [05:16:52] New patchset: Faidon; "swift: use WSGIContext properly" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33510 [05:16:52] New patchset: Faidon; "swift: removed code to hide the ETag." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/23392 [05:16:52] New patchset: Faidon; "swift: removed copy2() and friends from rewrite.py" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25410 [05:16:52] New patchset: Faidon; "swift: remove unreferenced code/variables" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33651 [05:16:53] New patchset: Faidon; "swift: add CORS support" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33652 [05:16:57] rebase fun [05:17:01] * Ryan_Lane twitches [05:17:01] heh [05:18:12] PROBLEM - HTTPS on ssl3002 is CRITICAL: Connection refused [05:19:15] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/23392 [05:19:25] !log repooling ssl3002 (upgrade complete) [05:19:31] !log depooling ssl3001 [05:19:33] Logged the message, Master [05:19:33] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/25410 [05:19:39] Logged the message, Master [05:19:51] RECOVERY - HTTPS on ssl3002 is OK: OK - Certificate will expire on 08/22/2015 22:23. [05:20:17] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33490 [05:20:40] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33491 [05:21:06] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33492 [05:21:41] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33510 [05:21:53] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33651 [05:22:08] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33652 [05:23:08] New patchset: Faidon; "swift: remove support for container sync" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33653 [05:23:38] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33653 [05:29:17] New patchset: Faidon; "swift: remove more unreferenced config" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34263 [05:29:29] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34263 [05:31:15] RECOVERY - Puppet freshness on ms-fe1 is OK: puppet ran at Tue Nov 20 05:30:59 UTC 2012 [05:31:27] TimStarling: running puppet on swift proxies and restarting them via rollover now [05:36:30] RECOVERY - Apache HTTP on srv296 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.007 seconds [05:36:39] RECOVERY - Apache HTTP on srv297 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.005 seconds [05:36:57] RECOVERY - Apache HTTP on srv300 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.006 seconds [05:37:06] RECOVERY - Apache HTTP on srv299 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.005 seconds [05:37:33] RECOVERY - Apache HTTP on srv298 is OK: HTTP OK HTTP/1.1 200 OK - 454 bytes in 0.006 seconds [05:38:42] Ryan_Lane: ok, then what prereqs are there for public DNS? [05:39:06] someone doing it? :) [05:39:29]