[00:48:05] New review: Jeremyb; "could leave [[Main Page]] out of the import I think" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61069 [01:00:32] New review: Tim Starling; "Looks good." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/61078 [01:02:09] Change merged: Tim Starling; [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/61033 [01:02:33] Change abandoned: Tim Starling; "Superseded" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/55841 [02:12:34] !log LocalisationUpdate completed (1.22wmf2) at Mon Apr 29 02:12:33 UTC 2013 [02:12:43] Logged the message, Master [03:25:36] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 29 03:25:35 UTC 2013 [03:25:43] Logged the message, Master [04:12:16] New patchset: Ori.livneh; "Add 'tcpircbot' Puppet class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61078 [04:14:01] New review: Ori.livneh; "PS3 allows filtering incoming connections by IPv6 CIDR range, specified in the configuration. If a C..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61078 [04:19:34] I said it would be around 5 lines, it ended up being 150. I was only off by a factor of 30, which is about the norm for software estimation. [04:20:00] * Aaron|home hands ori-l a bear to waltz with [04:20:21] heh [04:45:56] New review: Tim Starling; "netaddr.IPNetwork.__contains__ is apparently undocumented, but I confirmed that it does exist. At fi..." [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/61078 [04:45:58] Change merged: Tim Starling; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61078 [06:27:32] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:28:21] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.133 second response time [06:32:31] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:33:21] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time [06:34:21] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 208 seconds [06:35:21] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 10 seconds [06:39:49] New patchset: Mattflaschen; "Add site icon config for Wikipedia." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61348 [09:35:48] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [09:35:48] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [09:35:48] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [09:35:48] PROBLEM - Puppet freshness on virt1005 is CRITICAL: No successful Puppet run in the last 10 hours [10:14:06] PROBLEM - Puppet freshness on mc15 is CRITICAL: No successful Puppet run in the last 10 hours [10:42:23] PROBLEM - Puppet freshness on vanadium is CRITICAL: No successful Puppet run in the last 10 hours [10:53:33] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 184 seconds [10:54:33] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 6 seconds [11:14:00] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 229 seconds [11:16:00] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 6 seconds [11:19:00] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 186 seconds [11:21:00] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 21 seconds [11:43:56] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 229 seconds [11:45:56] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 12 seconds [11:49:56] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 210 seconds [11:55:56] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 10 seconds [12:14:04] New patchset: Matthias Mullie; "Enable Auto-archive for enwiki, dewiki, frwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60803 [12:19:21] mlitn: is it documented somewhere? what does it do? [12:19:39] I vaguely remember thar "archiving" here doesn't have our usual meaning but rather means "expiry" or "deletion" [12:22:51] Nemo_bis: http://www.mediawiki.org/wiki/Article_feedback/Version_5/Feature_Requirements#Auto-archive_comments [12:22:56] I'll add it to the commit msg too ;) [12:23:25] thanks [12:24:40] AFTv5 has these "filters" (like unreviewed, useful, …); from a UI pov, archiving will remove feedback in the "unreviewed" filte & display them in the "archived" filter (similar like how marking feedback "useful" moves feedback from "unreviewed" to "useful") [12:29:36] New patchset: Matthias Mullie; "Enable Auto-archive for enwiki, dewiki, frwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60803 [12:30:11] New review: Matthias Mullie; "Should not be merged before https://rt.wikimedia.org/SelfService/Display.html?id=5016 is completed" [operations/mediawiki-config] (master) C: -2; - https://gerrit.wikimedia.org/r/60803 [12:33:06] PROBLEM - Puppet freshness on cp1031 is CRITICAL: No successful Puppet run in the last 10 hours [12:59:32] mark: around? varnish purging broken again https://bugzilla.wikimedia.org/show_bug.cgi?id=47825 [13:16:45] drdee: finally have a contact with Cisco...just sent them everything on an1007 [13:16:49] fyi ^ [13:17:04] woot woot! thank you soo much!!! [13:17:40] had to get through support contract nonsense [13:31:02] * jeremyb_ spies east coasters [13:34:43] * jeremyb_ wonders whose week it is [13:44:46] <^demon> paravoid: Ping [13:44:54] ^demon: pong [13:45:16] <^demon> Hi! Got a second to look at a 3-line puppet change for me? [13:45:48] sure [13:45:52] <^demon> https://gerrit.wikimedia.org/r/#/c/61050/ [13:48:10] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61050 [13:50:17] <^demon> Thanks! [13:50:38] sure [14:17:14] mark, are you already looking at the problem with projectgid.rb? [14:17:52] andrewbogott: no, I don't know what I would do about that [14:18:10] i mean, clearly a fact generating something from a conf file like that is pretty broken :) [14:18:22] considering that the old ganglia manifests used that fact too, I wonder how it even worked in the first place... [14:18:31] yeah, me too. [14:18:32] is there a script that generates gmond.conf first in labs or something? [14:18:47] perhaps we should strip that down to write it out to /etc/projectgid [14:18:48] Although it looks like gmond.conf used to be produced by a python script? [14:18:51] and then write a puppet fact to read from there [14:18:57] I don't follow how these parts fit together. [14:19:08] there was certainly logic to write out gmond.conf for labs in a puppet template [14:19:51] Is generate-ganglia-conf.py unrelated? [14:19:52] and where is that script then? [14:21:30] Probably it did something dumb before like write 0 to the port and then come back on a second pass and fill it in. [14:21:35] hm [14:21:39] yuck [14:21:45] but yeah that does seem to be it [14:21:54] perhaps we can strip that down to just be a source for the projectgid fact [14:22:05] and write that to /etc/projectgid [14:22:13] then we don't need to muck around with ldap in ruby ;) [14:22:41] Or just embed ldap shell commands in the fact. [14:22:52] yeah [14:22:56] but presumably it doesn't change does it [14:23:03] nope [14:23:10] ew :) [14:23:17] (ldap in the fact) [14:23:26] so writing it out to the filesystem would perform better and be useful to other tools as well [14:23:37] It would. But, who would do the writing out? [14:23:49] i don't know how instances get created [14:24:00] just one shell command at instance creation time would already do the trick ;) [14:24:11] True. Hm... [14:25:45] * andrewbogott needs breakfast [14:40:00] New patchset: coren; "Preleminary toollabs module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59969 [14:43:15] you can use ldap as a hiera backend :) [14:43:59] PROBLEM - Puppet freshness on cp3003 is CRITICAL: No successful Puppet run in the last 10 hours [14:43:59] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [14:45:46] New patchset: coren; "Fix gridengine class to be parametrized" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61377 [15:04:24] New patchset: Andrew Bogott; "Band-aid patch to get puppet running again on labs." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61381 [15:04:31] New patchset: Jforrester; "Deploy VisualEditor opt-in alpha to viwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61382 [15:05:41] New patchset: Jforrester; "Deploy VisualEditor opt-in alpha to viwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61382 [15:06:51] mark: https://gerrit.wikimedia.org/r/61381 [15:07:09] (Hopefully not the last word, but I want to get labs running again) [15:07:11] so what is going to fix up the projectgid? [15:07:22] i.e. what causes that script to run? [15:07:35] apergos: hey [15:07:41] yo [15:07:51] what's up with ms-be2? [15:08:17] I haven't looked at it today [15:08:30] let's see [15:09:10] mark, ganglia.pp sets up a cron that calls it. [15:09:18] weird [15:09:26] Oh, although, maybe not anymore… hm. [15:09:30] then it doesn't anymore indeed :) [15:09:38] why did you add ms-be9 without adding ms-be2 at the same time? [15:09:57] New patchset: Ottomata; "Adding Aaron Halfakar's new keys:" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61383 [15:09:57] and possibly lowering the weight of ms-be1... [15:10:06] we're losing days if not weeks :/ [15:10:19] oh. ms-be2. :-D [15:10:31] necause it wasn't ready to go in [15:10:50] it was still being set up by steve (might still be tbh) [15:10:56] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61383 [15:11:19] and I couldn't wait any longer so I just put ms-be9 to 33, bs-be12 to 100 and pushed it out [15:11:20] still, after a week? [15:11:27] yes, still, after a week [15:11:32] so what is he doing instead? [15:11:50] for at least a couple days he wasn't on site [15:12:03] but chris will know the details better [15:12:29] i'm glad you're so on top of this [15:12:45] you're welcome :-P [15:14:07] mark: Did you disable that script because you determined that we don't need it, or was it an oversight? I'm just trying to get context before I dig into understanding how this used to work... [15:14:21] andrewbogott: must have been an oversight [15:14:25] ok [15:14:31] i didn't realize gmond.conf was used in a circular fassion for projectgid [15:14:36] (seriously wtf) [15:14:55] fashion [15:15:08] so feel free to put that back, or even restore ganglia.pp entirely for labs [15:15:28] as long as the new module remains invoked for esams and manutius [15:15:57] mark: The projectgid thing is obviously silly, I'm just trying to figure out if that python script does anything else we care about. [15:16:05] yeah [15:17:20] apergos: hasn't ms-be9 already reached 33%? [15:18:54] I guess it's about providing aggregate per-project ganglia graphs. That seems useful. [15:19:12] that would be the clusters and gmetad stuff [15:19:19] i'm just not sure where this is supposed to run [15:19:20] on every host? [15:19:43] can't it do that only on the ganglia gmetad host with ldap info or something? [15:19:51] apergos: looks like it to me [15:19:54] anyway [15:19:59] if needed we can leave that part separate [15:20:03] they're distinct functions [15:20:07] gmond collects the data on a host [15:20:24] mark: yeah, it doesn't make sense to run on every host. I'm still reading… (and have never written a ganglia conf so it's slow going) [15:20:24] a gmond aggregator collects it per cluster (centrally now, also in labs before) [15:20:24] apergos: I'd say let's up ms-be9, up ms-be12, add ms-be2, lower e.g. ms-be1 all in one go [15:20:36] and the gmetad takes the data from the aggregators and writes it to RRDs on disk [15:20:43] (and then there's a web frontend that reads that again) [15:21:41] I already pushed out rigns for ms-be9 and 12 [15:21:57] I did that last week middle of the week [15:22:13] yes, and these are done now [15:22:16] that's what I'm saying [15:22:23] you put ms-be9 to 33% [15:22:32] and that's already done as far as I can see [15:23:15] both of them are [15:23:23] let me check on that [15:23:42] it would be great if they are [15:28:27] New review: Jdlrobson; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/60885 [15:30:33] mark, that cron is part of ganglia::collector, which it looks like you left intact. That wasn't part of your refactor was it? [15:30:37] So probably the cron is still getting installed. [15:35:25] no [15:36:02] PROBLEM - Puppet freshness on ms-fe3001 is CRITICAL: No successful Puppet run in the last 10 hours [15:36:30] i only changed the monitor part (for now) [15:36:45] ok. I'm going to merge my bandaid and see what happens :) [15:37:02] apergos: ? [15:37:09] worst case it will allow labs to work again minus ganglia [15:37:17] yes? [15:38:05] weren't you checking? [15:38:07] yes [15:38:13] New review: Andrew Bogott; "Dumb but should help in the short-term." [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/61381 [15:38:14] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61381 [15:38:16] I'm now checking on the status of ms-be2 [15:49:34] New patchset: ArielGlenn; "ms-be2 new mac addr, new disk layout (720xd)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61388 [15:51:32] Andrew, plz to review https://gerrit.wikimedia.org/r/#/c/61377/ [15:51:33] ? [15:51:36] andrewbogott: ^ [15:52:38] New patchset: ArielGlenn; "ms-be2 new mac addr, new disk layout (720xd)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61388 [15:57:07] apergos: raid cfg is finished on ms-be2 so once all the puppet changes are done you are free to install [15:57:24] ok thanks [15:59:23] New review: Andrew Bogott; "(1 comment)" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/61377 [16:00:37] New patchset: ArielGlenn; "ms-be2 new mac addr, new disk layout (720xd), remove dup stanza" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61388 [16:02:33] New patchset: coren; "Fix gridengine class to be parametrized" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61377 [16:02:49] andrewbogott: Ask, and thou shall recieve. [16:03:25] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61388 [16:03:46] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61377 [16:04:50] I'm about to push your change andrewbogott, Coren [16:04:56] thx [16:05:22] done [16:06:50] New patchset: Faidon; "Enable LVS check for ms-fe.eqiad.wmnet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61392 [16:08:03] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61392 [16:09:17] apergos: Danke [16:09:25] yw [16:16:45] PROBLEM - Puppet freshness on db44 is CRITICAL: No successful Puppet run in the last 10 hours [16:18:06] New patchset: ArielGlenn; "ability to batch pages-logging dump (works aoround wikidata issue)" [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/61394 [16:18:45] RECOVERY - Host ms-be2 is UP: PING OK - Packet loss = 0%, RTA = 26.57 ms [16:20:55] PROBLEM - swift-account-replicator on ms-be2 is CRITICAL: Connection refused by host [16:21:05] PROBLEM - swift-account-server on ms-be2 is CRITICAL: Connection refused by host [16:21:06] PROBLEM - swift-container-replicator on ms-be2 is CRITICAL: Connection refused by host [16:21:06] PROBLEM - SSH on ms-be2 is CRITICAL: Connection refused [16:21:06] PROBLEM - swift-object-updater on ms-be2 is CRITICAL: Connection refused by host [16:21:15] PROBLEM - swift-container-updater on ms-be2 is CRITICAL: Connection refused by host [16:21:15] PROBLEM - swift-account-reaper on ms-be2 is CRITICAL: Connection refused by host [16:21:15] PROBLEM - swift-container-auditor on ms-be2 is CRITICAL: Connection refused by host [16:21:15] PROBLEM - DPKG on ms-be2 is CRITICAL: Connection refused by host [16:21:25] PROBLEM - swift-container-server on ms-be2 is CRITICAL: Timeout while attempting connection [16:21:26] PROBLEM - swift-account-auditor on ms-be2 is CRITICAL: Timeout while attempting connection [16:21:26] PROBLEM - Disk space on ms-be2 is CRITICAL: Timeout while attempting connection [16:21:26] PROBLEM - RAID on ms-be2 is CRITICAL: Timeout while attempting connection [16:21:26] PROBLEM - swift-object-auditor on ms-be2 is CRITICAL: Timeout while attempting connection [16:21:35] PROBLEM - swift-object-server on ms-be2 is CRITICAL: Timeout while attempting connection [16:21:35] PROBLEM - swift-object-replicator on ms-be2 is CRITICAL: Timeout while attempting connection [16:22:23] apergos: have you seen ms-be11's broken disk? [16:22:31] yeah [16:22:34] ok [16:23:09] something's also wrong with ms-be9's nagios checks [16:23:17] see nagios [16:23:19] ok that's odd [16:23:23] nrpe config or something [16:23:51] * paravoid sighs on the mess that nagios is [16:24:04] (in general, not swift specifically) [16:25:26] PROBLEM - Host ms-be2 is DOWN: PING CRITICAL - Packet loss = 100% [16:25:35] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/61394 [16:25:45] okay, afk for now [16:26:47] later [16:30:35] RECOVERY - Host ms-be2 is UP: PING OK - Packet loss = 0 [16:44:04] !log reedy synchronized php-1.22wmf3/ 'initial sync of 1.22wmf3 files' [16:44:13] Logged the message, Master [16:49:01] !log reedy synchronized docroot [16:49:09] Logged the message, Master [17:01:37] !log attempt to fix NRPE checks on ms-be9, restart as correct user [17:01:45] Logged the message, Master [17:03:28] Uhhh [17:03:29] The authenticity of host 'mw1047 (10.64.0.77)' can't be established. [17:03:29] RSA key fingerprint is 10:af:44:e5:3a:4f:82:68:d6:9b:70:2b:2c:64:dd:4f. [17:03:29] Are you sure you want to continue connecting (yes/no)? The authenticity of host 'mw1048 (10.64.0.78)' can't be established. [17:03:29] RSA key fingerprint is 0f:10:1c:6a:be:6f:09:30:f6:f3:cf:51:51:69:a7:0f. [17:03:29] Are you sure you want to continue connecting (yes/no)? The authenticity of host 'mw1049 (10.64.0.79)' can't be established. [17:03:31] RSA key fingerprint is e4:1e:35:34:c2:a2:71:85:90:e6:bf:07:82:7e:54:da. [17:03:36] etc etc [17:03:44] Reedy: should be fixed already [17:03:50] i know why [17:03:52] heh [17:03:56] * Reedy runs it again [17:04:30] !log reedy synchronized live-1.5 [17:04:33] !log csteipp synchronized php-1.22wmf2/includes/ [17:04:33] !log removing reinstalled ms-be2 from ssh_known_hosts with ssh-keygen and then making it world-readable again manually so deployers don't get errors :p [17:04:45] ^ [17:04:45] Logged the message, Master [17:04:45] :D [17:04:46] Logged the message, Master [17:04:53] Logged the message, Master [17:05:16] ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R ms-be2 [17:05:24] this stuff makes it just readable for root [17:05:50] but if i dont remove it then could not connect to ms-be2 which has been reinstalled [17:06:11] same thing Tim fixed a couple days ago [17:06:33] yeah [17:07:36] apergos: success https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=ms-be9 [17:08:23] I looked at that file but it was dated apr 27, after ms-be9 reinstall [17:08:25] !log fix NRPE checks on ms-be2, running as "weird user id" bug [17:08:32] Logged the message, Master [17:08:33] why would it have been broken for that? [17:08:54] look at the user id it was running as before: [17:09:02] 4294967295 16383 0.0 0.0 25352 1172 ? Ss 16:40 0:00 /usr/sbin/nrpe -c /etc/icinga/nrpe.cfg -d [17:09:05] after: [17:09:08] icinga 23497 0.0 0.0 25476 1176 ? Ss 17:08 0:00 /usr/sbin/nrpe -c /etc/icinga/nrpe.cfg -d [17:09:25] killed the process, used init script to start it [17:09:28] what I mean is, the file had obviously been recreated after the reinstall [17:09:39] https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=ms-be2 [17:09:41] like a few days after [17:09:53] for ms-be9; why was that not enough? [17:10:53] i just know right now this was a known bug we had with NRPE init script afair, and we fixed it and just had to kill and restart it everywhere at some point [17:11:09] both, be-2 and be-9 were in that same state though [17:11:13] ok and you run this on neon? [17:11:23] no, on the hosts themselves [17:11:26] oh [17:11:33] that's even weirder [17:11:51] the "server" is on the hosts, because it accepts the requests to execute local stuff [17:12:03] nagios remote plugin executor [17:12:21] root@ms-be2:~# /etc/init.d/nagios-nrpe-server start [17:12:39] really [17:13:02] I wonder if that's [17:13:07] well, it checks for running processes on the boxes [17:13:35] yes, we don't pass arguments to commandlines over the net [17:13:40] they are hardcoded [17:13:57] the option is called "dont_blame_nrpe" :p [17:13:57] no I was going to ask something else [17:14:01] ah,heh [17:16:15] I wonder if I just fixed db55 then [17:16:54] looks like it :) [17:16:54] maybe so [17:17:21] good. hopefully I will now remember this little issue, ahving done it for one of the hosts [17:17:58] I suppose after the first reboot it would be ok [17:18:10] https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=db55 yep, done [17:19:24] as long as it's running as "icinga", and it does [17:25:13] New patchset: Andrew Bogott; "Add robots.txt and a privacy policy to mediawiki_singlenode." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61069 [17:25:13] New patchset: Andrew Bogott; "Add unicorn logo to labs mediawiki installs." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61401 [17:25:56] unicorn! <3 [17:29:45] !log reedy Started syncing Wikimedia installation... : test2wiki to 1.22wmf3 and rebuild l10n cache [17:29:52] Logged the message, Master [17:34:42] assume it's just a hiccup.... [17:34:43] PHP fatal error in /usr/local/apache/common-local/wmf-config/CommonSettings.php line 2779: [17:34:46] require() [function.require]: Failed opening required '/usr/local/apache/common-local/php-1.22wmf3/../wmf-config/ExtensionMessages-1.22wmf3.php [17:34:53] (include_path='/usr/local/apache/common-local/php-1.22wmf3/extensions/TimedMediaHandler/handlers/OggHandler/PEAR/File_Ogg:/usr/local/apache/common-local/php-1.22wmf3:/usr/local/lib/php:/usr/share/php') [17:34:59] on test2 [17:35:04] Or scap being fail [17:35:19] As it's only test2wiki I'm not going to fix it to confirm whether scap does [17:35:24] !log removing asw-c1 for replacement [17:35:25] k [17:35:31] Logged the message, Master [17:35:34] thanks for the heads up though [17:40:01] New patchset: BBlack; "Work-In-Progress vhtcpd code." [operations/software/varnish/vhtcpd] (master) - https://gerrit.wikimedia.org/r/60390 [17:43:45] Can someone fix the permissions of /home/wikipedia/common/php-1.22wmf2/.git/modules/extensions/WikiLove/index.lock ? Aaron is being greedy [17:50:11] ok, who killed ms-fe [17:50:17] also, where is the bot [17:50:26] eh? [17:50:28] !log restarting icinga-wm [17:50:36] Logged the message, Mistress of the network gear. [17:50:39] we just got paged for ms-fe lvs death [17:50:40] eqiad [17:50:46] but no icinga bot echo [17:51:15] oh eqiad [17:51:17] *whew* [17:51:26] I was only messin with tampa I swear [17:51:29] oh, lookit, pmtpa [17:51:31] as well [17:51:40] appservers.svc.pmtpa.wmnet is error [17:51:46] heh [17:51:50] Error: invalid magic word 'rootpagename' [17:52:05] sounds like waiting for localisation cache [17:52:13] pybal has depooled 2 (the other two aren't depooled because of the limits) mostly fails on health checks [17:52:17] i didn't get a page [17:52:19] what's up? [17:52:30] leslie and i got pages, our phones went off at same time [17:52:38] I got no pages either [17:52:38] for ms-fe.eqiad? [17:52:40] ms-fe.eqiad.wmnet [17:52:42] yep [17:52:50] that's not yet in use until later today [17:52:57] so you can ignore that for another few hours [17:53:15] cool [17:53:26] ok [17:53:35] aude: Or we get to beat someone up [17:53:43] heh [18:03:03] !log deploying new frontend squid conf [18:03:10] Logged the message, notpeter [18:05:57] !log reedy Finished syncing Wikimedia installation... : test2wiki to 1.22wmf3 and rebuild l10n cache [18:06:05] Logged the message, Master [18:07:26] Actually. Can a root just delete /home/wikipedia/common/php-1.22wmf2/.git/modules/extensions/WikiLove/index.lock please? [18:07:48] Reedy: sure [18:08:00] Thanks [18:08:22] aude: Looks like we get to beat someone up [18:08:37] Reedy: done [18:08:48] thanks [18:09:54] Reedy: did you take down test2wiki just now? [18:10:00] Not just now [18:10:07] Last 40 minutes or so [18:10:40] https://gerrit.wikimedia.org/r/#/c/60198/ [18:10:40] new magic word [18:11:09] thanks aude [18:12:04] usually is localisation cache rebuild [18:12:16] w/o it, things fail hard [18:12:25] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61069 [18:12:37] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61401 [18:13:59] Oh wth [18:14:58] New review: preilly; "This is so full of win! :-)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61401 [18:18:52] paravoid: any chance you could review the ipython stuff? [18:21:15] !log reedy synchronized php-1.22wmf3/extensions/ [18:21:24] Logged the message, Master [18:27:54] Running scap number 2 [18:37:02] !log reedy synchronized php-1.22wmf2/.git/modules/extensions/WikiLove [18:37:10] Logged the message, Master [18:46:14] !log reedy Started syncing Wikimedia installation... : Scap again to see if that fixes ROOTPAGENAME and ROOTPAGENAMEE. Otherwise I'm reverting it [18:46:22] Logged the message, Master [19:06:08] !log reedy Finished syncing Wikimedia installation... : Scap again to see if that fixes ROOTPAGENAME and ROOTPAGENAMEE. Otherwise I'm reverting it [19:06:15] Logged the message, Master [19:06:52] Reedy: it no longer MWExceptions … but it calls itself 1.22alpha [19:09:43] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki and mediawikiwiki to 1.22wmf3 [19:09:51] Logged the message, Master [19:11:04] New patchset: coren; "Preleminary toollabs module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59969 [19:11:28] New patchset: Reedy; "testwiki, test2wiki and mediawikiwiki to 1.22wmf3" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61417 [19:12:11] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61417 [19:12:45] Change abandoned: Catrope; "No longer needed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/55137 [19:12:53] Change abandoned: Catrope; "No longer needed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/55135 [19:15:20] I am pushing out new Parsoid code now, so please ignore any Parsoid-related alerts in the next minutes [19:19:12] !log kaldari synchronized php-1.22wmf2/extensions/Echo/maintenance/setEmailOptionTemp.php [19:19:20] Logged the message, Master [19:21:49] New patchset: coren; "Fix gridengine puppet manifest syntax" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61419 [19:22:02] the Parsoid update is done now [19:22:21] Ryan_Lane: Can you check out and +2 ^^ so that I don't look so stupid anymore? [19:23:01] :D [19:23:20] oh. it needs a jenkins ru [19:23:21] *run [19:23:36] Jenkin lags by ~1 min usually [19:25:00] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61419 [19:25:03] done [19:25:19] for obvious things like that you can self-merge, too :) [19:25:35] Yes, so, actually using puppet syntax in puppet manifests = a good thing. :-) [19:26:06] * Coren isn't entirely certain where his brain was residing during that particular one. [19:27:10] heh [19:32:04] New patchset: coren; "Preleminary toollabs module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59969 [19:37:27] New patchset: Faidon; "Ceph: be more resilient to malfunctioning OSDs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61423 [19:37:40] ^^^ should fix the cause that produced the outage an hour ago [19:37:51] the page was for a real outage [19:38:20] it's also on my TODO to document a few things like basic troubleshooting and inform the rest of you about Ceph plans [19:38:29] but let's see how today's deployment window goes :) [19:38:43] New patchset: coren; "Preleminary toollabs module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59969 [19:38:44] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61423 [19:39:26] gridengine changeset not merged in sockpuppet [19:39:27] New patchset: Hashar; "multiversion: headers() do not play nice on CLI" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61424 [19:39:27] I'm merging [19:39:31] Coren, andrewbogott ^ [19:39:50] - require gridengine($gridmaster) [19:39:50] + class { 'gridengine': [19:39:51] + gridmaster => $gridmaster, [19:39:51] + } [19:40:27] paravoid: Yeah, my mind got puppet syntax confused with something else for a while. I did two similar commits within 1h of each other. :-) [19:40:29] coren: this btw means that it was either was wrong before (it didn't need a require but an includes), or that it's broken now (missing a class dependency) [19:40:32] New patchset: Hashar; "multiversion: ability to destroy singleton" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61425 [19:40:32] New patchset: Hashar; "multiversion: hostname to dbname basic tests" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61426 [19:41:17] paravoid: It was semi-wrong before. It didn't need the dependency per se; I tend to use require reflexively though. [19:42:14] don't [19:42:27] Oh? [19:42:31] more dependencies means larger catalog and more graphs for puppet to solve [19:42:40] Ah. I wasn't aware of that. [19:43:17] New patchset: Hashar; "beta: configuration for Wikidata" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61428 [19:46:40] New patchset: Ryan Lane; "Adding labstore3/4 to gmetad config for Labs NFS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61429 [19:47:55] New review: coren; "LGM" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/61429 [19:48:32] * Coren waits on Jenkins [19:52:16] ottomata: is the replag on db56 known? Lag: 1518271 from noc [19:52:29] not to me! [19:53:20] okay, well, all I know other than that is that it is currently pooled :-) [19:53:42] (and accepting queries) [19:53:44] that is really far behind [19:53:52] it looks like its catching up, but that is way too far behind [19:53:54] New patchset: Ori.livneh; "RDoc for tcpircbot class and tcpircbot::instance resource" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61432 [19:54:08] binasher, you around? [19:54:13] ottomata: might it make sense to depool or set load to 0 while it catches up? [19:54:29] i haven't done dba-ing for WMF yet, so i'm not sure how to do that [19:54:34] Change merged: coren; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61429 [19:54:36] but that would make sense i think [19:54:50] searching wikitech... [19:55:10] feel free to pass along to another ops, I just notified you since you are listed as the on duty :-) [19:56:02] yeah [19:56:03] Coren: wanna merge https://gerrit.wikimedia.org/r/#/c/61432/ too since you're doing puppet stuff? It's just docs + class parameter [19:56:08] i pinged binasher [19:56:13] i should know how to do this I think [19:57:05] pgehres: the comment in db-pmtpa.php says [19:57:06] 'db56' => 400, # snapshot host [19:57:11] aha [19:57:36] hmm, nothing like the monthly dumps [19:57:49] so maybe that's ok? i dunno, 17 days is pretty far behind, but if this machine is used for taking dumps then maybe its ok [19:57:57] notpeter: you there? ^ [20:00:32] New review: coren; "Simple enough" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/61432 [20:00:54] Coren: thanks [20:01:18] ori-l: I'll merge once Jenkins gets around to it [20:03:25] AaronSchulz: hey :) [20:03:54] New patchset: Aaron Schulz; "Enabled multiwrites for ceph/swift." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61462 [20:04:08] oh heh [20:07:22] ... [20:07:40] sex@wikipedia? meh, whatever [20:08:05] * AaronSchulz awaits gerrit [20:08:09] "hehehe, I named my user account something VULGAR" [20:08:29] "poppycock@wikipedia" [20:08:36] is "sex" even vulgar? [20:09:05] AaronSchulz: I guess it depends on what colour your state is [20:09:07] more so just silly [20:10:38] Change merged: coren; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61432 [20:10:59] ori-l: {{done}} [20:16:24] AaronSchulz: what's your plan? [20:16:42] Coren: thanks again [20:18:39] paravoid: just turn one writes for today and run some scripts [20:18:43] *turn on [20:18:50] ? [20:21:54] assuming I don't spend the whole window waiting on jenkins [20:24:50] why just writes today? [20:24:57] I thought we agreed writes & reads last week? [20:25:23] and talked to Rob about it later and decided to wait a few days [20:25:52] I don't think we are in a hurry [20:26:19] Change merged: Aaron Schulz; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61462 [20:27:28] !log aaron synchronized wmf-config/filebackend.php 'Enabled multiwrites for ceph/swift' [20:27:36] Logged the message, Master [20:28:04] it would have been nice to be part of that discussion though -- or at least be informed of it before the window... [20:30:58] !log Doing another original/math/timeline sync/copy run [20:31:06] Logged the message, Master [20:31:28] I still have apache access logs for now [20:31:35] so I see the requests coming through fine [20:32:12] paravoid: heh, well Rob is fine with either, I just want to watch some logs for a while before doing the next thing [20:33:02] I'm okay with that, just keep me in the loop though :) [20:33:42] silly amount of req/s btw [20:34:06] something like 10-15 in total [20:34:40] I did some load testing on Friday btw [20:34:57] with a sample of ~2000 real URLs [20:35:20] just GETs, one iteration with just originals and one with thumbs [20:35:47] maxed out the gigabit on the former case, got close to on the latter [20:35:53] over 800 req/s per box [20:36:15] ceph has a rest-bench package to do load testing with PUTs too [20:36:36] but I didn't bother as much, we did a bit of that with the replication scripts :) [20:39:56] heh [20:43:11] As to your other point, for someone like Jonah Falcon, a "size of penis" field could be very useful. Not sure about U.S. presidents, though I can think of at least one case where you could argue it's historically notable. — PinkAmpers&(Je vous invite à me parler) 23:36, 5 February 2013 (UTC) [20:43:17] ohh, wikidata :) [20:51:03] ottomata: yeah, it is comically far behind [20:51:06] although it's catching up [20:51:17] my guess would be that a dump was run against it recently [20:51:36] is that normal? [20:53:39] the pmtpa slaves to fall pretty far behind from time to time [20:53:45] I don't know if I've seen them that bad, though [20:53:55] soooo, let's go with yes, it's normal :) [20:59:40] Deploying a Parsoid config change, there shouldn't be any LVS flapping but if there is please ignore [21:02:10] All done [21:03:46] hey paravoid, hashar: nested classes/defines: go! [21:04:08] ok, thanks notpeter! [21:04:44] definitely [21:04:56] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61250 [21:05:27] ottomata: no. [21:05:30] imho :) [21:05:44] ottomata: definitely :) [21:06:02] good i agree [21:06:05] how else are you going to create things that are wildly abusive and unwieldly [21:06:07] i have a weird special case [21:06:08] ? [21:06:14] ah but now i'm in a meeting... [21:07:38] AaronSchulz: works so far [21:07:42] AaronSchulz: anything else to do/deploy? [21:13:32] AaronSchulz: I see some PUTs for thumbs too, with a PHP-Cloudfiles UA [21:14:11] * paravoid feels like talking to a wall :) [21:14:33] paravoid: VE on wikitech is a good point, I'll set that up later today. Should be easy to do [21:15:43] RoanKattouw: \o/ [21:16:54] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61382 [21:24:03] !log catrope synchronized wmf-config/InitialiseSettings.php 'Enabling VisualEditor on viwiki and test2wiki' [21:24:11] Logged the message, Master [21:31:04] !log upgrading db45 to precise [21:31:12] Logged the message, notpeter [21:38:46] New patchset: Bsitu; "Assign immediate priority to EchoNotificationJob" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61479 [21:41:16] anomie: Just FYI, RoanKattouw's deployment may run over due to Jenkins slowness; hopefully not, though. [21:41:54] James_F: That's ok, mine should only take a few minutes so I can start late. [21:42:06] anomie: Yeah, but keen to get the fix out the door ASAP. :-) [21:43:54] error: unable to unlink old '.gitmodules' (Permission denied) [21:43:59] * RoanKattouw glares at re [21:44:01] * RoanKattouw glares at Reedy [21:44:24] unlink where? [21:44:28] Reedy: When creating php-1.22wmfN directories, please create them with correct perms [21:44:30] drwxr-xr-x 16 reedy wikidev 4096 Apr 29 18:18 /home/wikipedia/common/php-1.22wmf3 [21:44:36] * RoanKattouw fixes [21:44:50] RoanKattouw: I don't create them [21:45:05] Who/what does? [21:45:13] It's done my multiversion/checkoutMediaWiki [21:45:20] Ah [21:45:20] And it was fine last time around... [21:45:25] Maybe your umask is off? [21:45:25] s/my/by/ [21:45:33] reedy@fenari:/tmp/uploads/zurich$ umask [21:45:33] 0002 [21:45:54] Hah [21:45:55] New patchset: Catrope; "Add the TemplateData extension" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61481 [21:46:02] Strange that it was 755 then [21:48:22] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61481 [21:52:12] !log reedy synchronized php-1.22wmf3/includes/DefaultSettings.php 'Fix wgVersion' [21:52:20] Logged the message, Master [21:52:29] Going to run scap now [21:53:00] * ^demon grabs an umbrella [21:53:27] !log Jenkins is in trouble [21:53:34] Logged the message, Master [21:55:55] New patchset: Lwelling; "Add global setting to trigger migration time functionality in Echo relates to https://gerrit.wikimedia.org/r/61025" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61487 [21:56:02] New patchset: Catrope; "Enable TemplateData on mw.org, testwiki and test2wiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61488 [21:56:16] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61488 [21:56:42] OK actually scapping now [21:57:48] anomie: Running scap now, will probably take a while [21:57:58] RoanKattouw: That's ok. Ping me when it's done. [21:58:35] Will do [21:58:53] * anomie is still in meeting anyway [21:59:55] !log gallium / jenkins : blackholed a web drawer using ip route add blackhole /32 [22:00:03] Logged the message, Master [22:02:56] !log jenkins web server is no more responding on port 8080. Might have to restart it. [22:03:03] Logged the message, Master [22:07:21] paravoid: back [22:07:29] welcome back :) [22:08:53] see my comments above? [22:10:04] I'm also looking at https://graphite.wikimedia.org/render/?title=Top%208%20FileBackend%20Methods%20by%20Max%20Average%20Time%20%28ms%29%20log%282%29%20-8hours&from=-8hours&width=1024&height=500&until=now&areaMode=none&hideLegend=false&logBase=2&lineWidth=1&lineMode=connected&target=cactiStyle%28substr%28highestMax%28FileBackendStore.*.tavg,8%29,0,2%29%29 [22:10:33] but some of the mediawiki internals elude me, so I can't fully understand it :) [22:12:23] !log gallium: restarted apache2 to kill off http connections established by mod_proxy to the jenkins backend. [22:12:31] Logged the message, Master [22:13:59] OMG this scap is sllooooowwww [22:18:25] !log catrope Started syncing Wikimedia installation... : [22:18:33] Logged the message, Master [22:19:53] paravoid: I think that's it for today [22:20:13] we're writing thumbs too I see? [22:20:16] * AaronSchulz was in a core meeting and then talking to James_F [22:20:23] yes [22:20:55] Agh [22:20:58] Fatal error: require_once(): Failed opening required '/home/wikipedia/common/php-1.22wmf3/extensions/TemplateData/TemplateData.php' (include_path='/home/wikipedia/common/php-1.22wmf3/extensions/TimedMediaHandler/handlers/OggHandler/PEAR/File_Ogg:/home/wikipedia/common/php-1.22wmf3:/usr/local/lib/php:/usr/share/php') in /usr/local/apache/common-local/wmf-config/CommonSettings.php on line 2196 [22:21:00] cp: cannot create regular file `/home/wikipedia/common/wmf-config/ExtensionMessages-1.22wmf3.php': Permission denied [22:21:09] Aborted scap [22:22:39] !log Running scap again [22:22:56] Logged the message, Mr. Obvious [22:23:07] anomie: I had to abort the scap because of a combination of me forgetting to git submodule update and the perms on ExtensionMessages-1.22wmf3.php being borked. Scapping from scratch now [22:23:25] RoanKattouw: Hopefully it goes faster this time... [22:23:37] I sure hope so but I'm skeptical [22:23:41] RoanKattouw: You could just deploy anomie's fix as you're there? :-) [22:23:43] Sorry for the delay dude [22:23:45] Sure [22:23:49] What cha got anomie ? [22:23:54] Rather than have anomie want to wait around. [22:24:01] Just a minute, let me git review the patches [22:24:07] https://gerrit.wikimedia.org/r/#/c/61057/ [22:24:33] OK, take your time [22:24:56] pgehres / AaronSchulz: Did you want the Lightning Deploy window for Account Audit, then? [22:25:30] sure? [22:25:36] Ok, pushed to gerrit and merged [22:25:37] anomie: OK I see they've gone in [22:25:43] I'll pull them in and scap [22:26:07] pgehres: is it in branch already? I think so [22:26:20] AaronSchulz: not that I know of [22:26:22] I'm also going to revert 60947 (to re-enable WikiLove) once I confirm those are deployed. [22:26:27] OK [22:26:34] That's a simple config change so that should be fine [22:26:47] But we'll have a ridiculously slow scap first, in just a minute [22:26:54] !log Running scap [22:27:01] Logged the message, Mr. Obvious [22:27:03] ahh, it wouldn't be in out branch script [22:27:07] *in our [22:27:39] !log (Including backport of gerrit change 61494 in the scap) [22:27:47] Logged the message, Master [22:27:59] paravoid: Where is wikitechwiki hosted these days? [22:28:26] Yeah, one sync-file, easy. Ping me when the scap is done so I can verify the fix. [22:28:41] virt0 [22:30:08] Is any of the wiki setup in version control at all? [22:30:31] I guess I can just go around changing config files as root... [22:31:21] pgehres: can you make the backport changes in gerrit? [22:31:40] AaronSchulz: sure [22:32:03] I believe RoanKattouw is deploying things at the moment, so I won't merge them [22:32:56] pgehres / AaronSchulz: https://wikitech.wikimedia.org/wiki/Deployments#Week_of_April_29th claims you're doing this at 16:00 now. :-) [22:33:55] trickery [22:38:14] pgehres: I am, yes [22:39:40] New patchset: Pgehres; "Adding configuration variables for Extension:AccountAudit" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61499 [22:40:20] !log catrope Started syncing Wikimedia installation... : [22:40:28] Logged the message, Master [22:52:58] New review: Aaron Schulz; "(1 comment)" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/61499 [22:53:39] Reedy: spot the typo...you lose! :) [22:54:07] New patchset: Anomie; "Revert "Temporarily disabling WikiLove per bug 47457, except testwiki"" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61504 [22:58:47] New patchset: Pgehres; "Adding configuration variables for Extension:AccountAudit" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61499 [22:59:01] AaronSchulz: ^ oops, thanks for the spell check [23:03:52] !log catrope Finished syncing Wikimedia installation... : [23:03:59] Logged the message, Master [23:04:37] Change merged: Anomie; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61504 [23:05:45] !log anomie synchronized wmf-config/InitialiseSettings.php 'Re-enable WikiLove, bug fix is deployed' [23:05:52] Logged the message, Master [23:08:13] !log finished creating AccountAudit tables [23:08:21] Logged the message, Master [23:08:42] hey, congrats on the wikidata -> enwiki launch! [23:08:59] !log s5 pmtpa master changed to db73 - MASTER_LOG_FILE='db73-bin.000090', MASTER_LOG_POS=978211000 [23:09:08] Logged the message, Master [23:09:36] maplebed: thanks! congrats on becoming a fb employee :) [23:10:22] Thanks! excited for everything but the commute. [23:10:47] New patchset: Pyoungmeister; "updating s5 topology" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61511 [23:10:51] maplebed: they're moving everyone to menlo park? [23:10:58] that's brutal... [23:11:04] yup. [23:11:11] Ugh [23:11:47] some folks are moving, and for the rest there're shuttles etc. not so bad. [23:12:08] oh noes! [23:12:20] the commute sort of sucks away your soul [23:12:55] Amusingly, the length of my commute probably will only change by ~10 minutes. [23:13:15] that's pretty good [23:13:49] going to percona live three days in a row was a nice (horrible, painful) reminder never to work anywhere south of san bruno [23:14:37] why? :D [23:14:40] if you're into the biking 35 miles to work thing, the facebook campus is actually well located [23:15:00] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/61511 [23:15:22] The bike across the dumbarton bridge would be the most annoying part. [23:15:33] i don't find that too bad - windy sometimes [23:15:51] i wonder if you could sail most of the way [23:16:53] New patchset: Asher; "pmtpa s5 master swap" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61513 [23:18:18] New patchset: Asher; "pmtpa s5 master swap" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61513 [23:18:56] Change merged: Asher; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61513 [23:20:03] !log asher synchronized wmf-config/db-pmtpa.php 'updated s5 pmtpa topology' [23:20:11] Logged the message, Master [23:22:40] New patchset: Pgehres; "Adding configuration variables for Extension:AccountAudit" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61499 [23:26:57] Change merged: Pgehres; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/61499 [23:49:31] hey, if anyone's around and wants to help with ssh, I just randomly lost access to analytics1002 [23:49:40] so I went ssh analytics1002.eqiad.wmnet and it worked fine [23:49:52] then I rsync-ed some jar files into my home directory [23:50:15] then I tried to connect with the same exact ssh command, and I get the permission denied (publickey) thing [23:50:45] not super urgent but ottomata isn't around and was counting on access to get something done for tfinc [23:52:21] milimetric, are you sure your network connection is stable? [23:53:00] looks ok to me, though i have had an outage here and there? why, does something get cached somewhere? [23:53:28] i'm assuming since i'm on here talking my network's ok-ish :) [23:54:37] irc doesn't really require a stable connection, ssh is a bit more fickle, if you're dropping packets that would potentially disconnect you from ssh and could interfere with the ssh handshake [23:55:51] that makes sense, but yeah, my connection's definitely stable enough for that. [23:57:14] is it possible that I tripped something that revoked my access on those servers? [23:57:21] by uploading jar / .pig files? [23:57:27] or by using rsync? [23:58:59] no [23:59:15] ok. boarding my flight...