[00:01:20] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:11:59] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 3.541 seconds [00:33:26] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [00:33:26] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [00:47:23] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:01:38] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.042 seconds [01:14:47] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 320 seconds [01:16:35] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 7 seconds [01:25:53] PROBLEM - Host ms-be1003 is DOWN: PING CRITICAL - Packet loss = 100% [01:28:17] RECOVERY - Host ms-be1003 is UP: PING OK - Packet loss = 0%, RTA = 66.01 ms [01:34:35] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:44:07] jeremyb: yes. it's segfaulting [01:44:15] I need to run it in gdb [01:45:14] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.680 seconds [01:53:29] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [01:53:29] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [01:53:29] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [01:53:29] PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours [01:53:29] PROBLEM - Puppet freshness on ms-be1010 is CRITICAL: Puppet has not run in the last 10 hours [01:53:29] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [02:04:17] PROBLEM - MySQL disk space on db78 is CRITICAL: DISK CRITICAL - free space: /a 116492 MB (3% inode=99%): [02:20:38] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:27:27] !log LocalisationUpdate completed (1.21wmf6) at Wed Jan 2 02:27:26 UTC 2013 [02:27:38] Logged the message, Master [02:36:33] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.033 seconds [02:43:42] PROBLEM - Varnish traffic logger on cp1021 is CRITICAL: PROCS CRITICAL: 2 processes with command name varnishncsa [02:55:24] RECOVERY - Puppet freshness on neon is OK: puppet ran at Wed Jan 2 02:55:16 UTC 2013 [03:01:42] PROBLEM - Puppet freshness on solr2 is CRITICAL: Puppet has not run in the last 10 hours [03:03:39] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [03:10:15] RECOVERY - Varnish traffic logger on cp1021 is OK: PROCS OK: 3 processes with command name varnishncsa [03:13:42] PROBLEM - Puppet freshness on solr1003 is CRITICAL: Puppet has not run in the last 10 hours [03:13:42] PROBLEM - Puppet freshness on solr3 is CRITICAL: Puppet has not run in the last 10 hours [03:14:36] PROBLEM - Puppet freshness on solr1001 is CRITICAL: Puppet has not run in the last 10 hours [03:20:42] ah [03:21:53] PROBLEM - Varnish HTCP daemon on cp1021 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [03:23:32] RECOVERY - Varnish HTCP daemon on cp1021 is OK: PROCS OK: 1 process with UID = 997 (varnishhtcpd), args varnishhtcpd worker [03:28:02] RECOVERY - Puppet freshness on sq81 is OK: puppet ran at Wed Jan 2 03:27:50 UTC 2013 [03:39:53] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 195 seconds [03:40:21] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 214 seconds [03:48:53] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [03:49:02] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [03:54:55] !b 43112 | gerrit-wm needs a boot! [03:54:55] gerrit-wm needs a boot!: https://bugzilla.wikimedia.org/43112 [04:02:59] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [04:27:26] PROBLEM - Puppet freshness on srv191 is CRITICAL: Puppet has not run in the last 10 hours [04:27:26] PROBLEM - Varnish HTTP upload-backend on cp1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:30:53] RECOVERY - Varnish HTTP upload-backend on cp1021 is OK: HTTP OK HTTP/1.1 200 OK - 634 bytes in 0.056 seconds [04:31:56] PROBLEM - MySQL Replication Heartbeat on db1007 is CRITICAL: CRIT replication delay 201 seconds [04:32:14] PROBLEM - MySQL Slave Delay on db1007 is CRITICAL: CRIT replication delay 206 seconds [04:45:22] New patchset: Ori.livneh; "Enable PostEdit on az, be_x_old, eo, pms, si & uk." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41917 [04:46:05] New patchset: Ori.livneh; "Enable PostEdit on az, be_x_old, eo, pms, si & uk." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41917 [04:56:23] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [05:03:35] RECOVERY - MySQL Replication Heartbeat on db1007 is OK: OK replication delay 0 seconds [05:03:53] RECOVERY - MySQL Slave Delay on db1007 is OK: OK replication delay 0 seconds [05:05:23] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [05:30:07] * jeremyb waves to gerrit-wm! [05:34:56] hello -- any one knows total number of concurrent requests that wikipedia servers handle? [05:37:48] Many. [05:38:07] any approximate numbers? [05:38:28] Susan: this is being discussed already in #wikimedia-tech [05:38:44] Mahmoud: you were given a guess but ori's not certain if that number is really what he thinks it is [05:38:58] (the 141k from the /topic here) [05:39:48] 09:32 < ori-l> Mahmoud: IIRC the number in the topic of #wikimedia-operations is it [05:40:09] i didn't check the -tech channel since then :) [05:42:26] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [06:29:12] PROBLEM - Puppet freshness on stat1 is CRITICAL: Puppet has not run in the last 10 hours [07:05:43] apergos: no !log ? ;-) [07:05:49] nope [07:09:57] New review: MZMcBride; "So should this changeset be abandoned, then?" [operations/apache-config] (master) C: 0; - https://gerrit.wikimedia.org/r/13293 [07:10:03] apergos: ^ [07:10:43] apergos: It's curious that the bot keeps going quiet. I wonder if it makes sense to keep the bug open until that's diagnosed. [07:10:51] I didn't close it [07:10:56] I know. [07:11:30] that was deliberate [07:11:47] Heh, just got closed. [07:11:52] if I had been a bit more awake I would have read the scrollback here, [07:12:05] seen ryan's comment about the segfault and looked into that first [07:12:07] but I wasn't [07:12:38] apergos: the segfault is on virt0 i think [07:12:41] fwiw [07:12:45] oh, nm then [07:12:48] :-D [07:12:58] in that case leave it closed [07:13:16] * jeremyb is a little confused... why is jenkins doing merges? [07:13:40] !g 41912 | e.g. [07:13:41] e.g.: https://gerrit.wikimedia.org/r/#q,41912,n,z [07:13:59] and I don't know if this changeset should be abandonded, ask thelast two people who commented on it [07:14:01] apergos: I re-opeend it. [07:14:03] With a comment. [07:14:25] apergos: I was asking them. I was just pointing out that gerrit-wm was working again. [07:14:31] :-) [07:15:01] PROBLEM - Puppet freshness on silver is CRITICAL: Puppet has not run in the last 10 hours [07:15:01] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [07:15:20] I see there was a net split that resolved around the time gerrit wm went silent again [07:15:20] jeremyb: Yeah, not sure what's going on with that. [07:15:41] that's a known issue with ircbot or whatever it's called [07:15:58] affects more than just gerrit-wm and probably won't get resolved anytime soon [07:16:25] Ah, okay. [07:16:44] If there's a bug open about that, the gerrit-wm bug can be closed with a pointer. [07:16:48] so search for that in bz, if there isn't something about one of the bots and netsplits, make one, otherwise you could close up the gerrit wm one [07:16:53] yep [07:16:53] Heh. [07:17:22] hm I typed 'as dup' and it lost that in the line [07:17:23] weird [07:17:25] anyways.... [07:18:52] errr, so why did i not get notifs about the last 2 comments on bug 43112? [07:19:51] * jeremyb returns to sleep :) [07:21:42] I would return to sleep if only it weren't 9 am :-D [07:21:53] * apergos goes foraging for some small breakfast-like item [08:01:59] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 188 seconds [08:02:17] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 195 seconds [08:40:12] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [08:40:12] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Puppet has not run in the last 10 hours [08:48:24] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/40795 [09:03:27] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [09:11:24] PROBLEM - Puppet freshness on analytics1007 is CRITICAL: Puppet has not run in the last 10 hours [09:11:24] PROBLEM - Puppet freshness on ms-be1007 is CRITICAL: Puppet has not run in the last 10 hours [09:11:24] PROBLEM - Puppet freshness on ms-be1006 is CRITICAL: Puppet has not run in the last 10 hours [09:11:24] PROBLEM - Puppet freshness on ms-be1005 is CRITICAL: Puppet has not run in the last 10 hours [09:15:27] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 1 seconds [09:15:36] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [09:36:20] !log krinkle synchronized php-1.21wmf6/resources/mediawiki.page/mediawiki.page.watch.ajax.js 'I2b1b34c9' [09:36:31] Logged the message, Master [10:34:21] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [10:34:21] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [10:59:38] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [11:26:28] New patchset: Ori.livneh; "Bits VCL for EventLogging: require query to match" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41942 [11:47:38] New patchset: Ori.livneh; "EventLogging varnishncsa: require query to match" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41942 [11:54:59] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [11:54:59] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: Puppet has not run in the last 10 hours [11:54:59] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: Puppet has not run in the last 10 hours [11:54:59] PROBLEM - Puppet freshness on ms-be1010 is CRITICAL: Puppet has not run in the last 10 hours [11:54:59] PROBLEM - Puppet freshness on sq48 is CRITICAL: Puppet has not run in the last 10 hours [11:55:00] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [12:57:06] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [13:03:07] PROBLEM - Puppet freshness on solr2 is CRITICAL: Puppet has not run in the last 10 hours [13:05:03] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [13:15:06] PROBLEM - Puppet freshness on solr1003 is CRITICAL: Puppet has not run in the last 10 hours [13:15:06] PROBLEM - Puppet freshness on solr3 is CRITICAL: Puppet has not run in the last 10 hours [13:16:00] PROBLEM - Puppet freshness on solr1001 is CRITICAL: Puppet has not run in the last 10 hours [13:24:15] RECOVERY - Puppet freshness on neon is OK: puppet ran at Wed Jan 2 13:23:49 UTC 2013 [13:50:29] ori-l: why is vanadium's listener address on the same line as the title and kraken's on the next line? [13:52:12] heh, i see you reworked the commit msg [14:04:09] PROBLEM - Puppet freshness on brewster is CRITICAL: Puppet has not run in the last 10 hours [14:11:41] can someone restart labsconsole's memcached which appears to be down again? [14:13:51] yes [14:14:07] done [14:15:33] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.001 second response time on port 11000 [14:21:14] thanks! [14:29:03] PROBLEM - Puppet freshness on srv191 is CRITICAL: Puppet has not run in the last 10 hours [14:34:15] !log MaxSem> can someone restart labsconsole's memcached which appears to be down again? 14.14 paravoid> done [14:34:21] looks worth logging [14:34:25] Logged the message, Master [14:34:36] thanks, although it happens so often that it probably isn't [14:41:48] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:48:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.022 seconds [14:57:26] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [15:05:10] New patchset: Ottomata; "Setting up .htpasswd file for E3 and metrics-api.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41955 [15:06:00] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41955 [15:06:26] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [15:06:26] cmjohnson1: morning [15:06:33] cmjohnson1: got a minute? [15:07:19] good morning paravoid..yes [15:07:27] happy new year :) [15:07:46] hope you had fun [15:07:53] New patchset: Ottomata; "Fixing 'content' parameter name" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41956 [15:08:00] :-] same to you....no, i have kids...asleep b4 midnight [15:08:17] hehe [15:08:22] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41956 [15:08:49] I filed a ticket yesterday (yes, yesterday...) about ms-be1003's cable [15:09:49] k..i see it...i can replace easily [15:10:09] do you want to keep the server on or power down? [15:10:13] on [15:10:18] ethernet cable [15:10:32] okay..i need to go to storage and get the cable...i will ping you b4 i swap it [15:10:41] thanks! [15:10:54] 100mbit is just not enough for ms-bes :) [15:11:13] New patchset: Ottomata; "Need to use single quotes for .htpasswd content." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41957 [15:11:30] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41957 [15:15:18] paravoid: the cable is ready to swap....we could bond eth0/eth1 [15:15:26] no, no reason to [15:15:28] just swap it [15:15:37] k..doing that now [15:15:50] !log swapping ethernet cable for ms-be1003 [15:15:59] Logged the message, Master [15:16:56] paravoid: done [15:17:16] cmjohnson1: perfect, renegotiated at 1000 [15:17:17] thanks [15:17:22] yw [15:20:23] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:20:48] paravoid, I've got a volunteer who has problems logging in Gerrit [15:20:59] MaxSem: on IRC? [15:21:07] yes [15:21:21] madcaplaughs, ^^^ [15:21:21] MaxSem: not a very good time now I'm afraid... [15:21:25] although... [15:22:04] madcaplaughs: symptoms? [15:22:17] not letting me log in [15:22:37] says cookie must be enabled which is enabled on my browser [15:22:45] so, I've done this only once before, and can't remember the details [15:22:52] how does one go about adding a new domain? [15:22:59] I need to add metrics-api. for E3 [15:23:02] pointed at stat1001 [15:23:15] ottomata: dobson maybe? wildish guess [15:23:19] If there isn't already a page on wikitech I'll write one this time :p [15:23:27] ottomata: there's probably a local repo there that's not on gerrit [15:23:54] madcaplaughs: try clearing all gerrit.wikimedia.org cookies and start again [15:24:11] madcaplaughs: also, tried multiple browsers? [15:24:57] tried on chrome and firefox [15:26:15] username? [15:26:20] now its saying incorrect username and password [15:26:28] username: Debarshiban [15:26:39] thanks jeremyb, i think that's a good guess, but not it. I found an old pdns .bak dirertory there, but not much else [15:26:53] there is a pdns-templates svn repo on sockpuppet [15:27:09] i think that's it, but i'm not sure where I'd update it to get the changes live [15:27:11] maybe sockpuppet then [15:27:12] paravoid? ^ [15:27:27] ottomata: i do have the answer in my irc logs if you really need it ;) [15:27:31] haha [15:27:37] its not an emergency at all [15:27:39] hm? [15:27:49] paravoid: where's the canonical home for DNS? [15:27:53] paravoid: to edit [15:27:53] wondering how to add a new domain name [15:27:55] was going to: [15:28:04] add a CNAME entry in pdns-templates/wikimedia.org [15:28:33] there's quite good doc on the wiki [15:28:37] wikitech [15:28:49] ah! good, i searched but could only find info about setting up DNS as a service etc. [15:30:02] ahhhh wait [15:30:02] i found it [15:30:04] its in that doc [15:30:07] okokok [15:30:09] http://wikitech.wikimedia.org/view/Dns#Changing_records_in_a_zonefile [15:33:10] thanks paravoid. As the doc there says, could you review [15:33:10] /tmp/dns.diff [15:33:10] on sockpuppet for me? [15:33:53] sure [15:34:49] looks good, but use tabs instead of spaces [15:35:05] not that I have anything against spaces, but the rest of the file is like that [15:35:08] so let's be consistent [15:35:41] the resolvers section there is wrong. at least i'm pretty sure lily is dead [15:35:46] oop, yeah ok [15:36:17] ok thanks, with tabs then, i'm going to commit it and do the rest [15:36:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 1.015 seconds [15:36:43] ottomata: note that we have no checkzones or anything (I've tried, but dobson is... hardy) [15:36:54] so if you make a typo, DNS goes down [15:37:08] just saying, be careful :) [15:37:18] haha, yikes, ok, want to look once more over /tmp/dns.diff then? [15:38:14] +1 [15:38:31] DNS is on the immediate TODO btw [15:38:53] doing what? [15:39:00] refactor it in general [15:39:02] ah, aye [15:39:05] cool [15:39:15] git/gerrit, precise, more flexible scenarios (think ulsfo) [15:39:19] ipv6 [15:39:30] it needs some love [15:40:40] aye [15:40:43] ok cool, looks good! [15:43:20] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [16:10:02] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:12:16] paravoid, since you're on duty - how's your Squid-foo? [16:14:25] paravoid, can you take a look at https://bugzilla.wikimedia.org/show_bug.cgi?id=35215#c11 [16:20:32] MaxSem: I think I remember a discussion about this [16:20:40] Asher nak'ed iirc [16:21:56] <^demon> Apparently the market for "BZ client apis written in Java" is really small. There's like 3, and they all suck. [16:21:58] <^demon> Who knew. [16:22:02] mmm, http://www.urbandictionary.com/define.php?term=nak [16:22:14] updating ticket [16:22:29] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.053 seconds [16:22:44] MaxSem: http://en.wikipedia.org/wiki/NAK_%28protocol_message%29 [16:22:50] sorry for the slang :) [16:23:44] meh, why didn't anything reach us? [16:23:52] I'll poke him, thanks [16:24:37] Arthur mailed ops@ and Asher replied to him Cc'ing ops@ [16:24:44] but I updated bz now [16:30:17] PROBLEM - Puppet freshness on stat1 is CRITICAL: Puppet has not run in the last 10 hours [16:46:33] New patchset: Cmjohnson; "Updating mac address for solr1002. H/W change" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41963 [16:47:02] New review: Cmjohnson; "looks good to me" [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/41963 [16:47:03] Change merged: Cmjohnson; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/41963 [16:49:36] ^demon: for gerrit? [16:49:44] <^demon> Yeah. [16:49:48] <^demon> I'm working on a BZ plugin [16:49:50] did madcaplaughs get sorted? [16:49:56] ^demon: to show summary/status? [16:51:46] <^demon> madcaplaughs? [16:51:47] <^demon> huh. [16:54:33] !log pulling sfp uplink module on asw-c-eqiad [16:54:42] Logged the message, Mistress of the network gear. [16:55:14] 02 15:24:57 < madcaplaughs> tried on chrome and firefox [16:55:14] 02 15:26:20 < madcaplaughs> now its saying incorrect username and password [16:55:32] ^demon: can't log in to gerrit. but that question wasn't really for you. unless you have some ideas [16:55:57] yeah i could login [16:56:05] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:56:08] madcaplaughs: so what's the latest then? [16:56:08] thanks, i cleared the cookies, it worked :) [16:56:18] ok, great [16:56:33] he still can't commit though [16:56:39] i am having a minor problem with my git [16:56:42] trying to sort it out [16:56:46] <^demon> Generally "can't login" should be debugged by "Clear your session & cookies" and "Make sure you're using CN not SN" [16:56:49] well, i have the same question then: symptoms? [16:57:12] its saying permission denied (publickey) [16:57:26] <^demon> jeremyb: Well, the api developer is saying he's going to fix it this weekend :) [16:57:28] madcaplaughs: when you do what? [16:57:44] ^demon: erm, SN===CN in this case [16:57:44] when try ssh username [16:58:00] madcaplaughs: err, what's the whole command? [16:58:04] <^demon> jeremyb: I was speaking generally :) [16:58:10] ^demon: yeah :) [16:58:22] ^demon: 02 15:23:53 < jeremyb> madcaplaughs: try clearing all gerrit.wikimedia.org cookies and start again [16:59:06] ssh username@gerrit.wikimedia.org -p 29418 [16:59:19] and where did you submit your key? [17:00:09] i did a add-ssh with the path to my key [17:00:27] er, huh? [17:00:55] what is this add-ssh you speak of? [17:02:03] ssh-add [17:02:10] oh, locally [17:02:20] mutante's alive! [17:02:29] <^demon> Ok, I take back anything bad I said about this library being bad. The author is *awesome* [17:02:47] mutante: freues neues jahr did i butcher that? ;-) [17:02:50] <^demon> He already said he's going to fix my bug by this weekend, and then he e-mailed me to wonder what I was using his library for :) [17:03:04] <^demon> And wanting to know if I had other problems. [17:03:14] jeremyb: thanks. :) s/freues/frohes/g :) [17:03:18] you too [17:03:34] yeah locally [17:03:37] wow, 2 out of 3 [17:04:26] jeremyb: froh = glad (adjective), freuen = to gladden (verb) [17:04:43] * jeremyb also heard glicklickes (sp?) [17:04:46] sorry my bad i meant ssh-add [17:05:02] madcaplaughs: well i'll be back for a short bit in 10ish mins and will see if anyone else has figured it out by then. probably you need to pastebin something like `ssh -vv username@gerrit.wikimedia.org -p 29418 gerrit --help` [17:05:10] * jeremyb runs away [17:05:27] jeremyb: http://en.wiktionary.org/wiki/gl%C3%BCcklich [17:05:45] damn umlauts [17:06:10] hehe, check http://en.wikipedia.org/wiki/Metal_umlaut [17:07:22] ^demon: are you working on gerrit-bugzilla integration? :o [17:07:29] <^demon> Yes. [17:09:20] <^demon> I'm having trouble getting it committed though. [17:10:11] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.031 seconds [17:10:37] ^demon: you're awesome :-O [17:15:49] <^demon> Nemo_bis: https://gerrit-review.googlesource.com/#/c/40750/ :) [17:16:20] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [17:16:20] PROBLEM - Puppet freshness on silver is CRITICAL: Puppet has not run in the last 10 hours [17:22:36] hey paravoid [17:22:41] hi [17:22:45] best wishes for 2013! [17:22:52] you too :) [17:23:41] would you maybe have some time to help me and ottomata with deploying some changes to varnish/squid/ log format, in particular we need to add the X_Carrier http header [17:25:13] what do you guys need? [17:25:43] this: [17:25:44] https://gerrit.wikimedia.org/r/#/c/12188/ [17:26:19] I saw that [17:26:23] this has +1 from a number of people [17:27:20] so, what do you need? [17:27:25] yeah, it just needs done, they wanted to wait til after the fundraiser [17:27:37] i don't feel comfortable deploying this one myself [17:27:38] it needs a babysitter [17:27:53] it pushes changes to all varnishes, squids, and nginxes [17:30:50] the changeset is only for varnish & nginx [17:31:00] "only" :-) [17:32:20] right, the squid confs aren't in puppet [17:32:29] the commit message says: [17:32:35] Since the frontend.conf.php squid config template is not checked into [17:32:35] puppet, I cannot include the change to that file as part of this commit. [17:32:35] I've stored the patch in my home directory on fenari: [17:32:35] fenari:/home/otto/frontend.conf.php.accept_language_x_carrier_headers_in_log.patch [17:32:35] This needs to be applied to /home/w/conf/squid/frontend.conf.php. [17:32:56] (I made that patch back in June though, and I don't know if frontend.conf.php has changed since then) [17:35:20] so [17:35:25] how do you want to play this? [17:35:37] want to drive this? I can back you up :) [17:35:55] !log db1014 removing bad hdd from slot11 to replace w/new rt 4039 [17:36:05] Logged the message, Master [17:41:49] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:56:04] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.036 seconds [17:56:24] hey, sorry, paravoid, yeah sure! [17:56:42] we have a little meeting now, can we do this in 30 mins or an hour? [17:57:07] sure [17:57:19] let's talk again then [17:57:40] New patchset: Reedy; "Update symlinks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41965 [17:58:25] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41965 [18:01:38] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90?authuser=1 [18:01:42] oops [18:01:45] wrong chat [18:08:14] New patchset: Aaron Schulz; "Captchas back to swift for testwikis." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41966 [18:08:46] Change merged: Aaron Schulz; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41966 [18:09:11] lesliecarr: can you check to see if the port is enabled on asw-a7-eqiad 0/32 please (solr1002) [18:09:26] sure [18:09:58] !log aaron synchronized wmf-config/CommonSettings.php [18:10:08] so, you can also check yourself -- if you do "show interface description | match 7/32 " on the switch [18:10:08] Logged the message, Master [18:10:31] or actually 7/0/32 [18:10:49] it's up and in the private vlan [18:12:05] !log reedy synchronized php-1.21wmf7/ 'Initial sync' [18:12:10] New patchset: Bouron; "Added Babel category names for Ossetian Wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/41967 [18:12:14] Logged the message, Master [18:12:45] Can someone please power cycle srv191? [18:12:45] reedy@fenari:/home/wikipedia/common$ ssh srv191 [18:12:45] ssh_exchange_identification: Connection closed by remote host [18:13:14] reedy: i got it [18:13:24] !log reedy synchronized wmf-config/ [18:13:32] Logged the message, Master [18:13:34] Thanks [18:14:03] !log powercycling srv191 [18:14:12] Logged the message, Master [18:14:44] !log reedy synchronized live-1.5/ [18:14:52] Logged the message, Master [18:16:28] RECOVERY - SSH on srv191 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [18:16:49] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki to 1.21wmf7 before rebuilding message cache [18:16:58] Logged the message, Master [18:17:17] !log Running sync-common on srv191 [18:17:25] Logged the message, Master [18:23:09] !log aaron synchronized php-1.21wmf6/extensions/ConfirmEdit 'deployed 74c5543fead84a8460be51ffa0b16104cfc3abd1' [18:23:18] Logged the message, Master [18:24:25] Can someone also run this on fenari for me please? chown mwdeploy /home/wikipedia/common/wmf-config/ExtensionMessages-1.21wmf7.php [18:26:15] !log reedy Started syncing Wikimedia installation... : Rebuild message cache for 1.21wmf7 [18:26:23] Logged the message, Master [18:29:31] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:31:42] quarterly uptime report shows DNS uptime 100% [18:31:51]