[00:05:44] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [00:07:05] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.160 seconds [00:13:13] New review: Aaron Schulz; "Can this be merged and improvements done later?" [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/7823 [01:12:38] PROBLEM - Puppet freshness on db1042 is CRITICAL: Puppet has not run in the last 10 hours [01:32:44] PROBLEM - Puppet freshness on es1003 is CRITICAL: Puppet has not run in the last 10 hours [01:32:44] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [01:32:44] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [01:41:35] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 262 seconds [01:45:56] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 0 seconds [02:17:26] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [02:37:41] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [02:41:35] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.162 seconds [02:46:41] RECOVERY - Puppet freshness on searchidx2 is OK: puppet ran at Wed Jun 6 02:46:12 UTC 2012 [02:47:16] what's going on? [02:49:14] RECOVERY - Puppet freshness on db1042 is OK: puppet ran at Wed Jun 6 02:48:49 UTC 2012 [02:54:45] Jasper_Deng: sleep? [02:54:48] i assume [02:55:01] sleep of servers? [02:55:03] maybe leslie excepted [02:55:25] sleep of europeans and aliens visiting europe [02:55:26] some particular squid and db servers have been complaining this whole afternoon [02:55:37] yeah [02:56:00] that's probably fine [02:56:24] cp100[12] was mentioned earlier. idk if it was investigated but it's known and not new this evening [02:56:33] the db's don't look important [02:56:54] (certainly someone would have complained if they were masters) [02:57:11] yeah, I was just wondering. [02:57:47] k. my mind reading skills are out of practice [03:08:53] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [03:26:17] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27407 bytes in 0.141 seconds [04:23:08] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [04:42:38] PROBLEM - Puppet freshness on es1004 is CRITICAL: Puppet has not run in the last 10 hours [04:44:44] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [04:45:56] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.160 seconds [04:58:59] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27407 bytes in 0.113 seconds [05:07:41] PROBLEM - Puppet freshness on storage3 is CRITICAL: Puppet has not run in the last 10 hours [05:34:41] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [06:10:05] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [06:14:26] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [06:18:47] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27407 bytes in 0.107 seconds [06:59:59] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27398 bytes in 0.136 seconds [07:04:08] good morning! [07:04:53] morning [07:12:13] morning [07:17:58] New patchset: Mark Bergsma; "Add all remaining IPv6 LVS service IPs and services to the LVS balancers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10399 [07:18:20] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10399 [07:21:37] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10399 [07:21:39] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10399 [07:37:51] Logged the message, Master [07:37:55] Logged the message, Master [07:37:59] Logged the message, Master [07:48:26] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [07:53:05] New patchset: Raimond Spekking; "Bug 37365: Install Narayam in Gujarati Wiktionary" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/10401 [07:53:12] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/10401 [07:53:30] !log Converted geoiplookup.wikimedia.org into a separate, IPv4-only geodns record [07:53:34] Logged the message, Master [07:59:50] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [08:01:29] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.176 seconds [08:02:22] ugh [08:12:40] New patchset: Mark Bergsma; "Add first IPv6 LVS service monitoring to Nagios for testing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10402 [08:13:03] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10402 [08:13:14] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10402 [08:13:16] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10402 [08:16:47] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [08:21:08] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27407 bytes in 0.114 seconds [08:22:56] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [08:26:50] RECOVERY - mysqld processes on db1042 is OK: PROCS OK: 1 process with command name mysqld [08:30:17] PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 41120 seconds [08:32:50] PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 40925 seconds [08:41:15] !log Converted bits.wikimedia.org into a direct geodns record, removed the old bits -> bits-geo CNAME [08:41:19] Logged the message, Master [08:45:26] hey [08:49:42] mark: how can I find a spare server in eqiad for v6relay? [08:49:47] or what else can I do [08:50:29] hi [08:50:36] lemme find you one in a bit [08:50:43] normally you should ask rob, who coordinates that [08:52:48] New patchset: ArielGlenn; "one job queue across all wikis" [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/10403 [08:53:47] so, what's the status? [08:53:56] what are you working on? [08:56:49] ok, take server 'nitrogen' [08:56:58] i'll put it in the public vlan now [08:56:59] RECOVERY - Puppet freshness on spence is OK: puppet ran at Wed Jun 6 08:56:46 UTC 2012 [08:58:31] ok, done [08:58:39] you may need to put it in dns (rev/forward) [08:58:54] i'm waiting on spence to finish its puppet run for nagios [08:59:00] i've prepared the upcoming dns changes [08:59:05] and I'm pretty much ready to go [09:00:17] which IP nitrogen should take? [09:00:27] do we manage allocations somehow? [09:00:58] any free one in the respective subnet, 208.80.154.0/26 [09:01:01] we just use rev dns [09:01:18] add v6 while you're at it ;) [09:01:35] and hurry... I'm gonna do some important dns changes soon ;) [09:01:51] gonna get a coffee, and then i'll start [09:06:25] RECOVERY - Host capella is UP: PING OK - Packet loss = 0%, RTA = 0.19 ms [09:06:52] RECOVERY - SSH on capella is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [09:10:10] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [09:11:28] mark: :wq [09:11:29] :P [09:11:36] (you have wikimedia.org open) [09:11:53] done [09:13:35] * Damianz sits back and watches mark break wikipedia [09:14:29] ~. [09:15:23] oh? :) [09:15:35] my connection broke [09:16:53] http://i2.kym-cdn.com/photos/images/original/000/035/232/Internet-Don_t_worry_Tron.jpg [09:16:56] * Damianz hides [09:17:46] New patchset: Mark Bergsma; "Handle IPv6 LVS monitoring a bit differently, add bits/upload.pmtpa" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10404 [09:18:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10404 [09:19:19] New review: ArielGlenn; "(no comment)" [operations/dumps] (ariel); V: 1 C: 2; - https://gerrit.wikimedia.org/r/10403 [09:19:21] Change merged: ArielGlenn; [operations/dumps] (ariel) - https://gerrit.wikimedia.org/r/10403 [09:20:06] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10404 [09:20:15] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10404 [09:20:41] New patchset: Faidon; "Add nitrogen to linux-host-entries" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10405 [09:21:04] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10405 [09:22:46] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27407 bytes in 0.107 seconds [09:23:38] paravoid: so are you done with dns? [09:23:49] yep [09:24:01] well, commited, not authdns-update yet [09:24:10] then do that now [09:24:19] or I can [09:24:43] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10405 [09:24:45] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10405 [09:25:04] done. [09:25:08] thanks [09:25:19] lvs monitoring in nagios is teh suck [09:25:29] we really need to fix that some day [09:28:26] New patchset: Mark Bergsma; "Fix LVS checks" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10406 [09:28:40] PROBLEM - Auth DNS on ns1.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:28:51] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10406 [09:29:03] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10406 [09:29:06] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10406 [09:29:15] eh, what's this alert? [09:29:17] did I break it? [09:29:38] ns1 does not respond [09:30:06] that's "normal" [09:30:06] a bug [09:30:08] just restart it [09:30:18] happens often during dns updates, it's a deadlock bug [09:30:23] fixed in newer pdns which we'll roll soon [09:30:55] I restarted it (on linne) [09:31:00] I did it as well :) [09:31:02] heh [09:31:28] RECOVERY - Auth DNS on ns1.wikimedia.org is OK: DNS OK: 0.030 seconds response time. www.wikipedia.org returns 208.80.154.225 [09:34:20] New patchset: Mark Bergsma; "Monitor the same IPv6 LVS services for eqiad and esams" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10407 [09:34:42] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10407 [09:35:08] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10407 [09:35:11] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10407 [09:37:49] New patchset: Faidon; "Make nitrogen a role::ipv6relay" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10408 [09:38:02] yikes [09:38:11] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10408 [09:38:28] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10408 [09:38:31] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10408 [09:38:44] New patchset: Mark Bergsma; "Fix LVS checks" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10409 [09:39:05] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10409 [09:39:05] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10409 [09:39:07] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10409 [09:40:48] so, is it live? [09:40:52] no [09:41:01] how about now? [09:41:03] ;) [09:41:15] * mark shoots ryan [09:41:18] no, it's not (a)live [09:41:20] heh [09:41:22] * apergos gets popcorn [09:48:59] esams upload https is not working [09:49:21] for ipv6, or at all? [09:49:25] ipv6 [09:49:44] nginx doesn't listen on it [09:49:56] sec [09:50:01] on ssl3001... [09:50:04] on 3002 it works [09:50:35] only ssl3001 [09:50:48] perhaps we should just depool it [09:50:55] it's a one-off host, with the ipv6.labs stuff on it [09:50:56] should for now [09:51:02] will you? [09:51:05] sure [09:51:10] !log depooling ssl3001 [09:51:14] hopefully 2 hosts is enough [09:51:15] Logged the message, Master [09:51:16] because 3004 is also down ;) [09:51:20] yeah [09:51:51] should I just remove the ipv6.labs stuff? [09:52:03] I think that may still be used in enwiki's common.js or whatever it's called now [09:52:08] ah [09:52:09] we should probably get that removed soon [09:53:03] I don't see why the ipv6 stuff isn't added for 3001 [09:54:00] Instead of an ipv6day can we have a fixeverything day, like you take those 200 bugs that are not important enough to dedicate time to but are annoying as fuck and just smash them in a day? [09:54:04] Would be more productive :D [09:54:33] which ops bugs would those be? [09:54:57] Anything that was written betwean the hours of 11pm and 9am [09:55:14] * Ryan_Lane shrugs [09:55:18] Or we could have an 'implimentoauthday' ;) [09:55:41] most of what you are talking about is dev and not ops [09:56:22] * Damianz gives Ryan_Lane some od that devops thing [09:56:42] Ops have commit access too :D [09:57:26] alright [09:57:34] it's time to give upload some ipv6 traffic ;) [09:57:48] * Damianz hides for when you break his bot [09:58:28] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [09:59:55] New patchset: Ryan Lane; "Fix ordering of includes so that ipv6 will work on ssl3001" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10410 [10:00:00] !log Added AAAA record to upload.wikimedia.org [10:00:06] Logged the message, Master [10:00:17] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10410 [10:00:25] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10410 [10:00:28] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10410 [10:00:36] \o/ [10:00:48] I see traffic [10:02:19] RECOVERY - Host bellin is UP: PING OK - Packet loss = 0%, RTA = 0.50 ms [10:02:37] PROBLEM - mysqld processes on bellin is CRITICAL: Connection refused by host [10:02:37] PROBLEM - MySQL Idle Transactions on bellin is CRITICAL: Connection refused by host [10:02:37] PROBLEM - MySQL Slave Running on bellin is CRITICAL: Connection refused by host [10:02:51] !log repooling ssl3001 [10:02:55] PROBLEM - MySQL Recent Restart on bellin is CRITICAL: Connection refused by host [10:02:55] PROBLEM - MySQL disk space on bellin is CRITICAL: Connection refused by host [10:02:55] Logged the message, Master [10:03:22] PROBLEM - MySQL Replication Heartbeat on bellin is CRITICAL: Connection refused by host [10:03:22] PROBLEM - NTP on bellin is CRITICAL: NTP CRITICAL: No response from NTP server [10:03:38] Weirdly I can't connect to http but I wouldn't be surprised if my tunnel is broken. [10:03:56] http://nagios.wikimedia.org/nagios/cgi-bin/status.cgi?servicegroup=lvs&style=detail [10:03:58] * Damianz drop kicks the office firewall into the carpark [10:04:30] ! [10:04:38] Oooh - you're using pybal magic for v6too? Gotta check that code out later. [10:04:49] * Damianz waits for you to be boring and point out bgp doesn't care [10:06:03] mark: not the time, but nitrogen's ready [10:06:12] cool [10:06:19] so you want me to add statics? [10:06:22] PROBLEM - MySQL Slave Delay on bellin is CRITICAL: Connection refused by host [10:06:22] PROBLEM - Full LVS Snapshot on bellin is CRITICAL: Connection refused by host [10:06:40] PROBLEM - SSH on bellin is CRITICAL: Connection refused [10:07:10] sure [10:07:44] let me know which [10:07:47] to which nexthop [10:08:03] 2620:0:861:1:208:80:154:17 [10:08:08] nitrogen's AAAA :-) [10:08:24] remind me, 2002::/16 and 2001::/32, right? [10:09:10] yes [10:10:28] Ohai [10:10:58] paravoid: done [10:11:01] thanks :-) [10:11:08] please check if they do what they need to do ;) [10:12:59] gah, I can't reassign tickets to myself [10:13:04] that aren't unowned [10:13:14] can I just upgrade myself to admin? :) [10:13:29] you can [10:13:33] noone can, in fact [10:13:38] you first need to make them unowned [10:13:40] then to yourself [10:14:54] i hear no screams yet [10:14:56] next: bits? :) [10:16:49] you'd almost think we're not complete lunatics [10:16:50] * Damianz imagines mark's desk with one of those big red usb buttons and an arrow pointing to it that reads 'abort' [10:16:52] ah, no, you just hit the button called "steal" [10:19:44] !log Added AAAA record to bits.wikimedia.org [10:19:48] Logged the message, Master [10:20:49] PROBLEM - Host bellin is DOWN: PING CRITICAL - Packet loss = 100% [10:23:03] RECOVERY - SSH on bellin is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [10:23:03] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.160 seconds [10:23:12] RECOVERY - Host bellin is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms [10:28:00] RECOVERY - MySQL disk space on bellin is OK: DISK OK [10:29:12] RECOVERY - MySQL Slave Running on bellin is OK: OK replication [10:29:21] RECOVERY - MySQL Idle Transactions on bellin is OK: OK longest blocking idle transaction sleeps for seconds [10:29:21] RECOVERY - MySQL Recent Restart on bellin is OK: OK seconds since restart [10:29:48] RECOVERY - MySQL Replication Heartbeat on bellin is OK: OK replication delay seconds [10:30:06] RECOVERY - Full LVS Snapshot on bellin is OK: OK no full LVM snapshot volumes [10:30:06] RECOVERY - MySQL Slave Delay on bellin is OK: OK replication delay seconds [10:31:27] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [10:32:41] everything seems to work [10:34:16] yeah [10:34:18] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.186 seconds [10:34:23] shall we do the rest? :) [10:34:26] except mobile [10:35:36] yes! [10:35:41] alright then [10:37:00] RECOVERY - NTP on bellin is OK: NTP OK: Offset 0.03532397747 secs [10:37:02] !log Added AAAA records to all non-mobile wiki projects [10:37:07] Logged the message, Master [10:37:40] \o/ [10:37:49] awesome! [10:37:53] oh god [10:38:09] !log Wikipedia is IPv6-enabled. [10:38:13] Logged the message, Master [10:38:19] yay [10:39:30] that gets a retweet from me! [10:39:48] is twitter ipv6 enabled? [10:39:51] Wth's the sal twitter again? [10:40:13] no [10:40:18] Damianz: @wikimediatech [10:40:26] :) [10:40:52] Aww ryan went with his own thing [10:41:01] Hell I'll just spam my followers because I don't spam them enough [10:41:27] wikimediatech gets spammed enough that I'd prefer it not have a billion followes [10:41:30] *followers [10:41:37] lol [10:41:44] I don't have twitter [10:41:47] That's why I don't follow it - I'd only ever see sal [10:42:03] Might follow it on identica assuming you're OS friendly as I don't use identica that much. [10:42:30] mark: If twitter get ipv6 will you have twitter? [10:42:31] :D [10:42:43] I do have twitter, so I'll gladly steal all of your credit :D [10:43:27] RECOVERY - mysqld processes on bellin is OK: PROCS OK: 1 process with command name mysqld [10:43:41] that's fine with me [10:43:46] i'll gladly steal your salary instead [10:43:50] :D [10:43:51] you can have the credit [10:44:00] You mean they pay him !? [10:46:54] yup, some people are being paid, you know... [10:47:37] definitely not danny_b... ;-) [10:48:01] meh. I should have waited to tweet. no one is awake in the US [10:48:02] the smiley should have been :-( [10:48:13] See if we didn't pay you we'd only have to run donations once a year (totally ripping off his if everyone donated 5$ or w/e) [10:51:17] http://en.wikipedia.org/w/index.php?title=Special:RecentChanges&limit=500&hideliu=1 [10:51:54] I'm monitoring the relays [10:51:59] not too much of a traffic [10:52:03] good [10:52:10] people should get native ;) [10:52:33] :) [10:52:46] hmm, we put the v6 on different servers [10:52:58] just lvs [10:52:59] so I guess we can see the v6 traffic [10:53:03] yes [10:53:03] yes [10:53:40] however there's more pybal monitoring traffic than actual ipv6 it seems ;-) [10:54:28] heh, yeah [11:01:17] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [11:04:47] so, I'd like to move LVS servers to separate ganglia groups [11:04:52] any tips before I start digging? [11:05:03] Heyo. Quick questions Do you guys know whether rows in the externallinks table are ever deleted? [11:05:11] you might want a JCB [11:05:22] declerambaul: in theory... they should be when they are removed from a page [11:05:23] In theory [11:05:29] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.195 seconds [11:06:13] * Damianz offers paravoid tnt [11:07:27] Reedy: Thanks. Does in theory mean that the data could be pretty noisy? [11:08:09] Yeah [11:08:21] I think linksupdate does it... But I'm not in a position to go digging in the code [11:11:17] Nono that's fine thanks. Somebody did work on external link stats and it seemed that there is a lot of spam in there was likely removed from the actual wiki pages. [11:12:42] !log Added AAAA record to mobile [11:12:46] Logged the message, Master [11:12:54] now i'm really done [11:13:02] Awww I can't get on facebook with my ipv6 interface up [11:15:51] yaay [11:17:03] neat, I get 15ms less with ipv6 [11:17:21] to esams [11:17:47] New patchset: Mark Bergsma; "Add mobile IPv6 LVS service IP to lvs1004" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10415 [11:17:48] :P [11:18:10] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10415 [11:18:24] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10415 [11:18:26] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10415 [11:22:43] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [11:22:52] RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay 26 seconds [11:24:04] RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 0 seconds [11:25:25] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.177 seconds [11:29:19] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [11:29:34] paravoid: for selective-answer.py... [11:29:42] it would be good to be able to support prefixes instead of just /32 ips [11:29:56] so we can blacklist an entire prefix instead of having to find out where their resolvers live [11:30:10] ok, will fix [11:30:13] i have a feeling we're gonna need that script in not too long ;) [11:30:15] makes sense [11:30:30] and perhaps make it v6 aware too [11:30:55] we're not gonna make our auth servers answer on v6 just yet, but some day we probably will [11:31:17] do we have geoip data for ipv6? [11:31:36] not yet [11:32:47] Nameservers are kinda a weird one, lots of people have broken ipv6 that just happens to get an ra then suddenly they can't resolve anything and kittens get worried... still the isps fault though [11:33:47] mark: do you mind if I merge https://gerrit.wikimedia.org/r/#/c/9798/ before working on prefixes? [11:33:58] PROBLEM - Puppet freshness on es1003 is CRITICAL: Puppet has not run in the last 10 hours [11:33:58] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [11:33:58] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [11:34:04] go ahead, the script is effectively inactive now [11:34:16] upload.esams.wikimedia.org is no longer used [11:34:31] New patchset: Pyoungmeister; "adding in db1042 as s1 slave" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/10416 [11:34:37] New review: jenkins-bot; "Build Successful " [operations/mediawiki-config] (master); V: 1 C: 0; - https://gerrit.wikimedia.org/r/10416 [11:35:10] RECOVERY - Backend Squid HTTP on cp1001 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.185 seconds [11:35:32] Change abandoned: Pyoungmeister; "ben is going to regen." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10227 [11:35:33] New review: Faidon; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/9798 [11:35:36] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/9798 [11:41:48] mark: gah, dobson is hardy... I wanted to use python-ipaddr [11:41:59] yeah you should [11:42:04] ah that was why I didn't do that then ;) [11:42:06] not in hardy [11:42:09] use it anyway [11:42:10] well, I'll backport it [11:42:19] we'll upgrade that box soon anyway [11:42:21] mchenry is hardy too btw [11:42:22] I can always use IPy, but I prefer IPAddr [11:42:24] both are auth [11:45:14] New patchset: Pyoungmeister; "changing searchidx partman conf to give larger root partition" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10417 [11:45:37] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10417 [11:46:01] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10417 [11:46:04] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10417 [11:46:56] New patchset: Mark Bergsma; "Add mobile site v6 monitoring in eqiad" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10418 [11:46:58] paravoid: can I do a merge on sockpuppet? [11:47:06] your stuff is in the queue [11:47:18] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10418 [11:47:23] yes [11:47:28] ok, cool [11:47:59] (mine too ;) [11:49:43] PROBLEM - Backend Squid HTTP on cp1002 is CRITICAL: Connection refused [11:53:06] New patchset: Mark Bergsma; "Add monitoring for the mobile v4 site as well" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10419 [11:53:28] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10418 [11:53:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10419 [11:53:30] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10418 [11:55:06] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/10419 [11:55:08] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10419 [11:55:16] RECOVERY - Backend Squid HTTP on cp1002 is OK: HTTP OK HTTP/1.0 200 OK - 27399 bytes in 0.174 seconds [11:57:14] PROBLEM - Backend Squid HTTP on cp1001 is CRITICAL: Connection refused [12:05:50] New patchset: Mark Bergsma; "Add remaining non-SSL LVS service monitoring" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/10420 [12:06:13] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/10420 [12:06:52]