[00:30:54] !log streaming hotbackup of db37 to db26, preparing to reprovision db26 in s7 [00:30:55] Logged the message, Master [00:31:59] New patchset: Asher; "moving db26 from s1 to s7" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2076 [00:32:16] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2076 [00:32:17] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2076 [00:52:10] !log shutting down mysql on db32, going to reconfigure with lvm and reslave [00:52:11] Logged the message, Master [00:57:49] !log streaming hotbackup of db53 to db32 [00:57:50] Logged the message, Master [01:18:02] Ryan_Lane: ready for me to break labs or shall i wait until the morning ? [01:18:10] go for it [01:20:09] Anyone have any idea if bash environmental variables assigned from the command line are private to just that user, or can others see it (without root?) [01:20:12] New patchset: Lcarr; "Pushing labs configuration into the main config module + making labs mirror the prod puppet repo configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2077 [01:20:13] check this out Ryan_Lane ? [01:21:42] RobH: I'm fairly sure they're private [01:21:46] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2077 [01:21:54] LeslieCarr: ^^ [01:22:04] I hope so, the ipmi script i am doing can read that variable and thus not prompt the sysadmin a dozen times for the damned thing [01:22:45] okay, guess i should bite the bullet and see if it works :) [01:23:52] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2077 [01:25:50] New review: Lcarr; "approving to wipe it out :)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2072 [01:25:51] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2077 [01:25:51] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2072 [01:30:23] New patchset: Lcarr; "removing conflicting apachesite definition" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2078 [01:30:40] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2078 [01:30:57] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2078 [01:43:54] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [01:44:45] New patchset: Lcarr; "Revert "removing conflicting apachesite definition"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2079 [01:44:59] New patchset: Lcarr; "Revert "Pushing labs configuration into the main config module + making labs mirror the prod puppet repo configuration"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2080 [01:45:15] New patchset: Lcarr; "Revert "addingin cron job to sync software repo in labs"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2081 [01:45:36] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2081 [01:46:06] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2081 [01:46:14] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2080 [01:46:19] New review: Lcarr; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2079 [01:46:19] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2079 [01:46:49] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2080 [01:46:56] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2081 [02:08:39] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 809s [02:24:09] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1739s [02:34:09] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:38:49] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [03:04:29] PROBLEM - DPKG on db57 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [03:18:09] PROBLEM - DPKG on db58 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [04:18:53] RECOVERY - Disk space on es1004 is OK: DISK OK [04:20:53] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:39:53] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [04:53:03] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2413* [05:19:26] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2425* [05:39:26] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2425* [09:03:46] hello :) [09:07:44] New patchset: ArielGlenn; "add new host for xml dump rsync mirror" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2082 [09:08:01] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2082 [09:09:03] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2082 [09:09:03] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2082 [09:41:48] New patchset: Pyoungmeister; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [09:42:02] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2083 [09:43:22] PROBLEM - Puppet freshness on copper is CRITICAL: Puppet has not run in the last 10 hours [09:43:22] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [09:43:22] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [09:45:22] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [09:48:22] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [09:49:22] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [09:53:22] PROBLEM - Puppet freshness on ms-fe2 is CRITICAL: Puppet has not run in the last 10 hours [09:55:22] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [09:57:12] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 438405 MB (3% inode=99%): [09:57:22] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [10:00:52] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 410996 MB (3% inode=99%): [10:01:22] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [10:07:22] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [11:00:22] RECOVERY - MySQL slave status on es1004 is OK: OK: [11:30:54] apergos: mutante: are you aware of any issue with gerrit? [11:30:56] seems dead [11:31:08] http / https to https://gerrit.wikimedia.org/ does not work [11:31:22] ssh to port 29418 for pulling repo does not work [11:31:28] but formey is pingable [11:32:10] hello [11:32:14] no, I used it earlier [11:32:17] oh and hello sorry :) [11:33:21] it seems like git pull is hung all right, that's new then [11:33:51] * hashar checks labs [11:34:23] labs works so that should not be LDAP [11:34:29] sounds like gerrit died somehow? :( [11:35:56] looks like it is formey [11:35:57] http://nagios.wikimedia.org/nagios/cgi-bin/status.cgi?navbarsearch=1&host=formey [11:36:05] not sure why we did not receive the notification yet though [11:37:52] load. it must have gone out to lunch [11:37:56] http://ganglia.wikimedia.org/2.2.0/?r=hour&cs=&ce=&m=load_one&s=by+name&c=Miscellaneous+pmtpa&h=formey.wikimedia.org&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 [11:38:57] PROBLEM - HTTP on formey is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:39:20] no one else is on the mgmt console right? (I"m getting the old port in use message) [11:39:28] gonna reset it otherwise... [11:42:15] time to powercycle [11:42:27] PROBLEM - HTTPS on formey is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:24] I should repeat that in here [11:45:29] a bunch of stuff from boot up like [11:45:38] init: nss-ldap: do_open: do_start_tls failed:stat=-1 [11:45:44] init: nss_ldap: could not search LDAP server - Server is unavailable [11:45:50] [Wed Jan 25 11:44:53 2012] [warn] NameVirtualHost *:80 has no VirtualHosts [11:45:57] [Wed Jan 25 11:44:53 2012] [warn] NameVirtualHost *:443 has no VirtualHosts [11:45:58] etc. [11:47:27] at least git is working again [11:47:54] if someone with a lab project wants ot make sure it is working... [11:48:47] RECOVERY - HTTP on formey is OK: HTTP OK HTTP/1.1 200 OK - 3596 bytes in 0.007 seconds [11:48:51] hashar: do you have a labs project? [11:49:14] yeah but can't not connect on it [11:49:24] so that is not reliable for testing :) [11:49:51] :-D [11:50:36] thanks for fixing formey! [11:50:47] sure [11:52:27] RECOVERY - HTTPS on formey is OK: OK - Certificate will expire on 08/22/2015 22:23. [11:53:17] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [11:53:57] ah.. thanks! also just noticed gerrit being down [11:57:07] apergos: maybe the new "atop" command would let you know which process gone mad on formey [11:57:15] not sure how to use atop though [11:58:07] mutante, do you have a lab project? [11:58:18] I would love it if someone would just check that they are working... [11:58:55] apergos: yes, i do, i will check [11:59:05] cool [11:59:30] my labs instance is running [11:59:46] ok [11:59:56] and has a running webserver..as it should [12:00:02] ok great [12:02:00] http://nagios.wmflabs.org/cgi-bin/nagios3/status.cgi?hostgroup=all&style=hostdetail ..looks good [12:03:40] hard to say from atop, java was steadily at 28% [12:04:26] I guess I'll claim that's what finally did it [12:06:54] I wonder... what starts java on here? [12:10:03] the gerrit daemon? [12:13:12] well right now it's not running [12:15:37] "jetty" ? [12:15:51] what is that? [12:15:53] HTTP server and J2EE servlet container [12:15:57] comes with gerrit [12:16:06] I seeeeee [12:17:47] review_site/bin/gerrit.sh stop [12:17:59] but gerrit works ..so all good? [12:18:12] review_site/bin/gerrit.sh stop [12:18:27] http://review.coreboot.org/Documentation/install-j2ee.html#_installation [12:18:40] for now I'm going to leave it alone [12:19:04] I had a complaint in the tech channel that extension subversion was complaining about responses from the remote subversion server [12:19:07] so that worries me some [12:19:33] but whats weird about formey is [12:19:45] look at the memory for the last week [12:19:50] now look at it for the last hour [12:22:09] btw. The requested URL /search/ was not found on this server. (ganglia) [12:23:46] oh yay svn ok now [12:24:17] I am going to be ill with a feveer in a bit I think [12:24:18] dang it [12:24:33] (I can feel it coming on now but I can beat it for a little while) [12:25:14] :/ [12:25:23] but yay for svn [12:25:30] yeah [13:14:22] ok, going to get food for soup, depon, etc... back in a little while [13:19:53] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2450* [14:30:49] New review: Dzahn; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2083 [15:09:12] * jeremyb wonders what depon is [15:09:21] * jeremyb sends apergos some soup [15:09:31] thanks what kind is it? [15:09:56] chicken noodle [15:09:59] it's a pain killer and fever reducer [15:10:05] http://en.wikipedia.org/wiki/Paracetamol [15:10:12] aw, can't [15:10:21] (vegetarian) [15:10:26] but thanks for the thought :-D [15:10:29] ohh, how didn't i know that [15:10:50] idk then. chicken soup is the traditional solution [15:11:24] miso soup is a pretty good substitute [15:11:45] yeah, that works [15:11:47] but I'm going to have toda a very mild potato, cauliflower and curry dish. with rice [15:11:54] if such a thing exists out there [15:12:05] tomorrow a lentil soup with greens and onions, also very mild [15:12:18] one could get it but not anywhere round here so I didn't trry to find it [15:12:39] * jeremyb hates https://en.wikipedia.org/wiki/Kneidlach , always get chicken soup without [15:16:39] what is it? [15:16:58] oh [15:17:16] I like dumplings in my soup but I won't have any in these ones [15:17:27] want to keep it very simple since I have to make it [15:20:51] I like soup in my dumplings. [15:21:26] :-D [15:22:35] ( https://en.wikipedia.org/wiki/Xiaolongbao ) [16:37:59] New patchset: Pyoungmeister; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [16:38:24] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2083 [16:47:27] New patchset: Pyoungmeister; "REVIEW REQUESTED major cleanup and refactoring and some parameterization of udp2log class, but should be no substantive changes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2083 [16:49:09] gerrit-wm: what the fuck? does it pass lint check or not? [16:49:29] ah, yes, it does. great. [16:49:54] i'll have a look [16:49:56] where are you? :) [16:50:07] I'm on asher's couch [16:52:53] what is aft? [16:55:14] it's some second udp2log process. tbh, I'm not sure why it has to be seperate... [16:55:36] I try not too stare too deeply into the madness, for fear that I will be lost there [16:57:06] that $log_file parameter [16:57:09] what does it do exactly? [16:57:13] soooooo [16:57:32] there's *no* standardization between the two setups [16:57:57] notpeter: you mean locke vs emery? [16:58:03] Jeff_Green: yeah [16:58:10] you should review my code too :) [16:58:14] ya i ran into that the other day it was a drag [16:58:33] mark_: I believe that that's the actual log file of the proc [16:58:41] i'm happy to review if you want, but I don't really know the history of why the two configs are different [16:58:52] Jeff_Green: not sure anyone does [16:59:14] mark_: I could probably just edit the init file to make them in the same place and scrap that param [16:59:22] I would like that [16:59:30] yeah, that is more reasonable [17:00:28] yeah I agree. i can adjust the fundraising log collection stuff on locke if that changes there [17:00:44] there was also the issue of which user runs udp2log, it was different between hosts [17:00:55] arg [17:01:01] yeah [17:01:24] this all thwarted me from doing log rotation for the SOPA stuff faulkner was working on [17:01:40] standardization ftw! [17:01:43] srsly [17:06:28] mark_: ah, those are in different places because the disks are partitioned differently. one has /a, one doesn't :/ [17:07:23] omw in [17:09:44] someone should change that mess :( [17:39:29] hey chris [17:39:31] you in the dc? [17:39:40] hey mark [17:39:45] yes [17:39:48] cool [17:39:54] we're gonna do a risky experiment in 20 mins [17:40:07] it's good that you're standby to be able to power cycle the core switch if necessary ;) [17:40:30] okay [17:40:56] hey, can you tell me what the situation is with the patches in pmtpa? [17:41:02] leslie didn't move one external link when she was there [17:41:16] I'm wondering if the physical patch is there, and whether we can move it today [17:41:26] yes..the physical patches are there [17:41:44] so in pmtpa, there is one link connected to the old core switch, csw5-pmtpa port 8/1 [17:41:56] are you able to move that patch to somewhere in row D? [17:42:32] i should be able to...i don't see any issues [17:43:44] where does that patch go to? [17:43:48] to some patch panel? [17:44:07] I mean, the current patch can't possibly be long enough, right? [17:44:20] idk..i will need to go up there...probably not [17:44:51] ok [17:44:51] give me 5 and I will go there [17:44:56] in 15 mins we need you in sdtpa though [17:44:57] so it can wait [17:45:09] I just want to get rid of that old core switch [17:45:41] that is the only thing left on those old racks and db9 [17:47:22] exactly ;) [17:47:26] I wanna get rid of those entire rows [17:47:35] row B and C [17:47:39] and only db9 is in use now [17:51:48] apergos: replacement raid controller came in [17:51:52] for your dataset1001 [17:51:55] i will install on Monday for you [17:56:59] New patchset: RobH; "added new simple shell script for ipmi mgmt" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2084 [17:57:14] mark: ^ simple shell script for ipmi mgmt [17:57:15] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2084 [17:58:02] nice [17:58:04] small suggestion [17:58:09] if you put in a "case" statement [17:58:22] then you can check whether the argument is a supported command from a list [17:58:30] it's like the preseed command in netboot.cfg [17:59:07] case $1 in powercycle|powerup|powerdown) do something ;; esac [17:59:43] also, instead of $1 $2 $3 $4 you can use "$@" for "all arguments" [18:00:55] ahh, ok [18:01:02] will tinker and append the commit =] [18:01:23] apergos: https://rt.wikimedia.org/Ticket/Display.html?id=2326 capella (previously mobile1) is all yours for the network stuffs [18:01:49] hrmm, what should we call the internal databases .. [18:03:58] RobH: great [18:04:04] and great! [18:04:26] now to get replacements for db9 [18:04:32] must kill old datacenter rows. [18:04:32] although I am sick as a dawg and hav a migraine too, I am *still* a happy camper [18:04:55] what should we call the two db9/db10 replacements. (they are not db class machines, but high performance misc servers) [18:04:57] New patchset: Ottomata; "Documentation and brainstorming, mostly of Observation class." [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2085 [18:05:03] a full db class machine is overkill for our internal dbs. [18:05:19] hence db# is no good. [18:05:59] mark: 8/1 on csw5 is multimode fiber going to hostway's idf [18:06:04] mark: the issue is the ipmitool has a couple dozen potential commands, i guess i would need to case for all of them [18:09:46] RobH: that would be good practice [18:10:13] cmjohnson: ok, are you able to put in a patch to row D yourself? [18:10:19] we're gonna start our experiment now [18:10:22] i will call you if I need you [18:10:25] probably for csw1-sdtpa [18:10:50] mark...sounds good [18:12:20] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2084 [18:13:00] * jeremyb stabs gerrit in the eye. why can't i be anonymous once my session's expired?! let me in! (had to clear the cookie) [18:15:51] !log Added multicast address 239.192.0.115 to the frontend squid udplog config, and pushed this to sq57 only [18:29:34] cmjohnson: you mid task for mark? [18:29:38] or time to do https://rt.wikimedia.org/Ticket/Display.html?id=2338 ? [18:29:50] i need those two labeled and more importantly, the hard disks swapped with SSDs [18:30:07] (when you finish i will be installing them to replace db9/db10 which frees space in pmtpa) [18:30:58] robh: I am on standby for mark so I am able to take care of it [18:31:05] cool [18:31:19] low priority compared to network stuff, so if he pulls you back in no worries [18:35:17] LeslieCarr: !rt 2340 when you get a moment [18:35:24] !rt 2340 [18:35:24] https://rt.wikimedia.org/Ticket/Display.html?id=2340 [18:35:30] wmbot needs to be smarter. [18:36:02] who is blondel ? [18:36:16] it sounds like a blonde grendel [18:36:29] http://en.wikipedia.org/wiki/Jacques-Fran%C3%A7ois_Blondel [18:36:32] LeslieCarr: isn't that that guy in resevour dogs? [18:36:33] http://en.wikipedia.org/wiki/Jacques-Nicolas_Bellin [18:36:43] 'i dont tip' [18:37:09] binasher: so on the db9/db10 SSD server replacement [18:37:16] how you want that installed/parititioned? [18:37:21] or did you want to handle that? [18:45:36] RECOVERY - Host knsq8 is UP: PING OK - Packet loss = 0%, RTA = 109.38 ms [18:45:38] RECOVERY - Host knsq29 is UP: PING OK - Packet loss = 0%, RTA = 109.68 ms [18:45:38] RECOVERY - Host knsq24 is UP: PING OK - Packet loss = 0%, RTA = 109.29 ms [18:46:26] RECOVERY - Host wikisource-lb.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.36 ms [18:46:44] mark: whats up with this? [18:46:48] your network changes? [18:47:09] nm, varnish bug LeslieCarr is fixing? [18:47:49] i restarted varnish on niobium [18:47:53] root@niobium:~# uptime [18:47:54] 18:47:29 up 78 days, 1:24, 2 users, load average: 12670.67, 7487.67, 3122.59 [18:47:59] varnish bug caused by net changes [18:50:56] RECOVERY - Host wikiquote-lb.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.60 ms [18:51:48] explanation in wikimedia-tech [18:54:39] so [18:54:43] it broke [18:54:46] but we didn't need chris ;) [18:55:05] bummer. [18:55:59] RobH: after that excitement… wmf3642 - don't see the port for that ? [18:56:05] did it have another name at any time ? [18:56:09] ticket https://rt.wikimedia.org/Ticket/Display.html?id=2340 [18:56:31] no other name [18:56:39] go ahead and task the ticket to cmjohnson_ to comment [18:56:43] and he can confirm =] [18:57:07] the ports and such [18:57:13] he is working on those systems now in fact [18:58:49] cmjohnson_: ping ? [18:59:11] Hi Leslie [18:59:21] ticket 2340 above ? [18:59:28] can you confirm the port for wmf3642 ? [19:00:50] 17 [19:01:46] and 3643 is in 23 [19:02:00] thanks :) [19:02:33] yw [19:07:30] LeslieCarr: if ever restarting varnish on bits again, please run "varnishstat -1" first, and save the output [19:07:49] okay [19:07:54] doh :-/ [19:11:22] binasher: fwiw, we have ganglia for *all* varnish metrics right now [19:22:37] mark can you take a look at RT 2323 and 2290? they're equipment requests for the eqiad payments system [19:23:15] ok [19:23:21] thx [19:24:28] please put them in procurement instead :) [19:24:36] oops. ok [19:24:54] but yeah, R410s or R310s should be fine for that [19:25:01] and rob can give you a box for erzurumi [19:25:09] then we can upgrade erzurumi as well [19:25:10] ok great [19:25:11] one of the last hardy boxes [19:25:17] kill. [19:25:21] smash. [19:25:21] yes [19:29:22] LeslieCarr: so all ports done and i good on those? [19:29:31] should be rob [19:29:35] thank you [19:29:56] !log dns update go! [19:29:58] Logged the message, RobH [19:30:38] RobH: for these eqiad hosts--it sounds as if we're allocating and not procuring, so which queue would be appropriate? [19:30:38] robh: drive swap is complete and verified in post [19:30:54] Jeff_Green: procurement since you want it [19:31:01] k [19:31:03] then i make the call if we have it already or need to order and escalate [19:31:08] no need for you guys to have to worry about that =] [19:31:27] cool, thanks [19:38:49] New review: Rich Smith; "(no comment)" [analytics/reportcard] (master) C: 1; - https://gerrit.wikimedia.org/r/2073 [19:41:26] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2073 [19:41:26] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2073 [19:41:42] I want to force all RT traffic to https =P [19:41:48] anyone in ops dislike this idea? [19:41:57] like++ whoo whoo [19:42:03] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2074 [19:42:03] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2074 [19:42:07] the links within rt are http [19:42:11] I don't dislike that idea [19:42:12] even when accessed via https [19:42:12] however [19:42:17] someone already tried that once (ryan) [19:42:22] and did it wrong, and broke the mail gateway [19:42:27] heh [19:42:28] just saying, it's not trivial ;) [19:42:31] hehe [19:42:31] damn it. [19:42:38] the mail gateway needs http to work [19:42:39] https everywhere plugin it is ;P [19:42:40] from localhost only [19:42:52] is CommonSettings, etc kept in vcs somewhere? [19:42:59] svn [19:43:04] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2085 [19:43:04] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2085 [19:46:34] New patchset: Bhartshorne; "configuring lvs to front swift" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2087 [19:46:47] New review: gerrit2; "Change did not pass lint check. You will need to send an amended patchset for this (see: https://lab..." [operations/puppet] (production); V: -1 - https://gerrit.wikimedia.org/r/2087 [19:47:30] mark: when you've gto a moment, would you review https://gerrit.wikimedia.org/r/2087? [19:47:52] was already on it ;) [19:47:58] there's a syntax error [19:48:15] you're missing a comma on line 149 [19:48:38] the service ip is not in the right ip range [19:48:53] it needs to be in 10.2.1. [19:48:59] oh, you said that. sorry. [19:49:20] otherwise looks fine [19:49:44] mark: the fiber in csw5 is not long enough to make it to row d. [19:49:55] I figured [19:50:03] is there a new fiber waiting from hostway? [19:50:14] there are several [19:52:06] but I imagine you can't replace that fiber since part of it is in the hostway cage? [19:52:31] that is correct. i can call Matt at hostway? [19:52:46] yeah we'll need to do this soonish [19:52:49] but doesn't need to be today [19:53:08] i'm not sure if he's busy :) [19:53:09] okay...he may have already run the fiber...just not sure...i will find out [19:53:22] PROBLEM - Puppet freshness on copper is CRITICAL: Puppet has not run in the last 10 hours [19:53:22] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [19:53:22] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [19:54:33] mark: http://rt.wikimedia.org/Ticket/Display.html?id=2343 [19:55:22] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [19:56:28] thanks, i'll add some comments to it [19:56:59] how much disk space do you need? [19:57:06] mark: i forwarded you the email response from Hostway [19:57:20] thanks [19:57:29] log running dist-upgrade on payments* and silicon [19:57:32] err [19:57:36] !log running dist-upgrade on payments* and silicon [19:57:38] Logged the message, Master [19:57:45] (don't)log :-P [19:58:22] PROBLEM - Puppet freshness on ms2 is CRITICAL: Puppet has not run in the last 10 hours [19:59:22] PROBLEM - Puppet freshness on owa3 is CRITICAL: Puppet has not run in the last 10 hours [20:03:22] PROBLEM - Puppet freshness on ms-fe2 is CRITICAL: Puppet has not run in the last 10 hours [20:05:22] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [20:07:22] PROBLEM - Puppet freshness on ms-fe1 is CRITICAL: Puppet has not run in the last 10 hours [20:11:22] PROBLEM - Puppet freshness on ms1 is CRITICAL: Puppet has not run in the last 10 hours [20:15:37] New patchset: Bhartshorne; "configuring lvs to front swift" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2087 [20:15:53] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2087 [20:16:00] New patchset: Asher; "thread_pool_max was way too high (setting changed meaning in 3.0, became max * pools) try to catch backend failures quicker and better serve stale content mobile frontend probes now test backend varnish, not apaches" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2088 [20:16:08] mark: can you review ^^^ [20:16:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/2088 [20:16:19] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2087 [20:16:27] ok [20:16:39] mark: thanks for the second review on 2087. [20:17:05] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2087 [20:17:05] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2087 [20:18:34] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2088 [20:20:30] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2088 [20:20:31] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2088 [20:20:57] PROBLEM - Puppet freshness on owa2 is CRITICAL: Puppet has not run in the last 10 hours [20:53:17] PROBLEM - Host knsq22 is DOWN: PING CRITICAL - Packet loss = 100% [20:54:47] RECOVERY - Host knsq22 is UP: PING WARNING - Packet loss = 80%, RTA = 109.52 ms [20:55:17] PROBLEM - Host knsq16 is DOWN: PING CRITICAL - Packet loss = 100% [20:55:34] New patchset: Ottomata; "Moved Observation class to observation.py, created test directory and added unit tests for new Observation class. Need to work on UserAgentObservation next" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2089 [20:56:07] PROBLEM - Host knsq21 is DOWN: PING CRITICAL - Packet loss = 100% [20:56:07] PROBLEM - Host ns2.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [20:57:07] RECOVERY - Host knsq21 is UP: PING WARNING - Packet loss = 37%, RTA = 110.94 ms [20:57:42] !log running "varnishadm param.set thread_pool_max 1875" on mobile varnish servers [20:57:42] New review: Diederik; "Ok." [analytics/reportcard] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2089 [20:57:42] Change merged: Diederik; [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/2089 [20:57:43] Logged the message, Master [20:58:57] PROBLEM - Host knsq18 is DOWN: PING CRITICAL - Packet loss = 100% [21:00:37] RECOVERY - Host knsq16 is UP: PING WARNING - Packet loss = 50%, RTA = 110.15 ms [21:01:17] RECOVERY - Host knsq18 is UP: PING OK - Packet loss = 0%, RTA = 109.54 ms [21:04:17] RECOVERY - Host ns2.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.46 ms [21:08:06] New patchset: Asher; "thread_pool_min must always be < max" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2090 [21:08:24] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2090 [21:08:25] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2090 [21:10:45] New patchset: Asher; "what a differnce a % makes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2091 [21:11:01] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2091 [21:11:02] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2091 [21:11:02] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2091 [21:22:28] !log bits caches: running varnish param.set thread_pool_min, thread_pool_max, where min = 15000 / cores / 4 and max = 15000 / cores [21:22:30] Logged the message, Master [21:25:53] PROBLEM - DPKG on db1029 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:32:33] PROBLEM - Host ms-fe.pmtpa.wmnet is DOWN: PING CRITICAL - Packet loss = 100% [21:33:43] PROBLEM - Apache HTTP on srv197 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:35:53] RECOVERY - DPKG on db1029 is OK: All packages OK [21:39:01] hello [21:42:53] PROBLEM - Host amssq52 is DOWN: PING CRITICAL - Packet loss = 100% [21:43:41] i am seeing an rtt of 110 ms between a host in amsterdam (damiana.toolserver.org) and a host in washington (amaranth.toolserver.org) [21:44:22] the monitoring is complaining but unfortunatelly i do not know an average rtt and if this is something to worry about [21:44:23] RECOVERY - Host amssq52 is UP: PING WARNING - Packet loss = 61%, RTA = 108.91 ms [21:45:41] 110 ms seems pretty average [21:46:03] PROBLEM - Host knsq17 is DOWN: PING CRITICAL - Packet loss = 100% [21:46:29] thx [21:46:43] RECOVERY - Host knsq17 is UP: PING WARNING - Packet loss = 28%, RTA = 109.30 ms [21:47:33] PROBLEM - Host bits.esams.wikimedia.org_https is DOWN: PING CRITICAL - Packet loss = 100% [21:48:11] looks like p-loss on our chosen path [21:48:21] i'm going to switch that out [21:48:23] PROBLEM - Host bits.esams.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [21:50:03] PROBLEM - BGP status on csw2-esams is CRITICAL: CRITICAL: No response from remote host 91.198.174.244, [21:50:03] RECOVERY - Host bits.esams.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 109.36 ms [21:54:02] okay, we're going xo to datahop now, i just did a "ignore the selected paths" thing :) [21:54:24] !log deactivated selected-paths policy-statement on cr1-eqiad and cr2-eqiad [21:54:25] Logged the message, Mistress of the network gear. [22:00:03] RECOVERY - BGP status on csw2-esams is OK: OK: host 91.198.174.244, sessions up: 4, down: 0, shutdown: 0 [22:01:43] RECOVERY - Host bits.esams.wikimedia.org_https is UP: PING OK - Packet loss = 0%, RTA = 114.58 ms [22:03:03] PROBLEM - Puppet freshness on knsq9 is CRITICAL: Puppet has not run in the last 10 hours [22:04:30] hey folks, srv299 seems to be borked. i saw the following when tyring to do a configchange on fenari: [22:04:31] awjrichards@fenari:/home/wikipedia/common$ configchange wmf-config/InitialiseSettings.php "Setting up FeaturedFeeds config; disabled by default" [22:04:31] Sending wmf-config/InitialiseSettings.php [22:04:32] Transmitting file data . [22:04:32] Committed revision 2861. [22:04:32] copying to apaches [22:04:32] srv299: Error reading response length from authentication socket. [22:04:32] awjrichards@srv299's password: [22:16:16] hrmm [22:16:22] forcing it to puppet run [22:16:33] lets see if it fixes it ;] [22:30:40] awjr: sorry, got sidetracked [22:30:45] puppet just ran on that, still have error? [22:31:08] RobH: im about to scap one sec [22:31:22] no rush [22:31:53] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [22:35:21] mark: maplebed , want to check this out ? [22:35:25] sure [22:36:24] LeslieCarr: I don't understand what it's changing. [22:36:36] (I haven't used the test repo enough I suppose) [22:36:40] right now virt0 does an rsync [22:36:44] once a minute [22:36:52] so actually make it pull in from git [22:37:49] ah. I'll defer to mark's review. [22:38:44] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [22:39:23] maplebed: any word on ms-be1? [22:39:33] im just checking so we can order asap for ya [22:39:35] RobH: I'm trying to figure out mark's drive partitioning stuff in puppet [22:39:55] ahh, ok, well, lemme know when we should move forward on orderin gmore of them =] [22:41:45] !log updating dns for bellin/blondel db9/10 replacements [22:41:47] Logged the message, RobH [22:42:30] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/2096 [22:42:45] maplebed: need help? [22:43:09] sure... [22:43:12] here's the story: [22:43:40] the first two drives are software raid with a 120G partition for / [22:43:54] the rest (10) are unformatted / mounted. [22:44:01] right [22:44:07] from what I can tell, I'm supposed to list all devices, [22:44:18] and it figures out which ones are the OS partitions (but maybe only for sun hardware) [22:44:29] and appends 1 to each device name (sdc -> sdc1) [22:44:36] but that won't quite work here. [22:45:06] I'm tempted to ignore the first two disks for now [22:45:12] and just list sdc - sdl [22:45:35] yeah that's right, only for sun hardware [22:45:36] then get fancy with using the rest of the first two disks (the remaining 1900G) later. [22:45:42] because it differs for ms1/2 vs 3 [22:46:21] New patchset: Pyoungmeister; "adding cp1001-cp1020 as text squids" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2097 [22:46:31] you could in theory give the first every drive a small first partition, and only actually use the first two for root [22:46:42] and then use the 2nd partition for all drives? [22:46:45] brb - doorbell rang. [22:47:45] yay presents! it was the ups trick. [22:48:00] yeah, but that seems silly. [22:48:42] why did you choose to list the device and then always use the first partition instead of just listing the full partition path in puppet? [22:48:52] (eg sda2, sdb2, sdc1, sdd1, etc.) [22:49:06] New patchset: Bhartshorne; "first stab at getting ms-be into puppet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2098 [22:49:15] btw, ^^^ is what I'm trying so far. [22:49:42] mark: cp1001-cp1020 should be .eqiad.wmnet, not .w.o, correct? [22:50:18] yeah. ok. [22:52:00] New review: Pyoungmeister; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2097 [22:52:01] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2097 [22:54:12] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [22:54:24] mark: look better now ? [22:55:10] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2098 [22:55:10] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2098 [22:56:21] New review: Mark Bergsma; "The git::clone generic definition doesn't actually work this way. Check the comments in the previous..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2096 [22:58:40] New patchset: Pyoungmeister; "also needed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2099 [23:00:33] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2096 [23:00:47] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [23:01:04] Change abandoned: Pyoungmeister; "(no reason)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2099 [23:09:40] ok [23:09:44] who wants to make this work? [23:09:50] bellin is saing no free leases for dhcp [23:09:55] yet it has dns, has entry, mac is right [23:10:02] mac on nic1 is showing on brewster [23:10:07] with the no free leases. [23:10:12] kick dhcp? [23:10:16] did [23:10:33] RobH: usually it's dns [23:10:34] i have checked dhcp entry, restarted it, ensures it hits it, ensure the mac is right, no negatively cached dns left [23:10:47] i'll check it out [23:10:50] huh [23:10:55] every time i wipe negative it comes back [23:10:55] wtf [23:11:06] how many l/s? [23:11:09] redo authdns-update ? maybe one server didn't get it ? [23:11:24] root@dobson:~# rec_control wipe-cache bellin.pmtpa.wmnet [23:11:25] wiped 1 records, 2 negative records [23:11:33] redo netboot, those show back up =/ [23:12:58] RECOVERY - Puppet freshness on owa1 is OK: puppet ran at Wed Jan 25 23:12:36 UTC 2012 [23:12:58] RECOVERY - Puppet freshness on copper is OK: puppet ran at Wed Jan 25 23:12:51 UTC 2012 [23:13:10] weird, i just checked all the dns servers and they're responding correctly.. [23:13:10] work damn you! [23:13:16] yea and the entry is in dns [23:13:18] so it got to them all [23:13:23] i dug against each server [23:13:31] mark: I would appreciate help - I'm not sure how to interpret the error the puppet run is giving me on ms-be1. [23:13:36] when i push dns i dig test each server with the fwdn [23:13:38] fqdn [23:14:05] ok [23:14:07] just a sec [23:14:30] ok, who wants to take this over and give it another set of eyeballs? [23:15:09] RobH: is it the right ip for the vlan it's in? [23:18:32] maplebed: so far it's not doing much on the puppet runb ;) [23:18:59] arghafewfvad [23:19:00] typo. [23:19:00] i blame LeslieCarr because i can. [23:19:10] :) [23:19:55] mark: right - it's complaining about an error, but I don't see what's wrong. [23:20:05] err: Could not retrieve catalog from remote server: Error 400 on SERVER: 'undef' from right operand of 'in' expression is not of a supported type (string, array or hash) at /var/lib/git/operations/puppet/manifests/swift.pp:214 on node ms-be1.pmtpa.wmnet [23:20:22] looking [23:20:22] which is buried in the filesystem stuff. [23:20:23] ah [23:20:26] see [23:20:34] it refers to $base::platform::startup_drives [23:20:39] which is undefined [23:21:02] because base.pp doesn't recognize these dells [23:21:02] (or most other server types) [23:21:02] so we should fix that [23:21:16] oh, it's because startup_drives doesn't exist? [23:21:23] I put that in because the thumpers (ms1-2) and the thors (ms3+) have different startup drives [23:21:23] hrmph. [23:21:25] yeah likely [23:21:57] New patchset: Lcarr; "Make test puppet repo act like production (pull from git)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2096 [23:22:58] RECOVERY - Puppet freshness on owa2 is OK: puppet ran at Wed Jan 25 23:22:32 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on zinc is OK: puppet ran at Wed Jan 25 23:22:32 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on magnesium is OK: puppet ran at Wed Jan 25 23:22:32 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on owa3 is OK: puppet ran at Wed Jan 25 23:22:46 UTC 2012 [23:22:58] RECOVERY - Puppet freshness on ms2 is OK: puppet ran at Wed Jan 25 23:22:56 UTC 2012 [23:22:59] RECOVERY - Puppet freshness on ms3 is OK: puppet ran at Wed Jan 25 23:22:57 UTC 2012 [23:23:42] would you mind putting in a fix? [23:24:04] I'm not sure what I would pu there. [23:24:58] RECOVERY - Puppet freshness on ms-fe1 is OK: puppet ran at Wed Jan 25 23:24:56 UTC 2012 [23:26:33] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/2096 [23:26:40] ok [23:29:18] PROBLEM - Host srv189 is DOWN: PING CRITICAL - Packet loss = 100% [23:29:28] RECOVERY - Puppet freshness on ms-fe2 is OK: puppet ran at Wed Jan 25 23:29:12 UTC 2012 [23:31:38] New patchset: Mark Bergsma; "Fix purge exec" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2100 [23:31:57] New patchset: Mark Bergsma; "Add platform support for Dell PowerEdge C2100 in base.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2101 [23:32:08] New patchset: Pyoungmeister; "nagios monitoring group and also some realphebitizing" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2102 [23:32:22] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2100 [23:32:23] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2100 [23:33:16] New patchset: Mark Bergsma; "Add platform support for Dell PowerEdge C2100 in base.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/2101 [23:33:43] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 1; - https://gerrit.wikimedia.org/r/2101 [23:33:49] maplebed: ^ [23:35:58] RECOVERY - Puppet freshness on ms1 is OK: puppet ran at Wed Jan 25 23:35:38 UTC 2012 [23:36:41] looking [23:37:16] huh. [23:37:53] merging. [23:37:56] let's see how it does. [23:38:13] mark: I'm merging r2100 too. [23:38:43] no change. [23:39:00] maybe it hasn't synced to stafford yet. [23:39:24] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2101 [23:57:39] they're getting reqs from User-Agent: Twisted PageGetter which i assume is pybal [23:58:26] they're all still in the pybal conf with enabled = false. monitored but not in the lvs pool [23:58:43] that's pybal yes [23:58:44] i'm going to delete them from there [23:58:50] sure [23:59:41] !log removed old external store apaches from pybal config [23:59:42] Logged the message, Master