[00:27:25] Ryan_Lane: Ages ago, you mentioned that if dhcp_domain is defined I should use that instead of dns_instance_zone. I am at last looking at that... the dhcp_domain flag has a default value, though, so it looks like it will /always/ be defined. [00:27:37] oh [00:27:50] let's use dns_instance_zone specifically, then [00:28:07] it wouldn't be the only redundant flag set [00:28:20] So, ignore dhcp_domain entirely? [00:28:24] yeah [00:28:28] That's easy, it requires me to do nothing at all. [00:28:31] cool [00:35:10] New patchset: Lcarr; "trying to put machine in multiple clusters" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1942 [00:35:25] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/1942 [00:38:13] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1942 [00:38:14] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1942 [00:40:59] New patchset: Lcarr; "tryuing to see where ganglia chokes" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1945 [00:41:33] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1945 [00:41:34] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1945 [00:45:36] New patchset: Lcarr; "another ganglia test" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1948 [00:45:50] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/1948 [00:47:04] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1948 [00:47:04] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1948 [00:58:03] New patchset: Lcarr; "reverting everything" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1949 [00:58:18] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/1949 [00:58:46] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1949 [00:58:47] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1949 [01:04:46] New patchset: Lcarr; "fixing var" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1951 [01:05:11] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1951 [01:05:11] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1951 [01:07:57] New patchset: Lcarr; "another var fix" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1952 [01:08:13] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/1952 [01:08:27] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1952 [01:08:27] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1952 [02:57:54] RECOVERY Total Processes is now: OK on nova-dev3 nova-dev3 output: PROCS OK: 78 processes [02:58:44] RECOVERY dpkg-check is now: OK on nova-dev3 nova-dev3 output: All packages OK [02:59:04] RECOVERY Current Load is now: OK on nova-dev3 nova-dev3 output: OK - load average: 0.55, 0.62, 0.28 [03:00:14] RECOVERY Current Users is now: OK on nova-dev3 nova-dev3 output: USERS OK - 0 users currently logged in [03:00:44] RECOVERY Disk Space is now: OK on nova-dev3 nova-dev3 output: DISK OK [03:01:14] RECOVERY Free ram is now: OK on nova-dev3 nova-dev3 output: OK: 63% free memory [04:59:46] goodbye en.wiki! [05:02:56] Lol [05:03:02] The learn more link is blacked out too though [05:03:03] fail [05:04:00] I think that means the bots get a day off [05:04:54] And win, they fixed the learn more page. [05:05:09] * Damianz goes back to idling [11:57:15] 01/18/2012 - 11:57:15 - Creating a project directory for planet [11:57:16] 01/18/2012 - 11:57:15 - Creating a home directory for dzahn at /export/home/planet/dzahn [11:58:15] 01/18/2012 - 11:58:15 - Updating keys for dzahn [12:03:57] I need to duplicate svn://svn.wikimedia.org/svnroot/mediawiki/trunk/lucene-search-2 to svn://svn.wikimedia.org/svnroot/mediawiki/trunk/lucene-search-3 [12:04:17] using svn copy fails, any susggestion [12:04:39] would be welcome [12:17:01] PROBLEM host: venus is DOWN address: venus CRITICAL - Host Unreachable (venus) [12:21:21] !log planet - created new project to test planet (rss aggregator) replacement [12:21:24] Logged the message, Master [12:21:44] !log bastion - bastion1 asks for reboot [12:21:45] Logged the message, Master [12:22:02] RECOVERY host: venus is UP address: venus PING OK - Packet loss = 0%, RTA = 7.04 ms [12:22:12] PROBLEM Free ram is now: CRITICAL on venus venus output: Connection refused by host [12:22:12] PROBLEM HTTP is now: CRITICAL on venus venus output: Connection refused [12:22:36] !log planet - added new instance 'venus' to test planet-venus package [12:22:37] Logged the message, Master [12:23:27] !log planet - added misc::planet and misc::planet-venus classes [12:23:28] Logged the message, Master [12:23:42] PROBLEM Total Processes is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:24:02] PROBLEM Current Load is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:24:22] PROBLEM dpkg-check is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:24:52] PROBLEM Current Users is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:26:22] PROBLEM Current Users is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [12:26:32] PROBLEM Total Processes is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [12:27:12] RECOVERY Free ram is now: OK on venus venus output: OK: 92% free memory [12:27:12] RECOVERY HTTP is now: OK on venus venus output: HTTP OK: HTTP/1.1 200 OK - 453 bytes in 0.007 second response time [12:27:22] PROBLEM Disk Space is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [12:28:22] PROBLEM Free ram is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [12:28:42] RECOVERY Total Processes is now: OK on venus venus output: PROCS OK: 89 processes [12:28:52] PROBLEM dpkg-check is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [12:29:02] RECOVERY Current Load is now: OK on venus venus output: OK - load average: 0.06, 0.54, 0.42 [12:29:22] RECOVERY dpkg-check is now: OK on venus venus output: All packages OK [12:29:32] PROBLEM Current Load is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [12:29:52] RECOVERY Current Users is now: OK on venus venus output: USERS OK - 1 users currently logged in [12:31:22] RECOVERY Current Users is now: OK on deployment-web deployment-web output: USERS OK - 0 users currently logged in [12:31:32] RECOVERY Total Processes is now: OK on deployment-web deployment-web output: PROCS OK: 125 processes [12:32:22] RECOVERY Disk Space is now: OK on deployment-web deployment-web output: DISK OK [12:33:12] RECOVERY Free ram is now: OK on deployment-web deployment-web output: OK: 77% free memory [12:33:42] RECOVERY dpkg-check is now: OK on deployment-web deployment-web output: All packages OK [12:34:22] PROBLEM Current Load is now: WARNING on deployment-web deployment-web output: WARNING - load average: 2.23, 16.82, 12.97 [12:54:22] RECOVERY Current Load is now: OK on deployment-web deployment-web output: OK - load average: 0.02, 0.40, 3.65 [14:41:48] !log planet - test run with en.planet config and musings theme at http://10.4.0.66/planet/en/ [14:41:50] Logged the message, Master [15:32:59] !log testlabs - added misc::package-builder to instance labs-build1 (ok to use to build, right? and Wikitech:Ragweed says to use that class) [15:33:00] Logged the message, Master [15:33:46] mutante: yeah ragweed is going to be decommissioned as I understand it [15:33:58] in december, mark told me to build the package in a VM [15:34:10] hashar: yeah, instead of creating a new instance, i am just using the existing build-1 though [15:34:30] hashar: it looked pretty unused, and did not have the package-builder class applied yet [15:40:03] hashar: ah, ok, but that does not install pbuilder, so it is all about "Git-buildpackage" now? did you build on labs yet? [15:46:15] mutante: I build the testswarm package [15:46:19] but did not use git-buildpackage [15:46:35] Victory: http://www.facebook.com/SenatorMarcoRubio/posts/340889625936408 [15:46:39] tf? [15:46:54] Bah. [15:47:01] Victory anyways. [15:47:12] hashar: what did you use? just dpkg-deb --build ? [15:49:12] debuild :b [15:49:23] the good old way to build package [15:49:32] but you most probably want to use git-buildpackage [15:56:15] 01/18/2012 - 15:56:15 - Creating a home directory for platonides at /export/home/deployment-prep/platonides [15:57:15] 01/18/2012 - 15:57:14 - Updating keys for platonides [16:14:44] hashar: thanks. [16:57:41] petan: you're configuring squid? [16:59:26] "Configuring squid" always sound like some sort of hentai anime. [17:22:01] petan, this describes the issue http://archives.lists.indymedia.org/imc-tech/2001-October/005697.html [17:22:07] (but was given no answer) [17:40:44] hexmode: yes [17:41:03] hexmode: there is a good reason for that [17:42:14] petan: k, I knew it needed to be done ... was just surprised by the error page :) [17:42:24] which one [17:42:37] hexmode: I gave shell to platonides if you suspect someone :P [17:42:38] :D [17:42:48] heh [17:43:00] btw you still get it? [17:43:18] Platonides: that's it! [17:43:19] Not now, but earlier ... 1hr ago [17:43:22] ok [17:59:10] !bugzilla [17:59:10] https://bugzilla.wikimedia.org/show_bug.cgi?id=$1 [18:01:28] petan: saibo just told me he got a squid error ... [18:01:38] really? which page [18:01:45] I am definitely not changing it now [18:01:48] perhaps someone else [18:04:11] petan: http://commons.wikimedia.beta.wmflabs.org/w/index.php?title=Commons:Deletion_requests/Files_uploaded_by_Saibo&action=submit&uselang=en [18:04:34] petan: click "show preview" [18:09:14] ok [18:32:24] hexmode: fixed [20:31:21] Coren: hey! are you on the labs mailing list as well? [20:31:42] also, Coren, what city are you nearest? https://www.mediawiki.org/wiki/MediaWiki_developer_meetings [21:31:28] diederik: will this work? https://github.com/appliedsec/pygeoip [21:31:45] it's a pure pythin api for maxmind's databases [21:32:17] ryan: prefer the maxmind C implementation [21:32:23] way way faster and I need to crunch a lot of data [21:32:31] ok. lemme see if I can find it [21:35:56] I'm installing the following: geoip-bin geoip-database libgeoip1 python-geoip [21:35:59] * Damianz thinks about twisted hell and wonders if he'll get time to fix cb before 5am. [21:36:39] http://geolite.maxmind.com/download/geoip/api/c/ [21:37:10] diederik: that's the geoip packages [21:37:18] ryan: can I also copy data from bayes to stat1 [21:37:19] I just installed them [21:37:20] ryan: okay! [21:37:25] super super th [21:37:25] x [21:37:28] yw [21:37:36] I'm trying for wurfl right now [21:37:42] going to have to add it to our repo [21:38:06] i think that's a great idea because i am sure this is a package that we will be using a lot in the future [21:38:24] * Ryan_Lane nods [21:55:02] diederik: ok. python-wurfl is now available too [21:59:23] ryan: thanks! [21:59:33] yw [21:59:44] I don't see how data is making it from emery to bayes [21:59:49] ryan: spoke with nimish, according to him there is a remote mount on locke that makes the log files available on bayes [22:00:00] this was setup by mark bergsma [22:00:01] on locke? [22:00:09] that is what nimish said [22:00:18] (don't shoot the messenger :)) [22:00:51] I see a mount for dataset2 [22:01:07] not for locke [22:01:11] or emery [22:01:18] so, that couldn't possibly be correct [22:02:47] I can share the same share to stat1, though [22:02:50] uuuuummhhhhhhhh [22:06:31] what data would make that available? [22:06:35] I have no clue [22:06:39] :D [22:06:44] let's try and see [22:06:48] I'd imagine nimish would know this right? [22:06:58] he's the analytics guy, eh? :D [22:07:27] i am just trying to piece the puzzle together :D [22:08:04] it'll save time if we ask someone who deals with this stuff a lot [22:08:14] i think i just did [22:08:23] who? nimish? [22:08:26] yes [22:08:34] he doesn't know? [22:09:01] sorry, only asked him the data got from locke to bayes [22:09:10] what is the name of the share? [22:09:13] and he didn't answer us :) [22:09:57] /mnt/data on bayes is dataset2:/data [22:10:16] i don't think i have access to dataset2 [22:10:24] it's a share [22:10:37] so, if we share it to stat1, you would have access to it [22:10:41] oh ok [22:10:44] but, we need to know if that's what you need [22:10:55] checking right now, 1 sec [22:10:58] ok [22:11:35] pretty sure that is not what i need [22:11:58] can he get on IRC? [22:11:59] on bayes [22:17:14] ryan: [22:17:24] so the data is located here: [22:17:33] emery:/var/log/squid/archive [22:17:34] or [22:17:34] locke:/a/squid/archive [22:17:54] bayes does not have any of those mounts [22:18:20] locke doesn't even have an NFS server installed [22:18:24] is erik z around? [22:18:48] neither does emery [22:18:52] I have no clue [22:19:03] or mark bergsma [22:19:49] so what shall we do? at least we know where the data is located on both emery and locke [22:20:56] I have no clue [22:21:00] I don't know how this stuff works [22:21:24] who could fix this? [22:21:25] is it documented anywhere? [22:21:30] i don't think so [22:21:59] robla: can you chime in quickly? [22:22:47] * robla reads backlog [22:22:48] robla: just to be clear, I'm pretty annoyed by this situation [22:23:06] I hope not with me :) [22:23:12] no [22:23:15] the situation [22:23:46] the lack of documentation for this is not ok [22:24:04] I can't do anything because I have no clue how things work [22:24:10] and there's no one to answer my questions [22:24:40] back to the situation, though. I need to know how data gets to bayes from locke or emery. [22:24:47] so that I can get it to stat1 [22:25:18] Ryan_Lane: I'm trying to fix this situation by hiring the right people (like what I'm doing today) [22:25:43] that's good. we should also be getting people to document how things work before these people get in [22:25:56] ugh [22:26:05] working on it: all filters are just committed to svn, that's step 1 [22:26:09] indeed [22:26:11] diederik: thanks [22:52:03] ryan: "CT: i suppose there is a cronjob that transfer the daily log over [22:52:03] or evey xx hours", could that be the case? [22:56:46] no [22:56:50] and I've given up looking [22:56:56] when you guys figure it out, I'll help you more [22:57:31] I have other things I need to do. people have been waiting for me for an hour [23:00:10] damn, they left :( [23:04:08] diederik: do you only need a single copy of the data? [23:04:18] we need to figure out which user would be used for a cron [23:04:26] and that will take a while [23:04:48] yes, only a single copy [23:04:55] ok [23:05:15] so, the current data is good, and you won't need updated data today? [23:05:31] do you have access to locke? [23:06:40] diederik: ^^? [23:06:49] yes [23:06:50] if so, you can do this [23:07:03] make sure to forward your agent to locke [23:07:03] sorry, about single copy [23:07:05] from fenari [23:07:12] ssh -A fenari.wikimedia.org [23:07:15] ssh -A locke [23:07:21] ideally, stat1 syncs data with locke [23:07:32] but one time shot is good for now as well [23:09:13] on stat1: rsync -v —bwlimit=4096 /a/squid/archive/. /a/squid/archive/sampled/sampled-1000.log-201112*.gz [23:09:17] oh, sorry [23:09:22] you should forward your agent to stat1 [23:09:23] not locke [23:09:28] ssh -A fenari.wikimedia.orf [23:09:33] ssh -A stat1 [23:09:55] on stat1: rsync -v —bwlimit=4096 /a/squid/archive/sampled/. /a/squid/archive/sampled/sampled-1000.log-201112*.gz [23:10:08] ^^ fixed rsync [23:10:31] of course, adjust the date to whatever you need to pull the right logs [23:10:56] Hmm do I feel like installing a bunch of centos6 boxes or hacking a centos5 install to get newer libvirt stuff. [23:11:21] centos6 [23:11:30] centos5 is ancient [23:12:23] I wish 5>6 wasn't a re-install. [23:12:38] 6 has some cool stuff but updating a few thousand 5 boxes is a bit PITA. [23:13:07] ryan: sorry, but i get an rsync error [23:13:11] rsync: link_stat "/home/diederik/—bwlimit=4096" failed: No such file or directory (2) [23:13:17] add a space [23:13:33] also, — is -- [23:13:47] my client automatically "corrects" that [23:31:27] diederik: are you good now? [23:31:54] ryan: yes, i tweaked the command a bit, and i logged in straight to stat1 [23:31:58] rsync -v --bwlimit=4096 locke:/a/squid/archive/sampled-1000.log* /a/squid/archive/sampled/ [23:32:09] it is syncing now [23:32:11] cool [23:32:25] thanks a bunch, i know this took way too much time :( [23:32:34] oh. right. I gave you a bad command. heh. whoops [23:32:45] it's ok. yw. [23:48:06] ezachte: thanks for the documentation. added at: http://wikitech.wikimedia.org/view/Bayes [23:48:22] ezachte: at some point we should puppetize this stuff, then we can manage the code via gerrit [23:51:21] Ryan_Lane: sure, there are already RT requests to set up stats1 work environment (tools and rights), but this would be good follow-up [23:53:28] yeah. sounds good [23:53:32] thanks for the quick response