[00:02:58] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.912 seconds [00:05:17] New patchset: Asher; "* for mysqlatfacebook 5.1 r3875 * moved packages to unique names and /usr/local/ to co-exist with mysql 5.5 packages in ubuntu precise (some of which we bring in as dependencies)" [operations/debs/mysqlatfacebook] (master) - https://gerrit.wikimedia.org/r/15989 [00:06:57] notpeter: want to review ^^ with your new fangled dba hat? [00:26:44] +17168, -0 [00:26:45] lol [00:37:46] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:31] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [00:44:04] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.39 ms [00:46:37] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 2.660 seconds [00:49:01] PROBLEM - Host mw1074 is DOWN: PING CRITICAL - Packet loss = 100% [01:02:40] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [01:08:13] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.51 ms [01:12:25] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [01:15:25] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [01:21:25] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [01:21:34] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:31:27] New patchset: awjrichards; "Enable mobile redirects for all *.wikimedia.org sites, except commons" [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/16000 [01:31:55] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.029 seconds [01:36:33] New patchset: awjrichards; "Enable mobile redirects for all *.wikimedia.org sites, except commons" [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/16000 [01:41:58] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 250 seconds [01:42:52] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 272 seconds [01:47:39] New patchset: awjrichards; "Enable mobile redirects for all *.wikimedia.org sites, except commons and bits. Change-Id: I51956cb80f6b7fa4eb9fe8d4e26047cf1181ba35" [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/16000 [01:49:46] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 684s [01:50:58] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 11 seconds [01:54:34] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [01:54:52] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 5 seconds [01:55:46] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 20s [02:00:07] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.39 ms [02:04:01] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:12:52] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.170 seconds [02:20:22] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [02:47:22] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [02:52:55] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.37 ms [03:09:53] RECOVERY - Puppet freshness on zinc is OK: puppet ran at Thu Jul 19 03:09:46 UTC 2012 [03:11:22] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [03:16:55] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.41 ms [03:17:59] New patchset: Aaron Schulz; "Added multiwrite backend config - not yet used." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/15963 [03:25:10] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [03:27:07] RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [03:33:25] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [03:46:01] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [03:48:16] RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 0.55 ms [04:00:07] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [04:02:12] RECOVERY - Host search32 is UP: PING OK - Packet loss = 0%, RTA = 0.27 ms [04:29:12] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [04:34:45] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.35 ms [04:53:27] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [05:18:30] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [05:20:54] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [05:26:27] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.47 ms [06:03:12] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [06:13:15] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [06:18:48] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.41 ms [06:28:02] moring [06:32:36] PROBLEM - Host search32 is DOWN: PING CRITICAL - Packet loss = 100% [07:07:20] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [07:13:02] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.35 ms [07:31:02] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [07:36:44] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.44 ms [07:47:21] bllahh hello [07:51:20] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [07:54:20] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [08:00:02] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [08:05:35] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.51 ms [08:24:02] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [08:29:35] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.38 ms [08:37:01] New patchset: Hashar; "fix syntax errors" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16011 [08:37:32] apergos: mark: paravoid: mutante: can one of you please merge some syntax errors fixes to puppet ? https://gerrit.wikimedia.org/r/16011 [08:37:36] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16011 [08:38:40] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16011 [08:38:46] I was looking at the TMH stuff and wondering [08:39:08] thx! [08:39:15] paravoid: which stuff were you looking at ? [08:39:24] timedmediahandler.pp [08:39:29] I'm not very happy about it [08:39:59] the main issue is that it rely on an external repo I guess [08:40:20] that [08:40:24] the lsbdistcodename == lucid [08:40:28] the fact that it's a root class [08:41:48] https://launchpad.net/~j/+archive/timedmediahandler/+packages that Jan packages list [08:41:59] he probably need some specific versions of libs that were not in Lucid [08:42:10] anyway, we will need Precise for TMH [08:42:20] cause the transcoding require a recent ffmpeg package [08:42:28] the one provided with Lucid just don't work [08:43:30] I remember :) [08:43:42] I involved in that, remember? :) [08:44:07] of course :-] [08:44:26] just making sure your brain is still up to date following the 2 weeks of DebianConf brainwashing ! [08:44:28] ;-D [08:44:32] hehehe [08:44:42] I have also replied to your post about puppet tests / doc etc [08:45:07] we have a rake file already https://gerrit.wikimedia.org/r/15568 [08:45:36] looks like generating the doc is all about running something like: puppet doc --mode rdoc --outputdir doc --manifestdir manifests [08:45:36] I just replied to that too [08:45:52] ohh [08:46:58] we could definitely have Jenkins to run puppet lint on the files that got changed [08:47:06] and trigger a doc build every day [08:49:35] paravoid: do you have any knowledge about varnish and our varnish manifest? [08:49:53] I need to set up a cache for upload.beta.wmflabs.org [08:50:18] I know varnish, however I know little about our setup [08:50:35] will dig in our pp files :-] [08:54:23] paravoid: are you in the mood to merge some of my pending changes ? :D [08:54:42] I get a cleanup one https://gerrit.wikimedia.org/r/#/c/15882/ [08:54:56] a if / else hack that was meant for labs, no more needed [08:55:40] hmm maybe it is indeed [08:55:43] that site eqiad [08:55:44] grmbmblblbl [08:56:53] Change abandoned: Hashar; "That is meant to install varnish in eqiad only and squid in other clusters (like pmtpa / knams). So ..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15882 [09:02:03] paravoid: do you know how to recover files from the corrupted labs instances ? I would need the /etc/squid/* files from deployment-cache-upload ( i-00000263 ) [09:02:11] if not I will mail Ryan and cross fingers :-] [09:05:25] where are those images? [09:05:38] if he's done it before I'd prefer it if he does it tbh [09:05:45] fine to me [09:05:48] mailing him :-] [09:05:55] if not, I'll be happy to investigate [09:06:09] i am pretty sure he did before [09:06:14] I've heard very little of the specifics of the incident [09:07:23] same to me [09:07:31] so better to let Ryan handle that request. I have mailed him [09:17:40] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [09:23:13] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.41 ms [09:44:01] paravoid: if you are still around, there is also my rakefile enhancement https://gerrit.wikimedia.org/r/#/c/15568/ , let you validate the whole manifests with just 'rake validate' [09:44:11] and add a help message as a default target :) [10:38:20] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [10:43:53] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.38 ms [10:53:32] New review: Hashar; "Stepping out, letting ops review the change." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/13484 [11:13:41] PROBLEM - Puppet freshness on virt1002 is CRITICAL: Puppet has not run in the last 10 hours [11:16:41] PROBLEM - Puppet freshness on virt1001 is CRITICAL: Puppet has not run in the last 10 hours [11:22:41] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [11:28:50] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [11:34:23] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.35 ms [12:21:46] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [12:21:55] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [12:23:02] hashar: fwiw, I'm completely lost on the things you need me for :) [12:23:21] (not saying that you did something wrong, on the contrary, I got lost between them) [12:23:38] I guess we are both lost [12:23:49] I have been doing heavy context switching for the last 3 months or so [12:24:00] and I am lost myself [12:26:23] paravoid: I guess we could look at my pending puppet changes ? [12:27:19] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.42 ms [12:29:28] New patchset: Hashar; "(bug 37076) `lint` tool require php5-lint" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13048 [12:30:04] New review: Hashar; "Added dependency to lint.php script as well." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/13048 [12:30:05] New patchset: Hashar; "(bug 37076) `lint` tool require php5-lint" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13048 [12:30:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13048 [12:46:49] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [13:08:08] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.37 ms [13:34:50] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [13:42:27] !log dist-upgrade and reboot hosts in payments cluster [13:42:35] Logged the message, Master [13:42:37] hiyaaa [13:42:46] anyone there have gerrit project delete powers? [13:43:00] i created my new project repo with an empty first commit [13:43:03] which I did not want to do [13:43:15] and now i'm having trouble pushing my local repository (which already has a few commits) to the remote [13:45:35] Probably need/want Chad or Ryan [13:45:54] hm, ok, is Chad chad on IRC? [13:46:12] He's not here atm but he's ^deamon [13:47:41] ottomata: what is the repo ? maybe I can help [13:48:26] operations/debs/nodejs [13:48:31] https://gerrit.wikimedia.org/r/#/admin/projects/operations/debs/nodejs [13:48:37] <-- hashar btw [13:48:37] you can just delete it, and then I can recreate it properly [13:48:47] hashar! aka gerrit-helper [13:48:49] heheh [13:49:21] @search hashar [13:49:21] Results (Found 1): hashar, [13:49:26] !hashar [13:49:27] [10:15:12] !log WMFLabs seems to have recovered now [13:49:36] :o [13:49:58] ... [13:50:07] I am pretty sure I have whatever bot is responding in my ignore list :-] [13:50:22] heh [13:50:36] ottomata: you could push your own repo using git push -f [13:50:42] ottomata: need a force push right though [13:50:44] -f eh? [13:50:45] hmm [13:50:46] ok [13:51:16] quick q [13:51:19] ottomata: which apparently any people from the analtyics team can do [13:51:21] does this look right? [13:51:22] remote.origin.url=ssh://otto@gerrit.wikimedia.org:29418/operations/debs/nodejs.git [13:51:23] remote.origin.fetch=+refs/heads/*:refs/remotes/origin/* [13:51:23] branch.master.merge=refs/heads/master [13:52:03] please tell me you're not building packages from scratch :) [13:52:21] hehe [13:52:24] nope [13:52:25] not me anyway [13:52:38] john du hart had already done this: [13:52:48] http://svn.mediawiki.org/viewvc/mediawiki/trunk/debs/nodejs/ [13:52:52] oh and I would expect a project under operations/debs to be under ops control [13:53:07] i am just trying to build a nodejs 0.8.2 .deb [13:53:11] yeah, i could do that [13:53:11] ohh [13:53:18] and figured we can change that later [13:53:19] the package in our svn is quiet old [13:53:23] latest and greatest? :) [13:53:26] yeah but it is just the debian [13:53:28] was snapshoted from debian IIRC [13:53:34] isn't there a fresh one in Ubuntu ? [13:53:42] there's an 0.6.x something [13:53:46] afai can tell [13:53:52] (or maybe Ubuntu is going to drop node.js because it release too often *grin* ) [13:53:52] we need at least 0.8.0 [13:53:53] yes [13:54:24] i would put ops perms on this repo, but then I couldn't push straight to it :p, and I'd have to wait long times to get reviews for things that have already been reviewed [13:54:25] and quantal has just 0.6.19 :/ [13:54:32] or i'd just not bother importing this from svn [13:54:34] and just commit to svn [13:54:37] but I do not have svn perms [13:54:39] to commit [13:54:45] forget about svn [13:55:10] ooo, -f might have worked... [13:55:21] yes! [13:55:23] paravoid: don't ubuntu has something like Debian alioth where devs do their packaging work ? [13:55:33] yes! [13:55:34] it works [13:55:38] thanks hashar! [13:55:53] hashar: you mean a thing called "launchpad"? :) [13:55:55] * hashar write down ottomata owe him a beer. [13:56:20] yeah, and there is a guy that has already made these debs in launchpad [13:56:23] i was advised to build our own though... [13:56:35] paravoid: hahh that one indeed [13:56:50] http://ppa.launchpad.net/chris-lea/node.js/ubuntu/pool/main/n/ [13:57:13] paravoid: I though Debian had some kind of git repository for the debian directories [13:57:24] hashar: not exactly [13:57:42] when it comes to choices, Debian picks all of them [13:57:49] where the awesome deb developers would experiment their new packages before sending them to experimental / unstable [13:58:10] so, we have cvs, svn, hg, git, bzr, arch, darcs, monotone [13:58:29] most packages used to use svn though, with git being prevalent nowadays I think [13:58:41] most of them are hosted on git.debian.org, although there's no strict requirement to do so [14:00:01] ahhh http://anonscm.debian.org/gitweb/?p=collab-maint/nodejs.git [14:00:05] that is what I was looking for [14:00:11] from the PTS http://packages.qa.debian.org/n/nodejs.html [14:00:17] collab-maint is a special alioth project where all DDs have commit access [14:00:51] so that is for packages which are simple enough or have no clear maintainer ? [14:01:13] they do have clear maintainers but it's being used either on very simple packages where it doesn't make sense to create a dedicated team [14:01:24] (I love Debian but I barely understand their organization, we are lucky to have you around!) [14:01:29] or on packages that the maintainers feel particularly open about their maintainance [14:01:47] it's an easy way to collaborate, rather than go through the whole process of building a new team [14:02:08] traditionally Debian used to have sole maintainers or e.g. two-three maintainers per package [14:02:20] we're migrating to team maintainance or even collaborative maintainance gradually [14:02:33] in Ubuntu for example, everyone can commit everywhere [14:02:44] both paradigms have their pros and cons [14:03:08] so maybe we could have the node.js packaging done upstream with the Debian folks ? [14:03:15] on the collar-maint/nodejs.git repo? [14:04:47] of course you could [14:04:58] k was wondering. thx [14:04:58] ;) [14:05:12] fresh air timem [14:05:16] we have some sun outside :-] [14:05:20] (which is rare) [14:05:53] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [14:07:59] RECOVERY - SSH on virt1002 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [14:08:08] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.70 ms [14:08:16] hate gerrit hate [14:08:23] (sorry, I just want to rant) [14:08:24] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/15900 [14:09:00] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/15807 [14:09:18] please add your hate to the gerrit replacement page [14:09:26] then you can feel like your hate is productive too [14:09:53] heh [14:10:25] I've added two or three entries there [14:10:52] FF14... [14:12:09] what about it? [14:12:19] New patchset: Andrew Bogott; "Temporary change" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16025 [14:12:28] It's amsuing how quickly they're bumping the major versions now [14:12:55] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16025 [14:14:17] six-weeks isn't it? [14:14:28] six-week release cycle that is [14:14:33] they do have the ESR releases though [14:14:40] Sounds about right [14:15:48] the ESRs are yearly or something [14:28:26] New patchset: Andrew Bogott; "Revert "Temporary change"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16028 [14:29:02] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16028 [14:32:10] PROBLEM - Host virt1003 is DOWN: PING CRITICAL - Packet loss = 100% [14:37:43] RECOVERY - Host virt1003 is UP: PING OK - Packet loss = 0%, RTA = 35.66 ms [14:42:04] PROBLEM - SSH on virt1003 is CRITICAL: Connection refused [14:46:51] hashar: around? [14:47:00] paravoid: yup [14:48:08] I'm looking at your review requests [14:48:22] have a few mins to help me understand them? [14:49:05] of course, go ahead! ;-] [14:49:19] so, rakefile; is it used by anything else but users? [14:49:37] why do we even have that? how is it different than "puppet-lint foo.pp" or "puppet parser validate foo.pp"? [14:50:37] lot of newbies don't know about those commands [14:50:53] my point is to provide an easy to use script to people [14:51:02] okay [14:51:05] I don't mind [14:51:05] + puppet-lint comes by default with rules that we do not respect at all [14:51:19] so I wanted to make an easy way to ignore some upstream rules [14:51:22] well, I hope we will with modules [14:51:31] tie the cleanups with the split up [14:51:41] have ops agreed on a set of rules? [14:51:49] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/15568 [14:51:51] I know MediaWiki people use hard tabs as [14:52:05] whereas I think puppet would like to use 2 spaces [14:52:14] not a big deal though [14:52:24] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13339 [14:52:30] yeah!!! [14:53:10] anyway, I wanted to clean out some lintng errors then someone (maybe mark?) told me not to cause it clutter the blame history [14:53:17] which is a valid point [14:53:22] re: lint, that needs manual steps, right? [14:53:43] I am not sure I understand your question? [14:53:57] do you mean it need manual steps to fix the errors reported by lint? answer is yes [14:54:10] though there might some tool around to automatically solve them [14:54:12] which I doubt [14:54:13] sorry, I context switched at a very bad time [14:54:24] I meant php5-lint :) [14:54:25] ssh paravoid vmstat 1 [14:54:31] that needs pecl uninstall, right? [14:54:55] not sure [14:54:58] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [14:55:05] hey boys, q about packaging again [14:55:09] http://wikitech.wikimedia.org/view/Apt [14:55:11] says: [14:55:19] "It's important that we always specify the right distribution (hardy-wikimedia or lucid-wikimedia) for which the package is built in the package's Changelog (debian/changelog)." [14:55:29] !gerrit 13048 [14:55:29] https://gerrit.wikimedia.org/ [14:55:34] er [14:55:36] does that mean that the changelog rev should say 'lucid-wikimedia' in the version? [14:55:46] I disagree that it's important [14:55:49] paravoid: the lint script just load a php script that depends on parsekit_compile_file() provided by the php5-parsekit package you installed. Not sure pecl is related there [14:56:03] iv [14:56:07] i've got this in my control file right now [14:56:07] paravoid: anyway PHP uses a building PECL lib iirc [14:56:07] nodejs (0.8.2-1wm1) lucid-wikimedia; urgency=low [14:56:14] then over at http://wikitech.wikimedia.org/view/Pbuilder [14:56:18] ottomata: that's what it means [14:56:20] whne I create a distribution [14:56:26] should I say [14:56:30] pbuilder create --distribution lucid-wikimedia [14:56:31] ? [14:56:52] hashar: you mention on your commit message something about pecl uninstall etc. [14:57:04] paravoid: hooo yeah [14:57:21] which servers is that? fenari/bast1001? [14:57:23] paravoid: so since we did not have php5-parsekit yet, we had to install that extension using pecl the php extension installer. [14:57:32] New patchset: Faidon; "(bug 37076) `lint` tool require php5-lint" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13048 [14:57:43] paravoid: whenever debian does not provide a php5-foobar package, we have to pecl install foobar [14:58:06] yeah I got that [14:58:07] paravoid: once 13048 it merged, it can be applied to fenari [14:58:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/13048 [14:58:26] paravoid: the cleanup task would be to remove the old copy that was installed by pecl [14:58:34] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13048 [14:58:44] paravoid: I have added the "how to" in commit message as a reminder to whoever was going to merge it [14:59:20] fenari$ pecl list [14:59:21] parsekit 1.3.0 stable [14:59:22] ; [14:59:22] ) [14:59:51] great [15:00:13] PROBLEM - NTP on virt1003 is CRITICAL: NTP CRITICAL: No response from NTP server [15:00:20] so that one could get removed [15:00:24] whenever puppet has run on fenari [15:00:38] not sure if this is relevant, but someone has written a providers for puppet's package resource for pear and pecl [15:00:38] http://www.mit.edu/~marthag/puppet/ [15:00:45] i've used them before and they worked great [15:00:46] please don't [15:01:06] package { "foobar": provider => "pecl" } [15:01:14] :-D [15:01:24] ops don't want to have puppet to install from third parties repository [15:01:29] aye cool [15:01:29] I think that installing whatever packaging each language decided to create is a very bad idea [15:01:36] ok, fine with me! [15:01:52] someone could sneak a faulty package in upstream repo and kill us :-) [15:01:56] I guess that is the main point [15:02:08] I've worked with systems that had gems, pypi, pecl and whatnot [15:02:23] they sneak bugs instead, bastards! [15:02:38] every language author thinks they're better than distributions [15:02:42] and they can do a better job [15:03:05] and that it's /their/ job of providing tools for the sysadmins [15:03:06] ottomata: if you ever need a specific php5 extension, it is best to use the debian package. If it is not there we can get ops to build the .deb package for us (Debian has a script that generate a nice package out of pecl. Something like : dh-build-from-pecl foobar) [15:03:23] php is even more fucked up, they decided to create /two/ of them! [15:03:31] ah ok cool [15:03:33] composer!!! ;-D [15:03:45] good to konw [15:03:52] for php http://getcomposer.org/ [15:04:12] paravoid, do you remember what it was like when you were learning to create .debs for the first time [15:04:13] ? [15:04:14] AGH! [15:04:18] SO MANY DIFFERENT WAYS [15:04:19] hehe [15:04:24] q [15:04:30] if I have a debian/ dir that works [15:04:31] I only managed to build two so far :-/ [15:04:32] can I just run [15:04:36] dpkg-buildpackage [15:04:37] ? [15:04:38] and both time had to find a doc [15:04:44] anyway, now I ask paravoid :-] [15:04:45] or do I have to use pbuilder? [15:04:52] do I have to have a chroot? [15:05:08] ottomata: if you're a newbie, I wouldn't recommend to you to build packages for nodejs [15:05:10] echo 'can you please build a package for foo? thanks -- antoine' | mail faidon@wikimedia.org [15:05:17] well i don't have to create the debian/ [15:05:20] john du hart already has [15:05:26] and I build it locally [15:05:29] seemd to work great [15:05:42] all I did was change changelog, and change a dependency in debian/control [15:05:50] most of the work is doing by nodejs's build scripts [15:06:07] http://svn.mediawiki.org/viewvc/mediawiki/trunk/debs/nodejs/debian/rules?revision=101926&view=markup [15:06:17] half of upstream build systems are completely fucked up :) [15:06:28] it seemed to work [15:06:49] node.js even uses (used?) WAF, couldn't do much worse [15:07:02] WAF? [15:07:22] a broken by design build system [15:08:03] it produces a python script that self-extracts a uuencoded payload that contains more python code that is dynamically loaded and run on the target system [15:08:29] hm, so i'm not sure of the details [15:08:35] but previously they were using scons [15:08:37] and now are using gyp [15:08:38] ? [15:08:40] hahahahaha [15:08:57] isnt scons just the build system? [15:09:07] well, it's the node.js, shouldn't expect more :) [15:09:07] for python, i think so, but they are using gyp now instead [15:09:16] make is just too mainstream for them :-) [15:09:31] anyway I just use npm whenever I need a node package [15:09:32] http://npmjs.org/ [15:09:42] oh right, they made their own package manager as well [15:10:03] and npm uses a shell script as an installer :D http://npmjs.org/install.sh [15:10:21] npm is kind of good [15:10:29] gem just works [15:10:50] cpan is a bit troublesome but it is just like perl anyway so you get used to that [15:11:14] pear / pecl I can never recall the command and always have to add the channel (aka the remote repository) [15:11:18] pip just works [15:11:31] one day we will come with a universal package manager :-D [15:11:34] oh wait [15:11:38] .deb ! [15:11:42] you forgot cabal [15:12:11] I am pretty sure Debian has helpers to generates .deb out of pip/gem/npm/pear/pecl etc [15:12:18] most of them, yeah [15:12:23] so um, anyway, [15:12:27] if I can build a .deb [15:12:30] using dpkg-buildpackage [15:12:34] is that all I need to do [15:12:35] ? [15:12:38] or do I need to be fancier? [15:17:12] depends on your goal :) [15:17:30] building in a clean chroot is a prerequisite for building proper packages [15:17:48] my goal is to eventually have node 0.8.2 installable on prod machines [15:17:48] but if you don't much about packaging then pbuilder is the least of your problems :) [15:18:00] so whatever you guys require to get it into the apt repo, i guess is what I need to do [15:18:35] the new reportcard needs it [15:18:35] http://reportcard.wmflabs.org/ [15:18:46] right now it is in labs, but sometime soon we want to move it to a more permanent place [15:18:46] someone else needed it do, wasn't it? [15:18:51] the visual editor team or something? [15:18:54] oh maybe? [15:18:55] dunno! [15:18:56] do they? [15:19:03] I think so [15:19:06] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [15:19:35] and why do you need 0.8.2? [15:19:43] why isn't the version in precise enough? [15:20:04] if the goal is to get it into production, it needs a rationale too :-) [15:20:47] i thikn we need 0.8.0 [15:20:56] and I do not know the exact reason, david schoonver does [15:20:59] he is developer for this [15:21:03] some dependency or something [15:21:09] also, I think I should review the work too; unfortunately we often skip peer review, but if you're not very experienced in packaging, we should probably exercise this [15:21:20] hm but I didn't do anything! [15:21:27] john du hart created this [15:21:30] and it is already in wm apt repo [15:21:33] is it? [15:21:36] oh? [15:21:37] we depend on packages that depend on 0.8.x [15:21:53] and in general, node moves quickly and improves greatly with each point release. [15:21:59] http://apt.wikimedia.org/wikimedia/pool/main/n/nodejs/ [15:22:03] it's something like 6x faster, and uses less RAM [15:22:07] wow [15:22:19] (0.4.x was released well over a year ago. not a surprise.) [15:22:24] here is the change I made [15:22:24] https://gerrit.wikimedia.org/r/gitweb?p=operations/debs/nodejs.git;a=commitdiff;h=4b2ec81ec8add91234b4c6645f3d3b404e0c8386 [15:22:29] keep in mind that this is also including updates to v8 [15:22:36] i ported the svn stuff into a new gerrit project [15:22:38] and made that one change [15:22:41] so updating node means getting all the benefits of google's work on v8 [15:22:48] ottomata: I don't think that's a good strategy [15:22:57] porting from svn? [15:23:01] (also, node is one of the most stable projects i've worked with, so i'm not terribly scared of regular updates.) [15:23:13] the Ubuntu people have 0.6.19 [15:23:17] http://svn.mediawiki.org/viewvc/mediawiki/trunk/debs/nodejs/ [15:23:47] I'm not sure what are the changes between 0.4.9 and 0.6.19, but if you're making packages for 0.8.2 I think you would benefit from using 0.6.19 as your base rather than 0.4.9 [15:24:04] i'm not using anything as my base [15:24:16] right? [15:24:20] maybe I don't know what you mean [15:24:26] we ahven't modified the nodejs code at all [15:24:33] all we have is a debian/ dir [15:24:35] I'm talking about packaging [15:24:41] that i'm not even sure if johnduhart actually made himself [15:24:42] not the node.js source itself [15:24:47] he might have gotten it somewhere and just tweaked it [15:24:57] oh i see [15:25:05] you mean take debian/ from ubuntu's 0.6.19 version? [15:25:06] and use that? [15:25:13] yes [15:25:17] I think that's a better idea [15:25:33] PROBLEM - Host virt1003 is DOWN: PING CRITICAL - Packet loss = 100% [15:25:38] or else you'll lose all of the 0.4.9->0.6.19 work that Ubuntu might have done [15:25:39] sigh ok, got different advice from notpeter yesterday :p [15:25:46] oh? [15:25:56] well, since we already have our own debian/ [15:26:00] (that does build with 0.8.2) [15:26:09] he thought it would be best if we used that [15:26:18] what do you mean "our own"? it's Ubuntu's 0.4.9 [15:26:34] and ubuntu has since updated it to 0.6.19 [15:26:43] it is ubuntu's? [15:27:18] that's what debian/changelog suggests :) [15:27:26] oh why yes it does [15:27:28] ok cool, [15:27:37] and johnduhart hardly did any of it [15:27:38] ok ok ok [15:27:39] there are no ubuntu specific changes. [15:27:42] will grab ubuntu's and try that [15:27:50] i've built the source on a stock ubuntu machine. [15:27:53] it requires no patches. [15:28:01] yeah, he's talking about the debian/ dir [15:28:11] it is possible ubuntu has made configure changes for newer versions [15:28:16] just safer to start with what they are using now [15:28:23] rather than what they were using at 0.4.x [15:28:52] now to figure out how to find that... [15:29:04] $ ls nodejs-0.4.9/debian/patches/ |wc -l [15:29:04] 8 [15:29:11] $ ls nodejs-0.6.19~dfsg1/debian/patches/ |wc -l [15:29:11] 12 [15:29:21] that's patches to node.js itself [15:29:24] interesting [15:29:31] does running buildpackage apply the patches? [15:29:36] automatically? [15:30:21] $ diff -Nurp nodejs-{0.4.9,0.6.19~dfsg1}/debian | diffstat -s [15:30:21] 30 files changed, 88202 insertions(+), 2179 deletions(-) [15:30:41] that's what you basically lost by forking 0.8.2 from 0.4.9 instead of 0.6.19 :-) [15:30:58] aye [15:31:00] ok [15:31:06] RECOVERY - Host virt1003 is UP: PING OK - Packet loss = 0%, RTA = 35.34 ms [15:33:42] I'm looking at the diff; most of them don't look relevant to us (e.g. porting it to ARM) [15:33:44] so, do I need to apply these patches manually? [15:33:58] but still, it doesn't make sense to fork off an ancient version and do three-way merges later [15:34:02] yeah i agree [15:34:06] no, they'll get applied automatically [15:34:08] ok cool [15:34:29] so i'm not sure of the best way to bring this over to my already existing repo, but, [15:34:33] since I don't ahve gerrit delete powers [15:34:39] cd nodejs [15:34:46] git rm -rf ./debian [15:34:59] ...what is in these patches? [15:35:22] cp -a ../nodejs-0.6.19-debian ./debian [15:35:23] because i'm not aware of anything that needs to be applied off the mainline release. [15:35:24] git add debian [15:35:29] seems to work... [15:36:42] dschoon: http://patch-tracker.debian.org/package/nodejs [15:36:48] thanks [15:37:06] !log rebooting hydrogen to set bios redirection [15:37:08] ottomata: also, dschoon mentioned something about newer v8 before; if you really need that, then you need to first backport that, install it into your system and then rebuild node :-) [15:37:15] Logged the message, RobH [15:37:20] (welcome to the joy of packaging) [15:37:39] paravoid: pretty sure the source distro of node has its deps frozen. [15:37:43] bwerrrrr? [15:37:46] which includes the target of v8 [15:37:48] so ignore that :) [15:38:12] I'd do all that for you but considering the increasing amount of such requests that I get per day (you're the third one today), I fear I'll soon become a bottleneck if I do so [15:38:36] dschoon: I'm pretty sure the packages don't use embedded libraries [15:38:46] meh. [15:38:48] nono, i'm trying to learn how to do it [15:38:56] would much rather you do like you are doing and answer all these questions :) [15:39:01] (thank you, btw ) [15:39:02] i shall leave it up to ottomata whether he wants to *also* package up v8 [15:39:15] but the tarball you get from nodejs.org contains v8, afaik [15:39:22] dschoon: ewwww [15:39:28] dschoon: that's always a HORRIBLE idea [15:39:32] *shrug* [15:39:41] dschoon: packagers always disable that and use the system copy of libraries [15:39:48] PROBLEM - Host 2620:0:861:1:7a2b:cbff:fe09:c21 is DOWN: PING CRITICAL - Packet loss = 100% [15:40:05] oh that host [15:40:14] v8.h is in the source [15:40:15] so yes? [15:40:18] I still remember when a zlib vulnerability was found and we needed to rebuild something like 50 packages [15:40:27] (we as in Debian) [15:40:33] PROBLEM - Host hydrogen is DOWN: PING CRITICAL - Packet loss = 100% [15:40:42] PROBLEM - Host 208.80.154.50 is DOWN: PING CRITICAL - Packet loss = 100% [15:41:09] so, nodejs on quantal depends on libv8-3.8.9.20, while precise has libv8-3.7.12.22 [15:41:26] i.e. it definitely uses the system copy and quantal has a newer v8. [15:42:02] fyi, ottomata / paravoid: https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager [15:42:46] New patchset: Hashar; "basic README introducing our files" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16035 [15:42:54] i would prefer we did this: [15:42:55] sudo add-apt-repository ppa:chris-lea/node.js [15:42:56] or [15:43:01] just add his .debs to our apt [15:43:07] would that be ok paravoid? [15:43:16] or do i need to build somethign with a changelog modified for wikimedia? [15:44:00] (for reference https://github.com/joyent/node ) [15:44:14] i agree with ottomata, btw. we should just use chris lea's stuff [15:44:54] RECOVERY - SSH on virt1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [15:47:35] my personal opinion is that including the debs in our repo is fine [15:48:00] I wouldn't use the ppa itself, since that might change to e.g. newer incompatible versions or even a broken upload [15:48:14] really? that would make our lives so much easier [15:48:23] how do we make that happen? [15:48:23] !log hydrogen repaired per rt 3243 [15:48:25] we wouldn't want e.g. an apt-get upgrade on production to bring 0.9 without anyone having test that [15:48:31] Logged the message, RobH [15:48:31] yeah certainly [15:48:35] that's what we did at CouchSurfing too [15:48:41] we'd use 3rd party debs no problem [15:48:47] uhh, rpms [15:48:48] but always from our own yum repo [15:48:53] so yeah [15:48:58] should I just email ops list and ask? [15:49:09] I can just do that now [15:49:13] i'm sure the discussion would be brief, supportive, and concise! [15:49:19] yay! from you it would probably carry more weigh [15:49:19] t [15:49:35] well, I meant just including the debs in the repo [15:49:41] oh! [15:49:44] even better! [15:49:49] what worries me more is if we have anyone else using node.js [15:50:04] like, currently, and then they upgrade, and something crazy happens? [15:50:11] yes [15:50:26] hm, but they shouldn't auto-upgrade, right? :) [15:50:49] well, an apt-get upgrade on a stable system should be harmless [15:50:57] we don't stage apt-get upgrades, what would be impossible [15:51:04] aye hm [15:51:06] welp [15:51:06] hm [15:51:07] http://ppa.launchpad.net/chris-lea/node.js/ubuntu/pool/main/n/ [15:51:14] we need both nodejs and npm [15:51:27] for precise, right? [15:51:37] probably should do both precise and lucid [15:51:47] both? [15:52:01] i think our labs instances are on lucid right now, [15:52:10] well, you know, i guess just precise is fine [15:52:18] i'm sure the final home will be precise [15:52:19] and [15:52:25] I can just dpkg -i them on the labs instances [15:52:44] for now [15:53:09] RECOVERY - SSH on virt1003 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [15:54:41] paravoid: finally there is the getting rid of our ifs instance change at https://gerrit.wikimedia.org/r/#/c/15545/ [15:54:47] paravoid: that one might impact production though [15:54:56] paravoid: and labs is dead. So I guess that will be for next week :-] [15:56:18] ottomata: done :) [15:56:53] hehehehehe [15:56:54] ok [15:58:02] oh my woooo, ok thanks [15:58:33] now if only I had a precise instance to install this on…maybe I will have to make one :p [15:58:34] thank youuuuu! [15:58:37] saved me so much time [16:04:15] PROBLEM - Puppet freshness on owa1 is CRITICAL: Puppet has not run in the last 10 hours [16:10:21] New patchset: Hashar; "beta: wmgArticleFeedbackLotteryOdds => 0" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16036 [16:12:09] so hashar, do you have gerrit project delete powers? [16:12:14] since my repo is not being used at all anymore [16:12:14] hashar: uninstall ok: channel://pecl.php.net/parsekit-1.3.0 [16:12:15] we should delete it [16:12:18] hashar: (fenari) [16:12:21] ottomata: we can't delete projects in gerrit [16:12:24] hashar: please verify that lint works? [16:12:32] paravoid: let me check [16:12:33] :D [16:13:04] won't that confuse people in the future? [16:13:05] PHP Warning: PHP Startup: Unable to load dynamic library '/usr/lib/php5/20090626/parsekit.so' - /usr/lib/php5/20090626/parsekit.so: cannot open shared object file: No such file or directory in Unknown on line 0 [16:13:06] grmblblblb [16:13:08] i guess I can change the description saying [16:13:12] ah that must be the php.ini file [16:13:12] USELESS! [16:13:57] hashar: nope [16:14:01] so [16:14:13] that's why I despise language "package managers" [16:14:16] I installed php5-parsekit [16:14:21] then did pecl uninstall parsekit [16:14:27] cli/php.ini [16:14:28] 942:extension=parsekit.so [16:14:30] which went and removed the file that php5-parsekit shipped [16:14:31] hmm [16:14:44] it went and fucking removed stuff off /usr/lib [16:14:59] I just did apt-get install --reinstall [16:15:01] try again. [16:15:23] stupid me [16:16:27] well hmm [16:16:31] it is working again apparently [16:22:03] paravoid: ah I haven't seen your message. So yeah it works. Thanks a ton :-] [16:22:19] paravoid: off for today, see you tomorrow :-] [16:28:22] Logged the message, Master [16:33:06] PROBLEM - Host mw60 is DOWN: PING CRITICAL - Packet loss = 100% [16:47:12] PROBLEM - Host virt1001 is DOWN: PING CRITICAL - Packet loss = 100% [16:52:09] RECOVERY - Host virt1001 is UP: PING OK - Packet loss = 0%, RTA = 35.40 ms [16:53:12] RECOVERY - Host mw60 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [16:56:48] PROBLEM - Apache HTTP on mw60 is CRITICAL: Connection refused [17:08:48] RECOVERY - Apache HTTP on mw60 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.030 second response time [17:12:06] PROBLEM - Host virt1001 is DOWN: PING CRITICAL - Packet loss = 100% [17:14:39] RECOVERY - Host virt1001 is UP: PING OK - Packet loss = 0%, RTA = 35.43 ms [17:14:40] New patchset: Pyoungmeister; "role/apache.pp cluster names are not the same as names in lvs.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16039 [17:15:16] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16039 [17:24:33] Change abandoned: Pyoungmeister; "will just fix properly now..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16039 [17:34:09] PROBLEM - Host virt1001 is DOWN: PING CRITICAL - Packet loss = 100% [17:36:45] RECOVERY - Host virt1001 is UP: PING OK - Packet loss = 0%, RTA = 35.38 ms [17:49:39] PROBLEM - Host virt1002 is DOWN: PING CRITICAL - Packet loss = 100% [17:52:21] PROBLEM - Puppet freshness on nfs2 is CRITICAL: Puppet has not run in the last 10 hours [17:52:30] RECOVERY - Host virt1002 is UP: PING OK - Packet loss = 0%, RTA = 35.60 ms [17:55:21] PROBLEM - Puppet freshness on nfs1 is CRITICAL: Puppet has not run in the last 10 hours [18:07:57] PROBLEM - Host virt1003 is DOWN: PING CRITICAL - Packet loss = 100% [18:09:18] RECOVERY - Host virt1003 is UP: PING OK - Packet loss = 0%, RTA = 35.38 ms [18:15:37] !log storage3 dist-upgrade and reboot [18:15:46] Logged the message, Master [18:17:42] PROBLEM - Host storage3 is DOWN: PING CRITICAL - Packet loss = 100% [18:18:09] RECOVERY - Host storage3 is UP: PING OK - Packet loss = 0%, RTA = 0.36 ms [18:21:07] !log updating dns with new zonefiles for legally won domain names [18:21:15] Logged the message, RobH [18:21:27] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: (Return code of 255 is out of bounds) [18:22:57] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 12s [18:28:57] PROBLEM - Host srv278 is DOWN: PING CRITICAL - Packet loss = 100% [18:29:51] RECOVERY - Host srv278 is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms [18:33:36] PROBLEM - Apache HTTP on srv278 is CRITICAL: Connection refused [18:34:55] cmjohnson1: im getting alarms for ps1-d3 [18:35:06] its lost its lead for its temp/humid sensor 1 [18:35:14] i'll drop a ticket for ya [18:37:21] RECOVERY - Puppet freshness on virt1001 is OK: puppet ran at Thu Jul 19 18:36:48 UTC 2012 [18:37:21] RECOVERY - Puppet freshness on virt1002 is OK: puppet ran at Thu Jul 19 18:36:48 UTC 2012 [18:38:55] cool [18:38:59] too late though ;] [18:41:25] loving that observium spam yet? [18:41:26] ;] [18:46:12] RECOVERY - Apache HTTP on srv278 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.028 second response time [18:48:56] New patchset: preilly; "rely on HTTP_X_SUBDOMAIN not on HTTP_X_CARRIER" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16053 [18:49:44] Change merged: preilly; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16053 [18:56:23] RECOVERY - NTP on virt1002 is OK: NTP OK: Offset -0.03941094875 secs [18:57:17] RECOVERY - NTP on virt1001 is OK: NTP OK: Offset -0.02814674377 secs [18:57:26] RECOVERY - NTP on virt1003 is OK: NTP OK: Offset -0.03542995453 secs [19:04:43] blargh. "Can't locate Debconf/Frontend/newt.pm" stab. stab. stab. [19:42:53] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 305 seconds [19:48:35] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 647s [19:51:35] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 16s [19:52:20] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 17 seconds [20:00:14] New patchset: Jgreen; "modded dump_fundraisingdb to accept take a list of a subset of dbs to dump" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16057 [20:00:49] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16057 [20:01:35] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16057 [20:05:21] hey LeslieCarr [20:05:50] would you have some tim this week to rack up the Dell machines for the analytics cluster? [20:08:05] drdee: no for two reasons - #1 I'm gone in about 45 minutes and #2 I'm sitting in the office in San Francisco and it's a bit far away [20:08:27] :) [20:08:34] so who would be the right person to ask? [20:08:36] RobH is the guy on the ground in DC (eqiad) and cmjohnson1 is the guy on the ground in tampa [20:08:48] however Rob pulled his shoulder and is not doing any heavy lifting for a few days [20:09:29] paravoid: were your modules changes also reflected in the bootstrapping for labs? [20:09:33] actually, i think they are already physically installed but the network needs to be configured [20:09:35] LeslieCarr: Your powers don't stretch to remote racking of servers? :P [20:09:42] sorry for not being more clear [20:09:43] drdee: ah :) that is much easier [20:09:49] let me see if there was a ticket [20:09:54] Damianz: i'm workiing on it [20:10:36] drdee: is it analytics1001 - analytics1010 ? [20:11:11] those are the cisco boxes (IIRC) they are working [20:11:32] ah, 1011 to 1027 ? [20:11:42] * drdee is searching through rt as well [20:11:48] yes that sounds about right [20:13:47] the analytics machiens are all racked and wired [20:13:50] they have been for awhile [20:13:55] its all on row c networking now [20:15:37] paravoid: ah. the post-merge hook on virt0 needed to be fixed [20:15:47] we need to change that over to the same method as stafford/sockpuppet at some point [20:16:18] is there a ticket do you remember ? I'll try to get these all before i run off :) [20:18:04] still searching [20:18:19] RobH do you remember the ticket for those Dell boxes? [20:19:49] ticket for what part? [20:20:35] i assume you mean the network ticket [20:20:47] im looking but i dont see it, its row c access switches [20:20:51] ihttps://rt.wikimedia.org/Ticket/Display.html?id=3067 maybe? [20:21:07] https://rt.wikimedia.org/Ticket/Display.html?id=3067 [20:21:14] hrmm, not exactly [20:21:19] but if we cannot find a ticket, we can append to that [20:21:56] New patchset: Pyoungmeister; "role/apache.pp: adding in lvs_pool support" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16060 [20:22:27] cool, or if you remember what interfaces they're plugged into, i can get it all done now :) [20:22:28] drdee & LeslieCarr Ok, ticket 3067 is updated with the details [20:22:35] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16060 [20:22:37] racktables link in ticket [20:22:42] bottom server = 0 [20:22:45] then works up from there [20:22:55] cool [20:22:57] thats the default for any newly wired rack unless you tell us differently [20:23:02] :) [20:23:10] (like c1/frack is oddball) [20:23:31] it's special [20:25:07] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16060 [20:27:04] ty RobH & LeslieCarr [20:27:28] welcome =] [20:31:04] New patchset: Pyoungmeister; "role/apache.pp cleanup: only need apaches::monitoring once..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16061 [20:31:43] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16061 [20:33:34] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16061 [20:34:36] New patchset: Ryan Lane; "Adding patches for metadata queries" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16063 [20:35:12] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16063 [20:35:52] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16063 [20:44:34] LeslieCarr,ty ty ty [20:44:42] hehe saw the update to the ticket ? [20:44:45] before i could tell you :) [20:44:47] yep [20:44:55] should be good to go [20:44:55] so now it's up to us to install the OS? [20:45:09] usually ops does it [20:45:13] though you are so welcome to ;) [20:45:31] i think we will take it from here, but i will loop ottomata in [20:45:38] i have to run and go to the airport, if you see any issues, please email me directly as well as the ticket (so i can see easily the email) [20:45:49] will do! ty again [20:46:12] uhhhh [20:46:15] no we don't install the os! [20:46:16] haha [20:46:20] we configure once os is installed [20:46:39] if i could install the os i would though [20:46:43] would love to learn how to do that [20:47:11] whoops sorry [20:47:14] my mistake [20:48:14] it's pretty easy, i can walk you through it sometime ottomata .. except not now! got to go to the airport :) [20:48:26] ottomata: actually someone else may be able ot walk you through it today :) [20:48:46] ok, but really heading out, like i said, please also directly email me if you see any problems and i'll check them out when i'm on the ground [20:49:11] ok cool, naw i ahve to head out soon too [20:49:34] would love to walk through it with you or someone tomorrow [20:50:20] binasher: https://gerrit.wikimedia.org/r/#/c/16000/3 is ready for review and deployment [20:50:57] preilly: code review ^^ ? [20:51:38] btw, why are you excluding commons? [20:53:50] binasher: see schedule and explanation here: http://www.mediawiki.org/wiki/Mobile_default_for_sibling_projects#Commons_switchover [20:54:16] cool, makes sense [20:59:00] hi RobH, any news from dell on stat1001 (https://rt.wikimedia.org/Ticket/Display.html?id=3121) [21:00:22] Change merged: Asher; [operations/debs/squid] (master) - https://gerrit.wikimedia.org/r/16000 [21:01:16] New patchset: Asher; "redirector as of https://gerrit.wikimedia.org/r/#/c/16000/" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16067 [21:01:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16067 [21:03:09] PROBLEM - MySQL Slave Delay on storage3 is CRITICAL: CRIT replication delay 289 seconds [21:06:10] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16067 [21:07:24] !log deploying new mobile redirector to esams text squids [21:07:32] Logged the message, Master [21:08:34] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 612s [21:15:53] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 6s [21:15:53] RECOVERY - MySQL Slave Delay on storage3 is OK: OK replication delay 1 seconds [21:23:41] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [21:36:48] !log deploying new mobile redirector to eqiad text squids [21:36:56] Logged the message, Master [21:37:04] notpeter: Are you messing with srv281 by any chance? [21:37:06] srv281: rsync: mkdir "/apache/common-local/php-1.20wmf7/extensions/LastModified" failed: No such file or directory (2) [21:37:11] nope [21:37:12] awjr: the deploy is done in europe [21:37:14] just 194 [21:37:19] binasher sweet [21:37:46] binasher is there an easy way for me to test that or should i wait til it's totally done? [21:39:03] awjr: wikipedia-lb.esams.wikimedia.org has address 91.198.174.225 [21:40:25] sweet - at first glance looks good [21:45:18] hmm im getting mobile main page on .m for spcies [21:45:27] er rather on non-.m [21:45:42] binasher ^ [21:46:08] hmm and non-.m link for the mobile view link [21:46:14] maybe a configuration problem [21:46:26] yep [21:47:38] yeah looks like when .m domains were added back in june InitialiseSettings.php was not updated to reflect .m domains [21:50:50] New patchset: Catrope; "Set up the E3Experiments extension" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16077 [21:51:23] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16077 [21:58:48] New patchset: Catrope; "Fix various snafus in E3Experiments setup" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16078 [21:59:10] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16078 [22:00:56] awjr: the eqiad deploy is done [22:01:25] New patchset: Catrope; "Enable E3Experiments on testwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16079 [22:01:29] binasher awesome thanks; im waiting for the config change to go out to fix the .m domains for spcies/meta [22:01:47] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/16079 [22:02:26] !log deploying new mobile redirector to pmtpa text squids (currently inactive) [22:02:33] Logged the message, Master [22:18:33] drdee: so… get to the point to tell if it's working or not ? [22:21:35] RECOVERY - mysqld processes on db1003 is OK: PROCS OK: 1 process with command name mysqld [22:22:38] PROBLEM - Puppet freshness on db29 is CRITICAL: Puppet has not run in the last 10 hours [22:24:44] PROBLEM - MySQL Replication Heartbeat on db1003 is CRITICAL: CRIT replication delay 1907267 seconds [22:25:02] PROBLEM - MySQL Slave Delay on db1003 is CRITICAL: CRIT replication delay 1907242 seconds [22:25:26] awjr: preilly and i were were going to spend some time and discuss geosearch, the associated schema changes, and whether using sphinx is actually a good idea (very possibly not) vs. a new solr install - either way it's a huge new change to wmf infrastructure.. [22:26:03] binasher: so i should not have merged that changeset? [22:26:11] i can't believe that just got a +2 and merge! [22:26:29] didn't realize the spinx debate was still going on [22:27:00] it hasn't even really been had yet [22:27:05] but, the code seems sane and should still work w/o sphinx [22:27:33] i don't think that's how code review is supposed to work [22:29:03] awjr: what sort of testing did you do on: https://gerrit.wikimedia.org/r/#/c/15905/ [22:30:06] awjr: plus https://gerrit.wikimedia.org/r/#/c/15905/5/sql/sphinx-backed.sql [22:30:23] schema is supposed to specially reviewed [22:31:18] preilly i did functional testing on mobile-geo [22:31:24] if the merge is a problem, revert it [22:31:43] awjr: okay makes sense [22:31:46] that should be good for a +1 [22:36:28] MaxSem: if we use sphinx, we'd want to use the pecl extension, not the userspace php api thats in https://gerrit.wikimedia.org/r/#/c/15905/5/lib/sphinxapi.php [22:36:49] so there would need to configuration around that [22:37:20] props for running with the idea, but this needs some additional planning, and potentially review by tim [22:38:01] will do [23:01:34] MaxSem i added adm id support in r763 [23:02:06] sehr gut [23:17:33] New patchset: Ryan Lane; "Adding missing puppet config for nova essex puppetmaster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16083 [23:17:55] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16083 [23:18:01] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16083 [23:35:23] PROBLEM - Puppet freshness on ms3 is CRITICAL: Puppet has not run in the last 10 hours [23:53:45] New patchset: preilly; "X-Forwarded-For may contain two IP addresses, so grab the first." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16085 [23:54:21] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/16085 [23:55:15] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/16085 [23:56:05] preilly: is the first one always going to be the carrier ip we want? [23:56:46] binasher: yes [23:58:09] isn't varnish multithreaded? [23:58:54] strtok is not threadsafe according to the linux manual [23:59:52] TimStarling: so should I use strtok_r() for thread safety?