[00:00:41] * Krenair will amend the patchset [00:01:19] I'm not sure what the connection with meta.wikimedia.org/wiki/MediaWiki:Experiments.js is [00:02:26] if I was doing that task, I would make an administrative interface in PHP which ran on some unrelated domain, like e3control.wikimedia.org [00:02:32] then that admin interface would write to the database [00:02:57] then an RL module would provide access to that data as JS code [00:03:09] the RL module would be part of the existing extension [00:04:09] and I would make the DB query cached in memcached, and I would make the RL module cacheable on the client side [00:04:22] or perhaps even combined with some startup bundle [00:04:39] well, that was my first instinct, but halfway through implementing a mediawiki interface that allows interactive update, is versioned, and references the person making the update [00:04:50] i got the sneaking suspicion that i'm implementing mediawiki in mediawiki [00:04:52] ... Sounds easier to just deploy changes [00:05:18] Fix MediaWikis config handling [00:05:19] Krenair: no. i did that for the past six months. [00:05:26] Reedy: Hah! [00:05:31] I was trying to lure him in on that yesterday. [00:05:32] Brooke: you paid him to say that [00:05:38] I wish. [00:05:57] afaik "mediawikis config handling" == reedy [00:06:00] well, you could still run the admin interface on a separate domain, even if it is MW [00:06:03] Heh. [00:06:15] TimStarling: And use what for user auth? [00:06:16] we have lots of wikis in *.wikimedia.org [00:06:17] No ori-l, I won't come and configure your own MediaWiki installation [00:06:25] Or you mean installing a separate MW instasnce? [00:06:28] instance [00:06:51] yeah, a separate MW instance would be ideal [00:07:05] what would be a good one to use? [00:07:25] I'm not sure what advantage a separate MW instance is over using Meta-Wiki. [00:07:34] privilege separation [00:07:42] usually admin interfaces are full of XSS vulnerabilities [00:07:45] Doesn't the entire user rights system already account for that? [00:07:51] so the idea is to put it in a separate cookie domain [00:07:53] Well, let's not introduce those, then. [00:08:00] yeah, sure [00:08:08] TimStarling: well, why create an admin interface? the people who will need to update stuff number 4-5 [00:08:20] it can just be some JSON [00:08:48] Loop. [00:08:50] some JSON changed how? [00:09:06] just edit the article it lives in? [00:09:06] Isn't there a better code editor these days? [00:09:36] New patchset: Alex Monk; "Redirect secure.wikimedia.org URLs to proper HTTPS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13429 [00:09:50] yes, you could do that [00:10:53] and if i did do that, there really wouldn't be a point in keeping it on a separate wiki, no? since i won't be building a custom crud interface for the database, just using the standard edit interface [00:11:09] you could even have the contents of a metawiki page delivered via an RL module on the local wiki [00:11:23] if you wanted to get clever [00:11:28] yeah, that's what i was trying to explain above [00:11:32] that was the idea [00:12:06] i didn't articulate it very clearly [00:12:20] bbl [00:12:49] ori-l, ok, I uploaded a new patchset to redirect wikidata properly [00:13:05] Krenair: would you like me to update the config on the labs machine? [00:13:10] to match the patch? [00:13:49] I doubt it can cause any issues, but yes please [00:16:22] Krenair: done [00:18:09] Okay so that was a good idea [00:18:20] Turns out what wasn't a good idea was coding at past midnight [00:18:29] wikidata is at wikidata.org, not wikidata.wm.o -.- [00:20:06] New patchset: Alex Monk; "Redirect secure.wikimedia.org URLs to proper HTTPS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13429 [00:24:25] Krenair: updated labs [00:25:15] It's unclear whether it's at www.wikidata.org or wikidata.org. [00:25:25] I guess it's still being decided or something. [00:26:11] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:29:31] Brooke, looks like it's www. [00:30:22] Krenair: Kind of. [00:30:51] http://wikidata.org/wiki/Hello and http://www.wikidata.org/wiki/Hello both work. [00:34:52] Brooke, sitematrix points to www. [00:36:18] I don't know what you want me to say. It's in an inconsistent state. There's a bug about it somewhere. [00:39:32] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 8.821 seconds [00:40:53] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [00:48:30] Brooke, I'll assume www. here. I don't think it matters a great deal [00:52:54] PROBLEM - Puppet freshness on cp1042 is CRITICAL: Puppet has not run in the last 10 hours [00:57:06] New patchset: Alex Monk; "Redirect secure.wikimedia.org URLs to proper HTTPS" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/13429 [01:08:26] nak [01:08:40] well, kind of [01:08:48] I'd prefer to just 404 wikidata [01:09:16] who would ever link to wikidata on secure [01:09:18] but anyway [01:09:25] but did testing reveal Krenair? [01:12:18] paravoid, what did it reveal? [01:13:50] er, s/what/but/ [01:13:55] er, s/but/what/ even [01:14:01] too late [01:14:07] That everything worked fine... Until I tried to add wikidata. Took me two failed tries to work out what the hell I was doing wrong [01:14:13] I should better get some sleep [01:14:20] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:29:02] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.021 seconds [01:42:14] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 267 seconds [01:42:35] Bot noise. [01:45:41] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 28 seconds [01:46:16] Brooke? [01:46:26] dbbot-wm [01:47:39] I wonder why it keeps joining then closing it's connection [02:00:02] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 256 seconds [02:00:20] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 274 seconds [02:02:08] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:09:56] PROBLEM - Puppet freshness on copper is CRITICAL: Puppet has not run in the last 10 hours [02:09:56] PROBLEM - Puppet freshness on db1001 is CRITICAL: Puppet has not run in the last 10 hours [02:09:56] PROBLEM - Puppet freshness on db1031 is CRITICAL: Puppet has not run in the last 10 hours [02:09:56] PROBLEM - Puppet freshness on cp1005 is CRITICAL: Puppet has not run in the last 10 hours [02:09:56] PROBLEM - Puppet freshness on hooper is CRITICAL: Puppet has not run in the last 10 hours [02:09:57] PROBLEM - Puppet freshness on mc8 is CRITICAL: Puppet has not run in the last 10 hours [02:09:57] PROBLEM - Puppet freshness on srv251 is CRITICAL: Puppet has not run in the last 10 hours [02:09:58] PROBLEM - Puppet freshness on sodium is CRITICAL: Puppet has not run in the last 10 hours [02:09:58] PROBLEM - Puppet freshness on search33 is CRITICAL: Puppet has not run in the last 10 hours [02:09:59] PROBLEM - Puppet freshness on srv223 is CRITICAL: Puppet has not run in the last 10 hours [02:10:51] PROBLEM - Puppet freshness on es5 is CRITICAL: Puppet has not run in the last 10 hours [02:10:51] PROBLEM - Puppet freshness on mc3 is CRITICAL: Puppet has not run in the last 10 hours [02:10:51] PROBLEM - Puppet freshness on search17 is CRITICAL: Puppet has not run in the last 10 hours [02:10:51] PROBLEM - Puppet freshness on sq33 is CRITICAL: Puppet has not run in the last 10 hours [02:10:51] PROBLEM - Puppet freshness on srv272 is CRITICAL: Puppet has not run in the last 10 hours [02:11:53] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [02:11:53] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [02:11:53] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [02:13:32] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 5.589 seconds [02:32:41] !log LocalisationUpdate completed (1.21wmf3) at Mon Nov 12 02:32:41 UTC 2012 [02:32:50] Logged the message, Master [02:38:26] RECOVERY - Puppet freshness on search1011 is OK: puppet ran at Mon Nov 12 02:38:16 UTC 2012 [02:39:11] RECOVERY - Puppet freshness on es5 is OK: puppet ran at Mon Nov 12 02:38:43 UTC 2012 [02:39:11] RECOVERY - Puppet freshness on search33 is OK: puppet ran at Mon Nov 12 02:38:55 UTC 2012 [02:39:38] RECOVERY - Puppet freshness on db1031 is OK: puppet ran at Mon Nov 12 02:39:19 UTC 2012 [02:40:50] PROBLEM - Puppet freshness on analytics1009 is CRITICAL: Puppet has not run in the last 10 hours [02:40:50] PROBLEM - Puppet freshness on cp1006 is CRITICAL: Puppet has not run in the last 10 hours [02:40:50] PROBLEM - Puppet freshness on cp1025 is CRITICAL: Puppet has not run in the last 10 hours [02:40:50] PROBLEM - Puppet freshness on searchidx1001 is CRITICAL: Puppet has not run in the last 10 hours [02:40:50] PROBLEM - Puppet freshness on srv202 is CRITICAL: Puppet has not run in the last 10 hours [02:41:35] RECOVERY - Puppet freshness on sq33 is OK: puppet ran at Mon Nov 12 02:41:14 UTC 2012 [02:41:35] RECOVERY - Puppet freshness on hooper is OK: puppet ran at Mon Nov 12 02:41:17 UTC 2012 [02:42:11] RECOVERY - Puppet freshness on srv251 is OK: puppet ran at Mon Nov 12 02:41:40 UTC 2012 [02:47:08] RECOVERY - Puppet freshness on db1001 is OK: puppet ran at Mon Nov 12 02:46:53 UTC 2012 [02:48:38] RECOVERY - Puppet freshness on mc8 is OK: puppet ran at Mon Nov 12 02:48:17 UTC 2012 [02:49:14] RECOVERY - Puppet freshness on cp1006 is OK: puppet ran at Mon Nov 12 02:48:53 UTC 2012 [02:49:41] RECOVERY - Puppet freshness on srv223 is OK: puppet ran at Mon Nov 12 02:49:25 UTC 2012 [02:51:38] RECOVERY - Puppet freshness on mc3 is OK: puppet ran at Mon Nov 12 02:51:31 UTC 2012 [02:51:38] RECOVERY - Puppet freshness on srv202 is OK: puppet ran at Mon Nov 12 02:51:36 UTC 2012 [02:52:05] RECOVERY - Puppet freshness on cp1005 is OK: puppet ran at Mon Nov 12 02:51:55 UTC 2012 [02:54:11] RECOVERY - Puppet freshness on srv272 is OK: puppet ran at Mon Nov 12 02:53:52 UTC 2012 [02:58:41] RECOVERY - Puppet freshness on cp1025 is OK: puppet ran at Mon Nov 12 02:58:25 UTC 2012 [02:59:08] RECOVERY - Puppet freshness on sodium is OK: puppet ran at Mon Nov 12 02:58:44 UTC 2012 [03:02:44] RECOVERY - Puppet freshness on search17 is OK: puppet ran at Mon Nov 12 03:02:12 UTC 2012 [03:03:38] RECOVERY - Puppet freshness on searchidx1001 is OK: puppet ran at Mon Nov 12 03:03:12 UTC 2012 [03:06:38] RECOVERY - Puppet freshness on analytics1009 is OK: puppet ran at Mon Nov 12 03:06:16 UTC 2012 [03:28:50] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [03:31:42] RECOVERY - Puppet freshness on copper is OK: puppet ran at Mon Nov 12 03:31:20 UTC 2012 [03:43:32] RECOVERY - MySQL Slave Delay on db78 is OK: OK replication delay 0 seconds [03:49:59] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 13 seconds [05:44:11] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [05:45:32] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 3.021 second response time on port 8123 [07:02:08] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [07:05:35] New review: Siebrand; "Any updates? Another month went by." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/12188 [08:19:11] what the hell... [08:32:13] PROBLEM - Puppet freshness on analytics1011 is CRITICAL: Puppet has not run in the last 10 hours [08:32:13] PROBLEM - Puppet freshness on analytics1013 is CRITICAL: Puppet has not run in the last 10 hours [08:32:13] PROBLEM - Puppet freshness on analytics1014 is CRITICAL: Puppet has not run in the last 10 hours [08:32:13] PROBLEM - Puppet freshness on analytics1015 is CRITICAL: Puppet has not run in the last 10 hours [08:32:13] PROBLEM - Puppet freshness on analytics1016 is CRITICAL: Puppet has not run in the last 10 hours [08:32:14] PROBLEM - Puppet freshness on analytics1018 is CRITICAL: Puppet has not run in the last 10 hours [08:32:14] PROBLEM - Puppet freshness on analytics1017 is CRITICAL: Puppet has not run in the last 10 hours [08:32:15] PROBLEM - Puppet freshness on analytics1012 is CRITICAL: Puppet has not run in the last 10 hours [08:32:15] PROBLEM - Puppet freshness on analytics1020 is CRITICAL: Puppet has not run in the last 10 hours [08:32:16] PROBLEM - Puppet freshness on analytics1022 is CRITICAL: Puppet has not run in the last 10 hours [08:32:16] PROBLEM - Puppet freshness on analytics1021 is CRITICAL: Puppet has not run in the last 10 hours [08:32:17] PROBLEM - Puppet freshness on analytics1019 is CRITICAL: Puppet has not run in the last 10 hours [08:32:17] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [08:32:18] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [09:02:18] apergos: Did you see my ping about the eswiki backups? [09:02:25] Should be all good now... [09:02:32] yes, I reran them already, they are past the first problem stage [09:02:40] yay [09:02:46] but you're not on the xml datadumps list or you would have seen today's mail :-P [09:03:04] I saw that the bug fix went in but not when it was deployed so thanks for the ping [10:13:49] New patchset: Dereckson; "(bug 41962) Namespace configuration for bar.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33043 [10:14:18] New review: Dereckson; "PS3: Renaming "Portal Diskussion" to "Portal Dischkrian"" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/33043 [10:15:42] New patchset: Dereckson; "(bug 41962) Namespace configuration for bar.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33043 [10:16:03] New review: Dereckson; "PS4: Fixing whitespace issue" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/33043 [10:41:55] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [10:53:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: Puppet has not run in the last 10 hours [11:14:22] New patchset: Ori.livneh; "Enable CombineUserTalk for enwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33058 [11:14:22] New patchset: Ori.livneh; "Enable event logging for mobile beta" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32864 [11:17:57] New patchset: Ori.livneh; "Enable CombineUserTalk for enwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33058 [11:18:11] New patchset: Faidon; "swift: include Ubuntu Cloud archive for folsom" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33059 [11:18:33] Change merged: Ori.livneh; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33058 [11:20:36] apergos: ping? [11:23:52] paravoid: ponngg [11:25:45] apergos: ms-be6 has synced account/container but zero objects [11:25:49] I'm debugging it now. [11:27:45] ugh [11:28:03] hope I didn't screw up the config somehow [11:28:13] actually, maybe it's better if I did since that would be easily fixed [11:28:38] how do you see those stats? [11:28:57] df? [11:28:59] ls? [11:29:00] :) [11:29:58] yeah, it's a ring screwup [11:30:05] grr [11:30:25] container & object have the wrong ports [11:30:35] oh, woops [11:30:46] also, container seems to not be balanced at all [11:31:04] forgot to run rebalance probably [11:31:05] hmm that'z weird, I did rebalance after adding all three [11:31:28] maybe because it told you to rebalance again after a few hours? [11:31:33] yep [11:31:37] anyway [11:31:43] want to push a fix? [11:31:44] for account and container it does that [11:31:54] where is it? [11:32:10] and I'm gonna run puppet on everythig so if you have something manually disabled... [11:32:15] now is the time to worry :-P [11:32:38] everywhere? :) [11:32:46] all ms-fe* all ms-be* [11:32:51] and no, puppet is not disabled anywhere [11:32:53] that's everywhere as far as I'm concerned [11:32:55] ok [11:33:01] no, everywhere was the answer to the "where is it?" question [11:33:19] I mean where did you redo the rings? [11:33:29] I didn't redo them, that's what I just asked you to do :) [11:33:55] ok, I thought you said you had a fix you wanted me to push out [11:33:57] ok then [11:34:06] no [11:34:08] sorry, my bad [11:34:25] port should be 6000 for object and 6001 for container (see the entries above it) [11:34:37] you have 6002 on all three [11:35:03] look at the port column on the other servers and you'll spot the issue immediately [11:36:29] ok great [11:36:32] * mark is reading his email backlog [11:36:34] thanks for catching that [11:36:34] i'm now at oct 12 [11:36:59] hahahaha [11:37:43] mark: so, I should reformat the thumpers; I looked up wikitech documentation, is it really going to be a PITA? [11:38:28] I saw some "grub doesn't recognize sdac so you have to install grub manually foo" but I'm not sure if it's current [11:38:41] may no longer be current [11:38:54] just try it [11:41:13] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33059 [11:42:03] anyone knows what happend to tmh1 yesterday? according to gangila it was off [11:43:05] !log apt: removing swift 1.5 from precise-wikimedia [11:43:12] Logged the message, Master [11:43:15] j^: 19:58 mutante: powercycling tmh1 [11:44:33] that's not very helpful, is it? :) [11:44:45] sorry, I don't have a better answer for you [11:45:01] ok lets hope that was a one off and see [11:45:31] was considering to increase the number of concurrent jobs to process more videos but if it already crashes with 50% cpu usage [11:46:48] New patchset: Faidon; "Remove spurious notify, doesn't work across stages" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33062 [11:47:11] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33062 [11:47:20] j^: do we have a backlog? [11:48:32] paravoid: yes all the existing videos are transcoded right now [11:49:18] transcoded to what? [11:49:55] webm, smaller resolutions if required(i.e. 1080p uploads) [11:50:31] ok [11:51:10] no h.264 after all? [11:52:50] might be added at a later point [11:52:58] code is there, not enabled right now [11:54:01] nod [11:55:08] so with tmh out its down to a legal/political descision to switch it on [11:59:02] New patchset: Faidon; "Repurpose ms[1-3]" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33063 [11:59:03] great [11:59:33] apergos: all new swift boxes should run 1.7.4 [11:59:43] ok [11:59:51] apergos: be careful not to create ring files on precise boxes until we switch all of them to precise [12:00:02] I assume the back ends are still on 1.5? [12:00:08] no, ms-be6 is 1.7.4 now [12:00:16] all precise backends will be 1.7 [12:00:17] but that's the only one? [12:00:42] that's the only one I upgraded and we don't have ensure => latest, so yes [12:00:49] ok great [12:00:50] but I think it's also the only one in precise [12:00:55] and there's no 1.7 for lucid in our repo [12:01:03] uh huh [12:01:10] (and not planning to) [12:01:31] fix the ring files so we can be sure it replicates properly across versions [12:02:27] 1.7.5 says "expected in 11 hours", heh [12:02:53] has a few fixes that we expect [12:05:01] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33063 [12:11:47] New patchset: Faidon; "Kill last references to Solaris" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33065 [12:12:39] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [12:12:40] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [12:12:40] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [12:12:42] New patchset: Faidon; "Cleanup the base class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33066 [12:12:54] mark: I think you'll enjoy this [12:27:00] ? [12:27:00] ah yes [12:27:43] the last two [12:33:11] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33065 [12:35:16] New patchset: Faidon; "Add ms1-3 to autoinstall" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33067 [12:35:43] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33067 [12:37:01] yay, I broke stuff [12:39:31] New patchset: Faidon; "Fix ntp template brekage" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33068 [12:39:43] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33068 [12:55:04] mark: any insight on why ms1 won't PXE even when I get confirmation that network boot will be attempted? [12:55:09] I don't get a DHCP request at all [12:57:20] no [12:57:29] should work [12:57:33] it's on the internal vlan isn't it? [13:05:41] hmm [13:05:45] where shall we run the ceph monitors on [13:08:47] lldpctl is unhelpful [13:08:51] which switch should I look into? [13:08:59] they're on csw1-sdtpa I'm pretty sure [13:09:00] maybe bonding affects it [13:09:04] oh right [13:09:05] yes that is it [13:09:10] sorry, should've thought of that [13:09:17] there's an option to make that work [13:09:26] force-up [13:09:30] you'll need to disable the non-eth0 ports and disable the lag [13:09:33] no there isn't [13:10:12] old junos? [13:10:30] I'm sure force-up works, I've used it before :) [13:10:49] if only it were junos eh [13:10:55] aaaaaaw crap [13:10:58] hahaha [13:11:12] sure, on our juniper switches force-up works great [13:15:13] have the patience to guide me through the foundrys? [13:15:23] or slap me with a fine manual, since I can't find one on wikitech [13:16:17] oh there's a PDF on fenari [13:19:10] sure [13:19:14] lag ms1 [13:19:26] disable e for all of eth1+ [13:19:31] no deploy [13:19:33] then install [13:19:36] then deploy again [13:19:38] and reenable ports [13:20:20] ms1 dynamic Y 12 15/2 ethe 15/2 [13:20:22] some info here: http://wikitech.wikimedia.org/view/Link_aggregation#Foundry [13:20:25] that's not very useful, is it [13:20:33] what isn't? [13:21:05] isn't that a single port? [13:21:36] seems like it [13:21:40] that's not very useful no [13:21:58] might as well undeploy that lag [13:23:18] looks a lot like ios [13:23:22] (first time in a foundry) [13:24:44] yes [13:24:48] it's like a bad ios clone [13:25:29] hmz so [13:25:44] ceph docs recommend using SSDs for OSD journals and for monitors [13:25:49] yeah [13:25:55] but i don't think we should be using the SSDs currently used for swift [13:26:21] those are not guaranteed for stable data, i.e. fsync() does not really mean it survives a power failure [13:29:46] what do you mean? what's the problem with the SSDs currently used for swift? [13:30:20] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [13:30:27] effectively they use write caching [13:30:30] despite fsync() [13:30:41] so data consistency is not guaranteed [13:30:55] that's fine for squid/varnish, according to ben it was also fine for swift (not sure that's true) [13:30:59] but it sure doesn't seem fine for ceph [13:31:23] you mean caching within the SSD firmware? [13:31:33] yes, inside the controller [13:32:02] not the system's controller but the SSD's controller? [13:32:12] first time I'm hearing this [13:32:57] something that hdparm -W 0 doesn't fix? [13:33:04] no [13:33:24] (installer worked after 'no lag ms1', yay) [13:33:51] http://www.evanjones.ca/intel-ssd-durability.html [13:34:16] so we use the X25m and the intel 320, which is an evolution of that [13:34:46] as far as I understand the 320 has more/larger capacitors which help with a bit of power backup on power failures, but that doesn't sound like it really fully fixes the problem [13:34:54] so although those may be better, I still don't fully trust them for this use [13:35:02] we better use the Intel 720 SSDs for that purpose [13:35:39] wow [13:36:02] nasty [13:37:02] ok, installer is running [13:37:11] going to grab a quick lunch [13:53:26] RECOVERY - mysqld processes on es4 is OK: PROCS OK: 1 process with command name mysqld [13:56:35] back [13:59:06] Two file systems are assigned the same mount point (/): RAID1 device #0 and SCSI29 (0,0,0), partition #1 (sdac). [14:00:34] where? [14:00:54] I'm reformatting ms1 [14:04:59] I see [14:13:05] New patchset: Pyoungmeister; "re-adding es4 to db.php" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33073 [14:14:20] Change merged: Pyoungmeister; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33073 [14:14:53] !log py synchronized wmf-config/db.php 're-adding es4' [14:15:01] Logged the message, Master [14:22:06] New patchset: Faidon; "partman: remove mdadm/boot_degraded from recipes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33074 [14:22:06] New patchset: Faidon; "partman: new recipe for thumpers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33075 [14:23:41] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33074 [14:23:58] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33075 [14:59:12] !log reedy synchronized php-1.21wmf3/extensions/CentralNotice/ [14:59:18] Logged the message, Master [15:18:28] yep, grub-install fails [15:18:39] (among other d-i fails that I've solved via the magic of dd) [15:19:04] 'grub-install /dev/sda' failed [15:27:25] !log reedy synchronized php-1.21wmf3/includes/api/ApiEditPage.php [15:27:31] Logged the message, Master [15:28:04] !log reedy synchronized php-1.21wmf3/extensions/EducationProgram/ [15:28:10] Logged the message, Master [15:29:04] Heyaaaa, RobH, are you working today? [15:30:36] paravoid: ms3 and up will fare better [15:34:17] apergos: you're done with ms-be3001 right? [15:34:28] yes, sorry I didn't put it back to some pristine state [15:34:33] also the drac on that one has the normal password now [15:34:39] good [15:34:47] thanks for the loan [15:51:25] formatting ms2 now [15:51:40] starting on ms-be3003 now [15:53:11] esams servers don't have ssds btw [15:53:22] ms3's mgmt doesn't work [15:53:25] and they probably don't have the drive cages for the SSDs either [15:53:27] grr [15:53:41] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [15:55:52] $ telnet ms3.mgmt.pmtpa.wmnet 22 [15:55:52] Trying 10.1.7.3... [15:55:52] Connected to ms3.mgmt.pmtpa.wmnet. [15:55:52] Escape character is '^]'. [15:55:55] and that's it, no SSH banner [15:56:04] it's a sun [15:58:29] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.013 second response time on port 11000 [15:59:06] yeah, and I don't know/remember how to reset it from within the system [15:59:19] I'll ask chris to hard-reset it [16:01:54] paravoid: reset /SP [16:01:57] to reset the sun LOM [16:02:07] it might be connected to the SCS [16:02:13] or it might not [16:02:14] ms3? [16:02:16] lemme check [16:02:37] RobH: I can't login to the ILOM [16:02:50] it's not ms3 that I can't, it's ms3.mgmt [16:03:12] yes [16:03:23] oh, cannot get in to reset [16:03:29] so yea, we can use the power strip to reset [16:03:37] oh yeah [16:03:43] we have managed power strips? [16:03:47] yes [16:03:50] on this [16:03:54] i can send ms3 a power down [16:03:55] almost all [16:03:56] want me to? [16:04:00] yes please! [16:04:00] paravoid: or i can show you how [16:04:06] however, most don't have relays [16:04:11] but this rack does [16:04:12] only a1 and b1 sdtpa [16:04:16] because the core router is in it [16:04:28] paravoid: ok, so its ok to shut ms3 down completely right? [16:04:32] yes [16:04:35] ok, doing now [16:04:41] thanks [16:04:49] !lot shutting down power ports for ms3 on ps1-a1-sdtpa [16:05:10] ok, off, giving it a moment [16:05:40] paravoid: Ok, its got power again, i would just ping mgmt cuz it should return to service [16:06:38] fyi: all power strips in US use the ps#-rack-location format, so http://ps1-a1-sdtpa.mgmt.pmtpa.wmnet [16:06:44] if you have a proxy setup into cluster [16:07:02] http://wikitech.wikimedia.org/view/Proxy [16:07:19] other than a1 and b1 sdtpa, its info only [16:07:23] those are the only two switched [16:07:35] oh, and I suppose ULSFO has switched [16:07:43] but since nothing is deployed there yet its not documented. [16:08:26] ms3 mgmt is owrking [16:08:35] im going back to halo4 =] [16:08:38] \o/ [16:08:56] * RobH may still be lurking cuz he has issues letting go [16:13:19] RECOVERY - MySQL Slave Delay on es1001 is OK: OK replication delay NULL seconds [16:16:28] RobH: yeah, I'm in the console, thanks a bunch [16:18:16] PROBLEM - MySQL Slave Delay on es1001 is CRITICAL: CRIT replication delay 4492309 seconds [16:18:32] oh yay, the precise bonding issue [16:18:47] what a day [16:19:43] * mark grins [16:20:07] heh [16:21:05] i'm not faring much better [16:21:09] ms-be3003 hangs at the bios stage [16:21:12] trying 3002 now [16:23:27] these are 720xds? [16:23:34] yes [16:23:40] New patchset: Faidon; "Disable bonding for ms1-3" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33089 [16:23:58] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33089 [16:24:19] odd [16:24:35] ms3 probably has a broken disk too [16:24:50] probably [16:24:51] it took quite a while to scan at bios time, and it's taking a while in the installer [16:24:55] what are the odds, with 48 drives :) [16:25:03] they're 4 years old too [16:25:08] make that 48x3 boxes :) [16:25:08] almost 5 [16:25:35] what's that dbbot-wm that's been flapping all day? [16:26:18] great question for which I don't have the answer [16:26:27] but Krinkle|detached would... :-/ [16:26:39] paravoid: There are problems with the Toolserver right now [16:26:45] ah [16:26:51] Some connection to an NFS server is down [16:29:39] ms-be3002 does the same [16:30:48] what's the last line you see, out of curiosity? [16:30:58] Scanning for devices. Please wait, this may take several minutes... [16:31:19] might have shipped with wrong boot order, ours did (had hd first, silently hung forever) [16:31:35] yeah will check bios [16:48:12] ms1 is done, ms2 is running puppet, ms3 needs remote hands for that broken disk (installer hangs) [16:49:20] mark: btw, are you aware that ms6.esams' btrfs says "degraded"? [16:49:33] no [16:49:40] not surprising hehe [16:49:47] /dev/sdc1 on /export/thumbs type btrfs (rw,noatime,degraded,subvol=thumbs) [16:50:03] that's just the degraded option [16:50:11] meaning it will continue to mount IF it's degraded [16:50:16] ah [16:50:21] but i wouldn't be surprised if it actually is [16:50:30] I've never used btrfs [16:50:48] apergos: how did you configure JBOD on those r720xd? [16:50:53] paravoid: you should [16:51:13] on one of the menus there's an option for 'convert to non raid" [16:51:22] ok [16:51:43] I should indeed, looks interesting [16:53:59] and that's a really old version of it [16:54:10] well, lucid [16:56:17] the ceph guys recommend using quantal when trying btrfs, so perhaps we shouldn't [16:56:37] I'm not sure I want to combine ceph with btrfs [16:56:42] i do [16:56:49] on 1/3 boxes [16:57:09] can we mix and match? [16:57:18] have boxes with both xfs and btrfs? [16:57:26] er, different boxes obviously [16:57:39] sure why not? [16:57:59] dunno [17:02:45] !log reedy synchronized php-1.21wmf4/ 'Initial sync' [17:02:51] Logged the message, Master [17:02:58] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [17:03:45] !log reedy synchronized php-1.21wmf4/cache [17:03:51] Logged the message, Master [17:05:31] !log reedy synchronized wmf-config/ExtensionMessages-1.21wmf4.php [17:05:38] Logged the message, Master [17:07:18] !log reedy synchronized live-1.5/ [17:07:24] Logged the message, Master [17:13:20] New patchset: Reedy; "Add new docroot stuff for 1.21wmf4" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33095 [17:13:34] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33095 [17:14:39] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: test2wiki to 1.21wmf4 [17:14:45] Logged the message, Master [17:27:12] chown mwdeploy /home/wikipedia/common/wmf-config/ExtensionMessages-1.21wmf4.php [17:27:17] Can someone run that on fenari for me please? [17:27:56] done [17:28:01] thanks [17:28:34] any other perms/owners issues outstanding? [17:31:30] New patchset: Reedy; "Removed ScanSet from extension-list" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33097 [17:32:12] nope... I think we're good now [17:32:21] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33097 [17:33:53] New patchset: Faidon; "apt: re-add support for commenting old apt entries" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33098 [17:34:54] New patchset: Faidon; "apt: re-add support for commenting old apt entries" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33098 [17:34:57] !log reedy Started syncing Wikimedia installation... : Build 1.21wmf4 messages [17:35:04] Logged the message, Master [17:36:00] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33098 [17:52:49] Right, who wants to help me beat some apaches? [17:53:02] Circa 30 of them are giving 404s for http://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Stryn [17:53:11] what do you need done? [17:53:22] I need opsen [17:53:26] feh [17:53:33] mark: paravoid, apergos ^ [17:53:36] http://p.defau.lt/?hkQnnMQPf5_hsMtXcFGO3A [17:54:01] mw17-mw42 [17:54:52] awesome [17:55:09] It would look like they've forgotten about wikidata [17:56:46] did they ever know about wikidata? [17:57:08] 17-59 now [17:57:09] Yup [17:57:11] srv211.pmtpa.wmnet 500 Internal Server Error [17:57:39] 124 Warning: require(/usr/local/apache/common-local//index.php) [function.require]: failed to open stream: No such file or directory [17:57:39] in /usr/local/apache/common-local/live-1.5/index.php on line 3 [17:57:39] 120 Fatal error: require() [function.require]: Failed opening required '/usr/local/apache/common-local//index.php' (include_path='.: [17:57:39] /usr/share/php:/usr/local/apache/common/php') in /usr/local/apache/common-local/live-1.5/index.php on line 3 [17:58:13] ive-1.5/index.php? really? [17:58:17] running puppet on srv211 [17:58:25] It would seemingly be related to scap [17:59:04] !log reedy synchronized live-1.5/ [17:59:10] Logged the message, Master [18:01:10] PROBLEM - Apache HTTP on srv211 is CRITICAL: Connection refused [18:02:26] what do you need? [18:02:38] * aude sad  [18:03:40] Reedy: can you rerun that test? [18:03:45] I ran puppet on two which did a resync [18:03:53] so I'm curious to see if that fixed it on mw17 and srv211 [18:04:24] yeah, hang on [18:04:28] RECOVERY - Apache HTTP on srv211 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.088 second response time [18:04:36] !log reedy synchronized php-1.21wmf3/extensions/EducationProgram/ [18:04:44] Logged the message, Master [18:05:10] !log reedy synchronized live-1.5/ [18:05:11] scap is still running [18:05:17] Logged the message, Master [18:07:41] And just got a bunch of connection closeds.. [18:08:33] that's probably why [18:09:08] http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=cpu_report&s=by+name&c=Miscellaneous+pmtpa&h=nfs1.pmtpa.wmnet&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 [18:09:15] nfs1 has stopped reporting [18:09:46] looks fine to me? [18:10:20] as of 5 minutes ago there's nothing showing.. http://ganglia.wikimedia.org/latest/graph.php?r=2hr&z=xlarge&h=nfs1.pmtpa.wmnet&m=cpu_report&s=by+name&mc=2&g=cpu_report&c=Miscellaneous+pmtpa [18:11:12] here's why: [23379442.936876] possible SYN flooding on port 873. Sending cookies. [18:11:12] [23393843.503771] possible SYN flooding on port 873. Sending cookies. [18:11:12] [23415443.449945] possible SYN flooding on port 873. Sending cookies. [18:11:13] [23421743.400079] possible SYN flooding on port 873. Sending cookies. [18:11:16] New patchset: Faidon; "apt: minor syntax/whitespace fixes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33102 [18:12:16] basically the problem is a few hundred servers all connecting the rsyncd at the same time [18:12:25] its listen() queue fills up, and the kernel rejects some tcp connections [18:12:26] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33102 [18:18:32] New patchset: Faidon; "Cleanup the base class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/33066 [18:18:47] mark: what's your take on this btw? [18:19:54] !log changing ssd's on labsdb3 [18:20:01] Logged the message, Master [18:21:18] sbernardin: no veteran's day for you? [18:25:41] looks like ganglia is catching up now [18:29:16] !log reedy synchronized live-1.5/ [18:29:24] Logged the message, Master [18:32:46] reedy is doing a graceful restart of all apaches [18:33:04] Oh, that works apparently [18:33:04] !log reedy gracefulled all apaches [18:33:11] Logged the message, Master [18:33:38] Doc root complaints [18:34:13] PROBLEM - Puppet freshness on analytics1011 is CRITICAL: Puppet has not run in the last 10 hours [18:34:13] PROBLEM - Puppet freshness on analytics1013 is CRITICAL: Puppet has not run in the last 10 hours [18:34:13] PROBLEM - Puppet freshness on analytics1012 is CRITICAL: Puppet has not run in the last 10 hours [18:34:13] PROBLEM - Puppet freshness on analytics1014 is CRITICAL: Puppet has not run in the last 10 hours [18:34:13] PROBLEM - Puppet freshness on analytics1017 is CRITICAL: Puppet has not run in the last 10 hours [18:34:14] PROBLEM - Puppet freshness on analytics1018 is CRITICAL: Puppet has not run in the last 10 hours [18:34:14] PROBLEM - Puppet freshness on analytics1015 is CRITICAL: Puppet has not run in the last 10 hours [18:34:15] PROBLEM - Puppet freshness on analytics1016 is CRITICAL: Puppet has not run in the last 10 hours [18:34:15] PROBLEM - Puppet freshness on analytics1020 is CRITICAL: Puppet has not run in the last 10 hours [18:34:16] PROBLEM - Puppet freshness on analytics1021 is CRITICAL: Puppet has not run in the last 10 hours [18:34:16] PROBLEM - Puppet freshness on analytics1019 is CRITICAL: Puppet has not run in the last 10 hours [18:34:17] PROBLEM - Puppet freshness on analytics1022 is CRITICAL: Puppet has not run in the last 10 hours [18:34:17] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [18:34:18] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [18:35:52] New patchset: Reedy; "This reverts commit 65e200f65f4e0a61cf0e3cf92ab7e12f1bf1c51b" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33103 [18:35:59] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33103 [18:36:35] reedy is doing a graceful restart of all apaches [18:36:42] seems back [18:36:54] !log reedy gracefulled all apaches [18:37:02] Logged the message, Master [18:37:16] mw31: 12 Nov 18:36:45 ntpdate[31121]: no server suitable for synchronization found [18:37:16] mw31: Error: unable to contact NTP server [18:37:16] mw19: 12 Nov 18:36:46 ntpdate[12050]: no server suitable for synchronization found [18:37:16] mw19: Error: unable to contact NTP server [18:37:59] don't trust the timestamp, then :P [18:38:18] it doesn't look too far off.. [18:43:09] hm, I messed with the ntp template today [18:44:24] only 2 were complaining [18:44:31] and they didn't on earlier actions [18:45:49] Why do we have a new "performance" mailing list? [18:46:01] Krinkle: terry asked for it a while back [18:46:01] paravoid: no...had a few things to finish up today [18:46:42] private, engineering, wikitech, mediawiki, internal and securty aren't enough? [18:47:48] segregation of information! [18:47:51] like CIA [18:48:21] Krenair: I think the concept was that it'd be a separate team [18:48:23] but no idea [18:48:25] I hope the useful things from that list get reported back to or discussed on operations [18:48:26] ask terry [18:48:26] wikitech [18:48:27] etc [18:48:40] I don't think there's any traffic there [18:48:51] heh [18:48:53] worksforme [18:50:36] paravoid, ? [18:51:08] what? [18:51:22] Krenair: I think the concept was that it'd be a separate team [18:51:33] ah maybe that was to Krinkle [18:51:38] ah, sorry, yes [18:51:45] my bad [18:52:04] !log reedy synchronized php-1.21wmf4/cache/l10n/ [18:52:11] Logged the message, Master [18:53:05] !log upgrading tmh1/tmh2 and srv{190,219,220,221,222,223,224} for libav vulnerability (USN-1630-1) [18:53:12] Logged the message, Master [18:53:30] if i _want_ a value in memcached to be shared by multiple wikis in prod, is it fair game to just not prefix the key with dbname / cacheprefix? [18:53:48] should be [18:53:49] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: mediawikiwiki, testwiki and wikidatawiki to 1.21wmf4 [18:53:53] I think we do that in some places [18:53:56] Logged the message, Master [18:54:30] cool. i was debating between that and using wfForeignMemcKey with meta's db / prefix [18:54:37] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Mon Nov 12 18:54:14 UTC 2012 [18:54:39] and the upgrade also upgrades apache, and stops it in the process [18:54:44] sigh [19:02:41] !log reedy synchronized php-1.21wmf4/extensions/Wikibase [19:02:47] Logged the message, Master [19:04:31] PROBLEM - Apache HTTP on srv190 is CRITICAL: Connection refused [19:04:40] PROBLEM - Apache HTTP on srv220 is CRITICAL: Connection refused [19:04:47] New patchset: Reedy; "Add VisualEditor namespace creation to wmf-config" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31949 [19:04:49] (that would be me, see above) [19:05:02] Reedy: wait wait [19:05:11] ? [19:05:12] Reedy: NOOO [19:05:13] are you about to do a wmf-config sync? [19:05:17] I rebased it [19:05:21] I didn't submit it [19:05:21] Or, wait? [19:05:24] Oh OK [19:05:28] Sorry I thought you created a new one [19:05:29] Carry on [19:06:05] apergos: are you going to monitor ms-be6? [19:06:10] RECOVERY - Apache HTTP on srv190 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.081 second response time [19:06:15] do the container/object rebalance, increase weight to 66 etc.? [19:06:25] and see if generally everything works ok? [19:07:51] New patchset: Ori.livneh; "Enable wgVectorCombineUserTalk for {test|mw}wikis" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33110 [19:07:58] RECOVERY - Apache HTTP on srv220 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.066 second response time [19:08:50] Reedy: if you're doing a config sync, can this change piggyback? ^ [19:08:54] lol [19:09:11] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31949 [19:09:20] :) [19:09:21] thanks [19:09:26] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33110 [19:09:34] RoanKattouw: is that ok to go out then? [19:10:02] Just a second [19:10:04] I have another oen [19:10:16] New patchset: Catrope; "Hide the VE preference on mw.org and enable it by default" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33111 [19:11:54] Reedy: If you could merge that one too and push them out together, that would be great [19:13:24] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33111 [19:13:32] !log reedy synchronized php-1.21wmf4/extensions/EducationProgram [19:13:38] Logged the message, Master [19:13:51] paravoid, I had a look about 5 mins ago and saw a bunch of stuff in the objects on one partition [19:14:05] that's not exactly "monitoring" but at least it means stuff is now happeniing [19:14:09] apergos: that was a general question [19:14:20] basically, if I should have it on my mind and check it up or if you will [19:14:23] someone has to :) [19:14:31] no, I figured I would check it [19:14:59] !log reedy synchronized wmf-config/ [19:15:07] Logged the message, Master [19:18:12] Reedy: Excellent, thank worked, thanks! [19:20:45] j^: here? [19:22:55] !log Updated Parsoid on wtp1 [19:23:01] Logged the message, Mr. Obvious [19:23:07] RECOVERY - Puppet freshness on analytics1011 is OK: puppet ran at Mon Nov 12 19:22:48 UTC 2012 [19:26:34] RECOVERY - Puppet freshness on analytics1012 is OK: puppet ran at Mon Nov 12 19:26:24 UTC 2012 [19:27:37] RECOVERY - Puppet freshness on analytics1013 is OK: puppet ran at Mon Nov 12 19:27:16 UTC 2012 [19:29:34] RECOVERY - Puppet freshness on analytics1020 is OK: puppet ran at Mon Nov 12 19:29:15 UTC 2012 [19:29:34] RECOVERY - Puppet freshness on analytics1017 is OK: puppet ran at Mon Nov 12 19:29:17 UTC 2012 [19:29:35] RECOVERY - Puppet freshness on analytics1019 is OK: puppet ran at Mon Nov 12 19:29:20 UTC 2012 [19:29:35] RECOVERY - Puppet freshness on analytics1021 is OK: puppet ran at Mon Nov 12 19:29:24 UTC 2012 [19:29:35] RECOVERY - Puppet freshness on analytics1016 is OK: puppet ran at Mon Nov 12 19:29:25 UTC 2012 [19:29:35] RECOVERY - Puppet freshness on analytics1015 is OK: puppet ran at Mon Nov 12 19:29:27 UTC 2012 [19:29:35] RECOVERY - Puppet freshness on analytics1018 is OK: puppet ran at Mon Nov 12 19:29:27 UTC 2012 [19:30:10] RECOVERY - Puppet freshness on analytics1022 is OK: puppet ran at Mon Nov 12 19:29:41 UTC 2012 [19:30:10] RECOVERY - Puppet freshness on analytics1014 is OK: puppet ran at Mon Nov 12 19:29:58 UTC 2012 [19:38:36] hah, funny [19:38:46] www.wikivoyage.org is having the list of projects on the bottom [19:38:52] but one of them is missing [19:38:55] ...Wikipedia [19:38:56] New patchset: Reedy; "testwiki, wikidatawiki, test2wiki and mediawikiwiki to 1.21wmf4" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33115 [19:39:16] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33115 [19:42:08] I can't remember which one Chris copied... [19:42:51] RECOVERY - NTP on analytics1011 is OK: NTP OK: Offset -0.02186000347 secs [19:42:53] They all look somewhat inconsistent [19:47:11] RECOVERY - NTP on analytics1012 is OK: NTP OK: Offset -0.02181899548 secs [19:47:47] RECOVERY - NTP on analytics1013 is OK: NTP OK: Offset -0.01740527153 secs [19:48:59] RECOVERY - NTP on analytics1020 is OK: NTP OK: Offset -0.01003038883 secs [19:49:26] RECOVERY - NTP on analytics1017 is OK: NTP OK: Offset -0.02421700954 secs [19:49:26] RECOVERY - NTP on analytics1015 is OK: NTP OK: Offset -0.02195727825 secs [19:49:26] RECOVERY - NTP on analytics1019 is OK: NTP OK: Offset -0.02186012268 secs [19:49:27] RECOVERY - NTP on analytics1021 is OK: NTP OK: Offset -0.02689230442 secs [19:50:29] RECOVERY - NTP on analytics1022 is OK: NTP OK: Offset -0.01714742184 secs [19:50:29] RECOVERY - NTP on analytics1018 is OK: NTP OK: Offset -0.01956355572 secs [19:50:29] RECOVERY - NTP on analytics1014 is OK: NTP OK: Offset -0.01474118233 secs [19:50:38] RECOVERY - NTP on analytics1016 is OK: NTP OK: Offset -0.02328658104 secs [19:52:24] apergos: have you made any progress on the ms7 cruft btw? [19:53:06] not even gotten to look at it [19:53:19] do you think you'll be able to this week? [19:53:37] I can at least make sure we know what's really left [19:54:05] well, that's an hour's work :P [19:54:38] prolly a lot longer than that, actually [19:58:56] off for the night, talk to you tomorrow [20:00:02] yeah, I'm not really here either [20:00:04] laters [20:32:43] what's the value of $wgDBprefix for meta? unclear from wmf-config. [20:33:44] ori-l: "meta" per http://meta.wikimedia.org/wiki/Special:SiteMatrix [20:34:54] Jasper_Deng: thanks [20:37:10] Jasper_Deng: No [20:37:20] ori-l: nothing, non of the WMF projects have a prefix [20:37:26] oh [20:37:32] I read $wgDBname [20:37:45] reedy@fenari:~$ mwscript eval.php metawiki [20:37:45] > var_dump( $wgDBprefix ); [20:37:45] string(0) "" [20:38:14] That would also be wrong for $wgDBname [20:38:17] $ mwscript eval.php metawiki [20:38:21] you just made my day [20:39:13] eval.php gives you a php command line interface thing in a setup mw environment [20:39:16] pretty handy ;) [20:41:11] i configured phpsh to do the same on vagrant [20:41:33] phpsh has readline, so you get things like command history, autocompletion, etc [20:42:29] eval.php has that too [20:42:31] To an extent [20:42:58] ah, neat [20:43:07] maybe i should take out phpsh then [20:43:17] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [20:43:30] oh: if you have ctags installed it'll also show you docstrings inline [20:43:42] phpsh may or may not be useful if it's better than eval.php, but I haven't tried phpsh and you haven't tried eval.php so neither of us knows :) [20:44:03] and so we sail on, passing each other like ships in the night [20:44:35] i'll probably replace phpsh with eval.php for the sake of consistency [20:45:10] i want to write a plugin for vagrant so you can type 'vagrant repl' and get straight to an eval.php prompt [21:08:41] New patchset: Alex Monk; "(bug 42052) Add patrol right to wikidatawiki autopatrolled group..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33146 [21:08:52] paravoid: whats up? [21:10:40] I think he's gone for the night, sorry [21:11:46] !log reedy synchronized wmf-config/InitialiseSettings.php [21:11:52] Logged the message, Master [21:15:38] !log reedy synchronized wmf-config/InitialiseSettings.php [21:15:44] Logged the message, Master [21:17:35] !log reedy synchronized wmf-config/InitialiseSettings.php [21:17:41] Logged the message, Master [21:28:29] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33146 [21:28:43] New patchset: Reedy; "Cleanup Lucene config, removing old globals and unused code" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33037 [21:29:30] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33043 [21:30:12] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32564 [21:30:39] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/32223 [21:32:08] !log reedy synchronized wmf-config/InitialiseSettings.php [21:32:14] Logged the message, Master [21:43:15] !log reedy synchronized wmf-config/CommonSettings.php [21:43:21] Logged the message, Master [21:45:49] !log reedy synchronized wmf-config/CommonSettings.php [21:45:55] Logged the message, Master [21:46:41] New patchset: Reedy; "Re-enable lucene search on wikidatawiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33148 [21:47:54] !log reedy synchronized wmf-config/CommonSettings.php [21:48:00] Logged the message, Master [21:48:14] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33148 [22:13:47] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [22:13:47] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [22:13:47] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [23:31:15] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours