[00:07:52] (03PS1) 10Hoo man: Undeploy the skins extension [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 [00:08:04] Reedy: ^ [00:08:16] Not sure that's all it takes to undeploy it [00:08:24] probably need to also fix tools/release [00:08:51] are they definitely unused? [00:08:59] I thought fundraising use(d) them? [00:09:13] Reedy: SELECT * FROM user_properties WHERE up_property = 'skin' AND up_value = 'tomas'; was empty [00:09:20] same for donate and schulenburg [00:09:32] I hope they don't, because they are terribly broken [00:09:45] enjoy: https://wikimediafoundation.org/wiki/Staff_and_contractors?useskin=schulenburg [00:09:58] https://wikimediafoundation.org/wiki/Staff_and_contractors?useskin=tomas [00:10:10] Yeah, needs a commit to tools/release [00:10:30] this one is especially nice: https://wikimediafoundation.org/wiki/Staff_and_contractors?useskin=Donate [00:10:52] oh wait, it's Donate (uppercase [00:10:53] ) [00:10:55] will check back [00:11:11] lol [00:11:20] I wonder if SkinPerPage can be disabled also [00:11:26] mysql:wikiadmin@db1038 [foundationwiki]> SELECT * FROM user_properties WHERE up_property = 'skin' AND up_value = 'Donate'; [00:11:26] Empty set (0.00 sec) [00:11:28] Oh.. That might be what you need to lookup [00:11:35] use of the skin parser function... [00:11:47] there's a skin parser function... what the ... [00:11:54] extension SkinPerPage [00:12:34] hoo: yw [00:12:38] I see, mh [00:13:15] Should be relatively easily to check [00:13:39] FormPreloadPostCache might be able to die also [00:14:17] I shouldn't have looked into that, I guess *shivers* [00:14:21] :D [00:14:25] [00:12:34] hoo: yw [00:16:21] (03CR) 10Reedy: [C: 04-1] Undeploy the skins extension (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 (owner: 10Hoo man) [00:18:07] I've just mentioned it in #wikimedia-fundraising [00:20:08] ?useskin= [00:20:11] (01:13:14 AM) Reedy: Should be relatively easily to check [00:20:12] Maybe. [00:20:12] how? [00:20:23] Gloria: would require scanning the web logs :/ [00:20:29] It would. [00:20:32] page_text LIKE "% But that's the more likely scenario. [00:20:41] Than database prefs. [00:21:15] Oh, right. There was a tag. [00:21:28] was? still is [00:21:34] on foundationwiki [00:21:42] Link? [00:21:55] if ( $wgDBname == 'foundationwiki' ) { [00:21:56] include( "$IP/extensions/FormPreloadPostCache/FormPreloadPostCache.php" ); [00:21:56] include( "$IP/extensions/SkinPerPage/SkinPerPage.php" ); [00:21:56] Gloria: https://github.com/wikimedia/mediawiki-extensions-SkinPerPage/blob/master/SkinPerPage.php [00:22:04] No. [00:22:07] I mean to an actual usage. [00:22:14] That's the question [00:22:18] Bah. [00:22:22] Hence my page_text LIKE [00:22:41] page_text sounds made-up. [00:22:50] You can grep a dump pretty easily. [00:23:15] http://dumps.wikimedia.org/foundationwiki/20140303/foundationwiki-20140303-pages-articles-multistream.xml.bz2 [00:23:26] 7.4 MB! [00:23:54] * Gloria wgets. [00:23:56] 40MB [00:24:47] Don't see "" in it. [00:24:54] I'm looking at meta-current. [00:25:10] $ bzcat foundationwiki-20140303-pages-meta-current.xml.bz2 | grep "" [00:25:27] I'm not sure if zgrep works with .bz2. [00:25:29] there is [00:25:29] how is meta current smaller? [00:25:47] <skin>schulenburg</skin> [00:26:00] lol [00:26:01] literally [00:26:09] Ohhhhhh, hah. [00:26:12] Fuck XML. [00:26:43] 49 [00:27:18] Of course, some could be archive type unused pages [00:27:19] Reedy: https://wikimediafoundation.org/w/index.php?title=2007/Donate-thanks/fr [00:27:25] but it doesn't work [00:27:32] at least for me [00:27:32] It's probably been broken for a while. [00:27:52] It's mostly 2008 donation pages. [00:27:57] The skins are probably fine to kill. [00:29:47] if ( isset( $parserOutput->spp_skin ) ) { [00:29:49] TimStarling: Oh noes! SkinPerPage is apparently broken!!!!!! [00:29:55] I don't think that property still exists, even [00:29:57] hoo: Might want to amend your commit and kill SkinPerPage too. Update commit summary to note it's broken and old usages [00:30:04] Will do [00:30:08] let it rot in hell :P [00:30:15] hey! [00:30:30] hi TimStarling [00:30:31] it surely at least deserves a place in limbo [00:30:51] are extensions not born innocent? [00:32:15] Maybe before the first code commit :P [00:32:44] You do wonder why some ever existed... [00:33:28] or ever worked (AbuseFilter, ftw) [00:34:12] (03PS2) 10Hoo man: Undeploy the skins and SkinPerPage extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 [00:34:51] TimStarling: Any idea about checking if FormPreloadPostCache is still actually used? [00:34:52] forgot the extension list :/ [00:35:02] PS3! [00:35:04] and/or even still works [00:35:25] (03PS3) 10Hoo man: Undeploy the skins and SkinPerPage extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 [00:37:32] Reedy: https://gerrit.wikimedia.org/r/117813 [00:37:41] [00:36:17] (PS1) Reedy: Don't branch skins or SkinPerPage extensions [tools/release] - https://gerrit.wikimedia.org/r/117812 [00:37:47] :D [00:38:20] That's the nice thing of pair working: Everything is being done twice and then thrown away once (or twice) [00:40:24] $wgRequest->getArray( 'fppc' ); [00:41:04] * Jasper_Deng thought we're no longer supposed to be using $wgRequest [00:41:28] https://github.com/wikimedia/mediawiki-extensions-FormPreloadPostCache/commits/master [00:41:31] Jasper_Deng: That extension is from the middle ages [00:41:32] It's very old code [00:44:42] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:45:32] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 202728 bytes in 7.307 second response time [00:47:26] I'm not sure I even remember writing it [00:48:32] * TimStarling waits for the information to seep out of cold storage [00:48:40] Written under duress? [00:49:48] ok, maybe something is vaguely resolving [00:50:39] I think it was used in banner links, to prefill currencies and things like that, based on the source wiki [00:50:55] i.e. prefill currencies in the donation forms on wikimediafoundation.org [00:51:31] well, it still works: https://wikimediafoundation.org/wiki/Template:Add_News?fppc[title]=Black%20Magic%20extensions%20rule [00:51:41] wikimediafoundation.org is not used for actual donation forms anymore, is it? [00:52:28] I think they're all on donatewiki now [00:52:43] is that hosted offshore? [00:52:53] and DonationInterface has its own methods for this sort of thing [00:53:04] ok, donatewiki [00:53:04] It's a republic in EQIAD [00:53:10] it doesn't rely on rewriting the output of elements [00:53:11] we grepping for donationwiki :P [00:53:43] * was [00:53:44] so I guess you can remove it [00:53:50] Ok [00:53:53] Reedy: Same commit? [00:54:00] the Grim Hoo Reaper [00:54:16] Yeah, can't see any reason for it to be done seperately [00:56:21] (03PS4) 10Hoo man: Undeploy the skins, SkinPerPage and FormPreloadPostCache extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 [00:57:17] (03PS5) 10Hoo man: Undeploy the skins, SkinPerPage and FormPreloadPostCache extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 [00:57:52] I think ContactPageFundraiser will be going the same way very soon [01:01:05] vim trips over "foo" in comments, nice :/ [01:01:25] always wondered why it couldn't parse InitialiseSettings [01:03:39] you forgot to set $wgVimSyntaxHighlightingCompatibility to WMF_YES [01:05:03] hah, it seems to work if you open it without offset... weird [01:07:52] ori: It's so much simpler :P Just :1 every time before saving... then it will work next time... [02:14:08] !log LocalisationUpdate completed (1.23wmf16) at 2014-03-10 02:14:08+00:00 [02:14:18] Logged the message, Master [02:25:14] !log LocalisationUpdate completed (1.23wmf17) at 2014-03-10 02:25:14+00:00 [02:25:23] Logged the message, Master [02:57:15] (03PS1) 10BryanDavis: Remove dangling scap symlinks [operations/puppet] - 10https://gerrit.wikimedia.org/r/117817 [02:57:36] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Mar 10 02:57:32 UTC 2014 (duration 57m 31s) [02:57:47] Logged the message, Master [02:58:40] (03CR) 10Ori.livneh: [C: 031] "Looks good. Can you submit a follow-up patch that removes the entries from the manifest entirely?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117817 (owner: 10BryanDavis) [03:03:05] (03PS1) 10BryanDavis: Remove removed scap symlinks [operations/puppet] - 10https://gerrit.wikimedia.org/r/117818 [03:04:09] (03CR) 10BryanDavis: "Follow up patch in I9f05dc9 to remove mention of the deleted scripts entirely." [operations/puppet] - 10https://gerrit.wikimedia.org/r/117817 (owner: 10BryanDavis) [03:05:55] (03PS2) 10BryanDavis: Remove dangling scap symlinks [operations/puppet] - 10https://gerrit.wikimedia.org/r/117817 [03:06:01] (03CR) 10Ori.livneh: [C: 032 V: 032] Remove dangling scap symlinks [operations/puppet] - 10https://gerrit.wikimedia.org/r/117817 (owner: 10BryanDavis) [03:07:06] (03PS2) 10BryanDavis: Remove removed scap symlinks [operations/puppet] - 10https://gerrit.wikimedia.org/r/117818 [03:07:53] (03CR) 10Ori.livneh: [C: 031] "I'll merge this in an hour, once the prior patch has had a chance to roll out." [operations/puppet] - 10https://gerrit.wikimedia.org/r/117818 (owner: 10BryanDavis) [03:34:55] (03PS2) 10Ori.livneh: Add IPv6 GeoIP support to Varnish [operations/puppet] - 10https://gerrit.wikimedia.org/r/30836 (owner: 10Faidon Liambotis) [03:37:24] (03PS3) 10BryanDavis: Send Vary header on http to http redirect [operations/apache-config] - 10https://gerrit.wikimedia.org/r/111925 [03:38:17] (03CR) 10Ori.livneh: "I think the code is correct, but the database could not geo-locate any of the IPv6 addresses that I tried. (All of the ones that I tried c" [operations/puppet] - 10https://gerrit.wikimedia.org/r/30836 (owner: 10Faidon Liambotis) [03:56:29] (03CR) 10Ori.livneh: [C: 032] Remove removed scap symlinks [operations/puppet] - 10https://gerrit.wikimedia.org/r/117818 (owner: 10BryanDavis) [05:04:37] (03PS1) 10Andrew Bogott: Added instancetype fact to get labs instance flavor. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117823 [05:05:41] (03CR) 10Andrew Bogott: "Coren -- this should work but is currently untested. When you have a corresponding patch that actually uses this then we can test and mer" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117823 (owner: 10Andrew Bogott) [07:54:02] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [08:30:24] Jenkins hung [08:52:02] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [08:54:46] (03CR) 10Hashar: [C: 032] "Thanks! Lets see what happens." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115623 (owner: 10Hashar) [09:01:51] !log Jenkins unresponsive for some reason yeah!!! [09:01:59] Logged the message, Master [09:05:02] (03CR) 10Nemo bis: "Added MZ and THO who are the last persons who used some tender loving care on those old pages. Is anyone taking care of them, can anyone a" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 (owner: 10Hoo man) [09:05:55] !log stopping Jenkins [09:06:03] Logged the message, Master [09:09:33] (03Merged) 10jenkins-bot: beta: memcached multiwrite to pmtpa and eqiad [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115623 (owner: 10Hashar) [09:10:20] hi hashar :) [09:11:05] !log Jenkins restarted and proceeding jobs again [09:11:13] Logged the message, Master [09:18:49] does akosiaris know the port used by upd2log ? [09:20:28] is it 8420? [09:21:03] matanya: a quick puppet git grep say that it might be more than that [09:21:14] yes, i did that [09:21:23] for example 51234 on oxygen for lsearch ? [09:21:23] but what is used in fact? [09:21:33] many different ones depending on service ? [09:21:38] and what do you mean in fact ? [09:21:44] whatever puppet says is used [09:21:48] are all in use? [09:21:55] or just defined? [09:22:21] if multiple are used, it complicates my change quite a lot, but oh well [09:22:32] if puppet says they are in use, they should be in use. That is the SOA for all our configs [09:22:45] ok, thanks [09:23:06] you are welcome [10:04:46] so akosiaris basiclly you can pick any of https://gerrit.wikimedia.org/r/#/q/owner:%22Matanya+%253Cmatanya%2540foss.co.il%253E%22+status:open,n,z when you find time/will. [10:04:58] I would like to finish the sudo and etherpad modules before if possible [10:05:56] matanya: i am already compiling 111754 111761 111787 112738 112872 112876 112889 [10:06:20] and looking for diffs [10:07:02] sweet, thanks [10:07:58] (03CR) 10Hoo man: "> can anyone assess the consequences of removing this?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 (owner: 10Hoo man) [10:44:29] (03CR) 10Alexandros Kosiaris: [C: 031] "I LGTM this, but... it should be applied on hosts that include the ferm class, which means that all roles of those hosts should have their" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117698 (owner: 10Matanya) [10:45:08] (03CR) 10Alexandros Kosiaris: [C: 032] nagios: remove broken solaris icon pointer [operations/puppet] - 10https://gerrit.wikimedia.org/r/117835 (owner: 10Matanya) [11:18:35] (03PS1) 10Alexandros Kosiaris: fix puppet doc for nrpe/bacula module [operations/puppet] - 10https://gerrit.wikimedia.org/r/117844 [11:21:44] (03CR) 10Alexandros Kosiaris: [C: 032] fix puppet doc for nrpe/bacula module [operations/puppet] - 10https://gerrit.wikimedia.org/r/117844 (owner: 10Alexandros Kosiaris) [11:27:32] (03CR) 10Mark Bergsma: [C: 032] Add reverse DNS for row D subnets [operations/dns] - 10https://gerrit.wikimedia.org/r/117409 (owner: 10Mark Bergsma) [11:29:19] (03CR) 10Mark Bergsma: [C: 032] Add LVS subnet IPs for row D [operations/dns] - 10https://gerrit.wikimedia.org/r/117412 (owner: 10Mark Bergsma) [11:30:48] (03PS2) 10Mark Bergsma: Setup eqiad LVS servers for row D real servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/117413 [11:32:42] (03CR) 10Mark Bergsma: [C: 032] Setup eqiad LVS servers for row D real servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/117413 (owner: 10Mark Bergsma) [12:11:42] (03PS1) 10Alexandros Kosiaris: Allow Internet Archive to archive etherpads [operations/puppet] - 10https://gerrit.wikimedia.org/r/117845 [12:13:25] akosiaris: please oh, please incorparate this ^ into my module [12:13:40] i wish to reduce the rebases :) [12:14:51] ok will do [12:15:16] (03CR) 10Alexandros Kosiaris: [C: 032] Allow Internet Archive to archive etherpads [operations/puppet] - 10https://gerrit.wikimedia.org/r/117845 (owner: 10Alexandros Kosiaris) [12:20:56] (03PS18) 10Alexandros Kosiaris: etherpad: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/107567 (owner: 10Matanya) [12:25:23] akosiaris: me again, would it be better to move it into files in the module? rather them files/misc? [12:26:11] nope. standard WMF-specific vs non-WMF specific talk. I put it in the role on purpose [12:26:56] if anyone ever decides to use that module they should not also get the policy that dictates internet archiver needs to be able to archive their pads [12:28:30] makes sense. thanks for clarifiying this, i'll remember it at last [12:33:04] (03PS1) 10ArielGlenn: puppetize the page count files copy job from gadolinium and move it [operations/puppet] - 10https://gerrit.wikimedia.org/r/117848 [12:40:41] (03CR) 10Matanya: "just being a nudge" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/117848 (owner: 10ArielGlenn) [12:44:55] hahaha that would be the one I'd miss too [12:45:53] (03PS2) 10ArielGlenn: puppetize the page count files copy job from gadolinium and move it [operations/puppet] - 10https://gerrit.wikimedia.org/r/117848 [12:49:51] (03CR) 10ArielGlenn: [C: 032] puppetize the page count files copy job from gadolinium and move it [operations/puppet] - 10https://gerrit.wikimedia.org/r/117848 (owner: 10ArielGlenn) [12:57:37] (03PS1) 10ArielGlenn: fix up page count copy script perms, require the user that runs them [operations/puppet] - 10https://gerrit.wikimedia.org/r/117852 [12:59:46] (03PS2) 10ArielGlenn: fix up page count copy script perms, require the user that runs them [operations/puppet] - 10https://gerrit.wikimedia.org/r/117852 [13:00:54] (03CR) 10Alexandros Kosiaris: [C: 032] Make wikimetrics' database setup depend on alembic [operations/puppet] - 10https://gerrit.wikimedia.org/r/117416 (owner: 10QChris) [13:04:02] (03PS3) 10ArielGlenn: fix up page count copy script perms, require the user that runs them [operations/puppet] - 10https://gerrit.wikimedia.org/r/117852 [13:06:23] (03PS4) 10ArielGlenn: fix up page count copy script perms, require the user that runs them [operations/puppet] - 10https://gerrit.wikimedia.org/r/117852 [13:07:49] (03PS5) 10ArielGlenn: fix up page count copy script perms, require the user that runs them [operations/puppet] - 10https://gerrit.wikimedia.org/r/117852 [13:09:06] akosiaris: hello! do you have any clue how a debian changelog entry could specify multiple authors ? [13:09:18] (03CR) 10ArielGlenn: [C: 032] fix up page count copy script perms, require the user that runs them [operations/puppet] - 10https://gerrit.wikimedia.org/r/117852 (owner: 10ArielGlenn) [13:09:21] can't find doc about it :-( [13:10:48] hashar: just one more athor line [13:11:18] the -- Antoine Musso .... lines ? [13:11:23] I just concatenate them ? [13:11:45] hashar: can you paste the file in question? [13:11:54] yeah will be easier : http://paste.debian.net/86851/ [13:12:14] debian get me lost all the time [13:12:20] I got to get rid of those debian packages :] [13:12:30] that is the correct format [13:12:41] :-] [13:15:42] Now I gotta find out how to upgrade from debian policy 3.9.3 to whatever is current [13:17:19] 3.9.5 [13:17:32] hashar: just change in control [13:24:54] hashar: The maintainer name and email address used in the changelog should be the details of the person uploading this version. [13:25:12] which by definition is one person [13:27:40] hashar: e.g http://metadata.ftp-master.debian.org/changelogs//main/a/asciidoctor/asciidoctor_0.1.4-1_changelog [13:27:45] re sorry [13:28:09] ahhh [13:28:19] first time i see that. Interesting [13:28:30] an abuse of course but since it works :-) [13:28:43] I should get rid of svn [13:28:44] :D [13:28:50] akosiaris: it is used for NMU and when mentoring [13:28:55] afaik [13:29:46] I came up with: http://anonscm.debian.org/viewvc/python-modules/packages/python-statsd/trunk/debian/changelog?revision=28123&view=markup [13:29:50] I guess it is totally wrong [13:30:40] I am wondering whether is will be lintian or some other tool that will croak on this [13:31:13] i haven't met one [13:32:46] should I make it like: http://paste.debian.net/86854/ [13:33:00] aka signature having my name and a section per author? [13:38:30] better this way hashar [13:47:43] yeah [13:47:54] now gonna spend the rest of the afternoon to figure out how to build it :] [13:59:07] hashar: debian stuff is quite striatforward [13:59:15] FOR GOD SAKE [13:59:19] dont make me /quit [13:59:24] it is not? [13:59:40] every single time I need to touch a package it takes me a week of self motivation [13:59:51] and I eventually give up after half a day [13:59:51] i agree there are easier package managers [14:00:11] then I eventually rely on local friends to help me with the packaging [14:00:23] hashar: try to build rpm :P [14:00:25] so I can get a stupid minor version installed on wm server [14:00:34] took me six months to upgrade Zuul :/ [14:00:53] just wondering, do you use dh_helper of some kind? [14:01:42] I thnk that was the very last time I was playing with deb packages [14:01:51] will find out a better way to ship the python modules I need [14:02:08] by better I mean something I can do in less than 10 minutes including deployment [14:08:14] you need something like gem2deb for python [14:08:37] i know there was py2deb, but don't know its status [14:09:33] and there is fpm if you don't care for the process, just the outcome [14:10:02] hashar: anyway, i'll always be glad to help you with packaging if you need (and i can help) [14:10:23] na it is ok [14:10:55] the workflow doesn't work anyway since I need the package to be build by ops / uploaded to apt.wm.o [14:11:09] I got to put the dependencies I need in some git repo that I would deploy using git-deploy [14:11:42] i see [14:18:55] akosiaris: good morningnggg (well, my time : ) ) [14:19:08] this is just a friendly ping about: https://gerrit.wikimedia.org/r/#/c/115323/ [14:19:57] ottomata: good morning to you!. I have not forgotten you. I hope to have it by day's end :-) [14:20:16] you are a true friend! thank you! :) [14:23:10] ottomata: ping [14:23:28] hiya! [14:23:29] wanna do it? [14:23:47] sure! [14:24:46] ottomata: just verifing, udp2log ports are 8420 and 51234 any others i should know of, didn't find more, but i don't know the system very well [14:25:29] hm, its whatever they are set to in puppet :) [14:25:46] yes, that is what i found in puppet [14:25:53] then that is all I know of too :) [14:25:55] manybubbles: , ready? [14:26:21] do it! [14:26:29] hope my grep-fu is working good [14:26:48] !log put elastic1007 and elastic1013-1016 back in main elasticsearch pool [14:26:56] Logged the message, Master [14:27:02] matanya: , just curious, why you looking? [14:27:19] replacing iptables with ferm [14:27:43] oh nice, ok [14:27:44] cool [14:28:32] matanya: I actually need some ferm rules on a few boxes, if you are interested in submittting some patches that actually do some new things :) [14:28:58] ottomata: as alwyas, tell me :) [14:29:17] and i hope to get to at in the pipe/queue [14:29:22] *it [14:30:14] hm, cool, ok, let's just start with zookeeper then [14:30:20] let's see [14:31:01] look at zoo.cfg.erb [14:31:07] you can see the 3 ports that zookeeper users [14:31:08] uses [14:31:27] i'm not 100% sure of the rules we need [14:31:40] but basically, i think that 2182 and 2183 should only be useable by other zookeepers [14:31:56] 2181 is the client port [14:31:57] and hm [14:32:14] that's actually the one i'm worried about, as I think people can issue commands through that port [14:32:30] for now I guess, only allow analytics nodes to talk to it [14:34:19] (03PS6) 10Hoo man: Undeploy the skins, SkinPerPage and FormPreloadPostCache extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 [14:34:27] (03CR) 10Reedy: [C: 032] Undeploy the skins, SkinPerPage and FormPreloadPostCache extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 (owner: 10Hoo man) [14:34:37] (03Merged) 10jenkins-bot: Undeploy the skins, SkinPerPage and FormPreloadPostCache extensions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117810 (owner: 10Hoo man) [14:34:44] manybubbles: how's it look? [14:34:50] no errors [14:34:51] (03PS6) 10Reedy: Disable and remove ContactPageFundraiser [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110292 [14:34:54] (03CR) 10Reedy: [C: 032] Disable and remove ContactPageFundraiser [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110292 (owner: 10Reedy) [14:35:01] (03Merged) 10jenkins-bot: Disable and remove ContactPageFundraiser [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110292 (owner: 10Reedy) [14:35:16] ottomata: looks great! [14:36:38] ottomata: can you rt it? and document the hosts by name if possible? [14:36:40] great [14:36:45] yes matanya, can do [14:37:04] matanya: can you view the core-ops queue? [14:37:06] in rt [14:37:13] i can view all [14:37:21] - access [14:37:24] i thinl [14:37:24] k [14:39:01] k [14:48:48] thanks for the ticket [14:49:17] Something wrong with Labs? [14:49:31] !log reedy synchronized wmf-config/ 'Remove ContactPageFundraiser, SkinPerPage, FormPreloadPostCache and skins used on foundationwiki' [14:49:36] Logged the message, Master [14:51:17] Oh, it’s freenode. [14:56:54] !log reedy updated /a/common to {{Gerrit|Ibcbd10044}}: Disable and remove ContactPageFundraiser [14:56:58] (03PS1) 10Reedy: Set default for wmgContactPageConf [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117865 [14:57:02] Logged the message, Master [14:57:09] (03PS1) 10Hashar: beta: drop $wgSessionsInMemcached [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117866 [14:57:12] (03CR) 10Reedy: [C: 032] Set default for wmgContactPageConf [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117865 (owner: 10Reedy) [14:57:21] (03Merged) 10jenkins-bot: Set default for wmgContactPageConf [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117865 (owner: 10Reedy) [14:57:24] (03CR) 10Hashar: [C: 032] beta: drop $wgSessionsInMemcached [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117866 (owner: 10Hashar) [14:57:35] (03Merged) 10jenkins-bot: beta: drop $wgSessionsInMemcached [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117866 (owner: 10Hashar) [14:57:54] !log reedy synchronized wmf-config/InitialiseSettings.php [14:58:02] Logged the message, Master [15:00:33] (03PS1) 10Hashar: beta: write sessions to both datacenters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117867 [15:00:42] (03CR) 10Hashar: [C: 032] beta: write sessions to both datacenters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117867 (owner: 10Hashar) [15:00:49] (03Merged) 10jenkins-bot: beta: write sessions to both datacenters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117867 (owner: 10Hashar) [15:01:55] hashar: --- [15:02:05] * ... [15:02:15] hoo: ? [15:02:32] hashar: I'm about to deploy :P I can sync thing as well, if you want [15:02:36] but don't get in my way on tin [15:02:48] they're not production [15:02:49] hoo: they are changes made to -labs.php files which do not impact production [15:02:58] Still should be synced... [15:03:01] hoo: they are safe to sync. I usually sync them in prod myself [15:03:09] (03CR) 10Faidon Liambotis: "If the language hostnames are not used, why not remove them from DNS instead?" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/113972 (owner: 10Thiemo Mättig (WMDE)) [15:03:14] hoo: sorry for the trouble . It is safe for sure I guarantee it [15:03:29] (03PS2) 10Bene: Enable Extension:GuidedTour on testwikidatawiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117452 [15:03:49] (03CR) 10Hoo man: [C: 032] Enable Extension:GuidedTour on testwikidatawiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117452 (owner: 10Bene) [15:03:58] (03Merged) 10jenkins-bot: Enable Extension:GuidedTour on testwikidatawiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117452 (owner: 10Bene) [15:04:01] hashar: Ok :) [15:04:02] \o/ [15:04:49] :-]] [15:05:02] !log hoo synchronized wmf-config/InitialiseSettings.php 'Enable Extension:GuidedTour on testwikidatawiki' [15:05:09] Logged the message, Master [15:05:58] !log hoo synchronized wmf-config/session-labs.php 'Syncing labs file for consistency {{Gerrit|I5e1a3242}}' [15:06:06] Logged the message, Master [15:07:00] ok, I'm done ;) [15:07:37] looks fine [15:08:27] (03CR) 10Faidon Liambotis: [C: 04-1] "This sets a *request* header, not a response header. This should be moved under the 666 if in vcl_error, unless I'm misunderstanding somet" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117471 (owner: 10MaxSem) [15:08:33] yep [15:08:39] (03PS2) 10MaxSem: Redirect Kindle search requests to mobile [operations/puppet] - 10https://gerrit.wikimedia.org/r/117449 [15:09:09] (03CR) 10Faidon Liambotis: [C: 032] Redirect Kindle search requests to mobile [operations/puppet] - 10https://gerrit.wikimedia.org/r/117449 (owner: 10MaxSem) [15:09:15] (03CR) 10Faidon Liambotis: [V: 032] Redirect Kindle search requests to mobile [operations/puppet] - 10https://gerrit.wikimedia.org/r/117449 (owner: 10MaxSem) [15:22:40] thanks, paravoid [15:22:54] did you see the -1 on the other one? [15:24:01] yep, fixing [15:26:47] (03CR) 10Tobias Gritschacher: "There are plans to use them in the future." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/113972 (owner: 10Thiemo Mättig (WMDE)) [15:28:19] MaxSem: I can fix too, I was just wondering if there's a misunderstanding :) [15:29:00] no, you're right - it's needed in response [15:30:08] (03CR) 10Faidon Liambotis: [C: 04-2] "Then we'll add them in the future. Sending redirects now for hostnames that do not exist because we may use them in the future doesn't mak" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/113972 (owner: 10Thiemo Mättig (WMDE)) [15:34:51] (03PS2) 10MaxSem: Always output Content-Length on mobile redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/117471 [15:36:55] akosiaris or paravoid, does someone know if there is a maintenance/whatever host with a mediawiki install on it in eqiad? https://gerrit.wikimedia.org/r/#/c/117250/ [15:37:25] terbium? [15:37:28] and if not, what host is going to have one (in order to be able to replace fenari and/or hume)? [15:37:41] Nemo_bis: Terbium, tin, ... [15:37:43] is it? some time ago I was told it didn't have one yet [15:37:44] look at sites.pp [15:38:01] tin is the deployment host, terbium the maintenance one [15:38:04] iirc [15:38:06] sites.php can't tell me which is actually supposed to be used though [15:38:34] true, but taht wasn't your question [15:39:03] well, I tried to clarify with "in order to be able to replace fenari and/or hume" :) [15:39:16] terbium, then [15:41:00] (03PS1) 10ArielGlenn: fix typo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117878 [15:41:44] I'll submit a patch later to verify this then ;) [15:42:26] (03CR) 10Hoo man: [C: 032] "Weird :P" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117878 (owner: 10ArielGlenn) [15:42:33] (03Merged) 10jenkins-bot: fix typo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117878 (owner: 10ArielGlenn) [15:43:42] (03PS1) 10Ottomata: Giving Nik shell access to analytics1004 to do some elasticsearch load testing [operations/puppet] - 10https://gerrit.wikimedia.org/r/117879 [15:43:51] !log hoo synchronized multiversion/MWScript.php 'Path typo fix: {{Gerrit|I6a05447}}' [15:43:58] Logged the message, Master [15:45:28] (03CR) 10Ottomata: [C: 032 V: 032] Giving Nik shell access to analytics1004 to do some elasticsearch load testing [operations/puppet] - 10https://gerrit.wikimedia.org/r/117879 (owner: 10Ottomata) [15:47:26] (03CR) 10Addshore: "*is confused*" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/113972 (owner: 10Thiemo Mättig (WMDE)) [15:47:56] (03PS1) 10Ottomata: Need accounts::manybubbles [operations/puppet] - 10https://gerrit.wikimedia.org/r/117881 [15:48:12] (03CR) 10Ottomata: [C: 032 V: 032] Need accounts::manybubbles [operations/puppet] - 10https://gerrit.wikimedia.org/r/117881 (owner: 10Ottomata) [15:49:35] manybubbles: [Nik Everett]/User[manybubbles]/ensure: created [15:49:39] on analytics1004.eqiad.wmnet [15:51:52] (03CR) 10BBlack: [C: 04-1] "1) The regsub is using XFF rather than Set-Cookie" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 (owner: 10Dr0ptp4kt) [16:13:55] big swift spike [16:14:09] anyone knows why? [16:14:37] http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&h=ms-fe1002.eqiad.wmnet&m=cpu_report&s=by+name&mc=2&g=network_report&c=Swift+eqiad [16:16:07] hoo's deploy shouldn't have anything to do with that. [16:16:16] (GuidedTour on testwikidata) [16:21:51] Coren: do you have time to work on the virtNNNN icinga certificate error at some point? [16:22:12] will you* [16:22:29] paravoid: Middle-end of the week, I expect. I'm still busy working out kinks in the migration, but that list of problems is getting thinner. :-) [16:33:34] !log restarted Zuul on gallium : leaked file descriptors (fixed upstream) [16:33:41] Logged the message, Master [16:39:07] (03PS1) 10ArielGlenn: puppetize the weekly centralauth mysql dump [operations/puppet] - 10https://gerrit.wikimedia.org/r/117888 [16:43:09] (03CR) 10ArielGlenn: [C: 032] puppetize the weekly centralauth mysql dump [operations/puppet] - 10https://gerrit.wikimedia.org/r/117888 (owner: 10ArielGlenn) [16:43:52] (03CR) 10Matanya: "merged, so never mind, but for next time" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/117888 (owner: 10ArielGlenn) [16:44:12] apergos: are you aware of the snapshot1004 disk space alert? [16:52:51] (03CR) 10Alexandros Kosiaris: [C: 04-1] Initial 2.0.0-1 debian release (0311 comments) [operations/debs/archiva] (debian) - 10https://gerrit.wikimedia.org/r/115323 (owner: 10Ottomata) [16:54:17] Coren: what's with labstore4/labstore1001 and their last puppet run? [16:54:26] (03PS1) 10ArielGlenn: snapshot common role gets account related classes [operations/puppet] - 10https://gerrit.wikimedia.org/r/117892 [16:55:16] paravoid: yes, greg-g and co. are workng on that [16:55:22] ok [16:55:27] (greg? snapshot?) [16:55:34] it's a mw deploy space issue [16:55:44] aha [16:56:01] should have some updates before the next round (thursday) [16:56:34] k, I'm just looking at the icinga alerts [16:56:38] we're at an all time minimum [16:56:43] let's see if we can get it to zero :) [16:56:43] (03CR) 10ArielGlenn: [C: 032] snapshot common role gets account related classes [operations/puppet] - 10https://gerrit.wikimedia.org/r/117892 (owner: 10ArielGlenn) [16:56:47] :-) [16:58:18] paravoid: yeah, bryan has a change to scap that will delete old wmfXX branchs and i10n caches as soon as they are ready to be deleted. [16:58:45] oh hi, sorry for the gratuitous ping but that's good to know [16:59:04] paravoid: They're both disabled atm while I'm twiddling with their configs. [16:59:23] why are you doing manual config changes? [16:59:34] we don't generally do that, disabling puppet should be limited to 1-2 days [17:03:35] paravoid: apergos for more information than you probably want: https://etherpad.wikimedia.org/p/OldWmfBranchCleanup [17:03:55] ottomata: hey [17:04:32] RECOVERY - Host elastic1007 is UP: PING OK - Packet loss = 0%, RTA = 1.03 ms [17:16:03] * bd808 should circulate the cleanup email to a wider list [17:22:41] paravoid, hi [17:22:45] (was eating lunch) [17:22:49] nevermind [17:22:54] thanks :) [17:23:14] ha, ok! [17:26:58] (03PS3) 10Faidon Liambotis: Always output Content-Length on mobile redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/117471 (owner: 10MaxSem) [17:27:05] (03PS4) 10MaxSem: Always output Content-Length on mobile redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/117471 [17:28:12] (03CR) 10Faidon Liambotis: [C: 032 V: 032] Always output Content-Length on mobile redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/117471 (owner: 10MaxSem) [17:28:36] So… the ops meeting is in 30 minutes or 90? [17:29:02] greg-g: looks good! [17:29:17] i think 30 minutes, right? [17:29:32] yeah 30 min [17:29:36] cool [17:32:21] MaxSem: see above; will be deployed in the next 30' or so [17:32:35] thanks paravoid:) [17:32:46] np [17:34:12] addshore: I'm also confused [17:34:37] what's the point of having hundreds of domains that redirect to one when we can just not have the domains at all? [17:36:14] (03CR) 10Faidon Liambotis: "I'm also confused :) Yes, they exist but they are completely pointless, unless I'm misunderstanding something." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/113972 (owner: 10Thiemo Mättig (WMDE)) [17:37:28] !log upgrading mediawiki on wikitech [17:37:31] Wait, it's called $wgSquidMaxage not $wgSquidMaxAge? WTF is "Maxage" meant to be? :-) [17:37:36] Logged the message, Master [17:38:05] <^d> andrewbogott: <3 [17:38:16] ^d: Wait and see what breaks... [17:38:24] <^d> Probably search! [17:38:25] <^d> :) [17:38:33] search is already broken, that's why I'm upgrading! [17:39:10] Actually… what's the latest branch for wmf mediawiki? [17:41:21] ^d, what branch should I upgrade to? Googling is oddly not answering this question [17:41:43] <^d> 1.23wmf17 [17:41:50] ok [17:41:51] thanks [17:41:53] <^d> (wmf16 is also acceptable, but 17's latest) [17:41:55] <^d> yw [17:42:47] <^d> andrewbogott: `mwversionsinuse` on tin or check wikiversions.dat in wmf-config. [17:42:50] <^d> for future ref. [17:44:04] (03CR) 10Faidon Liambotis: "No, backend should be enough. Varnish is smart enough to always accept gzip when fetching from the backend, even if the client request doe" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108484 (owner: 10Ori.livneh) [17:44:07] bblack: ^ [17:45:09] manybubbles: which log is of interest for this cirrus failure? [17:45:25] andrewbogott: it'll log php warning [17:45:50] manybubbles: what log file? [17:45:56] There's nothing in apache logs which is where I looked first [17:46:09] no? [17:46:14] what other log files do you have? [17:46:36] we scatter the logs all over the place in production [17:47:10] (03CR) 10BBlack: "I think the gzip issues were with ESI specifically. I'm not aware of any good reason (yet!) to fear gzip in general w/ varnish 3.0.5." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108484 (owner: 10Ori.livneh) [17:47:40] Ah, here we go, just had to restart apache for some reason... [17:47:43] manybubbles: PHP Fatal error: Class 'Elastica\\Exception\\PartialShardFailureException' not found in /srv/org/wikimedia/controller/wikis/slot0/extensions/Elastica/Elastica/lib/Elastica/Transport/Http.php on line 152, referer: https://wikitech.wikimedia.org/wiki/Main_Page [17:47:43] bblack: it was more than just ESI :( [17:47:54] bblack: we specifically had issues with Range requests when we migrated text [17:48:07] yes [17:48:12] well, there were those small bugs, but I thought we were past that w/ 3.0.5's upstream fixes? [17:48:14] bblack: (range requests would require Varnish to gunzip to be able to satisfy them) [17:48:21] I guess nobody has tested though [17:48:22] we can carefully retest it [17:48:24] nope [17:48:31] <^d> andrewbogott: You updated Cirrus & Elastica, right? [17:48:36] we have some VCL code that does unset req.http.Range [17:48:53] ^d: I tried to. [17:48:56] "FIXME: we're seeing an issue with Range requests and gzip/gunzip." [17:49:04] text + mobile [17:49:06] andrewbogott and ^d: looks like we missed another import.... [17:49:10] autoloader [17:49:22] * andrewbogott has no idea what that means [17:49:22] but that'd give us more information, there are already some failures [17:49:30] <^d> I'll write a patch for it. [17:49:35] thanks [17:49:37] that'll help [17:50:02] paravoid: well, let's test that first then. but worst case we could still just disable range on upload right? or do you think we get significant savings from Range on some sort of large downloads from upload? [17:50:15] no that's not it [17:50:20] well, kind of :) [17:50:33] so the problem with Range was with varnish doing gunzip [17:50:50] because otherwise, Varnish just fetches gzipped content, stores it and serves it unmodified [17:50:50] and you think fe-gunzip on when the client doesn't accept will trigger the same? [17:51:06] so varnish's gzip codepaths aren't being executed [17:51:18] greg-g: When can I deploy hotfix for watch-star breakage (bug 62422)? [17:51:23] now Ori with this changeset wants to do gzip on SVGs, because Swift isn't doing that [17:51:29] oh, I figured it was doing the gzipping at least [17:51:41] so if gunzip was broken, I'm wondering if gzip will also be :) [17:51:47] different codepaths I suppose [17:52:01] but worth raising a flag I think [17:52:25] let's gzip the geoip cookie! [17:52:30] * ori kids. Honest. [17:52:33] Krinkle: oh, you wanna, sure. how about now? [17:52:34] (03CR) 10BBlack: [C: 04-1] "Per clarifications on IRC, we should probably test some things (about varnish gzip) first before moving forward with this!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108484 (owner: 10Ori.livneh) [17:53:15] I'll look at it this week on my little varnish dev vhost and see if I can figure out which of all of the above things are or aren't broken on 3.0.5 [17:53:34] greg-g: k, thx :) [17:53:45] thanks! [17:55:46] (03PS1) 10ArielGlenn: convert oooold snapshot manifest to module [operations/puppet] - 10https://gerrit.wikimedia.org/r/117901 [18:01:27] <^d> andrewbogott: Updating Elastica submodule for wmf17 again should fix your fatal exception. [18:01:30] <^d> (just merged it) [18:01:51] ^d: k, go ahead. I'm currently backporting a core patch, but will be a few more minutes. [18:02:03] not yet on tin yet [18:03:14] !log demon synchronized php-1.23wmf17/extensions/Elastica/Elastica.php [18:03:16] ^d: It doesn't crash anymore, now it says... [18:03:20] well, see for yourself: https://wikitech.wikimedia.org/w/index.php?search=address&title=Special%3ASearch&go=Go [18:03:22] Logged the message, Master [18:03:41] <^d> That's better than a fatal :p [18:03:52] <^d> We probably need to rebuild index, but manybubbles or I can do that . [18:04:27] <^d> Krinkle: Thanks. I'm done and out of your way [18:05:44] ^d: Hm.. core wmf17 is now dirty on tin? Did you not apply the update? [18:05:46] ^d, yes please! [18:05:51] Or did you only pull inside the submodule? [18:06:06] <^d> Krinkle: I did :) [18:06:24] <^d> Whoops, I didn't. [18:06:32] You did both, but they point to different hashes [18:06:35] <^d> im such a git noob sometimes. [18:06:57] tehre is a core wmf17 commit updating Elastica, and it is pulled in on tin, and the Elastica submodule also points to latest wmf17 of that extension [18:06:59] but they're not the same [18:07:40] <^d> Fixed. [18:07:43] !log demon synchronized php-1.23wmf17/extensions/Elastica/Elastica.php 'I forgot how to use git' [18:07:51] Logged the message, Master [18:10:11] ^d: Ah, right, it's pointing at master now [18:10:21] So that previous sync didn't actually deploy it? Yikes [18:10:23] Deploying mine now [18:10:41] (03CR) 10Dr0ptp4kt: "'1) The regsub is using XFF rather than Set-Cookie'" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 (owner: 10Dr0ptp4kt) [18:10:51] <^d> Krinkle: Yeah first sync was pointless. [18:10:55] <^d> Second was correct. [18:11:03] ^^^bblack, will hit you up once i'm done with email and such and you're done with the ops meeting [18:11:09] !log krinkle synchronized php-1.23wmf17/resources/mediawiki.api/mediawiki.api.watch.js 'I2ac9e0da0f1c825' [18:11:18] Logged the message, Master [18:12:12] (03PS2) 10Dr0ptp4kt: Overwrite stale ZeroOpts= (blank) cookies if HTTPS zero-rated. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 [18:38:25] (03CR) 10Ottomata: Initial 2.0.0-1 debian release (0311 comments) [operations/debs/archiva] (debian) - 10https://gerrit.wikimedia.org/r/115323 (owner: 10Ottomata) [18:38:53] (03PS7) 10Ottomata: Initial 2.0.0-1 debian release [operations/debs/archiva] (debian) - 10https://gerrit.wikimedia.org/r/115323 [18:39:38] (03PS3) 10Dr0ptp4kt: Overwrite stale ZeroOpts= (blank) cookies if HTTPS zero-rated. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 [18:40:22] (03PS4) 10Dr0ptp4kt: Overwrite stale ZeroOpts= (blank) cookies if HTTPS zero-rated. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 [18:47:51] Coren: cached dns entries have a fixed lifetime, right? Is it possible that the slowdown involves all of the cache entries expiring in lockstep? [18:50:44] (03PS1) 10Jgreen: install tftpboot/trusty-installer/ubuntu-installer & tftpboot/trusty-installer/version.info from volatile [operations/puppet] - 10https://gerrit.wikimedia.org/r/117909 [18:56:28] (03CR) 10Jgreen: [C: 032 V: 031] install tftpboot/trusty-installer/ubuntu-installer & tftpboot/trusty-installer/version.info from volatile [operations/puppet] - 10https://gerrit.wikimedia.org/r/117909 (owner: 10Jgreen) [18:56:29] andrewbogott: That seems unlikely, otherwise I'd expect the slowdown to be pretty much independent of traffic levels. [18:57:03] ok [18:57:51] oh, hm, "cache size 150, 130083/466899 cache insertions re-used unexpired cache entries" [18:58:01] Bigger cache could definitely help [18:59:45] I just triaged maint-announced [18:59:51] s/d$// [19:00:02] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [19:00:05] basically merge tickets that are the same upstream ticket and closed the ones that their dates have passed [19:00:16] please keep the queue in mind when doing RT duty :) [19:09:13] akosiaris: [19:09:26] re /var/run/archiva, should the init script just create that and chown it then? [19:09:38] dr0ptp4kt: so, you're right about the if-else logic, it does trap the case where ZeroOpts is in Set-Cookie, but lacks the tls option. My concern now is about multiple options in ZeroOpts in general and the set-cookie layout [19:10:02] I was about to answer that. Yes [19:10:30] so ottomata that way we avoid the postinst, postrm and package creation stuff [19:10:46] dr0ptp4kt: (a) the name ZeroOpts implies other, future options. What's the separator, and do we need to plan for searching for tls among several options? (b) What about the ordering of multiple cookies in Set-Cookie? Seems like the current regsub sill sometimes result in OtherCookie=whatever;; ZeroOpts=tls? [19:10:48] Hm… also, Coren, https://dpaste.de/YRRU [19:10:55] doesn't that actively kill dnsmasq every puppet run? [19:11:02] ottomata: is zookeeper a git submodule? [19:11:27] matanya: : yes [19:11:36] but, the ferm changes probably only nee to happen in the role class [19:11:47] andrewbogott: ... omfg. ensure=>stopped [19:11:48] OR [19:11:52] hm [19:11:54] oh, so this is way it is empty [19:11:59] you coudl do that in the submodule as an opt in [19:12:02] i need to clone the submodule [19:12:10] andrewbogott: *headdesks* [19:12:13] the reason i don't love it in the submodule is that it introduce a dependency on the ferm module [19:12:14] andrewbogott: Yes, yes it does. [19:12:16] Coren: I understand the intention there, but, seems wrong! [19:12:19] but, as long as the class is decoupled [19:12:23] you can put it in the submodule [19:12:33] OK, well… I'm working on that bit of code anyway, will have a patch soon [19:12:37] i'll need to see the .erb's before [19:12:41] class zookeeper::ferm, or zookeeper::firewall, or maybe zookeeper::server::firewall [19:12:42] or something [19:12:45] aye yeah [19:12:48] so [19:12:48] yeah [19:12:51] if oyu have puppet cloned [19:12:52] jsut run [19:12:58] git submodule update —init [19:13:22] and maybe read this: [19:13:23] https://wikitech.wikimedia.org/wiki/Puppet_coding#git_submodules [19:13:26] yes, did that :) thanks [19:13:34] * matanya is reading [19:13:42] ah, akosiaris_away, you want me to git rid of the postinst stuff? [19:14:12] or you just mean tthe line that chowns /var/run/archiva [19:15:05] you don't need that postinst at all, if i follow [19:15:12] re archive [19:15:15] *a [19:15:38] ottomata: just that line [19:15:46] the rest if pretty ok [19:15:49] is* [19:16:39] Coren: actually, I don't think that does anything. The dnsmasq started by nova-network doesn't respond to 'service stop dns' because it's not started by upstart [19:16:42] ottomata: basically anything the references/creates/delete blah blah /var/run/archiva apart from the init script needs to be updated [19:16:44] So, red herring [19:17:24] andrewbogott: that's because you want to try 'service dnsmasq stop' :-) [19:17:42] <^d> !log enwiki search indexes don't look happy. sporadic reports of "search request timed out" from users, cpu usage pretty high. possibly unrelated bug logs getting spammed with: "Retry, temporal error for query..." [19:17:51] Logged the message, Master [19:17:59] Coren: Oh, yes, that's what I tried… just mis-transcribed in irc. [19:18:07] Coren, but, please check my results [19:19:06] andrewbogott: Also, I'm not sure how that provider behaves; it may well try to find and kill the daemon even though it wasn't in upstart? I think the 'correct' test would be to force a puppet run and see what it does to the daemon. [19:19:09] ottomata: i think the rules indeed should go in manifests/role/analytics/zookeeper.pp [19:19:26] ok, will try... [19:19:44] but we will need to put ferm there, and for that we must verify we don't block any wanted traffic [19:19:58] Specifically, the daemon's pid should not change because that'd indicate it was killed then restarted by nova-network. [19:20:24] Coren, looks like no change, same pid [19:20:26] for both processes [19:20:36] Ze erring, she was red. [19:20:55] That'd have been too easy. :-) [19:21:26] Well, I still suspect it's just because the cache is too small… increased activity would invalidate the /whole/ cache, leaving it to hit upstream for every single request [19:23:10] (03PS1) 10Andrew Bogott: Increase the cache size for dnsmasq. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117913 [19:24:42] yeah, true matanya [19:25:03] <^d> I hate lsearchd. Lack of any useful logging anywhere makes debugging it difficult. [19:25:31] yeah matanya, probably zookeeper/manifests/server/firewall.pp (or ferm.pp?) [19:25:40] class zookeeper::server::firewall (or ferm) [19:26:09] (03PS2) 10Andrew Bogott: Increase the cache size for dnsmasq. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117913 [19:26:25] i prefer zookeeper::server::firewall [19:27:05] Coren or matanya, quick review of ^ ? [19:27:18] akosiaris: since dpkg won't manage /var/run/archiva anymore, shoudl postrm remove it? [19:27:23] * andrewbogott realizes he hasn't provided any quick reviews for about a month :( [19:27:36] matanya: i tink I do too [19:28:00] andrewbogott: Isn't that file actually generated /by/ nova-network? [19:28:12] (03CR) 10Matanya: [C: 031] "LGTM" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117913 (owner: 10Andrew Bogott) [19:28:20] Ah, no it isn't. [19:28:26] no [19:28:37] (03CR) 10coren: [C: 031] "Appears sane." [operations/puppet] - 10https://gerrit.wikimedia.org/r/117913 (owner: 10Andrew Bogott) [19:28:57] http://docs.openstack.org/grizzly/openstack-compute/admin/content/dnsmasq.html [19:29:11] andrewbogott: in fact if the files contain thr same value you can combine them [19:29:30] matanya: true [19:29:35] but that might cause issue when you need diff value for diff versions [19:29:38] so, your call here [19:29:47] Yeah, they may diverge… hard to say. [19:29:59] i would combine 150 is low by all means [19:30:14] (03CR) 10Andrew Bogott: [C: 032] Increase the cache size for dnsmasq. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117913 (owner: 10Andrew Bogott) [19:30:16] (03PS8) 10Ottomata: Initial 2.0.0-1 debian release [operations/debs/archiva] (debian) - 10https://gerrit.wikimedia.org/r/115323 [19:30:27] and 600 looks reasoneable for both, but meh [19:32:26] dammit, it's not picking up the config [19:33:13] postmerge build failed.. [19:33:19] (03PS1) 10Jgreen: check in the new versions (mostly same) of trusty ubuntu-installer files [operations/puppet] - 10https://gerrit.wikimedia.org/r/117917 [19:33:43] matanya: yeah, that's unrelated I think [19:35:13] Jeff_Green: tabs and trailing spaces ... [19:35:32] well… Coren, I'm too sleepy to figure out why that config file is ignored. I'll look again in the morning, feel free to investigate in the meantime if you're interested. [19:35:39] matanya: where? [19:35:47] * andrewbogott should not have started a patch after 3am [19:36:06] Jeff_Green: https://gerrit.wikimedia.org/r/#/c/117917/1/modules/install-server/files/tftpboot/trusty-installer/ubuntu-installer/amd64/boot-screens/adtxt.cfg [19:36:16] mwalker|alt: that is not a manifest. [19:36:28] i know [19:36:38] just for being pretty [19:36:44] why would I reformat stock files? that's insane [19:37:02] those are stock? ignore me then [19:37:26] i just saw you deleted all of them, and then re added [19:37:41] now i got what you did. sorry for the noise [19:37:44] oooh; more trusty boot stuff! :) [19:38:29] matanya: yeah, I was hoping we could handle all of that stock stuff from puppet/volatile instead of splitting it up between git and volatile, but that made puppet unhappy for some reason I don't yet understand, so I'm putting them back [19:38:47] yeah, got it at last [19:40:36] (03CR) 10Jgreen: [C: 032 V: 031] check in the new versions (mostly same) of trusty ubuntu-installer files [operations/puppet] - 10https://gerrit.wikimedia.org/r/117917 (owner: 10Jgreen) [19:40:42] bblack: regardiging (a) for multiple fields in the cookie - yurik and i agreed not to use the cookie for things other than tls. if we change our mind in the future, though, i think we'd want to rewrite the vcl. regarding (b) for the double semi-colon in the case that the cookie is in the middle of the Set-Cookie header (as opposed to the beginning or the end), should we just remove any double semi-colons on the very next line of the [19:40:42] or should we just let it be? [19:42:54] (03PS3) 10Reedy: Use Vips for images over 20MP [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117809 [19:43:09] (03CR) 10Reedy: [C: 032] Use Vips for images over 20MP [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117809 (owner: 10Reedy) [19:43:16] (03Merged) 10jenkins-bot: Use Vips for images over 20MP [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117809 (owner: 10Reedy) [19:43:58] !log reedy synchronized wmf-config/ 'Use Vips for images over 20MP' [19:44:05] Logged the message, Master [19:44:14] Coren: hey, remind me, what was the issue we were having with lvm thin? [19:44:19] Coren: was it filesystem corruption? [19:45:07] paravoid: It looked more like it confused XFS into /thinking/ there was corruption. Something about the interaction with XFS readahead of metadata. [19:45:23] unable to decrement a reference count below 0 [19:45:27] do you remember a message like that? [19:45:49] see https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-linus&id=cebc2de44d3bce53e46476e774126c298ca2c8a9 [19:45:52] paravoid: I'm pretty sure that wasn't it; it was about some bit of metadata being all zeroes instead. [19:48:34] paravoid: Yeah, different issue in this case. That said, it might be worthwhile to experiment with thin volume snapshots again once things stabilize back after migration. It /was/ a damn useful feature. [19:55:16] dr0ptp4kt: I'll take a look again at the regsub stuff later this afternoon. I think we should be able to craft a single regsub statement that appends the tls bit and doesn't mess with the order or the other cookies. [19:55:55] bblack, cool, thx! [19:56:55] (03CR) 10Alexandros Kosiaris: [C: 032] base: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/111754 (owner: 10Matanya) [20:00:19] (03CR) 10Alexandros Kosiaris: [C: 032] varnish: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/111761 (owner: 10Matanya) [20:00:31] merge day! yay :) [20:02:13] akosiaris: think we can IRC discuss the remaining archiva issues? [20:02:20] or, i bet it is getting late for you, eh? [20:02:30] ottomata: it is.. like 10pm [20:02:37] thanks :) [20:02:55] we can do it tomorrow your morning though ? [20:04:42] (03CR) 10Alexandros Kosiaris: [C: 04-1] cache: puppet 3 compatibility fix: fully qualify variable (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/111787 (owner: 10Matanya) [20:06:29] yeah k sounds good [20:08:02] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [20:08:12] (03PS2) 10Matanya: cache: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/111787 [20:08:23] (03CR) 10Alexandros Kosiaris: [C: 032] search: remove lookupvar and replace with top scope @ var [operations/puppet] - 10https://gerrit.wikimedia.org/r/112876 (owner: 10Matanya) [20:10:40] (03CR) 10Alexandros Kosiaris: [C: 032] "I will merge this but the rest of the variables in that file could use the same treatment too :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/112872 (owner: 10Matanya) [20:10:47] bblack: I'm about to deploy parsoid with the new trebuchet and might need help in case that fails [20:14:21] (03CR) 10Hashar: "Most probably fixed https://bugzilla.wikimedia.org/show_bug.cgi?id=62484 intermittent but frequent Login error on beta labs" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117867 (owner: 10Hashar) [20:14:25] (03PS1) 10Matanya: base: remove lookupvar and replace with top scope @ var [operations/puppet] - 10https://gerrit.wikimedia.org/r/117922 [20:15:31] the new shiny trebuchet still has the same problems: [20:15:35] git deploy service restart [20:15:35] Error received from salt; raw output: [20:15:35] 'deploy.restart' runner publish timed out [20:15:55] bblack, mutante: can one of you restart the parsoids as root? [20:16:37] this should work on the salt host: salt-run deploy.restart 'parsoid/deploy' '10%' [20:16:53] s/salt host/salt master/ [20:18:38] !log deployed Parsoid 681f7b8d2 using deploy 77d17489; service restart incomplete due to [[bugzilla:61882]] [20:18:47] Logged the message, Master [20:22:54] akosiaris, paravoid, apergos, Coren ^^ [20:23:14] * Coren reads backlog. [20:23:49] I'm pingable but pretty checked out (10:30 my evening, I start at an unreasonable mornign hour for programmers too) [20:24:40] * gwicke is just trying to get the attention of some root so that we can finish the parsoid deployment [20:24:42] Doing it. [20:25:04] Restart completed [20:25:04] (sez salt) [20:25:05] Coren, thanks! [20:26:48] !log Coren fixed up the Parsoid deploy by running "salt-run deploy.restart 'parsoid/deploy' '10%'" from the salt master as a work-around for [[bugzilla:61882]] [20:27:00] Logged the message, Master [20:28:53] something is periodically hitting swift doubling the bandwidth needed and saturating the boxes [20:29:12] I haven't been able to caught it when it happens and logs are a mess [20:29:15] any ideas? AaronSchulz maybe? [20:29:39] http://ganglia.wikimedia.org/latest/graph_all_periods.php?h=ms-fe1001.eqiad.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2&st=1394483293&g=network_report&z=large&c=Swift%20eqiad [20:32:51] paravoid: we got full logs for swift if that can help [20:32:59] I know [20:33:07] making sure you do :-] [20:33:30] lots of errors trying to view https://en.wikipedia.org/wiki/File:Theophrastus_redivivus_%28Paris_manuscript%29.pdf [20:34:04] the MW log has http 200 logged...which usually means that content-length mismatched the payload [20:35:50] Original file ‎(8,225 × 13,085 pixels, file size: 1.87 GB, MIME type: application/pdf, 1,106 pages) [20:35:54] seriously? [20:36:09] haha [20:36:25] Yes... [20:36:29] see also https://bugzilla.wikimedia.org/show_bug.cgi?id=62423 [20:36:34] you are right, sir [20:36:52] appeared 8 times within one of minutes where we have a spike [20:37:08] AaronSchulz, you have one more beer credit [20:37:28] I wonder why getLocalCopy is triggering [20:37:54] getMetadata() is fast and the metadata looks fine on the page, so it must be something else [20:38:02] what do you mean? [20:38:05] it's trying to scale it down [20:38:10] this means the imagescalers fetching 1.9G from swift [20:38:21] multiple times, because you have multiple thumbs in the page [20:38:30] ah, right, just thumbnails [20:38:33] Yeah, so far the thumbnails don't work for the 2GB PDF ;) [20:38:33] they are broken [20:38:43] it should bail out after a few failures though [20:38:44] yeah, probably because of timeouts [20:38:53] https://upload.wikimedia.org/wikipedia/commons/thumb/1/17/Theophrastus_redivivus_%28Paris_manuscript%29.pdf/page1-301px-Theophrastus_redivivus_%28Paris_manuscript%29.pdf.jpg [20:38:56] indeed it is [20:39:04] * AaronSchulz is glad he put that logic in [20:39:07] I am glad too! [20:39:15] I never thanked you for that :) [20:39:24] that's probably more than one beer credit [20:40:13] but that's for each page, right? [20:40:36] so it should be easy to kill the site if I keep on requesting thumbnails for this file? [20:40:39] clever. [20:40:49] what's your point? [20:40:50] not the site [20:41:37] can't the file be devided into several smaller ones? [20:42:18] *i [20:42:46] paravoid: for each thumbnail, so other sizes have their own counter [20:43:00] yes, that's what I meant, sorry [20:43:11] other sizes and other pages [20:43:13] might be useful to have a per-file counter too in case people do things with non-standard sizes [20:43:40] this is multipage tiff, so page1-301px has a different counter than page1-75px and a different than page2-75px [20:43:50] the permutations are enough to make this an issue [20:44:12] twkozlowski: You must be new around here [20:44:35] Yes. [20:44:39] nah, insinuating that people are stupid is an old-timer's behavior [20:44:55] paravoid: assuming people use that dropdown, yeah [20:45:13] PoolCounter all the things [20:45:15] all the multipage media stuff is pretty terrible in most ways [20:45:23] paravoid: Not sure what your problem is, but I haven't been insinuating anything. [20:45:24] we all know this of course ;) [20:45:26] So get a grip. [20:46:33] can someone kill that file from commons? [20:46:49] Yeah [20:47:02] I have no administrator privileges nor do I want them [20:47:20] I can kill the file from swift, but better if you just delete it properly :) [20:47:32] Gone [20:47:36] thanks :) [20:47:37] Well, going... [20:48:22] Must be waiting for a synchronous delete/move [20:48:28] it has to move two files to archive zone :) [20:48:46] probably will timeout [20:49:03] heh [20:51:03] 304 size/page permutations requested per logs [20:51:56] so, yeah, we probably need some overall fail counter per file [20:52:34] A database query error has occurred. This may indicate a bug in the software. [20:52:34] Function: WikiPage::pageData [20:52:34] Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.64.16.29) [20:52:35] wheee [20:52:48] It's not deleted either [20:57:02] (03PS1) 10Hashar: HHVM enabler for application servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/117931 [20:57:47] (03CR) 10jenkins-bot: [V: 04-1] HHVM enabler for application servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/117931 (owner: 10Hashar) [20:58:12] see if delete batch might work [20:59:35] :-( [21:00:00] reedy@tin:~$ mwscript deleteBatch.php --wiki=commonswiki --u=Reedy --r="Too big" delete.txt [21:00:01] File:Theophrastus redivivus (Paris manuscript).pdf Deleted! [21:00:01] reedy@tin:~$ [21:00:04] Very quickly [21:01:27] thanks Reedy :) [21:02:03] (03PS2) 10Hashar: HHVM enabler for application servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/117931 [21:04:28] (03CR) 10Hashar: "notify {} needs the message to be passed with a 'message' option, forgot it sorry :-(" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117931 (owner: 10Hashar) [21:16:00] bblack: mind handling https://rt.wikimedia.org/Ticket/Display.html?id=7009 ? [21:26:13] matanya: ok [21:26:33] thanks, just pinged you since it says you are on the duty [21:27:23] !log No new data in logstash since 14:56Z. Bryan will investigate. [21:27:31] Logged the message, Master [21:33:04] (03PS1) 10RobH: reclaim arsenic as spare [operations/puppet] - 10https://gerrit.wikimedia.org/r/118010 [21:34:42] (03PS1) 10RobH: reclaim arsenic as spare [operations/dns] - 10https://gerrit.wikimedia.org/r/118011 [21:39:59] !log ms-be5 swapping failed disk [21:40:07] Logged the message, Master [21:42:14] sbernardin: you are fixing brewster too? [21:42:22] RECOVERY - RAID on ms-be5 is OK: OK: optimal, 13 logical, 13 physical [21:42:23] or is it fixed already? [21:43:43] (03PS1) 10Dzahn: add account for Chase and add to admins::root [operations/puppet] - 10https://gerrit.wikimedia.org/r/118016 [21:44:18] !log arsenic reclaim per rt6522, ignore alerts [21:44:27] Logged the message, RobH [21:45:12] (03CR) 10Dzahn: [C: 04-1] "please add the right SSH key" [operations/puppet] - 10https://gerrit.wikimedia.org/r/118016 (owner: 10Dzahn) [21:46:04] (03CR) 10RobH: [C: 032] reclaim arsenic as spare [operations/puppet] - 10https://gerrit.wikimedia.org/r/118010 (owner: 10RobH) [21:46:15] mutante: you should add xxxx to the resource too, in order to prevent mistake later, he is nit legoktm. (just a friendly advice) [21:46:33] (03CR) 10RobH: [C: 032] reclaim arsenic as spare [operations/dns] - 10https://gerrit.wikimedia.org/r/118011 (owner: 10RobH) [21:46:34] !log Restarted elastcisearch on logstash1003; it was JVM heap thrashing at 98% heap used. [21:46:43] Logged the message, Master [21:47:36] !log ganglia monitoring for elasticsearch on logstash cluster seems broken. Caused by 1.0.x upgrade having not happened there yet? [21:47:45] Logged the message, Master [21:49:53] anyone know if we have historical cache requests/second growth data? [21:50:05] like, a couple of years back at least? [21:50:42] its hard to read this out of ganglia, especially since varnish is relatively new, and I dunno what was up before that really [21:50:48] or if we kept htose metrics from squid or whatever else [21:51:06] like, how many web requests/second did we server this time last year? [21:51:12] serve* [21:52:06] (03PS2) 10Dzahn: add account for Chase and add to admins::root [operations/puppet] - 10https://gerrit.wikimedia.org/r/118016 [21:54:18] mayyyybe paravoid knows? ^^ :) [21:55:30] !log Restarted logstash on logstash1001; new events flowing in again now [21:55:30] (03PS3) 10Dzahn: add account for Chase and add to admins::root [operations/puppet] - 10https://gerrit.wikimedia.org/r/118016 [21:55:41] Logged the message, Master [21:56:12] tabs vs. spaces vs. flying unicorns again? :P [21:57:04] (03CR) 10Ori.livneh: [C: 032] HHVM enabler for application servers [operations/puppet] - 10https://gerrit.wikimedia.org/r/117931 (owner: 10Hashar) [21:58:12] ignores tab comments [21:58:58] ori: sorry :-( [21:59:46] ori: we can probably create a hhvm based app server on beta cluster and have varnish to route some specific url/wiki/cookie to that app server [22:00:16] tim had an idea of how to do it in apache using mod_proxy_fcgi that i think would work [22:00:26] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [22:00:32] I wonder whether a Wikibase setup runs fully on HHVM... did you guys test that? [22:00:53] hashar: tim's idea was: RewriteCond ${HTTP_COOKIE} magic_cookie , RewriteRule ^/(.*\.php(/.*)?)$ fcgi://127.0.0.1:8888/a/mw/$1 [P] [22:00:53] hoo, who's WD dev here?:P [22:01:05] yesyessomoenesaidhhvm [22:01:15] (03PS1) 10RobH: DNS cleanup [operations/dns] - 10https://gerrit.wikimedia.org/r/118019 [22:01:21] (03CR) 10Gergő Tisza: WIP Add MMV feature flags for beta and pilot sites (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/117376 (owner: 10MarkTraceur) [22:01:35] domas, lotsofpeopledidthatlately [22:01:39] MaxSem: Dunno... *looks innocent* [22:01:45] rwar [22:01:46] rawr [22:01:47] :) [22:01:59] domas: have you heard of kPHP, btw? [22:02:17] by ВК (vkontakte) [22:02:28] ori: I would do it at the varnish level and route them to a dedicated cluster of hhvm applicationserver [22:02:32] hoo, you mean PHP3 with PHP5 syntax but without OOP? [22:02:36] (03CR) 10RobH: [C: 032] DNS cleanup [operations/dns] - 10https://gerrit.wikimedia.org/r/118019 (owner: 10RobH) [22:02:48] ori: at least for the first phase. I dont think we want hhvm installed everywhere [22:02:51] hoo: VK sure knows how to copy [22:02:56] \o/ [22:03:02] MaxSem: No idea, I just read about it randomly and couldn't find anything useful [22:03:07] hashar: was libmemcached the only issue? [22:03:32] domas: Reads like it's pretty much the original hip hop (converts to Cpp and then compiles) [22:03:42] hoo: yup [22:03:51] VK doodz know how to troll http://beta.hstor.org/getpro/habr/post_images/0f7/6c7/ba5/0f76c7ba59445b8864e784845902ae3e.png [22:04:09] MaxSem: :D Source? [22:04:18] THEM [22:04:20] that looks super credible [22:04:36] accurate to three decimal points and everything [22:04:37] ori: probably [22:04:39] that slide shows the flawed methodology right there [22:04:46] HHVM optimizes on the fly [22:04:54] ori: libmemcached06 for wikimedia-task-appserver vs libmemached10 for the hhvm package [22:04:59] whereas HPHPC was all static analysis and profile-based etc [22:05:19] ori: we can create another instance like deployment-hhvm01 and use role::applicationserver::common( hhvm => true ) on it [22:05:21] (03PS1) 10EBernhardson: Send Flow specific logs to fluorine [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118020 [22:06:20] we don't use libmemcached internally methinks :( [22:06:45] hashar: yeah, it might be ok for now. but the whole "drop-in replacement" value is diminished somewhat if it means "drop your production cluster and replace it with another". ideally the presence of another php interpreter wouldn't be grounds for needing to provision a separate instance [22:06:48] domas: heh... I just found that randomly in the russian Wikipedia some weeks back and wondered whether there's any credible information (or even a binary) out there... but nothing [22:07:29] ori: but for testing on beta that would be enough for now [22:07:36] PROBLEM - Host es3 is DOWN: PING CRITICAL - Packet loss = 100% [22:07:38] hoo: don't you know [22:07:42] .ru has all the best technology [22:07:44] hashar: +1 [22:07:47] ori: that buy you time with Tim to figure out an appropriate scenario for production :-] [22:07:48] how do they say [22:07:51] luchye v mire [22:09:03] :D [22:09:05] hoo, it's there: https://github.com/vk-com/kphp-kdb [22:09:33] MaxSem: Well, that's what you get for searching that on yandeks [22:09:41] + yandex [22:09:54] no OOP as I said, no other crap nobody uses like first(), end(), next(), prev(), current(), reset(), key() [22:10:25] MaxSem: people use next, current, reset, key, etc :P [22:10:38] [22:10:43] oh :P [22:10:57] php (subset) to c++ compiler: s/\$//g [22:11:40] well, eventually all PHP will switch to Hack [22:11:42] \o/ [22:11:45] can't wait for that [22:11:46] <3 <3 <3 [22:11:57] i can hope, but i dont expect i can write hack for prod here for a couple years :P [22:12:20] * ebernhardson can hear a chorus of 'but how will it run on shared hosting'  [22:12:33] it would be so awesome to say "mediawiki from now on is hack" <3 [22:13:53] <^d> ebernhardson: Sharded hosting can shove it. [22:13:56] <^d> *Shared [22:14:01] just discovered that VK also wrote their own DB too [22:14:08] ^d: oh i'm totally in that camp, but mediawiki sure isn't [22:14:08] велосипедисты [22:14:11] MaxSem: link :P [22:14:19] google.com [22:14:33] <^d> ebernhardson: Loudest voices aren't consensus ;-) [22:14:54] :P [22:15:26] MaxSem: every .ru property has their own databases [22:16:58] or an own file system, what the... [22:17:16] :-) [22:18:11] !log Two instances of logstash were running on logstash1001; Killed both and started service again [22:18:19] Logged the message, Master [22:18:44] interesting [22:19:42] ah, they only open sourced kPHP now... I was sure it was proprietary stuff last time I checked :P [22:19:57] https://ru.wikipedia.org/w/index.php?title=KPHP&diff=61818555&oldid=60391479 :P [22:20:42] * hoo goes to do something less useless [22:21:06] * ^d gives hoo that deletion bug [22:21:10] <^d> Since you're bored and all ;-) [22:22:20] ^d: I'm bored enough to look at weird software projects, not bored enough to do anything useful :D [22:27:37] weird software projects - https://pbs.twimg.com/media/BiYw4OuIIAEpksG.png [22:40:37] heading to bed *wave* [22:40:43] g'night [22:40:54] 'night hashar ;) [22:41:38] you too! [22:54:03] (03CR) 10BBlack: "Do we have cases now where multiple Set-Cookie (other than geoip) are being sent, which trigger this problem? I'm not fond of the magic m" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117375 (owner: 10Ori.livneh) [22:57:06] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (201557) [23:00:06] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:01:07] (03PS5) 10BBlack: Overwrite stale ZeroOpts= (blank) cookies if HTTPS zero-rated. [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 (owner: 10Dr0ptp4kt) [23:03:23] (03CR) 10BBlack: "How about this?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 (owner: 10Dr0ptp4kt) [23:05:35] (03CR) 10Dr0ptp4kt: [C: 031] "Oh right, no need to mess with the semicolon that way. Nice." [operations/puppet] - 10https://gerrit.wikimedia.org/r/117661 (owner: 10Dr0ptp4kt) [23:06:06] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200711) [23:07:19] !log shutting down srv258-srv270 [23:07:27] Logged the message, Master [23:08:26] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [23:08:46] PROBLEM - Host srv261 is DOWN: PING CRITICAL - Packet loss = 100% [23:09:18] mutante, is hume going to be killed too? [23:09:21] *soon [23:09:32] <^d> hume still has stuff running on it :( [23:09:46] it's spamming fatal logs with "db.php not found" errors [23:09:48] MaxSem: after this is resolved: 4792 deploy terbium as hume-equivilent server in eqiad [23:10:00] <^d> https://gerrit.wikimedia.org/r/#/c/74591/ - main thing I know of still on hume. [23:10:07] last comment was "it's still used" indeed [23:10:24] so just as soon as we can [23:12:18] srv261 should _not_ have been in icinga anymore.. sigh.. fixing [23:14:06] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:26:11] (03PS3) 10CSteipp: Raise minimum password length [core] - 10https://gerrit.wikimedia.org/r/117635 [23:26:48] (03CR) 10MaxSem: [C: 031] Add password reset link from desktop on mobile [extensions/MobileFrontend] - 10https://gerrit.wikimedia.org/r/117008 (owner: 10Jdlrobson) [23:27:14] (03CR) 10GWicke: "This got left behind & needs a rebase. Probably conflicts on the blacklist." [services/parsoid] - 10https://gerrit.wikimedia.org/r/116903 (owner: 10Subramanya Sastry) [23:27:33] YuviPanda, grrrit-wm went bonkers ^^^ [23:27:46] MaxSem: hmm? [23:27:50] MaxSem: oh [23:27:53] AzaToth: ^ [23:28:09] MaxSem: is it sending everything here? [23:28:12] notice [23:28:41] hmm, no - I still see some stuff in #-dev [23:28:53] it sends to #wikimedia-qa as well [23:28:56] makes no sense to me [23:29:05] interesting.. [23:29:20] YuviPanda did +2 on the code, so... [23:29:25] heh :D [23:29:36] AzaToth: I'll revert for now? I'll look at it again tomorrow [23:29:46] * YuviPanda logs in [23:29:46] I'll revert the instance [23:30:12] <^d> !log kicking gerrit to pick up bugfix. [23:30:20] Logged the message, Master [23:30:25] AzaToth: ok, no need to revert on gerrit then [23:30:33] (03CR) 10Mwalker: [C: 032] utm_campaign for Mediander [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/118025 (owner: 10Adamw) [23:30:36] hmm [23:30:59] interesting... wonder why a fundraising database announces here [23:31:03] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (201600) [23:31:05] *er fundraising repo [23:31:15] mwalker: YuviPanda's fault (and mine) [23:31:26] mine indeed. [23:31:44] <^d> {{done}} [23:32:05] reverted [23:32:34] *nods* thanks :) [23:32:57] YuviPanda: I actually rebased the two commits into oblivion [23:33:28] gwicke: https://gerrit.wikimedia.org/r/#/c/116996/3 [23:33:34] that was the change [23:34:02] makes no sense to me why it failed [23:34:33] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:36:00] (for the record; we don't mind it changing here temporarily while you work it out; just not permanently) [23:36:02] MaxSem: ↑ [23:36:25] mwalker: I've reverted it for now, as I've no clue why it failed [23:36:38] AzaToth, :) [23:37:31] MaxSem: _.has(variable, "key") on undefined/null/string should result in false right? [23:37:33] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (201476) [23:37:53] it did that when I tested it in node at least locally [23:37:58] AzaToth, ich sprache nich Python [23:37:59] unless I fnucked up [23:38:04] MaxSem: it's JS [23:38:12] from "underscore" [23:38:17] even worse:) [23:38:32] MaxSem: fyi, grrrit-wm is written in JS [23:38:49] MaxSem, it's in the eye of the beholder [23:38:50] MaxSem: https://gerrit.wikimedia.org/r/#/c/116996/3/src/relay.js [23:39:00] * MaxSem burns AzaToth at stake [23:39:05] heretic! [23:39:15] heh [23:39:20] _ as a variable name, might as well go full-crazy and invent a bunch of infix operators so noone knows whats going on :P [23:39:33] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:40:03] ebernhardson, too bad we can't have full monads [23:40:13] ebernhardson: http://underscorejs.org/ [23:46:33] PROBLEM - check_job_queue on terbium is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 199,999 jobs: , Total (200748) [23:48:33] RECOVERY - check_job_queue on terbium is OK: JOBQUEUE OK - all job queues below 200,000 [23:51:20] YuviPanda: found it [23:51:36] AzaToth: oh? [23:52:44] YuviPanda: https://gerrit.wikimedia.org/r/118028 [23:53:03] AzaToth: gaah [23:53:16] AzaToth: can you apply the patch and test? [23:54:51] yea [23:56:02] need to wait for the parsoid dude(sse)s to commit some [23:56:40] and a commit to any repo with branch "betacluster" would be ince [23:57:39] AzaToth, the last message at :51 went only to the parsoid channel [23:57:43] did you change the code since then? [23:58:41] no [23:58:59] YuviPanda: are you restarting grrrit-wm? [23:59:11] AzaToth: no... [23:59:41] TypeError: Cannot convert null to object