[00:01:36] hrmm, all my tests from various spots hit cp1044 and no loop [00:01:42] lemme see if 1043 is also workin [00:02:19] aude: Do you have full access to the rt ticket or only got the mail either? [00:02:34] I guess I'm going to reply to it to just have the stuff in the ticket itsekf [00:04:04] hoo: reply to the mail [00:19:25] For me "curl 'https://doc.wikimedia.org/VisualEditor/'" works = showing an Apache directory index. [00:23:09] RobH: :P [00:25:49] greg-g: ^_^ [00:25:59] be proud that you make newbs quake in fear! [00:26:48] hah, it's just the beard [00:45:49] (03PS3) 10Dzahn: decom professor, add decommissioning.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/109884 [00:47:36] (03CR) 10Cmjohnson: [C: 032] decom professor, add decommissioning.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/109884 (owner: 10Dzahn) [00:48:14] (03PS2) 10Dzahn: remove professor from DNS (decom) [operations/dns] - 10https://gerrit.wikimedia.org/r/109286 [00:49:23] (03CR) 10Cmjohnson: [C: 032] remove professor from DNS (decom) [operations/dns] - 10https://gerrit.wikimedia.org/r/109286 (owner: 10Dzahn) [01:00:32] (03PS1) 10Cmjohnson: removing pappas from dsh/dhcpd and site.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/110281 [01:00:38] (03Abandoned) 10Cmjohnson: decom "pappas" (formerly fr bastion) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110158 (owner: 10Dzahn) [01:01:48] (03CR) 10Cmjohnson: [C: 032] removing pappas from dsh/dhcpd and site.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/110281 (owner: 10Cmjohnson) [01:12:49] (03PS1) 10Cmjohnson: Removing dns entries for pappas [operations/dns] - 10https://gerrit.wikimedia.org/r/110286 [01:14:59] (03CR) 10Cmjohnson: [C: 032] Removing dns entries for pappas [operations/dns] - 10https://gerrit.wikimedia.org/r/110286 (owner: 10Cmjohnson) [01:18:21] beta cluster is dead [01:18:22] Exception from line 84 of /data/project/apache/common-local/php-master/extensions/Wikidata/extensions/Wikibase/repo/Wikibase.hooks.php: Wikibase: Incomplete configuration: $wgWBRepoSettings["entityNamespaces"] has to be set to an array mapping content model IDs to namespace IDs. See ExampleSettings.php for details and examples. [01:19:25] aude: ^ [01:19:27] damn [01:19:49] not again [01:25:51] hoo: aude: filed as https://bugzilla.wikimedia.org/show_bug.cgi?id=60606 already [01:26:10] got it [01:26:13] MatmaRex: aude: hoo: https://bugzilla.wikimedia.org/60606 [01:26:17] making a fix [01:26:59] Already have it fixed [01:29:11] (03PS1) 10Reedy: Disable and remove ContactPageFundraiser [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110292 [01:33:25] beta works again [01:33:32] \o/ [02:02:05] !log LocalisationUpdate completed (1.23wmf11) at 2014-01-30 02:02:05+00:00 [02:02:14] Logged the message, Master [02:02:54] !log LocalisationUpdate completed (1.23wmf10) at 2014-01-30 02:02:54+00:00 [02:03:03] Logged the message, Master [02:05:14] MariaDB 5.5.35 [02:05:15] https://blog.mariadb.org/mariadb-5-5-35-now-available/ [02:09:15] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-30 02:09:15+00:00 [02:09:24] Logged the message, Master [03:19:18] (03PS1) 10Springle: repool db1042 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110307 [03:20:21] (03CR) 10Springle: [C: 032] repool db1042 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110307 (owner: 10Springle) [03:20:27] (03Merged) 10jenkins-bot: repool db1042 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110307 (owner: 10Springle) [03:21:17] !log springle synchronized wmf-config/db-eqiad.php 'repool db1042' [03:21:26] Logged the message, Master [03:38:10] (03CR) 10Reedy: [C: 032] Stop making AdminSettings symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110259 (owner: 10Chad) [03:38:19] (03Merged) 10jenkins-bot: Stop making AdminSettings symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110259 (owner: 10Chad) [04:05:21] (03PS1) 10Springle: depool db1020 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110312 [04:08:55] (03CR) 10Springle: [C: 032] depool db1020 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110312 (owner: 10Springle) [04:09:02] (03Merged) 10jenkins-bot: depool db1020 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110312 (owner: 10Springle) [04:09:47] !log springle synchronized wmf-config/db-eqiad.php 'depool db1020 for schema changes' [04:09:54] Logged the message, Master [05:44:58] Well, this is a very basic problem… Ryan_Lane, on virt1000 "keystone endpoint-list" gets me a strange internal python keyerror [05:45:07] as though I need to upgrade some other component or db. But everything looks up to date... [06:49:17] (03PS1) 10Andrew Bogott: Inform keystone about our ldap schema for roles. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110318 [06:49:19] (03PS1) 10Andrew Bogott: Include python-novaclient for nova/havana. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110319 [06:51:51] (03CR) 10Andrew Bogott: [C: 032] Inform keystone about our ldap schema for roles. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110318 (owner: 10Andrew Bogott) [06:52:04] (03CR) 10Andrew Bogott: [C: 032] Include python-novaclient for nova/havana. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110319 (owner: 10Andrew Bogott) [07:07:32] (03PS1) 10Andrew Bogott: Use 'keystoneclient' instead of 'keystone'. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110320 [07:09:31] mornin [07:10:04] (03CR) 10Andrew Bogott: [C: 032] Use 'keystoneclient' instead of 'keystone'. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110320 (owner: 10Andrew Bogott) [08:07:55] (03PS1) 10Matanya: emery: move rsync api job to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110325 [08:18:29] (03PS1) 10Matanya: emery: remove api logs from emery [operations/puppet] - 10https://gerrit.wikimedia.org/r/110327 [08:20:29] (03PS1) 10Matanya: emery: move glam-nara logs to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110328 [08:27:22] PROBLEM - Check status of defined EventLogging jobs on vanadium is CRITICAL: CRITICAL: Stopped EventLogging jobs: consumer/vanadium [08:30:19] that's e [08:31:42] I wish I knew how to fix that :-D [08:32:18] that's "e"? [08:32:30] *me [08:36:22] RECOVERY - Check status of defined EventLogging jobs on vanadium is OK: OK: All defined EventLogging jobs are runnning. [08:49:11] paravoid: mind looking at https://gerrit.wikimedia.org/r/#/c/109869/ please? [08:58:15] paravoid: akosiaris: do we have any easy way to convert ruby gems to debian packages? :D [08:58:22] gem2deb [08:58:56] I am not yet sure which path we will follow: either a git repo containing all our gems dependencies, or packaging them [09:01:00] for? [09:01:21] context is for browsertests [09:01:29] they are based on cucumber / ruby webdriver [09:01:34] and have a bunch of dependencies [09:02:12] we run then on a labs instance and run a command that install the dependencies from rubygems [09:02:26] hashar: packages, way better [09:02:29] which add a 2-3 minutes overhead to the job (have to download from ruby gems repo then install them) [09:02:50] if you wish, can package them for you [09:03:30] matanya: thanks for your offering :] [09:03:34] I find packaging to be painfull [09:03:49] will check with zeljkof and find out which gems we should package [09:06:53] oh hashar would love if you can review https://gerrit.wikimedia.org/r/#/c/108289/ [09:07:00] !gitweb mediawiki/selenium [09:07:00] https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/selenium.git [09:07:06] * matanya lacks reviews [09:07:35] matanya: an example of gem dependencies: https://git.wikimedia.org/blob/mediawiki%2Fselenium.git/master/mediawiki-selenium.gemspec :D [09:07:44] mediawiki/selenium being a gem itself [09:08:00] ohhhh [09:08:09] thanks for the beta modularization [09:08:41] hashar: not an issue, i can package from the one with no deps, to the last one [09:08:46] matanya: you probably want to rebase your change already. I added a fatal monitoring script last week [09:08:57] i rebase daily :/ [09:11:15] :( [09:12:38] (03CR) 10Hashar: "sounds fine. You want to rebase this change again since I have added a fatal monitoring script last week :(" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108289 (owner: 10Matanya) [09:17:59] (03PS4) 10Matanya: beta: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/108289 [09:20:27] ok, hashar whenever you are ready, you can review the rebase, and give me a list of gems [09:50:44] is cluster tier a specific variable for varnish, or it can be used elsewhere? [09:51:46] * matanya wonders if ori would know [10:31:11] (03PS1) 10Matanya: facilities: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110339 [11:03:04] akosiaris: can poolcounter be modulrized? or any other plan for it? [11:03:53] a) yes b) no, c) go ahead :-) [11:04:06] * matanya is doing [11:06:14] akosiaris: can I include a cron directly in site.pp? [11:06:39] Nemo_bis: best not to do that. Unless it really really makes sense [11:06:41] surely adding exim::roled to mchenry is excessive https://bugzilla.wikimedia.org/show_bug.cgi?id=57890#c5 [11:07:25] and privateexim::aliases::private is ... private I guess, not in puppet so I don't think the cron can/should be added there [11:07:30] I don't know off hand. I 'd have to have a look [11:09:12] (03PS1) 10Matanya: poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 [11:09:39] Nemo_bis: the answer is no, don't do it [11:10:15] read in a nice manner please [11:10:17] then a new mail.pp class just for that? [11:10:31] came out too harsh, sorry [11:11:12] no Nemo_bis this : https://gerrit.wikimedia.org/r/#/c/68584/ will solve it once it is done [11:11:53] sounds like it will take ages [11:12:07] we need stats nowish [11:12:14] poke andrewbogott [11:12:46] he might know. i might take it over, if he can't but it looks nice, and i hope he can finish it [11:16:38] sigh, now I must go [11:16:44] too bad, I hoped to submit a patch [11:16:45] matanya: regarding poolcounter? [11:17:05] no, andrewbogott regarding exim module [11:17:18] poolcounter is already pushed [11:17:27] oh -- Yeah, I wrote a refactor once but I think mark didn't like it so… might be best to collaborate with him [11:17:33] not that he has a spare minute this week :( [11:17:48] ken related? [11:18:21] well [11:18:24] Ah, just, he's doing many things at once. [11:18:25] Faidon has agreed to take on the mail migration [11:18:32] so he'll probably be the point person for this new exim module [11:18:38] correct [11:18:49] Ah, ok then! [11:18:55] will we have DKIM everywhere? :) [11:19:05] yes [11:19:16] matanya: I don't know if there's much to salvage from my patch, probably it should be abandonded. [11:19:45] if you do that, i'll start a new one. too bad for my huge rebase on it :/ [11:20:17] and another question i had : is cluster tier a specific variable for varnish, or it can be used elsewhere? [11:20:29] i think paravoid or mark will know to answer this [11:20:48] grep the source :) [11:20:51] i did [11:20:57] where do you see it specified? [11:21:05] cluster_tier is role/cache.pp specific indeed [11:21:05] it is called cache.pp [11:21:22] if it's really useful elsewhere we could think about making it a global [11:21:53] it's hard to imagine "tiers" for other services I think [11:22:35] all around modules/protoproxy/templates/proxy.erb [11:22:52] 12 times. [11:23:25] tier wouldn't help you much there, since you need to reference the backend IPs anyway [11:23:26] so either i can replace it with a variable from the class it is called, or with some higher scope var [11:23:37] but in any case, the protoproxy stuff is eventually going away [11:23:51] this part anyway [11:23:53] egde case anyway. would be replaced by what? [11:23:58] with the localsslw ork [11:24:14] local ssl? where is that? [11:24:18] same module [11:24:22] but different template [11:24:25] oh [11:24:31] (and manifest) [11:24:43] so i'll ignore this puppet3 issue [11:24:56] no, let's not please tie these two migrations together [11:25:10] the syntax fixes are easy enough to do them now [11:25:34] hmm, it is, but it involves design issues [11:25:57] i.e. what we spoke some days ago [11:26:04] remind me, I forgot [11:26:25] openstack version iirc [11:27:25] andrewbogott: solved it at the end, but it wasn't clear at first [11:30:17] out for 15 min [11:37:51] (03CR) 10Alexandros Kosiaris: [C: 04-1] poolcounter: convert into a module (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 (owner: 10Matanya) [12:07:05] (03PS2) 10Matanya: poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 [12:07:42] (03CR) 10jenkins-bot: [V: 04-1] poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 (owner: 10Matanya) [12:08:44] (03PS3) 10Matanya: poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 [12:16:10] hashar: i would love to understand why manifests/zuul.pp gules modules/zuul and manifests/roles/zuul.pp [12:20:54] matanya: ahhh [12:21:05] matanya: so Zuul module is supposed to be independent / wikimedia agnostic [12:21:18] matanya: Zuul role just set system_role and basic stuff [12:21:31] matanya: manifests/zuul.pp carry wikimedia specific configuration [12:21:45] so why not put that into the role? [12:23:07] matanya: done that on my birthday 2 years [12:23:08] ago moved files definitions out of role::zuul to a new zuulwikimedia manifests in manifests/zuul.pp [12:23:09] zuulwikimedia calls the module class and then deploy Wikimedia specific configuration. [12:23:12] on https://gerrit.wikimedia.org/r/#/c/27611/ [12:24:48] and where is it now? [12:25:00] still the same layout [12:25:18] feel free to drop zuulwikimedia and put its content in the role class [12:25:24] might want to check with faidon first though [12:25:29] planning to change it yourself? [12:25:36] nop [12:25:41] if it works, don't break it! :-] [12:25:50] i'd love some input from paravoid [12:26:03] puppet-glue isn't very friendly :) [12:28:52] lunch needed [13:03:14] (03CR) 10Alexandros Kosiaris: [C: 04-1] poolcounter: convert into a module (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 (owner: 10Matanya) [13:07:25] (03PS4) 10Matanya: poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 [13:31:35] (03PS1) 10Andrew Bogott: Specify compute_driver as that seems required in havana. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110365 [13:34:35] (03CR) 10Andrew Bogott: [C: 032] Specify compute_driver as that seems required in havana. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110365 (owner: 10Andrew Bogott) [13:38:02] PROBLEM - DPKG on virt1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [13:39:02] RECOVERY - DPKG on virt1001 is OK: All packages OK [13:48:32] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:54:22] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 178565 bytes in 6.376 second response time [13:59:21] (03PS1) 10Matanya: certs: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110366 [14:11:26] (03PS5) 10Alexandros Kosiaris: poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 (owner: 10Matanya) [14:13:15] matanya: ^^ that is what i meant [14:13:34] yeah, i see. thanks. sorry for extra work [14:15:36] i'll remove those too next time :) [14:18:35] (03CR) 10Alexandros Kosiaris: [C: 032] poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 (owner: 10Matanya) [14:18:54] (03CR) 10Alexandros Kosiaris: [C: 032] puppet-merge: Fix bug introduced in 4a79aa1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/110203 (owner: 10Alexandros Kosiaris) [14:19:25] (03CR) 10Alexandros Kosiaris: [V: 032] poolcounter: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/110340 (owner: 10Matanya) [14:22:15] (03PS12) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [14:22:25] (03CR) 10jenkins-bot: [V: 04-1] site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 (owner: 10Matanya) [14:26:19] (03CR) 10Alexandros Kosiaris: "Both approaches failed to take merges into account. In that case another commiter, namely gerrit, is also present in the git log output. W" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110203 (owner: 10Alexandros Kosiaris) [14:27:50] (03PS13) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [14:29:47] who will be at fosdem this weekend? [14:30:52] lots of people [14:30:58] (03CR) 10Guido.iaquinti: [C: 031] add FIXMEs for erzurumi references [operations/puppet] - 10https://gerrit.wikimedia.org/r/109655 (owner: 10Dzahn) [14:31:07] what is it that you're interested in specifically? [14:31:48] * AaronSchulz likes how dark it is outside [14:31:57] if someone of you will be at fosdem, just to meet and have a beer :) [14:32:10] i already know that a lots of people will be there :D [14:32:34] we'll have a stand this year [14:32:47] plus Erik is giving a talk about the "Wikipedia stack" [14:33:32] paravoid: i know, but who [14:33:38] I think the best bet would be to hang out at our stand, I'm sure lots of us will be going back and forth there :) [14:33:51] ok [14:34:01] i'm sure paravoid doesn't have a name list [14:34:05] I actually do [14:34:06] https://www.mediawiki.org/wiki/Events/FOSDEM/2014 [14:34:07] :) [14:34:17] always suprising paravoid :) [14:37:12] PROBLEM - Host virt1001 is DOWN: PING CRITICAL - Packet loss = 100% [14:37:34] oh? [14:38:10] volunteer to kick the machine? [14:39:19] that's me [14:39:40] andrewbogott: around? want to draft an email about openstack support and need to know what exactly we are looking for [14:41:00] drdee: I'll be awake for a bit. It's hard to know exactly what we'll need, other than 'troubleshooting help' [14:41:16] But I can describe the setup [14:42:22] RECOVERY - Host virt1001 is UP: PING OK - Packet loss = 0%, RTA = 0.65 ms [14:44:22] PROBLEM - RAID on virt1001 is CRITICAL: Connection refused by host [14:44:30] that's still me! [14:44:42] PROBLEM - Disk space on virt1001 is CRITICAL: Connection refused by host [14:45:02] PROBLEM - SSH on virt1001 is CRITICAL: Connection refused [14:45:02] PROBLEM - puppet disabled on virt1001 is CRITICAL: Connection refused by host [14:45:02] PROBLEM - DPKG on virt1001 is CRITICAL: Connection refused by host [14:52:15] mutante: mind if I add your instructions to create jgonera to the wikitech RT page? [14:53:40] (03PS1) 10Aude: add data license settings for wikibase [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110368 [14:53:42] (03PS1) 10Aude: enable wikidata build for test.wikidata and test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110369 [14:54:13] (03CR) 10Ottomata: "Change the rsync job to point at erbium now too, ja?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110327 (owner: 10Matanya) [14:54:49] (03PS2) 10Matanya: emery: move glam-nara logs to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110328 [14:54:56] (03CR) 10Ottomata: [C: 032 V: 032] emery: move glam-nara logs to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110328 (owner: 10Matanya) [14:56:12] (03CR) 10Matanya: "done in https://gerrit.wikimedia.org/r/#/c/110325/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110327 (owner: 10Matanya) [14:56:38] oh matanya, didn't see that [14:56:39] cool [14:56:42] PROBLEM - NTP on virt1001 is CRITICAL: NTP CRITICAL: No response from NTP server [14:56:58] (03PS2) 10Matanya: emery: move rsync api job to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110325 [14:57:00] ottomata: told you i'm spiltting it up for easier review :) [14:57:03] (03CR) 10Ottomata: [C: 032 V: 032] emery: move rsync api job to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110325 (owner: 10Matanya) [14:59:26] Heja ops. Who would I ping for mailing list issues? See https://bugzilla.wikimedia.org/show_bug.cgi?id=60215 [15:00:11] * matanya suggestes rt ticket [15:00:35] (03PS2) 10Aude: enable wikidata build for test.wikidata and test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110369 [15:03:21] (03PS2) 10Ottomata: emery: remove api logs from emery [operations/puppet] - 10https://gerrit.wikimedia.org/r/110327 (owner: 10Matanya) [15:03:36] (03CR) 10Ottomata: [C: 032 V: 032] emery: remove api logs from emery [operations/puppet] - 10https://gerrit.wikimedia.org/r/110327 (owner: 10Matanya) [15:04:42] ottomata: the left services on emery should also move to erbium, correct? [15:09:17] (03PS1) 10Aude: update wikibase cronjobs to use wikidata build [DNM yet] [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [15:10:09] (03PS2) 10Aude: update wikibase cronjobs to use wikidata build [DNM yet] [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [15:10:52] if i want to setup new cron jobs (but not enable yet), can i list them in puppet as "absent" [15:11:06] then change to "enabled" later [15:11:14] does that work? [15:11:14] no aude that won't do that [15:11:18] aww [15:11:22] then how? [15:11:24] absent removes them [15:11:28] yes [15:11:40] i could comment them out [15:11:46] you should use comment out [15:11:49] ok [15:12:28] then once need to use, remove the remark, and puppet will install them [15:13:27] matanya, yeah, to erbium should be good [15:13:33] i was just checking out load and stuff, i think we can handle it [15:13:41] one patch, or a few? [15:13:59] one is fine for the rest, but, for now, just leave the sampled-1000 filter on emery and add it to erbium [15:14:12] we'll just turn off the sampled-1000 filter when we decom emery [15:14:17] its kinda nice to have it as a little backup [15:14:25] (03PS3) 10Aude: update wikibase cronjobs to use wikidata build [DNM yet] [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [15:14:44] it will remain on gadolinium only ottomata ? [15:14:44] like that ^ ? [15:14:55] PROBLEM - Host virt1001 is DOWN: PING CRITICAL - Packet loss = 100% [15:15:03] ? [15:15:05] we want to stop the old jobs when we switch wikidata from wmf11 to wmf12 [15:15:13] go ahead and add it to erbium config now [15:15:14] then after switched, enable new cron jobs [15:15:22] but don't remove it from emery config yet [15:15:36] i'm going to go in and be really careful about those logs when the rsync job changes [15:15:45] to either make sure there is no overlap in the archived logs [15:15:52] or at least note that there is [15:15:59] ok [15:16:22] aude: yes, but it hurts my eyes (where is lint?) [15:16:32] what's wrong? [15:16:49] i see a tab [15:16:50] you want a proper review? [15:16:56] yes [15:17:01] ok [15:18:05] RECOVERY - SSH on virt1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [15:18:15] RECOVERY - Host virt1001 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [15:20:43] (03CR) 10Matanya: update wikibase cronjobs to use wikidata build [DNM yet] (0310 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 (owner: 10Aude) [15:21:15] thanks :) [15:21:18] aude: are you aware of the style guide? [15:21:20] (03CR) 10Jgreen: "All these references to erzurumi are long-deprecated--no need to replace them. The box has been simply a standby server for the past year." [operations/puppet] - 10https://gerrit.wikimedia.org/r/109655 (owner: 10Dzahn) [15:21:26] no [15:21:36] * aude just copying code [15:21:47] sure it's poorly styled though [15:21:47] aude: docs.puppetlabs.com/guides/style_guide.html [15:21:53] thanks [15:21:58] please follow :) [15:22:11] i feel like the lint-nazi :P [15:23:10] aude: two execptions from the guide: [15:23:32] 1) we use 4 spaces instead of 2 in this guide [15:23:53] 2) ignore Inheritance at all cost :) [15:24:42] aude: and to nugde a bit more, are you aware of the puppet conventions on wikitech? [15:27:27] not sure i understand "$enabled ?{ " [15:27:52] can i just put 'present' or 'absent' ? [15:28:21] but we have "class misc::maintenance::wikidata( $enabled = false ) { " [15:28:23] http://puppet-lint.com/checks/selector_inside_resource/ [15:28:29] ok [15:29:07] aude: cron ensure => can get present or absent only [15:30:11] http://docs.puppetlabs.com/references/latest/type.html#cron [15:30:29] (03CR) 10Ori.livneh: [C: 04-1] "What's up with the removal of monitor_ganglia? It's not mentioned in the commit message, and it's used in several modules." [operations/puppet] - 10https://gerrit.wikimedia.org/r/107819 (owner: 10Matanya) [15:30:57] (03PS4) 10Aude: update wikibase cronjobs to use wikidata build [DNM yet] [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [15:31:41] what do you mean by "to be on the safe side remark those lines too."? [15:31:44] add more comments? [15:32:55] PROBLEM - Varnish HTCP daemon on cp1053 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:32:55] PROBLEM - Varnish HTTP text-backend on cp1053 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:33:15] PROBLEM - Varnish traffic logger on cp1053 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:33:44] yes aude [15:34:00] (03PS3) 10Matanya: nagios: puppet 3 compatibility fix: fully qualify variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/107819 [15:34:06] (03PS5) 10Aude: update wikibase cronjobs to use wikidata build [DNM yet] [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [15:34:40] ok [15:35:03] (03CR) 10Matanya: "I some how did something wrong here. Not sure what. Thanks for pointing it out ori." [operations/puppet] - 10https://gerrit.wikimedia.org/r/107819 (owner: 10Matanya) [15:35:12] (03CR) 10Ori.livneh: [C: 04-1] "Rather than provision each file in /var/lib/gerrit2 as a separate resource, could you simply reproduce the hierarchy in a modules/gerrit/f" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/109088 (owner: 10Matanya) [15:35:28] cmjohnson1: If you have a DC visit planned today, go ahead and drop the 10G nic into labnet1001. I'll be sleeping during most of the DC day anyway so won't notice the downtime :) [15:36:23] matanya: you added trailing whitespace on line 366 [15:36:33] of nagios.pp i mean [15:36:35] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:37:08] (03PS6) 10Aude: update wikibase cronjobs to use wikidata build [DNM yet] [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [15:37:45] RECOVERY - Disk space on virt1001 is OK: DISK OK [15:37:55] RECOVERY - puppet disabled on virt1001 is OK: OK [15:38:05] RECOVERY - DPKG on virt1001 is OK: All packages OK [15:38:25] RECOVERY - RAID on virt1001 is OK: OK: Active: 16, Working: 16, Failed: 0, Spare: 0 [15:38:35] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 180599 bytes in 9.620 second response time [15:39:33] (03PS4) 10Matanya: nagios: puppet 3 compatibility fix: fully qualify variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/107819 [15:39:35] RECOVERY - NTP on virt1001 is OK: NTP OK: Offset -0.01575934887 secs [15:42:32] paravoid, while Snaps is here, got a sec to opine about yajl1 vs yajl2 packages for kafkatee? [15:43:19] (03CR) 10Matanya: "There is no module yet. i fail to understand what you are referring to." [operations/puppet] - 10https://gerrit.wikimedia.org/r/109088 (owner: 10Matanya) [15:45:49] (03CR) 10Ori.livneh: [C: 032] nagios: puppet 3 compatibility fix: fully qualify variables [operations/puppet] - 10https://gerrit.wikimedia.org/r/107819 (owner: 10Matanya) [15:48:35] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:49:25] (03PS1) 10Matanya: emery: RT #6143 move two logs to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/110382 [15:51:05] RECOVERY - Varnish traffic logger on cp1053 is OK: PROCS OK: 2 processes with command name varnishncsa [15:51:05] PROBLEM - DPKG on virt1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [15:51:35] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 180469 bytes in 9.643 second response time [15:51:45] RECOVERY - Varnish HTCP daemon on cp1053 is OK: PROCS OK: 1 process with UID = 111 (vhtcpd), args vhtcpd [15:51:45] RECOVERY - Varnish HTTP text-backend on cp1053 is OK: HTTP OK: HTTP/1.1 200 OK - 189 bytes in 0.001 second response time [15:52:05] RECOVERY - DPKG on virt1001 is OK: All packages OK [15:53:37] (03PS1) 10Aude: bump $wgCacheEpoch for wikidatawiki and test wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110384 [15:53:44] (03PS2) 10Matanya: gerrit: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109088 [15:56:29] (03CR) 10Mark Bergsma: [C: 031] Allow PUT method to hosts behind the misc Varnish cluster [operations/puppet] - 10https://gerrit.wikimedia.org/r/109330 (owner: 10BryanDavis) [15:56:38] ori: go ahead [15:56:41] needs a rebase though, looks like [15:57:18] mark: awesome, thanks. i'll flag it if the rebase is in any way problematic. [15:57:39] (03PS4) 10BryanDavis: Allow PUT method to hosts behind the misc Varnish cluster [operations/puppet] - 10https://gerrit.wikimedia.org/r/109330 [16:00:03] (03CR) 10Ori.livneh: [C: 032] "per mark's ok" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109330 (owner: 10BryanDavis) [16:16:49] ori, mark: Thanks! One step closer to logstash world domination. [16:17:09] bd808: [16:17:15] so verified works :) [16:17:56] ori: is logstash planned to replace udp2log or to be on top of it? [16:18:06] andrewbogott: sounds good I will take care of it..thx [16:18:23] cmjohnson1: cool, thanks! [16:19:08] matanya: Good question. For the foreseeable I think it's an addition rather than a replacement [16:19:12] matanya: replace. though the intent is to preserve file-based logging as well for grepping [16:20:00] * bd808 sees that ori has more complete plans :) [16:20:05] thank you both [16:20:21] <^d> bd808: This plan is incomplete. No plan is complete without an evil-plans.txt [16:20:36] <^d> :) [16:20:50] bd808: i got the redirect loop again btw a couple of minutes ago [16:20:59] git add evil-plans.txt [16:21:10] matanya: logstash is kinda different than udp2log [16:21:15] (03PS1) 10Manybubbles: Update Elasticsearch monitoring for 0.90.10 [operations/puppet] - 10https://gerrit.wikimedia.org/r/110389 [16:21:20] yeah, i know ottomata [16:21:24] logstash index logs for fancy viewing and stats [16:21:30] udp2log is a transport mechanism [16:21:34] kafka is more the replacement for udp2log :p [16:21:46] have log stash at my $day_job [16:21:55] aye cool [16:22:09] ori: Yeah. It's a varnish issue caused by the way that we send the http -> https redirect. Not sure the best way to fix it yet. [16:22:18] build it from scratch (quite proud of it in fact) [16:22:26] :) [16:22:36] bd808: what's the issue, and why doesn't it affect other SSL services in misc-eqiad? [16:23:01] ori: https://bugzilla.wikimedia.org/show_bug.cgi?id=60488 [16:23:26] Varnish is caching the 301 redirect without respecting X-Forwarded-Proto [16:25:01] So when / is requested via http and the cache is empty the 301 to https is cached and served to both http and http requests until an Auth header is added that breaks the cache [16:26:23] I'm not sure that a similar problem wouldn't effect other vhosts behind the misc cluster if they had the same Apache behavior of redirecting to https as canonical [16:28:01] Adding X-Forwarded-Proto to the vcl_hash should fix it. I'm not sure if it could also be fixed by adding X-Forwarded-Proto to a Vary header. My Varnish fu is mostly search/grep based at this point. [16:28:55] Maybe mark and/or paravoid could read https://bugzilla.wikimedia.org/show_bug.cgi?id=60488 and tell us the right way to fix this cache issue [16:29:28] Vary: X-Forwarded-Proto [16:29:38] when emitting the 301 [16:29:57] hm, but the 301 is being generated from apache, right? [16:30:02] bd808: https://gerrit.wikimedia.org/r/#/c/23521/ [16:30:08] so apache can't do that [16:30:25] so, we have a VCL hack [16:30:36] paravoid: Yeah. 301 is done by mod_rewrite at the moment [16:30:52] see text-backend.inc.vcl.erb [16:30:53] paravoid: custom error code? [16:30:57] that resolves to a 302? [16:31:00] grep "FIXME: Fix up missing Vary headers on Apache redirects" [16:31:04] no, just adding the Vary [16:31:23] so you wil get 301? [16:31:26] *will [16:31:49] no, so that you get a 301 if you came via http, or the actual page (and not a redirect loop) if you came via https [16:32:05] yeah, that is what i meant [16:32:08] (03CR) 10Chad: [C: 031] Update Elasticsearch monitoring for 0.90.10 [operations/puppet] - 10https://gerrit.wikimedia.org/r/110389 (owner: 10Manybubbles) [16:32:17] * matanya must be clearer [16:33:29] paravoid: Should that logic be hoisted into modules/varnish/templates/vcl/wikimedia.vcl.erb to apply everywhere or would that be gross? [16:36:05] well, you have: [16:36:11] RewriteCond %{HTTP:X-Forwarded-Proto} !https [16:36:11] RewriteCond %{REQUEST_URI} !^/status$ [16:36:11] RewriteRule ^/(.*)$ https://<%= @hostname %>%{REQUEST_URI} [R=301,L] [16:36:12] akosiaris: ping [16:36:33] and per , [16:36:47] If a HTTP header is used in a condition this header is added to the Vary header of the response in case the condition evaluates to to true for the request. It is not added if the condition evaluates to false for the request. Adding the HTTP header to the Vary header of the response is needed for proper caching. [16:37:29] we've looked at the apache source code after failing to construct proper redirects a few times [16:37:36] it has explicit code that strips all headers when emitting redirects [16:37:39] incl. Vary [16:37:48] it's silly really [16:39:41] There is a pretty good description of the problem at http://stackoverflow.com/a/3711110/8171 which matches with Faidon's statement [16:42:00] what about the Header onsuccess merge Vary "Accept-Language" suggestion? [16:42:14] well, s/Accept-Language/X-Forwarded-Proto [16:44:33] ori: I'll try that on a test server [16:49:56] ori: No joy. The Vary that comes out is still only Accept-Encoding [16:50:47] jgage, you around? [16:51:14] bd808: gimme 5 mins [16:55:58] (03PS1) 10BryanDavis: varnish: Add X-Forwarded-Proto to Vary on redirects [operations/puppet] - 10https://gerrit.wikimedia.org/r/110393 [16:57:32] bd808: wait [16:57:54] ori: Sure. It's just a proposed patch at this point [16:58:05] * bd808 is not a root :) [16:59:51] (03PS1) 10Matanya: emery: remove last log before decom [operations/puppet] - 10https://gerrit.wikimedia.org/r/110394 [17:15:14] (03CR) 10Chad: [C: 032] Revert "New extra language for wikidata: Ottoman Turkish (ota)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110182 (owner: 10Aude) [17:25:38] (03CR) 10Chad: [V: 032] "I don't have time to wait on your shenanigans Jenkins. Merging myself." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110182 (owner: 10Aude) [17:26:40] !log demon synchronized wmf-config/InitialiseSettings.php 'Id0dea6e4: Revert "New extra language for wikidata: Ottoman Turkish (ota)"' [17:26:48] Logged the message, Master [17:26:55] <^d> aude: {{done}} [17:27:12] thanks [17:27:18] <^d> yw [17:28:20] (03PS1) 10Ori.livneh: kibana: Set Vary: X-Forwarded-Proto on HTTP -> HTTPS 301s [operations/puppet] - 10https://gerrit.wikimedia.org/r/110396 [17:28:33] bd808: if that works, are you happy with it? because: it works. [17:29:57] ^d: Reedy https://www.mediawiki.org/wiki/Wikidata_deployment#Deployment_notes (list of things needed before switching test wikidata to wmf12) [17:32:21] ori: I'll test it too. 5 minutes [17:34:39] bd808: applied locally on logstash1001. do: curl -I logstash1001 -H 'Host: logstash.wikimedia.org' -H 'X-Forwarded-Proto: http' [17:36:01] (03CR) 10BryanDavis: [C: 031] "Maybe add Bug: 60488 to the commit message." [operations/puppet] - 10https://gerrit.wikimedia.org/r/110396 (owner: 10Ori.livneh) [17:36:20] ori: Works on my test and yours. Good find! [17:36:22] (03PS2) 10Ori.livneh: kibana: Set Vary: X-Forwarded-Proto on HTTP -> HTTPS 301s [operations/puppet] - 10https://gerrit.wikimedia.org/r/110396 [17:36:50] (03CR) 10Ori.livneh: [C: 032 V: 032] kibana: Set Vary: X-Forwarded-Proto on HTTP -> HTTPS 301s [operations/puppet] - 10https://gerrit.wikimedia.org/r/110396 (owner: 10Ori.livneh) [17:37:02] (03Abandoned) 10BryanDavis: varnish: Add X-Forwarded-Proto to Vary on redirects [operations/puppet] - 10https://gerrit.wikimedia.org/r/110393 (owner: 10BryanDavis) [17:39:09] bd808: applied patch and reloaded logstash* apaches [17:40:05] paravoid: Coren merged https://gerrit.wikimedia.org/r/102617 (operations/debs/vips) on Monday. Does it automatically bubble up to the WMF repo, or do I have to ask someone to do that? (Who preferably?) [17:41:17] !log labnet1001 down for card install [17:41:25] Logged the message, Master [17:42:03] scfc_de: no it does not. usually the person who merged it, i.e. Coren [17:42:45] paravoid: Okay, then I'll ask him after FOSDEM. [17:42:52] ori: I can't recreate the loop and headers from Varnish look correct. [17:42:59] PROBLEM - Host labnet1001 is DOWN: PING CRITICAL - Packet loss = 100% [17:51:13] !log reedy updated /a/common to {{Gerrit|Id0633e5a0}}: depool db1020 for schema changes [17:51:21] Logged the message, Master [17:55:15] (03PS1) 10Reedy: Set $wgMathTexvcCheckExecutable to null. Not all apaches have executable [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110400 [17:56:09] Hmmm. That's not going to help [17:59:46] Oh, someone merged it [18:00:48] ori: if you've tested your workaround and can spare 5', it'd be nice to apply it to prod's apache config too [18:03:01] I read that as 5 feet [18:04:44] (03PS2) 10Manybubbles: Update Elasticsearch monitoring for 0.90.10 [operations/puppet] - 10https://gerrit.wikimedia.org/r/110389 [18:04:51] (03CR) 10Ottomata: [C: 032 V: 032] Update Elasticsearch monitoring for 0.90.10 [operations/puppet] - 10https://gerrit.wikimedia.org/r/110389 (owner: 10Manybubbles) [18:05:21] hey ori [18:05:33] i think the puppet-merge change is showing the committer in the output [18:05:36] instead of the author [18:05:40] i think it should show the author [18:05:56] its showing my name on changes that i've only rebased [18:08:03] <^d> Show both :) [18:08:37] <^d> fwiw, the author can be forged in gerrit. [18:08:53] <^d> by design [18:09:04] <^d> Committer is more reliable for "who did something" [18:10:27] (03PS1) 10Reedy: Need to rsync texvccheck files to tmp too [operations/puppet] - 10https://gerrit.wikimedia.org/r/110402 [18:11:03] Hey, can someone merge and pull the above onto whatever the puppetmaster is now please? [18:14:35] Reedy: i can for ya [18:15:04] (03CR) 10RobH: [C: 032] Need to rsync texvccheck files to tmp too [operations/puppet] - 10https://gerrit.wikimedia.org/r/110402 (owner: 10Reedy) [18:15:13] bleh, still verifying. [18:15:49] hrmm, i see no jobs running in zuul for it [18:15:58] oh, wait, there it is, nm [18:16:17] lots of queued items. [18:16:39] RobH: when you have time i'd love if you review my cert lint patch [18:17:51] Thanks [18:18:02] matanya: is it just formatting changes or other as well? [18:18:04] RobH: jenkins is a bit stuck right now [18:18:14] lint only RobH [18:18:25] RobH: no one around to fix it [18:18:25] (03PS1) 10Reedy: Add symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110403 [18:18:27] (03PS1) 10Reedy: Fix switchAllMediaWikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110404 [18:18:29] k, thats what it looked like but wanted to ask [18:18:29] (03PS1) 10Reedy: All Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110405 [18:18:40] (03CR) 10Reedy: [C: 032 V: 032] Add symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110403 (owner: 10Reedy) [18:19:01] (03CR) 10Reedy: [C: 032 V: 032] Fix switchAllMediaWikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110404 (owner: 10Reedy) [18:20:52] MatmaRex: has anyone looked into it? [18:21:04] cuz i see all kinds of empty build slots [18:21:42] though im not sure how the backend works to distribute jobs to those slots. [18:21:46] RobH: ^d said he doesn't know what's broken [18:21:50] hashar and krinkle are not here [18:21:53] No Krinkle|detached yet [18:21:55] heh [18:22:07] aren't they in a similar timezone? [18:22:10] if i were you i'd restart everything. :P [18:22:13] we're going to have to have one of them move ;] [18:22:23] Krinkle|detached is still in SF [18:22:25] haha [18:22:38] i fear restarting the service that is limping along now [18:22:44] versus taking it offline completely [18:23:05] but can give it a shot if we cannot find them later [18:23:20] <^d> It's not jenkins itself. [18:23:23] <^d> it's qunit, I think. [18:25:38] ^d: think i should kick the server? (seems drastic) [18:25:49] <^d> Yeah. [18:25:51] kick = reboot in rob terminology [18:25:57] <^d> They did something quick and easy the other day. [18:26:03] <^d> I wish I knew what it was. [18:26:11] ok, going to try rebooting it i suppose [18:26:25] before i do, lemme pull out both their cell numbers in case i break it more. [18:26:28] <^d> Overkill most likely :\ [18:26:49] well, can try restarting serivces first [18:26:51] see if that does it [18:27:23] <^d> Lemme look. [18:27:32] k, lemme kno w if you need me to do somethin [18:28:37] jenkins stuck? [18:29:17] <^d> something's stuck. [18:29:22] seems to be not filling all build/test slots [18:29:25] <^d> https://integration.wikimedia.org/zuul/ shows tons of stuff in the queue. [18:29:27] and stuff is just queuing up [18:30:13] <^d> I wonder if it's that slow pywikibot-core-test. [18:31:01] <^d> There we go! [18:31:05] <^d> Damn that test sucks. [18:31:11] <^d> ~6 hours when it succeeds. [18:31:34] christ [18:31:40] well, things seem to be going now [18:31:41] <^d> !log jenkins backed up due to pywikibot-core-test job stuck again [18:31:50] Logged the message, Master [18:32:00] lookitgo! [18:32:01] <^d> !log jenkins: killed job, queue seems to be trying to catch up now [18:32:09] Logged the message, Master [18:32:20] things are actually moving now [18:33:07] was that holding up the train? [18:33:20] <^d> That pywikibot-core-test job. [18:33:35] lots of empty test slots for jenkins it seems [18:33:39] <^d> https://integration.wikimedia.org/ci/job/pywikibot-core-tests/432/ [18:33:40] but maybe they aren't all pooled? [18:33:54] <^d> Not all of them can run in parallel yet I think. [18:34:13] well, before gallium had one of the slots filled [18:34:17] now its most of them [18:34:21] so its better =] [18:36:06] thanks ^d [18:36:17] <^d> yw [18:39:20] ottomata: thanks for merging the monitoring fix. now I can see gcs again! [18:40:22] (03PS2) 10Aude: add data license settings for wikibase [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110368 [18:40:27] (03CR) 10Reedy: [C: 032] add data license settings for wikibase [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110368 (owner: 10Aude) [18:41:37] yup! [18:42:08] !log reedy synchronized php-1.23wmf12 'staging' [18:42:16] Logged the message, Master [18:42:51] !log reedy synchronized docroot and w [18:42:59] Logged the message, Master [18:43:10] c'mon jenkins, meerrrrggeeeee [18:43:32] :) [18:43:41] Reedy: your change is merged live now [18:43:56] thanks :D [18:44:00] welcome [18:44:32] ooo [18:44:39] Gerrit is so colourful now [18:46:46] !log reedy started scap: testwiki to 1.23wmf12 and build l10n cache [18:46:54] Logged the message, Master [18:52:33] (03PS3) 10Aude: enable wikidata build for test.wikidata and test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110369 [18:52:59] (03CR) 10Reedy: [V: 032] add data license settings for wikibase [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110368 (owner: 10Aude) [18:57:31] <^d> !log jenkins: aborted another one of those pywiki jobs, was starting to back things up again. this job is broken methinks. [18:57:39] Logged the message, Master [18:59:13] <^d> why the hell are we not using a fraction of our queues? [19:02:35] !log reedy finished scap: testwiki to 1.23wmf12 and build l10n cache (duration: 18m 45s) [19:02:43] Logged the message, Master [19:02:46] that seems faster [19:02:58] scap? yeah [19:03:05] I thought it was taking an age [19:03:10] Just me being impatient [19:04:08] (03PS2) 10Reedy: All Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110405 [19:04:13] (03CR) 10Reedy: [C: 032] All Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110405 (owner: 10Reedy) [19:04:20] Wonder if jenkins is going to play ball [19:04:24] Special:Version isn't showing the hash any more [19:04:45] (03Merged) 10jenkins-bot: All Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110405 (owner: 10Reedy) [19:04:53] Where? [19:05:03] It has a habit of going back and forth depending on .git sync... [19:05:33] https://test2.wikipedia.org/wiki/Special:Version [19:05:56] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: all wikipedias to 1.23wmf11. testwiki back to 1.23wmf10 [19:06:03] Logged the message, Master [19:07:38] Reedy: huh? aren't we going to 12? [19:07:52] Nope [19:07:55] testwiki to wmf12? [19:08:01] Oh, yeah [19:08:24] !log testwiki back to 1.23wmf11 even [19:08:32] Logged the message, Master [19:09:09] so are the group0s on wmf12 or 11? [19:09:13] I'm confused now :( [19:09:30] I had just started rebuilding cirrus indexes after the update [19:11:06] (03CR) 10Ottomata: "Hm, the sampled-1000 should be put on erbium, as well as being left on emery. You can/should leave the rsync job as well, just pointed at" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110394 (owner: 10Matanya) [19:12:15] (03Abandoned) 10Reedy: Set $wgMathTexvcCheckExecutable to null. Not all apaches have executable [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110400 (owner: 10Reedy) [19:12:30] (03PS4) 10Aude: enable wikidata build for test.wikidata and test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110369 [19:12:35] (03CR) 10Reedy: [C: 032] enable wikidata build for test.wikidata and test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110369 (owner: 10Aude) [19:12:42] (03Merged) 10jenkins-bot: enable wikidata build for test.wikidata and test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110369 (owner: 10Aude) [19:15:42] (03PS2) 10Aude: bump $wgCacheEpoch for wikidatawiki and test wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110384 [19:15:45] (03CR) 10Reedy: [C: 032] bump $wgCacheEpoch for wikidatawiki and test wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110384 (owner: 10Aude) [19:15:54] (03Merged) 10jenkins-bot: bump $wgCacheEpoch for wikidatawiki and test wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110384 (owner: 10Aude) [19:18:06] (03PS2) 10Reedy: Update php symlink to php-1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109941 [19:18:10] (03CR) 10Reedy: [C: 032] Update php symlink to php-1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109941 (owner: 10Reedy) [19:18:18] (03Merged) 10jenkins-bot: Update php symlink to php-1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109941 (owner: 10Reedy) [19:19:00] !log reedy updated /a/common to {{Gerrit|Ie5c11239f}}: Update php symlink to php-1.23wmf11 [19:19:05] (03PS1) 10Reedy: group0 wikis to 1.23wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110422 [19:19:08] Logged the message, Master [19:19:58] (03CR) 10Reedy: [C: 032] group0 wikis to 1.23wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110422 (owner: 10Reedy) [19:20:00] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf12 [19:20:04] (03Merged) 10jenkins-bot: group0 wikis to 1.23wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110422 (owner: 10Reedy) [19:20:08] Logged the message, Master [19:20:44] !log reedy synchronized wmf-config/ [19:20:53] Logged the message, Master [19:21:19] nothing exploded [19:22:19] yet! [19:23:29] :) [19:23:33] 3 Fatal error: Call to a member function userCan() on a non-object in /usr/local/apache/common-local/php-1.23wmf12/includes/api/ApiQueryRevisions.php on line 150 [19:23:37] Noting of yours at least ;) [19:23:55] looks like we have teh usual js caching issues [19:24:11] now seems to work [19:27:08] Reedy: what is running on wmf12 now? [19:27:40] test, test2, testwikidatawiki and mediawikiwiki [19:27:56] thank you [19:29:07] !log rebuilding search index for test2wiki and checking that everything is sane [19:29:16] Logged the message, Master [19:29:29] (03PS1) 10Reedy: Remove both temp dirs [operations/puppet] - 10https://gerrit.wikimedia.org/r/110426 [19:29:55] no fires there [19:30:15] test wikidata seems good [19:30:19] and test2 [19:31:34] !log rebuilding search index on test2wiki went perfectly. proceeding with test, testwikidatawiki, and mediawikiwiki [19:31:41] Logged the message, Master [19:32:30] Reedy: I'm going to have to do this for all the wikis when they go to wmf12 [19:33:34] manybubbles: there was a bug for zhwikivoyage, is that done? [19:33:40] or ^d actually [19:34:03] twkozlowski: I think so. I believe ^d did it [19:34:16] <^d> twkozlowski: Yes, I thought we closed it already [19:34:27] <^d> If not, it's done and been done. [19:34:30] I'll close the bug then. Thanks! [19:34:45] userCan! [19:35:17] Reedy: i suppose give folks a little while to test on test.wikidata and then [19:35:22] wikidata can go to wmf12 [19:36:52] !log reedy synchronized php-1.23wmf12/extensions/PdfHandler [19:36:59] Logged the message, Master [19:37:02] (03PS7) 10Aude: update wikibase cronjobs to use wikidata build [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [19:37:28] if someone would like to review puppet ^ [19:37:37] i think we'll be ready for that in a bit [19:38:59] (03PS1) 10Aude: Wikidata is ready for quantities, after it gets switched to wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110433 [19:39:45] aude: I'm going to step away for a few minutes. can you ping me when wikidata is updated to wmf12 so I can rebuild its search index? [19:40:00] i think it will be at least half hour or so [19:40:08] ought to give people time to poke at test.wikidata [19:40:18] aude: thanks! I should go poke it too [19:40:24] yes! [19:40:36] wikidata is getting quantities [19:43:16] !log reedy synchronized php-1.23wmf12/includes/api/ApiQueryRevisions.php 'bug 60635' [19:43:23] Logged the message, Master [19:46:11] userCan! [19:49:27] Hmm [19:49:34] How big is testwikidatawiki [19:52:50] <_david_> ^d, https://gerrit-review.googlesource.com/#/c/54153 [19:52:59] !log Changed wb_items_per_site.ips_row_id and wb_terms.term_row_id to BIGINT on testwikidatawiki [19:53:06] Logged the message, Master [19:53:09] Reedy: a couple hundred articles [19:53:20] <1000 [19:53:58] <^d> _david_: Oh sweet! [19:54:09] <_david_> ^d ;-) [19:54:48] (03PS2) 10Hashar: Remove both temp dirs in scap-recompile [operations/puppet] - 10https://gerrit.wikimedia.org/r/110426 (owner: 10Reedy) [19:55:00] heh :) [19:55:21] (03CR) 10Hashar: [C: 031] Remove both temp dirs in scap-recompile [operations/puppet] - 10https://gerrit.wikimedia.org/r/110426 (owner: 10Reedy) [19:55:37] !log Update indexes on wb_terms for testwikidatawiki https://gerrit.wikimedia.org/r/#/c/99660 [19:55:46] Logged the message, Master [19:56:29] is _david_ a gerrit guy? [19:56:59] _david_: do you know where do i put http://etherpad.wikimedia.org/p/new-gerrit-change-view-comments so that it gets read? alternatively, could you put it there? :D [19:57:03] Looks like it [19:57:12] the new change view screen is pretty nice [19:57:22] <_david_> MatmaRex, thx [19:57:35] is the new view an opt-in preference ? [19:57:51] hashar: on our install, yes [19:57:52] Reedy: is deploy completed? seeing some new i18n messages not being translated on-wiki [19:57:59] (only in Flow so far) [19:58:08] Yeah, should be all done... [19:58:17] hashar: last two dropdowns on https://gerrit.wikimedia.org/r/#/settings/preferences [19:58:33] Reedy: still waiting on wikidatawiki, right? [19:58:34] hmm, at https://www.mediawiki.org/wiki/Talk:Flow flow-post-interaction-separator isn't being translated. I just double checked the branch and its in the i18n file [19:58:50] hashar: http://i.imgur.com/8zOa81m.png http://i.imgur.com/jsD1wLO.png http://i.imgur.com/of9Z6B0.png [19:58:59] (and spelling is correct on both sides) [19:59:08] _david_: :) [19:59:13] manybubbles: I'm not sure. It's not actually scheduled [19:59:38] <_david_> MatmaRex, you mean where to put these enhancements? Feature requests? [20:00:07] _david_: they are pretty minor, it'd feel silly to file 20 bugs for them [20:00:27] and i suppose many are already done [20:00:37] <_david_> MatmaRex, agreed, but then really fill one minor bugs an explain exactly what you mean [20:00:39] i have no idea how close to master our install is [20:00:55] MatmaRex: yup I have seen the new GUI on other Gerrit installation :-) [20:01:39] <_david_> MatmaRex, you can check on gerit-review, they are 10-20 commits away, ok, mostly [20:01:42] the new one is a bit inconstent, but I find it better overal [20:04:38] !log reedy synchronized php-1.23wmf12/extensions/ContactPageFundraiser/ContactPage.php [20:04:46] Logged the message, Master [20:06:54] Reedy, ebernhardson : the new Flow message isn't in https://test2.wikipedia.org/wiki/Special:AllMessages?prefix=flow-post . It's special characters ' • ' but that should work... [20:08:40] <^d> MatmaRex: We're running 2.8.1 stable, so not master at all. [20:11:11] spage: i doubt its the special chars, flow-post-moderated-toggle-delete-hide is also new and missing [20:11:23] but doesn't have anything special about it, text + GENDER [20:11:38] ^d: i mean `git log --oneline v2.8.1..master | wc -l` :) [20:15:33] (03PS4) 10Manybubbles: Monitor Elasticsearch query stats groups [operations/puppet] - 10https://gerrit.wikimedia.org/r/108852 [20:15:36] Reedy, the new flow messages aren't in tin:/usr/local/apache/common/php-1.23wmf12/cache/l10n/l10n_cache-en.cdb , but that's as far as my l10n knowledge goes :) [20:15:50] (03CR) 10Manybubbles: "I believe this is ready." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108852 (owner: 10Manybubbles) [20:16:02] ottomata: ^^^^ mmmmmm more graphs [20:16:10] !log reedy synchronized php-1.23wmf12/extensions/ContactPageFundraiser [20:16:18] Logged the message, Master [20:19:30] ok! [20:19:45] (03CR) 10Ottomata: [C: 032 V: 032] Monitor Elasticsearch query stats groups [operations/puppet] - 10https://gerrit.wikimedia.org/r/108852 (owner: 10Manybubbles) [20:20:02] merged [20:22:55] !log reedy started scap: Scap take 2 for 1.23wmf12 [20:23:03] Logged the message, Master [20:23:22] Updating LocalisationCache for 1.23wmf12... Updated 366 JSON file(s) in '/a/common/php-1.23wmf12/cache/l10n'. [20:31:14] LVS boxes have stopped reporting to ganglia [20:31:19] https://ganglia.wikimedia.org/latest/?c=LVS%20loadbalancers%20eqiad&m=cpu_report&r=hour&s=by%20name&hc=4&mc=2 [20:31:36] same for pmtpa too [20:38:33] !log reedy finished scap: Scap take 2 for 1.23wmf12 (duration: 18m 08s) [20:38:42] Logged the message, Master [21:04:53] (03PS1) 10Ottomata: Giving shell access to Christian on gerrit and gitblit boxes [operations/puppet] - 10https://gerrit.wikimedia.org/r/110453 [21:06:39] (03PS1) 10Matanya: webserver: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 [21:06:44] ottomata: around? can you review puppet stuff? [21:06:51] (03CR) 10Ottomata: [C: 032 V: 032] Giving shell access to Christian on gerrit and gitblit boxes [operations/puppet] - 10https://gerrit.wikimedia.org/r/110453 (owner: 10Ottomata) [21:07:01] seems like he is :) [21:07:12] matanya_: want to look at https://gerrit.wikimedia.org/r/#/c/110371/ again? [21:07:21] at least ensure it's not breaking stuff [21:07:31] ja [21:07:34] ok [21:07:57] we're moving the script for the cron jobs in wmf12 [21:08:30] so, 1) stop the old ones 2) switch wikidata to wmf12 3) remove commented out parts / re-enable [21:09:21] in reviewing , fixed some linting issues [21:09:26] !log reedy synchronized php-1.23wmf12/extensions/Wikidata/ [21:09:35] Logged the message, Master [21:09:36] thanks Reedy [21:10:46] (03CR) 10Matanya: "Just to make sure, you understand this code will remove the old cron jobs before the new ones come in?" (035 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 (owner: 10Aude) [21:11:15] oh noes :) [21:11:50] (03CR) 10Aude: "yes stop old ones, enable new ones in follow up" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 (owner: 10Aude) [21:13:55] (03PS8) 10Aude: update wikibase cronjobs to use wikidata build [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [21:13:56] fixed [21:16:23] ottomata: my new graphs aren't in yet! [21:16:29] the code is on the nodes [21:16:51] ohhh ganglia [21:16:52] gimme 2 mins [21:18:15] (03CR) 10Ottomata: webserver: lint (038 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 (owner: 10Matanya) [21:18:29] ok manybubbles elastic1001 a good node to check [21:18:30] ? [21:18:41] https://gerrit.wikimedia.org/r/#/c/110371/ is ready [21:18:42] I was checking 1008 [21:18:44] but whatever [21:18:49] 1008 is the current master that is why I was there [21:18:52] no good reason [21:19:02] then expect follow up in ~10 min or so [21:19:05] what am I looking for? [21:19:34] aude: are you a male or a female? not sure how to refer to you, sorry [21:19:41] she [21:19:48] oh, thanks [21:19:48] there is a stat called es_prefix_queries, I believ [21:20:08] ottomata: when I run the script if finds it: value for es_prefix_queries is 2095.0 [21:20:19] but it doesn't shove it into the ganglia [21:20:27] ottomata: she pushed the code for the wikidata cron jobs [21:21:14] you can review and merge when it is reday, i did a review, but better if you take a look too, it is too late here to rely on my CR :) [21:21:48] (03CR) 10Matanya: [C: 031] update wikibase cronjobs to use wikidata build [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 (owner: 10Aude) [21:23:29] greg-g: would it be ok to sync bits dir sometime? we have a new ver of firefox os app there [21:23:33] manybubbles: i see a bunch of these in syslog on gmond restart [21:23:33] Jan 30 21:23:17 elastic1001 /usr/sbin/gmond[17750]: [PYTHON] Extra data key [path] could not be processed.#012 [21:23:33] Jan 30 21:23:17 elastic1001 /usr/sbin/gmond[17750]: [PYTHON] No metric name given in module [elasticsearch_monitoring].#012 [21:23:50] i don't think greg is around today [21:24:17] yurikR: wait until we are done with wikidata stuff [21:24:30] yeah, manybubbles, this wikidata stuff is probably something that someone else who is workign with it should merge [21:24:44] ottomata: who can we ask? [21:24:54] hm, i mean i can do it, but i don't know anything about it [21:25:07] has anyone in ops already been working with you on this? [21:26:07] well they helped us originally set these up [21:27:12] just need to check that things like syntax are correct [21:28:38] peter, LeslieCarr and apergos reviewed these before [21:28:47] don't know if they are available for this sort of thing [21:29:03] (03CR) 10Matanya: webserver: lint (038 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 (owner: 10Matanya) [21:29:38] (03PS2) 10Matanya: webserver: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 [21:30:13] (03CR) 10jenkins-bot: [V: 04-1] webserver: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 (owner: 10Matanya) [21:30:32] ottomata: I'll look at monitoring [21:30:51] (03PS3) 10Matanya: webserver: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 [21:30:54] suppose i can check the logs for the cron jobs [21:31:17] aude: peter and leslie no longer work for wmf [21:31:32] exactly [21:31:39] and apergos is most likely sleeping [21:31:57] but you never know with him [21:39:08] (03CR) 10Anomie: "> Scribunto does not allow the creation of namespace aliases" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101889 (owner: 10Odder) [21:40:54] paravoid: around? [21:40:59] * aude wonders who else to ask [21:41:04] (03PS1) 10BryanDavis: kibana: Set default dashboard [operations/puppet] - 10https://gerrit.wikimedia.org/r/110457 [21:41:24] aude, don't ask if you can ask, just ask:P [21:41:31] who can merge puppet stuff? [21:41:38] lots of ppl [21:41:47] ori: ? [21:42:07] kind of sucky to wait until next week to enable new code on wikidata [21:42:18] considering i am travelling and i am around today [21:42:33] and test.wikidata is absolutely doing good with the code [21:42:58] travelling next week* [21:43:53] aude: My trick for getting attention is to say ori three times while flashing the lights on and off in my office [21:43:58] hah [21:44:03] aude, i can merge, i just don't want to break anything that i'm not going to watch :) [21:44:18] i can watch [21:44:27] i am on terbium looking at the logs [21:44:33] i would love to break :) [21:44:34] basically they should stop [21:44:48] didn't do that in a day or two [21:44:53] then when we re-enable, they should work again [21:45:51] Yep, even *if* they break, those aren't super-critical [21:45:57] no data will be lost or whatever [21:46:07] But aude's change looks sane [21:46:21] i agree [21:46:21] 21:46:11 Posted 176 changes to iswikisource, up to ID 105196033, timestamp 20140130214611. Lag is 0 seconds. Next ID is 105196033. [21:46:37] that will stop (which is fine to do briefly) [21:46:40] ok aude, i can merge [21:46:44] thanks :) [21:46:48] * aude owes beer [21:46:48] (03PS9) 10Aude: update wikibase cronjobs to use wikidata build [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 [21:46:50] \o/ [21:46:54] (03CR) 10Ottomata: [C: 032 V: 032] update wikibase cronjobs to use wikidata build [operations/puppet] - 10https://gerrit.wikimedia.org/r/110371 (owner: 10Aude) [21:47:05] blame me if doesn't work [21:47:13] we shall wait for puppet then can put wikidata on wmf12 [21:47:15] Reedy: ^ [21:47:22] matanya: i'm sure it's good [21:47:27] i can run puppet manually [21:47:28] if you like [21:47:31] if you want [21:47:35] k running [21:47:40] this will speed things up [21:47:41] terbium, right? [21:47:56] yes [21:47:57] yep [21:48:02] Cannot reassign variable enabled at /etc/puppet/manifests/misc/maintenance.pp:217 [21:48:04] might take a few minutes for the currently running jobs to complete [21:48:07] gah [21:48:09] ok [21:49:18] doh [21:49:25] (03PS1) 10Aude: Fix variable assignment for wikidata cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/110459 [21:49:27] that's not even used in the current version AFAIR [21:49:28] does that work? [21:49:37] we cleaned up lint stuff during review [21:50:07] ok, that should work... m 0.02$ [21:50:10] * my [21:50:14] :) [21:50:46] i don't know exactly why things were done this way originally with $::enabled ? [21:50:59] oh hmm [21:51:00] aude [21:51:03] i don't htink you want $::enabled [21:51:13] that makes it globally qualified instead of locally [21:51:16] just $enabled will do [21:51:18] yeah [21:51:19] ok [21:51:22] amending [21:51:31] Not sure why one did that in the first place [21:51:48] (03PS2) 10Aude: Fix variable assignment for wikidata cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/110459 [21:54:11] (03CR) 10Hoo man: [C: 031] Fix variable assignment for wikidata cron jobs (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110459 (owner: 10Aude) [21:54:45] something else than true/false I meant, doh [21:54:52] (03CR) 10Aude: Fix variable assignment for wikidata cron jobs (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110459 (owner: 10Aude) [21:55:11] someone could go through the entire file and cleanup [21:55:20] * aude ignored stuff around what i changed [21:55:43] aude: Yep... we even have the old huwiki cron in there still (absent, though) [21:56:01] needs to be absent... i don't know at what point it can be removed completely [21:56:30] this has been more than a year AFAIR [21:56:43] yep [21:56:51] (03PS2) 10BryanDavis: kibana: Set default dashboard [operations/puppet] - 10https://gerrit.wikimedia.org/r/110457 [21:58:41] (03CR) 10Ottomata: [C: 032 V: 032] Fix variable assignment for wikidata cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/110459 (owner: 10Aude) [21:58:51] sorry about that... [22:02:05] notice: /Stage[main]/Misc::Maintenance::Wikidata/Cron[wikibase-dispatch-changes]/ensure: removed [22:02:05] notice: /Stage[main]/Misc::Maintenance::Wikidata/Cron[wikibase-dispatch-changes2]/ensure: removed [22:02:05] notice: /Stage[main]/Misc::Maintenance::Wikidata/Cron[wikibase-repo-prune]/ensure: removed [22:02:06] looks good aude [22:02:10] yay [22:02:29] we'll wait a few minutes to see that everything finished / stopped [22:02:55] :) [22:04:43] --max-time 900 [22:04:49] guess we wait 15 min? [22:05:23] aude: Sure... if you're on the box you can also look whether it's still running, they'll probably die earlier [22:05:35] it is [22:05:50] ok [22:06:01] to be safe , we can wait [22:06:02] Ah, no max-passes set, so they'll really run 900s each [22:06:13] yep [22:09:29] PROBLEM - Host mw1036 is DOWN: PING CRITICAL - Packet loss = 100% [22:10:19] PROBLEM - Host mw1033 is DOWN: PING CRITICAL - Packet loss = 100% [22:12:17] * aude waiting [22:12:39] PROBLEM - Host mw1039 is DOWN: PING CRITICAL - Packet loss = 100% [22:13:29] thats odd. [22:13:39] three apaches in same range offlining at same time [22:14:48] !log mw1036 crashed, unresponsive to console or ssh, rebooting [22:14:58] Logged the message, RobH [22:14:59] (03CR) 10Odder: Give testwiki some custom namespaces (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/78016 (owner: 10TTO) [22:16:04] 22:15:02 Done, exiting after 2734 passes and 901 seconds. [22:16:24] Reedy: still around? [22:16:51] wikidata can go to wmf12 whenever [22:17:17] (03PS1) 10Manybubbles: Fix new Elasticsearch monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/110464 [22:17:36] ottomata: found it ^^^ [22:17:46] u'foo' is no 'foo' [22:17:49] PROBLEM - Host mw1054 is DOWN: PING CRITICAL - Packet loss = 100% [22:17:53] but it is on some machines......... [22:18:48] ottomata: I have some packages in labs that I need added to the apt repo [22:18:50] rather, some machine don't have the problem [22:18:57] (03PS1) 10Aude: Enable new cron jobs for wikidata change dispatcher and pruning [operations/puppet] - 10https://gerrit.wikimedia.org/r/110465 [22:18:58] hmm [22:19:04] k Ryan_Lane, where? [22:19:06] not to merge that yet^ [22:19:07] let me make sure you're in the project [22:19:08] one sec [22:19:09] k [22:19:13] !log mw1033 crashed, powercycling [22:19:19] (03CR) 10Aude: [C: 04-1] "not until wikidata is on wmf12" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110465 (owner: 10Aude) [22:19:20] Logged the message, RobH [22:19:33] ok, added you to the trebuchet project [22:19:37] I'm slowly moving stuff there [22:19:48] I'll let you know when it's set up like the sartoris project [22:19:59] PROBLEM - Host mw1057 is DOWN: PING CRITICAL - Packet loss = 100% [22:20:19] ottomata: on trebuchet-build1.pmtpa.wmflabs:/home/laner [22:20:29] it's the trebuchet-trigger package [22:20:42] this version fixes submodules as well [22:20:47] or, well, should [22:20:49] RECOVERY - Host mw1036 is UP: PING OK - Packet loss = 0%, RTA = 0.45 ms [22:20:53] I didn' t get a chance to test it yet [22:21:42] can anyone say where our equipment is at 200Paul? [22:21:52] floor, location, etc. [22:22:09] PROBLEM - Host mw1058 is DOWN: PING CRITICAL - Packet loss = 100% [22:22:31] !log powercycleing mw1039 as its also crashed [22:22:38] Logged the message, RobH [22:22:39] RECOVERY - Host mw1033 is UP: PING OK - Packet loss = 0%, RTA = 0.34 ms [22:24:19] PROBLEM - Host mw1061 is DOWN: PING CRITICAL - Packet loss = 100% [22:24:39] PROBLEM - Host mw1060 is DOWN: PING CRITICAL - Packet loss = 100% [22:24:41] Ryan_Lane, done [22:24:41] http://apt.wikimedia.org/wikimedia/pool/main/t/trebuchet-trigger/ [22:26:00] (03PS1) 10Aude: wikidatawiki to wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110467 [22:28:09] RECOVERY - Host mw1039 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [22:28:58] RobH: are you looking into mw1057, mw1058, and mw1060 as well? [22:29:13] eventually yep [22:29:31] looping emails for a few then will get to those three [22:29:49] PROBLEM - Host mw1073 is DOWN: PING CRITICAL - Packet loss = 100% [22:30:19] PROBLEM - RAID on mw1039 is CRITICAL: Connection refused by host [22:30:19] PROBLEM - Disk space on mw1039 is CRITICAL: Connection refused by host [22:30:39] PROBLEM - SSH on mw1039 is CRITICAL: Connection refused [22:30:39] PROBLEM - puppet disabled on mw1039 is CRITICAL: Connection refused by host [22:30:49] PROBLEM - Apache HTTP on mw1039 is CRITICAL: Connection refused [22:30:59] PROBLEM - twemproxy process on mw1039 is CRITICAL: Connection refused by host [22:31:09] PROBLEM - DPKG on mw1039 is CRITICAL: Connection refused by host [22:31:53] urghhh [22:32:13] filesystem repairs [22:33:49] PROBLEM - Host mw1039 is DOWN: PING CRITICAL - Packet loss = 100% [22:35:23] !log rebooting frozen systems mw1057, mw1058, mw1060 [22:35:30] Logged the message, RobH [22:35:39] RECOVERY - SSH on mw1039 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [22:35:39] RECOVERY - puppet disabled on mw1039 is OK: OK [22:35:49] RECOVERY - Host mw1039 is UP: PING OK - Packet loss = 0%, RTA = 0.20 ms [22:35:59] RECOVERY - twemproxy process on mw1039 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [22:36:09] RECOVERY - DPKG on mw1039 is OK: All packages OK [22:36:19] RECOVERY - RAID on mw1039 is OK: OK: no RAID installed [22:36:19] RECOVERY - Disk space on mw1039 is OK: DISK OK [22:36:45] !log powercycling mw1061, system frozen [22:36:53] Logged the message, RobH [22:37:49] RECOVERY - Host mw1058 is UP: PING OK - Packet loss = 0%, RTA = 0.65 ms [22:37:49] RECOVERY - Apache HTTP on mw1039 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.045 second response time [22:39:08] anyone want to merge https://gerrit.wikimedia.org/r/#/c/110467/ ? :) [22:39:14] Reedy: [22:39:29] RECOVERY - Host mw1061 is UP: PING OK - Packet loss = 0%, RTA = 0.35 ms [22:41:09] RECOVERY - Host mw1057 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [22:41:19] RECOVERY - Host mw1060 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [22:42:19] PROBLEM - Apache HTTP on mw1061 is CRITICAL: Connection refused [22:49:49] PROBLEM - Host mw1061 is DOWN: PING CRITICAL - Packet loss = 100% [22:51:37] ^d: do you want to review https://gerrit.wikimedia.org/r/#/c/110467/ ? ( i guess reedy is not around) [22:51:56] * aude would merge if i could and deploy ;) [22:52:24] not the least bit concerned about the change [22:52:42] <^d> why didn't it happen earlier? :p [22:52:49] RECOVERY - Host mw1061 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms [22:52:50] don't know [22:52:54] ^d: We we're blocked on a puppet change first [22:53:10] we poked around on test, then stopped cron job [22:53:45] been one of the easiest deploys, in terms of lack of problems [22:53:49] <^d> hoo: That done now? [22:53:55] cron job stopped [22:53:56] yes [22:54:04] ^d: All done, that one si good to go [22:54:21] and you should give aude +2 on the repo [22:54:23] * hoo hdies [22:54:29] 22:15:02 Done, exiting after 2734 passes and 901 seconds. [22:54:57] for trivial stuff, +2 could be nice [22:55:14] Yep, aude already has shell and deploy access anyway [22:55:19] RECOVERY - Apache HTTP on mw1061 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.044 second response time [22:56:31] (03CR) 10Chad: [C: 032] wikidatawiki to wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110467 (owner: 10Aude) [22:56:35] thanks :) [22:56:42] :) [22:56:43] (03Merged) 10jenkins-bot: wikidatawiki to wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110467 (owner: 10Aude) [22:57:22] ^d: Getting aude proper access could really save us quite some time [22:57:32] * aude spending time reading reddit [22:57:39] and waiting [22:58:41] aude: You (or someone else) needs to sync now [22:58:57] i could, if ^d doesn't want to [22:59:11] needs to rebuild wikiversions [22:59:20] <^d> wtf. [22:59:25] !log demon rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki to wmf12 [22:59:29] yay [22:59:30] <^d> I'm getting pubkey denied all over the place. [22:59:34] Logged the message, Master [22:59:41] <^d> silly tampa. [22:59:57] here we go, thanks ^d :) [22:59:59] stuff looks absolutely fine [23:00:11] <^d> it's just tampa. [23:00:20] <^d> scap/etc hasn't worked to tampa for me for like a week. [23:00:23] https://gerrit.wikimedia.org/r/110465 is good to merge now [23:01:01] wikidata thinks i speak russian now as a second language [23:01:07] (that's intended) [23:01:24] ottomata: https://gerrit.wikimedia.org/r/#/c/110465/ ? [23:01:26] I have that as well [23:01:30] or ori [23:01:39] https://gerrit.wikimedia.org/r/#/c/110433/ is also good to go [23:01:51] <^d> I wonder if the tampa thing is just me. [23:01:55] no idea [23:08:47] ottomata: Want to merge https://gerrit.wikimedia.org/r/110465 ? [23:09:02] wikidata is doing super great [23:09:08] logs are quiet, etc [23:09:45] hoo, aude −1ed? [23:09:50] oh, yes [23:09:59] ottomata: that's obsolete ;) [23:10:11] (03CR) 10Aude: [C: 031] "wikidata is on wmf12, logs quiet, everything looks great" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110465 (owner: 10Aude) [23:10:14] there [23:11:00] and ^d https://gerrit.wikimedia.org/r/#/c/110433/ ? or anyone [23:12:03] <^d> So this turns it all back to default? [23:12:41] ^d: adds quantities data type which is / has been on test wikidata [23:12:51] so, yes use wikibase defaults [23:12:55] (03CR) 10Chad: [C: 032] Wikidata is ready for quantities, after it gets switched to wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110433 (owner: 10Aude) [23:13:43] (03Merged) 10jenkins-bot: Wikidata is ready for quantities, after it gets switched to wmf12 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110433 (owner: 10Aude) [23:14:37] !log demon synchronized wmf-config/Wikibase.php 'I50dfdc42: Enable quantity values' [23:14:39] thanks [23:14:42] <^d> yw [23:14:45] Logged the message, Master [23:14:50] <^d> Hmm, that ssh problem seems to be just on sync-wikiversions. [23:14:54] <^d> wonder wtf is up there. [23:17:56] (03CR) 10Ottomata: [C: 032 V: 032] Enable new cron jobs for wikidata change dispatcher and pruning [operations/puppet] - 10https://gerrit.wikimedia.org/r/110465 (owner: 10Aude) [23:19:42] thanks, ottomata [23:20:17] want to manually run puppet again? :) [23:37:57] ^d: mh... might it be possible to run the localization update on wmf12? [23:38:23] <^d> We run it for all branches. [23:38:40] ^d: When did/ will you? [23:39:01] <^d> It's running now ;-) [23:39:13] Ah, fine :) [23:40:01] <^d> Gah, what is it! [23:40:02] <^d> Failed to read file: exception 'Exception' with message 'Expected =, got '['' in /a/common/php-1.23wmf11/extensions/LocalisationUpdate/QuickArrayReader.php:140 [23:40:17] oh crap [23:40:24] bt? [23:40:28] !log LocalisationUpdate completed (1.23wmf11) at 2014-01-30 23:40:27+00:00 [23:40:29] PROBLEM - Varnish traffic logger on cp1054 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [23:40:29] PROBLEM - Varnish HTTP text-backend on cp1054 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:40:30] <^d> For wmf10 [23:40:35] Logged the message, Master [23:40:38] <^d> Gr, wmf11, I mean [23:40:41] <^d> wmf12 was fine. [23:40:59] PROBLEM - Varnish HTCP daemon on cp1054 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [23:41:42] ^d: mh... can I still have a trace? [23:42:27] <^d> I'll pastebin the whole thing in a moment [23:42:31] :) [23:42:39] greg-g: ping [23:43:16] <^d> bleh, wmf12 too. [23:43:19] <^d> Not all extensions. [23:43:28] Thought so [23:43:37] that's probably us, although the code looked sane [23:43:49] <^d> It's Translate* extensions + VE [23:43:53] <^d> Not you guys [23:44:37] wow, that was unexpected :) [23:54:19] RECOVERY - Varnish traffic logger on cp1054 is OK: PROCS OK: 2 processes with command name varnishncsa [23:54:19] RECOVERY - Varnish HTTP text-backend on cp1054 is OK: HTTP OK: HTTP/1.1 200 OK - 189 bytes in 0.007 second response time [23:54:49] RECOVERY - Varnish HTCP daemon on cp1054 is OK: PROCS OK: 1 process with UID = 111 (vhtcpd), args vhtcpd