[00:06:38] Change merged: Demon; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59443 [00:08:54] !log demon synchronized docroot [00:09:01] Logged the message, Master [00:11:02] New review: awjrichards; "looks OK to me" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/59767 [00:14:26] PROBLEM - Host db1004 is DOWN: PING CRITICAL - Packet loss = 100% [00:16:06] RECOVERY - Host db1004 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms [00:18:03] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59519 [00:18:46] RECOVERY - mysqld processes on db1004 is OK: PROCS OK: 1 process with command name mysqld [00:18:59] New review: Lcarr; "a yes click is needed but i am a bit concerned over the possibility of something going wrong. anyon..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/57419 [00:21:41] New patchset: Odder; "(bug 46431) Update Apple touch icon for en.wiktionary." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59771 [00:21:42] hey ^demon, you got a second to approve https://gerrit.wikimedia.org/r/#/c/59093/ ? [00:22:18] <^demon|busy> Why is the scope compile? [00:22:30] <^demon|busy> *Why does it need to be? [00:22:44] New patchset: Asher; "Revert "pulling db1004"" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59772 [00:22:54] because else it won't compile :) the project structure does not follow maven pom style [00:23:07] so the test files are in the same folder as the source files [00:23:17] Change merged: Asher; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59772 [00:23:22] <^demon|busy> Hmm, had been for me. [00:23:34] <^demon|busy> I guess it's harmless enough, but we should fix that you're right. [00:23:45] i tried it this week :) [00:23:52] New review: Lcarr; "doh there's been a path conflict! rebase (or abandon and redo) please" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/42035 [00:25:02] New review: Lcarr; "this has been sitting here a little bit -- still needed? (probably will need rebasing and cleaning)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/48149 [00:25:15] ty ^demon [00:26:57] New review: Lcarr; "(1 comment)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50011 [00:27:55] New review: Lcarr; "poor harmon ? still needed?" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52042 [00:29:26] New review: Lcarr; "since this is only for the bots project, i am a +1 however do we need another bots member for more v..." [operations/puppet] (production) C: 1; - https://gerrit.wikimedia.org/r/53145 [00:36:15] New review: Odder; "Yes, this is true and, quite frankly, a bit surprising. " [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59756 [00:36:17] New patchset: Lcarr; "Linkify RT a little more liberally." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/49196 [00:37:22] !log asher synchronized wmf-config/db-eqiad.php 'adding db1004 at a warmup weight' [00:37:30] Logged the message, Master [00:37:36] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/49196 [00:39:19] New patchset: Ori.livneh; "Parametrize and extend IPython Notebook class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59727 [00:39:38] New review: Liangent; "Only Wikipedia has zh-mo enabled." [operations/apache-config] (master) - https://gerrit.wikimedia.org/r/59580 [00:39:55] hey, any ops around to merge a puppet change for vanadium for me? https://gerrit.wikimedia.org/r/59727 [00:41:20] apergos, maybe, if you have a moment? [00:41:46] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59727 [00:42:00] binasher: :))) thanks [00:42:17] no prob [00:42:28] !log asher synchronized wmf-config/db-eqiad.php 'setting db1004 to full weight' [00:42:35] Logged the message, Master [00:45:59] ori-l: i'm pretty sure apergos is asleep [00:46:10] it's pretty late in greece [00:46:13] LeslieCarr: all the better, i can sneak big puppet changes right past [00:46:17] ha [00:46:17] j/k, i only pinged because of the topic [00:46:40] New patchset: Demon; "Update the ldap scripts to pep8 compliant" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/53476 [00:46:44] ah [00:54:37] New patchset: Ori.livneh; "Specify config dir using "--ipython-dir" command-line param" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59775 [00:56:45] binasher: gah, i made one small mistake. can you do this one too? ^ (sorry..) [00:56:53] sure [00:57:04] thanks :) [00:57:09] New review: Demon; "(1 comment)" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/53476 [00:57:18] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59775 [01:00:12] asdjkhaksjdh i am a retard [01:01:22] New patchset: Ori.livneh; "Drop extraneous slash from template path" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59777 [01:01:44] doh [01:02:00] sorry [01:02:06] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59777 [01:02:24] * ori-l puts on the puppet dunce cap [01:03:01] <^demon|zzz> Night folks. [01:03:01] !log re-enabling Tomasz' RT account [01:03:10] Logged the message, Master [01:05:33] aaaand it works, sweet [01:14:34] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 238 seconds [01:15:34] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 25 seconds [01:33:34] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 205 seconds [01:34:34] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 15 seconds [01:55:18] New patchset: Ori.livneh; "Create IPython config based on $ipython_profile" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59782 [02:02:20] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [02:02:20] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [02:02:20] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [02:06:27] New review: Krinkle; "I've tested this locally in the browser dev tools, it doesn't work. The element being hidden here i..." [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/58082 [02:08:11] New review: Krinkle; "(1 comment)" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/57302 [02:10:17] New patchset: Krinkle; "gerrit-wm: Sends translatewiki events to #mediawiki-i18n" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/57302 [02:13:48] New patchset: Krinkle; "wikibugs: Set up #mediawiki-visualeditor" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/37570 [02:14:27] !log LocalisationUpdate completed (1.22wmf2) at Thu Apr 18 02:14:26 UTC 2013 [02:14:35] Logged the message, Master [02:23:54] !log LocalisationUpdate completed (1.22wmf1) at Thu Apr 18 02:23:54 UTC 2013 [02:24:02] Logged the message, Master [02:54:52] New patchset: Odder; "(bug 44899) Namespace setup for Korean Wikiversity" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59786 [02:56:58] New patchset: Odder; "(bug 44899) Namespace setup for Korean Wikiversity" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59786 [07:58:19] !log olivneh synchronized php-1.22wmf1/extensions/ConfirmEdit 'Updating extension/ConfirmEdit (Bug 46132 / Change I8ee3fd136)' [07:58:23] Logged the message, Master [07:58:25] !log olivneh synchronized php-1.22wmf2/extensions/ConfirmEdit 'Updating extension/ConfirmEdit (Bug 46132 / Change I8ee3fd136)' [07:58:27] Logged the message, Master [07:59:07] New patchset: Isarra; "(Bug 47299) Update MediaWiki.org favicon" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59589 [07:59:07] New patchset: Isarra; "Update wikisource favicon" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59809 [07:59:40] New review: Isarra; "This patchset was an accident. I apologise." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59589 [08:09:41] New review: Hashar; "(5 comments)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59611 [08:09:50] New patchset: Hashar; "convert package-builder to a module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59611 [08:13:44] New patchset: Hashar; "package builder now supports Debian.org unstable" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59626 [08:14:01] New review: Hashar; "rebased / fixed conflict" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59626 [08:14:12] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 197 seconds [08:16:07] New patchset: MaxSem; "Adjust $wgLoadScript adjustment" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59810 [08:18:57] New patchset: MaxSem; "Adjust $wgLoadScript adjustment" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59810 [08:19:11] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 240 seconds [08:20:11] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 0 seconds [08:20:44] New review: Kaldari; "(1 comment)" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/59717 [08:20:45] New patchset: MaxSem; "Adjust $wgLoadScript adjustment" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59810 [08:22:11] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59810 [08:23:51] New patchset: Hashar; "package builder now supports Debian.org unstable" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59626 [08:24:10] New review: Hashar; "adds in mirror parameter to the class" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59626 [08:32:14] New patchset: MaxSem; "grrr" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59811 [08:34:11] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 187 seconds [08:34:13] New patchset: MaxSem; "grrr" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59811 [08:35:05] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59811 [08:35:41] hashar, PHP linting doesn't work [08:38:13] MaxSem: proof? :D [08:38:46] hashar, https://gerrit.wikimedia.org/r/#/c/59810/ [08:39:07] I accidentally committed an unfinished change [08:39:09] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 26 seconds [08:40:14] New review: Hashar; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59810 [08:40:29] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:40:47] https://integration.wikimedia.org/ci/job/operations-mw-config-phplint/3219/console [08:40:47] ah [08:40:48] hm [08:40:49] yeah [08:41:19] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [08:43:33] MaxSem: seems like timo broke it with https://gerrit.wikimedia.org/r/#/c/59112/ :D [08:47:19] MaxSem: can you fill a quick bug please? Will amend it after [08:47:42] wydoncha simply rvv him? [08:48:05] yeah [08:48:17] !log maxsem synchronized wmf-config/mobile.php [08:48:23] Logged the message, Master [08:50:49] MaxSem: fixed https://integration.wikimedia.org/ci/job/operations-mw-config-phplint/3223/console [08:52:16] weee [09:01:09] New patchset: Hashar; "preparing package to be uploaded to the debian repo" [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/59605 [09:02:05] New review: Hashar; "I have cleaned up the debian/changelog file, basically wiped it out :)" [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/59605 [09:08:52] hehehe [09:09:04] apergos: Just looking through the logs of yesterday... Thanks for reviewing the change for gerrit's apache config. [09:09:06] New patchset: Ori.livneh; "Parametrize IPython notebook profile configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59782 [09:19:24] New patchset: Ori.livneh; "Specify 'endpoint' param in mwerrors.pyconf" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59813 [09:19:59] any ops around? [09:21:51] * ori-l squints [09:22:22] PROBLEM - Puppet freshness on ms-fe3001 is CRITICAL: No successful Puppet run in the last 10 hours [09:32:32] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:33:22] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.132 second response time [10:01:17] New patchset: Hashar; "preparing package to be uploaded to the debian repo" [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/59605 [10:12:24] PROBLEM - Puppet freshness on gallium is CRITICAL: No successful Puppet run in the last 10 hours [10:31:35] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:32:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [10:59:26] New patchset: Hashar; "packaging `statsd` python module" [operations/debs/python-statsd] (master) - https://gerrit.wikimedia.org/r/59397 [10:59:43] New review: Hashar; "tweaked package for debian uploading" [operations/debs/python-statsd] (master) - https://gerrit.wikimedia.org/r/59397 [11:31:34] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:32:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.151 second response time [11:36:30] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:37:20] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [11:45:50] sorry to not be responsive; was trying to negotiate customs for our visting WMF person, and instead of a half our it took three (3) [11:45:56] hours to get one simple stpupid package [11:46:28] ah qchris yeah happy to, for stuff like that just ping me and I look at it the next time I have some free brain cells [11:46:34] ori-l: yes, now [11:47:29] apergos: I will. Thanks again \o/ [11:48:28] :-) [11:57:58] customs? [11:58:03] for what? [11:58:15] customs upon exit? [12:03:10] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [12:03:10] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [12:03:10] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [12:12:25] no [12:12:30] to get a package of schwag [12:12:44] fedex didn't tell them anything when they sent it [12:13:04] nada, not a word, and the partner company here of course... mh. so not set up to deal with regular customers [12:31:05] PROBLEM - Varnish HTCP daemon on cp1041 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [12:31:55] RECOVERY - Varnish HTCP daemon on cp1041 is OK: PROCS OK: 1 process with UID = 997 (varnishhtcpd), args varnishhtcpd worker [12:33:52] Should the mysql client stuff be installed by puppet? As terbium doesn't have it :( [12:34:12] New patchset: Mark Bergsma; "Varnish rules for Beta cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47567 [12:34:46] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47567 [12:37:25] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59813 [12:48:19] re [12:49:16] New review: Faidon; "(2 comments)" [operations/debs/python-statsd] (master) C: -1; - https://gerrit.wikimedia.org/r/59397 [12:49:48] paravoid: hey :) I uploaded my packages to Debian svn repo :D [12:50:01] I saw that [12:50:07] see the comment right above [12:50:11] yeah [12:50:13] fully agree [12:50:21] was about to tell you I was going to drop our git repos [12:51:03] it's funny, same two people, different context, different vcs [12:51:21] New review: Hashar; "I was going to suggest the same thing. I originally did not thought about upstreaming my packages...." [operations/debs/python-statsd] (master) - https://gerrit.wikimedia.org/r/59397 [12:51:24] New patchset: Mark Bergsma; "Simplify and cleanup the mobile host rewrites" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59401 [12:51:25] nice work though [12:51:36] I am not sure how python-debian handle review [12:51:44] eh, was fenari someone's laptop? ls /usr/games :) [12:52:42] MaxSem: $ dpkg -S /usr/games [12:52:42] base-files, cowsay, bsdgames: /usr/games [12:52:57] We need something to do while we scap [12:53:07] hashar: mails and irc basically [12:53:25] Reedy, sudo apt-get install skyrim [12:54:00] paravoid: someone told to attempt to build the package against unstable. That is why I have updated our package builder module (which has new changes) [12:54:26] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59401 [12:54:36] mark: \O/ [12:54:45] yes, always build for an up-to-date unstable before uploading to unstable [12:55:03] but you can do this on your laptop [12:55:07] no reason to puppetize all that [12:55:15] i don't mind puppetizing it, but it's probably not very useful [12:56:43] hashar: there's also https://github.com/sivy/py-statsd, both a client & server [12:56:49] wtf is wrong with these people [12:57:15] how many py{,thon}-statsd are out there [12:57:34] i feel the need to create another one [12:57:44] I think we've found what? 5 so far? [13:00:13] paravoid: there are a bunch of them indeed https://crate.io/?has_releases=on&q=statsd [13:00:36] I need to find a virtual box to run debian/unstable [13:02:40] !log Rebooting cp1041 [13:02:47] Logged the message, Master [13:06:42] PROBLEM - Puppet freshness on virt1005 is CRITICAL: No successful Puppet run in the last 10 hours [13:07:29] New patchset: MaxSem; "Another $wgMFVaryResources fix" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59821 [13:08:07] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59821 [13:10:16] MaxSem: just deployed some changes to the mobile frontends with Host: header rewriting [13:10:19] let me know if you notice any issues [13:10:33] mark, thanks, will look [13:10:50] basically a rewrite and followup to your beta varnish rules change [13:10:52] actually, I'm debugging some stuff relatd to new caching right now [13:10:58] ok [13:11:30] I haven't really followed the work, is it enabled everywhere now? [13:11:31] no [13:11:36] just on test and mediawiki.org [13:11:44] ah [13:11:57] getting closer :) [13:13:36] paravoid: so python-statsd package has a source named 'statsd'. Should I rename it python-statsd instead? :D [13:13:43] paravoid: I think that is what I originally used [13:14:02] yeah that would work [13:14:12] * hashar hates svn [13:14:19] we've seen no reports from ppl about it which means that either it's ok, or they don't care in which case we should just flip the switch everywhere for them to complain [13:14:23] raise the multiple python statsd modules with debian python people though [13:14:37] MaxSem: good ;) [13:14:46] paravoid: they told me the most used one would be fine. [13:14:56] MaxSem: I've never used m.mediawiki.org, I'm guessing that's the case for most people [13:15:07] i think so too [13:15:20] we specifically emailed a couple of lists with a plea for testing [13:15:21] i only ever visit m.wikipedia [13:15:56] but I need to debug the fuck out of config for it to happen [13:17:54] dafuqdafuqdafuq [13:20:12] PROBLEM - NTP on cp1041 is CRITICAL: NTP CRITICAL: Offset unknown [13:24:00] !log maxsem synchronized wmf-config/mobile.php [13:24:07] Logged the message, Master [13:25:12] RECOVERY - NTP on cp1041 is OK: NTP OK: Offset 0.004767894745 secs [13:43:10] hashar, so MobileFrontend is now mergeable only by jenkins? looks a bit premature taking into account its recent woes [13:43:23] MaxSem: what? [13:43:33] uh [13:43:34] I have not changed anything iirc [13:43:44] it has a dependency [13:44:02] not because Jenkins is required. stupid me [13:53:08] I am tired of those debian packages [13:53:20] now it does not find the source and creates an empty package [13:53:21] :/ [13:53:23] I am doomed [13:54:28] New patchset: Ottomata; "Ensuring group file_mover is present for accounts::file_mover." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59826 [13:55:09] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59826 [14:01:58] New patchset: Ottomata; "Not specifying numeric gid for file_mover group." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59829 [14:02:15] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59829 [14:02:21] that account should really not be in that file, it's only meant for humans ;) [14:02:59] file_mover? yeah probably, i was just trying to touch as little as possible for this [14:03:55] !log maxsem synchronized php-1.22wmf2/extensions/MobileFrontend/includes/MobileContext.php 'https://gerrit.wikimedia.org/r/#/c/59818/' [14:04:02] Logged the message, Master [14:05:58] !log maxsem synchronized php-1.22wmf1/extensions/MobileFrontend/includes/MobileContext.php 'https://gerrit.wikimedia.org/r/#/c/59818/' [14:06:05] Logged the message, Master [14:08:35] allrighty - I'm in a playful mood [14:08:59] mark, what do you think about flipping caching for more wikis? [14:09:05] i'm all for it [14:09:18] let's pick a victim, then:P [14:09:24] nl.wikipedia [14:10:55] MaxSem: I have updated the beta varnish instance with the latest puppet manifests [14:11:32] hashar, woohoo [14:11:54] New patchset: MaxSem; "Enable $wgMFVaryResources on nlwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59830 [14:12:01] it took about a minute before my varnishtop found a Vary header with X-Wap in it [14:12:08] so clearly m.mediawiki.org isn't very popular ;) [14:12:30] yeah [14:12:31] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59830 [14:12:34] lemme change that [14:12:39] :) [14:14:01] there we go :) [14:14:15] !log maxsem synchronized wmf-config/InitialiseSettings.php [14:14:22] Logged the message, Master [14:16:15] much better now [14:16:43] \o/ [14:17:24] much better as in...? [14:17:28] much more X-WAP Vary headers [14:17:44] http://p.defau.lt/?i6EECtHuSS5uBJ3YqQ4phA [14:17:50] that's over 60s [14:18:12] before your change, the X-WAP line had like 2-3 hits [14:20:38] hmm, mw1049-52 look like they're out of sync [14:21:11] New patchset: Demon; "Update the ldap scripts to pep8 compliant" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/53476 [14:21:19] are they excluded from dsh or something? [14:21:30] let's see [14:21:55] mark, mw1149-52 [14:21:56] no they're in there [14:21:58] my bad [14:21:59] ah [14:22:06] those too [14:22:21] why do you think they're out of sync? [14:22:54] I see messages in debug log related to the bug I fixed and deployed a fix recently [14:23:01] from these 4 hosts [14:23:16] trying again [14:24:28] !log maxsem synchronized php-1.22wmf2/extensions/MobileFrontend/includes/MobileContext.php 'https://gerrit.wikimedia.org/r/#/c/59818/' [14:24:35] Logged the message, Master [14:25:34] New patchset: Aklapper; "Try again to list urgent issues in Bugzilla Weekly Report email. Now that I've visited the Logic School again..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59831 [14:25:54] !log maxsem synchronized php-1.22wmf1/extensions/MobileFrontend/includes/MobileContext.php 'https://gerrit.wikimedia.org/r/#/c/59818/' [14:26:01] Logged the message, Master [14:26:36] why does MF vary on accept-language again? [14:28:14] does it? I see only Accept-Encoding [14:28:31] check the link I just gave [14:29:54] it's mot MF [14:29:56] *not [14:30:13] hmm ok [14:30:13] it's MW, related to language variants [14:31:02] * bug 21672: Add Accept-Language to Vary and XVO headers [14:31:02] * if there's no 'variant' parameter existed in GET. [14:31:40] https://bugzilla.wikimedia.org/show_bug.cgi?id=21672 [14:31:42] without normalization, having AL in vary sucks [14:31:53] mark got a minute to flip netapp replication? [14:31:58] yes [14:32:01] it'll need more than a minute ;) [14:32:09] 7! [14:32:17] ok! [14:32:35] whenever you're ready. fr knows we're breaking things today [14:32:37] let me know when you're ready; i'll have to make the source volume read-only [14:32:39] ok [14:33:04] stopping the script on locke... [14:35:15] Jeff_Green: can I proceed? [14:36:40] mark: yes [14:36:46] ok [14:48:52] hm, hey guys, i'm looking at setting up https for metrics.wikimedia.org, curious as to the way I should do it [14:49:02] when i've done it in the past, i usually set the https vhost up as a proxy [14:49:23] hmm [14:49:26] something is confused [14:49:44] i'm looking at wikitech's setup, it looks like http just redirects to https [14:49:55] and the main vhost configs are in the https vhost [14:50:04] that how I shoudl do this? [14:56:40] <^demon> ottomata: Sounds pretty standard, we do the same on gerrit.wm.o [15:01:31] Jeff_Green: i think it's ok now [15:01:40] nas1001-a is now read-write, nas1-a is read-only [15:01:50] and any changes on nas1-a after we started have been lost [15:02:42] ok cool [15:02:59] i would think it's possible to set a volume read-write to make sure there aren't any changes [15:03:03] i'll try mounting it r/w from the new udp2log host [15:03:05] but it didn't seem to work that way [15:03:20] there's only one process that writes at all, and I stopped it before you started [15:03:33] yeah but I don't like it principally [15:03:44] plus it'll self-heal b/c it's an rsync job [15:07:58] i don't fully trust it yet [15:08:03] it's currently syncing and has a lag of 3 minutes [15:09:19] it's in sync now [15:09:21] I guess it's ok [15:10:10] mark locke has gone r/o as expected, but I still can't write from gadolinium after remounting [15:10:26] !log Reversed replication of volume fr_archive, now nas1001-a:fr_archive -> nas1-a:fr_archive [15:10:29] checking [15:10:33] Logged the message, Master [15:11:11] ah that's right [15:11:17] /vol/fr_archive-sec=sys,ro=208.80.154.6:10.64.21.103:208.80.154.15:208.80.154.73 [15:11:23] the NFS rights set it to ro as well [15:11:28] oic [15:11:36] should all of these be able to write? [15:11:59] the eqiad ones at least [15:12:07] yeah but all of them? [15:12:16] the way I'm using it is just to write from the udp2log collection hosts [15:12:30] looking at the list now [15:13:29] I'm not sure what the second one is for [15:13:38] 208.80.154.6:208.80.154.15:208.80.154.73 [15:13:40] i can leave it ro and rw all the others [15:13:52] sounds good [15:14:06] oxygen is there as a tepid standby for gadolinium [15:14:38] ok, try now [15:14:47] works [15:14:50] thanks! [15:14:51] \o/ [15:16:02] it makes me cry that we don't use uid/gid consistently, i hate the idea of resorting to nfs trickery [15:16:29] where you use nfs, you should use uid/gid consistently [15:16:38] yes [15:17:05] but today involves redoing uid/gid by hand in some cases [15:17:06] which is of course harder if the uid used is already used for something else on other systems [15:17:11] yeah I guess [15:17:13] exactly [15:17:34] so far I've managed to redo things by hand when necessary, just starting to look at gadolinium vs the world now [15:19:28] today it's a fight between groups ganglia (system installed?) vs file_mover (puppet installed) [15:20:58] group 999 is a free-for-all, it's a different group name on each of the 3 hosts I've looked at [15:21:06] jeff green [15:21:19] is there still aproblem there? [15:21:24] do the uids have to match? [15:21:26] sorry, gids? [15:21:37] i was looking at that this morning [15:22:05] if they do I can rely on them to limit read privs on the consumer hosts [15:22:34] once upon a time i standardized it by hand on the fundraising hosts to match locke [15:23:14] hm [15:23:38] hm, welp, i guess fix by hand on gadolinium? [15:23:39] I'm going to get the script working at all and then come back to this [15:23:40] that's fine [15:23:50] i just added the file_mover group inpuppet [15:24:01] but I had to remove the explicit gid => 999 because of the ganglia conflict on gadolinium [15:24:04] if you fix it by hand [15:24:07] you can re add the explicit gid [15:24:13] except it'll conflict elsewhere [15:25:41] hm, yaeh amybe [15:25:53] i think there are only a few places where accounts::file_mover is included [15:25:57] wouldnt' be hard to fix those by hands too [15:26:06] i think 3 nods…pdfx...? [15:26:44] frack too--lemme look at those [15:26:53] they're standardized, probably on gid=999 [15:27:48] nope, apparently I don't use it there at all [15:28:14] so yeah, why don't we pick a new unconflicting uid/gid for file_mover and I'll go back and fix by hand any host where it disagrees [15:28:37] oh hmm, ok, since locke is being deprecated anyway [15:28:41] can we use whatever it is on gadolinium right now? [15:28:56] 1002 [15:28:57] ? [15:29:00] hmm, i bet that conflicts in places [15:29:12] yeah just pick something [15:29:13] i geuss [15:29:50] hashar, you there? [15:29:56] i'm tryign to use expanderb.rb [15:29:58] getting an error [15:30:03] ohh [15:30:06] that is unfortunate [15:30:09] "wrong number of arguments (0 for 1)" [15:30:10] i think because [15:30:13] p template.result(get_values) [15:30:20] and get_values takes an arg [15:30:20] def get_values(key) [15:30:49] ottomata: how about 10003 [15:31:04] sure, no pref [15:31:53] maybe that's the range samba uses [15:32:08] lemme see if I can find any mention of a standard for ubuntu [15:34:32] ottomata: yeah the script might be broken [15:34:48] ottomata: if you get an example fill a bug / drop me an email and I will fix it :) [15:35:38] ok, i might have a fix…but i'm not sure how it ever worked, right? since your method takes an arg and you are not passing one [15:35:46] it did work [15:35:51] but maybe I have hacked it before submitting hehe [15:35:59] ottomata: looks like ubuntu uses 100-999 for dynamically allocated system groups, and 1000-29999 for dynamically allocated regular groups [15:36:01] anyway must rush out [15:36:02] sorry [15:37:26] let's make file_mover 30001.30001 :-) [15:38:43] !log authdns update [15:38:47] ok [15:38:50] Logged the message, Master [15:48:47] ottomata: are you doing the puppet change or shall I? [15:55:45] hmm, you wann? [15:55:53] sure [15:55:53] i can, but i'm into something else astm [15:55:53] atm [15:58:25] New patchset: Jgreen; "assign uid/gid 30001/30001 to file_mover, switch banner logrotation user" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59835 [15:58:53] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59835 [16:00:31] New patchset: Lwelling; "Set up email addresses for sending notifications from Echo for enwiki and mediawiki.org" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59717 [16:01:59] New review: Lwelling; "(1 comment)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59717 [16:09:50] !log mlitn synchronized php-1.22wmf1/extensions/ArticleFeedbackv5 'Update ArticleFeedbackv5 to master' [16:09:56] Logged the message, Master [16:10:08] !log mlitn synchronized php-1.22wmf2/extensions/ArticleFeedbackv5 'Update ArticleFeedbackv5 to master' [16:10:16] Logged the message, Master [16:11:13] New patchset: ArielGlenn; "new pub key for tfinc" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59837 [16:14:32] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59837 [16:17:53] New patchset: Jgreen; "puppetize user file_mover on aluminium/grosley" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59838 [16:18:22] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59838 [16:19:06] New patchset: Ottomata; "Fixing a bug in expanderb.rb. Also avoiding syntax error on -%> closing tags." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59839 [16:19:07] hashar: ^ [16:20:30] New patchset: Ottomata; "Fixing a bug in expanderb.rb. Also avoiding syntax error on -%> closing tags." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59839 [16:20:57] New patchset: Ottomata; "Fixing a bug in expanderb.rb. Also avoiding syntax error on -%> closing tags." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59839 [16:21:06] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59839 [16:29:01] New patchset: ArielGlenn; "mysql client on hume and terbium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [16:29:46] New patchset: Lwelling; "Set up email addresses for sending notifications from Echo for enwiki and mediawiki.org" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59717 [16:30:53] PROBLEM - Puppet freshness on cp3003 is CRITICAL: No successful Puppet run in the last 10 hours [16:31:56] mutante, you around? [16:35:47] New patchset: RobH; "adding wikidev group to antimony" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59841 [16:36:44] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59625 [16:38:41] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59841 [16:39:43] bleh [16:39:49] someone has unmerged changes on sockpuppet [16:41:21] sec [16:41:40] what change is it ( robh )? [16:42:03] expanderb.rb file [16:42:08] ah notme [16:42:12] notme [16:42:13] i merged cuz if its on sockpuppet already then it should be ok [16:42:26] i dont want to stop my work cuz somenoe just +2 and didnt finish the stuff ;] [16:42:51] urgh, it still didnt copy over chad's account to antimony [16:42:54] WTF is missing. [16:42:55] ? [16:43:34] Anyone want to glance at the entries for antimony.wikimedia.org? [16:43:55] I am not sure why, but it doesn't appear to be sycning over chad's account, when afaik it should [16:44:02] i added the wikidev group, but still no dice. [16:45:51] robh, I will look if you tell me whether we are stil usign sync-file to push out single files from wmf-config or whether we have some fancy git-deploy or something [16:46:09] sync-file still, git deploy stuff was killed [16:46:12] k [16:46:13] its all in git instead of svn [16:46:21] but once you git update on fenari, its old sync method [16:46:32] unless its changed since last week. [16:46:36] !log ariel synchronized wmf-config/throttle.php [16:46:42] Logged the message, Master [16:46:47] I would guess it hasn't [16:46:47] thanks [16:46:55] quite welcome [16:46:58] now I gotta see whether we update the memcache key the same way [16:47:05] anyways, tell me what to look at here [16:47:19] for chad's acct [16:48:04] So backstory, antimony will be new gerrit webfacing review thign [16:48:18] but we cannot just include all that crap yet, it has to be moved a bit at a time [16:48:27] so we cannot just copy the existing server entry that does this [16:48:32] (manganese) [16:48:46] we just want the standard puppet run + chad's account with sudo added [16:48:55] gotcha [16:49:00] So I included the account (well, chad did) then I added the wikidev this morning [16:49:05] ok [16:50:01] So I am kind of at a loss as to why the user account isn't being imported. [16:50:33] ok so the whole account doesn't show up you are saying? [16:51:42] yep [16:52:16] ok [16:53:52] so his account is there, I see it in /etc/passwd etc, but the home dir is a bit odd [16:54:03] so lemme check the entry [16:56:48] oh heh I can't actually spell [16:56:50] so that's a problem [16:57:33] !log reedy synchronized php-1.22wmf1/extensions/Wikibase '5d496b4ea2c8a68b9e1280a282530f7ce5232b68' [16:57:40] Logged the message, Master [17:01:46] New review: Pyoungmeister; "in the process of depricating mysql_wmf. instead, please use something like:" [operations/puppet] (production) C: -1; - https://gerrit.wikimedia.org/r/59840 [17:04:25] apergos: sorry about that, sister called me [17:04:28] got pulled away [17:04:31] back now [17:04:36] that's fine. I'm just poking a bit [17:04:48] so i added the wikidev group [17:04:54] and i also dont see it in /etc/group [17:05:26] Reedy: https://bugzilla.wikimedia.org/show_bug.cgi?id=47354 hrm [17:05:26] yes, I noticed that [17:05:29] more of those [17:05:32] !log reedy synchronized php-1.22wmf2/extensions/ [17:05:40] Logged the message, Master [17:05:40] \o/ [17:05:44] I think all wikis when reviewable templates should have reviewable modules [17:05:46] apergos: ohh, demon, daemon, was wondering what you meant, heh [17:05:53] ah yeah :-D [17:05:54] it's a matter of config [17:07:21] antimony.wikimedia.org [17:07:39] it has a public ip [17:07:46] the puppet stanza is antimony.eqiad [17:07:52] RobH: ^^ [17:08:01] .... [17:08:03] HAHAHA [17:08:07] fuck me. [17:08:08] yw :-) [17:08:11] i let someone else add that! [17:08:15] but yea, i should have caught it [17:08:17] awesome [17:08:22] ^demon|brb: ^ [17:08:29] the rest is left as an exercise to the reader... [17:08:39] yea, cool, thx dude [17:08:44] ^_^ [17:09:22] New review: Ottomata; "(2 comments)" [operations/debs/kafka] (master) - https://gerrit.wikimedia.org/r/53170 [17:09:34] New patchset: RobH; "antimony.wikimedia.org not antimony.eqiad.wmnet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59843 [17:09:41] fuckin a. [17:09:49] <^demon|brb> Oh man, that's my fault too. [17:09:50] i feel stupid [17:09:52] <^demon|brb> We all missed that. [17:09:56] i looked at it like a billion times [17:10:21] <^demon|brb> Ok, lunchtime. [17:10:33] i s there a notpeter in the house? [17:10:39] no [17:10:40] meh [17:10:43] dunno, im wfh today [17:10:46] laundry day. [17:11:07] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59843 [17:13:21] * Aaron|home volunteers Reedy [17:13:22] ^demon|brb: so i just watched it make your account [17:13:25] yer good to go [17:15:06] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki to 1.22wmf2 [17:15:14] Logged the message, Master [17:17:11] New patchset: Aude; "Update Wikibase settings" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59844 [17:17:12] New review: ArielGlenn; "Are you saying that I should create a class for terbium and hume that incudes this package? (If so, ..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [17:17:22] ottomata: puppet camp in 6 days. coming? [17:17:51] ha in nyc? [17:18:57] jeremyb_: the paid puppet camp or conference? [17:19:07] i did the puppet week class, i think it was called puppet camp... [17:19:19] was very nonopensource centric version [17:19:28] 'you use site.pp, you must be an open source client' [17:19:34] what did you expect? :) [17:19:36] 'we dont really touch that stuff in this class' [17:20:03] i expected they would have made the commercial version just a fancy shell over the open source version [17:20:11] but its a bit of a different beast [17:20:32] I don't really see the point of trainings [17:20:39] esp. hugely expensive trainings such as these [17:20:51] heyaoooo, so i'm working https://rt.wikimedia.org/Ticket/Display.html?id=4912 [17:20:53] some folks learn better in a guided format than on self study [17:20:59] mutante added the .pem file [17:21:05] so i wont say i didnt learn anything from the class, i did [17:21:06] but install_certificate also expects a .key file [17:21:14] but its only about 40% on topic for us =[ [17:21:27] ottomata: the key is in the private repo [17:21:42] (if you have worked in puppet for 6 months in ops, you have about the same level of knowledge the class attempts to impart [17:21:44] ) [17:21:56] ohhhh ohh i see [17:22:01] makes sense [17:22:11] ottomata: it should just work with install_cert [17:22:15] ok cool [17:22:15] yea [17:22:19] :) [17:22:19] but yea, the issue is unless one person we know has taken course [17:22:19] i'm testing elsewhere i see that now [17:22:20] hm, ok [17:22:21] then we have no idea [17:22:29] so, if i use install_certificate [17:22:37] and then set SSLCertificateFile and SSLCertificateKeyFile [17:22:39] to the proper paths [17:22:41] that should do it? [17:22:44] no [17:22:52] chain? [17:22:53] also should add chain [17:23:01] or CACertificatePath :-) [17:23:11] yea, i was wondering, so which one:) [17:23:17] I prefer the path [17:23:23] New review: Mwalker; "Brion has given his plus 1; I'll deploy these at 4 in the LD window." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/59589 [17:23:30] more future proof, if you change the CA [17:23:46] ottomata: we probably need to fix this in a whole bunch of other places.. fwiw.. [17:23:46] lol mwalker [17:23:52] yeah i'm working off of wikitech as example [17:24:43] odder: yep; it makes a bit more sense if you look at the patch and realize that brions +1 got lost a couple topic renames/commit message edits down [17:25:12] mwalker: I meant the bug – there are around 4 logos suggested, and you decided on this one out of a sudden [17:25:33] ottomata: so as paravoid says, set key and cert as you intended but then additionally add http://httpd.apache.org/docs/2.2/mod/mod_ssl.html#sslcacertificatepath [17:25:52] oh; Isarra ^ I assume you know about this? [17:26:02] I'm pretty much just being a deploy lacky about this [17:26:12] What? [17:26:40] odder says that there's some discussion about these logos? [17:27:23] ok, set to what? /etc/ssl/certs? [17:27:52] paravoid: is that ok? remembering the issue with using that path [17:28:01] it is [17:28:13] ok, cool,, yes otto [17:28:27] ok cool [17:29:34] There was some discussion, but there wasn't any consensus for changing the favicon entirely, so this is just an update on the current version to clean it up. [17:31:09] Basically I made some design decisions to improve the asthetics and make the favicon work cross-platform. [17:31:23] If you think it should be changed entirely, perhaps a discussion on-wiki would be better? [17:40:58] New patchset: Ottomata; "Setting up HTTPS for metrics.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59851 [17:41:39] New patchset: Ottomata; "Setting up HTTPS for metrics.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59851 [17:41:44] New patchset: Jgreen; "enable banner log rotation on gadolinium" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59852 [17:42:02] paravoid, while we are both on, wanna talk about kafka.sh? [17:42:20] I'm working on other things atm, but we can at least discuss so I can work on it later [17:42:22] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59852 [17:42:43] New patchset: Ottomata; "Setting up HTTPS for metrics.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59851 [17:42:46] New review: ArielGlenn; "yes it sure is. let me just do this right now." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52042 [17:43:26] New patchset: ArielGlenn; "harmon into production" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52042 [17:43:40] I really shouldn't... :) [17:43:53] hah, shouldn't what? discuss that with me? :) [17:44:00] I'm behind on my own projects, I really should invest more time in them... [17:44:14] dawwww but you are puppet + debianization reviewer guy! [17:44:18] you are the man with the stamp! [17:44:29] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/52042 [17:44:33] which is why I'm behind on my own projects I guess [17:44:43] ha yup [17:45:09] can you give me your overview beef? [17:45:11] what's the prob? [17:45:22] also, mutante, can you look this over for me: [17:45:23] https://gerrit.wikimedia.org/r/#/c/59851/3/templates/apache/sites/metrics.wikimedia.org.erb [17:45:27] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [17:46:39] New review: Mwalker; "Ok; has 16x16 and 32x32 and seems to work fine locally -- so I'll push this along with 59589 at 4 in..." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/59809 [17:50:01] !log authdns0update for git.w.o [17:50:07] Logged the message, RobH [17:50:16] ^demon: ^ done [17:50:29] <^demon> <3 [17:52:20] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59851 [17:54:17] PROBLEM - Host rdb1 is DOWN: PING CRITICAL - Packet loss = 100% [17:55:44] New patchset: Ori.livneh; "Parametrize mwerror port" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59854 [17:56:31] !log reedy Started syncing Wikimedia installation... : rebuild l10n cache for wikidata deploy [17:56:37] Logged the message, Master [18:07:06] New patchset: Ori.livneh; "Parametrize IPython notebook profile configuration" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59782 [18:07:15] New patchset: Demon; "Install apache & setup replication destination for repositories" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [18:07:58] New patchset: Demon; "Install apache & setup replication destination for repositories" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [18:08:27] upgraded his browser > 10 versions at a time.. Iceweasel 22.0a2 (2013-04-12) heh "aurora" ftw [18:08:41] so far it didnt crash [18:09:24] New patchset: Demon; "Install apache & setup replication destination for repositories" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [18:10:36] New patchset: Demon; "Install apache & setup replication destination for repositories" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [18:11:29] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59782 [18:15:25] New patchset: Lcarr; "gerrit-wm: Sends translatewiki events to #mediawiki-i18n" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/57302 [18:16:16] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/57302 [18:16:23] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59717 [18:16:32] New patchset: Lcarr; "zuul: support cloning from a different branch" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58737 [18:16:33] New patchset: Ori.livneh; "Fully quality ipython path" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59863 [18:18:46] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58737 [18:18:50] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59863 [18:19:01] New patchset: Lcarr; "zuul: in labs use the `labs` branch to install Zuul" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58738 [18:20:00] New patchset: Ori.livneh; "Parametrize mwerror port" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59854 [18:20:55] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58738 [18:21:07] ha, uh LeslieCarr, i just merged that on sockpuppet [18:21:09] hope that's ok [18:21:28] i merged 58737 [18:22:18] New patchset: Lcarr; "zuul: no fetch from pypi and drop statsd dependency" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58827 [18:22:21] PROBLEM - Host rdb2 is DOWN: PING CRITICAL - Packet loss = 100% [18:22:39] ottomata: that's ok [18:22:52] i went to go merge them and was like "why did only one change show up?" [18:22:53] hehehe [18:23:09] i'm going on a ha shar reviewing roll [18:24:40] New review: Lcarr; "In the future I would like for this to not use any of the scripts and be installed by a package. Ho..." [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/58827 [18:24:40] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58827 [18:26:29] hashar: i sent a mail to ops list regarding 4 BZ's related to gallium and installing or upgrading packages [18:26:51] after andre_ pointed me to them in our bug triaging session [18:27:47] !log reedy Finished syncing Wikimedia installation... : rebuild l10n cache for wikidata deploy [18:27:54] Logged the message, Master [18:28:48] mutante: ah thanks [18:29:41] mutante: you are awesome [18:29:46] andree too [18:30:50] New patchset: Pyoungmeister; "labsdb: don't repl private wikis" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59868 [18:31:11] mutante: might reply after dinner. Else that will be tomorrow [18:31:13] dinner time [18:31:33] sure, that's why i used mail instead of bugging on IRC [18:32:01] what is the procedure to get an rt account? [18:32:20] email rt [18:32:33] rt@wikimedia.org? [18:32:52] yeah, then it secretly creates you an account, then you can "request password reset" or whatever on the webview [18:32:58] sbernardin: yes they are raided..so reboot...ctrl-r ...than f2 and clear raid config. dban should work after that [18:33:02] ah, ok [18:33:07] gwicke: just ask to get a "full" (engineering) one, getting a limited one is happening automaatically when you mail it [18:33:19] gwicke: yeah, it's weird. And yeah, then what mutante said. [18:33:20] mail ops-requests@rt [18:34:25] rt@wikimedia.org did not work [18:34:50] ops-requests@rt.wikimedia.org? [18:35:01] correct [18:35:05] ok [18:38:50] New patchset: Reedy; "wikidatawiki to 1.22wmf2" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59870 [18:39:05] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59870 [18:39:34] New patchset: Pyoungmeister; "labsdb: don't repl private wikis" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59868 [18:42:30] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59868 [18:45:35] mutante: did not get a delivery failure this time, but no reply either. Password reset requests also don't do anything yet. [18:46:58] gwicke: it arrived just fine, you'll get email on replies and resolve. i'm taking it now [18:47:00] New review: Brion VIBBER; "Agreed, looks fine to me. Push it! :)" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/59809 [18:47:47] mutante: ah, it is a manual process. Thanks for looking into it! [18:50:18] yea, it's an access request to get membership in "engineering" group, but you're auto-confirmed by walking over to your desk:) [18:55:07] New patchset: Demon; "Initial puppetization for git.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [18:55:36] <^demon> Ok, I think that's ready now if someone's got time for a review. [18:57:15] !log aaron synchronized php-1.22wmf2/includes/job/jobs 'deployed 1e3eaed6fbe7fd91c15399cc17eac8602b34c9cd' [18:57:22] Logged the message, Master [18:58:07] gwicke: you should have received mail. your login is now simply "gwicke" as opposed to your full email address before [18:58:20] mutante: yes, thanks! [18:58:30] am now trying to retrieve the password through the reset interface [18:58:33] "forgot password" feature should work before and after, the difference is just the username format [18:59:17] sees the request in logs, it sends you a "reset link" [18:59:39] got yet another tiny fix for anyone: https://gerrit.wikimedia.org/r/#/c/59807/ :D I was passing an invalid param to systemuser {} [18:59:42] paravoid: https://gerrit.wikimedia.org/r/#/c/59857/5 [18:59:52] paravoid: should we get a new ssl cert for this? [18:59:59] mutante: did not receive any reset link yet [19:00:10] I know we're trying to move away from things using the star cert [19:00:33] ^demon: put in a request for a new cert [19:00:48] but it's still on the same boxes, right? [19:00:59] different box I believe [19:01:02] oh, apparently not [19:01:04] yeah, antimony [19:01:07] I wonder why [19:01:21] <^demon> We're moving the git viewing aspect of git off of manganese and onto its own box. [19:01:24] to move the load of git viewing away from gerrit [19:01:58] <^demon> I can file an RT for a new cert, no problem. I just copy+pasted what I did before :) [19:02:06] wasn't gitblit supposed to solve the load problems? [19:02:25] <^demon> Well, gitblit running as a gerrit plugin still keeps the load on gerrit. [19:02:33] New patchset: Bsitu; "Switch testwiki to use extension1 db for Echo" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59878 [19:02:35] <^demon> We're going to run gitblit as a standalone service. [19:02:45] okay [19:03:10] <^demon> gitblit as a gerrit plugin also has a dozen other problems I'm sick of dealing with ;-) [19:03:15] then the (not so important) question is, why is it under gerrit.pp / gerrit::, role::gerrit:: etc.? :) [19:03:25] different box, different hostname, different setup [19:03:29] doesn't seem very gerrit-related to me [19:03:39] <^demon> I suppose it's not. [19:03:56] <^demon> It just seemed like the logical place to put it since it's part of the git/gerrit infrastructure. [19:06:42] <^demon> paravoid: I can split it off to its own files if you'd prefer. [19:07:04] when we finally get the NFS server up from labs I'd like to replicate all repos there, btw [19:07:14] Coren: ^^ [19:07:17] <^demon> Easy to do. [19:07:19] into a global read only share [19:07:20] doesn't matter that much, but since you went all the way to split it... [19:07:27] oh [19:07:32] Ryan_Lane: that sounds like an excellent idea [19:07:35] yep [19:07:40] I wasn't willing to do that on gluster :) [19:07:55] lol [19:07:55] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59878 [19:08:16] ah. right. Coren isn't around today [19:08:47] RT mails out password reset links just fine, so far all requests regarding that have been explained by local spam filters [19:08:53] <^demon> Ryan_Lane: Should be able to just install role::gerrit::production::replicationdest and then add it to gerrit's list of places to replicate to. [19:10:03] New review: Lcarr; "What about class generic::mysql::packages::client ?" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [19:10:44] ^demon: yep [19:11:19] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59807 [19:11:35] chuckles seeing "Death and/or desctruction" as a commit message title with Topic "holy-crap" [19:13:23] !log bsitu synchronized wmf-config/InitialiseSettings.php 'Echo email config + db cluster update to testwiki' [19:13:30] Logged the message, Master [19:13:51] <^demon> paravoid: But yeah, between the new repo packing, the extra 1G we allocated to the JVM, moving repo browsing off manganese, moved connection throttling to Apache and out of Jetty...we should be a lot more stable and faster. [19:13:58] <^demon> This is my "make gerrit less slow" plan :) [19:14:01] !log bsitu synchronized wmf-config/CommonSettings.php 'Echo email config + db cluster update to testwiki' [19:14:07] Logged the message, Master [19:14:26] cool stuff [19:14:58] <^demon> Also, the gitblit mirror will act as a read-only git mirror...so people who are just wanting to watch the code (and not contribute) can pull from that. [19:14:59] <^demon> :) [19:18:42] <^demon> So, new cert for sure? [19:18:45] Is there any reason (for public repos) we can't just make apache serve up the git directory directly? [19:20:15] <^demon> I don't see why we should. It's available over https already. [19:20:33] <^demon> And we'll be allowing a second https mirror soon. [19:20:39] <^demon> Why have a third https way? [19:20:52] speeeeeed [19:20:57] Though I did just say apache so lol [19:21:25] <^demon> I doubt it'd be all that much faster. [19:22:20] New review: Dzahn; "missing braces" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/59831 [19:22:21] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59831 [19:22:36] New review: Dzahn; "it's just getting the number of bugs" [operations/puppet] (production) C: 2; - https://gerrit.wikimedia.org/r/59656 [19:22:45] PROBLEM - Puppet freshness on ms-fe3001 is CRITICAL: No successful Puppet run in the last 10 hours [19:23:34] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59656 [19:25:21] <^demon> paravoid: So, new cert for sure? [19:25:50] I'd really like to [19:26:06] <^demon> No problem from my end, will file in RT. [19:26:10] I'm not sure what's going on with pricing now and if it's going to get approved [19:26:21] but better file it in RT, and RobH will figure out the details I guess :) [19:26:23] just started a scap [19:27:45] what do we need a new cert for? [19:27:55] RobH/cmjohnson1 got the host moved yet for https://rt.wikimedia.org/Ticket/Display.html?id=4893 ? [19:27:55] (if its not a wildcard, they arent that expensive) [19:28:06] (and about to go to lunch, no quick response needed) [19:28:17] how much isn't "not that expensive"? :) [19:28:36] LeslieCarr: Ack, it was but I never updated the ticket, lemme confirm its what i tghink [19:28:43] and it should be caesium is yours [19:28:50] cool [19:28:51] :) [19:28:54] ok, nom's [19:28:55] which is in row c, lemme confirm in old tickets [19:29:15] yep, it is [19:29:39] ^demon: thanks! [19:29:47] <^demon> yw. [19:30:12] <^demon> I figured while we're at it we can get one for gerrit.wm.o and stop using the wildcard there too, so I filed the second ticket. [19:30:39] paravoid: 49/86/112/159 for 1/2/3/4 yr [19:30:42] gerrit.wikimediaplatformengineering.org :D [19:30:54] or we could use some *unicorn* domain [19:31:00] that isnt a discount for us, its just rapidssl default rates [19:31:04] so i dont mind saying in public area [19:31:05] <^demon> I should've kept that wiki-cloud.net domain and not let it go to harvesters. [19:31:14] <^demon> Then I could make everyone use git in the "wiki cloud" [19:31:17] ie: if anyone knows a reputable place thats cheaper and US based then let me know [19:31:23] <^demon> And make everyone cringe every single time. [19:31:46] (if its non US based, the user agreement may not work, but can have legal review) [19:32:59] ^demon: drop a ticket in ops-requests and assign to me if you need a cert [19:33:04] he already did [19:33:09] heh nm then [19:33:22] <^demon> #4975/4976 [19:33:35] RobH: so, the consensus is that we should buy a lot more certificates instead of reusing the wildcards everywhere, for security reasons [19:33:37] RobH: ottomata: neither paid camp nor conf, free in nyc. [19:33:47] ^demon: these will live on different boxes? (the two urls?) [19:33:52] <^demon> Yep. [19:33:57] so you should eventually expect many more such orders, plus renewals when the time comes [19:33:59] ok, so its two certs then [19:34:04] <^demon> gerrit.wm.o -> manganese, git.wm.o -> antimony [19:34:07] with two different keys, just fyi [19:34:12] feel free to do market survey fro cheaper/easily manageable [19:34:44] hrmm, i can ask the folks we have the wildcards via [19:34:50] they may be able to offer us a much cheaper rate [19:34:52] i dunno. [19:35:08] ^demon: can this afford to sit for a day or two? [19:35:18] cuz we have some domains (main wildcards) via digicert [19:35:31] but our digicert stuff is contract, not CC billed, so would have to get the pricing [19:35:34] and get added to contract [19:35:35] RobH: ottomata: last one i went to of the same variety was mostly talks with people describing how they do things at their own companies [19:35:51] that soudns cooler than the training by a long shot. [19:36:17] <^demon> RobH: Yeah, there's no rush on this. [19:36:26] <^demon> Couple of days is no big deal. [19:36:29] hrmm, i dont think it'll be cheaper though [19:36:41] the default pricing at digicert (public on website) is 156 a year [19:36:57] but i'll ask. [19:44:59] New patchset: Reedy; "Update Wikibase settings" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59844 [19:48:59] New patchset: Demon; "Initial puppetization for git.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [19:50:06] New patchset: Demon; "Initial puppetization for git.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [19:50:36] New patchset: Demon; "Initial puppetization for git.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [19:51:46] !log kaldari Started syncing Wikimedia installation... : [19:51:52] Logged the message, Master [19:52:14] New patchset: Demon; "Initial puppetization for git.wikimedia.org" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59857 [19:56:15] New patchset: Reedy; "wgMaxImageArea and wgMaxAnimatedGifArea to 50MP" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59925 [19:56:30] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59925 [19:57:23] !log reedy synchronized wmf-config/InitialiseSettings.php 'wgMaxImageArea and wgMaxAnimatedGifArea to 50MP' [19:57:26] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:57:30] Logged the message, Master [19:58:16] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time [20:04:00] Isarra: Do you have any idea where the Wikipedia Apple Touch icon comes from? [20:04:43] Isarra: There's been a bug about the en.wiktionary one; we can use the opportunity and update the lot. [20:05:16] odder: wgAppleTouchIcon [20:05:21] 'wiki' => '//$lang.$site.org/apple-touch-icon.png', [20:05:38] https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php [20:06:00] Reedy: I know where this is located, but I have no idea where the file itself comes from :) [20:06:13] Oh, as in, who made it? [20:06:16] There is a 114 x 114 px version, but Retina displays use a 144 x 144 one [20:06:28] Wondering if there is an SVG or a bigger PNG, or something. [20:06:44] Brion would probably be the best person to ask.. [20:06:53] New patchset: Demon; "Begin replicating all repositories to antimony" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59927 [20:08:52] New patchset: Demon; "Remove old github detection hack, that's what remoteNameStyle is for" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59928 [20:12:57] PROBLEM - Puppet freshness on gallium is CRITICAL: No successful Puppet run in the last 10 hours [20:14:34] !log kaldari Finished syncing Wikimedia installation... : [20:14:42] Logged the message, Master [20:15:22] odder, Reedy, the apple-touch-icon was made bu Munaf [20:15:37] ? [20:17:00] New patchset: Demon; "Remove old github detection hack, that's what remoteNameStyle is for" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59928 [20:17:01] New patchset: Demon; "Begin replicating all repositories to antimony" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59927 [20:19:07] <^demon> Ok, other than the new cert, all the initial puppet work should be done now. [20:19:08] MaxSem: Never heard of; are there any possible places to look for the original file that you know of? [20:19:21] ask the author;) [20:19:49] he's currently in #wikimedia-mobile [20:19:53] oh nice. [20:19:54] nick munaf [20:34:12] New patchset: Reedy; "Update Wikibase settings" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59844 [20:34:19] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59844 [20:35:46] !log reedy synchronized wmf-config/ [20:35:53] Logged the message, Master [20:43:07] !log mflaschen synchronized php-1.22wmf1/extensions/GettingStarted/ 'Sync GettingStarted for E3 deploy' [20:43:14] Logged the message, Master [20:45:21] !log mflaschen synchronized php-1.22wmf2/extensions/GettingStarted/ 'Sync GettingStarted 1.22wmf2 for E3 deploy' [20:45:28] Logged the message, Master [20:48:34] RECOVERY - Disk space on analytics1010 is OK: DISK OK [20:49:27] New review: Nemo bis; "@Krinkle, thanks for testing; this means however I run out of ideas to test. Suggestions?" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/58082 [20:52:18] hey LeslieCarr, can I pm? [20:52:34] ori-l: it is quite silly to ask me if you can pm [20:52:37] you can just pm [20:52:44] that goes for everyone [20:52:58] I've noticed other people asking first so I'm just adapting, but sure, OK [20:53:06] they are also silly [20:53:08] it's never an issue for me to be *more* nagging :) [20:53:13] hehe [20:59:39] Guys — any ideas why I might be getting the 'Cannot assign user name' while trying to log in into Gerrit? [20:59:50] New patchset: Jgreen; "removing drush symlink from aluminium, replacing with wrapper script" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59935 [21:00:11] Change merged: Jgreen; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59935 [21:01:02] odder: no idea. have you tried the obvious (i.e., clearing cache / cookies)? [21:01:57] * odder runs his browser in pr0n mode... [21:02:11] Yeah, still no luck :-( [21:05:07] can you take a screenshot of the error, make a bugzilla ticket noting source ip and username you are trying to use ? [21:06:36] LeslieCarr: source IP = my IP address? [21:07:24] ya [21:07:32] You want me to post it to Bugzilla? [21:07:44] RECOVERY - DPKG on db1058 is OK: All packages OK [21:07:54] RECOVERY - Disk space on db1014 is OK: DISK OK [21:08:04] RECOVERY - Disk space on db1058 is OK: DISK OK [21:08:04] RECOVERY - MySQL Replication Heartbeat on db1058 is OK: OK replication delay seconds [21:08:04] RECOVERY - MySQL Slave Running on db1058 is OK: OK replication [21:08:04] RECOVERY - Full LVS Snapshot on db1058 is OK: OK no full LVM snapshot volumes [21:08:05] RECOVERY - MySQL Slave Delay on db1058 is OK: OK replication delay seconds [21:08:05] RECOVERY - RAID on db1058 is OK: OK: State is Optimal, checked 2 logical device(s) [21:08:05] RECOVERY - RAID on db1001 is OK: OK: State is Optimal, checked 2 logical device(s) [21:08:14] RECOVERY - DPKG on db1001 is OK: All packages OK [21:08:14] RECOVERY - mysqld processes on db1058 is OK: PROCS OK: 1 process with command name mysqld [21:08:15] RECOVERY - MySQL Idle Transactions on db1058 is OK: OK longest blocking idle transaction sleeps for seconds [21:08:15] RECOVERY - RAID on db1014 is OK: OK: State is Optimal, checked 2 logical device(s) [21:08:15] RECOVERY - MySQL disk space on db1058 is OK: DISK OK [21:08:15] RECOVERY - DPKG on db1014 is OK: All packages OK [21:08:15] RECOVERY - RAID on db1015 is OK: OK: State is Optimal, checked 2 logical device(s) [21:08:16] RECOVERY - mysqld processes on db1001 is OK: PROCS OK: 1 process with command name mysqld [21:08:16] RECOVERY - DPKG on db1015 is OK: All packages OK [21:08:19] if you don't feel comfortable posting it to bugzilla you can pm it to me and i can check it out [21:08:30] RECOVERY - MySQL Recent Restart on db1001 is OK: OK seconds since restart [21:08:30] RECOVERY - Disk space on db1015 is OK: DISK OK [21:08:30] RECOVERY - MySQL Recent Restart on db1058 is OK: OK seconds since restart [21:08:30] RECOVERY - Disk space on db1001 is OK: DISK OK [21:08:34] RECOVERY - MySQL disk space on db1001 is OK: DISK OK [21:11:20] binasher: can I trick you into commenting on https://bugzilla.wikimedia.org/show_bug.cgi?id=10788#c20 ? [21:11:50] Ryan_Lane: While you're here… I've recently changed labs-morebots/analytics-morebots/etc to run on the tools project. Do you think we should move this channel's morebot as well? [21:12:13] hm [21:12:15] I don't know [21:12:24] where does it run now? [21:12:28] it's currently running on wikitech-static [21:12:30] morebots, where do you run? [21:12:31] I am a logbot running on wikitech-static. [21:12:31] Messages are logged to wikitech.wikimedia.org/wiki/Server_Admin_Log. [21:12:31] To log a message, type !log . [21:12:31] andre__: wow, ticket from 2007 [21:12:54] Ah, ok, so there's no real security concern with leaving it there. [21:13:13] binasher, yeah, I sometimes travel into the past. :) Feel free to close if it doesn't make sense anymore. [21:13:19] nope [21:13:30] back in a little bit [21:16:09] LeslieCarr: PM then :) [21:22:24] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:23:02] Reedy: you about? [21:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.149 second response time [21:24:02] Ya [21:26:57] Question: In https://noc.wikimedia.org/conf/highlight.php?file=db-eqiad.php it shows which wikis are using which master db servers, but it only shows about 30 wikis. Where do you look up the info for the other 800 wikis? [21:27:49] is there some default master db server that handles all the others? [21:28:00] Yes, s3 [21:28:04] notpeter: Wassup? [21:28:13] kaldari: /* s3 */ 'DEFAULT' => array( [21:28:26] If it's not explicitly listed, then it's in s3 [21:28:32] Ah, I see it, thanks! [21:31:31] Reedy: where are the doc for making new wikis, specifically for making new private wikis [21:31:35] there's a new step! :) [21:31:42] orly? [21:31:50] https://wikitech.wikimedia.org/wiki/Add_a_wiki [21:31:51] I tried searching for it, but our search feature isn't so great [21:31:55] aaaahhh [21:32:05] yeah, they now have to be added to an array in puppet [21:32:16] probably needs a few redirects creating [21:32:21] so that they won't get thrown to labsdbs [21:38:02] !log merging changes on bugzilla to now support RT in "See also" field [21:38:09] Logged the message, Master [21:39:28] !log reedy synchronized php-1.22wmf1/extensions/EducationProgram [21:39:30] more private wikis [21:39:35] Logged the message, Master [21:40:49] !log reedy synchronized php-1.22wmf2/extensions/EducationProgram [21:40:56] Logged the message, Master [21:55:27] Change abandoned: Pyoungmeister; "(no reason)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59438 [21:56:18] !log bsitu synchronized php-1.22wmf2/extensions/Echo/includes/EchoDbFactory.php 'Db cache fix' [21:56:24] Logged the message, Master [21:56:47] New patchset: Pyoungmeister; "s4: db31 out, db72 in. s5: db35 out, db73 in" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59945 [22:00:49] Change merged: Pyoungmeister; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59945 [22:03:04] !log py synchronized wmf-config/db-pmtpa.php 'db31 and db35 out, db72 and db73 in' [22:03:10] Logged the message, Master [22:04:55] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [22:04:55] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [22:04:55] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [22:08:34] !log correction: that was db33 out, not db31 [22:08:41] Logged the message, notpeter [22:11:57] New patchset: Pyoungmeister; "actually, using db65 as snapshot host for pmtpa s4" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59948 [22:12:18] New patchset: MaxSem; "Enable $wgMFEnableSiteNotice on testwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59949 [22:12:42] New patchset: Pyoungmeister; "actually, using db65 as snapshot host for pmtpa s4" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59948 [22:13:52] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59948 [22:14:21] andre__: lengthy comment left on 10788 [22:14:47] !log reedy synchronized php-1.22wmf1/extensions/EducationProgram [22:14:53] Logged the message, Master [22:22:02] New patchset: Pyoungmeister; "adjusting weights for pmtpa s4 hosts" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59953 [22:24:55] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 182 seconds [22:26:26] !log reedy synchronized php-1.22wmf1/extensions/EducationProgram [22:26:32] andre__: btw.. bugzilla3.wikimedia.org , yeah, with the 3 in it.. it is some remnant, you'll find quite a few links to that in Google, but i recently noticed it was broken when bugzilla moved to new server the last time. i fixed it in DNS the other day so it redirects to the actual bz [22:26:33] Logged the message, Master [22:30:53] New patchset: Pyoungmeister; "moving db33 and db35 to new pmtpa m1 shard" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59956 [22:31:10] New review: Reedy; "Hume already has mysqlclient.." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [22:34:08] Change merged: Pyoungmeister; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59953 [22:34:56] !log py synchronized wmf-config/db-pmtpa.php 'adjusting weights on pmtpa slaves' [22:35:04] Logged the message, Master [22:35:11] !log reedy synchronized php-1.22wmf2/extensions/EducationProgram [22:35:17] Logged the message, Master [22:36:06] PROBLEM - MySQL Slave Delay on db78 is CRITICAL: CRIT replication delay 212 seconds [22:37:11] New patchset: Dzahn; "add account abaso and add to mortals (RT-4956)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59453 [22:37:45] New patchset: Pyoungmeister; "moving db33 and db35 to new pmtpa m1 shard" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59956 [22:38:48] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59956 [22:41:48] New review: Mwalker; "I guess jenkins cannot verify favicons..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59589 [22:41:49] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59589 [22:42:21] Change merged: Mwalker; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59809 [22:45:24] !log mwalker synchronized docroot/bits/favicon/mediawiki.ico 'New and shiney mw.o favicon by Isarra' [22:45:31] Logged the message, Master [22:46:02] !log mwalker synchronized docroot/bits/favicon/wikisource.ico 'New and shiney ws.o favicon by Isarra' [22:46:08] Logged the message, Master [22:48:05] !log reimaging db35 [22:48:11] Logged the message, notpeter [22:48:46] PROBLEM - Host db35 is DOWN: PING CRITICAL - Packet loss = 100% [22:49:14] Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59949 [22:50:41] !log awjrichards synchronized wmf-config/InitialiseSettings.php 'Enable mobile sitenotice on testwiki' [22:50:48] Logged the message, Master [22:51:06] !log awjrichards synchronized wmf-config/mobile.php 'Enable mobile sitenotice on testwiki' [22:51:13] Logged the message, Master [22:53:56] RECOVERY - Host db35 is UP: PING OK - Packet loss = 0%, RTA = 26.48 ms [22:55:56] PROBLEM - MySQL Slave Delay on db35 is CRITICAL: Connection refused by host [22:56:06] PROBLEM - DPKG on db35 is CRITICAL: Connection refused by host [22:56:06] PROBLEM - mysqld processes on db35 is CRITICAL: Connection refused by host [22:56:06] PROBLEM - MySQL Idle Transactions on db35 is CRITICAL: Connection refused by host [22:56:26] PROBLEM - RAID on db35 is CRITICAL: Connection refused by host [22:56:26] PROBLEM - MySQL Replication Heartbeat on db35 is CRITICAL: Connection refused by host [22:56:27] PROBLEM - Disk space on db35 is CRITICAL: Connection refused by host [22:56:36] PROBLEM - MySQL Recent Restart on db35 is CRITICAL: Connection refused by host [22:56:36] PROBLEM - SSH on db35 is CRITICAL: Connection refused [22:56:36] PROBLEM - Full LVS Snapshot on db35 is CRITICAL: Connection refused by host [22:56:37] PROBLEM - MySQL Slave Running on db35 is CRITICAL: Connection refused by host [22:56:46] PROBLEM - MySQL disk space on db35 is CRITICAL: Connection refused by host [22:57:32] odder: Did you manage to log in gerrit yet? [22:57:43] no. [22:58:35] I filled https://bugzilla.wikimedia.org/show_bug.cgi?id=47385 for now [22:58:41] odder: While I have no access to ldap or the gerrit error message, but I could reproduce you problem when I use wrong casing in the user name [22:58:56] odder: Yes, I saw that bug report. That's why I am asking [22:59:03] odder: :-) [22:59:06] PROBLEM - MySQL Replication Heartbeat on db33 is CRITICAL: NRPE: Unable to read output [22:59:13] odder: Have you tried using lowercase? [22:59:23] qchris: I'm getting it for both "Odder" and "odder" as the username. [23:00:21] notpeter: can you check out my comment here ? https://gerrit.wikimedia.org/r/#/c/59840/ [23:00:36] PROBLEM - Host db35 is DOWN: PING CRITICAL - Packet loss = 100% [23:00:45] LeslieCarr: needs more profanity [23:01:08] lemme fix that next version [23:01:32] odder: Strange. That's something I cannot reproduce locally. Hmmm. [23:01:36] RECOVERY - SSH on db35 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [23:01:46] RECOVERY - Host db35 is UP: PING OK - Packet loss = 0%, RTA = 26.48 ms [23:01:47] I mean, he's right. it does. but if terbium doesn't then hume having it isn't puppetized :) [23:02:01] odder: So I guess we'll have to wait for ^demon to show us the gerrit logs then :-( [23:02:42] New patchset: Ori.livneh; "Add default Notebook helpers to config" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59960 [23:03:04] LeslieCarr: oh, yes, and that class will also work [23:03:12] the one that you suggested [23:03:14] cool [23:03:31] I'd love to get to the point where everything related to mysql uses the mysql module [23:03:38] but, we've got a bit of rewriting to do before that [23:04:29] qchris: Ah well, I guess I can wait :) [23:04:49] any ops person got a sec to merge https://gerrit.wikimedia.org/r/#/c/59960/ for me? [23:05:33] odder: Ja, sorry. But as casing bit a few people when trying to log in, so I thought it might be worth a try ;-) [23:07:18] PROBLEM - Puppet freshness on virt1005 is CRITICAL: No successful Puppet run in the last 10 hours [23:08:34] New patchset: Ori.livneh; "Add default Notebook helpers to config" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59960 [23:15:49] PROBLEM - NTP on db35 is CRITICAL: NTP CRITICAL: Offset unknown [23:18:01] New patchset: Lcarr; "mysql client on hume and terbium patch 2 - changed class used" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [23:20:18] New patchset: Lcarr; "mysql client on hume and terbium patch 2 - changed class used patch 3 - fixed spacing and comma" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [23:22:17] Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59840 [23:24:48] !log bsitu synchronized php-1.22wmf2/extensions/Echo/includes/DbEchoBackend.php 'More echo db fix' [23:24:56] Logged the message, Master [23:25:20] !log bsitu synchronized php-1.22wmf2/extensions/Echo/processEchoEmailBatch.php 'More echo db fix' [23:25:27] Logged the message, Master [23:26:19] New patchset: coren; "Preleminary toollabs module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59969 [23:26:48] RECOVERY - NTP on db35 is OK: NTP OK: Offset -0.002932429314 secs [23:26:59] ori-l: looking now [23:33:56] New review: Jforrester; "LGTM." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/59768 [23:34:30] Change merged: Catrope; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/59768 [23:37:24] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59960 [23:37:39] ori-l: hey [23:37:46] !log catrope synchronized wmf-config/CommonSettings.php 'Deploy 5c5455c81bc8dee0ae2a0410655da639e2f84e87 ($wgVisualEditorParsoidProblemReportURL)' [23:37:53] Logged the message, Master [23:39:46] !log catrope synchronized php-1.22wmf1/extensions/VisualEditor 'Update VisualEditor to master' [23:39:53] Logged the message, Master [23:40:06] !log catrope synchronized php-1.22wmf2/extensions/VisualEditor 'Update VisualEditor to master' [23:40:12] Logged the message, Master [23:43:49] ori-l: we've been deprecating /a in favor of /srv, unfortunately I didn't see the commit that added /a/mongodb [23:44:31] LeslieCarr, paravoid: thanks. /a/ actually predates my time with the machine and harks back to some statistics:: class. [23:44:46] aha [23:44:53] it's easy enough to change. would you like me to? [23:44:59] if it's easy, sure :) [23:45:35] New patchset: Ori.livneh; "Parametrize mwerror port" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59854 [23:46:35] PROBLEM - Parsoid on wtp1 is CRITICAL: Connection refused [23:46:45] PROBLEM - Parsoid on mexia is CRITICAL: Connection refused [23:46:45] PROBLEM - Parsoid on tola is CRITICAL: Connection refused [23:46:54] New patchset: coren; "Preleminary toollabs module" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59969 [23:46:56] PROBLEM - LVS HTTP IPv4 on parsoid.svc.pmtpa.wmnet is CRITICAL: Connection refused [23:46:59] PROBLEM - Parsoid on cerium is CRITICAL: Connection refused [23:47:05] PROBLEM - Parsoid on titanium is CRITICAL: Connection refused [23:47:05] PROBLEM - LVS HTTP IPv4 on parsoidcache.svc.pmtpa.wmnet is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 674 bytes in 0.054 second response time [23:47:08] PROBLEM - LVS HTTP IPv4 on parsoid.svc.eqiad.wmnet is CRITICAL: Connection refused [23:47:15] PROBLEM - Parsoid on wtp1002 is CRITICAL: Connection refused [23:47:26] paravoid: '/srv/mongodb' ok? [23:47:28] dude RoanKattouw , i know you want to hit up my phone but killing parsoid is not the way [23:47:34] Parsoid dying is expected [23:47:35] PROBLEM - Parsoid Varnish on celsus is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 674 bytes in 0.055 second response time [23:47:36] PROBLEM - Parsoid on wtp1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:36] PROBLEM - Parsoid on wtp1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:36] PROBLEM - Parsoid on kuo is CRITICAL: Connection refused [23:47:38] This is a scheduled deployment [23:47:46] PROBLEM - Parsoid on lardner is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:46] PROBLEM - Parsoid Varnish on constable is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:46] PROBLEM - Parsoid Varnish on titanium is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:54] (Scheduled for 3pm but ignore that bit) [23:48:00] * James_F coughs. [23:48:03] It will come back within the next 5-10 [23:48:05] PROBLEM - Parsoid Varnish on cerium is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:48:10] RoanKattouw: if doing something on purpose that will page the entire ops team repeatedly, please silence monitoring first :) [23:48:41] * James_F snorts. [23:48:53] binasher: Right, I totally forgot about that [23:48:54] * RoanKattouw apologizes [23:48:57] * RoanKattouw puts on silly hat [23:49:38] i want a silly hat [23:49:59] How do I even silence things [23:50:05] This UI is completely different from what it was before [23:50:11] I suppose that's the Nagios-Icinga change [23:50:30] New patchset: Ori.livneh; "MongoDB dbpath: /a/mongodb -> /srv/mongodb" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/59972 [23:50:48] https://icinga-admin.wikimedia.org to login [23:51:09] "Disable notifications for this service"? [23:51:09] then if you see the button by notifications that says enabled [23:51:10] paravoid: I stopped the service and moved the dir in preparation [23:51:30] click it, then disabled if it's going to kill tons of stuff, or just notifications for this service for minor things like just parsoid [23:52:45] I'd raise the point that if the service can be down during the deployment it shouldn't page when it's down in general [23:52:59] LeslieCarr: Where is the "button by notifications that says enabled"? I don't see it [23:53:04] either it's critical for it to be up and deployment should be done without downtimes [23:53:15] or it's not, so it shouldn't page us [23:53:16] paravoid: It doesn't usually go down during deployments [23:53:27] But for this particular one we knew it would happen [23:53:43] * RoanKattouw should put this in as a feature request in git-deploy for Ryan [23:53:44] i walk over and point at your screen roan [23:53:47] OK [23:53:57] what feature? [23:54:02] parsoid [23:54:14] parsoid is already in git-deploy [23:54:21] what feature is needed for the deployment? [23:54:32] Option to disable autorestart [23:54:37] ahh. right [23:54:54] that would be handy for the config update [23:54:57] this'll be a lot easier with sartoris in the front [23:55:26] Yeah exactly [23:55:47] Background: we're deploying a config update (really a node_modules update) and a code update such that they both break each other [23:55:47] maybe I'll spend some time soon to replace that bit [23:55:49] Which is rare [23:55:59] ah. I see [23:56:01] But in that case it would be nice to suppress autorestart on the first deploy [23:56:05] yep [23:56:14] In general, it would be nice to be able to suppress it, restart one box, test that one, then trigger a restart [23:56:33] we could also split the restart apart so that you can restart at any point [23:56:39] Yeah [23:56:43] That's probably better [23:56:57] but then you'd need to update your deploy process [23:56:57] Restarting too soon is much worse than restarting too late [23:57:03] That's OK [23:57:21] cool. can you open a bug for this? [23:57:27] Where? [23:57:36] bugzilla has a git-deploy product, I think [23:57:40] Oh OK [23:57:56] Wikimedia -> git-deploy [23:58:06] binasher: BTW turns out I don't even have authorization to silence notifications in Nagios :O but yeah I had totally forgotten, thanks for pointing that out, I'll make sure I'll do it in the future [23:58:58] RoanKattouw: no worry! you can always yell at ops ppl before hand too [23:59:14] Yeah [23:59:18] I should totally have notified you guys [23:59:22] RoanKattouw: we can give you nagios login [23:59:27] I didn't realize that Parsoid would die until I heard cellphones go off [23:59:31] notpeter: Leslie is investigating [23:59:33] I'm supposed to have it [23:59:43] Oh look [23:59:46] I have a page SMS now! [23:59:51] It just took a long time to get here [23:59:54] Arrived 16:54