[00:01:53] (03PS5) 10Ori.livneh: Add a handler for HHVM fatals [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/120180 (owner: 10MaxSem) [00:02:06] (03CR) 10Ori.livneh: [C: 032] Add a handler for HHVM fatals [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/120180 (owner: 10MaxSem) [00:02:13] (03Merged) 10jenkins-bot: Add a handler for HHVM fatals [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/120180 (owner: 10MaxSem) [00:05:13] (03PS1) 10BryanDavis: beta: Remove hacks from misc::deployment::vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/147348 [00:11:28] bd808, in your puppet finangling; did you bring down beta? [00:11:51] mwalker: ori and I were just discussing that [00:11:53] mwalker: it was me and giuseppe most likely [00:12:12] I helped by fixing other puppet issues apparently [00:12:43] the master on deployment-salt had been in conflict and I unwedged it [00:13:15] which was probably caused by me :P [00:13:25] no, not likely [00:13:27] actually, yes :) [00:13:40] oh, really? [00:13:56] I had a patch that was cherry picked; then jeff submitted it; but it had changes [00:14:07] There was a rebase conflict with that patch, yeah [00:14:23] bd808: i did notice that the beta app servers were perfectly happy loading vhost configs for prod [00:14:32] so maybe the solution is not to try to pre-empt those from being loaded [00:14:39] but to simply load the beta vhosts as well [00:14:59] ah. yeah that should work acutally [00:15:51] currently the apache2.conf that the mediawiki module provisions on top of the packaged one doesn't recurse into {sites,conf,mods}-enabled, but i'm pretty sure giuseppe has a patch locally to do that [00:15:53] Hmm... that should be an easy fix right? Just bring out configs into the puppet repo and apply them with one of the beta roles [00:16:12] s/out/our/ [00:17:14] nothing would pick that up given the current setup [00:17:39] it's a simple change to make it work (and one we want to make anyhow) but probably best to coordinate with _joe._ [00:17:51] is it all right for puppet agent to be disabled for a few hours on beta? [00:18:14] Yeah it won't hurt anything. Let's log in the beta's SAL though [00:18:35] * ori nods. [00:24:11] (03CR) 10BryanDavis: [C: 031] "Cherry-picked to deployment-salt and applied on deployment-bastion. Verified that Jenkins triggered scap run worked after." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147348 (owner: 10BryanDavis) [00:31:20] (03CR) 10Ori.livneh: [C: 032] beta: Remove hacks from misc::deployment::vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/147348 (owner: 10BryanDavis) [00:53:17] ori: thanks muchly (re back up) [00:56:52] ori and bd808 and mwalker, I guess :) [01:04:15] greg-g, would it be unethical to give myself permissions on betawiki by logging into the database and just doing so? [01:05:15] mwalker: Nope. Or use a maintenance script, or ask me to make you staff there [01:05:39] ah; bd808 can you make me staff on betawiki; I promise only to do good things and never to harm kittens [01:06:05] * bd808 doesn't care for kittens [01:06:12] puppies? [01:06:16] dolphins!? [01:06:30] puppes will work [01:06:46] what's your username on the wiki? [01:06:50] Mwalker (WMF) [01:08:15] mwalker: {{done}} staff; steward; developer [01:09:56] whoo; thanks bd808! [01:10:30] yw. Now pass it on the next time you see someone asking for perms in beta [01:12:55] PROBLEM - Puppet freshness on db1006 is CRITICAL: Last successful Puppet run was Thu 17 Jul 2014 23:12:41 UTC [01:32:56] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Fri Jul 18 01:32:52 UTC 2014 [02:37:57] !log LocalisationUpdate completed (1.24wmf13) at 2014-07-18 02:36:54+00:00 [02:38:04] Logged the message, Master [02:42:25] (03PS1) 10Springle: Repool db1021, context RT 7916 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147353 [02:43:11] (03CR) 10Springle: [C: 032] Repool db1021, context RT 7916 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147353 (owner: 10Springle) [02:43:18] (03Merged) 10jenkins-bot: Repool db1021, context RT 7916 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147353 (owner: 10Springle) [02:45:12] !log springle Synchronized wmf-config/db-eqiad.php: Repool db1021, context RT 7916, warm up (duration: 00m 08s) [02:45:18] Logged the message, Master [02:49:13] (03PS1) 10Aaron Schulz: Set the refreshLinks runner timeout to 5 minutes [operations/puppet] - 10https://gerrit.wikimedia.org/r/147355 [02:50:22] (03CR) 10Ori.livneh: "Looks fine, but please indicate the rationale (even if it's just "based on observed timeouts" or something like that)." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147355 (owner: 10Aaron Schulz) [02:52:45] (03PS2) 10Aaron Schulz: Set the refreshLinks runner timeout to 5 minutes [operations/puppet] - 10https://gerrit.wikimedia.org/r/147355 [02:53:47] (03CR) 10Ori.livneh: [C: 032] Set the refreshLinks runner timeout to 5 minutes [operations/puppet] - 10https://gerrit.wikimedia.org/r/147355 (owner: 10Aaron Schulz) [03:09:06] !log LocalisationUpdate completed (1.24wmf14) at 2014-07-18 03:08:02+00:00 [03:09:11] Logged the message, Master [03:22:00] !log Updated jobrunner to d9520c9 and restarted service on all jobrunners [03:22:05] Logged the message, Master [03:52:03] PROBLEM - puppet last run on rdb1001 is CRITICAL: CRITICAL: Puppet has 1 failures [03:52:54] interesting [03:53:31] oh, just the usual puppetmaster flakiness [03:55:01] !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 18 03:53:55 UTC 2014 (duration 53m 54s) [03:55:06] Logged the message, Master [04:09:59] RECOVERY - puppet last run on rdb1001 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [04:34:51] (03PS5) 10Springle: mysql_multi_instance: qualify vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/147076 (owner: 10Matanya) [04:36:08] (03CR) 10Ottomata: "But, this package has openjdk-7 as a dependency, right? It shouldn't let you install without Java." [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147338 (owner: 10Kmosher) [04:37:06] (03CR) 10Springle: [C: 032] mysql_multi_instance: qualify vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/147076 (owner: 10Matanya) [04:46:13] PROBLEM - Unmerged changes on repository puppet on palladium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [04:46:44] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [04:54:13] PROBLEM - Puppet freshness on db1006 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 02:53:11 UTC [05:53:22] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Fri Jul 18 05:53:19 UTC 2014 [06:01:39] !log Updated /srv/deployment/jobrunner to 4cddd5033efadf431e138c399b5d86542e32f196 [06:01:44] Logged the message, Master [06:28:51] PROBLEM - puppet last run on mw1176 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:41] PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:52] PROBLEM - puppet last run on cp3016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:01] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 1 failures [06:44:51] RECOVERY - puppet last run on mw1176 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [06:45:40] RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:46:51] RECOVERY - puppet last run on cp3016 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [06:47:00] RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [07:40:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:41:02] (03PS1) 10Aaron Schulz: Lowered the default job runner timeout [operations/puppet] - 10https://gerrit.wikimedia.org/r/147388 [07:42:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:43:13] (03PS2) 10Aaron Schulz: Lowered the default job runner timeout [operations/puppet] - 10https://gerrit.wikimedia.org/r/147388 [07:44:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:46:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:48:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:50:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:52:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:53:49] PROBLEM - Puppet freshness on db1006 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 05:53:19 UTC [07:54:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:56:49] PROBLEM - Puppet freshness on mw1127 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 07:38:43 UTC [07:58:19] RECOVERY - Puppet freshness on mw1127 is OK: puppet ran at Fri Jul 18 07:58:14 UTC 2014 [08:12:32] (03CR) 10Alexandros Kosiaris: "Seems fine to me" [operations/puppet] - 10https://gerrit.wikimedia.org/r/122399 (owner: 10Dzahn) [08:27:27] (03CR) 10Alexandros Kosiaris: "Minor comment, also all these elliptic curves ciphers wont be enabled until we go to apache 2.4 (trusty as a distribution)" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (owner: 10Dzahn) [08:29:22] (03CR) 10Alexandros Kosiaris: [C: 032] icinga - outdated variable syntax [operations/puppet] - 10https://gerrit.wikimedia.org/r/147204 (owner: 10Dzahn) [08:37:16] (03CR) 10Alexandros Kosiaris: "Do we have a policy that apache configs need to be with spaces and not tabs ?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147197 (owner: 10Dzahn) [08:40:30] (03CR) 10Giuseppe Lavagetto: jobrunner: create hhvm-only jobrunners (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [08:42:48] (03CR) 10Giuseppe Lavagetto: "If we don't, we should :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147197 (owner: 10Dzahn) [08:46:24] apergos: Had a chance to have another look at the wikidata json patch yet? [08:46:51] heh, o, yesterday evening turned into an overly full program [08:46:54] *no [08:47:01] let me pull that up again [08:47:07] ok :) [08:49:06] uh that description in the system role seems ... entertaining :-D [08:49:55] whoops [08:50:02] copy and paste got me :P [08:50:06] figured :-D [08:51:06] (03CR) 10Alexandros Kosiaris: "If we do, we should also make adding a vim modeline a requisite" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147197 (owner: 10Dzahn) [08:53:28] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Fri Jul 18 08:53:25 UTC 2014 [08:57:19] (03CR) 10Alexandros Kosiaris: [C: 032] etherpad - outdated variable syntax [operations/puppet] - 10https://gerrit.wikimedia.org/r/147198 (owner: 10Dzahn) [08:59:27] (03PS3) 10Giuseppe Lavagetto: jobrunner: create hhvm-only jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 [09:09:14] (03PS4) 10Hoo man: Introduce snapshot::wikidatajsondump [operations/puppet] - 10https://gerrit.wikimedia.org/r/146470 [09:09:52] (03CR) 10Hoo man: "Fix copy and paste error... thanks, Ariel" [operations/puppet] - 10https://gerrit.wikimedia.org/r/146470 (owner: 10Hoo man) [09:12:12] (03CR) 10Alexandros Kosiaris: [C: 04-1] generic_vhost - outdated variable syntax (034 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147210 (owner: 10Dzahn) [09:21:33] (03CR) 10Filippo Giunchedi: [C: 031] add alternatives installation for /usr/bin/php [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/147141 (owner: 10Giuseppe Lavagetto) [09:25:55] (03PS1) 10Filippo Giunchedi: swift: enable recon middleware [operations/puppet] - 10https://gerrit.wikimedia.org/r/147399 [09:26:47] (03CR) 10Filippo Giunchedi: [C: 032] swift: enable recon middleware [operations/puppet] - 10https://gerrit.wikimedia.org/r/147399 (owner: 10Filippo Giunchedi) [09:26:55] (03CR) 10Filippo Giunchedi: [V: 032] swift: enable recon middleware [operations/puppet] - 10https://gerrit.wikimedia.org/r/147399 (owner: 10Filippo Giunchedi) [09:28:04] springle: still around? is 9355b71 good to be merged? [09:29:03] nothing too bad it seems, just vars qualification [09:36:51] (03PS4) 10Giuseppe Lavagetto: jobrunner: create hhvm-only jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 [09:36:58] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [09:37:10] spot-checked the mysql_multi_instance module, no notify/subscribe etc so I'm merging [09:37:18] RECOVERY - Unmerged changes on repository puppet on palladium is OK: No changes to merge. [09:39:01] godog: ah... i really thought i merged that. sorry [09:39:06] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] add alternatives installation for /usr/bin/php [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/147141 (owner: 10Giuseppe Lavagetto) [09:39:45] <_joe_> btw, I know it's friday, but I'd really like to merge https://gerrit.wikimedia.org/r/#/c/147066/ [09:40:47] <_joe_> I know this would piss Reedy off a bit, he's got a few outstanding patches against apache-config [09:40:59] springle: no worries! [09:49:19] (03CR) 10Filippo Giunchedi: jobrunner: create hhvm-only jobrunners (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [09:49:26] (03CR) 10Filippo Giunchedi: jobrunner: create hhvm-only jobrunners (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [09:53:49] (03CR) 10Giuseppe Lavagetto: jobrunner: create hhvm-only jobrunners (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [09:55:55] (03CR) 10Filippo Giunchedi: releases: add reprepro repository (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/146826 (owner: 10Filippo Giunchedi) [09:56:05] (03PS1) 10Alexandros Kosiaris: akosiaris dotfiles. Link .my.cnf to /root/.my.cnf [operations/puppet] - 10https://gerrit.wikimedia.org/r/147408 [10:02:43] apergos: I've amended the change... is there anything else left which should be addressed? [10:04:30] !log stagger reload swift {account,object,container} server in ms-be.eqiad to pick up recon changes [10:04:36] Logged the message, Master [10:04:38] (03CR) 10Alexandros Kosiaris: [C: 032] akosiaris dotfiles. Link .my.cnf to /root/.my.cnf [operations/puppet] - 10https://gerrit.wikimedia.org/r/147408 (owner: 10Alexandros Kosiaris) [10:14:38] hoo: that was all I saw [10:15:01] I'll try pushing it live a little later today [10:15:26] apergos: That would be *so* awesome :P [10:21:49] (03CR) 10Alexandros Kosiaris: [C: 032] include 'bastionhost' on bastion hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/122399 (owner: 10Dzahn) [10:23:13] how awesome would it be then? :-P [10:25:41] On a scale of 1 to 10... A clear 11, I'd say :D [10:25:48] :-D [10:27:19] how awesome would i be then? :-P [10:27:21] ftfy [10:27:32] and hoo's answer still works [10:27:36] (03PS5) 10ArielGlenn: Introduce snapshot::wikidatajsondump [operations/puppet] - 10https://gerrit.wikimedia.org/r/146470 (owner: 10Hoo man) [10:27:45] just a rebase [10:29:17] I read your message three times and thought 'how did I leave out that t? gotta clean this keyboard'... [10:29:50] (03CR) 10ArielGlenn: [C: 032] Introduce snapshot::wikidatajsondump [operations/puppet] - 10https://gerrit.wikimedia.org/r/146470 (owner: 10Hoo man) [10:29:56] \o/ [10:31:25] it's not live yet, hold your horses [10:31:49] gotta run puppet on a secondary and a primary, then look at the corntab and the roles [10:31:51] crontab [10:32:59] I'm dared to say "what could possible go wrong?" ... but experience taught me better [10:33:49] thanks apergos ! :) [10:34:00] * aude can't wait to have a dump [10:34:27] that... might be tmi :-D [10:34:50] akosiaris: hm, seeing those dotfiles: why exactly do you use .variables for the hist-/file size, but do everything else history related in .bashrc? ;) [10:36:26] Trminator: I keep .bashrc clean so I can easily upgrade to newer versions of it in the future without too much fuss [10:36:47] PROBLEM - puppet last run on snapshot1003 is CRITICAL: CRITICAL: Complete puppet failure [10:36:58] uhm [10:36:59] doh [10:37:10] ah, I c [10:37:17] (03PS1) 10ArielGlenn: wikidata json dumps: manifest filename had typo [operations/puppet] - 10https://gerrit.wikimedia.org/r/147411 [10:37:42] Trminator: and since .bash_aliases has been introduced I should migrate .aliases to that btw [10:37:55] this is why I test [10:38:00] hehe [10:38:44] apergos: Ah, manual run... just wondered why that didn't make it to /var/log/puppet.log [10:39:36] (03CR) 10ArielGlenn: [C: 032] wikidata json dumps: manifest filename had typo [operations/puppet] - 10https://gerrit.wikimedia.org/r/147411 (owner: 10ArielGlenn) [10:42:47] RECOVERY - puppet last run on snapshot1003 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [10:43:54] well it's in, we'll see in a few days... [10:44:47] thanks for continuing to work on this [10:45:09] (03PS2) 10Giuseppe Lavagetto: mediawiki: use sites-available everywhere. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147066 [10:46:25] apergos: I have to thank you :) [10:46:47] not yet! if it runs successfully on.. MOnday is it? then you can thank me :-) [10:47:05] Hehe, I'll surely do that :) [10:47:46] (03CR) 10Tim Landscheidt: [C: 04-1] "WIP." [operations/puppet] - 10https://gerrit.wikimedia.org/r/118796 (https://bugzilla.wikimedia.org/60925) (owner: 10Tim Landscheidt) [10:48:57] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [10:49:06] wtf [10:49:21] <_joe_> apergos: how do you launch puppet-merge? [10:49:34] from palladium [10:49:42] just type he command but there's nothing to merge there [10:49:43] <_joe_> sudo puppet-merge? [10:49:52] I usually sudo -s first [10:49:59] then puppet merge from the command line [10:50:44] <_joe_> strange [10:50:49] <_joe_> sudo -s should work [10:50:53] <_joe_> sudo -i works for sure [10:50:58] <_joe_> use sudo -i :) [10:53:01] I used sudo -s for the previous change and it worked just fine [10:53:07] that would have been 15 minutes ago [10:53:14] or 20 whatever [10:54:07] <_joe_> mmmh [10:54:15] <_joe_> I thought sudo -s worked in fact [10:54:29] <_joe_> we don't really understand why that happens [10:54:33] <_joe_> but we need to fix it [10:55:03] is there a ticket? maybe there should be [10:56:27] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0] [10:59:20] (03PS5) 10Giuseppe Lavagetto: jobrunner: create hhvm-only jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 [11:00:42] I have the puppet-merge output from the two changes in my window still, it's pretty interesting [11:02:48] <_joe_> we know what happens [11:03:06] <_joe_> for some reasons git-merge launches the hook as the worng user [11:03:11] <_joe_> or something like that [11:03:23] <_joe_> btw, I've been called to lunch! [11:14:42] !log restart proxy-server on ms-fe1003, double checking for a change in numbers reported to graphite [11:14:48] Logged the message, Master [11:15:09] enjoy (your lunch) [11:16:27] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0] [11:17:55] (03CR) 10Alexandros Kosiaris: [C: 032] deployment: fully qualify var [operations/puppet] - 10https://gerrit.wikimedia.org/r/145234 (owner: 10Matanya) [11:19:05] (03CR) 10Alexandros Kosiaris: [C: 032] geoip: qualify vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/145269 (owner: 10Matanya) [11:20:23] !log restart proxy-server on ms-fe1003, as suspected it wasn't running the latest version [11:20:28] Logged the message, Master [11:20:36] (03CR) 10Alexandros Kosiaris: [C: 032] "The changes on nickel are due to the non stable nature of puppet hashes that are being used to populated those JSON files and have nothing" [operations/puppet] - 10https://gerrit.wikimedia.org/r/145266 (owner: 10Matanya) [11:22:08] (03CR) 10Alexandros Kosiaris: [C: 032] cache: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/140678 (owner: 10Matanya) [11:23:01] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [11:26:02] (03PS2) 10JanZerebecki: racktables - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147185 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:26:20] to all the amazing things that made my singular journey of knowledge possible I just want to say thank you, i don't want to take up your time but i must say your work looks grand :) [11:28:26] (03CR) 10JanZerebecki: [C: 04-1] "Please remove the newlines from the ciphers list." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147185 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:29:35] (03PS2) 10JanZerebecki: smokeping - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:29:39] (03CR) 10Alexandros Kosiaris: [C: 032] admin: fix var scoping [operations/puppet] - 10https://gerrit.wikimedia.org/r/144908 (owner: 10Matanya) [11:30:22] (03PS1) 10Filippo Giunchedi: swift: use statsd_default_sample_rate default [operations/puppet] - 10https://gerrit.wikimedia.org/r/147417 [11:36:52] (03CR) 10JanZerebecki: [C: 031] smokeping - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:37:44] (03PS2) 10JanZerebecki: etherpad - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:39:00] (03CR) 10Chmarkine: [C: 031] "By the way, we are using wrong certificate for https://smokeping.wikimedia.org/. The certificate is issued to librenms.wikimedia.org." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:39:51] (03CR) 10JanZerebecki: [C: 031] etherpad - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:43:38] (03PS1) 10Springle: Remove db1050 from s1, to be repurposed for dumps. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147419 [11:44:19] (03CR) 10Alexandros Kosiaris: [C: 031] "I can see this being used when installing via dpkg and not apt in which case it makes sense. I think we should merge." [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147338 (owner: 10Kmosher) [11:44:44] (03CR) 10Springle: [C: 032] Remove db1050 from s1, to be repurposed for dumps. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147419 (owner: 10Springle) [11:47:11] PROBLEM - MySQL Slave Running on db60 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Table ops.event_log doesnt exist on query. Default databas [11:47:22] oops [11:48:44] lunch, bbl [11:49:11] RECOVERY - MySQL Slave Running on db60 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [11:49:16] (03PS2) 10JanZerebecki: icinga - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:49:59] (03CR) 10JanZerebecki: icinga - update SSL cipher list (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [11:50:09] (03CR) 10JanZerebecki: [C: 031] icinga - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:05:47] (03CR) 10JanZerebecki: [C: 04-1] "role::smokeping is not used in site.pp should that be removed instead?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:07:20] (03CR) 10Chmarkine: "Ganglia's server is Apache/2.2.14, while others are Apache/2.2.22. Probably Apache 2.2.14 doesn't support TLS 1.1, TLS 1.2, AES128-GCM-SHA" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147110 (https://bugzilla.wikimedia.org/53259) (owner: 10Chmarkine) [12:10:47] (03CR) 10JanZerebecki: [C: 031] "It is in use by e.g. role::librenms, smokeping, ..." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147208 (owner: 10Dzahn) [12:11:21] (03PS1) 10Yurik: Enabled Opera on 429-02 carrier [operations/puppet] - 10https://gerrit.wikimedia.org/r/147422 [12:12:52] Any way to get gerrit upgraded to the last minor version? https://bugzilla.wikimedia.org/show_bug.cgi?id=63847 [12:13:04] bblack, https://gerrit.wikimedia.org/r/#/c/147422/ when you have a chance pls [12:13:20] (03PS2) 10JanZerebecki: generic_vhost (webserver) - update SSL ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147208 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:17:19] (03CR) 10JanZerebecki: [C: 031] wikitech - remove DHE ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147315 (owner: 10Dzahn) [12:17:52] (03PS3) 10JanZerebecki: metrics - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147214 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:18:13] (03CR) 10JanZerebecki: [C: 031] metrics - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147214 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:18:29] (03CR) 10Chmarkine: [C: 031] icinga - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:20:11] (03CR) 10Chmarkine: [C: 031] etherpad - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:25:15] (03CR) 10Chmarkine: [C: 031] metrics - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147214 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:28:44] (03CR) 10Chmarkine: [C: 031] generic_vhost (webserver) - update SSL ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147208 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [12:31:01] (03CR) 10Chmarkine: [C: 031] OTRS - remove DHE ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147316 (owner: 10Dzahn) [12:31:11] (03PS3) 10Giuseppe Lavagetto: mediawiki: use sites-available everywhere. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147066 [12:33:48] (03CR) 10Chmarkine: [C: 031] wikitech - remove DHE ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147315 (owner: 10Dzahn) [12:37:13] (03CR) 10Giuseppe Lavagetto: "Puppet compiler results:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147066 (owner: 10Giuseppe Lavagetto) [12:40:00] (03CR) 10Giuseppe Lavagetto: "@ori: while I siphoned out some config from the jobrunner specific class for now, we should not overgeneralize now. I think once we've got" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [12:44:56] (03CR) 10BBlack: [C: 032] Enabled Opera on 429-02 carrier [operations/puppet] - 10https://gerrit.wikimedia.org/r/147422 (owner: 10Yurik) [12:51:36] _joe_: If you merge that.. Where do all the wiki specific configs go? [12:52:09] <_joe_> Reedy: in puppet [12:52:17] * Reedy facepalms [12:52:22] Where etc? :P [12:52:23] <_joe_> Reedy: I know [12:52:36] <_joe_> I know you have a few pending patches [12:52:37] I think you can cherry pick between repos... [12:52:45] <_joe_> I hope so [12:52:56] <_joe_> or, we merge them in apache-config and import them in puppet [12:52:57] Or might be able to automate making a patch from one repo, saving it, applying to other and re-submitting [12:52:59] <_joe_> :/ [12:53:08] well, not might for the latter [12:53:15] the question is whether it can be done easier [12:53:31] <_joe_> it can [12:53:39] <_joe_> I hope [12:53:56] Where will the files live? [12:53:57] <_joe_> cherry-picking is basically the same thing btw [12:54:00] ie what path [12:54:10] <_joe_> modules/mediawiki/apache/sites/ [12:54:25] <_joe_> ehm mediawiki/files/apache/sites [12:55:35] <_joe_> if you want, I can convert them :) [12:58:50] (03PS1) 10Reedy: Apache config for votewiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147428 [12:58:59] reedy@ubuntu64-web-esxi:~/git/operations/puppet/modules/mediawiki/files/apache/sites$ git fetch https://gerrit.wikimedia.org/r/operations/apache-config refs/changes/92/146292/1 && git cherry-pick FETCH_HEAD [12:58:59] && git review [12:59:06] that works as expected to move the changes [12:59:45] <_joe_> Reedy: wow [13:00:03] <_joe_> I'm impressed :) [13:00:30] So all I really need is a list of all the outstanding commits we want to migrate [13:00:41] from 146292 grab the last 2 numbers to make 92/146292 [13:01:10] Uses the same change id too [13:01:11] * Reedy grins [13:01:13] ... and the /1 is the patchset ;) [13:01:16] https://gerrit.wikimedia.org/r/#/q/I9671e6df30db7718ec8329b00f46110352ddf203,n,z [13:01:24] <_joe_> Reedy: let me check, I thought your ones were the only ones we needed to migrate [13:01:49] <_joe_> oh I've played with fetch and patchsets and url magic for the puppet compiuler [13:02:03] I think there might be 1 or 2 others that are still wanted [13:02:08] (03PS1) 10coren: Tool Labs: tweaks to bigbrother [operations/puppet] - 10https://gerrit.wikimedia.org/r/147432 [13:02:08] There's quite a few we should just abandon [13:02:57] <_joe_> I'd say just migrate your ones for now [13:03:19] I guess others can be done pretty easily if needed [13:03:23] even if abandoned [13:03:38] <_joe_> then once we've migrated fully and we're confident it works, we can add a README in apache_config [13:03:43] (03PS2) 10coren: Tool Labs: tweaks to bigbrother [operations/puppet] - 10https://gerrit.wikimedia.org/r/147432 [13:04:15] [13:55:35] <_joe_> if you want, I can convert them :) [13:04:18] Do you mean into puppet config? [13:04:33] <_joe_> commits from one repo to the other [13:04:56] ah [13:04:59] <_joe_> but you did a perfect job ;) [13:05:20] (03CR) 10coren: [C: 032] "Minor tweaks." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147432 (owner: 10coren) [13:05:35] Damn, no qchris [13:09:22] https://gerrit.wikimedia.org/r/changes/?q=status:open+project:operations/apache-config+owner:Reedy [13:09:59] "_number": 146292, [13:10:07] I'll script up something for migration in a little while [13:10:45] I know the hhvm ones are all patch set 1 [13:11:01] <_joe_> :) [13:11:09] <_joe_> I'm merging the change btw [13:11:14] :P [13:11:18] Ook! [13:13:15] (03PS4) 10Giuseppe Lavagetto: mediawiki: use sites-available everywhere. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147066 [13:13:38] <_joe_> !log temporarily disabling puppet on mw servers, will re-enable when I'm done with testing (again) the change [13:13:43] Logged the message, Master [13:14:05] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] mediawiki: use sites-available everywhere. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147066 (owner: 10Giuseppe Lavagetto) [13:19:52] <_joe_> !log re-enabling puppet, applying on a sample of hosts created no change according to my tests. [13:19:57] Logged the message, Master [13:23:52] (03PS2) 10Reedy: Apache config for votewiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147428 [13:24:19] (03Abandoned) 10Reedy: Apache config for votewiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147428 (owner: 10Reedy) [13:30:55] (03PS1) 10Giuseppe Lavagetto: add discontinuation notice [operations/apache-config] - 10https://gerrit.wikimedia.org/r/147434 [13:31:05] (03PS1) 10Reedy: Apache config for donatewiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147435 [13:31:17] (03Restored) 10Reedy: Apache config for votewiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147428 (owner: 10Reedy) [13:31:29] (03PS1) 10Reedy: Apache config for wikidatawiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147436 [13:31:53] <_joe_> Reedy: if you have a script automating the patch migration, can you commit it into apache-config, and I can mention it in the readme [13:32:12] (03PS1) 10Reedy: Apache config for testwikidatawiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147437 [13:32:17] Yeah, it's running now [13:32:48] got the feeling it might take a little while... [13:32:51] (03PS1) 10Reedy: Apache config for mediawikiwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147438 [13:33:16] (03PS1) 10Reedy: Apache config for Wiktionary using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147439 [13:33:40] (03PS1) 10Reedy: Apache config for Wikiquote using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147440 [13:34:01] <_joe_> Reedy: eheh [13:34:04] (03PS1) 10Reedy: Apache config for Wikipedia using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147441 [13:34:24] <_joe_> is it transferring your commits only, right? [13:34:30] (03PS1) 10Reedy: Apache config for Wikibooks using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147442 [13:34:37] yup, and only the mod_proxy_fcgi ones [13:34:59] (03PS1) 10Reedy: Apache config for Wikisource using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147443 [13:35:23] (03PS1) 10Reedy: Apache config for Wikinews using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147444 [13:35:48] (03PS1) 10Reedy: Apache config for Wikiversity using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147445 [13:36:11] (03PS1) 10Reedy: Apache config for iegcomwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147446 [13:36:59] (03PS1) 10Reedy: Apache config for transitionteamwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147448 [13:37:11] I wonder if dippy-bird still works [13:37:23] (03PS1) 10Reedy: Apache config for zerowiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147449 [13:37:43] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] add discontinuation notice [operations/apache-config] - 10https://gerrit.wikimedia.org/r/147434 (owner: 10Giuseppe Lavagetto) [13:37:46] (03PS1) 10Reedy: Apache config for legalteamwiki sing mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147450 [13:38:11] (03PS1) 10Reedy: Apache config for loginwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147451 [13:38:36] (03PS1) 10Reedy: Apache config for wikimedia chapters using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147453 [13:39:02] (03PS1) 10Reedy: Apache config for ombudsmenwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147454 [13:39:25] (03PS1) 10Reedy: Apache config for stewardwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147455 [13:39:51] (03PS1) 10Reedy: Apache config for checkuserwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147456 [13:40:15] (03PS1) 10Reedy: Apache config for movementroleswiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147457 [13:40:43] (03PS1) 10Reedy: Apache config for outreachwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147458 [13:41:08] (03PS1) 10Reedy: Apache config for collabwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147459 [13:41:34] (03PS1) 10Reedy: Apache config for otrswiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147460 [13:41:57] (03PS1) 10Reedy: Apache config for qualitywiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147461 [13:42:26] (03PS1) 10Reedy: Apache config for auditcomwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147462 [13:43:06] (03PS1) 10Reedy: Apache config for advisorywiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147463 [13:43:23] lol [13:43:28] it's running git gc in the middle of it all [13:43:30] (03PS1) 10Reedy: Apache config for chairwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147464 [13:43:55] (03PS1) 10Reedy: Apache config for officewiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147465 [13:44:21] (03PS1) 10Reedy: Apache config for strategywiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147466 [13:44:45] (03PS1) 10Reedy: Apache config for usabilitywiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147467 [13:45:09] (03PS1) 10Reedy: Apache config for searchcomwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147469 [13:45:37] (03PS1) 10Reedy: Apache config for specieswiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147470 [13:46:04] (03PS1) 10Reedy: Apache config for incubatorwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147471 [13:46:28] (03PS1) 10Reedy: Apache config for chapcomwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147472 [13:46:53] (03PS1) 10Reedy: Apache config for spcomwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147473 [13:47:18] (03PS1) 10Reedy: Apache config for boardgovcomwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147474 [13:47:42] (03PS1) 10Reedy: Apache config for boardwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147475 [13:48:09] (03PS1) 10Reedy: Apache config for internalwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147476 [13:48:33] (03PS1) 10Reedy: Apache config for fdcwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147477 [13:48:58] (03PS1) 10Reedy: Apache config for grantswiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147478 [13:49:22] (03PS1) 10Reedy: Apache config for commonswiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147479 [13:49:45] (03PS1) 10Reedy: Apache config for sourceswiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147480 [13:50:10] (03PS1) 10Reedy: Apache config for metawiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147481 [13:50:29] (03PS1) 10coren: Add gold and platinum MAC to dhcp [operations/puppet] - 10https://gerrit.wikimedia.org/r/147482 [13:50:35] (03PS1) 10Reedy: Apache config for Wikimania wikis using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147483 [13:50:58] (03PS1) 10Reedy: Apache config for foundationwiki using mod_proxy_fcgi [operations/puppet] - 10https://gerrit.wikimedia.org/r/147484 [13:51:05] <_joe_> Reedy: when are people supposed to review this shitload of configs? [13:51:20] 2016 :P [13:51:32] <_joe_> Reedy: btw, you can create a simple include for fcgi that could be shared between all those sites [13:51:41] e [13:51:47] <_joe_> not that I want you to change all your patches now [13:51:50] lol [13:52:00] <_joe_> but I just noticed [13:52:02] <_joe_> :) [13:52:17] There are docroot differences between them [13:52:53] <_joe_> I have an even more evil idea [13:53:02] <_joe_> but ok, don't be scared [13:53:03] (03PS1) 10Reedy: Redirect c[sz].wikimedia.org to http://www.wikimedia.cz [operations/puppet] - 10https://gerrit.wikimedia.org/r/147485 [13:53:19] (03Abandoned) 10Reedy: Redirect c[sz].wikimedia.org to http://www.wikimedia.cz [operations/apache-config] - 10https://gerrit.wikimedia.org/r/143095 (owner: 10Reedy) [13:53:49] (03PS2) 10coren: Add gold and platinum MAC to dhcp [operations/puppet] - 10https://gerrit.wikimedia.org/r/147482 [13:54:10] (03PS1) 10Reedy: Move a lot of the miscellaneous wikis out of their own specific docroots [operations/puppet] - 10https://gerrit.wikimedia.org/r/147486 [13:54:18] (03PS10) 10Reedy: Move a lot of the miscellaneous wikis out of their own specific docroots [operations/apache-config] - 10https://gerrit.wikimedia.org/r/90703 [13:54:25] (03Abandoned) 10Reedy: Move a lot of the miscellaneous wikis out of their own specific docroots [operations/apache-config] - 10https://gerrit.wikimedia.org/r/90703 (owner: 10Reedy) [13:55:08] (03PS1) 10Reedy: Add robots.txt rewrite rule where wiki is public [operations/puppet] - 10https://gerrit.wikimedia.org/r/147487 [13:55:20] (03Abandoned) 10Reedy: Add robots.txt rewrite rule where wiki is public [operations/apache-config] - 10https://gerrit.wikimedia.org/r/143184 (owner: 10Reedy) [13:55:24] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] swift: use statsd_default_sample_rate default [operations/puppet] - 10https://gerrit.wikimedia.org/r/147417 (owner: 10Filippo Giunchedi) [13:55:32] (03CR) 10Alexandros Kosiaris: [C: 031] Add gold and platinum MAC to dhcp [operations/puppet] - 10https://gerrit.wikimedia.org/r/147482 (owner: 10coren) [13:56:02] (03CR) 10coren: [C: 032] Add gold and platinum MAC to dhcp [operations/puppet] - 10https://gerrit.wikimedia.org/r/147482 (owner: 10coren) [13:56:24] (03PS1) 10Reedy: Make apple-touch-icon.png configurable via touch.php [operations/puppet] - 10https://gerrit.wikimedia.org/r/147488 [13:56:32] (03Abandoned) 10Reedy: Make apple-touch-icon.png configurable via touch.php [operations/apache-config] - 10https://gerrit.wikimedia.org/r/143188 (owner: 10Reedy) [14:03:28] (03PS1) 10coren: Typo fix in linux-host-entries.ttyS1-115200 [operations/puppet] - 10https://gerrit.wikimedia.org/r/147489 [14:04:47] (03CR) 10coren: [C: 032] "Stupid typo." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147489 (owner: 10coren) [14:05:02] PROBLEM - puppet last run on carbon is CRITICAL: CRITICAL: Puppet has 1 failures [14:05:23] this check is working way too well..... [14:05:36] hmmm I just got an idea... it could even tell us what the problem is [14:06:06] so that icinga-wm spams the channel with even more unimportant info [14:08:18] (03PS1) 10Reedy: Add migraterepo.php script [operations/apache-config] - 10https://gerrit.wikimedia.org/r/147490 [14:08:21] _joe_: ^ [14:08:30] indeed, it is fairly noisy [14:10:20] I have 2 actions for that check, one figure out a race that occurs on hosts where the catalog does not even compile, two to stop the daily whining of 5-10 server whenever logrotate gracefully restarts apache every day on 6:25 UTC [14:10:32] <_joe_> Reedy: \o/ [14:10:43] I was hoping the graceful restart would solve that but it did not [14:11:03] <_joe_> Reedy: can you update the readme? [14:12:39] (03PS2) 10Reedy: Add migraterepo.php script [operations/apache-config] - 10https://gerrit.wikimedia.org/r/147490 [14:14:02] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [14:14:12] again ? [14:14:14] sigh... [14:14:42] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] Add migraterepo.php script [operations/apache-config] - 10https://gerrit.wikimedia.org/r/147490 (owner: 10Reedy) [14:14:56] (03CR) 10Filippo Giunchedi: jobrunner: create hhvm-only jobrunners (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [14:15:06] <_joe_> akosiaris: this happens continuously [14:15:41] and yet we can not reliably reproduce it [14:15:58] I just ran git pull on strontium and it fetched the changes nicely [14:16:01] grrrr [14:16:11] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [14:16:14] akosiaris: yesterday it even went so far to say there are 3 pending changes.. then when i go to strontium myself.. i can pull.. no error at all like you saw it [14:16:23] and then recovery by itself,like here [14:16:29] (03Abandoned) 10Reedy: Apache config for foundationwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146088 (owner: 10Reedy) [14:16:38] niah it is never recovering by itself [14:16:50] i disagree, it did [14:16:51] someone just runs either puppet-merge or git pull [14:16:52] that is when puppet-merge can't talk to strontium to tell it to pull no? [14:16:56] <_joe_> it recovers when someone merges [14:17:12] <_joe_> godog: it's the git post-merge hook, actually [14:17:40] godog: http://p.defau.lt/?VoFOQoBQ7pm0MxDXAPh_4g [14:17:46] this is what happens [14:18:13] <_joe_> ... [14:18:16] I have seen this only one when running puppet-merge yet it is quite often [14:18:34] and it is bound to happen if you do sudo puppet-merge instead of sudo -s (or -i) ; puppet-merge [14:18:46] akosiaris: yep [14:18:49] $USER not being setup and all [14:18:53] (03Abandoned) 10Reedy: Apache config for votewiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146292 (owner: 10Reedy) [14:18:54] _joe_: yeah the hook [14:18:57] (03Abandoned) 10Reedy: Apache config for donatewiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146289 (owner: 10Reedy) [14:19:00] (03Abandoned) 10Reedy: Apache config for wikidatawiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146287 (owner: 10Reedy) [14:19:03] (03Abandoned) 10Reedy: Apache config for testwikidatawiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146286 (owner: 10Reedy) [14:19:06] (03Abandoned) 10Reedy: Apache config for mediawikiwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146285 (owner: 10Reedy) [14:19:07] <_joe_> akosiaris: we should fix this in fact [14:19:09] (03Abandoned) 10Reedy: Apache config for Wiktionary using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146284 (owner: 10Reedy) [14:19:12] (03Abandoned) 10Reedy: Apache config for Wikiquote using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146283 (owner: 10Reedy) [14:19:15] (03Abandoned) 10Reedy: Apache config for Wikipedia using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146280 (owner: 10Reedy) [14:19:18] (03Abandoned) 10Reedy: Apache config for Wikibooks using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146278 (owner: 10Reedy) [14:19:22] (03Abandoned) 10Reedy: Apache config for Wikisource using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146277 (owner: 10Reedy) [14:19:25] (03Abandoned) 10Reedy: Apache config for Wikinews using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146276 (owner: 10Reedy) [14:19:28] _joe_: yeah, it should solve part of the problem for sure [14:19:28] (03Abandoned) 10Reedy: Apache config for Wikiversity using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146275 (owner: 10Reedy) [14:19:30] i even posted that link from alex yesterday.. i did NOT get that error when it happened to me [14:19:31] (03Abandoned) 10Reedy: Apache config for iegcomwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146271 (owner: 10Reedy) [14:19:34] (03Abandoned) 10Reedy: Apache config for transitionteamwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146270 (owner: 10Reedy) [14:19:37] (03Abandoned) 10Reedy: Apache config for zerowiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146268 (owner: 10Reedy) [14:19:40] (03Abandoned) 10Reedy: Apache config for legalteamwiki sing mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146267 (owner: 10Reedy) [14:19:43] Reedy: just out of curiosity, is it really you clicking the gerrit ui or sth automated? :) [14:19:44] (03Abandoned) 10Reedy: Apache config for loginwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146266 (owner: 10Reedy) [14:19:46] (03Abandoned) 10Reedy: Apache config for wikimedia chapters using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146265 (owner: 10Reedy) [14:19:53] (03Abandoned) 10Reedy: Apache config for stewardwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146128 (owner: 10Reedy) [14:19:54] automated [14:19:56] (03Abandoned) 10Reedy: Apache config for checkuserwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146127 (owner: 10Reedy) [14:19:59] (03Abandoned) 10Reedy: Apache config for movementroleswiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146125 (owner: 10Reedy) [14:20:03] (03Abandoned) 10Reedy: Apache config for outreachwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146124 (owner: 10Reedy) [14:20:05] phew [14:20:06] (03Abandoned) 10Reedy: Apache config for collabwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146123 (owner: 10Reedy) [14:20:06] godog: 1 liner to https://github.com/wikimedia/mediawiki-tools-dippybird/blob/master/dippy-bird.php [14:20:09] (03Abandoned) 10Reedy: Apache config for otrswiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146122 (owner: 10Reedy) [14:20:13] (03Abandoned) 10Reedy: Apache config for qualitywiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146121 (owner: 10Reedy) [14:20:16] (03Abandoned) 10Reedy: Apache config for auditcomwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146119 (owner: 10Reedy) [14:20:19] (03Abandoned) 10Reedy: Apache config for advisorywiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146118 (owner: 10Reedy) [14:20:22] (03Abandoned) 10Reedy: Apache config for chairwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146116 (owner: 10Reedy) [14:20:25] (03Abandoned) 10Reedy: Apache config for officewiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146113 (owner: 10Reedy) [14:20:29] (03Abandoned) 10Reedy: Apache config for strategywiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146108 (owner: 10Reedy) [14:20:32] (03Abandoned) 10Reedy: Apache config for usabilitywiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146107 (owner: 10Reedy) [14:20:35] (03Abandoned) 10Reedy: Apache config for searchcomwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146106 (owner: 10Reedy) [14:20:38] (03Abandoned) 10Reedy: Apache config for specieswiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146105 (owner: 10Reedy) [14:20:41] (03Abandoned) 10Reedy: Apache config for incubatorwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146104 (owner: 10Reedy) [14:20:46] (03Abandoned) 10Reedy: Apache config for chapcomwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146103 (owner: 10Reedy) [14:20:49] (03Abandoned) 10Reedy: Apache config for spcomwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146102 (owner: 10Reedy) [14:20:52] (03Abandoned) 10Reedy: Apache config for boardgovcomwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146101 (owner: 10Reedy) [14:20:55] (03Abandoned) 10Reedy: Apache config for boardwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146100 (owner: 10Reedy) [14:20:58] (03Abandoned) 10Reedy: Apache config for internalwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146099 (owner: 10Reedy) [14:21:11] php /var/www/wiki/mediawiki/tools/dippybird/dippy-bird.php --username=reedy --server=gerrit.wikimedia.org --port=29418 --query=$changeid --action=abandon [14:21:19] (03Abandoned) 10Reedy: Apache config for ombudsmenwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146129 (owner: 10Reedy) [14:21:22] (03Abandoned) 10Reedy: Apache config for fdcwiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146097 (owner: 10Reedy) [14:21:25] (03Abandoned) 10Reedy: Apache config for grantswiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146096 (owner: 10Reedy) [14:21:28] (03Abandoned) 10Reedy: Apache config for commonswiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146095 (owner: 10Reedy) [14:21:32] (03Abandoned) 10Reedy: Apache config for sourceswiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146093 (owner: 10Reedy) [14:21:35] (03Abandoned) 10Reedy: Apache config for metawiki using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146092 (owner: 10Reedy) [14:21:38] (03Abandoned) 10Reedy: Apache config for Wikimania wikis using mod_proxy_fcgi [operations/apache-config] - 10https://gerrit.wikimedia.org/r/146089 (owner: 10Reedy) [14:21:45] Reedy: ah ok, I went around trying https://github.com/pandemicsyn/fgerrit but never got used to it [14:22:01] (03Abandoned) 10Reedy: Rename "chapcomwiki" to "affcomwiki" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/53922 (https://bugzilla.wikimedia.org/39482) (owner: 10Reedy) [14:23:01] RECOVERY - puppet last run on carbon is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [14:23:34] oh srsly [14:23:41] I can't abandon other peoples changes in that repo [14:24:00] Reedy: you need some more access for that ;) [14:24:09] _joe_ mutante akosiaris and in turn IIRC the hook fails because HOME isn't /root and ssh to strontium fails [14:24:16] in the sudo puppet-merge case that is [14:24:29] <_joe_> so sudo -i [14:25:29] http://p.defau.lt/?sVhDg6QxTwapnfZEADsEiQ if someone else wants to do it [14:25:56] need a copy of dippy-bird.php [14:26:18] _joe_: I know what the solution is, but clearly not enough for everybody if it keeps happening.. or perhaps it fails for other reasons? [14:26:25] (03PS1) 10Reedy: Fix str_replace call [operations/apache-config] - 10https://gerrit.wikimedia.org/r/147491 [14:27:23] godog: ok then let's fix the sudo puppet-merge case and if it fails for other reasons too (i think it does) we need to figure them out [14:27:45] I am this close to making puppet-merge logging everything it does... [14:28:44] akosiaris: yep, I couldn't think of a better solution than checking HOME off top of my head, I'm sure there's something more robust [14:28:51] just run as root without sudo ?:p [14:28:59] mutante: :P [14:30:57] (03PS1) 10Alexandros Kosiaris: Add package building instruction in the README [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147492 [14:34:45] (03PS1) 10Alexandros Kosiaris: Add scalac as a build dependency [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147493 [14:35:29] (03CR) 10Alexandros Kosiaris: [C: 032] Add scalac as a build dependency [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147493 (owner: 10Alexandros Kosiaris) [14:37:21] (03PS1) 10Alexandros Kosiaris: Use %(version)s in gbp.conf for debian branch [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147494 [14:37:39] !log rolling reload of proxy-server on swift ms-fe1* to pick up changes [14:37:45] Logged the message, Master [14:41:06] (03CR) 10Dzahn: "yea, weird we don't use role::smokeping but then there is "include smokeping" on netmon1001 in site.pp" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [14:42:14] PROBLEM - puppet last run on platinum is CRITICAL: CRITICAL: Puppet has 1 failures [14:42:44] akosiaris: thanks for all the reviews and merges today [14:42:44] PROBLEM - puppet last run on gold is CRITICAL: CRITICAL: Puppet has 1 failures [14:43:44] RECOVERY - puppet last run on gold is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [14:44:14] RECOVERY - puppet last run on platinum is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [14:46:41] (03PS1) 10Dzahn: smokeping - use apache::conf for namevirtualhost [operations/puppet] - 10https://gerrit.wikimedia.org/r/147495 [14:53:15] PROBLEM - DPKG on neon is CRITICAL: DPKG CRITICAL dpkg reports broken packages [14:53:41] eh, neon? i'll look [14:54:14] RECOVERY - DPKG on neon is OK: All packages OK [14:54:32] did somebody upgrade or do we have an "ensure => latest" [14:56:35] heh, that must have been human, 0 pending :) [14:58:01] hah, so it was okay by the time you looked? [14:58:44] yea [15:00:56] mwalker: glad we didn't let ethics get in the way of you getting perms on beta cluster [15:00:59] icinga basically does "dpkg -l" and checks for lines not starting ii or rc [15:01:17] somebody upgraded everything [15:03:32] greg-g: Ethics? Here? No. [15:03:58] (03CR) 10BryanDavis: jobrunner: create hhvm-only jobrunners (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [15:20:54] PROBLEM - Puppet freshness on db1007 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 13:20:21 UTC [15:23:44] (03CR) 10Chmarkine: "Since "this site works only in browsers with SNI support", how about we remove SSL 3? Only IE 6 on Win XP doesn't support TLS 1.0 and IE 6" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [15:31:45] (03CR) 10Giuseppe Lavagetto: jobrunner: create hhvm-only jobrunners (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147086 (owner: 10Giuseppe Lavagetto) [15:32:25] removed DNS from topic, incident report is now here btw: https://wikitech.wikimedia.org/wiki/Incident_documentation/20140716-DNS [15:33:01] (removed lists: too) [15:33:20] thanks bblack [15:35:58] (03CR) 10Alexandros Kosiaris: icinga - update SSL cipher list (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [15:43:52] (03PS1) 10Alexandros Kosiaris: Use pbuilder by default [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147499 [15:56:12] (03PS1) 10Dzahn: delete unused role/smokeping.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/147502 [15:59:20] godog: thanks! https://wikitech.wikimedia.org/w/index.php?title=Incident_documentation/20131205-Swift&diff=next&oldid=107068 [16:00:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 15:57:53 UTC [16:01:48] (03Abandoned) 10Dzahn: smokeping - use apache::conf for namevirtualhost [operations/puppet] - 10https://gerrit.wikimedia.org/r/147495 (owner: 10Dzahn) [16:01:53] greg-g: no problem! slowly converging :) [16:02:06] :) [16:02:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 15:57:53 UTC [16:03:10] <_joe_> mmmh [16:04:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 15:57:53 UTC [16:05:15] <_joe_> and it's bogus [16:05:21] <_joe_> bye [16:05:33] (03PS3) 10Dzahn: generic_vhost - outdated variable syntax [operations/puppet] - 10https://gerrit.wikimedia.org/r/147210 [16:05:34] RECOVERY - Puppet freshness on mw1078 is OK: puppet ran at Fri Jul 18 16:05:24 UTC 2014 [16:07:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:05:24 UTC [16:09:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:05:24 UTC [16:10:15] (03PS1) 10Dzahn: delete smokeping apache template [operations/puppet] - 10https://gerrit.wikimedia.org/r/147504 [16:11:20] (03CR) 10Dzahn: "--> Change-Id: I8456d08bfda914308" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [16:11:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:05:24 UTC [16:11:33] (03Abandoned) 10Dzahn: smokeping - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147196 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [16:11:53] (03Abandoned) 10Dzahn: smokeping - retab Apache config [operations/puppet] - 10https://gerrit.wikimedia.org/r/147192 (owner: 10Dzahn) [16:12:11] (03Abandoned) 10Dzahn: smokeping - outdated variable syntax [operations/puppet] - 10https://gerrit.wikimedia.org/r/147191 (owner: 10Dzahn) [16:13:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:05:24 UTC [16:13:57] (03PS3) 10Dzahn: racktables - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147185 (https://bugzilla.wikimedia.org/53259) [16:15:04] RECOVERY - Puppet freshness on mw1078 is OK: puppet ran at Fri Jul 18 16:15:01 UTC 2014 [16:17:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:15:01 UTC [16:17:25] RECOVERY - Puppet freshness on mw1078 is OK: puppet ran at Fri Jul 18 16:17:23 UTC 2014 [16:18:27] (03Abandoned) 10Dzahn: remove pmtpa access switches [operations/dns] - 10https://gerrit.wikimedia.org/r/143202 (owner: 10Dzahn) [16:19:24] PROBLEM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:17:23 UTC [16:19:54] CUSTOM - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:17:23 UTC [16:20:15] ACKNOWLEDGEMENT - Puppet freshness on mw1078 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 16:17:23 UTC daniel_zahn bogus [16:21:15] ACKNOWLEDGEMENT - check configured eth on labstore1001 is CRITICAL: bond0 reporting no carrier. daniel_zahn RT-7657 [16:28:55] (03PS1) 10Rush: iridium.wikimedia.org => iridium.eqiad.wmnet [operations/dns] - 10https://gerrit.wikimedia.org/r/147506 [16:31:42] (03PS2) 10Rush: phab settings comment [operations/puppet] - 10https://gerrit.wikimedia.org/r/147326 [16:31:44] (03PS1) 10Rush: iridium.wikimedia.org => iridium.eqiad.wmnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/147507 [16:32:24] (03CR) 10Dzahn: [C: 031] iridium.wikimedia.org => iridium.eqiad.wmnet [operations/dns] - 10https://gerrit.wikimedia.org/r/147506 (owner: 10Rush) [16:33:17] ACKNOWLEDGEMENT - Disk space on dataset1001 is CRITICAL: DISK CRITICAL - free space: /data 860459 MB (2% inode=99%): daniel_zahn RT-7922 [16:34:37] (03CR) 10Rush: [C: 032] iridium.wikimedia.org => iridium.eqiad.wmnet [operations/dns] - 10https://gerrit.wikimedia.org/r/147506 (owner: 10Rush) [16:36:56] (03PS3) 10Rush: phab settings comment [operations/puppet] - 10https://gerrit.wikimedia.org/r/147326 [16:37:01] (03CR) 10Rush: [C: 032 V: 032] phab settings comment [operations/puppet] - 10https://gerrit.wikimedia.org/r/147326 (owner: 10Rush) [16:37:09] ACKNOWLEDGEMENT - puppet last run on dataset2 is CRITICAL: CRITICAL: Puppet has 3 failures daniel_zahn RT-7923 [16:37:10] (03PS2) 10Rush: iridium.wikimedia.org => iridium.eqiad.wmnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/147507 [16:37:16] (03CR) 10Rush: [C: 032 V: 032] iridium.wikimedia.org => iridium.eqiad.wmnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/147507 (owner: 10Rush) [16:38:14] "Slow CirrusSearch query rate" happens a lot.. but then it also always goes away by itself.. [16:38:22] (03CR) 10Chmarkine: [C: 031] racktables - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147185 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [16:41:35] ACKNOWLEDGEMENT - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.0933333333333 daniel_zahn RT-7924 [16:46:31] PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: Puppet has 1 failures [16:47:14] (03PS1) 10RobH: blog ttl lowered to 5m, removed old/depreciated entries for blog [operations/dns] - 10https://gerrit.wikimedia.org/r/147508 [16:48:00] manybubbles: so I had a look at https://www.mediawiki.org/wiki/Search#Wikis [16:48:11] Does this mean you don't have a date for nlwiki yet? [16:48:44] odder: let me fix it:) [16:50:00] odder: [16:50:01] https://www.mediawiki.org/w/index.php?title=Search&diff=1070303&oldid=1068917 [16:50:17] we'll put the new schedule back together week after next after I get back from vacation [16:50:35] after I verify my performance work which I'm finishing up today [16:50:40] (03CR) 10RobH: "removing community, test, and global blog dns entries. They are NOT migrating with the blog on Monday, and their HTTPS support has been r" [operations/dns] - 10https://gerrit.wikimedia.org/r/147508 (owner: 10RobH) [16:50:46] if all goes well we'll go back to two a week I imagine [16:50:56] maybe four a week [16:51:01] like we'd tried [16:51:26] * odder doesn't understand 'we slipped the schedule' [16:52:01] (03CR) 10Dzahn: [C: 031] blog ttl lowered to 5m, removed old/depreciated entries for blog [operations/dns] - 10https://gerrit.wikimedia.org/r/147508 (owner: 10RobH) [16:52:04] jawiki was supposed to go to Cirrus on 14th, and now it's rescheduled for Aug 20 [16:52:23] (03CR) 10RobH: [C: 032] blog ttl lowered to 5m, removed old/depreciated entries for blog [operations/dns] - 10https://gerrit.wikimedia.org/r/147508 (owner: 10RobH) [16:54:23] (03CR) 10Dzahn: [C: 032] etherpad - retab Apache config [operations/puppet] - 10https://gerrit.wikimedia.org/r/147197 (owner: 10Dzahn) [16:54:53] odder: yeah - it was one of the ones we rolled back [16:55:11] Hence my question if you have a date for nlwiki, as you rolled that one back, too [16:56:02] we don't a date for when we'll put any of them back. When I put jawiki where I put it it was a mistake. [16:56:25] does nlwiki need to move around in the list or need more warning? [16:56:30] Oh, I see your latest edit [16:56:36] Thanks :) [16:57:41] sorry for being confusing [16:57:54] papaul: https://rt.wikimedia.org/Ticket/Display.html?id=7917 [16:58:03] manybubbles: I think an e-mail notification to wikitech-ambassadors a week before the deployment is fine [16:58:16] (03CR) 10Dzahn: "Chmarkine, that is probably a good point, but it would also affect all these other misc services on zirconium. i'd like to have that as a " [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [16:58:18] manybubbles: We'll carry the news in Tech News, too, if you give us enough time [16:59:03] (03CR) 10Dzahn: [C: 032] etherpad - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [16:59:42] odder: well, for now just the fact that the rollback happened, and re-enablement/continuing will start happening in 2 weeks [16:59:49] gah, a volunteer who is not a trusted user just has to update the commit message and not getting +2 from jenkis anymore.. [17:00:11] (i know it's not new, just adding user to trusted user regex :) [17:00:44] greg-g: Yep, my intention precisely. [17:00:53] * greg-g nods [17:00:56] Just thinking ahead a little here :) [17:00:59] thanks :) [17:01:19] (03CR) 10Dzahn: [V: 032] etherpad - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147199 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [17:03:26] (03PS1) 10Ori.livneh: mediawiki::web: use floor/min instead of inline_template [operations/puppet] - 10https://gerrit.wikimedia.org/r/147511 [17:04:31] RECOVERY - puppet last run on db1022 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [17:06:25] (03CR) 10Plucas: "Our use-case is that we are using the java_home argument to puppet-kafka (see: https://gerrit.wikimedia.org/r/#/c/147010/) to specify whic" [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147338 (owner: 10Kmosher) [17:06:27] (03CR) 10RobH: [C: 031] "if somethign else is already controlling netmon's use of smokeping, then this is indeed depreciated" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147502 (owner: 10Dzahn) [17:06:35] (03PS3) 10Aaron Schulz: Lowered the default job runner timeout [operations/puppet] - 10https://gerrit.wikimedia.org/r/147388 [17:08:31] (03PS1) 10Ori.livneh: mediawiki::web: get rid of envvars.appserver [operations/puppet] - 10https://gerrit.wikimedia.org/r/147514 [17:09:01] ori: https://gerrit.wikimedia.org/r/#/c/147388/ [17:09:01] _joe_: ^ [17:09:03] pretty amazing [17:10:09] <_joe_> ori: ahahhah LOL [17:10:32] <_joe_> so, one less thing I should modify of the debian plain config [17:12:27] _joe_: yeah, we should restore the package's envvars file ideally [17:14:04] (03PS4) 10Ori.livneh: Lowered the default job runner timeout [operations/puppet] - 10https://gerrit.wikimedia.org/r/147388 (owner: 10Aaron Schulz) [17:16:20] PROBLEM - puppet last run on stat1003 is CRITICAL: CRITICAL: Puppet has 9 failures [17:16:30] PROBLEM - puppet last run on mw1139 is CRITICAL: CRITICAL: Puppet has 10 failures [17:16:31] PROBLEM - puppet last run on amssq31 is CRITICAL: CRITICAL: Puppet has 2 failures [17:16:31] PROBLEM - puppet last run on search1011 is CRITICAL: CRITICAL: Puppet has 39 failures [17:16:31] PROBLEM - puppet last run on mw1016 is CRITICAL: CRITICAL: Puppet has 17 failures [17:16:31] PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: Puppet has 9 failures [17:16:40] PROBLEM - puppet last run on mw1209 is CRITICAL: CRITICAL: Puppet has 44 failures [17:16:50] PROBLEM - puppet last run on ms-be1011 is CRITICAL: CRITICAL: Puppet has 13 failures [17:16:50] PROBLEM - puppet last run on db1072 is CRITICAL: CRITICAL: Puppet has 13 failures [17:16:50] PROBLEM - puppet last run on mw1027 is CRITICAL: CRITICAL: Puppet has 70 failures [17:16:50] PROBLEM - puppet last run on search1006 is CRITICAL: CRITICAL: Puppet has 31 failures [17:16:50] PROBLEM - puppet last run on elastic1003 is CRITICAL: CRITICAL: Puppet has 13 failures [17:17:00] PROBLEM - puppet last run on strontium is CRITICAL: CRITICAL: Puppet has 31 failures [17:17:00] PROBLEM - puppet last run on mw1152 is CRITICAL: CRITICAL: Puppet has 46 failures [17:17:00] PROBLEM - puppetmaster backend https on strontium is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 8141: HTTP/1.1 500 Internal Server Error [17:17:00] PROBLEM - puppet last run on mw1220 is CRITICAL: CRITICAL: Puppet has 38 failures [17:17:10] PROBLEM - puppet last run on stat1001 is CRITICAL: CRITICAL: Puppet has 30 failures [17:17:10] PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: Puppet has 19 failures [17:17:11] PROBLEM - puppet last run on terbium is CRITICAL: CRITICAL: Puppet has 66 failures [17:17:11] PROBLEM - puppet last run on cp1067 is CRITICAL: CRITICAL: Puppet has 17 failures [17:17:20] PROBLEM - puppet last run on analytics1003 is CRITICAL: CRITICAL: Puppet has 20 failures [17:17:20] PROBLEM - puppet last run on cp1040 is CRITICAL: CRITICAL: Puppet has 21 failures [17:17:20] PROBLEM - puppet last run on mw1107 is CRITICAL: CRITICAL: Puppet has 49 failures [17:17:20] PROBLEM - puppet last run on cp1053 is CRITICAL: CRITICAL: Puppet has 25 failures [17:17:20] PROBLEM - puppet last run on search1004 is CRITICAL: CRITICAL: Puppet has 39 failures [17:17:20] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Puppet has 56 failures [17:17:21] PROBLEM - puppet last run on db60 is CRITICAL: CRITICAL: Puppet has 19 failures [17:17:22] (03CR) 10Ori.livneh: [C: 032] Lowered the default job runner timeout [operations/puppet] - 10https://gerrit.wikimedia.org/r/147388 (owner: 10Aaron Schulz) [17:17:30] PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: Puppet has 58 failures [17:17:30] PROBLEM - puppet last run on mw1193 is CRITICAL: CRITICAL: Puppet has 64 failures [17:17:31] PROBLEM - puppet last run on mw1064 is CRITICAL: CRITICAL: Puppet has 58 failures [17:17:31] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: Puppet has 47 failures [17:17:31] PROBLEM - puppet last run on cp1044 is CRITICAL: CRITICAL: Puppet has 19 failures [17:17:31] PROBLEM - puppet last run on wtp1010 is CRITICAL: CRITICAL: Puppet has 16 failures [17:17:31] PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: Puppet has 17 failures [17:17:31] PROBLEM - puppet last run on ms-be1002 is CRITICAL: CRITICAL: Puppet has 22 failures [17:17:32] PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: Puppet has 15 failures [17:17:32] PROBLEM - puppet last run on wtp1008 is CRITICAL: CRITICAL: Puppet has 23 failures [17:17:33] PROBLEM - puppet last run on mw1093 is CRITICAL: CRITICAL: Puppet has 31 failures [17:17:34] PROBLEM - puppet last run on mw1204 is CRITICAL: CRITICAL: Puppet has 55 failures [17:17:34] PROBLEM - puppet last run on mw1219 is CRITICAL: CRITICAL: Puppet has 54 failures [17:17:34] PROBLEM - puppet last run on search1012 is CRITICAL: CRITICAL: Puppet has 38 failures [17:17:35] PROBLEM - puppet last run on analytics1018 is CRITICAL: CRITICAL: Puppet has 18 failures [17:17:35] PROBLEM - puppet last run on mw1112 is CRITICAL: CRITICAL: Puppet has 57 failures [17:17:36] PROBLEM - puppet last run on mw1071 is CRITICAL: CRITICAL: Puppet has 54 failures [17:17:36] PROBLEM - puppet last run on db71 is CRITICAL: CRITICAL: Puppet has 20 failures [17:17:40] PROBLEM - puppet last run on mw1143 is CRITICAL: CRITICAL: Puppet has 64 failures [17:17:40] PROBLEM - puppet last run on mw1203 is CRITICAL: CRITICAL: Puppet has 53 failures [17:17:40] PROBLEM - puppet last run on cp4002 is CRITICAL: CRITICAL: Puppet has 25 failures [17:17:40] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: Puppet has 20 failures [17:17:50] PROBLEM - puppet last run on mw1207 is CRITICAL: CRITICAL: Puppet has 60 failures [17:17:50] PROBLEM - puppet last run on mw1066 is CRITICAL: CRITICAL: Puppet has 57 failures [17:17:50] PROBLEM - puppet last run on ssl3001 is CRITICAL: CRITICAL: Puppet has 28 failures [17:17:51] PROBLEM - puppet last run on mw1154 is CRITICAL: CRITICAL: Puppet has 58 failures [17:17:51] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: Puppet has 20 failures [17:17:53] um? [17:17:54] oh puppet [17:18:00] PROBLEM - puppet last run on db1045 is CRITICAL: CRITICAL: Puppet has 23 failures [17:18:00] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: Puppet has 27 failures [17:18:10] PROBLEM - puppet last run on mw1155 is CRITICAL: CRITICAL: Puppet has 57 failures [17:18:10] PROBLEM - puppet last run on db1037 is CRITICAL: CRITICAL: Puppet has 19 failures [17:18:10] PROBLEM - puppet last run on dysprosium is CRITICAL: CRITICAL: Puppet has 14 failures [17:18:11] PROBLEM - puppet last run on ms-fe1002 is CRITICAL: CRITICAL: Puppet has 15 failures [17:18:11] PROBLEM - puppet last run on osm-cp1001 is CRITICAL: CRITICAL: Puppet has 19 failures [17:18:11] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:11] PROBLEM - puppet last run on virt1008 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:13] (03CR) 10ArielGlenn: "You might want to consider adding log rotation so we don't keep these logs around forever." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [17:18:16] lots of failures, more than normal [17:18:20] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: Puppet has 25 failures [17:18:20] PROBLEM - puppet last run on mw1113 is CRITICAL: CRITICAL: Puppet has 53 failures [17:18:20] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: Puppet has 55 failures [17:18:20] PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: Puppet has 22 failures [17:18:21] PROBLEM - puppet last run on cp1068 is CRITICAL: CRITICAL: Puppet has 25 failures [17:18:21] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: Puppet has 19 failures [17:18:21] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 19 failures [17:18:30] PROBLEM - puppet last run on db1056 is CRITICAL: CRITICAL: Puppet has 20 failures [17:18:30] PROBLEM - puppet last run on ms-be1010 is CRITICAL: CRITICAL: Puppet has 17 failures [17:18:30] PROBLEM - puppet last run on mw1021 is CRITICAL: CRITICAL: Puppet has 55 failures [17:18:30] PROBLEM - puppet last run on analytics1012 is CRITICAL: CRITICAL: Puppet has 22 failures [17:18:30] PROBLEM - puppet last run on mexia is CRITICAL: CRITICAL: Puppet has 20 failures [17:18:30] PROBLEM - puppet last run on lvs4004 is CRITICAL: CRITICAL: Puppet has 23 failures [17:18:30] PROBLEM - puppet last run on cp4009 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:31] PROBLEM - puppet last run on amssq33 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:31] PROBLEM - puppet last run on chromium is CRITICAL: CRITICAL: Puppet has 22 failures [17:18:32] 500 Internal Server Error [17:18:32] PROBLEM - puppet last run on mw1037 is CRITICAL: CRITICAL: Puppet has 59 failures [17:18:32] PROBLEM - puppet last run on mw1047 is CRITICAL: CRITICAL: Puppet has 54 failures [17:18:33] PROBLEM - puppet last run on amssq39 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:34] PROBLEM - puppet last run on mw1073 is CRITICAL: CRITICAL: Puppet has 46 failures [17:18:34] PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: Puppet has 56 failures [17:18:34] PROBLEM - puppet last run on palladium is CRITICAL: CRITICAL: Puppet has 31 failures [17:18:35] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 60 failures [17:18:35] PROBLEM - puppet last run on mw1086 is CRITICAL: CRITICAL: Puppet has 65 failures [17:18:36] PROBLEM - puppet last run on wtp1001 is CRITICAL: CRITICAL: Puppet has 25 failures [17:18:36] PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: Puppet has 57 failures [17:18:37] PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: Puppet has 24 failures [17:18:37] PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: Puppet has 22 failures [17:18:38] PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: Puppet has 22 failures [17:18:38] PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:40] PROBLEM - puppet last run on mw1104 is CRITICAL: CRITICAL: Puppet has 54 failures [17:18:40] PROBLEM - puppet last run on mw1194 is CRITICAL: CRITICAL: Puppet has 49 failures [17:18:45] mutante: ? where [17:18:50] PROBLEM - puppet last run on db1027 is CRITICAL: CRITICAL: Puppet has 16 failures [17:18:50] PROBLEM - puppet last run on search1003 is CRITICAL: CRITICAL: Puppet has 41 failures [17:18:50] PROBLEM - puppet last run on mc1015 is CRITICAL: CRITICAL: Puppet has 21 failures [17:18:50] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: Puppet has 18 failures [17:18:51] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 54 failures [17:19:05] greg-g: random box, db1045.. looks like puppetmaster [17:19:10] PROBLEM - puppet last run on db1005 is CRITICAL: CRITICAL: Puppet has 21 failures [17:19:10] PROBLEM - puppet last run on mw1018 is CRITICAL: CRITICAL: Puppet has 48 failures [17:19:11] PROBLEM - puppet last run on ssl1003 is CRITICAL: CRITICAL: Puppet has 22 failures [17:19:16] oh [17:19:20] PROBLEM - puppet last run on fenari is CRITICAL: CRITICAL: Puppet has 71 failures [17:19:21] PROBLEM - puppet last run on mw1103 is CRITICAL: CRITICAL: Puppet has 60 failures [17:19:21] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: Puppet has 61 failures [17:19:21] PROBLEM - puppet last run on elastic1010 is CRITICAL: CRITICAL: Puppet has 24 failures [17:19:21] PROBLEM - puppet last run on ms-be1014 is CRITICAL: CRITICAL: Puppet has 19 failures [17:19:21] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: Puppet has 22 failures [17:19:21] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: Puppet has 20 failures [17:19:22] PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: Puppet has 25 failures [17:19:22] (03CR) 10ArielGlenn: "Oh, forgot to say, I think the default port for ORPort is indeed 443, if we can use that, like Jan says." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [17:19:22] greg-g: puppet serves files and manifests via HTTP [17:19:22] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: Puppet has 22 failures [17:19:30] PROBLEM - puppet last run on mw1199 is CRITICAL: CRITICAL: Puppet has 61 failures [17:19:30] PROBLEM - puppet last run on es1004 is CRITICAL: CRITICAL: Puppet has 24 failures [17:19:31] PROBLEM - puppet last run on search1008 is CRITICAL: CRITICAL: Puppet has 50 failures [17:19:31] PROBLEM - puppet last run on cp4016 is CRITICAL: CRITICAL: Puppet has 23 failures [17:19:31] PROBLEM - puppet last run on amssq50 is CRITICAL: CRITICAL: Puppet has 23 failures [17:19:31] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 17 failures [17:19:31] PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: Puppet has 24 failures [17:19:31] PROBLEM - puppet last run on mw1017 is CRITICAL: CRITICAL: Puppet has 66 failures [17:19:32] PROBLEM - puppet last run on elastic1013 is CRITICAL: CRITICAL: Puppet has 17 failures [17:19:32] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 26 failures [17:19:40] PROBLEM - puppet last run on search1019 is CRITICAL: CRITICAL: Puppet has 49 failures [17:19:40] PROBLEM - puppet last run on hydrogen is CRITICAL: CRITICAL: Puppet has 25 failures [17:19:40] PROBLEM - puppet last run on mw1095 is CRITICAL: CRITICAL: Puppet has 57 failures [17:19:40] PROBLEM - puppet last run on elastic1009 is CRITICAL: CRITICAL: Puppet has 20 failures [17:19:40] PROBLEM - puppet last run on cp4011 is CRITICAL: CRITICAL: Puppet has 21 failures [17:19:42] greg-g: the various transient failures we're seeing usually manifest as HTTP 500s from puppetmaster in the puppet logs [17:19:42] ori: gotcha, I was worried it was causing an outage, thanks [17:19:50] PROBLEM - puppet last run on mw1070 is CRITICAL: CRITICAL: Puppet has 53 failures [17:19:50] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: Puppet has 10 failures [17:19:51] PROBLEM - puppet last run on mw1085 is CRITICAL: CRITICAL: Puppet has 56 failures [17:19:51] PROBLEM - puppet last run on virt1005 is CRITICAL: CRITICAL: Puppet has 16 failures [17:19:51] PROBLEM - puppet last run on mw1058 is CRITICAL: CRITICAL: Puppet has 59 failures [17:19:51] it's strontium [17:19:53] "makes sense" [17:19:53]
Apache/2.2.22 (Ubuntu) Server at strontium.eqiad.wmnet Port 8141
[17:20:00] PROBLEM - puppet last run on mw1157 is CRITICAL: CRITICAL: Puppet has 61 failures [17:20:11] PROBLEM - puppet last run on mw1019 is CRITICAL: CRITICAL: Puppet has 50 failures [17:20:11] PROBLEM - puppet last run on db1019 is CRITICAL: CRITICAL: Puppet has 14 failures [17:20:20] PROBLEM - puppet last run on ssl1004 is CRITICAL: CRITICAL: Puppet has 24 failures [17:20:20] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: Puppet has 51 failures [17:20:20] PROBLEM - puppet last run on virt1009 is CRITICAL: CRITICAL: Puppet has 16 failures [17:20:20] PROBLEM - puppet last run on amssq52 is CRITICAL: CRITICAL: Puppet has 27 failures [17:20:20] PROBLEM - puppet last run on analytics1024 is CRITICAL: CRITICAL: Puppet has 17 failures [17:20:21] PROBLEM - puppet last run on tantalum is CRITICAL: CRITICAL: Puppet has 21 failures [17:20:21] PROBLEM - puppet last run on ms-be3004 is CRITICAL: CRITICAL: Puppet has 17 failures [17:20:21] PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: Puppet has 20 failures [17:20:30] PROBLEM - puppet last run on mw1101 is CRITICAL: CRITICAL: Puppet has 57 failures [17:20:30] PROBLEM - puppet last run on db1049 is CRITICAL: CRITICAL: Puppet has 17 failures [17:20:31] PROBLEM - puppet last run on lanthanum is CRITICAL: CRITICAL: Puppet has 32 failures [17:20:31] PROBLEM - puppet last run on mw1102 is CRITICAL: CRITICAL: Puppet has 58 failures [17:20:31] PROBLEM - puppet last run on db1058 is CRITICAL: CRITICAL: Puppet has 14 failures [17:20:31] PROBLEM - puppet last run on mw1020 is CRITICAL: CRITICAL: Puppet has 59 failures [17:20:31] PROBLEM - puppet last run on es1003 is CRITICAL: CRITICAL: Puppet has 17 failures [17:20:31] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: Puppet has 60 failures [17:20:32] PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: Puppet has 22 failures [17:20:32] PROBLEM - puppet last run on search1021 is CRITICAL: CRITICAL: Puppet has 42 failures [17:20:33] PROBLEM - puppet last run on mw1015 is CRITICAL: CRITICAL: Puppet has 53 failures [17:20:33] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Puppet has 26 failures [17:20:36] [ pid=18238 file=ext/apache2/Hooks.cpp:727 time=2014-07-18 17:20:29.876 ]: Unexpected error in mod_passenger: Could not connect to the ApplicationPool server: Broken pipe (32) [17:20:40] PROBLEM - puppet last run on cp3007 is CRITICAL: CRITICAL: Puppet has 25 failures [17:20:40] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 56 failures [17:20:40] PROBLEM - puppet last run on es1005 is CRITICAL: CRITICAL: Puppet has 20 failures [17:20:47] urg [17:20:50] PROBLEM - puppet last run on lvs4001 is CRITICAL: CRITICAL: Puppet has 17 failures [17:20:50] PROBLEM - puppet last run on zinc is CRITICAL: CRITICAL: Puppet has 19 failures [17:20:50] PROBLEM - puppet last run on search1009 is CRITICAL: CRITICAL: Puppet has 45 failures [17:20:50] PROBLEM - puppet last run on labsdb1002 is CRITICAL: CRITICAL: Puppet has 22 failures [17:20:50] PROBLEM - puppet last run on mw1182 is CRITICAL: CRITICAL: Puppet has 54 failures [17:20:50] PROBLEM - puppet last run on mw1184 is CRITICAL: CRITICAL: Puppet has 49 failures [17:21:00] PROBLEM - puppet last run on analytics1034 is CRITICAL: CRITICAL: Puppet has 23 failures [17:21:00] PROBLEM - puppet last run on virt0 is CRITICAL: CRITICAL: Puppet has 53 failures [17:21:00] PROBLEM - puppet last run on amssq45 is CRITICAL: CRITICAL: Puppet has 19 failures [17:21:00] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: Puppet has 20 failures [17:21:00] PROBLEM - puppet last run on elastic1016 is CRITICAL: CRITICAL: Puppet has 19 failures [17:21:00] PROBLEM - puppet last run on mw1127 is CRITICAL: CRITICAL: Puppet has 51 failures [17:21:01] PROBLEM - puppet last run on mw1075 is CRITICAL: CRITICAL: Puppet has 50 failures [17:21:10] RECOVERY - puppetmaster backend https on strontium is OK: HTTP OK: Status line output matched 400 - 335 bytes in 5.550 second response time [17:21:10] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: Puppet has 19 failures [17:21:10] PROBLEM - puppet last run on cp1057 is CRITICAL: CRITICAL: Puppet has 19 failures [17:21:11] PROBLEM - puppet last run on cp1051 is CRITICAL: CRITICAL: Puppet has 25 failures [17:21:14] !log graceful'ed apache on strontium [17:21:19] Logged the message, Master [17:21:19] [notice] Apache/2.2.22 (Ubuntu) Phusion_Passenger/2.2.11 mod_ssl/2.2.22 OpenSSL/1.0.1 configured -- resuming normal operations [17:21:20] PROBLEM - Puppet freshness on db1007 is CRITICAL: Last successful Puppet run was Fri 18 Jul 2014 13:20:21 UTC [17:21:20] PROBLEM - puppet last run on analytics1029 is CRITICAL: CRITICAL: Puppet has 14 failures [17:21:20] PROBLEM - puppet last run on mw1196 is CRITICAL: CRITICAL: Puppet has 51 failures [17:21:30] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: Puppet has 61 failures [17:21:30] PROBLEM - puppet last run on lvs1006 is CRITICAL: CRITICAL: Puppet has 25 failures [17:21:30] PROBLEM - puppet last run on mw1096 is CRITICAL: CRITICAL: Puppet has 64 failures [17:21:31] PROBLEM - puppet last run on mw1191 is CRITICAL: CRITICAL: Puppet has 60 failures [17:21:31] PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Puppet has 56 failures [17:21:31] PROBLEM - puppet last run on mw1083 is CRITICAL: CRITICAL: Puppet has 55 failures [17:21:31] PROBLEM - puppet last run on erbium is CRITICAL: CRITICAL: Puppet has 34 failures [17:21:31] PROBLEM - puppet last run on mw1136 is CRITICAL: CRITICAL: Puppet has 63 failures [17:21:32] PROBLEM - puppet last run on mw1094 is CRITICAL: CRITICAL: Puppet has 62 failures [17:21:32] PROBLEM - puppet last run on cp1069 is CRITICAL: CRITICAL: Puppet has 12 failures [17:21:32] PROBLEM - puppet last run on search1014 is CRITICAL: CRITICAL: Puppet has 44 failures [17:21:37] thanks mutante [17:21:40] PROBLEM - puppet last run on tmh1002 is CRITICAL: CRITICAL: Puppet has 28 failures [17:21:45] i think that fixed it [17:21:50] PROBLEM - puppet last run on cp4013 is CRITICAL: CRITICAL: Puppet has 23 failures [17:21:50] PROBLEM - puppet last run on es10 is CRITICAL: CRITICAL: Puppet has 15 failures [17:21:50] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Puppet has 20 failures [17:21:51] PROBLEM - puppet last run on analytics1019 is CRITICAL: CRITICAL: Puppet has 25 failures [17:21:51] PROBLEM - puppet last run on es7 is CRITICAL: CRITICAL: Puppet has 19 failures [17:21:51] hold on for recoveries [17:22:00] PROBLEM - puppet last run on cp1065 is CRITICAL: CRITICAL: Puppet has 21 failures [17:22:00] the 500 error is gone [17:22:10] PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: Puppet has 69 failures [17:22:10] PROBLEM - puppet last run on db1041 is CRITICAL: CRITICAL: Puppet has 23 failures [17:22:11] PROBLEM - puppet last run on zirconium is CRITICAL: CRITICAL: Puppet has 50 failures [17:22:20] PROBLEM - puppet last run on wtp1021 is CRITICAL: CRITICAL: Puppet has 22 failures [17:22:20] PROBLEM - puppet last run on mw1036 is CRITICAL: CRITICAL: Puppet has 62 failures [17:22:21] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: Puppet has 22 failures [17:22:21] PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 16 failures [17:22:21] PROBLEM - puppet last run on amslvs4 is CRITICAL: CRITICAL: Puppet has 15 failures [17:22:30] PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: Puppet has 21 failures [17:22:30] PROBLEM - puppet last run on mw1035 is CRITICAL: CRITICAL: Puppet has 63 failures [17:22:31] PROBLEM - puppet last run on analytics1015 is CRITICAL: CRITICAL: Puppet has 21 failures [17:22:31] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: Puppet has 16 failures [17:22:31] PROBLEM - puppet last run on mw1109 is CRITICAL: CRITICAL: Puppet has 54 failures [17:22:31] PROBLEM - puppet last run on mc1008 is CRITICAL: CRITICAL: Puppet has 16 failures [17:22:31] PROBLEM - puppet last run on tarin is CRITICAL: CRITICAL: Puppet has 15 failures [17:22:40] interesting that alarms are still coming in. [17:22:40] PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: Puppet has 49 failures [17:22:40] PROBLEM - puppet last run on wtp1024 is CRITICAL: CRITICAL: Puppet has 23 failures [17:22:40] PROBLEM - puppet last run on db1010 is CRITICAL: CRITICAL: Puppet has 24 failures [17:22:40] PROBLEM - puppet last run on mw1013 is CRITICAL: CRITICAL: Puppet has 45 failures [17:22:42] /ignore -time 30m -regexp -pattern "puppet last run" icinga-wm [17:22:46] (FWIW) [17:22:49] heh [17:22:55] i did that :p [17:22:59] so..we can talk [17:23:03] Unexpected error in mod_passenger [17:23:06] is root cuase [17:23:21] (not really the very first time, afair) [17:24:22] !log puppetmaster on strontium had 'Unexpected error in mod_passenger" causing puppet fails all over the place with error 500 on master, resumed normal after graceful [17:24:28] Logged the message, Master [17:24:54] should they have talked to palladium instead? [17:25:43] !log temp. stopped icinga-wm to avoid channel spam [17:25:49] Logged the message, Master [17:27:08] zirconium, f.e. finished puppet run again without issues [17:27:28] waits for icinga to have less than 243 CRITs [17:28:42] mh looks like it started a while ago [17:28:43] [ pid=17174 file=ext/apache2/Hooks.cpp:727 time=2014-07-18 17:13:56.761 ]: [17:30:05] mutante, godog: i think this is the fix: https://github.com/phusion/passenger/commit/b8047d7567438d1d8a84e [17:30:23] based on https://groups.google.com/forum/#!topic/phusion-passenger/1bsvF8hEbqA [17:32:36] that looks promising.. we have 2.2.11debian-2 [17:33:13] fortunately we dont have that problem "comes back after a few hours" so far [17:33:18] !log replacing disk 2 es1005 [17:33:22] but i think we had it crash at least once [17:33:23] Logged the message, Master [17:33:49] icinga so slow to recover.. hrmm [17:34:01] ori: heh could be [17:35:03] "Can those who can reproduce the problem run 'passenger-status [17:35:04] --show=backtraces' and post the output?" [17:35:04] ah [17:35:54] wait .. ERROR: Phusion Passenger doesn't seem to be running. [17:35:56] (03CR) 10Plucas: ""scalac" isn't an ubuntu package[1], but "scala" is and includes /usr/bin/scalac[2]." [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147493 (owner: 10Alexandros Kosiaris) [17:36:09] but it is [17:38:40] we're down to 130 from > 240 [17:39:20] RECOVERY - puppet last run on ssl1004 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [17:39:21] RECOVERY - puppet last run on tantalum is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [17:39:22] RECOVERY - puppet last run on analytics1024 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [17:39:30] RECOVERY - puppet last run on mw1169 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [17:39:30] RECOVERY - puppet last run on lvs1006 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [17:39:30] RECOVERY - puppet last run on analytics1036 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [17:39:31] RECOVERY - puppet last run on db1024 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [17:39:31] RECOVERY - puppet last run on db1058 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [17:39:31] RECOVERY - puppet last run on es1003 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [17:39:31] RECOVERY - puppet last run on mw1191 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [17:39:31] RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [17:39:32] RECOVERY - puppet last run on erbium is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [17:39:33] RECOVERY - puppet last run on mw1083 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [17:39:33] RECOVERY - puppet last run on mc1008 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [17:39:34] RECOVERY - puppet last run on mw1136 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [17:39:34] RECOVERY - puppet last run on search1021 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [17:39:35] RECOVERY - puppet last run on cp1069 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [17:39:35] RECOVERY - puppet last run on search1014 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [17:39:35] RECOVERY - puppet last run on mw1094 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [17:39:40] RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [17:39:40] RECOVERY - puppet last run on tmh1002 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [17:39:50] RECOVERY - puppet last run on cp4013 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [17:39:50] RECOVERY - puppet last run on es10 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [17:39:51] RECOVERY - puppet last run on lvs4001 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [17:39:51] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [17:39:51] RECOVERY - puppet last run on mw1184 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [17:40:00] RECOVERY - puppet last run on es7 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [17:40:00] RECOVERY - puppet last run on analytics1034 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [17:40:00] RECOVERY - puppet last run on cp1065 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [17:40:00] RECOVERY - puppet last run on virt0 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [17:40:11] RECOVERY - puppet last run on virt1002 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [17:40:11] RECOVERY - puppet last run on mw1138 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [17:40:11] RECOVERY - puppet last run on db1041 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [17:40:11] RECOVERY - puppet last run on cp1051 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [17:40:20] RECOVERY - puppet last run on wtp1021 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [17:40:20] RECOVERY - puppet last run on mw1036 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [17:40:20] RECOVERY - puppet last run on amssq52 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [17:40:20] RECOVERY - puppet last run on mw1196 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [17:40:20] RECOVERY - puppet last run on amslvs4 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [17:40:21] RECOVERY - puppet last run on amssq57 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [17:40:30] RECOVERY - puppet last run on analytics1039 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [17:40:31] RECOVERY - puppet last run on es1006 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [17:40:31] RECOVERY - puppet last run on mw1035 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [17:40:31] RECOVERY - puppet last run on mw1096 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [17:40:31] RECOVERY - Puppet freshness on db1007 is OK: puppet ran at Fri Jul 18 17:40:26 UTC 2014 [17:40:31] RECOVERY - puppet last run on cp1043 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [17:40:40] RECOVERY - puppet last run on tarin is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [17:40:40] RECOVERY - puppet last run on mw1130 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [17:40:40] RECOVERY - puppet last run on mw1218 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [17:40:40] RECOVERY - puppet last run on wtp1024 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [17:40:40] RECOVERY - puppet last run on db1010 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [17:40:41] RECOVERY - puppet last run on mw1013 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [17:40:50] RECOVERY - puppet last run on analytics1019 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [17:40:50] RECOVERY - puppet last run on db1007 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [17:41:00] RECOVERY - puppet last run on mw1062 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [17:41:10] RECOVERY - puppet last run on mw1132 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [17:41:10] RECOVERY - puppet last run on es1009 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [17:41:11] RECOVERY - puppet last run on mw1040 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [17:41:20] RECOVERY - puppet last run on mw1147 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [17:41:20] RECOVERY - puppet last run on cp3022 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [17:41:20] RECOVERY - puppet last run on amssq58 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [17:41:20] RECOVERY - puppet last run on ssl3003 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [17:41:30] RECOVERY - puppet last run on mc1009 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [17:41:30] RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [17:41:30] RECOVERY - puppet last run on analytics1015 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [17:41:31] RECOVERY - puppet last run on mc1010 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [17:41:31] RECOVERY - puppet last run on mw1216 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [17:41:31] RECOVERY - puppet last run on mw1192 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [17:41:31] RECOVERY - puppet last run on mw1109 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [17:41:50] RECOVERY - puppet last run on ytterbium is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [17:41:50] RECOVERY - puppet last run on mw1161 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [17:41:50] RECOVERY - puppet last run on mw1124 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [17:42:10] RECOVERY - puppet last run on labsdb1001 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [17:42:10] RECOVERY - puppet last run on mw1038 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [17:42:20] RECOVERY - puppet last run on cp3013 is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures [17:42:31] RECOVERY - puppet last run on dbstore1001 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [17:42:31] RECOVERY - puppet last run on rdb1004 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [17:42:50] RECOVERY - puppet last run on mw1031 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [17:42:53] that was it.. [17:43:20] RECOVERY - puppet last run on mw1005 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [17:44:28] apergos: this is when i can use "Bob's your uncle":) [17:45:27] :-D [17:45:59] ori: "wmerrors: doing precautionary abort() after request timeout" ... what does that mean? [17:46:06] * AaronSchulz tries to make sense of the macros in the code [17:47:09] maybe the caller hanging up (or sending HUP if cli) and the caught fatal error wasn't written yet? [17:48:56] no, seems to be abort via PHP timer, not HUP [17:49:41] (03Abandoned) 10Dzahn: Revert "depool db1021 due to replag" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147166 (owner: 10Dzahn) [18:02:01] (03PS1) 10Rush: role::phabricator cleanup [operations/puppet] - 10https://gerrit.wikimedia.org/r/147530 [18:02:11] (03PS2) 10Rush: role::phabricator cleanup [operations/puppet] - 10https://gerrit.wikimedia.org/r/147530 [18:03:28] (03CR) 10Aaron Schulz: [C: 032] Set "daemonized" flag for the redis job queue [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147220 (owner: 10Aaron Schulz) [18:04:09] AaronSchulz: does https://gerrit.wikimedia.org/r/#/c/147529/ need to be backported? [18:04:30] (03Merged) 10jenkins-bot: Set "daemonized" flag for the redis job queue [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147220 (owner: 10Aaron Schulz) [18:05:06] !log aaron Synchronized wmf-config/jobqueue-eqiad.php: Set "daemonized" flag for the redis job queue (duration: 00m 04s) [18:05:11] Logged the message, Master [18:05:16] ori: I'll do that [18:06:15] (03CR) 10Rush: [C: 032] role::phabricator cleanup [operations/puppet] - 10https://gerrit.wikimedia.org/r/147530 (owner: 10Rush) [18:09:54] (03PS1) 10Rush: reverse proxy ssl for phabricator.wm.o => iridium [operations/puppet] - 10https://gerrit.wikimedia.org/r/147533 [18:12:07] (03PS1) 10Rush: phab.wm.o handled by misc-web-lb.eqiad [operations/dns] - 10https://gerrit.wikimedia.org/r/147534 [18:12:47] (03CR) 10jenkins-bot: [V: 04-1] phab.wm.o handled by misc-web-lb.eqiad [operations/dns] - 10https://gerrit.wikimedia.org/r/147534 (owner: 10Rush) [18:13:13] (03CR) 10Dzahn: [C: 031] reverse proxy ssl for phabricator.wm.o => iridium [operations/puppet] - 10https://gerrit.wikimedia.org/r/147533 (owner: 10Rush) [18:15:04] chasemp: that jenkins one is .. surprising ? [18:15:06] CNAME not allowed alongside other data at domainname 'phabricator.wikimedia.org.' [18:15:12] wth..it looks identical to the existing one [18:15:13] yeah asking bblack now [18:15:21] cool [18:16:28] !log aaron Synchronized php-1.24wmf14/maintenance/runJobs.php: 684c21c325370aa3baac631ae9a006fc8861b952 (duration: 00m 03s) [18:16:33] Logged the message, Master [18:17:06] !log aaron Synchronized php-1.24wmf13/maintenance/runJobs.php: ae053860dc36a07f05ab9e31299f2da0d2f66e85 (duration: 00m 03s) [18:17:11] Logged the message, Master [18:17:29] chasemp: oh.. because you have an MX for it [18:17:43] yep [18:18:20] i guess you must use A [18:20:13] this kinda dovetails into the "simplify our DNS" thing at the end of the DNS incident report, somewhat [18:21:01] IMHO, we shouldn't be doing all this CNAMEing in the first place. We can template locally to reduce redundancy in our local config and prevent update mistakes, but all of our public hostnames just just return a straightforward A/AAAA record from the client POV. [18:21:09] (03PS2) 10Rush: phab.wm.o handled by misc-web-lb.eqiad [operations/dns] - 10https://gerrit.wikimedia.org/r/147534 [18:21:20] RFC1034, "if a CNAME RR is present at a node, no other data should be present" [18:21:35] right [18:21:59] in general CNAMEs suck for a lot of reasons. Sometimes they're necessary when you're really crossing an ownership boundary (e.g. if we want to CNAME a wm.o hostname to a google service) [18:22:20] but since it's all within our control in most of these cases, there's no good reason to be exposing the CNAME mess to clients and dealing with the restrictions. [18:22:59] (03CR) 10BBlack: [C: 031] reverse proxy ssl for phabricator.wm.o => iridium [operations/puppet] - 10https://gerrit.wikimedia.org/r/147533 (owner: 10Rush) [18:23:21] (03PS1) 10Jgreen: correct donate.wm.o mail routing destination from aluminium to barium [operations/puppet] - 10https://gerrit.wikimedia.org/r/147542 [18:23:23] (03CR) 10BBlack: [C: 031] phab.wm.o handled by misc-web-lb.eqiad [operations/dns] - 10https://gerrit.wikimedia.org/r/147534 (owner: 10Rush) [18:23:45] thanks bblack, learned something today :) [18:24:12] (03CR) 10Rush: [C: 032 V: 032] reverse proxy ssl for phabricator.wm.o => iridium [operations/puppet] - 10https://gerrit.wikimedia.org/r/147533 (owner: 10Rush) [18:29:20] PROBLEM - puppet last run on cp1052 is CRITICAL: CRITICAL: Complete puppet failure [18:29:20] PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: Complete puppet failure [18:29:30] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: Complete puppet failure [18:29:30] PROBLEM - puppet last run on cp1059 is CRITICAL: CRITICAL: Complete puppet failure [18:29:30] PROBLEM - puppet last run on cp1046 is CRITICAL: CRITICAL: Complete puppet failure [18:29:31] PROBLEM - puppet last run on cp1044 is CRITICAL: CRITICAL: Complete puppet failure [18:29:31] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Complete puppet failure [18:29:31] PROBLEM - puppet last run on cp4006 is CRITICAL: CRITICAL: Complete puppet failure [18:29:31] PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Complete puppet failure [18:29:32] PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: Complete puppet failure [18:29:32] PROBLEM - puppet last run on cp3015 is CRITICAL: CRITICAL: Complete puppet failure [18:29:32] PROBLEM - puppet last run on cp1069 is CRITICAL: CRITICAL: Complete puppet failure [18:29:40] PROBLEM - puppet last run on cp1048 is CRITICAL: CRITICAL: Complete puppet failure [18:29:40] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Complete puppet failure [18:29:40] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Complete puppet failure [18:29:40] PROBLEM - puppet last run on cp3004 is CRITICAL: CRITICAL: Complete puppet failure [18:29:41] PROBLEM - puppet last run on cp4002 is CRITICAL: CRITICAL: Complete puppet failure [18:29:41] PROBLEM - puppet last run on cp4003 is CRITICAL: CRITICAL: Complete puppet failure [18:29:50] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Complete puppet failure [18:29:50] PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: Complete puppet failure [18:29:50] PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: Complete puppet failure [18:30:00] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Complete puppet failure [18:30:00] PROBLEM - puppet last run on cp4018 is CRITICAL: CRITICAL: Complete puppet failure [18:30:00] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: Complete puppet failure [18:30:06] (03CR) 10Jgreen: [C: 032 V: 031] correct donate.wm.o mail routing destination from aluminium to barium [operations/puppet] - 10https://gerrit.wikimedia.org/r/147542 (owner: 10Jgreen) [18:30:13] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Complete puppet failure [18:30:13] PROBLEM - puppet last run on cp1037 is CRITICAL: CRITICAL: Complete puppet failure [18:30:13] PROBLEM - puppet last run on cp1070 is CRITICAL: CRITICAL: Complete puppet failure [18:30:20] PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: Complete puppet failure [18:30:20] PROBLEM - puppet last run on cp3013 is CRITICAL: CRITICAL: Complete puppet failure [18:30:21] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: Complete puppet failure [18:30:21] PROBLEM - puppet last run on cp3016 is CRITICAL: CRITICAL: Complete puppet failure [18:30:41] RECOVERY - puppet last run on cp1048 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [18:30:44] tried to run puppet via salt and ^ ths happend [18:30:50] RECOVERY - puppet last run on cp1060 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [18:30:53] spot checking some hosts manually [18:31:00] RECOVERY - puppet last run on cp1054 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [18:31:11] RECOVERY - puppet last run on cp1037 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [18:31:11] RECOVERY - puppet last run on cp1070 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [18:31:20] RECOVERY - puppet last run on cp1052 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [18:31:20] RECOVERY - puppet last run on cp1066 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [18:31:31] RECOVERY - puppet last run on cp1059 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [18:31:31] RECOVERY - puppet last run on cp1069 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [18:31:42] i just checked cp1070, i see puppet agent --test [18:31:51] according to the log the run finished, but the lock is still there [18:31:51] PROBLEM - icinga is a jerk [18:32:09] yeah puppetmaster doesn't like a big burst of puppet clients [18:32:12] I think it snowballed, salt & master on one host caused palladium to overload [18:32:18] you can use salt batchsize to fix that [18:32:20] and then the clients hanging and retrying caused more fun [18:32:30] RECOVERY - puppet last run on cp1044 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [18:32:44] so this is my fault, but should just be a temporary resource thing not a persistent issue [18:32:50] e.g. salt -G cluster:cache* -b 3 cmd.run 'puppet agent -t' would only run 3 at a time in parallel [18:32:58] sweet thanks [18:33:12] http://docs.saltstack.com/en/latest/topics/targeting/batch.html [18:33:20] RECOVERY - puppet last run on cp3013 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [18:33:20] RECOVERY - puppet last run on cp3016 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [18:33:20] RECOVERY - puppet last run on cp3018 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [18:33:31] RECOVERY - puppet last run on cp4006 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [18:33:31] RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [18:33:32] (in any case, I think misc-web-lb is only on two of the cp hosts) [18:33:40] RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [18:33:40] RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [18:33:40] RECOVERY - puppet last run on cp3004 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [18:34:00] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [18:34:00] RECOVERY - puppet last run on cp4018 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [18:35:20] PROBLEM - puppet last run on amslvs3 is CRITICAL: CRITICAL: Puppet has 4 failures [18:35:30] PROBLEM - puppet last run on db1033 is CRITICAL: CRITICAL: Puppet has 14 failures [18:35:31] PROBLEM - puppet last run on db1044 is CRITICAL: CRITICAL: Puppet has 6 failures [18:35:31] PROBLEM - puppet last run on ssl1007 is CRITICAL: CRITICAL: Puppet has 28 failures [18:35:31] PROBLEM - puppet last run on wtp1003 is CRITICAL: CRITICAL: Puppet has 10 failures [18:35:31] PROBLEM - puppet last run on mw1087 is CRITICAL: CRITICAL: Puppet has 4 failures [18:35:50] PROBLEM - puppet last run on mw1148 is CRITICAL: CRITICAL: Puppet has 8 failures [18:35:50] PROBLEM - puppet last run on pc1003 is CRITICAL: CRITICAL: Puppet has 6 failures [18:35:50] PROBLEM - puppet last run on search1013 is CRITICAL: CRITICAL: Puppet has 14 failures [18:36:03] goodness gracious, well sorry that was clearly a bad idea [18:36:10] PROBLEM - puppet last run on mw1185 is CRITICAL: CRITICAL: Puppet has 24 failures [18:36:11] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [18:36:11] PROBLEM - puppet last run on mw1032 is CRITICAL: CRITICAL: Puppet has 38 failures [18:36:11] PROBLEM - puppet last run on elastic1017 is CRITICAL: CRITICAL: Puppet has 10 failures [18:36:11] PROBLEM - puppet last run on analytics1028 is CRITICAL: CRITICAL: Puppet has 5 failures [18:36:20] PROBLEM - puppet last run on cp1052 is CRITICAL: CRITICAL: Puppet has 7 failures [18:36:30] PROBLEM - puppet last run on db1006 is CRITICAL: CRITICAL: Puppet has 6 failures [18:36:31] PROBLEM - puppet last run on mw1077 is CRITICAL: CRITICAL: Puppet has 37 failures [18:36:31] PROBLEM - puppet last run on analytics1027 is CRITICAL: CRITICAL: Puppet has 5 failures [18:36:31] PROBLEM - puppet last run on amssq38 is CRITICAL: CRITICAL: Puppet has 1 failures [18:36:31] PROBLEM - puppet last run on amssq62 is CRITICAL: CRITICAL: Puppet has 4 failures [18:36:31] PROBLEM - puppet last run on ssl3002 is CRITICAL: CRITICAL: Puppet has 4 failures [18:36:31] PROBLEM - puppet last run on mw1033 is CRITICAL: CRITICAL: Puppet has 6 failures [18:36:32] PROBLEM - puppet last run on mw1186 is CRITICAL: CRITICAL: Puppet has 8 failures [18:36:33] PROBLEM - puppet last run on mw1167 is CRITICAL: CRITICAL: Puppet has 7 failures [18:36:33] PROBLEM - puppet last run on mw1024 is CRITICAL: CRITICAL: Puppet has 15 failures [18:36:34] PROBLEM - puppet last run on mw1201 is CRITICAL: CRITICAL: Puppet has 8 failures [18:36:34] PROBLEM - puppet last run on search1011 is CRITICAL: CRITICAL: Puppet has 3 failures [18:36:35] PROBLEM - puppet last run on db1030 is CRITICAL: CRITICAL: Puppet has 8 failures [18:36:40] PROBLEM - puppet last run on mw1001 is CRITICAL: CRITICAL: Puppet has 15 failures [18:37:05] I think the delay in the old puppet check hid some of this before [18:37:10] PROBLEM - puppet last run on mw1105 is CRITICAL: CRITICAL: Puppet has 15 failures [18:37:10] PROBLEM - puppet last run on mw1122 is CRITICAL: CRITICAL: Puppet has 17 failures [18:37:20] PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: Puppet has 5 failures [18:37:20] PROBLEM - puppet last run on cp1053 is CRITICAL: CRITICAL: Puppet has 21 failures [18:37:30] RECOVERY - puppet last run on cp3005 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [18:37:31] PROBLEM - puppet last run on cp1044 is CRITICAL: CRITICAL: Puppet has 5 failures [18:37:31] PROBLEM - puppet last run on db1056 is CRITICAL: CRITICAL: Puppet has 1 failures [18:37:31] PROBLEM - puppet last run on mw1064 is CRITICAL: CRITICAL: Puppet has 4 failures [18:37:31] PROBLEM - puppet last run on amssq31 is CRITICAL: CRITICAL: Puppet has 3 failures [18:37:31] PROBLEM - puppet last run on mw1204 is CRITICAL: CRITICAL: Puppet has 2 failures [18:37:31] PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: Puppet has 20 failures [18:37:40] PROBLEM - puppet last run on mw1112 is CRITICAL: CRITICAL: Puppet has 3 failures [18:37:40] PROBLEM - puppet last run on db71 is CRITICAL: CRITICAL: Puppet has 5 failures [18:37:40] PROBLEM - puppet last run on mw1143 is CRITICAL: CRITICAL: Puppet has 3 failures [18:37:50] PROBLEM - puppet last run on mw1027 is CRITICAL: CRITICAL: Puppet has 31 failures [18:38:00] PROBLEM - puppet last run on elastic1003 is CRITICAL: CRITICAL: Puppet has 3 failures [18:38:00] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: Puppet has 4 failures [18:38:00] PROBLEM - puppet last run on mw1220 is CRITICAL: CRITICAL: Puppet has 34 failures [18:38:10] PROBLEM - puppet last run on terbium is CRITICAL: CRITICAL: Puppet has 30 failures [18:38:11] PROBLEM - puppet last run on db1037 is CRITICAL: CRITICAL: Puppet has 5 failures [18:38:11] PROBLEM - puppet last run on dysprosium is CRITICAL: CRITICAL: Puppet has 3 failures [18:38:11] PROBLEM - puppet last run on cp1067 is CRITICAL: CRITICAL: Puppet has 5 failures [18:38:20] PROBLEM - puppet last run on analytics1003 is CRITICAL: CRITICAL: Puppet has 9 failures [18:38:20] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: Puppet has 1 failures [18:38:20] PROBLEM - puppet last run on mw1113 is CRITICAL: CRITICAL: Puppet has 15 failures [18:38:20] PROBLEM - puppet last run on search1004 is CRITICAL: CRITICAL: Puppet has 31 failures [18:38:30] PROBLEM - puppet last run on db60 is CRITICAL: CRITICAL: Puppet has 5 failures [18:38:30] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: Puppet has 5 failures [18:38:30] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 4 failures [18:38:30] PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: Puppet has 15 failures [18:38:30] PROBLEM - puppet last run on mw1193 is CRITICAL: CRITICAL: Puppet has 38 failures [18:38:31] PROBLEM - puppet last run on hooft is CRITICAL: CRITICAL: Puppet has 10 failures [18:38:31] PROBLEM - puppet last run on mexia is CRITICAL: CRITICAL: Puppet has 7 failures [18:38:31] PROBLEM - puppet last run on amssq33 is CRITICAL: CRITICAL: Puppet has 4 failures [18:38:32] PROBLEM - puppet last run on ms-be1002 is CRITICAL: CRITICAL: Puppet has 21 failures [18:38:33] PROBLEM - puppet last run on mw1086 is CRITICAL: CRITICAL: Puppet has 28 failures [18:38:40] PROBLEM - puppet last run on amssq39 is CRITICAL: CRITICAL: Puppet has 5 failures [18:38:50] PROBLEM - puppet last run on ssl3001 is CRITICAL: CRITICAL: Puppet has 4 failures [18:39:40] PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: Puppet has 5 failures [18:39:50] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [18:43:30] RECOVERY - puppet last run on cp3015 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [18:44:20] RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [18:45:30] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [18:47:41] RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [18:49:30] RECOVERY - puppet last run on cp1046 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [18:49:33] (03CR) 10Dzahn: [C: 032] delete unused role/smokeping.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/147502 (owner: 10Dzahn) [18:50:42] Is there a tutorial I can follow to run SQL queries on stat1003? [18:50:57] I tried searching for "stat1003" on wikitech.wikimedia.org and office.wikimedia.org but I can't find any instructions. [18:51:31] RECOVERY - puppet last run on ssl3002 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [18:51:31] RECOVERY - puppet last run on mw1087 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [18:51:50] RECOVERY - puppet last run on cp3009 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [18:52:10] RECOVERY - puppet last run on elastic1017 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [18:52:30] RECOVERY - puppet last run on hooft is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [18:52:31] RECOVERY - puppet last run on amssq62 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [18:52:31] RECOVERY - puppet last run on db1044 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [18:52:31] RECOVERY - puppet last run on ssl1007 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [18:52:31] RECOVERY - puppet last run on mw1186 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [18:52:31] RECOVERY - puppet last run on mw1201 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [18:52:31] RECOVERY - puppet last run on wtp1003 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [18:52:32] RECOVERY - puppet last run on db1030 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [18:52:40] RECOVERY - puppet last run on mw1001 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [18:52:50] RECOVERY - puppet last run on mw1148 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [18:52:50] RECOVERY - puppet last run on search1013 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [18:52:50] RECOVERY - puppet last run on pc1003 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [18:53:10] RECOVERY - puppet last run on mw1105 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [18:53:11] RECOVERY - puppet last run on mw1185 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [18:53:11] RECOVERY - puppet last run on mw1032 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [18:53:20] RECOVERY - puppet last run on amslvs3 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [18:53:20] RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [18:53:30] RECOVERY - puppet last run on mw1077 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [18:53:31] RECOVERY - puppet last run on analytics1027 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [18:53:31] RECOVERY - puppet last run on amssq38 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [18:53:31] RECOVERY - puppet last run on mw1033 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [18:53:31] RECOVERY - puppet last run on mw1167 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [18:53:31] RECOVERY - puppet last run on search1011 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [18:53:35] Deskana: try #wikimedia-analytics for that [18:54:10] RECOVERY - puppet last run on mw1122 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [18:54:10] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Complete puppet failure [18:54:10] RECOVERY - puppet last run on analytics1028 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [18:54:20] RECOVERY - puppet last run on cp1052 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [18:54:20] RECOVERY - puppet last run on search1004 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [18:54:30] RECOVERY - puppet last run on db1033 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [18:54:30] RECOVERY - puppet last run on db1006 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [18:54:30] RECOVERY - puppet last run on amssq31 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [18:54:31] RECOVERY - puppet last run on mw1024 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [18:54:31] RECOVERY - puppet last run on mw1086 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [18:54:40] RECOVERY - puppet last run on mw1112 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [18:54:40] RECOVERY - puppet last run on db71 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [18:54:40] RECOVERY - puppet last run on amssq39 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [18:54:40] RECOVERY - puppet last run on mw1143 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [18:54:50] RECOVERY - puppet last run on mw1027 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [18:55:00] RECOVERY - puppet last run on elastic1003 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [18:55:00] RECOVERY - puppet last run on mw1220 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [18:55:10] RECOVERY - puppet last run on cp1067 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [18:55:20] RECOVERY - puppet last run on analytics1003 is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures [18:55:30] RECOVERY - puppet last run on db60 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [18:55:30] RECOVERY - puppet last run on mw1193 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [18:55:30] RECOVERY - puppet last run on mw1064 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [18:55:30] RECOVERY - puppet last run on cp1044 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [18:55:31] RECOVERY - puppet last run on amssq33 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [18:55:31] RECOVERY - puppet last run on mw1204 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [18:55:31] RECOVERY - puppet last run on ms-be1002 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [18:55:32] RECOVERY - puppet last run on ms-be1015 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [18:55:40] RECOVERY - puppet last run on cp4002 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [18:56:11] RECOVERY - puppet last run on terbium is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [18:56:11] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [18:56:11] RECOVERY - puppet last run on dysprosium is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [18:56:11] RECOVERY - puppet last run on db1037 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [18:56:20] RECOVERY - puppet last run on cp1053 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [18:56:20] RECOVERY - puppet last run on mw1131 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [18:56:30] RECOVERY - puppet last run on lvs3003 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [18:56:30] RECOVERY - puppet last run on eeden is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [18:56:31] RECOVERY - puppet last run on mw1215 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [18:56:31] RECOVERY - puppet last run on cp4012 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [18:56:50] RECOVERY - puppet last run on ssl3001 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [18:57:00] RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [18:57:20] RECOVERY - puppet last run on mw1113 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [18:57:30] RECOVERY - puppet last run on db1056 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [18:57:31] RECOVERY - puppet last run on mexia is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [18:57:40] RECOVERY - puppet last run on amssq37 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [18:59:20] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There are 2 unmerged changes in puppet (dir /var/lib/git/operations/puppet). [19:00:11] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Complete puppet failure [19:03:10] (03CR) 10Dzahn: [C: 032] delete smokeping apache template [operations/puppet] - 10https://gerrit.wikimedia.org/r/147504 (owner: 10Dzahn) [19:04:58] (03PS1) 10Rush: php-mailparse for phabricator module [operations/puppet] - 10https://gerrit.wikimedia.org/r/147555 [19:07:20] PROBLEM - Disk space on gallium is CRITICAL: DISK CRITICAL - free space: /var/lib/jenkins-slave/tmpfs 18 MB (3% inode=99%): [19:07:43] (03CR) 10Rush: [C: 032] php-mailparse for phabricator module [operations/puppet] - 10https://gerrit.wikimedia.org/r/147555 (owner: 10Rush) [19:08:20] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [19:08:30] PROBLEM - puppet last run on cp1059 is CRITICAL: CRITICAL: Complete puppet failure [19:11:04] !log restarted apache on strontium.. sigh [19:11:10] Logged the message, Master [19:11:11] (03CR) 10JanZerebecki: icinga - update SSL cipher list (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147207 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [19:12:40] PROBLEM - puppet last run on polonium is CRITICAL: CRITICAL: Complete puppet failure [19:13:44] (03CR) 10JanZerebecki: [C: 031] racktables - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147185 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [19:15:40] RECOVERY - puppet last run on polonium is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [19:17:10] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [19:18:19] (03CR) 10Rush: [C: 032] phab.wm.o handled by misc-web-lb.eqiad [operations/dns] - 10https://gerrit.wikimedia.org/r/147534 (owner: 10Rush) [19:22:31] RECOVERY - puppet last run on cp1059 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [19:23:50] RECOVERY - RAID on es1005 is OK: OK: optimal, 1 logical, 2 physical [19:34:53] (03PS1) 10Rush: phab.wm.o https [operations/puppet] - 10https://gerrit.wikimedia.org/r/147625 [19:36:09] (03CR) 10Rush: [C: 032 V: 032] phab.wm.o https [operations/puppet] - 10https://gerrit.wikimedia.org/r/147625 (owner: 10Rush) [19:38:30] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [19:42:20] RECOVERY - Disk space on gallium is OK: DISK OK [20:04:03] (03PS1) 10Jforrester: Enable TemplateData GUI tool on the Finnish Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147632 (https://bugzilla.wikimedia.org/68184) [20:17:16] (03PS1) 10Dzahn: phab-login screen HTML-replace deprecated HTML [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 [20:25:20] (03CR) 10Chad: "But is great because it totally works but annoys the pedantic and validators ;-)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 (owner: 10Dzahn) [20:26:42] (03CR) 10Rush: "first thing epriestly said to me was "nice font tag :)" hehe" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 (owner: 10Dzahn) [20:27:15] (03CR) 10Chad: "It did its job then :D" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 (owner: 10Dzahn) [20:29:06] mutante https://gerrit.wikimedia.org/r/#/c/147168/? [20:29:09] (03CR) 10Dzahn: "Error Line 1, Column 3076: Element center not allowed as child of element font in this context. (Suppressing further errors from this subt" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 (owner: 10Dzahn) [20:31:01] dogeydogey: hard to review because it changes the number of lines even though you'd expect it to be just tabs and spaces [20:31:20] mutante anything i can do to make it betteR? [20:32:00] dogeydogey: extend the commit message what you are trying to achieve [20:32:20] can it be +806/-806 while you still get what you wanted? [20:32:21] mutante there's nothing to extend, i just wanted to clean up the spacing [20:32:34] it was really bad before [20:32:48] "fixed" can be anything [20:33:18] what are you removing? [20:35:34] well initially i wanted to just align everything [20:35:39] but at your suggestion i removed tabs [20:35:44] and used just spaces [20:36:05] yea, but aligning and removing tabs would not change the number of lines [20:36:13] there were some empty lines [20:36:16] that i removed [20:38:15] dogeydogey: assuming you did this in vim with a macro? [20:38:48] nah by hand [20:39:20] lots of spaces and and backspaces [20:39:31] well then nevermind :) was going to suggest md5sum pre and post with commands to verify changes [20:41:30] !log Load testing GeoData [20:41:33] Logged the message, Master [20:42:04] mutante: could you use the puppet compiler thing [20:42:08] to verify no actual changes? [20:42:20] chasemp: it's not a puppet change, it's DNS [20:42:27] yeah just noticed :D [20:42:36] definitly tricky [20:42:37] and "just" wikimedia zone :p [20:43:01] it would be much easier if it was not also removing the empty lines [20:43:24] so this file is compiled to the end file I believe [20:43:30] and jenkins bascially does that to verify syntax [20:43:49] you could check the end state file md5sum, assuming it ignores whitespace in the source file [20:43:56] and if they match no actual content change? [20:47:43] chasemp: it also removes all the literal tabs [20:47:50] (well i said so, but yea) [20:47:57] that would change md5 [20:48:19] !log awight updated /a/common/php-1.24wmf12 to {{Gerrit|Idf3f49941}}: Updating ZeroBanner [20:48:23] Logged the message, Master [20:48:57] !log awight Synchronized php-1.24wmf12: update FundraisingTranslateWorkflow submodule (duration: 00m 21s) [20:49:02] Logged the message, Master [20:50:46] !log awight Synchronized php-1.24wmf12: update FundraisingTranslateWorkflow submodule (duration: 00m 49s) [20:50:50] Logged the message, Master [20:54:50] Hi, I'm deploying from tin and am forced to workaround a mysterious gerrit key issue. [20:55:01] I cannot fetch origin, and *sometimes* cannot sync-dir. [20:55:27] awight: sounds like problems with your ssh-agent [20:55:35] I'm forwarding auth using -A, but also have a key I only use on tin, cos I prefer not to forward. [20:56:06] bd808: why would I be able to log in but not forward the key, though? [20:56:30] (03CR) 10Dzahn: [C: 032] racktables - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147185 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [20:56:50] I have personally had problems where the rate of requests made by ssh during a sync swamps out my ssh-agent [20:57:04] It hasn't happened to me for a long time though [20:57:25] i also had what bd808 said and was about to ask if your agent is still alive ..locally [20:57:49] !log awight updated /a/common/php-1.24wmf13 to {{Gerrit|Id3462554b}}: Made --maxtime a soft limit again [20:57:57] Logged the message, Master [20:57:57] ps auxxww|grep $SSH_AGENT_PID [20:58:01] yep, it's alive [20:58:36] btw. that logmsg above is *not* what I just updated to [20:58:56] !log for the record, I actually updated to ade90e0e22492d87e6069db3a359b22ef56401a6 [20:59:02] Logged the message, Master [20:59:06] awight: please don't have that other key on tin though [20:59:24] mutante: oh really? OK, deleting it now. I was under the impression that key forwarding was the bad practice. [21:00:36] awight: if you cant use ProxyCommand you probably have not much choice, but having private key there is worse [21:00:54] !log awight Synchronized php-1.24wmf13: update FundraisingTranslateWorkflow submodule (duration: 01m 04s) [21:01:01] Logged the message, Master [21:01:36] awight: do you have more than one key loaded in the agent? [21:01:43] mutante: well, fwiw I removed the key, logged out and back in, and still fail. [21:01:48] mutante: probably. [21:01:56] I have a second key for fundraising production stuff. [21:02:11] mutante: my .ssh/config is specific about which key to use though [21:02:30] awight: we recently saw an issue where user had like 5 keys loaded in agent and he got denied .. after sending the first 4 wrong keys [21:02:41] hah [21:02:47] no, ssh -v says everything is normal [21:03:08] hmm.. define "sometimes" ? [21:03:38] did it all work for you like this before? [21:03:40] mutante: the inability to sync-dir happened a few minutes ago, nothing obviously different about my session. all 228 apaches rejected me [21:03:43] mutante: yes [21:03:54] ok, let me check one of them [21:03:58] mutante: the only risk factor I know if is that my gerrit username was changed [21:06:37] Is anyone in the process of building the wmf14 branch or something? I'm getting php list errors when attempting to sync. [21:06:46] s/list/lint/ [21:07:07] sync-dir php-1.24wmf14 "update FundraisingTranslateWorkflow submodule" [21:07:11] 21:07:01 sync-dir failed: Command '['/usr/bin/php', '-n', '-dextension=parsekit.so', '/usr/local/bin/lint.php', '/a/common/php-1.24wmf14']' returned non-zero exit status -11 [21:07:28] awight: let's try this: just login from tin on mw1084.. without any sync scripts.. just ssh [21:07:34] watching log [21:08:02] mutante: done [21:08:12] Failed publickey for awight [21:08:20] whaat? I logged in successfully. [21:08:39] both :) [21:08:41] it must have tried a bad one first? [21:08:44] first you are sending one that doesnt work [21:08:50] * greg-g nods [21:08:50] then the other one [21:08:50] okay, I have to stop this deployment but could not sync the 1.24wmf14 dir because of that error [21:08:58] please advise how I can clean up. [21:09:10] it's a conflict in your ssh config, from my experience [21:09:31] awight@tin:/a/common/php-1.24wmf14$ php -n -dextension=parsekit.so /usr/local/bin/lint.php /a/common/php-1.24wmf14 [21:09:34] Segmentation fault [21:09:42] something in there is sending the wrong key first, then falling back to a good one [21:09:48] greg-g: the ssh thing is not blocking me at this moment, I've worked around [21:09:53] try unloading all other keys from agent [21:09:57] greg-g: but the above error is ^^ [21:10:06] greg-g: should I rollback using the SAL? [21:10:10] I have no idea what that is supposed to do [21:10:21] it is preventing me from sync-dir'ing the wmf14 branch [21:10:27] maybe someone is building that right now? [21:10:39] shouldn't be [21:10:56] that's what I would thing. segfault is double unhelpful, unfortunately. [21:10:59] :/ [21:11:00] I'm not about to gdb that [21:11:26] the error is just on that fundraising submodule [21:11:28] revert your change in gerrit, re fetch on tin [21:11:29] ? [21:11:34] right bd808 ? [21:11:39] mutante: oh really? How can you tell? [21:11:47] Has this branch been deployed at all, yet? [21:11:48] no, i'm asking because of what you pasted [21:11:51] "update FundraisingTranslateWorkflow submodule" [21:11:52] * bd808 reads backscroll [21:12:00] that was in your error message [21:12:00] I'll revert as closely as I can. [21:12:24] (03PS1) 10Chad: Add average/90th pct search latency for 1 hour window [operations/puppet] - 10https://gerrit.wikimedia.org/r/147651 [21:13:44] awight: What state is tin in? You staged changes but can't sync? [21:14:22] bd808: I'm trying to rollback my small change, then I'll try to sync again. [21:14:28] I'll ping with the results [21:15:01] (03CR) 10Greg Grossmeier: ":)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147651 (owner: 10Chad) [21:15:19] ori: Still getting 502 responses on stream.wm.o from time to time. [21:15:30] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [21:15:31] Did you yield anything useful from your investigation the other day? [21:15:42] Krinkle: https://gerrit.wikimedia.org/r/#/c/145997/ never got merged [21:15:44] Krinkle: yes, that [21:16:16] (03CR) 10Krinkle: [C: 031] rcstream: make lvs health check fetch /nginx_status [operations/puppet] - 10https://gerrit.wikimedia.org/r/145997 (https://bugzilla.wikimedia.org/67957) (owner: 10Ori.livneh) [21:16:40] It's breaking production, can we get an opsen? [21:17:06] we should wait for _joe_ [21:17:23] it's a small change, but a small change to the LVS servers [21:17:34] (03PS3) 10Dzahn: racktables - retab apache config [operations/puppet] - 10https://gerrit.wikimedia.org/r/147186 [21:18:03] Ah frack. I see why lint.php is barfing [21:18:20] ori: Does it require a reboot of something that we don't want to / can't reboot whenever? Or waiting for his review in general? [21:18:21] oooh [21:18:25] The psr-3 library has php 5.4+ features in it [21:18:36] That I didn't strip out [21:18:40] Krinkle: probably both [21:18:44] It's ben down for over a week and knowing the (potentially working) fix has been known and waiting in gerrit for 4 days isn't nice.. [21:19:02] i know, but both giuseppe and i and 100% allocated to hhvm atm [21:19:14] * bd808 doesn't know how Reedy synced it yesterday [21:19:29] with the --reedy option [21:19:42] I wouldn't be surprised ;) [21:19:59] I'm sure there's at least one other opsen available that knows general stuff (e.g. lvs) enough to fix a production service that is down. [21:20:01] It just lets me do it [21:20:05] it knows I'll find a way around it [21:20:22] The seg fault happens on /a/common/php-1.24wmf14/vendor/psr/log/Psr/Log/LoggerTrait.php [21:20:25] (03PS4) 10Dzahn: racktables - retab apache config [operations/puppet] - 10https://gerrit.wikimedia.org/r/147186 [21:20:48] bd808: sorry, I just pinged you in -dev about this. Can I leave the cleanup to you, since I cannot sync? I've reverted my changes. [21:20:56] I can make a patch to mw/core/vendor to pull that file out (and I think a couple more?) [21:21:01] (03CR) 10Dzahn: [C: 032] racktables - retab apache config [operations/puppet] - 10https://gerrit.wikimedia.org/r/147186 (owner: 10Dzahn) [21:21:43] (03CR) 10Chad: "Wiki is locked, data's still in the database. I don't see what there is to fix." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147012 (owner: 10Chad) [21:21:45] awight: So what needs syncing when I get the lint fixed? [21:21:47] Krinkle: sure, I'm not opposed [21:22:00] Krinkle: but keep in mind that we're not considering stream.wm.o "launched" atm [21:22:06] bd808: nothing; I've reverted to in theory it should be identical to when I arrived. This is the php-1.24wmf14 directory. [21:22:20] my patches were f2a5d1996234bf94325c0f095db0959cbbae2b4e and 50d19b2e27f716fbad58c63d98699eccc18b5961 [21:22:36] bd808: ok if we're good, I'm running to a meeting with tail already between legs :D [21:23:00] (03CR) 10BBlack: [C: 031] rcstream: make lvs health check fetch /nginx_status [operations/puppet] - 10https://gerrit.wikimedia.org/r/145997 (https://bugzilla.wikimedia.org/67957) (owner: 10Ori.livneh) [21:23:02] I'll make a patch to fix core/vendor [21:23:11] kthx [21:23:20] Krinkle: we want to push that now? [21:23:22] ori: right. But I hope in the future when it is launched, that the main maintainers being otherwise occupied doesn't keep it from being fixed. It doesn't take insider's knowledge (and if it does, then we failed documenting it properly) [21:23:27] bblack: if you could [21:23:54] Krinkle: kk [21:23:59] i don't disagree [21:23:59] I understand it's not launched yet, you're right. [21:24:57] are you sure /nginx_status is legal? [21:25:36] bblack: we provision it explicitly , via http://wiki.nginx.org/HttpStubStatusModule [21:26:00] (03Abandoned) 10Dzahn: delete smokeping apache template [operations/puppet] - 10https://gerrit.wikimedia.org/r/147504 (owner: 10Dzahn) [21:26:01] hm, but it 404s [21:26:06] yeah that's what I mean [21:26:35] sigh. it only listens on localhost. [21:27:11] (03PS1) 10Dzahn: delete smokeping apache template [operations/puppet] - 10https://gerrit.wikimedia.org/r/147656 [21:27:13] (also, how is this affecting loadbalancing? pybal doesn't care about the contents, only that it worked at all) [21:27:48] bblack: it's affecting the backends [21:27:48] (03CR) 10Dzahn: [C: 032] "the role class that used it is already deleted now" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147656 (owner: 10Dzahn) [21:28:02] causing LVS checks to pile on specific backends [21:28:06] you mean the excess load from pybal's queries? [21:28:29] one would think a health check would be pretty cheap, and that real traffic would dwarf healthcheck traffic... [21:29:00] hmm. that's probably true. [21:29:04] (03CR) 10BBlack: rcstream: make lvs health check fetch /nginx_status [operations/puppet] - 10https://gerrit.wikimedia.org/r/145997 (https://bugzilla.wikimedia.org/67957) (owner: 10Ori.livneh) [21:29:20] in any case, I retract my ill-considered +1 :) [21:29:30] yes, it's an ill-considered fix. [21:29:40] bblack: thanks for scrutinizing [21:33:26] greg-g: any info on the user-facing consequences of MW update on Wikitech? [21:33:49] (03CR) 10Dzahn: [C: 032] icinga - retab Apache config template [operations/puppet] - 10https://gerrit.wikimedia.org/r/147201 (owner: 10Dzahn) [21:34:10] (03CR) 10Dzahn: "thx" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147204 (owner: 10Dzahn) [21:38:36] (03CR) 10Dzahn: [C: 032] metrics - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147214 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [21:38:49] (03CR) 10Dzahn: [V: 032] metrics - update SSL cipher list [operations/puppet] - 10https://gerrit.wikimedia.org/r/147214 (https://bugzilla.wikimedia.org/53259) (owner: 10Dzahn) [21:38:51] (03CR) 10Ori.livneh: [C: 031] Add average/90th pct search latency for 1 hour window [operations/puppet] - 10https://gerrit.wikimedia.org/r/147651 (owner: 10Chad) [21:39:19] (03PS3) 10Dzahn: metrics - outdated variable syntax [operations/puppet] - 10https://gerrit.wikimedia.org/r/147215 [21:39:35] (03CR) 10Dzahn: [C: 032] metrics - outdated variable syntax [operations/puppet] - 10https://gerrit.wikimedia.org/r/147215 (owner: 10Dzahn) [21:41:31] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [21:43:17] manybubbles: You forgot Russia(n Wikipedia). [21:44:17] Man, what a wmf branch really needs is s few more submodules :( [21:45:27] (03CR) 10Plucas: "I need to add another patch, to apply the same change to debian/kafka.kafka-mirror.init. I'll try to rope kmosher into adding it to this r" [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/147338 (owner: 10Kmosher) [21:47:05] bd808: recursive submodules ftw! [21:50:16] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate declaration: Package[imagemagick] is already declared in file /etc/puppet/modules/contint/manifests/packages.pp:92; cannot redeclare at /etc/puppet/modules/mediawiki/manifests/packages.pp:13 on node i-000004b1.eqiad.wmflabs [21:50:31] Yikes [21:51:30] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [21:52:07] (03CR) 10Dzahn: [C: 032] wikitech - remove DHE ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147315 (owner: 10Dzahn) [21:52:35] Krinkle: on tin ? [21:52:44] how did it recover then? [21:52:50] mutante: integration-slave1004 [21:52:58] Brand new instance, couldn't provision it. [21:53:15] i see.. hrmm [21:53:35] I'm working on upgrading the Jenkins slaves to Trusty. But before I could do anything it failed on this. [21:54:38] I recall there being some kind of hack to avoid "defining" the same package twice [21:54:52] i think it's new in the mw module [21:54:59] Yes [21:54:59] and can/should be removed in contint [21:55:09] where it was long time before [21:55:11] greg-g, awight: lint.php runs cleanly on /a/common/php-1.24wmf14 again [21:55:25] Do we use that hack yet? And if not, what do we currently do when two unrelated classes use the same debian package? [21:55:41] Krinkle: we just remove it from one of the classes [21:55:47] bd808|deploy: ok awesome. I'll redo my patch so it matches wmf12 and 13. [21:55:55] mutante: And then how do you ensure the other class installs the package if missing? [21:56:23] I guess you'd move package{ foo ensure } to some generic place and depend on Package['foo'] in both places? [21:56:38] awight, greg-g: I did not sync anything, and I'd rather not if awight can take care of that as needed now [21:57:06] Ideally there'd be a test against this. I mean, obviously no two classes no matter how unrelated should define the same package, given how problematic that is in puppet. [21:57:10] Krinkle: it depends , but yea, in some cases we have generic package classes [21:57:23] I'll solve it whatever way works. What do you recommend we do here? [21:57:24] @seen anomie [21:57:29] in this case.. does contint need it at all? [21:57:36] i think it already has the mw classes [21:57:36] I'm not going to doubt that right now. [21:57:41] It doesn't. [21:57:57] already making a patch to remove it from contint class [21:58:01] so for now you can make instances [22:00:03] (03PS1) 10Dzahn: contint-remove imagemagick, duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 [22:00:54] mutante: It does use it. The slaves run temporary installs of mediawiki. And mediawiki presuably needs that package to do thumb scaling and what not [22:01:31] awight: do you still need to sync? [22:01:34] thanks bd808 [22:01:53] odder: "resolve some echo bugs, update localization, and fix a minor issue with edit attribution on instance info pages. " [22:02:01] odder: from what the person doing the upgrade told me [22:02:03] greg-g: yep, I'm about to right now [22:02:08] cool [22:02:16] Krinkle: when did it get added to mw module? [22:02:20] greg-g: yw. breakage was at least partly my fault [22:02:21] i dont see it yet in git log [22:02:36] mutante: Presumably when it was created recently by ori and others in the refactor of mw modules [22:02:39] bd808: what a jerk [22:02:41] greg-g: Sorry, I meant outage or whatever. [22:03:02] odder: shouldn't be [22:03:22] !log awight updated /a/common/php-1.24wmf14 to {{Gerrit|I1036dae02}}: Update mediawiki/core/vendor to head to 1.24wmf14 [22:03:26] Logged the message, Master [22:04:26] Krinkle: https://gerrit.wikimedia.org/r/#/c/133956/3 [22:04:29] mutante: Do you mind if I change that patchset to instead comment it out and add a FIXME comment? It needs this package. Considering this is the 5th tangent today while trying to get started, I likely won't be able to finish it today and Antoine might not know otherwise Monday if it just gets removed. [22:04:30] that is May [22:04:50] !log awight Synchronized php-1.24wmf14: update FundraisingTranslateWorkflow submodule (take 2) (duration: 00m 58s) [22:04:56] So contint slaves have been errorring on puppet provisioning every 30 minutes since May [22:04:56] Logged the message, Master [22:05:06] Krinkle: go ahead, i dont mind at all, just trying to get an emergency fix for you [22:05:16] but this appears to be broken since May 18th [22:05:43] i thought it was like ...today [22:05:51] Can we move it to a generic file maybe? Else I'd need to create a manual page with "run these bash commands manually over ssh after creating a new node before pooling it". which would contain a sudo apt-get install for imagemagick. And then I need to make sure everybody reads that page before they create new servers in labs or production of this type. [22:06:07] bd808: okay, done with that sync. thanks again for the quick patching! [22:06:10] well, that's what happens when there's no error reporting in labs (since January?). [22:06:37] (03PS2) 10Dzahn: contint-remove imagemagick, duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 [22:07:25] awight: yw. sorry you had to find out it was broken [22:08:46] (03CR) 10Dzahn: "there is a duplicate definition on labs instances because of the imagemagick package -> Ia3d002201734cd57" [operations/puppet] - 10https://gerrit.wikimedia.org/r/133956 (owner: 10Ori.livneh) [22:09:08] (03PS3) 10Krinkle: contint-remove imagemagick, duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 (owner: 10Dzahn) [22:09:32] (03CR) 10Krinkle: [C: 031] contint-remove imagemagick, duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 (owner: 10Dzahn) [22:10:01] Krinkle: i'd like to pass that on to ori and hashar (if we should use it in generic files ) [22:10:16] k [22:11:28] Krinkle: ah, i have another idea.. [22:11:48] if defined( Package['imagemagick'] ) { [22:11:51] (03PS4) 10Krinkle: contint: remove imagemagick, duplicate definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 (owner: 10Dzahn) [22:12:18] the other one would be virtual resources [22:12:20] seems to be used in a few other places [22:12:24] yeah [22:13:02] Maybe something like this method should be available globally? https://github.com/wikimedia/operations-puppet/blob/c2a1762afe209031d2feeea28aca85d98b548ef9/modules/puppetmaster/manifests/naggen2.pp#L7-L10 [22:13:17] or do we want the classes to be re-usable outside our puppet repo? [22:14:30] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [22:14:34] theoretically we do want them to be re-usable.. not sure [22:14:41] let me amend one more time [22:14:43] OK [22:14:56] (03PS1) 10Mwalker: Some additional AppArmor paths for OCG [operations/puppet] - 10https://gerrit.wikimedia.org/r/147666 [22:17:44] (03PS5) 10Dzahn: contint: fix duplicate definition of imagemagick [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 [22:17:53] (03PS2) 10Mwalker: Some additional AppArmor paths for OCG [operations/puppet] - 10https://gerrit.wikimedia.org/r/147666 [22:18:31] Krinkle: see how it was done for python-requests package right below that [22:18:56] yeah [22:19:06] mutante: eh.. package { 'python-requests': [22:19:11] yea :p [22:19:13] (03PS6) 10Dzahn: contint: fix duplicate definition of imagemagick [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 [22:19:14] there it is [22:19:25] (03CR) 10Krinkle: [C: 031] "Thanks!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 (owner: 10Dzahn) [22:20:26] (03CR) 10Dzahn: [C: 032] contint: fix duplicate definition of imagemagick [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 (owner: 10Dzahn) [22:21:40] Krinkle: and we have a third option i just rememberd.. "ensure_resource" from stdlib [22:21:52] but those would be bigger changes and that one was exactly how hashar solved it before [22:22:40] Krinkle: https://forge.puppetlabs.com/puppetlabs/stdlib -> "ensure_packages" even [22:22:46] Takes a list of packages and only installs them if they don't already exist. It optionally takes a hash as a second parameter that will be passed as the third argument to the ensure_resource() function. [22:23:41] I see we do already ship that [22:23:42] modules/stdlib/Modulefile [22:23:43] nice [22:26:06] (03CR) 10Dzahn: "basically (at least) 3 ways to solve this. a) if ! defined(Package .. b) virtual resources and 'realize' them c) ensure_packages from st" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147664 (owner: 10Dzahn) [22:31:31] (03CR) 10Dzahn: [C: 032] OTRS - remove DHE ciphers [operations/puppet] - 10https://gerrit.wikimedia.org/r/147316 (owner: 10Dzahn) [22:43:19] (03CR) 10PleaseStand: "> Wiki is locked, data's still in the database. I don't see what there is to fix." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147012 (owner: 10Chad) [22:46:39] (03PS1) 10BryanDavis: Use aliasByNode() to clean up metric labels [operations/puppet] - 10https://gerrit.wikimedia.org/r/147673 [22:48:08] (03CR) 10PleaseStand: "Also, does this WONTFIX bug 67763?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147012 (owner: 10Chad) [22:49:29] (03PS6) 10Dzahn: Add puppet module for a tor relay [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 [22:49:46] (03CR) 10Dzahn: Add puppet module for a tor relay (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [22:51:37] (03CR) 10Dzahn: [C: 031] "@Ariel: port change: done , log dir: gets created by package, checked on Debian laptop , log rotate: gets added by package as well, chec" [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [22:52:36] (03CR) 10Dzahn: "@Jan, yep, using 443 and 80 now, i expect we'll get a misc box that does not already run anything else (RT: 7925)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [23:02:48] mutante: The error persists and it makes sense [23:02:49] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate declaration: Package[imagemagick] is already declared in file /etc/puppet/modules/contint/manifests/packages.pp:97; cannot redeclare at /etc/puppet/modules/mediawiki/manifests/packages.pp:13 on node i-000004b1.eqiad.wmflabs [23:03:02] while ideally we wouldn't rely on the load order, the load order is pretty consistent [23:03:11] and just like before, mediawiki/packages loads last [23:03:20] that's where the error came from [23:03:23] (03CR) 10PleaseStand: "> Also, does this WONTFIX bug 67763?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/147012 (owner: 10Chad) [23:03:24] it never came from contint [23:03:41] mutante: I guess we need to patch mediawiki::packages as well? [23:04:31] * Krinkle writes path [23:08:16] (03PS1) 10Krinkle: mediawiki: Fix duplicate imagemagick package definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147677 [23:08:24] andrewbogott: ^ [23:09:03] (03PS2) 10Krinkle: mediawiki: Fix duplicate imagemagick package definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147677 [23:10:39] (03CR) 10Andrew Bogott: [C: 032] mediawiki: Fix duplicate imagemagick package definition [operations/puppet] - 10https://gerrit.wikimedia.org/r/147677 (owner: 10Krinkle) [23:12:12] Are there any delays with mailing lists? [23:12:46] I just sent an e-mail to a mailing list from an incorrect e-mail, and can't get the bounce [23:13:37] Krinkle: re.. was reading something else.. i see Andrew Bogott merged.. good ? [23:14:18] mutante: Yeah, but there's another package dupe, but it only tells me after I fixed that one [23:14:19] php-alc [23:14:21] php-apc [23:14:25] and there's loads more I'm sure [23:14:42] mutante: Want me to add if()'s for those as well in mediawiki and contint? [23:16:15] ori: opinions which way to work around it? [23:22:55] * AaronSchulz kicks git-deploy [23:27:14] andrewbogott: Would you be willing to merge another such patch? [23:27:26] probably :) [23:27:36] And can I use ensure_packages() instead of those if- blocks? [23:28:14] It's a relatively new global method from stdlib that does the same (read the source in stdlib/lib/puppet/parser/functions/ensure_packages.rb) we already use it in minimalpuppetagent.pp and zuul/manifests/init.pp [23:28:40] I have… never heard of that. Let me read up a bit [23:29:09] Krinkle, is it possible that we want to just remove these packages from CI? Do we ever apply that class w/out the other? [23:29:40] andrewbogott: no, we've already been there [23:29:42] andrewbogott: It doesn't apply both classes afaik [23:29:56] see last comment on https://gerrit.wikimedia.org/r/#/c/147664/ [23:30:00] Are things from stdlib already present thorughout? [23:30:14] we can use more of those "if"s, or virtual resources or the one Krinkle mentioned [23:31:34] Hm, looks like ensure_packages is only barely used. [23:31:40] http://serverfault.com/questions/486455/why-puppet-can-require-each-package-just-once mentions all the options [23:31:45] But it does what we want, so, I think you should go ahead. [23:32:16] * AaronSchulz wonders if git-deploy is broken [23:33:20] (03PS7) 10Dzahn: Add puppet module for a tor relay [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 [23:35:03] (03CR) 10Dzahn: "PS7: add "tor-arm" package for monitoring, add comment about "MyFamily" config option..." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [23:36:27] (03PS1) 10Krinkle: Fix more duplicate package definition of 'php-apc' between mediawiki and contint [operations/puppet] - 10https://gerrit.wikimedia.org/r/147681 [23:37:56] ori: around? [23:38:07] AaronSchulz: still in my 1:1 [23:38:16] (03PS2) 10Krinkle: Fix dupe package definition of 'php-apc' between mediawiki and contint [operations/puppet] - 10https://gerrit.wikimedia.org/r/147681 [23:38:19] heh, wow [23:38:25] mutante: work around what? [23:38:43] ori: duplicate definitions of packages between contint and mw module [23:38:50] i'll look [23:38:57] we can either add lot's of "if"s [23:39:06] or make them all virtual resources and realize them [23:39:24] ori: So far only imagemagick and php-apc, but once I deploy that, provisioning will probably fail on some other package.. [23:39:25] or use ensure_packages() [23:39:46] just don't declare them in contint [23:39:58] contint has them because contint tests mediawiki and mediawiki needs imagemagick and php-apc [23:40:07] so it's appropriate for those to be defined there [23:40:13] ori: 147664, 147681 (added in 133956) [23:40:39] * ori reviews [23:40:57] the first instinct was to simply remove them from contint, yes [23:41:16] (03CR) 10Krinkle: [C: 04-1] "Before this change:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147681 (owner: 10Krinkle) [23:41:42] ori: Yes, but contint isn't ready yet to use all of the mediawiki class. [23:41:54] but: 15:02 < Krinkle> mutante: It does use it. The slaves run temporary installs of mediawiki. And mediawiki presuably needs that package to do thumb scaling and what not [23:42:42] Too many differences and too many abstraction layers at once. The outer scope of today is using node 0.10 for npm tests of oojs/visualeditor, *not* at all mediawiki related. But it needs Trusty Ubuntu, and for that I'm creating a new instances, and then it turned up that role-ci-slave class has been broken for months since the mediawiki refactoring. [23:42:55] mutante: It does use it; it = imagemagick. [23:43:17] i don't want to use ensure_packages [23:43:33] it took a ridiculous amount of time to get ::mediawiki in shape and it's still not done [23:43:50] ori: if-defined-Package is preferred? [23:43:54] the spidering interdependencies are so tightly interwoven that you can't change anything if you try to do move everything at once [23:43:57] ori: virtual resources like the top answer on http://serverfault.com/questions/486455/why-puppet-can-require-each-package-just-once ? [23:44:13] it's not the right answer [23:44:27] this isn't a limitation of puppet [23:44:30] it's a design choice [23:44:46] you're supposed to articulate your setup in a way that declares everything once [23:45:48] ori: if you mean that different thigns using meidaiwki-like stuff should use the same class, then yet, that's a bad design in contint (although to be fair, the meidaiwki package didn't exist back then)- however I don't think it's at all realistic that a package is only defined once for packages don't have a 1:1 relationship with application, roles or classes. Lots of different things may use php [23:45:48] , imagemagic, or whatever other package. [23:46:50] you can include classes multiple times [23:46:58] they should be contained in a class that can be easily included [23:47:11] i'm in a meeting atm, is something critically broken? [23:47:16] if not, can we pick this up on monday? [23:47:25] that's a generic "package" class. afair we did those in the past but they weren't liked either [23:47:31] sorry, i gotta run though [23:48:08] ori: I can do it myself and have another opsen merge. I just need to know what you want me to use if-defined-package or ensure_packages. We can discuss grant scheme things later. I just need my stuff to not be broken and destroy another days of work. [23:48:55] I don't think there's any issue with using ensure_packages. It's the same as package{} but doesn't conflict with contint and visa versa. So I'm going to use that. [23:49:12] we've discussed it before with faidon and joe [23:49:19] you can use it in contint but please don't introduce it to mediawiki [23:49:44] ori: it has to be in both because puppet's default destrministi load order loads mediawiki after contint [23:49:48] I tried that, it doesn't wokr [23:50:02] ori: It was already in mediawiki:: and at least 20 other classes in puupet, so I'm going to use if-defined, OK? [23:50:14] please don't [23:50:22] what's broken right now? [23:50:38] Right now the integration slaves are broken because I can't create a new instance because puppet won't run [23:50:50] I already have 2 pages of documetnation with manual fixes just of today, my head is exploding. [23:50:55] this does not matter... ? [23:51:21] i can help you fix it properly, but not for another twenty minutes [23:51:25] OK [23:51:26] and it would take an ops person to merge [23:51:30] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [23:51:47] ori: at least 5 different opsen have merged patches using if-defined-package or ensure_packages (which does the same) over the last few days. [23:52:07] but not to the mediawiki module [23:52:17] I'll wait 20 minues [23:52:24] they would know to ask _joe_ or me [23:52:28] since they know we're working on it [23:53:00] PROBLEM - puppet last run on wtp1004 is CRITICAL: CRITICAL: Puppet has 1 failures [23:53:07] well, if it's breaking unrelated classes, I guess they think the future plans don't apply, certainly if it's just a minor patch like adding an if-block doesnt affect the meidawiki module otherwise. [23:53:50] I'll be back in a bit and then we'll fix it [23:54:55] Krinkle: I need to step away, but email me if/when you have a working patch and I'll try to check in [23:56:34] thanks andrewbogott [23:56:54] mutante might be around to merge too… but you're running out of timezones :) [23:59:09] (03PS3) 10Krinkle: Fix more duplicate package definition of 'php-apc' between mediawiki and contint [operations/puppet] - 10https://gerrit.wikimedia.org/r/147681