[00:41:13] (03CR) 10Ori.livneh: [C: 031] beta: Small scap fixes [operations/puppet] - 10https://gerrit.wikimedia.org/r/140045 (owner: 10BryanDavis) [01:32:50] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Tue Jun 17 01:32:44 UTC 2014 [02:23:50] !log LocalisationUpdate completed (1.24wmf8) at 2014-06-17 02:22:46+00:00 [02:23:57] Logged the message, Master [02:35:12] !log LocalisationUpdate completed (1.24wmf9) at 2014-06-17 02:34:09+00:00 [02:35:17] Logged the message, Master [02:50:16] (03PS1) 10Springle: Milimetric access to stats user on stat1003 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140051 [02:55:51] Papaul [03:20:18] !log LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 17 03:19:12 UTC 2014 (duration 19m 11s) [03:20:22] Logged the message, Master [03:25:14] (03PS1) 10Springle: Allow bsitzmann access to bast1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140052 [03:27:05] _joe_: ^ (think that's needed) [04:18:04] hey springle [04:18:14] hi [04:18:30] RT gives me a CN mismatch (says it belongs to racktables) [04:18:34] known? [04:19:04] hmm [04:21:30] i'm not seeing that [04:21:44] hrmmm, rt is an abbreviation for racktables :P [04:21:56] well i checked my hosts file and that's not it [04:22:00] digging some more [04:22:03] aha that rt [04:22:18] no, i meant rt.wikimedia.org [04:22:50] (I don't even have access to racktables) [04:23:52] so, idk, maybe it's just some kind of failure to do SNI [04:24:25] but i tested with 2 iceweasel instances of the same binary and i get different results (different internet connections too) [04:24:31] * jeremyb tests some more [04:24:40] ok [04:25:39] (it must be using SNI because they both resolve to the same IP) [04:26:54] ok, wtf [04:27:12] so, i put in hosts file "127.0.0.1 rt.wikimedia.org" [04:27:19] reloaded in both browsers and neither works [04:27:30] (03PS1) 10Ori.livneh: delete apache::vhost::proxy and apache::vhost::redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/140054 [04:27:30] then commented it out and reload again and they both work [04:27:42] * jeremyb gives up [04:27:45] heisenbug [04:27:49] sorry for the (false?) alarm [04:27:49] hah [04:28:08] jeremyb: also, https://rt.wikimedia.org/Ticket/Display.html?id=5553 [04:28:12] can we close? [04:28:37] (now that your browsers work... ;) [04:28:45] hah [04:31:00] hrmmm [04:35:55] (03CR) 10Giuseppe Lavagetto: [C: 031] "my bad." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140052 (owner: 10Springle) [04:37:00] (03CR) 10Springle: [C: 032] Allow bsitzmann access to bast1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140052 (owner: 10Springle) [04:41:04] (03PS1) 10Ori.livneh: delete webserver::modproxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140056 [04:42:43] <_joe_> I hate puppet manifests written like that [04:42:52] <_joe_> all small classes doing nothing [04:43:06] heh [04:43:08] <_joe_> ori: kill em all [04:43:17] (03PS1) 10Ori.livneh: delete webserver::apache2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140057 [04:43:29] * ori does [04:43:32] <_joe_> ori: delete ::webserver :P [04:43:38] eventually [04:43:44] it's a tangled web [04:47:12] (03PS1) 10Ori.livneh: delete webserver::apache2::rpaf [operations/puppet] - 10https://gerrit.wikimedia.org/r/140058 [04:48:17] (03CR) 10jenkins-bot: [V: 04-1] delete webserver::apache2::rpaf [operations/puppet] - 10https://gerrit.wikimedia.org/r/140058 (owner: 10Ori.livneh) [04:48:22] tut tut [04:52:01] heh, i miss Ori :) I recently mentioned Ori on a thread but i don't think he's on that list... [04:52:14] (ori has mail in case he wants to know what i mean) [04:52:15] i'm probably not [04:52:20] oh [04:52:21] * ori looks [04:55:20] huh, are we using diamond? [04:55:25] * jeremyb spies https://rt.wikimedia.org/Ticket/Display.html?id=7699 [04:55:37] yes [04:57:18] hi jeremyb [04:57:49] hi ori [04:59:32] (03PS2) 10Ori.livneh: delete webserver::apache2::rpaf [operations/puppet] - 10https://gerrit.wikimedia.org/r/140058 [05:31:13] springle: ok, replied 5553 and 7548 [05:31:46] jeremyb: thank you! [07:03:50] (03PS2) 10Springle: Set $wgCategoryCollation to 'uca-fr' on frwikinews [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/138820 (https://bugzilla.wikimedia.org/66165) (owner: 10Odder) [07:06:08] (03CR) 10Springle: [C: 032] Set $wgCategoryCollation to 'uca-fr' on frwikinews [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/138820 (https://bugzilla.wikimedia.org/66165) (owner: 10Odder) [07:06:14] (03Merged) 10jenkins-bot: Set $wgCategoryCollation to 'uca-fr' on frwikinews [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/138820 (https://bugzilla.wikimedia.org/66165) (owner: 10Odder) [07:07:13] !log springle Synchronized wmf-config/InitialiseSettings.php: $wgCategoryCollation to uca-fr on frwikinews (duration: 00m 07s) [07:07:18] Logged the message, Master [07:12:21] !log starting updateCollation on s3 frwikinews from tin [07:12:27] Logged the message, Master [07:16:00] ori: yes [07:20:18] thanks [07:22:25] anyone feels like reviewing some lint fixes ? [07:25:23] sure [07:25:30] i can only +1/-1 tho [07:26:02] i'll do the salt one [07:29:01] cajoel: here? [07:30:18] ori: btw, /etc/init.d/apache2 is wrong [07:30:29] invoke-rc.d apache2 start is the traditional way to do it [07:30:39] and the new way is "service apache2 start" [07:31:02] the whole thing should be torched :/ [07:31:07] but point taken [07:31:10] i'll update [07:31:27] nah [07:31:34] I'll wait until it's torched :) [07:31:39] <_joe_> eheh [07:31:51] i keep having to abandon commits locally [07:31:59] because i start cleaning up after some class [07:32:02] can you review https://gerrit.wikimedia.org/r/#/c/138804 ? :) [07:32:07] and the spidering dependencies lead me to construct a giant commit [07:32:28] oh sure [07:32:34] I have my own opinion about that one, but I'll leave it up to you :) [07:32:59] (03CR) 10Ori.livneh: salt: lint (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/139684 (owner: 10Matanya) [07:33:54] a new php5::apache2packages class [07:34:00] :P [07:34:01] no no no no no no no no no no no no no no no no no no [07:34:31] rofl [07:35:14] (03CR) 10Matanya: salt: lint (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/139684 (owner: 10Matanya) [07:37:33] <_joe_> ori: I might be guilty for the name [07:38:02] <_joe_> I may have said 'create a class for common php5 packages, like php5::packages' [07:38:14] <_joe_> but people should know better than listening to me on naming [07:38:18] <_joe_> :) [07:38:29] <_yuvi_> nooooo [07:38:33] php5::packages would be fine [07:38:59] it's not the name so much as the whole thing being somewhat willy-nilly and unprincipled [07:40:30] <_joe_> ori: well, on that I don't agree [07:40:41] <_joe_> either you have one-size-fits all modules [07:40:59] <_joe_> either you use ensure_resource to wrap any package def [07:41:10] <_joe_> (which btw does not work in puppet 2.7) [07:41:26] <_joe_> or you create classes abstracting minimum functionality units [07:41:42] <_joe_> I prefer this very much to inclusion-order tricks [07:41:44] these aren't completely distinct classes of functionality, though [07:41:56] as to have the need of "minimum functionality units" [07:42:16] contint should be as close to production as possible and hence reuse larger blocks of code [07:42:47] <_joe_> we may agree on that, though we *may* run some php5 apps that do not need the whole mediawiki module [07:42:58] yup, that's true [07:43:02] i don't actually see php5::apache2packages in the patch [07:43:05] <_joe_> and having the packages needed for that grouped in one single class [07:43:05] am i being a dork? [07:43:11] <_joe_> ori: mmmh let me see [07:43:33] (03CR) 10Ori.livneh: [C: 04-1] "where's php5::apache2packages?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/138804 (owner: 10Hashar) [07:47:57] (03PS2) 10Matanya: salt: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139684 [07:48:18] (03PS4) 10Faidon Liambotis: Switch Central/South Asia to esams [operations/dns] - 10https://gerrit.wikimedia.org/r/80973 [07:50:32] (03CR) 10Ori.livneh: [C: 031] salt: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139684 (owner: 10Matanya) [07:54:06] <_joe_> ori: so, we want to use the module 'apache' and get rid of 'webserver', right? [07:54:45] we want to have one module to configure apache I'd say [07:54:55] parts of webserver have some syntactic sugar that apache doesn't [07:56:12] <_joe_> so we need to build a crasis of the two [07:56:18] <_joe_> ok, makes sense [07:56:25] yeah [07:56:30] that's where i'm at, too [07:56:47] there's stuff to torch still, tho [07:58:37] (03PS5) 10Faidon Liambotis: Switch Central/South Asia to esams [operations/dns] - 10https://gerrit.wikimedia.org/r/80973 [07:58:39] (03PS1) 10Faidon Liambotis: Switch more Asia-Pacific countries to ulsfo [operations/dns] - 10https://gerrit.wikimedia.org/r/140064 [07:58:57] (03CR) 10Giuseppe Lavagetto: [C: 032] delete webserver::apache2::rpaf [operations/puppet] - 10https://gerrit.wikimedia.org/r/140058 (owner: 10Ori.livneh) [08:00:26] <_joe_> mmm what's up with gerrit [08:00:54] <_joe_> oh there's a dependency [08:01:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:01:49] (03PS2) 10Giuseppe Lavagetto: delete webserver::modproxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140056 (owner: 10Ori.livneh) [08:03:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:04:45] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 13.33% of data exceeded the critical threshold [500.0] [08:05:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:06:21] (03PS1) 10Ori.livneh: Delete webserver::apache::module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140067 [08:07:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:07:39] (03CR) 10Giuseppe Lavagetto: [C: 032] delete webserver::modproxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140056 (owner: 10Ori.livneh) [08:08:37] (03PS2) 10Giuseppe Lavagetto: delete webserver::apache2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140057 (owner: 10Ori.livneh) [08:09:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:09:25] (03CR) 10Giuseppe Lavagetto: [C: 032] delete webserver::apache2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140057 (owner: 10Ori.livneh) [08:09:38] (03CR) 10Giuseppe Lavagetto: [V: 032] delete webserver::apache2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140057 (owner: 10Ori.livneh) [08:10:29] (03PS3) 10Giuseppe Lavagetto: delete webserver::apache2::rpaf [operations/puppet] - 10https://gerrit.wikimedia.org/r/140058 (owner: 10Ori.livneh) [08:11:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:13:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:15:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:15:07] <_joe_> ori: I've seen you merged my change to profiler-to-carbon, did you release that? [08:17:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:18:45] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% data above the threshold [250.0] [08:19:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:21:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:23:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:24:38] (03PS7) 10Nikerabbit: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 [08:25:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:27:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 07:58:16 UTC [08:27:55] RECOVERY - Puppet freshness on analytics1026 is OK: puppet ran at Tue Jun 17 08:27:44 UTC 2014 [08:30:05] PROBLEM - Puppet freshness on analytics1026 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 08:27:44 UTC [08:34:24] (03CR) 10Nikerabbit: "PS7:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [08:36:12] (03CR) 10Filippo Giunchedi: [C: 031] puppet: switch masters to puppet 3 [operations/puppet] - 10https://gerrit.wikimedia.org/r/139832 (owner: 10Giuseppe Lavagetto) [08:36:29] (03PS8) 10Nikerabbit: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 [08:36:53] _joe_: doo iitt [08:36:54] :D [08:37:09] (03CR) 10Nikerabbit: "PS8:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [08:37:30] haha [08:42:18] (03PS2) 10Filippo Giunchedi: scap: ensure=>absent /usr/local/bin/sync-common-file [operations/puppet] - 10https://gerrit.wikimedia.org/r/135924 (owner: 10BryanDavis) [08:42:23] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] scap: ensure=>absent /usr/local/bin/sync-common-file [operations/puppet] - 10https://gerrit.wikimedia.org/r/135924 (owner: 10BryanDavis) [08:42:38] (03CR) 10Filippo Giunchedi: "ori: yup, done!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/135924 (owner: 10BryanDavis) [08:45:11] <_joe_> paravoid: last rounds of testing [08:51:54] PROBLEM - Unmerged changes on repository puppet on palladium is CRITICAL: Fetching origin [08:52:14] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: Fetching origin [08:58:24] RECOVERY - Puppet freshness on analytics1026 is OK: puppet ran at Tue Jun 17 08:58:15 UTC 2014 [09:04:21] oops! [09:04:44] (03PS9) 10Nikerabbit: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 [09:04:50] done [09:04:54] RECOVERY - Unmerged changes on repository puppet on palladium is OK: Fetching origin [09:05:14] RECOVERY - Unmerged changes on repository puppet on strontium is OK: Fetching origin [09:06:49] (03PS10) 10KartikMistry: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [09:13:43] (03PS11) 10KartikMistry: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [09:26:23] <_joe_> and... we've introduced a few errors w. puppet 3 since last I checked properly (~ 10 d ago) [09:27:48] :( [09:28:34] <_joe_> paravoid: hold on, maybe it's just some facts archives not updated... weird. [10:41:30] (03PS12) 10KartikMistry: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [10:47:56] (03CR) 10Nuria: "Meta question here... weren't we going to use the rsync module?" [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [10:54:05] (03CR) 10QChris: Add backup role and scripts (035 comments) [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [11:00:31] (03CR) 10QChris: "I am not sure if tar-ing up the hourly files in the daily script is" (0313 comments) [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [11:00:39] (03CR) 10Nikerabbit: [C: 04-1] cxserver configuration for beta labs (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [11:25:24] PROBLEM - Puppet freshness on holmium is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 08:24:47 UTC [11:31:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:33:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:33:31] (03PS13) 10KartikMistry: cxserver configuration for beta labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/139095 (owner: 10Nikerabbit) [11:35:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:37:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:39:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:41:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:43:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:45:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:47:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:49:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:51:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:53:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:55:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:57:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:27:54 UTC [11:58:07] RECOVERY - Puppet freshness on cp1057 is OK: puppet ran at Tue Jun 17 11:58:05 UTC 2014 [12:00:07] PROBLEM - Puppet freshness on cp1057 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 11:58:05 UTC [12:15:42] (03PS1) 10Reedy: Non Wikipedias to 1.24wmf9 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140100 [12:20:39] (03CR) 10Reedy: "Noting this doesn't make any changes to redirects.conf" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/138297 (owner: 10Ori.livneh) [12:28:04] RECOVERY - Puppet freshness on cp1057 is OK: puppet ran at Tue Jun 17 12:28:01 UTC 2014 [12:35:37] FWIW, I moved all the bot traffic to notices in irssi with this script, much nicer to the eye https://raw.githubusercontent.com/bucciarati/irssi-script-msg_to_notice/master/msg_to_notice.pl [12:35:50] 13:35 noticeable_nicks = grrrit-wm,icinga-wm,wm-bot,logmsgbot,wikibugs [12:36:41] (03CR) 10QChris: [C: 04-1] "Looks like this would not work outside of the wikimetrics1" (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/139558 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [12:44:14] _joe_: Mind spending 30 second on https://bugzilla.wikimedia.org/show_bug.cgi?id=66714 ? [12:44:19] second-s [12:45:15] I support this request, if it matter :) [12:45:20] https://lists.wikimedia.org/mailman/admin/meta-oversight is where you can log-in _joe_ [12:45:38] * twkozlowski would need an NDA to do this stuff [12:45:59] <_joe_> twkozlowski: sorry, but people should really use RT for mailing lists [12:46:13] Fun fact: RT is closed for outsiders :-) [12:46:13] <_joe_> I took over those old tickets not to leave them around [12:46:25] Anyone can email it [12:46:31] <_joe_> but you can still send an email to ops-requests [12:46:39] <_joe_> Reedy: thanks [12:46:41] Yes. And how many people know they can e-mail it RT? [12:46:51] Those who need to know [12:47:17] <_joe_> twkozlowski: ok, when we'll have phabricator this will be over, hopefully [12:47:38] Yes, I hope so. Then you'll just need to move the ticket to that product or whatever the term would be. [12:47:44] <_joe_> I can simply close the ticket asking to send an email :) [12:48:21] That'll take you much longer than just logging in to that list and adding the guy to admins :-) [12:48:45] <_joe_> yes, but that would mean one less person educated to the correct process [12:48:48] <_joe_> ;) [12:49:42] _joe_: filled 7701 [12:50:15] Now someone explain to me why we have duplicate tickets for the same request, please? [12:50:56] <_joe_> twkozlowski: we can also have one in bugzilla that the RT onduty ops will quietly ignore, if you prefer [12:51:36] Whatever, hopefully RT dies soon. [12:51:43] <_joe_> (but let me eat before we continue) [12:53:50] so much hate :) [12:54:23] Stupid things provoke hate :-P [12:54:41] MediaWiki is stupid [12:54:46] PHP is stupud [12:57:20] <_joe_> life is beautiful [12:58:09] was this supposed to be a haiku? :-P [13:03:55] (03CR) 10Jgreen: [C: 032 V: 031] fundraising: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/140029 (owner: 10Matanya) [13:13:02] (03PS1) 10Manybubbles: Switch all wikis to the experimental highlghter [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140110 [13:25:01] (03CR) 10Manybubbles: [C: 031] Reduce EventLogging sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/139817 (owner: 10Gilles) [13:28:57] (03CR) 10Ottomata: [C: 032] Remove spetrea from icinga's analytics contact group [operations/puppet] - 10https://gerrit.wikimedia.org/r/140016 (owner: 10QChris) [13:30:36] <_joe_> ottomata: hi, I need your advice. In the last test for puppet 2 -> 3 migration, I found these diffs: http://puppet-compiler.wmflabs.org/89/html/cp1037.eqiad.wmnet.html [13:30:51] <_joe_> it's basically only an ordering of an array. Will this matter? [13:31:00] <_joe_> if not, I think we're good to go. [13:31:06] nope, doesn't matter [13:31:16] <_joe_> okedokey [13:31:46] weird, but ok! [13:33:14] (03PS2) 10Giuseppe Lavagetto: puppet: switch masters to puppet 3 [operations/puppet] - 10https://gerrit.wikimedia.org/r/139832 [13:38:17] (03CR) 10Giuseppe Lavagetto: [C: 032] puppet: switch masters to puppet 3 [operations/puppet] - 10https://gerrit.wikimedia.org/r/139832 (owner: 10Giuseppe Lavagetto) [13:38:31] <_joe_> and here we are [13:38:37] \o/ [13:44:35] * matanya crosses fingers [13:45:12] gogogo [13:45:45] <_joe_> palladium done, doing strontium [13:46:08] <_joe_> the way puppet uses apt is distasteful to say the least [13:46:55] <_joe_> update complete [13:48:18] <_joe_> confirmed we're running puppet 3 on the master, and we're having successful puppet runs [13:51:24] <_joe_> !log production puppet masters upgraded to puppet 3 [13:51:29] Logged the message, Master [13:51:57] no observable difference in CPU so far [13:52:09] hm, sort of, strontium is still not up [13:52:23] <_joe_> really? mmh [13:54:13] <_joe_> should I reload apache on palladium, maybe? [13:57:30] <_joe_> it is in now [13:58:27] awesome [13:58:33] let's see how the trend will look like [13:58:38] <_joe_> yes [13:58:40] will trusty/ruby 1.9 be next? [13:58:42] happy days \o/ [13:58:48] <_joe_> yes [13:58:52] awesome [13:58:57] <_joe_> it won't take much time I hope [13:59:09] <_joe_> then we can start migrating clusters [13:59:49] hm, we could even go for ruby2.0, no? [14:00:10] <_joe_> that will mean rebuilding the puppet packages for trusty I guess [14:00:19] <_joe_> (don't remember that exactly) [14:00:19] no [14:00:20] Depends: init-system-helpers (>= 1.13~), puppet-common (= 3.4.3-1), ruby | ruby-interpreter [14:00:39] # apt-cache show ruby2.0 |grep Provides [14:00:39] Provides: ruby-interpreter [14:00:42] <_joe_> ok, some packages it depends on depend on ruby 1.9 I guess [14:00:58] <_joe_> some packages it build-depends for sure [14:07:48] (03Abandoned) 10Alexandros Kosiaris: icinga analytics contactgroup update [operations/puppet] - 10https://gerrit.wikimedia.org/r/139335 (owner: 10Alexandros Kosiaris) [14:12:11] oh _joe_ awesome! [14:12:15] puppet 3 puppet masters [14:12:36] can/should I switch some of the nodes i'm working with? particularly the kafka brokers that i'm messing with? [14:12:41] and the new cp30* nodes i'm going to install [14:13:13] <_joe_> whenever you feel like that, on a client is as simple as putting $puppet_version = '3' in their node definitions [14:13:20] <_joe_> it's really up to you though [14:13:26] we can just remove the pin [14:13:30] we don't ensure => latest [14:14:09] <_joe_> paravoid: we do that AFAIR [14:14:45] <_joe_> and I honestly liked the idea of being able to switch classes of servers over one at a time if we want [14:15:14] it looks like we are [14:16:51] <_joe_> and virt1000, the only puppet 3 client we have, is happy [14:16:53] <_joe_> :) [14:17:05] <_joe_> ok, now I do take a break [14:17:39] (03PS1) 10Faidon Liambotis: check_smtp: send the FQDN on HELO [operations/puppet] - 10https://gerrit.wikimedia.org/r/140126 [14:17:55] (03CR) 10Faidon Liambotis: [C: 032 V: 032] check_smtp: send the FQDN on HELO [operations/puppet] - 10https://gerrit.wikimedia.org/r/140126 (owner: 10Faidon Liambotis) [14:20:10] err: /Stage[main]/Icinga::Monitor::Files::Misc/Exec[fix_icinga_temp_files]/returns: change from notrun to 0 failed: /bin/chown -R icinga /var/lib/icinga returned 1 instead of one of [0] at /etc/puppet/manifests/misc/icinga.pp:299 [14:20:14] blergh [14:21:27] bblack: ping? [14:26:20] PROBLEM - Puppet freshness on holmium is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 08:24:47 UTC [14:29:09] (03CR) 10Chad: [C: 031] Switch all wikis to the experimental highlghter [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140110 (owner: 10Manybubbles) [14:40:24] (03CR) 10JanZerebecki: [C: 031] Apache settings for wikimania2015wiki [operations/apache-config] - 10https://gerrit.wikimedia.org/r/139288 (https://bugzilla.wikimedia.org/66370) (owner: 10Withoutaname) [14:48:01] <_joe_> so everything seems fine :) [14:48:26] ha, famous last words :-) [14:48:40] <_joe_> hey [14:48:44] <_joe_> you're mean [14:48:48] <_joe_> :P [14:48:56] :-P [14:49:33] _joe_: does that mean you are free to review some patches ? [14:49:52] manybubbles: You going to do the SWAT today, since you have patches in it? [14:49:59] anomie: yeah! [14:50:00] <_joe_> matanya: no, I have to write an outage report now :P [14:50:24] dammit with those outages! :P [14:50:25] <_joe_> matanya: add me as a reviewer, I hope to give them a shot today [14:51:50] _joe_: i piled your mail :) [14:54:46] (03CR) 10JanZerebecki: [C: 031] Update the redirect target for education.wikimedia.org [operations/apache-config] - 10https://gerrit.wikimedia.org/r/122866 (owner: 10Ragesoss) [14:55:40] gi11es: arround to verify your change for SWAT today? [14:55:51] manybubbles: yep [14:56:03] sweet. I'll push yours in five minutes then [15:00:05] anomie: The time is nigh to deploy SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140617T1500) [15:00:38] (03CR) 10Manybubbles: [C: 032] Reduce EventLogging sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/139817 (owner: 10Gilles) [15:00:50] (03Merged) 10jenkins-bot: Reduce EventLogging sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/139817 (owner: 10Gilles) [15:01:19] * manybubbles has the conch [15:02:22] !log manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - lower event logging rate for mediaviewer (duration: 00m 05s) [15:02:24] gi11es: synced [15:02:27] Logged the message, Master [15:08:16] I'm starting to see some database errors on fatalmonitor [15:08:37] Someone just complained of connection errors on wikidata [15:09:05] SELECT mr_blob,mr_resource,mr_timestamp FROM `msg_resource` WHERE mr_resource IN ('user.options','user.tokens') AND mr_lang = 'he' [15:09:16] no error message I can find [15:09:22] I'm probably reading it wrong [15:09:41] I was just looking at dberror.log on fluorine [15:10:00] I don't see any error messages either [15:10:16] Tue Jun 17 15:07:00 UTC 2014 mw1095 zhwiki Error connecting to 10.64.32.17: :real_connect(): (HY000/2013): Lost connection to MySQL server at 'reading initial communication packet', system error: 104 [15:10:23] huh [15:10:35] manybubbles: I'm starting to see the new values hit EventLogging, everything looks ok [15:10:43] gi11es: cool - you are done [15:10:48] thanks! [15:10:56] Reedy: https://logstash.wikimedia.org/#/dashboard/elasticsearch/fatalmonitor shows them as spikey [15:11:03] like, limited to small periods [15:11:32] I'm going to proceed with my backport patch for Cirrus - its already merged and shouldn't be involved [15:11:48] "A database error has occurred. Did you forget to run maintenance/update.php after upgrading? See: https://www.mediawiki.org/wiki/Manual:Upgrading#Run_the_update_script\nQuery: SELECT ips_item_id FROM `wb_items_per_site` WHERE ips_site_id = 'nlwiki' AND ips_site_page = 'Vuurrugtangare' LIMIT 1 \nFunction: DatabaseBase::selectRow\nError: 0 [15:13:06] !log manybubbles Synchronized php-1.24wmf9/extensions/CirrusSearch/: SWAT - Fix Cirrus Special:Random (duration: 00m 04s) [15:13:10] Logged the message, Master [15:13:14] huh [15:15:04] Reedy: same thing for SELECT user_id FROM `user` WHERE user_name = '' LIMIT 1 [15:15:06] bd808: i think godog had a question about salt in labs [15:15:10] i'm directing him to you :P [15:15:26] I think its not schema related, I think [15:15:50] ah ye, was trying to get gdash up and running in labs ori bd808 [15:16:10] (03PS2) 10Giuseppe Lavagetto: ganglia_new: retab instance.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/139781 (owner: 10Matanya) [15:16:16] godog: Ah. And you need trebuchet? [15:16:48] (03CR) 10Giuseppe Lavagetto: [C: 032] ganglia_new: retab instance.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/139781 (owner: 10Matanya) [15:17:41] (03CR) 10Giuseppe Lavagetto: [V: 032] ganglia_new: retab instance.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/139781 (owner: 10Matanya) [15:17:47] godog: We have trebuchet working for the deployment-prep project. Ryan set it up initially, but there are docs on wikitech -- I'll find them for you. [15:18:01] !log manybubbles Synchronized php-1.24wmf8/extensions/CirrusSearch/: SWAT - Fix Cirrus Special:Random (duration: 00m 04s) [15:18:06] Logged the message, Master [15:19:00] bd808: thanks! as far as I understand gdash is being pulled in via deployment_target however salt seemed to fail while fetching [15:20:13] godog: I think that deployment-prep may be the only project setup to make trebuchet work. There isn't any labs-wide trebuchet deploy host as far as I know. [15:20:13] (03PS2) 10Giuseppe Lavagetto: mirror: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139681 (owner: 10Matanya) [15:20:45] godog: btw, i have a patch for a grafana module, we could merge that and provision it on labs too: https://gerrit.wikimedia.org/r/#/c/133274/ [15:20:52] godog: Instructions on project local setup are at https://wikitech.wikimedia.org/wiki/Trebuchet#Using_Trebuchet_in_Labs [15:21:34] ori: yep absolutely, would be good to have a playground [15:21:48] (03CR) 10Ori.livneh: "You don't need the "require wmflib", and in fact I'd make a point of leaving it out, to be consistent with the way we use stdlib." [operations/puppet] - 10https://gerrit.wikimedia.org/r/139921 (owner: 1020after4) [15:21:57] It would be nice to get gdash setup for beta. I got as far as setting up a graphite host and then never came back to adding other tools. [15:22:20] bd808: ah okay! I'll take a look at how do setup trebuchet [15:22:34] ye graphite is fairly easy in labs too (thanks ori!) [15:22:46] graphite in labs was bd808's work too :) [15:23:02] (03CR) 10Giuseppe Lavagetto: [C: 032] mirror: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139681 (owner: 10Matanya) [15:23:10] ah, thanks bd808 :) [15:23:28] (03CR) 10Ori.livneh: "if this gets merged i'll add a role for labs" [operations/puppet] - 10https://gerrit.wikimedia.org/r/133274 (owner: 10Ori.livneh) [15:25:30] (03PS1) 10Faidon Liambotis: Kill $project-lb.wikimedia.org and move to text-lb [operations/dns] - 10https://gerrit.wikimedia.org/r/140136 [15:26:04] (03CR) 10Faidon Liambotis: [C: 04-2] "Do not merge without an explicit ack." [operations/dns] - 10https://gerrit.wikimedia.org/r/140136 (owner: 10Faidon Liambotis) [15:27:17] (03CR) 10Giuseppe Lavagetto: [C: 032] salt: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139684 (owner: 10Matanya) [15:27:54] paravoid: if in the near future we wanted test.wikipedia.org routed to a trusty/hhvm machine, would that change your patch? [15:28:06] ? [15:28:06] paravoid: or would we do that at the varnish level? [15:28:32] it wouldn't change my patch [15:28:34] paravoid: i'm just looking ahead and thinking past the job runner deployment [15:28:39] nod [15:28:48] <_joe_> jobrunner? [15:28:59] <_joe_> we're getting rid of it? [15:29:03] no no [15:29:10] hhvm is getting deployed to a jobrunner first [15:29:15] <_joe_> oh you got me for a second. [15:29:24] but eventually we'll expand it to user-serving roles [15:29:28] <_joe_> (referring to job-loop.sh) [15:32:38] godog: I found a "simple" trebuchet setup that I'd forgotten about in the logstash project -- https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000320.eqiad.wmflabs [15:32:59] paravoid: pong [15:33:04] The salt master and deploy server are setup on the same host there [15:33:06] bblack: heya [15:33:12] did I break icinga? :) [15:33:15] I saw some commits re: the SPF RR [15:33:26] that it's deprecated or something? [15:33:29] yes [15:33:34] ok, we still use it :) [15:33:47] * manybubbles puts down the conch [15:33:55] bd808: oh okay, I can take a look at that too, thanks! [15:34:11] paravoid: well, basically, the position of gdnsd at this point is that there's no valid reason to use it. If we have a reason, it would be good to figure out whether it's valid. [15:34:26] no, I was asking if I should replace it with TXT [15:34:42] (which is based on the position of the relevant RFCs that the explicit SPF RR-type was a bad experiment, has never seen widespread adoption or requirement, and is actively harmful) [15:35:10] paravoid: hopefully you already have a TXT copy, or else everything has been broken for a long time [15:35:38] yes we do [15:35:48] so, just remove the SPF copy [15:37:05] I suspect this is something that's going to need more explaining in release notes and whatnot. People find it alarming. [15:37:37] (03PS2) 10Giuseppe Lavagetto: rt: retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/139683 (owner: 10Matanya) [15:40:09] (03CR) 10Giuseppe Lavagetto: [C: 032] rt: retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/139683 (owner: 10Matanya) [15:43:20] (03CR) 10Alexandros Kosiaris: [C: 032] Remove spetrea from icinga's analytics contact group [operations/puppet] - 10https://gerrit.wikimedia.org/r/140016 (owner: 10QChris) [15:49:02] (03CR) 10Alexandros Kosiaris: [C: 032] Delete webserver::apache::module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140067 (owner: 10Ori.livneh) [15:50:48] (03CR) 10BBlack: [C: 04-1] "I'd suggest splitting this into two separate changes: first the switch to everything using text-lb, then later the removal of the $project" [operations/dns] - 10https://gerrit.wikimedia.org/r/140136 (owner: 10Faidon Liambotis) [15:51:19] (03CR) 10Alexandros Kosiaris: [C: 032] delete apache::vhost::proxy and apache::vhost::redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/140054 (owner: 10Ori.livneh) [15:57:45] akosiaris: \o/ thanks! [15:58:19] akosiaris: i just emailed the ops list about deployment of apache-config, too -- would love to get your thoughts on that [16:03:05] <_joe_> I'm too tired to stay around - don't hesitate to contact me if hell freezes because of puppet 3 [16:03:48] _joe_: rest well! [16:06:13] (03PS3) 10Odder: Meta: automatic translation workflow state changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/137804 (owner: 10Awight) [16:06:24] (03CR) 10Odder: Meta: automatic translation workflow state changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/137804 (owner: 10Awight) [16:10:05] anyone around here well-versed in debugging missing ganglia data from new hosts? [16:10:34] I'm trying to track down the flow from the host's gmond to the aggregator to nickel, and the ganglia/ganglia_new puppet defs involved, etc. It's rather confusing [16:16:33] (03CR) 10Milimetric: Add backup role and scripts (0313 comments) [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [16:16:37] (03PS4) 10Milimetric: Add backup role and scripts [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) [16:17:06] PROBLEM - Puppet freshness on polonium is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 13:16:47 UTC [16:19:59] ah nevermind, there was no misconfiguration. gmond's initscript is just stupid. when puppet initially configures it and tells it to restart, it says "ok", but it totally doesn't restart [16:20:03] manual kill + start fixes [16:20:07] not sure why [16:21:24] (03CR) 10BBlack: [C: 031] Switch Central/South Asia to esams [operations/dns] - 10https://gerrit.wikimedia.org/r/80973 (owner: 10Faidon Liambotis) [16:22:02] (03CR) 10BBlack: [C: 031] "But note, I plan to reboot lvs400x sometime today, may as well wait until after that." [operations/dns] - 10https://gerrit.wikimedia.org/r/140064 (owner: 10Faidon Liambotis) [16:29:56] bblack: yeah [16:32:34] !log enabled amssq31-46 esams text frontend varnishes in pybal (were misconfigured; wrong domainname) [16:32:39] Logged the message, Master [16:32:56] ^ funny how things like that become apparent when you can actually see the stats [16:33:48] (03PS1) 10Ori.livneh: apache::vhost: retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/140147 [16:33:50] (03PS1) 10Ori.livneh: apache::vhost: get rid of logroot param [operations/puppet] - 10https://gerrit.wikimedia.org/r/140148 [16:34:34] (03PS2) 10Faidon Liambotis: Kill $project-lb.wikimedia.org IPs [operations/dns] - 10https://gerrit.wikimedia.org/r/140136 [16:34:36] (03PS1) 10Faidon Liambotis: Kill $project-lb.$site.wikimedia.org and free IPs [operations/dns] - 10https://gerrit.wikimedia.org/r/140149 [16:34:40] bblack: that was a good point, thanks. [16:34:50] the changes are actually completely separate from each other [16:35:28] mark: if you're up for some tech work... ;) [16:35:57] yes boss [16:37:55] paravoid: actually there's two layers of CNAME redirection in there, so it should be 3 patches :) [16:39:21] e.g. you can't remove foundation-lb.wikimedia.org until $TTL (+whatever) after removing donate 1H IN CNAME foundation-lb.wikimedia.org. [16:40:30] (03CR) 10Milimetric: "I did not see your comments on my earlier patch, I'll submit another patch after going through those, so please disregard patch 4." [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [16:40:49] * bblack digs up his CNAMEs are evil rant... [16:41:15] (03CR) 10Milimetric: "> Meta question here... weren't we going to use the rsync module?" [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [16:47:23] (03CR) 10QChris: "> I'll submit another patch after [...]" (031 comment) [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [16:51:46] (03PS5) 10Milimetric: Add backup role and scripts [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) [17:11:01] (03PS2) 10Milimetric: Enable the new backup role if set [operations/puppet] - 10https://gerrit.wikimedia.org/r/139558 (https://bugzilla.wikimedia.org/66119) [17:23:49] (03PS2) 10Ori.livneh: Delete webserver::apache::module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140067 [17:23:55] (03CR) 10Ori.livneh: [C: 032 V: 032] Delete webserver::apache::module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140067 (owner: 10Ori.livneh) [17:27:06] PROBLEM - Puppet freshness on holmium is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 08:24:47 UTC [17:40:14] do we have any opsen with experience tuning java GC stuff? [17:40:38] chasemp, _joe_: any experience with that? [17:41:09] I believe _joe_ has spent time with it, and godog even? [17:41:12] me not really [17:41:23] I consider java and leprosy to be equivalent [17:41:26] (03PS1) 10Ori.livneh: resolve apache package conflict [operations/puppet] - 10https://gerrit.wikimedia.org/r/140165 [17:42:08] ottomata: can i tap you for a +1 on that ^^ ? need it to fix puppet on several hosts following https://gerrit.wikimedia.org/r/#/c/140067/ [17:43:58] most of it is interim stuff to ease migration from webserver::$foo to the new apache module [17:45:17] PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 5 below the confidence bounds [17:45:43] ensure_packages!??!?! [17:45:55] it's in stdlib. and it's gross, yes. but interim. [17:46:08] it won't live on longer than say 48 hrs [17:46:14] I did not know that existed! [17:46:17] no its ok with me [17:46:21] better than if !defined ... crap [17:46:30] well, if !defined doesn't really work [17:46:32] so it's voodoo [17:46:41] it does if everybody does it :p [17:46:44] but ja [17:46:49] (03PS1) 10Manybubbles: Setup pool counter for regex search [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140166 [17:48:00] Reedy: do we know what these exceptions are? https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&x=0.5&n=&hreg[]=vanadium.eqiad.wmnet&mreg[]=fatal|exception>ype=stack&glegend=show&aggregate=1&embed=1 [17:49:27] AaronSchulz: bunch of "ApiMobileView::getParserOutput: PoolCounter didn't return parser output", and "Error: 1213 Deadlock found when trying to get lock; try restarting transaction (10.64.16.29)" in Function: Title::invalidateCache [17:50:17] ori this change looks a little risky, no? hm [17:50:23] i guess itwon't break things as is [17:50:24] hm, [17:50:35] like, removing apache packages from webserver::apache class [17:50:56] means that anyone currently using webserver::apache won't have those packages installed.....well, if they are using it on a new node.. [17:50:57] the worst case scenario, if i got it badly wrong, is that the packages would not get provisioned on a new host [17:50:59] i geuss its fine now [17:51:00] but they're already are [17:51:02] yeah [17:51:10] and this is temporary like this? [17:51:11] and i'll catch it during provisioning [17:51:13] yep [17:51:18] odd [17:51:25] how temporary? [17:51:43] this will change by EOD today [17:53:01] osmium arwiki: Could not unserialize ParsoidCacheUpdateJobOnDependencyChange job. [17:53:04] ottomata: would you prefer that i add ensure_packages to webserver::apache as well? [17:53:07] why are there piles of those again? [17:53:11] HMmmm [17:53:15] oh, by EOD today is fine i'm sure [17:53:16] ok [17:53:37] AaronSchulz: i ran jobs on osmium last night since time fixed the parser->parseroutput bug [17:53:53] (03CR) 10Ottomata: [C: 031] "Since this is transitional and the package weirdness will be fixed asap, I think its ok." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140165 (owner: 10Ori.livneh) [17:54:08] (03CR) 10Ori.livneh: [C: 032] resolve apache package conflict [operations/puppet] - 10https://gerrit.wikimedia.org/r/140165 (owner: 10Ori.livneh) [17:54:52] * AaronSchulz doesn't see many errors Title::invalidateCache [17:57:01] Nikerabbit: is anyone fixing those "Did you call Parser::parse recursively?" errors? [17:58:31] anomie: lots of this error to: http://pastebin.com/Drzsq0KL [18:00:04] AaronSchulz: not assigned on current sprint [18:00:05] Reedy, greg-g: The time is nigh to deploy MediaWiki train (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140617T1800) [18:00:12] godog: yt [18:00:12] ? [18:00:52] (03PS2) 10Reedy: Non Wikipedias to 1.24wmf9 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140100 [18:01:02] AaronSchulz: do you want me to ask time for it? [18:01:47] * anomie looks [18:02:00] (03CR) 10Reedy: [C: 032] Non Wikipedias to 1.24wmf9 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140100 (owner: 10Reedy) [18:02:06] (03Merged) 10jenkins-bot: Non Wikipedias to 1.24wmf9 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140100 (owner: 10Reedy) [18:02:23] Nikerabbit: it would be nice since it still spams the logs [18:02:44] (03PS1) 10Ori.livneh: dirty hack to fix service['httpd'] conflict [operations/puppet] - 10https://gerrit.wikimedia.org/r/140168 [18:03:28] ottomata: very sorry. one more, and it's a gross one. you have my commitment to resolve it by eod as an alibi (documented in commit message and code). [18:04:06] (03CR) 10Chad: [C: 032] Setup pool counter for regex search [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140166 (owner: 10Manybubbles) [18:04:13] (03Merged) 10jenkins-bot: Setup pool counter for regex search [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140166 (owner: 10Manybubbles) [18:04:18] AaronSchulz: At first glance, that looks like GIGO as far as Scributo goes. I'll try to figure out where the garbage is coming in from. [18:04:25] ottomata: note that the apache module is currently only used on tin and mediawiki_singlenode on labs so it's less risky than it looks [18:04:44] hm, ok [18:05:03] it's awful, but roll with me and you'll see [18:05:03] (03CR) 10Ottomata: [C: 031] "Hmm, Ok....." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140168 (owner: 10Ori.livneh) [18:05:05] Bleugh. Back in 10-15 [18:05:14] !log demon Synchronized wmf-config/PoolCounterSettings-eqiad.php: Limit regex searches before they start landing on wikis (duration: 00m 04s) [18:05:18] Logged the message, Master [18:05:29] (03CR) 10Ori.livneh: [C: 032] dirty hack to fix service['httpd'] conflict [operations/puppet] - 10https://gerrit.wikimedia.org/r/140168 (owner: 10Ori.livneh) [18:06:07] anomie: if it's just bad user input, it should probably avoid making an exception though [18:06:27] AaronSchulz: No, not user input. [18:09:16] (03PS2) 10Ori.livneh: apache::vhost: retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/140147 [18:09:34] (03CR) 10Ori.livneh: [C: 032 V: 032] apache::vhost: retab [operations/puppet] - 10https://gerrit.wikimedia.org/r/140147 (owner: 10Ori.livneh) [18:15:38] AaronSchulz: Found it. https://gerrit.wikimedia.org/r/#/c/78016/ The problem is that testwiki has aliases for namespaces 100, 101, 102, and 103, but doesn't have those namespaces actually *defined*. [18:16:28] ouch. [18:17:16] RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected [18:22:45] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf9 [18:22:50] Logged the message, Master [18:23:49] !log reedy Synchronized docroot and w: (no message) (duration: 00m 16s) [18:23:54] Logged the message, Master [18:24:50] AaronSchulz: https://gerrit.wikimedia.org/r/140169 [18:25:46] PROBLEM - gdash.wikimedia.org on tungsten is CRITICAL: Connection refused [18:25:46] PROBLEM - DPKG on tungsten is CRITICAL: DPKG CRITICAL dpkg reports broken packages [18:25:53] * YuviPanda gives chasemp some leprosy [18:26:06] PROBLEM - graphite.wikimedia.org on tungsten is CRITICAL: Connection refused [18:26:07] oh man, that backscroll was from a while ago [18:26:16] (03CR) 10Anomie: "... Were you intending to create aliases for namespaces that don't actually exist?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/78016 (https://bugzilla.wikimedia.org/52528) (owner: 10TTO) [18:27:46] RECOVERY - gdash.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 8758 bytes in 0.022 second response time [18:27:47] RECOVERY - DPKG on tungsten is OK: All packages OK [18:28:06] RECOVERY - graphite.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 1607 bytes in 0.021 second response time [18:31:17] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:33:06] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 53310 bytes in 0.424 second response time [18:39:16] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:40:06] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 53290 bytes in 0.594 second response time [18:44:21] (03PS1) 10Milimetric: Stop relying on limited redis module [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/140173 (https://bugzilla.wikimedia.org/63664) [18:47:16] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:15] akosiaris: yt? [18:48:32] (03PS2) 10Milimetric: Stop relying on limited redis module [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/140173 (https://bugzilla.wikimedia.org/63664) [18:54:06] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 53291 bytes in 0.317 second response time [18:58:04] (03CR) 10Ottomata: [C: 032] "This will fix the problem, but basically means that redis won't be managed by puppet at all." [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/140173 (https://bugzilla.wikimedia.org/63664) (owner: 10Milimetric) [18:59:11] (03CR) 10Milimetric: "Yeah, I think so, we can fix it in any better ways once we think of them." [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/140173 (https://bugzilla.wikimedia.org/63664) (owner: 10Milimetric) [19:01:51] (03PS1) 10Ottomata: Update wikimetrics module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140174 [19:02:06] (03CR) 10Ottomata: [C: 032 V: 032] Update wikimetrics module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140174 (owner: 10Ottomata) [19:18:06] PROBLEM - Puppet freshness on polonium is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 13:16:47 UTC [19:18:18] what's the RT tracker mainly used for? I created some initial settings for wm2015 but I noticed previous wikimanias used RT to push patches to operations/dns. I don't know if I should use git to push the patches instead or file something to RT [19:18:52] It'll end up in Git anyway, there's no other way. [19:19:00] Well, Gerrit, I should say. [19:20:52] I thought RT might be used for nonpublic stuff, but I guess I'll use git to submit a patch. Shouldn't be harder than copypasting a few lines from previous wikimania right? [19:21:07] Withoutaname: RT is just used for tracking by the operation's team. You can file a ticket with all the patches linked if you want ops to recognised it quicker. [19:21:36] RT is not really non-public stuff always. Just operations may have non-public stuff :p [19:22:12] I couldn't login so I assumed it's "private" [19:22:44] Users need to be NDA'd and stuff as private stuff can come along. [19:23:30] RT stands for Request Tracker, so it's basically Bugzilla which you can't access through the web [19:23:57] yeah figures [19:24:08] You can send an e-mail to ops-requests@rt.wikimedia.org and it'll automagically create a new ticket for you and the ops team will be able to track it there [19:24:27] Feel free to link all the patches you create & stuff. [19:24:29] twkozlowski: Technically non-authorised users can't access through the web :p [19:24:51] But that's what I said. [19:25:11] you = Withoutaname [19:25:36] I guess [19:25:49] greg-g: btw poked robla? [19:26:17] regarding? [19:26:31] NDA, no :) [19:26:33] but now yes [19:26:33] robla: NDA [19:26:36] * greg-g forwards email [19:26:37] :p [19:26:48] greg-g: this is one for qgil [19:27:16] greg-g: hey... you here? [19:27:31] yeah [19:28:10] andrewbogott: paravoid: Did the CVN changes impact NFS positively? Any graph I can look at? [19:28:12] greg-g: I need to deploy a JS fix for wikidata today, can we do that soonish? [19:28:21] Js-Only [19:28:26] jquery ui issues [19:28:35] https://gerrit.wikimedia.org/r/140180 is the change [19:28:40] issue* [19:28:52] Krinkle: I'm pretty sure that change is still in progress. [19:29:13] andrewbogott: Well, the person doing the changes thinks he's done. [19:29:15] That would be me. [19:29:25] Oh, sorry, I must not understand the question [19:29:59] hoo: just to wmf8 or also 9? [19:30:03] The dozens of CVNBot in the 'cvn' labs project were writing and reading a 20+ MB sqlite file very freqiuently from NFS. [19:30:11] greg-g: Both, as always [19:30:36] andrewbogott: I changed it per Coren and paravoid's request to use the local disk instead. There was no reason why it was just NFS, I dind't even know /data/project used NFS. [19:30:57] I heard claims it was occupying like 80% of all I/O for labs as a whole [19:31:02] greg-g: Well actually, I guess we only need 9 [19:31:12] that change is against 8 [19:31:22] greg-g: We use one branch for two MW versions [19:31:29] so our wmf8 is your 8 and 9 [19:31:44] aha [19:31:48] right [19:32:24] hoo: kk [19:32:31] Reedy: ready for hoo to do a js update? [19:34:59] (03PS1) 10Ori.livneh: Take another step in the direction of Apache consolidation [operations/puppet] - 10https://gerrit.wikimedia.org/r/140185 [19:35:03] ottomata: ^ [19:36:58] ori, just curious, why both the dummy apache services? [19:37:17] oh, i see [19:37:23] hm, real service coming in new patch [19:37:25] because there are references to Service['httpd'] and Service['apache2'] in the code. if i consolidate them in a single patch, it's risky [19:37:26] (03PS1) 10Withoutaname: DNS settings for wikimania2015wiki [operations/dns] - 10https://gerrit.wikimedia.org/r/140186 (https://bugzilla.wikimedia.org/66370) [19:37:26] right [19:37:41] ok [19:37:46] (03CR) 10Ottomata: [C: 031] Take another step in the direction of Apache consolidation [operations/puppet] - 10https://gerrit.wikimedia.org/r/140185 (owner: 10Ori.livneh) [19:37:57] there we go [19:38:02] (03CR) 10Ori.livneh: [C: 032] Take another step in the direction of Apache consolidation [operations/puppet] - 10https://gerrit.wikimedia.org/r/140185 (owner: 10Ori.livneh) [19:40:33] DNS and apache changes have been submitted for wm2015wiki [19:44:30] John Lewis: are you part of operations? [19:45:03] s/John Lewis/JohnLewis [19:45:04] He's not [19:45:14] Withoutaname: What hoo said p [19:47:46] Of couse he is. [19:47:56] He submitted stuff to that repo, didn't he. [19:48:12] The question was just wrong :-P [19:48:18] :p [19:48:23] +1 twkozlowski [19:48:27] <_joe_> ori: or, use a single patch, and use the compiler to verify nothing fails [19:48:41] (03PS1) 10Ori.livneh: Consolidate httpd/apache2 Service and Package [operations/puppet] - 10https://gerrit.wikimedia.org/r/140191 [19:48:50] _joe_: so what's up with RT #7701? [19:49:18] hoo: I haven't heard from Reedy, give it another 10 minutes and you should be good to go [19:49:22] _joe_: i keep forgetting about that! [19:49:34] i'll do that for the next patch. [19:49:41] ottomata: https://gerrit.wikimedia.org/r/#/c/140191/ is very easy to reason about [19:49:51] greg-g: Ok, that's ok [19:50:07] ottomata: last patch applied correctly btw [19:50:09] <_joe_> twkozlowski: it's in the RT system, someone (typically, the RT onduty person), will take care of it [19:50:53] I see James is on it, incidentally [19:50:57] ori, just so i understand where we are going: [19:51:13] are you intending that sites-enabled files can be managed both by your apache module and manually? [19:51:21] ottomata: yep [19:51:24] aye ok then [19:52:06] twkozlowski: poke springle_ :D [19:52:13] (03CR) 10Ottomata: [C: 031] Consolidate httpd/apache2 Service and Package [operations/puppet] - 10https://gerrit.wikimedia.org/r/140191 (owner: 10Ori.livneh) [19:52:27] JohnLewis: But I have nothing to poke him about :-) [19:52:43] Oh :( [19:53:14] <_joe_> twkozlowski: also, I think he's in deep sleep mode now [19:53:39] (03CR) 10Ori.livneh: [C: 032] Consolidate httpd/apache2 Service and Package [operations/puppet] - 10https://gerrit.wikimedia.org/r/140191 (owner: 10Ori.livneh) [19:54:00] _joe_: on a related subject, what do you need to be able to close https://bugzilla.wikimedia.org/show_bug.cgi?id=66003 ? [19:54:33] Should I have Pharos (the list admin) comment there? [19:54:52] <_joe_> twkozlowski: just time to do it [19:55:07] Cool. Thanks! [19:55:08] <_joe_> today I was completely absorbed by a pretty big change I brought online [19:55:23] ottomata: that one applied correctly too [19:55:36] RECOVERY - Puppet freshness on holmium is OK: puppet ran at Tue Jun 17 19:55:27 UTC 2014 [19:55:56] greg-g: Am I ok to +2 my stuff and deploy when jenkins is done (that's about 3-5 min) [19:56:01] Reedy: ^ [19:56:57] yeah [19:58:11] greg-g: We should amend the deploy policy to also ban deploys just before World Cup games :D [19:58:18] hoo, that's my window in 3 minutes:) [19:58:40] MaxSem: Are you ok with me pushing a small JavaScript fix for Wikidata? [19:58:46] yep [19:58:49] <_joe_> hoo: and what about NBA/NHL finals? [19:59:10] _joe_, we're an international org, don't care [19:59:11] _joe_: I don't watch that, thus I don't care :D If others do, maybe [19:59:26] NBA finals? Aren't those over with? [19:59:33] bblack, ready when you are [19:59:35] <_joe_> bblack: they are [19:59:51] yeah and Texas won :) [19:59:52] <_joe_> MaxSem: international too, still I prefer basket over soccer [19:59:54] hoo: ze Germans aren't playing today, are they? [20:00:04] bblack, MaxSem: The time is nigh to deploy Redirect tablets to mobile site (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140617T2000) [20:00:07] twkozlowski: No, but I still wanted to watch :P [20:00:09] <_joe_> bblack: San Antonio :) [20:00:15] _joe_: it's called football you Yank! :-P [20:00:25] eh SA is like 2 hours left of me, it's close enough :) [20:00:28] <_joe_> twkozlowski: it's called calcio [20:00:46] <_joe_> bblack: I did not imagine SA to be such a big city [20:00:54] <_joe_> (discovered on wp during the finals) [20:01:31] to be politically incorrect, basically SA is the Mexican capitol city of Texas, whereas Austin is the normal capitol. [20:01:50] !log hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix editing site links (duration: 00m 24s) [20:01:55] Logged the message, Master [20:01:57] <_joe_> wow the right place to go eat [20:02:05] hoo: gah! i was an hour off in my head [20:02:08] sorry [20:02:13] MaxSem: sorry [20:02:18] np [20:02:30] MaxSem: do you have a link to what you're pushing for tablet? [20:02:33] sign I need to take a nap: can't tell difference between noon and 1pm [20:03:02] bblack, https://gerrit.wikimedia.org/r/#/c/134119/ [20:03:09] afternoon naps are part of the natural diurnal cycle for mammals. the workforce overlords don't like humans hearing about that, though. [20:03:14] (03PS1) 10BryanDavis: Remove package dependency for apache::mod::proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140198 [20:04:09] MaxSem: ready whenever [20:04:24] greg-g: MaxSem: Thanks for letting me sneak in [20:04:27] bblack: +1 [20:04:28] Wikidata works again [20:04:29] bblack, go for it:) [20:04:39] hah you're gonna make me break everything? :P [20:04:54] (03PS2) 10BBlack: Redirect tablets to mobile site, currently scheduled for June 17 [operations/puppet] - 10https://gerrit.wikimedia.org/r/134119 (owner: 10MaxSem) [20:05:20] we're Wikipedia, we're broken by definition [20:06:22] <_joe_> bblack: I tried to break everything today and I failed, it seems. So it's your turn now [20:06:32] (03CR) 10BryanDavis: "Cherry-picked to deployment-salt where it fixed a puppet run error on deployment-logstash1:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/140198 (owner: 10BryanDavis) [20:07:12] (03CR) 10BBlack: [C: 032 V: 032] Redirect tablets to mobile site, currently scheduled for June 17 [operations/puppet] - 10https://gerrit.wikimedia.org/r/134119 (owner: 10MaxSem) [20:07:43] it's rolling, as puppet splays, etc [20:08:08] (03CR) 10BryanDavis: "Error message in above comment should have said "libapache2-mod-proxy" instead of "libapache2-mod-proxy-html". The error quoted was from a" [operations/puppet] - 10https://gerrit.wikimedia.org/r/140198 (owner: 10BryanDavis) [20:16:12] <_joe_> redirect is working [20:16:21] <_joe_> at least via esams :) [20:16:52] <_joe_> the mobile site is nice even on the desktop [20:17:42] !log maxsem Synchronized php-1.24wmf8/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/140178/ (duration: 00m 05s) [20:17:46] Logged the message, Master [20:19:34] !log maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/140178/ (duration: 00m 04s) [20:19:39] Logged the message, Master [20:21:17] it's about 40% rolled-out across the global text caches so far [20:23:06] PROBLEM - Puppet freshness on netmon1001 is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 17:22:04 UTC [20:23:11] (03PS1) 10Ottomata: Now building against and running with openjdk-7 [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140206 [20:23:13] (03PS1) 10Ottomata: Add commented out recommended GC settings if using Java 7 u51 or greater [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140207 [20:23:15] (03PS1) 10Ottomata: Fix for KAFKA_MIRROR_START variable in kafka-mirror.init script [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140208 [20:23:17] (03PS1) 10Ottomata: Not starting kafka and kafka-mirror during postinstall [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140209 [20:24:04] (03PS2) 10Withoutaname: DNS settings for wikimania 2015 wiki [operations/dns] - 10https://gerrit.wikimedia.org/r/140186 (https://bugzilla.wikimedia.org/66370) [20:24:13] (03PS2) 10Withoutaname: Apache settings for wikimania 2015 wiki [operations/apache-config] - 10https://gerrit.wikimedia.org/r/139288 (https://bugzilla.wikimedia.org/66370) [20:24:33] (03PS2) 10Ottomata: Now building against and running with openjdk-7 [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140206 [20:24:42] (03PS2) 10Ottomata: Add commented out recommended GC settings if using Java 7 u51 or greater [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140207 [20:24:51] (03PS2) 10Ottomata: Fix for KAFKA_MIRROR_START variable in kafka-mirror.init script [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140208 [20:24:55] (03PS2) 10Ottomata: Not starting kafka and kafka-mirror during postinstall [operations/debs/kafka] (debian) - 10https://gerrit.wikimedia.org/r/140209 [20:27:13] (03PS1) 10Ori.livneh: Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 [20:27:48] (03CR) 10Ori.livneh: [C: 031] Remove package dependency for apache::mod::proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140198 (owner: 10BryanDavis) [20:29:51] (03CR) 10jenkins-bot: [V: 04-1] Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 (owner: 10Ori.livneh) [20:31:08] (03PS2) 10Ori.livneh: Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 [20:31:50] (03PS2) 10Withoutaname: Initialize some settings for wikimania2015wiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/139279 (https://bugzilla.wikimedia.org/66370) [20:32:12] (03CR) 10jenkins-bot: [V: 04-1] Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 (owner: 10Ori.livneh) [20:32:19] (03PS3) 10Withoutaname: Initialize some settings for wikimania 2015 wiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/139279 (https://bugzilla.wikimedia.org/66370) [20:32:49] (03PS3) 10Ori.livneh: Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 [20:35:07] (03CR) 10Ori.livneh: "Compiled for antimony and ytterbium: http://puppet-compiler.wmflabs.org/92/change/140211/html/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 (owner: 10Ori.livneh) [20:35:10] ottomata: ^ [20:35:30] i think there's probably only a couple more [20:35:55] ottomata: there's also bd808's change which unfortunately i can't +2 since it's not mine: https://gerrit.wikimedia.org/r/140198 [20:37:39] MaxSem: the tablet change should be live for all caches now. [20:37:47] whee, thanks! [20:37:48] performance impact so far is pretty negligible [20:38:20] yep, while I see a slight traffic increase, it's barely noticeable [20:41:36] congrats, MaxSem :) [20:41:58] * MaxSem grabs popcorn and heads to VPT [20:42:09] ottomata: yt? [20:42:23] ja, hang on [20:42:31] trying to rebase something... [20:42:37] (03PS2) 10Ottomata: Lint fixes for misc/statistics.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/139820 [20:45:09] (03CR) 10Ottomata: Lint fixes for misc/statistics.pp (0328 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/139820 (owner: 10Ottomata) [20:45:52] (03PS2) 10Ottomata: Remove package dependency for apache::mod::proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140198 (owner: 10BryanDavis) [20:46:01] (03CR) 10Ottomata: [C: 032 V: 032] Remove package dependency for apache::mod::proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/140198 (owner: 10BryanDavis) [20:49:55] ori, what about the Install_certificate for smokeping dependency? [20:49:59] https://gerrit.wikimedia.org/r/#/c/140211/3/manifests/role/smokeping.pp [20:50:24] ottomata: the whole line is commented out [20:51:02] Oh [20:51:03] there's no harm in including apache::mod::ssl tho, but i can amend to just not touch that line [20:51:04] true true [20:51:09] nono, that's fine [20:51:13] somehow didn't notice that comment [20:51:24] (03CR) 10Ottomata: [C: 031] Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 (owner: 10Ori.livneh) [20:53:05] (03CR) 10Ottomata: "FYI, waiting for new elastic servers to test this on before merging." [operations/puppet] - 10https://gerrit.wikimedia.org/r/138012 (owner: 10Ottomata) [20:53:30] (03PS2) 10Ottomata: udp2log: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139826 (owner: 10Matanya) [20:53:39] (03CR) 10Ottomata: [V: 032] udp2log: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/139826 (owner: 10Matanya) [20:53:43] thanks [20:54:10] yup! [20:54:17] running puppet now on oxygen, just to be sure [20:55:44] looks good [20:55:49] thanks matanya [20:55:59] :) [20:59:34] ottomata: were you still looking? [20:59:36] or just forgot to +1? [20:59:42] 'sokay if the former :) [21:00:04] bsitu: The time is nigh to deploy Flow (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140617T2100) [21:00:05] oh, thought I did [21:00:17] its +1ed [21:00:19] https://gerrit.wikimedia.org/r/#/c/140211/ [21:00:22] was there another? [21:00:22] oh yeah you did [21:00:23] weird [21:00:24] sorry [21:00:31] np [21:01:38] (03PS1) 10Ori.livneh: kill apache_site (wip) [operations/puppet] - 10https://gerrit.wikimedia.org/r/140218 [21:04:07] (03PS4) 10Ori.livneh: Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 [21:04:17] (03CR) 10Ori.livneh: [C: 032 V: 032] Kill apache_module [operations/puppet] - 10https://gerrit.wikimedia.org/r/140211 (owner: 10Ori.livneh) [21:04:37] (03CR) 10QChris: Add backup role and scripts (031 comment) [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [21:06:56] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data exceeded the critical threshold [500.0] [21:06:58] (03CR) 10Milimetric: Add backup role and scripts (031 comment) [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/139557 (https://bugzilla.wikimedia.org/66119) (owner: 10Milimetric) [21:13:24] (03PS2) 10Ori.livneh: kill apache_site [operations/puppet] - 10https://gerrit.wikimedia.org/r/140218 [21:15:20] ottomata: ^ [21:15:23] it's almost over, i promise :) [21:15:29] and that one is a lot less scary once you see the trick [21:15:34] documented in the commit message [21:18:22] ori, are the 000-default things removed by new apache module anyway? [21:18:33] seeing you just took the ensure => absents out of noc.pp [21:18:42] ottomata: yes, because it sets recurse => true, purge => true for the entire sites-enabled dir [21:19:09] hmm, pupept let's you do that AND put files in the same dir? [21:19:23] ottomata: yes, it means recurse/purge any file not managed by puppet [21:19:26] aye [21:19:27] hmk [21:19:31] ottomata: also, note that this will leave files in sites-available, but they won't be doing anything [21:19:38] ottomata: i will be cleaning those up manually after the patch is applied everywhere [21:19:47] ottomata: seems better than cluttering the diff with a lot of ensure => absents [21:20:33] ha!, i just switched over to IRC to ask that! [21:20:37] you answered my question while I was thinking it! [21:20:40] agreed. [21:20:41] :) [21:20:57] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% data above the threshold [250.0] [21:22:36] RECOVERY - Puppet freshness on netmon1001 is OK: puppet ran at Tue Jun 17 21:22:34 UTC 2014 [21:22:40] (03CR) 10Ottomata: [C: 031] "LGTM. Ori says he is going to clean up the leftover files in sites-available manually." [operations/puppet] - 10https://gerrit.wikimedia.org/r/140218 (owner: 10Ori.livneh) [21:22:50] ori, i'm heading out pretty soon [21:23:03] not sure if you want to merge that unless you have another opsen around to help follow up with any quick changes [21:23:11] bblack, thank you once again - everything seems to be in order [21:23:23] ottomata: awesome, thanks a bunch!!!!! i know these were pretty stressful to review [21:23:40] naw, not stressful at all, this afternoon was a good time for it for me [21:23:45] MaxSem: np [21:23:51] i'm wrapping up my kafka experiments, and am now waiting for some reviews from akosiaris [21:23:57] going through backlog of gerrit stuff anyway [21:24:27] bblack: if shit hits the fan (it won't), would you be available to sanity-check a fixup? [21:24:36] ottomata: *nod* [21:26:48] (03PS1) 10Ori.livneh: kill apache_confd [operations/puppet] - 10https://gerrit.wikimedia.org/r/140227 [21:27:03] ottomata: actually, before you run, just because it'd take some time to explain the backlog to someone else, do you think you could look at that one ^^ ? it's the simplest of the bunch [21:27:22] ottomata: if not, 'sokay too [21:28:14] i'm going to merge the apache_site one, worst case scenario if there's no one willing to look at a fix-up i can self-merge or revert [21:28:36] looking [21:28:46] (03CR) 10Ori.livneh: [C: 032] kill apache_site [operations/puppet] - 10https://gerrit.wikimedia.org/r/140218 (owner: 10Ori.livneh) [21:29:14] whoa is that really the last of generic-definitions?! [21:29:34] (03CR) 10Ottomata: [C: 031] kill apache_confd [operations/puppet] - 10https://gerrit.wikimedia.org/r/140227 (owner: 10Ori.livneh) [21:29:39] ottomata: \o/ yep [21:33:18] (03CR) 10Ori.livneh: [C: 032 V: 032] kill apache_confd [operations/puppet] - 10https://gerrit.wikimedia.org/r/140227 (owner: 10Ori.livneh) [21:49:03] (03PS6) 10Withoutaname: Delete ve.wikimedia.org and leave redirect [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/131907 (https://bugzilla.wikimedia.org/55737) [21:51:24] (03PS7) 10Withoutaname: Reduce string URLs to defined constant [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/131914 (https://bugzilla.wikimedia.org/48618) [21:52:43] (03PS3) 10Withoutaname: Enable Echo on Wikimedia wikis by default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/139326 [21:52:52] (03CR) 10Matanya: "General note: the ensure => foo some of foo is quoted and some not. Pick one and stick with it, i recommend unquoted. +1 minor nitpick" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/139820 (owner: 10Ottomata) [21:56:59] (03PS1) 10Ori.livneh: Un-dummy apache2 service [operations/puppet] - 10https://gerrit.wikimedia.org/r/140229 [21:58:06] bblack: 3 line diff, reverts a temporary safety precaution from this morning that is no longer needed since all the relevant patches applied correctly. got a moment to review? ^ [21:58:40] with that i'm done with puppet for the day [21:58:53] and the terror alert can be lowered back to orange [22:00:06] or jgage? :) [22:00:50] (03PS1) 10Kaldari: Turning on Translate Extension for foundationwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140231 [22:00:51] i don't bite, i promise [22:07:56] (03PS1) 10QChris: Make dbstore1002 handle s6 analytics queries [operations/dns] - 10https://gerrit.wikimedia.org/r/140236 (https://bugzilla.wikimedia.org/66068) [22:17:22] (03PS1) 10Tim Landscheidt: Tools: Install Catalan locale [operations/puppet] - 10https://gerrit.wikimedia.org/r/140239 (https://bugzilla.wikimedia.org/62269) [22:19:06] PROBLEM - Puppet freshness on polonium is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 13:16:47 UTC [22:24:19] (03CR) 10BBlack: [C: 032 V: 032] Un-dummy apache2 service [operations/puppet] - 10https://gerrit.wikimedia.org/r/140229 (owner: 10Ori.livneh) [22:24:32] thanks bblack! [22:24:38] np [22:53:52] (03PS1) 10Ori.livneh: Add a lightweight apache::site resource [operations/puppet] - 10https://gerrit.wikimedia.org/r/140242 [22:54:07] (03CR) 10jenkins-bot: [V: 04-1] Add a lightweight apache::site resource [operations/puppet] - 10https://gerrit.wikimedia.org/r/140242 (owner: 10Ori.livneh) [22:56:48] (03PS1) 10Ori.livneh: graphite: remove duplicate site config [operations/puppet] - 10https://gerrit.wikimedia.org/r/140243 [23:00:04] mwalker, ori, MaxSem: The time is nigh to deploy SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140617T2300) [23:00:17] I can do it today [23:01:09] kaldari, regarding your configuration change -- I've long been under the impression that we didn't want translate on foundation wiki [23:02:18] o.O [23:02:20] https://bugzilla.wikimedia.org/show_bug.cgi?id=44871 [23:03:17] (03CR) 10Legoktm: "This is bug 44871, btw." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140231 (owner: 10Kaldari) [23:05:44] greg-g, tangentially, is this even a swattable change? https://gerrit.wikimedia.org/r/#/c/140231/ ... I thought enabling extensions on wikis had to be announced beforehand [23:06:50] mwalker: Why is that? [23:07:46] re the bug that legoktm pulled up; it's because the workflow has traditionally been translate on meta and copy over [23:07:54] (03PS4) 1020after4: Move the ordered_json parser function to a shared module and remove the copy-pasta found in deployment, statsd and gdash modules. [operations/puppet] - 10https://gerrit.wikimedia.org/r/139921 [23:08:05] ori, can you please comment on https://bugzilla.wikimedia.org/show_bug.cgi?id=66751 ? [23:08:06] mwalker: that's fine, I just need Special:MyLanguage [23:08:53] right; but we cant just enable it for that feature only; it comes with lots of other baggage [23:09:04] (03CR) 1020after4: "@ori.livneh: ok I removed the require calls and added a readme for wmflib module. This should be ready to commit now?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/139921 (owner: 1020after4) [23:09:06] * mwalker wonders if we can split that functionality out [23:09:32] mwalker: Extension:LandingCheck can basically do the same thing [23:09:49] mwalker: You can skip deploying that in the meantime then [23:18:27] * Jasper_Deng wonders about the plans to deploy global rename (in terms of exact hours) [23:22:12] Jasper_Deng: it got delayed, did you see wikitech-ambassadors? [23:22:40] oh now I see it, from Dan [23:41:58] (03PS1) 10Gergő Tisza: Enable MediaViewer suvery links on Commons [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/140250 [23:43:06] PROBLEM - Puppet freshness on iodine is CRITICAL: Last successful Puppet run was Tue 17 Jun 2014 20:42:42 UTC [23:50:04] (03PS2) 10Ori.livneh: graphite: remove duplicate site config [operations/puppet] - 10https://gerrit.wikimedia.org/r/140243 [23:51:52] (03PS1) 10Ori.livneh: otrs: resolve duplicate def'n of libapache2-mod-perl2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/140252 [23:55:46] (03CR) 10Ori.livneh: "recheck" [operations/puppet] - 10https://gerrit.wikimedia.org/r/140242 (owner: 10Ori.livneh)