[00:04:05] We're seeing a failure to update l10n strings in media viewer [00:04:10] Since this morning's deploy [00:04:14] Did something go wrong? [00:05:24] (03PS5) 10Gergő Tisza: Increase the network performance sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:05:28] (03CR) 10jenkins-bot: [V: 04-1] Increase the network performance sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:07:30] (03PS6) 10Ori.livneh: Increase the network performance sampling rate [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:07:37] oh, woops [00:07:42] tgr: didn't mean to step on your change [00:07:58] mine is half in jest [00:09:06] ori: np, needs rebase anyway [00:09:24] tgr: you should restore PS5, but amend it to move the $wgNetworkPerformanceSamplingFactor = $wmgNetworkPerformanceSamplingFactor; statement into the if ( $wmgMediaViewerBeta ) block [00:09:50] RoanKattouw: you may still like my solution in PS6, tho i'm not sure i'm serious about it :P [00:11:42] ori, isn't it nicer to handle that in InitializeSettings? [00:11:55] tgr: yes, ignore my patch [00:12:49] ori: i meant your amend comment [00:13:24] (03PS6) 10BBlack: Enhanced X-Analytics header with HTTPS and Proxy information [operations/puppet] - 10https://gerrit.wikimedia.org/r/119795 (owner: 10Yurik) [00:13:42] (03CR) 10BBlack: [C: 032 V: 032] Enhanced X-Analytics header with HTTPS and Proxy information [operations/puppet] - 10https://gerrit.wikimedia.org/r/119795 (owner: 10Yurik) [00:15:48] (03PS7) 10Gergő Tisza: Increase the network performance sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:19:18] (03CR) 10Gergő Tisza: Increase the network performance sampling rate for MediaViewer (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:23:07] AaronSchulz: which patch did you need me to review again? [00:23:40] well, several ;) [00:23:43] I think it was https://gerrit.wikimedia.org/r/#/c/117916/ [00:31:31] Notice: Undefined variable: udp in /a/common/wmf-config/InitialiseSettings.php on line 4004 [00:31:50] Did anyone change anything regarding that? [00:32:06] Reedy: ^ [00:32:24] (importImages.php) [00:33:19] (03PS1) 10Ori.livneh: Fix typo in I86f5493d0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119909 [00:33:30] that's erik b.'s change from earlier [00:33:36] ... [00:33:51] ori: Shall I approve? can also push out [00:33:56] If needed/ wanted [00:33:56] hoo: please [00:34:07] https://github.com/wikimedia/operations-mediawiki-config/blame/master/wmf-config/InitialiseSettings.php#L4004 [00:34:32] RoanKattouw: ok, now I don't feel so bad: you merged it :P [00:34:41] Wat? [00:34:43] https://gerrit.wikimedia.org/r/#/c/118020/1/wmf-config/InitialiseSettings.php '$udp' [00:34:48] (03CR) 10Hoo man: [C: 032] Fix typo in I86f5493d0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119909 (owner: 10Ori.livneh) [00:34:55] (03Merged) 10jenkins-bot: Fix typo in I86f5493d0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119909 (owner: 10Ori.livneh) [00:35:38] ori: Yeeeeah, what about that? [00:35:56] should have been 'udp'; there is no '$udp' var [00:36:06] Aaaah haha [00:36:07] yikes [00:36:11] Whoops [00:36:19] Sorry about that :S [00:36:20] Also stuff not deployed which has been merged an hour ago [00:36:25] so now you have to remove the -2 from https://gerrit.wikimedia.org/r/#/c/119269/7 , by the cosmic law of silliness karma [00:36:29] verifiying [00:36:32] one sec.... [00:38:26] (03CR) 10Catrope: [C: 031] Increase the network performance sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:38:43] !log hoo synchronized wmf-config/ 'Fix typo <> udp, also Icdb5425 and I04e5f7f which weren't synced but look harmless' [00:38:49] Logged the message, Master [00:39:19] * hoo slaps RoanKattouw and MarkTraceur for not syncing their stuff out [00:39:38] message fail :P $udp... [00:40:04] RoanKattouw: ok, mind if i sync that? [00:42:10] ori: If greg-g is OK with it [00:45:07] greg-g: ? [00:47:22] well, it was scheduled, so I'll go for it [00:47:23] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Thu 20 Mar 2014 09:46:44 PM UTC [00:47:41] if it explodes it'll affect eventlogging so i know how to debug [00:47:45] but it won't explode [00:47:49] (03CR) 10Ori.livneh: [C: 032] Increase the network performance sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:47:58] (03Merged) 10jenkins-bot: Increase the network performance sampling rate for MediaViewer [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119269 (owner: 10Gilles) [00:48:12] !log ori updated /a/common to {{Gerrit|Ib0eb802c4}}: Fix typo in I86f5493d0 [00:48:18] Logged the message, Master [00:49:21] !log ori synchronized wmf-config 'Ib539f96eb7: Increase the network performance sampling rate for MediaViewer' [00:49:26] Logged the message, Master [01:12:33] RECOVERY - Puppet freshness on labstore2 is OK: puppet ran at Fri Mar 21 01:12:32 UTC 2014 [01:20:23] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 19 Mar 2014 07:10:56 PM UTC [01:32:43] (03PS5) 10Hoo man: Introduce an admins::release user group [operations/puppet] - 10https://gerrit.wikimedia.org/r/116019 [01:45:34] (03PS1) 10Reedy: Point php symlink at php-1.23wmf18 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119921 [01:45:38] (03CR) 10jenkins-bot: [V: 04-1] Point php symlink at php-1.23wmf18 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119921 (owner: 10Reedy) [01:56:28] ori, https://www.youtube.com/watch?v=z8rYotiiFP8 [02:12:39] !log LocalisationUpdate completed (1.23wmf18) at 2014-03-21 02:12:39+00:00 [02:12:46] Logged the message, Master [02:20:28] ori: cool [02:26:52] @info db1047 [02:26:52] Krinkle: [db1047: ?] 10.64.16.36 [02:27:09] @replag db1047 [02:27:09] Krinkle: Could not get replag information. [02:27:25] @externals [02:27:26] Krinkle: [operations/mediawiki-config.git] Checked out HEAD: 5f79d1012586a10ba6340edbea4e72e38dd063a0 - https://git.wikimedia.org/commit/operations%2Fmediawiki-config.git/5f79d1012586a10ba6340edbea4e72e38dd063a0 [02:27:28] @externals-update [02:27:28] Krinkle: [operations/mediawiki-config.git] Checked out HEAD: 5f79d1012586a10ba6340edbea4e72e38dd063a0 - https://git.wikimedia.org/commit/operations%2Fmediawiki-config.git/5f79d1012586a10ba6340edbea4e72e38dd063a0 [02:27:36] @help [02:27:36] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 2.0.0.4 my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [02:27:39] @info [02:27:39] http://bots.wmflabs.org/~wm-bot/dump/%23wikimedia-operations.htm [02:27:39] Krinkle: Invalid arguments [02:27:42] @doc [02:27:46] @source [02:27:51] @docs [02:27:51] Krinkle: https://www.mediawiki.org/wiki/dbbot-wm [02:28:11] @externals update [02:28:30] -_- [02:33:15] @externals [02:33:16] Krinkle: [operations/mediawiki-config.git] Checked out HEAD: 80c9ea77047421ee4c7a654610b4a1a39b000160 - https://git.wikimedia.org/commit/operations%2Fmediawiki-config.git/80c9ea77047421ee4c7a654610b4a1a39b000160 [02:34:40] !log LocalisationUpdate completed (1.23wmf19) at 2014-03-21 02:34:39+00:00 [02:34:45] Logged the message, Master [02:42:03] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [02:49:30] (03CR) 10Brian Wolff: [C: 04-1] "Doesn't work." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/118956 (owner: 1001tonythomas) [03:05:30] !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 21 03:05:27 UTC 2014 (duration 5m 26s) [03:05:36] Logged the message, Master [03:31:10] @info 10.64.16.34 [03:31:10] Krinkle: Unknown identifier (10.64.16.34) [03:31:27] @info db1045 [03:31:28] Krinkle: Unknown identifier (db1045) [03:31:56] @info s5 [03:31:56] Krinkle: [s5] db1058: 10.64.32.28, db1005: 10.64.0.9, db1026: 10.64.16.15, db1021: 10.64.16.10 [03:36:29] @replag s5 [03:36:29] Krinkle: [s5: wikidatawiki] db1058: 0s, db1005: 0s, db1026: 0s, db1021: 0s, db1045 (*): 0s [03:39:54] @restart [03:39:54] @quit [03:39:55] Permission denied [03:40:26] @info s5 # db1045 should show up now [03:40:26] Krinkle: [s5] db1058: 10.64.32.28, db1005: 10.64.0.9, db1026: 10.64.16.15, db1021: 10.64.16.10, db1045: 10.64.16.34 [03:40:33] @info 10.64.16.34 [03:40:33] Krinkle: [10.64.16.34: s5] db1045 [03:40:36] right [03:40:40] @replag [03:40:41] Krinkle: [s5] db1021: 1s [03:41:08] https://tools.wmflabs.org/wmfdbbot/log/updateExternals.log [03:42:03] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [03:42:10] @info db1047 [03:42:10] Krinkle: [db1047: ?] 10.64.16.36 [03:53:30] @info 10.64.16.10 [03:53:30] Krinkle: [10.64.16.10: s5] db1021 [03:53:35] Interesting.. [04:18:54] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [04:21:23] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 19 Mar 2014 07:10:56 PM UTC [04:38:48] Krinkle|detached: neat (re dbbot) [04:44:13] PROBLEM - DPKG on vanadium is CRITICAL: DPKG CRITICAL dpkg reports broken packages [05:09:13] RECOVERY - DPKG on vanadium is OK: All packages OK [05:45:58] (03PS1) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [05:50:46] (03CR) 10Springle: "I'm no puppet guru. Please nitpick..." [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [07:14:59] (03PS2) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [07:19:23] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [07:22:23] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 19 Mar 2014 07:10:56 PM UTC [07:43:12] today I encountered a most delightful code non sequitur [07:44:32] xhttps://github.com/minrk/pyzmq/blob/c0c9bf27da3ec3023fd0bed19780a858b09d1867/zmq/error.py#L119 [07:44:40] https://github.com/minrk/pyzmq/blob/c0c9bf27da3ec3023fd0bed19780a858b09d1867/zmq/error.py#L119 , rather [07:45:01] pyzmq's poll() re-raises EINTR as KeyboardInterrupt [07:46:35] they removed it in https://github.com/zeromq/pyzmq/pull/338 but neither the commit introducing it nor the commit removing it made any attempt to explain it [07:47:37] and it's not, you know, a copy-paste error. you know there must have been some deliberate thought there. [07:54:31] (03PS1) 10ArielGlenn: adds-changes: generate prev maxrevid if it's missing, small code cleanups [operations/dumps] (ariel) - 10https://gerrit.wikimedia.org/r/119943 [07:57:13] (03PS3) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [07:57:44] (03CR) 10ArielGlenn: [C: 032] adds-changes: generate prev maxrevid if it's missing, small code cleanups [operations/dumps] (ariel) - 10https://gerrit.wikimedia.org/r/119943 (owner: 10ArielGlenn) [08:10:28] (03PS1) 10Nikerabbit: New LocalisationUpdate config [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119945 [08:14:33] (03PS1) 10Nikerabbit: New LocalisationUpdate config [operations/puppet] - 10https://gerrit.wikimedia.org/r/119946 [08:21:55] (03CR) 10Ori.livneh: [C: 04-1] Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [08:22:27] springle: gah, I meant to submit the comments on PS2 [08:22:59] :) [08:22:59] (03CR) 10Ori.livneh: Add a MariaDB module. (0310 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [08:26:00] ori: thanks [08:26:07] good comments [08:26:16] np! :) [08:42:01] (03CR) 10Ori.livneh: [C: 032] Make logstash and kibana roles work in labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/119099 (owner: 10BryanDavis) [08:47:31] (03CR) 10Springle: Add a MariaDB module. (034 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [08:49:09] springle: you are awesome :-] [08:49:44] I have never understood all the DB modules we ended up with :] [08:51:04] well, it's always like that, right -- with the systems you least want to restart or crash, you take the fewest liberties when puppetizing [08:51:58] so you end up with something that delicately tiptoes around the way things are already provisioned rather than bold declarations [08:52:02] yes, i think it was allways done to be conservative [08:52:22] still doing that in some places [08:52:30] (03CR) 10Matanya: "ori said it all, but wonder about two things, the inline comment and where did you define the ram variable?" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [08:52:32] you see it all over the codebase: the least important modules are usually the cleanest [08:53:26] i think it's universal law [08:53:37] clean is always better [08:54:01] not better than a crashed instance :P [08:54:11] :P [08:54:37] springle: thanks for this work [08:55:07] * hashar drinks more coffee [08:57:32] ori: with apt::repository, no .list file anywhere in puppet? or must it still exist somewhere outside the module? [08:58:17] if you look at modules/apt/manifests/repository.pp, you'll see that it generates a file { "/etc/apt/sources.list.d/${name}.list": resource [08:58:33] and a exec { "apt-update-for-${name}": [08:58:42] ah [08:59:05] and the latter subscribes to the former, so changes always trigger an apt-get update [08:59:20] aha that's how coredb was broken [08:59:37] the update wasn't always triggered at the right time [08:59:47] and the ability to require => Apt::Repository means you can depend on the file and the update [08:59:49] ah [09:00:50] hashar: so, I am being stupid, right -- if we are building HHVM anyway, there is really no point in using facebook's package, especially as we decided not to use it in production [09:02:16] but since we are allowing ourselves to be loose with packages in labs given the complicated road ahead for proper hhvm packages, could we generate a simple package out of the hourly / daily builds? [09:03:02] (03PS5) 10Matanya: sudo: convert into a module [operations/puppet] - 10https://gerrit.wikimedia.org/r/111189 [09:03:17] ori: can you please ^ [09:03:19] (03PS4) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [09:04:14] matanya: I've been chastened not to use my Puppet +2 outside of a narrow domain of platforms I co-administer [09:04:30] so I can review but will leave it for ops to merge [09:04:34] ori: all i want is a review :) [09:07:33] (03CR) 10Ori.livneh: "It'd be easier to review if you had a patch adding the module (keeping the old resources for now), and then another patch updating usage f" (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/111189 (owner: 10Matanya) [09:08:08] ori: I think we should invest some time in the packaging work and use it for deployment [09:08:16] ori: not sure why we would compile from source and use a binary [09:08:42] (03CR) 10Matanya: Add a MariaDB module. (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [09:08:44] ori: also we could use a puppet module to generate the hhvm configuration :-D [09:08:55] thanks ori [09:08:58] hashar: use it for deployment, meaning using it for production? or for the beta cluster? [09:09:06] boths ? [09:09:40] ori: if using debian packages for production is the goal, we should do the same on beta [09:09:55] this way all the experience we gather on beta will be useful for prod later on [09:10:01] well, once we have suitable packages for production, of course we should also use them in labs. but the impression i got from faidon was that packaging hhvm properly was going to be a long road [09:10:42] ori: long road because I think Faidon wants a package which meet the Debian quality goals [09:10:49] which are probably a superset of our needs [09:10:57] we can go with a rough package that match our needs [09:11:33] i agree if we're talking about beta [09:11:48] we dont need to match the Debian policy of packaging all dependencies for example. We might well ship the libs in the hhvm package and point the binary to the provided compiled lib [09:11:56] that is the same for prod [09:12:09] then iterate and get the compiled lib packaged properly ala debian [09:12:44] thought that might be a challenge to have to maintain security patches for all the embedded libs :/ [09:12:52] it sounds right to me, but i can see how it might be disrespectful to faidon [09:13:13] since he actually started working on packaging hhvm for debian [09:13:54] that is the pragmatism versus idealism dilemma all other again [09:13:54] and obviously the best possible outcome is that the packages are production-quality and (co-)maintained by a wmfer [09:13:56] yeah I think that is Faidon aim [09:14:06] to get Wikimedia to co maintain the package with Facebook and others [09:14:10] which is nice. [09:14:32] but if it takes a year to achieve this. It is going to be a serious blocker to the hhvm deployment on our cluster [09:14:55] so we can fallback either to (a) compiled hhvm (b) a rough package [09:14:58] well, on the one hand, i agree: i think working with a package that matches our requirements makes sense [09:14:59] I would prefer the rough package [09:15:38] on the other hand, faidon feels pretty strongly about this, and we're going to need ops help to deploy hhvm, and pissing them off royally right off the bat won't work very well [09:16:09] so i'm inclined to think that we owe it to faidon to at least wait a while [09:16:54] unless "a while" means a year or so [09:16:54] well, i think the clearest way forward is to do the work for labs [09:17:22] if the 'proper package' road stalls forever, we can say: look, here are packages, and they work well with our infrastructure / extensions / etc. [09:18:06] if a proper package does arrive in a timely way, then we nevertheless gain the ability to test things in labs without having to be completely blocked on the one hand or having to fight over it on the other [09:19:26] and who defines "our requirements" then? :) [09:19:36] mzmcbride [09:19:40] AHAH [09:20:47] we got to list somewhere the possible strategies to deploy hhvm [09:21:33] faidon was ok with using facebook's packages as long as usage was quarantined to labs, despite finding these packages seriously flawed, so i take that to mean that he wouldn't be opposed to replacing it with a rough custom package in labs [09:21:47] then foreach each list the strengths, weakness, opportunities and possible threats [09:21:58] hashar: not foreach, array_map [09:22:03] functional is hipper [09:24:05] i think there's a long list of things we still have to do before we're ready to deploy to prod [09:24:30] who is in charge of listing all those tasks / prerequisites ? [09:24:50] and i don't think it's true that there would not be value in testing extensions etc against a custom build [09:25:06] hashar: they're mostly here: https://etherpad.wikimedia.org/p/hhvm [09:25:53] and tracked in bugzilla, and listed here: https://www.mediawiki.org/wiki/HHVM#Current_work [09:26:13] * hashar misses a proper project management tool :-] [09:26:33] i also have a status update for ops@ / wikitech@ half-written [09:26:46] great [09:27:00] would it make sense to add a tracking bug in bugzilla? [09:27:16] there was one but andre__ said it was that or a keyword, not both, and we went with a keyword [09:27:32] (03PS1) 10ArielGlenn: move mounts on stat1/1002 from dataset2 to dataset1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/119950 [09:27:35] I like the tree view display in bugzilla [09:27:54] but yeah I guess I can look at the keyword from time to time [09:27:56] yeah ditto, but you know andre__, he's evil and mean :P [09:28:06] :-D [09:28:24] https://www.mediawiki.org/wiki/HHVM#Current_work uses Extension:RSS to pull open bugs with the hiphop keyword [09:28:27] looking at the list of bugs, it seems there is still a lot of dev to be done to run on hhvm [09:28:30] so it should remain perpetually up-to-date [09:28:45] so the packaging / deployment work can be done in parallel [09:28:50] yeah [09:29:03] like, tim has this massive patch upstream: https://github.com/facebook/hhvm/pull/1986 [09:29:04] (03CR) 10ArielGlenn: [C: 032] move mounts on stat1/1002 from dataset2 to dataset1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/119950 (owner: 10ArielGlenn) [09:29:08] which hasn't even been merged yet [09:29:22] and once it's merged, it will sit in HEAD until the next release is cut [09:29:39] which happens once every eight weeks [09:30:29] so there's no point in picking fights about packages with so much still in the air [09:32:50] still, having debs in labs of builds of tim's fork at https://github.com/tstarling/hiphop-php where his patches are staged would be great [09:34:42] ori: we can build packages out of Facebook HEAD and add in Tim's patches using quilt [09:34:51] or just build out of Tim fork :] [09:35:02] yeah, either way [09:36:01] okay, so first step is i undo my own kludge and remove the facebook packages from puppet / reprepo / labs [09:36:09] I love tim code [09:36:18] "ohai, i fixed all your bugs" [09:37:10] or "slightly enhanced the crappy code by using which cut the run time by a factor of X" with X being quite large and the code being quite small [09:37:34] ori: I am fine with the Facebook package :-D [09:37:50] it let us run mw core test suite with hhvm! [09:38:17] ah, right [09:38:41] i just figured since upstream seemed to '+1' his changes he would go ahead and start porting luasandbox [09:39:04] in which case we would want to have a package with the patches [09:39:14] but maybe it's too early to even worry about *that* [09:39:24] probably [09:40:33] you realize that this whole exploration of alternative solutions is basically because i am being a lazy jerk and hoping not to have to fix the libmemcached compat issue in labs because i wouldn't know where to start [09:41:13] i would basically suggest RPMs if we kept going :P [09:51:14] ori: well memecached we will have to fix it up eventually dont we ? [09:51:26] I am not sure why wikimedia-task-appserver depends on libmemcached6 [09:51:30] maybe for the php extensions [09:52:21] 6? [09:52:40] applicationserver/manifests/packages.pp has libmemcached11 [10:00:23] (03PS5) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [10:01:14] !log Jenkins: deleting pmtpa labs slaves integration-slave02 and integration-slave03. Replaced by eqiad instances integration-slave1001 and integration-slave1002. [10:01:20] Logged the message, Master [10:01:56] ori: but I think the issue I had was wikimedia-task-appserver depending libmemcached6 [10:03:54] root@deployment-apache02:~# apt-cache depends hhvm|grep libmem [10:03:54] Depends: libmemcached6 [10:03:54] # apt-cache depends wikimedia-task-appserver|grep libmem [10:03:54] Depends: libmemcached10 [10:03:54] # [10:03:54] that is the other way around :-] [10:04:42] (03PS6) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [10:20:23] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [10:22:22] (03CR) 10Alexandros Kosiaris: [C: 04-1] "That is a just a first set of changes. I reviewed PS5 before you uploaded PS6 but I see they are still valid. What am I more concerned abo" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [10:22:55] (03CR) 10Alexandros Kosiaris: Add a MariaDB module. (0313 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [10:23:08] grrr gerrit and PS... [10:23:23] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 19 Mar 2014 07:10:56 PM UTC [10:23:47] (03PS7) 10Springle: Add a MariaDB module. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 [10:27:12] pff [10:27:19] and Gerrit does not let me push tags :D [10:28:51] (03CR) 10Springle: Add a MariaDB module. (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [10:31:50] (03PS1) 10Hashar: Merge tag 'v0.8.1' from upstream [operations/debs/jenkins-debian-glue] - 10https://gerrit.wikimedia.org/r/119961 [10:33:45] akosiaris: would you have time this afternoon to build a package and put it on apt.wikimedia.org by any chance ? :-] [10:33:56] it is a set of shell script I am using to build Debian packages on labs. [10:34:16] need to bump it from current 0.7.x to 0.8.1 https://gerrit.wikimedia.org/r/#/c/119961/ [10:34:40] why is it empty ? [10:34:58] aah it is a merge [10:35:26] still weird. I would expect some diff there... [10:36:12] (03CR) 10Springle: "@Alex, feel free!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [10:38:42] akosiaris: yeah sorry :-( [10:39:02] I pushed upstream code to the 'upstream' branch [10:39:07] the debian package is in the master branch [10:39:13] debian/gbp.conf should be fine [10:40:10] akosiaris: and you can see the result of the build at https://integration.wikimedia.org/ci/job/operations-debs-jenkins-debian-glue-debian-glue/12/ Under 'Build Artifacts' are the lintian/piuparts output + the .deb files :] [10:40:49] and it is lintian clean! https://integration.wikimedia.org/ci/job/operations-debs-jenkins-debian-glue-debian-glue/12/testReport/ [10:41:45] akosiaris: also I noticed one of your change got merged by a Gerrit user named "rush" after you voted CR+2 [10:42:00] that would be chase [10:42:18] ahh [10:42:21] new hire [10:42:27] https://gerrit.wikimedia.org/r/#/c/119127/ [10:42:34] I believe an email has been sent already [10:42:38] I thought it was some kind of bot acting after you voted +2 hehe [10:43:00] ahahaha [10:43:54] something like your catalog compiation script submitting changes if the catalog are properly compiled and the auto merging :] [10:44:46] hmmm [10:44:59] automatic +2 in case the differ says it is a noop [10:45:06] you are intriguing me sir [10:45:20] jenkins-debian-glue-buildenv-git_0.8.1+0~20140321103153.12~1.gbpff783e_all.deb [10:45:22] ahahahaha [10:45:35] guiness world of records for longest package name [10:46:23] hehe [10:46:32] that is merely for CI purposes [10:46:38] you probably want to rebuild it properly [10:46:59] either from upstream or from operations/debs/jenkins-debian-glue which is merely a copy of the github repo [10:48:47] akosiaris: [10:48:50] $ apt-cache dump | grep '^Package:' | awk '{print length, $0;}' | sort -nr | head -1 [10:48:51] 76 Package: libcgi-application-plugin-authorization-driver-activedirectory-perl [10:51:29] ori: damn... I always forget about perl and cpan [10:51:58] those blah::blah2::blah3::blah4 perl modules ...meh [10:52:17] i try to construct a notion of what that package does as i read it but my stack overflows about 2/3rds of the way through [10:52:48] something about cgi in perl [10:52:58] yes, that's about as far as i got :D [10:52:58] :P [10:55:46] wife lunch bbl [11:15:00] (03CR) 10QChris: "This change (and merging it within less than 6 hours since" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119795 (owner: 10Yurik) [11:32:44] (03CR) 10QChris: "Could this change be related to:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119795 (owner: 10Yurik) [11:38:57] !log upgraded libmemcached packages on apt.wikimedia.org to libmemcached_1.0.17-1~wmf+precise2 [11:39:02] Logged the message, Master [11:39:38] that is a heads up if you notice anything peculiar btw. It has been tested for days on test and beta, but you never know [11:39:42] akosiaris: whoa, is that going to fix the hhvm package dep issue on labs? [11:40:33] ori: * Add allow-zero-retry patch for bug #1251482 in launchpad [11:40:42] so no [11:41:19] ah, right -- that. well, still -- thanks! that's very good to have in [11:45:45] ouf. that planet.osm load is taking forever... [11:45:52] import more like it [11:46:27] akosiaris: be patient [11:46:51] aaa btw [11:47:06] hstore. do we need it ? do we want it ? [11:47:10] yes [11:47:37] ok we support it but it has no data right now, will need to populate it [11:48:08] it might be more important for osm2pgsql than osmosis [11:48:15] but think it's useful regardless [11:48:39] I understand some data can only be found efficiently through hstore [11:48:45] allows having tags indexed key => value [11:48:56] exactly [11:48:56] then that helps with rendering [11:49:02] we used it for multilingual rendering [11:49:33] put name:de => value there, for example [11:51:38] you are using osmosis right, to import? [11:51:38] (03CR) 10Alexandros Kosiaris: [C: 032] Tune labsdb postgresql [operations/puppet] - 10https://gerrit.wikimedia.org/r/119870 (owner: 10Alexandros Kosiaris) [11:52:40] osm2pgsql and osmosis for the .osc [11:52:44] ok [11:53:20] osmosis is great for doing a variety of things, like making an extract from plant or supports the format used by the osm api [11:53:36] osm2pgsql might be faster / better for the convert / import to mapnik format [11:53:59] btw [11:54:00] and is what kai maintains [11:54:03] Using 8 helper-processes [11:54:03] WARNING: Failed to fork helper processes. Falling back to only using 1 [11:54:26] any idea why ? I searched quite a bit but I did not find even a clue to what might be wrong [11:54:41] and I would sure love to use all the CPUs [11:54:43] i woul dask kai [11:54:45] ask* [11:54:55] thought so, I will email him. thanks [11:55:30] there are 2 modes of input, one that uses temp tables [11:55:34] import* [11:56:36] "slim" mode [11:56:55] yes, we use slim mode [11:56:59] ok [11:57:04] needed for incremental updates [11:57:17] because that is needed to support minutely mapnik updates afterwards [11:57:22] yep [11:57:43] i'm sure kai will know what to do about the error [11:59:45] (03PS2) 10Alexandros Kosiaris: Create a user for access to OSM db [operations/puppet] - 10https://gerrit.wikimedia.org/r/119869 [12:02:18] !log upgrade jenkins-debian-glue to 0.8.1 on apt.wikimedia.org [12:02:24] Logged the message, Master [12:02:25] hashar: ^^ [12:02:47] (03CR) 10Alexandros Kosiaris: [C: 032] Create a user for access to OSM db [operations/puppet] - 10https://gerrit.wikimedia.org/r/119869 (owner: 10Alexandros Kosiaris) [12:04:27] heh. osm postgres cluster directory is 370G. Not gargantuan, but quite large indeed [12:18:26] (03CR) 10Alexandros Kosiaris: [C: 032] Merge tag 'v0.8.1' from upstream [operations/debs/jenkins-debian-glue] - 10https://gerrit.wikimedia.org/r/119961 (owner: 10Hashar) [12:18:41] (03PS1) 10Ori.livneh: applicationserver::hhvm: add boost backports ppa for beta [operations/puppet] - 10https://gerrit.wikimedia.org/r/119979 [12:19:32] LeslieCarr: Hey, I nominated your file for deletion [12:19:44] https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Sue_and_WMF_Staffers_drinking_champagne_2013-03-08.jpg [12:20:02] https://commons.wikimedia.org/wiki/Commons:OTRS for the instructions [12:25:25] * MaxSem decrments twkozlowski's karma [12:52:25] (03PS1) 10coren: Tool Labs: remove 'tree' from exec_environ [operations/puppet] - 10https://gerrit.wikimedia.org/r/119982 [12:52:57] hmm. my phone is complaining about nocs ssl cert [12:53:51] (03CR) 10coren: [C: 032] Tool Labs: remove 'tree' from exec_environ [operations/puppet] - 10https://gerrit.wikimedia.org/r/119982 (owner: 10coren) [12:56:40] (03CR) 10Manybubbles: [C: 031] Adding archiva module and role, applying on titanium [operations/puppet] - 10https://gerrit.wikimedia.org/r/117024 (owner: 10Ottomata) [12:58:01] re [12:59:21] akosiaris: thanks for the jenkins-debian-glue package! :-] [13:10:08] (03PS1) 10Hashar: contint: puppetize jenkins-debian-glue [operations/puppet] - 10https://gerrit.wikimedia.org/r/119983 [13:10:53] (03CR) 10Hashar: "v0.8.1 of jenkins-debian-glue has been uploaded to apt.wikimedia.org a couple hours ago by Alexandros :-]" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119983 (owner: 10Hashar) [13:15:52] (03PS1) 10Reedy: Add zero.wikimedia.org [operations/dns] - 10https://gerrit.wikimedia.org/r/119984 [13:18:15] (03PS1) 10Reedy: Add zerowiki [operations/apache-config] - 10https://gerrit.wikimedia.org/r/119985 [13:21:23] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [13:24:12] (03PS3) 10Reedy: Move "RewriteEngine On" earlier in www.wikimedia.org vhost [operations/apache-config] - 10https://gerrit.wikimedia.org/r/91339 [13:24:23] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 19 Mar 2014 07:10:56 PM UTC [13:27:12] (03PS2) 10Hashar: contint: puppetize jenkins-debian-glue [operations/puppet] - 10https://gerrit.wikimedia.org/r/119983 [13:36:48] (03CR) 10coren: [C: 032] "Appears mostly sane." [operations/puppet] - 10https://gerrit.wikimedia.org/r/119983 (owner: 10Hashar) [13:37:02] hashar: it has nothing to do with my Debian standarsd [13:37:11] and I've never applied Debian standards to our packaging, fwiw [13:37:36] hashar: this is hhvm's original "packaging": https://github.com/hhvm/packaging/blob/master/hhvm/deb/package [13:37:45] I probably misinterpreted the discussion on the ITP :/ [13:37:48] (03PS1) 10Reedy: Add zerowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119990 [13:38:25] (03CR) 10Reedy: [C: 032] Point php symlink at php-1.23wmf18 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119921 (owner: 10Reedy) [13:38:39] (03Merged) 10jenkins-bot: Point php symlink at php-1.23wmf18 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119921 (owner: 10Reedy) [13:38:42] just have a look at that link above [13:39:31] line 110 and onwards [13:39:37] line 142 is what "packages" it [13:39:53] (03PS1) 10Hashar: contint: typo in contint::packages::labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/119991 [13:39:55] these are not debian packages, they are just a tarball packed into a deb format [13:41:20] which is not really the Debian way to do it but might be acceptable for us isn't it ? [13:41:26] (03CR) 10coren: [C: 032] "Need coffee, Jenkins?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119991 (owner: 10Hashar) [13:41:28] no [13:41:43] it has nothing to do with the Debian way [13:42:37] http://anonscm.debian.org/gitweb/?p=collab-maint/hhvm.git;a=shortlog [13:42:47] David is a Facebook employee [13:42:58] these are the real packages [13:43:02] it already builds [13:43:09] my understanding was that the hhvm package was including some patched libs [13:43:13] it's still not up to Debian standards [13:43:16] which would need to be extracted out to some proper packages [13:43:17] no, it's not, not anymore [13:43:20] this is not the issue [13:43:36] !log Jenkins: installing jenkins-debian-glue and misc::package-builder on labs slaves. [13:43:40] Logged the message, Master [13:43:47] paravoid: that is a good news! [13:44:09] this whole hhvm packages deal is crazy [13:44:15] so on beta cluster should we use the package from collab-maint? [13:44:19] no [13:44:31] we use the crappy not-real-packages because ori was pushing me very hard for them [13:45:04] he came up with another patch earlier today which is to use a ppa from launchpad https://gerrit.wikimedia.org/r/#/c/119979/1/modules/applicationserver/manifests/hhvm.pp .. [13:45:24] but he was right earlier, I don't appreciate the assumption that I'm being unreasonable and putting Debian ways above Wikimedia's interests [13:48:00] paravoid: sorry you took it like that :-/ I probably miss explained my thought [13:48:25] as I understand it you are babysitting the hvvm debian packaging with Facebook folks [13:48:36] we're working on it [13:48:37] which would eventually produce a nice package that match Debian expectations [13:49:04] I'm working on it with the hopes that it will produce a nice package that meets Wikimedia's expectations [13:49:15] which are a subset of Debian's [13:49:25] but since that might takes a while that would prevent us to install hhvm using a package meanwhile. Hence my proposal to use a shitty intermediate package meanwhile [13:49:40] we have explicitly agreed not to [13:49:49] ahhh [13:49:51] we are not going to use the dl.hhvm.com packages anywhere in production [13:50:11] the existing dl.hhvm.com anyway, I suspect they switch them to david's/my work at some point [13:50:20] so how should we install hhvm meanwhile ? [13:50:30] where? [13:50:58] on the beta cluster labs and contint labs slaves (we run the mediawiki unit test suite with hhvm there) [13:51:16] I agreed to use the dl.hhvm.com packages there and they are already in apt [13:51:30] for *labs* [13:51:47] great [13:51:52] so that solve my concerns :-] [13:52:16] we got a package to install hhvm and it is not going to be the one used later on in production [13:52:22] correct [13:52:43] it's for testing the code with the hhvm interpreter, not for running a service [13:52:54] that's fine, for all I care it could have as easily been tar -xzf; ./hhvm [13:53:32] so I guess we can -1 ori proposal to install hhvm from a launchpad ppa which is https://gerrit.wikimedia.org/r/#/c/119979/1/modules/applicationserver/manifests/hhvm.pp [13:54:21] though 2.4.2 requires a backport of libboost :/ [13:54:54] http://paste.debian.net/88893/ [13:57:20] (03CR) 10Hashar: "The hhvm 2.4.2 package we have on apt.wikimedia.org depends on libboost 1.49. Precise has 1.46 and 1.48 :(" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119979 (owner: 10Ori.livneh) [13:57:24] pasted on gerrit chnage [13:58:56] (03CR) 10Ottomata: "Hm, what's the point of putting applications of this module inside of this module? I'm talking about beta::config, tendril::config, etc. " [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [14:08:01] !log Jenkins: label labs slaves with hasJenkinsDebianGlue to build debian packages on them [14:08:06] Logged the message, Master [14:18:56] (03CR) 10Alexandros Kosiaris: "@ottomata: +1 to what you said. That is why I asked to have a go. I wanna see if the effort will make sense and I Sean will like it." [operations/puppet] - 10https://gerrit.wikimedia.org/r/119930 (owner: 10Springle) [14:36:42] (03PS1) 10Hashar: Lint misc::package-builder [operations/puppet] - 10https://gerrit.wikimedia.org/r/120005 [14:42:24] (03PS2) 10Hashar: Lint misc::package-builder [operations/puppet] - 10https://gerrit.wikimedia.org/r/120005 [14:47:13] RECOVERY - Puppet freshness on labsdb1004 is OK: puppet ran at Fri Mar 21 14:47:05 UTC 2014 [14:47:51] (03PS1) 10Hashar: package-builder.pp: convert notify to use 'message' [operations/puppet] - 10https://gerrit.wikimedia.org/r/120008 [14:47:53] (03PS1) 10Hashar: package-builder.pp: parameterized $pbuilder_root [operations/puppet] - 10https://gerrit.wikimedia.org/r/120009 [14:53:26] (03CR) 10Brion VIBBER: "Ok I've done some testing with Wireshark to see what's actually being sent over the wire when I make requests..." [operations/puppet] - 10https://gerrit.wikimedia.org/r/119786 (owner: 10Faidon Liambotis) [14:54:45] brion: we'd have to ask Aaron about that X-Content-Duration thing [14:54:50] it's something that MWCore does [14:55:11] ok [14:55:37] i still can't believe flash doesn't let me add the damn normal regular range header [14:59:47] (03PS1) 10Yurik: Fix duplicate zero= entries for X-Analytics [operations/puppet] - 10https://gerrit.wikimedia.org/r/120010 [15:00:16] bblack, when you have a chance, yesterday's patch introduced a minor bug in logs, here's the fix ^ [15:00:17] yeah, crazy [15:02:02] (03PS2) 10Yurik: Fix duplicate zero= entries for X-Analytics [operations/puppet] - 10https://gerrit.wikimedia.org/r/120010 [15:03:28] (03PS1) 10Alexandros Kosiaris: OSM: Better checking for planet_osm existence [operations/puppet] - 10https://gerrit.wikimedia.org/r/120012 [15:05:25] (03CR) 10Alexandros Kosiaris: [C: 032] OSM: Better checking for planet_osm existence [operations/puppet] - 10https://gerrit.wikimedia.org/r/120012 (owner: 10Alexandros Kosiaris) [15:05:56] (03PS1) 10Hashar: Role classes wrapper for misc::package-pbuilder [operations/puppet] - 10https://gerrit.wikimedia.org/r/120013 [15:06:23] oh for ....... flash won't let me read response headers either. seriously? [15:06:31] * brion slaps Flash around with the HTTP standard [15:08:02] brion: try Silverlight ? [15:08:05] lol [15:08:14] it is backed up by a multi billion dollars company! [15:08:18] it must be good [15:08:20] maybe i'll use javascript xhr to do HEAD request and fetch the x-content-duration :P [15:08:28] I hear Mono was cool 5 years ago. [15:08:32] oh mono [15:08:55] was it a free implement of the .NET API ? [15:09:08] ..implementation.. [15:09:27] (03PS1) 10Alexandros Kosiaris: Fix typo introduced in db8c944 [operations/puppet] - 10https://gerrit.wikimedia.org/r/120015 [15:09:29] hashar: yep http://www.mono-project.com/ [15:09:37] ironic to see java is moving to openjdk :-D [15:09:57] i remember when java wasn't free enough for us, we ran mono version of lucene search backend for a while :D [15:11:08] (03CR) 10Alexandros Kosiaris: [C: 032] Fix typo introduced in db8c944 [operations/puppet] - 10https://gerrit.wikimedia.org/r/120015 (owner: 10Alexandros Kosiaris) [15:13:55] (03CR) 10Hashar: "misc::package-builder is only used on labs apparently nowadays." [operations/puppet] - 10https://gerrit.wikimedia.org/r/120013 (owner: 10Hashar) [15:14:00] (03PS1) 10Ottomata: Creating var and public directories so that wikimetrics can write out public datasets and serve them [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/120016 [15:15:04] akosiaris: paravoid: ottomata: do we have any production server using misc::package-builder (installs cowbuilder to build debian packages with) [15:15:37] I got a few changes that let me change the pbuilder root from /var/cache/pbuilder to /mnt/pbuilder for labs usage [15:15:49] /mnt ? [15:15:58] if the class is not being used in production anymore (it is not in site.pp ) we could safely merge them :] [15:15:59] hashar: i do not know [15:16:07] i have never used it [15:16:14] yeah /mnt we went with that when adding the labs LVM mount for eqiad labs instance :\ [15:16:24] ottomata: thx :] [15:17:59] aha [15:18:16] paravoid: ok i think i see why i'm not always getting x-content-duration -- i think it's only set on the original file, not the derivatives [15:18:22] for audio we almost always play original file [15:18:29] for video we usually play a derivative [15:19:08] lemme file a bug for the moment [15:20:47] hashar: that's wrong [15:20:55] the /mnt [15:21:21] should we switch to /srv ? [15:21:52] Coren: ^^:-D [15:22:32] I dont mind changing to /srv but I got a bunch of repositories that have been using /mnt for ages :/ [15:22:37] not sure what needs to be changed [15:22:39] (03PS1) 10Alexandros Kosiaris: OSM: Tune shmmax sysctl parameter [operations/puppet] - 10https://gerrit.wikimedia.org/r/120017 [15:22:50] maybe we can mount the lvm volume twice :] [15:22:58] paravoid: Yeah, I've been pusing to a move to /srv because using /mnt is wrong. Hysterical raisins, pmtpa labs use to put additional storage on /mnt [15:23:09] hashar: That'd need a bind mount [15:23:29] switching everything to /srv would be ncie [15:23:40] on beta we have the scap utilities under /srv but that is part of the root paritition [15:24:04] while /mnt is the LVM partition and host the Jenkins workspaces for example [15:24:04] (03CR) 10Alexandros Kosiaris: [C: 032] OSM: Tune shmmax sysctl parameter [operations/puppet] - 10https://gerrit.wikimedia.org/r/120017 (owner: 10Alexandros Kosiaris) [15:24:11] or the jenkins user homedir [15:26:29] hashar: We can switch to /srv using the $::lvm_mount_point variable I added. [15:28:08] bd808: hashar btw, did I misunderstand the memcache/l10ncache thing? (re that bug report) Is it possible that even though the config is the same, that there is a difference in where the cache lives in practicality? [15:28:28] bd808: or add another class :D [15:29:02] * bd808 looks for "that bug report" in a pile of inbox spam from gsoc [15:29:38] https://bugzilla.wikimedia.org/show_bug.cgi?id=62875 [15:29:39] (03PS1) 10Hashar: role::labs::lvm::srv to mount second disk on /srv [operations/puppet] - 10https://gerrit.wikimedia.org/r/120019 [15:29:42] bd808: ^ [15:29:46] bd808: ^ [15:29:49] :) [15:30:45] (03CR) 10Hashar: "Can be used instead of role::labs::lvm::mnt . I tend to dislike $::lvm_mount_point which Bryan added in some patch because that force us " [operations/puppet] - 10https://gerrit.wikimedia.org/r/120019 (owner: 10Hashar) [15:31:03] bd808: I am not sure where the l10n cache is looked up on beta cluster. Would need some one with more knowledge than me to look at it. [15:31:04] err [15:31:07] that was for greg-g [15:31:13] * hashar nature's call [15:31:24] * bd808 needs a nap [15:34:37] me too, instead: drinking maté [15:34:39] mmm [15:35:06] (03CR) 10BryanDavis: "I've got no objections to this role, but I would pout about my global that allows control of the mount point being removed." [operations/puppet] - 10https://gerrit.wikimedia.org/r/120019 (owner: 10Hashar) [15:36:07] I stayed up almost until ori hours of the morning writing code because my wife if out of town [15:36:19] But then my dogs got me up at the normal time [15:36:28] bd808: get to your nap :-] [15:36:47] bd808: the only thing worse than those hours are faidon hours (eg: never go to sleep) [15:36:50] I am not going to switch my instances to /srv on a friday evening anyway (per greg-g stick) :D [15:39:02] (03CR) 10Hashar: Role classes wrapper for misc::package-pbuilder (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/120013 (owner: 10Hashar) [15:40:44] apergos: have you seen ms1004's alert? [15:40:57] no [15:41:07] disk space [15:42:30] looking [15:43:01] bd808: was it fun at least? [15:43:42] greg-g: Yeah. Until Tyler showed up and dropped a -1 on the root concept again :/ [15:43:47] blah [15:46:02] fixed [15:46:05] thanks [16:01:14] paravoid, i tried to work through your numbers at https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#Impact_on_infrastructure - i still end up with 500GB + 15% growth in the file count [16:02:31] (03CR) 10Alexandros Kosiaris: [C: 04-1] Manage scap proxy rsync config in puppet. (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119677 (owner: 10Reedy) [16:03:03] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [16:11:36] (03CR) 10BryanDavis: Manage scap proxy rsync config in puppet. (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119677 (owner: 10Reedy) [16:19:38] (03CR) 10Rush: "matanya, I haven't forgotten about this man. I am working on testing a version of this diff: https://gerrit.wikimedia.org/r/#/c/107848, t" [operations/puppet] - 10https://gerrit.wikimedia.org/r/111189 (owner: 10Matanya) [16:20:07] :) [16:22:23] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [16:25:17] (03CR) 10Dzahn: [C: 031] "Mark, feel like checking the LVS config part?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119515 (owner: 10Reedy) [16:25:26] paravoid: coren: I can't migrate the instances from /mnt to /srv. There is too much havoc involved because the pmtpa instances use /mnt . Would have to change a bunch of assumption made in various scripts / Jenkins config etc. I would rather stick to /mnt and migrate to /srv once all instances are in eqiad. [16:26:29] mutante: looks fine [16:26:34] paravoid: coren: I am happy to add a FIXME comment on the role::package::builder::labs class which uses /mnt as a pbuilder for now ( https://gerrit.wikimedia.org/r/#/c/120013/1/manifests/role/package.pp,unified ) [16:28:45] (03PS5) 10Dzahn: Decomission lvs[1-6] [operations/puppet] - 10https://gerrit.wikimedia.org/r/119515 (owner: 10Reedy) [16:35:58] (03PS6) 10Reedy: Manage scap proxy rsync config in puppet. [operations/puppet] - 10https://gerrit.wikimedia.org/r/119677 [16:36:44] (03CR) 10Alexandros Kosiaris: Manage scap proxy rsync config in puppet. (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/119677 (owner: 10Reedy) [16:38:00] (03PS1) 10Mark Bergsma: Add analytics1-d-eqiad subnets [operations/dns] - 10https://gerrit.wikimedia.org/r/120026 [16:38:54] (03PS2) 10Mark Bergsma: Add analytics1-d-eqiad subnets [operations/dns] - 10https://gerrit.wikimedia.org/r/120026 [16:39:50] (03CR) 10Mark Bergsma: [C: 032] Add analytics1-d-eqiad subnets [operations/dns] - 10https://gerrit.wikimedia.org/r/120026 (owner: 10Mark Bergsma) [16:46:48] yurik: I thought mods to resp.http.foo in vcl_deliver weren't cached? the old zero= was being done with the same basic mechanism... [16:48:34] bblack, not exactly - the x-analytics was set in vcl_recv, and the backend would take that value, parse it, append stuff to it, and return it in response. The response would than get cached, and the new VCL code would append the new (duplicate) value [16:48:49] now x-analytics is one way only -- response [16:49:05] set by backend, appended by varnish [16:49:27] ok, right, I wasn't considering the backend setting [16:50:16] (03PS3) 10BBlack: Fix duplicate zero= entries for X-Analytics [operations/puppet] - 10https://gerrit.wikimedia.org/r/120010 (owner: 10Yurik) [16:50:25] (03CR) 10BBlack: [C: 032 V: 032] Fix duplicate zero= entries for X-Analytics [operations/puppet] - 10https://gerrit.wikimedia.org/r/120010 (owner: 10Yurik) [16:50:34] bblack, thx! [16:51:03] np [17:04:03] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [17:05:21] (03PS1) 10Mark Bergsma: Add subnet analytics1-d-eqiad [operations/puppet] - 10https://gerrit.wikimedia.org/r/120030 [17:07:35] (03CR) 10Mark Bergsma: [C: 032] Add subnet analytics1-d-eqiad [operations/puppet] - 10https://gerrit.wikimedia.org/r/120030 (owner: 10Mark Bergsma) [17:22:11] !log Running deleteEqualMessages.php on wuuwiki (bug 43917 comment 23) [17:22:17] Logged the message, Master [17:23:43] !log remove pmtpa bits (sq67-70) from pybal [17:23:48] Logged the message, Master [17:26:33] (03CR) 10Dzahn: [C: 032] Decomission lvs[1-6] [operations/puppet] - 10https://gerrit.wikimedia.org/r/119515 (owner: 10Reedy) [17:38:57] robh: I've got a question about the dns changes you made in https://gerrit.wikimedia.org/r/#/c/118019 [17:39:58] That commit removed the *.local.wmftest.net entry which breaks my Scholarships role in mw-vagrant [17:40:28] The wildcard to 127.0.0.1 was added by ori in https://gerrit.wikimedia.org/r/#/c/109948/ [17:40:40] (03CR) 10Hashar: Role classes wrapper for misc::package-pbuilder (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/120013 (owner: 10Hashar) [17:42:10] argh, i didnt realize it was being used [17:42:23] so we can revert and toss it back [17:42:29] That was the answer I was hoping for :) [17:42:34] bd808: I apologize for breakign your stuff, was unintentional =] [17:42:52] !log removing pmtpa https from pybal (ssl1-4) [17:42:52] hope it didnt cause too much pain [17:42:56] Logged the message, Master [17:43:11] robh: No worries. I didn't notice for a couple of weeks :) [17:43:48] if you wanna put in the revert with a note that you were using it, i'll merge (seems nicer than me putting a note we talked in irc, blah blah, audit trail) [17:43:58] less he said that the other person said [17:44:01] Can do. Thanks [17:44:51] (03PS8) 10Ottomata: Adding archiva module and role, applying on titanium [operations/puppet] - 10https://gerrit.wikimedia.org/r/117024 [17:46:59] (03PS1) 10BryanDavis: Revert "DNS cleanup" [operations/dns] - 10https://gerrit.wikimedia.org/r/120039 [17:47:42] bd808: can you add a comment? [17:47:43] (03PS2) 10BryanDavis: Revert "DNS cleanup" [operations/dns] - 10https://gerrit.wikimedia.org/r/120039 [17:47:52] ah, there is one already [17:47:53] nevermind me [17:48:08] .ignore paravoid :) [17:48:12] :) [17:48:50] Killing the weird google thing is fine with me. [17:49:04] I can amend the revert to keep that bit [17:49:07] that's probably some google apps for your domain test [17:49:13] that OIT used or maybe still uses [17:49:31] cajoel: do you know if OIT uses wmftest.org for testing Google Apps? [17:49:59] "open web analytics" owa1-3, that died LONG time ago, right [17:50:05] Yeah that's what it looks like or a webmaster tools claim [17:50:12] mutante: correct [17:50:34] see you next week [17:50:56] !log removing owa1-3 from pmtpa pybal [17:51:01] Logged the message, Master [17:52:47] !log removing search_pool4 from pmtpa pybal [17:52:51] Logged the message, Master [17:54:46] !log same for search_pool5, and setting search_prefix (search19/20) to disabled [17:54:51] Logged the message, Master [17:57:00] how about mobile2-5 still being enabled in pybal [17:57:18] already gone from DNS ...ok [18:00:03] !log Reloading Zuul to deploy I20b4aa9159df7 [18:00:08] Logged the message, Master [18:00:45] (03CR) 10Nuria: [C: 031] "Tested on both daemon and apache mode. All looks good." [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/120016 (owner: 10Ottomata) [18:01:24] (03CR) 10Dzahn: "sorry, i should have seen that Change-Id: If7b0b0b7338da181e767ed023c4a52067f274e29 was already here, i duplicated it and already merged" [operations/dns] - 10https://gerrit.wikimedia.org/r/118644 (owner: 10Reedy) [18:02:11] (03CR) 10Ottomata: [C: 032 V: 032] Creating var and public directories so that wikimetrics can write out public datasets and serve them [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/120016 (owner: 10Ottomata) [18:02:42] (03CR) 10Dzahn: "sorry, i duplicated that in Change-Id: I6a5ff28f1a1228dafd2d3a34dac65a03c7923d10 . should have just used this one" [operations/dns] - 10https://gerrit.wikimedia.org/r/118642 (owner: 10Reedy) [18:02:46] (03Abandoned) 10Reedy: Remove ssl[1-4]. Leave mgmt [operations/dns] - 10https://gerrit.wikimedia.org/r/118644 (owner: 10Reedy) [18:03:14] (03Abandoned) 10Reedy: Remove sq67, sq68, sq69, sq70. Leave mgmt [operations/dns] - 10https://gerrit.wikimedia.org/r/118642 (owner: 10Reedy) [18:05:03] !log disabled and commented mobile2-5 in pmtpa pybal [18:05:08] Logged the message, Master [18:09:57] (03CR) 10Yurik: Add zerowiki (032 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119990 (owner: 10Reedy) [18:14:05] (03CR) 10Dzahn: [C: 032] Add zero.wikimedia.org [operations/dns] - 10https://gerrit.wikimedia.org/r/119984 (owner: 10Reedy) [18:14:49] !log DNS update - adding zero.wikimedia.org [18:14:54] Logged the message, Master [18:15:11] Reedy: yurik zero.wikimedia.org. 3600 IN CNAME wikimedia-lb.wikimedia.org. [18:25:18] (03PS1) 10Alexandros Kosiaris: Migrate the auth method of OSM user to trust [operations/puppet] - 10https://gerrit.wikimedia.org/r/120045 [18:27:14] (03PS9) 10Ottomata: Adding archiva module and role, applying on titanium [operations/puppet] - 10https://gerrit.wikimedia.org/r/117024 [18:41:51] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 03:41:16 PM UTC [18:42:07] (03CR) 10Alexandros Kosiaris: [C: 032] Migrate the auth method of OSM user to trust [operations/puppet] - 10https://gerrit.wikimedia.org/r/120045 (owner: 10Alexandros Kosiaris) [18:55:03] (03PS1) 10Ottomata: Updating wikimetrics submodule [operations/puppet] - 10https://gerrit.wikimedia.org/r/120056 [18:56:30] (03CR) 10Ottomata: [C: 032 V: 032] Updating wikimetrics submodule [operations/puppet] - 10https://gerrit.wikimedia.org/r/120056 (owner: 10Ottomata) [18:59:52] (03PS1) 10Dzahn: decom: remove pc1-3 (pmtpa, parser cache) [operations/dns] - 10https://gerrit.wikimedia.org/r/120058 [19:00:31] (03CR) 10Dzahn: [C: 04-1] "wait until they are actually down" [operations/dns] - 10https://gerrit.wikimedia.org/r/120058 (owner: 10Dzahn) [19:04:18] (03PS10) 10Ottomata: Adding archiva module and role, applying on titanium [operations/puppet] - 10https://gerrit.wikimedia.org/r/117024 [19:07:13] (03PS11) 10Ottomata: Adding archiva module and role, applying on titanium [operations/puppet] - 10https://gerrit.wikimedia.org/r/117024 [19:10:54] (03PS1) 10Dzahn: decom: remove snapshot1-4 [operations/dns] - 10https://gerrit.wikimedia.org/r/120060 [19:11:36] (03CR) 10Dzahn: [C: 04-1] "wait until they are actuall shut down and gone from puppet" [operations/dns] - 10https://gerrit.wikimedia.org/r/120060 (owner: 10Dzahn) [19:14:34] akosiaris: could I poke ya on this one? [19:14:35] https://gerrit.wikimedia.org/r/#/c/117024/ [19:14:44] particularly the rsync daemon needs review I htink [19:14:48] in gitfat.pp [19:17:47] (03PS12) 10Ottomata: Adding archiva module and role, applying on titanium [operations/puppet] - 10https://gerrit.wikimedia.org/r/117024 [19:22:51] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [19:23:37] !log torrus was broken. going through fix per https://wikitech.wikimedia.org/wiki/Torrus#Deadlock_problem [19:23:42] Logged the message, Master [19:37:48] (03CR) 10Yurik: "Left one more comment, but otherwise looks good." (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/119990 (owner: 10Reedy) [19:44:29] (03PS1) 10Dzahn: decom: remove pmtpa search pools and searchidx2 [operations/dns] - 10https://gerrit.wikimedia.org/r/120062 [19:45:39] (03PS2) 10Dzahn: decom: remove pmtpa search pools and searchidx2 [operations/dns] - 10https://gerrit.wikimedia.org/r/120062 [19:46:45] (03CR) 10Chad: [C: 031] "lgtm." [operations/dns] - 10https://gerrit.wikimedia.org/r/120062 (owner: 10Dzahn) [19:47:18] (03CR) 10Dzahn: "11.1.2.10.in-addr.arpa domain name pointer search-pool1.svc.pmtpa.wmnet. etc..." [operations/dns] - 10https://gerrit.wikimedia.org/r/120062 (owner: 10Dzahn) [19:52:31] could any op have a look at https://gerrit.wikimedia.org/r/119634 ? :/ [19:52:48] Easy peasy code review... [19:54:06] (03PS1) 10Dzahn: decom Tampa: remove service IPs [operations/dns] - 10https://gerrit.wikimedia.org/r/120063 [19:59:54] (03CR) 10Hashar: "Can anyone visit this patch, try out the git buildpackage and ideally get this change merged in? It makes it possible for Jenkins to buil" [operations/debs/vips] - 10https://gerrit.wikimedia.org/r/113098 (owner: 10Hashar) [20:04:16] (03PS2) 10Hashar: Move math related packages to a puppet class [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/115135 [20:04:28] (03CR) 10Hashar: "Trivial rebase." [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/115135 (owner: 10Hashar) [20:05:44] (03CR) 10Dzahn: [C: 032] "not that familiar with toollabs, but i'll do it since this is indeed exactly what is already on the bastion hosts, so same thing on exec n" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119634 (owner: 10Hoo man) [20:06:40] thanks :) [20:07:14] (03PS5) 10Hashar: Describe Math related packages in a class [operations/puppet] - 10https://gerrit.wikimedia.org/r/115133 [20:07:27] (03CR) 10Hashar: "rebased." [operations/puppet] - 10https://gerrit.wikimedia.org/r/115133 (owner: 10Hashar) [20:13:48] (03PS1) 10Dzahn: remove 'labsudb' IPs and mgmt [operations/dns] - 10https://gerrit.wikimedia.org/r/120065 [20:17:44] (03Abandoned) 10Hashar: adjust jobrunner/videoscaler role for beta [operations/puppet] - 10https://gerrit.wikimedia.org/r/77034 (owner: 10Hashar) [20:18:37] (03Abandoned) 10Hashar: /find_tabs.sh to find puppet manifest using tabs [operations/puppet] - 10https://gerrit.wikimedia.org/r/108018 (owner: 10Hashar) [20:19:22] (03PS1) 10coren: Tool Labs: tweak to mail config [operations/puppet] - 10https://gerrit.wikimedia.org/r/120069 [20:20:17] (03CR) 10coren: [C: 032] "Those were planned, but never used." [operations/dns] - 10https://gerrit.wikimedia.org/r/120065 (owner: 10Dzahn) [20:20:50] (03CR) 10coren: [C: 032] Tool Labs: tweak to mail config [operations/puppet] - 10https://gerrit.wikimedia.org/r/120069 (owner: 10coren) [20:23:42] (03Abandoned) 10Hashar: mediawiki: stop timidity only once it got installed [operations/puppet] - 10https://gerrit.wikimedia.org/r/115618 (owner: 10Hashar) [20:33:16] (03CR) 10Dzahn: "puppet-lint | grep -i "tab char"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108018 (owner: 10Hashar) [20:37:37] (03CR) 10Hashar: "mutante> puppet-lint | grep -i "tab char"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108018 (owner: 10Hashar) [20:37:41] thanks mutante :] [20:41:07] yw:) [20:41:40] (03PS5) 10BryanDavis: Make a self-hosted puppetmaster workalike for puppet::geoip [operations/puppet] - 10https://gerrit.wikimedia.org/r/119555 [20:41:51] (03CR) 10BryanDavis: [C: 031] Make a self-hosted puppetmaster workalike for puppet::geoip [operations/puppet] - 10https://gerrit.wikimedia.org/r/119555 (owner: 10BryanDavis) [20:42:20] (03PS3) 10BryanDavis: Ensure that status is always defined in deploy.checkout [operations/puppet] - 10https://gerrit.wikimedia.org/r/119232 [20:42:48] (03CR) 10BryanDavis: "Ryan Lane: Any suggestion on a good error number to return?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119232 (owner: 10BryanDavis) [20:49:39] (03CR) 10Ottomata: [C: 032 V: 032] Make a self-hosted puppetmaster workalike for puppet::geoip [operations/puppet] - 10https://gerrit.wikimedia.org/r/119555 (owner: 10BryanDavis) [20:50:58] Thanks ottomata. That brings us down to only 2 local hacks for beta.eqiad puppet :) [20:55:47] ^d: now i wonder why searchidx2 is even still alive, i thought we ... oh well.. was there a reason in the past to keep that longer ? [20:58:14] <^d> nope. [20:58:36] k, thanks! [20:59:28] (03PS1) 10Dzahn: decom: remove searchidx2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/120145 [21:10:02] (03PS2) 10Dzahn: decom: remove searchidx2 [operations/puppet] - 10https://gerrit.wikimedia.org/r/120145 [21:18:12] :( git browser doesn't work - clicking "HEAD" at the top gives internal error: http://git.wikimedia.org/blob/operations%2Fpuppet.git/f13f911e1f8f85d00fe803eae41aa93d42ed291c/templates%2Fvarnish%2Fzero.inc.vcl.erb [21:19:10] <^d> same link wfm :\ [21:19:34] ^d, the link works, but if you click "HEAD" at the very top [21:19:39] it gives an error [21:19:48] <^d> Oh dur, yes. [21:19:52] <^d> That's a known bug. [21:20:00] <^d> puppet doesn't have a master branch, just production. [21:20:06] <^d> gitblit's silly about it :\ [21:20:32] HEAD is not connected to the branch name, its just the top of the current branch, isn't it? [21:22:08] <^d> yurik: In git-speak, yes. but see my previous comment. [21:22:13] <^d> "gitblit's silly about it :\" [21:22:16] :D [21:22:26] bummer [21:22:32] thx for explaining though! [21:22:33] <^d> There's a bug somewhere in BZ for it. [21:22:36] <^d> yw. [21:26:10] Hm.. what is the supposed 500P and 60P spike on the network a few hours ago? [21:26:13] http://ganglia.wikimedia.org/latest/?m=cpu_report&r=hour&s=by%20name&hc=4&mc=2 [21:26:22] (the "Wikimedia Grid Network" last hour) [21:26:36] surely we didn't transfer 500 petabyte in a few minutes times? [21:26:43] Or did we? [21:27:38] Came from MySQL eqiad it seems [21:30:40] (03CR) 10Yurik: "Documentation has been updated at https://wikitech.wikimedia.org/wiki/X-Analytics" [operations/puppet] - 10https://gerrit.wikimedia.org/r/119795 (owner: 10Yurik) [21:32:59] es1010.eqiad to be exact [21:35:17] http://ganglia.wikimedia.org/latest/graph_all_periods.php?h=es1010.eqiad.wmnet&m=network_report&g=network_report&z=large&c=MySQL%20eqiad [21:42:51] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 03:41:16 PM UTC [22:20:24] (03PS1) 10Odder: Enable Import on Spanish Wikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/120152 [22:23:51] PROBLEM - Puppet freshness on labstore2 is CRITICAL: Last successful Puppet run was Fri 21 Mar 2014 01:17:26 AM UTC [22:33:23] (03PS1) 10Odder: Add new user groups to Spanish Wikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/120153 [23:20:45] (03CR) 10Tim Landscheidt: "@Hashar: Could you set up puppet-lint as non-voting for operations/puppet?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/108018 (owner: 10Hashar) [23:29:48] (03CR) 10Rush: let bastion hosts have base::firewall (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/96424 (owner: 10Dzahn) [23:47:58] (03CR) 10Dzahn: let bastion hosts have base::firewall (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/96424 (owner: 10Dzahn)