[00:13:27] ebernhardson: $.proxy [00:24:42] hi, I want to request shell access. I was following https://wikitech.wikimedia.org/wiki/Requesting_shell_access but I'm not sure if there's a way to check if I have an RT account or not. I didn't get any message after submitting the password reset form so I'd assume I don't have one.w [00:25:18] You can just email ops-requests [00:25:56] ops-requests@wikimedia.org? OK, I'll update that wiki page [00:34:05] thanks Reedy, sent [00:41:43] (03PS1) 10Gage: Enable varnishkafka on bits caches [operations/puppet] - 10https://gerrit.wikimedia.org/r/109800 [00:41:58] jgage_: \o/ [00:42:50] woo, a lot of reading and a tiny change [00:42:53] sorry that took so long [00:42:58] at least now i understand what i'm doing [00:53:17] <^d> Friendly reminder: gerrit coming down in about ~40mins for update. [00:55:05] (03PS1) 10Springle: repool db1006, warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109804 [00:56:04] (03CR) 10Springle: [C: 032] repool db1006, warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109804 (owner: 10Springle) [00:56:28] !log ori synchronized php-1.23wmf11/extensions/WikimediaEvents 'Ifc697cbe6: Revert I829790cd5, removing module storage logging' [00:56:35] Logged the message, Master [00:56:56] !log springle synchronized wmf-config/db-eqiad.php 'repool db1006, warm up' [00:57:00] greg-g: FYI we scrubbed enabling Flow on enwiki today. Instead just updating Flow in wmf11 on mediawiki.org [00:57:06] Logged the message, Master [00:57:33] Did Flow go out? [00:57:54] Gloria: no, not on enwiki [00:58:29] spage: I'm looking at https://wikitech.wikimedia.org/wiki/Deployments now. Has it been rescheduled or is it going out now(ish)? [00:58:35] Just curious. [00:59:24] Gloria yeah I'll update it. [01:05:34] !log ori synchronized php-1.23wmf10/extensions/WikimediaEvents 'Ifc697cbe6: Revert I829790cd5, removing module storage logging' [01:05:46] Logged the message, Master [01:10:06] spage: Thanks. :-) [01:11:48] greg-g: https://wikitech.wikimedia.org/wiki/Deployments still has "Week of January 20" on it, BTW. [01:18:42] !log ori synchronized php-1.23wmf11/extensions/WikimediaEvents [01:18:51] Logged the message, Master [01:19:07] !log ori synchronized php-1.23wmf10/extensions/WikimediaEvents [01:19:13] Logged the message, Master [01:26:48] !log bsitu synchronized php-1.23wmf11/extensions/Flow 'Update Flow with some special contribs cherry-picks' [01:26:56] Logged the message, Master [01:31:42] So we deployed some fixes to Flow changes in Special:Contributions to mediawiki.org. [01:36:51] <^d> !log gerrit down for upgrade [01:37:00] Logged the message, Master [01:39:43] PROBLEM - gerrit process on ytterbium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^GerritCodeReview .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war [01:39:50] gj icinga-wm [01:40:18] you can disable alerts in icinga-admin.wikimedia.org , though it may require ops [01:40:55] I suspect it wont be as bad as the puppet spam ;) [01:42:04] <^d> It'll be back soon enough. [01:42:10] <^d> database already upgraded. [01:42:21] <^d> re-running puppet now, then will bring service back up [01:42:43] RECOVERY - gerrit process on ytterbium is OK: PROCS OK: 1 process with regex args ^GerritCodeReview .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war [01:43:09] <^d> !log gerrit back up, running v2.8.1 stable now [01:43:19] Logged the message, Master [01:43:57] <^d> spage: Definitely has the old change screen as the default. You can turn the ugly thing on if you'd like. [01:44:03] <^d> (although I don't know why you would :)) [02:20:23] PROBLEM - puppetmaster https on virt0 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:25:45] !log LocalisationUpdate completed (1.23wmf10) at 2014-01-28 02:25:44+00:00 [02:25:53] PROBLEM - HTTP on virt0 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:28:25] andrewbogott: ^ [02:29:03] lol, wikitech [02:44:52] (03PS1) 10Springle: prepare for rotation of s6 master from db1027 to db1023 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109824 [02:47:06] virt0 /a full [02:47:27] dejavu [02:48:35] yeah [02:48:45] * springle deletes two oldest backups [02:49:43] RECOVERY - HTTP on virt0 is OK: HTTP OK: HTTP/1.1 302 Found - 457 bytes in 4.237 second response time [02:50:02] springle: Thanks! [02:52:03] !log LocalisationUpdate completed (1.23wmf11) at 2014-01-28 02:52:02+00:00 [02:52:15] Logged the message, Master [02:53:13] RECOVERY - puppetmaster https on virt0 is OK: HTTP OK: Status line output matched 400 - 336 bytes in 4.421 second response time [02:53:34] !log wikitech /a full again as per Ryan Lane email to ops@ on 2014-01-14. Deleted two oldest backup sets *2014012[01]*. [02:53:42] Logged the message, Master [02:54:54] (03CR) 10Springle: [C: 032] prepare for rotation of s6 master from db1027 to db1023 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109824 (owner: 10Springle) [02:55:38] scfc_de: back, looking... [02:55:45] oh, and now it's recovered [02:55:46] Sean just fixed it ;) [02:55:57] well it isn't fixed [02:56:00] springle: disk full? [02:56:01] it needs attention [02:56:10] yes. backups keep overflowing [02:56:15] ok [02:56:17] You fixed wikitech [02:57:54] !log springle synchronized wmf-config/db-eqiad.php 'db1006 to LB 400. prep db1027 for s6 master rotation' [02:58:01] Logged the message, Master [03:09:19] (03PS1) 10Springle: rotate s6 master, demote db1027, promote db1023 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109828 [03:09:43] PROBLEM - mysqld processes on db1027 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld [03:09:57] (03CR) 10Springle: [C: 032] rotate s6 master, demote db1027, promote db1023 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109828 (owner: 10Springle) [03:10:04] (03Merged) 10jenkins-bot: rotate s6 master, demote db1027, promote db1023 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109828 (owner: 10Springle) [03:11:14] !log springle synchronized wmf-config/db-eqiad.php 'rotate s6 master, demote db1027, promote db1023' [03:11:21] Logged the message, Master [03:11:43] PROBLEM - MySQL Replication Heartbeat on db1015 is CRITICAL: CRIT replication delay 307 seconds [03:11:43] PROBLEM - MySQL Replication Heartbeat on db1006 is CRITICAL: CRIT replication delay 307 seconds [03:11:45] !log springle synchronized wmf-config/db-pmtpa.php 'rotate s6 master, demote db1027, promote db1023' [03:11:52] (03CR) 10Andrew Bogott: [C: 032] set time intervals properly [operations/puppet] - 10https://gerrit.wikimedia.org/r/109701 (owner: 10Cmcmahon) [03:11:52] Logged the message, Master [03:11:53] PROBLEM - MySQL Replication Heartbeat on db1010 is CRITICAL: CRIT replication delay 321 seconds [03:11:55] say what now [03:12:13] PROBLEM - MySQL Replication Heartbeat on db1022 is CRITICAL: CRIT replication delay 334 seconds [03:12:33] PROBLEM - MySQL Replication Heartbeat on db1023 is CRITICAL: CRIT replication delay 355 seconds [03:13:31] false positive [03:13:38] * springle phew [03:20:45] springle, the full volume on vir0 was /a? [03:20:55] andrewbogott: correct [03:23:08] (03PS1) 10Springle: update coredb topology after s6 master rotation [operations/puppet] - 10https://gerrit.wikimedia.org/r/109831 [03:23:08] crontab has [03:23:08] # Puppet Name: backup-cleanup [03:23:09] 0 3 * * * find /a/backup -type f -mtime +7 -delete [03:23:18] springle, maybe I should just change that to +4 [03:23:30] is 4 enough? [03:23:30] Or does that strike you as an embrassing half-measure? [03:23:34] who decides that [03:23:58] well… I don't really know what that backup is for anyway. Guess I should figure that out [03:24:28] Ah, it's backups of all the openstack bits [03:24:29] (03CR) 10Springle: [C: 032] update coredb topology after s6 master rotation [operations/puppet] - 10https://gerrit.wikimedia.org/r/109831 (owner: 10Springle) [03:29:18] (03PS1) 10Andrew Bogott: Save backups for 4 days rather than 7. [operations/puppet] - 10https://gerrit.wikimedia.org/r/109833 [03:32:00] (03CR) 10Andrew Bogott: [C: 032] "Better than an outage." [operations/puppet] - 10https://gerrit.wikimedia.org/r/109833 (owner: 10Andrew Bogott) [03:32:36] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-28 03:32:36+00:00 [03:32:45] Logged the message, Master [03:40:50] (03PS1) 10Ori.livneh: retire logmsgbot from #wikimedia-dev [operations/puppet] - 10https://gerrit.wikimedia.org/r/109836 [03:41:06] (03PS2) 10Ori.livneh: retire logmsgbot from #wikimedia-dev [operations/puppet] - 10https://gerrit.wikimedia.org/r/109836 [03:41:19] (03CR) 10Ori.livneh: [C: 032 V: 032] retire logmsgbot from #wikimedia-dev [operations/puppet] - 10https://gerrit.wikimedia.org/r/109836 (owner: 10Ori.livneh) [03:42:14] just needs a puppet run on neon now [03:51:38] (03PS1) 10Yurik: Added an opera-only partner [operations/puppet] - 10https://gerrit.wikimedia.org/r/109838 [03:52:16] (03PS1) 10Springle: update cname after s6 master rotation [operations/dns] - 10https://gerrit.wikimedia.org/r/109839 [03:52:41] (03PS1) 10Ori.livneh: tcpircbot: allow 'channels' to be either a string or an array [operations/puppet] - 10https://gerrit.wikimedia.org/r/109840 [03:52:58] (03CR) 10Springle: [C: 032] update cname after s6 master rotation [operations/dns] - 10https://gerrit.wikimedia.org/r/109839 (owner: 10Springle) [03:54:54] RECOVERY - MySQL Replication Heartbeat on db1010 is OK: OK replication delay -0 seconds [03:55:04] RECOVERY - MySQL Replication Heartbeat on db1022 is OK: OK replication delay -0 seconds [03:55:34] RECOVERY - MySQL Replication Heartbeat on db1023 is OK: OK replication delay -1 seconds [03:55:54] RECOVERY - MySQL Replication Heartbeat on db1015 is OK: OK replication delay -0 seconds [03:55:54] RECOVERY - MySQL Replication Heartbeat on db1006 is OK: OK replication delay -0 seconds [03:56:20] (03PS2) 10Ori.livneh: tcpircbot: allow 'channels' to be either a string or an array [operations/puppet] - 10https://gerrit.wikimedia.org/r/109840 [03:58:25] (03PS3) 10Ori.livneh: tcpircbot: allow 'channels' to be either a string or an array [operations/puppet] - 10https://gerrit.wikimedia.org/r/109840 [03:58:34] (03PS4) 10Ori.livneh: tcpircbot: allow 'channels' to be either a string or an array [operations/puppet] - 10https://gerrit.wikimedia.org/r/109840 [03:58:40] (03CR) 10Ori.livneh: [C: 032 V: 032] tcpircbot: allow 'channels' to be either a string or an array [operations/puppet] - 10https://gerrit.wikimedia.org/r/109840 (owner: 10Ori.livneh) [04:17:02] (03PS1) 10Springle: depool db1040 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109842 [04:17:20] (03CR) 10Springle: [C: 032] depool db1040 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109842 (owner: 10Springle) [04:17:26] (03Merged) 10jenkins-bot: depool db1040 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109842 (owner: 10Springle) [04:18:23] !log springle synchronized wmf-config/db-pmtpa.php 'depool db1040 for schema changes' [04:18:30] Logged the message, Master [04:18:56] !log springle synchronized wmf-config/db-eqiad.php 'depool db1040 for schema changes' [04:19:03] Logged the message, Master [04:58:44] RobH: I'm having trouble accessing https://wikitech with curl/wget and andrewbogott said you're working on a "proper" certificate (apparently https://rt.wikimedia.org/Ticket/Display.html?id=6592). Do you have an ETA when that will be available? [05:40:43] ottomata1: so [05:40:54] ottomata1: https://github.com/trebuchet-deploy [05:41:06] I've been slowly working everything into the upstream [05:41:40] trebuchet itself is still in kind of a grey zone of being correctly up to date in puppet and not on github [05:42:03] Gloria: yeah re jan 20th, just didn't have any time to archive it today, will do [05:42:15] or, "it's a wiki" ;) [05:42:27] :D [05:42:28] ottomata1: you'll find the trebuchet code in puppet in the "deployment" module [05:42:41] greg-g: Do you know if Flow has been rescheduled? [05:44:31] not yet [05:44:43] ottomata1: I'll try to get that code properly in github and in launchpad ppas [05:59:17] Man, when I see that a puppet module has all its logic in .rb rather than .pp I'm immediately suspicious. [05:59:30] Ryan_Lane: Speaking of puppet-nova here :/ [07:06:28] (03PS1) 10Ori.livneh: Add VisualEditor timing data reporter [operations/puppet] - 10https://gerrit.wikimedia.org/r/109848 [07:43:41] (03PS1) 10Matanya: emery: remove one udp2log logger. [operations/puppet] - 10https://gerrit.wikimedia.org/r/109849 [08:07:15] (03CR) 10Ori.livneh: [C: 032] Add VisualEditor timing data reporter [operations/puppet] - 10https://gerrit.wikimedia.org/r/109848 (owner: 10Ori.livneh) [08:22:14] (03PS4) 10Yuvipanda: Deploy Extension:MobileApp to betalabs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/106217 [08:22:34] (03CR) 10Hashar: [C: 032] Deploy Extension:MobileApp to betalabs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/106217 (owner: 10Yuvipanda) [08:22:42] (03Merged) 10jenkins-bot: Deploy Extension:MobileApp to betalabs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/106217 (owner: 10Yuvipanda) [08:22:47] hashar: wooo! [09:01:16] (03CR) 10Alexandros Kosiaris: "@ottomata" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109316 (owner: 10Ottomata) [09:16:52] (03CR) 10Faidon Liambotis: "Why a submodule?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109316 (owner: 10Ottomata) [09:17:42] paravoid: because it is also being used in Mediawiki-Vagrant, IIRC [09:17:51] (03CR) 10Faidon Liambotis: [C: 04-1] "w" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/109838 (owner: 10Yurik) [09:22:57] (03CR) 10Dzahn: [C: 032] "icinga host group was erroneously still named "pmtpa search servers (lucene)" but now only contains eqiad servers" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109635 (owner: 10Dzahn) [09:25:29] (03CR) 10Alexandros Kosiaris: [C: 032] network: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109732 (owner: 10Matanya) [09:35:39] (03PS2) 10Yurik: Zero: Added 655-12 [operations/puppet] - 10https://gerrit.wikimedia.org/r/109838 [09:36:37] andrewbogott: hi, i didn't really understand whey you merged the cron change (the first one) yesterday, did i miss something? [09:37:13] matanya: Um… I merged without thinking that hard about it/figuring it was mostly harmless [09:37:19] That's correct now in any case, right? [09:37:37] (And, actually, I tried to get puppet-lint to complain about the lack of quotes and it didn't. But maybe a problem with my methodology.) [09:37:51] yes, i just thought you saw something i missed [09:38:20] regarding quotes, not lint warning, just a style thing I (we?) do [09:38:46] I should write a draft style guide, shouldn't i? [09:39:45] https://wikitech.wikimedia.org/wiki/Puppet_coding#Coding_Style isn't enough i fear [09:41:36] akosiaris: i would much appricate if you can review my site lint patch. i know it is a touchy one [09:42:47] matanya, if your style guide isn't too long you can shoehorn it into the 'Format' section... [09:42:57] matanya: yeah i 'll try [09:43:28] andrewbogott: i need to get OPS ok before deciding such thing on my own [09:43:34] thanks akosiaris [09:43:45] matanya: true, you can write a draft someplace else and we can discuss adding it to that page [09:43:54] ok [09:45:23] (03PS3) 10Faidon Liambotis: Zero: add carrier 655-12 to Varnish config [operations/puppet] - 10https://gerrit.wikimedia.org/r/109838 (owner: 10Yurik) [09:45:34] (03CR) 10Faidon Liambotis: [C: 032 V: 032] Zero: add carrier 655-12 to Varnish config [operations/puppet] - 10https://gerrit.wikimedia.org/r/109838 (owner: 10Yurik) [09:48:32] (03CR) 10Faidon Liambotis: [C: 04-1] coredb_mysql: puppet 3 compatibility fix: fully qualify variable (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108313 (owner: 10Matanya) [09:50:12] (03CR) 10Faidon Liambotis: [C: 04-1] coredb_mysql: puppet 3 compatibility fix: fully qualify variables (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108488 (owner: 10Matanya) [09:51:02] (03CR) 10Faidon Liambotis: "Please do not rebase needelessly. It doesn't make a difference for review purposes, and we can always rebase right before the merge." [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [09:52:14] (03PS2) 10Matanya: coredb_mysql: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/108313 [09:52:24] (03PS3) 10Faidon Liambotis: ldap: puppet 3 compatibility, fix variable name typo [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [09:52:34] (03CR) 10Faidon Liambotis: [C: 032] ldap: puppet 3 compatibility, fix variable name typo [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [09:52:36] (03CR) 10jenkins-bot: [V: 04-1] ldap: puppet 3 compatibility, fix variable name typo [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [09:52:54] pfft, needs rebase [09:53:10] doing paravoid [09:53:19] (03PS2) 10Dzahn: remove professor from site.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/109285 [09:53:54] mutante: feel free to proceed with that without CR, as long as you follow the decom process :) [09:54:44] (03CR) 10ArielGlenn: [C: 032] remove professor from site.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/109285 (owner: 10Dzahn) [09:54:45] paravoid: better do the change in another patch, my ldap-lint stuff changed it a lot [09:54:57] paravoid: yea, thanks for yours and ori's trust, be i just talked to apergos about the "follow the decom process"-part, especially the right order and the non-gerrit part like cleaning certs, salt keys [09:55:03] matanya: keep the same Change-Id [09:55:07] i will [09:55:09] matanya: and it will get under the same gerrit patchset [09:57:42] (03PS4) 10Matanya: ldap: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 [09:57:49] done paravoid [09:58:09] no, not good [09:58:28] (03CR) 10jenkins-bot: [V: 04-1] ldap: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [09:59:02] (03PS5) 10Matanya: ldap: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 [09:59:36] (03CR) 10Faidon Liambotis: [C: 032] ldap: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/107823 (owner: 10Matanya) [10:00:11] (03PS1) 10Andrew Bogott: ::qualify the global openstack_version [operations/puppet] - 10https://gerrit.wikimedia.org/r/109858 [10:00:46] (03CR) 10jenkins-bot: [V: 04-1] ::qualify the global openstack_version [operations/puppet] - 10https://gerrit.wikimedia.org/r/109858 (owner: 10Andrew Bogott) [10:01:50] andrewbogott: and abandon https://gerrit.wikimedia.org/r/#/c/97007 presumably? [10:02:21] Oh, I don't know, maybe that one's better! [10:02:23] * andrewbogott reads [10:05:44] paravoid, apergos, I pretty much don't understand about the class vs. global question in that change. In the style guide we have 'Role classes are never parameterized, and are only configured via globals' which I wrote and maybe didn't think about enough... [10:06:07] Does it not work to have site.pp define $foo and then refer to it later as $::foo? [10:06:34] not if it'sn in a node definition, that'snot top level [10:07:55] So the whole idea of configuring a role via a global… impossible? Or just ugly due to ambiguous scope? [10:08:15] Wrapping each var in a class seems like a whole lot of work [10:08:26] (and can't really be done from labsconsole either atm) [10:08:48] paravoid: do you want to take https://gerrit.wikimedia.org/r/#/c/83768/ until we made some progress on a more general approach in https://gerrit.wikimedia.org/r/#/c/107831/ , that might take a bit longer though, if the simpler one is bad though i'll abandon it in favor of the other one [10:08:50] yeah I don't know what a good approach is [10:09:03] the second one was kind of a reply to your comment on the first one [10:09:06] but the dynamic scope thing means you can't do it that way in puppet3 [10:09:30] (03CR) 10Faidon Liambotis: [C: 032] move {jobs,careers}.wikimedia.org to redirects.dat [operations/apache-config] - 10https://gerrit.wikimedia.org/r/106108 (owner: 10Jeremyb) [10:11:12] (03CR) 10Faidon Liambotis: [C: 04-1] "I'd prefer the opposite: trailing slashes everywhere." [operations/apache-config] - 10https://gerrit.wikimedia.org/r/106110 (owner: 10Jeremyb) [10:11:35] I was just looking for some way to avoid dynamic scope, there must be better ones [10:11:42] apergos: Tell me more about 'the dynamic scope thing'? Does that mean no globals ever? [10:12:08] you have to give the full reference to the var name [10:12:59] e.g. $::foo [10:13:02] is that a full reference? [10:13:08] yes but you can't assign to it [10:13:17] ariables can only be assigned using their short name [10:13:21] *variables [10:13:28] you can retrieve from it [10:13:49] ok, so then what's an example of assigning and subsequently retrieving? [10:14:42] if you are in role/foo.pp and you want it to use class also called foo, but which is from modules/foo/manifests/init.pp (which has to have that name by module convention), then you'd have to use it like class { '::foo': parameter => in the role class [10:14:52] Or is the answer to that 'it can only be done via parameters'? [10:14:57] common mistake was to forget the leading :: and then the role class tries to find itself [10:15:55] (03CR) 10Faidon Liambotis: [C: 04-1] "This is very nice work, kudos for doing all this!" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/106109 (owner: 10Jeremyb) [10:17:12] (03Abandoned) 10Hashar: cleanup scap scripts, sql, etc. [operations/puppet] - 10https://gerrit.wikimedia.org/r/8438 (owner: 10Jeremyb) [10:17:34] in search.pp lucene::server has an indexer value passed in [10:17:37] oh shit [10:17:38] (03PS1) 10Dzahn: remove professor from dsh groups (decom) [operations/puppet] - 10https://gerrit.wikimedia.org/r/109860 [10:17:42] later in the config class ther eis a line if $lucene::server::indexer == true [10:17:47] that's pretty typical [10:17:52] (for an example) [10:19:02] (03Abandoned) 10Hashar: db1038 is s3 master [operations/puppet] - 10https://gerrit.wikimedia.org/r/90481 (owner: 10Springle) [10:19:17] (03Abandoned) 10Hashar: db1027 is s6 master [operations/puppet] - 10https://gerrit.wikimedia.org/r/89616 (owner: 10Springle) [10:19:31] paravoid: ? [10:19:35] (03CR) 10Dzahn: [C: 032] "per https://wikitech.wikimedia.org/wiki/Server_Lifecycle#In_Service :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109860 (owner: 10Dzahn) [10:19:50] (03Abandoned) 10Hashar: Inserted QT libraries [operations/puppet] - 10https://gerrit.wikimedia.org/r/96478 (owner: 10Petrb) [10:19:53] nothing [10:19:55] false alarm [10:19:58] :) ok [10:20:40] apergos: it isn't passed in though, right? It's globally defined? [10:20:43] Just as a class vs. a var? [10:21:11] (03Abandoned) 10Hashar: more permissions/ownership hackery [operations/puppet] - 10https://gerrit.wikimedia.org/r/58614 (owner: 10Jgreen) [10:21:20] oh, you want an example of a global var assigned and then later retrieved? sorry, I lost track a little of what you need [10:21:33] apergos: eh, request for command line to add to docs, if you have it handy [10:21:36] Remove from puppet stored configuration files. [10:21:51] i know i did it before, looks in history [10:22:07] apergos, I think you are telling me that https://wikitech.wikimedia.org/wiki/Puppet_coding#Organization is mostly wrong and/or impossible. [10:22:18] Which means I am immobilized, pending new instructions :) [10:22:25] (03Abandoned) 10Hashar: fundraising.pp file owner cleanup [operations/puppet] - 10https://gerrit.wikimedia.org/r/58603 (owner: 10Jgreen) [10:22:40] it's on the server lifecycle page [10:22:44] The thing you do for the above OpenStack patch works but I don't see how it's better than just using a parameter in the first place [10:22:59] Manually run puppetstoredconfigclean.rb on the puppet master. [10:23:01] mutante: [10:23:03] …which, neither thing is supported by labs, which makes me sad :( [10:23:09] andrewbogott: looking [10:23:35] 1 on there is certainly wrong for moving to puppet3 [10:23:49] apergos: thanks, that was it, just needed reminder of the script name [10:24:01] (adds in that place on lifecycle page) [10:24:08] mutante: it's already there [10:24:33] the line I pasted in here, it's from that page under the right step [10:24:33] apergos: well, it says to " [10:24:33] Remove from puppet stored configuration files. [10:24:38] (03Abandoned) 10Hashar: Send VisualEditor metrics to Ganglia via StatsD [operations/puppet] - 10https://gerrit.wikimedia.org/r/90464 (owner: 10Ori.livneh) [10:24:38] that's a bad title [10:24:41] as well _before_ that warning [10:24:45] that means from the manifests [10:24:48] These steps, once started, must be completed without interruption. [10:25:12] apergos: duh, then i just read that part as meaning the storedconfigclean.rb [10:25:18] apergos: ok, got it, no worries [10:25:36] if you want to change that section title so it's clearer that's fine [10:25:47] ok [10:28:49] !log removed professor from dsh,puppet,puppetmaster: on palladium Killing professor.pmtpa.wmnet...done. [10:28:52] apergos: this isn't urgent since I can just merge your patch as is. But will you take on the task of writing some docs (or at least opening a discussion) about how we should properly define the boundaries between nodes, roles, and modules? [10:28:58] Logged the message, Master [10:29:08] Since my attempt won't work with 3? [10:29:39] well but the patch is considered the wrong approach [10:29:53] so please don't merge it :-( [10:30:00] ok... [10:30:09] the thing is I don't know what a good approach is [10:30:13] I guess my patch was just clarifying anyway, I can just strip out everything but the comments [10:30:26] I suspect that the right approach is 'use heira' [10:30:28] I am happy to open a discussion, where were you thinking? [10:30:32] but I don't even really know what heira is. [10:30:49] apergos, on wikitech-l probably. [10:31:02] ryan has ben using it at $newjob [10:31:05] Unless you have a complete fully-formed vision in which case you can just edit the wiki by fiat [10:31:22] no, I have a lack of vision [10:31:34] ok I'll send an email in a bit [10:31:36] My instinct about heira is -- I like being able to read about things in files, like having to look inside a db less. [10:31:43] But I may be misunderstanding how it works. [10:31:44] Thanks. [10:33:55] (03PS5) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [10:34:16] I should read up [10:34:18] (03Abandoned) 10Andrew Bogott: ::qualify the global openstack_version [operations/puppet] - 10https://gerrit.wikimedia.org/r/109858 (owner: 10Andrew Bogott) [10:34:48] !log disabled host and service checks/notifications for professor. running puppet on icinga [10:34:55] Logged the message, Master [10:35:10] andrewbogott: heira would be a good approach here [10:35:35] you will just being assigning the "data" in heira a variable [10:36:22] apergos: heh, disabled notifications just in time, icinga config didnt break but is now "host not found" a minute after i ACKed the services [10:36:29] that is a good abstraction layer, and you can chage the data held in heira as much as you want as long as the variable pointing to it remains the same [10:36:42] matanya: Is that different from a global? [10:37:08] yes, it is held "outside" of puppet scope [10:37:25] this prevents scoping issues [10:37:34] I understand that it is technically different from a global, but from a code-design standpoint is it the moral equivalent of a global? [10:37:42] Just whitewashed by the purity of 'not being code'? [10:38:11] mostly similar, but has some diff's [10:38:47] and is great when using stuff like passwords [10:39:33] (03CR) 10JanZerebecki: coredb_mysql: puppet 3 compatibility fix: fully qualify variables (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108488 (owner: 10Matanya) [10:40:28] matanya: makes sense for passwords, right now the line we draw between public and private puppet info is funny [10:41:26] !log professor: revoked puppet cert, deleted salt minion key, powering down. bye bye [10:41:34] Logged the message, Master [10:43:38] (03CR) 10Faidon Liambotis: coredb_mysql: puppet 3 compatibility fix: fully qualify variables (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108488 (owner: 10Matanya) [10:44:27] (03CR) 10Hashar: [C: 04-1] "Well funnel will redirect anything under those domains to one and only URL. So that would break any potential old links. We used to have " [operations/apache-config] - 10https://gerrit.wikimedia.org/r/109652 (owner: 10Dzahn) [10:44:44] mutante: I guess you can get rid of the bugzilla funnel :D https://gerrit.wikimedia.org/r/#/c/109652/ [10:44:57] (03CR) 10Matanya: coredb_mysql: puppet 3 compatibility fix: fully qualify variables (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/108488 (owner: 10Matanya) [10:45:18] mutante: the SSL error for https://bugzilla.wikipedia.org/ should be fixed by pointing it to the mediawiki load balancer instead of pointing to kaulen [10:46:08] hashar: confusing, because that's exactly what i said on that ticket in Oct 2013, but then [10:46:11] that's fixed meanwhile, apparently: [10:46:12] wikipedia.org zone: 73 bugs 1H IN CNAME bugs.wikimedia.org. [10:46:35] let me look again at your comments first [10:47:26] hashar: i did that because i looked at your example from doc.mediawiki.org to doc.wikimedia.org and there is no cert error [10:47:37] or at least very similar thing [10:48:12] mutante: yeah there is no cert because that points to the text load balancer and make the SSL query to terminate on the SSL proxy which have the *.mediawiki.org cert [10:48:19] where do you see that pointed to kaulen? [10:48:29] i just see it in wikiMedia zone [10:48:50] apergos: a wikitech post for a simple puppet question? seriously? [10:48:55] bugzilla.wikipedia.org. 3600 IN CNAME bugzilla.wikimedia.org. [10:48:58] jesus [10:49:12] andrewbogott: skim over: http://www.slideshare.net/PuppetLabs/roles-talk [10:49:18] for the single puppet3 commit you've done? [10:49:34] thankfully matanya isn't writing a wikitech-l mail for every puppet3 fix they are making [10:49:56] mutante: so I guess you want to move bugzilla.wikipedia.org to text load balancer, making sure the apaches have it as a server alias [10:50:03] paravoid: I asked him to write that post, because it's an important question. We have a style guide which is almost entirely incorrect for puppet 3. [10:50:17] I'd like to start writing puppet 3-appropriate code as soon as possible. [10:50:37] paravoid: i can start now :) [10:50:47] what is it specifically on our style guide that is incorrect for puppet 3? [10:50:55] hashar: i'm confused, that's what the ticket started out as and apache redirects have that alias since a longer time [10:51:13] paravoid: https://wikitech.wikimedia.org/wiki/Puppet_coding#Organization [10:51:13] but i'm also multi-tasking, so i'll look closer in a sec [10:51:35] The interface between nodes, roles, manifests. This also matters to how the labs console handles puppet [10:51:38] since right now it relies on globals [10:51:46] matanya: please don't :) your amazing in volume work and lack of procrastination is highly appreciated. many kudos. [10:51:55] andrewbogott: what specifically? [10:52:04] can you point me to the relevant section? [10:52:23] That is a specific section already :) Items 1,2 and 3 all mention 'globals' and none of those things will work in 3. [10:52:28] As I understand it. [10:52:28] mutante: the text apaches are probably already configured (haven't checked), you still need to point the DNS entry to them :-D Currently bugzilla.wikipedia.org points to kaulen which does not have the SSL cert for it (nor for *.wikipedia.org). [10:52:51] thanks paravoid :) [10:53:04] hashar: https://gerrit.wikimedia.org/r/#/c/108906/ [10:53:22] andrewbogott: I don't understand... [10:53:44] what exactly is the problem? could you give an example? [10:53:59] paravoid: ok… I think that this whole conversation happened here in this room, in response to your comment :) So [10:54:55] So my understanding is that declaring $foo=7 in site.pp and then referring to $foo in a subsequent class won't work due to how scope is handled in puppet 3. [10:55:02] that is correct [10:55:25] So, the question is, how to get node-specific config into a role class? [10:55:36] One answer is, 'make role class parameterized' but we don't do that currently, on purpose. [10:55:39] like what? [10:55:48] There are, presumably, other answers. [10:56:01] (we do that, for coredb, but it's an exception) [10:56:06] (we generally try to avoid doing so) [10:56:17] that should be removed [10:56:30] paravoid: if you are indirectly arguing that role classes should always be self-sufficient and not take params from outside, well, that would be a valid response to this question. [10:56:50] for openstack_version, the right way to do it would be to have a role class called "role::labs::eqiad" or something like that [10:57:06] or role::labs::node, that has a conditional on $::site within it [10:57:20] the latter being better [10:57:29] role classes exist specifically for /not/ being generic [10:57:31] So, ok, you are advocating for a possible answer to this question. And I support the advocating of answers! [10:57:41] like for example, we have role::authdns::ns0, role::authdns::ns1, role::authdns::ns2 [10:57:47] but you having an answer is not the same thing as thinking the question is dumb or trivial. [10:57:56] There are plenty of vars defined in site.pp currently. [10:58:04] And labsconsole allows the configuration of instances that way [10:58:08] yup, we're working on fixing this [10:58:08] And the style guide encourages it. [10:58:14] where does it encourage it? [10:58:31] Use of global variables within non-role manifests or templates is highly discouraged. A few actually global settings (e.g. $::site, $::realm) will crop up, but we probably don't need any new ones. [10:58:52] So… I wrote that. [10:58:56] non-role manifests. [10:59:01] good :) [10:59:04] As opposed to role manifests, which 4. Role classes are never parameterized, and are only configured via globals. [10:59:17] and 1. Nodes defined in site.pp can include role classes and define global variables. [10:59:23] and I didn't say the question is dumb, fwiw [10:59:35] ok, sorry [11:00:10] that said, i do think we should move as much 'data' out of code and use hiera. [11:00:21] Anyway -- if you have a definite, right answer for how this should be managed, then, that's great! Apergos and I just now had a long discussion about this and neither of us did. [11:00:38] using heira strikes me as a different solution from 'fully self-sufficient roles', yes? [11:00:46] apergos has largely ignored the puppet3 effort for months now, and has a single commit about it, and takes it up on wikitech-l to discuss that single one commit, while matanya for example has been doing all of https://etherpad.wikimedia.org/p/Puppet3 without too much fuss [11:00:53] this is what annoys me, to be clear. [11:01:26] (while apergos had commited to help with the puppet3 migration many weeks ago) [11:01:34] to be clear, I asked multiple times, every few days, someone (I am not going to name names because this is not about blaming them) for a review of that one change, so that I could knowo if that was the right approach before proceeding [11:01:38] I never got an answer [11:01:52] that went on for several weeks. [11:01:59] Yeah, and we just now discussed it at lenght in this channel and no one piped in with an answer... [11:02:03] Mailing list seems like the right next step [11:02:08] that is not the same as ignoring the effort. [11:02:39] apergos: so you started with a single commit from a non-trivial case, got no response to reviews, and decided to stop altogether? [11:02:53] when you weighed in, you told me basically to leave it alone and let other people refactor it, but that still did not tell me what the right approach is, which I asked you then too [11:02:59] I'm sorry, but https://gerrit.wikimedia.org/r/#/q/owner:%22ArielGlenn+%253Cariel%2540wikimedia.org%253E%22,n,z vs. https://gerrit.wikimedia.org/r/#/q/owner:%22Matanya+%253Cmatanya%2540foss.co.il%253E%22,n,z [11:03:06] something is wrong in this picture, isn't it? [11:03:07] from that I get the very clear conclusion: stay the heck out [11:03:32] aside from who did or didn't do something, can we look at the right approach? [11:04:12] 3 commits in 40 days, one of which is a revert; I think that speaks for itself [11:04:24] * apergos gives up [11:04:31] (and that isn't a far compare, i do many lint's, those are not functional changes) [11:06:51] anyway, to the point: the openstack classes were wrong for using a global variable for configuring the openstack version [11:06:55] anyhow, to the point, I think using module --> roles::subrole --> include in site.pp is the right way [11:07:03] OK, so: I would love it if someone would write a new puppet 3 style-guide so that I can start writing puppet3-enabled code going forward. [11:07:13] that should be a class parameter for the openstack classes, which the role class (but not site.pp) will define to 'folsom' [11:07:27] I think that matanya has already offered to start working on that, which is great. [11:07:31] that's how 90% of our manifests are written anyway [11:08:10] I spoke from a coding style prespective, but i can expand it to this too , if approved [11:08:44] and i still support using hiera [11:08:54] hiera is great, but not for this purpose [11:08:55] As usual, I don't care much what the standard is, only that there be a standard. [11:09:14] also, I don't think it makes sense for us to bother with hiera before the puppet 3 upgrade [11:09:19] so this would deadlock that [11:09:26] And it sounds like whatever the standard is, it will be incompatible with how labs currently works, so… that will go on my list I guess :( [11:09:33] paravoid: it can simplfy things after puppet 3 fail over [11:09:49] I don't think "openstack_version" is a good candidate for hiera [11:09:50] that is for sure andrewbogott [11:09:56] no, it is not paravoid [11:10:36] (03PS1) 10Dzahn: point bugs. and bugzilla. within wikiPedia to -lb [operations/dns] - 10https://gerrit.wikimedia.org/r/109868 [11:10:36] the solution to this doesn't need a style guide update imvho, just making it the openstack classes behave like most other classes in our repo [11:10:41] but things like $lucene_oai_pass it is [11:10:50] Yeah, in this case it's simple to make per-version role classes. It just wasn't obvious to me until /just now/ that that was the 'correct' approach. [11:10:51] but if people would like to have it explicitly documented in the style guide, I wouldn't mind either [11:12:59] andrewbogott , paravoid :and as i side point, the silde i linked above is the right approach in my opinion (http://www.slideshare.net/PuppetLabs/roles-talk) [11:14:19] * andrewbogott bookmarks [11:14:44] (03CR) 10Dzahn: "indeed, thanks Hashar, yea, in mediawiki.org zone this already works, and the Apache cluster already has these aliases in redirects.conf" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/109652 (owner: 10Dzahn) [11:16:35] execpt for the inheritance part [11:17:03] so they've split what we call "roles" into "roles" and "profiles", correct? [11:17:15] and enforce a one-to-one relationship between nodes & roles [11:18:02] yes [11:19:11] which makes it much easier for newcomers and when trying to remember what the hell is going on this node, after you haven't looked at it for some months [11:19:26] it's not very different to what we do I think [11:19:40] I wouldn't mind discussing the split of roles & profiles, but I think it may be a bit premature [11:19:49] we still have too many on-going migrations [11:20:04] it'd be nice to finish up the modularization & puppet 3/no global variables processes [11:20:04] i agree, this is my future vision part [11:20:14] then start thinking about ways to further improve [11:21:56] (03Abandoned) 10Dzahn: funnel instead of redirect historic Bugzilla URLs [operations/apache-config] - 10https://gerrit.wikimedia.org/r/109652 (owner: 10Dzahn) [11:23:25] speaking of which paravoid, i wanted to ask ypu about templates/varnish/bits.inc.vcl.erb [11:23:53] there is site var there, do you have a good idea what to replace it with? [11:26:07] with $cluster_tier [11:26:31] which should be 1 for eqiad/pmtpa, and higher for other sites [11:26:48] thanks mark [11:27:03] that variable is already set in role/cache.pp [11:27:10] but bits.inc.vcl hasn't adapted to it yet [11:27:37] yeah, i saw that [11:27:48] grep is my friend :) [11:29:29] (03PS1) 10Matanya: varnish: puppet 3 compatibility fix: correct variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/109869 [11:30:55] not that there's anything really wrong with using $::site [11:35:32] (03PS1) 10Hashar: restore TTL for contint websites [operations/dns] - 10https://gerrit.wikimedia.org/r/109870 [11:36:09] (03CR) 10Hashar: "RobH: that cleans up the DNS configuration for the migration we completed last week :-)" [operations/dns] - 10https://gerrit.wikimedia.org/r/109870 (owner: 10Hashar) [11:36:36] (03CR) 10Faidon Liambotis: [C: 032] restore TTL for contint websites [operations/dns] - 10https://gerrit.wikimedia.org/r/109870 (owner: 10Hashar) [11:38:20] I think I am gonna add a vim modeline in dns zonefiles and retab/lint everything. The pedantic inside of me aches when he reads DNS changes in gerrit :-( [11:39:38] akosiaris: +1 [11:39:45] akosiaris: I tried to do it too, I've done it in a few of them [11:39:48] +1, i feel like i can only do it wrong, either it looks bad in vim or in gerrit [11:39:52] but I got too tired and stopped [11:39:54] with real tabs [11:39:56] but please do :) [11:40:22] mutante: same feeling here. [11:40:27] tabs or spaces btw ? [11:40:31] then we can get a Jenkins job to complain whenever it finds tabs \O/ [11:40:33] 4 spaces [11:40:39] 4 spaces would be nice [11:40:48] just so it's like puppet [11:40:51] 4 spaces? [11:41:10] you realize this means they won't be indented, right? :) [11:41:13] well I would lmake it 3 spaces [11:41:21] but seems most people like 4 for whatever reason [11:41:37] you need to indent at IN A boundaries etc. [11:41:49] to be specific, :set ts=4 sts=4 et :retab :wq ? [11:41:57] "column -t" will probably do the job, with the exception of lines that have the same LHS as the entry above them [11:41:59] or replace the values [11:42:05] man column [11:42:10] (wrong window) [11:42:57] let's do spaces, the specific amount for dns zonefiles I don't care about much [11:44:42] as long as they are indented :-) [11:44:48] yes [11:44:50] yes please [11:45:04] IN A following the IN A from the line above etc. [11:45:18] !log running fresh s5 dump for toolserver on db73 [11:45:20] yes [11:45:25] Logged the message, Master [11:45:26] and no more than one space between IN and record type [11:45:34] I've seen people do that, I find it pointless [11:46:36] and I don't mind spaces vs. tabs either, as long as it's consistent :) [11:46:53] i think for dns it's easier with spaces [11:46:57] lunch, brb [11:48:25] same && nap [11:50:38] after we change it maybe hashar can re-enable/adjust config of lint check in dns repo [11:50:54] so jenkins then tells us about tab chars if added in the future [11:51:18] matanya: why are you changing all these include lines in site.pp to continuation lines with commas ? [11:56:41] (03CR) 10Springle: "Fine with this if it's definitely a noop for the resulting config file. No comment on the puppet style/scope conversation." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108488 (owner: 10Matanya) [12:04:41] akosiaris: i find it easier to read [12:06:22] matanya: more difficult to grep however [12:07:13] and more difficult to read for me at least [12:07:17] me as well [12:07:31] akosiaris: if this is the target, we should include every class [12:07:41] i'll revise it [12:07:48] thanks :-) [12:08:17] akosiaris: i'll include all, is the ok? being explicit makes sense [12:08:26] *that [12:10:34] hashar: again fatal mails from beta every minute? :( [12:11:28] who's good with wm-bot commands and setting up a channel?:) [12:12:41] it's like in a channel but completely muted, doesn't talk to anyone there [12:17:02] Nemo_bis: arrhhh [12:17:31] guess who is going to have to fix it ? :D [12:24:01] (03PS1) 10Hashar: beta: fatal monitor twice per days [operations/puppet] - 10https://gerrit.wikimedia.org/r/109877 [12:24:13] mutante: some cron fix https://gerrit.wikimedia.org/r/109877 [12:24:38] the current definition cause a cron job to run every minutes during two periods of one hour :/ [12:24:46] that spam the qa mailing list :-D [12:27:47] hashar: makes sense! [12:27:52] just jenkins isnt done yet [12:28:44] mutante: merge it please, nice spam there [12:29:01] (03CR) 10Matanya: [C: 031] beta: fatal monitor twice per days [operations/puppet] - 10https://gerrit.wikimedia.org/r/109877 (owner: 10Hashar) [12:29:17] (03CR) 10Nemo bis: [C: 031] "Yes please." [operations/puppet] - 10https://gerrit.wikimedia.org/r/109877 (owner: 10Hashar) [12:29:38] (03CR) 10Dzahn: [C: 032] "right, without minute that would be every minute" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109877 (owner: 10Hashar) [12:30:14] hashar: merged in puppetmaster [12:30:37] thanks [12:30:43] mutante: and the script is bugged anyway hehe [12:30:50] my puppetization was kind of lame [12:36:59] (03PS1) 10Hashar: beta: fix fatal monitor duration detection [operations/puppet] - 10https://gerrit.wikimedia.org/r/109879 [12:37:01] (03PS1) 10Hashar: beta: tweak fatal monitor email content [operations/puppet] - 10https://gerrit.wikimedia.org/r/109880 [12:38:10] (03PS1) 10Hashar: beta: remove beta_monitor_fatals_every_hours [operations/puppet] - 10https://gerrit.wikimedia.org/r/109881 [12:38:38] mutante: and if you feel brave enough, you can get the three above in the repository :-] [12:40:49] (03CR) 10Dzahn: [C: 032] beta: fix fatal monitor duration detection [operations/puppet] - 10https://gerrit.wikimedia.org/r/109879 (owner: 10Hashar) [12:41:09] (03CR) 10Dzahn: [C: 032] beta: tweak fatal monitor email content [operations/puppet] - 10https://gerrit.wikimedia.org/r/109880 (owner: 10Hashar) [12:41:26] (03CR) 10Dzahn: [C: 032] beta: remove beta_monitor_fatals_every_hours [operations/puppet] - 10https://gerrit.wikimedia.org/r/109881 (owner: 10Hashar) [12:42:34] mutante: thx [12:42:46] hashar: yea, easy enough, comments and removing the old cron, +1 [12:42:54] cool that you fix it right away [12:44:15] :-D [12:44:33] we eventually want to phase that out [12:44:40] and query logstash directly [12:44:52] possibly via an icinga plugin [12:44:57] i see [12:45:15] so syslog -> logstash <-- poll from icinga -> irc alarm [12:45:16] buuut [12:45:20] we have no icinga for beta :-] [12:45:55] hmmm https://github.com/hpcugent/logstash-patterns [12:46:11] logstash-patterns / files / icinga [12:46:13] it has [12:46:21] well we can make logstash to send passive checks http://logstash.net/docs/1.3.3/outputs/nagios [12:46:22] btw, even if unrelated [12:46:33] aha [12:47:15] hashar: i know, but said so(tm) when labs nagios was started and differed completely from prod setup [12:47:19] manual vs. puppet [12:47:28] yup [12:47:35] we need a real labs one that uses the prod classes [12:47:51] but the other labs nagios could do some nice tricks [12:47:54] the prod icinga rely on the puppet master collecting resources on all hosts [12:47:58] but that is disabled on labs [12:48:04] like getting the instance names from openstack [12:48:09] and creating the configs [12:50:23] one day maybe :D [12:50:26] i am out for nap now [12:52:06] (03CR) 10Dzahn: [C: 04-1] "merge after missing bugs.wikipedia.org has been added to cluster Apache redirects" [operations/dns] - 10https://gerrit.wikimedia.org/r/109868 (owner: 10Dzahn) [12:54:18] (03CR) 10Dzahn: [C: 031] "i have done all the other decom steps and powered it down, but disks need to be wiped and maybe cmjohnson would like to have the .mgmt. fo" [operations/dns] - 10https://gerrit.wikimedia.org/r/109286 (owner: 10Dzahn) [13:01:01] (03PS1) 10Dzahn: decom professor, add decommissioning.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/109884 [13:02:45] (03CR) 10Dzahn: [C: 031] "merge if professor hardware is not reclaimed for anything, all else done, already removed from puppet and storedconfigs" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109884 (owner: 10Dzahn) [13:15:52] (03PS6) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [13:15:56] akosiaris: ^ [13:16:05] this way? [13:16:44] (03CR) 10jenkins-bot: [V: 04-1] site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 (owner: 10Matanya) [13:26:55] (03PS7) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [14:14:07] (03PS3) 10Matanya: coredb_mysql: puppet 3 compatibility fix: fully qualify variable [operations/puppet] - 10https://gerrit.wikimedia.org/r/108488 [14:19:41] (03CR) 10Ottomata: "Faidon, so that it works in mediawiki-vagrant too." [operations/puppet] - 10https://gerrit.wikimedia.org/r/109316 (owner: 10Ottomata) [14:39:03] (03PS2) 10Matanya: emery: remove one udp2log logger. [operations/puppet] - 10https://gerrit.wikimedia.org/r/109849 [14:39:13] (03CR) 10Ottomata: [C: 032 V: 032] "Thanks Matanya!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109849 (owner: 10Matanya) [14:42:22] ottomata: i'll put a patch to mvoe the left two logs to oxygen. [14:42:31] api and glam nara [14:42:41] arabic stuff still pending [14:44:12] hmmm [14:44:14] matanya [14:44:25] maybe we should move them to erbium [14:44:38] i think there's more capacity there [14:44:55] oh, while you are at it [14:44:56] check [14:45:01] misc/statistics.pp for rsync::jobs [14:45:09] there are rsync jobs there they copy the logs from the logging boxes [14:45:38] if any of the filters we are deleting or moving have corresponding rsync jobs, the rsync job should be updated to match [14:45:43] e.g. change hostname to erbium or whatever [14:50:15] (03CR) 10Ottomata: [C: 032] "Looks great! Let's create the topic together today?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109800 (owner: 10Gage) [14:57:17] (03CR) 10Nemo bis: "This means the new kafka-logs will have information about all our users starting from now, right?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109800 (owner: 10Gage) [14:57:52] ottomata: that needs V+2 because he's not in trusted users yet perhaps? [14:58:21] yeahi could V+2 but I also don't want to merge it yet [14:59:06] (03CR) 10Ottomata: "No, this just sets up kafka log collection of webrequest access logs from the bits varnishes. This is being done now so that Ori can use " [operations/puppet] - 10https://gerrit.wikimedia.org/r/109800 (owner: 10Gage) [15:02:18] (03PS1) 10Matanya: emery: move rsync teahouse job [operations/puppet] - 10https://gerrit.wikimedia.org/r/109894 [15:04:17] ottomata: ^ [15:10:37] (03CR) 10Ottomata: "Great, thanks. Since we just merged the removal of the udp2log filter, I'll go ahead and wait a day before I merge this so the job can ru" [operations/puppet] - 10https://gerrit.wikimedia.org/r/109894 (owner: 10Matanya) [15:19:50] oh matanya, that patch is good, that's what I would do too [15:19:56] but, just so you know (in case you don't) [15:20:02] that that wouldn't actually remove the cron job [15:20:07] that would just keep puppet from managing it [15:20:09] to actually remove it [15:20:15] i know [15:20:17] you'd have to do some ensure => 'absent' [15:20:18] ah ok cool [15:20:21] ok then you know! [15:20:23] :) [15:20:30] i will remove the cron job manually when I merge [15:20:50] that is what i was intending to request :) thank you [15:21:04] great danke [15:33:58] ottomata: mind review onther minor change? [15:34:05] *another [15:34:27] PROBLEM - LVS HTTPS IPv4 on text-lb.eqiad.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.445 second response time [15:34:37] PROBLEM - LVS HTTPS IPv4 on mobile-lb.eqiad.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:34:40] not at all [15:34:47] PROBLEM - LVS HTTPS IPv6 on text-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 7.251 second response time [15:34:47] Something's not right with the cluster... [15:34:51] https://gerrit.wikimedia.org/r/#/c/106502/ [15:34:51] PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:35:14] uh oh [15:35:18] those icinga alearts are bad [15:35:37] PROBLEM - LVS HTTP IPv6 on text-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:35:50] (03CR) 10Ottomata: [C: 032 V: 032] wmclient moved to git. correcting README [operations/debs/adminbot] - 10https://gerrit.wikimedia.org/r/106502 (owner: 10Matanya) [15:36:37] RECOVERY - LVS HTTPS IPv4 on mobile-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 28381 bytes in 7.013 second response time [15:36:44] apergos, i don't know much about these systems, but i'm looking (i'm also in a meeting right now) [15:36:50] ok [15:36:51] packetloss ? [15:37:06] not straight cut [15:37:12] where is lvs1001 [15:38:27] RECOVERY - LVS HTTP IPv6 on text-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 66291 bytes in 0.004 second response time [15:38:30] RECOVERY - LVS HTTPS IPv4 on text-lb.eqiad.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 66419 bytes in 0.011 second response time [15:38:35] ook [15:38:37] RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 28377 bytes in 0.025 second response time [15:38:37] that was fast and odd [15:38:40] RECOVERY - LVS HTTPS IPv6 on text-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 66416 bytes in 0.029 second response time [15:38:53] did anyone do something ? [15:38:57] not me [15:39:04] not me either, was still loooking [15:39:11] not me [15:39:16] mark? [15:39:26] lvs1001 ran out of memory [15:40:03] good morning phone :) [15:40:26] cluster saw 500 MB/s network activity drop for about 5 min https://ganglia.wikimedia.org/latest/graph_all_periods.php?title=Bytes+served%2C+caches+and+misc+crumbs+-+old+data+incomplete&vl=&x=10000000000&n=&hreg[]=cp|mw|amssq&mreg[]=bytes_out>ype=stack&glegend=hide&aggregate=1 [15:40:44] (I love this graph I made, yes. :P Whatever it graphs.) [15:40:47] * hashar hands coffee and donuts to paged folks [15:41:12] ottomata: webrequest_bits topic is already created. looks correct to me but maybe you can doublecheck? [15:41:18] https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=cpu_report&s=by+name&c=LVS+loadbalancers+eqiad&h=lvs1001.wikimedia.org&host_regex=&max_graphs=0&tab=m&vn=&hide-hf=false&sh=1&z=small&hc=4 [15:41:30] Nemo_bis: you are an artist [15:41:40] :D [15:42:10] oh i hadn't checked [15:42:44] awesome, jgage looks great [15:42:48] yay [15:42:51] i think we can merge it! [15:42:54] omg [15:43:03] you want to? [15:43:11] ok [15:43:12] go ahead, merge, run puppet on one (or all) of the bits varnishes [15:43:25] i will start a consumer and look for traffic coming in [15:43:43] watching... [15:43:44] :) [15:44:36] one sec [15:44:41] ja no hurry [15:44:47] i'm still in a meeting too :) [15:45:50] heh [15:46:09] i ran puppet-merge on palladium but it says no changes to merge, i guess i missed a step. [15:46:12] * jgage consults docs [15:46:18] oh i didn't merge in gerrit yet [15:46:30] you usually do both of those steps at the same time [15:46:33] gerrit first, then puppet-merge [15:46:45] i can merge in gerrit if you like, but maybe you should ? [15:46:46] https://gerrit.wikimedia.org/r/#/c/109800/ [15:47:08] oh ok, i thought it was auto after you +2 [15:48:03] naw, you have to +2 both buttons, and then click submito [15:48:20] niah not both (most of the times) [15:48:36] the verified +2 is given by jenkins [15:48:45] and it is advisable to not override it [15:48:51] though you can [15:49:01] both = +2 cr and +submit to [15:49:17] ottomata: ok, merged and ran puppet-merge [15:49:44] ha... ok i misunderstood then... i thought V:+2, CR:+2 [15:49:46] (03CR) 10Hashar: "Make sure the application servers have the redirect and that should be fine :-D" [operations/dns] - 10https://gerrit.wikimedia.org/r/109868 (owner: 10Dzahn) [15:49:56] (03CR) 10Hashar: [C: 031] point bugs. and bugzilla. within wikiPedia to -lb [operations/dns] - 10https://gerrit.wikimedia.org/r/109868 (owner: 10Dzahn) [15:50:01] ok coool [15:50:02] jagage [15:51:08] jgage, log into cp1056.eqiad.wmnet and run puppetd -t [15:51:21] *tv :) [15:51:35] doing that now :) [15:52:45] hrm err: /Stage[main]/Varnishkafka::Monitoring/Exec[generate-varnishkafka.pyconf]/returns: change from notrun to 0 failed: /usr/bin/python /usr/lib/ganglia/python_modules/varnishkafka.py --generate --tmax=15 /var/cache/varnishkafka/varnishkafka.stats.json > /etc/ganglia/conf.d/varnishkafka.pyconf.new returned 1 instead of one of [0] at /etc/puppet/modules/varnishkafka/manifests/monitoring.pp:25 [15:53:07] hmmm [15:53:27] it did install varnishkafka [15:53:47] ok [15:54:00] is it running? [15:54:07] i'm running puppet on cp1057 to check [15:55:12] no, the daemon isn't running [15:55:23] running puppet a second time did produced same result [15:55:58] hmmm Jan 28 15:55:00 cp1057 varnishkafka[14764]: VSLOPEN: Failed to open Varnish VSL: No such file or directory [15:55:58] ok so [15:55:58] the monitoring one is fine [15:56:05] (03Abandoned) 10Hashar: WIP: DO NOT MERGE YET. Allow Google's bots to scrape bits. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/95548 (owner: 10Dr0ptp4kt) [15:56:13] that actually wont' work until varnishkafka generates some stats output [15:56:13] but [15:56:16] that is the varnishkafka error [15:56:20] dunno what that means yet [15:56:21] Snaps: ? [15:57:05] hm [15:57:11] VSL? should it be VCL? [15:57:21] VSL is the varnish shared log file [15:57:25] oh [15:57:42] thought the varnish lang [15:58:56] looks like varnishkafka.log is mode 640 on cp1056 vs 664 on cp1046 [15:59:15] oh hmm [15:59:18] where is that jgage? [15:59:23] /var/log/ [15:59:31] oh [15:59:41] that's just varnishkafka.log dunno, [15:59:44] but that's not the error [15:59:46] hmmm [15:59:52] wonder if varnishname is bad [15:59:54] checking [15:59:55] the VSL file is the file created by varnishd [16:00:00] yeah, -n [16:00:10] # varnish instance name [16:00:10] varnish.arg.n = frontend [16:00:11] hm [16:00:37] -n cp1057 [16:00:37] ah [16:00:38] yes [16:00:40] different on bits [16:00:40] hmmmm [16:00:42] weird [16:01:18] yeah ok [16:01:26] varnishkafka.conf looks appropriately similar [16:01:39] ok yeah [16:01:39] so [16:01:48] in your change to cache.pp [16:01:50] add a parameter [16:02:00] varnish_name => $::hostname [16:02:02] i think that should do it [16:02:07] hm ok [16:04:07] (03PS1) 10Gage: handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 [16:04:38] Here info of irc http://p.pw/DLV [16:05:00] jgage, need comma [16:05:06] best to just end all params with comma [16:05:07] (03CR) 10jenkins-bot: [V: 04-1] handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 (owner: 10Gage) [16:05:11] even the last one [16:05:12] ah crap, ok [16:06:15] i really should write my style guide :P [16:06:23] PROBLEM - Varnishkafka log producer on cp3019 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishkafka [16:06:53] PROBLEM - Varnishkafka log producer on cp3021 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishkafka [16:07:13] PROBLEM - Varnishkafka log producer on cp1056 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishkafka [16:07:16] hehe [16:07:20] monitoring works! [16:07:41] heh yay [16:07:51] arr where is the command to append my change [16:07:58] git commit --amend [16:08:11] aha no wonder it didn't like --append [16:08:36] set it as an alias? :D [16:08:43] (03PS1) 10Gage: once more with commas [operations/puppet] - 10https://gerrit.wikimedia.org/r/109910 [16:08:55] hashar: \o/ [16:08:56] hashar, good idea [16:09:00] hashar: yay for favicons [16:09:00] jgage: you should do the sane one [16:09:05] *same [16:09:06] I knew you'll be pleased :) [16:09:13] not a new patch [16:09:23] I got "git amend" and "git ammend" as alias to git commit --amend [16:09:30] can get append as well hehe [16:09:36] yeah jgage, you submitted a new patch [16:09:40] hrm why didn't my change get amended [16:09:45] arr ok how do i fix [16:09:46] yeah hm [16:09:52] twkozlowski: yeah well done. And lot of that has been done by volunteers via Google code in which is even more awesome [16:09:55] dunno [16:09:55] but its ok [16:09:59] we can abandon the other one i guess [16:10:00] * twkozlowski has a silly alias for git checkout -b blah/blah master [16:10:00] jgage: that would need to be : git commit --amend and then git review -R [16:10:02] oh i needded to clone from gerrit i think [16:10:22] (03CR) 10Ottomata: [C: 032] once more with commas [operations/puppet] - 10https://gerrit.wikimedia.org/r/109910 (owner: 10Gage) [16:10:23] jgage: if you have it locally you can checkout that branch [16:10:33] jgage, i will abandon the first one [16:10:36] go ahead and merge the second one [16:10:39] ok [16:10:41] hashar: virtually all of them have been done by GCI students, except perhaps two or three [16:10:42] if not, git review -d number [16:10:53] (03Abandoned) 10Ottomata: handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 (owner: 10Gage) [16:10:54] thanks matanya [16:11:01] hashar: not to mention the number of SVG logos they created [16:11:09] also, you usually don't need to change commit message when you amend [16:11:12] unless you want to [16:11:15] bt yeah, dunno [16:11:15] I know Quim couldn't cope with creating additional tasks for them :) [16:11:15] cool [16:11:17] twkozlowski: you should write an announce on wikitech :-] [16:11:26] or ottomata tells you broke something :) [16:11:29] hm but now my second change depends on the first one which was abandoned [16:11:40] oh it did! [16:11:41] hashar: one of the students who helped with the favicons was selected as our winner, in the end [16:11:42] i ddin't see that [16:11:46] i thought I checked [16:11:47] grr [16:11:49] though he also helped with some core changes, etc. [16:11:57] twkozlowski: and I would love us to have something similar to google code in albeit without Google and running continuously. [16:12:01] jgage: did you the commit from producation branch? [16:12:07] ahh poo [16:12:11] this is probably not what i should be doing right after being woken up by monitoring [16:12:13] ah poo sorry [16:12:20] ahh haha [16:12:31] (03Restored) 10Ottomata: handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 (owner: 10Gage) [16:12:33] hashar: It would be nice, but from an organizational perspective, it eats a lot of resources [16:12:33] ok, restored [16:12:36] ok jgage, lets do this right [16:12:37] do this [16:12:38] * hashar sends jgage to Git/Gerrit 101 course with matanya :D [16:12:47] git fetch ssh://jgage@gerrit.wikimedia.org:29418/operations/puppet refs/changes/09/109909/1 && git checkout FETCH_HEAD [16:12:50] make your change [16:12:51] then [16:12:57] git commit -a —amend [16:12:58] then [16:13:00] git review [16:13:07] -R [16:13:08] -amend or --amend? :) [16:13:12] -- [16:13:13] -- [16:13:13] --amend [16:13:16] -- [16:13:17] long option so it is double dash [16:13:17] lol [16:13:30] annnnd [16:13:30] hashar lost the game. [16:13:34] -_-amend [16:13:40] haha [16:13:42] well -- shows up as a single dash with my font :( [16:13:50] haha, it does in mine too [16:13:51] --amend [16:13:51] jgage: tab is your friend [16:13:54] sometimes [16:13:55] pro tip: install bash/zsh completion script for git [16:14:02] oh yah [16:14:10] tab completing options? black magic! [16:14:18] that works [16:14:31] then it becomes: git com --am [16:14:37] and if you want to be really pro, use vim with syntactic [16:15:00] i use this one [16:15:00] https://github.com/nojhan/liquidprompt [16:15:24] it will show you red bold error when you puppet code is broken [16:15:28] *your [16:15:30] (03PS2) 10Gage: handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 [16:17:12] very nice ottomata i'll check it out :) [16:17:29] * matanya wonders if it is packaged in his distro [16:17:40] (03PS3) 10Gage: handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 [16:17:47] ok jgage, merging [16:17:51] (03CR) 10Ottomata: [C: 032 V: 032] handle different varnishkafka frontend for bits [operations/puppet] - 10https://gerrit.wikimedia.org/r/109909 (owner: 10Gage) [16:18:06] running puppet, go ahead and run on cp1047 [16:18:08] uhh [16:18:14] cp1056? whatever you were on [16:18:18] yeah 56, ok [16:19:51] well ok it made the frontend change in varnishkafka.conf [16:19:59] did get another err [16:20:00] err: /Stage[main]/Ganglia/Service[gmond]: Failed to call refresh: Could not start Service[gmond]: Execution of '/etc/init.d/ganglia-monitor start' returned 1: at /etc/puppet/manifests/ganglia.pp:240 [16:20:13] RECOVERY - Varnishkafka log producer on cp1056 is OK: PROCS OK: 1 process with command name varnishkafka [16:20:22] however varnishkafka is running [16:21:28] yeah i got that too, i htink it isi lying though [16:21:30] about gmond [16:21:43] i restarted gmond manually [16:21:48] yeah [16:21:59] hm ganglia 19531 0.0 0.0 0 0 ? Z 16:21 0:00 [varnishstat] [16:22:03] coool i'm watching kafka, lots coming in [16:22:07] yaay [16:22:25] how are you watching it? ganglia? [16:24:01] naw, on analytics1021 [16:24:10] kafka console-consumer —topic webrequest_bits [16:24:14] actually, like to do [16:24:23] RECOVERY - Varnishkafka log producer on cp3019 is OK: PROCS OK: 1 process with command name varnishkafka [16:24:24] kafka console-consumer —topic webrequest_bits | jq . hostname | uniq -c [16:24:28] just to make the output a little less [16:24:39] cool [16:25:48] also, i'm running puppet on some more nodes [16:25:53] ok [16:25:53] RECOVERY - Varnishkafka log producer on cp3021 is OK: PROCS OK: 1 process with command name varnishkafka [16:25:57] ganglia stuff looks fine on them [16:26:10] probably somethign weird with that bad setup we had at first [16:26:11] looks good onow that puppet is all right [16:26:19] good [16:26:48] haven't seen jq before, neato [16:27:10] whoa [16:27:11] http://ganglia.wikimedia.org/latest/graph_all_periods.php?hreg[]=cp.%2B&mreg[]=kafka.rdkafka.topics.webrequest_.%2B%5C.txmsgs.per_second&z=large>ype=stack&title=kafka.rdkafka.topics.webrequest_.%2B%5C.txmsgs.per_second&aggregate=1&r=hour [16:27:16] there they come! [16:27:24] whoa that is a lot more than mobile, wooo [16:27:25] haha [16:27:28] heh [16:27:44] Snaps: ^ :) [16:28:33] whats happening? [16:29:00] just set up varnishkafka on more nodes [16:29:03] for more traffic [16:29:10] cool! :) [16:29:19] up from about 5K msgs / ssec to 16K msgs / sec it looks like [16:30:12] fwiw ottomata's kafka console-consumer command has a typo in it, should be: jq .hostname [16:30:37] oh oops, danke [16:30:37] yeah [16:30:39] no space [16:31:22] awesooome [16:31:26] thanks jgage, its looking good [16:31:27] mmm data [16:31:36] no, thank you sir! [16:32:42] oh you know [16:32:47] i think all the ganglia data is not in yet [16:32:50] i think it is more than that [16:32:56] i only see 2 bits nodes in that graph [16:32:57] ooo [16:36:58] PROBLEM - Varnishkafka log producer on cp4004 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishkafka [16:37:08] PROBLEM - Varnishkafka log producer on cp4001 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishkafka [16:37:18] PROBLEM - Varnishkafka log producer on cp4002 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishkafka [16:37:23] ottomata is that you twiddling? [16:37:58] RECOVERY - Varnishkafka log producer on cp4004 is OK: PROCS OK: 1 process with command name varnishkafka [16:39:08] RECOVERY - Varnishkafka log producer on cp4001 is OK: PROCS OK: 1 process with command name varnishkafka [16:39:29] killing the defunct varnishstat on cp1056 just spawns a new one [16:41:00] hmm no [16:41:01] no tme [16:41:14] naw, probably puppet [16:41:17] i didn't do it on cp40* [16:44:50] jgage [16:44:55] yo [16:45:05] i think the aggregated ganglia graphs for varnishkafka take a while to get everything [16:45:08] (ganglia is weird0 [16:45:10] but check this one [16:45:10] http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&tab=v&vn=kafka&hide-hf=false [16:45:14] 50K msgs per second [16:45:44] that was more than I had expected, didn't realize bits was so much! [16:45:45] :) [16:45:46] ha [16:45:47] sweet [16:45:48] heh [16:45:55] stress test :) [16:46:18] RECOVERY - Varnishkafka log producer on cp4002 is OK: PROCS OK: 1 process with command name varnishkafka [16:56:23] (03CR) 10Alexandros Kosiaris: [C: 04-1] "Various points here and there." (0312 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 (owner: 10Matanya) [16:57:47] food tiiime [16:58:06] ciao [17:32:38] (03CR) 10Hashar: "Ariel and mark talked about wikipedia-lb versus wikimedia-lb . Basically it does not matter, they both point to the same text IP." [operations/dns] - 10https://gerrit.wikimedia.org/r/109868 (owner: 10Dzahn) [17:33:15] !log replacing ethernet cable db1024 rt6672 [17:33:22] Logged the message, Master [17:34:58] PROBLEM - Host db1024 is DOWN: PING CRITICAL - Packet loss = 100% [17:38:28] RECOVERY - Host db1024 is UP: PING OK - Packet loss = 0%, RTA = 0.40 ms [17:51:44] hey greg-g, do you know if the updated gwtoolset code will be deployed today to commons? it wasn't deployed last week. [17:54:09] Reedy: there are still 2 bugs open that have fixes already deployed. please take a look and close or add comments: https://bugzilla.wikimedia.org/show_bug.cgi?id=58651 and https://bugzilla.wikimedia.org/show_bug.cgi?id=58591 [17:58:31] greg-g: don't see anything listed on https://www.mediawiki.org/wiki/MediaWiki_1.23/wmf11, but we do have updates [18:00:37] dan-nl: https://www.mediawiki.org/wiki/MediaWiki_1.23/wmf11#GWToolset ? [18:01:25] ah, now i see it .. cool [18:01:53] dan-nl: is everything there you need? [18:02:03] double-checking now [18:02:06] k [18:03:57] greg-g: got my mail? [18:05:00] greg-g: this is the other patch https://gerrit.wikimedia.org/r/#/c/109273/ [18:05:16] i don't see that one listed [18:07:14] (03CR) 10Physikerwelt: "rabasing brings the change back to the first page of gerrit that gives more people a chance to find it." [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [18:07:58] greg-g: that one is d376127, also see 083b22e [18:08:47] greg-g: are the localisation updates automatically included? [18:09:57] greg-g: so, updating version nr from 0.1.0 to 0.1.1 (d376127), Merge "image url not evaluated" (083b22) and image url not evaluated (6301d12) [18:14:20] aude: yeah, one sec [18:15:48] aude: replied [18:16:08] k [18:16:28] thanks [18:16:30] dan-nl: https://gerrit.wikimedia.org/r/#/c/109273/ didn't make the cut in time [18:16:35] beta seems good now [18:17:04] okay, that's just a version change, so no big deal, the other two are the main code changes [18:17:10] https://gerrit.wikimedia.org/r/#/c/107038/ is included [18:18:09] greg-g, since you are here! i asked this yesterday and then may have missed some chats with answers [18:18:14] but i want to hack on git-deploy stuff [18:18:16] what repository should I use [18:18:18] ? [18:18:24] hah, I have no effing clue [18:18:26] git-deploy? sartoris (trebuchet?) [18:18:27] ah rats! [18:18:28] there's too many of them [18:18:35] ask Ryan_Lane [18:18:42] yeah, i guess I should email [18:19:01] * Reedy deploys ottomata [18:19:15] haha [18:20:14] dan-nl: this'll show all that's in wmf11: https://git.wikimedia.org/log/mediawiki%2Fextensions%2FGWToolset.git/refs%2Fheads%2Fwmf%2F1.23wmf11 [18:20:28] !log reedy updated /a/common to {{Gerrit|Iac50ca3fb}}: Deploy Extension:MobileApp to betalabs [18:20:32] (03PS1) 10Reedy: Non Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109930 [18:20:35] Logged the message, Master [18:23:15] ori: yt? [18:30:45] (03PS1) 10Reedy: Move re-usable code from checkoutMediaWiki to checkoutMediaWiki.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109932 [18:30:54] greg-g: i don't see wmf11 next to 6301d12, will that happen later? [18:31:40] dan-nl: it's listed on that link I pasted, right? [18:32:10] (03CR) 10Reedy: [C: 032] Move re-usable code from checkoutMediaWiki to checkoutMediaWiki.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109932 (owner: 10Reedy) [18:32:16] (03Merged) 10jenkins-bot: Move re-usable code from checkoutMediaWiki to checkoutMediaWiki.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109932 (owner: 10Reedy) [18:36:39] !log reedy updated /a/common to {{Gerrit|If45e33ae9}}: Move re-usable code from checkoutMediaWiki to checkoutMediaWiki.php [18:36:42] (03PS1) 10Reedy: Fixup default file permissions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109933 [18:36:48] Logged the message, Master [18:38:00] (03CR) 10Reedy: [C: 032] Fixup default file permissions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109933 (owner: 10Reedy) [18:38:05] (03Merged) 10jenkins-bot: Fixup default file permissions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109933 (owner: 10Reedy) [18:40:45] !log reedy updated /a/common to {{Gerrit|Ic2ea727b4}}: Fixup default file permissions [18:40:48] (03PS1) 10Reedy: Rename phase1 dblist to group0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109934 [18:40:53] Logged the message, Master [18:41:40] greg-g: I'm trying to decide if using groupX (starts at 0) on wikitech and then phaseX on mediawiki.org is confusing [18:41:59] probably [18:42:01] :) [18:42:05] wasn't intentional [18:43:42] Which are we keeping? :P [18:44:47] Reedy: I'm agnostic [18:45:24] Anyone with any preferences? [18:45:25] which ever one requires the least amount of random other fixing [18:46:17] greg-g: Reedy we have a cherry pick to wikibase coming [18:46:25] in a few minutes [18:47:19] I don't mind the 0 base myself, but that seems likely more confusing for non tech people [18:48:11] sure [18:48:30] could just use words? [18:48:40] testwikis, sister projects, wikipedias [18:49:26] Reedy: https://gerrit.wikimedia.org/r/#/c/109935/ [18:49:32] whenever, no hurry [18:49:53] greg-g: i only see wmf11 next to bfc8930, Localisation updates from https://translatewiki.net., but maybe that includes all previous commits? [18:51:33] oh, yeah, sorry, see this link for what is included: https://git.wikimedia.org/log/mediawiki%2Fextensions%2FGWToolset.git/refs%2Fheads%2Fwmf%2F1.23wmf11 [18:52:54] decisions decisions [18:53:04] away for a bit [19:01:13] yay: https://projects.puppetlabs.com/issues/14312 [19:01:25] andrewbogott_afk: that's for you too [19:12:05] (03PS1) 10Reedy: Set $wgMathTexvcCheckExecutable [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109936 [19:12:59] (03CR) 10Reedy: [C: 032] Set $wgMathTexvcCheckExecutable [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109936 (owner: 10Reedy) [19:13:08] (03Merged) 10jenkins-bot: Set $wgMathTexvcCheckExecutable [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109936 (owner: 10Reedy) [19:13:23] (03PS1) 10Ori.livneh: Remove obsoleted views from Ganglia [operations/puppet] - 10https://gerrit.wikimedia.org/r/109938 [19:17:33] ottomata: hey [19:18:02] (03CR) 10Ori.livneh: [C: 032] Remove obsoleted views from Ganglia [operations/puppet] - 10https://gerrit.wikimedia.org/r/109938 (owner: 10Ori.livneh) [19:18:19] ori: bits comin' in! [19:18:20] http://ganglia.wikimedia.org/latest/graph_all_periods.php?title=&vl=&x=&n=&hreg%5B%5D=cp.*&mreg%5B%5D=kafka.rdkafka.topics.webrequest_bits.partitions.*.txmsgs.per_second>ype=stack&glegend=show&aggregate=1 [19:18:47] no wai [19:18:56] its a loooot, didn't realize it was going to be so much [19:19:06] 10x more data in kafka now [19:19:21] up from about 5K per sec w just mobile to 50K per sec with mobile + bits [19:20:14] so how do i get at it? i can connect to a kafka broker(which? do i need to use zookeeper?) and specify the topic? [19:20:25] (03PS2) 10Reedy: Non Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109930 [19:20:37] (03CR) 10Reedy: [C: 032] Non Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109930 (owner: 10Reedy) [19:20:46] (03Merged) 10jenkins-bot: Non Wikipedias to 1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109930 (owner: 10Reedy) [19:21:09] (03CR) 10Faidon Liambotis: "It's the equivalent to mailing "bump" on a mailing list post every week. Please don't do that :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [19:22:04] ori yeah you need a kafka client of some kind [19:22:11] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.23wmf11 [19:22:16] you can play with it on analytics1022 (or 1021) if you like [19:22:20] Logged the message, Master [19:22:20] if you do it there [19:22:26] you can just do [19:22:45] kafka console-consumer --topic webrequest_bits [19:22:49] thanks greg-g, will check-in on the deploy tomorrow when david tries it out [19:23:28] dan-nl: cool [19:23:49] !log reedy updated /a/common to {{Gerrit|I4abcbb866}}: Non Wikipedias to 1.23wmf11 [19:23:53] (03PS1) 10Reedy: Update php symlink to php-1.23wmf11 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109941 [19:23:58] Logged the message, Master [19:25:10] ottomata: yes, that works, and the docs for the kafka cli tool also explain to me what i need to do to create a consumer, so that's very useful [19:25:20] ottomata: thanks very very very (very, very) much! [19:26:44] yupyupyup [19:26:54] you can/should be able to install kafka deb package via apt anywhere and get that client [19:26:59] the zookeeper url is what you will need [19:27:33] cat /etc/profile.d/kafka.sh [19:27:38] export ZOOKEEPER_URL='analytics1023.eqiad.wmnet,analytics1024.eqiad.wmnet,analytics1025.eqiad.wmnet/kafka/eqiad' [19:28:36] jesus christ magnus is a ninja [19:28:40] ori, this might/will be useful for you I think, once it is ready [19:28:45] i am just reading some of the rdkafka docs [19:28:47] haha, yeah [19:28:49] also this [19:28:53] https://gerrit.wikimedia.org/r/#/c/109505 [19:28:56] !log reedy synchronized php-1.23wmf11/extensions/Wikibase [19:29:02] Logged the message, Master [19:29:04] replacement for udp2log, check out that conf.example file for more ninja skills [19:29:14] yes i saw, we should all just retire in shame and stop pretending to be programmers [19:29:50] i'll probably just write a simple consumer using librdkafka since i assume we have it packaged [19:30:00] yup [19:30:12] :) actually i think it comes with one [19:30:16] we don't installa binary for it [19:30:22] but if you compile it it makes a test consumer that you can use [19:34:58] ottomata: seriously... well done. there's a lot more to do i guess but this is looking like a pretty awesome setup. [19:35:42] (03PS2) 10Reedy: Remove 1.18 back compat [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/93116 [19:35:46] thanks! [19:38:09] ottomata: btw, can we remove the user_metrics module from mwv? [19:38:14] wikimetrics obsoletes it, no? [19:38:37] yes! [19:38:38] true [19:38:39] and [19:38:41] that reminds me [19:38:45] i need to turn it off ons tat1 [19:38:50] i will do both of those right now ori [19:38:56] (03PS2) 10Jforrester: Enable VisualEditor on ptwikibooks, ptwikiversity for testing [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108156 [19:39:03] (03CR) 10Reedy: [C: 032] Enable VisualEditor on ptwikibooks, ptwikiversity for testing [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108156 (owner: 10Jforrester) [19:41:00] ori: https://gerrit.wikimedia.org/r/#/c/109942/ [19:43:52] (03PS1) 10Ottomata: Removing metrics site (UMAPI) from stat1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/109943 [19:44:24] (03PS1) 10Reedy: Move binaries to version indifferent folders [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109944 [19:45:53] !log reedy synchronized database lists files: [19:46:00] Logged the message, Master [19:46:40] !log reedy synchronized wmf-config/InitialiseSettings.php 'touch' [19:46:48] Logged the message, Master [19:51:41] (03CR) 10Ottomata: "I will manually disable the site on stat1001. Will remove puppetization in a week or so." [operations/puppet] - 10https://gerrit.wikimedia.org/r/109943 (owner: 10Ottomata) [19:51:47] (03CR) 10Ottomata: [C: 032 V: 032] Removing metrics site (UMAPI) from stat1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/109943 (owner: 10Ottomata) [19:56:18] RobH: I'm having trouble accessing https://wikitech with curl/wget and andrewbogott said you're working on a "proper" certificate (apparently https://rt.wikimedia.org/Ticket/Display.html?id=6592). Do you have an ETA when that will be available? [19:57:18] (03PS1) 10Hashar: scap-recompile: remove mw 1.18 support [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109947 [19:58:03] (03CR) 10Reedy: "I did this ages ago in https://gerrit.wikimedia.org/r/#/c/93116/" [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109947 (owner: 10Hashar) [19:58:10] hashar: ^^ [19:58:19] Oh MY GOD [19:58:26] (03PS2) 10Reedy: Move binaries to version indifferent folders [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109944 [19:58:32] so either we need to phase out wikimedia-task-appserver [19:58:43] or we need to be granted +2 and ability to push that .deb :D [19:58:55] I think it is mostly unused now [19:59:01] (03Abandoned) 10Hashar: scap-recompile: remove mw 1.18 support [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109947 (owner: 10Hashar) [19:59:24] Reedy: the Math extension needs to compile another script "texvccheck" [19:59:43] Reedy: beta uses scap-recompile , isn't it still used on prod? [20:00:04] hashar: I know it does [20:00:09] I fixed beta earlier today :P [20:00:20] ohh [20:00:28] what have you done ? :-] [20:00:45] we had https://bugzilla.wikimedia.org/show_bug.cgi?id=60486 [20:00:47] paravoid: ping [20:01:36] (03PS1) 10Ori.livneh: Add *.local.wmftest.org wildcard, mapped to 127.0.0.1 [operations/dns] - 10https://gerrit.wikimedia.org/r/109948 [20:03:09] (03PS1) 10Reedy: Build texvccheck too [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109949 [20:03:28] hashar: done ;) [20:03:31] yeah [20:03:34] copy paste for the win :D [20:03:44] was willing to refactor it :D [20:04:05] physikerwelt: https://gerrit.wikimedia.org/r/#/c/109949/1 :D [20:04:16] I guess the 2nd "install -d /usr/local/apache/uncommon/bin" is redundant [20:04:42] to fix beta have you manually compiled it ? [20:04:44] ottomata: is there any reason not to always run wikimetrics under apache in vagrant? [20:04:49] also is production still using scap-recompile ? [20:05:14] hashar: http://p.defau.lt/?Frd8QHGpvrdCqG4epS5q4A [20:05:24] I compiled it on /tmp and moved it into place for beta [20:05:26] ori, mainly for dev purposes [20:05:34] Reedy: you are evil :-] [20:05:35] with daemon and DEBUG=true [20:05:43] flask will reload the app whenever you save changes [20:05:47] (03CR) 10Hashar: "copy pasting for the win. Good enough." [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109949 (owner: 10Reedy) [20:05:47] ottomata: why not apache with DEBUG=true? [20:05:49] with apache you have to restart everytime [20:05:52] ah, ok [20:05:52] (03CR) 10Hashar: [C: 031] Build texvccheck too [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109949 (owner: 10Reedy) [20:06:08] Reedy: if you can trick someone from ops to rebuild the package and get it upload on apt.wm.o that would be nice [20:06:20] ori, there would be a complicated and hacky way to make that happen in wsgi [20:06:21] i think [20:06:30] you can touch the api.wsgi file and somehow get apache to reload [20:06:31] buuut [20:06:36] 1. i didn't get it to work [20:06:44] and 2. that would be hacky and annoying for development anyway [20:06:47] so i stopped trying :p [20:07:53] (03CR) 10Physikerwelt: [C: 031] "as far as I can judge (very limited) it looks ok." [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109949 (owner: 10Reedy) [20:08:40] hashar: how do we set $wgMathTexvcCheckExecutable [20:08:46] physikerwelt: so basically, if you can manage to get ops to deploy https://gerrit.wikimedia.org/r/#/c/109949/ and dependent change that will fix it :-D [20:09:30] also you might want to disable texvccheck by default [20:09:39] I already set it in CommonSettings.php [20:09:52] \O/ [20:09:58] and texvccheck is on tin [20:10:34] physikerwelt: will try to get some ops to rebuild the package and get it uploaded [20:10:47] hashar: thank you [20:11:20] (03PS1) 10Reedy: Update changelog [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109950 [20:11:58] hashar: I think we should go with moving scap-recompile to where its friends live in puppet [20:11:59] Reedy: do you have any high karma to get wikimedia-task-appserver rebuild/deployed? :D [20:12:06] yeah I guess so [20:12:13] is that the last script in the .deb ? [20:12:29] Not quite [20:13:00] apache-sanity-check, apache-start, authorized_keys, check-time, mw-cksum, mw-cksum-list, scap-recompile [20:13:04] * Reedy wonders where mutante|away is [20:13:05] :D [20:13:26] we should ask the old timers (Tim|Mark) to migrate them to puppet :-] [20:13:38] if we get mathoid running one day in the glory future, we can move to the php version of texvccheck [20:13:54] Reedy: and mutante is in Germany, so he is out right now [20:14:22] He is fond of wikimedia-task-appserver though ;) [20:16:08] PROBLEM - SSH on iodine is CRITICAL: Server answer: [20:16:29] scfc_de: uh, wikitech.wikimedia.org is using its own proper certificate [20:16:35] it was resolved week(s)? ago [20:16:41] hashar: is there any open todo for me? [20:16:52] fill bugs? :D [20:16:58] and handle the project management ! [20:17:08] RECOVERY - SSH on iodine is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [20:17:20] well, a week. [20:17:33] hence the (s) ;] [20:17:36] (03PS1) 10Reedy: Add scap-recompile to puppet instead of wikimedia-task-appserver [operations/puppet] - 10https://gerrit.wikimedia.org/r/109951 [20:17:46] physikerwelt: so right now Reedy has fixed prod and beta so we are fine. We will get mutante (who is in germany right now) to approve / rebuild the package [20:18:06] Reedy: mind filling a RT for mutante ? [20:18:07] Or we just get the puppet version deployed :) [20:18:13] ah yeah [20:18:32] It might make sense to commit the history, then delete the file from wikimedia-task-appserver [20:19:05] we can potentially import the history of a single file from a different repo [20:19:09] using some git filter magic [20:19:19] but I am not sure whether it is worth the effort [20:19:28] <^demon|lunch> Of a single file? [20:19:28] <^demon|lunch> lol. [20:19:32] <^demon|lunch> Yeah, you can do it. [20:19:48] Just commit and delete it from wikimedia-task-appserver [20:19:49] probably [20:19:59] RobH: The problem is that (on Ubuntu Precise), "curl https://en.wikipedia.org/" => success, "curl https://wikitech.wikimedia.org/" => "curl: (60) SSL certificate problem, verify that the CA cert is OK. ". So one cert is "better" than the other :-). Do you have any suggestions on how to proceed? [20:20:23] well, they hit different clusters [20:20:30] wikitech isnt main cluster, but lets check out the chain on wikitech [20:20:33] <^demon|lunch> I saw that on my phone recently :( [20:20:39] <^demon|lunch> It didn't trust the issuing authority. [20:20:46] yea, sounds like borked chain [20:20:46] <^demon|lunch> (wikitech, that is) [20:20:47] lemme check it out [20:23:47] hashar: the project management for math is really broken... there is no project manager and nobody feels resposible for changes there... I expect a lot of serious problems for the upcoming changes that I try to seperate as sigle commits from math 2.0 for this problem my proposal was to set $wgMathDisableTexFilter = is_executable( $wgMathTexvcCheckExecutable ) ? false : true; I plan to change the database layout... the caching [20:23:52] Yea, its not right... [20:24:02] so the chained cert isnt pulling rapidssl, lemme check the manifests [20:24:59] (03PS2) 10Reedy: Remove scap-recompile [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/109950 [20:27:06] physikerwelt: sorry was mostly kidding about project management [20:27:25] physikerwelt: wanted to make sure texvccheck is sorted out in both prod and beta, which apparently is [20:27:40] physikerwelt: for Mathoid you definitely need some bandwidth on wmf side [20:27:52] s/bandwidth/people assigned to make it happens/ [20:28:22] (03PS8) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [20:28:32] (03PS1) 10RobH: setting ca_name for wikitech.w.o cert [operations/puppet] - 10https://gerrit.wikimedia.org/r/109952 [20:28:46] scfc_de: so yea, it had no ca_name set, so it defaulted to our internal ca declaration [20:28:50] which is a bad chain [20:28:56] thx for noticing [20:29:06] ^demon|lunch: thx for nothing why didny you tell meeeee ;p [20:29:13] (mostly kidding ;) [20:30:06] RobH: Thanks for fixing! :-) [20:30:16] not quite fixed yet, but will be soon [20:30:34] (03CR) 10RobH: [C: 032] setting ca_name for wikitech.w.o cert [operations/puppet] - 10https://gerrit.wikimedia.org/r/109952 (owner: 10RobH) [20:30:36] (03PS1) 10Ori.livneh: gdash: add stub-quality VisualEditor dashboard [operations/puppet] - 10https://gerrit.wikimedia.org/r/109954 [20:30:55] hashar: sure. My concern is that I can not find out, if the task is scheduled or not. [20:30:58] (03PS2) 10Ori.livneh: gdash: add stub-quality VisualEditor dashboard [operations/puppet] - 10https://gerrit.wikimedia.org/r/109954 [20:31:04] (03CR) 10Ori.livneh: [C: 032 V: 032] gdash: add stub-quality VisualEditor dashboard [operations/puppet] - 10https://gerrit.wikimedia.org/r/109954 (owner: 10Ori.livneh) [20:31:10] !log fixing virt0 wikitech cert, wikitech may restart [20:31:14] physikerwelt: I have no idea :/ [20:31:16] well, will restart, folks may notice. [20:31:17] Logged the message, RobH [20:32:01] hashar: Therefore it would be greate if someone from WMF gets assigned to supervise the the math extension [20:32:50] gwicke: pong [20:33:35] hey [20:33:58] hi [20:33:59] paravoid, I sent a mail earlier re getting the new Parsoid deploy system ready [20:34:33] we'll need root help, so was wondering if we can tackle that sometime this week [20:34:38] hashar: I looked to https://gerrit.wikimedia.org/r/#/admin/groups/448,members does that indicate that anything? [20:35:08] RobH: But you're leaving the actual restart for wikitech admins? [20:35:18] uh? [20:35:19] no. [20:35:25] restart of apache is done [20:35:45] physikerwelt: no idea [20:35:46] "curl https://wikitech.wikimedia.org/" still fails for me? [20:36:16] i just saved the changes [20:36:17] physikerwelt: I guess that group is used to grant merge rights on Math [20:36:18] when did you try? [20:36:55] Just now. [20:37:50] hashar: yes but WMF employes have merge rights as well... so why is one WMF employee listed there? [20:39:12] hrmm [20:39:17] (03PS2) 10Ori.livneh: Add *.local.wmftest.org wildcard, mapped to 127.0.0.1 [operations/dns] - 10https://gerrit.wikimedia.org/r/109948 [20:39:22] <^demon|lunch> physikerwelt: Lots more groups have rights on mediawiki/* than just the given extension group. [20:39:25] <^demon|lunch> ACLs inherit. [20:39:39] <^demon|lunch> The mediawiki group has rights on all of it. [20:39:51] <^demon|lunch> Which inherits members from ldap/wmf and ldap/ops. [20:39:55] <^demon|lunch> :) [20:40:26] RobH: https://sslcheck.globalsign.com/de/sslcheck?host=wikitech.wikimedia.org says that not all certificates in the chain are included? (Sorry, the UI of that page is German for me.) [20:41:00] yea, thats what i saw the first issue being [20:41:06] but now the chain is including all of them, doublechecking [20:41:18] RobH: https://rt.wikimedia.org/Ticket/Display.html?id=5787 ? [20:41:29] ^demon|lunch^: My goal is to find the person who is somehow the main resposible person for the development of the math extension [20:41:36] paravoid: yes? [20:41:52] <^demon|lunch> physikerwelt: There isn't one responsible person. I'd say the platform *team* is most responsible at the moment. [20:42:23] RobH: it's waiting for you for 3 months now, with monthly "bumps" by sean; could you respond? [20:42:34] i responded to sean directly [20:42:36] i jsut replied on ticket [20:42:40] sean made it sound like really important, I wouldn't ask otherwise [20:42:43] he asked me the other day [20:42:50] we've been communicating regularly [20:43:05] and the other day i was told it wasnt yet super important, but today told differently by you and another [20:43:09] so its bumped up my list for now. [20:43:13] ok [20:43:21] (at no point was the urgency made clear until today, which isnt ideal) [20:43:53] well, the dec 23 mail says "but just note we're starting to feel the pinch with very few spare eqiad db boxes" [20:44:07] that sounds important :) [20:44:12] ^demon|lunch: What should I do prior to change were I expect deployment conflicts? [20:44:15] fine, im wrong all the time [20:44:17] whatevs [20:44:23] i'll take care of it today i said [20:44:33] but lets keep calling me out in a public channel, cuz thats awesome way to handle this [20:44:42] <^demon|lunch> physikerwelt: Well, coordinating the deployment is important. greg-g is useful for that. [20:44:54] did I say that? [20:44:59] <^demon|lunch> Getting as much non-conflicting stuff deployed separately is helpful too [20:45:17] I didn't say you are wrong all the time, and that's not what I think fwiw [20:46:11] ^demon|lunch: ok thank you... I'll try to contact him [20:46:27] hi [20:46:29] what's up? [20:46:41] <^demon|lunch> greg-g: physikerwelt has been doing some cool work on Math. [20:46:48] <^demon|lunch> But it's breaking changes, so might be hard to merge + deploy. [20:47:05] <^demon|lunch> (And is hard, since nobody really "owns" Math right now) [20:47:19] ahhhhh [20:47:21] scfc_de: so wikitech has all kinds of odd redirects [20:47:25] but the actual cert chain seems right [20:47:27] so im not sure whats up [20:47:46] physikerwelt: could you send me an email (greg@wikimedia.org) with what is needed to happen to make this a smooth operation? [20:47:50] I'll brb [20:48:06] greg-g: ok I'll do so [20:48:25] thank you very much [20:49:03] so yea puppet makes bad chain even with the correction [20:49:08] i manually fix the chain and restart apache [20:49:10] same error [20:49:30] RobH: Could it be useful to add wikitech.wikimedia.org in manifests/certs.pp's install_certificate? [20:50:06] Or pass $ca = "RapidSSL_CA.pem GeoTrust_Global_CA.pem" in role/nova.pp (guessing). [20:50:30] hrmm, yea, certs missed [20:50:49] hrmm [20:50:53] nah, thats for star stuff [20:50:57] we dont list one offs in there [20:52:16] (03CR) 10Faidon Liambotis: [C: 032] Add *.local.wmftest.org wildcard, mapped to 127.0.0.1 [operations/dns] - 10https://gerrit.wikimedia.org/r/109948 (owner: 10Ori.livneh) [20:53:00] heading out [20:53:01] RobH: Should I file a bug to put solving that on the backburner? [20:53:14] scfc_de: if in doubt fill a bug / RT :-D [20:53:24] nope, we have an rt [20:53:29] i'll reopen it but im not done staring at this [20:53:44] Okay :-). [20:54:49] (03PS1) 10Matanya: emery: move left emery udp2log logs and sync jobs to erbium [operations/puppet] - 10https://gerrit.wikimedia.org/r/109957 [20:57:11] scfc_de: so i see the chain as right [20:57:18] and if i curl the redirected url of https://wikitech.wikimedia.org/wiki its ok [20:57:22] drop the /wiki and nope [20:57:35] the chain is correct though in the wikitech.wikimedia.org.chained.pem [21:00:06] "curl https://wikitech.wikimedia.org/wiki" gives me the same error as "curl https://wikitech.wikimedia.org/". [21:00:14] bleh [21:00:30] wtf is up that i get no error, its my machine but i cleaned off all certs [21:00:51] ahh, i see it now [21:01:26] lemme take break off it for 15 and find out wtf is going on with somethign else and illl come back to this [21:02:08] np [21:07:15] scfc_de: RobH : for what it is worth, curl -v https://wikitech.wikimedia.org/ works for me :-) [21:07:38] http://paste.debian.net/plain/78907 [21:08:27] yea i have that on home laptop [21:08:32] but from iron i get the actual ssl error [21:09:14] hio paravoid [21:09:26] i'm trying to compile Snaps' recent initial change for kafkatee! [21:09:32] it requires yajl2 [21:09:39] which, according to this thread [21:09:41] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653880 [21:09:52] has been uploaded to debian unstable on 2012-01-26. [21:09:56] trying to figure out how to install... [21:13:19] ahhh ok i think i got it, needed to ad dit to my sources [21:15:10] hashar: My use case is from Tools where it doesn't work, and on my personal box (Fedora 19) neither. [21:15:34] ahh [21:16:10] ottomata: i pushed the last bit for emery [21:16:28] scfc_de: so yeah it doesn't work on labs http://paste.debian.net/plain/78908 [21:16:42] scfc_de: the cert chain is probably missing/wrong on labs instance :/ [21:16:45] RobH: ^^^ [21:17:16] hashar: what labs instance? [21:17:20] virt0 is hosting it no? [21:17:22] any apparently [21:17:29] matanya: thanks [21:17:30] I tried on deployment-bastion.pmtpa.wmflabs [21:17:35] who ok'ed the arabic banner removal? [21:17:44] jonathan [21:17:49] hashar: No, it's wrong on wikitech :-). The server needs to supply the chain from its own certificate to a trusted root CA. [21:17:53] hashar: i dont get if its labs fault then why doesnt our production server iron work? [21:18:34] might be missing some CA in /etc/ssl/certs ? [21:19:56] I think the root CA for wikitech is Equinix or something like that, and that's installed on Labs instances. Also, http://www.sslshopper.com/ssl-checker.html#hostname=wikitech.wikimedia.org & Co. wouldn't complain if the server's cert would be working. [21:19:58] ok, just checing matanya, as I didn't see that explictly in the RT [21:20:00] maybe i'm missin git [21:20:09] oh oh [21:20:11] his most recent one [21:20:12] ok [21:20:24] yes, last comment [21:20:43] after this change is merged, we can shut down emery [21:21:37] ottomata: oops, i missed something: err: /Stage[main]/Wikimetrics/File[/vagrant/wikimetrics/wikimetrics/config]/ensure: change from absent to directory failed: Cannot create /vagrant/wikimetrics/wikimetrics/config; parent directory /vagrant/wikimetrics/wikimetrics does not exist [21:22:14] gwicke: What needs doing for https://gerrit.wikimedia.org/r/#/c/108158/ ? [21:23:18] (03CR) 10Ottomata: "Hmm, you know, just to be safe, I'd rather move these filters one at a time. Let's move the api filter over as part of this commit, and l" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/109957 (owner: 10Matanya) [21:23:27] Reedy: Ori reviewed that patch already, but a quick second look might be good [21:23:32] um Reedy mind lookind at https://gerrit.wikimedia.org/r/#/c/109590/ [21:23:40] hmmmm [21:23:41] ori [21:23:42] hm [21:23:55] when that is fine, the plan is to merge that patch to create the new repo and place the upstart configs onto the parsoid machines [21:23:58] gwicke: I meant for actually deploying it [21:24:06] ah ok [21:24:08] i see that ori [21:24:10] we'd then push out the new code and test the upstart setup on one machine [21:24:34] when that looks good, then we'd switch to the new repo & upstart with a second patch [21:24:48] ori, i forget, does puppet infer dependencies if it manages file paths? [21:24:49] like [21:24:58] Reedy, finally, we'd remove the old repo & init script [21:25:04] file { '/path/to/dir' … } and then later file { '/path/to/dir/fileA': … } [21:25:12] does it know that /path/to/dir/ has to happen first? [21:29:05] (03PS1) 10Ottomata: Requiring clone of wikimetrics repo before setting up config files [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/110063 [21:29:29] (03CR) 10Ottomata: [C: 032] Requiring clone of wikimetrics repo before setting up config files [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/110063 (owner: 10Ottomata) [21:31:46] (03CR) 10Ottomata: [V: 032] Requiring clone of wikimetrics repo before setting up config files [operations/puppet/wikimetrics] - 10https://gerrit.wikimedia.org/r/110063 (owner: 10Ottomata) [21:32:44] (03PS1) 10Tim Landscheidt: wikitech: Add intermediate certificates to chain [operations/puppet] - 10https://gerrit.wikimedia.org/r/110066 [21:33:02] ottomata: yes, but the parent dir has to be declared [21:33:05] (03CR) 10Tim Landscheidt: "Total stab in the dark." [operations/puppet] - 10https://gerrit.wikimedia.org/r/110066 (owner: 10Tim Landscheidt) [21:37:17] ahh ok thought so but I explicitly required anyway :/ [21:37:24] one sec, got a patch for you... [21:40:25] (03PS1) 10Ori.livneh: VisualEditor performance instrumentation: report counts [operations/puppet] - 10https://gerrit.wikimedia.org/r/110076 [21:40:27] (03PS1) 10Ori.livneh: gdash: Strip (mw) and (cdn) prefixes from dashboard names [operations/puppet] - 10https://gerrit.wikimedia.org/r/110077 [21:40:31] ori https://gerrit.wikimedia.org/r/#/c/110073/ [21:40:40] Reedy, we'll basically need a root to do some basic checking after merging https://gerrit.wikimedia.org/r/#/c/107492/ [21:40:51] should not take more than 30 minutes [21:43:06] (03CR) 10Ori.livneh: [C: 032 V: 032] VisualEditor performance instrumentation: report counts [operations/puppet] - 10https://gerrit.wikimedia.org/r/110076 (owner: 10Ori.livneh) [21:43:13] (03CR) 10Ori.livneh: [C: 032 V: 032] gdash: Strip (mw) and (cdn) prefixes from dashboard names [operations/puppet] - 10https://gerrit.wikimedia.org/r/110077 (owner: 10Ori.livneh) [21:43:43] ori: why? [21:43:47] I found them useful [21:44:06] what exactly? they're a poor replacement for a proper taxonomy, imo [21:44:31] the '(mw)' category was entirely too broad and '(cdn)' wasn't exactly scoped to cdn [21:44:59] i think we'd be better-served by an informal convention of having a prefix followed by a colon [21:45:11] VisualEditor: Timing metrics [21:45:12] (cdn) is data coming from varnishes [21:45:37] but yes, it has metrics such as 500s, which are mw-generated, although from a varnish PoV [21:45:38] yeah, I get the distinction; it's just a bit unclear. How about if I add 'Varnish: ' to those metric names that previously had '(cdn)'? [21:46:10] and MediaWiki: to the ones that had (mw)? :) [21:46:33] that would still be better; the title is used in each dashboard page and the (mw) looks awful [21:46:44] but I can add MediaWiki: to the ones that actually measure something mediawiki-related [21:46:56] dunno, I guess it's fine without any prefix too [21:47:01] maybe I'm just used to it [21:47:15] I remember helping me when I was still figuring our setup out, though [21:47:17] i already merged it, so let me run puppet on tungsten and let you stare at it for a few seconds [21:47:33] then we can decide [21:49:13] paravoid: try now. you may need to add a junk query string to bust the cache. [21:49:48] nope, it just works [21:49:57] ™ [21:50:01] yeah, it sucks [21:50:04] (03PS4) 10Matanya: wgSitename: consistent using ' instead of " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109590 [21:50:08] (03CR) 10Reedy: [C: 032] wgSitename: consistent using ' instead of " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109590 (owner: 10Matanya) [21:50:10] specifically because they're alphabetically sorted [21:50:20] so now it's "API methods", then "All HTTP requests" then "Article Methods" [21:50:23] very confusing [21:50:32] yeah, fair point [21:50:35] then "Client-side latency", then "Data Stores" [21:50:46] ok, but i hate '(cdn)'. i'm going with 'Prefix: ' [21:50:48] cool? [21:50:56] sure, np [21:51:08] 'Frontend: ' too, I assume? [21:51:16] for Client-side metrics, VE etc.? [21:51:17] (03PS1) 10Ori.livneh: Revert "gdash: Strip (mw) and (cdn) prefixes from dashboard names" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110085 [21:51:41] or RUM, dunno. That may be too buzzwordy. [21:51:45] (03PS4) 10Chad: Remove old Tampa srv* and mw* apaches from dsh groups [operations/puppet] - 10https://gerrit.wikimedia.org/r/108070 [21:53:27] (03Merged) 10jenkins-bot: wgSitename: consistent using ' instead of " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109590 (owner: 10Matanya) [21:58:56] (03PS1) 10MaxSem: Experiment: disable mobileview tidying [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110087 [22:01:22] (03CR) 10Yuvipanda: [C: 031] "Experimentally I'm okay with it - (unverified) but theoretically the apps should be okay with it. Since this is easy enough to revert, let" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110087 (owner: 10MaxSem) [22:01:37] (03PS2) 10Ori.livneh: Restore dashboard name prefixes [operations/puppet] - 10https://gerrit.wikimedia.org/r/110085 [22:02:10] MaxSem: I gave it a +1 [22:02:15] MaxSem: you should probably ask brion too :) [22:02:21] thanks yuvipanda [22:02:22] MaxSem: and maybe the web team? idk [22:02:26] MaxSem: I'm off now! [22:02:29] (03CR) 10Ori.livneh: [C: 032 V: 032] Restore dashboard name prefixes [operations/puppet] - 10https://gerrit.wikimedia.org/r/110085 (owner: 10Ori.livneh) [22:02:32] sweet dreams [22:04:19] (03CR) 10Brion VIBBER: [C: 031] "Should work -- if we have imbalanced HTML inside the section chunks, they'll get tidied up by the browser parser when inserted into the do" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110087 (owner: 10MaxSem) [22:04:50] stupid colons [22:04:53] stupid yaml [22:04:56] haha [22:04:58] stupid data [22:05:09] (03PS2) 10Reedy: Rename phase1 dblist to group0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109934 [22:05:19] (03CR) 10Reedy: [C: 032] Rename phase1 dblist to group0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109934 (owner: 10Reedy) [22:05:20] * ori waves the 5-minute amnesia wand [22:05:48] * greg-g has deja vu [22:05:59] or was that DjVu, and I was just compromised? [22:06:33] it was dejanews, and you were reading USENET [22:07:19] we should bring usenet back [22:07:22] wikipedia over nntp [22:07:42] I hear we can download more female participation from Usenet. [22:08:01] brion: they're working on it: https://archive.org/details/giganews [22:08:25] I love you IA [22:08:27] heh [22:08:27] so much [22:10:03] (03PS1) 10Ori.livneh: gdash: quote all dashboard names and descriptions [operations/puppet] - 10https://gerrit.wikimedia.org/r/110089 [22:10:27] (03CR) 10Ori.livneh: [C: 032 V: 032] gdash: quote all dashboard names and descriptions [operations/puppet] - 10https://gerrit.wikimedia.org/r/110089 (owner: 10Ori.livneh) [22:12:18] ottomata: the risk with puppet-merge is that if you have a large change it is very easy to miss any changes before it. you have to be disciplined about scrolling up. ideally puppet-merge would warn you if there is more than one committer name across the set of patches you are merging. [22:12:57] (03Merged) 10jenkins-bot: Rename phase1 dblist to group0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109934 (owner: 10Reedy) [22:13:14] ori that would be fancy [22:13:24] i think i also need to change the diff that gets shown if there is a submodule change too [22:13:27] diff unified or whatever [22:18:28] ottomata: git log --color HEAD..${fetch_head_sha1} --format=%ce | sort | uniq | wc -l | grep -q 1$ [22:18:38] actually the --color doesn't make sense since you're piping it [22:19:01] !log reedy synchronized docroot and w [22:19:03] but should be correct otherwise; will be truthy if there is only one committer and false otherwise [22:19:09] Logged the message, Master [22:19:37] !log reedy synchronized database lists files: [22:19:38] actually, will be true for 11 committers as well [22:21:33] (03PS1) 10Reedy: Make sync-dblist report done, don't echo mediawiki-installation [operations/puppet] - 10https://gerrit.wikimedia.org/r/110092 [22:27:19] !log reedy updated /a/common to {{Gerrit|I6b16562d0}}: Rename phase1 dblist to group0 [22:27:27] Logged the message, Master [22:27:35] (03PS1) 10Reedy: Move reuseable code from switchAllMediaWikis to switchAllMediaWikis.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110093 [22:28:03] (03CR) 10Reedy: [C: 032] Move reuseable code from switchAllMediaWikis to switchAllMediaWikis.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110093 (owner: 10Reedy) [22:28:09] (03Merged) 10jenkins-bot: Move reuseable code from switchAllMediaWikis to switchAllMediaWikis.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110093 (owner: 10Reedy) [22:33:23] Ok, WTF wikitech [22:33:24] ottomata: git log HEAD..${fetch_head_sha1} --format=%ce | sort | uniq | wc -l | grep -Eq '\s+1$' [22:33:26] I manually made your chained cert [22:33:30] Why arent you working?!?! [22:33:43] ottomata: $? will be 0 if one one committer, 1 otherwise [22:34:14] ottomata: maybe test for that and just add an echo with a warning if true? [22:36:54] ori….i'm closed down for the day, but patch away? :p [22:37:00] :) [22:37:02] where's the repo? [22:37:06] * ori searches [22:38:19] its just a script in puppet [22:39:18] yep, found it [22:41:51] ok, who understands ssl certificate chains and wants to give this a glance? [22:42:02] cuz im at a loss on why my manually created ssl chain isnt working [22:42:06] (for wikitech) [22:43:42] my chain checks show its ok [22:43:49] but i get curl error trying to pull it, stating cert error [22:44:00] so not sure wtf is up. [22:44:04] (or if its really an issue) [22:46:26] yet curling the https url from iron shows the cert error. [22:46:55] RobH: If I do "echo | openssl s_client -connect wikitech.wikimedia.org:443 -showcerts | less", I only see one certificate. Could you post the certificate chain to pastebin or somewhere? [22:48:05] interesting [22:48:15] i was just doing check, not reading full output [22:48:18] i have same finding as you [22:48:39] i just did openssl verify sslserver on virt0 [22:48:46] against wikitech.wikimdia.org, its less info than that. [22:49:06] so yea... not showing both [22:49:07] wtf [22:49:25] RobH: I'm making this up as I go along, so don't expect much wisdom from me :-). [22:49:33] no, its helping [22:49:38] i appreciate it! [22:49:55] so yea, i run against other certs we have that are identical (in that they should show the chain) [22:50:04] so gerrit.w.o, blog.w.o, they present the full chain [22:50:14] but not wikitech.w.o =P [22:50:27] though so far as i can tell, they should be identical [22:51:02] well, gerrit is same, blog may be different [22:51:07] gerrit is the known good example. [22:51:10] shows full chain [22:52:11] And wikitech.chained.pem looks similar in structure to gerrit.chained.pem? (Are those binaries?) [22:53:37] just certificate file of txt, should be half the same [22:53:48] each is rapidssl issued, so the issuer cert is identical for both [22:53:58] the chain is just catting them together [22:54:23] Yeah, but are there one or more --- BEGIN ... --- sections in them? [22:55:17] Is wikitech using the .pem or the .chained.pem as certificate? [22:55:24] the .pem [22:55:33] so i wonder where its setting that chain [22:55:40] cuz it may be snagging old one from config [22:55:48] which would explain why its broken [22:56:23] well, i merged my change, lemme rerun puppet and have it create the chain and see if it does it properly [22:56:29] install_certificate in manifest/certs.pp creates the .chained.pem AFAICS. [22:57:32] well, it does for the nginx hosts that call that [22:57:36] but pretty sure our odd one offs dont [22:57:43] or else we'd have blog, gerrit, etc in certs.pp [22:58:12] i think.. [22:58:30] You're right, files/apache/sites/blog.wikimedia.org refers to the .pem alone ... [22:58:35] ok, well, i see an issue [22:58:56] i added my ca_name to nova and all [22:59:12] but its still making a chain using the internal wmlabs one [22:59:18] ca_name that is [22:59:20] not rapidssl. [22:59:32] why? [22:59:41] Did you look at my Gerrit change? $ca_name isn't used. [22:59:58] (Outside of .../ldap/server.pp.) [23:00:07] (03CR) 10Awjrichards: "I've followed up with the mobile department and ping'd the appropriate product folks." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108738 (owner: 10Faidon Liambotis) [23:00:24] awjr: thanks! [23:00:26] hrmm [23:00:34] np paravoid [23:00:50] https://gerrit.wikimedia.org/r/#/c/110066/ [23:00:54] scfc_de: i see what yer doin [23:00:58] well, im willing to give this a shot [23:01:01] =] [23:01:14] it makes sense that you list off all three entries rather than one [23:01:22] well, two, main and first in chain [23:01:35] oh that's what that SoS card was [23:01:37] (03CR) 10RobH: [C: 032] wikitech: Add intermediate certificates to chain [operations/puppet] - 10https://gerrit.wikimedia.org/r/110066 (owner: 10Tim Landscheidt) [23:01:52] awjr: thanks [23:02:09] lets give it a shot. [23:02:24] np mark [23:04:25] (It would be nice to standardize the whole Apache things more and consolidate them using webserver::apache::site or something similar, so there's less room for subtle differences between them.) [23:04:30] so i think i have two issues. 1: puppet doesnt make chain properly, which scfc_de's fix will hopefully do [23:04:47] scfc_de: your rant has been made by everyone in ops, there is a start of standardizing [23:04:49] =] [23:05:15] and then the apache vhost wasnt telling fetches where the full ca cert path [23:05:53] scfc_de: your patchset fixed the chaining issue, needed the geotrust addtion [23:06:11] still only presents one [23:06:16] but the chain internal on host is right [23:06:25] now i think we need the sslcertificatepath added to vhost for it [23:07:17] So now you're pointing SSLCertificateChainFile to .chained.pem à la modules/gitblit/templates/git.wikimedia.org.erb? [23:09:06] scfc_de: so it will work now [23:09:09] so nah was two issues [23:09:18] so the cert install is how puppet handles it, but it wasnt the issue here [23:09:25] its the issue for new apache configs which refrence that properly [23:09:30] old configs like this still need SSLCACertificatePath /etc/ssl/certs/ added [23:09:39] so you future-proofed it for update later [23:09:44] and now i hot-fixed the vhost file [23:09:48] which im adding to puppet now [23:10:12] cuz glitblit uses the new proper way [23:10:16] and this is old way [23:10:18] Yeah, "curl https://wikitech.wikimedia.org/" works for me now on my box and on Labs. That was a long birth :-). Thanks! [23:10:26] (works, but doesn't seem to be the popular way to do it now)) [23:10:34] thanks for working with me on it =] [23:10:59] np [23:12:35] (03PS1) 10RobH: wikitech.w.o vhost didnt set sslcertificatepath [operations/puppet] - 10https://gerrit.wikimedia.org/r/110099 [23:13:52] integration.wikimedia.org/zuul means never refreshing your patchset page [23:14:00] or atleast just once maybe [23:14:19] (03CR) 10RobH: [C: 032] wikitech.w.o vhost didnt set sslcertificatepath [operations/puppet] - 10https://gerrit.wikimedia.org/r/110099 (owner: 10RobH) [23:17:49] and the actual fix is live (no more live hack) [23:19:19] And it's still working, so it seems to last :-). [23:27:12] (03CR) 10Reedy: "There's gotta be a better way to do this" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101889 (owner: 10Odder) [23:27:49] (03Abandoned) 10Reedy: Language Template fixup definition for UploadWizard. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/39026 (owner: 10Lupo) [23:28:12] (03PS3) 10MaxSem: Extend OpenSearchXml with images from PageImages [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94179 [23:28:16] (03CR) 10Reedy: [C: 032] Extend OpenSearchXml with images from PageImages [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94179 (owner: 10MaxSem) [23:28:23] (03Merged) 10jenkins-bot: Extend OpenSearchXml with images from PageImages [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94179 (owner: 10MaxSem) [23:29:09] (03PS10) 10Dereckson: Throttle now handles IP ranges. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/65644 [23:31:29] (03PS3) 10Dereckson: New extra language for wikidata: Ottoman Turkish (ota) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96771 [23:31:32] (03CR) 10Reedy: [C: 032] New extra language for wikidata: Ottoman Turkish (ota) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96771 (owner: 10Dereckson) [23:31:40] (03Merged) 10jenkins-bot: New extra language for wikidata: Ottoman Turkish (ota) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96771 (owner: 10Dereckson) [23:33:12] (03CR) 10Reedy: Give testwiki some custom namespaces (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/78016 (owner: 10TTO) [23:33:30] (03Abandoned) 10Reedy: Added more resolutions to commons.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/103107 (owner: 10Yatinmaan) [23:34:15] (03PS8) 10Hashar: sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 [23:34:34] (03PS3) 10Gerrit Patch Uploader: Enable FlaggedRevs extension on ce.wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108302 [23:35:53] !log Created FlaggedRevs tables on cewiki [23:36:01] Logged the message, Master [23:36:36] ottomata, do you know if I can simply reply to emails I get from access-requests at rt.wikimedia.org and they will get posted in RT? [23:36:44] (03PS4) 10Reedy: Enable FlaggedRevs extension on ce.wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108302 (owner: 10Gerrit Patch Uploader) [23:36:50] yeah, i saw you replied recently, right? [23:37:05] (03CR) 10Reedy: [C: 032] Enable FlaggedRevs extension on ce.wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108302 (owner: 10Gerrit Patch Uploader) [23:37:12] (03Merged) 10jenkins-bot: Enable FlaggedRevs extension on ce.wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/108302 (owner: 10Gerrit Patch Uploader) [23:37:18] ottomata, yeah, ok, just double checking, thanks [23:37:36] (03Abandoned) 10Reedy: Updated docroot/bits/favicon/internal.ico [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/100326 (owner: 10Gerrit Patch Uploader) [23:37:36] yup! [23:37:51] (03CR) 10Lydia Pintscher: "Why was this merged without Daniel or Katie looking into it like I asked? This has the potential for serious breakage." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96771 (owner: 10Dereckson) [23:37:55] (03CR) 10Odder: "Scribunto does not allow the creation of namespace aliases; it is only possible to translate NS_MODULE and NS_MODULE_TALK into a language " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101889 (owner: 10Odder) [23:43:51] (03CR) 10Reedy: "SMW code would seem to suggest it can be done..." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/101889 (owner: 10Odder) [23:45:41] (03PS6) 10Reedy: Make missing.php aware of interwiki prefixes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94716 (owner: 10TTO) [23:46:12] (03PS7) 10TTO: Make missing.php aware of interwiki prefixes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94716 [23:46:17] (03CR) 10Reedy: [C: 032] Make missing.php aware of interwiki prefixes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94716 (owner: 10TTO) [23:46:24] (03Merged) 10jenkins-bot: Make missing.php aware of interwiki prefixes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/94716 (owner: 10TTO) [23:48:09] (03PS3) 10Addshore: Add wikibase permissions to MWOAuthGrantPermissions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109333 [23:48:13] (03CR) 10Reedy: [C: 032] Add wikibase permissions to MWOAuthGrantPermissions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109333 (owner: 10Addshore) [23:48:20] (03Merged) 10jenkins-bot: Add wikibase permissions to MWOAuthGrantPermissions [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109333 (owner: 10Addshore) [23:50:05] merge all the stuffs! :) [23:52:15] (03PS2) 10Reedy: Add checkuser OAuth group [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109308 (owner: 10Anomie) [23:54:51] (03CR) 10Reedy: [C: 032] Add checkuser OAuth group [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109308 (owner: 10Anomie) [23:54:58] (03Merged) 10jenkins-bot: Add checkuser OAuth group [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/109308 (owner: 10Anomie) [23:57:27] (03CR) 10Peachey88: "@Lydia Pintscher, Based on your BZ Comment (C7) they did look at it, And Siebrand even followed up with a comment close the the start of D" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96771 (owner: 10Dereckson) [23:59:12] (03PS1) 10Ori.livneh: puppet-merge: warn if multiple committers [operations/puppet] - 10https://gerrit.wikimedia.org/r/110104