[01:50:39] PROBLEM - MySQL Recent Restart on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:50:40] PROBLEM - MySQL Processlist on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:50:50] PROBLEM - DPKG on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:50:59] PROBLEM - check if dhclient is running on db1046 is CRITICAL: Timeout while attempting connection [01:51:00] PROBLEM - RAID on db1046 is CRITICAL: Timeout while attempting connection [01:51:00] PROBLEM - SSH on db1046 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:51:01] PROBLEM - MySQL disk space on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:51:01] PROBLEM - puppet disabled on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:51:01] PROBLEM - Disk space on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:51:02] PROBLEM - MySQL Idle Transactions on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:51:02] PROBLEM - check configured eth on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:51:02] PROBLEM - mysqld processes on db1046 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:52:19] * springle pokes db1046 [02:13:31] PROBLEM - Disk space on virt0 is CRITICAL: DISK CRITICAL - free space: /a 3790 MB (3% inode=99%): [02:20:42] !log LocalisationUpdate completed (1.24wmf1) at 2014-04-28 02:20:39+00:00 [02:20:50] Logged the message, Master [02:21:30] PROBLEM - Disk space on virt0 is CRITICAL: DISK CRITICAL - free space: /a 3434 MB (3% inode=99%): [02:21:48] D'oh! [02:22:21] !log powercycle db1046 unresponsive [02:22:28] Logged the message, Master [02:22:53] Ah, I was wondering why I couldn't connect to the mgmt interface. [02:23:10] Simple answer: because you were. :-) [02:23:22] heh [02:23:30] RECOVERY - Disk space on virt0 is OK: DISK OK [02:30:07] !log LocalisationUpdate completed (1.24wmf2) at 2014-04-28 02:30:04+00:00 [02:30:14] Logged the message, Master [03:00:04] !log starting online schema change, bug 64411, page_props.pp_sortkey [03:00:11] Logged the message, Master [03:12:17] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 28 03:12:11 UTC 2014 (duration 12m 10s) [03:12:22] Logged the message, Master [03:23:39] springle: I'm around, by the way, if there's anything you need [03:27:56] ori: cool thanks :) I noticed on email list you said you can cross reference db1048 uuid's with log files, which would be wise. Fine to begin that against db1048 if/when you like. [03:28:24] it won't affect me fighting with db1046 and db1047 repl [03:28:46] cool, will do. i'll need to modify a script so it'll take me a few to get started; will let you know once i have the results. [03:29:53] excellent [03:30:23] i'm still a little worried about the duplicate ids [03:30:53] springle: which duplicate ids? [03:31:01] i ran joins on uuid for all tables and found no gaps, but i wonder if any new ids from the new consumer were overwritten [03:31:57] springle: you'll have to slow down for me a little :) how would you look for gaps? [03:32:50] ori: recall I mentioned in email that running the duplicate consumers in parallel for a while meant some auto-inc ids were used on db1048 as well as in the db1047 dump? theoretically shouldn't be an issue, but would be nice to verify from external uuid's [03:34:08] ori: looking for gaps meant LEFT JOIN .. WHERE .. IS NULL on old and new consumer data [03:36:00] ok, that makes sense. [03:36:44] do you use the auto-inc primary key for anything? [03:36:50] (just wondering) [03:37:14] no, it's not significant (and not part of the event record). it's an artifact of storing in the database. i would have used the uuid as the primary key but asher advised against it iirc [03:37:25] yeah, for index size [03:37:42] well, and write overhead i guess [03:38:13] uuid == PK would help for any future INSERT IGNORE migrations though :) [03:38:27] well, there's a unique key on uuid, it's just not the primary key [03:39:02] it isn't unique [03:39:03] hm, is it not unique? [03:39:06] not formally [03:39:08] yes, i just noticed that [03:39:13] normal key, and nullable [03:39:43] that's odd. i could have sworn.. let me look at the source code again, it's been a while [03:39:44] hence my last email to you :) i assumed it was properly unique only based uuid-ness [03:40:34] # Every table gets an integer auto-increment primary key column `id` [03:40:35] # and an indexed CHAR(32) column, `uuid`. (UUIDs could be stored as [03:40:35] # binary in a CHAR(16) column, but at the cost of readability.) [03:40:36] columns = [ [03:40:39] sqlalchemy.Column('id', sqlalchemy.Integer, primary_key=True), [03:40:40] # To keep INSERTs fast, the index on `uuid` is not unique. [03:40:42] sqlalchemy.Column('uuid', sqlalchemy.CHAR(32), index=True) [03:40:44] ] [03:40:51] :D [03:40:55] speeeeed! [03:41:22] which actually I wouldn't be too concerned about knowing the relatively small consumer write load [03:41:42] is that actually legit, or is 2013 ori full of shit? [03:41:52] * ori doesn't trust older self [03:42:25] i would have probably gone with CHAR(16) in hindsight; UUIDs are many things but readable is not one [03:42:54] unique keys are a little slower to update in some cases [03:43:15] not crazy. but propabbly also not a big deal in this case [03:43:51] CHAR(16) vs CHAR(32) is a legit consideration [03:44:08] i may quote you in the file header [03:44:14] :) [03:44:19] "not crazy." -- sean pringle, 27-April-2014 [03:50:12] actually, make them BINARY(16). table charset is utf8 [03:52:22] ori: pros and cons http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/ [03:54:15] imo the best combination here would be binary(16) uuid as primary key and lose the auto-inc id. this would still have some drawbacks, but the overhead of a single 16-byte binary PK vs auto-inc + secondary index, with just a single consumer, is lesser [03:56:17] yes, that seems right [03:57:38] and if readability is a concern the uuid could be duplicated and stored as a redundant but unindexed char(32) column [03:58:00] or binary [03:58:09] yep [04:00:22] it's compelling -- not for whatever performance improvement it would confer, but because it would make QAing the data easier [04:03:36] i think i'm going to write up the considerations in a TODO file or a bugzilla bug, and then CC analytics. i'm a bit reluctant to mess around with the data model [04:06:36] yep. plus any performance concern is also very storage engine specific, and we're talking innodb mostly. if it was tokudb or aria engine, bottlenecks will be elsewhere [04:06:53] losing auto-inc happens to make it more portable too, if that matters [04:08:30] PROBLEM - MySQL Slave Delay on db60 is CRITICAL: CRIT replication delay 334 seconds [04:08:32] PROBLEM - MySQL Replication Heartbeat on db60 is CRITICAL: CRIT replication delay 334 seconds [04:16:30] RECOVERY - MySQL Slave Delay on db60 is OK: OK replication delay 92 seconds [04:16:32] RECOVERY - MySQL Replication Heartbeat on db60 is OK: OK replication delay 90 seconds [04:34:57] (03CR) 10KartikMistry: "We have two instances maintained by team that are using browsertests." [operations/puppet] - 10https://gerrit.wikimedia.org/r/129687 (owner: 10Hashar) [06:47:33] apergos: morning [06:48:05] would an icinga check on all hosts for cpu/memory usage be useful ? [07:04:27] (03PS1) 10Nemo bis: Use an actually generic address as $wmgNotificationSender default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130037 (https://bugzilla.wikimedia.org/58261) [07:06:01] (03CR) 10Nemo bis: "Followup: I214b0f5fd82d357a32849bee2e072f33577f8ef6" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/59717 (https://bugzilla.wikimedia.org/46670) (owner: 10Lwelling) [07:06:51] (03CR) 10Legoktm: [C: 031] Use an actually generic address as $wmgNotificationSender default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130037 (https://bugzilla.wikimedia.org/58261) (owner: 10Nemo bis) [07:29:55] (03PS1) 10Dzahn: disable Siebrand's shell account [operations/puppet] - 10https://gerrit.wikimedia.org/r/130038 [07:31:56] (03CR) 10Dzahn: [C: 032] disable Siebrand's shell account [operations/puppet] - 10https://gerrit.wikimedia.org/r/130038 (owner: 10Dzahn) [07:32:28] morning mutante [07:34:13] <_joe_> hi matanya [07:34:22] hi _joe_ :) [07:34:49] <_joe_> matanya: http://puppet-transition-helper.wmflabs.org/html/ (today it will improve, still here are reasonable results [07:34:54] <_joe_> matanya: bottom line [07:35:08] <_joe_> matanya: a lot of templates to fix, and just a few other things [07:36:04] * matanya is looking [07:36:25] 500 Internal Server Error [07:36:57] _joe_: i guess that needs fixing too :P [07:37:12] <_joe_> matanya: that's a 404 [07:37:26] <_joe_> but that is because we need to fix the private/labs repo [07:37:32] <_joe_> I'll do that today [07:37:45] k, thanks [07:41:48] _joe_: to verify the output of the tool: what is the content of /etc/icinga/puppet_hosts.cfg on neon? [07:44:17] <_joe_> matanya: icinga results on 2.7 are wrong for sure [07:44:32] <_joe_> matanya: I had to disable --storedconfigs on 2.7 or it would fail [07:45:07] <_joe_> so we have no external resources collected in puppet 2.7 [07:45:25] just wondering :) can you check please ? [07:47:55] (03PS8) 10Giuseppe Lavagetto: Substituting the check_graphite script. [operations/puppet] - 10https://gerrit.wikimedia.org/r/125726 [07:48:35] <_joe_> matanya: yes gimme 5 minutes [07:49:14] <_joe_> matanya: what should I check in particular? [07:49:39] the content of the file. should be : /usr/local/bin/naggen', '--stdout', '--type', 'hostextinfo [07:50:01] <_joe_> no. [07:50:10] <_joe_> look at http://puppet-transition-helper.wmflabs.org/compiled/puppet_catalogs_2.7/neon.wikimedia.org.warnings [07:50:26] <_joe_> the first line is the error, we don't have naggen [07:50:35] <_joe_> that's why compilation fails [07:50:42] <_joe_> ok, another dependency [07:51:04] so we should add naggen? [07:51:09] <_joe_> yes [07:51:35] to your tool, or to our puppet repo? who has the missing deps? [07:52:30] <_joe_> to my 'tool' [07:52:42] <_joe_> I just guess naggen is not in $PATH [07:52:48] <_joe_> I'll check in a few [07:53:18] modules/puppetmaster/files/naggen [07:53:36] (03CR) 10Giuseppe Lavagetto: [C: 032] "Let's merge (and fix anything later)." [operations/puppet] - 10https://gerrit.wikimedia.org/r/125726 (owner: 10Giuseppe Lavagetto) [07:56:55] (03PS1) 10Dzahn: disable account 'akhanna' [operations/puppet] - 10https://gerrit.wikimedia.org/r/130039 [07:57:29] morning hashar do you need help with https://bugzilla.wikimedia.org/show_bug.cgi?id=63934 ? [07:58:03] matanya: I gotta figure out a solution for a the dependent tickets :} [07:58:12] hold on brb [08:00:07] (03CR) 10Dzahn: [C: 032] disable account 'akhanna' [operations/puppet] - 10https://gerrit.wikimedia.org/r/130039 (owner: 10Dzahn) [08:05:31] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: usage: check_graphite [-h] [-U URL] [-T TIMEOUT] [08:06:37] _joe_: ^ [08:08:08] <_joe_> matanya: yes I know [08:09:42] <_joe_> matanya: wait for the next puppet run at least, the file on disk has changed but the checkcommands def has not been reloaded [08:11:08] <_joe_> matanya: if that does not recover in ~ 10 minutes, then I'll worry about it [08:11:14] (03CR) 10Hashar: "@KartikMistry Yup that is what this patch is about. Your two instances were unreachable from the Jenkins slave so I added the iptables ru" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129687 (owner: 10Hashar) [08:11:18] ok, sure [08:24:11] (03PS1) 10Matanya: appserver: no more hardy boxes [operations/puppet] - 10https://gerrit.wikimedia.org/r/130042 [08:44:33] (03PS3) 10Dzahn: rm old wikibugs - replaced by pywikibugs [operations/puppet] - 10https://gerrit.wikimedia.org/r/129694 [08:45:33] (03CR) 10Dzahn: [C: 032] rm old wikibugs - replaced by pywikibugs [operations/puppet] - 10https://gerrit.wikimedia.org/r/129694 (owner: 10Dzahn) [08:49:35] !log reloading db1046 from fresh m2 dump [08:49:42] Logged the message, Master [08:51:33] (03CR) 10Dzahn: "what is the motivation? are all ferm rules supposed to be in roles? is that considered config?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129965 (owner: 10Matanya) [08:54:03] (03PS1) 10Odder: Add Library of Congress to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130043 (https://bugzilla.wikimedia.org/64487) [08:55:01] (03PS2) 10Odder: Add Library of Congress to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130043 (https://bugzilla.wikimedia.org/64487) [08:57:25] (03CR) 10Zfilipin: [C: 031] contint/beta: set natfix for the labs shared proxy [operations/puppet] - 10https://gerrit.wikimedia.org/r/129687 (owner: 10Hashar) [09:11:05] (03PS5) 10Dzahn: turn ircecho into a parameterized class [operations/puppet] - 10https://gerrit.wikimedia.org/r/129676 [09:16:37] (03CR) 10Dzahn: "Alex, there is just the single bot (icinga-wm) left now" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129676 (owner: 10Dzahn) [09:18:30] (03PS1) 10Springle: Include standard for the new mariadb roles. Remove duplicate includes added previously to node definitions. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130046 [09:21:08] _joe_: ^ [09:21:16] <_joe_> springle: on it [09:21:23] thank you [09:21:58] <_joe_> (btw, including standard twice should not be an issue) [09:22:08] ah cool, did wonder [09:22:37] (03CR) 10Giuseppe Lavagetto: [C: 032] Include standard for the new mariadb roles. Remove duplicate includes added previously to node definitions. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130046 (owner: 10Springle) [09:26:58] (03PS1) 10Hashar: contint: get phantomJS on Jenkins slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 [09:27:18] (03CR) 10Matanya: "I think ferm rules are specific to wmf, and as such should be in role classes." [operations/puppet] - 10https://gerrit.wikimedia.org/r/129965 (owner: 10Matanya) [09:27:30] https://www.wikimania.org/ has wrong cert? [09:28:14] Nemo_bis: open a ticket in rt [09:28:26] you should poke RobH [09:28:56] Nemo_bis: that will be related to wikimania.org moving from WMCH to WMF [09:29:03] there is a months old ticket about that [09:29:19] #5587 [09:29:38] so Wikimania CH used to own that domain and it returns a wikimedia.ch cert [09:30:13] in fact, you'll have to ask Manuel Schneider [09:30:23] Admin Email:info@wikimedia.ch [09:30:43] and also CC RobH about a new cert [09:33:38] last status we have was that the decision what happens with wikimania.org went to the WMCH board [09:34:32] (03PS2) 10Hashar: contint: get phantomJS on Jenkins slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 [09:35:16] (03PS3) 10Hashar: contint: get phantomJS on Jenkins slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 [09:35:18] (03PS3) 10Hashar: contint: get composer on Jenkins slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/124305 [09:35:54] (03CR) 10Hashar: [V: 032] "Rebased. Still deployed on contint puppetmaster" [operations/puppet] - 10https://gerrit.wikimedia.org/r/124305 (owner: 10Hashar) [09:36:25] (03CR) 10Hashar: [C: 031 V: 032] "Rebased on top of https://gerrit.wikimedia.org/r/#/c/124305/ which brings composer. That is to avoid a potential conflict." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 (owner: 10Hashar) [09:42:11] (03CR) 10Zfilipin: [C: 031] contint: get phantomJS on Jenkins slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 (owner: 10Hashar) [09:46:48] (03CR) 10KartikMistry: [C: 031] "As per Hashar's latest comments :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129687 (owner: 10Hashar) [09:47:46] (03PS1) 10Giuseppe Lavagetto: Fix the monitor_graphite_threshold check. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130052 [09:48:36] (03CR) 10Giuseppe Lavagetto: [C: 032] Fix the monitor_graphite_threshold check. [operations/puppet] - 10https://gerrit.wikimedia.org/r/130052 (owner: 10Giuseppe Lavagetto) [09:58:32] mutante: ah right, I had forgotten [10:02:58] sent [10:05:27] (03CR) 10Alexandros Kosiaris: [C: 032] "Matanya is correct. Monitoring, backup and firewalling are very specific to WMF production and as such they are better positioned in role" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129965 (owner: 10Matanya) [10:06:31] Oh, RT gives AutoReply now, that's very helpful :) [10:12:14] (03PS1) 10Aude: Set $wgPagePropsHaveSortkey to false [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130055 (https://bugzilla.wikimedia.org/64411) [10:13:22] (03CR) 10Dzahn: "" This package is a dummy transitional package. It can be safely removed." Depends: fonts-vlgothic" [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [10:16:04] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1% data above the threshold [250.0] [10:17:18] <_joe_> and there it is. [10:17:57] :) [10:18:19] <_joe_> now I should write some docs :) [10:18:48] <_joe_> one thing I notoriously suck at. [10:31:02] Nemo_bis: blame/thank apergos :) [10:31:34] hashar: i'm gzipping all the jenkins console logs (re: change 125991) [10:31:54] mutante: dont do it as root! :-D [10:32:25] hashar: arr, ..ok [10:32:30] as jenkins.. grmbl [10:32:34] mutante: the command seems fine [10:32:42] but it traverse a million of files/directories [10:32:53] I am not sure how much time it takes nor whether it is a good idea :/ [10:33:07] that's why i wanted to watch it running [10:33:14] sorry about the user [10:36:26] hashar: the are all owned by jenkins:jenkins [10:37:13] the initial run will just be long, the diff every 24h shouldn't be that bad [10:37:37] I wish Jenkins can compress them automatically [10:37:53] I need to discard old builds as well [10:39:13] runs it exactly as in the cron job, with nice -n 19 etc [10:39:43] <_joe_> mutante: nice -n 19? cpu intensive? [10:40:37] it doesnt look bad in top [10:40:42] maybe 2% [10:40:59] <_joe_> so I'd use ionice instead :) [10:42:05] <_joe_> (or, if you're fancy and your kernel supports it, cgroups and blkio limits :) ) [10:44:04] ok, fair, is that worth it for a mostly one-time thing ? [10:44:11] and https://ganglia.wikimedia.org/latest/?c=Miscellaneous%20eqiad&h=gallium.wikimedia.org&m=cpu_report&r=hour&s=descending&hc=4&mc=2 [10:44:39] just tests what was suggested on gerrit [10:45:23] <_joe_> mutante: well 10% iowait is not nice, but still not killing the server [10:46:32] thanks ariel :) [10:46:49] 10% iowait is low [10:46:59] ionice -c 3 ? [10:47:10] ah I was preceded [10:48:14] yw [10:48:56] <_joe_> eggia' [10:51:00] continues with ionice -c3 [10:56:50] (03CR) 10Springle: [C: 031] "Seems fine, obviously, however the schema change has been done. Is the field also to be populated by a batch job?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130055 (https://bugzilla.wikimedia.org/64411) (owner: 10Aude) [11:00:31] !log completed schema change, bug 64411, page_props.pp_sortkey [11:00:37] Logged the message, Master [11:05:20] (03PS4) 10Dzahn: Add jkrauska [operations/puppet] - 10https://gerrit.wikimedia.org/r/127134 (owner: 10Jkrauska) [11:11:05] (03PS5) 10Dzahn: add shell account for jkrauska [operations/puppet] - 10https://gerrit.wikimedia.org/r/127134 (owner: 10Jkrauska) [11:12:48] (03CR) 10Reedy: [C: 04-1] "http://packages.ubuntu.com/precise/ttf-vlgothic" [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [11:16:53] (03PS2) 10Reedy: Add ttf-vlgothic to imagescalers [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) [11:30:05] (03PS3) 10Reedy: Add fonts-japanese-gothic to imagescalers [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) [11:30:29] fixing the edit summary too works [11:30:37] s/works/helps/ [11:30:54] Reedy: package 'fonts-japanese-gothic' as it is purely virtual [11:31:12] i think fonts-vlgothic [11:46:13] !log Running deleteEqualMessages.php on bpywiki (bug 43917) [11:46:20] Logged the message, Master [11:47:47] (03CR) 10Aude: "I don't think populating the column is required for having the setting = true." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130055 (https://bugzilla.wikimedia.org/64411) (owner: 10Aude) [11:55:09] (03Abandoned) 10Aude: Set $wgPagePropsHaveSortkey to false [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130055 (https://bugzilla.wikimedia.org/64411) (owner: 10Aude) [12:07:17] (03PS6) 10Dzahn: add shell account for jkrauska [operations/puppet] - 10https://gerrit.wikimedia.org/r/127134 (owner: 10Jkrauska) [12:09:36] (03CR) 10Dzahn: [C: 032] add shell account for jkrauska [operations/puppet] - 10https://gerrit.wikimedia.org/r/127134 (owner: 10Jkrauska) [12:26:14] (03PS1) 10Dzahn: add 'rhenium' (netflow box) to site.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/130060 [12:27:10] (03PS2) 10Dzahn: add 'rhenium' (netflow box) to site.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/130060 [12:28:14] (03CR) 10Matanya: [C: 031] "I think we should add ferm on the host too." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130060 (owner: 10Dzahn) [12:29:04] !log Running deleteEqualMessages.php on afwiki (bug 43917) [12:29:07] !log Running deleteEqualMessages.php on cvwiki (bug 43917) [12:29:11] Logged the message, Master [12:29:18] Logged the message, Master [12:31:33] (03PS1) 10Matanya: archiva: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 [12:32:44] (03PS1) 10Dzahn: apply role::pmacct on node rhenium [operations/puppet] - 10https://gerrit.wikimedia.org/r/130062 [12:33:53] (03CR) 10Alexandros Kosiaris: [C: 04-1] "One minor comment. Otherwise LGTM" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129676 (owner: 10Dzahn) [12:35:52] mutante / matanya / Nemo_bis : uhh, not sure what we would do about the cert error, its not our server. (wikimania.org) [12:36:04] yet [12:36:04] (you guys pinged me about it so it was in my bouncer, heh) [12:36:14] thats what folks been telling me now for over half a decade [12:36:22] 'wikimania.org is going to transfer to wikimedia' [12:36:23] ;] [12:36:44] where is manuel s ? [12:36:58] RobH: yes, something changed on their side but no indication yet that it actually moved to us :p [12:37:16] i hope it does transfer, would be nice [12:37:25] but _once_ and if they finally decide to move ... they should ask about certs _before_ just switching [12:37:30] yea [12:37:32] (dont take my doubt as an indication of disapproval, i want the domain, heh_ [12:37:43] mutante: its wikimania.org, it has no certs [12:37:48] it'd go to our main varnish cluster [12:37:51] no cert to buy =] [12:37:58] sorry, main ssl cluster, eqiad [12:38:04] (03CR) 10Matanya: [C: 031] "as a side note: I think the ferm rule should move from modules/pmacct/manifests/configs.pp, but yes, this seems right." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130062 (owner: 10Dzahn) [12:38:06] (only ulsfo has varnish handling ssl) [12:39:09] RobH: wouldn't we still need to get star.wikimania.org ? [12:39:38] RobH: I read an old and very interesting doc yesterday about ssl optimization, interested to read ? [12:40:08] hrmm, i guess so yea [12:40:18] have to add it to unified cert [12:40:23] or its own small cert, bleh [12:40:36] matanya: I'll book mark it and read it later, right now im ssl'd out =] [12:41:16] (03CR) 10Dzahn: "fonts-japanese-gothic is virtual. and we tried to remove all the virtual ones, or? fonts-vlgothic is the "regular" one" [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [12:44:29] (03CR) 10Alexandros Kosiaris: [C: 04-1] "There seems also to be rsync::server included from archiva::gitfat so at least a ferm::service/rule will be needed for that. Not sure this" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 (owner: 10Matanya) [12:45:46] (03PS6) 10Dzahn: turn ircecho into a parameterized class [operations/puppet] - 10https://gerrit.wikimedia.org/r/129676 [12:47:13] (03PS2) 10Matanya: archiva: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 [12:48:04] <_joe_> !log restarted apache on wikitech-static [12:48:13] Logged the message, Master [12:53:15] (03CR) 10Dzahn: [C: 032] turn ircecho into a parameterized class [operations/puppet] - 10https://gerrit.wikimedia.org/r/129676 (owner: 10Dzahn) [12:55:19] RobH: https://insouciant.org/tech/ssl-performance-case-study/ [12:55:44] cool, thx =] [12:59:53] (03PS4) 10Dzahn: Add fonts-vlgothic to imagescalers [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [13:01:29] (03CR) 10Dzahn: [C: 04-1] "eh, wth, bug 127623 ?:), hold on" [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [13:02:15] (03PS5) 10Dzahn: Add fonts-vlgothic to imagescalers [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [13:03:23] (03CR) 10Dzahn: "- no puppet3 regression, using parameters in role" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129676 (owner: 10Dzahn) [13:04:55] (03CR) 10Dzahn: [C: 032] ""VL Gothic is beautiful Japanese free Gothic TrueType font, developed by Project Vine"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/127623 (https://bugzilla.wikimedia.org/64002) (owner: 10Reedy) [13:22:35] (03CR) 10Ottomata: "The rsync server is meant to be open to the public. It allows users of repositories that use artifacts hosted here to run 'git fat pull' " [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 (owner: 10Matanya) [13:24:31] ottomata: on port 873 ? ^ [13:26:21] (03CR) 10Ottomata: archiva: add ferm rule (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 (owner: 10Matanya) [13:27:11] matanya: yup [13:27:20] sure, fixing [13:28:01] (03PS3) 10Matanya: archiva: add ferm rule [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 [13:30:33] (03PS1) 10Matanya: titinium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 [13:30:45] matanya: titanium [13:31:07] <_joe_> titinium in italian sounds funny [13:31:25] <_joe_> well, in english as well, I guess [13:31:42] (03PS2) 10Matanya: titanium: add firewall to the host [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 [13:32:57] thanks Reedy fixed [13:34:55] (03CR) 10Matanya: [C: 04-1] "This depends on https://gerrit.wikimedia.org/r/#/c/130061/2" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130066 (owner: 10Matanya) [13:35:00] (03PS1) 10Odder: Add University of Neuchâtel to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130067 (https://bugzilla.wikimedia.org/64535) [13:38:56] (03CR) 10Krinkle: "Inline (what is /usr/share/dblist?)" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/65634 (owner: 10Petrb) [13:39:37] (03PS1) 10Andrew Bogott: Change six UIDs to match ldap [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 [13:42:29] (03CR) 10Dzahn: "can't find user abaso in LDAP, all others confirmed though" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 (owner: 10Andrew Bogott) [13:43:08] (03PS1) 10Odder: Additional two Swiss domains to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130070 (https://bugzilla.wikimedia.org/64536) [13:46:24] (03CR) 10Krinkle: improved sql script (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/67826 (owner: 10Petrb) [13:48:00] (03CR) 10Andrew Bogott: "Adam's labs login is dr0ptp4kt for some reason" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 (owner: 10Andrew Bogott) [13:48:58] (03PS6) 10Hoo man: Make labs' sql command work with -v and remove cruft [operations/puppet] - 10https://gerrit.wikimedia.org/r/113755 [13:49:10] Krinkle: ^ that might be interesting for you :P [13:49:14] (03CR) 10Dzahn: "ugh, i see, so how are we resolving those? different users with same UID, meh. renaming labs users? meh. .." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 (owner: 10Andrew Bogott) [13:50:43] (03PS1) 10Krinkle: toollabs/sql: Remove unused 'list', remove duplicate 'commons' [operations/puppet] - 10https://gerrit.wikimedia.org/r/130071 [13:51:14] (03PS2) 10Andrew Bogott: Change five UIDs to match ldap [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 [13:51:29] (03CR) 10Krinkle: "Fixed in I795051ab034bb4c." [operations/puppet] - 10https://gerrit.wikimedia.org/r/65634 (owner: 10Petrb) [13:51:36] (03CR) 10Krinkle: "Fixed in I795051ab034bb4c." [operations/puppet] - 10https://gerrit.wikimedia.org/r/65634 (owner: 10Petrb) [13:51:41] (03CR) 10Krinkle: "Fixed in I795051ab034bb4c." [operations/puppet] - 10https://gerrit.wikimedia.org/r/67826 (owner: 10Petrb) [13:52:11] (03CR) 10Hoo man: [C: 04-1] "Mostly redundant with https://gerrit.wikimedia.org/r/113755 (the _p additions aren't needed)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130071 (owner: 10Krinkle) [13:52:54] (03CR) 10Krinkle: new tool for easy sql replica access (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/65634 (owner: 10Petrb) [13:53:28] (03CR) 10Dzahn: [C: 031] Change five UIDs to match ldap [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 (owner: 10Andrew Bogott) [13:53:35] andrewbogott: heheee [13:54:11] (03CR) 10Krinkle: "That change is a more major refactor. This is a lot more trivial and therefore easier to review and merge. Yours can trivially be rebased " [operations/puppet] - 10https://gerrit.wikimedia.org/r/130071 (owner: 10Krinkle) [13:55:16] (03CR) 10Hoo man: "I don't see a reason for this change, thus... it would be nice if someone could finally review my code... that would be time spend better " [operations/puppet] - 10https://gerrit.wikimedia.org/r/130071 (owner: 10Krinkle) [13:57:01] (03PS1) 10Krinkle: toollabs/sql: Add handling for connecting to "meta_p" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130073 [13:58:27] (03PS2) 10Krinkle: toollabs/sql: Support connecting to "meta_p" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130073 [13:59:02] hoo: I can rebase it for you, that would make your change easier to review as it'll make less changes [13:59:26] nobody is going to review any of this, anyway... probably [14:01:13] (03CR) 10Ottomata: archiva: add ferm rule (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/130061 (owner: 10Matanya) [14:03:54] (03PS7) 10Krinkle: toollabs/sql: Fix argument forwarding (-v breaks mysql) and clean up [operations/puppet] - 10https://gerrit.wikimedia.org/r/113755 (owner: 10Hoo man) [14:04:28] (03PS3) 10Krinkle: toollabs/sql: Add handling for connecting to "meta_p" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130073 [14:07:04] Krinkle: did you figure things out wrt jsduck? [14:07:15] akosiaris had updated the ticket since last week, aiui [14:07:17] (03PS8) 10Krinkle: toollabs/sql: Fix argument forwarding (-v breaks mysql) and clean up [operations/puppet] - 10https://gerrit.wikimedia.org/r/113755 (owner: 10Hoo man) [14:07:36] paravoid: I'm waiting for someone to tell me where and how I can test the package. [14:07:47] paravoid: akosiaris put one on the repo last week, but it didn't work [14:07:59] Krinkle: jsduck 5 ? [14:08:06] quack [14:08:07] I think akosiaris was going to or has attempted to fix it, but I haven't heard back yet [14:08:09] matanya: yes [14:08:34] yes and I am still on it. I have packaged rkelly-remix (being a dependency). I will be contacting you later today so you can test [14:08:47] (03CR) 10Krinkle: "Clarified more of the changes in the commit message." [operations/puppet] - 10https://gerrit.wikimedia.org/r/113755 (owner: 10Hoo man) [14:09:22] (03CR) 10Krinkle: "This change did even more than the commit message says, effectively moved those out into separate changes by rebasing onto my change." [operations/puppet] - 10https://gerrit.wikimedia.org/r/113755 (owner: 10Hoo man) [14:09:32] akosiaris: awesome :) [14:10:16] (03CR) 10Reedy: Vary twemproxy config location based on getRealmSpecificFilename() (take 2) (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129663 (https://bugzilla.wikimedia.org/62836) (owner: 10Reedy) [14:11:02] hoo: can you undo your -1? [14:11:22] (03CR) 10Andrew Bogott: [C: 032] Change five UIDs to match ldap [operations/puppet] - 10https://gerrit.wikimedia.org/r/130069 (owner: 10Andrew Bogott) [14:11:37] done [14:12:02] (03CR) 10Krinkle: "This is not needed. We already have phantomjs in slave-scripts, via that same npm." [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 (owner: 10Hashar) [14:12:20] hashar: [14:12:31] Krinkle: in conf call sorry [14:12:48] (03CR) 10Krinkle: [C: 04-1] "https://github.com/wikimedia/integration-jenkins/blob/master/bin/phantomjs" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 (owner: 10Hashar) [14:19:58] (03PS1) 10Reedy: Use twemproxy on labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130078 [14:20:21] Krinkle: where is the other phantomjs? :-) [14:20:27] read [14:20:29] up [14:20:40] yeah :D [14:20:49] I thought you wanted to drop that wmf grunt stuff entirely ? [14:20:55] so went with yet another repo [14:21:27] Which is why I am moving that dependency up from wmfgrunt to the root. But there is no need for setting up more cruft. slave-scripts has a package.json for this reason [14:21:39] ahhhhhh [14:21:47] Also, you should've included the pckage.json in ingegration/phantomjs.git so that it can be verified by someone. [14:21:47] I guess the same can be done for kss js [14:21:51] some kind of css linter [14:23:35] bin/phantomjs@ -> ../tools/node_modules/grunt-contrib-qunit/node_modules/grunt-lib-phantomjs/node_modules/phantomjs/bin/phantomjs :( [14:23:44] hashar: the other phantomjs has been there for a while, not new. But I've moved the dependency up a few directories so that it won't be removed if we drop wmfgrunt [14:23:44] (03CR) 10Giuseppe Lavagetto: [C: 031] "Looks good; added a small suggestion" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129663 (https://bugzilla.wikimedia.org/62836) (owner: 10Reedy) [14:23:47] Krinkle: should I add phantomjs to the /tools/package.json ? [14:24:01] hashar: you're behind a few minutes.. [14:24:23] yeah in conf call :D [14:24:48] hashar: I just moved it [14:24:52] (03CR) 10Hashar: [C: 04-1] "Ahhh will have to cleanup that mess so :-]" [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 (owner: 10Hashar) [14:25:12] (03PS1) 10Odder: Add image-reviewer group to Persian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130079 (https://bugzilla.wikimedia.org/64532) [14:26:03] (03CR) 10Dzahn: [C: 031] Remove mysql client from bastionhost [operations/puppet] - 10https://gerrit.wikimedia.org/r/126027 (owner: 10Hoo man) [14:26:23] (03CR) 10Reedy: Vary twemproxy config location based on getRealmSpecificFilename() (take 2) (033 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129663 (https://bugzilla.wikimedia.org/62836) (owner: 10Reedy) [14:27:57] Krinkle: great. Can we bump phantom js now ? :) [14:29:06] hashar: Already new enough afaik. [14:29:14] hashar: Which version do you need? [14:29:20] "phantomjs": "~1.9.0-1", [14:29:21] not sure [14:29:28] asking zeljkof [14:29:40] hashar: Also, read the commit messages before changing it. [14:30:20] (03CR) 10Calak: [C: 031] "Thanks." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130079 (https://bugzilla.wikimedia.org/64532) (owner: 10Odder) [14:30:26] Krinkle: lets follow up in #wikimedia-qa :) [14:34:42] (03PS3) 10Reedy: Vary twemproxy config location based on getRealmSpecificFilename() (take 2) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129663 (https://bugzilla.wikimedia.org/62836) [14:36:27] (03Abandoned) 10Hashar: contint: get phantomJS on Jenkins slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/130049 (owner: 10Hashar) [14:38:59] (03PS4) 10Reedy: Vary twemproxy config location based on getRealmSpecificFilename() (take 2) [operations/puppet] - 10https://gerrit.wikimedia.org/r/129663 (https://bugzilla.wikimedia.org/62836) [14:46:56] (03CR) 10Giuseppe Lavagetto: [C: 032] "Merging." [operations/puppet] - 10https://gerrit.wikimedia.org/r/129663 (https://bugzilla.wikimedia.org/62836) (owner: 10Reedy) [14:51:12] manybubbles: Since you have changes in it, do you want to do today's SWAT? [14:51:29] anomie: yeah, I'll do it! [14:54:10] doesn't look like odder is around. [14:54:25] hoo|away: you going to be available in case something goes wrong with you SWAT changes when I start in 5? [14:54:43] hoo|away: also, do you mind if I sync them together or should they go one after the other? [14:55:29] manybubbles: oh, I'm here. [14:55:55] twkozlowski: cool. sweet. your changes: do you want them one after another of all at the same time? [14:56:27] Let me look it up again to see what exactly is scheduled [14:56:55] manybubbles: Whatever suits you better, I have no preference [14:58:09] (03CR) 10Manybubbles: [C: 032] Add a new namespace to Hebrew Wikisource [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129666 (https://bugzilla.wikimedia.org/64353) (owner: 10Odder) [14:58:16] * manybubbles has the conch [14:58:19] (03Merged) 10jenkins-bot: Add a new namespace to Hebrew Wikisource [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129666 (https://bugzilla.wikimedia.org/64353) (owner: 10Odder) [14:59:01] (03CR) 10Manybubbles: [C: 032] Add Library of Congress to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130043 (https://bugzilla.wikimedia.org/64487) (owner: 10Odder) [14:59:10] (03Merged) 10jenkins-bot: Add Library of Congress to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/130043 (https://bugzilla.wikimedia.org/64487) (owner: 10Odder) [14:59:25] twkozlowski: mind rebasing :https://gerrit.wikimedia.org/r/#/c/129675/ [14:59:59] Sure. [15:00:04] (03CR) 10BBlack: "Are we still waiting on verification that there's no ill effects for analytics?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/129714 (owner: 10Dr0ptp4kt) [15:00:35] (03PS4) 10BBlack: Set domain to TLD on GeoIP cookie [operations/puppet] - 10https://gerrit.wikimedia.org/r/127131 (owner: 10Ori.livneh) [15:02:47] (03PS2) 10Odder: National Library of Scotland to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129675 (https://bugzilla.wikimedia.org/64357) [15:03:47] manybubbles: ^^ [15:05:58] RECOVERY - twemproxy port on fenari is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [15:05:58] RECOVERY - twemproxy process on fenari is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [15:06:03] (03CR) 10Manybubbles: [C: 032] National Library of Scotland to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129675 (https://bugzilla.wikimedia.org/64357) (owner: 10Odder) [15:06:17] (03Merged) 10jenkins-bot: National Library of Scotland to wgCopyUploadsDomains [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/129675 (https://bugzilla.wikimedia.org/64357) (owner: 10Odder) [15:06:35] twkozlowski: k. deploying [15:06:52] (03CR) 10BBlack: [C: 032 V: 032] Set domain to TLD on GeoIP cookie [operations/puppet] - 10https://gerrit.wikimedia.org/r/127131 (owner: 10Ori.livneh) [15:08:28] !log manybubbles synchronized wmf-config/InitialiseSettings.php 'Add new sources to gwtoolset and namespaces to hewikisource' [15:08:33] Logged the message, Master [15:08:35] twkozlowski: ^^^^ [15:08:42] please make sure everything looks good [15:09:41] manybubbles: Looks fine to me. [15:09:44] hoo|away: I'll do yours when you come online [15:09:52] twkozlowski: sweet, consider yourself SWATed [15:09:59] :-) [15:22:40] apergos: dataset alerts? [15:22:46] (dataset2 & dataset1001) [15:28:30] anyone around to support hoo's swat deploy? [15:29:41] what is it? [15:30:14] Nikerabbit: https://gerrit.wikimedia.org/r/#/c/129707/ and https://gerrit.wikimedia.org/r/#/c/129708/ [15:30:44] seem simple enough but I don't want to do them without him around to verify that they worked properly [15:30:56] looking [15:31:48] I wonder if it has been verified on testwikidata [15:35:15] Krinkle: wget http://apt.wikimedia.org/pending/ruby-rkelly-remix_0.0.6-1_all.deb && ruby-rkelly-remix_0.0.6-1_all.deb [15:35:34] and test [15:36:01] and if everything is ok I will update apt.wikimedia.org [15:38:37] PROBLEM - Puppet freshness on dataset2 is CRITICAL: Last successful Puppet run was Mon 28 Apr 2014 03:36:28 PM UTC [15:39:27] RECOVERY - Puppet freshness on dataset2 is OK: puppet ran at Mon Apr 28 15:39:25 UTC 2014 [15:39:37] PROBLEM - Puppet freshness on dataset2 is CRITICAL: Last successful Puppet run was Mon 28 Apr 2014 03:39:25 PM UTC [15:39:50] chasemp: could you have a look at https://gerrit.wikimedia.org/r/#/c/129728/ ? I'm hoping it can fit in with your user refactor... [15:40:40]