[00:08:20] RECOVERY - puppet last run on cp4001 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [00:58:21] PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL: CRITICAL: 7.69% of data above the critical threshold [500.0] [01:00:15] RoanKattouw_away: http://www.cbsnews.com/news/wikipedia-jimmy-wales-morley-safer-60-minutes/ [01:00:18] You're on it :D [01:00:23] (03PS5) 10BryanDavis: logstash: Ship logs via syslog udp datagrams [mediawiki-config] - 10https://gerrit.wikimedia.org/r/191259 (https://phabricator.wikimedia.org/T88732) [01:00:25] (03PS1) 10BryanDavis: Update beta cluster logging config for namespaced classes [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201986 [01:00:27] (03PS1) 10BryanDavis: Convert MWLogger to MediaWiki\Logger\LoggerFactory [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201987 [01:00:48] maybe not [01:02:41] (03CR) 10BryanDavis: "Latest patch set requires code that will be in 1.26wmf1 (namespaced classes)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/191259 (https://phabricator.wikimedia.org/T88732) (owner: 10BryanDavis) [01:04:51] ori: I would appreciate your re-review of https://gerrit.wikimedia.org/r/#/c/191259 with clear reasons that you may think it still deserves a -2 when you can get to it. [01:05:57] (03CR) 10BryanDavis: [C: 04-1] "This will require 1.26wmf1 on all wikis (2015-04-15 at current release cadence)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201987 (owner: 10BryanDavis) [01:07:27] bd808: OK, I'll try to get to it ASAP. Feel free to re-poke if I don't follow-up promptly. [01:07:57] ori: thanks. It can roll out before the train on Wednesday so you have time [01:08:19] s/can/cannot/ [01:10:11] RECOVERY - HTTP 5xx req/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [01:10:29] (03CR) 10Tim Landscheidt: "@Jeremyb: I think with https://gerrit.wikimedia.org/r/#/c/197341 merged, this change is now obsolete and can be abandoned." [puppet] - 10https://gerrit.wikimedia.org/r/111387 (owner: 10Jeremyb) [01:14:37] bd808: so, just to clarify, you don't want *unclear* reasons? [01:14:49] you should have said so from the beginning! how am i supposed to know that [01:15:17] heh. I don't want "it feels wrong" because that's hard to address :) [01:15:36] bd808: well, in https://phabricator.wikimedia.org/T88732 i wrote: [01:15:43] "If you are sure that Monolog is a good solution, then there is no reason to shirk the work of introducing it properly, which is by deprecating $wgDebugLogGroups and rewriting the configuration blocks that utilize it in operations/mediawiki-config to use Monolog instead." [01:15:58] which you had quoted and replied to with "Fair enough." [01:16:53] And additional reasons not to jump to removing wgDebugLogGroups quite yet [01:17:27] Like the clear need to test monolog to logstash a full prod volume again [01:18:20] If the reason is really "we can't stand any more half measures" then I guess I'll need to think hard about a safe way to do that [01:18:37] but I should also be giving out a ton of -2s every day [01:19:07] so generate wgDebugLogGroups from the monolog config [01:19:22] how is that different at all? [01:19:51] it's different in what remains to be done to get us to things-as-they-should-be once you're confident enough in the logstash cluster [01:20:11] in the current case, the base configuration is still articulated in the old way, and needs to be ported over [01:20:31] in the alternative i'm suggesting, the base configuration is current, and it's a matter of deleting a block of code. [01:23:23] *nod* that is closer to my desired outcome. I'll have to think a bit on how it could be done [01:24:24] Actually what may be even easier to just to jump to all monolog all the time but make the logstash bits respond to a feature flag [01:24:58] The only fears that are left are really related to the logstash input transport [01:25:13] right, not the use of logstash for udp logging [01:25:25] that sounds good to me [01:27:29] Would you object to the full Monolog config being generated from some more terse input? [01:28:02] Sampling especially turns into a bit of a dog's breakfast of config when it is needed [01:37:26] The formatter that is used for the udp2log packets depends on $wgDebugLogGroups, but only to know if it should format a message like wfDebug() would have or like wfDebugLog() would have. [01:41:23] (03CR) 10Ori.livneh: "To recap: the value of the sweeping changes to the logging infrastructure in core is (to my mind) in making MediaWiki behave like a well-m" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/191259 (https://phabricator.wikimedia.org/T88732) (owner: 10BryanDavis) [01:42:34] (03CR) 10BryanDavis: [C: 04-1] logstash: Ship logs via syslog udp datagrams [mediawiki-config] - 10https://gerrit.wikimedia.org/r/191259 (https://phabricator.wikimedia.org/T88732) (owner: 10BryanDavis) [02:17:43] !log l10nupdate Synchronized php-1.25wmf23/cache/l10n: (no message) (duration: 06m 26s) [02:18:00] Logged the message, Master [02:22:29] !log LocalisationUpdate completed (1.25wmf23) at 2015-04-06 02:21:26+00:00 [02:22:36] Logged the message, Master [02:39:09] !log l10nupdate Synchronized php-1.25wmf24/cache/l10n: (no message) (duration: 06m 06s) [02:39:16] Logged the message, Master [02:43:46] !log LocalisationUpdate completed (1.25wmf24) at 2015-04-06 02:42:42+00:00 [02:43:52] Logged the message, Master [03:09:39] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: puppet fail [03:28:10] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:00:19] PROBLEM - puppet last run on mw2195 is CRITICAL: CRITICAL: puppet fail [04:12:43] (03PS4) 10Tim Landscheidt: Tools: Factor out registering with proxies [puppet] - 10https://gerrit.wikimedia.org/r/197658 (https://phabricator.wikimedia.org/T91954) [04:12:45] (03PS1) 10Tim Landscheidt: Tools: Make list of proxies for portgrabber configurable [puppet] - 10https://gerrit.wikimedia.org/r/201991 (https://phabricator.wikimedia.org/T91954) [04:16:42] (03CR) 10Tim Landscheidt: "Still not tested for uswgi-python and nodejs." [puppet] - 10https://gerrit.wikimedia.org/r/197658 (https://phabricator.wikimedia.org/T91954) (owner: 10Tim Landscheidt) [04:16:59] RECOVERY - puppet last run on mw2195 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [04:17:38] (03CR) 10Tim Landscheidt: "Tested on Toolsbeta apart from uswgi-python and nodejs." [puppet] - 10https://gerrit.wikimedia.org/r/201991 (https://phabricator.wikimedia.org/T91954) (owner: 10Tim Landscheidt) [04:29:44] (03CR) 10Yuvipanda: "Looks ok. Ultimately I think we'll want to move all of these into a package by itself. I'll test and merge tomorrow." [puppet] - 10https://gerrit.wikimedia.org/r/197658 (https://phabricator.wikimedia.org/T91954) (owner: 10Tim Landscheidt) [04:46:03] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 6 04:45:00 UTC 2015 (duration 44m 59s) [04:46:08] Logged the message, Master [05:36:42] 6operations, 3Interdatacenter-IPsec: Kernel panics on Jessie (3.16.0-4-amd64) during IPsec load test - https://phabricator.wikimedia.org/T94820#1181348 (10faidon) Well, I obviously agree with @BBlack that we should test with 3.19 (and, in general, have a test environment that resembles production as much as po... [05:42:19] (03PS1) 10Tim Landscheidt: Tools: Fix bigbrother's patterns for web service types [puppet] - 10https://gerrit.wikimedia.org/r/201996 (https://phabricator.wikimedia.org/T94496) [05:42:46] (03PS1) 10Faidon Liambotis: Remove bloom filter store configuration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201997 (https://phabricator.wikimedia.org/T93006) [05:44:21] (03CR) 10Tim Landscheidt: "Tested live on tools-submit for "lighttpd" and "uwsgi-python"." [puppet] - 10https://gerrit.wikimedia.org/r/201996 (https://phabricator.wikimedia.org/T94496) (owner: 10Tim Landscheidt) [05:45:23] 6operations, 10hardware-requests: Decom/repurpose rbf* hosts - https://phabricator.wikimedia.org/T95153#1181377 (10faidon) 3NEW [05:48:16] 6operations: Decommission svn.wikimedia.org server (import SVN into Phabricator) - https://phabricator.wikimedia.org/T86655#1181387 (10jayvdb) [05:48:24] 6operations: Migrate leftover old svn content to a readonly repo somewhere - https://phabricator.wikimedia.org/T86674#1181389 (10jayvdb) [06:29:59] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:10] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:40] PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:41] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:50] PROBLEM - puppet last run on ms-fe2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:50] PROBLEM - puppet last run on cp3041 is CRITICAL: CRITICAL: puppet fail [06:34:40] PROBLEM - puppet last run on mw1235 is CRITICAL: CRITICAL: Puppet has 4 failures [06:35:20] PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:36:20] PROBLEM - puppet last run on mw2023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:45:30] RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:46:20] RECOVERY - puppet last run on mw1235 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:46:31] RECOVERY - puppet last run on mw2023 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [06:46:39] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [06:46:59] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:30] RECOVERY - puppet last run on amssq54 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:30] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:30] RECOVERY - puppet last run on ms-fe2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:39] RECOVERY - puppet last run on cp3041 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [07:26:57] (03CR) 10Gilles: [C: 032] Remove bloom filter store configuration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201997 (https://phabricator.wikimedia.org/T93006) (owner: 10Faidon Liambotis) [07:27:02] (03Merged) 10jenkins-bot: Remove bloom filter store configuration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201997 (https://phabricator.wikimedia.org/T93006) (owner: 10Faidon Liambotis) [07:42:49] PROBLEM - Unmerged changes on repository mediawiki_config on tin is CRITICAL: There is one unmerged change in mediawiki_config (dir /srv/mediawiki-staging/). [07:49:22] 6operations: Package or import PageSpeed module (mod_pagespeed) - https://phabricator.wikimedia.org/T95123#1181521 (10ori) Assigning to myself as more evidence is needed for the value it would provide. It is possible that the subset of filters which are safe to use is the complement of the set of filters which w... [07:58:05] 6operations, 10ops-eqiad: vanadium failed disk /dev/sda - https://phabricator.wikimedia.org/T94926#1181538 (10faidon) [08:00:07] 6operations: eventlog1001 / full - https://phabricator.wikimedia.org/T95154#1181542 (10faidon) 3NEW a:3Ottomata [08:01:40] RECOVERY - salt-minion processes on db1035 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [08:16:46] 6operations, 10ops-eqiad, 10Analytics-EventLogging: vanadium failed disk /dev/sda - https://phabricator.wikimedia.org/T94926#1181595 (10yuvipanda) [08:18:02] (03PS1) 10Faidon Liambotis: Kill passwords.pp, superseded by labs-private [puppet] - 10https://gerrit.wikimedia.org/r/202006 [08:22:23] (03CR) 10Alexandros Kosiaris: [C: 032] "I am of the same opinion. +2 from me." [puppet] - 10https://gerrit.wikimedia.org/r/202006 (owner: 10Faidon Liambotis) [08:24:51] (03PS2) 10Faidon Liambotis: Silence Phabricator/Bugzilla migration cronspam [puppet] - 10https://gerrit.wikimedia.org/r/201930 [08:25:06] (03CR) 10Faidon Liambotis: [C: 032] Silence Phabricator/Bugzilla migration cronspam [puppet] - 10https://gerrit.wikimedia.org/r/201930 (owner: 10Faidon Liambotis) [09:05:14] (03CR) 10Werdna: [C: 032] Add $wgPopupsSurveyLink if $wmgUsePopups is true [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201419 (https://phabricator.wikimedia.org/T1005) (owner: 10Prtksxna) [09:05:21] (03Merged) 10jenkins-bot: Add $wgPopupsSurveyLink if $wmgUsePopups is true [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201419 (https://phabricator.wikimedia.org/T1005) (owner: 10Prtksxna) [10:09:00] PROBLEM - puppet last run on wtp2015 is CRITICAL: CRITICAL: puppet fail [10:19:01] PROBLEM - puppet last run on wtp2011 is CRITICAL: CRITICAL: Puppet has 1 failures [10:20:45] 7Puppet, 7Technical-Debt: "Setting templatedir is deprecated" warning issued on self-hosted puppetmaster - https://phabricator.wikimedia.org/T95158#1181785 (10Tgr) 3NEW [10:27:30] RECOVERY - puppet last run on wtp2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:35:49] PROBLEM - puppet last run on cp3034 is CRITICAL: CRITICAL: puppet fail [10:35:50] RECOVERY - puppet last run on wtp2011 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [10:45:35] (03CR) 10Alexandros Kosiaris: [C: 032] network: add uranium.wm ipv6 to def [puppet] - 10https://gerrit.wikimedia.org/r/201958 (owner: 10John F. Lewis) [10:52:40] RECOVERY - puppet last run on cp3034 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:59:27] (03CR) 10Alexandros Kosiaris: [C: 031] dumps: ferm service for rsyncd clients using hiera [puppet] - 10https://gerrit.wikimedia.org/r/188204 (owner: 10Dzahn) [11:02:35] 7Puppet, 10Tool-Labs: Document our GridEngine set up - https://phabricator.wikimedia.org/T88733#1181911 (10scfc) p:5Triage>3Normal [11:10:59] PROBLEM - RAID on eventlog1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:13:29] PROBLEM - configured eth on eventlog1001 is CRITICAL: NRPE: Unable to read output [11:13:30] PROBLEM - DPKG on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:15:00] PROBLEM - salt-minion processes on eventlog1001 is CRITICAL: NRPE: Unable to read output [11:15:09] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: NRPE: Unable to read output [11:16:40] PROBLEM - dhclient process on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:16:49] PROBLEM - configured eth on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:17:39] PROBLEM - RAID on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:18:20] RECOVERY - dhclient process on eventlog1001 is OK: PROCS OK: 0 processes with command name dhclient [11:18:53] PROBLEM - SSH on eventlog1001 is CRITICAL: Server answer: [11:20:11] PROBLEM - configured eth on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:20:19] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: NRPE: Unable to read output [11:21:19] PROBLEM - Disk space on eventlog1001 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=86%): [11:21:20] PROBLEM - RAID on eventlog1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:21:50] PROBLEM - puppet last run on eventlog1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:21:59] PROBLEM - salt-minion processes on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:23:40] PROBLEM - dhclient process on eventlog1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:23:59] RECOVERY - DPKG on eventlog1001 is OK: All packages OK [11:25:29] PROBLEM - configured eth on eventlog1001 is CRITICAL: NRPE: Unable to read output [11:26:59] PROBLEM - dhclient process on eventlog1001 is CRITICAL: NRPE: Unable to read output [11:28:21] PROBLEM - puppet last run on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:28:40] PROBLEM - salt-minion processes on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:28:49] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [11:28:53] PROBLEM - DPKG on eventlog1001 is CRITICAL: NRPE: Call to popen() failed [11:32:00] RECOVERY - dhclient process on eventlog1001 is OK: PROCS OK: 0 processes with command name dhclient [11:32:09] RECOVERY - salt-minion processes on eventlog1001 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [11:32:09] RECOVERY - configured eth on eventlog1001 is OK: NRPE: Unable to read output [11:32:10] RECOVERY - Check status of defined EventLogging jobs on eventlog1001 is OK: OK: All defined EventLogging jobs are runnning. [11:32:13] RECOVERY - DPKG on eventlog1001 is OK: All packages OK [11:32:39] RECOVERY - SSH on eventlog1001 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2 (protocol 2.0) [11:32:59] RECOVERY - RAID on eventlog1001 is OK: OK: no disks configured for RAID [11:42:30] PROBLEM - puppet last run on mw2035 is CRITICAL: CRITICAL: puppet fail [11:53:10] RECOVERY - Disk space on eventlog1001 is OK: DISK OK [12:01:09] RECOVERY - puppet last run on mw2035 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:02:10] RECOVERY - puppet last run on eventlog1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:09:52] (03PS1) 10KartikMistry: CX: Enable 'newarticle' campaign by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202021 (https://phabricator.wikimedia.org/T95147) [12:26:37] 6operations, 10Wikimedia-Apache-configuration: Apache slash expansion incorrectly redirects from HTTPS to HTTP - https://phabricator.wikimedia.org/T95164#1182055 (10Krinkle) 3NEW [12:39:32] 6operations, 10Wikimedia-Apache-configuration: Apache slash expansion should not redirect from HTTPS to HTTP - https://phabricator.wikimedia.org/T95164#1182064 (10Krinkle) [13:08:52] kart_: Puppet changes (i.e. 201667) aren't really good for SWAT, since mostly only ops has +2 on the puppet repo. [13:46:30] PROBLEM - puppet last run on mw1115 is CRITICAL: CRITICAL: Puppet has 1 failures [13:46:59] PROBLEM - puppet last run on mw1080 is CRITICAL: CRITICAL: Puppet has 2 failures [13:47:39] PROBLEM - puppet last run on mc1016 is CRITICAL: CRITICAL: Puppet has 1 failures [13:47:50] PROBLEM - puppet last run on mw2034 is CRITICAL: CRITICAL: Puppet has 1 failures [13:48:57] anomie: manybubbles: Hi! I added a request to the SWAT deployment this morning, an important EducationProgram update that needs a scap. Basically it's important so that in Ukrainian, we don't fall back on newly-arrived Russian translations of EP namespaces [13:49:14] I added the patches for EP production branches but I don't have +2 on thos branches [13:49:28] AndyRussG: cool. on moment [13:49:31] anomie: sure. greg-g added it to make sure it need to merge along with. I'll just remove from list. [13:49:38] manybubbles: k thanks! :) [13:50:06] AndyRussG: I suspect anomie will swat today because he has patches in it [13:50:28] Ah OK [13:51:26] the swat looks quite full but you seem pretty convinced its important. I know sending russian to a ukranian is certainly weird. [13:51:58] AndyRussG: can it wait until tomorrow or is the program starting super soon [13:52:37] AndyRussG: in any case I've +2ed so they'll merge to the release branches. can you create the submodule updates? [13:53:01] 10Ops-Access-Requests, 6operations, 10Analytics-EventLogging: Grant user 'tomasz' access to dbstore1002 for Event Logging data - https://phabricator.wikimedia.org/T95036#1182107 (10Ottomata) Thomas will need to be added to the 'researchers' group. Thomas, you can then read the file /etc/mysql/conf.d/researc... [13:54:04] manybubbles: I don't know the details, Glaisher and some other folks have flagged it as urgent. The phab task is at "Unbreak now". https://phabricator.wikimedia.org/T73953 [13:55:29] As far as I understood, there's controversy in the Ukrainian community about even having Russian, rather than English, as the fallback language [13:55:49] 6operations: eventlog1001 / full - https://phabricator.wikimedia.org/T95154#1182120 (10Ottomata) On it... full!! [13:56:16] AndyRussG: yeah - imagine. I suppose my question is "do they see the russian messages right now?" [13:56:24] and, also, can you build the submodule branch [13:56:39] manybubbles: do you know if folks like greg-g are around today? [13:56:45] sorry, submodule update to core [13:56:46] or anyone [13:56:52] know* [13:57:42] manybubbles: The actual patch had been sitting around for a while--I hadn't noticed that it didn't hadn't gone through--but it became urgent just recent because the Russian translations merged. Yeah, those are in the EP's 1.25wmf23 branch [13:57:48] anomie: any plans to do a scap in today's swap? [13:57:55] Yeah! I'll make the patches [13:58:46] AndyRussG: thanks. I'll get them out this morning. [13:58:49] aude: let me check [13:58:53] manybubbles: Maybe. We're at 10 patches now, so the latest added by AndyRussG that would require a scap might get bumped. [13:59:00] PROBLEM - puppet last run on analytics1018 is CRITICAL: CRITICAL: puppet fail [13:59:11] manybubbles: thanks [13:59:15] anomie: i might do my own [13:59:33] anomie: yeah - I'm somewhat sensitive to those issues so I might see if I can sneak the scap in for them early so they don't get in your way [14:00:12] aude: I think greg-g will be around. [14:00:30] manybubbles: Go for it, there's nothing scheduled for the next hour. [14:01:01] manybubbles: thanks so much! doing the patches now... [14:01:50] RECOVERY - puppet last run on mw1115 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [14:02:14] manybubbles: ok [14:02:50] RECOVERY - puppet last run on mc1016 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [14:02:52] we have some small schema changes to do for one our tables on test.wikidata, test2 and test [14:03:00] RECOVERY - puppet last run on mw2034 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [14:03:01] so might as well do our things together [14:03:23] but would prefer greg say it's ok before i do them [14:03:50] RECOVERY - puppet last run on mw1080 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:03:54] * aude can't imagine it not being ok [14:05:31] * manybubbles imagines its ok too [14:05:50] AndyRussG and anomie: https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=152338&oldid=152336 [14:05:57] manybubbles: here's 1.25wmf23 https://gerrit.wikimedia.org/r/#/c/202030/ [14:06:36] as much as I've deployed, I've only scaped like 4 times [14:07:07] manybubbles: Go for it. It takes a long time though, so ideally go for it in the next 10-15 minutes. [14:07:31] anomie: yeah. I'm confident scap will work and everything will be fine. but yeah. I want to start in 10 minutes. [14:08:23] never had too much problem with scap [14:09:47] * aude claims hour after swat [14:09:51] the* hour [14:10:57] manybubbles: here's 1.25wmf24 https://gerrit.wikimedia.org/r/#/c/202031/ [14:11:23] aude: stick that on the deployments page? [14:12:11] manybubbles: done [14:12:12] manybubbles: yeah the super special deploy on deployments page looks great, thanks...That's a SSMPSWATSWAT then? [14:12:14] and removed self from swat [14:13:06] indeed [14:14:03] Yeah I recently got deploy rights (for CentralNotice) (though I guess I still am not fully deploy-able Gerritwise) but I've never done a scap [14:14:34] anomie: tin says: It looks like git-am is in progress. Cannot rebase. [14:14:41] manybubbles@tin:/srv/mediawiki-staging/php-1.25wmf24$ git pull [14:14:52] any experience with that stuff? [14:15:10] * anomie looks [14:15:49] I suppose I could abort the am [14:15:59] RECOVERY - puppet last run on analytics1018 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:22:56] !log manybubbles Started scap: earyly-SWAT: Ukrainian translations for EducationProgram [14:23:02] Logged the message, Master [14:23:26] that pig is new [14:24:57] Which pig? [14:25:20] (03PS3) 10Rush: phab: update phab version in labs to 2015-02-18 [puppet] - 10https://gerrit.wikimedia.org/r/201857 (owner: 10Dzahn) [14:26:34] AndyRussG: its an ascii flying pig with sweet goggles that spits out when you run scap [14:26:50] looks like "scap" [14:26:57] except scap look a bit more like scop [14:28:15] manybubbles: whoa............. amazing [14:28:36] Even better than apt-get's super cow! [14:29:18] way fancier. it has color [14:31:26] What a great perk for scappers! [14:31:35] 6operations, 6Labs, 10hardware-requests: eqiad: (6) labs virt nodes - https://phabricator.wikimedia.org/T89752#1182146 (10Cmjohnson) Andrew, What naming convention do you want to use? Stick with labs10xx for now or start with something new? Also, do you want these in row D or are there any cisco's we ca... [14:51:03] manybubbles: Still scapping? Do you just want to handle SWAT too? [14:51:37] anomie: I certainly can if you'd like to go do other things. its still scaping [14:51:46] 75% [14:52:21] manybubbles: I'll have to be around to verify my two patches didn't somehow break editing, but it'd be easier for you to just SWAT when the scap is done instead of trying to hand-off. [14:52:33] k [14:55:45] manybubbles: anomie: thanks once again! [15:00:04] manybubbles, anomie, ^d, thcipriani: Dear anthropoid, the time has come. Please deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20150406T1500). [15:00:09] PROBLEM - puppet last run on wtp2016 is CRITICAL: CRITICAL: puppet fail [15:01:45] I'm around to 'test'. [15:01:45] * greg-g looks around [15:02:05] Scappy the flying pig in all it's ascii art glory -- https://wikitech.wikimedia.org/wiki/Wikimedia_binaries#/media/File:Scap-logo-white-on-black.png [15:02:41] bd808: lovely [15:02:48] will we see it each time? :) [15:03:48] !log manybubbles Finished scap: earyly-SWAT: Ukrainian translations for EducationProgram (duration: 40m 52s) [15:03:52] Logged the message, Master [15:03:53] It will print at the start of scap, sync-file and sync-dir [15:04:07] AndyRussG: done [15:04:11] wooo sync-dir too [15:04:13] manybubbles: nice! [15:05:33] anomie: +2ed your patches. [15:06:59] akosiaris: can you merge https://gerrit.wikimedia.org/r/#/c/201667/ [15:07:00] (03CR) 10KartikMistry: [C: 031] "Good to go anytime now." [puppet] - 10https://gerrit.wikimedia.org/r/201667 (owner: 10KartikMistry) [15:08:18] he has to wear goggles because of the speed! [15:08:23] manybubbles: confirmed it's there on uk.wikipedia.org [15:08:48] manybubbles: Let me know when they're deployed and I'll double-check that editing isn't broken. Unfortunately if I knew how to actually make the exceptions be thrown I wouldn't need the patches in the first place... [15:10:15] AndyRussG: sweet! [15:10:20] anomie: you want me to do just wmf24? [15:10:26] first [15:10:28] that is [15:10:52] manybubbles: That's testwiki? That's what I ususally do. [15:11:37] anomie: I'm going to have to hand off to you in 20 or 30 minutes actually - I have to do another interview and I have to prepare [15:11:45] I hadn't realized it [15:11:47] until now [15:13:51] manybubbles: ok. I see kart_ and gilles are already here, while jhernandez is not in any channels I'm in and probably isn't even the right jhernandez. [15:14:08] anomie: yes [15:14:12] anomie: it's meee [15:14:18] first time around for this [15:14:24] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1182222 (10GWicke) @ArielGlenn and @RobH, any ETA on the shell access so that we can start testing? If this is going to take longer, any objections against testing on ruthenium in the mea... [15:14:42] joakino: Ah, you (or whoever put your name on https://wikitech.wikimedia.org/wiki/Deployments) forgot to put your correct irc nick. [15:15:29] woops [15:15:33] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1182226 (10RobH) This was discussed in the last operations meeting, and decided then that Ariel had to work this out with you guys. (I was only involved as I was the clinic person that w... [15:16:03] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1182227 (10RobH) So until that dicsusison takes place with you guys deciding how this is going to roll out and implement, this is stalled. [15:16:27] joakino: Do you know how to make the patch to update your submodule in the mediawiki/core repo? [15:17:09] RECOVERY - puppet last run on wtp2016 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [15:17:10] anomie: no idea, cc: phuedx [15:17:33] joakino, anomie: has that thing been merged? [15:18:06] phuedx: They merged a version of it in master, but not the wmf23 cherry-pick yet. [15:18:38] anomie: your wmf24 patch is merged. want me to deploy it? [15:18:48] manybubbles: Please do [15:19:04] phuedx: anomie: both ps1 or 2 are fine. the emitted scroll event is not used anywhere, and has been in master MobileFrontend for a bit [15:19:15] joakino: Instructions are at https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Updating_the_submodule; the down-side is that it requires checking out the appropriate wmf branch of mediawiki/core, which can take a long time if you're on a slow connection. [15:19:37] have we started SWAT? [15:19:56] !log manybubbles Synchronized php-1.25wmf24/includes/Revision.php: SWAT try and catch funky revision errors 1/2 (duration: 00m 12s) [15:19:58] anomie: i'm set up to do it -- if joakino's fine with the cherry-pick (and updates master soon) then i'll merge it and make the patch [15:19:59] Logged the message, Master [15:20:02] kart_: manybubbles is starting it, but he says I'll have to finish it because he as other obligations before the end of the window [15:20:16] please phuedx [15:20:19] manybubbles: Save on testwiki was not broken [15:20:24] !log manybubbles Synchronized php-1.25wmf24/includes/page/WikiPage.php: SWAT try and catch funky revision errors 2/2 (duration: 00m 12s) [15:20:28] Logged the message, Master [15:20:32] anomie: try again - I had to do the second file [15:20:47] manybubbles: Still not broken [15:20:53] anomie: ok - doing 23 [15:20:53] anomie: also merge time for core in addition :) [15:21:21] * phuedx stares at jerkins [15:21:45] !log manybubbles Synchronized php-1.25wmf23/includes/Revision.php: SWAT try and catch funky revision errors 1/2 (duration: 00m 12s) [15:21:48] Logged the message, Master [15:22:07] !log manybubbles Synchronized php-1.25wmf23/includes/page/WikiPage.php: SWAT try and catch funky revision errors 2/2 (duration: 00m 13s) [15:22:09] anomie: both files now out on 23 ^^^^^^ [15:22:13] Logged the message, Master [15:22:14] manybubbles: enwiki isn't broken either. Yay! [15:22:23] anomie: hurray! [15:22:41] kart_: your turn? [15:22:52] (03CR) 10Manybubbles: [C: 032] CX: Enable Content Translation in guwiki and viwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201666 (owner: 10KartikMistry) [15:24:17] godog: can you merge https://gerrit.wikimedia.org/r/#/c/201667/ if you're around? [15:24:23] akosiaris: if you're ^^ [15:24:39] oh boy the config repo on tin isn't clean.... [15:25:13] manybubbles: for future reference -- what do we do in that scenario? [15:25:27] phuedx: sorry, which scenario? [15:25:34] unclean config on tin [15:25:56] manybubbles: I saw some patches merged today earlier [15:26:03] but forgot to poke people [15:26:12] 6operations, 6Labs, 10hardware-requests: eqiad: (6) labs virt nodes - https://phabricator.wikimedia.org/T89752#1182256 (10Andrew) According to the naming scheme in https://phabricator.wikimedia.org/T95042, let's name these boxes 'labvirt10xx'. You can start with 1001 and when I re-image the other HPs I'll r... [15:26:13] usually I poke people and halt [15:26:27] :/ [15:26:30] but sometimes I read the patches, decide they are ok, and then just go on [15:26:54] manybubbles: +2 without SWAT should be removed :) [15:27:19] (03Merged) 10jenkins-bot: CX: Enable Content Translation in guwiki and viwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201666 (owner: 10KartikMistry) [15:27:26] kart_: I just did it though! I +2ed your patch but am stuck behind other merge but not deployed patches [15:27:44] faidon and prateek [15:28:00] hey [15:28:01] somethings beta only things get merged [15:28:02] what did I do? [15:28:07] I think werdna merged one? [15:28:09] without syncing to prodution [15:28:16] I didn't merge anything [15:28:17] not ideal [15:28:29] paravoid: https://gerrit.wikimedia.org/r/#/c/201997/ [15:28:32] oh, probably my patch that someone else merged [15:28:35] paravoid: I think its actaully just your patch, sorry [15:28:36] yeah [15:28:46] I was just looking at git, sorry [15:28:51] no worries [15:28:53] can I sneak another backport into my swat list? :P bd808 just +2ed it on master [15:28:53] kart_: poke andrewbogott with the puppet patch [15:28:53] is safe to sync? [15:29:09] (03CR) 10Alexandros Kosiaris: [C: 032] CX: Add 'gu' and 'vi' in language selector [puppet] - 10https://gerrit.wikimedia.org/r/201667 (owner: 10KartikMistry) [15:29:20] JohnLewis: alex just did :) [15:29:23] or have him beaten by akosiaris :) [15:29:25] it should be [15:29:29] gar! I've got to hand off to anomie soon - swat is already behind because I scaped an hour ago and it bled into it (sorry) [15:29:55] paravoid: k. then I'll sync it with kart_'s config patch [15:30:03] (03CR) 10Rush: [C: 04-1] Puppet run storage upgrade for phd service (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/201864 (https://phabricator.wikimedia.org/T95062) (owner: 10Negative24) [15:30:20] manybubbles: Ok, let's hand off if I have the situation correct: You're in the middle of kart_'s patch, but werdna and gilles merged stuff without deploying (grr). Then gilles and joakino are left? [15:30:27] gilles: Things look pretty open after swat. You can probably do it yourself after Nik and Brad are done with the long list [15:30:39] anomie: close enough [15:30:49] what? I only merged the extension's commit, as usual [15:30:52] anomie: I've get fetched kart_'s patch but not pulled [15:31:05] its not gilles, I think, its someone else but I haven't checked [15:31:09] gilles: https://gerrit.wikimedia.org/r/#/c/201997/ [15:31:12] bd808: check the calendar again :) [15:31:24] don't see a problem with getting gilles' thing in though [15:31:27] oh dammit, sorry about that [15:31:41] * werdna blinks [15:31:44] * werdna waves [15:31:57] unless i filled in the wrong day or such [15:32:15] aude: oops. wikidata isn't on my gcal calendar view [15:32:20] ah [15:32:32] I should have something block my +2 abilities in the 2 hours after waking up [15:32:43] gcal just has the big standing deploy windows on it I think [15:32:56] bd808: i see [15:33:00] werdna: You merged https://gerrit.wikimedia.org/r/#/c/201419/ without deploying it. [15:33:16] ok /me logs out of tin and goes to markup deployments page. good luck anome [15:33:17] manybubbles: Are you going to finish kart_'s stuff, or am I? [15:33:22] oh, just answered [15:33:23] anomie: you? [15:33:27] sorry [15:33:27] ok [15:33:27] I +2’d a patch on operations/wikimedia-config or whatever it’s called. I didn’t realise that obliged me to deploy it immediately. [15:33:42] Sorry about that. [15:33:51] werdna: Yeah, otherwise you screw up anyone else wanting to deploy a config change. [15:34:09] werdna: Safe for me to deploy it for you, now? [15:34:10] even if it's a beta only change [15:34:14] hmm, would be nice if I could approve something to go out with the “deployment train" [15:34:16] still needs to be synced [15:34:38] werdna: For that, you +1 and talk to the train deployer. [15:35:10] anomie: It’s adding a config variable that is used by an extension, though obviously i approved that revision this morning and it won’t be live yet [15:35:20] but I assume there’s no harm in an unused config variable lying around for a week or two [15:36:19] RECOVERY - Unmerged changes on repository mediawiki_config on tin is OK: No changes to merge. [15:36:46] godog: any clinic interest from last week that I should pay extra attention to? [15:36:57] !log anomie Synchronized wmf-config/: SWAT: Enable ContentTranslation in the Vietnamese and Gujarati Wikipedia, and sync some other changes that naughty people didn't sync themselves but say are safe. (duration: 00m 12s) [15:37:03] werdna, gilles, kart_: Check that nothing broke, please ^ [15:37:03] Logged the message, Master [15:37:14] anomie: checking.. [15:37:15] werdna: Shouldn't be [15:37:55] anomie: that thing was unused [15:38:05] andrewbogott: I can give you like patches you can deal with while on duty this week if you want :) [15:38:07] turned off a few weeks ago [15:38:12] *like 4 patches [15:38:18] gilles: Good, then nothing should be broken ;) [15:38:34] JohnLewis: sure, you can add me. I need to catch up in phab before I do any code review though. [15:38:37] anomie: https://gerrit.wikimedia.org/r/#/c/202041/ [15:38:40] joakino: ^ [15:39:05] sorry it took so long [15:39:12] andrewbogott: that's fine. adding you now. two are just merges while two require changes to a servers networking conf [15:39:12] gilles: You're next after kart_ confirms his, so be ready to test. [15:39:16] anomie: looks fine. [15:39:24] anomie: well, I don't see CX in SpecialPages :/ [15:39:31] That's bad. [15:39:58] CX? [15:40:14] content translation [15:40:21] werdna: Content Translation [15:40:32] Special:ContentTranslation [15:41:10] 10Ops-Access-Reviews, 6operations: Add researchers to 'research' group - https://phabricator.wikimedia.org/T95173#1182321 (10Andrew) 3NEW [15:41:34] anomie: I've sneak-edited the '23 and '24 backports of what bd808 merged earlier [15:41:41] and I'll be ready to test that too [15:42:29] gilles: One of your links seems to be incorrect, and I'll probably do joakino's before those. [15:42:31] anomie: not yet. [15:42:33] If there's time. [15:42:36] mkay, checking [15:43:09] 10Ops-Access-Reviews, 6operations: Add tomasz to 'research' group - https://phabricator.wikimedia.org/T95173#1182321 (10Andrew) [15:43:21] anomie: link fixed [15:43:23] kart_: I see "Content Translation statistics" on those wikis, and I don't see "Content Translation" on frwiki despite that supposedly already being enabled before this patch. [15:43:40] 10Ops-Access-Requests, 6operations, 10Analytics-EventLogging: Grant user 'tomasz' access to dbstore1002 for Event Logging data - https://phabricator.wikimedia.org/T95036#1182337 (10Andrew) p:5Triage>3High [15:44:01] anomie: https://fr.wikipedia.org/wiki/Sp%C3%A9cial:ContentTranslation [15:44:37] kart_: Yeah. And frwiki wasn't one of the ones you just messed with. Should we revert your patch, or leave it? [15:45:10] anomie: I messed? :) [15:45:37] anomie: checkin again. [15:46:13] Well, whike kart_ is investigating that I'm going to deploy gilles's UploadWizard bump. [15:46:25] * gilles is ready [15:46:32] anomie: my bad. go ahead. [15:46:50] Idiot me. [15:47:25] anomie: is the mobile frontend patch gonna make it? [15:47:32] joakino: Yes [15:47:35] anomie: sorry for noise, I forgot in excitement that I've to enable Beta :) [15:47:35] !log anomie Synchronized php-1.25wmf24/extensions/UploadWizard/: SWAT: Backport UploadWizard bugfix (duration: 00m 12s) [15:47:38] Logged the message, Master [15:47:39] gilles: ^ Test please [15:47:43] anomie: testing... [15:47:46] wohoo 👍 [15:47:53] joakino: You're next, be ready to test [15:47:58] k [15:49:02] anomie: fix confirmed, uplaodwizard looking good on testwiki [15:50:13] !log anomie Synchronized php-1.25wmf23/extensions/MobileFrontend/: SWAT: MobileFrontend: Debounce resize events [[gerrit:201840]] (duration: 00m 12s) [15:50:14] joakino: ^ Test please [15:50:17] Logged the message, Master [15:50:44] on it [15:51:26] 6operations, 10ops-codfw, 3codfw-appserver-setup, 3wikis-in-codfw: mw2208-2209, mw2213 have unreachable mgmt interfaces - https://phabricator.wikimedia.org/T93857#1182352 (10Papaul) 5Open>3Resolved a:3Papaul IDRAC card has been replaced and installation complete on mw2208. [15:51:27] 6operations, 3codfw-appserver-setup, 3wikis-in-codfw: install/deploy codfw appservers - https://phabricator.wikimedia.org/T85227#1182355 (10Papaul) [15:51:40] 10Ops-Access-Requests, 6operations, 10Analytics-Cluster: Requesting access to analytics-users (stat1002) for Jkatz - https://phabricator.wikimedia.org/T94939#1182358 (10Andrew) Once we have approval from both Toby and Howie, the clock will start ticking for this. A simple reply to this ticket from both is s... [15:52:35] gilles: Unless Jenkins has suddenly gotten faster than it has ever been before for mediawiki/core merges, I don't think there's time to properly merge and deploy your other two before the end of the SWAT window. aude might be willing to do a post-SWAT after she does her stuff. [15:52:52] alright, thanks anomie [15:53:24] 10Ops-Access-Requests, 6operations, 10Analytics-Cluster: Requesting access to analytics-users (stat1002) for Jkatz - https://phabricator.wikimedia.org/T94939#1182362 (10Andrew) [15:53:56] 10Ops-Access-Requests, 6operations, 10Analytics-Cluster: Requesting access to analytics-users (stat1002) for Jkatz - https://phabricator.wikimedia.org/T94939#1176345 (10Andrew) Sorry, not Howie obviously. Whoever Jon's direct manager is, which I can't currently tell from looking at the staff page :/ [15:54:23] andrewbogott: toby [15:54:46] Ah, Jon is on the analytics team? [15:55:26] gilles: what is the change? [15:55:48] anomie: phuedx works fine on http://en.m.wikipedia.beta.wmflabs.org/ [15:55:55] aude: https://gerrit.wikimedia.org/r/#/c/202042/ https://gerrit.wikimedia.org/r/#/c/202039/ [15:56:02] * anomie declares this rocky SWAT closed! [15:56:09] andrewbogott: no, he's PM, and toby is acting new-Howie [15:56:30] oh… ok. That’s simple then. [15:56:31] thanks [15:56:43] :) np [15:56:56] 10Ops-Access-Requests, 6operations, 10Analytics-Cluster: Requesting access to analytics-users (stat1002) for Jkatz - https://phabricator.wikimedia.org/T94939#1182388 (10Andrew) So, I guess just Toby. [15:57:26] i might hold off on our schema stuff until we check with sean [15:57:31] but will do our swat stuff now [15:58:58] 6operations, 10ops-codfw: install cable covers in enclosure's sidewalls - https://phabricator.wikimedia.org/T84072#1182394 (10Papaul) 5Open>3Resolved The covers are in place and the spares are in storage. [15:59:40] gilles: the patches look safe [15:59:47] i can take care of them [15:59:56] aude: thank you! [16:00:05] aude: Respected human, time to deploy Wikidata (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20150406T1600). Please do the needful. [16:00:14] I'm tailing dbperformance.log, I should see the change immediately [16:00:27] ok [16:01:50] i'll do https://gerrit.wikimedia.org/r/#/c/202039 first (wmf24) [16:02:10] (03PS1) 10Hoo man: Fix legacy dump link creation [puppet] - 10https://gerrit.wikimedia.org/r/202044 [16:04:04] 7Blocked-on-Operations, 6operations, 10Continuous-Integration, 6Release-Engineering, 6Scrum-of-Scrums: Jenkins: Re-enable lint checks for Apache config in operations-puppet - https://phabricator.wikimedia.org/T72068#1182424 (10Andrew) Hashar, if this is still blocked on reviews then ping me on IRC this w... [16:06:50] 7Blocked-on-Operations, 6operations, 10Continuous-Integration: Build Debian package ruby-jsduck for Jessie - https://phabricator.wikimedia.org/T95008#1182434 (10Andrew) So, to clarify -- you have the packages already built and in the repo for Trusty? Is this as simple as adding those same packages to the Je... [16:07:34] jenkins is slow today... [16:09:29] (03PS5) 10Andrew Bogott: Redirect wikibook(s).(org|com) to www.wikibooks.org [puppet] - 10https://gerrit.wikimedia.org/r/185474 (https://phabricator.wikimedia.org/T87039) (owner: 10Glaisher) [16:12:05] !log removing higher metric for eqiad-ulsfo GTT link [16:12:09] Logged the message, Master [16:13:09] gilles: still waiting on jenkins :/ [16:14:24] (03PS3) 10Andrew Bogott: various role classes - indentation fixes [puppet] - 10https://gerrit.wikimedia.org/r/200110 (https://phabricator.wikimedia.org/T93645) (owner: 10Dzahn) [16:15:27] (03CR) 10Andrew Bogott: [C: 032] various role classes - indentation fixes [puppet] - 10https://gerrit.wikimedia.org/r/200110 (https://phabricator.wikimedia.org/T93645) (owner: 10Dzahn) [16:19:59] aude, you deploying? [16:20:11] Krenair: yes [16:20:13] ok [16:20:14] if jenkins allows [16:20:40] been almost 20 minutes [16:20:40] jgage: since this is related to https, do you want it? https://phabricator.wikimedia.org/T95164 [16:20:56] glhf [16:27:01] 10Ops-Access-Requests, 6operations, 10Analytics-Cluster: Requesting access to analytics-users (stat1002) for Jkatz - https://phabricator.wikimedia.org/T94939#1182506 (10JKatzWMF) @ ottomata my direct manager is toby, thanks! [16:27:34] still waiting, though looks like jenkins is almost done [16:27:47] 6operations, 6Services, 7Service-Architecture: Set up monitoring automation for services - https://phabricator.wikimedia.org/T94821#1182507 (10Andrew) p:5Triage>3Normal [16:28:31] 6operations, 10ops-esams, 10procurement: Buy fiber patches - https://phabricator.wikimedia.org/T94846#1182508 (10Andrew) p:5Triage>3High [16:29:14] 6operations, 10MediaWiki-Logging, 6Release-Engineering, 7HHVM: SlowTimer logs should go to their own location, instead of hhvm.log - https://phabricator.wikimedia.org/T94855#1182512 (10Andrew) p:5Triage>3Normal [16:30:05] gilles: wmf24 patch finally merged [16:30:08] still around? [16:30:11] yep [16:30:14] ok [16:30:22] wmf23 should be merged shortly after [16:30:24] 6operations, 7Performance: Optimize prod's resource domains for SPDY/HTTP2 - https://phabricator.wikimedia.org/T94896#1182515 (10Andrew) p:5Triage>3Normal [16:31:45] 6operations: Enable the usage of `hhvm -m debug --debug-host ::1` from mw1017 so developers can step through code (think gdb) in production to see what is going wrong. - https://phabricator.wikimedia.org/T94951#1182518 (10Andrew) p:5Triage>3Normal [16:31:57] !log aude Synchronized php-1.25wmf24/includes/profiler/TransactionProfiler.php: Track request method in dbperformance.log (duration: 00m 12s) [16:32:00] Logged the message, Master [16:32:04] there^ [16:32:19] aude: I confirm that it works [16:32:23] ok [16:32:26] * aude doing wmf23 [16:32:31] 6operations, 10Continuous-Integration: Provide Jessie package to fullfil Mediawiki::Packages requirement - https://phabricator.wikimedia.org/T95002#1182528 (10Andrew) p:5Triage>3Normal [16:33:15] (03PS1) 10Ottomata: Set up disk full alerts for eventlogging that will notify the analytics contact group [puppet] - 10https://gerrit.wikimedia.org/r/202048 (https://phabricator.wikimedia.org/T95154) [16:33:34] !log aude Synchronized php-1.25wmf23/includes/profiler/TransactionProfiler.php: Track request method in dbperformance.log (duration: 00m 13s) [16:33:37] Logged the message, Master [16:33:38] 6operations, 7Tracking: Make ircecho much better (Tracking) - https://phabricator.wikimedia.org/T95052#1182535 (10Andrew) p:5Triage>3Low [16:33:38] there^ [16:33:43] and now will do wikidata stuff [16:33:43] 6operations: ircecho should accept input via unix sockets - https://phabricator.wikimedia.org/T95053#1182537 (10Andrew) p:5Triage>3Low [16:33:45] 6operations: Move ircecho config file to be YAML - https://phabricator.wikimedia.org/T95054#1182538 (10Andrew) p:5Triage>3Low [16:33:49] 6operations: Convert ircecho init script to an upstart job - https://phabricator.wikimedia.org/T95055#1182540 (10Andrew) p:5Triage>3Low [16:34:04] (03CR) 10Nuria: [C: 031] Set up disk full alerts for eventlogging that will notify the analytics contact group [puppet] - 10https://gerrit.wikimedia.org/r/202048 (https://phabricator.wikimedia.org/T95154) (owner: 10Ottomata) [16:34:21] (03PS2) 10Ottomata: Set up disk full alerts for eventlogging that will notify the analytics contact group [puppet] - 10https://gerrit.wikimedia.org/r/202048 (https://phabricator.wikimedia.org/T95154) [16:34:30] 6operations, 10ops-eqiad, 10Analytics-EventLogging: vanadium failed disk /dev/sda - https://phabricator.wikimedia.org/T94926#1182542 (10Andrew) p:5Triage>3High [16:35:22] 6operations, 10ops-eqiad, 10Analytics-EventLogging: vanadium failed disk /dev/sda - https://phabricator.wikimedia.org/T94926#1182543 (10Ottomata) FYI, because of this, on Friday we replaced vanadium with eventlog1001. vanadium is no longer in production, and will be decommissioned this week. [16:35:25] 6operations, 3Interdatacenter-IPsec: Kernel panics on Jessie (3.16.0-4-amd64) during IPsec load test - https://phabricator.wikimedia.org/T94820#1182544 (10Andrew) p:5Triage>3High [16:36:02] (03CR) 10Ottomata: [C: 032] Set up disk full alerts for eventlogging that will notify the analytics contact group [puppet] - 10https://gerrit.wikimedia.org/r/202048 (https://phabricator.wikimedia.org/T95154) (owner: 10Ottomata) [16:37:03] aude: it works, thanks! [16:37:26] if I want to test a patch that adds a new project to manifests/role/deployment.pp, do I need to set up a self-hosted saltmaster? [16:38:30] gilles: sure :) [16:39:09] 6operations, 10hardware-requests: Decom/repurpose rbf* hosts - https://phabricator.wikimedia.org/T95153#1182548 (10Andrew) p:5Triage>3Normal [16:39:18] 6operations, 10Datasets-General-or-Unknown, 6Services, 10hardware-requests: Hardware for HTML / zim dumps - https://phabricator.wikimedia.org/T91853#1182553 (10RobH) [16:39:21] 6operations, 5Patch-For-Review: deploy francium for html/zim dumps - https://phabricator.wikimedia.org/T93113#1182552 (10RobH) [16:39:23] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1182550 (10RobH) 5Open>3declined There was an IRC discussion between Ariel, Faidon, & Gabriel. The end result is some testing will need to take place in labs, and the service will be... [16:39:50] 6operations, 10ops-eqiad, 10Analytics-EventLogging: vanadium failed disk /dev/sda - https://phabricator.wikimedia.org/T94926#1182554 (10yuvipanda) p:5High>3Low [16:40:24] !log aude Synchronized php-1.25wmf24/extensions/Wikidata: Update property suggester, valueview and fix editlinks bug in client (duration: 00m 19s) [16:40:29] Logged the message, Master [16:40:47] * aude checks [16:43:13] looks good [16:43:43] you have another patch to deploy right? [16:43:59] or was it done already? [16:44:30] almost done [16:44:36] wmf23 now [16:44:45] !log aude Synchronized php-1.25wmf24/extensions/Wikidata: Fix editlinks bug in client (duration: 00m 21s) [16:44:47] PROBLEM - Eventlogging /srv disk space on eventlog1001 is CRITICAL: DISK CRITICAL - free space: / 6168 MB (70% inode=86%): [16:44:49] * aude checks again [16:44:50] Logged the message, Master [16:45:09] mutante: Does ‘deploying’ https://gerrit.wikimedia.org/r/#/c/185474/5 consiste of anything other than merging it and the nervously watching a few of those urls? [16:45:17] * aude notes a bunch of Fatal error: request has exceeded memory limit in /srv/mediawiki/php-1.25wmf23/includes/Export.php on line 945 [16:45:32] 10Ops-Access-Requests, 6operations: Access request: +2 on cassandra submodule for services team members - https://phabricator.wikimedia.org/T93775#1182559 (10Ottomata) No, it won't. It will just do the regular puppet lints. However, I think you could still do this with the second commit that updates the subm... [16:45:46] can't be related though but still somewhat concerning [16:46:19] 6operations, 6Phabricator, 10Wikimedia-Bugzilla, 7Tracking: Tracking: Remove Bugzilla from production - https://phabricator.wikimedia.org/T95184#1182562 (10JohnLewis) 3NEW [16:46:49] done? :) [16:46:50] ottomata: [16:47:06] 6operations, 6Phabricator, 10Wikimedia-Bugzilla, 7Tracking: Tracking: Remove Bugzilla from production - https://phabricator.wikimedia.org/T95184#1182570 (10JohnLewis) [16:47:10] 6operations, 10Wikimedia-Bugzilla: analyze Bugzilla access logs - https://phabricator.wikimedia.org/T86859#1182569 (10JohnLewis) [16:47:14] 10Ops-Access-Requests, 6operations, 10Analytics-Cluster: Requesting access to analytics-users (stat1002) for Jkatz - https://phabricator.wikimedia.org/T94939#1182571 (10Tnegrin) approved [16:47:31] heh... I think I got that the wrong way around >.> [16:47:46] Krenair: i think so [16:48:03] I just wanted to send out a debug logging patch for something [16:48:19] give me a minute [16:48:29] or go ahead and if i need to do anything else, can do after [16:48:44] ah, [16:48:48] i know what i did [16:48:53] 6operations, 6Phabricator, 10Wikimedia-Bugzilla, 7Tracking: Tracking: Remove Bugzilla from production - https://phabricator.wikimedia.org/T95184#1182562 (10JohnLewis) [16:48:55] (03PS2) 10ArielGlenn: Fix legacy dump link creation [puppet] - 10https://gerrit.wikimedia.org/r/202044 (owner: 10Hoo man) [16:48:57] 6operations, 10Wikimedia-Bugzilla: analyze Bugzilla access logs - https://phabricator.wikimedia.org/T86859#977836 (10JohnLewis) [16:49:05] * aude synced wmf24 twice and not wmf23 [16:49:10] hah [16:49:24] can i do that now? [16:49:30] sure [16:49:32] k [16:49:33] I haven't touched anything [16:49:49] * aude was wondering wtf, bug not fixed [16:49:59] !log aude Synchronized php-1.25wmf23/extensions/Wikidata: Fix editlinks bug in client (duration: 00m 21s) [16:50:02] Logged the message, Master [16:50:14] (03CR) 10ArielGlenn: [C: 032] Fix legacy dump link creation [puppet] - 10https://gerrit.wikimedia.org/r/202044 (owner: 10Hoo man) [16:50:18] looks good now :) [16:50:21] * aude is donoe [16:50:23] done* [16:51:00] ^d, what were you using to debug log stuff in production? [16:51:18] <^d> fatalmonitor and intuition. [16:51:29] <^d> Plus slapping extra wfDebugLogGroup() calls if needed [16:51:32] <^d> Or throwing an exception [16:51:36] 6operations, 10Wikimedia-Bugzilla: analyze Bugzilla access logs - https://phabricator.wikimedia.org/T86859#1182575 (10JohnLewis) [16:51:38] 6operations, 6Phabricator, 10Wikimedia-Bugzilla: Bugzilla HTML static version and database dump - https://phabricator.wikimedia.org/T1198#1182576 (10JohnLewis) [16:51:42] 6operations, 6Phabricator, 10Wikimedia-Bugzilla, 7Tracking: Tracking: Remove Bugzilla from production - https://phabricator.wikimedia.org/T95184#1182574 (10JohnLewis) [16:51:53] wfDebugLog, with a wgDebugLogGroups entry, ^d? [16:53:18] (03PS1) 10Ottomata: Don't use critical => 'true' for eventlogging disk space alerts; also fix /srv alert path [puppet] - 10https://gerrit.wikimedia.org/r/202050 [16:53:44] (03PS2) 10Ottomata: Don't use critical => 'true' for eventlogging disk space alerts; also fix /srv alert path [puppet] - 10https://gerrit.wikimedia.org/r/202050 [16:53:57] <^d> Krenair: That [16:54:30] (03CR) 10Ottomata: [C: 032] Don't use critical => 'true' for eventlogging disk space alerts; also fix /srv alert path [puppet] - 10https://gerrit.wikimedia.org/r/202050 (owner: 10Ottomata) [16:57:11] ok... let's see then [16:57:15] !log krenair Synchronized wmf-config/InitialiseSettings.php: debug logging (duration: 00m 12s) [16:57:20] Logged the message, Master [16:59:15] !log krenair Synchronized php-1.25wmf23/includes/libs/MapCacheLRU.php: debug logging (duration: 00m 12s) [16:59:18] Logged the message, Master [17:00:33] !log krenair Synchronized php-1.25wmf23/includes/libs/MapCacheLRU.php: ok, done (duration: 00m 12s) [17:00:36] Logged the message, Master [17:01:05] !log krenair Synchronized wmf-config/InitialiseSettings.php: done (duration: 00m 14s) [17:01:06] oO What's wrong with MapCacheLRU? [17:01:08] Logged the message, Master [17:01:12] got exactly what I was looking for [17:01:15] * hoo just making use of it [17:01:31] 6operations, 6Phabricator: re-use server 'radon' as phab failover - https://phabricator.wikimedia.org/T88818#1182594 (10chasemp) a:5chasemp>3Dzahn >>! In T88818#1051922, @Dzahn wrote: > hey @chasemp this can be reused anytime. just assigning it to you to let you know, because you asked about radon in icing... [17:01:38] hoo, it's been erroring a lot [17:01:49] :S [17:02:00] but only if you did silly things with it [17:02:16] like gave it keys that weren't strings/integers [17:02:52] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1182603 (10GWicke) As requested by @arielglenn, there is now a test dump for svwiki running on the 'htmldump' labs instance. The hope is that svwiki will fit into the 130G storage availab... [17:04:23] 6operations, 6Phabricator, 6Project-Creators: Create policy projects and convert people projects to open - https://phabricator.wikimedia.org/T90491#1182605 (10chasemp) >>! In T90491#1151638, @atgo wrote: > Maybe I'm late in the discussion here and you guys are set - but prefixing the Project with //acl*// mo... [17:07:40] PROBLEM - puppet last run on capella is CRITICAL: CRITICAL: puppet fail [17:21:50] RECOVERY - Eventlogging /srv disk space on eventlog1001 is OK: DISK OK - free space: /srv 382859 MB (82% inode=99%): [17:25:00] RECOVERY - puppet last run on capella is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [17:33:56] I wonder if I should have run that debugging for longer actually :/ [17:34:18] and for the Title::newFromText change instead [17:36:25] 6operations, 10Wikimedia-Bugzilla: analyze Bugzilla access logs - https://phabricator.wikimedia.org/T86859#1182758 (10Andrew) access.log and error.log contains about 36 hours' worth of access: # grep -ir bugzilla access.log | wc 1054 22300 314744 # grep -ir static-bugzilla access.log | wc 588 126... [17:38:01] PROBLEM - puppet last run on amssq33 is CRITICAL: CRITICAL: puppet fail [17:38:20] !log restarted eventlogging to deal with log issues [17:38:23] Logged the message, Master [17:42:16] 6operations, 5Patch-For-Review: eventlog1001 / full - https://phabricator.wikimedia.org/T95154#1182775 (10Ottomata) 5Open>3Resolved - deployed change to log less to stderr/stdout - moved /var/log/upstart into /srv/log/upstart and symlinked [17:43:41] hoo, yeah... so actually I'm wondering if I should revert https://gerrit.wikimedia.org/r/#/c/176515/9 [17:47:40] mh [17:48:29] that Title::newFromText bit could be nasty [17:50:33] will look into it a bit later [17:54:23] (03PS1) 10Ottomata: Fix eventlogging graphite consumer on hafnium [puppet] - 10https://gerrit.wikimedia.org/r/202070 [17:54:32] (03PS2) 10Ottomata: Fix eventlogging graphite consumer on hafnium [puppet] - 10https://gerrit.wikimedia.org/r/202070 [17:55:07] (03PS1) 10Shanmugamp7: Enable new user groups (patroller, rollbacker, autopatrolled) on ta wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202071 (https://phabricator.wikimedia.org/T95180) [17:55:09] (03CR) 10Nuria: [C: 031] Fix eventlogging graphite consumer on hafnium [puppet] - 10https://gerrit.wikimedia.org/r/202070 (owner: 10Ottomata) [17:55:10] RECOVERY - puppet last run on amssq33 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [17:56:46] (03CR) 10Ottomata: [C: 032] Fix eventlogging graphite consumer on hafnium [puppet] - 10https://gerrit.wikimedia.org/r/202070 (owner: 10Ottomata) [17:57:04] (03CR) 10John F. Lewis: [C: 031] "Consensus and change lgtm." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202071 (https://phabricator.wikimedia.org/T95180) (owner: 10Shanmugamp7) [17:58:52] 6operations, 6Collaboration-Team, 10Flow: Flow Exception Caught: DB connection error: Can't connect to MySQL server - https://phabricator.wikimedia.org/T95121#1182872 (10EBernhardson) [18:00:58] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1182882 (10jeremyb) 3NEW [18:03:35] (03PS1) 10Southparkfan: nginx: don't add duplicate slash [puppet] - 10https://gerrit.wikimedia.org/r/202074 (https://phabricator.wikimedia.org/T73152) [18:12:23] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1182954 (10Nemo_bis) Which brings us back to my request of a 500 GB partition on labs. :) https://lists.wikimedia.org/pipermail/labs-l/2015-March/003557.html [18:13:51] (03PS2) 10Andrew Bogott: labs: update bastion values in network.pp [puppet] - 10https://gerrit.wikimedia.org/r/201957 (owner: 10John F. Lewis) [18:15:10] (03CR) 10Andrew Bogott: [C: 032] labs: update bastion values in network.pp [puppet] - 10https://gerrit.wikimedia.org/r/201957 (owner: 10John F. Lewis) [18:15:57] (03PS9) 10Andrew Bogott: Alphabetise site.pp [puppet] - 10https://gerrit.wikimedia.org/r/201850 (owner: 10John F. Lewis) [18:16:59] (03CR) 10Andrew Bogott: [C: 032] Alphabetise site.pp [puppet] - 10https://gerrit.wikimedia.org/r/201850 (owner: 10John F. Lewis) [18:18:17] andrewbogott: ^ could go wrong very easily, so be careful :) [18:18:24] should be ok, though... [18:18:29] Yeah, it scares me a bit [18:18:43] but if it breaks it’ll be obvious [18:19:14] hopefully :D [18:25:08] 6operations, 5Patch-For-Review: remove public IP from zirconium - https://phabricator.wikimedia.org/T90676#1183003 (10Andrew) This seems like something that should accompany a re-install... it's a lot of risky mucking around for a simple cleanup on a running system. [18:26:01] (03CR) 10Andrew Bogott: [C: 04-1] "For a running-and-working system, this seems intrusive..." [puppet] - 10https://gerrit.wikimedia.org/r/192827 (https://phabricator.wikimedia.org/T90676) (owner: 10John F. Lewis) [18:33:35] (03CR) 10Alex Monk: [C: 04-1] "This needs to be grant-able to global groups, see I77ec5a14 and the ticket" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201940 (https://phabricator.wikimedia.org/T94368) (owner: 10Glaisher) [18:35:20] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.00666666666667 [18:35:28] (03PS1) 10Ottomata: Make it possible to reload systemd hosted varnishkafka instances [puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/202084 [18:35:50] (03CR) 10Alex Monk: [C: 031] Add $wgUploadNavigationUrl for iswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201943 (https://phabricator.wikimedia.org/T95089) (owner: 10Glaisher) [18:36:39] 6operations, 6Collaboration-Team, 10Flow: Flow Exception Caught: DB connection error: Can't connect to MySQL server - https://phabricator.wikimedia.org/T95121#1183039 (10Andrew) Does this happen every time you follow those steps, or was it a one-off? [18:36:56] (03CR) 10Ottomata: [C: 032 V: 032] Make it possible to reload systemd hosted varnishkafka instances [puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/202084 (owner: 10Ottomata) [18:37:34] (03PS1) 10Ottomata: Update varnishkafka with reload instance systemd change [puppet] - 10https://gerrit.wikimedia.org/r/202085 [18:37:49] (03PS2) 10Ottomata: Update varnishkafka with reload instance systemd change [puppet] - 10https://gerrit.wikimedia.org/r/202085 [18:41:19] RECOVERY - Slow CirrusSearch query rate on fluorine is OK: CirrusSearch-slow.log_line_rate OKAY: 0.0 [18:43:35] (03CR) 10Ottomata: [C: 032] Update varnishkafka with reload instance systemd change [puppet] - 10https://gerrit.wikimedia.org/r/202085 (owner: 10Ottomata) [18:53:14] 6operations, 10ops-eqiad: labnodepool1001 setup tasks: labels/ports/racktables - https://phabricator.wikimedia.org/T95048#1183118 (10Cmjohnson) ge-7/0/26 description updated. racktables updated, label updated (although the name is super long and didn't fit on the label) Added to [edit interfaces interface-ra... [18:55:19] 6operations, 10ops-eqiad: labnodepool1001 setup tasks: labels/ports/racktables - https://phabricator.wikimedia.org/T95048#1183125 (10Cmjohnson) 5Open>3Resolved [18:55:21] 6operations, 3Continuous-Integration-Isolation: install/deploy labnodepool1001 - https://phabricator.wikimedia.org/T95045#1183126 (10Cmjohnson) [18:55:46] (03CR) 10Andrew Bogott: [C: 032] "You're right, I don't know what that firewall hole is for." [puppet] - 10https://gerrit.wikimedia.org/r/201879 (owner: 10Dzahn) [18:56:08] 6operations, 6Collaboration-Team, 10Flow: Flow Exception Caught: DB connection error: Can't connect to MySQL server - https://phabricator.wikimedia.org/T95121#1183129 (10He7d3r) I wasn't able to reproduce this again. [18:56:24] (03CR) 10Andrew Bogott: [C: 032] openstack firewall: avoid hardcoding tendril IP [puppet] - 10https://gerrit.wikimedia.org/r/201875 (owner: 10Dzahn) [18:58:11] ^d, what log group have you been sending stuff to for debugging? [18:58:23] <^d> I don't have a catch-all one [18:58:25] <^d> Maybe we should [18:58:41] <^d> "AdHocDebug" [18:58:44] I had to set one up earlier :/ [18:59:13] then realised I couldn't simply move the file out of fluorine:/a/mw-log to ~ to avoid cluttering the directory [18:59:14] oops [18:59:31] <^d> Yeah a dedicated group would be nice [19:00:39] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: puppet fail [19:02:31] (03PS1) 10Gergő Tisza: Use require_package for python-redis [puppet] - 10https://gerrit.wikimedia.org/r/202093 [19:03:19] (03CR) 10Ori.livneh: [C: 031] Use require_package for python-redis [puppet] - 10https://gerrit.wikimedia.org/r/202093 (owner: 10Gergő Tisza) [19:04:00] (03PS1) 10Alex Monk: Add a new log group for temp live debugging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202094 [19:05:23] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1183157 (10GWicke) I set up a 500G VM using a third-party provider to test enwiki. @nemo_bis, let me know if you want to use that to test ZIM generation as well once I'm done with the HTM... [19:18:17] (03CR) 10Negative24: Puppet run storage upgrade for phd service (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/201864 (https://phabricator.wikimedia.org/T95062) (owner: 10Negative24) [19:19:49] RECOVERY - puppet last run on cp3022 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:20:44] (03PS1) 10Dzahn: admin: add tomasz to researchers group [puppet] - 10https://gerrit.wikimedia.org/r/202095 (https://phabricator.wikimedia.org/T95036) [19:21:03] (03PS1) 10Ori.livneh: postgresql: small lint fixes for server.pp [puppet] - 10https://gerrit.wikimedia.org/r/202096 [19:22:16] (03PS2) 10Dzahn: admin: add tomasz to researchers group [puppet] - 10https://gerrit.wikimedia.org/r/202095 (https://phabricator.wikimedia.org/T95036) [19:24:35] mutante, doesn't that need approval from tnegrin/dsicore? [19:28:04] ^d, shall we just do https://gerrit.wikimedia.org/r/202094 ? [19:28:08] Krenair: yes. and i wasn't going to skip that. [19:28:18] ok :) [19:29:20] <^d> Yes [19:29:50] PROBLEM - puppet last run on mw2017 is CRITICAL: CRITICAL: puppet fail [19:30:37] (03CR) 10Alex Monk: [C: 032] Add a new log group for temp live debugging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202094 (owner: 10Alex Monk) [19:30:50] (03CR) 10Dzahn: [C: 032] postgresql: small lint fixes for server.pp [puppet] - 10https://gerrit.wikimedia.org/r/202096 (owner: 10Ori.livneh) [19:32:07] (03CR) 10Dzahn: [C: 04-1] "pending approval" [puppet] - 10https://gerrit.wikimedia.org/r/202095 (https://phabricator.wikimedia.org/T95036) (owner: 10Dzahn) [19:32:15] (03Merged) 10jenkins-bot: Add a new log group for temp live debugging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202094 (owner: 10Alex Monk) [19:32:22] csteipp or Krenair, can I get advice about using TitleBlacklist? [19:32:32] sure [19:33:00] (03CR) 10Dzahn: "thank you Andrew" [puppet] - 10https://gerrit.wikimedia.org/r/200110 (https://phabricator.wikimedia.org/T93645) (owner: 10Dzahn) [19:33:03] actually, will move this into a pm since it’s security-ish [19:33:49] !log krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/202094/ - should basically be a no-op for now (duration: 00m 13s) [19:33:52] Logged the message, Master [19:40:58] 10Ops-Access-Requests, 6operations, 10Analytics-EventLogging, 5Patch-For-Review: Grant user 'tomasz' access to dbstore1002 for Event Logging data - https://phabricator.wikimedia.org/T95036#1183202 (10Dzahn) @Mark or @tnegrin Do you approve? [19:41:01] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183204 (10jeremyb-phone) looks like this is @andrew's week [19:41:20] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183208 (10jeremyb-phone) [19:43:13] mutante ^ some fun there ;) [19:45:53] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183219 (10Dzahn) Careful with this! If you change archive content you have to edit the mbox file and then recreate the HTML files from the mbox using mailman. This regularly changes existi... [19:46:59] RECOVERY - puppet last run on mw2017 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [19:48:54] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183229 (10Dzahn) see this first https://wikitech.wikimedia.org/wiki/Remove_a_message_from_mailing_list_archive [19:50:44] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183239 (10Krenair) #WMF-Legal is going to need to say that this is a legal obligation, otherwise this request must be rejected. [19:56:30] yo, looks like something is croaking periodically. https://www.mediawiki.org/w/index.php?search=mobile+showcase&title=Special%3ASearch&go=Go [19:56:44] sometimes it shows results without normal styling, sometimes it 503s [19:56:51] greg-g: legoktm manybubbles ^ [19:57:56] umm, that would be a bits issue then [19:58:16] bits is fataling [19:59:44] one of you admin on meta wiki? [20:00:04] gwicke, cscott, arlolra, subbu: Dear anthropoid, the time has come. Please deploy Services – Parsoid / OCG / Citoid / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20150406T2000). [20:00:39] PROBLEM - HTTP 5xx req/min on graphite1001 is CRITICAL: CRITICAL: 28.57% of data above the critical threshold [500.0] [20:00:56] uh oh. [20:00:59] PROBLEM - HTTP error ratio anomaly detection on graphite1001 is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 1 below the confidence bounds [20:02:22] hmm, I can repro that as well [20:02:39] paravoid: bblack ^^ bits failures [20:02:44] ori: does X-Wikimedia-Debug also send bits.wm.o requests to testwiki? [20:03:02] mw1017 really [20:03:40] (03Abandoned) 10Andrew Bogott: resolv: selector outside a resource [puppet] - 10https://gerrit.wikimedia.org/r/195516 (owner: 10Matanya) [20:03:45] https://www.mediawiki.org/load.php?debug=false&lang=en&modules=site&only=styles&skin=monobook&* ocassionally says "404 File Not Found" [20:03:51] other times its a 503 [20:04:03] somewhat strange dip in https://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&c=Bits+caches+eqiad&h=&tab=m&vn=&hide-hf=false&m=cpu_report&sh=1&z=small&hc=4&host_regex=&max_graphs=0&s=by+name [20:04:08] but that’s probably not relevant [20:04:30] oh wait, it's w/load.php [20:04:40] (03PS3) 10Andrew Bogott: dynamicproxy: resource attributes quote [puppet] - 10https://gerrit.wikimedia.org/r/195627 (owner: 10Matanya) [20:04:40] https://www.mediawiki.org/w/load.php?debug=false&lang=en&modules=site&only=styles&skin=monobook&* appears to be fine [20:04:54] actually, it’s not just a bits issue [20:05:01] refreshing the original link enough times gives me a 503 [20:05:58] (03CR) 10Andrew Bogott: [C: 032] dynamicproxy: resource attributes quote [puppet] - 10https://gerrit.wikimedia.org/r/195627 (owner: 10Matanya) [20:07:43] can’t repro that anymore [20:07:46] YuviPanda: 503 on special:search? [20:07:53] I can't either [20:07:58] but styles are in and out [20:08:00] I got one repro tho [20:08:01] hmmm [20:08:42] oh my, getting 503s [20:09:18] (03PS4) 10Andrew Bogott: ferm: resource attributes quoting [puppet] - 10https://gerrit.wikimedia.org/r/195858 (https://phabricator.wikimedia.org/T91908) (owner: 10Matanya) [20:09:35] domas: did you just get one? [20:10:38] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183266 (10jeremyb) ok, I had considered including legal here to begin with. public lists on that server generally don't permit content which would be a copyvio if it were posted instead to... [20:11:47] (03CR) 10Andrew Bogott: [C: 032] ferm: resource attributes quoting [puppet] - 10https://gerrit.wikimedia.org/r/195858 (https://phabricator.wikimedia.org/T91908) (owner: 10Matanya) [20:14:05] (03PS3) 10Andrew Bogott: puppet_compiler: resource attributes quoting and minor lints [puppet] - 10https://gerrit.wikimedia.org/r/195660 (owner: 10Matanya) [20:14:57] thanks andrewbogott [20:15:11] matanya: a couple of these need manual rebases now :( [20:15:23] 6operations, 10Wikimedia-Mailing-lists: scrub non-free PDF from list archives - https://phabricator.wikimedia.org/T95195#1183285 (10Krenair) Well, roots have to be involved here, because mailman. And therefore legal too, IMO. [20:15:27] (03CR) 10Andrew Bogott: [C: 032] puppet_compiler: resource attributes quoting and minor lints [puppet] - 10https://gerrit.wikimedia.org/r/195660 (owner: 10Matanya) [20:15:27] legoktm: I’m unable to repro anything. [20:15:29] now [20:15:42] I got a 503 on my watchlist on mw.o [20:15:45] enwp is fine though [20:15:51] er, 503 from bits on my watchlist [20:16:06] andrewbogott: i'll do it after the holidays, not on a relable network atm [20:16:29] (03PS1) 10John F. Lewis: remove radon from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202223 [20:16:40] (03CR) 10jenkins-bot: [V: 04-1] remove radon from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202223 (owner: 10John F. Lewis) [20:17:17] (03PS2) 10Dzahn: remove radon from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202223 (https://phabricator.wikimedia.org/T88818) (owner: 10John F. Lewis) [20:17:27] (03CR) 10jenkins-bot: [V: 04-1] remove radon from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202223 (https://phabricator.wikimedia.org/T88818) (owner: 10John F. Lewis) [20:17:38] !log updated Parsoid to version d5aa726ebe831e6e7d3343f1dd01d8cc11fba1c3 [20:17:43] Logged the message, Master [20:17:51] YuviPanda: https://gdash.wikimedia.org/dashboards/reqerror/ [20:20:10] (03PS3) 10Dzahn: remove radon from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202223 (https://phabricator.wikimedia.org/T88818) (owner: 10John F. Lewis) [20:22:42] ah, whelp [20:22:57] seems to have died down... [20:24:10] (03PS1) 10John F. Lewis: hadeus / capella: remove from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202226 (https://phabricator.wikimedia.org/T94474) [20:24:24] (03CR) 10Dzahn: [C: 032] remove radon from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202223 (https://phabricator.wikimedia.org/T88818) (owner: 10John F. Lewis) [20:25:36] 6operations, 6Phabricator, 5Patch-For-Review: reclaim radon as spare re-use server 'radon' as phab failover - https://phabricator.wikimedia.org/T88818#1183359 (10Dzahn) [20:29:20] RECOVERY - HTTP 5xx req/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [20:29:58] YuviPanda: I didĄ [20:30:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 37.44, 29.77, 18.24 [20:30:47] uh oh [20:31:41] is 1025 frack? [20:32:52] yes [20:32:55] yup [20:33:11] K4|meeting: ^ [20:33:21] 6operations, 3Interdatacenter-IPsec: Kernel panics on Jessie (3.16.0-4-amd64) during IPsec load test - https://phabricator.wikimedia.org/T94820#1183423 (10Gage) Thanks for the feedback. Steps to reproduce are in the task description, I used IPv4: ``` while true ; do wget -nv -O /dev/null http://10.64.0.170/ind... [20:34:55] YuviPanda: ori: Hi, thanks for the ping. Yes, we definitely care about that, it's a SPOF for all fundraising things. [20:35:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 48.23, 40.50, 25.98 [20:35:43] Jeff_Green just had laptop fail, I'm sure he's getting the icinga notifications but I'll email him a note as well. [20:35:57] anyone else in SF tz has root on frack? [20:36:27] awight: can you phone him? [20:36:27] cmjohnson1 ^^ ? [20:36:32] ori: k [20:37:30] (03PS1) 10Yuvipanda: Add .gitreview [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202228 [20:37:44] (03CR) 10Yuvipanda: [C: 032 V: 032] Add .gitreview [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202228 (owner: 10Yuvipanda) [20:38:53] hi, i have root on db1025 [20:39:22] YuviPanda: ori: jgage: I confirmed that Jeff_Green's aware of the situation, he's just locked out of IRC at the moment, probably due to laptop fail. [20:39:32] thanks [20:39:39] the output of show processlist means little to me :\ [20:39:44] :) [20:40:07] jgage: if you can securely paste that somewhere, or email to me, I might see things... [20:40:14] buncha selects to civicrm [20:40:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 46.05, 45.44, 32.25 [20:40:23] i hear phab has a pastebin, lemme look for that [20:40:39] jgage: pls be careful about PII, obviously... [20:40:41] AaronS: there are a lot of exceptions for http://commons.wikimedia.org/w/thumb_handler.php/archive/9/9f/20101013182931!Qanat_illustration-de.svg/120px-Qanat_illustration-de.svg.png in the logs [20:41:00] you can limit the pastebin to custom people [20:41:03] that works [20:41:04] awight: yep [20:41:20] i think i have access as well [20:41:45] grr. cosmic misalignment. [20:41:57] jgage: Is mysqld the cause of the excessive load? [20:42:10] it's all normal stupid civicrm queries [20:42:11] two users [20:42:34] lots and lots of copies of the same query, so they're hitting refresh a lot [20:42:39] aaargh. [20:42:40] OK [20:42:59] Jeff_Green_Reall: can you send me the UIDs? I'll try to flail my arms in front of their desks. [20:43:31] awight: numeric? [20:43:36] sure [20:44:02] Jeff_Green_Reall: and fwiw, the mapping is just, select * from drupal.users where uid in (...) [20:44:33] yeah. I'll leave that to you while I fight freenode and linux battles [20:44:35] 6operations, 10RESTBase, 7Performance: Create a path entry point for the REST API under regular domains - https://phabricator.wikimedia.org/T95229#1183654 (10GWicke) 3NEW [20:44:59] fwiw: phab -> left side nav -> Applications -> Utilities -> Paste -> Create Paste -> Visible To -> Custom Policy -> allow users awight -> Save Policy [20:45:12] looks like i don't need it now, but yay now i know how [20:45:19] *adds that application to his navbar* [20:45:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 44.90, 44.13, 35.22 [20:46:38] jgage: when you hit the + like when you create a task, paste is just another option as well [20:46:55] ah cool thanks [20:46:57] upper right corner, the bigger + [20:47:20] mm trying to talk about user interfaces [20:47:24] !log deploying restbase 42db7c422f [20:47:27] Logged the message, Master [20:49:21] (03PS2) 10John F. Lewis: haedus / capella: remove from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202226 (https://phabricator.wikimedia.org/T94474) [20:49:29] (03PS1) 10Legoktm: Force tox -e flake8 to run using python3.4 [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202232 [20:49:37] YuviPanda: ^ [20:50:09] (03CR) 10Yuvipanda: [C: 032 V: 032] Force tox -e flake8 to run using python3.4 [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202232 (owner: 10Legoktm) [20:50:13] legoktm: ^ [20:50:14] done [20:50:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 30.61, 37.45, 35.00 [20:51:03] (03PS3) 10Dzahn: haedus / capella: remove from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202226 (https://phabricator.wikimedia.org/T94474) (owner: 10John F. Lewis) [20:51:44] 6operations: Trigger some sort of alert if the memcache-serious log file is filling up at a greater than usual rate - https://phabricator.wikimedia.org/T95231#1183701 (10EBernhardson) 3NEW [20:52:11] (03PS1) 10Yuvipanda: Add LICENSE [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202235 [20:52:16] legoktm: ^ [20:52:22] jgage: If you're still in there, can you go ahead and kill any old queries for the offending users? Will PM the uids. [20:52:33] (03CR) 10Legoktm: [C: 031] Add LICENSE [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202235 (owner: 10Yuvipanda) [20:52:41] !log deployed restbase 42db7c422f [20:52:42] (03PS2) 10Yuvipanda: Add LICENSE [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202235 [20:52:45] Logged the message, Master [20:52:50] (03CR) 10Yuvipanda: [C: 032 V: 032] Add LICENSE [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202235 (owner: 10Yuvipanda) [20:53:10] (03CR) 10Dzahn: [C: 032] haedus / capella: remove from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/202226 (https://phabricator.wikimedia.org/T94474) (owner: 10John F. Lewis) [20:55:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 2.71, 19.28, 28.23 [20:56:04] (03PS1) 10Legoktm: jenkins job validation, do not submit [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202236 [20:56:15] (03PS1) 10Ori.livneh: ircecho message beautification tweaks [puppet] - 10https://gerrit.wikimedia.org/r/202237 [20:57:14] (03CR) 10Ori.livneh: [C: 032] ircecho message beautification tweaks [puppet] - 10https://gerrit.wikimedia.org/r/202237 (owner: 10Ori.livneh) [20:57:18] (03Abandoned) 10Legoktm: jenkins job validation, do not submit [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/202236 (owner: 10Legoktm) [20:57:42] YuviPanda: ^ jenkins configured [20:57:58] 6operations, 5Patch-For-Review: reclaim / decom haedus and capella - https://phabricator.wikimedia.org/T94474#1183723 (10JohnLewis) Removed from site.pp and added to the list of spares [[https://wikitech.wikimedia.org/w/index.php?title=Server_Spares&diff=152418&oldid=152413 | here]]. @dzahn is dealing with the... [20:58:25] 6operations, 6Phabricator, 5Patch-For-Review: reclaim radon as spare re-use server 'radon' as phab failover - https://phabricator.wikimedia.org/T88818#1183725 (10JohnLewis) Removed from site.pp and added to the list of spares [[https://wikitech.wikimedia.org/w/index.php?title=Server_Spares&diff=152418... [20:59:45] 6operations, 5Patch-For-Review: reclaim / decom haedus and capella - https://phabricator.wikimedia.org/T94474#1183728 (10Dzahn) [palladium:~] $ sudo puppetstoredconfigclean.rb haedus.codfw.wmnet Killing haedus.codfw.wmnet...done. [palladium:~] $ sudo puppetstoredconfigclean.rb capella.codfw.wmnet Killing capel... [20:59:50] (03CR) 10Ottomata: "Faidon suggested that we just make vagrant puppet's modulepath include operations/puppet/modules. If we do this, then all of my objection" [puppet] - 10https://gerrit.wikimedia.org/r/196335 (https://phabricator.wikimedia.org/T92560) (owner: 10Eevans) [21:00:12] 6operations, 5Patch-For-Review: reclaim / decom haedus and capella - https://phabricator.wikimedia.org/T94474#1183730 (10Dzahn) a:3Dzahn [21:00:19] PROBLEM - check_load on db1025 is CRITICAL: CRITICAL - load average: 4.53, 8.82, 21.10 [21:00:59] (03PS1) 10Hashar: Include README.rst in Sphinx generated doc [tools/scap] - 10https://gerrit.wikimedia.org/r/202240 [21:01:13] 6operations, 6Phabricator, 5Patch-For-Review: reclaim radon as spare re-use server 'radon' as phab failover - https://phabricator.wikimedia.org/T88818#1183738 (10Dzahn) [palladium:~] $ sudo puppetstoredconfigclean.rb radon.eqiad.wmnet Killing radon.eqiad.wmnet...done. this will remove it from monitor... [21:02:19] (03CR) 10Rush: Puppet run storage upgrade for phd service (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/201864 (https://phabricator.wikimedia.org/T95062) (owner: 10Negative24) [21:14:18] (03PS1) 10John F. Lewis: decom rbf[1-2]00[1-2] [puppet] - 10https://gerrit.wikimedia.org/r/202242 (https://phabricator.wikimedia.org/T95153) [21:14:28] (03CR) 10jenkins-bot: [V: 04-1] decom rbf[1-2]00[1-2] [puppet] - 10https://gerrit.wikimedia.org/r/202242 (https://phabricator.wikimedia.org/T95153) (owner: 10John F. Lewis) [21:14:30] (03PS2) 10John F. Lewis: decom rbf[1-2]00[1-2] [puppet] - 10https://gerrit.wikimedia.org/r/202242 (https://phabricator.wikimedia.org/T95153) [21:18:07] 6operations, 10RESTBase, 7Performance: Create a path entry point for the REST API under regular domains - https://phabricator.wikimedia.org/T95229#1183806 (10GWicke) [21:18:13] 6operations, 10hardware-requests, 5Patch-For-Review: Decom/repurpose rbf* hosts - https://phabricator.wikimedia.org/T95153#1183808 (10Dzahn) a:3Dzahn [21:21:04] 6operations, 10Deployment-Systems, 6Release-Engineering: Determine Trebuchet/git-deploy maintenance plan - https://phabricator.wikimedia.org/T85008#1183826 (10greg) a:5greg>3demon Reassigning to Chad, he's going to talk with Ryan soon about this (spoiler alert to Ryan) :) [21:26:14] ori: ? [21:26:26] matanya: ? [21:27:00] ori: some users on some projects are complaining special:staticsitcs went crazy [21:27:15] why are you pinging me? [21:27:39] is it related to: https://gerrit.wikimedia.org/r/188066 ? [21:27:50] which is mentioned in https://phabricator.wikimedia.org/T68867 ? [21:28:21] Nemo_bis: do you know what's up? [21:28:41] i doubt he is around at this hour [21:30:20] RECOVERY - check_load on db1025 is OK: OK - load average: 1.05, 1.47, 4.48 [21:37:11] RECOVERY - HTTP error ratio anomaly detection on graphite1001 is OK: OK: No anomaly detected [21:37:47] (03CR) 10Dzahn: [C: 032] "https://gerrit.wikimedia.org/r/#/c/201997/" [puppet] - 10https://gerrit.wikimedia.org/r/202242 (https://phabricator.wikimedia.org/T95153) (owner: 10John F. Lewis) [21:40:59] 6operations, 10hardware-requests, 5Patch-For-Review: Decom/repurpose rbf* hosts - https://phabricator.wikimedia.org/T95153#1183961 (10Dzahn) [palladium:~] $ sudo puppetstoredconfigclean.rb rbf1001.eqiad.wmnet Killing rbf1001.eqiad.wmnet...done. [palladium:~] $ sudo puppetstoredconfigclean.rb rbf1002.eqiad.wm... [21:42:41] 6operations, 6Phabricator, 6Project-Creators: Create policy projects and convert people projects to open - https://phabricator.wikimedia.org/T90491#1183974 (10atgo) It's certainly not the end of the world, but my team for sure mostly uses the "projects I am in" on the dashboard. We have a lot of projects spr... [21:42:51] PROBLEM - puppet last run on ms-be2014 is CRITICAL: CRITICAL: puppet fail [21:44:50] (03PS1) 10Dzahn: admin: add jkatz to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/202248 (https://phabricator.wikimedia.org/T94939) [21:54:21] 6operations, 5Patch-For-Review: deploy francium for html/zim dumps - https://phabricator.wikimedia.org/T93113#1184052 (10GWicke) [21:54:23] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1184051 (10GWicke) 5declined>3Open [21:54:25] 6operations, 10Datasets-General-or-Unknown, 6Services, 10hardware-requests: Hardware for HTML / zim dumps - https://phabricator.wikimedia.org/T91853#1184053 (10GWicke) [21:57:18] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1184067 (10GWicke) Re-opened, as we realistically need shell to deploy and run this service. If you really don't plan to grant shell on this box at all, then we'll have to discuss who is... [21:59:41] RECOVERY - puppet last run on ms-be2014 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:07:19] (03PS2) 10Yuvipanda: Set has_ganglia=false for labs [puppet] - 10https://gerrit.wikimedia.org/r/201942 (https://phabricator.wikimedia.org/T95107) (owner: 10Gergő Tisza) [22:17:33] (03PS1) 10EBernhardson: Enable the flow extension on wikidatawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202256 [22:20:04] 6operations, 10hardware-requests: Repurpose rhenium as "network insight" host - https://phabricator.wikimedia.org/T95243#1184113 (10faidon) 3NEW [22:34:36] 6operations: Purge > 90 days stat1002:/a/squid/archive/edits - https://phabricator.wikimedia.org/T92339#1184166 (10kevinator) @ezachte uses these logs and @ottomata has asked him if we need to keep them. [22:47:56] (03PS1) 10Chmarkine: dbtree - Raise HSTS max-age to 1 year and add always flag [puppet] - 10https://gerrit.wikimedia.org/r/202267 (https://phabricator.wikimedia.org/T40516) [22:59:41] (03PS1) 10BryanDavis: Remove exotic unicode from ascii logo [tools/scap] - 10https://gerrit.wikimedia.org/r/202271 [23:00:04] RoanKattouw, ^d, Krenair, ebernhardson: Respected human, time to deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20150406T2300). Please do the needful. [23:01:14] ok, so [23:01:25] ebernhardson [23:01:31] \o [23:02:08] bd808: exotic unicode? sounds sexy [23:02:30] so sexy [23:02:37] curvy even [23:03:13] (03CR) 10Dzahn: [C: 031] Remove exotic unicode from ascii logo [tools/scap] - 10https://gerrit.wikimedia.org/r/202271 (owner: 10BryanDavis) [23:03:54] ebernhardson, hm, ok, so this mediawiki core commit touches quite a few files.. [23:04:02] wait, geocities is "oocities" now? [23:04:06] introduces new messages [23:04:14] well, only one. and it's an error. ok [23:04:33] mutante: geocites is finally dead but there is a pretty complete archive at oocities [23:04:38] and unlikely to come up if this is deployed successfully [23:04:45] Krenair: yea [23:05:11] edit api on desktop and mobile... [23:05:18] bd808: got it:) [23:05:36] looks fine [23:06:01] ebernhardson, shouldn't mediawiki.messagePoster.wikitext depend on mediawiki.messagePoster? [23:06:57] (03CR) 10BryanDavis: [C: 032] Include README.rst in Sphinx generated doc [tools/scap] - 10https://gerrit.wikimedia.org/r/202240 (owner: 10Hashar) [23:07:00] I might be reviewing this a bit closely for swat but it feels like almost a new feature and we have an hour window, so I want to get this right [23:07:17] (03Merged) 10jenkins-bot: Include README.rst in Sphinx generated doc [tools/scap] - 10https://gerrit.wikimedia.org/r/202240 (owner: 10Hashar) [23:07:59] AaronS: Can I get a +2 on https://gerrit.wikimedia.org/r/#/c/202271 to avoid self-merge nastiness? [23:08:00] Do we usually document that a jQuery.Promise has .done and .fail functions in our own docs? :/ [23:08:07] ebernhardson, ^ [23:09:19] (03CR) 10Aaron Schulz: [C: 032] Remove exotic unicode from ascii logo [tools/scap] - 10https://gerrit.wikimedia.org/r/202271 (owner: 10BryanDavis) [23:10:19] (03PS2) 10John F. Lewis: bugzilla: use https by default for static [puppet] - 10https://gerrit.wikimedia.org/r/201964 [23:11:50] (03Merged) 10jenkins-bot: Remove exotic unicode from ascii logo [tools/scap] - 10https://gerrit.wikimedia.org/r/202271 (owner: 10BryanDavis) [23:12:10] ebernhardson, hello? [23:12:18] Krenair: sorry, subbu poked me in the other room [23:12:41] I think it's fine to swat this if you're going to test it thoroughly. [23:13:09] Krenair, sorry, we were going to do two patches, then we punted one to tomorrow. I thought ebernhardson was also going to remove this one. However, it's fine to do it tonight. [23:13:21] ebernhardson, Krenair sorry :) [23:14:27] (03PS1) 10Yuvipanda: base: Allow hiera to override ldap use_dnsmasq variable [puppet] - 10https://gerrit.wikimedia.org/r/202278 (https://phabricator.wikimedia.org/T95240) [23:14:28] thcipriani: can you cherry pick ^ and try? [23:14:39] andrewbogott: ^ [23:14:46] We're still doing https://gerrit.wikimedia.org/r/#/c/202262/ right ebernhardson superm401? [23:14:49] hiera doesn’t inject globals [23:15:38] ebernhardson, superm401: and https://gerrit.wikimedia.org/r/#/c/202256/ ? [23:15:40] YuviPanda: I'll give that a shot [23:15:43] (03CR) 10Andrew Bogott: [C: 031] base: Allow hiera to override ldap use_dnsmasq variable [puppet] - 10https://gerrit.wikimedia.org/r/202278 (https://phabricator.wikimedia.org/T95240) (owner: 10Yuvipanda) [23:15:45] quick responses would be appreciated... [23:16:11] Krenair, no, let's skip that too. [23:16:22] Krenair: just the wikidatawiki config patch [23:16:37] so we're doing the core change and the config change [23:16:39] but not the flow change [23:16:42] that right superm401 & ebernhardson? [23:16:47] Krenair: no, only the config patch [23:17:10] :| [23:17:25] wtf? [23:17:29] Krenair: you complained and found valid issues, whats the :| for :P [23:17:42] They're not issues that would block swat [23:17:49] it will probably work, but we don't have to rush this out [23:17:52] they were potential things to improve on master [23:17:57] ebernhardson, it will work fine. I tested it. [23:18:02] (03CR) 10Alex Monk: [C: 032] Enable the flow extension on wikidatawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202256 (owner: 10EBernhardson) [23:18:15] Krenair, okay, we can do it then. I don't have a strong preference either way. I misunderstood and thought you were concerned enough to be hesitant deploying it. [23:18:47] The only way it wouldn't work is if you loaded 'mediawiki.messagePoster.wikitext' directly which defeats the whole point. [23:19:05] However, modules are supposed to load safely if loaded directly, so I'll fix it. [23:22:48] (03Abandoned) 10Negative24: Puppet run storage upgrade for phd service [puppet] - 10https://gerrit.wikimedia.org/r/201864 (https://phabricator.wikimedia.org/T95062) (owner: 10Negative24) [23:23:35] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations: Access to francium - https://phabricator.wikimedia.org/T94093#1184230 (10GWicke) @arielglenn, the svwiki dump finished without issues. Final disk usage was 50G. Now trying dewiki on the labs instance, while enwiki continues to run on the larger VM. [23:25:55] YuviPanda: patch seems to work on staging [23:26:18] (03CR) 10Thcipriani: [C: 031] "Works well on staging-palladium." [puppet] - 10https://gerrit.wikimedia.org/r/202278 (https://phabricator.wikimedia.org/T95240) (owner: 10Yuvipanda) [23:26:29] (03Merged) 10jenkins-bot: Enable the flow extension on wikidatawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202256 (owner: 10EBernhardson) [23:26:29] thcipriani: sweet. let me emrge [23:26:31] *merge [23:26:49] (03CR) 10Yuvipanda: [C: 032] base: Allow hiera to override ldap use_dnsmasq variable [puppet] - 10https://gerrit.wikimedia.org/r/202278 (https://phabricator.wikimedia.org/T95240) (owner: 10Yuvipanda) [23:27:25] !log krenair Synchronized flow.dblist: https://gerrit.wikimedia.org/r/#/c/202256/ - flow to wikidatawiki (duration: 00m 12s) [23:27:26] ebernhardson, superm401 ^ [23:27:30] Logged the message, Master [23:27:34] please test etc. [23:28:17] Krenair, thanks. https://www.wikidata.org/wiki/Special:Version and https://www.wikidata.org/wiki/Special:EnableFlow behave as expected. [23:28:38] We don't have any Flow boards yet (not sure where we want them). I expect these will be added soon, but as is it's working normally. [23:29:05] ok, great [23:29:32] btw, in case you missed it [23:29:37] Do we usually document that a jQuery.Promise has .done and .fail functions in our own docs? :/ [23:30:10] Krenair, I did that to document the parameter passed to the done callback: [23:30:12] * @return {mw.messagePoster.MessagePoster} return.done.poster MessagePoster [23:30:21] I tried skipping the return.done level first, but jsduck does not allow that. [23:31:23] 6operations, 6Engineering-Community, 6WMF-Legal, 6WMF-NDA: Implement the Volunteer NDA process in Phabricator - https://phabricator.wikimedia.org/T655#1184247 (10Dzahn) 5Resolved>3Open [23:32:12] Krinkle: Re ---^^ what is the preferred jsduck syntax to document the type of an object that a returned promise is resolved with? [23:32:36] (03PS4) 10Negative24: phab: update phab version in labs to 2015-02-18 [puppet] - 10https://gerrit.wikimedia.org/r/201857 (owner: 10Dzahn) [23:32:58] RoanKattouw: {jQuery.Promise} [23:33:19] Avoid documenting the whole object/.done{Function} etc. [23:33:35] OK [23:33:37] superm401: ---^^ [23:33:54] oh hi Krinkle [23:33:58] I was wondering where you were today [23:34:08] I'm chasing bunnies [23:34:09] ecmabot went nuts and duplicated itself a few times again [23:34:16] Krinkle, like * @return {jQuery.Promise} Promise resolving to a {mw.messagePoster.MessagePoster} ? [23:34:36] superm401: no {} on the second one, but yeah. That should link it. [23:34:36] 6operations, 10RESTBase, 10hardware-requests: Expand RESTBase cluster capacity - https://phabricator.wikimedia.org/T93790#1184264 (10GWicke) Re init scripts: We could consider using [systemd instances](http://0pointer.de/blog/projects/instances.html) to avoid actually having to create init scripts. This feat... [23:34:39] (03CR) 10Rush: [C: 032 V: 032] phab: update phab version in labs to 2015-02-18 [puppet] - 10https://gerrit.wikimedia.org/r/201857 (owner: 10Dzahn) [23:34:44] Krenair: Aye, is it fixed now? [23:34:56] Krenair: I blame jstart for having -once not be once. [23:35:05] there's a phab task for it [23:35:07] >>> no [23:35:07] <1JTAAOGUB> Krenair: ReferenceError: no is not defined [23:35:08] <17SAB9VNK> Krenair: ReferenceError: no is not defined [23:35:08] <64MACH3PW> Krenair: ReferenceError: no is not defined [23:35:08] Krenair: ReferenceError: no is not defined [23:35:29] eh? [23:35:40] >> truth [23:35:41] Krinkle: ReferenceError: truth is not defined [23:35:41] <17SAB9VNK> Krinkle: ReferenceError: truth is not defined [23:35:41] <1JTAAOGUB> Krinkle: ReferenceError: truth is not defined [23:35:41] <64MACH3PW> Krinkle: ReferenceError: truth is not defined [23:36:00] * Krinkle fixes [23:36:09] oh, other thing Krinkle [23:36:12] https://gerrit.wikimedia.org/r/#/c/202218/ [23:36:34] and I have some other things I should swat [23:36:41] while I have a chance [23:37:27] (03CR) 10Alex Monk: [C: 032] Add $wgUploadNavigationUrl for iswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201943 (https://phabricator.wikimedia.org/T95089) (owner: 10Glaisher) [23:37:35] (03Merged) 10jenkins-bot: Add $wgUploadNavigationUrl for iswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201943 (https://phabricator.wikimedia.org/T95089) (owner: 10Glaisher) [23:37:40] >> fun! [23:37:40] legoktm: SyntaxError: Unexpected token ! [23:37:40] <17SAB9VNK> legoktm: SyntaxError: Unexpected token ! [23:37:40] <1JTAAOGUB> legoktm: SyntaxError: Unexpected token ! [23:37:40] <64MACH3PW> legoktm: SyntaxError: Unexpected token ! [23:38:21] !log krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/201943/ - wgUploadNavigationUrl for iswiki (duration: 00m 12s) [23:38:26] Logged the message, Master [23:43:06] Glaisher, hey [23:46:29] hmm... weird. can't get it to have any actual effect [23:46:45] am wondering if it's affected by caching [23:48:57] ah [23:49:01] because the on-wiki css hides it. [23:49:03] ok :) [23:50:16] PROBLEM - puppet last run on subra is CRITICAL: CRITICAL: Puppet has 1 failures [23:51:11] (03PS2) 10Alex Monk: Enable transwiki import from English Wikisource on Telugu Wikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201196 (https://phabricator.wikimedia.org/T94531) (owner: 10Pmlineditor) [23:51:22] (03PS3) 10Alex Monk: Enable transwiki import from English Wikisource on Telugu Wikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201196 (https://phabricator.wikimedia.org/T94531) (owner: 10Pmlineditor) [23:51:34] (03CR) 10Alex Monk: [C: 032] Enable transwiki import from English Wikisource on Telugu Wikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201196 (https://phabricator.wikimedia.org/T94531) (owner: 10Pmlineditor) [23:51:39] (03Merged) 10jenkins-bot: Enable transwiki import from English Wikisource on Telugu Wikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/201196 (https://phabricator.wikimedia.org/T94531) (owner: 10Pmlineditor) [23:52:18] !log krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/201196/ (duration: 00m 11s) [23:52:25] Logged the message, Master [23:52:36] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures [23:53:05] (03CR) 10Alex Monk: [C: 032] Switch CA icon from 'wiki' to 'wikipedia' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/194084 (https://phabricator.wikimedia.org/T91340) (owner: 10MaxSem) [23:53:07] PROBLEM - puppet last run on mw1213 is CRITICAL: CRITICAL: Puppet has 2 failures [23:53:12] (03Merged) 10jenkins-bot: Switch CA icon from 'wiki' to 'wikipedia' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/194084 (https://phabricator.wikimedia.org/T91340) (owner: 10MaxSem) [23:53:37] PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: Puppet has 1 failures [23:53:43] !log krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/194084/ (duration: 00m 12s) [23:53:46] PROBLEM - puppet last run on mw1084 is CRITICAL: CRITICAL: Puppet has 1 failures [23:53:46] Logged the message, Master [23:54:07] PROBLEM - puppet last run on mw1149 is CRITICAL: CRITICAL: Puppet has 1 failures [23:54:27] PROBLEM - puppet last run on mw1111 is CRITICAL: CRITICAL: Puppet has 1 failures [23:54:36] PROBLEM - puppet last run on mw1151 is CRITICAL: CRITICAL: Puppet has 1 failures [23:55:51] (03CR) 10Alex Monk: [C: 032] Enable new user groups (patroller, rollbacker, autopatrolled) on ta wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202071 (https://phabricator.wikimedia.org/T95180) (owner: 10Shanmugamp7) [23:55:59] (03Merged) 10jenkins-bot: Enable new user groups (patroller, rollbacker, autopatrolled) on ta wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/202071 (https://phabricator.wikimedia.org/T95180) (owner: 10Shanmugamp7) [23:56:44] !log krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/202071/ (duration: 00m 14s) [23:56:49] Logged the message, Master [23:57:56] RECOVERY - puppet last run on mw1151 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:58:31] false positives [23:59:37] PROBLEM - puppet last run on amssq56 is CRITICAL: CRITICAL: puppet fail