[00:00:04] RoanKattouw ostriches Krenair: Respected human, time to deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160112T0000). Please do the needful. [00:00:05] James_F: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [00:00:35] ok [00:01:15] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1926191 (10Dzahn) a:3Dzahn [00:01:28] (03PS2) 10Alex Monk: Remove wgArticlePath from InitialiseSettings as it's in CommonSettings [mediawiki-config] - 10https://gerrit.wikimedia.org/r/260242 (owner: 10Reedy) [00:01:30] (03CR) 10BryanDavis: [C: 031] "Tested that memory cgroup support was actually enabled on puppet-andrew-cgroup.puppet.eqiad.wmflabs by manually following MediaWiki-Vagran" [puppet] - 10https://gerrit.wikimedia.org/r/262838 (https://phabricator.wikimedia.org/T122734) (owner: 10Andrew Bogott) [00:01:42] (03CR) 10Milimetric: [C: 04-1] "This un-does the blacklisting of MobileWebSectionUsage. It should NOT be merged until that schema is more predictable (by sampling, for e" [puppet] - 10https://gerrit.wikimedia.org/r/263549 (owner: 10Milimetric) [00:02:14] (03CR) 10JanZerebecki: "After applying to the canary, the 3 curl commands from my comment above, return the expected result." [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [00:02:16] (03CR) 10Alex Monk: [C: 032] Remove wgArticlePath from InitialiseSettings as it's in CommonSettings [mediawiki-config] - 10https://gerrit.wikimedia.org/r/260242 (owner: 10Reedy) [00:02:34] 6operations, 7Mail: remove gbyrd from exim alias file - https://phabricator.wikimedia.org/T123285#1926201 (10Dzahn) a:3JKrauska [00:03:48] (03Merged) 10jenkins-bot: Remove wgArticlePath from InitialiseSettings as it's in CommonSettings [mediawiki-config] - 10https://gerrit.wikimedia.org/r/260242 (owner: 10Reedy) [00:04:14] (03PS2) 10BryanDavis: [WIP] Provision MediaWiki-Vagrant on Jessie hosts [puppet] - 10https://gerrit.wikimedia.org/r/245920 [00:04:29] Krenair: I can do the SWAT if you haven't already started [00:04:31] RECOVERY - puppet last run on mw1108 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [00:04:36] (03CR) 10jenkins-bot: [V: 04-1] [WIP] Provision MediaWiki-Vagrant on Jessie hosts [puppet] - 10https://gerrit.wikimedia.org/r/245920 (owner: 10BryanDavis) [00:04:36] I already started [00:04:46] Oh, you're already +2ing things [00:05:36] checked on mw1017 [00:05:39] seems ok [00:05:39] 6operations, 7Mail: Remove exim alias - yuvipanda - https://phabricator.wikimedia.org/T123275#1926226 (10JKrauska) I don't think it has to fully resolve inside exim.. eg. khorn isn't defined itself inside the exim config, but I'm fairly confident it's still delivering to her address on google.. [00:06:01] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/260242/ (duration: 00m 30s) [00:06:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:07:37] (03PS4) 10Alex Monk: Exempt private/fishbowl wikis from the global title blacklist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/244140 (https://phabricator.wikimedia.org/T114873) (owner: 10TTO) [00:07:45] (03CR) 10Alex Monk: [C: 032] Exempt private/fishbowl wikis from the global title blacklist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/244140 (https://phabricator.wikimedia.org/T114873) (owner: 10TTO) [00:08:18] (03Merged) 10jenkins-bot: Exempt private/fishbowl wikis from the global title blacklist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/244140 (https://phabricator.wikimedia.org/T114873) (owner: 10TTO) [00:09:25] (03PS1) 10KartikMistry: WIP: CX: Use ordered_yaml instead of ordered_json [puppet] - 10https://gerrit.wikimedia.org/r/263550 [00:09:31] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/244140/ (duration: 00m 30s) [00:09:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:10:29] !log krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/244140/ (duration: 00m 30s) [00:10:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:12:13] (03PS6) 10Alex Monk: Namespace config change on de.wikivoyage.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/255361 (https://phabricator.wikimedia.org/T119420) (owner: 10Mdann52) [00:12:19] (03CR) 10Alex Monk: [C: 032] Namespace config change on de.wikivoyage.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/255361 (https://phabricator.wikimedia.org/T119420) (owner: 10Mdann52) [00:13:25] helpful timing grrrit-wm [00:14:05] 6operations, 10Deployment-Systems, 6Performance-Team, 10Traffic: Varnish cache busting desired for /static/$VERSION/ resources which change within the lifetime of a branch - https://phabricator.wikimedia.org/T99096#1926319 (10Krinkle) a:3Krinkle [00:14:14] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/255361/ (duration: 00m 30s) [00:14:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:15:13] (03PS1) 10MaxSem: Update WMF address [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263551 [00:16:03] (03PS2) 10Alex Monk: Template editor group on hi.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258444 (https://phabricator.wikimedia.org/T120342) (owner: 10Dereckson) [00:16:21] (03CR) 10Alex Monk: [C: 032] Template editor group on hi.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258444 (https://phabricator.wikimedia.org/T120342) (owner: 10Dereckson) [00:16:56] (03Merged) 10jenkins-bot: Template editor group on hi.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258444 (https://phabricator.wikimedia.org/T120342) (owner: 10Dereckson) [00:16:59] (03PS1) 10Andrew Bogott: Horizon: Fix up cache rules [puppet] - 10https://gerrit.wikimedia.org/r/263552 [00:18:22] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258444/ (duration: 00m 30s) [00:18:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:19:56] (03PS3) 10Alex Monk: Enable interface-editor group at urwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258453 (https://phabricator.wikimedia.org/T120348) (owner: 10Luke081515) [00:20:12] RECOVERY - puppet last run on mw2168 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:20:17] (03CR) 10Dzahn: [C: 031] "looks ok. note though that this is not wikidata-only, this affects ALL projects and API links, also wikipedia" [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [00:21:48] (03CR) 10Dzahn: "on canary appserver:" [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [00:23:55] (03CR) 10Alex Monk: [C: 032] Enable interface-editor group at urwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258453 (https://phabricator.wikimedia.org/T120348) (owner: 10Luke081515) [00:24:18] (03Merged) 10jenkins-bot: Enable interface-editor group at urwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258453 (https://phabricator.wikimedia.org/T120348) (owner: 10Luke081515) [00:25:02] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [00:25:53] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258453/ (duration: 00m 30s) [00:25:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:26:21] Krenair: Thanks for running point [00:27:31] (03PS2) 10Alex Monk: Enable NewUserMessage on ps.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258672 (https://phabricator.wikimedia.org/T121132) (owner: 10Dereckson) [00:27:40] (03CR) 10Alex Monk: [C: 032] Enable NewUserMessage on ps.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258672 (https://phabricator.wikimedia.org/T121132) (owner: 10Dereckson) [00:27:42] PROBLEM - puppet last run on mw2106 is CRITICAL: CRITICAL: Puppet has 1 failures [00:27:56] (03PS5) 10Dzahn: tor: move role to module/role [puppet] - 10https://gerrit.wikimedia.org/r/260065 [00:27:59] 6operations, 7Mail: Remove exim alias - timo: ttijhof - https://phabricator.wikimedia.org/T123330#1926375 (10JKrauska) 3NEW a:3Dzahn [00:28:04] (03Merged) 10jenkins-bot: Enable NewUserMessage on ps.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258672 (https://phabricator.wikimedia.org/T121132) (owner: 10Dereckson) [00:28:19] 6operations, 7Mail: Remove exim alias - timo: ttijhof - https://phabricator.wikimedia.org/T123330#1926375 (10JKrauska) Remove the line: timo: ttijhof [00:29:11] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258672/ (duration: 00m 30s) [00:29:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:29:23] (03CR) 10Dzahn: [C: 032] "http://puppet-compiler.wmflabs.org/1573/radium.wikimedia.org/" [puppet] - 10https://gerrit.wikimedia.org/r/260065 (owner: 10Dzahn) [00:30:25] 6operations, 7Mail: Remove exim alias - timo: ttijhof - https://phabricator.wikimedia.org/T123330#1926394 (10Dzahn) [00:30:42] 6operations, 7Mail: Remove Exim Alias - luis: lvilla - https://phabricator.wikimedia.org/T123331#1926396 (10JKrauska) 3NEW a:3Dzahn [00:30:54] (03PS2) 10Alex Monk: Set site name on sr.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258670 (https://phabricator.wikimedia.org/T121278) (owner: 10Dereckson) [00:31:04] (03CR) 10Alex Monk: [C: 032] Set site name on sr.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258670 (https://phabricator.wikimedia.org/T121278) (owner: 10Dereckson) [00:31:52] (03Merged) 10jenkins-bot: Set site name on sr.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258670 (https://phabricator.wikimedia.org/T121278) (owner: 10Dereckson) [00:32:44] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258670/ (duration: 00m 30s) [00:32:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:33:44] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1926407 (10Dzahn) [00:33:45] 6operations, 7Mail: Remove exim alias - timo: ttijhof - https://phabricator.wikimedia.org/T123330#1926405 (10Dzahn) 5Open>3Resolved done. removed on palladium and ran puppet on mx1001: -timo: ttijhof [00:34:10] legoktm, around? [00:34:51] 6operations, 7Mail: Remove Exim Alias - luis: lvilla - https://phabricator.wikimedia.org/T123331#1926409 (10Dzahn) [00:34:52] 6operations, 7Mail: Remove exim alias - timo: ttijhof - https://phabricator.wikimedia.org/T123330#1926408 (10Dzahn) [00:35:58] 6operations, 7Mail: Remove Exim Alias - luis: lvilla - https://phabricator.wikimedia.org/T123331#1926396 (10Dzahn) [00:35:59] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1926411 (10Dzahn) [00:38:12] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1926434 (10Dzahn) [00:38:13] 6operations, 7Mail: Remove Exim Alias - luis: lvilla - https://phabricator.wikimedia.org/T123331#1926432 (10Dzahn) 5Open>3Resolved done. removed on palladium and ran puppet on mx1001/2001: ``` -# Luis Villa -luis: lvilla ``` [00:40:07] 16:41 < eurodyne> that's weird. when I log into meta, it redirects me to the login site :/ [00:40:09] 16:43 < mutante> ugh, confirmed that [00:41:14] 6operations: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1926453 (10JGulingan) 3NEW [00:41:15] go to meta.wm.org, login, get redirected to login wiki [00:41:21] go back to meta.. stilll logged in [00:41:35] Yeah, quiddity just reported that over in -core too [00:41:40] I tried in an incognito window [00:41:43] Logging in redirects me to login.wm.o [00:41:50] If I then go back, I am logged in, but only on that subdomain [00:42:06] So logging in on en.wp.o logs me in there and also on *.wp.o, but not on *.wiktionary.org [00:42:10] reproduced in Firefox-incognito, and chrome-standard. [00:42:40] legoktm: Around? ----^^ [00:45:05] looking [00:45:23] it happens on all projects for me [00:45:25] woah [00:45:27] incl. en.wp and wiktionary [00:46:25] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: rv 443026e3ad18934dd0017a258673d88104cf6b5e (duration: 00m 29s) [00:46:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:46:29] hmm [00:46:34] I'm on a bus on slow wifi, RoanKattouw, I assume you and Krenair are on point for debugging and/or getting others to help. [00:47:02] I think that fixed it? [00:47:11] please confirm RoanKattouw quiddity mutante [00:47:15] The question is, who can help [00:47:27] looks fixed to me (en.wp) [00:47:40] csteipp isn't anywhere on the internet, and dapatrick and legoktm are not responding [00:47:40] Checking [00:48:03] Yup, WFM [00:48:03] Thanks Krenair [00:48:15] Thanks much, Krenair :) [00:48:24] SWAT regression? [00:48:25] Krenair, partially. Login now works, but searching from Enwiki for "fr:wikt: sends me to French Wiktionary, and I'm not logged-in there. [00:48:46] quiddity: I think you'll have to re-login [00:48:46] and refreshing doesn't help. [00:48:51] wmf on meta too [00:48:52] (03PS1) 10Alex Monk: Revert "Remove wgArticlePath from InitialiseSettings as it's in CommonSettings" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263557 [00:48:59] s/wmf/wfm/g :p [00:49:06] Logging into enwiki in incognito, then going to frwikt in that same window worked [00:49:06] lol [00:49:09] That's just retarded [00:49:13] WHY IS IT IN BOTH PLACES [00:49:17] (03CR) 10Alex Monk: [C: 032] "already in prod" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263557 (owner: 10Alex Monk) [00:49:30] Reedy, so I suspect that some code is looking this up via wgConf [00:49:57] (03Merged) 10jenkins-bot: Revert "Remove wgArticlePath from InitialiseSettings as it's in CommonSettings" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263557 (owner: 10Alex Monk) [00:50:06] Needs digging into further I guess [00:50:32] quiddity, yes, try logging out and back in from enwiki [00:50:58] Still not working, even after logging-out and logging-in. (Chrome, and no extensions. Reproduced in Opera, no extensions) [00:51:16] Login at Enwiki, and then visit any Sister project. Not logged-in. [00:51:47] Ahhh, now logged in. It took a while for SUL to kick in [00:51:51] well it works for me [00:51:52] ok [00:51:58] yes, it's not instant [00:52:02] tgr, ^ [00:52:25] (TL;DR: It's fixed) [00:52:31] thx [00:53:01] RECOVERY - puppet last run on mw2106 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:53:16] login = new mw.Uri( [00:53:16] mw.config.get( 'wgArticlePath' ).replace( '$1', 'Special:CentralAutoLogin/toolslist' ) [00:53:16] ); [00:53:33] that can't be it [00:53:37] that's not it [00:57:45] there was one more for swat but I'd like to wait for legoktm to be around for it [01:10:31] PROBLEM - Unmerged changes on repository mediawiki_config on mira is CRITICAL: There is one unmerged change in mediawiki_config (dir /srv/mediawiki-staging/). [01:11:42] 6operations: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1926578 (10Peachey88) How is Bmansurov trying to log in, via LDAP or using the MediaWiki oauth? [01:12:11] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1926587 (10Peachey88) [01:13:24] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1926593 (10bmansurov) Hi, I've tried both method. After logging in I'm being asked an app code. [01:15:32] (03CR) 10Alex Monk: "This came up for swat but I wasn't fully comfortable doing it without legoktm being around" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/237686 (owner: 10Legoktm) [01:16:26] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1926603 (10Dzahn) please see the steps outlined on https://www.mediawiki.org/wiki/Phabricator/Help/Two-factor_Authentication_Resets [01:18:29] (03PS1) 10Krinkle: [WIP] Implement /w/static.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263566 (https://phabricator.wikimedia.org/T99096) [01:23:01] do we have something like https://tools.wmflabs.org/cdnjs/ that can be used in production? [01:26:16] no. production code should be committed to production repos. :) [01:29:19] (03CR) 10Jforrester: [C: 031] Consistently use require_once for MWVersion.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263415 (owner: 10Krinkle) [02:17:21] RECOVERY - Unmerged changes on repository mediawiki_config on mira is OK: No changes to merge. [02:26:05] !log mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 47s) [02:26:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:33:00] !log l10nupdate@tin ResourceLoader cache refresh completed at Tue Jan 12 02:33:00 UTC 2016 (duration 6m 55s) [02:33:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [03:06:11] PROBLEM - Eqiad HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [03:06:21] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [03:12:22] RECOVERY - Eqiad HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [03:12:32] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [03:28:53] 6operations: add ema to ops mailing lists - https://phabricator.wikimedia.org/T123256#1926863 (10ema) >>! In T123256#1925372, @Dzahn wrote: > https://lists.wikimedia.org/mailman/listinfo/ops Subscription request sent. [03:37:22] 6operations, 6Performance-Team, 10Wikimedia-General-or-Unknown, 5Patch-For-Review: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1926866 (10faidon) 06:25 UTC is cron.daily, which includes, among others, logrotate. We have three HHVM/MediaWiki-related logrotates, but only `/etc/logrotat... [03:39:23] 6operations, 7Mail: Remove exim alias - yuvipanda - https://phabricator.wikimedia.org/T123275#1926867 (10faidon) That (@JKrauska's comment) is correct — exim does recursive expansion of aliases. Referencing a Google address on the root@ alias is fine. [03:42:02] (03CR) 10Faidon Liambotis: [C: 031] "Yes please :)" [puppet] - 10https://gerrit.wikimedia.org/r/263363 (https://phabricator.wikimedia.org/T122665) (owner: 10Muehlenhoff) [03:44:05] 6operations: add ema to icinga (contact / paging) - https://phabricator.wikimedia.org/T123257#1926868 (10ema) a:3ema [03:45:55] 6operations: add ema to ops mail aliases (exim) - https://phabricator.wikimedia.org/T123255#1926876 (10ema) a:3ema [03:47:02] PROBLEM - puppet last run on mw2139 is CRITICAL: CRITICAL: puppet fail [03:51:08] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1926882 (10ema) [03:51:09] 10Ops-Access-Requests, 6operations, 5Patch-For-Review: root shell for ema - https://phabricator.wikimedia.org/T123252#1926881 (10ema) 5Open>3Resolved [03:52:15] 10Ops-Access-Requests, 6operations, 5Patch-For-Review: root shell for ema - https://phabricator.wikimedia.org/T123252#1924616 (10ema) [03:53:05] hi, I'm here [03:56:08] (03CR) 10Legoktm: "It should have been removed from CommonSettings, not InitialiseSettings. Core settings should be set in InitialiseSettings, regardless of " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/260242 (owner: 10Reedy) [03:56:15] Krenair, RoanKattouw_away: ^ [03:59:01] PROBLEM - puppet last run on sca1001 is CRITICAL: CRITICAL: Puppet last ran 4 days ago [03:59:39] !log reenabling puppet on sca1001/2; no reason was left [03:59:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [04:00:18] 7Blocked-on-Operations, 6operations, 10RESTBase, 10procurement: Expand SSD space in Cassandra cluster - https://phabricator.wikimedia.org/T121575#1926909 (10GWicke) [04:00:32] PROBLEM - puppet last run on sca1002 is CRITICAL: CRITICAL: Puppet last ran 4 days ago [04:01:12] RECOVERY - puppet last run on sca1001 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [04:02:51] RECOVERY - puppet last run on sca1002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [04:04:24] 6operations, 10RESTBase, 7RESTBase-architecture: restbase - nodejs package upgrade - puppet fail - https://phabricator.wikimedia.org/T123297#1926924 (10GWicke) @dzahn, you basically migrated the remaining nodes to 4.2. We originally planned to do this tomorrow to be absolutely sure that 4.2 is indeed workin... [04:07:32] !log cleaning up elastic1006's /var/log from old logs [04:07:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [04:09:15] paravoid: Too much logging for logrotate to keep up? :( [04:09:26] logrotate isn't set to compress, so.. [04:10:06] 7Blocked-on-Operations, 6operations, 10RESTBase, 6Services: Switch RESTBase to use Node.js 4.2 - https://phabricator.wikimedia.org/T107762#1926940 (10GWicke) Per T123297, the remaining nodes have been switched to 4.2, but they have not been restarted. To be on the safe side, I think we should temporarily r... [04:10:11] ...it isn't? [04:10:14] * ostriches shakes head [04:10:52] paravoid: could you revert restbase1001-1004 to node 0.10 please? [04:11:27] (03PS1) 10Faidon Liambotis: elasticsearch: compress old log files [puppet] - 10https://gerrit.wikimedia.org/r/263579 [04:12:16] gwicke: looking [04:12:21] RECOVERY - puppet last run on mw2139 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [04:12:27] why was there a dpkg alert for those nodes? [04:12:50] (03CR) 10Faidon Liambotis: [C: 032] elasticsearch: compress old log files [puppet] - 10https://gerrit.wikimedia.org/r/263579 (owner: 10Faidon Liambotis) [04:12:50] it seems that whoever uploaded the node packages & added the hold didn't disable / ack the alerts [04:12:52] (03CR) 10Chad: [C: 031] "I like this change, I have indicated it with a completely symbolic +1!" [puppet] - 10https://gerrit.wikimedia.org/r/263579 (owner: 10Faidon Liambotis) [04:13:16] no, a new version in the repo shouldn't result in dpkg alerts [04:13:30] do we have an ensure => latest somewhere? [04:13:43] I hope not [04:13:52] me too, let me check [04:14:26] oh there was a hold set [04:16:03] there isn't any need for setting hold [04:16:03] anyway, I got to run; would be nice to do this migration in a more orderly manner [04:16:12] ttyl [04:16:14] yeah :) [04:16:17] I'll downgrade [04:25:46] 6operations, 10RESTBase, 7RESTBase-architecture: restbase - nodejs package upgrade - puppet fail - https://phabricator.wikimedia.org/T123297#1926953 (10faidon) 5Open>3Resolved a:3faidon This was applied inconsistently (restbase1002/1003 were still running 0.10). I downgraded 1001/1004 to 0.10 again an... [04:54:56] (03PS2) 10Andrew Bogott: Horizon: Fix up cache rules [puppet] - 10https://gerrit.wikimedia.org/r/263552 [04:56:34] (03CR) 10Andrew Bogott: [C: 032] Horizon: Fix up cache rules [puppet] - 10https://gerrit.wikimedia.org/r/263552 (owner: 10Andrew Bogott) [05:09:42] (03Abandoned) 10Faidon Liambotis: Don't require nodejs for restbase [puppet] - 10https://gerrit.wikimedia.org/r/229304 (owner: 10GWicke) [05:27:41] PROBLEM - Hadoop HistoryServer on analytics1001 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer [05:31:50] !log rm CirrusSearchRequests.log-201510*.gz on fluorine (saving ~200G) [05:31:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [05:39:36] paravoid: i'm going to deploy https://gerrit.wikimedia.org/r/#/c/263425/ if you don't mind [05:41:37] actually, i'm going to look at the apache log a little first [05:42:22] RECOVERY - Hadoop HistoryServer on analytics1001 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer [06:31:22] PROBLEM - puppet last run on db2055 is CRITICAL: CRITICAL: Puppet has 2 failures [06:31:51] PROBLEM - puppet last run on mw2043 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:52] PROBLEM - puppet last run on db2056 is CRITICAL: CRITICAL: puppet fail [06:32:12] PROBLEM - puppet last run on wtp2015 is CRITICAL: CRITICAL: Puppet has 2 failures [06:32:42] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:51] PROBLEM - puppet last run on mw2021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:11] PROBLEM - puppet last run on mw2036 is CRITICAL: CRITICAL: Puppet has 2 failures [06:33:12] PROBLEM - puppet last run on mw2158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:51] PROBLEM - puppet last run on mw2050 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:22] PROBLEM - puppet last run on mw2207 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:41] PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 2 failures [06:36:02] PROBLEM - puppet last run on db2049 is CRITICAL: CRITICAL: Puppet has 1 failures [06:40:14] (03PS3) 10Andrew Bogott: Enable memory cgroups for labs debian instances [puppet] - 10https://gerrit.wikimedia.org/r/262838 (https://phabricator.wikimedia.org/T122734) [06:44:07] (03CR) 10Andrew Bogott: [C: 032] Enable memory cgroups for labs debian instances [puppet] - 10https://gerrit.wikimedia.org/r/262838 (https://phabricator.wikimedia.org/T122734) (owner: 10Andrew Bogott) [06:51:30] (03PS3) 10Andrew Bogott: Labs jessie image: override the debian grub with a copy of our current puppetized grub defaults. [puppet] - 10https://gerrit.wikimedia.org/r/262839 (https://phabricator.wikimedia.org/T122734) [06:52:32] (03CR) 10Andrew Bogott: [C: 032] Labs jessie image: override the debian grub with a copy of our current puppetized grub defaults. [puppet] - 10https://gerrit.wikimedia.org/r/262839 (https://phabricator.wikimedia.org/T122734) (owner: 10Andrew Bogott) [06:55:42] RECOVERY - puppet last run on mw2021 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [06:56:02] RECOVERY - puppet last run on mw2036 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [06:56:31] RECOVERY - puppet last run on db2055 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:56:51] PROBLEM - Kafka Broker Replica Max Lag on kafka1020 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [5000000.0] [06:56:51] RECOVERY - puppet last run on mw2043 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:56:52] RECOVERY - puppet last run on db2056 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:21] RECOVERY - puppet last run on wtp2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:21] RECOVERY - puppet last run on mw2207 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:57:33] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:57:42] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:12] RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:51] RECOVERY - puppet last run on mw2050 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:01:12] RECOVERY - puppet last run on db2049 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:05:22] RECOVERY - Kafka Broker Replica Max Lag on kafka1020 is OK: OK: Less than 50.00% above the threshold [1000000.0] [07:14:17] (03PS1) 10Andrew Bogott: Remove apt http proxies from jessie image [puppet] - 10https://gerrit.wikimedia.org/r/263584 [07:15:32] (03CR) 10Andrew Bogott: [C: 032] Remove apt http proxies from jessie image [puppet] - 10https://gerrit.wikimedia.org/r/263584 (owner: 10Andrew Bogott) [07:38:33] !log cr2-eqiad: reenable BGP peerings with GTT [07:38:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [07:41:54] 6operations, 6Performance-Team, 10Wikimedia-General-or-Unknown, 5Patch-For-Review: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1927080 (10ori) To try to detect what caused the change at ~6:30, I counted jobs by type for the two hours before and after the change. I did this on mw1166,... [07:57:28] 6operations, 10netops: User connectivity issues to wikipedias; fine to phabricator et al - https://phabricator.wikimedia.org/T123211#1927097 (10faidon) 5Open>3Invalid a:3faidon The reverse traceroute looks to be to an IP that ends way earlier than the forward traceroute (probably hop 5 from the original... [08:00:02] PROBLEM - very high load average likely xfs on ms-be1002 is CRITICAL: CRITICAL - load average: 207.71, 151.20, 75.55 [08:01:14] 6operations, 10ops-codfw: patch/implement new zayo wave (579171) codfw-ulsfo cr1-codfw:xe-5/0/2 - https://phabricator.wikimedia.org/T122823#1927102 (10faidon) [08:05:31] PROBLEM - puppet last run on nescio is CRITICAL: CRITICAL: puppet fail [08:32:52] RECOVERY - puppet last run on nescio is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:36:12] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: Puppet last ran 2 days ago [08:37:16] !log ms-be1002: echo b > /proc/sysrq-trigger, kernel misbehaving and unrecoverable (out of kernel memory/XFS issues) [08:37:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [08:40:02] RECOVERY - very high load average likely xfs on ms-be1002 is OK: OK - load average: 13.48, 3.12, 1.03 [08:44:21] PROBLEM - puppet last run on mw1168 is CRITICAL: CRITICAL: Puppet last ran 2 days ago [08:47:43] PROBLEM - puppet last run on mw1167 is CRITICAL: CRITICAL: Puppet last ran 2 days ago [08:50:42] RECOVERY - puppet last run on mw1169 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:56:51] PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: Puppet last ran 2 days ago [09:01:12] RECOVERY - puppet last run on mw1168 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:04:41] RECOVERY - puppet last run on mw1167 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [09:17:42] RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:32:09] (03CR) 10Giuseppe Lavagetto: [C: 032] etcd: auth puppetization [puppet] - 10https://gerrit.wikimedia.org/r/255155 (https://phabricator.wikimedia.org/T97972) (owner: 10Giuseppe Lavagetto) [09:32:18] (03PS22) 10Giuseppe Lavagetto: etcd: auth puppetization [puppet] - 10https://gerrit.wikimedia.org/r/255155 (https://phabricator.wikimedia.org/T97972) [09:44:22] (03CR) 10Reedy: "I did sorta say that in an earlier comment :P" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/260242 (owner: 10Reedy) [09:44:53] (03CR) 10Filippo Giunchedi: "FWIW python-diamond upstream has been receptive of patches in the past, we should send it upstream if it makes sense" [puppet] - 10https://gerrit.wikimedia.org/r/263394 (owner: 10Rush) [09:53:01] PROBLEM - Host mr1-codfw.oob is DOWN: PING CRITICAL - Packet loss = 100% [09:53:21] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 205, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-4/3/1: down - Transit: Telia (IC-308845) {#3861} [10Gbps]BR [09:56:42] I think that's expected, there was planned work from telia [09:59:21] RECOVERY - Host mr1-codfw.oob is UP: PING OK - Packet loss = 0%, RTA = 38.19 ms [10:02:20] Hi ops team [10:02:35] I can't connect to stat1002 anymore, is that expected? [10:03:05] (03CR) 10Sjoerddebruin: [C: 031] Set logos for mobile login page for Wikidata and Wikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263201 (https://phabricator.wikimedia.org/T123175) (owner: 10Aude) [10:04:02] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 207, down: 0, dormant: 0, excluded: 0, unused: 0 [10:07:31] ok, seems back :) [10:11:51] PROBLEM - puppet last run on mw1165 is CRITICAL: CRITICAL: Puppet last ran 2 days ago [10:14:41] PROBLEM - Host mr1-codfw.oob is DOWN: PING CRITICAL - Packet loss = 100% [10:16:41] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 205, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-4/3/1: down - Transit: Telia (IC-308845) {#3861} [10Gbps]BR [10:20:43] mark paravoid akosiaris ^ anything we can do for that to move traffic? [10:21:01] RECOVERY - Host mr1-codfw.oob is UP: PING OK - Packet loss = 0%, RTA = 38.92 ms [10:22:21] RECOVERY - puppet last run on mw1165 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [10:23:01] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 207, down: 0, dormant: 0, excluded: 0, unused: 0 [10:33:54] hey [10:34:05] I'm not home [10:34:17] if it's flapping a lot, we can disable [10:34:29] doesn't sound so bad so far [10:34:49] can you check for traffic impact, eg. 500s? [10:35:05] paravoid: hey, flapped twice in the last half an hour with reports of people not being able to connect, checking 500s [10:35:11] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 813 [10:37:41] PROBLEM - puppet last run on mw2034 is CRITICAL: CRITICAL: Puppet has 1 failures [10:38:02] yeah doesn't seem impactful on 500s paravoid [10:39:43] it dropped around ~1.5gbit of traffic afaics [10:40:11] RECOVERY - check_mysql on db1008 is OK: Uptime: 1879402 Threads: 2 Questions: 43567568 Slow queries: 18996 Opens: 60220 Flush tables: 2 Open tables: 416 Queries per second avg: 23.181 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [11:00:36] (03PS1) 10Giuseppe Lavagetto: etcd: turn on authentication in production [puppet] - 10https://gerrit.wikimedia.org/r/263596 [11:03:23] RECOVERY - puppet last run on mw2034 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:14:43] (03CR) 10Alexandros Kosiaris: [C: 031] "We should think what else needs to be unique as well in the future, but this LGTM." [puppet] - 10https://gerrit.wikimedia.org/r/263363 (https://phabricator.wikimedia.org/T122665) (owner: 10Muehlenhoff) [11:41:05] 6operations, 10RESTBase, 7RESTBase-architecture: restbase - nodejs package upgrade - puppet fail - https://phabricator.wikimedia.org/T123297#1927317 (10akosiaris) >>! In T123297#1926953, @faidon wrote: > This was applied inconsistently (restbase1002/1003 were still running 0.10). I downgraded 1001/1004 to 0... [11:53:14] (03CR) 10Giuseppe Lavagetto: [C: 032] "https://puppet-compiler.wmflabs.org/1575/ shows what seem as correct results" [puppet] - 10https://gerrit.wikimedia.org/r/263596 (owner: 10Giuseppe Lavagetto) [11:57:40] <_joe_> !log enabling auth on the production etcd cluster [11:57:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [12:06:13] (03PS1) 10Giuseppe Lavagetto: etcd: allow reconnection from etcd-manage [puppet] - 10https://gerrit.wikimedia.org/r/263598 [12:07:04] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] etcd: allow reconnection from etcd-manage [puppet] - 10https://gerrit.wikimedia.org/r/263598 (owner: 10Giuseppe Lavagetto) [12:11:08] (03CR) 10Alexandros Kosiaris: "inline comments answered. @Giuseppe, is there a scap3 local command that does all that ? I 'd love to use it and kill all the duplicating " (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [12:13:15] 6operations, 10hardware-requests: Upgrade restbase100[7-9] to match restbase100[1-6] hardware - https://phabricator.wikimedia.org/T119935#1927330 (10fgiunchedi) update, I've tested locally growing a raid0 and was successful ``` # cat /proc/mdstat Personalities : [raid0] md0 : active raid0 sdc[1] sdb[0]... [12:16:12] hey, I'm back [12:17:36] hey paravoid, looks like it stopped flapping and the window is almost over, traffic doesn't seem to be fully back to what it was before tho [12:23:20] 6operations: add ema to ops mail aliases (exim) - https://phabricator.wikimedia.org/T123255#1927343 (10ema) 5Open>3Resolved Done. Thanks @Dzahn and @Joe for your help! [12:23:21] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1927346 (10ema) [12:33:00] 6operations, 10LDAP-Access-Requests: ldap/ops membership for ema - https://phabricator.wikimedia.org/T123253#1927355 (10Krenair) I think the ldap-admins (and ops) do something like this from terbium: `modify-ldap-group --addmembers=ema ops` [12:34:26] 6operations: add ema to ops mailing lists - https://phabricator.wikimedia.org/T123256#1927356 (10Krenair) a:3Dzahn [12:42:20] (03CR) 10Alexandros Kosiaris: [V: 04-1] "puppet compiler complains on this one https://puppet-compiler.wmflabs.org/1576/sca1001.eqiad.wmnet/change.sca1001.eqiad.wmnet.err" [puppet] - 10https://gerrit.wikimedia.org/r/263550 (owner: 10KartikMistry) [12:42:32] PROBLEM - Kafka Broker Replica Max Lag on kafka1020 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [5000000.0] [12:44:42] RECOVERY - Kafka Broker Replica Max Lag on kafka1020 is OK: OK: Less than 50.00% above the threshold [1000000.0] [12:50:52] (03PS9) 10Alexandros Kosiaris: Puppet provider for scap3 [puppet] - 10https://gerrit.wikimedia.org/r/262742 [12:51:36] (03PS4) 10Giuseppe Lavagetto: conftool: add support for ACLs, helper scripts [puppet] - 10https://gerrit.wikimedia.org/r/258975 [12:54:06] (03PS5) 10Giuseppe Lavagetto: conftool: add support for ACLs, helper scripts [puppet] - 10https://gerrit.wikimedia.org/r/258975 [12:59:43] (03CR) 10Giuseppe Lavagetto: "LGTM" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/258975 (owner: 10Giuseppe Lavagetto) [13:00:03] (03CR) 10Giuseppe Lavagetto: [C: 032] conftool: add support for ACLs, helper scripts [puppet] - 10https://gerrit.wikimedia.org/r/258975 (owner: 10Giuseppe Lavagetto) [13:08:33] (03PS1) 10Giuseppe Lavagetto: etcd: add allow_reconnect true to the client config [puppet] - 10https://gerrit.wikimedia.org/r/263601 [13:10:10] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/263601 (owner: 10Giuseppe Lavagetto) [13:10:12] YuviPanda, now I just get ldap.SIZELIMIT_EXCEEDED when I try to run https://phabricator.wikimedia.org/T108078#1512371 :( [13:11:10] <_joe_> Krenair: why are you doing that btw? [13:11:39] _joe_, ? [13:11:53] <_joe_> auditing the ssh keys between labs and prod [13:11:56] why am I running the script? [13:12:01] <_joe_> yes [13:12:12] <_joe_> is there something specific you're looking at? [13:12:34] no, I just remembered that it hadn't been checked in a while [13:12:49] <_joe_> ok [13:13:24] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 205, down: 1, dormant: 0, excluded: 0, unused: 0BRxe-4/3/1: down - Transit: Telia (IC-308845) {#3861} [10Gbps]BR [13:13:49] <_joe_> akosiaris paravoid ^^ flapping again? [13:14:11] I have trouble connecting to iron so there is something here for sure [13:14:58] from Orange France I can't reach eqiad. Stop at some open transit node in Paris [13:15:37] oh, it's fine again [13:19:29] seems that indeed our telia transit has some problems today [13:20:06] ah telia contacted us about that [13:21:42] at least I can still reach exams :-} [13:21:44] esams [13:21:53] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 207, down: 0, dormant: 0, excluded: 0, unused: 0 [13:22:50] akosiaris: route propagating , i am reaching open transit -> telia now [13:23:50] what I don't get thought is why our routes aren't reachable by another link [13:25:11] (03PS1) 10Ema: nagios: add myself to sms contactgroup [puppet] - 10https://gerrit.wikimedia.org/r/263602 (https://phabricator.wikimedia.org/T123257) [13:27:43] PROBLEM - test icmp reachability to eqiad on ripe-atlas-eqiad is CRITICAL: CRITICAL - failed 42 probes of 381 (alerts on 19) [13:30:21] (03PS6) 10BBlack: Text VCL: Fix up logged-in users caching [puppet] - 10https://gerrit.wikimedia.org/r/259882 [13:35:31] (03PS7) 10BBlack: Text VCL: Fix up logged-in users caching [puppet] - 10https://gerrit.wikimedia.org/r/259882 [13:38:44] RECOVERY - test icmp reachability to eqiad on ripe-atlas-eqiad is OK: OK - failed 1 probes of 381 (alerts on 19) [13:41:41] (03CR) 10BBlack: [C: 031] Fix api redirect that come in via https to target https [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [13:47:08] (03CR) 10BBlack: [C: 031] Enable RPS on eth0 on labstores [puppet] - 10https://gerrit.wikimedia.org/r/261598 (owner: 10Mark Bergsma) [13:47:45] (03CR) 10BBlack: "(also, note that merging this patch should remove irqbalance for you when it applies)" [puppet] - 10https://gerrit.wikimedia.org/r/261598 (owner: 10Mark Bergsma) [13:47:58] (yeah ;) [14:02:04] PROBLEM - Kafka Broker Replica Max Lag on kafka1014 is CRITICAL: CRITICAL: 58.33% of data above the critical threshold [5000000.0] [14:03:39] 6operations, 5Patch-For-Review: url-downloader should be set up more redundantly - https://phabricator.wikimedia.org/T122134#1927405 (10akosiaris) Let's actually do this a bit better and not tie the url-downloader service to acamar. I suggest setting up a VM in `codfw` and have the service over there. After th... [14:10:24] RECOVERY - Kafka Broker Replica Max Lag on kafka1014 is OK: OK: Less than 50.00% above the threshold [1000000.0] [14:10:33] PROBLEM - puppet last run on mw2022 is CRITICAL: CRITICAL: puppet fail [14:14:44] PROBLEM - Kafka Broker Replica Max Lag on kafka1012 is CRITICAL: CRITICAL: 51.72% of data above the critical threshold [5000000.0] [14:16:42] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] nagios: add myself to sms contactgroup [puppet] - 10https://gerrit.wikimedia.org/r/263602 (https://phabricator.wikimedia.org/T123257) (owner: 10Ema) [14:17:49] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1927414 (10bmansurov) Done: * P2471 * https://www.mediawiki.org/w/index.php?title=User%3ABmansurov_%28WMF%29&type=revision&diff=2014332&oldid=1488052 [14:22:35] (03CR) 10Hashar: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/244148 (https://phabricator.wikimedia.org/T114887) (owner: 10Hashar) [14:25:14] RECOVERY - Kafka Broker Replica Max Lag on kafka1012 is OK: OK: Less than 50.00% above the threshold [1000000.0] [14:32:01] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1927433 (10ema) [14:32:02] 6operations, 5Patch-For-Review: add ema to icinga (contact / paging) - https://phabricator.wikimedia.org/T123257#1927431 (10ema) 5Open>3Resolved Done. [14:37:58] RECOVERY - puppet last run on mw2022 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:49:08] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: puppet fail [14:49:59] 6operations: add ema to ops mailing lists - https://phabricator.wikimedia.org/T123256#1927453 (10Joe) a:5Dzahn>3Joe [14:51:27] 6operations, 10LDAP-Access-Requests: ldap/ops membership for ema - https://phabricator.wikimedia.org/T123253#1927455 (10ema) a:3ema [14:52:40] 6operations, 7Pybal: conftool backend errors during merge - https://phabricator.wikimedia.org/T114091#1927458 (10Joe) 5Open>3Resolved [14:53:15] 6operations, 7Pybal: conftool backend errors during merge - https://phabricator.wikimedia.org/T114091#1684335 (10Joe) With the upgrade to etcd 2.2 those problems should be solved. My initial testing didn't show any further issues btw. [14:53:37] 6operations, 10Traffic, 5Patch-For-Review, 7discovery-system, 5services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972#1927462 (10Joe) [14:53:39] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Upgrade conftool to support credentials form a config file - https://phabricator.wikimedia.org/T118833#1927460 (10Joe) 5Open>3Resolved [14:53:53] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Upgrade conftool to support credentials form a config file - https://phabricator.wikimedia.org/T118833#1810322 (10Joe) [14:53:55] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Upgrade python-etcd to 0.4.2+ - https://phabricator.wikimedia.org/T118834#1927463 (10Joe) 5Open>3Resolved [14:54:03] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Upgrade python-etcd to 0.4.2+ - https://phabricator.wikimedia.org/T118834#1810332 (10Joe) a:3Joe [14:54:16] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Upgrade conftool to support credentials form a config file - https://phabricator.wikimedia.org/T118833#1810322 (10Joe) a:3Joe [14:54:47] 6operations, 10Traffic, 5Patch-For-Review, 7discovery-system, 5services-tooling: Backport etcd 2.2 to jessie - https://phabricator.wikimedia.org/T118830#1927468 (10Joe) 5Open>3Resolved [14:54:49] 6operations, 10Traffic, 5Patch-For-Review, 7discovery-system, 5services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972#1256151 (10Joe) [14:54:59] !log added myself to ops and wmf ldap groups [14:55:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:55:14] 6operations, 10Traffic, 5Patch-For-Review, 7discovery-system, 5services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972#1256151 (10Joe) [14:55:16] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Upgrade the production etcd cluster to 2.2 - https://phabricator.wikimedia.org/T118831#1927470 (10Joe) 5Open>3Resolved a:3Joe [14:55:56] 6operations, 10Traffic, 7discovery-system, 5services-tooling: Create a tool to sync static configuration from a repository to the consistent k/v store - https://phabricator.wikimedia.org/T97978#1927475 (10Joe) [14:55:57] 6operations, 10Traffic, 5Patch-For-Review, 7discovery-system, 5services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972#1927474 (10Joe) 5Open>3Resolved [14:58:30] ema: nice! you should be now capable of using the various tools that require ldap auth like icinga, grafana-admin, tendril, graphite, logstash and servermon [14:59:22] (03PS1) 10BBlack: eqiad misc-web addr fixes: 1/5 add new reverse DNS [dns] - 10https://gerrit.wikimedia.org/r/263609 (https://phabricator.wikimedia.org/T83110) [14:59:24] (03PS1) 10BBlack: eqiad misc-web addr fixes: 3/5 switch forward DNS [dns] - 10https://gerrit.wikimedia.org/r/263610 (https://phabricator.wikimedia.org/T83110) [14:59:26] (03PS1) 10BBlack: eqiad misc-web addr fixes: 5/5 remove old reverse DNS [dns] - 10https://gerrit.wikimedia.org/r/263611 (https://phabricator.wikimedia.org/T83110) [14:59:28] (03PS1) 10BBlack: eqiad misc-web addr fixes: 2/5 add new addrs to LVS/caches [puppet] - 10https://gerrit.wikimedia.org/r/263612 (https://phabricator.wikimedia.org/T83110) [14:59:30] (03PS1) 10BBlack: eqiad misc-web addr fixes: 4/5 remove old addrs from LVS/caches [puppet] - 10https://gerrit.wikimedia.org/r/263613 (https://phabricator.wikimedia.org/T83110) [15:00:13] (03PS1) 10Mdann52: Config changes for gu.wikiquote.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263614 (https://phabricator.wikimedia.org/T121853) [15:01:05] (03CR) 10jenkins-bot: [V: 04-1] Config changes for gu.wikiquote.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263614 (https://phabricator.wikimedia.org/T121853) (owner: 10Mdann52) [15:01:11] 6operations, 10hardware-requests: Upgrade restbase100[7-9] to match restbase100[1-6] hardware - https://phabricator.wikimedia.org/T119935#1927484 (10fgiunchedi) I can't reproduce locally by growing a raid0 to a third disk while also having disk activity, I'm proposing the following: * expand the raid1s on res... [15:01:24] (03PS2) 10Mdann52: Config changes for gu.wikiquote.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263614 (https://phabricator.wikimedia.org/T121853) [15:07:23] (03CR) 10BBlack: [C: 032] eqiad misc-web addr fixes: 1/5 add new reverse DNS [dns] - 10https://gerrit.wikimedia.org/r/263609 (https://phabricator.wikimedia.org/T83110) (owner: 10BBlack) [15:08:47] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1927489 (10ema) [15:08:48] 6operations, 10LDAP-Access-Requests: ldap/ops membership for ema - https://phabricator.wikimedia.org/T123253#1927487 (10ema) 5Open>3Resolved Done by running the following on terbium.eqiad.wmnet: `modify-ldap-group --addmembers=ema ops` `modify-ldap-group --addmembers=ema wmf` [15:10:43] 6operations: add ema to ops mailing lists - https://phabricator.wikimedia.org/T123256#1927490 (10Joe) Done. [15:11:29] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1927492 (10Joe) [15:11:30] 6operations: add ema to ops mailing lists - https://phabricator.wikimedia.org/T123256#1927491 (10Joe) 5Open>3Resolved [15:11:51] (03CR) 10BBlack: [C: 032] eqiad misc-web addr fixes: 2/5 add new addrs to LVS/caches [puppet] - 10https://gerrit.wikimedia.org/r/263612 (https://phabricator.wikimedia.org/T83110) (owner: 10BBlack) [15:12:23] 6operations, 5Patch-For-Review, 7discovery-system: conftools: hostname creation validation, set != create - https://phabricator.wikimedia.org/T104574#1927493 (10Joe) 5Open>3Resolved [15:16:57] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [15:32:22] (03CR) 10BBlack: [C: 032] eqiad misc-web addr fixes: 3/5 switch forward DNS [dns] - 10https://gerrit.wikimedia.org/r/263610 (https://phabricator.wikimedia.org/T83110) (owner: 10BBlack) [15:33:43] (03CR) 10Thiemo Mättig (WMDE): [C: 031] "Our product manager (Lydia) decided we want this." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263046 (https://phabricator.wikimedia.org/T123112) (owner: 10Thiemo Mättig (WMDE)) [15:39:23] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1927512 (10Krenair) What else is left to do? [15:48:49] (03CR) 10Aude: "if the config is going to be same for test.wikidata, beta and wikidata, then please define it just once in Wikibase.php" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263046 (https://phabricator.wikimedia.org/T123112) (owner: 10Thiemo Mättig (WMDE)) [15:51:22] (03PS3) 10Hashar: varnish: lint varnishlog.py [puppet] - 10https://gerrit.wikimedia.org/r/262597 [15:53:45] (03PS1) 10Filippo Giunchedi: swift: reinstall ms-fe3* with jessie [puppet] - 10https://gerrit.wikimedia.org/r/263617 (https://phabricator.wikimedia.org/T117972) [15:53:58] (03PS1) 10Giuseppe Lavagetto: etcd::client: sort keys in the config files [puppet] - 10https://gerrit.wikimedia.org/r/263618 [15:54:00] (03PS1) 10Giuseppe Lavagetto: conftool: fix scripts syntax [puppet] - 10https://gerrit.wikimedia.org/r/263619 [15:54:02] (03PS1) 10Giuseppe Lavagetto: role::cache: install conftool scripts [puppet] - 10https://gerrit.wikimedia.org/r/263620 [15:54:15] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] swift: reinstall ms-fe3* with jessie [puppet] - 10https://gerrit.wikimedia.org/r/263617 (https://phabricator.wikimedia.org/T117972) (owner: 10Filippo Giunchedi) [15:56:25] !log reprovision ms-fe3001 with jessie [15:56:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [15:59:16] (03PS4) 10BBlack: varnish: lint varnishlog.py [puppet] - 10https://gerrit.wikimedia.org/r/262597 (owner: 10Hashar) [15:59:27] (03CR) 10BBlack: [C: 032 V: 032] varnish: lint varnishlog.py [puppet] - 10https://gerrit.wikimedia.org/r/262597 (owner: 10Hashar) [16:00:04] anomie ostriches thcipriani marktraceur Krenair: Dear anthropoid, the time has come. Please deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160112T1600). [16:00:04] James_F: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [16:00:21] * James_F waves. [16:00:45] I'm a block from the office :( [16:01:03] bblack: thank you :-} [16:01:04] (03CR) 10BBlack: [C: 04-1] "This needs to hold until at least Jan 13 @ 16:00 UTC to be sure old address are gone from DNS caches." [puppet] - 10https://gerrit.wikimedia.org/r/263613 (https://phabricator.wikimedia.org/T83110) (owner: 10BBlack) [16:01:41] thcipriani: I can wait. [16:01:41] (03CR) 10BBlack: [C: 04-1] "Needs to wait on merge of Iaa67744ab6a80f4ed242cfcda67e98e7caf42365 first" [dns] - 10https://gerrit.wikimedia.org/r/263611 (https://phabricator.wikimedia.org/T83110) (owner: 10BBlack) [16:02:14] James_F: kk. I'll be up in a sec. [16:02:20] thcipriani: No rush. :-) [16:06:23] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258477 (https://phabricator.wikimedia.org/T121238) (owner: 10Luke081515) [16:07:05] (03Merged) 10jenkins-bot: Enable flood group at lvwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258477 (https://phabricator.wikimedia.org/T121238) (owner: 10Luke081515) [16:08:44] (03PS2) 10Hashar: Get rid of .pep8 files [puppet] - 10https://gerrit.wikimedia.org/r/262598 (https://phabricator.wikimedia.org/T114887) [16:10:10] hmm, no bot today: Enable flood group at lvwiki just sync'd [16:10:12] ^ James_F [16:10:32] (03PS8) 10Hashar: tox entry point to run pep8==1.4.6 [puppet] - 10https://gerrit.wikimedia.org/r/244148 (https://phabricator.wikimedia.org/T114887) [16:11:37] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258474 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:11:49] (03CR) 10jenkins-bot: [V: 04-1] tox entry point to run pep8==1.4.6 [puppet] - 10https://gerrit.wikimedia.org/r/244148 (https://phabricator.wikimedia.org/T114887) (owner: 10Hashar) [16:11:53] thcipriani: Hmm, not showing up in the production listing [16:11:58] (03CR) 10jenkins-bot: [V: 04-1] Allow sysop to grant and revoke transwiki on gu.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258474 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:12:05] thcipriani: https://lv.wikipedia.org/wiki/Special:ListGroupRights?uselang=en [16:12:08] (03CR) 10Hashar: [C: 031] "We had all pep8 1.4.6 errors fixed over the last few days. Thus the .pep8 files are no more needed." [puppet] - 10https://gerrit.wikimedia.org/r/262598 (https://phabricator.wikimedia.org/T114887) (owner: 10Hashar) [16:12:42] guit is never ending [16:14:04] Hmm. [16:14:31] James_F: huh, spot checked on 1017, it's definitely sync'd. [16:14:31] <_joe_> hashar: I'd vote to exclude 80 chars violations from all our pep8 checks [16:14:42] <_joe_> is there a way to do it? [16:14:47] <_joe_> I mean globally [16:14:56] _joe_: yup jayvdb pointed it out earlier. It is set to max 173 chars ~ [16:15:08] <_joe_> hashar: oh cool [16:15:09] but maybe we can just entirely ignore the error [16:15:13] <_joe_> that's a good idea [16:15:13] thcipriani: Let's continue and I'll poke it a bit. [16:15:16] kk [16:15:21] <_joe_> nah 173 chars seem good :P [16:16:45] _joe_: do you get some spare time to get rid of the .pep8 file and switch the repo to use tox + pep8 ? [16:17:21] <_joe_> hashar: puppetSWAT is for such changes, IMO, but no, I definitely have 0 spare time now :( [16:17:28] (03CR) 10Thcipriani: Allow sysop to grant and revoke transwiki on gu.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258474 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:17:39] _joe_: ok :-) [16:18:12] 6operations, 10ops-codfw: ms-be2007 - System halted!Error: Integrated RAID - https://phabricator.wikimedia.org/T122844#1927603 (10Papaul) @RobH, @fgiunchedi This server is no longer under warranty. Last date was November 15, 2015 .The server is using Dell PowerEdge PERC H710 512MB Mini Mono RAID Controller 6... [16:18:55] James_F: hmm skipping https://gerrit.wikimedia.org/r/#/c/258474/ for now as well, needs manual rebase. [16:19:10] OK, I'll do that. [16:19:36] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258442 (https://phabricator.wikimedia.org/T119807) (owner: 10Dereckson) [16:20:20] (03Merged) 10jenkins-bot: Namespace configuration on my.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258442 (https://phabricator.wikimedia.org/T119807) (owner: 10Dereckson) [16:22:35] (03PS2) 10Jforrester: Allow sysop to grant and revoke transwiki on gu.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258474 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:22:53] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on my.wikipedia [[gerrit:258442]] (duration: 00m 30s) [16:22:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:23:09] ^ James_F check when you get a chance :) [16:23:22] thcipriani: mywiki working. [16:23:49] kk, thanks [16:26:09] 6operations, 10ops-codfw: ms-be2007 - System halted!Error: Integrated RAID - https://phabricator.wikimedia.org/T122844#1927622 (10RobH) Unfortunately the h310 is not an acceptable replacement for the H710. (It is really just a very low end raid controller, we try to not use the H310 anywhere.) I'll create a... [16:26:22] James_F: on https://gerrit.wikimedia.org/r/#/c/258441/ seems like there are now 2 entries for guwiki in a single array. Probably works ok, but likely does not have well defined behavior. If you're fine with it, I can get it out, just wanted to point it out. [16:26:31] 6operations, 6Analytics-Kanban, 7HTTPS: EventLogging sees too few distinct client IPs {oryx} [8 pts] - https://phabricator.wikimedia.org/T119144#1927624 (10Ottomata) a:3Ottomata [16:26:54] thcipriani: Aren't those the add and remove arrays? [16:27:08] double checking... [16:27:36] 6operations, 10ops-codfw: ms-be2007 - System halted!Error: Integrated RAID - https://phabricator.wikimedia.org/T122844#1927637 (10fgiunchedi) @papaul we want an h710 on these machines as h310 is not really suited. Not sure how many (if any) spares there are in eqiad, I'll defer to @robh for spare vs buy [16:27:36] wgAddGroups and wgRemoveGroups [16:27:51] RobH: lol, timing fail [16:28:02] 6operations, 10ops-codfw: ms-be2007 - System halted!Error: Integrated RAID - https://phabricator.wikimedia.org/T122844#1927640 (10RobH) [16:28:04] ? [16:28:15] T122844 [16:28:42] Robh: if we have to buy can we also include one as spare thnaks [16:28:51] oh, yea no spare 710s [16:29:02] papaul: welll, the issue is we dont use them in under warranty systems [16:29:11] James_F: just looks like wgImportSources 2 keys that are guwiki, am I missing some context here? [16:29:14] so keeping spare raid controllers indefinitely usuaully isnt useful, but not my call [16:29:21] i'll note on the task that you requested a spare as well though. [16:29:31] thcipriani: Oh, sorry, I was looking at the wrong guwiki patch. :-) [16:29:32] its mark's call if we keep a spare 710 there [16:29:42] (depends on hte price i assume ;) [16:29:45] thcipriani: Yeah, I'll fix that. [16:29:51] James_F: thanks :) [16:30:14] Looks like a rebase oddity. [16:30:21] RobH: all the system that uses them are i think the ms-be* and all those those systems are no longer under warranty [16:30:30] (03PS2) 10Jforrester: Import sources on gu.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258441 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:30:31] yep [16:30:49] I'm not sure what the replacement plan is on them either. If we plan to keep them for another year a spare on the shelf is a good idea [16:30:49] thcipriani: https://gerrit.wikimedia.org/r/#/c/258474/ should be good now. [16:31:02] RobH: ok [16:31:02] if we're replacing them next quarter, it isnt. (I dont htink we are replacing these next quarter) [16:31:06] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258436 (https://phabricator.wikimedia.org/T120936) (owner: 10Dereckson) [16:31:12] but not sure, mark will know when its pushed to him for approval [16:31:18] James_F: kk, thanks. [16:31:41] RobH: thanks [16:31:44] (03Merged) 10jenkins-bot: Namespace configuration on pa.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258436 (https://phabricator.wikimedia.org/T120936) (owner: 10Dereckson) [16:32:00] is the h710 in ms-be2007 a 512 or 1tb memory version? [16:32:05] papaul: ^ it should be printed on it [16:32:21] I'm askign dell to just quote a replacement, but just checkign so i dont have to trust dell [16:32:26] 512 [16:33:41] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on pa.wikipedia [[gerrit:258436]] (duration: 00m 29s) [16:33:45] RobH: MCR5X and not 5CT6D [16:33:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:33:48] ^ James_F check please [16:34:09] (03PS2) 10Giuseppe Lavagetto: etcd::client: sort keys in the config files [puppet] - 10https://gerrit.wikimedia.org/r/263618 [16:34:10] cool, thanks! [16:34:18] thcipriani: pawiki working. [16:34:18] the email is sent for quote, its linked off your onsite ticket [16:34:26] (since it has pricing it had to be in the procurement space as a sub-task) [16:34:34] RobH: ok [16:34:42] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258474 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:35:29] (03Merged) 10jenkins-bot: Allow sysop to grant and revoke transwiki on gu.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258474 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:35:45] PROBLEM - DPKG on ms-fe3001 is CRITICAL: Connection refused by host [16:36:25] PROBLEM - Disk space on ms-fe3001 is CRITICAL: Connection refused by host [16:36:38] that's me, icinga races [16:37:15] (03PS1) 10Ottomata: Use X-Client-IP instead of %h for eventlogging varnishkafka instance [puppet] - 10https://gerrit.wikimedia.org/r/263623 (https://phabricator.wikimedia.org/T119144) [16:37:40] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow sysop to grant and revoke transwiki on gu.wikipedia [[gerrit:258474]] (duration: 00m 29s) [16:37:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:37:48] ^ James_F check please [16:37:55] RECOVERY - DPKG on ms-fe3001 is OK: All packages OK [16:38:07] thcipriani: Not really checkable; lack of fatals is good enough. [16:38:16] kk [16:38:35] RECOVERY - Disk space on ms-fe3001 is OK: DISK OK [16:39:03] (03CR) 10Ottomata: [C: 032] Use X-Client-IP instead of %h for eventlogging varnishkafka instance [puppet] - 10https://gerrit.wikimedia.org/r/263623 (https://phabricator.wikimedia.org/T119144) (owner: 10Ottomata) [16:39:23] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/254842 (https://phabricator.wikimedia.org/T75414) (owner: 10Cenarium) [16:40:17] (03Merged) 10jenkins-bot: Remove proxyunbannable [mediawiki-config] - 10https://gerrit.wikimedia.org/r/254842 (https://phabricator.wikimedia.org/T75414) (owner: 10Cenarium) [16:41:59] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove proxyunbannable [[gerrit:254842]] (duration: 00m 30s) [16:42:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:42:05] ^ James_F sync'd [16:42:14] That one's pretty untestable too. [16:42:15] And should be a no-op. :-) [16:42:22] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/255519 (https://phabricator.wikimedia.org/T119510) (owner: 10Mdann52) [16:42:32] thcipriani: Well, site's still up, so… [16:42:35] PROBLEM - salt-minion processes on tin is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/salt-minion [16:42:58] <_joe_> (last famous words) [16:43:11] :D [16:43:13] (03Merged) 10jenkins-bot: Add portal namespace to ps.wikipedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/255519 (https://phabricator.wikimedia.org/T119510) (owner: 10Mdann52) [16:43:51] _joe_: :-) [16:44:52] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add portal namespace to ps.wikipedia.org [[gerrit:255519]] (duration: 00m 30s) [16:44:54] ^ James_F check please [16:45:41] thcipriani: pswiki tested and working. [16:45:59] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/256853 (https://phabricator.wikimedia.org/T50493) (owner: 10Alex Monk) [16:46:33] 6operations, 6Analytics-Kanban, 7HTTPS, 5Patch-For-Review: EventLogging sees too few distinct client IPs {oryx} [8 pts] - https://phabricator.wikimedia.org/T119144#1927669 (10Ironholds_backup) Will this do XFF resolution or just the immediate client IP? (Vast fix either way, mind!) [16:46:57] (03Merged) 10jenkins-bot: Get rid of old unused $wgAllowed* variables [mediawiki-config] - 10https://gerrit.wikimedia.org/r/256853 (https://phabricator.wikimedia.org/T50493) (owner: 10Alex Monk) [16:47:04] RECOVERY - salt-minion processes on tin is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [16:47:14] <_joe_> !log restarted salt-minion on tin [16:47:15] 6operations, 6Analytics-Kanban, 7HTTPS, 5Patch-For-Review: EventLogging sees too few distinct client IPs {oryx} [8 pts] - https://phabricator.wikimedia.org/T119144#1927671 (10Ottomata) Yes, it does XFF. It is the new canonical way of IDing client IPs, and is done in varnish for all requests. [16:47:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:48:19] (03PS1) 10Mdann52: Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) [16:48:43] !log thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Get rid of old unused $wgAllowed* variables [[gerrit:256853]] (duration: 00m 29s) [16:48:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:49:08] ^ James_F last one! No explosions near as I can tell :P [16:49:22] thcipriani: https://gerrit.wikimedia.org/r/#/c/258441/ ? [16:49:31] (I fixed the rebase.) [16:49:52] James_F: gotcha [16:50:16] (03CR) 10Samtar: [C: 031] "Beat me to it, code looks good" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [16:50:22] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258441 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:51:08] (03Merged) 10jenkins-bot: Import sources on gu.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258441 (https://phabricator.wikimedia.org/T120346) (owner: 10Dereckson) [16:51:44] ottomata: stat1002 disk space alert [16:52:14] (03CR) 10Ema: [C: 04-1] etcd::client: sort keys in the config files (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/263618 (owner: 10Giuseppe Lavagetto) [16:53:03] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Import sources on gu.wikipedia [[gerrit:258441]] (duration: 00m 29s) [16:53:06] ^ James_F check please [16:53:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:53:13] (03PS1) 10BBlack: public LVS subnet comment cleanup [dns] - 10https://gerrit.wikimedia.org/r/263626 [16:53:17] thcipriani: Yeah, looks OK I think. [16:53:21] yeah was kinda looking at it, i don't see anything actively growing, looks like it is just kinda full [16:53:23] 500G avail [16:53:24] but still [16:53:30] (03PS2) 10BBlack: eqiad misc-web addr fixes: 5/5 remove old reverse DNS [dns] - 10https://gerrit.wikimedia.org/r/263611 (https://phabricator.wikimedia.org/T83110) [16:53:32] James_F: kk, thanks for all your help! [16:53:39] thcipriani: Thank you! [16:54:00] (03CR) 10BBlack: [C: 04-1] "Needs to wait on merge of Iaa67744ab6a80f4ed242cfcda67e98e7caf42365 first" [dns] - 10https://gerrit.wikimedia.org/r/263611 (https://phabricator.wikimedia.org/T83110) (owner: 10BBlack) [16:54:15] PROBLEM - salt-minion processes on mira is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/salt-minion [16:54:44] <_joe_> wtf is happening? [16:54:48] <_joe_> first tin now mira [16:55:00] (03CR) 10Glaisher: [C: 04-1] Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [16:56:07] 6operations, 6Analytics-Kanban, 7HTTPS, 5Patch-For-Review: EventLogging sees too few distinct client IPs {oryx} [8 pts] - https://phabricator.wikimedia.org/T119144#1927694 (10Ironholds_backup) Cooool! Super-excited about this :D [16:56:15] PROBLEM - Kafka Broker Replica Max Lag on kafka1014 is CRITICAL: CRITICAL: 69.57% of data above the critical threshold [5000000.0] [16:57:28] (03CR) 10Mdann52: "Per previous uses, this appears valid." (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [16:57:54] (03CR) 10BBlack: [C: 032] public LVS subnet comment cleanup [dns] - 10https://gerrit.wikimedia.org/r/263626 (owner: 10BBlack) [16:59:42] whoa, just saw a massive spike in dberrors, like 3000 in a matter of seconds, went away though... [17:00:07] all connection errors, I guess. [17:00:55] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: Puppet has 1 failures [17:01:57] (03PS3) 10Giuseppe Lavagetto: etcd::client: sort keys in the config files [puppet] - 10https://gerrit.wikimedia.org/r/263618 [17:03:49] (03PS2) 10Mdann52: Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) [17:05:06] (03PS4) 10Giuseppe Lavagetto: etcd::client: sort keys in the config files [puppet] - 10https://gerrit.wikimedia.org/r/263618 [17:09:12] RECOVERY - Kafka Broker Replica Max Lag on kafka1014 is OK: OK: Less than 50.00% above the threshold [1000000.0] [17:09:59] (03PS3) 10Mdann52: Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) [17:13:53] (03CR) 10Glaisher: [C: 04-1] Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [17:14:50] (03CR) 10Hashar: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/244148 (https://phabricator.wikimedia.org/T114887) (owner: 10Hashar) [17:14:59] (03PS1) 10Filippo Giunchedi: swift: adjust dependencies for jessie [puppet] - 10https://gerrit.wikimedia.org/r/263628 (https://phabricator.wikimedia.org/T117972) [17:15:01] (03PS1) 10Filippo Giunchedi: swift: adjust mount options for debian and ubuntu [puppet] - 10https://gerrit.wikimedia.org/r/263629 (https://phabricator.wikimedia.org/T117972) [17:15:03] (03PS1) 10Filippo Giunchedi: swift: add explicit bind_port to servers [puppet] - 10https://gerrit.wikimedia.org/r/263630 (https://phabricator.wikimedia.org/T117972) [17:16:41] (03CR) 10jenkins-bot: [V: 04-1] swift: adjust mount options for debian and ubuntu [puppet] - 10https://gerrit.wikimedia.org/r/263629 (https://phabricator.wikimedia.org/T117972) (owner: 10Filippo Giunchedi) [17:16:52] (03PS1) 10Chad: Update debian package for gerrit [debs/gerrit] - 10https://gerrit.wikimedia.org/r/263631 [17:18:32] RECOVERY - salt-minion processes on mira is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [17:20:03] (03PS4) 10Mdann52: Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) [17:22:00] (03CR) 10Luke081515: [C: 04-1] Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [17:22:23] (03PS2) 10Filippo Giunchedi: swift: add explicit bind_port to servers [puppet] - 10https://gerrit.wikimedia.org/r/263630 (https://phabricator.wikimedia.org/T117972) [17:22:25] (03PS2) 10Filippo Giunchedi: swift: adjust mount options for debian and ubuntu [puppet] - 10https://gerrit.wikimedia.org/r/263629 (https://phabricator.wikimedia.org/T117972) [17:22:29] (03CR) 10Chad: "Still needs plugins and still needs to figure out wtf is going on with that jenkins job." [debs/gerrit] - 10https://gerrit.wikimedia.org/r/263631 (owner: 10Chad) [17:23:02] (03CR) 10Samtar: [C: 04-1] "Still requiring single quotes around each Wiki name in the array" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [17:24:05] (03CR) 10Ema: [C: 032] etcd::client: sort keys in the config files [puppet] - 10https://gerrit.wikimedia.org/r/263618 (owner: 10Giuseppe Lavagetto) [17:25:13] 10Ops-Access-Requests, 6operations: onboarding Emanuele Rocca - https://phabricator.wikimedia.org/T123089#1927755 (10Dzahn) 5Open>3Resolved a:3Dzahn compared to onboarding in the past on RT tickets. it was the same stuff, bugtracker, code review, root shell, mailman, exim, icinga, labs/ldap, IRC. lo... [17:26:55] (03PS5) 10Ema: etcd::client: sort keys in the config files [puppet] - 10https://gerrit.wikimedia.org/r/263618 (owner: 10Giuseppe Lavagetto) [17:27:32] RECOVERY - puppet last run on mw1090 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:28:22] (03PS5) 10Mdann52: Add temporary lift of IP cap for eswiki/wikivoyage on 2016-01-14/15 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) [17:29:06] (03PS8) 10BBlack: Text VCL: Fix up logged-in users caching [puppet] - 10https://gerrit.wikimedia.org/r/259882 [17:31:41] 6operations, 6Analytics-Kanban, 7HTTPS, 5Patch-For-Review: EventLogging sees too few distinct client IPs {oryx} [8 pts] - https://phabricator.wikimedia.org/T119144#1927780 (10Ottomata) Ok, should be deployed. Can someone verify that new data looks good? [17:31:55] (03PS1) 10Luke081515: Enable flood group at lvwiki: Correct mistake [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263632 (https://phabricator.wikimedia.org/T121238) [17:32:53] (03CR) 10Luke081515: Enable flood group at lvwiki (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/258477 (https://phabricator.wikimedia.org/T121238) (owner: 10Luke081515) [17:35:05] (03CR) 10Alex Monk: [C: 032] Enable flood group at lvwiki: Correct mistake [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263632 (https://phabricator.wikimedia.org/T121238) (owner: 10Luke081515) [17:35:10] (03CR) 10Luke081515: [C: 031] "Looks good for me now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [17:35:12] 6operations, 10Analytics, 6Discovery, 10EventBus, and 7 others: Define edit related events for change propagation - https://phabricator.wikimedia.org/T116247#1927791 (10Ottomata) I believe we can close this task, ja? Got a few defined here: https://github.com/wikimedia/mediawiki-event-schemas/tree/master/... [17:35:28] (03Merged) 10jenkins-bot: Enable flood group at lvwiki: Correct mistake [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263632 (https://phabricator.wikimedia.org/T121238) (owner: 10Luke081515) [17:36:26] (03PS2) 10Giuseppe Lavagetto: conftool: fix scripts syntax [puppet] - 10https://gerrit.wikimedia.org/r/263619 [17:36:50] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] conftool: fix scripts syntax [puppet] - 10https://gerrit.wikimedia.org/r/263619 (owner: 10Giuseppe Lavagetto) [17:37:06] !log krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/263632/ (duration: 00m 31s) [17:37:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:37:48] 6operations, 10Analytics, 6Discovery, 10EventBus, and 7 others: Define edit related events for change propagation - https://phabricator.wikimedia.org/T116247#1927800 (10Eevans) >>! In T116247#1927791, @Ottomata wrote: > I believe we can close this task, ja? Got a few defined here: https://github.com/wikim... [17:38:12] (03PS2) 10Giuseppe Lavagetto: role::cache: install conftool scripts [puppet] - 10https://gerrit.wikimedia.org/r/263620 [17:40:15] apergos: https://gerrit.wikimedia.org/r/#/c/253594/ ?:) [17:40:35] it's yours, was just wondering if we should get it done [17:41:09] sure [17:41:16] or I can just "get it done" [17:41:18] mutante: [17:41:21] right now [17:41:40] apergos: :) yes, let's, already +1 a while ago [17:41:52] (03CR) 10Samtar: [C: 031] "All good" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263625 (https://phabricator.wikimedia.org/T123351) (owner: 10Mdann52) [17:41:57] (03PS2) 10ArielGlenn: keep fewer dataset web server logs, add date to filename [puppet] - 10https://gerrit.wikimedia.org/r/253594 (https://phabricator.wikimedia.org/T118739) [17:43:13] (03CR) 10ArielGlenn: [C: 032] keep fewer dataset web server logs, add date to filename [puppet] - 10https://gerrit.wikimedia.org/r/253594 (https://phabricator.wikimedia.org/T118739) (owner: 10ArielGlenn) [17:45:20] sold and done [17:45:22] :) thanks [17:45:26] thanks for the ping [17:45:56] quite welcome [17:45:57] Either of you have a minute to poke https://gerrit.wikimedia.org/r/#/c/259301/ for me? [17:47:28] looking [17:48:06] thx! [17:48:19] (03PS3) 10ArielGlenn: ci: remove elasticsearch from browsertest slaves [puppet] - 10https://gerrit.wikimedia.org/r/259301 (https://phabricator.wikimedia.org/T89083) (owner: 10Chad) [17:48:26] (03CR) 10Dzahn: [C: 031] "per hashar the cherry picker" [puppet] - 10https://gerrit.wikimedia.org/r/259301 (https://phabricator.wikimedia.org/T89083) (owner: 10Chad) [17:49:48] (03CR) 10ArielGlenn: [C: 032] ci: remove elasticsearch from browsertest slaves [puppet] - 10https://gerrit.wikimedia.org/r/259301 (https://phabricator.wikimedia.org/T89083) (owner: 10Chad) [17:50:38] There's a redis piece that's running (according to comments) for Cirrus testing too, but I dunno if that's used by anything else so I left it alone. [17:50:44] Main thing was killing ES itself [17:51:18] so tht's the manual kill that's needed I guess [17:51:21] gotcha [17:52:37] 503 spike [17:52:42] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [17:53:07] apergos: antoine already did that bit via salt, this is for the ci nodes [17:53:18] okey dokey [17:53:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [17:54:43] PROBLEM - puppet last run on ms-be3002 is CRITICAL: CRITICAL: Puppet has 1 failures [17:54:59] (03PS7) 10Dzahn: Fix api redirect that come in via https to target https [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [17:56:09] 6operations: add user jrabah@ to strategicpartnerships@ - https://phabricator.wikimedia.org/T122989#1927919 (10eliza) Hello Just checking in, was wondering what the status may be on this? Also - the user name is jrabah (one b) - sorry for the confusion. Thank you, Eliza [17:56:40] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1927921 (10demon) >>! In T123334#1927414, @bmansurov wrote: > Done: > > * P2471 > * https://www.mediawiki.org/w/index.php?title=User%3ABmansurov_%28WMF%29&type=revision&diff=2014332&oldid=1488052... [17:57:43] (03CR) 10Thcipriani: [C: 031] "This is the right direction for moving ahead with packaging." [puppet] - 10https://gerrit.wikimedia.org/r/259071 (https://phabricator.wikimedia.org/T121435) (owner: 10Chad) [17:58:47] (03CR) 10Dzahn: [C: 032] Fix api redirect that come in via https to target https [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [17:59:04] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:59:52] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:59:53] (03PS2) 10Filippo Giunchedi: scap: Put configuration for scap in /etc/scap3.cfg [puppet] - 10https://gerrit.wikimedia.org/r/259071 (https://phabricator.wikimedia.org/T121435) (owner: 10Chad) [18:00:33] ostriches thcipriani good to merge https://gerrit.wikimedia.org/r/#/c/259071/1 (?) [18:01:38] jouncebot: next [18:01:39] In 0 hour(s) and 58 minute(s): MediaWiki train (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160112T1900) [18:02:35] (03PS9) 10BBlack: Text VCL: Fix up logged-in users caching [puppet] - 10https://gerrit.wikimedia.org/r/259882 [18:02:56] (03CR) 10Dzahn: "https://wikidata.org/api/" [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [18:03:13] (03CR) 10BBlack: [C: 032 V: 032] Text VCL: Fix up logged-in users caching [puppet] - 10https://gerrit.wikimedia.org/r/259882 (owner: 10BBlack) [18:03:15] (03CR) 10Luke081515: [C: 04-1] "You need to add spaces there" (035 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263614 (https://phabricator.wikimedia.org/T121853) (owner: 10Mdann52) [18:03:51] PROBLEM - Host analytics1021 is DOWN: PING CRITICAL - Packet loss = 100% [18:04:40] godog: sure, should be a no op. [18:04:59] <_joe_> thcipriani: can I steal you for 5 mins? [18:05:00] It's not a no-op. [18:05:07] We don't currently have a scap.cfg in /etc [18:05:38] <_joe_> thcipriani: so I was looking at https://gerrit.wikimedia.org/r/#/c/262742/ [18:05:38] ostriches: sure, but it's a no op in scap terms, nothing actually changing. [18:05:45] Yerp [18:05:46] After this lands we'll have config in 2 places for a bit and then we'll delete the other piece in scap :) [18:06:01] _joe_: yup. [18:06:07] <_joe_> and noticed akosiaris used native git commands there as opposed as some scap3 command that can run locally [18:06:28] <_joe_> doesn't scap3 have local commands like "sync-common" was for the old scap? [18:07:28] (03PS3) 10Filippo Giunchedi: scap: Put configuration for scap in /etc/scap3.cfg [puppet] - 10https://gerrit.wikimedia.org/r/259071 (https://phabricator.wikimedia.org/T121435) (owner: 10Chad) [18:07:35] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] scap: Put configuration for scap in /etc/scap3.cfg [puppet] - 10https://gerrit.wikimedia.org/r/259071 (https://phabricator.wikimedia.org/T121435) (owner: 10Chad) [18:08:00] _joe_: this is not something that currently exists, but it's what we want to do for the provider, there should be a command that is effectively like the salt-call deploy.[whatever] [repo]. This will be coming shortly, but does not exist yet. [18:08:13] <_joe_> ok thx [18:08:18] <_joe_> can you comment on the PS? [18:08:22] I agree that the provider shouldn't duplicate this function. [18:08:26] _joe_: will do [18:08:53] (03PS1) 10Chad: Remove plugin repository [puppet] - 10https://gerrit.wikimedia.org/r/263634 [18:13:05] <_joe_> thcipriani: thanks [18:15:02] !log cutting MW branch 1.27.0-wmf.10 [18:15:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:19:21] RECOVERY - puppet last run on ms-be3002 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [18:21:46] (03CR) 10Thcipriani: "There is not currently a command that can be run on a target that will cause a target to do a deploy locally; however, after the scap meet" [puppet] - 10https://gerrit.wikimedia.org/r/262742 (owner: 10Alexandros Kosiaris) [18:23:42] (03PS1) 10Ori.livneh: Add %D (response time in microseconds) to Apache log formats [puppet] - 10https://gerrit.wikimedia.org/r/263637 [18:24:11] bd808: do you know of anything that may complain or break if an additional field is added to the apache log formats used by mediawiki? [18:24:34] (03PS2) 10Ori.livneh: Add %D (response time in microseconds) to Apache log formats [puppet] - 10https://gerrit.wikimedia.org/r/263637 [18:25:14] 6operations, 10RESTBase, 7RESTBase-architecture: restbase - nodejs package upgrade - puppet fail - https://phabricator.wikimedia.org/T123297#1928039 (10Dzahn) From my perspective this just popped up in Icinga as CRIT puppet and DPKG on multiple servers and when i saw the nodejs package was in status "halF-c... [18:26:06] <_joe_> ori: I strongly doubt it since said logs are just local AFAIR [18:27:22] yeah, I think so too [18:27:27] (03PS1) 10BBlack: case-insensitive Vary:Cookie matching, JIC [puppet] - 10https://gerrit.wikimedia.org/r/263638 [18:27:31] <_joe_> lemme check [18:29:04] (03CR) 10BBlack: [C: 032] case-insensitive Vary:Cookie matching, JIC [puppet] - 10https://gerrit.wikimedia.org/r/263638 (owner: 10BBlack) [18:29:15] <_joe_> ori: yeah the two formats seem to be used only locally as far as apache is concerned [18:29:57] <_joe_> I don't remember how the sampled logs on fluorine are generated though [18:30:01] * _joe_ brainfart [18:31:09] 7Puppet, 10Deployment-Systems, 5Patch-For-Review, 3Scap3: Move scap.cfg things out of scap and into puppet - https://phabricator.wikimedia.org/T121435#1928061 (10demon) [18:31:10] <_joe_> syslog or something, I guess [18:32:08] (03CR) 10Giuseppe Lavagetto: [C: 031] "seems ok, I didn't check with analytics though. You should probably ask their manager :D" [puppet] - 10https://gerrit.wikimedia.org/r/263637 (owner: 10Ori.livneh) [18:39:38] (03PS1) 10Andrew Bogott: Add stubby node definitions for labtest* [puppet] - 10https://gerrit.wikimedia.org/r/263639 [18:40:13] 6operations: request VM for url-downloader in codfw - https://phabricator.wikimedia.org/T123386#1928090 (10Dzahn) 3NEW [18:40:47] 6operations, 5Patch-For-Review: url-downloader should be set up more redundantly - https://phabricator.wikimedia.org/T122134#1928104 (10Dzahn) >>! In T122134#1927405, @akosiaris wrote: > I suggest setting up a VM in `codfw` and have the service over there. After that we can do the same in `eqiad` alright, cre... [18:41:24] (03PS2) 10Andrew Bogott: Add stubby node definitions for labtest* [puppet] - 10https://gerrit.wikimedia.org/r/263639 [18:43:27] (03CR) 10Andrew Bogott: [C: 032] Add stubby node definitions for labtest* [puppet] - 10https://gerrit.wikimedia.org/r/263639 (owner: 10Andrew Bogott) [18:45:32] RECOVERY - puppet last run on labtestvirt2001 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [18:46:11] RECOVERY - puppet last run on labtestnet2001 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [18:46:50] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1928137 (10Dzahn) [18:46:51] 6operations, 7Mail: Remove exim alias - yuvipanda - https://phabricator.wikimedia.org/T123275#1928135 (10Dzahn) 5Open>3Resolved ok. thanks for confirming. this is done now. ``` -# Yuvaraj Pandian RT-2171 -yuvipanda: ypandian ``` [18:47:41] RECOVERY - puppet last run on labtestmetal2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:50:27] hiyaa, who knows anything about the naggen2 puppet_hosts.cfg generation? [18:50:40] getting a weird IP for analytics1021, which is causing a ping alert to stick [18:51:34] ottomata: it is from puppet exported resources, in mysql [18:51:45] hm, aye [18:52:01] i've cleared the stored resource for that host several times, and it keeps coming back with an incorrect IP [18:52:09] i reinstalled it last week sometime [18:52:11] you can try running the "puppetstoredconfigclean.rb" script [18:52:21] and then let it recreate it on next puppet run [18:52:25] oh, ok [18:52:31] lag [18:52:57] ottomata: what are the wrong vs correct ips? [18:53:16] heh the script should have fixed that yeah [18:53:43] ja, the IP is wrong in the db [18:53:48] its getting [18:53:48] 172.17.42.1 [18:54:17] should be 10.64.5.14 [18:54:39] that's the docker0 interface isn't it? [18:54:45] OOoO [18:54:47] yeah probably [18:54:51] YuviPanda: ^^ [18:55:01] yeah [18:55:11] puppet is finding the docker IP instead of the real one [18:55:43] hehe nasty [18:56:15] "Well, there’s your problem" [18:56:19] yup [18:56:20] hehhe [19:00:04] thcipriani: Respected human, time to deploy MediaWiki train (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160112T1900). Please do the needful. [19:00:14] (03PS3) 10Mdann52: Config changes for gu.wikiquote.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263614 (https://phabricator.wikimedia.org/T121853) [19:01:52] (03PS3) 10Ottomata: EventBus: add spec-based monitoring [puppet] - 10https://gerrit.wikimedia.org/r/260799 (owner: 10Mobrovac) [19:04:08] (03CR) 10Ottomata: "Just updated this to use $service_name as the name of the endpoint instead of $title." [puppet] - 10https://gerrit.wikimedia.org/r/260799 (owner: 10Mobrovac) [19:04:46] (03CR) 10Dzahn: "curl -H 'X-Wikimedia-Debug: 1' --dump-header - https://www.wikidata.org/api | grep "has moved"" [puppet] - 10https://gerrit.wikimedia.org/r/255150 (https://phabricator.wikimedia.org/T119532) (owner: 10JanZerebecki) [19:04:48] (03Abandoned) 10Ottomata: Disable AQS cassandra CQL interface check until AQS is production ready [puppet] - 10https://gerrit.wikimedia.org/r/247910 (https://phabricator.wikimedia.org/T78514) (owner: 10Ottomata) [19:05:16] 6operations, 10Deployment-Systems, 3Scap3: Move scap target configuration to etcd - https://phabricator.wikimedia.org/T115899#1928242 (10demon) [19:05:18] 6operations, 10Deployment-Systems, 6Performance-Team, 7HHVM, 3Scap3: Make scap able to depool/repool servers via the conftool API - https://phabricator.wikimedia.org/T104352#1928241 (10demon) [19:05:27] (03CR) 10Ottomata: "Joal, status update? I think we are just using X-Client-IP now, so can we remove XFF?" [puppet] - 10https://gerrit.wikimedia.org/r/253474 (https://phabricator.wikimedia.org/T118557) (owner: 10BBlack) [19:19:21] ottomata: godog yeah, that needs facter fix [19:20:46] (03CR) 10Alex Monk: "aren't they both in use? or does 3rd not accept mail now?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263551 (owner: 10MaxSem) [19:21:18] 6operations, 10Analytics, 6Discovery, 10EventBus, and 6 others: Reliable publish / subscribe event bus - https://phabricator.wikimedia.org/T84923#1928329 (10mobrovac) [19:21:22] 6operations, 10Analytics, 6Discovery, 10EventBus, and 7 others: Define edit related events for change propagation - https://phabricator.wikimedia.org/T116247#1928325 (10mobrovac) 5Open>3Resolved Indeed. We are done here. [19:21:49] (03PS1) 10Dduvall: Add 1.27.0-wmf.10 symlinks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263646 [19:21:50] (03PS1) 10Dduvall: Group0 to 1.27.0-wmf.10 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263647 [19:21:56] yeah it should just ignore docker0 for the purposes of choosing $ipaddress [19:22:20] godog: I think it should find eth0 or something like that, since docker will also create veth interfaces, and with k8s we might get flannel interfaces [19:23:19] yeah eth+ might work just fine, it should just for work for e.g. lvs or labs [19:23:39] ostriches: yo, can you +2 those? ^ [19:24:08] ostriches: pleeeeease [19:24:33] (03CR) 10Chad: [C: 032] Add 1.27.0-wmf.10 symlinks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263646 (owner: 10Dduvall) [19:24:40] \o/ [19:24:43] ostriches: thanks, homey [19:24:52] YuviPanda: and certainly something is broken if $fqdn and $ipaddress don't match [19:24:53] (03CR) 10Chad: [C: 032] Group0 to 1.27.0-wmf.10 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263647 (owner: 10Dduvall) [19:24:55] (03Merged) 10jenkins-bot: Add 1.27.0-wmf.10 symlinks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263646 (owner: 10Dduvall) [19:25:21] (03Merged) 10jenkins-bot: Group0 to 1.27.0-wmf.10 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263647 (owner: 10Dduvall) [19:25:49] grrrit-wm: yeah [19:25:51] err [19:25:53] godog: yeah [19:25:56] ottomata: can you file a bug? [19:26:26] !log import new r-base package into carbon [19:26:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:30:17] (03PS1) 10Dzahn: varnish/misc-web: add 15.wp.org -> bromine backend [puppet] - 10https://gerrit.wikimedia.org/r/263648 (https://phabricator.wikimedia.org/T599) [19:30:32] !log dduvall@tin Started scap: testwiki to php-1.27.0-wmf.10 and rebuild l10n cache [19:30:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:30:47] (03CR) 10Dzahn: "also see https://gerrit.wikimedia.org/r/#/c/263648/" [dns] - 10https://gerrit.wikimedia.org/r/248504 (https://phabricator.wikimedia.org/T599) (owner: 10Dzahn) [19:31:15] !log dduvall@tin scap failed: CalledProcessError Command 'sudo -u www-data -n -- /bin/mktemp' returned non-zero exit status 1 (duration: 00m 42s) [19:31:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:31:50] 10Ops-Access-Requests, 6operations, 10Analytics: add mforns, milimetric, nuria,ottomata, madhuvishy and joal to piwik-roots - https://phabricator.wikimedia.org/T122325#1928442 (10ggellerman) [19:32:04] 6operations, 10Analytics, 10Deployment-Systems, 6Services, 3Scap3: Deploy AQS with scap3 - https://phabricator.wikimedia.org/T114999#1928450 (10ggellerman) [19:32:44] marxarelli: yikes. mktemp failed? [19:33:06] bd808: werd. as www-data? [19:33:59] (03PS1) 10Dzahn: varnish/misc-web: enable caching for some static sites [puppet] - 10https://gerrit.wikimedia.org/r/263650 [19:34:07] bd808: noteworthy that there is now an /etc/scap.cfg file that _might_ have something to do with the failure...not sure at the moment. [19:35:42] (03CR) 10Dzahn: [C: 032] varnish/misc-web: add 15.wp.org -> bromine backend [puppet] - 10https://gerrit.wikimedia.org/r/263648 (https://phabricator.wikimedia.org/T599) (owner: 10Dzahn) [19:35:55] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1928541 (10bmansurov) @demon, done. [19:35:56] bd808: i'm not a deployer, apparently [19:36:06] lol [19:36:10] i.e. no sudoers [19:36:19] :( [19:36:53] bummer, I think marxarelli needs to be added to the deployment group :\ [19:37:21] (03CR) 10Ottomata: [C: 031] EventBus: add spec-based monitoring [puppet] - 10https://gerrit.wikimedia.org/r/260799 (owner: 10Mobrovac) [19:37:41] alright, I'm going to commandeer the new branch deployment :P [19:37:49] 6operations, 7Mail: remove dana alias -- hasn't worked here in years - https://phabricator.wikimedia.org/T123401#1928558 (10JKrauska) 3NEW a:3Dzahn [19:38:02] (03CR) 10Mobrovac: [C: 031] EventBus: add spec-based monitoring [puppet] - 10https://gerrit.wikimedia.org/r/260799 (owner: 10Mobrovac) [19:38:41] 6operations, 10Analytics, 7Privacy: Honor DNT header for access logs & varnish logs - https://phabricator.wikimedia.org/T98831#1928593 (10ggellerman) [19:38:48] * marxarelli steps away from the sword in the stone [19:39:39] !log thcipriani@tin Started scap: testwiki to php-1.27.0-wmf.10 and rebuild l10n cache [19:39:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:40:31] (03PS4) 10Ottomata: EventBus: add spec-based monitoring [puppet] - 10https://gerrit.wikimedia.org/r/260799 (owner: 10Mobrovac) [19:40:41] (03CR) 10Ottomata: [C: 032 V: 032] EventBus: add spec-based monitoring [puppet] - 10https://gerrit.wikimedia.org/r/260799 (owner: 10Mobrovac) [19:41:08] 6operations, 10Analytics, 10Analytics-Cluster, 10Traffic: Upgrade analytics-eqiad Kafka cluster to Kafka 0.9 - https://phabricator.wikimedia.org/T121562#1928633 (10ggellerman) [19:45:32] PROBLEM - Unmerged changes on repository mediawiki_config on mira is CRITICAL: There are 2 unmerged changes in mediawiki_config (dir /srv/mediawiki-staging/). [19:45:38] (03PS1) 10Dzahn: (WIP) - add Apache site for 15.wp.org [puppet] - 10https://gerrit.wikimedia.org/r/263652 (https://phabricator.wikimedia.org/T599) [19:51:26] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1928923 (10Dzahn) [19:51:27] 6operations, 7Mail: remove dana alias -- hasn't worked here in years - https://phabricator.wikimedia.org/T123401#1928921 (10Dzahn) 5Open>3Resolved done. ``` -## Admin ## - -# Dana Isokawa -dana: disokawa - ``` [19:51:40] 6operations, 7Mail: exim alias remove - feed-sales - https://phabricator.wikimedia.org/T123406#1928925 (10JKrauska) 3NEW a:3Dzahn [19:52:53] (03CR) 10Joal: "Code has been deployed ottomata, but jobs have not been restarted." [puppet] - 10https://gerrit.wikimedia.org/r/253474 (https://phabricator.wikimedia.org/T118557) (owner: 10BBlack) [19:53:52] 6operations, 7Mail: remove dariot alias - https://phabricator.wikimedia.org/T123407#1928938 (10JKrauska) 3NEW a:3Dzahn [19:58:01] RECOVERY - Unmerged changes on repository mediawiki_config on mira is OK: No changes to merge. [20:00:16] 6operations, 7Mail: Remove exim alias - wikiguides - https://phabricator.wikimedia.org/T123410#1928975 (10JKrauska) 3NEW a:3Dzahn [20:01:39] 6operations, 7Mail: Remove exim alias - wikiguides - https://phabricator.wikimedia.org/T123410#1928975 (10JKrauska) also please remove wikivoyage-announce: jalexander Same person approved -- was a one time use, not needed anymore. [20:08:41] PROBLEM - eventlogging-service-eventbus endpoints health on kafka1002 is CRITICAL: Generic error: NoneType object has no attribute __getitem__ [20:08:53] PROBLEM - eventlogging-service-eventbus endpoints health on kafka1001 is CRITICAL: Generic error: unicode object does not support item assignment [20:09:30] that's me and marko figuring out this service checker thing [20:10:20] !log restbase switching restbase100[1-4] to node 4.2 [20:10:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:13:32] RECOVERY - DPKG on restbase1002 is OK: All packages OK [20:13:51] !log restbase switch of restbase100[1-4] to node 4.2 completed [20:13:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:14:13] (03PS1) 10ArielGlenn: dumps: add separate directory creation stages for staged dumps [puppet] - 10https://gerrit.wikimedia.org/r/263654 [20:14:22] RECOVERY - DPKG on restbase1003 is OK: All packages OK [20:14:51] !log restbase switching restbase200x to node 4.2 [20:14:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:16:32] RECOVERY - DPKG on restbase2006 is OK: All packages OK [20:17:12] RECOVERY - DPKG on restbase2003 is OK: All packages OK [20:17:31] RECOVERY - DPKG on restbase2001 is OK: All packages OK [20:17:41] 7Blocked-on-Operations, 6operations, 10RESTBase, 6Services: Switch RESTBase to use Node.js 4.2 - https://phabricator.wikimedia.org/T107762#1929059 (10mobrovac) 5Open>3Resolved a:3mobrovac All of the remaining nodes (`restbase100[1-4]` and `restbase200x`) have now been switched to Node 4.2 and RESTBas... [20:17:52] RECOVERY - DPKG on restbase2004 is OK: All packages OK [20:18:40] (03CR) 10ArielGlenn: [C: 032] dumps: add separate directory creation stages for staged dumps [puppet] - 10https://gerrit.wikimedia.org/r/263654 (owner: 10ArielGlenn) [20:18:49] 7Blocked-on-Operations, 6operations, 10RESTBase, 6Services: Switch RESTBase to use Node.js 4.2 - https://phabricator.wikimedia.org/T107762#1929073 (10GWicke) @mobrovac, could you update the staging cluster as well? [20:19:21] RECOVERY - puppet last run on restbase1002 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [20:22:34] 7Blocked-on-Operations, 6operations, 10RESTBase, 6Services: Switch RESTBase to use Node.js 4.2 - https://phabricator.wikimedia.org/T107762#1929094 (10mobrovac) >>! In T107762#1929073, @GWicke wrote: > @mobrovac, could you update the staging cluster as well? {{done}} for all 6 nodes in the staging cluster. [20:22:42] RECOVERY - DPKG on restbase-test2003 is OK: All packages OK [20:23:12] RECOVERY - DPKG on restbase-test2001 is OK: All packages OK [20:24:24] 7Blocked-on-Operations, 6operations, 10RESTBase, 6Services: Switch RESTBase to use Node.js 4.2 - https://phabricator.wikimedia.org/T107762#1929109 (10GWicke) @mobrovac: Awesome, mille gazie! [20:27:02] RECOVERY - puppet last run on restbase2006 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [20:28:22] (03PS1) 10Hoo man: Set $wgMathEnableWikibaseDataType to false [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263658 [20:31:42] RECOVERY - puppet last run on restbase2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [20:32:31] RECOVERY - puppet last run on restbase2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [20:32:49] greg-g: You around? [20:33:01] (03CR) 10JanZerebecki: "The variable is introduced in https://gerrit.wikimedia.org/r/#/c/263657/ ." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263658 (owner: 10Hoo man) [20:33:11] (03PS1) 10ArielGlenn: dumps: actually produce the new dircreation dump stages files [puppet] - 10https://gerrit.wikimedia.org/r/263659 [20:33:33] (03CR) 10JanZerebecki: [C: 031] Set $wgMathEnableWikibaseDataType to false [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263658 (owner: 10Hoo man) [20:33:52] RECOVERY - puppet last run on restbase-test2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [20:34:21] !log thcipriani@tin Finished scap: testwiki to php-1.27.0-wmf.10 and rebuild l10n cache (duration: 54m 42s) [20:34:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:35:15] (03CR) 10ArielGlenn: [C: 032] dumps: actually produce the new dircreation dump stages files [puppet] - 10https://gerrit.wikimedia.org/r/263659 (owner: 10ArielGlenn) [20:37:12] PROBLEM - Swift HTTP frontend on ms-fe3001 is CRITICAL: Connection refused [20:37:22] PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: Puppet has 1 failures [20:38:21] PROBLEM - Swift HTTP backend on ms-fe3001 is CRITICAL: Connection refused [20:40:07] (03CR) 10Krinkle: "Recommended test path: Merge, pull on tin, sync-common on mw1017, use X-Wikimedia-Debug (e.g. using ChromeWikimediaDebug) to test MassMess" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/237686 (owner: 10Legoktm) [20:42:01] ok taking over the train from Dan and Tyler who are going to get lunch.. [20:42:26] greg-g: I want to deploy some things to disable a feature which is otherwise going to be deployed with the train and it will break things(tm) [20:42:56] That is: One small patch to the math extension (adding a feature flag) and one config. patch setting that flag to false for production. [20:42:57] 6operations, 6Performance-Team, 10Wikimedia-General-or-Unknown, 5Patch-For-Review: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1929172 (10aaron) Interesting that gwtoolsetUpload* jobs ran at 91-97% less rate (almost not running) in the later period of the above graph that the former... [20:44:24] thcipriani: ^ FYI as well [20:44:26] !log twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.27.0-wmf.10 [20:44:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:44:42] Don't really want testwikidata to go wmf10 before that [20:45:08] hoo: no problem [20:46:10] hoo: I'm done deploying the train for today, if you want to deploy something now. assuming nothing blows up in the next 5 minutes then I'd say it's safe to deploy changes [20:46:59] twentyafterfour: Thanks, will go ahead [20:49:31] (03PS1) 10ArielGlenn: dumps: add nocreate versions of stages files for huge wikis [puppet] - 10https://gerrit.wikimedia.org/r/263661 [20:50:21] (03CR) 10Hoo man: [C: 032] Set $wgMathEnableWikibaseDataType to false [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263658 (owner: 10Hoo man) [20:50:44] (03Merged) 10jenkins-bot: Set $wgMathEnableWikibaseDataType to false [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263658 (owner: 10Hoo man) [20:52:43] sync-masters is very slow... is that known? [20:52:46] !log hoo@tin Synchronized wmf-config/: Set $wgMathEnableWikibaseDataType to false (duration: 01m 29s) [20:52:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:53:00] How very slow? [20:53:13] 20:52:26 Finished sync-masters (duration: 01m 09s) [20:54:17] Seems reasonable [20:54:28] hm... that's way longer than the other steps :/ [20:54:57] Imagine stuff goes awry and you sit there for a minute waiting for an unresponsive scap :S [20:55:29] Would it be "safe" to defer that? [20:55:37] (03PS2) 10ArielGlenn: dumps: add nocreate versions of stages files for huge wikis [puppet] - 10https://gerrit.wikimedia.org/r/263661 [20:56:18] hoo: no, because cross-dc rsync replica servers feed from the other masters [20:56:43] OH [20:56:45] I see [20:57:03] (03CR) 10ArielGlenn: [C: 032] dumps: add nocreate versions of stages files for huge wikis [puppet] - 10https://gerrit.wikimedia.org/r/263661 (owner: 10ArielGlenn) [20:57:03] it is slower than a single host sync as it syncs the entire /srv/mediawiki-staging area including .git files on every change [20:57:06] So we will have to live with that from now on? :/ [20:57:30] but much of the perceived slowness is due to the lack feedback on progress [20:57:46] Well, a minute is quite a lot [20:58:00] when the average sync took maybe 15s-20s [20:58:18] not compared to a 50m full scap, but yes for a file sync it is the dominant time right now [20:58:36] (not that a 50m scap is good either) [20:59:25] the scap3 (aka trebuchet over ssh) plans will probably change things quite a bit [20:59:30] I can remember destroying the interwiki cache (a script went bad)... luckily I could revert it with the old version from my home in a less than 45s... that would probably be like one and a half minutes now [21:00:12] scap3 doesn't have co-master support yet, but it will/should [21:01:13] ostriches: *nod* I was mostly thinking about git operations vs rsync. My hypothesis has long been that precomputed deltas will make scap much faster [21:01:20] scap3 is a long way from deploying mediawiki [21:01:48] bd808: Yep, I think we'll have to abandon rsync for that, yes. [21:02:02] installing an ssd on tin & mira would make things much faster too [21:02:31] lol [21:02:48] today we cut the branch from a ramdisk. that really improved performance a lot [21:02:54] nice [21:03:16] what the hell jenkins [21:03:26] speeding up l10nupdate with ssd/ramdisk has been a long held hypothesis as well [21:03:51] * twentyafterfour checks how much RAM is installed on tin [21:04:04] 16G [21:04:59] that's enough [21:05:37] well, sorta.. mediawiki-staging is 24g right now but it doesn't need to be that big [21:05:50] if we pruned branches more consistently [21:06:02] hoo: what did jenkins do to you? [21:07:00] jzerebecki: It fails for the Math extensions for reasons unknown to me [21:07:07] but unrelated to what we're doing [21:07:20] !log hoo@tin Synchronized php-1.27.0-wmf.10/extensions/Math/: Introduce a "MathEnableWikibaseDataType" config (duration: 00m 32s) [21:07:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:07:47] Confirmed: New data type is no longer selectable on testwikidata [21:09:31] (03PS1) 10ArielGlenn: dumps: correct number of small/big wiki staged createdirs [puppet] - 10https://gerrit.wikimedia.org/r/263663 [21:10:49] (03CR) 10ArielGlenn: [C: 032] dumps: correct number of small/big wiki staged createdirs [puppet] - 10https://gerrit.wikimedia.org/r/263663 (owner: 10ArielGlenn) [21:15:10] (03CR) 10BBlack: [C: 031] varnish/misc-web: enable caching for some static sites [puppet] - 10https://gerrit.wikimedia.org/r/263650 (owner: 10Dzahn) [21:15:26] (03CR) 10Dzahn: [C: 032] jmxtrans: ignore indentation warnings [puppet/jmxtrans] - 10https://gerrit.wikimedia.org/r/259884 (owner: 10Dzahn) [21:19:19] Going to ahve some food, but am reachable on my phone (as hoo_) [21:19:59] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1929325 (10Luke081515) p:5Triage>3High [21:22:06] (03PS1) 10Dzahn: jmxtrans: update submodule for lint fix [puppet] - 10https://gerrit.wikimedia.org/r/263670 [21:23:25] (03PS2) 10Dzahn: jmxtrans: update submodule for lint fix [puppet] - 10https://gerrit.wikimedia.org/r/263670 [21:23:33] (03CR) 10Dzahn: [C: 032] jmxtrans: update submodule for lint fix [puppet] - 10https://gerrit.wikimedia.org/r/263670 (owner: 10Dzahn) [21:28:48] (03PS1) 10Greg Grossmeier: Add dduvall to deployment group [puppet] - 10https://gerrit.wikimedia.org/r/263676 [21:29:25] marx [21:31:20] (03CR) 10Greg Grossmeier: [C: 031] "Obviously I approve this." [puppet] - 10https://gerrit.wikimedia.org/r/263676 (owner: 10Greg Grossmeier) [21:32:55] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1929400 (10Dzahn) [21:33:25] greg-g: :) [21:36:16] (03CR) 10Alex Monk: Make MediaWiki treat $lang of be_x_oldwiki as be-tarask, just don't change the real DB name (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/236966 (https://phabricator.wikimedia.org/T111853) (owner: 10Alex Monk) [21:37:38] (03Abandoned) 10Ricordisamoa: maintain-replicas: add cx_translations and cx_translators [software] - 10https://gerrit.wikimedia.org/r/255943 (https://phabricator.wikimedia.org/T119847) (owner: 10Ricordisamoa) [21:37:41] PROBLEM - puppet last run on cp3049 is CRITICAL: CRITICAL: puppet fail [21:41:02] 6operations, 7Mail: remove exim alias - chad - https://phabricator.wikimedia.org/T123423#1929434 (10JKrauska) 3NEW a:3Dzahn [21:41:37] (03PS1) 10Aaron Schulz: Set $wgCentralAuthUseSlaves for mw.org and testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263682 [21:44:55] (03PS2) 10Aaron Schulz: Set $wgCentralAuthUseSlaves for mw.org and testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263682 (https://phabricator.wikimedia.org/T119689) [21:45:01] 6operations, 6Performance-Team, 10Wikimedia-General-or-Unknown, 5Patch-For-Review: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1929456 (10ori) >>! In T122069#1929172, @aaron wrote: > Interesting that gwtoolsetUpload* jobs ran at 91-97% less rate (almost not running) in the later peri... [21:46:34] (03PS3) 10Ori.livneh: Add %D (response time in microseconds) to Apache log formats [puppet] - 10https://gerrit.wikimedia.org/r/263637 [21:46:51] ostriches: how can bmansurov use phab web interface when hes locked out from doing the 2FA…, or does that only restrict certain features? [21:47:06] (03PS1) 10Ori.livneh: Disable gwtoolsetUpload* jobs on even-numbered jobrunners [puppet] - 10https://gerrit.wikimedia.org/r/263724 (https://phabricator.wikimedia.org/T122069) [21:51:04] (03CR) 10Ori.livneh: [C: 032 V: 032] "I confirmed that this does the right thing using PCC." [puppet] - 10https://gerrit.wikimedia.org/r/263724 (https://phabricator.wikimedia.org/T122069) (owner: 10Ori.livneh) [21:52:09] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1929469 (10Dzahn) [21:52:10] 6operations, 7Mail: remove dariot alias - https://phabricator.wikimedia.org/T123407#1929467 (10Dzahn) 5Open>3Resolved done ``` -dariot: dtaraborelli - ``` [21:54:51] 6operations, 7Mail: Remove exim alias - contact - https://phabricator.wikimedia.org/T123421#1929479 (10Dzahn) 5Open>3Resolved done. ``` # Group aliases -contact: mprimack ``` [21:54:51] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1929481 (10Dzahn) [21:58:11] !log Restarting jobchron / jobrunner / HHVM on all job runners for I44990808 [21:58:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:59:46] 6operations, 7Mail: Remove exim alias - wikiguides - https://phabricator.wikimedia.org/T123410#1929488 (10Dzahn) done ``` ## Community ## -wikiguides: jalexander ... -# RT-4393 - wikivoyage announcements -wikivoyage-announce: jalexander - ``` [22:00:07] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1929490 (10Dzahn) [22:00:09] 6operations, 7Mail: Remove exim alias - wikiguides - https://phabricator.wikimedia.org/T123410#1929489 (10Dzahn) 5Open>3Resolved [22:02:21] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1929495 (10Dzahn) [22:02:23] 6operations, 7Mail: remove exim alias - chad - https://phabricator.wikimedia.org/T123423#1929493 (10Dzahn) 5Open>3Resolved done ``` -# Chad / ^demon -chad: chadh - ``` [22:02:52] RECOVERY - puppet last run on cp3049 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [22:07:48] 6operations, 7Mail: Move most (all?) exim personal aliases to OIT - https://phabricator.wikimedia.org/T122144#1929517 (10Dzahn) @JKrauska to clean up the file i have removed these sections that were already commented: ``` -# Trademarks -# trademark has been migrated to google but the alias must remain here.... [22:17:03] 6operations: add user jrabah@ to strategicpartnerships@ - https://phabricator.wikimedia.org/T122989#1929547 (10Dzahn) 5Open>3Resolved a:3Dzahn @Eliza this is done now. ``` -strategicpartnerships: schang, dfoy, avrana, jvargas, sgupta, sventura +strategicpartnerships: schang, dfoy, avrana, jvargas, sgupt... [22:17:32] 6operations, 7Mail: add user jrabah@ to strategicpartnerships@ - https://phabricator.wikimedia.org/T122989#1929555 (10Dzahn) [22:18:47] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations, 6Discovery, 10Maps: Please grant admin Cassandra access to maps-admins - https://phabricator.wikimedia.org/T122465#1929564 (10Dzahn) [22:19:04] 7Blocked-on-Operations, 10Ops-Access-Requests, 6operations, 6Discovery, 10Maps: Please grant admin Cassandra access to maps-admins - https://phabricator.wikimedia.org/T122465#1929565 (10Dzahn) p:5Triage>3Normal [22:21:17] (03CR) 10Dzahn: [C: 031] "a +1 from the mobile team would be nice" [dns] - 10https://gerrit.wikimedia.org/r/256597 (https://phabricator.wikimedia.org/T120143) (owner: 10Dzahn) [22:21:36] (03PS1) 10JanZerebecki: reload apache after config change [puppet] - 10https://gerrit.wikimedia.org/r/263745 [22:24:01] (03PS3) 10Dzahn: delete www.m.wikipedia.org [dns] - 10https://gerrit.wikimedia.org/r/256597 (https://phabricator.wikimedia.org/T120143) [22:24:17] (03PS4) 10Dzahn: delete www.m.wikipedia.org [dns] - 10https://gerrit.wikimedia.org/r/256597 (https://phabricator.wikimedia.org/T120143) [22:31:50] mutante: while you're looking at that, could you ask about login.m.wikimedia.org too? it's defined in DNS (etc), but apparently the mobile site/clients actually use the regular login.wikimedia.org. I just haven't had time to dig and ask. [22:32:20] bblack: sure, will do that [22:32:38] mutante: https://phabricator.wikimedia.org/T111967 is where the question comes from, to know if we need it in our list of special HSTS hostnames for wikimedia.org [22:32:56] alright, gotcha [22:36:05] (03PS7) 10Madhuvishy: [WIP] wikimetrics: Puppet module for wikimetrics [puppet] - 10https://gerrit.wikimedia.org/r/260687 [22:36:56] 6operations, 7Mobile, 5Patch-For-Review: Investigate if login.m.wikipedia.org needs to stay around - https://phabricator.wikimedia.org/T123431#1929602 (10Dzahn) 3NEW [22:37:04] (03CR) 10jenkins-bot: [V: 04-1] [WIP] wikimetrics: Puppet module for wikimetrics [puppet] - 10https://gerrit.wikimedia.org/r/260687 (owner: 10Madhuvishy) [22:37:42] (03PS1) 10ArielGlenn: dumps: make cutoff option work the way it should for createdir jobs [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/263749 [22:37:49] 6operations, 7Mobile, 5Patch-For-Review: Investigate if login.m.wikipedia.org needs to stay around - https://phabricator.wikimedia.org/T123431#1929602 (10Dzahn) [22:37:59] 6operations, 7Mobile: Investigate if login.m.wikipedia.org needs to stay around - https://phabricator.wikimedia.org/T123431#1929612 (10Krenair) [22:38:36] 6operations, 10DBA: Revision 186704908 on en.wikipedia.org, Fatal exception: unknown "cluster16" - https://phabricator.wikimedia.org/T26675#1929614 (10Krenair) a:5Krenair>3None [22:38:49] 6operations, 10DBA: Revision 186704908 on en.wikipedia.org, Fatal exception: unknown "cluster16" - https://phabricator.wikimedia.org/T26675#272037 (10Krenair) removing this from my assigned list until someone answers my question [22:39:42] 6operations, 7Mail: remove wikibugs-irc mail alias ? - https://phabricator.wikimedia.org/T123432#1929618 (10Dzahn) 3NEW a:3Dzahn [22:40:04] 6operations, 7Mail: remove wikibugs-irc mail alias ? - https://phabricator.wikimedia.org/T123432#1929618 (10Dzahn) [22:41:08] (03CR) 10ArielGlenn: [C: 032 V: 032] dumps: make cutoff option work the way it should for createdir jobs [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/263749 (owner: 10ArielGlenn) [22:42:54] 6operations, 6Performance-Team, 10Wikimedia-General-or-Unknown, 5Patch-For-Review: jobrunner memory leaks - https://phabricator.wikimedia.org/T122069#1929628 (10aaron) Possibly related to https://github.com/facebook/hhvm/issues/3899 (which is the same pattern the GWT code uses). [22:44:10] (03PS1) 10Aude: Explicitly define Wikibase data types [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263753 [22:45:17] (03CR) 10Aude: "i think it's ok to have the new data types on beta, but perhaps not yet for test.wikidata." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263753 (owner: 10Aude) [22:51:33] 6operations, 7Mobile: Investigate if login.m.wikipedia.org needs to stay around - https://phabricator.wikimedia.org/T123431#1929651 (10Dzahn) a:3Dzahn [22:51:40] (03CR) 10Hoo man: [C: 04-1] "I don't think the lib dataTypes setting is being used anymore. Could probably also be removed entirely." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263753 (owner: 10Aude) [23:09:31] (03PS3) 10Dzahn: ganglia: move roles to modules/role/ [puppet] - 10https://gerrit.wikimedia.org/r/260698 [23:10:49] (03PS4) 10Dzahn: ganglia: move roles to modules/role/ [puppet] - 10https://gerrit.wikimedia.org/r/260698 [23:11:06] (03CR) 10Dzahn: [C: 032] ganglia: move roles to modules/role/ [puppet] - 10https://gerrit.wikimedia.org/r/260698 (owner: 10Dzahn) [23:14:14] (03CR) 10Dzahn: "noop on uranium (ganglia-web)" [puppet] - 10https://gerrit.wikimedia.org/r/260698 (owner: 10Dzahn) [23:14:56] 7Puppet, 10Deployment-Systems, 5Patch-For-Review, 3Scap3: Move scap.cfg things out of scap and into puppet - https://phabricator.wikimedia.org/T121435#1929745 (10demon) 5Open>3Resolved [23:15:30] (03PS25) 10Paladox: Rename all main WikimediaIncubator settings to have a wg prefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/207909 [23:17:17] (03PS26) 10Paladox: Rename all main WikimediaIncubator settings to have a wg prefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/207909 [23:18:19] (03PS1) 10ArielGlenn: dumps: new actions 'show lastrun' and 'show alldone' for dumps admin [dumps] (ariel) - 10https://gerrit.wikimedia.org/r/263762 [23:18:38] (03Abandoned) 10Paladox: Rename all main WikimediaIncubator settings to have a wg prefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/207909 (owner: 10Paladox) [23:24:34] (03PS2) 10Dzahn: annualreport: add Apache site for 15.wp.org [puppet] - 10https://gerrit.wikimedia.org/r/263652 (https://phabricator.wikimedia.org/T599) [23:25:46] (03PS8) 10Madhuvishy: [WIP] wikimetrics: Puppet module for wikimetrics [puppet] - 10https://gerrit.wikimedia.org/r/260687 [23:26:42] 6operations, 6Performance-Team, 7Performance: Update HHVM package to recent release - https://phabricator.wikimedia.org/T119637#1929783 (10Reedy) [23:26:59] (03CR) 10jenkins-bot: [V: 04-1] [WIP] wikimetrics: Puppet module for wikimetrics [puppet] - 10https://gerrit.wikimedia.org/r/260687 (owner: 10Madhuvishy) [23:28:21] (03PS3) 10Dzahn: annualreport: add Apache site for 15.wp.org [puppet] - 10https://gerrit.wikimedia.org/r/263652 (https://phabricator.wikimedia.org/T599) [23:28:34] (03CR) 10Dzahn: [C: 032] annualreport: add Apache site for 15.wp.org [puppet] - 10https://gerrit.wikimedia.org/r/263652 (https://phabricator.wikimedia.org/T599) (owner: 10Dzahn) [23:29:46] 23:25:04 + tox -v -e pep8 [23:29:55] 23:25:04 ERROR: unknown environment 'pep8' [23:30:10] jzerebecki: ^ just talked about it last night, heh [23:30:27] 6operations, 6Phabricator: Bahodir Mansurov locked out of Phabricator - https://phabricator.wikimedia.org/T123334#1929819 (10demon) 5Open>3Resolved a:3demon ``` These auth factors will be stripped: bmansurov totp Mobile Phone App (TOTP) Strip these authentication factors? [y/N] y Stripping au... [23:30:30] operations-puppet-tox-pep8 FAILURE in 4s (non-voting) [23:31:12] mutante: that error is because there is not pep8 defined in tox.ini in puppet.git [23:31:26] 7Puppet: uwsgi puppet module does not seem to trigger restart when config is updated - https://phabricator.wikimedia.org/T123438#1929823 (10madhuvishy) 3NEW [23:32:26] jzerebecki: something is new about it [23:32:36] i am reporting on https://phabricator.wikimedia.org/T117570 [23:36:02] PROBLEM - puppet last run on mw1036 is CRITICAL: CRITICAL: Puppet has 1 failures [23:37:33] mutante: yes the job was added yesterday [23:37:40] as non-voting [23:38:51] gotcha, that is what i noticed [23:39:04] everything was green usually [23:39:31] so it was added but some config is missing [23:41:06] (03PS1) 10ArielGlenn: add OfficeIT namespace to wikitech [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263768 (https://phabricator.wikimedia.org/T123383) [23:42:59] (03PS1) 10JGirault: Bump portals to master (deploy new a/b/c test) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263770 [23:43:09] mutante: that is added in https://gerrit.wikimedia.org/r/#/c/244148/ [23:43:20] s/is /will / [23:43:43] 6operations, 6Discovery, 10Wikimedia-Logstash, 7Elasticsearch: Upgrade ElasticSearch to 1.7.4 - https://phabricator.wikimedia.org/T122697#1929863 (10Reedy) [23:43:54] jzerebecki: ah :) and ironically that fails with the voting pep8 check ? [23:45:28] (03PS5) 10Dzahn: add 15.wikipedia.org -> misc-addrs [dns] - 10https://gerrit.wikimedia.org/r/248504 (https://phabricator.wikimedia.org/T599) [23:47:53] 6operations, 7Mobile: Investigate if login.m.wikipedia.org needs to stay around - https://phabricator.wikimedia.org/T123431#1929871 (10Dzahn) @oxygen:/srv/log/webrequest# jq .uri_host /srv/log/webrequest/sampled-1000.json | wc -l **7035053** total @oxygen:/srv/log/webrequest# jq .uri_host /srv/log/webrequest... [23:48:47] (03CR) 10Dzahn: [C: 032] add 15.wikipedia.org -> misc-addrs [dns] - 10https://gerrit.wikimedia.org/r/248504 (https://phabricator.wikimedia.org/T599) (owner: 10Dzahn) [23:54:32] (03CR) 10Dzahn: [C: 031] add OfficeIT namespace to wikitech [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263768 (https://phabricator.wikimedia.org/T123383) (owner: 10ArielGlenn) [23:56:50] (03PS1) 10Andrew Bogott: Labs Jessie image: Explicitly install 3.19.0-1 kernel [puppet] - 10https://gerrit.wikimedia.org/r/263772 [23:56:52] (03PS1) 10Andrew Bogott: Labs debian image: pre-install a bunch of things that are in base [puppet] - 10https://gerrit.wikimedia.org/r/263773 [23:58:32] (03CR) 10Andrew Bogott: [C: 032] Labs Jessie image: Explicitly install 3.19.0-1 kernel [puppet] - 10https://gerrit.wikimedia.org/r/263772 (owner: 10Andrew Bogott) [23:58:58] (03CR) 10Andrew Bogott: [C: 032] Labs debian image: pre-install a bunch of things that are in base [puppet] - 10https://gerrit.wikimedia.org/r/263773 (owner: 10Andrew Bogott)