[00:00:21] (03PS5) 10Alexandros Kosiaris: puppetmaster: Fix trailing whitespace found [puppet] - 10https://gerrit.wikimedia.org/r/308348 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:01:04] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] puppetmaster: Fix trailing whitespace found [puppet] - 10https://gerrit.wikimedia.org/r/308348 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:02:02] (03CR) 10Alexandros Kosiaris: [C: 032] striker: Add custom error pages to apache vhost [puppet] - 10https://gerrit.wikimedia.org/r/308359 (https://phabricator.wikimedia.org/T144040) (owner: 10BryanDavis) [00:02:07] (03PS3) 10Alexandros Kosiaris: striker: Add custom error pages to apache vhost [puppet] - 10https://gerrit.wikimedia.org/r/308359 (https://phabricator.wikimedia.org/T144040) (owner: 10BryanDavis) [00:02:09] (03CR) 10Alexandros Kosiaris: [V: 032] striker: Add custom error pages to apache vhost [puppet] - 10https://gerrit.wikimedia.org/r/308359 (https://phabricator.wikimedia.org/T144040) (owner: 10BryanDavis) [00:04:51] (03CR) 10Paladox: "Thanks." [puppet] - 10https://gerrit.wikimedia.org/r/308348 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:05:09] (03CR) 10Paladox: "Thanks." [puppet] - 10https://gerrit.wikimedia.org/r/308348 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:05:57] (03PS4) 10Alexandros Kosiaris: cgred: Fix indentation of => [puppet] - 10https://gerrit.wikimedia.org/r/308333 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:06:02] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] cgred: Fix indentation of => [puppet] - 10https://gerrit.wikimedia.org/r/308333 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:06:14] (03CR) 10Paladox: "Thanks." [puppet] - 10https://gerrit.wikimedia.org/r/308333 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [00:13:32] (03PS4) 10Paladox: role/cxserver: Fix role::cxserver not in autoload module layout [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) [00:14:28] (03PS5) 10Paladox: role/cxserver: Fix role::cxserver not in autoload module layout [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) [00:30:04] PROBLEM - Disk space on scb1001 is CRITICAL: DISK CRITICAL - free space: / 349 MB (3% inode=84%) [02:11:16] 06Operations, 10Traffic, 07HTTPS, 13Patch-For-Review: Create a secure redirect service for large count of non-canonical / junk domains - https://phabricator.wikimedia.org/T133548#2607883 (10AlexMonk-WMF) [02:13:37] 06Operations, 10Traffic, 07HTTPS, 13Patch-For-Review: Create a secure redirect service for large count of non-canonical / junk domains - https://phabricator.wikimedia.org/T133548#2235376 (10AlexMonk-WMF) >>! In T133548#2242401, @BBlack wrote: > According to [[ https://letsencrypt.org/upcoming-features/ | h... [02:24:10] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.17) (duration: 10m 50s) [02:24:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:29:51] !log l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 5 02:29:51 UTC 2016 (duration 5m 42s) [02:29:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [06:15:05] (03PS2) 10Giuseppe Lavagetto: role::rcstream: move redis settings from legacy config file [puppet] - 10https://gerrit.wikimedia.org/r/308432 (https://phabricator.wikimedia.org/T134400) [06:23:04] (03CR) 10Giuseppe Lavagetto: [C: 032] role::rcstream: move redis settings from legacy config file [puppet] - 10https://gerrit.wikimedia.org/r/308432 (https://phabricator.wikimedia.org/T134400) (owner: 10Giuseppe Lavagetto) [06:34:59] (03PS2) 10Giuseppe Lavagetto: redis: manage our redis common config with puppet [puppet] - 10https://gerrit.wikimedia.org/r/301790 (https://phabricator.wikimedia.org/T134400) [06:40:50] (03CR) 10Giuseppe Lavagetto: [C: 032] redis: manage our redis common config with puppet [puppet] - 10https://gerrit.wikimedia.org/r/301790 (https://phabricator.wikimedia.org/T134400) (owner: 10Giuseppe Lavagetto) [06:41:30] (03PS2) 10Muehlenhoff: kibana: Restrict to domain networks [puppet] - 10https://gerrit.wikimedia.org/r/308171 [06:51:22] (03CR) 10Elukey: "PCC: https://puppet-compiler.wmflabs.org/3942/" [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308173 (owner: 10Elukey) [06:51:33] (03CR) 10Elukey: [C: 032] Allow a more custom set of arguments for the cleanup script [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308173 (owner: 10Elukey) [06:54:13] (03PS1) 10Elukey: Update the zookeeper module's sha to the latest revision. [puppet] - 10https://gerrit.wikimedia.org/r/308514 [06:56:32] (03PS3) 10Muehlenhoff: kibana: Restrict to domain networks [puppet] - 10https://gerrit.wikimedia.org/r/308171 [06:57:50] (03CR) 10Muehlenhoff: [C: 032] kibana: Restrict to domain networks [puppet] - 10https://gerrit.wikimedia.org/r/308171 (owner: 10Muehlenhoff) [06:58:04] (03CR) 10Elukey: [C: 032] Update the zookeeper module's sha to the latest revision. [puppet] - 10https://gerrit.wikimedia.org/r/308514 (owner: 10Elukey) [06:58:10] (03PS2) 10Elukey: Update the zookeeper module's sha to the latest revision. [puppet] - 10https://gerrit.wikimedia.org/r/308514 [07:00:18] 06Operations, 13Patch-For-Review: Puppet-manage redis.conf - https://phabricator.wikimedia.org/T134400#2608077 (10Joe) 05Open>03Resolved [07:02:25] I just disabled puppet on conf100[123] as precaution [07:02:37] even if the change is about a cron [07:09:57] (03PS1) 10Addshore: Enable the revisionslider on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308517 (https://phabricator.wikimedia.org/T144616) [07:11:28] (03PS4) 10Addshore: Enable RevisionSlider BetaFeature on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/305653 (https://phabricator.wikimedia.org/T143421) [07:12:30] (03PS1) 10Elukey: Renamed the cleanup_cron_ensure variable to fix a typo [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308518 [07:12:32] of course I put ensure = true [07:12:38] sigh [07:12:50] !log reimaging mw2169-mw2172 to jessie [07:12:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [07:13:50] (03PS2) 10Elukey: Renamed the cleanup_cron_ensure variable to fix a typo [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308518 [07:17:38] (03CR) 10Elukey: [C: 032] Renamed the cleanup_cron_ensure variable to fix a typo [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308518 (owner: 10Elukey) [07:17:48] (03CR) 10Elukey: "https://puppet-compiler.wmflabs.org/3944/conf1001.eqiad.wmnet/" [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308518 (owner: 10Elukey) [07:17:51] (03CR) 10Elukey: [C: 032] Renamed the cleanup_cron_ensure variable to fix a typo [puppet/zookeeper] - 10https://gerrit.wikimedia.org/r/308518 (owner: 10Elukey) [07:19:55] 06Operations, 10fundraising-tech-ops: barium low on disk space - https://phabricator.wikimedia.org/T144659#2608110 (10MoritzMuehlenhoff) a:03Jgreen [07:20:05] (03PS1) 10Elukey: Update zookeeper module's sha to the latest code change [puppet] - 10https://gerrit.wikimedia.org/r/308519 [07:21:05] (03CR) 10Elukey: [C: 032 V: 032] Update zookeeper module's sha to the latest code change [puppet] - 10https://gerrit.wikimedia.org/r/308519 (owner: 10Elukey) [07:23:48] (03PS1) 10Volans: Automation: automatically reimage host [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) [07:24:48] * elukey hugs volans [07:27:51] all right all good on the Druid's zookeeper, waiting a bit and then re-enable puppet on conf* [07:32:39] (03CR) 10Volans: "Given the particular nature of the script most things can be tested only in production." [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [07:32:53] elukey, moritzm ^^^ [07:33:58] (03PS1) 10Gilles: Configure Thumbor Swift connection timeout [puppet] - 10https://gerrit.wikimedia.org/r/308522 (https://phabricator.wikimedia.org/T144414) [07:34:01] "I don't usually do testing, but when I do, I do it in production [07:34:15] :D [07:34:24] I am going to review the code in a bit, thanks! [07:35:03] consider that is somehow a "quick and dirty" way of doing it, not the full proper way I have in mind ;) [07:35:05] I'll have a look later as well [07:44:46] <_joe_> volans: the full proper way should include a correct script to restart apache [07:44:49] <_joe_> :P [07:44:56] <_joe_> volans: I'll take a look too [07:45:05] _joe_: :-P [07:45:33] good morning [07:50:11] (03PS1) 10Addshore: Remove my own old SSH key [puppet] - 10https://gerrit.wikimedia.org/r/308523 [08:03:09] (03PS1) 10Hashar: contint: drop browser test from Precise [puppet] - 10https://gerrit.wikimedia.org/r/308524 [08:06:34] (03CR) 10Hashar: [C: 031] "Cherry picked on CI puppetmaster. It is quite dirty though :(" [puppet] - 10https://gerrit.wikimedia.org/r/308524 (owner: 10Hashar) [08:07:49] (03CR) 10Nemo bis: [C: 031] "Should be ok" [puppet] - 10https://gerrit.wikimedia.org/r/308435 (https://phabricator.wikimedia.org/T136924) (owner: 10Merlijn van Deen) [08:09:00] (03PS1) 10Elukey: Attempt to stop cronspam from graphite-web [puppet] - 10https://gerrit.wikimedia.org/r/308526 (https://phabricator.wikimedia.org/T132324) [08:16:50] (03PS1) 10Giuseppe Lavagetto: puppetmaster2001: add conftool::master role [puppet] - 10https://gerrit.wikimedia.org/r/308527 [08:16:55] <_joe_> akosiaris: ^^ [08:22:07] 06Operations: Provide wrapper script for account handling - https://phabricator.wikimedia.org/T142825#2608190 (10MoritzMuehlenhoff) p:05Triage>03Normal [08:24:39] 06Operations, 10Beta-Cluster-Infrastructure, 05Prometheus-metrics-monitoring: deploy prometheus node_exporter and server to deployment-prep - https://phabricator.wikimedia.org/T144502#2608192 (10hashar) [08:30:27] _joe_: ah yes, I did not add any extra roles palladium has to puppetmaster2001 on purpose so we can do it piecemeal later on.thanks! [08:30:55] (03CR) 10Alexandros Kosiaris: [C: 032] puppetmaster2001: add conftool::master role [puppet] - 10https://gerrit.wikimedia.org/r/308527 (owner: 10Giuseppe Lavagetto) [08:32:27] <_joe_> akosiaris: yeah I'm looking into everything that's needed [08:32:27] (03CR) 10Hashar: [C: 031] manifests/role: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308315 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:32:38] (03CR) 10Hashar: [C: 031] deployment: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308336 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:32:48] (03CR) 10Hashar: [C: 031] docker: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308337 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:33:16] (03CR) 10Hashar: [C: 031] ganglia: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308339 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:33:29] (03CR) 10Hashar: [C: 031] labs_dns: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308340 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:33:38] (03CR) 10Hashar: [C: 031] lvs: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308342 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:39:04] 06Operations, 10Ops-Access-Requests, 06Research-and-Data, 10Research-collaborations, 10Research-management: Request access to data for WDQS research - https://phabricator.wikimedia.org/T142780#2608275 (10AlexKrauseTUD) I know how this sounds, but I must kindly ask you to update my public key one last tim... [08:42:15] (03CR) 10Alexandros Kosiaris: [C: 04-1] role/cxserver: Fix role::cxserver not in autoload module layout (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:43:00] (03CR) 10Filippo Giunchedi: [C: 031] Attempt to stop cronspam from graphite-web [puppet] - 10https://gerrit.wikimedia.org/r/308526 (https://phabricator.wikimedia.org/T132324) (owner: 10Elukey) [08:43:58] (03PS1) 10Gehel: elasticsearch cirrus - align configuration for deployment-prep [puppet] - 10https://gerrit.wikimedia.org/r/308529 [08:46:28] (03PS1) 10Alexandros Kosiaris: role: 2fa => twofa [puppet] - 10https://gerrit.wikimedia.org/r/308530 [08:47:27] (03CR) 10Gehel: [C: 032] elasticsearch cirrus - align configuration for deployment-prep [puppet] - 10https://gerrit.wikimedia.org/r/308529 (owner: 10Gehel) [08:49:33] (03CR) 10Muehlenhoff: [C: 031] role: 2fa => twofa [puppet] - 10https://gerrit.wikimedia.org/r/308530 (owner: 10Alexandros Kosiaris) [08:50:31] (03CR) 10Alexandros Kosiaris: [C: 032] role: 2fa => twofa [puppet] - 10https://gerrit.wikimedia.org/r/308530 (owner: 10Alexandros Kosiaris) [08:50:36] (03PS2) 10Alexandros Kosiaris: role: 2fa => twofa [puppet] - 10https://gerrit.wikimedia.org/r/308530 [08:50:38] (03CR) 10Giuseppe Lavagetto: [C: 04-2] "Please do not change role names; this and all the remaining roles not in autoload layout should be moved all at once, because of" [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [08:50:40] (03CR) 10Alexandros Kosiaris: [V: 032] role: 2fa => twofa [puppet] - 10https://gerrit.wikimedia.org/r/308530 (owner: 10Alexandros Kosiaris) [08:51:27] <_joe_> akosiaris: did ou try to quote the argument instead? [08:51:42] I 've tried to escape it [08:51:52] and it looked ugly... quoting looked uglier [08:52:07] <_joe_> ok ok [08:57:51] (03PS2) 10Elukey: Attempt to stop cronspam from graphite-web [puppet] - 10https://gerrit.wikimedia.org/r/308526 (https://phabricator.wikimedia.org/T132324) [09:01:06] !log reimaging mw2174-mw2177 to jessie [09:01:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:01:18] 06Operations, 10ops-eqiad, 10media-storage: diagnose failed(?) sda on ms-be1022 - https://phabricator.wikimedia.org/T140597#2608334 (10fgiunchedi) a:05Cmjohnson>03fgiunchedi thanks @Cmjohnson ! I've reimaged the machine and it seems fine so far, I'll run some tests and put it in service if no further err... [09:01:35] (03PS1) 10Alexandros Kosiaris: pybal: Fix require_package Puppet 4.x syntax [puppet] - 10https://gerrit.wikimedia.org/r/308531 [09:02:38] (03CR) 10Elukey: [C: 032] Attempt to stop cronspam from graphite-web [puppet] - 10https://gerrit.wikimedia.org/r/308526 (https://phabricator.wikimedia.org/T132324) (owner: 10Elukey) [09:03:47] akosiaris: should I merge your changes? [09:04:09] otherwise feel free to merge mines later on, I am not in a hurry [09:04:17] elukey: feel free [09:04:31] all right merging [09:10:06] (03PS2) 10Filippo Giunchedi: Configure Thumbor Swift connection timeout [puppet] - 10https://gerrit.wikimedia.org/r/308522 (https://phabricator.wikimedia.org/T144414) (owner: 10Gilles) [09:12:04] (03CR) 10Alexandros Kosiaris: [C: 04-1] "nice!Thanks for that!. first round of comments, I 'll looking into the python script itself later in the day" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [09:12:27] 06Operations, 10ops-codfw, 05Puppet-infrastructure-modernization: rack/setup/deploy puppetmaster200[12] - https://phabricator.wikimedia.org/T143255#2608339 (10akosiaris) 05Open>03Resolved [09:12:28] (03CR) 10Filippo Giunchedi: [C: 032] Configure Thumbor Swift connection timeout [puppet] - 10https://gerrit.wikimedia.org/r/308522 (https://phabricator.wikimedia.org/T144414) (owner: 10Gilles) [09:13:37] (03CR) 10Alexandros Kosiaris: "heh, I am actually the one advising the role name change in the interest of moving this forward. It's a very ugly thing indeed, but I am n" [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [09:14:15] !log reimaging mw218[4567] to Debian Jessie [09:14:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:19:26] (03PS1) 10Giuseppe Lavagetto: puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 [09:19:58] <_joe_> akosiaris: ^^ still completely untested and un-proofread, though [09:20:33] (03CR) 10jenkins-bot: [V: 04-1] puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 (owner: 10Giuseppe Lavagetto) [09:24:28] !log reimaging mw21(8[89]|9[01]) to Debian Jessie [09:24:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:25:06] (03CR) 10Alexandros Kosiaris: "premise looks fine, minor comment. This would need a ops meeting approval but given one is not gonna happen today we can expedite with mar" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/302471 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [09:27:16] 06Operations, 06Performance-Team, 10Thumbor, 13Patch-For-Review: add thumbor to production infrastructure - https://phabricator.wikimedia.org/T139606#2608345 (10fgiunchedi) [09:31:11] (03PS1) 10Giuseppe Lavagetto: role::xenon: fixup fluorine for redis config [puppet] - 10https://gerrit.wikimedia.org/r/308534 [09:32:22] (03CR) 10jenkins-bot: [V: 04-1] role::xenon: fixup fluorine for redis config [puppet] - 10https://gerrit.wikimedia.org/r/308534 (owner: 10Giuseppe Lavagetto) [09:34:02] (03PS2) 10Giuseppe Lavagetto: role::xenon: fixup fluorine for redis config [puppet] - 10https://gerrit.wikimedia.org/r/308534 [09:41:26] (03PS1) 10Elukey: Remove the unused group analytics-root from puppet [puppet] - 10https://gerrit.wikimedia.org/r/308535 [09:46:25] (03CR) 10Volans: "Thanks Alex for the review" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [09:46:38] (03PS2) 10Volans: Automation: automatically reimage host [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) [09:49:20] (03PS6) 10Paladox: role/cxserver: Fix role::cxserver not in autoload module layout [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) [09:49:25] (03PS7) 10Paladox: role/cxserver: Fix role::cxserver not in autoload module layout [puppet] - 10https://gerrit.wikimedia.org/r/308498 (https://phabricator.wikimedia.org/T93645) [09:52:34] (03PS1) 10Alexandros Kosiaris: mysql_multi_instance: Remove duplicate keys [puppet] - 10https://gerrit.wikimedia.org/r/308538 [09:52:36] (03PS1) 10Alexandros Kosiaris: Use parentheses in all custom functions [puppet] - 10https://gerrit.wikimedia.org/r/308539 [09:52:38] (03PS1) 10Alexandros Kosiaris: puppet::self: Merge the standalone statements in the hash [puppet] - 10https://gerrit.wikimedia.org/r/308540 [09:54:05] 06Operations, 10Ops-Access-Requests: Sudo access to the Analytics Druid cluster for the Analytics team - https://phabricator.wikimedia.org/T144726#2608377 (10elukey) [09:54:16] 06Operations, 10Ops-Access-Requests: Sudo access to the Analytics Druid cluster for the Analytics team - https://phabricator.wikimedia.org/T144726#2608391 (10elukey) p:05Triage>03Normal [09:57:47] (03CR) 10Giuseppe Lavagetto: [C: 032] role::xenon: fixup fluorine for redis config [puppet] - 10https://gerrit.wikimedia.org/r/308534 (owner: 10Giuseppe Lavagetto) [09:59:09] (03PS1) 10Filippo Giunchedi: thumbor: add firejail profile [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) [10:00:42] (03PS1) 10Elukey: Add the druid-admins group for the Analytics team [puppet] - 10https://gerrit.wikimedia.org/r/308544 (https://phabricator.wikimedia.org/T144726) [10:01:53] 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: Sudo access to the Analytics Druid cluster for the Analytics team - https://phabricator.wikimedia.org/T144726#2608412 (10elukey) There shouldn't be any need of manager approvals since the Analytics team is requesting access for itself. [10:02:09] (03PS1) 10Giuseppe Lavagetto: role::xenon: don't bind to localhost for redis [puppet] - 10https://gerrit.wikimedia.org/r/308545 [10:02:44] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] role::xenon: don't bind to localhost for redis [puppet] - 10https://gerrit.wikimedia.org/r/308545 (owner: 10Giuseppe Lavagetto) [10:05:40] (03PS2) 10Filippo Giunchedi: introduce thumbor-admins group [puppet] - 10https://gerrit.wikimedia.org/r/302471 (https://phabricator.wikimedia.org/T139606) [10:06:56] (03CR) 10jenkins-bot: [V: 04-1] introduce thumbor-admins group [puppet] - 10https://gerrit.wikimedia.org/r/302471 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [10:10:05] (03PS2) 10Gehel: elasticsearch: increase keepalive timeout to allow for better connection reuse. [puppet] - 10https://gerrit.wikimedia.org/r/308234 [10:11:18] (03Abandoned) 10Elukey: Add the druid-admins group for the Analytics team [puppet] - 10https://gerrit.wikimedia.org/r/308544 (https://phabricator.wikimedia.org/T144726) (owner: 10Elukey) [10:11:34] (03CR) 10Gehel: [C: 032] elasticsearch: increase keepalive timeout to allow for better connection reuse. [puppet] - 10https://gerrit.wikimedia.org/r/308234 (owner: 10Gehel) [10:12:27] (03PS3) 10Filippo Giunchedi: introduce thumbor-admins group [puppet] - 10https://gerrit.wikimedia.org/r/302471 (https://phabricator.wikimedia.org/T139606) [10:12:31] (03PS1) 10Alexandros Kosiaris: 2fa: Fix file name [puppet] - 10https://gerrit.wikimedia.org/r/308549 [10:12:49] (03PS1) 10Marostegui: Add db1019 to temporary replace db1064 which is going to be reimaged and upgraded [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 [10:12:52] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] 2fa: Fix file name [puppet] - 10https://gerrit.wikimedia.org/r/308549 (owner: 10Alexandros Kosiaris) [10:12:56] (03PS2) 10Alexandros Kosiaris: 2fa: Fix file name [puppet] - 10https://gerrit.wikimedia.org/r/308549 [10:12:59] (03CR) 10Alexandros Kosiaris: [V: 032] 2fa: Fix file name [puppet] - 10https://gerrit.wikimedia.org/r/308549 (owner: 10Alexandros Kosiaris) [10:13:35] (03CR) 10Hashar: [C: 031] openstack: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308345 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:14:14] (03CR) 10Muehlenhoff: "Two comments inline, looks good to me." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [10:14:17] (03CR) 10Hashar: [C: 031] nagios_common: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308344 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:14:30] (03CR) 10Hashar: [C: 031] statsd_proxy: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308356 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:15:00] (03CR) 10Hashar: [C: 031] pybal: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308349 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:16:48] 06Operations, 10Datasets-General-or-Unknown, 10hardware-requests: reallocate snapshto1001 for use as canary/testbed for dumps - https://phabricator.wikimedia.org/T144728#2608426 (10ArielGlenn) [10:18:10] !log reimaging mw2192-mw2195 to jessie [10:18:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [10:21:13] (03Restored) 10Elukey: Add the druid-admins group for the Analytics team [puppet] - 10https://gerrit.wikimedia.org/r/308544 (https://phabricator.wikimedia.org/T144726) (owner: 10Elukey) [10:23:16] moritzm: is there already a generic firejail profile "with network" I could test with thumbor ? [10:24:23] (03PS2) 10Elukey: Add the druid-admins group for the Analytics team [puppet] - 10https://gerrit.wikimedia.org/r/308544 (https://phabricator.wikimedia.org/T144726) [10:26:10] (03PS3) 10Elukey: Add the druid-admins group for the Analytics team [puppet] - 10https://gerrit.wikimedia.org/r/308544 (https://phabricator.wikimedia.org/T144726) [10:26:14] godog: more or less what service::unit uses (e.g. on scb), but I haven't converted these to use a specific profile file, ATM they pass it on the command line. initially it would look like mediawiki-converters.profile without "net none" [10:26:47] (03CR) 10Jcrespo: [C: 04-2] "The right fix is: https://gerrit.wikimedia.org/r/301076" [puppet] - 10https://gerrit.wikimedia.org/r/308441 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:27:07] just use the thumbor profile as currently used in your patch, I can convert that if the python-wand bug is fixed at some point [10:27:53] moritzm: ok thanks! [10:28:44] (03PS2) 10Filippo Giunchedi: thumbor: add firejail profile [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) [10:29:49] (03CR) 10Jcrespo: "Mutante, we wasted precious and really appreciated volunteer's time." [puppet] - 10https://gerrit.wikimedia.org/r/308441 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:30:38] (03CR) 10Filippo Giunchedi: thumbor: add firejail profile (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [10:32:05] (03PS1) 10ArielGlenn: allow test module commands such as test.ping to be run from palladium [puppet] - 10https://gerrit.wikimedia.org/r/308551 [10:33:18] (03CR) 10Jcrespo: "See my comment below." (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (owner: 10Marostegui) [10:34:02] (03CR) 10Marostegui: [C: 031] "Totally in favor of pigz, when available." [puppet] - 10https://gerrit.wikimedia.org/r/293743 (owner: 10Jcrespo) [10:35:20] (03Abandoned) 10Hashar: Flake8 for ganglia [puppet] - 10https://gerrit.wikimedia.org/r/277498 (owner: 10Ladsgroup) [10:36:16] (03Abandoned) 10Paladox: role: Fix not in autoload module layout [puppet] - 10https://gerrit.wikimedia.org/r/308441 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [10:37:16] !log depooling/rebooting/repooling sca1001 for upgrade to Linux 4.4 (T144492) [10:37:18] T144492: Blocked /etc/passwd on sca100[1234] hosts - https://phabricator.wikimedia.org/T144492 [10:37:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [10:38:53] (03PS1) 10Gilles: Upgrade to version 0.1.13 [debs/python-thumbor-wikimedia] - 10https://gerrit.wikimedia.org/r/308552 [10:42:40] (03PS2) 10Marostegui: Add db1019 to temporary replace db1064 which is going to be reimaged and upgraded [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 [10:45:46] (03CR) 10Filippo Giunchedi: thumbor: add firejail profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [10:52:37] (03CR) 10Filippo Giunchedi: [C: 031] Upgrade to version 0.1.13 [debs/python-thumbor-wikimedia] - 10https://gerrit.wikimedia.org/r/308552 (owner: 10Gilles) [10:54:13] (03PS3) 10Mobrovac: Change-Prop: Enable file transclusion updates [puppet] - 10https://gerrit.wikimedia.org/r/306308 (owner: 10Ppchelko) [10:55:18] (03CR) 10Hoo man: [C: 04-1] "Wikibase on purpose counts articles differently than normal MediaWiki. Changing this should only be done after a thorough discussion, I th" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308430 (https://phabricator.wikimedia.org/T144687) (owner: 10Urbanecm) [10:55:22] (03CR) 10Muehlenhoff: [C: 031] thumbor: add firejail profile [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [10:57:43] (03PS1) 10Volans: Salt: reducing permissions on the master's Job cache [puppet] - 10https://gerrit.wikimedia.org/r/308554 (https://phabricator.wikimedia.org/T143536) [10:59:35] (03CR) 10Volans: "Ariel, Moritz, do you think that this will do the job?" [puppet] - 10https://gerrit.wikimedia.org/r/308554 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [11:00:37] (03PS8) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) [11:01:39] (03CR) 10Muehlenhoff: "How about also allowing journalctl? This allows him to start/stop/restart, but it will be difficult to track down startup problems in prod" [puppet] - 10https://gerrit.wikimedia.org/r/302471 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [11:04:10] (03PS3) 10Filippo Giunchedi: thumbor: add firejail profile [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) [11:04:48] (03CR) 10Giuseppe Lavagetto: "@moritzm I would log from thumbor to a file via syslog filters in order to do that, see what we did for service::node" [puppet] - 10https://gerrit.wikimedia.org/r/302471 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [11:05:19] (03CR) 10Mobrovac: "Re separation, this was necessary to be able to properly and thoroughly test the process. Given the situation, IMHO it'd be better to keep" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [11:06:53] (03CR) 10Filippo Giunchedi: [C: 032] thumbor: add firejail profile [puppet] - 10https://gerrit.wikimedia.org/r/308542 (https://phabricator.wikimedia.org/T139606) (owner: 10Filippo Giunchedi) [11:10:57] 06Operations, 06Performance-Team, 10Thumbor, 13Patch-For-Review: add thumbor to production infrastructure - https://phabricator.wikimedia.org/T139606#2608522 (10fgiunchedi) [11:12:02] (03CR) 10Mobrovac: [C: 031] Use parentheses in all custom functions [puppet] - 10https://gerrit.wikimedia.org/r/308539 (owner: 10Alexandros Kosiaris) [11:16:58] (03CR) 10Muehlenhoff: "One bug and one enhancement proposal, only skimmed it so far, but will have a more indepth look later the day." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [11:18:34] (03PS3) 10Marostegui: Add db1019 to temporary replace db1064 which is going to be reimaged and upgraded Bug: T144723 Change-Id: I5c94357f512f8563f121331616cdfd179d1f4eb9 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) [11:20:21] (03CR) 10Mobrovac: [C: 04-1] "https://gerrit.wikimedia.org/r/#/c/308021/ eliminates mathoid's variables altogether, so this should not be an issue in a day or two." [puppet] - 10https://gerrit.wikimedia.org/r/308343 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [11:20:45] (03Abandoned) 10Paladox: mathoid: Fix variable contains an uppercase letter [puppet] - 10https://gerrit.wikimedia.org/r/308343 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [11:21:38] (03CR) 10Urbanecm: "@Hoo man Do you mean discussion inside Wikidata or some other kind of discussion?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308430 (https://phabricator.wikimedia.org/T144687) (owner: 10Urbanecm) [11:21:49] thanks moritzm [11:22:09] (03CR) 10Hoo man: "On Wikidata initially, probably… and then we can take it back to Phabricator to see how to proceed (technically)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308430 (https://phabricator.wikimedia.org/T144687) (owner: 10Urbanecm) [11:23:18] (03CR) 10Urbanecm: "This https://www.wikidata.org/w/index.php?title=Wikidata_talk:Main_Page&oldid=373228470#Item_count won't do? I think as this is technical " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308430 (https://phabricator.wikimedia.org/T144687) (owner: 10Urbanecm) [11:23:22] (03CR) 10ArielGlenn: [C: 031] "Everything the salt master does is run as root so this seems like a no-brainer. I have a changeset in to the puppet compiler for updating" [puppet] - 10https://gerrit.wikimedia.org/r/308554 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [11:24:46] (03CR) 10Hoo man: "I'm fairly sure that there are more people with an opinion on this, this has been discussed before (but I don't have links offhand)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308430 (https://phabricator.wikimedia.org/T144687) (owner: 10Urbanecm) [11:24:48] (03PS2) 10Giuseppe Lavagetto: puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 [11:26:21] (03CR) 10jenkins-bot: [V: 04-1] puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 (owner: 10Giuseppe Lavagetto) [11:30:36] (03PS3) 10Giuseppe Lavagetto: puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 [11:35:58] (03CR) 10Muehlenhoff: [C: 031] "Agreed, this group is empty for about 15 months now and seems unlikely to be needed again." [puppet] - 10https://gerrit.wikimedia.org/r/308535 (owner: 10Elukey) [11:36:31] (03CR) 10Urbanecm: "Okay, so descheduled. I'll wait some time." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308430 (https://phabricator.wikimedia.org/T144687) (owner: 10Urbanecm) [11:40:23] !log Reimaging mw217[89] and mw219[6789] to Debian jessie [11:40:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [11:42:01] (03PS4) 10Marostegui: Add db1019 to temporary replace db1064 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) [11:44:05] (03Abandoned) 10Volans: Reimaging: Fix infinite loops when -n is set [puppet] - 10https://gerrit.wikimedia.org/r/307490 (https://phabricator.wikimedia.org/T144264) (owner: 10Volans) [11:46:48] (03PS5) 10Marostegui: mariadb: Depool db1064 for maintenance; pool db1019 instead [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) [11:47:18] (03CR) 10Muehlenhoff: [C: 04-1] "Looks good from a technical PoV, but this grants new sudo permissions and needs to be acked in the ops meeting first." [puppet] - 10https://gerrit.wikimedia.org/r/308544 (https://phabricator.wikimedia.org/T144726) (owner: 10Elukey) [11:49:23] moritzm: about --^ agreed, I just prepared the change to be ready :) [11:53:09] ah, ok :-) [11:55:59] hashar: Dereckson aude I can do the EU swat today (as I have a patch in it!) :) [11:56:05] just FYI! [11:59:25] (03CR) 10Gilles: "Repackaging this and deploying it should fix T144481 T144415 and T144414" [debs/python-thumbor-wikimedia] - 10https://gerrit.wikimedia.org/r/308552 (owner: 10Gilles) [12:11:37] (03PS6) 10Marostegui: mariadb: Depool db1064 for maintenance; pool db1019 instead [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) [12:16:28] (03CR) 10Filippo Giunchedi: [C: 032] Upgrade to version 0.1.13 [debs/python-thumbor-wikimedia] - 10https://gerrit.wikimedia.org/r/308552 (owner: 10Gilles) [12:17:09] addshore: sure ! := [12:17:18] I will be around if needed [12:17:28] zeljkof: ^^ [12:18:10] 06Operations, 07HHVM: Migrate deployment servers (tin/mira) to jessie - https://phabricator.wikimedia.org/T144578#2608625 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:18:21] zeljkof: I will be around too :) [12:21:23] 06Operations: Update ICU version to 55.1 - https://phabricator.wikimedia.org/T143931#2583636 (10MoritzMuehlenhoff) We can't easily upgrade icu to 55.1. Can we pinpoint the fix to a specific upstream commit? [12:22:03] 06Operations, 10ops-eqiad: graphite1002.eqiad.wmnet: slot=10 disk failed - https://phabricator.wikimedia.org/T141795#2608633 (10MoritzMuehlenhoff) a:03Cmjohnson [12:22:24] 06Operations, 10Mail: mx1001/2001 - Exim SMTP - Certificate expires Sep 22 2016 - https://phabricator.wikimedia.org/T144568#2608634 (10MoritzMuehlenhoff) p:05Triage>03High [12:23:11] 06Operations: Blocked /etc/passwd on sca100[1234] hosts - https://phabricator.wikimedia.org/T144492#2608647 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [12:24:15] 06Operations, 13Patch-For-Review: Handling of customised systemd units via puppet in base::service_unit - https://phabricator.wikimedia.org/T143210#2608648 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [12:24:42] 06Operations: Require/track email addresses - https://phabricator.wikimedia.org/T142826#2608649 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:24:49] 06Operations: Optional expiry date for user accounts - https://phabricator.wikimedia.org/T142816#2608650 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:24:59] 06Operations: Cross-validation of account data - https://phabricator.wikimedia.org/T142836#2608651 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:25:07] 06Operations: Require/track Phabricator username - https://phabricator.wikimedia.org/T142830#2608652 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:25:18] 06Operations: Enforce reference to Phabricator task for all commits to modules/admin/data/data.yaml - https://phabricator.wikimedia.org/T142827#2608653 (10MoritzMuehlenhoff) p:05Triage>03Normal [12:26:15] 06Operations, 10ops-eqiad: ms-be1004.eqiad.wmnet: slot=3 dev=sdd failed - https://phabricator.wikimedia.org/T144499#2608654 (10MoritzMuehlenhoff) a:03Cmjohnson [12:29:44] hashar: zeljkof cool! :) [12:33:33] 06Operations, 10Traffic, 13Patch-For-Review: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502#2608658 (10ema) We suspect that the bug(s) encountered while upgrading ulsfo might have been caused by running a mix of Varnish 3 and Varnish 4 through multiple layers of caches (ulsfo... [12:33:52] (03PS7) 10Marostegui: mariadb: Depool db1064 for maintenance; pool db1019 instead [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) [12:39:11] (03PS1) 10Ema: Revert "Upgrade cp4005 (ulsfo cache_upload) to Varnish 4" [puppet] - 10https://gerrit.wikimedia.org/r/308560 (https://phabricator.wikimedia.org/T131502) [12:40:01] (03CR) 10Ema: [C: 032 V: 032] Revert "Upgrade cp4005 (ulsfo cache_upload) to Varnish 4" [puppet] - 10https://gerrit.wikimedia.org/r/308560 (https://phabricator.wikimedia.org/T131502) (owner: 10Ema) [12:40:04] 06Operations, 06Discovery, 10Elasticsearch, 06Discovery-Search (Current work): Make elasticsearch actually uses shard allocation awareness - https://phabricator.wikimedia.org/T143571#2608663 (10Gehel) [12:40:45] (03CR) 10Gilles: "I'll let you close the phab tasks once you've confirmed that all three issues are resolved" [debs/python-thumbor-wikimedia] - 10https://gerrit.wikimedia.org/r/308552 (owner: 10Gilles) [12:41:42] !log downgrading cp4005 to varnish 3 T131502 [12:41:43] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [12:41:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [12:42:39] (03PS1) 10Gehel: elasticsearch - enable row aware shard allocation [puppet] - 10https://gerrit.wikimedia.org/r/308561 (https://phabricator.wikimedia.org/T143571) [12:44:25] 06Operations, 06Discovery, 10Elasticsearch, 06Discovery-Search (Current work), 13Patch-For-Review: Make elasticsearch actually uses shard allocation awareness - https://phabricator.wikimedia.org/T143571#2608671 (10Gehel) Allocation will be row aware, not rack aware. Spreading shards across row will ensur... [12:44:46] (03CR) 10Jcrespo: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) (owner: 10Marostegui) [12:45:16] marostegui, let's do a deploy [12:45:27] jynus: let's do it! [12:45:29] (if you are around) [12:45:36] I am! [12:47:30] !log depooling/rebooting/repooling sca1002 for upgrade to Linux 4.4 (T144492) [12:47:31] T144492: Blocked /etc/passwd on sca100[1234] hosts - https://phabricator.wikimedia.org/T144492 [12:47:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [12:47:42] jynus: Let's wait like 30 minutes, or so, I will ping you around 15:15 [12:47:49] good [12:49:02] actually, at that time there is the European SWAT [12:49:26] so it will have to be after that, or add it to the list [12:49:39] !log repool cp4005 with varnish 3 [12:49:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [12:49:56] I will add it to the list, ask you for a self-deploy [12:56:41] (03CR) 10ArielGlenn: [C: 032] allow test module commands such as test.ping to be run from palladium [puppet] - 10https://gerrit.wikimedia.org/r/308551 (owner: 10ArielGlenn) [12:56:52] (03PS2) 10ArielGlenn: allow test module commands such as test.ping to be run from palladium [puppet] - 10https://gerrit.wikimedia.org/r/308551 [12:56:57] addshore: ack'ed [12:57:14] =] [13:00:04] hashar, Dereckson, addshore, and aude: Dear anthropoid, the time has come. Please deploy European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T1300). [13:00:04] MatmaRex, Urbanecm, Addshore, and jynus: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be available during the process. [13:00:13] Around [13:00:21] here [13:00:34] * addshore does not see a MatmaRex :o [13:01:01] Urbanecm: I'll start with you then! [13:01:17] Okay [13:01:22] (03CR) 10Addshore: [C: 032] Add source wikis in import page in sawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308429 (https://phabricator.wikimedia.org/T133483) (owner: 10Urbanecm) [13:01:49] (03Merged) 10jenkins-bot: Add source wikis in import page in sawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308429 (https://phabricator.wikimedia.org/T133483) (owner: 10Urbanecm) [13:03:43] Urbanecm: you change is deployed on mw1099, please check! :) [13:04:19] Okay, checking. [13:05:13] addshore: sawiki works but I'm not a sysop so I can't check it fully. [13:05:28] So I'll ask the author of the task for checking. [13:05:31] Is it fine? [13:05:51] I think so, It don't look like anything can go wrong with that one :) [13:06:18] Okay. [13:07:00] !log addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:308429|Add source wikis in import page in sawiki]] (duration: 00m 53s) [13:07:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:07:07] Urbanecm: all deployed! [13:07:12] 06Operations: Blocked /etc/passwd on sca100[1234] hosts - https://phabricator.wikimedia.org/T144492#2608702 (10MoritzMuehlenhoff) 05Open>03Resolved All systems from the sca cluster are now running the 4.4 HWE kernel from trusty, puppet runs are fine again. [13:07:33] (03CR) 10Addshore: [C: 032] Enable the revisionslider on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308517 (https://phabricator.wikimedia.org/T144616) (owner: 10Addshore) [13:08:03] Urbanecm: incase you missed that last message, it is deployed everywhere now [13:08:20] 06Operations, 13Patch-For-Review: Handling of customised systemd units via puppet in base::service_unit - https://phabricator.wikimedia.org/T143210#2608704 (10MoritzMuehlenhoff) 05Open>03Resolved base::service_unit provides support for an override file and HHVM has been migrated to that. I also sent a noti... [13:09:08] (03PS2) 10Addshore: Enable the revisionslider on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308517 (https://phabricator.wikimedia.org/T144616) [13:09:16] (03CR) 10Addshore: [C: 032] Enable the revisionslider on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308517 (https://phabricator.wikimedia.org/T144616) (owner: 10Addshore) [13:09:43] (03Merged) 10jenkins-bot: Enable the revisionslider on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308517 (https://phabricator.wikimedia.org/T144616) (owner: 10Addshore) [13:12:41] !log addshore@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:308517|Enable the RevisionSlider on test.wikidata.org]] (duration: 00m 48s) [13:13:13] marostegui: jynus as MatmaRex is a no show I'll leave you to do your patch :) [13:13:35] thanks, please ping me in case things change [13:13:41] so we do not collide [13:13:48] willdo! [13:19:20] (03PS2) 10Elukey: Remove the unused group analytics-root from puppet [puppet] - 10https://gerrit.wikimedia.org/r/308535 [13:23:04] (03CR) 10Elukey: [C: 032] Remove the unused group analytics-root from puppet [puppet] - 10https://gerrit.wikimedia.org/r/308535 (owner: 10Elukey) [13:23:27] (03PS4) 10Giuseppe Lavagetto: puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 [13:23:49] !log reimaging mw2087 to jessie [13:23:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:29:03] (03CR) 10Giuseppe Lavagetto: [C: 032] puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 (owner: 10Giuseppe Lavagetto) [13:29:17] (03CR) 10Giuseppe Lavagetto: "Does the right thing: https://puppet-compiler.wmflabs.org/3953/" [puppet] - 10https://gerrit.wikimedia.org/r/308533 (owner: 10Giuseppe Lavagetto) [13:29:23] (03PS5) 10Giuseppe Lavagetto: puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 [13:29:31] (03CR) 10Giuseppe Lavagetto: [V: 032] puppetmaster: allow using puppetdb as a backend for storeconfigs [puppet] - 10https://gerrit.wikimedia.org/r/308533 (owner: 10Giuseppe Lavagetto) [13:30:49] moritzm: codfw appservers (except mw2170) all debianized :) [13:34:42] (03PS1) 10Gehel: graphite - fix storage_schemas order [puppet] - 10https://gerrit.wikimedia.org/r/308565 [13:36:03] (03CR) 10jenkins-bot: [V: 04-1] graphite - fix storage_schemas order [puppet] - 10https://gerrit.wikimedia.org/r/308565 (owner: 10Gehel) [13:36:43] 06Operations, 06Operations-Software-Development, 07HHVM, 13Patch-For-Review: Upgrade all mw* servers to debian jessie - https://phabricator.wikimedia.org/T143536#2608809 (10Volans) [13:37:24] (03PS2) 10Gehel: graphite - fix storage_schemas order [puppet] - 10https://gerrit.wikimedia.org/r/308565 [13:38:11] elukey: \o/ also fixing the last scaler as we speak (via manual racadm setting, since Papaul's IPMI change seems non-working) [13:40:45] (03PS4) 10Giuseppe Lavagetto: Change-Prop: Enable file transclusion updates [puppet] - 10https://gerrit.wikimedia.org/r/306308 (owner: 10Ppchelko) [13:40:57] !log upgrading mw130[012345] to the latest version of Apache httpd (eqiad jobrunners, one at the time) [13:41:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:41:38] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] Change-Prop: Enable file transclusion updates [puppet] - 10https://gerrit.wikimedia.org/r/306308 (owner: 10Ppchelko) [13:43:43] <_joe_> mobrovac: running puppet now [13:43:48] kk [13:43:56] <_joe_> done [13:44:30] k, first restarting in codfw [13:44:31] jynus: I am ready to deploy [13:45:59] (03PS8) 10Marostegui: mariadb: Depool db1064 for maintenance; pool db1019 instead [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) [13:46:29] _joe_: all good in codfw, proceeding to eqiad [13:47:01] !log change-prop restarting for https://gerrit.wikimedia.org/r/306308 [13:47:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:47:09] (03CR) 10Marostegui: [C: 032] mariadb: Depool db1064 for maintenance; pool db1019 instead (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) (owner: 10Marostegui) [13:47:40] (03Merged) 10jenkins-bot: mariadb: Depool db1064 for maintenance; pool db1019 instead [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308550 (https://phabricator.wikimedia.org/T144723) (owner: 10Marostegui) [13:52:53] 07Puppet, 10Continuous-Integration-Infrastructure: Cant refresh Nodepool snapshot due to puppet: Could not find class passwords::puppet::database - https://phabricator.wikimedia.org/T143769#2608847 (10hashar) I think the original issue related to puppetmaster class is gone. The reason for Alexandros change is... [13:55:43] _joe_: ok, looking good so far, thnx! [13:57:36] <_joe_> mobrovac: cool [13:57:44] (03PS1) 10Giuseppe Lavagetto: role::puppetmaster: removed useless quoted booleans [puppet] - 10https://gerrit.wikimedia.org/r/308568 [13:57:46] (03PS1) 10Giuseppe Lavagetto: puppetdb: use on codfw puppetmaster [puppet] - 10https://gerrit.wikimedia.org/r/308569 [14:00:18] (03CR) 10Volans: "Hi Alex, thanks for the new revision, it's in much better shape now." (0320 comments) [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) (owner: 10Alex Monk) [14:00:21] !log upgrading mw1306/mw1299 to the latest version of Apache httpd [14:00:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:05:06] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Changing db-eqiad config to depool db1064 - T144723 (duration: 00m 48s) [14:05:07] T144723: Reimage & upgrade db1064 - https://phabricator.wikimedia.org/T144723 [14:05:08] 07Puppet, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: Cant refresh Nodepool snapshot due to puppet: Could not find class passwords::puppet::database - https://phabricator.wikimedia.org/T143769#2608943 (10hashar) 05Open>03Resolved a:03hashar That fixed the build of both Jessie and Trust... [14:05:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:07:36] The deploy is done this box failed: mw2087.codfw.wmnet failed because of ssh fingerprint, the full log is here: https://phabricator.wikimedia.org/P3980 [14:08:07] elukey, could it be one of the servers being reimaged? [14:08:10] (03PS2) 10Giuseppe Lavagetto: role::puppetmaster: removed useless quoted booleans [puppet] - 10https://gerrit.wikimedia.org/r/308568 [14:09:34] 06Operations: Remove mw2061-mw2074 - https://phabricator.wikimedia.org/T144745#2608981 (10MoritzMuehlenhoff) [14:11:54] jynus: yes sorry didn't see the message, it has been reimaged [14:11:54] 06Operations: Remove mw2061-mw2074 - https://phabricator.wikimedia.org/T144745#2609006 (10Joe) We can remove these, but we'd probably need some more servers to compensate the loss of computing power (eqiad servers are now newer and more powerful). [14:12:09] elukey, you take care of complete the install and do pull? [14:12:20] yes sure [14:12:24] elukey: thank you [14:12:49] sorry for the trouble :) In case you have any doubt during the next days https://etherpad.wikimedia.org/p/trusty-mw-reimage [14:12:51] <_joe_> elukey: while reimaging, you're setting the servers to "inactive" in conftool, right? [14:12:55] elukey, no trouble [14:13:04] <_joe_> if so, you can run puppet on tin to spare the deployers the pain [14:13:12] <_joe_> it will read data from conftool [14:13:25] _joe_, I assume inactive will remove from dsh? [14:13:41] in addition from LVS? [14:14:02] _joe_ I didn't do it for the last reimage since it was far from deployment times, but you're right [14:14:24] <_joe_> jynus: yes, but removing from dsh is done by puppet [14:14:36] <_joe_> which we might want to change in the future, or not [14:14:54] yes, indeed we want to cordinate on that [14:15:39] marostegui: yes, mw2087 is currently being reimaged to jessie [14:15:53] note I am only nitpicking because I want to show to marostegui the "right way to do things" [14:16:13] moritzm: cheers - elukey will kindly take care of it too :) [14:16:32] and it gets complicated because we are in a midle of many new, better way to do things [14:17:04] marostegui: we'll try to set all the hosts inactive to avoid any mess during the next days, but we do the pulls after all the reimages so we'll take care of it :) [14:17:19] (03CR) 10Giuseppe Lavagetto: [C: 032] role::puppetmaster: removed useless quoted booleans [puppet] - 10https://gerrit.wikimedia.org/r/308568 (owner: 10Giuseppe Lavagetto) [14:17:29] elukey: sounds good, thank you again [14:17:35] <_joe_> now I feel better ^^ [14:17:46] 06Operations, 10Traffic, 13Patch-For-Review: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502#2180463 (10fgiunchedi) >>! In T131502#2608658, @ema wrote: > We suspect that the bug(s) encountered while upgrading ulsfo might have been caused by running a mix of Varnish 3 and Varni... [14:18:38] elukey, not an issue, in fact [14:18:47] it was a good exercice [14:18:56] one thing I proposed to releng [14:19:07] is to have a place to communicate long running maintenance [14:19:25] <_joe_> jynus: fetching dsh from confftool is what does that better [14:19:42] elukey, _joe_ have a look at my proposal at https://phabricator.wikimedia.org/T144661 [14:19:46] <_joe_> but for now I decided to use puppet, thinking we don't need a full blown confd running there [14:19:57] jynus: I have relayed your idea to greg on friday evening and he likes it a lot :] [14:20:05] he probably has commented on the related task [14:20:14] <_joe_> jynus: oh ok I misunderstood [14:20:22] hashar told me the sometimes are a bit confused about ongoing maintenance [14:20:30] and I believe we could communicate it better [14:20:42] without blocking ongoing deploys [14:20:53] I like the idea, especially for big things like the debian mw* reimages [14:20:55] e.g. "this weel we are going to reimage all mw servers" [14:20:56] <_joe_> rolling upgrades like this should be painless if elukey does his homework :P [14:20:58] exactly that [14:21:06] * elukey hides and cries [14:21:09] :D [14:21:15] the real challenge is being able to catch up with everything that is going on :/ [14:21:16] yes, the idea is not to notify every single issue [14:21:22] but large plans [14:21:28] <_joe_> elukey: seriously, we have a tech solution to the issue in that case [14:21:46] _joe_, comment on the ticket if you have better solutions [14:22:12] _joe_ I promise that I'll follow the rules for all the next reimages [14:22:26] <_joe_> jynus: not in general, for this specific case, deployers should not even feel the problem, but this makes me rethink a bit the whole thing [14:22:49] I didn't write it for this specific issue [14:22:57] but for schema changes vs. maintenance scripts [14:23:11] I only added mw maintenance as a stretch [14:23:22] feel free to give thoughts there about that RFC [14:23:31] for the dsh groups generated from etcd by puppet, I could not understand why it would include hosts flagged as inactive [14:23:33] even if they are "I will not do that" [14:23:54] I kind of stack overflowed looking at all the bits involved [14:23:55] <_joe_> hashar: can you be more explicit, please? [14:24:00] <_joe_> what do you mean? [14:24:03] yeah [14:24:04] so last week [14:24:14] we had a bunch of mw20xx servers being reimaged to Jessie [14:24:32] <_joe_> did you verify they were set to pooled=inactive? [14:24:43] they were flagged as inactive in etcd/confd as seen via https://config-master.wikimedia.org/conftool/codfw/apaches [14:24:45] BUT [14:24:54] puppet insisted to still include them in the dsh files that scap is using [14:24:59] <_joe_> hashar: as NOT seen there [14:25:02] they were showing as 'enabled': False [14:25:11] <_joe_> hashar: that's pooled=no [14:25:22] so at https://config-master.wikimedia.org/conftool/codfw/apaches [14:25:25] <_joe_> which I repeatedly stressed should not be used for reimages [14:25:29] a bunch of those servers being reimaged lack a pooled=no ? [14:25:51] <_joe_> if they are pooled=inactive, you should NOT see the servers in https://config-master.wikimedia.org/conftool/codfw/apaches [14:26:08] * _joe_ goes to change "inactive" to "decommissioned" [14:26:11] so in the ned [14:26:32] the server being reimaged were not ready by the time of evening SWAT [14:26:35] 06Operations, 06Discovery, 10Elasticsearch, 06Discovery-Search (Current work), 13Patch-For-Review: Make elasticsearch actually uses shard allocation awareness - https://phabricator.wikimedia.org/T143571#2609091 (10Gehel) Discussion with @dcausse: It is not entirely clear from the documentation what happ... [14:26:35] included in the DSH files [14:26:49] which caused scap to choke/idles waiting for them to respond to ssh [14:27:00] <_joe_> hashar: yeah blame the reimager [14:27:02] <_joe_> :P [14:27:03] (03CR) 10Filippo Giunchedi: "LGTM, re: separation I'd prefer if it was split in two patches: one for the feature itself and one for migrating mathoid. Testing should w" [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [14:27:27] <_joe_> brb [14:27:33] maybe we could have a maintenance=true field which would let us skip them [14:27:52] also scap should most probably relies on etcd/conftool whatever instead of the dsh files generated by puppet [14:28:20] (not blaming, just trying to expose how I understood it last week) [14:28:51] eventually Daniel disabled puppet on tin, deleted out the mw20xx servers from the dsh files and we have finished SWAT / and mw train deploy [14:29:55] hashar, will it guarantee a pull in time? [14:30:12] 07Puppet, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: Cant refresh Nodepool snapshot due to puppet: Could not find class passwords::puppet::database - https://phabricator.wikimedia.org/T143769#2609093 (10hashar) Cant refresh the snapshots since uploading the images fail with: BadReque... [14:33:01] hashar: I am pretty sure that I am the one to blame for the dsh problem, really sorry [14:33:36] 06Operations, 10Ops-Access-Requests: Requesting access to stat1002 for ZZhou (WMF) - https://phabricator.wikimedia.org/T144624#2605535 (10MoritzMuehlenhoff) I noticed that you have a PGP key for your wikimedia address, which is great for validating this change: Could you please send me a PGP-signed mail with y... [14:33:45] 06Operations, 10Ops-Access-Requests: Requesting access to stat1002 for ZZhou (WMF) - https://phabricator.wikimedia.org/T144624#2609114 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [14:34:52] elukey: not a big deal. I am not sure how you can solve that really [14:35:26] sounds to me that entries with 'enabled': False at https://config-master.wikimedia.org/conftool/codfw/apaches should not be included in the dsh files [14:35:34] hashar: well as Giuseppe was saying setting the hosts to inactive would have helped a lot [14:35:45] then I am not sure what is the semantic behind 'enabled' [14:36:05] enabled false means that they are not receiving traffic from LVS but they could be repooled immediately [14:36:11] so it makes sense to have them in the dsh [14:36:25] inactive means that you are going to execute a longer maintenance [14:37:03] ohhh [14:37:51] so the reimaging servers should have been flagged 'inactive': true? [14:37:52] What Giuseppe was saying is that if the reimager (like me) sets the host as inactive, then the next puppet run on tin will take care of the dsh removal [14:37:57] yeah [14:38:08] and I guess that would have solved past week issue (delta waiting for puppet to trigger and update the dsh files) [14:38:11] I mean, not sure if they'll be shown as inactive in the webpage [14:38:22] volans, you know I'm trying to port an existing script here, right? [14:38:29] hashar: you wouldn't have noticed it [14:38:36] along with making some pre-defined improvements [14:38:54] hashar: so good lesson learned, we still have to reimage tons of servers :) [14:38:57] elukey: no worries. I am happy that we have all the mechanism / software stack in place [14:39:30] then I guess if there is a reimaging doc somewhere, it would be nice to add the "set host inactive to get rid of it from dsh notes" as a step [14:39:36] and we will be nicely covered :] [14:39:50] Krenair: yes I know, I put too much stuff there? [14:39:54] hashar: I wrote it in the etherpad but didn't followed the rules :P [14:40:12] but I'll try to write a guide somewhere [14:40:15] sorry for the mysql return values, I actually forgot was mentioned in the commit message that was not included [14:40:22] elukey: hehe even better! then it all boils down the human error ;] [14:40:34] maybe something like: "first step, depool with inactive. second step: did you do the first? sure?" [14:41:18] hashar: codfw gives you a false sense of confidence since it is not taking traffic [14:41:38] and it is easy to forget that it is used in tons of other things [14:41:41] like scap deployments :D [14:42:48] volans, things like "If launched without parameters it will replace all existing views, looks like a bit dangerous as a default to me." [14:42:55] !log upgrading apache httpd to the latest version on mw129[3-8] (eqiad image scalers) [14:43:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:43:01] You may or may not be right, but what does the existing script do? [14:43:39] elukey: yeah no worries. [14:44:01] elukey: I was more looking at whether we had a system to prevent that and a doc that had the step. looks like we are covered [14:44:14] the workaround was quite easy anyway (hack the dsh files on tin) [14:44:29] <_joe_> hashar: yes everything is documented and if used correctly would do exactly what you wanted [14:45:38] Krenair: sure, that for example is a very generic comment, doesn't have to be fixed right now [14:46:08] 06Operations, 05Continuous-Integration-Scaling, 07Nodepool, 07WorkType-NewFunctionality: Backport python-shade from debian/testing to jessie-wikimedia - https://phabricator.wikimedia.org/T107267#2609155 (10hashar) **status update** I haven't come to it cause the next commit in Nodepool relying on it is q... [14:46:24] 06Operations: Update ICU version to 55.1 - https://phabricator.wikimedia.org/T143931#2609156 (10MoritzMuehlenhoff) p:05Triage>03Normal [14:46:34] 06Operations, 05Continuous-Integration-Scaling, 07Nodepool, 07WorkType-NewFunctionality: Backport python-shade from debian/testing to jessie-wikimedia - https://phabricator.wikimedia.org/T107267#2609157 (10hashar) p:05Normal>03Low [14:47:02] (03CR) 10Volans: "actual puppet compiler output: https://puppet-compiler.wmflabs.org/3951/" [puppet] - 10https://gerrit.wikimedia.org/r/308554 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [14:47:29] 06Operations, 10Ops-Access-Requests: Requesting access to stat1002 for ZZhou (WMF) - https://phabricator.wikimedia.org/T144624#2609164 (10MoritzMuehlenhoff) p:05Triage>03Normal [14:50:23] (03PS2) 10Volans: Salt: reducing permissions on the master's Job cache [puppet] - 10https://gerrit.wikimedia.org/r/308554 (https://phabricator.wikimedia.org/T143536) [14:50:27] (03PS9) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) [14:51:02] (03PS2) 10Giuseppe Lavagetto: postgresql::server: fix service name on jessie. [puppet] - 10https://gerrit.wikimedia.org/r/304456 [14:52:00] (03CR) 10Volans: [C: 032] Salt: reducing permissions on the master's Job cache [puppet] - 10https://gerrit.wikimedia.org/r/308554 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [14:53:40] (03PS3) 10Giuseppe Lavagetto: postgresql::server: fix service name on jessie. [puppet] - 10https://gerrit.wikimedia.org/r/304456 [14:53:47] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] postgresql::server: fix service name on jessie. [puppet] - 10https://gerrit.wikimedia.org/r/304456 (owner: 10Giuseppe Lavagetto) [14:54:15] (03PS1) 10Mobrovac: Mathoid: USe Scap3 to deploy the config [puppet] - 10https://gerrit.wikimedia.org/r/308574 (https://phabricator.wikimedia.org/T144755) [14:56:17] (03CR) 10Mobrovac: "@Filippo, {{done}}, this patch only adds the feature, while Idb25ea39e95b19fa9d056a1b7abe8da570a4c2c8 switches Mathoid to use it." [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [14:56:41] _joe_: if you have a min, please take a look at ^ [14:56:49] * mobrovac is eager to get config deploys going [14:58:17] <_joe_> mobrovac: yeah, not really right now, in a few [14:58:26] <_joe_> I realize it's already late in the day [14:58:26] kk thnx [14:59:26] (03PS1) 10Filippo Giunchedi: lvs: switch prometheus to 'sh' scheduler [puppet] - 10https://gerrit.wikimedia.org/r/308575 [15:03:33] ori: (in case you're online today) can you tell me about the labs projects 'mdc,' 'mdc-east' and 'mdc-west'? Are they defunct? [15:04:23] <_joe_> mobrovac: honestly I have a few doubts about what you do in this PS [15:04:45] _joe_: happy to discuss [15:10:30] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "One small style note, and a substantially larger question." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [15:10:46] (03PS1) 10Muehlenhoff: role::statistics: Limit to production networks [puppet] - 10https://gerrit.wikimedia.org/r/308576 [15:12:01] (03PS3) 10Giuseppe Lavagetto: scap::source: use puppet to manage directory creation [puppet] - 10https://gerrit.wikimedia.org/r/306429 [15:13:22] (03PS1) 10Reedy: 2 more to extension.json in extension-list [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308577 (https://phabricator.wikimedia.org/T139800) [15:15:37] (03CR) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [15:15:47] _joe_: replied ^ [15:19:11] (03CR) 10Giuseppe Lavagetto: service::node: Compile the file holding puppet-controlled vars (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [15:19:15] (03PS10) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) [15:23:05] (03CR) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [15:26:20] (03PS11) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) [15:26:39] <_joe_> mobrovac: the dependency should be on the deploy vars file then maybe? [15:26:51] (03CR) 10Mobrovac: service::node: Compile the file holding puppet-controlled vars (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/308021 (https://phabricator.wikimedia.org/T144542) (owner: 10Mobrovac) [15:27:35] _joe_: ideally, yes, but on config-vars.yaml change, one would need to execute scap3 config deploy, just restarting the service wouldn't do us much good [15:28:46] (03PS2) 10Andrew Bogott: openstack horizon puppettab: Put docs on a question mark next to the role name [puppet] - 10https://gerrit.wikimedia.org/r/308188 (https://phabricator.wikimedia.org/T91990) (owner: 10Alex Monk) [15:29:17] (03CR) 10Alex Monk: Add python version of maintain-replicas script (0320 comments) [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) (owner: 10Alex Monk) [15:29:25] (03PS20) 10Alex Monk: Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [15:31:56] (03CR) 10Andrew Bogott: [C: 032] openstack horizon puppettab: Put docs on a question mark next to the role name [puppet] - 10https://gerrit.wikimedia.org/r/308188 (https://phabricator.wikimedia.org/T91990) (owner: 10Alex Monk) [15:32:24] <_joe_> mobrovac: ouch that's bad [15:33:04] _joe_: we could add an exec that does that [15:33:22] as effectively it's not a full deploy, but only a refresh of the config file [15:33:33] <_joe_> mobrovac: yes we should [15:33:40] in that case, if auto_refresh is true, do the exec [15:33:58] <_joe_> but can we sit on this until tomorrow? I want to take a closer look at the whole service::node current shape [15:34:07] kk [15:34:35] <_joe_> it has grown complex and it's also full of nested conditionals I'd like to simplify [15:35:09] (03PS1) 10Ema: cache_upload: route around codfw in cache::route_table [puppet] - 10https://gerrit.wikimedia.org/r/308582 (https://phabricator.wikimedia.org/T131502) [15:35:46] <_joe_> what exec would be needed there? [15:36:11] lemme look it up [15:38:35] _joe_: '/usr/bin/scap', 'deploy-local', '-v', '--repo', 'mathoid/deploy', '--force', '-g', 'default', 'config_deploy' [15:38:47] that seems to be the command run by scap to assemble the config file [15:38:54] (03CR) 10BBlack: [C: 031] cache_upload: route around codfw in cache::route_table [puppet] - 10https://gerrit.wikimedia.org/r/308582 (https://phabricator.wikimedia.org/T131502) (owner: 10Ema) [15:39:19] _joe_: but we should probably find a suitable solution in consultation with releng [15:39:31] _joe_: i think for now e should be fine with just failing [15:39:37] (03PS2) 10Ema: cache_upload: route around codfw in cache::route_table [puppet] - 10https://gerrit.wikimedia.org/r/308582 (https://phabricator.wikimedia.org/T131502) [15:39:56] (03CR) 10Ema: [C: 032 V: 032] cache_upload: route around codfw in cache::route_table [puppet] - 10https://gerrit.wikimedia.org/r/308582 (https://phabricator.wikimedia.org/T131502) (owner: 10Ema) [15:43:16] (03PS1) 10Hashar: nodepool: stop injecting show:true to images [puppet] - 10https://gerrit.wikimedia.org/r/308583 (https://phabricator.wikimedia.org/T91782) [15:46:51] (03CR) 10Paladox: [C: 031] nodepool: stop injecting show:true to images [puppet] - 10https://gerrit.wikimedia.org/r/308583 (https://phabricator.wikimedia.org/T91782) (owner: 10Hashar) [15:52:39] (03PS8) 10Andrew Bogott: openstack: Fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308345 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [15:54:31] (03CR) 10Andrew Bogott: [C: 032] "Puppet compiler approves." [puppet] - 10https://gerrit.wikimedia.org/r/308345 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [15:54:56] (03CR) 10Paladox: "Thanks." [puppet] - 10https://gerrit.wikimedia.org/r/308345 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [15:56:08] (03PS4) 10Giuseppe Lavagetto: scap::source: use puppet to manage directory creation [puppet] - 10https://gerrit.wikimedia.org/r/306429 [15:56:21] (03PS4) 10Jcrespo: Delete coredb_mysql module and dependent roles and modules [puppet] - 10https://gerrit.wikimedia.org/r/301076 [15:56:34] ^this patch only keeps getting larger and larger [15:57:07] 06Operations, 10fundraising-tech-ops: barium low on disk space - https://phabricator.wikimedia.org/T144659#2609379 (10Jgreen) Multiple causes: @cwdent via email "Eileen pointed out /srv/org.wikimedia.civicrm/sites/default/files/civicrm/ConfigAndLog/ which was filling up with some dedupe related stuff.  She al... [16:00:20] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "Needs more work, see https://puppet-compiler.wmflabs.org/3959/tin.eqiad.wmnet/change.tin.eqiad.wmnet.err" [puppet] - 10https://gerrit.wikimedia.org/r/306429 (owner: 10Giuseppe Lavagetto) [16:00:47] (03CR) 10Andrew Bogott: "I'm happy to merge this anytime you are standing by to watch :)" [puppet] - 10https://gerrit.wikimedia.org/r/308583 (https://phabricator.wikimedia.org/T91782) (owner: 10Hashar) [16:01:57] (03CR) 10Alexandros Kosiaris: [C: 032] mysql_multi_instance: Remove duplicate keys [puppet] - 10https://gerrit.wikimedia.org/r/308538 (owner: 10Alexandros Kosiaris) [16:02:02] (03PS2) 10Alexandros Kosiaris: mysql_multi_instance: Remove duplicate keys [puppet] - 10https://gerrit.wikimedia.org/r/308538 [16:02:05] (03CR) 10Andrew Bogott: [C: 032] openstack: Fix indentation of => [puppet] - 10https://gerrit.wikimedia.org/r/308346 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [16:02:07] (03CR) 10Alexandros Kosiaris: [V: 032] mysql_multi_instance: Remove duplicate keys [puppet] - 10https://gerrit.wikimedia.org/r/308538 (owner: 10Alexandros Kosiaris) [16:02:10] (03PS4) 10Andrew Bogott: openstack: Fix indentation of => [puppet] - 10https://gerrit.wikimedia.org/r/308346 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [16:02:12] (03CR) 10Alexandros Kosiaris: [C: 032] Use parentheses in all custom functions [puppet] - 10https://gerrit.wikimedia.org/r/308539 (owner: 10Alexandros Kosiaris) [16:02:16] (03PS2) 10Alexandros Kosiaris: Use parentheses in all custom functions [puppet] - 10https://gerrit.wikimedia.org/r/308539 [16:02:30] (03CR) 10Alexandros Kosiaris: [V: 032] "PCC happy at https://puppet-compiler.wmflabs.org/3948/" [puppet] - 10https://gerrit.wikimedia.org/r/308539 (owner: 10Alexandros Kosiaris) [16:02:58] (03PS2) 10Alexandros Kosiaris: puppet::self: Merge the standalone statements in the hash [puppet] - 10https://gerrit.wikimedia.org/r/308540 [16:03:02] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] puppet::self: Merge the standalone statements in the hash [puppet] - 10https://gerrit.wikimedia.org/r/308540 (owner: 10Alexandros Kosiaris) [16:04:01] (03PS5) 10Andrew Bogott: openstack: Fix indentation of => [puppet] - 10https://gerrit.wikimedia.org/r/308346 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [16:06:15] now scb1001 running low on space, probably due to apt cache/old kernels [16:06:20] (03PS1) 10Alexandros Kosiaris: pybal: Puppet 4 compatible require_package invocation [puppet] - 10https://gerrit.wikimedia.org/r/308584 [16:06:32] jynus: it's systemd logs [16:06:41] I got an open item in my TODO to fix it [16:07:39] (03CR) 10Alexandros Kosiaris: [C: 032] pybal: Puppet 4 compatible require_package invocation [puppet] - 10https://gerrit.wikimedia.org/r/308584 (owner: 10Alexandros Kosiaris) [16:07:43] (03PS2) 10Alexandros Kosiaris: pybal: Puppet 4 compatible require_package invocation [puppet] - 10https://gerrit.wikimedia.org/r/308584 [16:07:46] (03CR) 10Alexandros Kosiaris: [V: 032] pybal: Puppet 4 compatible require_package invocation [puppet] - 10https://gerrit.wikimedia.org/r/308584 (owner: 10Alexandros Kosiaris) [16:08:02] it is one of those 8-gb root partition servers [16:09:09] 06Operations, 06Discovery, 06Discovery-Search, 10Elasticsearch, 10Wikimedia-Logstash: Disable cron job to clear elasticsearch caches and validate that it does not have significant impact on GC - https://phabricator.wikimedia.org/T144396#2609409 (10Gehel) 05Open>03Resolved GC still seems well under co... [16:11:45] 06Operations, 06Discovery, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Install and configure new WDQS nodes on codfw - https://phabricator.wikimedia.org/T144380#2609413 (10Gehel) The cleanup of rules.log file is low priority and tracked on T144539. It will not be done as part of this task. [16:12:10] (03PS5) 10Andrew Bogott: openstack: lint, fix optional parameter listed before required parameter [puppet] - 10https://gerrit.wikimedia.org/r/308278 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [16:12:39] 06Operations, 06Discovery, 06Maps, 03Maps-Sprint, 13Patch-For-Review: Configure LVS in front of maps100? servers - https://phabricator.wikimedia.org/T142393#2609415 (10Gehel) [16:12:44] (03CR) 10Paladox: "Thanks." [puppet] - 10https://gerrit.wikimedia.org/r/308346 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [16:14:09] (03PS1) 10Ppchelko: Change-Prop: Encode page title before calling links module [puppet] - 10https://gerrit.wikimedia.org/r/308586 [16:15:37] (03CR) 10Andrew Bogott: [C: 032] "Puppet compiler approves." [puppet] - 10https://gerrit.wikimedia.org/r/308278 (https://phabricator.wikimedia.org/T93645) (owner: 10Paladox) [16:25:53] (03PS1) 10Gehel: maps - create project specific indices during initial data import [puppet] - 10https://gerrit.wikimedia.org/r/308587 [16:27:17] (03PS1) 10Ema: cache_upload: route codfw straight to applayer [puppet] - 10https://gerrit.wikimedia.org/r/308588 (https://phabricator.wikimedia.org/T131502) [16:29:15] (03CR) 10Ema: [C: 032] cache_upload: route codfw straight to applayer [puppet] - 10https://gerrit.wikimedia.org/r/308588 (https://phabricator.wikimedia.org/T131502) (owner: 10Ema) [16:30:15] (03CR) 10Yurik: maps - create project specific indices during initial data import (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/308587 (owner: 10Gehel) [16:33:27] (03PS2) 10Gehel: maps - create project specific indices during initial data import [puppet] - 10https://gerrit.wikimedia.org/r/308587 [16:44:12] (03CR) 10Yurik: [C: 031] maps - create project specific indices during initial data import [puppet] - 10https://gerrit.wikimedia.org/r/308587 (owner: 10Gehel) [16:47:30] (03PS21) 10Alex Monk: Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [16:48:13] (03PS3) 10Gehel: maps - create project specific indices during initial data import [puppet] - 10https://gerrit.wikimedia.org/r/308587 [16:48:25] (03CR) 10jenkins-bot: [V: 04-1] Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) (owner: 10Alex Monk) [16:49:33] (03PS22) 10Alex Monk: Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [16:50:30] (03CR) 10jenkins-bot: [V: 04-1] Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) (owner: 10Alex Monk) [16:51:11] volans, ^ think I broke flake8 [16:51:16] (03CR) 10Yurik: maps - create project specific indices during initial data import (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/308587 (owner: 10Gehel) [16:51:24] Krenair: lol, looking [16:52:20] looks like column_number ends up being None [16:52:34] and so it gets TypeError: unsupported operand type(s) for +: 'NoneType' and 'int' [16:53:50] (03PS23) 10Alex Monk: Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [16:54:02] yes saw that [16:55:12] think the actual error it was trying to print was something like this: [16:55:13] maintain-replicas/maintain-replicas.py:278:10: E901 SyntaxError: can't assign to function call [16:55:37] now it passed [16:56:02] (03CR) 10Yurik: [C: 031] maps - create project specific indices during initial data import [puppet] - 10https://gerrit.wikimedia.org/r/308587 (owner: 10Gehel) [16:56:14] wonder what version of flake8 is in use on jenkins [16:56:22] I have 2.5.4 (pep8: 1.7.0, mccabe: 0.2.1, pyflakes: 1.1.0) CPython 3.5.2 on Linux [16:56:34] should be 2.5.5 [16:56:39] specified in the tox.ini [16:56:42] ah [16:57:57] or maybe not... let me recall [16:58:33] actually 16:54:09 flake8 installed: configparser==3.5.0,enum34==1.1.6,flake8==3.0.4,mccabe==0.5.2,pycodestyle==2.0.0,pyflakes==1.2.3 [17:00:04] gehel: Dear anthropoid, the time has come. Please deploy Weekly Wikidata query service deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T1700). [17:00:13] jouncebot: o/ [17:01:23] Krenair: ping me when you want me to have another pass [17:01:49] haven't done any of your General stuff or MySQL stuff yet [17:02:03] and again thanks for taking care of this! [17:04:25] 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: deploy prometheus node_exporter for host monitoring - https://phabricator.wikimedia.org/T140646#2609495 (10fgiunchedi) In the interest of having an updated version of node-exporter, I've uploaded `0.12.0+git20160831.0.1549f308+ds1` internally... [17:04:32] (03PS5) 10Gehel: wdqs - move data to /srv [puppet] - 10https://gerrit.wikimedia.org/r/308023 (https://phabricator.wikimedia.org/T144536) [17:06:36] (03CR) 10Gehel: [C: 032] wdqs - move data to /srv [puppet] - 10https://gerrit.wikimedia.org/r/308023 (https://phabricator.wikimedia.org/T144536) (owner: 10Gehel) [17:12:22] SMalyshev: ready for the scary spac3 part of the wdqs deploy? [17:12:53] gehel: yes I think so [17:13:04] gehel: can we deploy just one server? [17:13:14] (03PS3) 10Alex Monk: Fixes and improvements for maintain-meta_p [software] - 10https://gerrit.wikimedia.org/r/304425 [17:13:16] (03PS1) 10Alex Monk: maintain-meta_p: style improvements [software] - 10https://gerrit.wikimedia.org/r/308590 [17:13:24] SMalyshev: yes, we have a canary configured, so it will stop after wdqs1001 [17:13:51] gehel: canary by itself doesn't stop IIRC. if it thinks it's ok it will proceed. But I want to manually check [17:14:24] SMalyshev: I remember it asking for confirmation... but I might be wrong [17:14:42] gehel: scap deploy -l host should work [17:14:48] I can remove 1002 from the list [17:14:57] SMalyshev: -l is even better! [17:15:03] yeah let's try with -l [17:15:05] (03PS1) 10Filippo Giunchedi: debian: add back librsvg2-bin [debs/python-thumbor-wikimedia] - 10https://gerrit.wikimedia.org/r/308591 [17:15:20] gehel: if it's fine then we do without -l [17:16:31] SMalyshev: the scap config changes have not been merged yet... [17:17:21] gehel: ah, sorry, let me see [17:17:50] (03CR) 10Volans: [C: 031] "LGTM" [software] - 10https://gerrit.wikimedia.org/r/308590 (owner: 10Alex Monk) [17:17:59] gehel: merged now [17:18:05] SMalyshev: thanks! [17:18:27] SMalyshev: yep, looks good [17:18:29] thanks Krenair for fixing flake8 in maintain-replicas! [17:18:51] now we just have to get that dependency merged :) [17:19:56] maintain-meta_p? [17:20:08] yeah [17:20:16] I can merge it ;) [17:21:06] I reviewed just the diffs for the style, I saw that some stuff could be merged with the other one you're working to, but can be done in a future iteration [17:21:37] (03PS1) 10Muehlenhoff: Rename ferm service for postgres/puppetdb [puppet] - 10https://gerrit.wikimedia.org/r/308592 [17:22:01] SMalyshev: failing... [17:22:25] volans, you think meta_p should be dealt with in the same script as all the views? [17:22:35] gehel: hmm looks like symlink is not there [17:23:06] SMalyshev: yep... so scap did not re-create it... [17:23:53] gehel: right... maybe the group command not works like I thought it works? [17:24:04] maybe it works only for one group [17:24:17] (03PS1) 10Ema: Upgrade upload codfw to Varnish 4 [puppet] - 10https://gerrit.wikimedia.org/r/308593 (https://phabricator.wikimedia.org/T131502) [17:24:21] the deploy log say: 17:20:22 [wdqs1001.eqiad.wmnet] Executing check 'create_symlink_rules" [17:24:42] gehel: that's not the right one, right one is create_symlink_jnl [17:24:51] Krenair: I don't have enough context right now on meta_p but I saw a bunch of "familiar" stuff that looks common to both [17:25:01] yes [17:25:02] gehel: can you replace group: canary, default with group: canary and try again? [17:25:04] there's some common stuff [17:25:20] SMalyshev: I'll just add some icinga downtime before it screams... [17:25:20] mysql connections, config, mediawiki-config stuff [17:26:30] yep [17:26:33] (03CR) 10Ema: [C: 032] Upgrade upload codfw to Varnish 4 [puppet] - 10https://gerrit.wikimedia.org/r/308593 (https://phabricator.wikimedia.org/T131502) (owner: 10Ema) [17:27:30] SMalyshev: retrying with group: canaries [17:27:45] genel: canary, not canaries [17:27:53] SMalyshev: for the record, scap asks confirmation to continue after canary [17:28:06] gehel: good! [17:28:22] (03PS1) 10Muehlenhoff: postgres/osm: Make accessible from production and labs networks [puppet] - 10https://gerrit.wikimedia.org/r/308594 [17:28:47] SMalyshev: no diff, so scap does nothing... there is probably a --force flag or similar, checking [17:28:59] gehel: hmm let me see [17:30:03] gehel: weird... maybe group does not work for some reason, I don't know enough about scap to say :( [17:30:35] the docs say it should work, but looks like it does not [17:30:45] or maybe because nothing changed it doesn't deploy? [17:32:01] gehel: there is, -f [17:32:35] mobrovac: thanks! I was still trying to find my way in the docs! [17:33:11] SMalyshev: ok, with -f symlink is created [17:33:32] cool [17:34:40] !log change-prop deploying 222fcf8 [17:34:41] seems to be fine [17:34:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:35:02] gehel: updater is still not running though [17:35:03] SMalyshev: test queries still failing, checking [17:35:05] !log upgrading cp2022 to varnish 4 T131502 [17:35:05] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [17:35:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:35:17] gehel: which ones? [17:35:34] SMalyshev: all :( [17:35:48] gehel: hmm... what are you running exactly and how? [17:36:33] ./test.sh -s http://localhost:8888/ (from the wikidata/query/rdf/queries repo), with SSH tunnel created [17:38:07] my bad... I activated maintenance on nginx during deploy... [17:38:09] hmm [17:38:13] gehel: ohh [17:40:02] gehel: hmm I remove maintenance file but nginx still says not found [17:40:09] SMalyshev: ok, looks good to me [17:40:30] gehel: does it work for you now? [17:40:38] SMalyshev: it works for me... [17:40:44] hmm ok [17:41:59] SMalyshev: checks.yaml seems to be yaml (obvious), so the syntax for an array is probably group: [ canary, default ] [17:42:00] gehel: ok, seems to be fine then [17:42:19] or new lines and "-" [17:42:28] gehel: maybe... I just split it into two clauses for now [17:42:40] since we'll be removing it anyway... [17:43:04] SMalyshev: whatever works for you, as you said, it is only temporary [17:43:25] !log restarting pybal on lvs2002 T134893 [17:43:26] T134893: Unhandled pybal error causing services to be depooled in etcd but not in lvs - https://phabricator.wikimedia.org/T134893 [17:43:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:43:41] SMalyshev: you push the fix to the deploy repo? [17:43:49] gehel: yes [17:44:00] SMalyshev: thanks! [17:50:04] gehel: so, I assume it's ready for codfw now? [17:50:43] SMalyshev: well, we still need to deploy wdqs1002, but then yes I'll start on codfw (this evening or tomorrow) [17:51:29] SMalyshev: Ok, I see the new checks, deploying again [17:53:04] SMalyshev: redeploy looks good on wdqs1001 [17:53:28] gehel: ok, ping me if anything weird comes up [17:54:00] SMalyshev: wdqs1002 looks good as well, seems we are done here [18:00:04] anomie, ostriches, thcipriani, hashar, and twentyafterfour: Respected human, time to deploy Morning SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T1800). Please do the needful. [18:02:31] gehel: ok, cool, if you're going to deploy codfw today please ping me if anything interesting happens :) [18:15:33] hey MatmaRex ! [18:16:34] hi addshore [18:16:59] You missed your mid day european swat stuff ;) [18:17:27] yeahhh. sorry [18:17:44] there is nothing in the swat window that is running right now though! [18:18:27] i don't want to pop up with it in the middle of the window though :P, and i didn't reschedule earlier. unless one of you folks wants to do the deployment, then sure, i'm around [18:18:36] if not i'll put it for the evening one [18:19:02] If you want them in this window I'm all ready to do them for you :) [18:22:03] (03PS1) 10ArielGlenn: dumps: use shell for explicit pipeline for check if files are empty [dumps] - 10https://gerrit.wikimedia.org/r/308599 [18:22:50] (03CR) 10ArielGlenn: [C: 032] dumps: use shell for explicit pipeline for check if files are empty [dumps] - 10https://gerrit.wikimedia.org/r/308599 (owner: 10ArielGlenn) [18:25:18] addshore: oh, you're a deployer? well, let's do it then :P [18:25:26] okay! :) [18:25:39] MatmaRex: can you move them in the deployments calendar ? [18:25:40] i'll fix up the Deployments page [18:25:41] yeah [18:27:14] (done) [18:27:20] =] [18:27:40] just got to wait for jenkins [18:31:23] !log upgrading cp2026 to varnish 4 T131502 [18:31:24] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [18:31:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:32:44] *twiddles thumbs* [18:37:21] MatmaRex: https://gerrit.wikimedia.org/r/#/c/308262/ is on mw1099 please check :) [18:39:23] addshore: doing. give me a few minutes, i have to make an account and upload a file [18:39:27] okay! [18:41:13] addshore: only https://gerrit.wikimedia.org/r/#/c/308262/ ? or all of them? [18:41:27] just did the first one for now, or do you need them all at the same time? [18:42:13] addshore: https://gerrit.wikimedia.org/r/#/c/308262/ by itself doesn't do much, the other two depend on it [18:42:31] okay, let me grab them all together for you [18:45:45] MatmaRex: all 3 are now on mw1099! [18:49:16] addshore: thanks, everything works as expected :) [18:49:28] okay! I'll push them all out in order then :) [18:49:48] yeah, the first one should go out first, the other two in any order afterwards [18:50:23] (actually, syncing them out-of-order wouldn't cause problems, the extra parameter to the function would just be ignored) [18:50:33] !log addshore@tin Synchronized php-1.28.0-wmf.17/resources/src/mediawiki/api/messages.js: SWAT: [[gerrit:308262|mw.api.messages: Allow passing extra parameters for the API call]] (duration: 00m 53s) [18:50:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:51:24] !log addshore@tin Synchronized php-1.28.0-wmf.17/resources/src/mediawiki/mediawiki.Upload.BookletLayout.js: SWAT: [[gerrit:308270|mw.Upload.BookletLayout: Use amenableparser to handle templates in error messages]] (duration: 00m 47s) [18:51:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:52:19] !log addshore@tin Synchronized php-1.28.0-wmf.17/extensions/UploadWizard/resources/mw.UploadWizardDetails.js: SWAT: [[gerrit:308267|mw.UploadWizardDetails, mw.UploadWizardUpload: Use amenableparser to handle templates in error messages]] Part 1/2 (duration: 00m 48s) [18:52:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:53:10] !log addshore@tin Synchronized php-1.28.0-wmf.17/extensions/UploadWizard/resources/mw.UploadWizardUpload.js: SWAT: [[gerrit:308267|mw.UploadWizardDetails, mw.UploadWizardUpload: Use amenableparser to handle templates in error messages]] Part 2/2 (duration: 00m 46s) [18:53:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:53:18] MatmaRex: thats all of them [18:53:31] yup. thanks! [18:55:21] And it fitted in the windows MatmaRex ;) [18:55:53] time to leave the office! [19:22:03] !log upgrading cp2024 to varnish 4 T131502 [19:22:04] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [19:22:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:22:21] !log ema@palladium conftool action : set/pooled=no; selector: cp2024.codfw.wmnet [19:22:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:26:15] !log ema@palladium conftool action : set/pooled=yes; selector: cp2024.codfw.wmnet [19:26:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:45:39] !log upgrading cp2020 to varnish 4 T131502 [19:45:40] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [19:45:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:45:54] !log ema@palladium conftool action : set/pooled=no; selector: cp2020.codfw.wmnet [19:45:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:49:46] !log ema@palladium conftool action : set/pooled=yes; selector: cp2020.codfw.wmnet [19:49:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:00:04] gwicke, cscott, arlolra, subbu, bearND, mdholloway, halfak, and Amir1: Dear anthropoid, the time has come. Please deploy Services – Parsoid / OCG / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2000). [20:23:54] 06Operations, 06Commons, 10MediaWiki-File-management, 06Multimedia, and 4 others: InstantCommons broken by switch to HTTPS - https://phabricator.wikimedia.org/T102566#2609727 (10Tau) Hi! Finally InstantCommons is working in my wiki again. I don't know exactly what caused the error but after updating form P... [20:36:50] 06Operations, 10Analytics, 06Performance-Team, 10Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2609746 (10Nuria) > The problem is, however, that your "observations", the impressions, are not independent, because subsets of them are generated by the same users, and so >yo... [20:41:21] !log upgrading cp2017 to varnish 4 T131502 [20:41:22] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [20:41:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:41:37] !log ema@palladium conftool action : set/pooled=no; selector: cp2017.codfw.wmnet [20:41:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:45:33] !log ema@palladium conftool action : set/pooled=yes; selector: cp2017.codfw.wmnet [20:45:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:56:36] !log Updated striker to b5fdbf9 (T144040, T144296) [20:56:38] T144296: Admin console fails to add new diffusionrepo entries - https://phabricator.wikimedia.org/T144296 [20:56:38] T144040: Need pretty page for uwsgi proxy errors - https://phabricator.wikimedia.org/T144040 [20:56:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:00:04] dapatrick and bawolff: Dear anthropoid, the time has come. Please deploy Weekly Security deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2100). [21:10:44] (03PS3) 10ArielGlenn: add timeout and related callback to method for running proc without output [dumps] - 10https://gerrit.wikimedia.org/r/308015 [21:10:46] (03PS11) 10ArielGlenn: abstract out code for adds/changes dumps generation, for general library [dumps] - 10https://gerrit.wikimedia.org/r/307257 (https://phabricator.wikimedia.org/T133547) [21:10:48] (03PS3) 10ArielGlenn: fix up locking for misc dumps [dumps] - 10https://gerrit.wikimedia.org/r/308016 [21:11:19] blergh [21:11:25] well that's it for me for the night anyways [21:13:41] 06Operations, 10Dumps-Generation: fix up datasets uid - https://phabricator.wikimedia.org/T113467#1665979 (10hashar) We have the issue on beta. The task has a fair amount of details T117028 [21:28:39] !log upgrading cp2014 to varnish 4 T131502 [21:28:40] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [21:28:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:28:57] !log ema@palladium conftool action : set/pooled=no; selector: cp2014.codfw.wmnet [21:29:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:31:01] (03CR) 10Andrew Bogott: [C: 032] google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308111 (https://phabricator.wikimedia.org/T144290) (owner: 10BryanDavis) [21:31:06] (03PS5) 10Andrew Bogott: google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308111 (https://phabricator.wikimedia.org/T144290) (owner: 10BryanDavis) [21:31:41] (03PS1) 10Aaron Schulz: Avoid pointless ChronologyProtector duplicate key notices [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308662 [21:32:11] thanks andrewbogott [21:33:08] (03CR) 10Andrew Bogott: [V: 032] google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308111 (https://phabricator.wikimedia.org/T144290) (owner: 10BryanDavis) [21:33:23] (03CR) 10Andrew Bogott: google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308111 (https://phabricator.wikimedia.org/T144290) (owner: 10BryanDavis) [21:36:10] !log ema@palladium conftool action : set/pooled=yes; selector: cp2014.codfw.wmnet [21:36:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:45:26] <_joe_> icinga-wm: ping? [21:49:47] (03PS1) 10Giuseppe Lavagetto: Revert "google_api_proxy: Add role for Google-api-proxy Labs project" [puppet] - 10https://gerrit.wikimedia.org/r/308663 [21:50:09] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] "breaking puppet everywhere." [puppet] - 10https://gerrit.wikimedia.org/r/308663 (owner: 10Giuseppe Lavagetto) [21:50:27] <_joe_> done [21:51:02] thanks _joe_ [21:51:26] <_joe_> andrewbogott: https://phabricator.wikimedia.org/T119042 [21:51:32] <_joe_> this is why I had to revert [21:52:19] <_joe_> andrewbogott: due to a bug in the puppet parser, we can't have fist-level roles like modules/role/manifests/google_api_proxy.pp [21:52:34] <_joe_> put that in manifests/role for now [21:53:15] ircecho is logging on neon [21:54:04] _joe_: reading… what did you revert? [21:54:19] <_joe_> andrewbogott: https://gerrit.wikimedia.org/r/308111 [21:54:24] <_joe_> see the backlog [21:54:24] oh yeah, elukey re-enabled ircecho shortly after disabling it, it's on SAL [21:54:32] (03CR) 10Hashar: "The snapshots we boot instances from are currently outdated. When this change merge, Nodepool will refresh them automatically at 14:14UTC." [puppet] - 10https://gerrit.wikimedia.org/r/308583 (https://phabricator.wikimedia.org/T91782) (owner: 10Hashar) [21:54:33] oh, bd808's thing. ok [21:55:20] <_joe_> yeah, good night :) [21:55:33] I cannot find in my log when icinga-wm died, last message was at Mon 00:31:27 UTC [21:55:54] I can still CTCP PING icinga-wm [21:56:03] so it live [21:56:04] lives* [21:56:26] I think the puppet parser error is fixed in puppet-lint 2.*+ [21:57:16] Oh yes i think someone may have quited the bot yesturday due to a huge puppet log here [21:57:24] it was filled with errors [21:57:49] I can restart ircecho [21:58:06] oh crap. sorry I broke the world. [21:58:18] !log restarting ircecho on neon to get back icinga-wm [21:58:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:58:44] ^^ it seems the bot was there, probaly just froze [21:58:47] lol [21:58:51] thanks volans [21:58:54] RECOVERY - puppet last run on aqs1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:58:58] paladox: it was logging fine on neon [21:59:00] there we go [21:59:04] Oh [21:59:32] !log upgrading cp2011 to varnish 4 T131502 [21:59:33] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [21:59:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:59:46] Oh wait isent there going to be a huge puppet recuvery spam now [21:59:54] paladox: yes! :) [21:59:59] LOL :) [22:00:04] !log ema@palladium conftool action : set/pooled=no; selector: cp2011.codfw.wmnet [22:00:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:00:22] * paladox goes back to submitting a patch to gerrit for to improve gerrit :) [22:00:49] paladox: careful to not enter in an infinite recursive loop ;) [22:00:58] *not to enter [22:01:03] volans oh what do you mean? [22:01:16] bd808: probably the right thing to do (for now at least) is move your role into modules/role/manifests/labs [22:01:21] sending a patch to gerrit to change gerrit [22:01:31] (just a joke ;) ) [22:01:51] volans lol, it is to fix T144565 [22:01:51] T144565: Gerrit's new side-by-side diff screen sometimes cuts off the last few characters of a line - https://phabricator.wikimedia.org/T144565 [22:01:55] RECOVERY - puppet last run on logstash1003 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:02:32] volans ^^ it will be a new pref, and my first time at java, ive been working on the change for most of the day. [22:02:42] pref = preference [22:02:51] 06Operations, 07Puppet: Add a Jenkins check that forbids creation of /modules/role/manifests/*.pp - https://phabricator.wikimedia.org/T144774#2609873 (10Andrew) [22:02:54] oh is broken in the backend? [22:03:08] volans what do you mean? [22:03:13] _joe_: (if still here) am I understanding the issue properly in https://phabricator.wikimedia.org/T144774 ? [22:03:17] Is that for gerrit or a different question? [22:03:19] I hoped was just bad CSS [22:03:33] volans nope, we tryed css but that caused it to be a bit ugly [22:03:42] so we reverted it last week [22:03:59] !log ema@palladium conftool action : set/pooled=yes; selector: cp2011.codfw.wmnet [22:04:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:04:47] paladox: ok, didn't know [22:05:01] RECOVERY - puppet last run on iron is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [22:05:03] RECOVERY - puppet last run on db1078 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:05:03] RECOVERY - puppet last run on ms-be1023 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [22:05:13] Yep, its all codemirror's fault, i just wish they do lineWrapping by default [22:05:14] RECOVERY - puppet last run on db1080 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [22:05:14] RECOVERY - puppet last run on ganeti1003 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [22:05:15] RECOVERY - puppet last run on scb2002 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [22:05:24] since gerrit has chosen to use scroll bars by default [22:05:25] LOL [22:05:27] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [22:05:33] RECOVERY - puppet last run on db1035 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [22:05:44] RECOVERY - puppet last run on rdb2001 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:05:45] RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [22:05:53] RECOVERY - puppet last run on aqs1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:05:54] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [22:06:09] RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:06:09] RECOVERY - puppet last run on cp4004 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:06:09] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:06:11] if too noisy I can stop it for a bit (icinga-wm0 [22:06:13] RECOVERY - puppet last run on cp3037 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:06:25] RECOVERY - puppet last run on restbase2001 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [22:06:25] RECOVERY - puppet last run on pybal-test2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:06:34] RECOVERY - puppet last run on db1037 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:06:34] RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:06:36] RECOVERY - puppet last run on db2040 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:06:40] (03PS1) 10BryanDavis: google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308664 (https://phabricator.wikimedia.org/T144290) [22:06:46] RECOVERY - puppet last run on es2001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [22:07:03] RECOVERY - puppet last run on mc1015 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [22:07:03] RECOVERY - puppet last run on thumbor1001 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:07:03] RECOVERY - puppet last run on ms-fe2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:03] RECOVERY - puppet last run on dbproxy1007 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [22:07:04] RECOVERY - puppet last run on cp3038 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:07:05] RECOVERY - puppet last run on db2052 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:13] RECOVERY - puppet last run on labsdb1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:14] RECOVERY - puppet last run on restbase-test2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:15] RECOVERY - puppet last run on cp4011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:28] RECOVERY - puppet last run on radon is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:30] RECOVERY - puppet last run on rdb1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:07:31] RECOVERY - puppet last run on db1092 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:08:00] !log stopped ircecho on neon to avoid the spam of recovery, monitoring icinga, I'll re-enable it in a bit [22:08:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:08:22] (or puppet will re-enable it for me :) ) [22:08:22] volans theres the #wikimedia-offtopic channel :) [22:08:29] lol [22:08:43] (03PS2) 10BryanDavis: google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308664 (https://phabricator.wikimedia.org/T144290) [22:09:01] andrewbogott: ^ I think that will avoid the bug (which I should have remembered) [22:09:21] * andrewbogott tries to remember how to add jenkins tests [22:10:22] andrewbogott through integration/config and then deploy with zuul, or you can if you have permissions to do it, do it through jenkins gui [22:13:21] (03PS3) 10BryanDavis: google_api_proxy: Add role for Google-api-proxy Labs project [puppet] - 10https://gerrit.wikimedia.org/r/308664 (https://phabricator.wikimedia.org/T144290) [22:23:35] 06Operations, 07Puppet: Add a Jenkins check that forbids creation of /modules/role/manifests/*.pp - https://phabricator.wikimedia.org/T144774#2609923 (10Legoktm) a:05Joe>03Legoktm [22:24:12] !log upgrading cp2008 to varnish 4 T131502 [22:24:13] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [22:24:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:24:29] !log restarting ircecho on neon to get back icinga-wm [22:24:31] !log ema@palladium conftool action : set/pooled=no; selector: cp2008.codfw.wmnet [22:24:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:28:00] !log ema@palladium conftool action : set/pooled=yes; selector: cp2008.codfw.wmnet [22:28:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:30:02] 06Operations: APT and Puppet failing on ms-be1022 - https://phabricator.wikimedia.org/T144776#2609927 (10Volans) [22:35:48] (03PS1) 10Legoktm: Add test to ensure there are no *.pp files in modules/role/manifests/ [puppet] - 10https://gerrit.wikimedia.org/r/308666 (https://phabricator.wikimedia.org/T144774) [22:36:01] volans yay i finally submitted https://gerrit-review.googlesource.com/85355 [22:36:48] 06Operations: ganglia-monitor and puppet failing on bast3001 - https://phabricator.wikimedia.org/T144778#2609955 (10Volans) [22:36:58] (03PS2) 10Legoktm: Add test to ensure there are no *.pp files in modules/role/manifests/ [puppet] - 10https://gerrit.wikimedia.org/r/308666 (https://phabricator.wikimedia.org/T144774) [22:37:11] paladox: \o/ [22:37:22] :) [22:47:54] volans i made a mistake with that patch lol [22:48:23] rotfl, sorry dind't look at it, I'm finishing something else [22:48:55] Ok, i emailed a gerrit user on the problem [22:49:33] legoktm: I have no reason to prefer my patch over yours… any preference? [22:49:57] !log upgrading cp2005 to varnish 4 T131502 [22:49:57] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [22:50:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:50:04] (except for the part where I can't write yaml to save my life, apparently.) [22:50:22] !log ema@palladium conftool action : set/pooled=no; selector: cp2005.codfw.wmnet [22:50:24] andrewbogott: we want to keep as much repo-specific logic out of integration/config as possible, that way CI is self-serve. In case the test needs to be modified in the future, if it's in ops/puppet, you can do it whenever, but if it's in CI, then you're blocked on the CI team [22:50:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:50:38] legoktm: ok [22:51:10] (03PS3) 10Volans: Automation: automatically reimage host [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) [22:52:12] (03CR) 10Andrew Bogott: [C: 031] Add test to ensure there are no *.pp files in modules/role/manifests/ [puppet] - 10https://gerrit.wikimedia.org/r/308666 (https://phabricator.wikimedia.org/T144774) (owner: 10Legoktm) [22:54:04] (03CR) 10Volans: "@Moritz, both fixed, see my replies inline" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/308520 (https://phabricator.wikimedia.org/T143536) (owner: 10Volans) [22:55:05] !log ema@palladium conftool action : set/pooled=yes; selector: cp2005.codfw.wmnet [22:55:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [22:55:14] time for bed... ttyl [22:55:35] 'night volans [22:56:07] thanks, you should too (is still in the same TZ ;) ) [22:56:12] s/is/if/ [22:56:47] jouncebot: next [22:56:47] In 0 hour(s) and 3 minute(s): Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2300) [22:56:58] jouncebot: now [22:57:10] well that's not good [22:57:13] lol bd808 you killed it [22:57:22] testing new code :) [22:58:15] (03CR) 10BryanDavis: [C: 04-1] "This needs some work. It crashed the bot when there was no active window." [wikimedia/bots/jouncebot] - 10https://gerrit.wikimedia.org/r/308086 (owner: 10BryanDavis) [22:58:33] jouncebot: next [22:58:33] In 0 hour(s) and 1 minute(s): Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2300) [22:59:59] is wikimania2015 closed yet? [23:00:04] RoanKattouw, ostriches, MaxSem, and Dereckson: Dear anthropoid, the time has come. Please deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2300). [23:00:04] Dereckson: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [23:00:08] s/yet/already/ [23:00:34] pretend you've not read anything [23:00:45] (listed on the page) [23:02:22] Hello [23:02:46] It's scheduled for this SWAT. [23:02:56] I see [23:03:06] Let's do that so. [23:03:20] * mafk does not have any patch for SWAT today [23:03:49] (03PS9) 10Dereckson: Closing wikimania2015wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/298772 (https://phabricator.wikimedia.org/T139032) (owner: 10MarcoAurelio) [23:05:32] (03CR) 10Dereckson: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/298772 (https://phabricator.wikimedia.org/T139032) (owner: 10MarcoAurelio) [23:05:33] ah, that's mine yep [23:06:01] (03Merged) 10jenkins-bot: Closing wikimania2015wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/298772 (https://phabricator.wikimedia.org/T139032) (owner: 10MarcoAurelio) [23:06:37] mafk: live on mw1099 [23:06:49] checking [23:07:23] https://wikimania2015.wikimedia.org/w/index.php?title=MediaWiki:Sitenotice&curid=14466&diff=53214&oldid=53174 [23:07:38] can you edit? [23:07:40] perhaps remove the site notice before commit? [23:07:45] Nope [23:07:58] stewards can edit closed wikis so I can't really check much [23:08:04] will see listgrouprights [23:08:09] hold on [23:08:38] (03PS2) 10BryanDavis: Add a "now" command [wikimedia/bots/jouncebot] - 10https://gerrit.wikimedia.org/r/308086 [23:08:39] https://wikimania2015.wikimedia.org/w/index.php?title=Social_Events&action=edit -> I can edit on prod, I correctly see a read only view on mw1099 [23:09:12] listgrouprights shows edit rights removed from '*' [23:09:27] good [23:09:51] strangely the createpage and createpage appears on mw1099, I think they should be removed too; but maybe w/o the 'edit' rights they don't work? [23:10:00] I'll delete the sitenotice later [23:10:01] I'm not sure [23:10:18] https://wikitech.wikimedia.org/wiki/Close_a_wiki only suggest to edit groupOverrides [23:10:34] (but now, perhaps this doc should be updated too) [23:10:55] * Dereckson check create page [23:11:12] There is currently no text in this page. You can search for this page title in other pages, or search the related logs, but you do not have permission to create this page. [23:12:03] jouncebot: now [23:12:03] For the next -1 hour(s) and 12 minute(s): Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2300) [23:12:12] almost.. [23:12:49] Dereckson: a hack in closedwikis would do it (adding more rights to remove) [23:13:03] anyway, no edits are possible [23:13:06] so it's closed [23:13:14] Okay, let's merge [23:13:32] well, deploy [23:14:10] !log dereckson@tin Synchronized wmf-config/: Close wikimania2015 (T139032). So long and thanks for all the fish. (duration: 00m 51s) [23:14:11] T139032: Close wikimania2015wiki - https://phabricator.wikimedia.org/T139032 [23:14:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:15:13] err [23:16:01] dblist isn't in wmf-config/ [23:16:29] lol @ deploymsg [23:17:43] !log upgrading cp2002 to varnish 4 T131502 [23:17:43] !log dereckson@tin Synchronized dblists/closed.dblist: Close wikimania2015 (T139032) dblist update (duration: 00m 47s) [23:17:44] T131502: Convert upload cluster to Varnish 4 - https://phabricator.wikimedia.org/T131502 [23:17:44] T139032: Close wikimania2015wiki - https://phabricator.wikimedia.org/T139032 [23:17:44] Reedy: ping? [23:17:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:17:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:17:56] !log ema@palladium conftool action : set/pooled=no; selector: cp2002.codfw.wmnet [23:18:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:18:22] Dereckson: the dblists should be in a subdirectory [23:18:30] yep [23:18:38] https://github.com/wikimedia/operations-mediawiki-config/tree/master/dblists [23:19:08] bd808: yes, it's synced now, but one moment I thought it was wmf-config/dblists [23:19:23] *nod* [23:19:24] https://phabricator.wikimedia.org/diffusion/OMWC/browse/master/dblists/ [23:20:18] So I confirm I lost the permission to edit https://wikimania2015.wikimedia.org/w/index.php?title=Friendly_space&action=edit [23:20:26] Closed. [23:20:59] I've just deleted the sitenotice [23:22:33] Thanks. [23:22:44] well, everything is done for wm2015 I think [23:22:47] Next: Allow wikitech to write files for Math [23:23:01] Dereckson: did you sync. InitialiseSettings as well? [23:23:21] (03PS2) 10Dereckson: Set wgMathFileBackend to false for wikitech wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308117 (https://phabricator.wikimedia.org/T126628) [23:23:24] !log ema@palladium conftool action : set/pooled=yes; selector: cp2002.codfw.wmnet [23:23:24] mafk: yup [23:23:28] i see wmf-config/ but nothing else, does it mean that the whole directory was sync? [23:23:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:23:38] that's right [23:23:49] how clever I am :P [23:23:52] thank you [23:24:04] (03CR) 10Dereckson: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308117 (https://phabricator.wikimedia.org/T126628) (owner: 10Dereckson) [23:24:21] Let's see what the new error in the Math on wikitech quest is. [23:24:28] (03Merged) 10jenkins-bot: Set wgMathFileBackend to false for wikitech wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/308117 (https://phabricator.wikimedia.org/T126628) (owner: 10Dereckson) [23:26:06] (03PS3) 10BryanDavis: Add a "now" command [wikimedia/bots/jouncebot] - 10https://gerrit.wikimedia.org/r/308086 [23:27:01] !log dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Set wgMathFileBackend to false for wikitech wikis (T126628) (duration: 00m 48s) [23:27:03] T126628: Allow wikitech to write files for Math - https://phabricator.wikimedia.org/T126628 [23:27:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:28:57] That part seems to works. We now just need a Puppet change to serve /srv/math-images as /wiki/images/math and we're done. [23:29:23] jouncebot: now [23:29:23] For the next 0 hour(s) and 30 minute(s): Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160905T2300) [23:29:24] Krenair: ping? [23:29:32] pong [23:29:38] jouncebot: next [23:29:38] In 13 hour(s) and 30 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160906T1300) [23:29:44] Could you confirm /srv/math-images/6/1/9/619a7845480ba7a8a749dc56a6de7c60.png exist? [23:29:49] on silver? [23:29:52] yes [23:30:00] you know you can log into that yourself right? :) [23:30:17] -rwxrwxrwx 1 www-data www-data 1268 Sep 5 23:27 /srv/math-images/6/1/9/619a7845480ba7a8a749dc56a6de7c60.png [23:30:18] ah, I thought it was a restricted host [23:30:25] yeah, to deployers [23:30:36] only deployers and ops can log in [23:31:06] Okay, so, 308110 works. [23:31:07] almost all hosts are restricted to a group or two [23:31:21] only bast[1-4]001 and rutherfordium are not [23:33:42] 06Operations, 06Labs, 10Wikimedia-Site-requests, 10wikitech.wikimedia.org, 13Patch-For-Review: Enable math extension on wikitech - https://phabricator.wikimedia.org/T126338#2610008 (10Dereckson) Math extension can now write files successfully in /srv/math-images folder. Last piece of the configuration i... [23:34:19] (03CR) 10BryanDavis: [V: 032] "Cherry-picked and tested live:" [wikimedia/bots/jouncebot] - 10https://gerrit.wikimedia.org/r/308086 (owner: 10BryanDavis) [23:35:56] and I guess the hosts which allow more than 2 groups... [23:42:59] (03PS1) 10Dereckson: Wikitech: Serve /srv/math-images as /w/images/math [puppet] - 10https://gerrit.wikimedia.org/r/308671 (https://phabricator.wikimedia.org/T126628) [23:43:08] Krenair: we had another solution than /srv/math-images: write to /srv/org/wikimedia/controller/wikis/images/math [23:43:42] if you really want to add to /srv/org... [23:43:47] ok [23:44:12] No, I don't especially want that, I just noticed that writing the Apache change. [23:46:48] (03PS2) 10Dereckson: Wikitech: Serve /srv/math-images as /w/images/math [puppet] - 10https://gerrit.wikimedia.org/r/308671 (https://phabricator.wikimedia.org/T126338) [23:47:22] PROBLEM - puppet last run on db2069 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues