[01:04:00] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3916113 (10Tbayer) >>! In T185350#3913845, @Nuria wrote: ... > it is likely that for media downloaded in chunks the field doesn't reflect file s... [02:37:23] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.17) (duration: 07m 24s) [02:37:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:25:25] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 748.17 seconds [03:53:34] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 257.24 seconds [04:32:44] PROBLEM - Check Varnish expiry mailbox lag on cp4021 is CRITICAL: CRITICAL: expiry mailbox lag is 2116243 [04:33:16] (03PS3) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [04:33:41] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [04:57:38] (03PS4) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [04:57:41] (03PS1) 10Andrew Bogott: striker: rename role class to profile [puppet] - 10https://gerrit.wikimedia.org/r/405669 [04:58:11] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [04:58:37] (03CR) 10jerkins-bot: [V: 04-1] striker: rename role class to profile [puppet] - 10https://gerrit.wikimedia.org/r/405669 (owner: 10Andrew Bogott) [05:01:16] (03PS2) 10Andrew Bogott: striker: rename role class to profile [puppet] - 10https://gerrit.wikimedia.org/r/405669 [05:01:18] (03PS5) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [05:01:59] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [05:02:14] PROBLEM - Check Varnish expiry mailbox lag on cp4022 is CRITICAL: CRITICAL: expiry mailbox lag is 2021916 [05:37:15] Hello [05:37:31] Please See https://phabricator.wikimedia.org/T185439 [05:37:53] Something went wrong with gerrit [05:38:27] Paladox: Hey [05:39:46] paladox: Hey [05:53:39] (03PS3) 10Andrew Bogott: striker: rename role class to profile [puppet] - 10https://gerrit.wikimedia.org/r/405669 [05:53:41] (03PS6) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [05:53:43] (03PS1) 10Andrew Bogott: role::client::labs: remove hiera defaults that rely on ::global vars [puppet] - 10https://gerrit.wikimedia.org/r/405671 [05:54:46] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [06:18:41] !log Compress ruwiki on db1102 - T182450 [06:18:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:18:55] T182450: db1102 (sanitarium) filling up (WAS: Clean up old binlogs from db1102 (sanitarium multi-instance)) - https://phabricator.wikimedia.org/T182450 [06:27:59] (03PS1) 10Marostegui: db-eqiad.php: Move db1063 to s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405672 (https://phabricator.wikimedia.org/T184397) [06:28:58] (03PS2) 10Marostegui: db-eqiad.php: Move db1063 to s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405672 (https://phabricator.wikimedia.org/T184397) [06:36:11] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Move db1063 to s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405672 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [06:39:32] (03Merged) 10jenkins-bot: db-eqiad.php: Move db1063 to s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405672 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [06:39:44] (03CR) 10jenkins-bot: db-eqiad.php: Move db1063 to s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405672 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [06:41:00] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Move db1063 from s8 to s6 - T184397 (duration: 00m 58s) [06:41:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:41:12] T184397: Decommission db1030 - https://phabricator.wikimedia.org/T184397 [06:49:24] (03PS1) 10Marostegui: mariadb: Move db1063 to s6 [puppet] - 10https://gerrit.wikimedia.org/r/405673 (https://phabricator.wikimedia.org/T184397) [06:56:12] (03CR) 10Marostegui: "https://puppet-compiler.wmflabs.org/compiler02/9798/" [puppet] - 10https://gerrit.wikimedia.org/r/405673 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [06:56:15] (03CR) 10Marostegui: [C: 032] mariadb: Move db1063 to s6 [puppet] - 10https://gerrit.wikimedia.org/r/405673 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [07:03:05] (03PS1) 10Marostegui: db-eqiad.php: Depool db1089 and db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405674 (https://phabricator.wikimedia.org/T162807) [07:04:24] RECOVERY - Disk space on labtestnet2001 is OK: DISK OK [07:04:26] !log truncated /var/log/upstart/neutron-server.log on labtestnet2001 - / disk space exhausted [07:04:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:04:54] Cc: andrewbogott,chasemp,madhuvishy ---^ [07:05:06] Puppet is also disabled with no reason.. [07:06:05] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1089 and db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405674 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [07:07:31] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1089 and db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405674 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [07:07:41] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1089 and db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405674 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [07:09:00] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 56s) [07:09:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:09:11] T162807: Run pt-table-checksum on s1 (enwiki) - https://phabricator.wikimedia.org/T162807 [07:11:39] !log Stop replication in sync db1089 and db1067 - T162807 [07:11:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:32:47] PROBLEM - Check Varnish expiry mailbox lag on cp4024 is CRITICAL: CRITICAL: expiry mailbox lag is 2138569 [07:40:58] (03PS1) 10Marostegui: db-eqiad.php: Repool db1067, depool db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405675 (https://phabricator.wikimedia.org/T162807) [07:43:15] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Repool db1067, depool db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405675 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [07:44:43] (03Merged) 10jenkins-bot: db-eqiad.php: Repool db1067, depool db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405675 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [07:46:47] (03CR) 10jenkins-bot: db-eqiad.php: Repool db1067, depool db1067 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405675 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [07:46:50] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1067, depool db1066 - T162807 (duration: 00m 56s) [07:47:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:47:03] T162807: Run pt-table-checksum on s1 (enwiki) - https://phabricator.wikimedia.org/T162807 [07:51:03] !log Stop replication in sync db1089 and db1066 - T162807 [07:51:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:56:32] (03PS1) 10Marostegui: db-eqiad.php: Repool db1066, depool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405676 (https://phabricator.wikimedia.org/T162807) [08:15:39] hello [08:15:59] Please see https://phabricator.wikimedia.org/T185439 [08:18:12] "The page you requested was not found, or you do not have permission to view this page." is what I'm getting when I try to view it [08:31:55] !log upgrading video scalers to HHVM 3.18.7 [08:32:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:33:52] (03PS1) 10Marostegui: netboot.cfg: Fixing typo [puppet] - 10https://gerrit.wikimedia.org/r/405678 [08:35:16] (03CR) 10Marostegui: [C: 032] netboot.cfg: Fixing typo [puppet] - 10https://gerrit.wikimedia.org/r/405678 (owner: 10Marostegui) [08:37:30] (03PS1) 10Jcrespo: mariadb: Pool db1067 permanently with higher weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405679 [08:37:48] (03CR) 10Marostegui: [C: 031] mariadb: Pool db1067 permanently with higher weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405679 (owner: 10Jcrespo) [08:38:41] (03CR) 10Jcrespo: [C: 032] mariadb: Pool db1067 permanently with higher weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405679 (owner: 10Jcrespo) [08:39:10] Jayprakash12345: was it pushed as a draft by any chance? [08:40:07] (03Merged) 10jenkins-bot: mariadb: Pool db1067 permanently with higher weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405679 (owner: 10Jcrespo) [08:40:18] (03CR) 10jenkins-bot: mariadb: Pool db1067 permanently with higher weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405679 (owner: 10Jcrespo) [08:41:46] !log jynus@tin Synchronized wmf-config/db-eqiad.php: Increase db1067 weight (duration: 00m 56s) [08:41:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:42:08] p858snake|L: I Publish Edit on https://gerrit.wikimedia.org/r/#/c/405590/ . jenkin-bot were give me -1 Build. Because I forget to add , in array. However When I Re-edit patch. and going to Publish. Then It shows error [08:42:26] interesting [08:42:41] the change definitely existed and was public, it's in the logs on https://wm-bot.wmflabs.org/logs/%23wikimedia-dev/ [08:42:49] p858snake|L: Re-edit patch was in Draft [08:43:22] Jayprakash12345: i can access this: https://gerrit.wikimedia.org/r/#/c/405590/1/ [08:43:50] and up to patchset 5: https://gerrit.wikimedia.org/r/#/c/405590/5/ [08:44:01] MatmaRex_mobile: But I cant [08:44:03] but ps6 and ps7 disappeared or are drafts [08:44:12] Jayprakash12345: its probably best to wait till no_justification is available [08:44:18] Jayprakash12345: you probably can see that one. with the 1/ at the end [08:44:19] I can't remember how to fix that issue [08:44:53] it's only the latest patchset of your changeset that is somehow curiously broken [08:45:30] i can't comment on the phab task right now but you might want to note this there [08:46:37] MatmaRex_mobile: Yeah, you are right I can access https://gerrit.wikimedia.org/r/#/c/405590/1/ [08:55:14] (03PS5) 10Gehel: Revert "Updates to enable short URLs for transliteration for crhwiki - beta" [puppet] - 10https://gerrit.wikimedia.org/r/405048 (https://phabricator.wikimedia.org/T23582) (owner: 10Tjones) [08:55:53] (03CR) 10Gehel: [C: 032] Revert "Updates to enable short URLs for transliteration for crhwiki - beta" [puppet] - 10https://gerrit.wikimedia.org/r/405048 (https://phabricator.wikimedia.org/T23582) (owner: 10Tjones) [09:06:15] (03PS2) 10Marostegui: db-eqiad.php: Repool db1066, depool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405676 (https://phabricator.wikimedia.org/T162807) [09:10:53] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Repool db1066, depool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405676 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [09:12:33] (03Merged) 10jenkins-bot: db-eqiad.php: Repool db1066, depool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405676 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [09:12:47] (03CR) 10jenkins-bot: db-eqiad.php: Repool db1066, depool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405676 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [09:13:52] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 00m 56s) [09:14:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:14:03] T162807: Run pt-table-checksum on s1 (enwiki) - https://phabricator.wikimedia.org/T162807 [09:14:59] (03PS1) 10Marostegui: db-eqiad.php: Depool db1030 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405681 (https://phabricator.wikimedia.org/T184397) [09:18:04] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1030 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405681 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [09:19:33] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1030 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405681 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [09:19:53] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1030 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405681 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [09:20:46] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1030 - T184397 (duration: 00m 56s) [09:20:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:20:59] T184397: Decommission db1030 - https://phabricator.wikimedia.org/T184397 [09:21:08] !log Stop MySQL on db1030 to clone db1063 - T184397 [09:21:14] (03CR) 10Muehlenhoff: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404967 (owner: 10Muehlenhoff) [09:21:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:05] !log Stop replication in sync db1099:3311 and db1089 - T162807 [09:25:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:17] T162807: Run pt-table-checksum on s1 (enwiki) - https://phabricator.wikimedia.org/T162807 [09:28:14] (03CR) 10Alexandros Kosiaris: [C: 04-1] ircecho: Support ssl when connecting to irc (036 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405591 (owner: 10Paladox) [09:30:32] (03CR) 10jerkins-bot: [V: 04-1] Depool poolcounter1002 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404967 (owner: 10Muehlenhoff) [09:30:53] (03PS3) 10Zfilipin: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [09:37:38] (03CR) 10Alexandros Kosiaris: [C: 04-1] ircecho: Support auth over irc (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405594 (owner: 10Paladox) [09:39:43] (03PS1) 10Marostegui: db-eqiad.php: Repool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405683 (https://phabricator.wikimedia.org/T162807) [09:41:04] (03CR) 10Alexandros Kosiaris: [C: 031] admin: Use the debian staff group for ops (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/331602 (owner: 10Alexandros Kosiaris) [09:41:18] (03PS4) 10Alexandros Kosiaris: admin: Use the debian staff group for ops [puppet] - 10https://gerrit.wikimedia.org/r/331602 [09:43:08] (03PS2) 10Marostegui: db-eqiad.php: Repool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405683 (https://phabricator.wikimedia.org/T162807) [09:44:58] (03PS1) 10Jcrespo: mariadb: Repool db2036 after reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405684 [09:45:23] (03CR) 10Jcrespo: [C: 031] mariadb: Repool db2036 after reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405684 (owner: 10Jcrespo) [09:47:05] (03CR) 10Marostegui: [C: 031] mariadb: Repool db2036 after reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405684 (owner: 10Jcrespo) [09:47:11] (03PS1) 10Ladsgroup: Enable fine grained lua tracking for arwiki, fawiki, viwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405685 (https://phabricator.wikimedia.org/T185032) [09:47:15] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Repool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405683 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [09:48:27] (03CR) 10Jcrespo: [C: 032] mariadb: Repool db2036 after reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405684 (owner: 10Jcrespo) [09:50:02] (03Merged) 10jenkins-bot: db-eqiad.php: Repool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405683 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [09:50:19] (03CR) 10jenkins-bot: db-eqiad.php: Repool db1099:3311 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405683 (https://phabricator.wikimedia.org/T162807) (owner: 10Marostegui) [09:50:20] !log running heavy reads on db2043, db2036 to try to reproduce s3 codfw crash [09:50:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:51:20] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s) [09:51:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:51:31] T162807: Run pt-table-checksum on s1 (enwiki) - https://phabricator.wikimedia.org/T162807 [09:51:52] (03Merged) 10jenkins-bot: mariadb: Repool db2036 after reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405684 (owner: 10Jcrespo) [09:53:33] (03CR) 10jenkins-bot: mariadb: Repool db2036 after reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405684 (owner: 10Jcrespo) [09:56:19] (03CR) 10Hashar: [C: 031] Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [09:58:31] (03CR) 10Hashar: "I am not sure there is a point in whitelisting every single website from the Ukrainian governement. Usually we just white list the specifi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405550 (https://phabricator.wikimedia.org/T185399) (owner: 10Urbanecm) [09:59:25] (03CR) 10Hashar: [C: 031] Enable fine grained lua tracking for arwiki, fawiki, viwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405685 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [09:59:47] (03CR) 10Hashar: [C: 031] Update officewiki logo, add HD logo for officewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403349 (https://phabricator.wikimedia.org/T184575) (owner: 10Urbanecm) [10:29:08] (03PS5) 10Volans: Migration to Python 3 [software/cumin] - 10https://gerrit.wikimedia.org/r/402059 [10:46:38] (03PS1) 10Marostegui: s6,s8.hosts: Move db1063 to s6 [software] - 10https://gerrit.wikimedia.org/r/405688 (https://phabricator.wikimedia.org/T184397) [10:48:43] (03CR) 10Marostegui: [C: 032] s6,s8.hosts: Move db1063 to s6 [software] - 10https://gerrit.wikimedia.org/r/405688 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [10:49:26] (03Merged) 10jenkins-bot: s6,s8.hosts: Move db1063 to s6 [software] - 10https://gerrit.wikimedia.org/r/405688 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [10:52:45] RECOVERY - Check Varnish expiry mailbox lag on cp4024 is OK: OK: expiry mailbox lag is 3 [11:00:04] jan_drewniak: I, the Bot under the Fountain, allow thee, The Deployer, to do Wikimedia Portals Update deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180122T1100). [11:00:04] No GERRIT patches in the queue for this window AFAICS. [11:02:19] (03PS1) 10Volans: Refactor for Cumin 2.0.0 API [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405690 [11:15:06] (03PS1) 10Muehlenhoff: Let debdeploy-server depend on cumin [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405692 [11:38:29] !log uploaded cumin_2.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia [11:38:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:45:55] !log jynus@tin Synchronized wmf-config/db-codfw.php: Repool db2036 (duration: 00m 57s) [11:46:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:00:51] !log Change x1 codfw topology: db2034 to replicate from eqiad [12:01:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:01:06] !log Change x1 codfw topology: db2034 to replicate from eqiad T184888 [12:01:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:01:18] T184888: Replace codfw x1 master (db2033) (WAS: Failed BBU on db2033 (x1 master)) - https://phabricator.wikimedia.org/T184888 [12:07:59] (03PS1) 10Marostegui: mariadb: Disable notifications db1030 [puppet] - 10https://gerrit.wikimedia.org/r/405695 (https://phabricator.wikimedia.org/T184397) [12:10:02] (03PS1) 10Marostegui: db-eqiad.php: Pool db1063 as vslow in s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405696 (https://phabricator.wikimedia.org/T184397) [12:11:29] (03CR) 10jerkins-bot: [V: 04-1] db-eqiad.php: Pool db1063 as vslow in s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405696 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [12:12:07] (03PS2) 10Marostegui: db-eqiad.php: Pool db1063 as vslow in s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405696 (https://phabricator.wikimedia.org/T184397) [12:13:09] (03CR) 10Marostegui: [C: 032] mariadb: Disable notifications db1030 [puppet] - 10https://gerrit.wikimedia.org/r/405695 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [12:15:10] (03CR) 10Faidon Liambotis: [C: 04-1] lxc: Fix support for stretch (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [12:22:31] !log upgrade mw1238-mw1258 to HHVM 3,18.7 [12:22:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:27:58] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Pool db1063 as vslow in s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405696 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [12:28:24] (03CR) 10Volans: [C: 031] "LGTM" [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405692 (owner: 10Muehlenhoff) [12:29:30] (03Merged) 10jenkins-bot: db-eqiad.php: Pool db1063 as vslow in s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405696 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [12:29:43] (03CR) 10jenkins-bot: db-eqiad.php: Pool db1063 as vslow in s6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405696 (https://phabricator.wikimedia.org/T184397) (owner: 10Marostegui) [12:30:49] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Pool db1063 as vslow - T184397 (duration: 00m 56s) [12:31:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:31:05] T184397: Decommission db1030 - https://phabricator.wikimedia.org/T184397 [12:33:28] (03CR) 10Muehlenhoff: [C: 032] Let debdeploy-server depend on cumin [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405692 (owner: 10Muehlenhoff) [12:34:38] PROBLEM - puppet last run on mw1238 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 6 minutes ago with 2 failures. Failed resources (up to 3 shown): Package[hhvm-dbg],Package[hhvm] [12:35:08] (03PS2) 10Volans: Refactor for Cumin 2.0.0 API [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405690 [12:36:12] (03CR) 10Lucas Werkmeister (WMDE): "WBQC commit 41c26ca1ef is in wmf.17:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403939 (https://phabricator.wikimedia.org/T180614) (owner: 10Lucas Werkmeister (WMDE)) [12:38:33] !log upgraded cumin on labpuppetmasters hosts to 2.0.0 [12:38:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:40:09] PROBLEM - DPKG on mw1244 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [12:41:18] RECOVERY - DPKG on mw1244 is OK: All packages OK [12:43:36] (03CR) 10Muehlenhoff: [C: 031] "Looks good" [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405690 (owner: 10Volans) [12:45:28] (03CR) 10Volans: [C: 032] Refactor for Cumin 2.0.0 API [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405690 (owner: 10Volans) [12:51:23] (03PS1) 10Gilles: Lower Thumbor subprocess timeout to 59 seconds [puppet] - 10https://gerrit.wikimedia.org/r/405698 (https://phabricator.wikimedia.org/T185479) [12:59:39] RECOVERY - puppet last run on mw1238 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:10:38] PROBLEM - Host heze is DOWN: PING CRITICAL - Packet loss = 100% [13:10:38] PROBLEM - Host helium is DOWN: PING CRITICAL - Packet loss = 100% [13:11:08] RECOVERY - Host helium is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms [13:12:08] RECOVERY - Host heze is UP: PING OK - Packet loss = 0%, RTA = 36.18 ms [13:45:54] (03PS11) 10Paladox: lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) [13:45:56] (03CR) 10Paladox: lxc: Fix support for stretch (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [13:46:27] !log Force BBU relearn on db1016 - T166344 [13:46:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:46:44] T166344: db1016 m1 master: Possibly faulty BBU - https://phabricator.wikimedia.org/T166344 [13:48:11] (03CR) 10Faidon Liambotis: [C: 04-1] lxc: Fix support for stretch (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [13:48:27] 10Operations, 10ops-eqiad, 10DBA: db1016 m1 master: Possibly faulty BBU - https://phabricator.wikimedia.org/T166344#3917284 (10Marostegui) This failed again - I have forced a relearn [13:48:59] (03PS12) 10Paladox: lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) [13:49:06] (03CR) 10Paladox: lxc: Fix support for stretch (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [13:53:18] (03Draft1) 10Paladox: ircecho: Remove support for sysvinit script [puppet] - 10https://gerrit.wikimedia.org/r/405703 [13:53:21] (03PS2) 10Paladox: ircecho: Remove support for sysvinit script [puppet] - 10https://gerrit.wikimedia.org/r/405703 [13:58:52] 10Operations, 10Puppet, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3917303 (10herron) [13:58:54] 10Operations, 10Patch-For-Review: Puppet: Setting configtimeout is deprecated - https://phabricator.wikimedia.org/T182585#3917301 (10herron) 05Open>03Resolved a:03herron [14:00:04] addshore, hashar, anomie, no_justification, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: That opportune time is upon us again. Time for a European Mid-day SWAT(Max 8 patches) deploy. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180122T1400). [14:00:04] Lucas_WMDE, Urbanecm, and Amir1: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [14:00:11] o/ [14:00:12] I can SWAT today [14:00:19] Amir1: want to deploy your own changes? [14:00:25] yup [14:00:26] * Lucas_WMDE is here :) [14:00:26] if so, go ahead :) [14:00:34] Thanks [14:00:49] I'm here [14:01:20] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405685 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [14:02:42] Lucas_WMDE: you are next, as soon as Amir1 is done [14:02:49] cool, thanks [14:02:55] (03Merged) 10jenkins-bot: Enable fine grained lua tracking for arwiki, fawiki, viwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405685 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [14:03:07] Lucas_WMDE: do you want to deploy your own changes? [14:03:09] (03CR) 10jenkins-bot: Enable fine grained lua tracking for arwiki, fawiki, viwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405685 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [14:03:45] zeljkof: I don’t think I have the necessary access for that… [14:03:59] ok, I'll deploy [14:05:25] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:405685|Enable fine grained lua tracking for arwiki, fawiki, viwiki (T185032)]] (duration: 00m 57s) [14:05:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:05:39] T185032: Enable lua fine grained usage tracking - Late January 2018 batch - https://phabricator.wikimedia.org/T185032 [14:05:49] The deploy is done. Everything looks fine for now [14:05:55] will keep monitoring things [14:05:57] Amir1: ok, taking over [14:06:04] Thanks :) [14:06:08] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403939 (https://phabricator.wikimedia.org/T180614) (owner: 10Lucas Werkmeister (WMDE)) [14:06:13] thank you :) [14:06:18] 10Operations, 10Puppet, 10Traffic: Puppet hosts with signed certificate present on agent but not master - https://phabricator.wikimedia.org/T185239#3917317 (10herron) [14:06:58] I see 105 cases of fatal in Quiz extension, do we have it deployed in WMF infra, that's interesting [14:07:12] apparently [14:07:23] I'm not familiar with it [14:07:33] Lucas_WMDE: the commit will be at mwdebug1002 in a few minutes [14:07:44] (03Merged) 10jenkins-bot: Remove $wgWBQualityConstraintsIncludeDetailInApi setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403939 (https://phabricator.wikimedia.org/T180614) (owner: 10Lucas Werkmeister (WMDE)) [14:07:54] (03CR) 10jenkins-bot: Remove $wgWBQualityConstraintsIncludeDetailInApi setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403939 (https://phabricator.wikimedia.org/T180614) (owner: 10Lucas Werkmeister (WMDE)) [14:07:54] ok thanks, I can test it on ApiSandbox then [14:09:40] Urbanecm: please stand by, you are next :) [14:09:48] Ok zeljkof [14:10:09] Lucas_WMDE: the commit is at mwdebug1002, please test and let me know if I can deploy [14:11:01] zeljkof: looks good, the config variable still seems to be in effect as expected [14:11:10] ok, deploying [14:11:28] !log cleanup leftover logrotate configuration on wdqs* [14:11:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:12:24] !log zfilipin@tin Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:403939|Remove $wgWBQualityConstraintsIncludeDetailInApi setting (T180614)]] (duration: 00m 56s) [14:12:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:12:35] T180614: Remove detail and detailHTML from wbcheckconstraints response - https://phabricator.wikimedia.org/T180614 [14:12:45] Lucas_WMDE: deployed, please check and thanks for deploying with #releng :) [14:13:29] zeljkof: still looking good (server: mw1234.eqiad.wmnet), thank you for deploying :) [14:14:39] Urbanecm: hashar had some comments for 405550 [14:14:55] (03PS3) 10Zfilipin: Add several domains of Ukraine government to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405550 (https://phabricator.wikimedia.org/T185399) (owner: 10Urbanecm) [14:15:35] are we uploading photos from government sites? [14:16:14] zeljkof, I see. I don't think it is good point to override hashar's or anybody else's opinion, so I think it'll be better to solve this on-the-task and skip it for today. [14:16:18] Is hashar here? [14:16:30] Urbanecm: he is traveling today [14:16:59] so please let's discuss this at the task [14:17:06] skipping for today [14:17:18] Oh, ok. So let's skip 405550 and process with 404088 (this is Poland government and no oppose there) [14:17:24] 10Operations, 10Puppet, 10Patch-For-Review: Puppet hosts with their cert revoked can still run puppet - https://phabricator.wikimedia.org/T184444#3917345 (10herron) 05Open>03Resolved Resolving this task as agents with revoked certs are no longer able to run puppet. Follow-up task for issue 3 above (host... [14:18:09] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [14:18:17] (03CR) 10Zfilipin: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [14:18:21] (03PS4) 10Zfilipin: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [14:18:33] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [14:19:26] Urbanecm: can you test 404088? [14:19:36] in general, I mean, it's still merging [14:19:54] (03PS1) 10Volans: Cumin: set PuppetDB api version to 3 [puppet] - 10https://gerrit.wikimedia.org/r/405712 (https://phabricator.wikimedia.org/T182575) [14:20:08] (03Merged) 10jenkins-bot: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [14:20:18] (03CR) 10jenkins-bot: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404088 (https://phabricator.wikimedia.org/T184853) (owner: 10Urbanecm) [14:21:07] (03CR) 10Volans: [C: 032] Cumin: set PuppetDB api version to 3 [puppet] - 10https://gerrit.wikimedia.org/r/405712 (https://phabricator.wikimedia.org/T182575) (owner: 10Volans) [14:21:57] zeljkof, you can deploy it into prod directly... [14:22:34] deploying [14:22:53] ack [14:22:57] are remaining two commits testable at mwdebug? [14:23:12] yes [14:23:25] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:404088|Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains (T184853)]] (duration: 00m 56s) [14:23:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:23:39] T184853: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains - https://phabricator.wikimedia.org/T184853 [14:25:28] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405256 (https://phabricator.wikimedia.org/T184553) (owner: 10Urbanecm) [14:26:36] !log uploaded debdeploy 0.0.99.2 for jessie-wikimedia, stretch-wikimedia, trusty-wikimedia to apt.wikimedia.org [14:26:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:27:04] (03CR) 10Zfilipin: Allow bureaucrats@mr.wiki to grant&revoke accountcreator [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405256 (https://phabricator.wikimedia.org/T184553) (owner: 10Urbanecm) [14:27:08] (03PS2) 10Zfilipin: Allow bureaucrats@mr.wiki to grant&revoke accountcreator [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405256 (https://phabricator.wikimedia.org/T184553) (owner: 10Urbanecm) [14:27:17] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405256 (https://phabricator.wikimedia.org/T184553) (owner: 10Urbanecm) [14:28:42] (03Merged) 10jenkins-bot: Allow bureaucrats@mr.wiki to grant&revoke accountcreator [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405256 (https://phabricator.wikimedia.org/T184553) (owner: 10Urbanecm) [14:28:53] (03CR) 10jenkins-bot: Allow bureaucrats@mr.wiki to grant&revoke accountcreator [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405256 (https://phabricator.wikimedia.org/T184553) (owner: 10Urbanecm) [14:29:16] (03PS1) 10Volans: Backends: add known hosts files backend [software/cumin] - 10https://gerrit.wikimedia.org/r/405719 [14:29:52] Urbanecm: 405256 is at mwdebug1002 [14:30:02] ack [14:30:45] zeljkof, working, please deploy [14:31:02] deploying [14:31:15] ack [14:31:53] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:405256|Allow bureaucrats@mr.wiki to grant&revoke accountcreator (T184553)]] (duration: 00m 56s) [14:32:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:32:07] T184553: Let mr.wiki bureaucrats add/remove account creator flag - https://phabricator.wikimedia.org/T184553 [14:32:07] Urbanecm: deployed, please check [14:32:22] zeljkof, working, thanks [14:32:30] (03PS2) 10Zfilipin: Update officewiki logo, add HD logo for officewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403349 (https://phabricator.wikimedia.org/T184575) (owner: 10Urbanecm) [14:33:40] PROBLEM - Disk space on labtestnet2001 is CRITICAL: DISK CRITICAL - free space: / 348 MB (3% inode=61%) [14:34:16] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403349 (https://phabricator.wikimedia.org/T184575) (owner: 10Urbanecm) [14:35:18] Urbanecm: can you test 403349? should I do it? [14:35:40] zeljkof, yes, at mwdebug :) [14:35:42] (03Merged) 10jenkins-bot: Update officewiki logo, add HD logo for officewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403349 (https://phabricator.wikimedia.org/T184575) (owner: 10Urbanecm) [14:36:13] Urbanecm: sorry, did not understand, will you test, or me? [14:36:39] RECOVERY - Disk space on labtestnet2001 is OK: DISK OK [14:36:44] !log truncate (again) /var/log/upstart/neutron-server.log on labtestnet2001 [14:36:49] (03CR) 10jenkins-bot: Update officewiki logo, add HD logo for officewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403349 (https://phabricator.wikimedia.org/T184575) (owner: 10Urbanecm) [14:36:54] zeljkof, oh, understand. Relevant pages of officewiki is public so I can test [14:36:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:36:58] (can and will ofc) [14:36:58] andrewbogott, chasemp --^ [14:37:11] Urbanecm: ok, will be there in a minute [14:37:15] ack [14:37:41] elukey: thanks, I did that on Friday too. It's really chatty [14:37:52] Urbanecm: logos are at mwdebug [14:38:00] ack [14:38:29] zeljkof, logos are changed, so working [14:38:31] please deploy [14:38:36] deploying [14:39:54] !log zfilipin@tin Synchronized static/images/project-logos/: SWAT: [[gerrit:403349|Update officewiki logo, add HD logo for officewiki (T184575)]] (duration: 00m 56s) [14:39:57] ack [14:40:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:40:06] T184575: Update logo for Wikimedia Office - https://phabricator.wikimedia.org/T184575 [14:41:05] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:403349|Update officewiki logo, add HD logo for officewiki (T184575)]] (duration: 00m 56s) [14:41:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:41:51] Urbanecm: deployed, purged officewiki.png from cache [14:42:04] Great. [14:42:15] Seems it's everything at least for me :) [14:42:20] Amir1: Quiz errors dropped to 49 [14:42:30] Urbanecm: thanks for deploying with #releng ;) [14:42:31] zeljkof, should I mark something about the skipped patch? [14:42:36] please do [14:42:40] Thanks for the heads up :) [14:42:43] leave a comment in gerrit and phab [14:43:08] !log EU SWAT finished [14:43:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:43:40] 10Operations, 10Analytics-Kanban, 10Patch-For-Review: Puppet admin module should support adding system users to managed groups - https://phabricator.wikimedia.org/T174465#3917427 (10Ottomata) Yeah, this is a very common request, and in the past we've told people we can't do it. Their only recourse then is t... [14:47:50] zeljkof, I've marked it in calendar and at Phab [14:47:54] should I do it in Gerrit too? [14:49:10] Urbanecm: please do [14:49:17] Ok, will do [14:49:44] (03CR) 10Urbanecm: "Skipped due to Hashar's concerns." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405550 (https://phabricator.wikimedia.org/T185399) (owner: 10Urbanecm) [14:51:00] elukey: I have an irc ping from you somewhere but unsure what about [14:51:19] chasemp: o/ - "!log truncate (again) /var/log/upstart/neutron-server.log on labtestnet2001" [14:51:30] elukey: ok thanks, cheers [14:55:49] RECOVERY - puppet last run on labtestnet2001 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [14:59:08] !log upgrade mw1221-mw1235 to HHVM 3,18.7 [14:59:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:11:07] 10Operations, 10cloud-services-team: Onboard astorm to WMF - https://phabricator.wikimedia.org/T185493#3917465 (10chasemp) p:05Triage>03Normal [15:12:03] !log restarting jenkins [15:12:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:12:17] 10Operations, 10cloud-services-team: Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3917476 (10chasemp) [15:13:24] 10Operations, 10ops-eqiad, 10Analytics, 10Patch-For-Review: rack/setup/install notebook[34] - https://phabricator.wikimedia.org/T183935#3917477 (10Ottomata) Status? [15:25:50] 10Operations, 10Analytics, 10hardware-requests: EQIAD: (1) hardware request for eventlog1001 replacement - eventlog1002. - https://phabricator.wikimedia.org/T184551#3917495 (10faidon) a:05faidon>03RobH Sounds good, please go ahead :) [15:26:12] (03PS2) 10Andrew Bogott: role::client::labs: remove hiera defaults that rely on ::global vars [puppet] - 10https://gerrit.wikimedia.org/r/405671 [15:26:14] (03PS4) 10Gehel: wdqs: cleanup JVM options for blazegraph [puppet] - 10https://gerrit.wikimedia.org/r/388026 (https://phabricator.wikimedia.org/T175919) [15:28:09] (03PS5) 10Gehel: wdqs: cleanup JVM options for blazegraph [puppet] - 10https://gerrit.wikimedia.org/r/388026 (https://phabricator.wikimedia.org/T175919) [15:28:44] (03CR) 10Andrew Bogott: [C: 032] role::client::labs: remove hiera defaults that rely on ::global vars [puppet] - 10https://gerrit.wikimedia.org/r/405671 (owner: 10Andrew Bogott) [15:38:15] 10Operations, 10Puppet: Upgrade puppetDB to version 3.2 or newer - https://phabricator.wikimedia.org/T177253#3917517 (10herron) The puppetlabs puppetdb 4.4 jessie package is running successfully in labs on a stretch instance (puppet project). With a few local modifications puppetdb 4.4 is cooperating with ro... [15:46:59] PROBLEM - MD RAID on restbase-dev1006 is CRITICAL: CRITICAL: State: degraded, Active: 11, Working: 11, Failed: 1, Spare: 0 [15:46:59] ACKNOWLEDGEMENT - MD RAID on restbase-dev1006 is CRITICAL: CRITICAL: State: degraded, Active: 11, Working: 11, Failed: 1, Spare: 0 nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T185494 [15:47:03] 10Operations, 10ops-eqiad: Degraded RAID on restbase-dev1006 - https://phabricator.wikimedia.org/T185494#3917526 (10ops-monitoring-bot) [15:47:50] PROBLEM - cassandra-b CQL 10.64.48.169:9042 on restbase-dev1006 is CRITICAL: connect to address 10.64.48.169 and port 9042: Connection refused [15:48:09] PROBLEM - cassandra-a CQL 10.64.48.168:9042 on restbase-dev1006 is CRITICAL: connect to address 10.64.48.168 and port 9042: Connection refused [15:48:19] PROBLEM - restbase endpoints health on restbase-dev1006 is CRITICAL: /en.wikipedia.org/v1/page/summary/{title} (Get summary from storage) is CRITICAL: Test Get summary from storage returned the unexpected status 500 (expecting: 200): /en.wikipedia.org/v1/feed/onthisday/{type}/{mm}/{dd} (Retrieve selected the events for Jan 01) is CRITICAL: Test Retrieve selected the events for Jan 01 returned the unexpected status 500 (expectin [15:48:29] PROBLEM - restbase endpoints health on restbase-dev1004 is CRITICAL: /en.wikipedia.org/v1/page/summary/{title} (Get summary from storage) is CRITICAL: Test Get summary from storage returned the unexpected status 500 (expecting: 200): /en.wikipedia.org/v1/feed/onthisday/{type}/{mm}/{dd} (Retrieve selected the events for Jan 01) is CRITICAL: Test Retrieve selected the events for Jan 01 returned the unexpected status 500 (expectin [15:48:39] PROBLEM - restbase endpoints health on restbase-dev1005 is CRITICAL: /en.wikipedia.org/v1/page/summary/{title} (Get summary from storage) is CRITICAL: Test Get summary from storage returned the unexpected status 500 (expecting: 200): /en.wikipedia.org/v1/feed/onthisday/{type}/{mm}/{dd} (Retrieve selected the events for Jan 01) is CRITICAL: Test Retrieve selected the events for Jan 01 returned the unexpected status 500 (expectin [15:48:48] (03PS1) 10Fdans: Remove sensitive fields from whitelist for QuickSurvey schemas [puppet] - 10https://gerrit.wikimedia.org/r/405727 (https://phabricator.wikimedia.org/T174386) [15:49:40] !log upgrade image scalers in eqiad to HHVM 3.18.7 [15:49:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:52:50] PROBLEM - MegaRAID on db1016 is CRITICAL: CRITICAL: 1 LD(s) must have write cache policy WriteBack, currently using: WriteThrough [15:54:12] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3917572 (10Tbayer) [15:54:42] :-( [16:09:07] I have run another relearn to try to help db1016 but it might be that its bbu is dead forever [16:11:11] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3917622 (10Nuria) >What is the argument for assuming that this data is correct? The data for the pdf file transfer looks as you would expect if... [16:11:52] 10Operations, 10Puppet: Build a pair of debian stretch PuppetDB servers - https://phabricator.wikimedia.org/T185499#3917623 (10herron) p:05Triage>03Normal [16:12:06] 10Operations, 10Puppet: Extend puppetmaster::puppetdb to support puppetlabs packaged puppetdb 4.4 - https://phabricator.wikimedia.org/T185500#3917639 (10herron) p:05Triage>03Normal [16:12:17] 10Operations, 10Puppet: Add PuppetDB version selector (puppet/hiera) - https://phabricator.wikimedia.org/T185501#3917650 (10herron) p:05Triage>03Normal [16:12:38] 10Operations, 10Puppet: Port puppetlabs PuppetDB 4.4 package to stretch - https://phabricator.wikimedia.org/T185502#3917661 (10herron) p:05Triage>03Normal [16:13:16] 10Operations, 10Puppet: Upgrade PuppetDB to version 4.4 - https://phabricator.wikimedia.org/T177253#3917672 (10herron) [16:15:53] 10Operations, 10Puppet: Add PuppetDB version selector (puppet/hiera) - https://phabricator.wikimedia.org/T185501#3917679 (10herron) [16:17:24] (03PS3) 10Alexandros Kosiaris: ircecho: Remove support for sysvinit script [puppet] - 10https://gerrit.wikimedia.org/r/405703 (owner: 10Paladox) [16:17:32] (03CR) 10Alexandros Kosiaris: [C: 032] "LGTM, thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/405703 (owner: 10Paladox) [16:17:45] (03CR) 10Paladox: "Thanks and your welcome :)." [puppet] - 10https://gerrit.wikimedia.org/r/405703 (owner: 10Paladox) [16:22:47] 10Operations, 10monitoring: Netbox: add Icinga check for PosgreSQL - https://phabricator.wikimedia.org/T185504#3917698 (10Volans) p:05Triage>03Normal [16:24:27] 10Operations, 10monitoring: Netbox: add Icinga check for the website - https://phabricator.wikimedia.org/T185505#3917710 (10Volans) p:05Triage>03Normal [16:25:00] 10Operations, 10Patch-For-Review: Netbox: postgres cannot be restarted w/ current config - https://phabricator.wikimedia.org/T184634#3917722 (10Volans) [16:25:52] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3917729 (10Tbayer) >>! In T185350#3917622, @Nuria wrote: >>What is the argument for assuming that this data is correct? > The data for the pdf f... [16:29:35] (03CR) 10Dzahn: "if Nuria still sees performance issues, with the login page specifically or Gerrit in general, kindly request that we open a real ticket a" [puppet] - 10https://gerrit.wikimedia.org/r/405368 (owner: 10Paladox) [16:30:26] (03CR) 10Dzahn: Gerrit: Fix performance issues with new login ui (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/405368 (owner: 10Paladox) [16:30:47] (03PS9) 10Paladox: Gerrit: Fix performance issues with new login ui [puppet] - 10https://gerrit.wikimedia.org/r/405368 [16:30:58] (03CR) 10Paladox: Gerrit: Fix performance issues with new login ui (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/405368 (owner: 10Paladox) [16:34:16] 10Operations, 10Gerrit, 10Performance: New gerrit login ui is causing performance problems when going through gerrit.wikimedia.org - https://phabricator.wikimedia.org/T185506#3917733 (10Paladox) [16:34:19] (03PS10) 10Paladox: Gerrit: Fix performance issues with new login ui [puppet] - 10https://gerrit.wikimedia.org/r/405368 (https://phabricator.wikimedia.org/T185506) [16:34:35] (03CR) 10Dzahn: "maybe better call this something like "remove visibility: hidden; from " or similar, describing what it actually does." [puppet] - 10https://gerrit.wikimedia.org/r/405368 (https://phabricator.wikimedia.org/T185506) (owner: 10Paladox) [16:35:18] (03PS11) 10Paladox: Gerrit: remove visibility: hidden; from [puppet] - 10https://gerrit.wikimedia.org/r/405368 (https://phabricator.wikimedia.org/T185506) [16:35:25] (03CR) 10Paladox: "> maybe better call this something like "remove visibility: hidden;" [puppet] - 10https://gerrit.wikimedia.org/r/405368 (https://phabricator.wikimedia.org/T185506) (owner: 10Paladox) [16:38:40] (03PS2) 10Dzahn: mediawiki: Update path to loadExitNodes.php [puppet] - 10https://gerrit.wikimedia.org/r/399434 (owner: 10Reedy) [16:40:03] (03PS3) 10Dzahn: mediawiki: Update path to loadExitNodes.php [puppet] - 10https://gerrit.wikimedia.org/r/399434 (owner: 10Reedy) [16:40:12] (03CR) 10Dzahn: [C: 032] "confirmed:" [puppet] - 10https://gerrit.wikimedia.org/r/399434 (owner: 10Reedy) [16:43:33] (03CR) 10Dzahn: "can you test the generated rewrite rule on a cloud instance, using the "apache-fast-test" script? It can be found in operations/puppet ./" [puppet] - 10https://gerrit.wikimedia.org/r/394743 (https://phabricator.wikimedia.org/T181878) (owner: 10Framawiki) [16:46:56] (03PS13) 10Faidon Liambotis: lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [16:46:58] (03CR) 10Faidon Liambotis: [C: 032] lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [16:47:48] (03CR) 10Faidon Liambotis: [C: 032] lxc: Fix support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [16:47:51] (03CR) 10Paladox: "Thanks :)" [puppet] - 10https://gerrit.wikimedia.org/r/405208 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [16:48:23] yw :) [16:49:46] (03CR) 10Paladox: "Does this mean this wont work on trusty? Anyways support for debian stretch was added in https://gerrit.wikimedia.org/r/#/c/405208/" [puppet] - 10https://gerrit.wikimedia.org/r/405532 (owner: 10Andrew Bogott) [16:51:21] !log upgraded debdeploy and cumin to latest released on neodymium/sarin - T182575 [16:51:27] (03Draft1) 10Paladox: mediawiki_vagrant: Add support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405736 [16:51:30] (03PS2) 10Paladox: mediawiki_vagrant: Add support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405736 [16:51:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:51:33] T182575: Cumin: PuppetDB backend, add support for API v4 - https://phabricator.wikimedia.org/T182575 [16:51:48] 10Puppet, 10Operations-Software-Development: Cumin: PuppetDB backend, add support for API v4 - https://phabricator.wikimedia.org/T182575#3917778 (10Volans) 05Open>03Resolved Cumin 2.0.0 with support for Puppet API v4 was released. Debdeploy was updated accordingly. Both debdeploy and cumin were released in... [16:51:52] 10Operations, 10Puppet, 10Goal: Modernize Puppet Configuration Management (2017-18 Q3 Goal) - https://phabricator.wikimedia.org/T184561#3917781 (10Volans) [16:51:56] (03PS3) 10Paladox: mediawiki_vagrant: Add support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405736 (https://phabricator.wikimedia.org/T180377) [16:52:16] 10Operations, 10Puppet, 10Goal: Modernize Puppet Configuration Management (2017-18 Q3 Goal) - https://phabricator.wikimedia.org/T184561#3888176 (10Volans) [16:53:15] (03CR) 10Andrew Bogott: [C: 04-2] "All this does is notify so it doesn't actively prevent running on trusty. And, as far as I know vagrant isn't really maintained on Trusty" [puppet] - 10https://gerrit.wikimedia.org/r/405736 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [16:53:39] (03PS4) 10Paladox: mediawiki_vagrant: Add support for stretch [puppet] - 10https://gerrit.wikimedia.org/r/405736 (https://phabricator.wikimedia.org/T180377) [16:54:08] (03CR) 10Paladox: "> All this does is notify so it doesn't actively prevent running on" [puppet] - 10https://gerrit.wikimedia.org/r/405736 (https://phabricator.wikimedia.org/T180377) (owner: 10Paladox) [16:54:30] (03PS1) 10Andrew Bogott: Revert "role::client::labs: remove hiera defaults that rely on ::global vars" [puppet] - 10https://gerrit.wikimedia.org/r/405737 [16:55:04] (03PS2) 10Andrew Bogott: Revert "role::client::labs: remove hiera defaults that rely on ::global vars" [puppet] - 10https://gerrit.wikimedia.org/r/405737 [16:55:52] (03CR) 10Andrew Bogott: [C: 032] Revert "role::client::labs: remove hiera defaults that rely on ::global vars" [puppet] - 10https://gerrit.wikimedia.org/r/405737 (owner: 10Andrew Bogott) [16:59:28] (03CR) 10Alexandros Kosiaris: [C: 032] "I am gonna merge this in the interest of moving forward. I think I have addressed comments and if I haven't, we can always revert." [puppet] - 10https://gerrit.wikimedia.org/r/331602 (owner: 10Alexandros Kosiaris) [16:59:32] (03PS5) 10Alexandros Kosiaris: admin: Use the debian staff group for ops [puppet] - 10https://gerrit.wikimedia.org/r/331602 [17:00:36] Hi ops-team - Little ping before deploying analytics-refinery -- Hadoop only related stuff [17:01:39] !log joal@tin Started deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands) [17:01:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:11:54] !log joal@tin Finished deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands) (duration: 10m 14s) [17:12:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:17:44] (03CR) 10EddieGP: "I don't know if someone else on this task can test it - I for one don't have access to any cloud instance (and probably won't have time to" [puppet] - 10https://gerrit.wikimedia.org/r/394743 (https://phabricator.wikimedia.org/T181878) (owner: 10Framawiki) [17:18:59] PROBLEM - puppet last run on ganeti1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:19:19] PROBLEM - puppet last run on mw1343 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:19:19] PROBLEM - puppet last run on lvs1010 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:20:29] PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:20:54] *waves* any ops around to reset my 2fa on my phab account? [17:21:59] PROBLEM - puppet last run on labvirt1009 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:22:30] PROBLEM - puppet last run on db1060 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:22:40] PROBLEM - puppet last run on poolcounter1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:23:10] PROBLEM - puppet last run on db1069 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:23:10] PROBLEM - puppet last run on analytics1033 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:23:15] akosiaris: puppetdb OOMed ^^^ [17:23:19] puppetdb on nitrogen killed volans [17:23:20] ahhaha [17:23:29] PROBLEM - puppet last run on elastic1038 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:23:30] poor puppetdb [17:23:50] addshore: most/all of us in a meeting right now [17:23:59] PROBLEM - puppet last run on snapshot1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:24:17] volans: ack, not desperate :) still logged in in my browser right now :) [17:24:40] volans: ahaha, good thing we waited for another day [17:24:47] indeed! good call [17:25:25] addshore: if you're in you should be able to remove it I think [17:26:22] volans: it asks me for a 2fa code in order to remove or add one ;) [17:27:44] eheheh [17:42:50] RECOVERY - MegaRAID on db1016 is OK: OK: optimal, 1 logical, 2 physical, WriteBack policy [17:47:31] RECOVERY - puppet last run on db1060 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [17:47:40] RECOVERY - puppet last run on poolcounter1001 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [17:48:06] 10Operations, 10ops-eqiad, 10DBA: db1016 m1 master: Possibly faulty BBU - https://phabricator.wikimedia.org/T166344#3917896 (10Marostegui) After the relearn: ``` ˜/icinga-wm 18:42> RECOVERY - MegaRAID on db1016 is OK: OK: optimal, 1 logical, 2 physical, WriteBack policy ``` Not sure it will last like that f... [17:48:11] RECOVERY - puppet last run on analytics1033 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [17:48:11] RECOVERY - puppet last run on db1069 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [17:48:31] RECOVERY - puppet last run on elastic1038 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:49:00] RECOVERY - puppet last run on ganeti1004 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [17:49:00] RECOVERY - puppet last run on snapshot1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:49:20] RECOVERY - puppet last run on mw1343 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:49:20] RECOVERY - puppet last run on lvs1010 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [17:50:30] RECOVERY - puppet last run on helium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [17:50:53] 10Operations, 10Release Pipeline, 10Release-Engineering-Team (Kanban): Package/upload service-checker for Debian stretch - https://phabricator.wikimedia.org/T184224#3917908 (10akosiaris) After a couple of setbacks, I 've managed to build the package just fine on stretch. I 'll upload it soon [17:52:00] RECOVERY - puppet last run on labvirt1009 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [17:53:01] 10Operations, 10Puppet, 10Traffic: Puppet hosts with signed certificate present on agent but not master - https://phabricator.wikimedia.org/T185239#3910613 (10akosiaris) ganeti2001, wtp2005, wtp2020 only use that cert for rsyslog and not for any service so we can refresh the puppet certs for them whenever we... [18:00:04] gehel and SMalyshev: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) Wikidata Query Service weekly deploy deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180122T1800). [18:00:04] No GERRIT patches in the queue for this window AFAICS. [18:00:37] jouncebot: WDQS GUI update coming up [18:14:41] !log gehel@tin Started deploy [wdqs/wdqs@f59ed29]: (no justification provided) [18:14:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:15:12] damn, forgot to add a justification again... [18:15:19] !log updating wdqs GUI [18:15:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:16:53] !log gehel@tin Finished deploy [wdqs/wdqs@f59ed29]: (no justification provided) (duration: 02m 12s) [18:17:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:18:37] SMalyshev: update completed, all tests are green ^ [18:22:22] (03PS1) 10Dzahn: Revert "rename ms-be3003.mgmt to bast3003.mgmt" [dns] - 10https://gerrit.wikimedia.org/r/405744 [18:27:47] (03CR) 10Dzahn: [C: 032] Revert "rename ms-be3003.mgmt to bast3003.mgmt" [dns] - 10https://gerrit.wikimedia.org/r/405744 (owner: 10Dzahn) [18:27:50] (03PS2) 10Dzahn: Revert "rename ms-be3003.mgmt to bast3003.mgmt" [dns] - 10https://gerrit.wikimedia.org/r/405744 [19:00:04] addshore, hashar, anomie, no_justification, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Dear deployers, time to do the Morning SWAT (Max 8 patches) deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180122T1900). [19:00:05] RoanKattouw, MatmaRex, Biplab, and stephanebisson: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [19:00:21] hi. [19:00:24] I'm here [19:01:47] hello [19:03:45] Any deployer /not/ at the dev summit? [19:04:10] * no_justification isn't :P [19:06:36] stephanebisson, quite some I think. [19:07:20] we should just make RoanKattouw do it. ;> [19:08:02] I'm at the dev summit though [19:08:13] oh whoops [19:08:31] That said I can probably do it, I'm not really paying attention to this session [19:08:40] Neither session going on right now was all that interesting to me [19:09:16] Is Knowledge as a service as boring as third parties? [19:10:12] It's just that neither is that relevant to what I work on [19:10:34] But then after lunch I want to go to both sessions, of course :| [19:10:40] of course [19:11:04] (03CR) 10MarcoAurelio: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:11:14] (03PS4) 10Catrope: Add Draft namespace to the Nepali Wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404067 (https://phabricator.wikimedia.org/T184157) (owner: 10Biplab Anand) [19:11:17] (03CR) 10Catrope: [C: 032] Add Draft namespace to the Nepali Wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404067 (https://phabricator.wikimedia.org/T184157) (owner: 10Biplab Anand) [19:11:19] I don't even know what we're talking about right now [19:14:56] (03CR) 10jerkins-bot: [V: 04-1] Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:15:41] (03Merged) 10jenkins-bot: Add Draft namespace to the Nepali Wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404067 (https://phabricator.wikimedia.org/T184157) (owner: 10Biplab Anand) [19:19:05] hello [19:19:19] (03PS3) 10Urbanecm: Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:19:47] https://gerrit.wikimedia.org/r/#/c/404067/ are on mwdebug1002? [19:20:39] (03CR) 10jenkins-bot: Add Draft namespace to the Nepali Wikipedia. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404067 (https://phabricator.wikimedia.org/T184157) (owner: 10Biplab Anand) [19:22:34] https://gerrit.wikimedia.org/r/#/c/404067/ are on mwdebug1002? [19:26:02] Jayprakash12345_: Sorry, I got distracted [19:26:07] I'm putting it on there right now [19:26:44] Jayprakash12345_: OK it's there now [19:27:04] Ah, wait, I spoke too sson [19:27:07] *soon [19:27:31] NOW it's there [19:29:27] stephanebisson: MatmaRex: Your patches are also on mwdebug1002 now [19:30:58] RoanKattouw: looks good [19:31:39] !log catrope@tin Synchronized php-1.31.0-wmf.17/extensions/InputBox/InputBox.hooks.php: T185367 (duration: 00m 58s) [19:31:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:31:53] T185367: InputBox doesn't let you combine useve=true with prefix - https://phabricator.wikimedia.org/T185367 [19:32:27] RoanKattouw: that's bad. I was able to "break" my testwiki usertalk even with the patch on mwdebug1002... abort mission [19:32:50] Hmm weird [19:33:00] It does deal with deferred stuff so it might be worth deploying anyway [19:33:19] Then again I don't think deferral runs things on different machines [19:33:20] https://gerrit.wikimedia.org/r/#/c/404067/ is good [19:33:30] please deply [19:34:14] RoanKattouw: Everthing is ok. [19:34:24] Cool, I'll do yours next [19:34:34] Another one is being deployed right now [19:34:42] !log catrope@tin Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionDetailsWidget.js: T184380 (duration: 00m 56s) [19:34:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:34:56] T184380: Upload Wizard ignores the description= field when using the upload campaign - https://phabricator.wikimedia.org/T184380 [19:34:56] RoanKattouw: I'd be surprise if this code support serialization to distribute deferred on other machines [19:35:08] Yeah true [19:35:14] stephanebisson: OK so you wanna revert? [19:35:29] RoanKattouw: yeah, I think it's better [19:35:47] clearly, I don't understand as much as I thought [19:35:59] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Add Draft namespace to newiki (T184157) (duration: 00m 56s) [19:36:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:36:12] T184157: Add Draft namespace to the Nepali Wikipedia - https://phabricator.wikimedia.org/T184157 [19:36:27] (thanks) [19:36:55] RoanKattouw: Thanks [19:37:08] RoanKattouw: how do we do this? do I revert on master and backport the revert patch? [19:37:33] I just reverted the backport [19:37:47] What you do in master is up to you [19:37:49] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3918246 (10Nuria) >Also, note in both cases (response_size 18550 and 15722), these weren't partial requests either (the status code was 200, not... [19:37:52] ok [19:38:40] 10Operations, 10Release Pipeline, 10Release-Engineering-Team (Kanban): Package/upload service-checker for Debian stretch - https://phabricator.wikimedia.org/T184224#3918252 (10dduvall) >>! In T184224#3917908, @akosiaris wrote: > After a couple of setbacks, I 've managed to build the package just fine on stre... [19:39:51] (03PS4) 10Urbanecm: Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:43:47] (03Abandoned) 10Andrew Bogott: striker: rename role class to profile [puppet] - 10https://gerrit.wikimedia.org/r/405669 (owner: 10Andrew Bogott) [19:48:56] (03PS7) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [19:49:31] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [19:56:16] (03PS1) 10Framawiki: Add NS_MAIN to $wgNamespacesWithSubpages for cawikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405755 (https://phabricator.wikimedia.org/T185436) [19:58:53] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3918346 (10Tbayer) @ottomata notes that the response_size field should correspond to the "Size of response in bytes, excluding HTTP headers" out... [20:01:44] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3918360 (10Ottomata) Thanks @Tbayer, meant to write that here too :) I'd be very surprised if the value of `%b` from varnishlog which is sent a... [20:02:07] (03PS8) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [20:02:39] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [20:06:08] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3918374 (10Tbayer) In the meantime, I ran a query to estimate how much data was transferred in the download direction last month overall *if* th... [20:06:21] 10Operations: shutdown/migrate gurvin (ipv6proxy,sslproxy) - https://phabricator.wikimedia.org/T83211#3918375 (10Dzahn) [20:09:01] 10Operations: what's left in Tampa - https://phabricator.wikimedia.org/T83156#3918390 (10Dzahn) [20:09:47] 10Operations: what's left in Tampa - child ticket - https://phabricator.wikimedia.org/T84665#3918487 (10Dzahn) [20:13:36] 10Operations: shutdown fenari - https://phabricator.wikimedia.org/T83207#3918610 (10Dzahn) [20:14:51] 10Operations, 10ops-requests: apache-fast-test not managed by puppet on fenari - https://phabricator.wikimedia.org/T83136#909634 (10Dzahn) [20:15:51] 10Operations: move apache config deployment from fenari to tin - https://phabricator.wikimedia.org/T83121#909419 (10Dzahn) [20:18:25] thanks Reedy :) [20:23:28] (03CR) 10Framawiki: [C: 04-1] Changing namespaces on some Urdu language projects, 'وکی لغت‌' to 'ویکی لغت', 'وکی کتب' to 'ویکی کتب', 'وکی اقتباسات' to 'ویکی اقتباس'. (033 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/403120 (owner: 10محمد شعیب) [20:29:06] 10Operations, 10Puppet, 10Traffic: Puppet hosts with signed certificate present on agent but not master - https://phabricator.wikimedia.org/T185239#3918682 (10herron) [20:30:02] 10Operations, 10Puppet, 10Traffic: Puppet hosts with signed certificate present on agent but not master - https://phabricator.wikimedia.org/T185239#3910613 (10herron) Thanks @akosiaris! ganeti2001, wtp2005, wtp2020 puppet certs have been refreshed. [20:44:35] (03CR) 10Framawiki: robots.txt: Remove old and disabled archive.org_bot rule (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358171 (https://phabricator.wikimedia.org/T7582) (owner: 10Framawiki) [20:55:42] The jobqueue is getting bigger for now, That's me, trying to refresh and reduce x usages, this directly improves jobqueue in the next days [21:02:25] !log restarting archiva [21:02:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:16:17] (03PS1) 10Framawiki: Allow euwiki bureaucrats to add/remove 'accountcreator' right [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405771 (https://phabricator.wikimedia.org/T185531) [21:21:50] (03PS1) 10Dzahn: rename amslvs4 to bast3003 [dns] - 10https://gerrit.wikimedia.org/r/405801 (https://phabricator.wikimedia.org/T184936) [21:26:24] (03PS2) 10Dzahn: rename amslvs4 to bast3003 [dns] - 10https://gerrit.wikimedia.org/r/405801 (https://phabricator.wikimedia.org/T184936) [21:26:32] (03PS1) 10Dzahn: httpd: copy apache-fast-test from apache module [puppet] - 10https://gerrit.wikimedia.org/r/405802 [21:26:34] (03PS1) 10Dzahn: install_server: update MAC of bast3003 [puppet] - 10https://gerrit.wikimedia.org/r/405803 (https://phabricator.wikimedia.org/T184936) [21:28:00] (03PS3) 10Dzahn: rename amslvs4.mgmt to bast3003.mgmt [dns] - 10https://gerrit.wikimedia.org/r/405801 (https://phabricator.wikimedia.org/T184936) [21:35:20] 10Operations, 10ops-codfw, 10netops: rack spare switches in c1-codfw - https://phabricator.wikimedia.org/T185336#3918812 (10Papaul) serial ports information ex4300-spare1-codfw = port 39 ex4300-spare2-codfw = port 40 qfx5100-spare1-codfw = port 41 qfx5100-spare2-codfw =port 42 [21:35:37] 10Operations, 10ops-codfw, 10netops: rack spare switches in c1-codfw - https://phabricator.wikimedia.org/T185336#3918813 (10Papaul) [21:46:14] (03PS2) 10Dzahn: httpd: copy apache-fast-test from apache module [puppet] - 10https://gerrit.wikimedia.org/r/405802 [21:47:00] (03CR) 10Dzahn: [C: 032] rename amslvs4.mgmt to bast3003.mgmt [dns] - 10https://gerrit.wikimedia.org/r/405801 (https://phabricator.wikimedia.org/T184936) (owner: 10Dzahn) [21:47:20] 10Operations, 10monitoring: Netbox: add Icinga check for the website - https://phabricator.wikimedia.org/T185505#3918825 (10ayounsi) Are those checks good enough? https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=netmon1002&service=netbox+uWSGI+web+app https://icinga.wikimedia.org/cgi-bin/ici... [21:48:19] (03PS3) 10Dzahn: httpd: copy apache-fast-test from apache module [puppet] - 10https://gerrit.wikimedia.org/r/405802 [21:49:08] (03CR) 10Dzahn: [C: 032] "just copy of existing file we have been using for years" [puppet] - 10https://gerrit.wikimedia.org/r/405802 (owner: 10Dzahn) [21:50:02] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3913807 (10faidon) >>! In T185350#3918374, @Tbayer wrote: > In the meantime, I ran a query to estimate how much data was transferred in the down... [22:00:04] bawolff and Reedy: My dear minions, it's time we take the moon! Just kidding. Time for Weekly Security deployment window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180122T2200). [22:00:04] No GERRIT patches in the queue for this window AFAICS. [22:00:36] (03PS1) 10Dzahn: apache: rm helper_scripts class, mv script to deployment_server [puppet] - 10https://gerrit.wikimedia.org/r/405806 [22:01:09] (03CR) 10jerkins-bot: [V: 04-1] apache: rm helper_scripts class, mv script to deployment_server [puppet] - 10https://gerrit.wikimedia.org/r/405806 (owner: 10Dzahn) [22:04:05] (03CR) 10MarcoAurelio: [C: 031] Allow euwiki bureaucrats to add/remove 'accountcreator' right [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405771 (https://phabricator.wikimedia.org/T185531) (owner: 10Framawiki) [22:04:30] (03PS2) 10Dzahn: apache: rm helper_scripts class, mv script to deployment_server [puppet] - 10https://gerrit.wikimedia.org/r/405806 [22:04:59] (03PS2) 10Dzahn: install_server: update MAC of bast3003 [puppet] - 10https://gerrit.wikimedia.org/r/405803 (https://phabricator.wikimedia.org/T184936) [22:05:36] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3918834 (10Tbayer) OK, I just launched the below query for ulsfo - will report the result here once it has completed. ```lang=sql SELECT SUM(... [22:05:39] (03CR) 10Dzahn: [C: 032] install_server: update MAC of bast3003 [puppet] - 10https://gerrit.wikimedia.org/r/405803 (https://phabricator.wikimedia.org/T184936) (owner: 10Dzahn) [22:09:09] (03PS1) 10Herron: puppetdb: add major version and package variant parameters [puppet] - 10https://gerrit.wikimedia.org/r/405808 (https://phabricator.wikimedia.org/T185501) [22:09:30] PROBLEM - MariaDB Slave Lag: s6 on db1102 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 327.20 seconds [22:11:30] RECOVERY - MariaDB Slave Lag: s6 on db1102 is OK: OK slave_sql_lag Replication lag: 0.00 seconds [22:12:44] 10Operations, 10Puppet, 10Patch-For-Review: Add PuppetDB version selector (puppet/hiera) - https://phabricator.wikimedia.org/T185501#3918872 (10herron) [22:12:46] 10Operations, 10Puppet: Extend puppetmaster::puppetdb to support puppetlabs packaged puppetdb 4.4 - https://phabricator.wikimedia.org/T185500#3918871 (10herron) [22:16:11] 10Operations, 10monitoring: Netbox: add Icinga check for the website - https://phabricator.wikimedia.org/T185505#3918894 (10Volans) Those are a good start, but if I'm not mistaken, all of them are local on the host, right? I think we might need one for netbox.wikimedia.org too, done from the Icinga host itself... [22:22:01] paravoid: BTW, from the webrequest data it looks like we currently have only 80 servers (or server names) serving requests, is that plausible? [22:22:13] https://www.irccloud.com/pastebin/MRFIr2rA/server%20names%20appearing%20on%20January%2021 [22:23:36] 10Operations, 10monitoring: Netbox: add Icinga check for the website - https://phabricator.wikimedia.org/T185505#3917710 (10Dzahn) The last check is from external. Actually we have that already: 487195 check_command check_https_url!netbox.wikimedia.org!https://netbox.wikimedia.org [22:25:54] HaeB: those are the front edge varnish server hostnames [22:25:54] HaeB: those aren't the backend mws, so, probably? [22:26:06] what bryan said [22:26:24] I could see if those were exactly right in puppet, but, feels close [22:26:58] the sites.pp is a mess of regex so good luck deciding if its exhaustive from that :) [22:27:33] but as a random example, cp1053.eqiad.wmnet is a role::cache::text box [22:27:38] yeah, 80 is accurate [22:28:11] total count of cp* systems is 92, then 12 of those are either test, not brought online yet or decommissioned [22:29:19] A:cp is 81 including cp1008 that is for testing [22:29:27] but I'm not sure that answer the question [22:30:14] looking at icinga gives you a list of host names that is easier to read than site.pp [22:31:08] i see, thanks for the explanations - somehow thought the number of front varnishes must be like several hundred [22:33:43] you're not crazy, it used to [22:33:51] we've just been optimizing :) [22:36:59] 10Operations, 10cloud-services-team: Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918977 (10chasemp) a:03Bstorm Hi Brooke :) [22:37:30] 10Operations, 10cloud-services-team: Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918979 (10chasemp) [22:38:10] !log rebooting the-server-formerly-known-as-amslvs4 to PXE to reinstall it as bast3003. doesnt work [22:38:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:39:54] 10Operations, 10cloud-services-team: Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918988 (10chasemp) [22:40:25] the "DHCP spam" is still going on, i could see the interesting log lines among that [22:40:50] 10Operations, 10cloud-services-team: Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918989 (10Bstorm) [22:41:17] 10Operations, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918990 (10bd808) [22:41:47] 10Operations, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918992 (10chasemp) [22:41:58] 10Operations, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3918993 (10Bstorm) [22:55:44] (03PS1) 10Dzahn: install_server: don't use http to fetch bast3003 installer [puppet] - 10https://gerrit.wikimedia.org/r/405813 (https://phabricator.wikimedia.org/T182215) [22:56:03] (03CR) 10jerkins-bot: [V: 04-1] install_server: don't use http to fetch bast3003 installer [puppet] - 10https://gerrit.wikimedia.org/r/405813 (https://phabricator.wikimedia.org/T182215) (owner: 10Dzahn) [22:58:15] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3919049 (10Tbayer) @faidon: The ulsfo result for December is 2018 (decimal) terabytes. Plausible? ``` total_bytes requests 2017852005519238... [23:03:25] 10Operations, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3919066 (10chasemp) [23:03:37] (03PS2) 10Dzahn: install_server: don't use http to fetch bast3003 installer [puppet] - 10https://gerrit.wikimedia.org/r/405813 (https://phabricator.wikimedia.org/T182215) [23:06:22] (03CR) 10Dzahn: [C: 032] install_server: don't use http to fetch bast3003 installer [puppet] - 10https://gerrit.wikimedia.org/r/405813 (https://phabricator.wikimedia.org/T182215) (owner: 10Dzahn) [23:06:46] 10Operations, 10Analytics-Data-Quality, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3919078 (10faidon) Interesting! So with a ratio in:out of approximately 25:1 (based on January's figures), this means that we could estimate the... [23:07:12] 10Operations, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3919079 (10chasemp) [23:10:02] 10Operations, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3919088 (10chasemp) [23:28:29] (03CR) 10Platonides: [C: 031] Allow euwiki bureaucrats to add/remove 'accountcreator' right [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405771 (https://phabricator.wikimedia.org/T185531) (owner: 10Framawiki) [23:35:31] 10Operations, 10Analytics, 10Code-Stewardship-Reviews, 10Tools, 10Wikimedia-IRC-RC-Server: IRC RecentChanges feed: code stewardship request - https://phabricator.wikimedia.org/T185319#3919182 (10greg) p:05Triage>03Normal