[01:05:01] PROBLEM - Check health of redis instance on 6480 on rdb2005 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6480 [01:05:01] PROBLEM - Check health of redis instance on 6479 on rdb2005 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6479 [01:05:11] PROBLEM - Check health of redis instance on 6481 on rdb2005 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6481 [01:06:01] RECOVERY - Check health of redis instance on 6480 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 4197295 keys, up 5 minutes 52 seconds - replication_delay is 0 [01:06:01] RECOVERY - Check health of redis instance on 6479 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6479 has 1 databases (db0) with 4197810 keys, up 5 minutes 52 seconds - replication_delay is 0 [01:06:11] RECOVERY - Check health of redis instance on 6481 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 4196369 keys, up 6 minutes - replication_delay is 0 [02:41:27] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.2) (duration: 13m 49s) [02:41:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:48:21] !log l10nupdate@tin ResourceLoader cache refresh completed at Mon Oct 9 02:48:21 UTC 2017 (duration 6m 54s) [02:48:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:29:52] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 825.09 seconds [04:24:02] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 260.26 seconds [05:32:21] (03PS1) 10Marostegui: db-eqiad.php: Depool db1083 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383055 (https://phabricator.wikimedia.org/T174509) [05:34:16] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1083 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383055 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:35:52] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1083 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383055 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:36:09] (03PS1) 10Marostegui: db1080.yaml: Update socket path [puppet] - 10https://gerrit.wikimedia.org/r/383056 [05:36:12] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1083 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383055 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:37:03] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1083 - T174509 (duration: 00m 47s) [05:37:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:37:11] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:38:55] !log Stop MySQL on db1083 to upgrade it [05:39:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:41:18] (03PS2) 10Marostegui: db1083.yaml: Update socket path [puppet] - 10https://gerrit.wikimedia.org/r/383056 [05:42:02] (03CR) 10Marostegui: [C: 032] db1083.yaml: Update socket path [puppet] - 10https://gerrit.wikimedia.org/r/383056 (owner: 10Marostegui) [05:44:53] !log Optimize pagelinks and templatelinks on db1083 - T174509 [05:44:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:44:59] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:46:33] (03PS4) 10Marostegui: db-eqiad.php: Set commonswiki on read only [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382379 (https://phabricator.wikimedia.org/T176883) [05:46:43] (03PS3) 10Marostegui: db1068: Update socket path [puppet] - 10https://gerrit.wikimedia.org/r/382380 (https://phabricator.wikimedia.org/T168661) [05:47:43] (03PS1) 10Marostegui: db-eqiad.php: Depool db1090 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383057 (https://phabricator.wikimedia.org/T174509) [05:50:59] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1090 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383057 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:52:35] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1090 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383057 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:52:49] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1090 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383057 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:53:37] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1090 - T174509 (duration: 00m 47s) [05:53:40] !log Optimize pagelinks and templatelinks on db1090 - T174509 [05:53:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:53:43] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:53:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:54:51] (03PS1) 10Marostegui: db-eqiad.php: Depool db1091 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383058 (https://phabricator.wikimedia.org/T174509) [05:56:41] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1091 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383058 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:58:10] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1091 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383058 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:58:21] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1091 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383058 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:59:12] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1091 - T174509 (duration: 00m 47s) [05:59:16] !log Optimize pagelinks and templatelinks on db1091 - T174509 [05:59:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:59:19] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:59:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:01:48] (03PS1) 10Marostegui: db-eqiad.php: Depool db1096 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383059 (https://phabricator.wikimedia.org/T174509) [06:03:31] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1096 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383059 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:04:01] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0 [06:04:41] PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 120, down: 1, dormant: 0, excluded: 0, unused: 0 [06:04:51] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1096 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383059 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:06:10] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1096 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383059 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:06:45] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1096 - T174509 (duration: 00m 47s) [06:06:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:06:51] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:08:21] !log Optimize pagelinks and templatelinks on db1096 - T174509 [06:08:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:11:29] !log Optimize pagelinks and templatelinks on db1050 - T174509 [06:11:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:14:12] !log Optimize pagelinks and templatelinks on db1039 - T174509 [06:14:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:14:19] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:15:42] RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 122, down: 0, dormant: 0, excluded: 0, unused: 0 [06:16:11] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [06:18:52] PROBLEM - Check Varnish expiry mailbox lag on cp4022 is CRITICAL: CRITICAL: expiry mailbox lag is 2087622 [06:25:25] !log Drop moodbar_feedback and moodbar_feedback_response from s6 - T153033 [06:25:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:25:33] T153033: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033 [06:30:29] rip [06:30:53] xddd [06:33:13] (03PS1) 10Marostegui: mariadb: Provision db2079 into s5 [puppet] - 10https://gerrit.wikimedia.org/r/383062 (https://phabricator.wikimedia.org/T170662) [06:36:47] (03PS2) 10Marostegui: mariadb: Provision db2079 into s5 [puppet] - 10https://gerrit.wikimedia.org/r/383062 (https://phabricator.wikimedia.org/T170662) [06:41:22] (03CR) 10Marostegui: "Looks good: https://puppet-compiler.wmflabs.org/compiler02/8219/" [puppet] - 10https://gerrit.wikimedia.org/r/383062 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:46:16] (03CR) 10Marostegui: [C: 032] mariadb: Provision db2079 into s5 [puppet] - 10https://gerrit.wikimedia.org/r/383062 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:51:23] (03PS1) 10Marostegui: db-codfw.php: Depool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383068 (https://phabricator.wikimedia.org/T170662) [06:53:31] (03CR) 10Marostegui: [C: 032] db-codfw.php: Depool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383068 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:54:41] (03PS1) 10Marostegui: s5.hosts: Add db2079 [software] - 10https://gerrit.wikimedia.org/r/383069 (https://phabricator.wikimedia.org/T170662) [06:55:06] (03Merged) 10jenkins-bot: db-codfw.php: Depool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383068 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:55:39] (03CR) 10Marostegui: [C: 032] s5.hosts: Add db2079 [software] - 10https://gerrit.wikimedia.org/r/383069 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:56:10] (03CR) 10jenkins-bot: db-codfw.php: Depool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383068 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:56:17] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Depool db2075 - T170662 (duration: 00m 46s) [06:56:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:56:23] T170662: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662 [06:56:27] (03Merged) 10jenkins-bot: s5.hosts: Add db2079 [software] - 10https://gerrit.wikimedia.org/r/383069 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:56:28] !log Stop MySQL on db2075 to use it to clone db2079 - T170662 [06:56:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:01:37] !log cp4022 - backend restart, mailbox lag, cache_upload [07:01:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:03:36] (03PS1) 10Marostegui: db-eqiad,db-codfw.php: Add db2079 the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383070 (https://phabricator.wikimedia.org/T170662) [07:08:54] RECOVERY - Check Varnish expiry mailbox lag on cp4022 is OK: OK: expiry mailbox lag is 0 [07:09:50] (03PS2) 10Muehlenhoff: Install python-dateutils via puppet [puppet] - 10https://gerrit.wikimedia.org/r/382707 [07:11:07] (03CR) 10Marostegui: [C: 032] db-eqiad,db-codfw.php: Add db2079 the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383070 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [07:13:28] (03Merged) 10jenkins-bot: db-eqiad,db-codfw.php: Add db2079 the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383070 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [07:13:42] (03CR) 10jenkins-bot: db-eqiad,db-codfw.php: Add db2079 the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383070 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [07:15:01] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Add db2079 to the config - T170662 (duration: 00m 47s) [07:15:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:15:07] T170662: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662 [07:16:42] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Add db2079 to the config - T170662 (duration: 00m 47s) [07:16:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:18:38] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668357 (10jcrespo) Correct me if I am misunderstanding something, but on RAID 10, we can lose a whole mirror group and we would be ok, what we cannot lose is the same disk on the two mirror... [07:20:15] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668371 (10Marostegui) >>! In T177720#3668357, @jcrespo wrote: > Correct me if I am misunderstanding something, but on RAID 10, we can lose a whole mirror group and we would be ok, what we c... [07:20:44] 10Operations, 10ops-codfw, 10DBA: db2038 disk with predictive failure - https://phabricator.wikimedia.org/T177720#3668372 (10Marostegui) p:05Triage>03High [07:28:46] (03CR) 10Muehlenhoff: [C: 032] Install python-dateutils via puppet [puppet] - 10https://gerrit.wikimedia.org/r/382707 (owner: 10Muehlenhoff) [07:29:53] 10Operations, 10ops-codfw, 10DBA: db2038 disk with predictive failure - https://phabricator.wikimedia.org/T177720#3668379 (10Marostegui) No, they are actually two different disks indeed by looking at the serials. [07:30:14] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668381 (10Marostegui) [07:33:00] !log Drop moodbar_feedback and moodbar_feedback_response from s2 - T153033 [07:33:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:33:07] T153033: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033 [07:42:42] 10Operations: Integrate stretch 9.2 point release - https://phabricator.wikimedia.org/T177739#3668407 (10MoritzMuehlenhoff) [07:42:45] (03PS5) 10Ema: varnish: support for version 5 [puppet] - 10https://gerrit.wikimedia.org/r/382464 (https://phabricator.wikimedia.org/T168529) [07:43:17] (03CR) 10jerkins-bot: [V: 04-1] varnish: support for version 5 [puppet] - 10https://gerrit.wikimedia.org/r/382464 (https://phabricator.wikimedia.org/T168529) (owner: 10Ema) [07:45:31] (03PS6) 10Ema: varnish: add support for version 5 [puppet] - 10https://gerrit.wikimedia.org/r/382464 (https://phabricator.wikimedia.org/T168529) [07:45:40] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668420 (10Papaul) @Marostegui this server is out of warranty 2017-07-10. We need to find out if any of the decommissioned servers have the same disks that we can use. [07:46:02] (03CR) 10jerkins-bot: [V: 04-1] varnish: add support for version 5 [puppet] - 10https://gerrit.wikimedia.org/r/382464 (https://phabricator.wikimedia.org/T168529) (owner: 10Ema) [07:49:32] (03PS4) 10Giuseppe Lavagetto: role::cache::base: abstract varnish logging to class [puppet] - 10https://gerrit.wikimedia.org/r/382674 [07:49:34] (03PS5) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 [07:49:36] (03PS4) 10Giuseppe Lavagetto: role::cache::base: convert to profile [1] [puppet] - 10https://gerrit.wikimedia.org/r/382683 [07:49:38] (03PS5) 10Giuseppe Lavagetto: role::cache::base: convert to profile [2] [puppet] - 10https://gerrit.wikimedia.org/r/382684 [07:49:40] (03PS3) 10Giuseppe Lavagetto: prometheus::varnish::exporter: convert to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/382720 [07:49:42] (03PS3) 10Giuseppe Lavagetto: cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 [07:49:44] (03PS1) 10Giuseppe Lavagetto: profile::cache::base: add role::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/383072 [07:49:46] (03PS1) 10Giuseppe Lavagetto: profile::cache::ssl::unified: move from role, refactor [puppet] - 10https://gerrit.wikimedia.org/r/383073 [07:51:52] (03CR) 10jerkins-bot: [V: 04-1] cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 (owner: 10Giuseppe Lavagetto) [07:52:00] (03CR) 10jerkins-bot: [V: 04-1] profile::cache::base: add role::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/383072 (owner: 10Giuseppe Lavagetto) [07:52:29] (03CR) 10jerkins-bot: [V: 04-1] profile::cache::ssl::unified: move from role, refactor [puppet] - 10https://gerrit.wikimedia.org/r/383073 (owner: 10Giuseppe Lavagetto) [07:56:53] (03PS1) 10Marostegui: db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) [07:58:14] (03PS2) 10Marostegui: db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) [07:58:14] <_joe_> yeah I know jenkins, don't be such a jerk ;) [07:59:52] (03CR) 10jerkins-bot: [V: 04-1] db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:02:13] (03PS1) 10Marostegui: mariadb: Move db1066 from s1 to s3 [puppet] - 10https://gerrit.wikimedia.org/r/383077 (https://phabricator.wikimedia.org/T172679) [08:02:21] (03PS1) 10Giuseppe Lavagetto: role::cache::text: move to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/383078 [08:02:42] (03PS3) 10Marostegui: db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) [08:06:01] (03CR) 10Marostegui: "Looks good: https://puppet-compiler.wmflabs.org/compiler02/8222/" [puppet] - 10https://gerrit.wikimedia.org/r/383077 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:08:48] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:10:28] (03Merged) 10jenkins-bot: db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:10:30] (03CR) 10jenkins-bot: db-eqiad.php: Move db1066 from s1 api to s3 vslow [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383076 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:12:23] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Move db1066 from s1 to s3 - T172679 (duration: 01m 25s) [08:12:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:12:30] T172679: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679 [08:15:59] (03CR) 10Giuseppe Lavagetto: "https://puppet-compiler.wmflabs.org/compiler02/8223/ seems the patch is a complete noop in terms of the generated catalog." [puppet] - 10https://gerrit.wikimedia.org/r/382674 (owner: 10Giuseppe Lavagetto) [08:19:42] (03PS5) 10Ema: role::cache::base: abstract varnish logging to class [puppet] - 10https://gerrit.wikimedia.org/r/382674 (owner: 10Giuseppe Lavagetto) [08:20:18] (03CR) 10Ema: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/382674 (owner: 10Giuseppe Lavagetto) [08:20:54] (03PS1) 10Marostegui: db-eqiad.php: Restore db1066 on s1, db1072 to s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383083 (https://phabricator.wikimedia.org/T172679) [08:21:40] (03Abandoned) 10Marostegui: mariadb: Move db1066 from s1 to s3 [puppet] - 10https://gerrit.wikimedia.org/r/383077 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:22:31] (03CR) 10Giuseppe Lavagetto: [C: 032] role::cache::base: abstract varnish logging to class [puppet] - 10https://gerrit.wikimedia.org/r/382674 (owner: 10Giuseppe Lavagetto) [08:22:57] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Restore db1066 on s1, db1072 to s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383083 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:25:59] (03Merged) 10jenkins-bot: db-eqiad.php: Restore db1066 on s1, db1072 to s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383083 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:26:22] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Restore db1066 on s1 to and move db1072 to s3 instead - T172679 (duration: 00m 47s) [08:26:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:26:29] T172679: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679 [08:26:33] (03PS1) 10Marostegui: mariadb: Move db1072 to s3 [puppet] - 10https://gerrit.wikimedia.org/r/383085 (https://phabricator.wikimedia.org/T172679) [08:31:18] (03CR) 10jenkins-bot: db-eqiad.php: Restore db1066 on s1, db1072 to s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383083 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:33:28] (03PS2) 10Marostegui: mariadb: Move db1072 to s3 [puppet] - 10https://gerrit.wikimedia.org/r/383085 (https://phabricator.wikimedia.org/T172679) [08:35:41] 10Operations: Investigate Chrony as a replacement for ISC ntpd - https://phabricator.wikimedia.org/T177742#3668545 (10MoritzMuehlenhoff) [08:36:47] (03CR) 10Marostegui: "This looks good: https://puppet-compiler.wmflabs.org/compiler02/8226/" [puppet] - 10https://gerrit.wikimedia.org/r/383085 (https://phabricator.wikimedia.org/T172679) (owner: 10Marostegui) [08:36:54] (03PS6) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 [08:37:31] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1096" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383086 [08:37:34] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1096" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383086 [08:39:17] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1096" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383086 (owner: 10Marostegui) [08:40:43] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1096" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383086 (owner: 10Marostegui) [08:40:57] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1096" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383086 (owner: 10Marostegui) [08:41:48] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1096 - T174509 (duration: 00m 47s) [08:41:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:41:54] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [08:43:35] (03PS1) 10Marostegui: db-eqiad.php: Depool db1092 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383087 (https://phabricator.wikimedia.org/T174509) [08:43:56] (03PS7) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 [08:46:35] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1092 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383087 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [08:48:51] 10Operations, 10Datasets-General-or-Unknown, 10Patch-For-Review: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680#3668636 (10elukey) Some info from this morning: ``` elukey@dataset1001:~$ uptime 08:43:39 up 95 days, 19:23, 1 user, load avera... [08:50:06] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1092 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383087 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [08:50:20] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1092 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383087 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [08:51:02] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1092 - T174509 (duration: 00m 47s) [08:51:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:08] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [08:51:14] !log Optimize pagelinks and templatelinks on db1092 - T174509 [08:51:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:59] 10Operations: Investigate Chrony as a replacement for ISC ntpd - https://phabricator.wikimedia.org/T177742#3668642 (10ema) p:05Triage>03Normal [08:52:09] !log Drop moodbar_feedback and moodbar_feedback_response from s4 - T153033 [08:52:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:52:15] T153033: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033 [08:53:55] (03PS8) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 [08:55:32] (03PS1) 10Marostegui: db-codfw.php: Repool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383089 (https://phabricator.wikimedia.org/T170662) [08:55:39] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3668661 (10Pablo-WMDE) @Dzahn Yes, I am asking for +2 permissions to be able to perform code review for my team mates. Phrased the ticket the way it was recommended... [08:56:13] !log disabled puppet on 'stat100[5-6]*,snapshot100[1,5-7]*,dataset1001*' to manually recover from nfsd stuck - T169680 [08:56:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:56:20] T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680 [08:59:08] (03PS1) 10Marostegui: db2079.yaml: Enable icinga notifications [puppet] - 10https://gerrit.wikimedia.org/r/383090 (https://phabricator.wikimedia.org/T170662) [08:59:47] (03CR) 10Marostegui: [C: 032] db-codfw.php: Repool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383089 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [09:00:07] (03PS9) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 [09:01:20] (03Merged) 10jenkins-bot: db-codfw.php: Repool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383089 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [09:01:34] (03CR) 10jenkins-bot: db-codfw.php: Repool db2075 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383089 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [09:01:47] (03CR) 10Marostegui: [C: 032] "This looks good: https://puppet-compiler.wmflabs.org/compiler02/8232/" [puppet] - 10https://gerrit.wikimedia.org/r/383090 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [09:02:20] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Repool db2075 - T170662 (duration: 00m 47s) [09:02:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:02:27] T170662: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662 [09:04:46] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668690 (10jcrespo) @Marostegui Should we do a master failover? We had planned it anyway- this is a good excuse. I know the answer is "yes, if we find the time" :-) Maybe I can take care of... [09:08:05] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668714 (10Marostegui) This is not a master :-) But yes, s6 needs a master failover anyways to decommission db2028 (s6 master) and to finish T169501. There are no alters running on s6 or sch... [09:12:29] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668722 (10jcrespo) > This is not a master :-) Oh! Easier, then- so just pooling one new server in preparation for the failover. [09:19:20] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668742 (10jcrespo) But this doesn't solve our issue, db2038 is not supposed to go away :-/ [09:21:07] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668748 (10Marostegui) >>! In T177720#3668742, @jcrespo wrote: > But this doesn't solve our issue, db2038 is not supposed to go away :-/ No, only hosts <2030 are supposed to go away. That i... [09:21:20] (03CR) 10Giuseppe Lavagetto: "https://puppet-compiler.wmflabs.org/compiler03/8233/ shows no significant diffs, the only one being one dependency being added twice in th" [puppet] - 10https://gerrit.wikimedia.org/r/382675 (owner: 10Giuseppe Lavagetto) [09:34:52] !log installing vim security updates [09:34:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:40:32] PROBLEM - HHVM rendering on mw2127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:41:21] RECOVERY - HHVM rendering on mw2127 is OK: HTTP OK: HTTP/1.1 200 OK - 76971 bytes in 0.428 second response time [09:43:14] 10Operations, 10Puppet, 10DBA: Switch databases to the future parser - https://phabricator.wikimedia.org/T172498#3668792 (10Joe) 05Open>03Resolved [09:43:17] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: Switch all hosts to the future parser - https://phabricator.wikimedia.org/T171704#3668793 (10Joe) [09:43:46] !log restarting Jenkins. Deadlock in SSHSlave plugin that causes memory to leak quite rapidly - T177749 [09:43:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:43:52] T177749: integration-slave-docker-1001 deadlocked since Friday Oct 6th ~ 19:00 utc - https://phabricator.wikimedia.org/T177749 [09:43:56] (03CR) 10Ema: [C: 031] "LGTM, one nit." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/382675 (owner: 10Giuseppe Lavagetto) [09:44:40] 10Operations, 10Puppet, 10cloud-services-team, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3668801 (10Joe) [09:45:42] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3668808 (10Addshore) >>! In T177599#3666657, @Dzahn wrote: > Hi, re: "to be able to contribute to AdvancedSearch" does this mean you want the +2 permissions to be... [09:45:49] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests, 10User-Addshore: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3668809 (10Addshore) [09:47:25] 10Operations, 10Puppet, 10cloud-services-team, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3668818 (10Joe) [09:48:15] 10Operations, 10Puppet, 10DBA: Switch databases to the future parser - https://phabricator.wikimedia.org/T172498#3668819 (10jcrespo) But we didn't check the parameter changes, did you do that or did it finally work? Why resolve now? [09:50:39] !log Drop moodbar_feedback and moodbar_feedback_response from s5 - T153033 [09:50:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:50:47] T153033: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033 [09:51:38] !log test setting bgp med on lvs3001/3003 T165584 [09:51:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:51:43] T165584: Deploy pybal with BGP MED support (for primary/backup) in production - https://phabricator.wikimedia.org/T165584 [09:51:48] !log rebooting dataset1001 for NFS stuck - T169680 [09:51:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:51:55] T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680 [09:51:55] Can I deploy for T171027 now? [09:51:55] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis - https://phabricator.wikimedia.org/T171027 [09:52:17] tell me if anything is going on right now so I should wait until later [09:52:24] jynus: marostegui FYI also [09:53:41] hashar: around? [09:54:06] jynus: yes! [09:54:28] hashar: I can take full responsability of the patch Amir1 prepared [09:54:40] but I assume releng has veto power [09:55:36] I think commonswiki and ruwiki are broken, and that would temporarilly fix them (with additonal database work) [09:56:11] is that for https://phabricator.wikimedia.org/T171027 ? [09:56:19] yes [09:56:49] the title is a bit missleading, hopefuly you may have seen my latest comments [09:56:53] well I cant really tell how safe deploying whatever patch is :] Seems it is database related? [09:57:30] https://phabricator.wikimedia.org/T171027#3667090 [09:57:40] This is the patch: https://gerrit.wikimedia.org/r/#/c/383093/ [09:57:58] and the patch seems ok by wikidata people and Reedy [09:58:01] ah that is to fix a fix that fix a fix :D [09:58:27] yeah I guess it is all good [09:59:03] might want to have the patch in the master branch as well, so it get deployed later this week as part of the train (when we bump wmf version) [09:59:59] (03CR) 10Alexandros Kosiaris: [C: 032] bacula: remove ganglia backup sets [puppet] - 10https://gerrit.wikimedia.org/r/382914 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [10:00:03] !log rebooting snapshot1005 for NFS stuck - T169680 [10:00:09] (03PS2) 10Alexandros Kosiaris: bacula: remove ganglia backup sets [puppet] - 10https://gerrit.wikimedia.org/r/382914 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [10:00:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:00:14] Amir1: jynus: let me know if you need assistance to deploy it? :) [10:00:14] T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680 [10:00:15] (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] bacula: remove ganglia backup sets [puppet] - 10https://gerrit.wikimedia.org/r/382914 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [10:00:28] hashar: sure [10:00:41] I trust Amir1, just want to be responsible for it too [10:00:47] (03PS5) 10Ema: pybal: BGP MED configuration [puppet] - 10https://gerrit.wikimedia.org/r/380516 (https://phabricator.wikimedia.org/T165584) [10:01:05] as the promoter of the functionality rollback [10:01:22] (03CR) 10Ema: [V: 032 C: 032] pybal: BGP MED configuration [puppet] - 10https://gerrit.wikimedia.org/r/380516 (https://phabricator.wikimedia.org/T165584) (owner: 10Ema) [10:02:20] I am not sure replicating the Wikidata changes to all wiki clients is any of a good idea :D [10:02:25] that seems indeed rather spammy [10:03:34] (03CR) 10Mobrovac: cassandra: move machines from restbase to restbase_ng cluster (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/382506 (https://phabricator.wikimedia.org/T177501) (owner: 10Eevans) [10:05:57] Amir1: also when ever it get disabled, you will probably want to craft an announce of some sort? [10:06:24] Amir1: as I understand it, that means the wikibase events will no more show up in the wiki RecentChanges [10:06:36] (eg https://commons.wikimedia.org/wiki/Special:RecentChanges?hidecategorization=1&limit=50&days=7&urlversion=2&hidelog=1&hidenewpages=1&hidepageedits=1 ) [10:07:10] 10Operations, 10Puppet, 10cloud-services-team, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3668864 (10Joe) [10:13:15] (03PS10) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 [10:14:51] (03PS1) 10Gehel: logstash: remove references to old logstash servers for decommissioning [puppet] - 10https://gerrit.wikimedia.org/r/383096 (https://phabricator.wikimedia.org/T175830) [10:15:45] !log rebooting snapshot1001 for NFS stuck - T169680 [10:15:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:15:52] T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680 [10:16:51] hashar: I asked it to be announced using the weekly community message [10:17:07] (03PS1) 10Gehel: lgostash: all log producers need to use the logstash LVS endpoint [puppet] - 10https://gerrit.wikimedia.org/r/383097 (https://phabricator.wikimedia.org/T175242) [10:18:23] (03CR) 10Elukey: "nit while seeing the commit msg on irc: s/lgostash/logstash :)" [puppet] - 10https://gerrit.wikimedia.org/r/383097 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [10:18:25] (03CR) 10Giuseppe Lavagetto: cache: convert kafka::webrequest to profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/382675 (owner: 10Giuseppe Lavagetto) [10:18:58] (03PS2) 10Gehel: logstash: all log producers need to use the logstash LVS endpoint [puppet] - 10https://gerrit.wikimedia.org/r/383097 (https://phabricator.wikimedia.org/T175242) [10:19:29] hashar: yes [10:19:35] (03CR) 10Gehel: "@elukey: thanks! I don't know how many times I have done the exact same typo. I should learn, but it seems I don't..." [puppet] - 10https://gerrit.wikimedia.org/r/383097 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [10:19:35] (sorry, was at the daily) [10:21:16] !log rebooting snapshot1006 for NFS stuck - T169680 [10:21:17] (03CR) 10WMDE-leszek: "Sorry, forgot to hit "Post". Works good, thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/382467 (owner: 10WMDE-leszek) [10:21:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:21:23] T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680 [10:21:58] !log Drop moodbar_feedback and moodbar_feedback_response from s7 - T153033 [10:22:03] (03PS1) 10Gehel: maps: all log producers need to use the logstash LVS endpoint [puppet] - 10https://gerrit.wikimedia.org/r/383098 (https://phabricator.wikimedia.org/T175242) [10:22:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:22:04] T153033: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033 [10:22:22] (03CR) 10Giuseppe Lavagetto: [C: 032] cache: convert kafka::webrequest to profile [puppet] - 10https://gerrit.wikimedia.org/r/382675 (owner: 10Giuseppe Lavagetto) [10:24:24] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668922 (10jcrespo) db2028 RAID claims it has 558.911 GB disks. db2038 RAID claims it has 600GB, maybe the actual size is the same? In that case the failover could actually help. [10:24:54] (03CR) 10Gehel: "puppet compiler agrees: https://puppet-compiler.wmflabs.org/compiler02/8234/" [puppet] - 10https://gerrit.wikimedia.org/r/383098 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [10:25:00] (03PS2) 10Gehel: maps: all log producers need to use the logstash LVS endpoint [puppet] - 10https://gerrit.wikimedia.org/r/383098 (https://phabricator.wikimedia.org/T175242) [10:25:06] volans: crond is stopped on snapshot1007? [10:25:14] first entry in fatalmonitor: 966 Syntax Warning: Invalid Font Weight [10:25:18] just sayin [10:25:19] hoo: yes, we're rebooting it [10:25:26] volans: Ok, cool [10:25:30] give us a few minutes and will be back online properly [10:25:34] (03CR) 10Gehel: [C: 032] maps: all log producers need to use the logstash LVS endpoint [puppet] - 10https://gerrit.wikimedia.org/r/383098 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [10:25:39] sorry about the trouble, see T169680 [10:26:00] volans: I know… was just wondering [10:26:12] is it ok to re-start dumps afterwards [10:26:33] we have weekly dumps that usually start 3am mondays [10:27:54] hoo: sure, everything should be fine after, dataset1001 is already rebooted and ok, do you know how to restart them or is something that Ariel takes care of usually? [10:28:08] I can restart them [10:30:22] (03PS2) 10Giuseppe Lavagetto: profile::docker::registry: conform to current style guide [puppet] - 10https://gerrit.wikimedia.org/r/381987 [10:30:28] 10Operations, 10ops-codfw, 10DBA: db2038 two disks with predictive failure - https://phabricator.wikimedia.org/T177720#3668930 (10Marostegui) By looking at both hosts' disks serial numbers, they are both 600GB 15k SAS 3.5" so maybe we can exchange them. @Papaul probably knows better if we can exchange those... [10:30:48] !log rebooting snapshot1007 for NFS stuck - T169680 [10:30:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:30:56] T169680: NFS on dataset1001 overloaded, high load on the hosts that mount it - https://phabricator.wikimedia.org/T169680 [10:31:08] Amir1: Syntax Warning: Invalid Font Weight <--- you can ignore that one [10:31:20] okay [10:31:29] lots of revision query errors? [10:31:34] Amir1: that comes from HHVM stdout or stderr. The content comes from some wfShellExec() that does not capture stderr [10:32:09] oh, I was looking at the wrong range- ignore me [10:32:13] there are a bunch of slow ApiQueryRecentChanges::run [10:32:25] hashar: those are the ones we want to fix :-) [10:32:28] jynus: I'm making the patch for the config, can you tell me which wikis this should be stopped? commonswiki, ruwiki, and? [10:32:29] \o/ [10:32:37] hashar: "fix" [10:32:54] Amir1: immediately commons an ru [10:33:02] RECOVERY - puppet last run on snapshot1006 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [10:33:08] the other can wait for a more thorough evaluation [10:33:21] I think some other on s3 is affected [10:33:31] but has so little traffic, it is more difficult to see [10:33:37] jynus: also by the time of deployment, the new rc records won't stop because there are lots of them in the jobqueue already, they fully stop after two days (but the rate would immediately drop) [10:33:45] I know [10:33:53] also the rows will not disappear [10:34:04] and the tables will not shrink in size [10:34:09] that will be my "job" [10:34:24] yeah, I think removing them is one thing you need to do, I probably can help if you want me to [10:34:30] yes [10:34:47] I need to make sure I only delete the right type/source [10:35:12] RECOVERY - puppet last run on snapshot1001 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures [10:35:23] and of course we canno delete too fast [10:35:32] !log Started dumpwikidatajson.sh on snapshot1007 after T169680 disruptions [10:35:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:36:28] hoo: I would have pinged once up, is still running the first puppet afeter the reboot [10:36:47] the commons page that hit /wiki/Special:RecentChangesLinked still seems to timeout [10:36:48] ;) [10:36:49] hashar: ApiQueryRecentChanges::run (not all, only some) and most SpecialWatchlist::doMainQuery [10:36:59] eg https://commons.wikimedia.org/wiki/Special:RecentChangesLinked/Category:Author_died_more_than_100_years_ago_public_domain_images [10:37:17] they show up in /srv/mw-log/exception.log [10:37:22] (03CR) 10Giuseppe Lavagetto: [C: 032] profile::docker::registry: conform to current style guide [puppet] - 10https://gerrit.wikimedia.org/r/381987 (owner: 10Giuseppe Lavagetto) [10:37:35] yep [10:37:46] started the deploy [10:37:48] user errors are many [10:37:49] volans: Sorry :S [10:37:59] maintenance probles are even more, hashar [10:38:11] RECOVERY - puppet last run on snapshot1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:38:17] hoo: all done, seems good and the partition is mounted at boot, so should be good anyway :) [10:38:33] !log ladsgroup@tin Synchronized php-1.31.0-wmf.2/extensions/Wikidata/extensions/Wikibase/client: Re-instate $wgWBClientSettings[injectRecentChanges] (T171027) (duration: 00m 56s) [10:38:36] hashar: the thing is, it started so slowly [10:38:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:38:38] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis - https://phabricator.wikimedia.org/T171027 [10:38:49] that it didn't got much attention [10:40:03] but again, this is not a fix, this is a rollback; fix will have to come later [10:41:14] (03PS1) 10Ladsgroup: Disable injecing RC records for commonswiki and ruwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) [10:41:46] one suggestion that I had was to dispatch only unpatrolled edits because the main reason to inject rc records is to fight vandalism in wikidata [10:42:06] I mentioned the posiblity of not materializing them but on query time [10:42:28] there are many thing we can do [10:42:32] RECOVERY - puppet last run on stat1006 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [10:42:48] the problem is not the functionality, but the current implementation under the current usage [10:43:21] I assume nobody predicted how many pages were going to be inserted [10:44:00] but this is ok, we rollback first, we rething alternatives [10:44:50] I confirm that the change is deployed (I logged in into one random node and manually checked, hope that's fine) [10:45:11] question, was it enabled by deafult? [10:45:18] time to deploy the config change [10:45:22] I thought I saw it was disabled by default [10:45:34] let me recheck [10:45:39] https://gerrit.wikimedia.org/r/#/c/383093/1/extensions/Wikibase/client/config/WikibaseClient.default.php [10:45:51] o cool [10:45:58] either it was changed or I missread it [10:47:01] RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [10:47:04] (03CR) 10Ladsgroup: [C: 032] Disable injecing RC records for commonswiki and ruwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) (owner: 10Ladsgroup) [10:47:17] (03CR) 10Jcrespo: [C: 031] Disable injecing RC records for commonswiki and ruwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) (owner: 10Ladsgroup) [10:47:28] jynus: okay, is there a way to monitor jobs coming in? [10:47:36] yes [10:47:48] although I would do it even easier [10:47:59] do a change an monitor results [10:48:35] (03Merged) 10jenkins-bot: Disable injecing RC records for commonswiki and ruwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) (owner: 10Ladsgroup) [10:48:45] (03CR) 10jenkins-bot: Disable injecing RC records for commonswiki and ruwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) (owner: 10Ladsgroup) [10:48:46] what does it trigger the injection? is it an HTTP requests? [10:49:05] does the edition itself create the job or something else? [10:49:31] edition creates a job [10:49:42] wikibase-injectRCrecord or something like that [10:49:43] ok, so new editions should not be refected [10:49:59] *reflected, without needing a jobque restart, right? [10:50:52] !log ladsgroup@tin Synchronized wmf-config/Wikibase-production.php: Disable injecing RC records for commonswiki and ruwiki (T171027) (duration: 00m 47s) [10:50:58] If that is true, I say we test as a final user [10:50:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:50:59] T171027: "2062 Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis - https://phabricator.wikimedia.org/T171027 [10:51:17] yes [10:51:21] that's correct [10:51:33] the change for disabling it is deployed as well [10:51:49] let's watch a random page, do a trivial edit on wikidata [10:52:05] they should be connected somehow [10:52:14] and see if the functionality still gets the edit [10:52:23] or if the row gets inserted on rcs [10:53:54] jynus: https://www.wikidata.org/w/index.php?title=Q20820239&diff=574470637&oldid=499491915 [10:54:37] !log installing expect updates from stretch 9.2 point release [10:54:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:55:06] "Show Wikidata edits in recent changes" on preferences, right? [10:56:14] jynus: you can do it in the watchlist too [10:56:22] a, ok [10:56:23] no need to enable it in preferences, it makes it default [10:56:56] to be fair, I do not know too much about the functionality to see which spefic changed would show or nort [10:56:57] I don't get my change in wikidata watchlist [10:57:01] so it stopped it seems [10:57:05] I don't either [10:57:14] https://commons.wikimedia.org/wiki/Category:Bieberer_Aussichtsturm [10:57:16] I will check the recentchanges table [10:57:26] for more direct evaluation [10:57:26] watch this page and then go to your watchlist in commons [10:57:56] https://commons.wikimedia.org/wiki/Special:RecentChanges?hidebots=1&hidecategorization=1&limit=50&days=7&urlversion=2 [10:58:01] also we have this too [10:58:33] do we have examples ther of wikidata-rcs before the deploy? [10:58:53] https://commons.wikimedia.org/wiki/Special:RecentChanges?hidecategorization=1&limit=50&days=7&urlversion=2&hidelog=1&hidepageedits=1&hidenewpages=1 [10:58:59] yes, I can see now [10:59:01] This is the best way to check [10:59:04] did the same manually :-) [10:59:32] seems stopped since 10:56 [10:59:56] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1090" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383101 [10:59:58] which is about 6 minutes after your deploy [11:00:01] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1090" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383101 [11:00:03] which makes sense [11:00:20] wow, lots of changes [11:00:22] before [11:00:41] I need to go now, someone is waiting for me, I'll be back in on hour [11:00:43] o/ [11:00:47] bye! [11:00:56] thanks! [11:01:56] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1090" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383101 (owner: 10Marostegui) [11:03:27] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1090" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383101 (owner: 10Marostegui) [11:03:41] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1090" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383101 (owner: 10Marostegui) [11:04:31] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1090 - T174509 (duration: 00m 46s) [11:04:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:04:38] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [11:06:13] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1092" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383102 [11:06:16] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1092" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383102 [11:07:07] (03PS5) 10Giuseppe Lavagetto: role::cache::base: convert to profile [1/2] [puppet] - 10https://gerrit.wikimedia.org/r/382683 [11:07:57] (03CR) 10Reedy: "Surely that disables it for everywhere else but ruwiki and commonswiki?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) (owner: 10Ladsgroup) [11:08:36] !log installing whois updates from stretch 9.2 point release [11:08:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:08:47] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1092" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383102 (owner: 10Marostegui) [11:10:09] jynus: Was that patch right? Unless I'm not thinking clearly because of tiredness and this headache thing... [11:10:19] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1092" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383102 (owner: 10Marostegui) [11:10:21] which one? [11:10:46] if ( !in_array( $wgDBname, [ 'commonswiki', 'ruwiki' ] ) ) { [11:10:46] $wgWBClientSettings['injectRecentChanges'] = false; [11:10:56] oh [11:11:12] Unless I'm completely out of it.. [11:11:16] but [11:11:18] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1092 - T174509 (duration: 00m 47s) [11:11:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:11:24] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [11:11:34] it affected commonswiki as expected [11:11:49] That can't be right [11:11:51] maybe in_array returns 0 [11:12:50] I thought so [11:12:51] reedy@tin:~$ mwscript eval.php commonswiki [11:12:51] > var_dump( $wgWBClientSettings['injectRecentChanges'] ); [11:12:51] bool(true) [11:13:27] indeed on ruwiki they continue [11:13:46] and also on commons [11:13:51] yes, it is wrong [11:13:52] lol [11:14:01] (03PS1) 10Reedy: Revert "Disable injecing RC records for commonswiki and ruwiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383103 [11:14:26] Well, doesn't really want reverting does it [11:14:31] It wants the ! removing [11:14:33] I do see https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/Wikibase.php#L205 [11:14:44] Which disables the inject on the repo wikis [11:14:51] (03Abandoned) 10Reedy: Revert "Disable injecing RC records for commonswiki and ruwiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383103 (owner: 10Reedy) [11:15:05] that is on wikidata only [11:15:12] yeah, I know [11:15:17] which is expected [11:15:23] the patch needs to be inverse [11:15:26] Which, of course... Was a noop after the code was stripped out [11:15:28] Which is amusing [11:15:33] removed as "not used" [11:15:38] my ass it wasn't used [11:15:48] ? [11:15:53] (03CR) 10Ladsgroup: "You're right. I will make a patch in one hour, i'm in lunch break atm. Or you can jyst remove the "!" Sign and depliy" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383100 (https://phabricator.wikimedia.org/T171027) (owner: 10Ladsgroup) [11:16:23] (03PS1) 10Reedy: Remove ! from Ifec94445b09e9fdc3a9f44d6393e992e66fd1226 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383104 [11:16:35] oh feck off portals [11:16:43] (03PS2) 10Reedy: Remove ! from Ifec94445b09e9fdc3a9f44d6393e992e66fd1226 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383104 [11:16:49] (03CR) 10Reedy: [C: 032] Remove ! from Ifec94445b09e9fdc3a9f44d6393e992e66fd1226 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383104 (owner: 10Reedy) [11:17:53] that is my fault, I didn't see the !, and because how batched it is, it wasn't caught immediately when tested [11:18:17] heh [11:18:19] to be fair [11:18:20] (03Merged) 10jenkins-bot: Remove ! from Ifec94445b09e9fdc3a9f44d6393e992e66fd1226 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383104 (owner: 10Reedy) [11:18:27] I wouldn't mind disabling it everywhere [11:18:43] Let's fix one thing at once ;) [11:18:46] yes [11:18:56] let's deploy your ammend first [11:19:41] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1083" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383105 [11:19:44] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1083" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383105 [11:19:45] !log reedy@tin Synchronized wmf-config/Wikibase-production.php: Actually disable injecting RC everywhere but commonswiki and ruwiki (duration: 00m 46s) [11:19:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:20:28] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1092" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383102 (owner: 10Marostegui) [11:20:30] (03CR) 10jenkins-bot: Remove ! from Ifec94445b09e9fdc3a9f44d6393e992e66fd1226 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383104 (owner: 10Reedy) [11:20:33] 10Operations, 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team (Kanban): Upgrade jenkins to 2.73.1 (new lts release) - https://phabricator.wikimedia.org/T168644#3669025 (10Paladox) >>! In T168644#3660984, @hashar wrote: > For #operations , we would need the Jenkins 2.73.1 Debi... [11:22:14] !log Optimize ores_classification table on db1083 - T159753 [11:22:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:22:21] T159753: Concerns about ores_classification table size on enwiki - https://phabricator.wikimedia.org/T159753 [11:22:34] !log installing apt updates from stretch 9.2 point release [11:22:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:23:57] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1083" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383105 (owner: 10Marostegui) [11:25:26] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1083" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383105 (owner: 10Marostegui) [11:26:11] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1083" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383105 (owner: 10Marostegui) [11:26:36] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1083 - T174509 (duration: 00m 46s) [11:26:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:26:42] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [11:29:05] 10Operations, 10OCG-General, 10Readers-Community-Engagement, 10Epic, and 3 others: [EPIC] (Proposal) Replicate core OCG features and sunset OCG service - https://phabricator.wikimedia.org/T150871#3669033 (10ovasileva) >>! In T150871#3665353, @Nemo_bis wrote: > I'm not attached to OCG, but https://www.media... [11:31:32] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests, 10User-Addshore: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3669034 (10Tobi_WMDE_SW) @Dzahn exactly. The ldap/wmde group is used to control access of WMDE employees to repositories that WMDE has ownership... [11:32:04] (03PS1) 10Reedy: Disable inject recent changes on all client wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383107 (https://phabricator.wikimedia.org/T171027) [11:32:27] (03PS2) 10Reedy: Disable inject recent changes on all client wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383107 (https://phabricator.wikimedia.org/T171027) [11:32:30] (03CR) 10Reedy: "stupid portals" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383107 (https://phabricator.wikimedia.org/T171027) (owner: 10Reedy) [11:36:10] Amir1: jynus: Reedy the wikidata/rc thing is still going isn't it ? [11:36:31] hashar: define still going? [11:36:46] by that I mean, do you still need to deploy more patches/fixes ? [11:36:49] I would like to upgrade jenkins [11:37:01] which would cause CI to pause for a few :] [11:37:52] Reedy: do you want to deploy that now or just preparing? [11:38:10] I don't mind deploying it now [11:38:16] We might aswell stem the flow completely [11:38:24] I do not need all now [11:38:41] I would send an email to wikitech and see the response [11:38:53] for ruwiki and commonswiki [11:39:03] but I will not block you for the others [11:39:06] (03PS2) 10Zfilipin: Fix amwikimedia site name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382463 (https://phabricator.wikimedia.org/T176042) (owner: 10Ladsgroup) [11:39:25] whateve happens, let's not block hashar much :-) [11:40:07] Certainly either way is fine... [11:41:00] let's wait, and reevaluate tomorow with more people [11:41:06] jynus: Let's leave it for now, and we can deploy later if necessary [11:41:07] heh [11:41:19] hashar: go ahead! [11:41:27] ok [11:42:40] !log Upgrading Jenkins - T168644 [11:42:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:42:47] T168644: Upgrade jenkins to 2.73.1 (new lts release) - https://phabricator.wikimedia.org/T168644 [11:45:21] PROBLEM - DPKG on ms-be2039 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [11:45:43] 10Operations, 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team (Kanban): Upgrade jenkins to 2.73.1 (new lts release) - https://phabricator.wikimedia.org/T168644#3669089 (10hashar) Got a bunch of: ``` WARNING: [hudson.plugins.sshslaves.verifiers.TrileadVersionSupportManager ge... [11:46:03] jynys: Reedy: jenkins is back :] [11:46:26] 10Operations, 10Wikimedia-Mailing-lists: Have a conversation about migrating from GNU Mailman 2.1 to GNU Mailman 3.0 - https://phabricator.wikimedia.org/T52864#1243534 (10MoritzMuehlenhoff) JFTR, the first maiman 3 package has been uploaded to Debian now: https://packages.qa.debian.org/m/mailman3-core.html [11:48:21] RECOVERY - DPKG on ms-be2039 is OK: All packages OK [11:49:00] !log joal@tin Started deploy [analytics/refinery@c3812c2]: Analytics regular weekly deploy [11:49:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:31] 10Operations, 10Continuous-Integration-Infrastructure, 10Jenkins, 10Release-Engineering-Team (Kanban): Upgrade jenkins to 2.73.1 (new lts release) - https://phabricator.wikimedia.org/T168644#3669096 (10hashar) 05Open>03Resolved a:03hashar [11:50:31] PROBLEM - HHVM rendering on mw2211 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:51:21] RECOVERY - HHVM rendering on mw2211 is OK: HTTP OK: HTTP/1.1 200 OK - 76935 bytes in 0.303 second response time [11:59:24] !log joal@tin Finished deploy [analytics/refinery@c3812c2]: Analytics regular weekly deploy (duration: 10m 24s) [11:59:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:08:36] 10Operations, 10Wikimedia-Mailing-lists: Have a conversation about migrating from GNU Mailman 2.1 to GNU Mailman 3.0 - https://phabricator.wikimedia.org/T52864#3669128 (10MarcoAurelio) @MoritzMuehlenhoff Does that help or ease the suggested migration? Thanks. [12:26:39] !log restart pybal on esams load balancers to pick up bgp-med config change T165584 [12:26:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:26:46] T165584: Deploy pybal with BGP MED support (for primary/backup) in production - https://phabricator.wikimedia.org/T165584 [12:26:58] jynus: Reedy, I'm back from the lunch [12:27:08] what's the plan, disable only these two or all? [12:29:41] I first disable it on the two first and then we can talk about all [12:31:09] !log restart pybal on ulsfo load balancers to pick up bgp-med config change T165584 [12:31:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:32:01] oh, you already did [12:32:02] thanks [12:38:03] !log restart pybal on codfw load balancers to pick up bgp-med config change T165584 [12:38:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:38:09] T165584: Deploy pybal with BGP MED support (for primary/backup) in production - https://phabricator.wikimedia.org/T165584 [12:47:29] !log restart pybal on eqiad load balancers to pick up bgp-med config change T165584 [12:47:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:47:34] T165584: Deploy pybal with BGP MED support (for primary/backup) in production - https://phabricator.wikimedia.org/T165584 [12:54:02] 10Operations, 10Pybal, 10Traffic, 10netops, 10Patch-For-Review: Deploy pybal with BGP MED support (for primary/backup) in production - https://phabricator.wikimedia.org/T165584#3270574 (10ema) All load balancers are now using BGP MED. Primaries send the MED attribute with a value of 0, backups send 100.... [12:56:47] jouncebot: next [12:56:47] In 0 hour(s) and 3 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T1300) [12:58:54] (03PS1) 10Zoranzoki21: New throttle rule [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383119 (https://phabricator.wikimedia.org/T177737) [12:59:45] (03PS2) 10Zoranzoki21: New throttle rule [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383119 (https://phabricator.wikimedia.org/T177737) [13:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Time to snap out of that daydream and deploy European Mid-day SWAT(Max 8 patches). Get on with it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T1300). [13:00:04] Amir1, MatmaRex, and dcausse: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [13:00:16] I can SWAT today [13:00:21] (03PS2) 10DCausse: Upgrade plugins (official LTR RC1, extra 5.5.2.2, highlighter 5.5.2.1) [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/381798 [13:00:23] o/ [13:00:43] hi [13:00:48] o/ [13:00:51] (03PS1) 10Hashar: Jenkins now supports our MAC/KEX algorithms [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) [13:01:37] looks like CI is busy, deployments might be slow :| [13:01:50] HI. Please deploy patch: https://gerrit.wikimedia.org/r/#/c/383119 [13:01:51] Urgent is [13:02:12] Amir1: reviewing 382463 [13:02:30] Zoranzoki21: how urgent? should I deploy it right now, or during this window? [13:02:41] Now [13:03:12] Zoranzoki21: ok, reviewing it, please add it to the calenar [13:03:19] Ok.. I will add [13:04:28] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383119 (https://phabricator.wikimedia.org/T177737) (owner: 10Zoranzoki21) [13:04:55] Thank you.. I added in calendar [13:05:39] Zoranzoki21: thanks, waiting for CI and will deploy it as soon as it is done [13:05:47] (03PS1) 10Ema: prometheus: re-introduce IPVS aggregation rule [puppet] - 10https://gerrit.wikimedia.org/r/383121 [13:05:50] Okay [13:07:57] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382463 (https://phabricator.wikimedia.org/T176042) (owner: 10Ladsgroup) [13:08:05] (03Merged) 10jenkins-bot: New throttle rule [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383119 (https://phabricator.wikimedia.org/T177737) (owner: 10Zoranzoki21) [13:08:15] (03CR) 10jenkins-bot: New throttle rule [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383119 (https://phabricator.wikimedia.org/T177737) (owner: 10Zoranzoki21) [13:09:53] (03Merged) 10jenkins-bot: Fix amwikimedia site name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382463 (https://phabricator.wikimedia.org/T176042) (owner: 10Ladsgroup) [13:10:13] !log zfilipin@tin Synchronized wmf-config/throttle.php: SWAT: [[gerrit:383119|New throttle rule (T177737)]] (duration: 00m 47s) [13:10:13] (03CR) 10jenkins-bot: Fix amwikimedia site name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382463 (https://phabricator.wikimedia.org/T176042) (owner: 10Ladsgroup) [13:10:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:10:21] T177737: Requesting temporary lift of IP cap on 2017-10-09 - https://phabricator.wikimedia.org/T177737 [13:10:32] Zoranzoki21: deployed, thanks for deploying with #releng ;) [13:10:55] Amir1: want to test at mwdebug or should I deploy directly? [13:11:16] Thank you! And one question. You only add +2 and then software work automaticly? [13:11:42] zeljkof: doesn't matter much [13:11:52] it's a very tiny change [13:12:02] Amir1: direct deploy then? [13:12:15] yeah [13:12:25] Zoranzoki21: for throttle changes, yes, as far as I know [13:12:31] Amir1: deploying... [13:13:22] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:382463|Fix amwikimedia site name (T176042)]] (duration: 00m 47s) [13:13:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:13:28] T176042: Create amwikimedia - https://phabricator.wikimedia.org/T176042 [13:13:53] Please deploy this too: https://gerrit.wikimedia.org/r/#/c/375765/ [13:14:05] Amir1: deployed, please check, https://am.wikimedia.org/wiki/%D5%8E%D5%AB%D6%84%D5%AB%D5%B4%D5%A5%D5%A4%D5%AB%D5%A1_%D5%80%D5%A1%D5%B5%D5%A1%D5%BD%D5%BF%D5%A1%D5%B6 looks good to me, but then I don't know that script at all [13:14:34] MatmaRex: reviewing 382749 [13:14:55] I will add to whitelist dvorapa in calendar [13:15:27] zeljkof: looks okay [13:15:30] (03CR) 10Hashar: [V: 031] "I have cherry picked it on deployment-prep and integration. Puppet did change sshd_config with:" [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [13:16:52] Amir1: thanks for releasing with #releng! ;) [13:17:00] thank you :) [13:17:06] I added in calendar for current SWAT patch https://gerrit.wikimedia.org/r/#/c/375765/ Please, if you can deploy this too [13:17:07] (03CR) 10Paladox: [C: 031] Jenkins now supports our MAC/KEX algorithms [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [13:17:55] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382749 (https://phabricator.wikimedia.org/T176647) (owner: 10Bartosz Dziewoński) [13:18:00] (03PS2) 10Ema: cache::upload: enable nginx-lua-prometheus [puppet] - 10https://gerrit.wikimedia.org/r/382663 [13:18:04] (03CR) 10Zfilipin: Disallow most file types from upload to enwikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382749 (https://phabricator.wikimedia.org/T176647) (owner: 10Bartosz Dziewoński) [13:18:08] (03PS2) 10Zfilipin: Disallow most file types from upload to enwikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382749 (https://phabricator.wikimedia.org/T176647) (owner: 10Bartosz Dziewoński) [13:18:14] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382749 (https://phabricator.wikimedia.org/T176647) (owner: 10Bartosz Dziewoński) [13:18:26] (03CR) 10Ema: [V: 032 C: 032] cache::upload: enable nginx-lua-prometheus [puppet] - 10https://gerrit.wikimedia.org/r/382663 (owner: 10Ema) [13:18:39] (03PS2) 10Hashar: Jenkins now supports our MAC/KEX algorithms [labs] [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) [13:18:41] (03PS1) 10Hashar: Jenkins now supports our MAC/KEXY algorithms [prod] [puppet] - 10https://gerrit.wikimedia.org/r/383122 (https://phabricator.wikimedia.org/T103351) [13:19:27] (03PS8) 10Zoranzoki21: Enable Extension:Newsletter on hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/381537 (https://phabricator.wikimedia.org/T177151) [13:20:05] (03CR) 10Giuseppe Lavagetto: "https://puppet-compiler.wmflabs.org/compiler02/8236/ the only difference is a require changed to an explicit declaration at the start of t" [puppet] - 10https://gerrit.wikimedia.org/r/382683 (owner: 10Giuseppe Lavagetto) [13:20:05] Zoranzoki21: if you are proposing a commit for SWAT, it should be listed under your name, even if the patch author is somebody else [13:20:31] Zoranzoki21: so, 375765 should be under your name, not MarcoAurelio [13:20:43] Ok.. I will change it [13:21:15] Changed [13:21:25] (03Merged) 10jenkins-bot: Disallow most file types from upload to enwikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382749 (https://phabricator.wikimedia.org/T176647) (owner: 10Bartosz Dziewoński) [13:21:35] (03CR) 10jenkins-bot: Disallow most file types from upload to enwikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382749 (https://phabricator.wikimedia.org/T176647) (owner: 10Bartosz Dziewoński) [13:21:54] zeljkof: note that InitialiseSettings.php must be synced before CommonSettings.php [13:22:11] (03CR) 10Hashar: [C: 031] "Puppet compiler for contint1001/contint2001 https://puppet-compiler.wmflabs.org/compiler02/8238/" [puppet] - 10https://gerrit.wikimedia.org/r/383122 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [13:22:26] You saying it to me? [13:22:32] Or to another user? [13:22:38] MatmaRex: thanks; I was just about to ask [13:23:16] Zeljko, I changed for https://gerrit.wikimedia.org/r/#/c/375765/ in calendar to be my name [13:23:17] (03PS2) 10Hashar: Jenkins now supports our MAC/KEX algorithms [prod] [puppet] - 10https://gerrit.wikimedia.org/r/383122 (https://phabricator.wikimedia.org/T103351) [13:23:50] MatmaRex: 382749 is at mwdebug1002, please test and let me know if I can deploy [13:24:33] (03CR) 10Zoranzoki21: [C: 031] Jenkins now supports our MAC/KEX algorithms [prod] [puppet] - 10https://gerrit.wikimedia.org/r/383122 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [13:24:50] zeljkof: looks good, https://en.wikivoyage.org/wiki/Special:Upload has the expected new list of file types, and oither wikis seem unaffected [13:25:05] MatmaRex: ok, deploying [13:26:26] (03PS6) 10Giuseppe Lavagetto: role::cache::base: convert to profile [2] [puppet] - 10https://gerrit.wikimedia.org/r/382684 [13:26:31] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:382749|Disallow most file types from upload to enwikivoyage (T176647)]] (duration: 00m 47s) [13:26:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:26:38] T176647: Help English Wikivoyage restrict certain file type uploads to address illicit file uploads - https://phabricator.wikimedia.org/T176647 [13:27:08] (03CR) 10Zoranzoki21: [C: 031] "Looks good to me, but someone else must approve" [puppet] - 10https://gerrit.wikimedia.org/r/382684 (owner: 10Giuseppe Lavagetto) [13:27:28] !log zfilipin@tin Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:382749|Disallow most file types from upload to enwikivoyage (T176647)]] (duration: 00m 47s) [13:27:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:28:52] thanks [13:29:07] Zoranzoki21: I do not feel comfortable deploying 375765 until somebody that is more familiar with CI (like hashar) gives it a +1, sorry [13:29:23] MatmaRex: deployed, please check [13:29:26] Ok [13:29:33] zeljkof: works fine [13:29:37] No problem [13:29:58] MatmaRex, Zoranzoki21: thanks for deploying with #releng :) [13:30:15] dcausse: reviewing 383079 [13:30:25] zeljkof: thanks [13:30:30] What mean #releng [13:30:52] Zoranzoki21: sorry, Release engineering, my team :) [13:31:03] like in #wikimedia-releng :D [13:31:14] Ok [13:31:17] Thank you [13:33:04] Zeljko, when event expire, can I create revert of patch and to add it for morning swat? [13:33:40] Or to say here? What is better option? [13:33:43] Zoranzoki21: for throttle rule? yes, you can create a patch that cleans up expired events, or just revert the one you have created [13:34:14] Yes, in file is current only one rule which I created.. I will revert it later when event end [13:34:29] Zoranzoki21: sounds good [13:34:55] And then to request here or to add in deployments table? [13:35:28] Zoranzoki21: add the commit to a SWAT deploy that you are able to be online for [13:35:43] Ok.. It can be deployed for morning swat [13:36:03] If in morning swat will be 8 patches, I will request deploy here [13:36:49] dcausse: any order in which files should be deployed? or any order would do? [13:37:14] zeljkof: I think you can sync the whole extension? [13:37:29] dcausse: that's an option too :) [13:37:33] so, the whole extension? [13:37:40] yes I think so [13:37:45] will do [13:38:32] the order would be fuzzylikethis, autoload, elasticsearchttmserver, but just doing it all sounds okay [13:39:37] zeljkof: sorry what is wrong? :) [13:40:02] hashar: not sure I this is OK to deploy https://gerrit.wikimedia.org/r/#/c/375765/ [13:40:19] zeljkof: they dont go through SWAT [13:40:38] I am deploying it anyway :] [13:41:00] hashar: ah, did not notice it's in integration/config :D [13:41:15] Zoranzoki21: ^ [13:42:18] This will be deployed? Thank you [13:42:22] And sorry for boring [13:44:28] Zoranzoki21: no problem, hashar said he is deploying it now [13:44:50] dcausse, Nikerabbit: 383079 is at mwdebug1002, please check and let me know if I can deploy [13:44:53] OK. Super [13:44:59] zeljkof: testing [13:45:48] Zoranzoki21: commits in integration/config do not need to be deployed during SWAT window, hashar and I can deploy them at any time, but for this commit I was not sure if it's ok to merge it [13:46:55] (03CR) 10Ema: [C: 031] role::cache::base: convert to profile [1/2] [puppet] - 10https://gerrit.wikimedia.org/r/382683 (owner: 10Giuseppe Lavagetto) [13:49:39] zeljkof: looks fine for me, I see translation suggestions again on long entries [13:50:14] (03PS6) 10Giuseppe Lavagetto: role::cache::base: convert to profile [1/2] [puppet] - 10https://gerrit.wikimedia.org/r/382683 [13:50:23] (03PS3) 10Muehlenhoff: Jenkins now supports our MAC/KEX algorithms [labs] [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [13:50:30] oops, I was testing on mwdebug1001 [13:50:48] dcausse: ok, deploying [13:50:49] mwdebug1002 indeed returns results for https://meta.wikimedia.org/w/api.php?action=translationaids&format=jsonfm&title=Translations%3ATech%2FNews%2F2017%2F41%2F1%2Ffi for example [13:50:58] (03CR) 10Giuseppe Lavagetto: [C: 032] role::cache::base: convert to profile [1/2] [puppet] - 10https://gerrit.wikimedia.org/r/382683 (owner: 10Giuseppe Lavagetto) [13:51:04] Nikerabbit: it's always 1002 ;) should I wait for you? [13:51:28] zeljkof: I'm done [13:51:36] Nikerabbit: ok to deploy? [13:51:40] yes [13:52:11] dcausse, Nikerabbit: deploying [13:52:16] ok [13:53:01] !log zfilipin@tin Synchronized php-1.31.0-wmf.2/extensions/Translate/: SWAT: [[gerrit:383079|Revert "[tech-debt] Remove usage of FuzzyLikeThis in favor of simple fuzzy match" (T177727)]] (duration: 00m 57s) [13:53:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:53:09] T177727: Translation memory not working for anything longer - https://phabricator.wikimedia.org/T177727 [13:53:21] dcausse, Nikerabbit: deployed, please check and thanks for deploying with #releng ;) [13:53:46] zeljkof: looks good to me! thanks releng! :) [13:53:49] !log EU SWAT finished [13:53:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:54:40] Thank you very much! [13:55:41] PROBLEM - puppet last run on cp2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:55:42] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:56:11] PROBLEM - puppet last run on cp4023 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:56:52] PROBLEM - puppet last run on cp1052 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:58:28] <_joe_> that was me ^^ [13:58:30] <_joe_> fixing it [13:58:32] (03PS7) 10Giuseppe Lavagetto: role::cache::base: convert to profile [2/2] [puppet] - 10https://gerrit.wikimedia.org/r/382684 [13:59:01] PROBLEM - puppet last run on cp4018 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:59:13] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [14:02:10] (03CR) 10Ema: [C: 031] "Typo in the commit message (unnneeded). Besides that, LGTM." [puppet] - 10https://gerrit.wikimedia.org/r/382684 (owner: 10Giuseppe Lavagetto) [14:02:12] (03CR) 10Giuseppe Lavagetto: [C: 032] role::cache::base: convert to profile [2/2] [puppet] - 10https://gerrit.wikimedia.org/r/382684 (owner: 10Giuseppe Lavagetto) [14:02:57] (03PS8) 10Giuseppe Lavagetto: role::cache::base: convert to profile [2/2] [puppet] - 10https://gerrit.wikimedia.org/r/382684 [14:04:13] (03PS4) 10Muehlenhoff: Jenkins now supports our MAC/KEX algorithms [labs] [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [14:05:35] (03CR) 10Muehlenhoff: [C: 032] Jenkins now supports our MAC/KEX algorithms [labs] [puppet] - 10https://gerrit.wikimedia.org/r/383120 (https://phabricator.wikimedia.org/T103351) (owner: 10Hashar) [14:06:51] RECOVERY - puppet last run on cp1052 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [14:09:11] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [14:10:42] RECOVERY - puppet last run on cp1054 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [14:15:31] This is deployed? https://gerrit.wikimedia.org/r/#/c/375765/ [14:17:12] 10Operations, 10Wikimedia-Mailing-lists: Have a conversation about migrating from GNU Mailman 2.1 to GNU Mailman 3.0 - https://phabricator.wikimedia.org/T52864#3669528 (10Elitre) [14:17:32] (03PS4) 10Giuseppe Lavagetto: prometheus::varnish::exporter: convert to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/382720 [14:18:58] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669546 (10jkroll) Thanks @MaxSem. I've tried to reproduce it on deployment-mediawiki04.eqiad.wmflabs without succes... [14:23:57] (03CR) 10Giuseppe Lavagetto: [C: 031] "https://puppet-compiler.wmflabs.org/compiler02/8241/" [puppet] - 10https://gerrit.wikimedia.org/r/382720 (owner: 10Giuseppe Lavagetto) [14:24:00] 10Operations, 10monitoring, 10Patch-For-Review: Uninstall ganglia from the fleet - https://phabricator.wikimedia.org/T177225#3651131 (10akosiaris) I 've seen all the nice changes. While merging the bacula one, it got me thinking how are we going to clean up the fleet ? Should we use puppet (so all these chan... [14:24:01] RECOVERY - puppet last run on cp4018 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [14:25:41] RECOVERY - puppet last run on cp2001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [14:26:02] RECOVERY - puppet last run on cp4023 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [14:28:50] (03CR) 10Ema: [C: 031] prometheus::varnish::exporter: convert to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/382720 (owner: 10Giuseppe Lavagetto) [14:30:50] (03PS4) 10Giuseppe Lavagetto: cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 [14:33:37] (03PS2) 10Andrew Bogott: Fix up puppet-compiler for labs usage [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/325042 (owner: 10Gerrit Patch Uploader) [14:36:13] 10Operations, 10monitoring, 10Patch-For-Review: Uninstall ganglia from the fleet - https://phabricator.wikimedia.org/T177225#3651131 (10MoritzMuehlenhoff) For dropping the salt minion, the removal was done in two stages, first a commit which purged the packages and when that had run across the fleet and WMCS... [14:39:10] (03CR) 10jerkins-bot: [V: 04-1] Fix up puppet-compiler for labs usage [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/325042 (owner: 10Gerrit Patch Uploader) [14:43:33] (03CR) 10Ema: [C: 032] prometheus::varnish::exporter: convert to role/profile [puppet] - 10https://gerrit.wikimedia.org/r/382720 (owner: 10Giuseppe Lavagetto) [14:43:43] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669614 (10jkroll) Also, getting a core file when this happens would be helpful. The core pattern was set to /data/p... [14:45:48] (03PS3) 10Andrew Bogott: Fix up puppet-compiler for labs usage [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/325042 (owner: 10Gerrit Patch Uploader) [14:47:12] (03CR) 10Muehlenhoff: [C: 031] cumin: drop ganglia::web role alias [puppet] - 10https://gerrit.wikimedia.org/r/382920 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [14:48:41] (03CR) 10Volans: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/383096 (https://phabricator.wikimedia.org/T175830) (owner: 10Gehel) [14:49:40] (03PS2) 10Gehel: logstash: remove references to old logstash servers for decommissioning [puppet] - 10https://gerrit.wikimedia.org/r/383096 (https://phabricator.wikimedia.org/T175830) [14:54:14] (03CR) 10jerkins-bot: [V: 04-1] Fix up puppet-compiler for labs usage [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/325042 (owner: 10Gerrit Patch Uploader) [14:56:03] 10Operations, 10monitoring, 10Patch-For-Review: Uninstall ganglia from the fleet - https://phabricator.wikimedia.org/T177225#3669650 (10Ottomata) > @Ottomata Hi, wondering what do you think should happen with modules/confluent/manifests/kafka/mirror/jmxtrans.pp and modules/confluent/manifests/kafka/broker/jm... [14:58:42] (03CR) 10Ema: [C: 032] prometheus: re-introduce IPVS aggregation rule [puppet] - 10https://gerrit.wikimedia.org/r/383121 (owner: 10Ema) [14:58:48] (03PS2) 10Ema: prometheus: re-introduce IPVS aggregation rule [puppet] - 10https://gerrit.wikimedia.org/r/383121 [14:58:52] (03CR) 10Ema: [V: 032 C: 032] prometheus: re-introduce IPVS aggregation rule [puppet] - 10https://gerrit.wikimedia.org/r/383121 (owner: 10Ema) [14:59:21] (03CR) 10Gehel: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382988 (https://phabricator.wikimedia.org/T177695) (owner: 10Jayprakash12345) [14:59:23] (03PS4) 10Andrew Bogott: Fix up puppet-compiler for labs usage [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/325042 (owner: 10Gerrit Patch Uploader) [15:00:04] (03CR) 10Ottomata: [C: 031] udp2log: remove ganglia monitoring [puppet] - 10https://gerrit.wikimedia.org/r/382913 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [15:03:57] (03PS1) 10Zoranzoki21: Enable SandboxLink on gawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383145 (https://phabricator.wikimedia.org/T177775) [15:04:17] (03CR) 10Muehlenhoff: Add support for stretch to hhvm::debug (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/380722 (owner: 10Muehlenhoff) [15:04:20] (03PS1) 10Gehel: logstash: update logstash_syslog common hiera parameter to point to LVS. [puppet] - 10https://gerrit.wikimedia.org/r/383146 (https://phabricator.wikimedia.org/T175242) [15:04:39] (03PS5) 10Muehlenhoff: Add support for stretch to hhvm::debug [puppet] - 10https://gerrit.wikimedia.org/r/380722 [15:04:59] (03PS2) 10Zoranzoki21: Enable SandboxLink on gawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383145 (https://phabricator.wikimedia.org/T177775) [15:05:33] (03PS3) 10Gehel: logstash: remove references to old logstash servers for decommissioning [puppet] - 10https://gerrit.wikimedia.org/r/383096 (https://phabricator.wikimedia.org/T175830) [15:05:48] (03CR) 10Volans: "see inline" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/383146 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [15:06:46] gehel: did we already tried with one test host? ^^^ [15:07:23] (03PS2) 10Gehel: logstash: update logstash_syslog common hiera parameter to point to LVS. [puppet] - 10https://gerrit.wikimedia.org/r/383146 (https://phabricator.wikimedia.org/T175242) [15:08:58] (03CR) 10Gehel: logstash: update logstash_syslog common hiera parameter to point to LVS. (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/383146 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [15:09:21] volans: nope, not yet, but yes, that might be a good idea... [15:09:59] damn, we have a lot of different mediawiki roles! [15:10:08] yeah, I was thinking at least one mwdebug in eqiad and one in codfw, or even a mw host [15:10:12] to be sure they can connect fine [15:10:27] ok, I'll prepare the patch in a minute... [15:12:44] (03CR) 10Muehlenhoff: [C: 032] logstash: remove references to old logstash servers for decommissioning [puppet] - 10https://gerrit.wikimedia.org/r/383096 (https://phabricator.wikimedia.org/T175830) (owner: 10Gehel) [15:13:14] moritzm: ^thanks! [15:13:26] thanks for preparing the patch :-) [15:13:49] I think it was the 3rd time I rebased it and did not merge it right away... [15:16:15] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3632060 (10Addshore) I have also tried reproducing this and really cant make it 503. Do we have a link to the origin... [15:16:35] (03PS1) 10Gehel: [test] mediawiki: use LVS endpoint for logstash [puppet] - 10https://gerrit.wikimedia.org/r/383147 (https://phabricator.wikimedia.org/T175242) [15:23:46] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669690 (10MoritzMuehlenhoff) >>! In T176637#3669546, @jkroll wrote: > Thanks @MaxSem. I've tried to reproduce it on... [15:27:36] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1091" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383148 [15:27:39] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1091" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383148 [15:32:15] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669695 (10Addshore) >>! In T176637#3669690, @MoritzMuehlenhoff wrote: > On the 28th, the update to HHVM 3.18.5 was... [15:34:49] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669700 (10jkroll) >>! In T176637#3669690, @MoritzMuehlenhoff wrote: > On the 28th, the update to HHVM 3.18.5 was in... [15:38:02] (03CR) 10Volans: [C: 031] "LGTM, but I'm not familiar with MW logstash settings, so I'd suggest to check it with someone else too." [puppet] - 10https://gerrit.wikimedia.org/r/383147 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [15:38:26] 10Operations, 10DBA, 10monitoring, 10Patch-For-Review, 10Prometheus-metrics-monitoring: MySQL monitoring with prometheus - https://phabricator.wikimedia.org/T143896#3669716 (10jcrespo) [15:39:48] 10Operations, 10DBA, 10monitoring, 10Patch-For-Review, 10Prometheus-metrics-monitoring: MySQL metrics monitoring - https://phabricator.wikimedia.org/T143896#2582458 (10jcrespo) [15:41:54] (03PS3) 10DCausse: Upgrade plugins (official LTR RC1, extra 5.5.2.3, highlighter 5.5.2.2) [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/381798 [15:42:53] (03PS2) 10Gehel: [test] mediawiki: use LVS endpoint for logstash [puppet] - 10https://gerrit.wikimedia.org/r/383147 (https://phabricator.wikimedia.org/T175242) [15:42:54] 10Operations, 10DBA, 10monitoring: Generate instance list of database hosts to be monitored automatically from exported resources - https://phabricator.wikimedia.org/T177779#3669724 (10jcrespo) [15:43:38] 10Operations, 10DBA, 10monitoring, 10Patch-For-Review, 10Prometheus-metrics-monitoring: MySQL metrics monitoring - https://phabricator.wikimedia.org/T143896#2582458 (10jcrespo) [15:45:18] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1091" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383148 (owner: 10Marostegui) [15:46:16] (03CR) 10Gehel: "The logstash events go through rsyslog. Puppet compiler agrees: https://puppet-compiler.wmflabs.org/compiler02/8244/" [puppet] - 10https://gerrit.wikimedia.org/r/383147 (https://phabricator.wikimedia.org/T175242) (owner: 10Gehel) [15:50:33] (03CR) 10Ema: [C: 031] cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 (owner: 10Giuseppe Lavagetto) [15:52:31] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1091" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383148 (owner: 10Marostegui) [15:52:46] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1091" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383148 (owner: 10Marostegui) [15:53:26] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1091 - T174509 (duration: 00m 47s) [15:53:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:53:36] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [15:56:47] (03PS5) 10Giuseppe Lavagetto: cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 [15:58:21] (03CR) 10Giuseppe Lavagetto: [C: 032] cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 (owner: 10Giuseppe Lavagetto) [15:58:29] (03CR) 10Giuseppe Lavagetto: [C: 032] cacheproxy: move some content to new module [puppet] - 10https://gerrit.wikimedia.org/r/382721 (owner: 10Giuseppe Lavagetto) [16:02:42] (03PS1) 10Zoranzoki21: Revert "New throttle rule" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383150 [16:02:52] (03CR) 10Gehel: [C: 031] "LGTM, checksums verified" [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/381798 (owner: 10DCausse) [16:03:13] (03CR) 10Zoranzoki21: "Deploy this in Morning SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383150 (owner: 10Zoranzoki21) [16:03:27] (03PS5) 10Marostegui: db-eqiad.php: Set commonswiki on read only [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382379 (https://phabricator.wikimedia.org/T176883) [16:05:31] 10Operations, 10Ops-Access-Requests, 10Analytics: analytics-privatedata-users access for Jeff Green - https://phabricator.wikimedia.org/T177602#3664354 (10Nuria) Approved, this is ops for FR tech [16:07:28] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669831 (10Jdlrobson) @Addshore this doesn't appear to be happening any more, but it was 100% reproducible when it c... [16:08:25] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3669832 (10Jdlrobson) [16:08:56] Please fix links [16:09:05] I think on: [16:09:06] #wikimedia-operations: Wikimedia Platform Operations, serious stuff | Status: UP | Log: https://bit.ly/wikitech | Channel logs: https://bit.ly/opsirclog | Ops Clinic Duty: herron [16:12:12] (03PS9) 10Zoranzoki21: Enable Extension:Newsletter on hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/381537 (https://phabricator.wikimedia.org/T177151) [16:13:16] (03PS1) 10Ottomata: Remove druid100[456] from druid analytics cluster. [puppet] - 10https://gerrit.wikimedia.org/r/383154 (https://phabricator.wikimedia.org/T176223) [16:22:14] (03PS2) 10Giuseppe Lavagetto: profile::cache::base: add role::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/383072 [16:22:16] (03PS1) 10Giuseppe Lavagetto: Rakefile: do not collect ignored lint issues win wmf_styleguide [puppet] - 10https://gerrit.wikimedia.org/r/383155 [16:26:11] (03PS2) 10Giuseppe Lavagetto: Rakefile: do not collect ignored lint issues in wmf_styleguide [puppet] - 10https://gerrit.wikimedia.org/r/383155 [16:26:13] (03PS3) 10Giuseppe Lavagetto: profile::cache::base: add role::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/383072 [16:26:55] (03CR) 10Giuseppe Lavagetto: [C: 032] Rakefile: do not collect ignored lint issues in wmf_styleguide [puppet] - 10https://gerrit.wikimedia.org/r/383155 (owner: 10Giuseppe Lavagetto) [16:27:51] (03PS2) 10Ottomata: Remove druid100[456] from druid analytics cluster. [puppet] - 10https://gerrit.wikimedia.org/r/383154 (https://phabricator.wikimedia.org/T176223) [16:28:51] volans: Any idea about stat1005? It has crond disabled still AFAICT [16:29:43] 10Operations, 10Analytics-Kanban, 10Traffic: Invalid "wikimedia" family in unique devices data due to misplaced WMF-Last-Access-Global cookie - https://phabricator.wikimedia.org/T174640#3669886 (10Nuria) a:03JAllemandou [16:30:45] (03CR) 10Ottomata: [C: 032] "https://puppet-compiler.wmflabs.org/compiler03/8248/" [puppet] - 10https://gerrit.wikimedia.org/r/383154 (https://phabricator.wikimedia.org/T176223) (owner: 10Ottomata) [16:30:52] (03PS3) 10Ottomata: Remove druid100[456] from druid analytics cluster. [puppet] - 10https://gerrit.wikimedia.org/r/383154 (https://phabricator.wikimedia.org/T176223) [16:30:56] (03CR) 10Ottomata: [V: 032 C: 032] Remove druid100[456] from druid analytics cluster. [puppet] - 10https://gerrit.wikimedia.org/r/383154 (https://phabricator.wikimedia.org/T176223) (owner: 10Ottomata) [16:31:03] or ottomata ^ [16:31:19] hoo: mmmh I left stat1005/6 to elukey [16:31:25] let me check [16:31:28] hoo: yeah I forgot to enable it [16:31:31] busy afternoon [16:31:34] I was about to do it :) [16:32:18] cron? [16:33:05] ottomata: Riccardo stopped it for the nfs mount problem [16:33:12] oh [16:33:13] ah k [16:33:23] just re-enabled it :) [16:34:44] !log beginning decom and reinstall process for druid1004-1006 -- T176223 [16:34:48] !log stopping druid services on druid1006 [16:34:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:34:50] T176223: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223 [16:34:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:39:11] 10Operations, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Tune Kafka logs to register clients connected - https://phabricator.wikimedia.org/T173493#3669942 (10Nuria) 05Open>03Resolved [16:42:44] (03CR) 10Giuseppe Lavagetto: [C: 04-2] "some puppet classes need fixing before this can be done." [puppet] - 10https://gerrit.wikimedia.org/r/383072 (owner: 10Giuseppe Lavagetto) [16:43:18] (03PS2) 10Giuseppe Lavagetto: profile::cache::ssl::unified: move from role, refactor [puppet] - 10https://gerrit.wikimedia.org/r/383073 [16:53:33] (03PS3) 10Giuseppe Lavagetto: profile::cache::ssl::unified: move from role, refactor [puppet] - 10https://gerrit.wikimedia.org/r/383073 [17:00:04] gehel: Dear deployers, time to do the Wikidata Query Service weekly deploy deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T1700). [17:00:04] No GERRIT patches in the queue for this window AFAICS. [17:00:27] jouncebot: o/ and yep, nothing to deploy on wdqs today... [17:04:13] (03PS1) 10Lucas Werkmeister (WMDE): Switch to new wbcheckconstraints API format [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383162 (https://phabricator.wikimedia.org/T175590) [17:04:16] (03PS1) 10Lucas Werkmeister (WMDE): Enable constraint checks on qualifiers and references [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383163 (https://phabricator.wikimedia.org/T176863) [17:19:34] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3670038 (10MoritzMuehlenhoff) >>! In T176637#3669700, @jkroll wrote: > Is it likely that between rollout of the new... [17:20:25] 10Operations, 10Contributors-Team, 10MobileFrontend, 10wikidiff2, and 3 others: Diff page consistently produces 503 on beta cluster on first visit - https://phabricator.wikimedia.org/T176637#3670039 (10MoritzMuehlenhoff) >>! In T176637#3669695, @Addshore wrote: > Is it possible or worth rolling back the ve... [17:28:11] !log stopping druid services on druid1005 [17:28:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:46:33] (03PS4) 10Zoranzoki21: profile::cache::base: add role::lvs::realserver [puppet] - 10https://gerrit.wikimedia.org/r/383072 (owner: 10Giuseppe Lavagetto) [17:59:56] !log stopping druid services on druid1006 [18:00:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:00:05] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: That opportune time is upon us again. Time for a Morning SWAT (Max 8 patches) deploy. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T1800). [18:00:05] davidwbarratt, foks, Jayprakash12345, and Zoranzoki21: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [18:00:14] !log stopping druid services on druid1004 (not 1006) [18:00:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:00:22] here! [18:00:27] So who will SWAT today? [18:00:29] * foks waves [18:00:30] (if we are really doing SWAT today) [18:00:36] yeah likewise [18:00:38] wait [18:00:46] I no mean what say jouncebot [18:00:54] Where? Which sticker? [18:01:13] 10Operations, 10monitoring, 10Patch-For-Review: Uninstall ganglia from the fleet - https://phabricator.wikimedia.org/T177225#3651131 (10faidon) I saw some of these commits fly by. These are obviously well agreed in principle but I think it's important to not have regressions here -- if we remove a service fr... [18:02:01] my patch will still need to be +2'd - this will be the first time I've rebased something and would love to not totally screw it up [18:04:04] (03PS4) 10Jayprakash12345: Enable NewUserMessage for SUL accounts too on hi.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382964 (https://phabricator.wikimedia.org/T177690) [18:04:14] (03PS4) 10Jayprakash12345: Add autopatrolled user group to sd.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382934 (https://phabricator.wikimedia.org/T177141) [18:04:19] (03PS4) 10Jayprakash12345: Enable NewUserMessage for SUL accounts too on dty.wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382960 (https://phabricator.wikimedia.org/T177688) [18:08:30] 10Operations, 10Analytics-Kanban, 10User-Elukey: LVS for Druid - https://phabricator.wikimedia.org/T177511#3670122 (10Ottomata) [18:14:51] Maybe I'll just push it back actually [18:16:35] yeah, pushed mine back to Wednesday [18:20:43] I am repeating my Question "So who will SWAT today?" [18:25:41] Jayprakash12345: Probably nobody will. It's a holiday for the US folks. I'm not sure why we even have a SWAT scheduled for today because last time we had a US holiday, they were cancelled. [18:25:53] Jayprakash12345: You better move your patch to Wednesday too. [18:26:17] ok [18:27:07] My patch revert of throttle rule please deploy [18:36:40] I am not sure where all US folks are :/ [18:36:55] (03PS1) 10Jdlrobson: Enable new print styles on Vector [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383170 (https://phabricator.wikimedia.org/T169732) [18:37:06] davidwbarratt, foks, Jayprakash12345, and Zoranzoki21: I can sprint the SWAT [18:37:22] yay! [18:37:37] hashar it's a holiday in the US [18:37:47] hashar: don’t worry about mine :) [18:37:52] https://www.redcort.com/us-federal-bank-holidays/ [18:37:54] (03CR) 10Hashar: [C: 032] "We can just clean them up later as we add new rules :D" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383150 (owner: 10Zoranzoki21) [18:37:57] hashar it's a us holiday ^^ [18:37:59] https://en.wikipedia.org/wiki/Indigenous_Peoples%27_Day [18:38:00] hashar: so sounds like 4pm SWAT window might not happen? i was wondering about that [18:38:11] jdlrobson: most probably not [18:38:16] "Columbus DayOctober 9Monday" [18:38:19] seems american are again in hollidays ! [18:39:22] davidwbarratt: are you able to verify your patch work by using mwdebug1001? If so I am going to deploy it on that machine [18:39:57] hashar I sure can. and the maintenance script will also need to be run. [18:40:12] hashar: dont suppose you have space for one more now? :) [18:40:16] (03Merged) 10jenkins-bot: Revert "New throttle rule" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383150 (owner: 10Zoranzoki21) [18:40:21] davidwbarratt: ok let me handle the simpler patches first [18:40:27] (03CR) 10jenkins-bot: Revert "New throttle rule" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383150 (owner: 10Zoranzoki21) [18:40:30] jdlrobson: I dont mind extending the window yeah [18:40:50] hashar thanks <3 https://gerrit.wikimedia.org/r/#/c/383170/ [18:41:05] (its on the 4pm slot but i can move it forward on the deployment calendar [18:41:12] jdlrobson: add it to https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T1800 I guess i will come to it in half an hour or so [18:41:50] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382934 (https://phabricator.wikimedia.org/T177141) (owner: 10Jayprakash12345) [18:44:24] (03Merged) 10jenkins-bot: Add autopatrolled user group to sd.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382934 (https://phabricator.wikimedia.org/T177141) (owner: 10Jayprakash12345) [18:45:42] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Add autopatrolled user group to sd.wikipedia - T177141 (duration: 00m 47s) [18:45:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:45:49] T177141: Enable autopatroller and rollbacker groups on Sindhi Wikipedia - https://phabricator.wikimedia.org/T177141 [18:46:04] (03PS5) 10Hashar: Enable NewUserMessage for SUL accounts too on dty.wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382960 (https://phabricator.wikimedia.org/T177688) (owner: 10Jayprakash12345) [18:46:07] (03PS5) 10Hashar: Enable NewUserMessage for SUL accounts too on hi.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382964 (https://phabricator.wikimedia.org/T177690) (owner: 10Jayprakash12345) [18:46:29] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382960 (https://phabricator.wikimedia.org/T177688) (owner: 10Jayprakash12345) [18:46:31] (03CR) 10Hashar: "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382964 (https://phabricator.wikimedia.org/T177690) (owner: 10Jayprakash12345) [18:46:35] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382964 (https://phabricator.wikimedia.org/T177690) (owner: 10Jayprakash12345) [18:46:58] davidwbarratt: doing those two patches above and I look at your :) [18:47:07] thank you hashar [18:47:23] hashar thanks! [18:47:27] davidwbarratt: havent you done the cherry pick to the wmf branches ?? [18:47:46] hashar uhhh... no? I didn't know I needed to. :) [18:48:02] (03Merged) 10jenkins-bot: Enable NewUserMessage for SUL accounts too on dty.wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382960 (https://phabricator.wikimedia.org/T177688) (owner: 10Jayprakash12345) [18:48:04] (03Merged) 10jenkins-bot: Enable NewUserMessage for SUL accounts too on hi.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382964 (https://phabricator.wikimedia.org/T177690) (owner: 10Jayprakash12345) [18:48:05] ah tyler did [18:48:20] davidwbarratt: https://gerrit.wikimedia.org/r/#/q/7ce947ebc50c8cad32fe31df6fc73bee73761bba [18:48:33] and seems we want https://gerrit.wikimedia.org/r/#/c/382561/ [18:49:14] which is in merge conflict with branch wmf/1.31.0-wmf.2 bah [18:49:55] hashar yeah that second one fixes the problem we had during the last deploy [18:50:12] ohhh [18:50:16] and that got reverted bah [18:50:32] hashar right, basically you can't use echoDB to get user_properties [18:50:40] hashar you have to call the main DB [18:51:09] errr.. this is the fix to the problem in the last deploy https://gerrit.wikimedia.org/r/#/c/382561/2/maintenance/updatePerUserBlacklist.php [18:52:37] davidwbarratt: https://gerrit.wikimedia.org/r/#/c/383173/ :D [18:52:43] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage for SUL accounts too on dty.wiki - T177688 (duration: 00m 47s) [18:52:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:52:49] T177688: Enable NewUserMessage for SUL accounts too on dty.wiki - https://phabricator.wikimedia.org/T177688 [18:53:41] hashar that doesn't have the fix in it https://gerrit.wikimedia.org/r/#/c/383173/1/maintenance/updatePerUserBlacklist.php is that intentional? [18:54:55] davidwbarratt: yeah I am reapplying them both as a single commit [18:55:23] 10Operations, 10Analytics-Kanban, 10Traffic, 10Patch-For-Review: Invalid "wikimedia" family in unique devices data due to misplaced WMF-Last-Access-Global cookie - https://phabricator.wikimedia.org/T174640#3670314 (10JAllemandou) The change above doesn't change the behavior of cookies, but at least removes... [18:55:26] hashar kk [18:55:58] davidwbarratt: https://gerrit.wikimedia.org/r/383173 Reapply "Use User Ids instead of User Names for Echo Mute"" [18:56:01] it should have both patches [18:57:14] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage for SUL accounts too on hi.wikiversity - T177690 (duration: 00m 47s) [18:57:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:57:20] T177690: Enable NewUserMessage for SUL accounts too on hi.wikiversity - https://phabricator.wikimedia.org/T177690 [18:57:38] Thanks Hashar Sir [18:57:39] jdlrobson: I will do your patch :) [18:57:46] Jayprakash12345: they should be all fine :) [18:57:50] hashar: w00t [18:57:53] ready to test when you are [18:58:21] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383170 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [18:58:29] jdlrobson: I will push it to mwdebug1001 [19:00:16] (03PS2) 10Hashar: Enable new print styles on Vector [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383170 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:00:21] (03CR) 10Hashar: [C: 032] Enable new print styles on Vector [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383170 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:00:52] poor gerrit is quite slow [19:01:11] (03CR) 10jenkins-bot: Add autopatrolled user group to sd.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382934 (https://phabricator.wikimedia.org/T177141) (owner: 10Jayprakash12345) [19:01:13] (03CR) 10jenkins-bot: Enable NewUserMessage for SUL accounts too on dty.wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382960 (https://phabricator.wikimedia.org/T177688) (owner: 10Jayprakash12345) [19:01:14] hashar looks good to me! [19:01:15] (03CR) 10jenkins-bot: Enable NewUserMessage for SUL accounts too on hi.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382964 (https://phabricator.wikimedia.org/T177690) (owner: 10Jayprakash12345) [19:02:59] (03Merged) 10jenkins-bot: Enable new print styles on Vector [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383170 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:03:05] (03CR) 10jenkins-bot: Enable new print styles on Vector [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383170 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:03:33] jdlrobson: it is on mwdebug1001 [19:03:47] davidwbarratt: can you +2 it please ? :] [19:04:03] hashar: looking [19:04:20] hashar I don't have access to +2 this repo [19:04:29] oh my :D [19:04:55] and I thought that all @wikimedia.org users had +2 bah [19:06:18] hashar: seeing some weirdness which im investigating.. [19:06:34] jdlrobson: no worries take your time [19:06:39] hashar well I do for some repos, but I guess not all of them. :/ [19:06:57] (03PS3) 10Zoranzoki21: Enable SandboxLink on gawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383145 (https://phabricator.wikimedia.org/T177775) [19:07:02] (03PS10) 10Zoranzoki21: Enable Extension:Newsletter on hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/381537 (https://phabricator.wikimedia.org/T177151) [19:08:42] davidwbarratt: so how do we deploy that one ? Should it be checked first? [19:09:03] davidwbarratt: seems it is solely a maintenance script and thus I can just deploy it cluster wide [19:09:15] hashar where there any conclicts when you cherry-picked? [19:09:23] no [19:09:46] hashar it's also a small change in how we are storing the user pref (i.e. now they are ids instead of uernames) [19:09:55] hashar if there were no conflicts then I say we are good to go [19:10:03] hashar I can test out the feature first before you run the script [19:10:05] ah there is some code change [19:10:20] so probably gotta give it a try on mwdebug1001 first [19:10:39] hashar sure, let me know when it's available and I'll test it out. :) [19:11:17] oh my [19:12:33] davidwbarratt: it is on mwdebug1001 [19:12:52] hashar: i think it's okay to sync. I'm seeing issues with a logo not rendering, but it doesn't seem to be fataling and might be due to caching [19:13:07] jdlrobson: well we can purge the logo if that helps? [19:13:17] i have a theory we can try [19:13:17] hashar kk, let me test it out [19:13:22] the url is relative and it might need to be absolute [19:13:35] jdlrobson: though hmm on mwdebug1001 there should not be much caching [19:13:55] basically some CSS takes the config variable and generates a base64 uri for it [19:14:18] jdlrobson: and the base url reference would be wrong? [19:14:39] jdlrobson: I lost track of how we serve extensions assets nowadays :( My last info was bits.wikimedia.org (gone :D ) [19:14:40] CSSMin::buildUrlValue( CSSMin::encodeImageAsDataURI('/static/images/mobile/copyright/wikipedia-wordmark-en.svg' ) ) [19:15:25] 10Operations, 10DBA, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3670375 (10jrbs) I've rebased (thanks, @Dzahn!) and scheduled this to go out on SWAT on Wednesday. [19:15:56] hashar it all looks good to me! :) [19:16:05] 10Operations, 10DBA, 10Support-and-Safety, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3670376 (10jrbs) [19:16:37] davidwbarratt: so deploy and then we run the maintenance script? [19:16:37] hashar: let's try it. I can debug this tomorrow. [19:16:56] hashar sure! unless you want to test out the maintenance script somewhere. [19:17:04] yeah hmm [19:17:06] lets deploy it [19:17:26] jdlrobson: you are the judge. I cant tell how bad the lack of icon is :] [19:17:40] jdlrobson: will sync it just after david one [19:17:52] !log hashar@tin Synchronized php-1.31.0-wmf.2/extensions/Echo: Reapply "Use User Ids instead of User Names for Echo Mute"" - T173475 (duration: 00m 52s) [19:17:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:17:59] T173475: Echo Notification Mute (Block List) can be bypassed by changing username - https://phabricator.wikimedia.org/T173475 [19:18:02] davidwbarratt: it is live! [19:18:08] hashar YAY! [19:18:14] hashar did the script run? [19:18:43] nice one! [19:18:54] davidwbarratt: na not yet [19:19:03] davidwbarratt: so which wiki should I run it against ? :) [19:19:24] hashar all of the ones that have echo enabled [19:19:25] !log hashar@tin scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details) [19:19:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:19:37] bah canaries failling :( [19:19:45] jdlrobson: ^^ [19:20:16] hashar: ah shit [19:20:17] Notice: Undefined index: copyright in /srv/mediawiki/wmf-config/CommonSettings.php on line 658 [19:20:20] forgot to check [19:20:23] * jdlrobson facepalms [19:20:24] I screwed up the order I guess [19:20:34] should be an isset [19:20:37] no it's my bad [19:20:40] I synced commonsettings.php first [19:21:01] and maybe it needs $wgVectorExperimentalPrintStyles = true first ? [19:22:04] (03PS1) 10Hashar: Revert "Enable new print styles on Vector" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383178 [19:22:18] !log hashar@tin Synchronized wmf-config/CommonSettings.php: REVERT: Enable new print styles on Vector - T169732 (duration: 00m 49s) [19:22:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:22:24] T169732: Deploy new desktop print styles on all projects - https://phabricator.wikimedia.org/T169732 [19:22:25] (03PS1) 10Jdlrobson: Check config variables are set before applying [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383179 (https://phabricator.wikimedia.org/T169732) [19:22:27] (03CR) 10Hashar: [C: 032] "Reverted on prod due to undefined index after I synced CommonSettings.php" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383178 (owner: 10Hashar) [19:22:29] hashar: ^ needed that [19:22:33] jdlrobson: I have reverted it [19:22:38] i can squash them [19:22:55] jdlrobson: yes pelase. By cherry picking https://gerrit.wikimedia.org/r/#/c/383170/ + adding your fix up :) [19:23:32] $ cat echo.dblist [19:23:33] %% all.dblist - nonecho.dblist [19:23:35] hehe that is creatie [19:23:38] creative [19:24:05] (03Merged) 10jenkins-bot: Revert "Enable new print styles on Vector" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383178 (owner: 10Hashar) [19:24:22] how long does it take the feature to available without debug? [19:24:24] (03PS1) 10Jdlrobson: Enable new print styles on Vector [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) [19:24:27] hashar: ^ [19:26:07] (03CR) 10jenkins-bot: Revert "Enable new print styles on Vector" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383178 (owner: 10Hashar) [19:26:33] davidwbarratt: it is running [19:27:21] hashar: actually let's abort the deploy today [19:27:30] i'm a little nervous now and want some more time to check it out [19:27:31] jdlrobson: sure [19:27:42] jdlrobson: what you do instead is add it to the -labs.php files [19:27:50] jdlrobson: and try it out / debug it out on the beta cluster? [19:27:58] hashar:that sounds like a good idea [19:28:26] davidwbarratt: it explodes somehow :( [19:28:40] hashar: i'll do testwiki [19:28:43] Notice: Undefined variable: dbFactory in /srv/mediawiki/php-1.31.0-wmf.2/extensions/Echo/maintenance/updatePerUserBlacklist.php on line 82 [19:28:45] PHP Fatal error: Call to a member function waitForSlaves() on null in /srv/mediawiki/php-1.31.0-wmf.2/extensions/Echo/maintenance/updatePerUserBlacklist.php on line 82 [19:29:04] hashar AHHHHHH!!!!!! [19:29:29] (03PS2) 10Jdlrobson: Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) [19:29:38] * davidwbarratt gives up on life [19:29:39] ^ hashar let's roll out to test wiki [19:30:06] hashar well roll that back. :( [19:30:17] (03CR) 10Hashar: [C: 032] Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:30:31] (03CR) 10jerkins-bot: [V: 04-1] Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:30:34] davidwbarratt: let me paste it :) [19:30:56] davidwbarratt: https://phabricator.wikimedia.org/P6096 :D [19:30:59] (03PS3) 10Jdlrobson: Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) [19:31:00] o_O [19:31:09] saves. commits. gitreviews. [19:31:19] hashar thanks, sorry 'bout that. :( [19:31:31] Is this sticker-worthy? [19:31:41] :D [19:32:06] davidwbarratt: so maybe it process the batches just fine [19:32:13] though it keep saying "Updated 0 users" :) [19:32:58] ohh [19:33:11] davidwbarratt: $dbfactory should be $dbw I guess :) [19:35:42] hashar: https://gerrit.wikimedia.org/r/383180 PS3 fixes lintin errors [19:35:42] hashar yeah I think so, didn't update the variable at the bottom, though doesn't make a lot of sense why it worked locally. :) [19:36:11] davidwbarratt: cause locally you updated less users than the batch size maybe [19:36:30] hashar ha, that would be true. :) [19:36:42] davidwbarratt: so most probably you wanna patch it and add a call to the global wfWaitForSlaves() [19:37:02] hashar so I can get this ready for next time, how do you do a git review to the branch that it needs to be on? [19:37:24] git-review [19:37:40] and if unsure run it with -n (dry run) which will show you the command it will end up doing eventually [19:37:44] so eg: [19:37:44] hashar ah! thankks [19:37:51] git-review -n wmf/1.30.0-wmf.2 [19:38:26] hashar cool cool. I'll have a cherry-picked patch ready for next time. [19:38:34] davidwbarratt: I am running the script again [19:38:46] I have live hacked a call to wfWaitForSlaves() [19:38:59] hashar okie dokie. :) [19:39:19] hashar you are braver than I am. :) [19:40:25] davidwbarratt: this is what I did https://gerrit.wikimedia.org/r/383182 updatePerUserBlacklist wfWaitForSlaves() [19:41:12] and the batch output is a bit odd $this->output( "Updated $processed Users\n" ); [19:41:23] on each batch completion it shows the total number of user updated :D [19:41:26] so far [19:41:42] which kind of confused me :] But it is of no importance I guess [19:42:17] (03CR) 10Hashar: [C: 032] Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:42:22] jdlrobson: doing it [19:42:34] jdlrobson: then really for experimental / debug stuff. Beta cluster is usually safer :] [19:43:07] hashar looks good to me [19:43:50] (03Merged) 10jenkins-bot: Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:43:55] oh weird, yeah I obviously didn't test with a large data set [19:45:14] davidwbarratt: script is completed :) [19:45:22] hashar YAYAYAY! [19:45:23] I got a log of some partial session if that can help [19:45:42] hashar eh, as long as it completed it should all be good to go [19:45:43] on terbium.eqiad.wmnet in /home/hashar/echo_blacklist.log [19:46:10] (03CR) 10jenkins-bot: Enable new print styles on Vector in test wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383180 (https://phabricator.wikimedia.org/T169732) (owner: 10Jdlrobson) [19:47:22] jdlrobson: syncing [19:47:38] hashar: thx. it's just this logo issue i need to work out. Hopefully we can wrap that up today for tomorrow [19:47:50] davidwbarratt: and you will want to cherry pick https://gerrit.wikimedia.org/r/#/c/383182/ to the master branch [19:48:00] davidwbarratt: then get it reviewed by someone who knows better than me :] [19:48:04] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable new print styles on Vector in test wiki - T169732 (duration: 00m 47s) [19:48:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:48:11] T169732: Deploy new desktop print styles on all projects - https://phabricator.wikimedia.org/T169732 [19:48:24] jdlrobson: sure thing. I am not there tomorrow though, but maybe I will be there in my evening :D [19:48:47] hashar kk, thank you so much! I'm glad that's done! [19:49:12] hashar: heads up i might need your help debugging https://phabricator.wikimedia.org/T177672 later this week [19:50:42] !log hashar@tin Synchronized php-1.31.0-wmf.2/extensions/Echo/maintenance/updatePerUserBlacklist.php: Sync a live hack in Echo updatePerUserBlacklist https://gerrit.wikimedia.org/r/#/c/383182/ - T173475 (duration: 00m 47s) [19:50:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:50:48] T173475: Echo Notification Mute (Block List) can be bypassed by changing username - https://phabricator.wikimedia.org/T173475 [19:50:56] jdlrobson: not sure I will have much time for debugging this week :((( [19:51:08] !log morning swat completed [19:51:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:51:16] I am taking a break [19:51:30] will then monitor perf metrics and such to see whether something is snowballing [19:51:32] hashar: is it synced? [19:51:38] (the config change for testwiki?) [19:51:45] should be? :) [19:52:24] maybe 5 min cache problem [19:52:28] doing it again [19:52:56] maybe that is hold in resource loader cache [19:53:03] which is known to not always invalidate itself [19:53:05] hashar here's the cherry-pick https://gerrit.wikimedia.org/r/#/c/383183/ [19:53:11] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable new print styles on Vector in test wiki - T169732 (duration: 00m 47s) [19:53:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:53:17] T169732: Deploy new desktop print styles on all projects - https://phabricator.wikimedia.org/T169732 [19:53:27] jdlrobson: i synced it again [19:54:31] hashar: yeh im seeing it [19:54:44] jdlrobson: \o/ [19:55:02] 19:48 Synchronized wmf-config/InitialiseSettings.php: Enable new print styles on Vector in test wiki - T169732 (duration: 00m 47s) [19:55:08] so I did it 7 minutes ago [19:55:15] and re did it after [19:55:22] but most probably that sounds like a 5 minutes TTL cache [19:55:37] davidwbarratt: awesome. Lets see what proper reviewers will say about it :) [19:58:07] thanks hashar i too need a break :) [19:58:47] (03PS1) 10Nuria: [WIP] Removing from whitelist tables that no longer exist [puppet] - 10https://gerrit.wikimedia.org/r/383185 (https://phabricator.wikimedia.org/T171629) [20:00:05] gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: Time to snap out of that daydream and deploy Services – Parsoid / OCG / Citoid / Mobileapps / ORES / …. Get on with it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T2000). [20:00:05] No GERRIT patches in the queue for this window AFAICS. [20:07:29] all looks gone [20:07:34] so I guess it is sleep time & [20:08:31] PROBLEM - HHVM rendering on mw2126 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:09:30] RECOVERY - HHVM rendering on mw2126 is OK: HTTP OK: HTTP/1.1 200 OK - 77198 bytes in 0.318 second response time [20:36:14] (03PS2) 10Nuria: [WIP] Removing from whitelist tables that no longer exist [puppet] - 10https://gerrit.wikimedia.org/r/383185 (https://phabricator.wikimedia.org/T171629) [20:37:52] (03PS1) 10Andrew Bogott: puppetmaster: don't include ruby-ldap packages [puppet] - 10https://gerrit.wikimedia.org/r/383192 [20:38:47] (03CR) 10Andrew Bogott: [C: 032] puppetmaster: don't include ruby-ldap packages [puppet] - 10https://gerrit.wikimedia.org/r/383192 (owner: 10Andrew Bogott) [20:43:17] (03PS1) 10Jdlrobson: Disable OCG services [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383210 (https://phabricator.wikimedia.org/T177795) [20:50:16] (03PS1) 10Hashar: PHP CodeSniffer no more process autogenerated files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383234 [20:54:30] (03CR) 10Hashar: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383234 (owner: 10Hashar) [21:00:04] dapatrick, bawolff, and Reedy: #bothumor My software never has bugs. It just develops random features. Rise for Weekly Security deployment window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T2100). [21:00:04] No GERRIT patches in the queue for this window AFAICS. [21:00:20] PROBLEM - Nginx local proxy to apache on mw2121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:01:20] RECOVERY - Nginx local proxy to apache on mw2121 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.199 second response time [21:02:34] (03PS3) 10Volans: Docstrings: use Google Style [software/cumin] - 10https://gerrit.wikimedia.org/r/382479 (https://phabricator.wikimedia.org/T159308) [21:02:36] (03PS3) 10Volans: Documentation: convert Markdown to reStructuredText [software/cumin] - 10https://gerrit.wikimedia.org/r/382480 (https://phabricator.wikimedia.org/T159308) [21:02:38] (03PS5) 10Volans: CLI: extract parser definition from parse_args() [software/cumin] - 10https://gerrit.wikimedia.org/r/382481 [21:02:40] (03PS5) 10Volans: setup.py: prepare for PyPi submission [software/cumin] - 10https://gerrit.wikimedia.org/r/382482 [21:02:42] (03PS10) 10Volans: Documentation: Sphinx setup [software/cumin] - 10https://gerrit.wikimedia.org/r/382483 (https://phabricator.wikimedia.org/T159308) [21:02:44] (03PS10) 10Volans: setup.py and tox: spit dependencies [software/cumin] - 10https://gerrit.wikimedia.org/r/382484 [21:12:42] (03CR) 10Volans: "replies inline, thanks for the review!" (035 comments) [software/cumin] - 10https://gerrit.wikimedia.org/r/382479 (https://phabricator.wikimedia.org/T159308) (owner: 10Volans) [21:26:52] 10Operations, 10Puppet, 10Operations-Software-Development: Consider adding a --skip-conftool option to puppet-merge - https://phabricator.wikimedia.org/T157133#3670734 (10Volans) 05Open>03stalled @jcrespo is this still happening? [22:18:02] !log Removed empty left-over xmldatadumps/public/other/wikibase/wikidatawiki/20171007 (https://dumps.wikimedia.org/wikidatawiki/entities/20171007/) [22:18:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:26:38] 10Operations, 10DBA, 10Support-and-Safety, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3670808 (10Reedy) >>! In T174370#3670375, @jrbs wrote: > I've rebased (thanks, @Dzahn!) and scheduled this to go out on SWAT on Wednesd... [22:33:47] 10Operations, 10DBA, 10Support-and-Safety, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3670812 (10jrbs) >>! In T174370#3670808, @Reedy wrote: >>>! In T174370#3670375, @jrbs wrote: >> I've rebased (thanks, @Dzahn!) and sche... [22:34:21] 10Operations, 10DBA, 10Support-and-Safety, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3670813 (10Reedy) >>! In T174370#3670812, @jrbs wrote: > What do you recommend? (I am absolutely new to this.) Ask me nicely and I'll... [22:36:15] 10Operations, 10DBA, 10Support-and-Safety, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3670814 (10jrbs) >>! In T174370#3670813, @Reedy wrote: >>>! In T174370#3670812, @jrbs wrote: >> What do you recommend? (I am absolutely... [23:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: It is that lovely time of the day again! You are hereby commanded to deploy Evening SWAT (Max 8 patches). (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171009T2300). [23:00:04] No GERRIT patches in the queue for this window AFAICS.