[00:38:10] (03CR) 10Tim Starling: [C: 031] "Reviewed LBFactoryMulti, LBFactory, LoadBalancer, DatabaseMysqli. Should work." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386810 (https://phabricator.wikimedia.org/T178553) (owner: 10Marostegui) [01:04:13] PROBLEM - Check health of redis instance on 6379 on rdb2003 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6379 [01:04:33] PROBLEM - Check health of redis instance on 6479 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1509325468 600 - REDIS 2.8.17 on 127.0.0.1:6479 has 1 databases (db0) with 3950789 keys, up 4 minutes 26 seconds - replication_delay is 1509325468 [01:04:52] PROBLEM - Check health of redis instance on 6481 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1509325485 600 - REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 3947910 keys, up 4 minutes 42 seconds - replication_delay is 1509325485 [01:05:02] PROBLEM - Check health of redis instance on 6480 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1509325496 600 - REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 3949937 keys, up 4 minutes 54 seconds - replication_delay is 1509325496 [01:05:22] RECOVERY - Check health of redis instance on 6379 on rdb2003 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6379 has 1 databases (db0) with 8643998 keys, up 5 minutes 11 seconds - replication_delay is 0 [01:06:02] RECOVERY - Check health of redis instance on 6480 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 3939987 keys, up 5 minutes 55 seconds - replication_delay is 0 [01:06:33] RECOVERY - Check health of redis instance on 6479 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6479 has 1 databases (db0) with 3940655 keys, up 6 minutes 28 seconds - replication_delay is 0 [01:06:53] RECOVERY - Check health of redis instance on 6481 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 3939040 keys, up 6 minutes 46 seconds - replication_delay is 0 [02:35:28] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.4) (duration: 11m 18s) [02:35:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:04:26] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.5) (duration: 11m 38s) [03:04:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:11:37] !log l10nupdate@tin ResourceLoader cache refresh completed at Mon Oct 30 03:11:37 UTC 2017 (duration 7m 11s) [03:11:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:26:02] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 790.41 seconds [04:02:12] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 294.81 seconds [05:58:09] !log Deploy alter table on db1075 (s3 primary master) - T174509 [05:58:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:58:16] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:06:54] (03PS1) 10Marostegui: db-codfw.php: Depool db2040 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387163 (https://phabricator.wikimedia.org/T178359) [06:08:40] !log Stop MySQL on db2040 to populate s7 on db2086 - T178359 [06:08:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:08:46] T178359: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359 [06:08:55] (03CR) 10Marostegui: [C: 032] db-codfw.php: Depool db2040 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387163 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [06:10:45] (03Merged) 10jenkins-bot: db-codfw.php: Depool db2040 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387163 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [06:10:56] (03CR) 10jenkins-bot: db-codfw.php: Depool db2040 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387163 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [06:12:23] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Depool db2040 - T178359 (duration: 00m 51s) [06:12:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:24:50] !log Optimize dewiki.pagelinks and dewiki.templatelinks on db1063 (s5 master) - T174509 [06:24:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:24:57] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:27:22] PROBLEM - ores on scb2006 is CRITICAL: connect to address 10.192.32.20 and port 8081: Connection refused [06:34:06] (03PS1) 10Marostegui: mariadb: Move db1103 from s3 to s4 [puppet] - 10https://gerrit.wikimedia.org/r/387165 (https://phabricator.wikimedia.org/T161088) [06:35:30] (03PS1) 10Marostegui: db-eqiad.php: Move db1103 from s3 to s4 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387166 (https://phabricator.wikimedia.org/T161088) [06:36:31] (03PS2) 10Marostegui: db-eqiad.php: Move db1103 from s3 to s4 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387166 (https://phabricator.wikimedia.org/T161088) [06:38:18] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Move db1103 from s3 to s4 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387166 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:40:07] (03Merged) 10jenkins-bot: db-eqiad.php: Move db1103 from s3 to s4 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387166 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:40:23] (03CR) 10jenkins-bot: db-eqiad.php: Move db1103 from s3 to s4 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387166 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:40:53] !log Stop MySQL on db1103 to reclone it - T161088 [06:40:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:41:01] T161088: Migrate some s4 hosts to file per table - https://phabricator.wikimedia.org/T161088 [06:41:15] (03CR) 10Marostegui: [C: 032] "Puppet looks good: https://puppet-compiler.wmflabs.org/compiler02/8530/" [puppet] - 10https://gerrit.wikimedia.org/r/387165 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:41:23] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Move db1103 from s3 to s4 - T161088 (duration: 00m 49s) [06:41:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:43:12] (03PS1) 10Marostegui: db-eqiad.php: Depool db1097 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387167 (https://phabricator.wikimedia.org/T161088) [06:44:59] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1097 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387167 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:46:13] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1097 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387167 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:46:21] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1097 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387167 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [06:47:16] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1097 - T161088 (duration: 00m 50s) [06:47:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:47:22] RECOVERY - ores on scb2006 is OK: HTTP OK: HTTP/1.0 200 OK - 3666 bytes in 0.083 second response time [06:47:22] T161088: Migrate some s4 hosts to file per table - https://phabricator.wikimedia.org/T161088 [06:51:17] !log Stop MySQL on db1097 to clone db1103 - T161088 [06:51:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:04:03] PROBLEM - puppet last run on db2018 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [07:08:53] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [07:41:53] (03PS1) 10Marostegui: install_server: Reimage db2087 as stretch [puppet] - 10https://gerrit.wikimedia.org/r/387168 (https://phabricator.wikimedia.org/T178359) [07:42:50] (03CR) 10Marostegui: [C: 032] install_server: Reimage db2087 as stretch [puppet] - 10https://gerrit.wikimedia.org/r/387168 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [07:44:52] (03PS1) 10Marostegui: s3,s4.hosts: Move db1103 from s3 to s4 [software] - 10https://gerrit.wikimedia.org/r/387169 (https://phabricator.wikimedia.org/T161088) [07:46:46] (03CR) 10Marostegui: [C: 032] s3,s4.hosts: Move db1103 from s3 to s4 [software] - 10https://gerrit.wikimedia.org/r/387169 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [07:47:57] (03Merged) 10jenkins-bot: s3,s4.hosts: Move db1103 from s3 to s4 [software] - 10https://gerrit.wikimedia.org/r/387169 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [07:51:40] PROBLEM - Check systemd state on stat1005 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [07:53:21] (03PS1) 10Marostegui: mariadb: Add db2087 to s6 and s7 [puppet] - 10https://gerrit.wikimedia.org/r/387170 (https://phabricator.wikimedia.org/T178359) [07:53:54] (03CR) 10jerkins-bot: [V: 04-1] mariadb: Add db2087 to s6 and s7 [puppet] - 10https://gerrit.wikimedia.org/r/387170 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [07:56:33] (03PS2) 10Marostegui: mariadb: Add db2087 to s6 and s7 [puppet] - 10https://gerrit.wikimedia.org/r/387170 (https://phabricator.wikimedia.org/T178359) [07:58:54] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] First version of scap deployment of docker-pkg (032 comments) [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/386641 (owner: 10Giuseppe Lavagetto) [08:00:49] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Artifacts for the first release [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/386878 (owner: 10Giuseppe Lavagetto) [08:01:40] (03CR) 10Marostegui: [C: 032] "Puppet looks good: https://puppet-compiler.wmflabs.org/compiler02/8531/" [puppet] - 10https://gerrit.wikimedia.org/r/387170 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [08:11:48] !log Stop MySQL on db2086 to transfer s7 to db2087 - T178359 [08:11:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:11:55] T178359: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359 [08:16:49] (03PS1) 10Marostegui: Revert "db-codfw.php: Depool db2040" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387174 [08:16:54] (03PS2) 10Marostegui: Revert "db-codfw.php: Depool db2040" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387174 [08:19:32] (03PS1) 10Giuseppe Lavagetto: role::deployment_server: add docker-pkg [puppet] - 10https://gerrit.wikimedia.org/r/387175 [08:19:52] (03CR) 10Marostegui: [C: 032] Revert "db-codfw.php: Depool db2040" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387174 (owner: 10Marostegui) [08:20:33] (03CR) 10Giuseppe Lavagetto: [C: 032] role::deployment_server: add docker-pkg [puppet] - 10https://gerrit.wikimedia.org/r/387175 (owner: 10Giuseppe Lavagetto) [08:20:39] (03PS2) 10Giuseppe Lavagetto: role::deployment_server: add docker-pkg [puppet] - 10https://gerrit.wikimedia.org/r/387175 [08:22:01] (03Merged) 10jenkins-bot: Revert "db-codfw.php: Depool db2040" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387174 (owner: 10Marostegui) [08:22:10] (03CR) 10jenkins-bot: Revert "db-codfw.php: Depool db2040" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387174 (owner: 10Marostegui) [08:22:50] (03PS1) 10Marostegui: s6,s7.hosts: Add db2087 [software] - 10https://gerrit.wikimedia.org/r/387177 (https://phabricator.wikimedia.org/T178359) [08:23:06] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Repool db2040 - T178359 (duration: 00m 50s) [08:23:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:23:13] T178359: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359 [08:24:02] (03PS6) 10Dzahn: ci::slave::labs: Drop host_aliases for gerrit.wm.org ssh [puppet] - 10https://gerrit.wikimedia.org/r/386257 (owner: 10Paladox) [08:24:47] (03PS1) 10Marostegui: db-codfw.php: Depool db2039 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387179 [08:25:18] (03PS2) 10Marostegui: db-codfw.php: Depool db2039 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387179 [08:25:36] (03CR) 10Dzahn: [C: 032] ci::slave::labs: Drop host_aliases for gerrit.wm.org ssh [puppet] - 10https://gerrit.wikimedia.org/r/386257 (owner: 10Paladox) [08:25:49] (03PS2) 10Marostegui: s6,s7.hosts: Add db2087 [software] - 10https://gerrit.wikimedia.org/r/387177 (https://phabricator.wikimedia.org/T178359) [08:26:13] (03PS1) 10ArielGlenn: fix arg initialization issue [dumps] - 10https://gerrit.wikimedia.org/r/387181 [08:26:17] (03PS2) 10Dzahn: rm requesttracker::labs class [puppet] - 10https://gerrit.wikimedia.org/r/385495 [08:26:54] (03PS2) 10Gehel: wdqs: remove PrintPLAB from GC logging [puppet] - 10https://gerrit.wikimedia.org/r/386791 (https://phabricator.wikimedia.org/T175919) [08:27:07] (03CR) 10Dzahn: [C: 032] rm requesttracker::labs class [puppet] - 10https://gerrit.wikimedia.org/r/385495 (owner: 10Dzahn) [08:27:19] (03CR) 10ArielGlenn: [C: 032] fix arg initialization issue [dumps] - 10https://gerrit.wikimedia.org/r/387181 (owner: 10ArielGlenn) [08:27:43] (03CR) 10Marostegui: [C: 032] s6,s7.hosts: Add db2087 [software] - 10https://gerrit.wikimedia.org/r/387177 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [08:27:49] (03CR) 10Gehel: [C: 032] wdqs: remove PrintPLAB from GC logging [puppet] - 10https://gerrit.wikimedia.org/r/386791 (https://phabricator.wikimedia.org/T175919) (owner: 10Gehel) [08:27:59] (03PS3) 10Gehel: wdqs: remove PrintPLAB from GC logging [puppet] - 10https://gerrit.wikimedia.org/r/386791 (https://phabricator.wikimedia.org/T175919) [08:28:01] (03CR) 10Marostegui: [C: 032] db-codfw.php: Depool db2039 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387179 (owner: 10Marostegui) [08:28:05] !log ariel@tin Started deploy [dumps/dumps@c204c72]: fix args initialization issue in getconfigvals script [08:28:07] !log ariel@tin Finished deploy [dumps/dumps@c204c72]: fix args initialization issue in getconfigvals script (duration: 00m 02s) [08:28:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:28:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:28:29] (03Merged) 10jenkins-bot: s6,s7.hosts: Add db2087 [software] - 10https://gerrit.wikimedia.org/r/387177 (https://phabricator.wikimedia.org/T178359) (owner: 10Marostegui) [08:29:14] (03Merged) 10jenkins-bot: db-codfw.php: Depool db2039 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387179 (owner: 10Marostegui) [08:29:25] (03CR) 10jenkins-bot: db-codfw.php: Depool db2039 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387179 (owner: 10Marostegui) [08:30:03] PROBLEM - puppet last run on naos is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): Scap_source[docker-pkg/deploy] [08:30:24] _joe_: ^^^ ;) [08:30:37] <_joe_> interesting [08:30:43] <_joe_> it worked like a charm on tin [08:31:00] <_joe_> I think that's a race condition of some sort [08:31:36] (03CR) 10Dzahn: [C: 032] extdist: use profile::labs::lvm::srv instead of role [puppet] - 10https://gerrit.wikimedia.org/r/385477 (owner: 10Hashar) [08:31:41] <_joe_> even simpler: gerrit failure afaics [08:31:42] (03PS2) 10Dzahn: extdist: use profile::labs::lvm::srv instead of role [puppet] - 10https://gerrit.wikimedia.org/r/385477 (owner: 10Hashar) [08:31:56] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Depool db2039 (duration: 00m 50s) [08:32:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:32:02] RECOVERY - Check systemd state on stat1005 is OK: OK - running: The system is fully operational [08:32:10] <_joe_> indeed [08:32:23] !log rolling restart of wdqs for config reload - T175919 [08:32:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:32:29] T175919: investigate GC times on wikidata query service - https://phabricator.wikimedia.org/T175919 [08:33:09] !log installing wget security updates [08:33:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:33:43] (03PS1) 10Marostegui: db-eqiad.php: Repool db1097 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387183 (https://phabricator.wikimedia.org/T161088) [08:35:03] RECOVERY - puppet last run on naos is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [08:35:56] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Repool db1097 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387183 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [08:36:52] (03PS9) 10Elukey: role::mediawiki::jobrunner: inc runners for refreshLinks/htmlCacheUpdate [puppet] - 10https://gerrit.wikimedia.org/r/386636 (https://phabricator.wikimedia.org/T173710) [08:37:09] (03Merged) 10jenkins-bot: db-eqiad.php: Repool db1097 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387183 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [08:37:21] (03CR) 10jenkins-bot: db-eqiad.php: Repool db1097 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387183 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [08:37:41] (03PS4) 10Dzahn: Switch use of jenkins for >= stretch to thirdparty/ci [puppet] - 10https://gerrit.wikimedia.org/r/385169 (owner: 10Muehlenhoff) [08:38:08] (03CR) 10Elukey: [C: 032] role::mediawiki::jobrunner: inc runners for refreshLinks/htmlCacheUpdate [puppet] - 10https://gerrit.wikimedia.org/r/386636 (https://phabricator.wikimedia.org/T173710) (owner: 10Elukey) [08:38:09] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1097 with low weight - T161088 (duration: 00m 50s) [08:38:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:38:15] T161088: Migrate some s4 hosts to file per table - https://phabricator.wikimedia.org/T161088 [08:38:37] (03CR) 10Dzahn: [C: 032] Switch use of jenkins for >= stretch to thirdparty/ci [puppet] - 10https://gerrit.wikimedia.org/r/385169 (owner: 10Muehlenhoff) [08:38:43] (03PS5) 10Dzahn: Switch use of jenkins for >= stretch to thirdparty/ci [puppet] - 10https://gerrit.wikimedia.org/r/385169 (owner: 10Muehlenhoff) [08:39:22] !log Stop replication on db2039 to reimport and compress tables [08:39:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:40:37] (03CR) 10Dzahn: "yep, confirmed on releases*:" [puppet] - 10https://gerrit.wikimedia.org/r/385169 (owner: 10Muehlenhoff) [08:42:42] !log raised priority of refreshlink and htmlcacheupdate job execution on jobrunners (https://gerrit.wikimedia.org/r/#/c/386636/) - T173710 [08:42:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:42:49] T173710: Job queue is increasing non-stop - https://phabricator.wikimedia.org/T173710 [08:43:53] (03CR) 10Dzahn: [C: 031] "@bblack was there a reason you wanted us to wait on these 'switch to $cache_misc source range' changes? RT would be perfect for testing. h" [puppet] - 10https://gerrit.wikimedia.org/r/366812 (owner: 10Muehlenhoff) [08:50:12] (03CR) 10Dzahn: "per ticket comments - looks like this needs "be more db friendly" first at https://gerrit.wikimedia.org/r/#/c/384429/" [puppet] - 10https://gerrit.wikimedia.org/r/382631 (https://phabricator.wikimedia.org/T176754) (owner: 10EddieGP) [08:51:02] !log cp4022: restart varnish-be for mbox lag [08:51:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:53:34] (03CR) 10Gehel: puppet: change elasticsearch_5 template to check undef(nil) variable (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) (owner: 10Herron) [08:55:22] RECOVERY - Check Varnish expiry mailbox lag on cp4022 is OK: OK: expiry mailbox lag is 0 [08:58:25] bblack: ema: Is there any reason not to revert all the Wikidata things back to where we were Thursday afternoon? [08:58:50] hoo: morgen! Are you still having issues? [08:59:54] Nope, everything looks fine from our side (the root issue is probably still open, but that's almost certainly unrelated) [09:00:42] right. So, my understanding from reading irc backlog+task is that https://gerrit.wikimedia.org/r/#/c/387059/ stopped the symptoms [09:03:31] 10Operations, 10Ops-Access-Requests: Add legoktm to releasers-mediawiki - https://phabricator.wikimedia.org/T179264#3718668 (10Dzahn) [09:03:47] (03PS1) 10Marostegui: db-eqiad.php: Increase db1097 API weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387187 (https://phabricator.wikimedia.org/T161088) [09:03:47] 10Operations, 10Ops-Access-Requests: Add legoktm to releasers-mediawiki - https://phabricator.wikimedia.org/T179264#3718681 (10Dzahn) [09:03:54] hoo: I'm now trying to figure out how to identify "indefinite" responses on the varnish side (as per T179156#3718225). If you do spot recent interesting changes in wikidata in that context please let me know [09:03:54] T179156: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156 [09:03:54] ema: It did, yes [09:04:02] PROBLEM - Router interfaces on cr1-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 66, down: 1, dormant: 0, excluded: 0, unused: 0 [09:04:15] I will… but I don't think we're doing anything remotely like that [09:04:16] (03PS2) 10Dzahn: admin: Add legoktm to releasers-mediawiki to upload wikidiff2 tarballs [puppet] - 10https://gerrit.wikimedia.org/r/386896 (https://phabricator.wikimedia.org/T179264) (owner: 10Legoktm) [09:04:32] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0 [09:04:52] (03CR) 10Dzahn: [C: 031] admin: Add legoktm to releasers-mediawiki to upload wikidiff2 tarballs [puppet] - 10https://gerrit.wikimedia.org/r/386896 (https://phabricator.wikimedia.org/T179264) (owner: 10Legoktm) [09:05:51] 10Operations, 10Ops-Access-Requests, 10Patch-For-Review: Add legoktm to releasers-mediawiki - https://phabricator.wikimedia.org/T179264#3718692 (10Dzahn) p:05Triage>03Normal [09:06:30] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase db1097 API weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387187 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [09:07:09] 10Operations, 10CirrusSearch, 10Discovery, 10MediaWiki-JobQueue, and 6 others: Job queue is increasing non-stop - https://phabricator.wikimedia.org/T173710#3718694 (10elukey) >>! In T173710#3717940, @Jack_who_built_the_house wrote: > On ruwiki, many editors are complaining about slow updating of pages with... [09:07:39] (03Merged) 10jenkins-bot: db-eqiad.php: Increase db1097 API weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387187 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [09:07:41] I'm going to schedule turning everything back to pre-issue state starting at 11:00 UTC… will schedule it in a second [09:07:48] (03CR) 10jenkins-bot: db-eqiad.php: Increase db1097 API weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387187 (https://phabricator.wikimedia.org/T161088) (owner: 10Marostegui) [09:07:49] ema: ^ fine w/ you? [09:08:43] !log Drop wb_entity_per_page page from s3 and s5 - T177601 [09:08:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:08:49] T177601: Deploy dropping wb_entity_per_page table - https://phabricator.wikimedia.org/T177601 [09:08:52] hoo: yes [09:08:54] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Increase db1097 api traffic - T161088 (duration: 00m 50s) [09:09:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:09:02] T161088: Migrate some s4 hosts to file per table - https://phabricator.wikimedia.org/T161088 [09:22:01] (03PS1) 10Hoo man: Revert "Wikidatawiki to wmf.4" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387188 (https://phabricator.wikimedia.org/T179156) [09:22:03] (03PS1) 10Hoo man: Revert "Disable constraints check with SPARQL for now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) [09:22:05] (03PS1) 10Hoo man: Revert "Revert "Add property for RDF mapping of external identifiers for Wikidata"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387190 (https://phabricator.wikimedia.org/T179156) [09:22:11] !log Stop replication in sync on db2039 and db2046 to reimport tables [09:22:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:49] 10Operations, 10CirrusSearch, 10Discovery, 10MediaWiki-JobQueue, and 6 others: Job queue is increasing non-stop - https://phabricator.wikimedia.org/T173710#3718725 (10Jack_who_built_the_house) Thanks for the reply. It just surprises me that [[https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&si... [09:29:38] (03PS1) 10ArielGlenn: allow dumpsdata hosts to sync to/from ms1001 and dataset1001 [puppet] - 10https://gerrit.wikimedia.org/r/387197 [09:30:23] RECOVERY - Router interfaces on cr1-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 68, down: 0, dormant: 0, excluded: 0, unused: 0 [09:30:52] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [09:32:12] (03PS1) 10Addshore: Set wgWikiDiff2MovedParagraphDetectionCutoff for group0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387198 (https://phabricator.wikimedia.org/T177891) [09:32:27] 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, 10WMDE-QWERTY-Team-Board: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3718734 (10Addshore) a:03Addshore [09:37:20] (03PS6) 10Addshore: WIP DNM Add ::statistics::wmde::wikidata_concepts [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) [09:37:31] (03PS2) 10ArielGlenn: allow dumpsdata hosts to sync to/from ms1001 and dataset1001 [puppet] - 10https://gerrit.wikimedia.org/r/387197 [09:38:21] (03CR) 10Alexandros Kosiaris: [C: 031] puppet: change dbstore_multiinstance mariadb groups call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387151 (https://phabricator.wikimedia.org/T179161) (owner: 10Herron) [09:38:24] (03CR) 10ArielGlenn: [C: 032] allow dumpsdata hosts to sync to/from ms1001 and dataset1001 [puppet] - 10https://gerrit.wikimedia.org/r/387197 (owner: 10ArielGlenn) [09:39:43] (03CR) 10Alexandros Kosiaris: [C: 031] puppet: change mediawiki refreshlinks cronjob call to use full name [puppet] - 10https://gerrit.wikimedia.org/r/387142 (https://phabricator.wikimedia.org/T179177) (owner: 10Herron) [09:42:31] (03CR) 10Alexandros Kosiaris: [C: 031] puppet: change ganglia aggregator site_instances call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387139 (https://phabricator.wikimedia.org/T179165) (owner: 10Herron) [09:46:07] (03PS1) 10ArielGlenn: add dumpsdata hosts to list for rsync ferm rules for dumps servers [puppet] - 10https://gerrit.wikimedia.org/r/387201 [09:47:34] (03CR) 10Alexandros Kosiaris: puppet: change elasticsearch_5 template to check undef(nil) variable (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) (owner: 10Herron) [09:48:13] (03PS2) 10ArielGlenn: add dumpsdata hosts to list for rsync ferm rules for dumps servers [puppet] - 10https://gerrit.wikimedia.org/r/387201 [09:50:28] (03CR) 10ArielGlenn: [C: 032] add dumpsdata hosts to list for rsync ferm rules for dumps servers [puppet] - 10https://gerrit.wikimedia.org/r/387201 (owner: 10ArielGlenn) [09:50:55] !log mobrovac@tin Started deploy [restbase/deploy@2b5889b]: Double-process all summaries and include the parsoid no-op switch [09:51:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:52:39] (03PS1) 10Marostegui: db-eqiad.php: Restore db1097 original weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387203 [09:54:12] PROBLEM - Restbase root url on restbase1007 is CRITICAL: connect to address 10.64.0.223 and port 7231: Connection refused [09:54:31] known ^ [09:54:32] (03PS7) 10Addshore: Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) [09:54:35] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Restore db1097 original weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387203 (owner: 10Marostegui) [09:54:57] (03PS1) 10Giuseppe Lavagetto: profile::docker::builder: add scap deployment of docker-pkg [puppet] - 10https://gerrit.wikimedia.org/r/387204 [09:55:03] (03CR) 10jerkins-bot: [V: 04-1] Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [09:55:45] (03Merged) 10jenkins-bot: db-eqiad.php: Restore db1097 original weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387203 (owner: 10Marostegui) [09:56:19] (03CR) 10jenkins-bot: db-eqiad.php: Restore db1097 original weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387203 (owner: 10Marostegui) [09:57:09] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Restore db1097 original weight - T161088 (duration: 00m 52s) [09:57:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:57:15] T161088: Migrate some s4 hosts to file per table - https://phabricator.wikimedia.org/T161088 [09:57:23] PROBLEM - Check systemd state on restbase1007 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [10:01:09] (03PS8) 10Addshore: Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) [10:01:42] (03CR) 10jerkins-bot: [V: 04-1] Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [10:02:19] (03PS9) 10Addshore: Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) [10:03:07] !log mobrovac@tin Finished deploy [restbase/deploy@2b5889b]: Double-process all summaries and include the parsoid no-op switch (duration: 12m 12s) [10:03:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:04:27] (03PS10) 10Addshore: Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) [10:05:23] RECOVERY - Check systemd state on restbase1007 is OK: OK - running: The system is fully operational [10:05:47] (03CR) 10Addshore: "So, we tried to add include ::r_lang within the wdcm manifest but the puppet tests complained." [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [10:07:32] (03PS11) 10Addshore: Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) [10:08:06] (03CR) 10GoranSMilovanovic: [C: 031] Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [10:09:57] (03PS3) 10Filippo Giunchedi: prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) [10:09:59] (03PS2) 10Filippo Giunchedi: hieradata: add redis stretch deployment-prep instances [puppet] - 10https://gerrit.wikimedia.org/r/386869 (https://phabricator.wikimedia.org/T148637) [10:10:01] (03PS2) 10Filippo Giunchedi: redis: add stretch support [puppet] - 10https://gerrit.wikimedia.org/r/386870 (https://phabricator.wikimedia.org/T148637) [10:10:29] (03CR) 10jerkins-bot: [V: 04-1] prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [10:13:01] (03CR) 10Alexandros Kosiaris: [C: 04-1] puppet: add puppet-master.conf to avoid conflict at pkg install time (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/386696 (https://phabricator.wikimedia.org/T179102) (owner: 10Herron) [10:14:55] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3718772 (10ema) >>! In T179156#3717895, @BBlack wrote: > My best hypothesis for the "unreasonable" behavior that would break un... [10:17:09] (03CR) 10Lucas Werkmeister (WMDE): [C: 031] "<3" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [10:18:11] (03CR) 10Hoo man: "> (is it possible to subscribe to a Gerrit change without leaving a comment like this one?)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [10:18:16] 10Operations, 10ops-eqiad, 10Packaging: Deprecate host copper.eqiad.wmnet - https://phabricator.wikimedia.org/T176957#3718773 (10akosiaris) [10:21:17] 10Operations, 10Discovery-Search: search.wikimedia.org is source of lots of 500s - https://phabricator.wikimedia.org/T179266#3718779 (10Ladsgroup) [10:21:59] (03PS1) 10Alexandros Kosiaris: Deprecate copper [puppet] - 10https://gerrit.wikimedia.org/r/387208 (https://phabricator.wikimedia.org/T176957) [10:22:50] !log mobrovac@tin Started deploy [restbase/deploy@2b5889b]: Double-process all summaries and include the parsoid no-op switch [10:22:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:23:00] (03PS4) 10Filippo Giunchedi: prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) [10:23:02] (03PS3) 10Filippo Giunchedi: hieradata: add redis stretch deployment-prep instances [puppet] - 10https://gerrit.wikimedia.org/r/386869 (https://phabricator.wikimedia.org/T148637) [10:23:04] (03PS3) 10Filippo Giunchedi: redis: add stretch support [puppet] - 10https://gerrit.wikimedia.org/r/386870 (https://phabricator.wikimedia.org/T148637) [10:23:32] (03CR) 10jerkins-bot: [V: 04-1] prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [10:25:00] (03PS1) 10Alexandros Kosiaris: Deprecate copper [dns] - 10https://gerrit.wikimedia.org/r/387209 (https://phabricator.wikimedia.org/T176957) [10:25:57] (03CR) 10Alexandros Kosiaris: [C: 032] Deprecate copper [puppet] - 10https://gerrit.wikimedia.org/r/387208 (https://phabricator.wikimedia.org/T176957) (owner: 10Alexandros Kosiaris) [10:27:30] !log mobrovac@tin Finished deploy [restbase/deploy@2b5889b]: Double-process all summaries and include the parsoid no-op switch (duration: 04m 41s) [10:27:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:31:11] !log Stop MySQL on db2039 to copy its data to db2087.s6 - T178359 [10:31:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:31:18] T178359: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359 [10:33:14] (03PS1) 10Ladsgroup: Revert "UBN! disbale ores for wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 [10:35:37] (03CR) 10Alexandros Kosiaris: [C: 032] Deprecate copper [dns] - 10https://gerrit.wikimedia.org/r/387209 (https://phabricator.wikimedia.org/T176957) (owner: 10Alexandros Kosiaris) [10:37:11] 10Operations, 10ops-eqiad, 10Packaging, 10Patch-For-Review: Deprecate host copper.eqiad.wmnet - https://phabricator.wikimedia.org/T176957#3718864 (10akosiaris) 05stalled>03Open p:05Low>03Normal a:05akosiaris>03Cmjohnson [10:37:14] (03CR) 10Hashar: "Note that the hosts contint1001/contint2001 have profile::ci::docker applied which also define thirdparty/ci. Albeit with a different name" [puppet] - 10https://gerrit.wikimedia.org/r/385169 (owner: 10Muehlenhoff) [10:48:15] (03PS5) 10Filippo Giunchedi: prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) [10:48:17] (03PS4) 10Filippo Giunchedi: hieradata: add redis stretch deployment-prep instances [puppet] - 10https://gerrit.wikimedia.org/r/386869 (https://phabricator.wikimedia.org/T148637) [10:48:19] (03PS4) 10Filippo Giunchedi: redis: add stretch support [puppet] - 10https://gerrit.wikimedia.org/r/386870 (https://phabricator.wikimedia.org/T148637) [10:48:22] RECOVERY - Restbase root url on restbase1007 is OK: HTTP OK: HTTP/1.1 200 - 15763 bytes in 0.008 second response time [10:48:50] (03CR) 10jerkins-bot: [V: 04-1] prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [10:50:22] !log mobrovac@tin Started deploy [restbase/deploy@2b5889b]: Double-process all summaries and include the parsoid no-op switch, take #gazillion [10:50:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:52:29] !log poweroff copper T176957 [10:52:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:52:36] T176957: Deprecate host copper.eqiad.wmnet - https://phabricator.wikimedia.org/T176957 [10:53:29] (03PS2) 10Giuseppe Lavagetto: docker::builder: add class to install tools to build images [puppet] - 10https://gerrit.wikimedia.org/r/387204 [10:54:00] (03CR) 10jerkins-bot: [V: 04-1] docker::builder: add class to install tools to build images [puppet] - 10https://gerrit.wikimedia.org/r/387204 (owner: 10Giuseppe Lavagetto) [10:56:13] (03PS3) 10Giuseppe Lavagetto: docker::builder: add class to install tools to build images [puppet] - 10https://gerrit.wikimedia.org/r/387204 [10:57:17] (03CR) 10Filippo Giunchedi: "> Main test build failed." [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [10:58:55] !log mobrovac@tin Finished deploy [restbase/deploy@2b5889b]: Double-process all summaries and include the parsoid no-op switch, take #gazillion (duration: 08m 33s) [10:59:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:46] (03CR) 10Alexandros Kosiaris: [C: 04-1] Puppet: Change hostcert and hostprivkey paths on puppetmasters (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/386666 (https://phabricator.wikimedia.org/T179099) (owner: 10Herron) [11:00:05] hoo: #bothumor My software never has bugs. It just develops random features. Rise for Undo attempted T179156 mitigations. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T1100). [11:00:05] No GERRIT patches in the queue for this window AFAICS. [11:00:05] T179156: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156 [11:02:07] (03CR) 10Hashar: [C: 04-1] "We require npm from Debian to be able to install a newer version. I dont think nodejs-legacy provides npm on Jessie though." [puppet] - 10https://gerrit.wikimedia.org/r/386889 (owner: 10Paladox) [11:02:18] I'll start in a bit [11:02:25] (03CR) 10Giuseppe Lavagetto: [C: 032] docker::builder: add class to install tools to build images [puppet] - 10https://gerrit.wikimedia.org/r/387204 (owner: 10Giuseppe Lavagetto) [11:02:28] (03PS1) 10Marostegui: wiki-replicas.sql: New file just to track GRANTS [puppet] - 10https://gerrit.wikimedia.org/r/387214 (https://phabricator.wikimedia.org/T178128) [11:02:47] (03PS2) 10Marostegui: wiki-replicas.sql: New file just to track GRANTS [puppet] - 10https://gerrit.wikimedia.org/r/387214 (https://phabricator.wikimedia.org/T178128) [11:03:18] 10Operations, 10monitoring: Disk space checks complaining on docker build hosts when building containers - https://phabricator.wikimedia.org/T179271#3718891 (10akosiaris) [11:03:26] 10Operations, 10monitoring: Disk space checks complaining on docker build hosts when building containers - https://phabricator.wikimedia.org/T179271#3718905 (10akosiaris) p:05Triage>03Normal [11:04:01] 10Operations, 10Ops-Access-Requests, 10DBA, 10Patch-For-Review, 10cloud-services-team (Kanban): Access to raw database tables on labsdb* for wmcs-admin users - https://phabricator.wikimedia.org/T178128#3718906 (10Marostegui) @madhuvishy please review this: https://gerrit.wikimedia.org/r/#/c/387214/ and i... [11:04:22] 10Operations: Copper root (/) 95% full - https://phabricator.wikimedia.org/T172409#3497614 (10akosiaris) 05Open>03Resolved I 'll re-resolve this. copper has been deprecated in T176957. I 've moved @Dzahn's comment about `Permission denied` errors to T179271 [11:06:51] (03CR) 10Marostegui: "And as expected, it is noop: https://puppet-compiler.wmflabs.org/compiler02/8536/" [puppet] - 10https://gerrit.wikimedia.org/r/387214 (https://phabricator.wikimedia.org/T178128) (owner: 10Marostegui) [11:09:45] (03PS10) 10Hashar: interface: IPAddr.new() requires an address family [puppet] - 10https://gerrit.wikimedia.org/r/336840 [11:10:34] (03CR) 10jerkins-bot: [V: 04-1] interface: IPAddr.new() requires an address family [puppet] - 10https://gerrit.wikimedia.org/r/336840 (owner: 10Hashar) [11:12:00] (03CR) 10Hoo man: [C: 032] Revert "Wikidatawiki to wmf.4" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387188 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:12:23] (03PS11) 10Hashar: interface: IPAddr.new() requires an address family [puppet] - 10https://gerrit.wikimedia.org/r/336840 [11:13:15] (03Merged) 10jenkins-bot: Revert "Wikidatawiki to wmf.4" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387188 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:13:29] (03CR) 10jenkins-bot: Revert "Wikidatawiki to wmf.4" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387188 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:15:55] !log hoo@tin rebuilt wikiversions.php and synchronized wikiversions files: Wikidatawiki back to wmf.5 (T179156) [11:16:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:16:01] T179156: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156 [11:16:20] (03CR) 10Hashar: "Made it an independent patch. Fixed rubocop." [puppet] - 10https://gerrit.wikimedia.org/r/336840 (owner: 10Hashar) [11:21:19] (03CR) 10Hoo man: [C: 032] Revert "Disable constraints check with SPARQL for now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:24:13] !log Compress table cebwiki.externallinks on db2092 - T178359 [11:24:15] (03Abandoned) 10Paladox: Gerrit: Set auth.userNameToLowerCase [puppet] - 10https://gerrit.wikimedia.org/r/368196 (owner: 10Paladox) [11:24:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:24:19] T178359: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359 [11:24:25] (03CR) 10Paladox: "Abandoning for now." [puppet] - 10https://gerrit.wikimedia.org/r/368196 (owner: 10Paladox) [11:27:52] PROBLEM - puppet last run on cp1063 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [11:28:49] (03PS2) 10Hoo man: Revert "Disable constraints check with SPARQL for now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) [11:28:51] (03PS2) 10Hoo man: Revert "Revert "Add property for RDF mapping of external identifiers for Wikidata"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387190 (https://phabricator.wikimedia.org/T179156) [11:29:02] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3718957 (10Lucas_Werkmeister_WMDE) > The only live polling feature I can think of that was recently introduced is for the live... [11:29:48] (03CR) 10Hoo man: [C: 032] Revert "Disable constraints check with SPARQL for now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:31:04] (03Merged) 10jenkins-bot: Revert "Disable constraints check with SPARQL for now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:31:16] (03CR) 10jenkins-bot: Revert "Disable constraints check with SPARQL for now" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387189 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:33:14] !log hoo@tin Synchronized wmf-config/Wikibase-production.php: Re-enable constraints check with SPARQL (T179156) (duration: 00m 50s) [11:33:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:33:20] T179156: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156 [11:35:45] (03CR) 10Hoo man: [C: 032] Revert "Revert "Add property for RDF mapping of external identifiers for Wikidata"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387190 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:41:01] (03Merged) 10jenkins-bot: Revert "Revert "Add property for RDF mapping of external identifiers for Wikidata"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387190 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:41:14] (03CR) 10jenkins-bot: Revert "Revert "Add property for RDF mapping of external identifiers for Wikidata"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387190 (https://phabricator.wikimedia.org/T179156) (owner: 10Hoo man) [11:41:43] (03PS1) 10Dzahn: introduce webperf1001 [dns] - 10https://gerrit.wikimedia.org/r/387215 (https://phabricator.wikimedia.org/T179036) [11:41:51] (03CR) 10jerkins-bot: [V: 04-1] introduce webperf1001 [dns] - 10https://gerrit.wikimedia.org/r/387215 (https://phabricator.wikimedia.org/T179036) (owner: 10Dzahn) [11:43:26] (03PS4) 10Ema: varnish: monitor child process restarts [puppet] - 10https://gerrit.wikimedia.org/r/386867 [11:43:31] (03CR) 10Ema: [V: 032 C: 032] varnish: monitor child process restarts [puppet] - 10https://gerrit.wikimedia.org/r/386867 (owner: 10Ema) [11:43:52] (03PS2) 10Dzahn: introduce webperf1001 [dns] - 10https://gerrit.wikimedia.org/r/387215 (https://phabricator.wikimedia.org/T179036) [11:44:38] !log hoo@tin Synchronized wmf-config/Wikibase.php: Re-add property for RDF mapping of external identifiers for Wikidata (T179156, T178180) (duration: 00m 49s) [11:44:42] (03PS6) 10Filippo Giunchedi: prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) [11:44:44] (03PS5) 10Filippo Giunchedi: hieradata: add redis stretch deployment-prep instances [puppet] - 10https://gerrit.wikimedia.org/r/386869 (https://phabricator.wikimedia.org/T148637) [11:44:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:44:45] T179156: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156 [11:44:45] T178180: Enable RDF mapping for external identifiers for Wikidata.org - https://phabricator.wikimedia.org/T178180 [11:44:46] (03PS5) 10Filippo Giunchedi: redis: add stretch support [puppet] - 10https://gerrit.wikimedia.org/r/386870 (https://phabricator.wikimedia.org/T148637) [11:45:16] (03CR) 10jerkins-bot: [V: 04-1] prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [11:48:26] 10Operations, 10vm-requests, 10Patch-For-Review, 10Performance-Team (Radar): Request VM for webperf (metrics processing) - https://phabricator.wikimedia.org/T179036#3719055 (10Dzahn) suggesting we introduce **webperf1001.eqiad.wmnet**/**webperf2001.codfw.wmnet** for this rather than using the misc names.... [11:48:40] !log Ran rebuildPropertyInfo for wikidatawiki and manually cleared the property cache (mc) [11:48:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:48:48] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3719057 (10BBlack) >>! In T179156#3718772, @ema wrote: > There's a timeout limiting the total amount of time varnish is allowed... [11:50:32] hoo: I'm going to undo the 2x earlier cache changes as well, the timeout reduction and max_connections limit [11:50:43] Ok [11:50:47] I'm done with my deploys [11:50:59] I'll leave it to the individual teams to re-enable ORES and remex html [11:51:06] (03PS1) 10BBlack: Revert "varnish: lower timeouts to api+appservers backends" [puppet] - 10https://gerrit.wikimedia.org/r/387219 [11:51:16] (03CR) 10BBlack: [C: 032] Revert "cache_text: raise MW connection limits to 10K" [puppet] - 10https://gerrit.wikimedia.org/r/386824 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [11:51:37] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3719064 (10hoo) [11:51:41] (03PS2) 10BBlack: Revert "varnish: lower timeouts to api+appservers backends" [puppet] - 10https://gerrit.wikimedia.org/r/387219 [11:51:54] (03CR) 10BBlack: [V: 032 C: 032] Revert "varnish: lower timeouts to api+appservers backends" [puppet] - 10https://gerrit.wikimedia.org/r/387219 (owner: 10BBlack) [11:52:01] (03PS2) 10BBlack: Revert "cache_text: raise MW connection limits to 10K" [puppet] - 10https://gerrit.wikimedia.org/r/386824 (https://phabricator.wikimedia.org/T179156) [11:52:03] (03CR) 10BBlack: [V: 032 C: 032] Revert "cache_text: raise MW connection limits to 10K" [puppet] - 10https://gerrit.wikimedia.org/r/386824 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [11:52:46] (03PS7) 10Filippo Giunchedi: prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) [11:52:47] (03PS6) 10Filippo Giunchedi: hieradata: add redis stretch deployment-prep instances [puppet] - 10https://gerrit.wikimedia.org/r/386869 (https://phabricator.wikimedia.org/T148637) [11:52:49] (03PS6) 10Filippo Giunchedi: redis: add stretch support [puppet] - 10https://gerrit.wikimedia.org/r/386870 (https://phabricator.wikimedia.org/T148637) [11:53:18] (03CR) 10jerkins-bot: [V: 04-1] prometheus: add redis_exporter class and profile [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [11:55:05] !log oblivian@tin Started deploy [docker-pkg/deploy@549c157]: Test deploy for virtualenv creation [11:55:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:57:38] (03CR) 10Filippo Giunchedi: "PCC https://puppet-compiler.wmflabs.org/compiler02/8539/maps1001.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/325466 (https://phabricator.wikimedia.org/T148637) (owner: 10Filippo Giunchedi) [11:57:47] !log oblivian@tin Started deploy [docker-pkg/deploy@549c157]: (no justification provided) [11:57:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:57:52] RECOVERY - puppet last run on cp1063 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:57:58] !log oblivian@tin Finished deploy [docker-pkg/deploy@549c157]: (no justification provided) (duration: 00m 11s) [11:58:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:58:34] !log depool tools-exec-1401 to test patch in T179024 --> aborrero@tools-bastion-03:~$ sudo exec-manage depool tools-exec-1401.tools.eqiad.wmflabs [11:58:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:58:41] T179024: nfsiostat collector appears to be broken - https://phabricator.wikimedia.org/T179024 [11:59:47] !log oblivian@tin Started deploy [docker-pkg/deploy@549c157]: (no justification provided) [11:59:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:00:06] !log oblivian@tin Finished deploy [docker-pkg/deploy@549c157]: (no justification provided) (duration: 00m 19s) [12:00:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:01:58] !log oblivian@tin Started deploy [docker-pkg/deploy@549c157]: (no justification provided) [12:02:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:02:09] !log oblivian@tin Finished deploy [docker-pkg/deploy@549c157]: (no justification provided) (duration: 00m 11s) [12:02:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:02:53] !log Manually started dumpwikidatajson on snapshot1007 (cron run failed tonight) [12:02:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:04:14] apergos: ^ [12:08:26] (03PS2) 10Rush: wmcs: add wmcs-roots to role::wmcs::openstack::wikitech [puppet] - 10https://gerrit.wikimedia.org/r/387153 (owner: 10BryanDavis) [12:10:01] (03CR) 10Rush: [C: 032] wmcs: add wmcs-roots to role::wmcs::openstack::wikitech [puppet] - 10https://gerrit.wikimedia.org/r/387153 (owner: 10BryanDavis) [12:11:22] (03PS1) 10Ema: cache_upload: ensure consistent CL [puppet] - 10https://gerrit.wikimedia.org/r/387221 [12:11:38] (03CR) 10WMDE-Fisch: [C: 031] "Goat for it!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387198 (https://phabricator.wikimedia.org/T177891) (owner: 10Addshore) [12:11:51] /q govg [12:11:55] heh [12:12:35] (03PS11) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [12:16:56] hoo, thanks for the heads up [12:17:38] PROBLEM - puppet last run on elastic1050 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [12:23:46] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3719145 (10Lucas_Werkmeister_WMDE) >>! In T179156#3719057, @BBlack wrote: >could other services on text-lb be making these kind... [12:24:56] (03PS1) 10Giuseppe Lavagetto: docker::builder: also install python3-wheel [puppet] - 10https://gerrit.wikimedia.org/r/387222 [12:31:37] (03CR) 10BBlack: [C: 031] cache_upload: ensure consistent CL [puppet] - 10https://gerrit.wikimedia.org/r/387221 (owner: 10Ema) [12:37:44] (03PS1) 10Giuseppe Lavagetto: Various fixes [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/387224 [12:41:02] (03PS2) 10Giuseppe Lavagetto: Various fixes [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/387224 [12:41:13] (03PS1) 10BBlack: cache_text: reduce applayer timeouts to reasonable values [puppet] - 10https://gerrit.wikimedia.org/r/387225 (https://phabricator.wikimedia.org/T179156) [12:42:56] (03PS1) 10Elukey: role::analytics_cluster::refinery: fix logrotate [puppet] - 10https://gerrit.wikimedia.org/r/387226 [12:43:34] (03CR) 10Giuseppe Lavagetto: [C: 031] "Seems reasonable, it could be interesting to investigate timeouts at a later time." [puppet] - 10https://gerrit.wikimedia.org/r/387225 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [12:44:37] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Various fixes [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/387224 (owner: 10Giuseppe Lavagetto) [12:44:40] (03CR) 10Elukey: [C: 032] role::analytics_cluster::refinery: fix logrotate [puppet] - 10https://gerrit.wikimedia.org/r/387226 (owner: 10Elukey) [12:45:02] (03CR) 10Giuseppe Lavagetto: [C: 032] docker::builder: also install python3-wheel [puppet] - 10https://gerrit.wikimedia.org/r/387222 (owner: 10Giuseppe Lavagetto) [12:45:26] (03PS2) 10Giuseppe Lavagetto: docker::builder: also install python3-wheel [puppet] - 10https://gerrit.wikimedia.org/r/387222 [12:46:46] (03PS1) 10Ema: varnish: fix 'child restarted' icinga check [puppet] - 10https://gerrit.wikimedia.org/r/387227 [12:47:28] !log repool tools-exec-1401 [12:47:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:47:43] RECOVERY - puppet last run on elastic1050 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [12:49:31] (03PS1) 10BBlack: cache_text: reduce inter-cache backend timeouts as well [puppet] - 10https://gerrit.wikimedia.org/r/387228 (https://phabricator.wikimedia.org/T179156) [12:49:34] !log oblivian@tin Started deploy [docker-pkg/deploy@e9b9146]: Proper deploy of first version [12:49:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:49:41] !log oblivian@tin Finished deploy [docker-pkg/deploy@e9b9146]: Proper deploy of first version (duration: 00m 07s) [12:49:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:50:22] (03CR) 10Ema: [C: 031] cache_text: reduce applayer timeouts to reasonable values [puppet] - 10https://gerrit.wikimedia.org/r/387225 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [12:50:46] jouncebot: next [12:50:46] In 0 hour(s) and 9 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T1300) [12:51:00] oh, SWAT is an hour earlier for me this week :) [12:52:25] (03CR) 10Zoranzoki21: "Thank you @Framawiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387106 (https://phabricator.wikimedia.org/T179251) (owner: 10Zoranzoki21) [12:53:00] (03CR) 10Zoranzoki21: [C: 031] Revert "UBN! disbale ores for wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 (owner: 10Ladsgroup) [12:53:31] 10Operations, 10CirrusSearch, 10Discovery, 10MediaWiki-JobQueue, and 6 others: Job queue is increasing non-stop - https://phabricator.wikimedia.org/T173710#3719254 (10Ladsgroup) >>! In T173710#3718725, @Jack_who_built_the_house wrote: > Thanks for the reply. It just surprises me that [[https://en.wikipedia... [12:54:27] (03PS1) 10Reedy: Add port numbers to beta db servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387229 (https://phabricator.wikimedia.org/T178553) [12:55:01] (03PS2) 10Reedy: Add port numbers to beta db servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387229 (https://phabricator.wikimedia.org/T178553) [12:55:42] (03CR) 10Reedy: [C: 032] Add port numbers to beta db servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387229 (https://phabricator.wikimedia.org/T178553) (owner: 10Reedy) [12:56:55] (03Merged) 10jenkins-bot: Add port numbers to beta db servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387229 (https://phabricator.wikimedia.org/T178553) (owner: 10Reedy) [12:57:04] (03CR) 10jenkins-bot: Add port numbers to beta db servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387229 (https://phabricator.wikimedia.org/T178553) (owner: 10Reedy) [12:58:15] !log reedy@tin Synchronized wmf-config/db-labs.php: beta labs stuffs (duration: 00m 50s) [12:58:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:59:49] Reedy: swat starts in a minute, are you deploying? [12:59:59] zeljkof: just a beta patch as a test [13:00:02] All done now :) [13:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: It is that lovely time of the day again! You are hereby commanded to deploy European Mid-day SWAT(Max 8 patches). (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T1300). [13:00:04] Zoranzoki21, addshore, and Amir1: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [13:00:11] Reedy: ok, great :) [13:00:16] I can SWAT today! [13:00:16] \o/ [13:00:19] Hi [13:00:22] o/ [13:00:23] Hello! [13:00:50] anybody else wants to swat? addshore? want to swat your stuff? or should I deploy everything? [13:01:01] I can do mine if you would like :) [13:01:18] leave mine until last! [13:01:21] Zeljko you do my.. :) [13:01:21] addshore: please do :) you can start, I'll review the rest [13:01:37] addshore: ok, I can ping you when I am done, if you prefer being last [13:01:42] ack! [13:01:58] Zoranzoki21: do you have two commits? looks like you have forgot the link to the second one [13:02:20] wait to see [13:02:20] Amir1: want to deploy your commits? or should I? [13:02:40] Zoranzoki21: reviewing 387106 [13:02:46] zeljkof: whichever that's easier for you :) [13:02:49] Zeljko: Only one patch I have [13:03:13] Enable the Autopatrolled User Rights on hiwikiversity (T179251) [13:03:13] Enable Extension:SandboxLink on hiwikiversity (T179252) are in one patch all because is created in same time for same wikis [13:03:14] T179251: Enable the Autopatrolled User Rights on hiwikiversity - https://phabricator.wikimedia.org/T179251 [13:03:14] T179252: Enable Extension:SandboxLink on hiwikiversity - https://phabricator.wikimedia.org/T179252 [13:03:14] Amir1: in that case, you deploy ;) I'll let you know when I am done [13:03:41] cool [13:03:46] Zoranzoki21: ok, the commit message sounded like it was two commits [13:04:06] OK.. Because is two tasks in one patch [13:05:01] zeljkof: Amir1 and I just discussed and I will go before him :) [13:05:24] addshore: nice, I'll ping you when I am done, the two of you can take over [13:05:34] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387106 (https://phabricator.wikimedia.org/T179251) (owner: 10Zoranzoki21) [13:06:46] (03Merged) 10jenkins-bot: Enable the Autopatrolled User Rights & Ext:SandboxLink on hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387106 (https://phabricator.wikimedia.org/T179251) (owner: 10Zoranzoki21) [13:06:52] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Switch to use docker-pkg for builds [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/385143 (owner: 10Giuseppe Lavagetto) [13:06:55] (03CR) 10jenkins-bot: Enable the Autopatrolled User Rights & Ext:SandboxLink on hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387106 (https://phabricator.wikimedia.org/T179251) (owner: 10Zoranzoki21) [13:07:49] (03PS12) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [13:08:10] Zoranzoki21: 387106 is at mwdebug1002, please test and let me know if I can deploy [13:09:58] Zoranzoki21: 387106 is at mwdebug1002, please test and let me know if I can deploy [13:12:20] Zoranzoki21, Zoran_Dori: around? [13:14:47] dun dun dun [13:15:15] addshore, Amir1: looks like Zoranzoki21 is gone :( [13:15:25] :| [13:15:28] not sure what to do with 387106 [13:15:43] should I revert? deploy? I am not familiar with what it does [13:15:54] *looks* [13:16:04] I can be done in a minute (revert/deploy) if you have any opinions [13:17:14] deploy and leave a comment in gerrit+phab that I could not test? [13:18:01] I think the patch looks okay [13:18:23] (03PS1) 10Ladsgroup: labs: Disable unneeded models in non-enwiki wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387231 (https://phabricator.wikimedia.org/T178792) [13:18:32] addshore: ok, so deploy? [13:18:36] yup [13:18:37] yeah it's super trivial [13:18:41] I will also watch everything with you :) [13:18:46] and easily testable [13:19:44] (03CR) 10Madhuvishy: [C: 031] "Thanks for making this!" [puppet] - 10https://gerrit.wikimedia.org/r/387214 (https://phabricator.wikimedia.org/T178128) (owner: 10Marostegui) [13:19:57] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:387106|Enable the Autopatrolled User Rights & Ext:SandboxLink on hiwikiversity (T179251 T179252)]] (duration: 00m 50s) [13:20:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:20:05] T179251: Enable the Autopatrolled User Rights on hiwikiversity - https://phabricator.wikimedia.org/T179251 [13:20:05] T179252: Enable Extension:SandboxLink on hiwikiversity - https://phabricator.wikimedia.org/T179252 [13:20:05] (03PS2) 10Herron: puppet: change elasticsearch_5 template to parse under puppet 4 [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) [13:20:07] Zoranzoki21: 387106 is deployed, please test [13:20:25] addshore, Amir1: SWAT is yours :) [13:20:32] Awesome! [13:20:42] addshore, Amir1: also, feel free to test 387106, if you know how [13:20:44] (03PS2) 10Addshore: Set wgWikiDiff2MovedParagraphDetectionCutoff for group0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387198 (https://phabricator.wikimedia.org/T177891) [13:20:47] (03CR) 10Addshore: [C: 032] Set wgWikiDiff2MovedParagraphDetectionCutoff for group0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387198 (https://phabricator.wikimedia.org/T177891) (owner: 10Addshore) [13:21:00] !log Set innodb_max_dirty_pages_pct = 10 on labsdb1001 so it powers off a bit faster - T168584 [13:21:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:21:06] T168584: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584 [13:21:46] (03PS3) 10Marostegui: wiki-replicas.sql: New file just to track GRANTS [puppet] - 10https://gerrit.wikimedia.org/r/387214 (https://phabricator.wikimedia.org/T178128) [13:22:33] (03Merged) 10jenkins-bot: Set wgWikiDiff2MovedParagraphDetectionCutoff for group0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387198 (https://phabricator.wikimedia.org/T177891) (owner: 10Addshore) [13:22:42] (03CR) 10jenkins-bot: Set wgWikiDiff2MovedParagraphDetectionCutoff for group0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387198 (https://phabricator.wikimedia.org/T177891) (owner: 10Addshore) [13:23:47] zeljkof: the sandbox link is there: https://hi.wikiversity.org/wiki/%E0%A4%B5%E0%A4%BF%E0%A4%95%E0%A4%BF%E0%A4%B5%E0%A4%BF%E0%A4%A6%E0%A5%8D%E0%A4%AF%E0%A4%BE%E0%A4%B2%E0%A4%AF:%E0%A4%AE%E0%A5%81%E0%A4%96%E0%A4%AA%E0%A5%83%E0%A4%B7%E0%A5%8D%E0%A4%A0?uselang=en [13:24:04] (03CR) 10Marostegui: [C: 032] wiki-replicas.sql: New file just to track GRANTS [puppet] - 10https://gerrit.wikimedia.org/r/387214 (https://phabricator.wikimedia.org/T178128) (owner: 10Marostegui) [13:25:16] same for auto patrol: https://hi.wikiversity.org/wiki/%E0%A4%B5%E0%A4%BF%E0%A4%B6%E0%A5%87%E0%A4%B7:%E0%A4%B8%E0%A4%A6%E0%A4%B8%E0%A5%8D%E0%A4%AF_%E0%A4%B8%E0%A4%AE%E0%A5%82%E0%A4%B9_%E0%A4%85%E0%A4%A7%E0%A4%BF%E0%A4%95%E0%A4%BE%E0%A4%B0?uselang=en [13:27:24] (03CR) 10Herron: "Good point! Indeed the first patch set didn't fix the issue. After taking another look this seems simpler than I had thought. Patch set 2 " [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) (owner: 10Herron) [13:27:56] 10Operations, 10Ops-Access-Requests, 10DBA, 10Patch-For-Review, 10cloud-services-team (Kanban): Access to raw database tables on labsdb* for wmcs-admin users - https://phabricator.wikimedia.org/T178128#3719383 (10madhuvishy) 05Open>03Resolved [13:27:59] Amir1: o/ - do you have a min for a couple of questions? [13:28:05] elukey: sure [13:28:16] hmmm [13:29:27] Amir1: thanks [13:29:29] Amir1: thanks! So this morning we prioritized (and added more workers) for refreshlinks and htmlcacheupdate jobs.. it looked good but at some point the enqueue rate spiked and we got a ton of jobs into the queue, getting to 11+ again [13:29:43] oh boy [13:29:52] now it went back to completed > enqueued [13:29:56] addshore@mwdebug1002:/srv/mediawiki/wmf-config$ mwscript eval.php --wiki=testwiki [13:29:56] PHP Fatal error: Class 'Memcached' not found in /srv/mediawiki/php-1.31.0-wmf.5/includes/libs/objectcache/MemcachedPeclBagOStuff.php on line 61 [13:30:05] >.> [13:30:36] Amir1: ^^ any ideas? [13:31:00] ah sorry didn't see that you were doing swait, will wait! [13:31:13] addshore: That's weird, Do you want to test it in tin instead? [13:32:00] Amir1: does eval.php not work on app servers only on deploy servers? [13:32:07] (03CR) 10Filippo Giunchedi: [C: 031] varnish: fix 'child restarted' icinga check [puppet] - 10https://gerrit.wikimedia.org/r/387227 (owner: 10Ema) [13:32:24] elukey: nah, I'm waiting. The thing is I don't know what can be the cause of these job spike, it can be that someone edited one of the highly used templates and we only can wait that jobqueue digest it, if that's the case but I'm not sure [13:32:31] elukey: I need to check [13:32:34] (03PS2) 10Madhuvishy: labsdb: Switchover dns for labsdb1001 shards to labsdb1003 [puppet] - 10https://gerrit.wikimedia.org/r/386660 (https://phabricator.wikimedia.org/T168584) [13:32:42] addshore: I never tried it in appservers [13:33:07] okay, well, I am not getting the desired effect from this patch so I might revert and try again later or on another day [13:33:21] (03CR) 10Madhuvishy: [C: 032] labsdb: Switchover dns for labsdb1001 shards to labsdb1003 [puppet] - 10https://gerrit.wikimedia.org/r/386660 (https://phabricator.wikimedia.org/T168584) (owner: 10Madhuvishy) [13:34:56] Amir1: yes yes, just wondering if there was an outstanding issues say with how refreshlinks are enqueued etc.. [13:35:52] I schedule https://gerrit.wikimedia.org/r/#/c/386777/ [13:36:00] Please merge [13:36:28] elukey: AFAIK nothing atm, we changed some stuff but they don't match time-wise [13:36:39] hoo cc [13:37:00] what's up again [13:37:00] :/ [13:37:29] moritzm: around? [13:37:49] * hoo reads irc logs on labs [13:38:07] addshore: tell me when I can take over [13:38:13] Amir1: will do [13:38:20] elukey: Amir1: I see [13:38:20] Thanks [13:38:37] Can anyone able to merge the patch 386777 [13:38:52] elukey: Any chance this is mostly/ a lot on ruwikisource? [13:39:16] Jayprakash12345: I can [13:39:27] just waiting for some stuff to be handled [13:39:40] hoo: this is the last snapshot that I took some days ago - https://phabricator.wikimedia.org/T173710#3712681 [13:39:44] I can recheck [13:40:15] elukey: I see [13:40:22] I just checked and ruwikisource seems fine [13:40:32] (even though there's a lot happening on WD regarding it right now) [13:40:48] !log Switched dns over for c1 shards to c3 in preparation of labsdb1001 reboot (Merged https://gerrit.wikimedia.org/r/#/c/386660/, Ran puppet on labservice1001|2) [13:40:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:40:58] So do, I started mwdebug1002 [13:42:07] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: puppetmaster hostcert and hostprivkey point to nonexistent files - https://phabricator.wikimedia.org/T179099#3719441 (10herron) @akosiaris following up to your comment in https://gerrit.wikimedia.org/r/#/c/386666/ > Unless I am mistaken (documentatio... [13:42:11] elukey: Our numbers don't seem out of bound currently (a little for ruwikisource, but that's not the issue here) [13:42:23] Jayprakash12345: I'm waiting for old deployments to finish, I can't merge your until they are done [13:42:23] (03PS1) 10Addshore: Revert "Set wgWikiDiff2MovedParagraphDetectionCutoff for group0" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387232 [13:42:24] It's still not very unlikely that this is us [13:42:27] (03PS2) 10Jayprakash12345: Change the af.wiktionary logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386777 (https://phabricator.wikimedia.org/T178824) [13:42:33] Amir1: reverting [13:42:44] not sure why that wasnt working will continue looking into it this afternoon [13:43:07] (03CR) 10Addshore: [C: 032] Revert "Set wgWikiDiff2MovedParagraphDetectionCutoff for group0" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387232 (owner: 10Addshore) [13:43:22] hoo: sorry to ask, us == wikidata? [13:43:24] (03CR) 10Herron: "Please see follow-up in T179099" [puppet] - 10https://gerrit.wikimedia.org/r/386666 (https://phabricator.wikimedia.org/T179099) (owner: 10Herron) [13:43:30] elukey: Sorry, yes [13:43:33] ahh okok :) [13:44:19] (03Merged) 10jenkins-bot: Revert "Set wgWikiDiff2MovedParagraphDetectionCutoff for group0" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387232 (owner: 10Addshore) [13:45:09] https://grafana.wikimedia.org/dashboard/db/wikidataclient-change-handling-wikipageupdater?panelId=2&fullscreen&orgId=1 [13:45:17] That doesn't look worse than usually [13:45:28] Amir1: it is all yours! [13:45:37] Thanks [13:45:57] hoo: the issue seems to be commons, ~6M jobs only for it (refreshlinks + htmlcacheupdate) [13:46:08] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386777 (https://phabricator.wikimedia.org/T178824) (owner: 10Jayprakash12345) [13:46:09] (03CR) 10Herron: [C: 032] puppet: change ganglia aggregator site_instances call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387139 (https://phabricator.wikimedia.org/T179165) (owner: 10Herron) [13:46:17] (03PS2) 10Herron: puppet: change ganglia aggregator site_instances call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387139 (https://phabricator.wikimedia.org/T179165) [13:46:25] (03CR) 10jenkins-bot: Revert "Set wgWikiDiff2MovedParagraphDetectionCutoff for group0" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387232 (owner: 10Addshore) [13:47:52] elukey: Yikes… this is probably not Wikidata… but they might be editing highly-used templates again [13:48:38] (03CR) 10Herron: [C: 032] puppet: change mediawiki refreshlinks cronjob call to use full name [puppet] - 10https://gerrit.wikimedia.org/r/387142 (https://phabricator.wikimedia.org/T179177) (owner: 10Herron) [13:49:10] (03PS2) 10Herron: puppet: change mediawiki refreshlinks cronjob call to use full name [puppet] - 10https://gerrit.wikimedia.org/r/387142 (https://phabricator.wikimedia.org/T179177) [13:49:10] hm… we don't see to have per-wiki stats about this [13:49:12] (03PS3) 10Ladsgroup: Change the af.wiktionary logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386777 (https://phabricator.wikimedia.org/T178824) (owner: 10Jayprakash12345) [13:49:15] Amir1: ^ We should change that [13:49:45] hoo: yeah, definitely [13:49:58] I wanted to do it for a very long time [13:50:30] 10Operations, 10MediaWiki-General-or-Unknown, 10TechCom-RfC: Bump PHP requirement to 5.6 in 1.31 - https://phabricator.wikimedia.org/T178538#3695340 (10Anomie) See T172165#3699606 for some relevant criticism of this decision. [13:50:33] Amir1: Do you want to or shall I take a shot? [13:50:54] you do, today I'm packed [13:51:11] 10Operations, 10wikidiff2, 10Patch-For-Review, 10User-Addshore, 10WMDE-QWERTY-Team-Board: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3719471 (10Addshore) So, I just tried deploying the above change. While testing on mwdebug1002 the config switch didn't a... [13:51:13] I will [13:51:16] (03CR) 10Ladsgroup: [C: 032] Change the af.wiktionary logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386777 (https://phabricator.wikimedia.org/T178824) (owner: 10Jayprakash12345) [13:51:27] Thanks :) [13:51:52] The Template:FlickreviewR (And sub-templates) have been edited (a lot) [13:51:57] that one has 200k+ usages [13:52:12] (03CR) 10Herron: [C: 032] puppet: change dbstore_multiinstance mariadb groups call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387151 (https://phabricator.wikimedia.org/T179161) (owner: 10Herron) [13:52:17] (03PS2) 10Herron: puppet: change dbstore_multiinstance mariadb groups call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387151 (https://phabricator.wikimedia.org/T179161) [13:52:27] but that's not enough here I guess (assuming job deduplication works) [13:52:31] (03Merged) 10jenkins-bot: Change the af.wiktionary logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386777 (https://phabricator.wikimedia.org/T178824) (owner: 10Jayprakash12345) [13:52:45] (03CR) 10jenkins-bot: Change the af.wiktionary logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386777 (https://phabricator.wikimedia.org/T178824) (owner: 10Jayprakash12345) [13:53:14] Jayprakash12345: Test please [13:53:19] in mwdebug1002 [13:53:27] ok [13:54:27] (03PS2) 10Marostegui: db-codfw.php: Pool db2084 as multi-instance host [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386810 (https://phabricator.wikimedia.org/T178553) [13:56:41] I tested and it works just fine [13:56:44] going live [13:56:49] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: puppet4: Error while evaluating a Resource Statement, Unknown resource type: 'site_instances' at /etc/puppet/modules/ganglia/manifests/monitor/aggregator.pp:36:5 - https://phabricator.wikimedia.org/T179165#3719484 (10herron) New error after merging th... [13:56:49] Everything is fine [13:57:22] Please run stash bot [13:58:47] Amir1 : ? [13:58:56] Doing it [13:59:01] !log ladsgroup@tin Synchronized static/images/project-logos: Change the af.wiktionary logo, part I (T178824) (duration: 00m 50s) [13:59:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:59:06] T178824: Change the af.wiktionary logo - https://phabricator.wikimedia.org/T178824 [13:59:24] 10Operations, 10Puppet, 10User-Joe: puppet4 Error while evaluating a Resource Statement, Unknown resource type: 'updatequerypages::cronjob' at /etc/puppet/modules/mediawiki/manifests/maintenance/updatequerypages.pp:12:5 - https://phabricator.wikimedia.org/T179290#3719488 (10herron) [14:00:08] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: Change the af.wiktionary logo, part II (T178824) (duration: 00m 50s) [14:00:10] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 (owner: 10Ladsgroup) [14:00:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:00:21] deployed [14:00:53] jouncebot: enxt [14:00:53] 10Operations, 10Puppet, 10User-Joe: puppet4: Error while evaluating a Resource Statement, Unknown resource type: 'instance' at /etc/puppet/modules/ganglia/manifests/monitor/aggregator/site_instances.pp:4:5 - https://phabricator.wikimedia.org/T179291#3719508 (10herron) [14:00:56] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3672194 (10herron) [14:00:56] jouncebot: next [14:00:56] In 2 hour(s) and 59 minute(s): Wikidata Query Service weekly deploy (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T1700) [14:01:12] stupid swat slot [14:01:13] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3681039 (10herron) [14:01:16] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: puppet4: Error while evaluating a Resource Statement, Unknown resource type: 'site_instances' at /etc/puppet/modules/ganglia/manifests/monitor/aggregator.pp:36:5 - https://phabricator.wikimedia.org/T179165#3719526 (10herron) 05Open>03Resolved a:0... [14:01:24] I have missed it due to the DST oddiness [14:02:14] (03CR) 10Hashar: [C: 032] labs: Disable unneeded models in non-enwiki wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387231 (https://phabricator.wikimedia.org/T178792) (owner: 10Ladsgroup) [14:02:25] (03PS2) 10Ladsgroup: Revert "UBN! disbale ores for wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 [14:02:36] (03CR) 10Ladsgroup: [C: 032] Revert "UBN! disbale ores for wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 (owner: 10Ladsgroup) [14:02:47] Amir1 : Thanks [14:03:14] I'm SWATing ATM :D [14:03:23] (03Merged) 10jenkins-bot: labs: Disable unneeded models in non-enwiki wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387231 (https://phabricator.wikimedia.org/T178792) (owner: 10Ladsgroup) [14:03:36] (03CR) 10jenkins-bot: labs: Disable unneeded models in non-enwiki wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387231 (https://phabricator.wikimedia.org/T178792) (owner: 10Ladsgroup) [14:03:45] (03Merged) 10jenkins-bot: Revert "UBN! disbale ores for wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 (owner: 10Ladsgroup) [14:06:16] (03PS2) 10Ema: cache_upload: ensure consistent CL [puppet] - 10https://gerrit.wikimedia.org/r/387221 [14:06:24] (03CR) 10jenkins-bot: Revert "UBN! disbale ores for wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387210 (owner: 10Ladsgroup) [14:06:32] (03CR) 10Ema: [V: 032 C: 032] cache_upload: ensure consistent CL [puppet] - 10https://gerrit.wikimedia.org/r/387221 (owner: 10Ema) [14:07:43] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: Revert "UBN! disbale ores for wikidata" (T179107) (duration: 00m 49s) [14:07:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:07:49] T179107: ORES service erroring, in a way that throws exceptions in Extension:ORES - https://phabricator.wikimedia.org/T179107 [14:08:45] (03PS1) 10BBlack: [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) [14:09:02] Sorry, forgot to rebase [14:09:06] doing it again [14:09:07] (03CR) 10jerkins-bot: [V: 04-1] [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [14:09:07] hoo: I am super ignorant but is there a quick way to check recent changes fro commons happened on 20171027042217 ? [14:09:14] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: Revert "UBN! disbale ores for wikidata" (T179107) (duration: 00m 50s) [14:09:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:10:09] elukey: Sure $ sql --group recentchanges commonswiki -e "SELECT * FROM recentchanges WHERE rc_timestamp = '20171027042217'" [14:11:21] the SWAT is done [14:11:27] hoo: ah nice! thanks! [14:11:29] I'm monitoring fatal for ores in wikidata [14:12:27] (03PS2) 10Ema: varnish: fix 'child restarted' icinga check [puppet] - 10https://gerrit.wikimedia.org/r/387227 [14:12:39] (03CR) 10Ema: [V: 032 C: 032] varnish: fix 'child restarted' icinga check [puppet] - 10https://gerrit.wikimedia.org/r/387227 (owner: 10Ema) [14:12:41] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3719574 (10herron) [14:12:43] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: puppet4: Could not find declared class mariadb::groups at /etc/puppet/modules/role/manifests/mariadb/dbstore_multiinstance.pp:16:5 - https://phabricator.wikimedia.org/T179161#3719572 (10herron) 05Open>03Resolved a:03herron [14:13:28] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3681042 (10herron) [14:15:35] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3719579 (10herron) [14:16:00] 10Operations, 10ops-eqiad: Degraded RAID on tungsten - https://phabricator.wikimedia.org/T179129#3719584 (10Cmjohnson) 05Open>03Resolved a:03Cmjohnson Replaced the disk but this server should be replaced and decommissioned sooner rather than later. [14:16:36] 10Operations, 10ops-eqiad, 10Packaging, 10Patch-For-Review: Decommission host copper.eqiad.wmnet - https://phabricator.wikimedia.org/T176957#3719588 (10Cmjohnson) [14:18:38] 10Operations, 10ops-eqiad, 10Analytics, 10User-Elukey: Possibly faulty BBU on analytics1029 - https://phabricator.wikimedia.org/T178742#3719606 (10Cmjohnson) a:03RobH this server is out of warranty by 6 months. Assigning to @robh to determine if we should order a new one? [14:19:24] 10Operations, 10ops-eqiad, 10Analytics, 10User-Elukey: Check analytics1037 power supply status - https://phabricator.wikimedia.org/T179192#3719610 (10Cmjohnson) a:03RobH this server is out of warranty by 6 months. Assigning to @robh to determine if we should order a new one...probably two. [14:20:45] (03PS3) 10Filippo Giunchedi: prometheus: add analytics instance [puppet] - 10https://gerrit.wikimedia.org/r/378716 (https://phabricator.wikimedia.org/T175922) [14:21:04] (03PS2) 10BBlack: [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) [14:21:10] 10Operations, 10ops-eqiad, 10DC-Ops: scs-c1-eqiad unresponsive - https://phabricator.wikimedia.org/T175625#3719619 (10Cmjohnson) a:05Cmjohnson>03RobH I've replaced the console, set it up so it's accessible. Setup all the ports but I am not able to access ports via pmshell. @robh could you look into thi... [14:21:19] (03CR) 10jerkins-bot: [V: 04-1] prometheus: add analytics instance [puppet] - 10https://gerrit.wikimedia.org/r/378716 (https://phabricator.wikimedia.org/T175922) (owner: 10Filippo Giunchedi) [14:21:21] (03CR) 10jerkins-bot: [V: 04-1] [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [14:21:57] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: eqiad: rack frack refresh equipment - https://phabricator.wikimedia.org/T169644#3719622 (10Cmjohnson) 05Open>03Resolved this has been completed. [14:22:02] (03CR) 10Alexandros Kosiaris: [C: 031] puppet: change elasticsearch_5 template to parse under puppet 4 [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) (owner: 10Herron) [14:22:16] 10Operations, 10ops-eqiad, 10fundraising-tech-ops, 10netops: rack/setup/wire/deploy msw2-c1-eqiad - https://phabricator.wikimedia.org/T166171#3719635 (10Cmjohnson) 05Open>03Resolved Completed. [14:27:48] 10Operations, 10vm-requests, 10Patch-For-Review, 10Performance-Team (Radar): Request VM for webperf (metrics processing) - https://phabricator.wikimedia.org/T179036#3719648 (10akosiaris) @Dzahn Yes and yes, both sound fine. @Krinkle Nicely written task! Thanks! [14:28:33] 10Operations, 10ops-eqiad, 10DBA: decommission db1018 - https://phabricator.wikimedia.org/T176215#3719649 (10Cmjohnson) [14:28:40] (03CR) 10Alexandros Kosiaris: [C: 031] introduce webperf1001 [dns] - 10https://gerrit.wikimedia.org/r/387215 (https://phabricator.wikimedia.org/T179036) (owner: 10Dzahn) [14:28:45] 10Operations, 10ops-eqiad, 10DBA: decommission db1018 - https://phabricator.wikimedia.org/T176215#3617573 (10Cmjohnson) Server has been wiped and removed from rack...racktables updated [14:29:03] 10Operations, 10ops-eqiad, 10DBA: decommission db1018 - https://phabricator.wikimedia.org/T176215#3719653 (10Cmjohnson) 05Open>03Resolved [14:29:05] 10Operations, 10DBA, 10Patch-For-Review: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3719654 (10Cmjohnson) [14:29:20] marostegui: cmjohnson1 bd808 We're all here, I'm working off of https://etherpad.wikimedia.org/p/labsdb-reboots. The task is T168584, and I've done the downtimes and announcements on irc/lists. [14:29:20] T168584: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584 [14:29:37] madhuvishy: i am ready whenver you want [14:29:46] marostegui: Let's do it :) [14:29:54] ok [14:29:55] standing by [14:30:05] * bd808 knocks wood [14:30:10] !log Stop replication and MySQL on labsdb1001 - T168584 [14:30:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:30:35] chasemp: ^ fyi [14:30:43] kk [14:32:07] mysql is being stopped... [14:32:13] (03PS1) 10Herron: puppet: change discovery-statefile template to parse under puppet 4 [puppet] - 10https://gerrit.wikimedia.org/r/387237 (https://phabricator.wikimedia.org/T179084) [14:32:13] let's hope it will not take long [14:33:11] * volans around if needed ;) [14:33:43] madhuvishy: mysql stopped - all good from my side [14:33:58] volans: thanks! [14:35:34] marostegui: cool, wanna hit reboot? I can watch the console [14:35:47] yep [14:36:04] !log Reboot labsdb1001 - T168584 [14:36:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:36:10] T168584: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584 [14:36:11] done [14:36:25] 14:27:33 up 515 days, 12:58, 3 users, load average: 5.60, 5.76, 6.39 [14:36:29] bye bye XD [14:36:46] okay, watching here [14:36:48] :) [14:37:25] (03CR) 10Herron: "https://puppet-compiler.wmflabs.org/compiler02/8549/" [puppet] - 10https://gerrit.wikimedia.org/r/387237 (https://phabricator.wikimedia.org/T179084) (owner: 10Herron) [14:37:57] (03CR) 10Herron: [C: 032] puppet: change elasticsearch_5 template to parse under puppet 4 [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) (owner: 10Herron) [14:38:03] (03PS3) 10Herron: puppet: change elasticsearch_5 template to parse under puppet 4 [puppet] - 10https://gerrit.wikimedia.org/r/387113 (https://phabricator.wikimedia.org/T179174) [14:40:03] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3719677 (10herron) [14:40:06] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: Puppet4: Error while evaluating a Function Call, Failed to parse template elasticsearch/elasticsearch_5.yml.erb:\n Filepath: /etc/puppet/modules/elasticsearch/templates/elasticsearch_5.yml.erb\n Lin... - https://phabricator.wikimedia.org/T179174#3719675 [14:40:30] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3699687 (10herron) [14:40:41] server up!! [14:40:56] (well, network at least) [14:41:11] marostegui: here it says [14:41:14] https://www.irccloud.com/pastebin/C0JsWRQI/ [14:41:21] cmjohnson1: ^ [14:41:27] :| [14:41:45] hit skip [14:41:58] okay done [14:42:11] i am in [14:42:24] I have login [14:42:25] cool [14:42:25] * bd808 fidgets [14:42:26] storage looks good [14:42:32] srvuserdata was the old partition i believe [14:42:39] let me check storage before starting mysql [14:42:44] looks okay [14:42:48] okay [14:43:13] is the kernel correct? [14:43:32] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3719686 (10herron) [14:43:34] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: Puppet4: Error while evaluating a Resource Statement, Unknown resource type: 'cronjob' at /etc/puppet/modules/mediawiki/manifests/maintenance/refreshlinks.pp:14:5 - https://phabricator.wikimedia.org/T179177#3719684 (10herron) 05Open>03Resolved a:... [14:43:40] hoo: I'd have another n00b question: is there a way to check what a refreshlink did for a specific page? Say in commons [14:43:51] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3699689 (10herron) [14:44:20] I am currently seeing a rootjob in commons that is responsible for a ton of refreshlinks, started on 20171027042217 [14:44:39] but the correspondent edit on recentchanges does not make a lot of sense [14:44:57] (and the rs might of course not be the same between change and job submission) [14:44:57] marostegui: I think so, moritzm is the latest kernel version for trusty - 3.13.0-133-generic? [14:45:10] labsdb1003 is at 3.13.0-40-generic [14:45:17] going to start mysql without replication first [14:45:42] RECOVERY - MegaRAID on tungsten is OK: OK: optimal, 1 logical, 2 physical [14:46:33] mysql up and I can see tables, databases and select stuff, so going to start replication again [14:46:44] elukey: Not that I'm aware of… no [14:47:17] marostegui: cool. https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?host=labsdb1001 looks good [14:47:24] madhuvishy:all done from my side [14:48:03] marostegui: nice! [14:48:41] I'm thinking we let it be for 1-2 hours, and then revert dns back if things are still looking good [14:48:48] chasemp: bd808 ^ sound okay? [14:48:48] agreed [14:49:07] madhuvishy: yeah. works for me [14:49:20] madhuvishy: yep, k. Can someone do some manual poking to try to use it? do we have an easy tool to point at it? [14:49:55] things that are pointed at c1.labsdb would be hitting it [14:50:13] I don't know any to check offhand [14:50:22] https://tools.wmflabs.org/tool-db-usage/ [14:50:25] actually... this is a great opportunity to see what is pinned to c1 [14:51:35] yeah, that tool would hit it, but only once an hour I think. I can force it to bypass cache, but that's not a heavy workout at all. [14:51:58] * bd808 makes it refresh [14:55:49] we can do mysql --defaults-file=$HOME/replica.my.cnf -h c1.labsdb from any tool, and test some queries [14:57:08] (03PS13) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [14:57:20] (03CR) 10Dzahn: [C: 032] introduce webperf1001 [dns] - 10https://gerrit.wikimedia.org/r/387215 (https://phabricator.wikimedia.org/T179036) (owner: 10Dzahn) [14:57:52] simple test cases will do I think, was just considering that an hour is a sane time to wait but is misleading if there is no traffic we can verify works before repooling :) [14:58:11] that is also true [14:58:16] mysql caught up already [14:58:30] chasemp: yeah I'm running some queries from Quarry recent queries to test :) [14:59:31] 10Operations, 10Prod-Kubernetes, 10monitoring, 10Kubernetes, and 2 others: Improve monitoring of the Kubernetes clusters - https://phabricator.wikimedia.org/T177395#3719781 (10akosiaris) [14:59:54] 10Operations, 10Prod-Kubernetes, 10monitoring, 10Kubernetes, and 2 others: Improve monitoring of the Kubernetes clusters - https://phabricator.wikimedia.org/T177395#3657156 (10akosiaris) a:03akosiaris [15:00:19] I feel bad if we have to reboot labsdb1003: 15:00:13 up 1061 days, [15:00:38] o_0 [15:01:18] let's not reboot it and keep that c00l uptime! [15:01:27] thanks cmjohnson1, it looks like we're good for now, fingers crossed! [15:01:45] sounds good(ish) [15:02:17] my test queries all did good [15:12:26] (03PS1) 10Ema: varnish child started: avoid illegal characters [puppet] - 10https://gerrit.wikimedia.org/r/387242 [15:15:55] 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3719840 (10Marostegui) [15:15:57] madhuvishy: I have updated ^ [15:16:25] marostegui: thanks! [15:17:12] 10Operations, 10Mail, 10monitoring: prometheus metrics and grafana dashboard for exim - https://phabricator.wikimedia.org/T179302#3719846 (10Dzahn) [15:20:28] !log re-enabling Zayo transit in eqiad [15:20:31] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3715032 (10daniel) @BBlack wrote: > something that's doing a legitimate request->response cycle, but trickling out the bytes o... [15:20:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:22:07] (03PS1) 10Arturo Borrero Gonzalez: diamond: nfsiostat: update collector to read from arbitrary NFS mount points [puppet] - 10https://gerrit.wikimedia.org/r/387243 (https://phabricator.wikimedia.org/T179024) [15:22:35] (03CR) 10jerkins-bot: [V: 04-1] diamond: nfsiostat: update collector to read from arbitrary NFS mount points [puppet] - 10https://gerrit.wikimedia.org/r/387243 (https://phabricator.wikimedia.org/T179024) (owner: 10Arturo Borrero Gonzalez) [15:23:40] -_- [15:26:56] (03PS2) 10Arturo Borrero Gonzalez: diamond: nfsiostat: update collector to read from arbitrary NFS mount points [puppet] - 10https://gerrit.wikimedia.org/r/387243 (https://phabricator.wikimedia.org/T179024) [15:32:08] 10Operations, 10Puppet, 10User-Joe: Puppet4: Error while evaluating a Resource Statement, Unknown resource type: 'exim_alias_file' at /etc/puppet/private/modules/privateexim/manifests/init.pp:55:2 - https://phabricator.wikimedia.org/T179170#3719894 (10herron) Fixed with commit 1cc673d6e6903b26e01fdcf8d6cbcd... [15:32:16] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3719897 (10herron) [15:32:18] 10Operations, 10Puppet, 10User-Joe: Puppet4: Error while evaluating a Resource Statement, Unknown resource type: 'exim_alias_file' at /etc/puppet/private/modules/privateexim/manifests/init.pp:55:2 - https://phabricator.wikimedia.org/T179170#3719895 (10herron) 05Open>03Resolved a:03herron [15:32:39] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe, 10cloud-services-team (FY2017-18): Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3699910 (10herron) [15:34:41] (03CR) 10Thcipriani: [C: 04-1] "> @thcipriani: The revised version of D774 was merged, so this is" [puppet] - 10https://gerrit.wikimedia.org/r/380503 (https://phabricator.wikimedia.org/T172333) (owner: 10Alexandros Kosiaris) [15:34:44] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3719914 (10BBlack) Trickled-in POST on the client side would be something else. Varnish's timeout_idle, which is set to 5s on... [15:39:13] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3719928 (10daniel) > In any case, this would consume front-edge client connections, but wouldn't trigger anything deeper into... [15:41:52] PROBLEM - puppet last run on rdb1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:42:20] arturo: Jenkins is saying hello! :) [15:43:15] elukey: should be fixed now, right? [15:43:27] :-) [15:43:53] now it likes your change more yes :D [15:44:10] I have to admit I hate indentation with spaces [15:44:40] 10Operations, 10ops-eqdfw, 10DC-Ops: add missing asset tag and correct location in rack for cr1-eqdfw - https://phabricator.wikimedia.org/T178613#3719946 (10Papaul) 05Open>03Resolved complete [15:44:49] you can start wars among people with these statements :D [15:45:00] (03CR) 10Dzahn: [C: 04-1] "blocked on https://phabricator.wikimedia.org/T179302" [puppet] - 10https://gerrit.wikimedia.org/r/382916 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [15:46:31] PROBLEM - puppet last run on elastic1038 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:46:58] (03Abandoned) 10Chad: Pylint nitpicks: clean up import orders [software/conftool] - 10https://gerrit.wikimedia.org/r/386206 (owner: 10Chad) [15:47:01] (03Abandoned) 10Chad: Pylint nitpicks: Whitespace/continuation fixes [software/conftool] - 10https://gerrit.wikimedia.org/r/386211 (owner: 10Chad) [15:48:50] arturo: hi, if you check on gerrit web ui, in your user settings /profile, there is a section for SSH keys as well [15:49:06] arturo: and it is technically unrelated to the prod ssh key in puppet [15:49:25] do you already have a gerrit key setup there? re: comments on the ticket [15:49:41] mutante: I was able to push to gerrit, so I guess so [15:49:53] elukey: :-) [15:50:23] arturo: ah:) that sounds like it's resolved. i was reading https://phabricator.wikimedia.org/T178807#3719429 [15:50:40] 10Operations, 10Goal, 10User-fgiunchedi: Port postgresql metrics to Prometheus - https://phabricator.wikimedia.org/T179306#3719974 (10fgiunchedi) [15:51:11] mutante: sorry, will update the ticket right now [15:51:18] arturo: no worries, glad it works [15:54:36] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3719995 (10BBlack) >>! In T179156#3719928, @daniel wrote: >> In any case, this would consume front-edge client connections, bu... [15:57:26] !log depool again tools-exec-1401.tools.eqiad.wmflabs for more tests related to T179024 [15:57:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:57:31] T179024: nfsiostat collector appears to be broken - https://phabricator.wikimedia.org/T179024 [15:58:38] arturo: generally that would be !log'd in 'wikimedia-cloud' w/ "!log " as each project has it's own SAL page fyi [15:59:14] ok great :S [15:59:26] (03PS1) 10Chad: Future-proof against changing import semantics in python 3 [software/conftool] - 10https://gerrit.wikimedia.org/r/387246 [15:59:41] can I manually edit the wiki pages? [15:59:52] (03CR) 10Chad: [C: 04-2] "Um, how did I push without a change-id?!" [software/conftool] - 10https://gerrit.wikimedia.org/r/387246 (owner: 10Chad) [15:59:59] (03CR) 10Dzahn: "also see https://phabricator.wikimedia.org/T175798 where we want to move diamond collectors to prometheus" [puppet] - 10https://gerrit.wikimedia.org/r/387243 (https://phabricator.wikimedia.org/T179024) (owner: 10Arturo Borrero Gonzalez) [16:00:01] (03CR) 10jerkins-bot: [V: 04-1] Future-proof against changing import semantics in python 3 [software/conftool] - 10https://gerrit.wikimedia.org/r/387246 (owner: 10Chad) [16:00:28] arturo: yes, you can edit the page "SAL" on wikitech wiki [16:01:55] (03PS1) 10Chad: Future-proof python 3 print function [software/conftool] - 10https://gerrit.wikimedia.org/r/387254 [16:02:34] (03PS15) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [16:02:36] meanwhile I'm on the longest chain of patchets in modern history [16:03:09] (03CR) 10jerkins-bot: [V: 04-1] openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [16:06:16] (03CR) 10Mforns: [C: 031] "After discussion in https://phabricator.wikimedia.org/T175395, seems it's safe to keep the skin field, after having bucketized it to (`vec" [puppet] - 10https://gerrit.wikimedia.org/r/379829 (https://phabricator.wikimedia.org/T175395) (owner: 10Bmansurov) [16:10:41] 10Operations, 10Goal, 10Patch-For-Review, 10User-fgiunchedi: Port non-deprecated Diamond collectors to Prometheus - https://phabricator.wikimedia.org/T177196#3720025 (10fgiunchedi) [16:10:43] 10Operations, 10monitoring: Port non-deprecated Diamond collectors to Prometheus - https://phabricator.wikimedia.org/T175798#3720027 (10fgiunchedi) [16:11:22] 10Operations, 10Mail, 10monitoring: prometheus metrics and grafana dashboard for exim - https://phabricator.wikimedia.org/T179302#3720029 (10fgiunchedi) [16:11:24] 10Operations, 10monitoring, 10Discovery-Search (Current work): port elasticsearch diamond collector to prometheus - https://phabricator.wikimedia.org/T175799#3720030 (10fgiunchedi) [16:11:26] 10Operations, 10Goal, 10Patch-For-Review, 10User-fgiunchedi: Port non-deprecated Diamond collectors to Prometheus - https://phabricator.wikimedia.org/T177196#3650139 (10fgiunchedi) [16:11:30] 10Operations, 10Mail, 10monitoring: prometheus metrics and grafana dashboard for exim - https://phabricator.wikimedia.org/T179302#3719846 (10fgiunchedi) [16:11:31] RECOVERY - puppet last run on elastic1038 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:11:32] 10Operations, 10monitoring, 10Discovery-Search (Current work): port elasticsearch diamond collector to prometheus - https://phabricator.wikimedia.org/T175799#3603468 (10fgiunchedi) [16:11:34] 10Operations, 10monitoring: Port non-deprecated Diamond collectors to Prometheus - https://phabricator.wikimedia.org/T175798#3603447 (10fgiunchedi) [16:12:01] RECOVERY - puppet last run on rdb1007 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [16:12:56] (03CR) 10jerkins-bot: [V: 04-1] Future-proof python 3 print function [software/conftool] - 10https://gerrit.wikimedia.org/r/387254 (owner: 10Chad) [16:15:41] (03PS3) 10Arturo Borrero Gonzalez: diamond: nfsiostat update collector to read from arbitrary NFS mount points [puppet] - 10https://gerrit.wikimedia.org/r/387243 (https://phabricator.wikimedia.org/T179024) [16:15:41] CUSTOM - LVS HTTP IPv4 on ores.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 4375 bytes in 0.006 second response time [16:15:59] <_joe_> what's happening on ores? [16:16:09] <_joe_> why CUSTOM, too? [16:16:10] _joe_: it's the test [16:16:12] <_joe_> wth is that? [16:16:13] _joe_: We don’t knwo of anything [16:16:24] <_joe_> oh ok [16:16:26] _joe_: it was a test to test paging [16:26:46] 10Operations, 10ops-eqiad, 10Packaging, 10hardware-requests: Decommission host copper.eqiad.wmnet - https://phabricator.wikimedia.org/T176957#3720086 (10RobH) [16:27:46] (03Draft2) 10Addshore: WIP DNM role and profile for wdcm dashboards [puppet] - 10https://gerrit.wikimedia.org/r/387211 [16:28:32] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3720106 (10BBlack) p:05Unbreak!>03High Reducing this from UBN->High, because current best-working-theory is this problem is... [16:29:08] 10Operations, 10ops-eqiad, 10Analytics, 10User-Elukey: Check analytics1037 power supply status - https://phabricator.wikimedia.org/T179192#3720108 (10RobH) a:05RobH>03Cmjohnson @Cmjohnson: Do we have any power supplies on already decommissioned hardware that would fit in the system with the failed powe... [16:31:44] (03PS1) 10Giuseppe Lavagetto: Small fixes, log path change [docker-images/docker-pkg] - 10https://gerrit.wikimedia.org/r/387268 [16:35:19] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Small fixes, log path change [docker-images/docker-pkg] - 10https://gerrit.wikimedia.org/r/387268 (owner: 10Giuseppe Lavagetto) [16:36:29] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: puppetmaster hostcert and hostprivkey point to nonexistent files - https://phabricator.wikimedia.org/T179099#3720148 (10akosiaris) >>! In T179099#3719441, @herron wrote: > @akosiaris following up to your comment in https://gerrit.wikimedia.org/r/#/c/3... [16:39:07] (03PS1) 10Dzahn: introduce webperf2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/387270 (https://phabricator.wikimedia.org/T179036) [16:39:15] (03CR) 10jerkins-bot: [V: 04-1] introduce webperf2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/387270 (https://phabricator.wikimedia.org/T179036) (owner: 10Dzahn) [16:39:46] (03PS2) 10Dzahn: introduce webperf2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/387270 (https://phabricator.wikimedia.org/T179036) [16:43:03] (03PS1) 10Giuseppe Lavagetto: New version of the software [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/387278 [16:43:29] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] New version of the software [docker-images/docker-pkg/deploy] - 10https://gerrit.wikimedia.org/r/387278 (owner: 10Giuseppe Lavagetto) [16:44:56] !log oblivian@tin Started deploy [docker-pkg/deploy@576c80f]: New version with small fixes [16:45:00] !log oblivian@tin Finished deploy [docker-pkg/deploy@576c80f]: New version with small fixes (duration: 00m 04s) [16:45:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:45:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:45:11] (03PS3) 10Dzahn: admin: Add legoktm to releasers-mediawiki to upload wikidiff2 tarballs [puppet] - 10https://gerrit.wikimedia.org/r/386896 (https://phabricator.wikimedia.org/T179264) (owner: 10Legoktm) [16:46:11] (03CR) 10Dzahn: [C: 032] "approved in ops meeting" [puppet] - 10https://gerrit.wikimedia.org/r/386896 (https://phabricator.wikimedia.org/T179264) (owner: 10Legoktm) [16:50:22] (03CR) 10Dzahn: "[releases1001:~] $ id legoktm" [puppet] - 10https://gerrit.wikimedia.org/r/386896 (https://phabricator.wikimedia.org/T179264) (owner: 10Legoktm) [16:50:52] 10Operations, 10Ops-Access-Requests, 10Patch-For-Review: Add legoktm to releasers-mediawiki - https://phabricator.wikimedia.org/T179264#3720210 (10Dzahn) 05Open>03Resolved a:03Dzahn ``` [releases1001:~] $ id legoktm uid=2552(legoktm) gid=500(wikidev) groups=500(wikidev),711(releasers-mediawiki) [relea... [16:52:22] (03PS1) 10Chad: Fix a ton of errors that were making flake8 freak out [software/conftool] - 10https://gerrit.wikimedia.org/r/387279 [16:54:24] (03PS1) 10Chad: Future-proof against changing import semantics in python 3 [software/conftool] - 10https://gerrit.wikimedia.org/r/387280 [16:54:35] (03PS1) 10DCausse: Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270) [16:54:50] (03Abandoned) 10Chad: Future-proof against changing import semantics in python 3 [software/conftool] - 10https://gerrit.wikimedia.org/r/387246 (owner: 10Chad) [16:55:01] (03CR) 10jerkins-bot: [V: 04-1] Future-proof against changing import semantics in python 3 [software/conftool] - 10https://gerrit.wikimedia.org/r/387280 (owner: 10Chad) [16:55:47] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current), and 2 others: Stress/capacity test new ores* cluster - https://phabricator.wikimedia.org/T169246#3720235 (10Halfak) a:05Halfak>03awight [16:57:50] AaronSchulz: o/ [16:58:08] (03PS2) 10DCausse: Properly check for cluster existence prior setting TTM mirrors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387281 (https://phabricator.wikimedia.org/T179270) [17:00:04] gehel: Your horoscope predicts another unfortunate Wikidata Query Service weekly deploy deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T1700). [17:00:04] No GERRIT patches in the queue for this window AFAICS. [17:00:27] (03PS1) 10Hoo man: Wikidata dispatcher: Choose a better value for --randomness [puppet] - 10https://gerrit.wikimedia.org/r/387282 [17:01:56] (03PS3) 10BBlack: [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) [17:02:17] (03CR) 10jerkins-bot: [V: 04-1] [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [17:02:59] jouncebot: no direct gerrit patch, but GUI update coming up! [17:03:20] (03PS1) 10Madhuvishy: Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003" [puppet] - 10https://gerrit.wikimedia.org/r/387291 [17:03:28] (03PS2) 10Madhuvishy: Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003" [puppet] - 10https://gerrit.wikimedia.org/r/387291 [17:04:38] (03PS4) 10BBlack: [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) [17:04:57] (03CR) 10jerkins-bot: [V: 04-1] [WIP] backend transaction_timeout [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/387236 (https://phabricator.wikimedia.org/T179156) (owner: 10BBlack) [17:05:13] (03CR) 10Madhuvishy: [C: 032] Revert "labsdb: Switchover dns for labsdb1001 shards to labsdb1003" [puppet] - 10https://gerrit.wikimedia.org/r/387291 (owner: 10Madhuvishy) [17:05:36] !log gehel@tin Started deploy [wdqs/wdqs@fdb9100]: GUI update [17:05:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:07:53] !log gehel@tin Finished deploy [wdqs/wdqs@fdb9100]: GUI update (duration: 02m 17s) [17:07:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:08:18] !log Revert dns switchover for c1 shards to c3 post labsdb1001 reboot T168584 [17:08:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:08:23] T168584: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584 [17:08:34] SMalyshev: wdqs update completed, tests are green [17:10:04] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3720290 (10daniel) > Because they're POST they'd be handled as an immediate pass through the varnish layers, so I don't think t... [17:10:12] (03CR) 10Dzahn: [C: 032] introduce webperf2001.codfw.wmnet [dns] - 10https://gerrit.wikimedia.org/r/387270 (https://phabricator.wikimedia.org/T179036) (owner: 10Dzahn) [17:12:25] _joe_ needs a FOSS clone of http://www.bitboost.com/pawsense/ [17:13:14] (03PS16) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [17:13:49] (03CR) 10jerkins-bot: [V: 04-1] openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [17:14:23] (03PS18) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [17:14:58] (03CR) 10jerkins-bot: [V: 04-1] openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [17:17:11] (03PS19) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [17:20:05] (03PS1) 10Chad: Future-proof python 3 print function [software/conftool] - 10https://gerrit.wikimedia.org/r/387297 [17:20:27] (03Abandoned) 10Chad: Future-proof python 3 print function [software/conftool] - 10https://gerrit.wikimedia.org/r/387254 (owner: 10Chad) [17:23:10] (03PS2) 10Chad: Future-proof python 3 print function [software/conftool] - 10https://gerrit.wikimedia.org/r/387297 [17:23:40] 10Operations, 10CirrusSearch, 10Discovery, 10MediaWiki-JobQueue, and 6 others: Job queue is increasing non-stop - https://phabricator.wikimedia.org/T173710#3720346 (10elukey) We had some relief after the last change in the configs of the jobrunners, namely the queue started shrinking, but then we got back... [17:24:22] 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3720347 (10madhuvishy) The 1001 reboot is all done. Notes from my planning etherpad: labsdb1001 (Planned for Oct 30 2017 14:3... [17:29:03] 10Operations, 10CirrusSearch, 10Discovery, 10MediaWiki-JobQueue, and 6 others: Job queue is increasing non-stop - https://phabricator.wikimedia.org/T173710#3720358 (10EBernhardson) All jobs have a `requestId` parameter, which is passed down through the execution chain. This is the same as the `reqId` field... [17:30:40] (03PS20) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [17:31:10] (03CR) 10jerkins-bot: [V: 04-1] openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [17:37:41] (03PS1) 10Ori.livneh: Update my (ori) keys [puppet] - 10https://gerrit.wikimedia.org/r/387301 [17:39:57] 10Operations, 10ORES, 10Scoring-platform-team, 10Traffic, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3720392 (10BBlack) >>! In T179156#3719995, @BBlack wrote: > We have an obvious case of normal slow chunked uploads of large fil... [17:40:57] volans: cumin question :) Can I run something across all VPS hosts, but exclude a project or two? [17:41:18] I know how to target all nodes, not sure if and how exclusion works [17:41:42] 10Operations, 10Ops-Access-Requests: Add hoo to perf-roots - https://phabricator.wikimedia.org/T179317#3720398 (10hoo) [17:41:57] madhuvishy: sure, so I think there are two ways, but one for sure :D [17:42:22] so the first obvious way is to use the global grammar that allow you to combine subqueries with and/or/and not/xor [17:43:31] like the current alias all is defined [17:43:55] volans: like 'O{*} and not O{project:tools}' [17:43:58] so you could do: A:all and not O{project:foo} and not O{project:bar} [17:44:15] if you want to exclude contintcloud and admin-monitoring those are already excluded by teh alias all [17:44:22] all: 'O{*} and not O{project:contintcloud} and not O{project:admin-monitoring}' [17:44:29] I see [17:44:31] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:44:38] see /etc/cumin/aliases.yaml on the host or hieradata/eqiad/profile/openstack/main/cumin.yaml in puppet [17:45:05] volans: cool thanks :) [17:45:37] madhuvishy: did by any chance any of you looked into that openstack policy to speed things up when querying all projects? [17:46:32] I haven't for sure, but I don't think so? [17:47:46] (03PS1) 10Chad: py3 compat [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 [17:47:48] (03PS1) 10Chad: Py3 compat: swap xrange for range [software/conftool] - 10https://gerrit.wikimedia.org/r/387306 [17:47:58] ok, np, in case you do ping me so we can test the performances and I can make the changed in cumin to take advantage of it ;) [17:48:07] (03CR) 10jerkins-bot: [V: 04-1] py3 compat [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 (owner: 10Chad) [17:48:09] (03CR) 10jerkins-bot: [V: 04-1] Py3 compat: swap xrange for range [software/conftool] - 10https://gerrit.wikimedia.org/r/387306 (owner: 10Chad) [17:48:33] volans: for sure :) thanks for cumin again, it's awesome! [17:49:03] 10Operations, 10Ops-Access-Requests: Add hoo to perf-roots - https://phabricator.wikimedia.org/T179317#3720428 (10daniel) It would indeed be very useful to us of @hoo had this access. [17:49:15] (03PS1) 10Herron: puppet: change mediawiki updatequerypages::cronjob call to full name [puppet] - 10https://gerrit.wikimedia.org/r/387307 (https://phabricator.wikimedia.org/T179290) [17:49:30] RECOVERY - puppet last run on bast1001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [17:49:30] madhuvishy: and btw there is a pending CR to allow to plug-in external backends, so we could write one for the puppet API of yours ;) [17:49:48] at that point you could mix projects in openstack and puppet classes... [17:49:55] ya that'd be really cool [17:50:16] (03PS1) 10Jdrewniak: Bumping portals to master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387308 (https://phabricator.wikimedia.org/T128546) [17:50:25] (03CR) 10Ori.livneh: "I e-mailed additional information for verification to security@." [puppet] - 10https://gerrit.wikimedia.org/r/387301 (owner: 10Ori.livneh) [17:50:58] (03PS1) 10Chad: Add .coverage to .gitignore [software/conftool] - 10https://gerrit.wikimedia.org/r/387309 [17:51:57] madhuvishy: stay tuned ;) [17:53:11] (03PS21) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [18:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: That opportune time is upon us again. Time for a Morning SWAT (Max 8 patches) deploy. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T1800). [18:00:04] bd808 and jan_drewniak: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [18:00:30] o/ [18:00:52] o/ [18:02:10] I suppose I could try to remember how to swat if there are no awesomely skilled swat teamers today [18:02:51] if you're not up for it, I'm around :) [18:03:05] thcipriani: you have the con sir. :) [18:03:11] :D [18:03:35] (03PS2) 10Thcipriani: Add Timeless skin to wikitech [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387069 (https://phabricator.wikimedia.org/T154371) (owner: 10BryanDavis) [18:03:37] you don't want managers wandering around on tin and breaking things [18:03:54] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387069 (https://phabricator.wikimedia.org/T154371) (owner: 10BryanDavis) [18:04:28] (03Restored) 10Chad: Pylint nitpicks: Whitespace/continuation fixes [software/conftool] - 10https://gerrit.wikimedia.org/r/386211 (owner: 10Chad) [18:05:21] totally, I'm way more effective at breaking stuff :P [18:05:31] (03Merged) 10jenkins-bot: Add Timeless skin to wikitech [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387069 (https://phabricator.wikimedia.org/T154371) (owner: 10BryanDavis) [18:05:58] 10Operations, 10ORES, 10Scoring-platform-team: Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3720469 (10Halfak) [18:06:02] bd808: you can't test ^ anywhere but silver, correct? And at that point it's live? [18:06:14] er, ^ meant to point to the timeless skin change [18:06:23] (03CR) 10jenkins-bot: Add Timeless skin to wikitech [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387069 (https://phabricator.wikimedia.org/T154371) (owner: 10BryanDavis) [18:06:36] thcipriani: there is labtestwikitech, but I don't think this rates that [18:06:37] (03PS22) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [18:06:50] its just letting a skin be used [18:06:58] okie doke [18:07:01] * thcipriani syncs [18:07:28] (03CR) 10jerkins-bot: [V: 04-1] openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [18:08:32] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:387069|Add Timeless skin to wikitech]] T154371 (duration: 00m 51s) [18:08:36] ^ bd808 live now [18:08:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:08:38] T154371: Review and deploy Timeless skin - https://phabricator.wikimedia.org/T154371 [18:09:13] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387308 (https://phabricator.wikimedia.org/T128546) (owner: 10Jdrewniak) [18:09:50] thcipriani: seems to work -- https://wikitech.wikimedia.org/wiki/Main_Page?useskin=timeless&cachebust [18:10:12] fancy :) Thanks for checking [18:11:59] (03Merged) 10jenkins-bot: Bumping portals to master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387308 (https://phabricator.wikimedia.org/T128546) (owner: 10Jdrewniak) [18:12:10] (03PS23) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [18:12:13] (03CR) 10jenkins-bot: Bumping portals to master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387308 (https://phabricator.wikimedia.org/T128546) (owner: 10Jdrewniak) [18:12:44] jan_drewniak: ^ live on mwdebug1002, check please [18:13:58] thcipriani: looks good :) [18:14:07] ok, going live [18:15:01] (03PS1) 10Chad: Pylint nitpicks: Whitespace/continuation/parens fixes [software/conftool] - 10https://gerrit.wikimedia.org/r/387313 [18:15:43] !log thcipriani@tin Synchronized portals/prod/wikipedia.org/assets: SWAT: [[gerrit:387308|Bumping portals to master]] T128546 (duration: 00m 50s) [18:15:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:15:49] T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546 [18:16:04] (03Abandoned) 10Chad: Pylint nitpicks: Whitespace/continuation fixes [software/conftool] - 10https://gerrit.wikimedia.org/r/386211 (owner: 10Chad) [18:16:20] * paladox switches to use timeless on wikitech :) [18:16:34] !log thcipriani@tin Synchronized portals: SWAT: [[gerrit:387308|Bumping portals to master]] T128546 (duration: 00m 50s) [18:16:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:16:43] ^ jan_drewniak live everywhere [18:17:22] thcipriani: thanks! [18:17:31] yw :) [18:17:35] (03PS2) 10Chad: py3 compat [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 [18:18:03] (03CR) 10Bearloga: "Not sure why `include ::r_lang` gave you problems, but try this other way?" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [18:18:14] (03PS12) 10Bearloga: Add ::statistics::wmde::wdcm [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [18:20:19] (03CR) 10jerkins-bot: [V: 04-1] Pylint nitpicks: Whitespace/continuation/parens fixes [software/conftool] - 10https://gerrit.wikimedia.org/r/387313 (owner: 10Chad) [18:22:26] (03CR) 10GoranSMilovanovic: [C: 031] "@Bearloga We've too wondered why would a simple `include ::r_lang` cause any problems." [puppet] - 10https://gerrit.wikimedia.org/r/369902 (https://phabricator.wikimedia.org/T171258) (owner: 10Addshore) [18:22:35] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) [18:22:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:28:10] (03CR) 10GoranSMilovanovic: "Failed? It seems similar to what has happened w. the r_lang module before:" [puppet] - 10https://gerrit.wikimedia.org/r/387211 (owner: 10Addshore) [18:28:23] (03CR) 10jerkins-bot: [V: 04-1] py3 compat [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 (owner: 10Chad) [18:29:40] (03PS3) 10Chad: py3 compat: rewrite execfile() using exec() [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 [18:37:41] (03CR) 10jerkins-bot: [V: 04-1] py3 compat: rewrite execfile() using exec() [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 (owner: 10Chad) [18:40:30] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) [18:40:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:41:10] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) [18:41:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:46:35] (03PS4) 10Chad: py3 compat: rewrite execfile() using exec() [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 [18:46:37] (03PS1) 10Chad: Fix StringIO import for py2/3 compat [software/conftool] - 10https://gerrit.wikimedia.org/r/387322 [18:47:15] (03CR) 10jerkins-bot: [V: 04-1] py3 compat: rewrite execfile() using exec() [software/conftool] - 10https://gerrit.wikimedia.org/r/387305 (owner: 10Chad) [18:47:32] (03CR) 10jerkins-bot: [V: 04-1] Fix StringIO import for py2/3 compat [software/conftool] - 10https://gerrit.wikimedia.org/r/387322 (owner: 10Chad) [18:48:19] (03PS2) 10Chad: Py3 compat: swap xrange for range [software/conftool] - 10https://gerrit.wikimedia.org/r/387306 [18:48:57] (03CR) 10jerkins-bot: [V: 04-1] Py3 compat: swap xrange for range [software/conftool] - 10https://gerrit.wikimedia.org/r/387306 (owner: 10Chad) [19:01:03] !log awight@tin Finished deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) (duration: 19m 53s) [19:01:07] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) [19:01:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:01:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:01:29] (sorry for the spam, git is taking a dump on me) [19:07:12] yeah… I’m consistently getting a 504 reponse from tin, when checking out one of my submodules. [19:07:24] !log awight@tin Finished deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) (duration: 06m 16s) [19:07:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:14:50] (03CR) 10Aaron Schulz: [C: 031] "Is anything else needed to deploy this change?" [puppet] - 10https://gerrit.wikimedia.org/r/384695 (https://phabricator.wikimedia.org/T175672) (owner: 10Jcrespo) [19:28:43] hello thcipriani, is it possible to deploy a unexpected small swat patch in this window ? if you are ok, i'll add it to wikitech:deployments [19:36:50] ping thcipriani :) [19:37:33] framawiki: well, we're over the window by a half-hour. What's the change? [19:38:32] thcipriani: https://phabricator.wikimedia.org/T178545 [19:38:59] (03PS3) 10Framawiki: Create Appendix NS on Burmese Wiktionary (mywikt) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385190 (https://phabricator.wikimedia.org/T178545) [19:41:31] (03PS24) 10Rush: openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) [19:41:40] framawiki: sure, I can get that out if you're around to test. [19:42:47] thcipriani: thanks :) i'm ready to test [19:44:22] (03CR) 10Thcipriani: [C: 032] "(late) SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385190 (https://phabricator.wikimedia.org/T178545) (owner: 10Framawiki) [19:49:13] (03Merged) 10jenkins-bot: Create Appendix NS on Burmese Wiktionary (mywikt) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385190 (https://phabricator.wikimedia.org/T178545) (owner: 10Framawiki) [19:49:27] (03CR) 10jenkins-bot: Create Appendix NS on Burmese Wiktionary (mywikt) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/385190 (https://phabricator.wikimedia.org/T178545) (owner: 10Framawiki) [19:50:30] framawiki: ^ is live on mwdebug1002, check please [19:50:38] (03PS1) 10Rush: openstack-refactor match old labs encapi dummy passwords [labs/private] - 10https://gerrit.wikimedia.org/r/387350 [19:52:35] thcipriani: good for me [19:53:38] framawiki: ok, going live [19:55:28] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) [19:55:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:55:42] !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:385190|Create Appendix NS on Burmese Wiktionary]] T178545 (duration: 00m 55s) [19:55:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:55:50] T178545: Create Appendix namespace on Burmese Wiktionary (mywikt) - https://phabricator.wikimedia.org/T178545 [19:56:48] ^ framawiki all done [19:56:52] thanks thcipriani, and sorry for the forgetfulness [19:57:25] no worries, thanks for the patch and being around to check :) [20:00:04] gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: It is that lovely time of the day again! You are hereby commanded to deploy Services – Parsoid / OCG / Citoid / Mobileapps / ORES / …. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T2000). [20:00:04] No GERRIT patches in the queue for this window AFAICS. [20:01:57] (03CR) 10Rush: [V: 032 C: 032] openstack-refactor match old labs encapi dummy passwords [labs/private] - 10https://gerrit.wikimedia.org/r/387350 (owner: 10Rush) [20:08:03] !log T179083: Creating keyspace enwiki_T_parsoid_stash_htmlmXxc_uDhgnFQAdM8PPlH5 [20:08:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:08:08] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [20:09:39] !log awight@tin Finished deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) (duration: 14m 11s) [20:09:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:12:26] PROBLEM - Host copper is DOWN: PING CRITICAL - Packet loss = 100% [20:14:43] !log T179083: Creating table enwiki_T_parsoid_stash_htmlmXxc_uDhgnFQAdM8PPlH5.meta [20:14:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:14:50] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [20:18:42] !log arlolra@tin Started deploy [parsoid/deploy@8a2c0ce]: Updating Parsoid to 292633c4 [20:18:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:20:41] !log T179083: Creating table enwiki_T_parsoid_stash_htmlmXxc_uDhgnFQAdM8PPlH5.data [20:20:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:20:47] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [20:21:00] (03CR) 10Rush: "http://puppet-compiler.wmflabs.org/8558/" [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [20:24:17] (03CR) 10Rush: [C: 032] openstack: refactor deployment specific puppetmaster code [puppet] - 10https://gerrit.wikimedia.org/r/386612 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [20:25:41] !log T179083: Creating keyspace enwiki_T_parsoid_stash_wikitextf0PBY8UXqY8UuiDv1 [20:25:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:25:48] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [20:26:15] !log arlolra@tin Finished deploy [parsoid/deploy@8a2c0ce]: Updating Parsoid to 292633c4 (duration: 07m 33s) [20:26:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:27:00] !log T179083: Creating table enwiki_T_parsoid_stash_wikitextf0PBY8UXqY8UuiDv1.meta [20:27:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:27:20] 10Operations, 10vm-requests, 10Patch-For-Review, 10Performance-Team (Radar): Request VM for webperf (metrics processing) - https://phabricator.wikimedia.org/T179036#3720964 (10Krinkle) p:05Triage>03Normal [20:28:16] !log T179083: Creating table enwiki_T_parsoid_stash_wikitextf0PBY8UXqY8UuiDv1.data [20:28:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:31:57] 10Operations, 10ops-codfw, 10fundraising-tech-ops, 10netops: connect second ethernet interface for fundraising codfw hosts - https://phabricator.wikimedia.org/T176175#3720997 (10ayounsi) Switch ports activated, some interfaces still show as down, most likely not activated on the server side: ``` ge-0/0/3... [20:33:43] !log T179083: Creating enwiki_T_parsoid_stash_dataWH8IDUS9SGI6LPpsJsLOQ [20:33:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:33:49] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [20:34:34] !log T179083: Creating enwiki_T_parsoid_stash_dataWH8IDUS9SGI6LPpsJsLOQ.meta [20:34:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:35:54] !log Updated Parsoid to 292633c4 (T133334) [20:36:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:36:00] T133334: Ref marker in caption in data-mw - https://phabricator.wikimedia.org/T133334 [20:36:30] !log T179083: Creating enwiki_T_parsoid_stash_dataWH8IDUS9SGI6LPpsJsLOQ.data [20:36:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:38:06] urandom: have a moment for a short question ? [20:38:13] matanya: sure [20:39:14] a user approached me that he forgot his password, but he is still logged in, he does not have an email set, any way he can recover his password ? [20:40:41] urandom: he also says when trying to add an email he is requested for a password [20:41:27] matanya: i may not be the best person to answer this, but afaik, being still logged in won't be of help there [20:41:56] Would security have a better answer matanya [20:42:23] yeah, i was just looking for a better channel to suggest [20:42:33] i.e dapatrick or Reedy or brian ? [20:42:41] Yes [20:42:59] thanks [20:43:03] Np [20:44:59] (03PS1) 10Rush: openstack: labtestpuppetmaster params adjustments [puppet] - 10https://gerrit.wikimedia.org/r/387366 (https://phabricator.wikimedia.org/T171494) [20:46:53] !log T179083: Creating enwiki_T_parsoid_stash_section2ACMDK1DRzacK9nUB3.{data,meta} (15sec delay) [20:46:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:46:59] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [20:56:06] !log T179083: Creating commons_T_parsoid_stash_htmlmXxc_uDhgnFQAdM8PPlH.{data,meta} (25 sec delay) [20:56:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:56:12] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [21:00:04] dapatrick, bawolff, and Reedy: (Dis)respected human, time to deploy Weekly Security deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T2100). Please do the needful. [21:00:04] No GERRIT patches in the queue for this window AFAICS. [21:00:25] PROBLEM - HHVM rendering on mw2121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:01:00] (03PS1) 10Rush: puppetmaster standalone: pull puppetmaster from hiera [puppet] - 10https://gerrit.wikimedia.org/r/387371 (https://phabricator.wikimedia.org/T171494) [21:01:14] RECOVERY - HHVM rendering on mw2121 is OK: HTTP OK: HTTP/1.1 200 OK - 79851 bytes in 0.303 second response time [21:01:33] (03CR) 10jerkins-bot: [V: 04-1] puppetmaster standalone: pull puppetmaster from hiera [puppet] - 10https://gerrit.wikimedia.org/r/387371 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [21:02:02] (03CR) 10Andrew Bogott: [C: 031] puppetmaster standalone: pull puppetmaster from hiera [puppet] - 10https://gerrit.wikimedia.org/r/387371 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [21:02:45] !log T179083: Creating commons_T_parsoid_stash_wikitextf0PBY8UXqY8UuiDv.{meta,data} (30 sec delay) [21:02:50] (03CR) 10Rush: [V: 032 C: 032] puppetmaster standalone: pull puppetmaster from hiera [puppet] - 10https://gerrit.wikimedia.org/r/387371 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [21:02:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:02:52] T179083: Cassandra 3.11.0 schema creation seems unreliable - https://phabricator.wikimedia.org/T179083 [21:13:12] (03PS2) 10Rush: openstack: labtestpuppetmaster params adjustments [puppet] - 10https://gerrit.wikimedia.org/r/387366 (https://phabricator.wikimedia.org/T171494) [21:13:28] (03PS3) 10Rush: openstack: labtestpuppetmaster params adjustments [puppet] - 10https://gerrit.wikimedia.org/r/387366 (https://phabricator.wikimedia.org/T171494) [21:20:04] (03CR) 10Rush: [C: 032] openstack: labtestpuppetmaster params adjustments [puppet] - 10https://gerrit.wikimedia.org/r/387366 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [21:46:59] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) [21:47:03] !log restart nova-fullstack to force a run [21:47:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:47:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:47:20] (03Abandoned) 10EBernhardson: Switch CirrusSearch MLR model for enwiki to older model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/379252 (owner: 10EBernhardson) [21:48:42] !log awight@tin Finished deploy [ores/deploy@a0f7d5c]: Smoke test for timeout bug on ores1002 (non-production) (duration: 01m 42s) [21:48:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:50:28] (03Draft1) 10Paladox: Gerrit: Add some soy templates for its-phabricator [puppet] - 10https://gerrit.wikimedia.org/r/387410 [21:50:31] (03Draft2) 10Paladox: Gerrit: Add some soy templates for its-phabricator [puppet] - 10https://gerrit.wikimedia.org/r/387410 [21:52:03] (03CR) 10Paladox: Gerrit: Add some soy templates for its-phabricator (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/387410 (owner: 10Paladox) [21:52:37] (03CR) 10Paladox: Gerrit: Add some soy templates for its-phabricator (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/387410 (owner: 10Paladox) [21:53:19] !log awight@tin Started deploy [ores/deploy@a0f7d5c]: Fun with deployments (non-production) T179336 [21:53:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:53:25] T179336: ORES deploy submodule 504 - https://phabricator.wikimedia.org/T179336 [21:55:22] !log awight@tin Finished deploy [ores/deploy@a0f7d5c]: Fun with deployments (non-production) T179336 (duration: 02m 03s) [21:55:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:03:32] (03CR) 10Chad: [C: 031] Gerrit: Add some soy templates for its-phabricator [puppet] - 10https://gerrit.wikimedia.org/r/387410 (owner: 10Paladox) [22:09:20] (03CR) 10EBernhardson: "model uploaded to deployment-prep" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386219 (owner: 10EBernhardson) [22:11:36] (03PS1) 10Rush: admin: wmcs-roots got lost on labpuppetmaster1002 [puppet] - 10https://gerrit.wikimedia.org/r/387433 (https://phabricator.wikimedia.org/T171494) [22:11:49] (03PS2) 10Rush: admin: wmcs-roots got lost on labpuppetmaster1002 [puppet] - 10https://gerrit.wikimedia.org/r/387433 (https://phabricator.wikimedia.org/T171494) [22:18:07] (03CR) 10Rush: [C: 032] admin: wmcs-roots got lost on labpuppetmaster1002 [puppet] - 10https://gerrit.wikimedia.org/r/387433 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [22:52:32] oh goodie. zuul was filled up just in time for SWAT :P [23:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: My dear minions, it's time we take the moon! Just kidding. Time for Evening SWAT (Max 8 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171030T2300). [23:00:04] ebernhardson: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [23:01:00] * ebernhardson clicks merge and hopes for the best [23:01:16] (03CR) 10EBernhardson: [C: 032] Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386219 (owner: 10EBernhardson) [23:01:51] ebernhardson: blame https://phabricator.wikimedia.org/T168353 :( [23:02:00] tons of small patches across many repos [23:06:06] (03PS1) 10BryanDavis: labs: Rebranding for Puppet failure emails [puppet] - 10https://gerrit.wikimedia.org/r/387492 (https://phabricator.wikimedia.org/T168480) [23:08:05] (03CR) 10Paladox: [C: 031] labs: Rebranding for Puppet failure emails [puppet] - 10https://gerrit.wikimedia.org/r/387492 (https://phabricator.wikimedia.org/T168480) (owner: 10BryanDavis) [23:11:55] PROBLEM - Check Varnish expiry mailbox lag on cp4025 is CRITICAL: CRITICAL: expiry mailbox lag is 2076318 [23:14:48] ebernhardson: you have a merge conflict [23:15:15] * greg-g was wondering why it wasn't annoucing in here [23:15:25] greg-g: ahh, i see. thanks! [23:15:29] (03PS2) 10EBernhardson: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386219 [23:15:41] (03CR) 10EBernhardson: [C: 032] Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386219 (owner: 10EBernhardson) [23:16:00] i've got wmf.4 and wmf.5 patches going too, although those hit the expedited queue [23:19:38] !log ebernhardson@tin Synchronized php-1.31.0-wmf.5/extensions/CirrusSearch/: Fix highlight tags inside section anchors of search results (duration: 01m 03s) [23:19:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:21:09] (03Merged) 10jenkins-bot: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386219 (owner: 10EBernhardson) [23:21:19] (03CR) 10jenkins-bot: Update CirrusSearch enwiki MLR model [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386219 (owner: 10EBernhardson) [23:22:28] !log ebernhardson@tin Synchronized php-1.31.0-wmf.4/extensions/CirrusSearch/: Fix highlight tags inside section anchors of search results (duration: 01m 01s) [23:22:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:28:13] !log ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: Update cirrussearch MLR model for enwiki (duration: 00m 50s) [23:28:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:30:27] SWAT complete. [23:59:15] 10Operations, 10Scap, 10Release-Engineering-Team (Watching / External): Scap: Standardize git version - https://phabricator.wikimedia.org/T179353#3721967 (10thcipriani)