[01:04:26] PROBLEM - Check health of redis instance on 6481 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1508115861 600 - REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 4173091 keys, up 4 minutes 18 seconds - replication_delay is 1508115861 [01:05:06] PROBLEM - Check health of redis instance on 6480 on rdb2005 is CRITICAL: CRITICAL: replication_delay is 1508115898 600 - REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 4175872 keys, up 4 minutes 56 seconds - replication_delay is 1508115898 [01:05:06] PROBLEM - Check health of redis instance on 6479 on rdb2005 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6479 [01:06:06] RECOVERY - Check health of redis instance on 6480 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6480 has 1 databases (db0) with 4170515 keys, up 6 minutes - replication_delay is 0 [01:06:07] RECOVERY - Check health of redis instance on 6479 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6479 has 1 databases (db0) with 4172607 keys, up 6 minutes - replication_delay is 0 [01:06:26] RECOVERY - Check health of redis instance on 6481 on rdb2005 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6481 has 1 databases (db0) with 4169887 keys, up 6 minutes 18 seconds - replication_delay is 0 [02:28:58] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.3) (duration: 07m 44s) [02:29:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:35:34] !log l10nupdate@tin ResourceLoader cache refresh completed at Mon Oct 16 02:35:34 UTC 2017 (duration 6m 36s) [02:35:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:27:06] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 621.37 seconds [04:17:05] (03PS2) 10BBlack: Various minor improvements/updates [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384165 [04:17:07] (03PS3) 10BBlack: Remove multi-head support from strq, move into purger. [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382865 [04:17:09] (03PS3) 10BBlack: Move all URL parsing and HTTP req generation to receiver [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382867 [04:17:11] (03PS3) 10BBlack: Chain the purgers together and split their stats [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382868 [04:17:13] (03PS3) 10BBlack: Bump http-parser upstream src to 2.7.1 + fixups [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382870 [04:17:15] (03PS2) 10BBlack: Refactor (rewrite?!) purging code [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384167 [04:17:17] (03PS3) 10BBlack: Release 0.1.0 [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382873 [04:17:19] (03PS1) 10BBlack: strq+purger: refactor, simplify, add queue delays [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384433 [04:17:21] (03PS1) 10BBlack: Rework stats further [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384434 [04:17:23] (03PS1) 10BBlack: link against jemalloc [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384435 [04:17:47] (03Abandoned) 10BBlack: purger: optimize write watcher churn [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384168 (owner: 10BBlack) [04:18:09] (03Abandoned) 10BBlack: bump copyright years [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382872 (owner: 10BBlack) [04:18:14] (03Abandoned) 10BBlack: http-parser usage updates [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382871 (owner: 10BBlack) [04:18:17] (03Abandoned) 10BBlack: split per-purger stats [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382869 (owner: 10BBlack) [04:18:20] (03Abandoned) 10BBlack: reduce CONN_WAIT_MAX to 8s [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384164 (owner: 10BBlack) [04:18:23] (03Abandoned) 10BBlack: Move the strq object completely inside the purger [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382866 (owner: 10BBlack) [04:18:25] (03Abandoned) 10BBlack: Add inpkts_sent metric [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382864 (owner: 10BBlack) [04:18:28] (03Abandoned) 10BBlack: raise PERSIST_REQS to 100K [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382863 (owner: 10BBlack) [04:18:31] (03Abandoned) 10BBlack: assertion/warning nits [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382862 (owner: 10BBlack) [04:32:26] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 252.21 seconds [05:22:44] <_joe_> morning [05:31:29] hey _joe_ o/ [05:33:28] (03PS1) 10Marostegui: db-eqiad.php: Depool db1073 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384436 [05:35:18] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1073 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384436 (owner: 10Marostegui) [05:36:27] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1073 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384436 (owner: 10Marostegui) [05:36:37] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1073 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384436 (owner: 10Marostegui) [05:36:55] !log Optimize templatelinks and pagelinks on db1073 - 174509 [05:37:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:37:08] !log Optimize templatelinks and pagelinks on db1073 - T174509 [05:37:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:37:15] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:37:32] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1073 - T174509 (duration: 00m 47s) [05:37:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:44:23] (03PS1) 10Marostegui: db-eqiad.php: Depool db1074 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384438 (https://phabricator.wikimedia.org/T174509) [05:45:43] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1074 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384438 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:46:53] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1074 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384438 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:47:01] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1074 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384438 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:47:18] !log Optimize pagelinks and templatelinks on db1074 - T174509 [05:47:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:47:24] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:47:47] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1074 - T174509 (duration: 00m 46s) [05:47:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:50:30] (03PS1) 10Marostegui: db-eqiad.php: Depool db1084 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384439 (https://phabricator.wikimedia.org/T174509) [05:51:48] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1084 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384439 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:53:36] !log Optimize recentchanges, pagelinks and templatelinks on db1084 - T174509 T177772 [05:53:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:53:45] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) - https://phabricator.wikimedia.org/T177772 [05:53:45] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [05:53:47] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1084 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384439 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:54:52] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1084 - T174509 T177772 (duration: 00m 46s) [05:54:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:56:18] (03PS1) 10Marostegui: db-eqiad.php: Depool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384440 (https://phabricator.wikimedia.org/T174509) [05:56:23] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1084 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384439 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:57:51] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384440 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:59:04] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384440 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:59:13] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384440 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [05:59:21] !log Optimize recentchanges, pagelinks and templatelinks on db1070 - T174509 [05:59:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:59:28] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:00:04] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1070 - T174509 (duration: 00m 46s) [06:00:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:01:16] (03PS1) 10Marostegui: db-eqiad.php: Depool db1093 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384441 (https://phabricator.wikimedia.org/T174509) [06:02:37] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1093 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384441 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:03:54] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1093 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384441 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:04:15] !log Optimize recentchanges, pagelinks and templatelinks on db1093 - T174509 T177772 [06:04:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:04:22] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) - https://phabricator.wikimedia.org/T177772 [06:04:50] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1093 - T174509 T177772 (duration: 00m 46s) [06:04:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:04:57] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:06:02] (03PS1) 10Marostegui: db-eqiad.php: Depool db1094 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384443 (https://phabricator.wikimedia.org/T174509) [06:06:24] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1093 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384441 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:07:40] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1094 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384443 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:08:52] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1094 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384443 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:09:03] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1094 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384443 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [06:12:57] !log Optimize pagelinks and templatelinks on db1094 - T174509 [06:13:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:13:04] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [06:17:16] PROBLEM - Check Varnish expiry mailbox lag on cp4026 is CRITICAL: CRITICAL: expiry mailbox lag is 2023831 [06:18:17] (03PS1) 10Marostegui: mariadb: Add db2084 to s5 and later s6 [puppet] - 10https://gerrit.wikimedia.org/r/384445 (https://phabricator.wikimedia.org/T170662) [06:19:22] (03PS1) 10Marostegui: s5.hosts: Add db2084 to s5 [software] - 10https://gerrit.wikimedia.org/r/384446 (https://phabricator.wikimedia.org/T170662) [06:19:37] PROBLEM - Router interfaces on cr1-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 66, down: 1, dormant: 0, excluded: 0, unused: 0 [06:19:46] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0 [06:20:36] (03CR) 10Marostegui: [C: 032] s5.hosts: Add db2084 to s5 [software] - 10https://gerrit.wikimedia.org/r/384446 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:20:57] PROBLEM - nova-compute process on labvirt1013 is CRITICAL: PROCS CRITICAL: 2 processes with regex args ^/usr/bin/pytho[n] /usr/bin/nova-compute [06:21:24] (03Merged) 10jenkins-bot: s5.hosts: Add db2084 to s5 [software] - 10https://gerrit.wikimedia.org/r/384446 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:21:49] (03PS2) 10Marostegui: mariadb: Add db2084 to s5 and later s8 [puppet] - 10https://gerrit.wikimedia.org/r/384445 (https://phabricator.wikimedia.org/T170662) [06:21:56] RECOVERY - nova-compute process on labvirt1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n] /usr/bin/nova-compute [06:23:13] (03CR) 10Marostegui: [C: 032] "Puppet looks good: https://puppet-compiler.wmflabs.org/compiler02/8324/" [puppet] - 10https://gerrit.wikimedia.org/r/384445 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [06:25:12] !log Stop MySQL on db2079 to clone db2084 from it - T170662 [06:25:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:25:19] T170662: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662 [06:27:33] (03CR) 10Marostegui: [C: 031] maintain-views: Add log types to logging_whitelist [puppet] - 10https://gerrit.wikimedia.org/r/384201 (https://phabricator.wikimedia.org/T178052) (owner: 10BryanDavis) [06:30:38] (03PS1) 10Marostegui: db-eqiad,db-codfw.php: Add db2084 to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384447 (https://phabricator.wikimedia.org/T170662) [06:35:56] RECOVERY - Router interfaces on cr1-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 68, down: 0, dormant: 0, excluded: 0, unused: 0 [06:35:56] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [06:52:19] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests, 10User-Addshore: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3686454 (10MoritzMuehlenhoff) >>! In T177599#3684322, @Dzahn wrote: > Would it make sense if i add the wmde group members? "ldap_wmde_users" on... [07:10:51] (03CR) 10Marostegui: [C: 032] db-eqiad,db-codfw.php: Add db2084 to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384447 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [07:11:54] (03Merged) 10jenkins-bot: db-eqiad,db-codfw.php: Add db2084 to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384447 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [07:12:11] (03CR) 10jenkins-bot: db-eqiad,db-codfw.php: Add db2084 to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384447 (https://phabricator.wikimedia.org/T170662) (owner: 10Marostegui) [07:13:13] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Add db2084 to the config - T170662 (duration: 00m 47s) [07:13:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:13:21] T170662: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662 [07:14:15] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Add db2084 to the config - T170662 (duration: 00m 46s) [07:14:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:17:09] (03PS1) 10Marostegui: db-eqiad.php: Depool db1072 to fix data drifts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384448 (https://phabricator.wikimedia.org/T164488) [07:17:11] (03PS3) 10Giuseppe Lavagetto: Port docker builder [docker-images/docker-pkg] - 10https://gerrit.wikimedia.org/r/384081 (https://phabricator.wikimedia.org/T177276) [07:19:01] !log Optimize table ores_classification on db1073 - T159753 [07:19:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:19:08] T159753: Concerns about ores_classification table size on enwiki - https://phabricator.wikimedia.org/T159753 [07:19:44] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1072 to fix data drifts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384448 (https://phabricator.wikimedia.org/T164488) (owner: 10Marostegui) [07:21:00] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1072 to fix data drifts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384448 (https://phabricator.wikimedia.org/T164488) (owner: 10Marostegui) [07:21:06] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1072 to fix data drifts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384448 (https://phabricator.wikimedia.org/T164488) (owner: 10Marostegui) [07:21:36] PROBLEM - Host mw2251 is DOWN: PING CRITICAL - Packet loss = 100% [07:22:28] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1072 - T164488 (duration: 00m 46s) [07:22:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:22:36] T164488: Run pt-table-checksum on s3 - https://phabricator.wikimedia.org/T164488 [07:22:36] RECOVERY - Host mw2251 is UP: PING OK - Packet loss = 0%, RTA = 36.06 ms [07:30:34] !log Stop db1103 and db1072 at the same position to fix data drifts - T164488 [07:30:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:30:41] T164488: Run pt-table-checksum on s3 - https://phabricator.wikimedia.org/T164488 [07:37:16] RECOVERY - Check Varnish expiry mailbox lag on cp4026 is OK: OK: expiry mailbox lag is 0 [07:42:27] !log installing NSS security updates [07:42:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:56:30] !log Stop db1103 and db1038 at the same position to fix data drifts - T164488 [07:56:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:56:37] T164488: Run pt-table-checksum on s3 - https://phabricator.wikimedia.org/T164488 [08:12:13] (03CR) 10Gehel: [C: 04-1] "Minor fix needed on r_lang::bioc, otherwise, LGTM" (035 comments) [puppet] - 10https://gerrit.wikimedia.org/r/383916 (https://phabricator.wikimedia.org/T178096) (owner: 10Bearloga) [08:15:57] 10Operations, 10MW-1.30-release-notes, 10Performance-Team, 10Thumbor, and 2 others: Limit maximum x-content-dimension size to avoid hitting nginx limits - https://phabricator.wikimedia.org/T167034#3686547 (10Gilles) [08:16:00] 10Operations, 10Performance-Team, 10Thumbor, 10MW-1.31-release-notes (WMF-deploy-2017-09-26 (1.31.0-wmf.1)), 10User-fgiunchedi: Remove X-Content-Dimensions for multipage originals - https://phabricator.wikimedia.org/T175689#3686545 (10Gilles) 05Open>03Resolved Cleanup should be complete for TIFFs, PD... [08:18:48] (03PS4) 10Giuseppe Lavagetto: Port docker builder [docker-images/docker-pkg] - 10https://gerrit.wikimedia.org/r/384081 (https://phabricator.wikimedia.org/T177276) [08:24:11] (03PS1) 10Marostegui: [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 [08:24:41] (03CR) 10jerkins-bot: [V: 04-1] [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [08:25:00] (03PS1) 10Marostegui: db-eqiad,db-codfw.php: Remove db1050 from config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384453 (https://phabricator.wikimedia.org/T178162) [08:27:47] (03CR) 10Marostegui: [C: 032] db-eqiad,db-codfw.php: Remove db1050 from config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384453 (https://phabricator.wikimedia.org/T178162) (owner: 10Marostegui) [08:29:02] (03Merged) 10jenkins-bot: db-eqiad,db-codfw.php: Remove db1050 from config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384453 (https://phabricator.wikimedia.org/T178162) (owner: 10Marostegui) [08:29:11] (03CR) 10jenkins-bot: db-eqiad,db-codfw.php: Remove db1050 from config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384453 (https://phabricator.wikimedia.org/T178162) (owner: 10Marostegui) [08:29:13] (03PS2) 10Gehel: Temporarily silence noisy warnings for dictionary upgrade [puppet] - 10https://gerrit.wikimedia.org/r/384114 (https://phabricator.wikimedia.org/T175948) (owner: 10Smalyshev) [08:30:07] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Remove db1050 from config - T178162 (duration: 00m 46s) [08:30:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:30:15] T178162: Decommission db1050 - https://phabricator.wikimedia.org/T178162 [08:30:49] (03CR) 10Gehel: [C: 032] Temporarily silence noisy warnings for dictionary upgrade [puppet] - 10https://gerrit.wikimedia.org/r/384114 (https://phabricator.wikimedia.org/T175948) (owner: 10Smalyshev) [08:31:02] !log marostegui@tin Synchronized wmf-config/db-codfw.php: Remove db1050 from config - T178162 (duration: 00m 46s) [08:31:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:31:08] (03PS1) 10Marostegui: s6.hosts: Remove db1050 [software] - 10https://gerrit.wikimedia.org/r/384454 (https://phabricator.wikimedia.org/T178162) [08:33:07] (03CR) 10Marostegui: [C: 032] s6.hosts: Remove db1050 [software] - 10https://gerrit.wikimedia.org/r/384454 (https://phabricator.wikimedia.org/T178162) (owner: 10Marostegui) [08:33:23] (03PS1) 10Marostegui: mariadb: Remove db1050 from config [puppet] - 10https://gerrit.wikimedia.org/r/384455 (https://phabricator.wikimedia.org/T178162) [08:33:55] (03Merged) 10jenkins-bot: s6.hosts: Remove db1050 [software] - 10https://gerrit.wikimedia.org/r/384454 (https://phabricator.wikimedia.org/T178162) (owner: 10Marostegui) [08:35:58] (03CR) 10Jcrespo: [C: 031] base/icinga: if on labs, don't page for mysql procs [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) (owner: 10Dzahn) [08:36:20] (03CR) 10Jcrespo: [C: 031] "I have not checked that it works." [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) (owner: 10Dzahn) [08:43:09] (03PS1) 10Muehlenhoff: Add library hint for NSS [puppet] - 10https://gerrit.wikimedia.org/r/384457 [08:43:45] (03CR) 10Muehlenhoff: [C: 032] Add library hint for NSS [puppet] - 10https://gerrit.wikimedia.org/r/384457 (owner: 10Muehlenhoff) [08:44:11] (03PS2) 10Marostegui: mariadb: Remove db1050 from config [puppet] - 10https://gerrit.wikimedia.org/r/384455 (https://phabricator.wikimedia.org/T178162) [08:44:42] (03CR) 10Marostegui: [C: 032] "Puppet looks good: https://puppet-compiler.wmflabs.org/compiler02/8325/" [puppet] - 10https://gerrit.wikimedia.org/r/384455 (https://phabricator.wikimedia.org/T178162) (owner: 10Marostegui) [08:45:06] moritzm: ok to merge your changes? [08:46:10] !log Remove db1050 from tendril - 178162 [08:46:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:46:23] !log Remove db1050 from tendril - T178162 [08:46:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:46:29] T178162: Decommission db1050 - https://phabricator.wikimedia.org/T178162 [08:50:31] moritzm: ping :) [08:50:48] marostegui: yes, please go ahaed [08:50:52] \o/ [08:50:59] merging [08:53:52] !log Stop MySQL on db1050 as it will be decommissioned - T178162 [08:53:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:53:58] T178162: Decommission db1050 - https://phabricator.wikimedia.org/T178162 [08:55:32] (03CR) 10Muehlenhoff: [C: 031] "Looks good; in stretch dnsutils now uses priority "standard" so it no longer gets added by default in the d-i base installation." [puppet] - 10https://gerrit.wikimedia.org/r/383657 (owner: 10Dzahn) [08:56:35] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: Decommission db1050 - https://phabricator.wikimedia.org/T178162#3686629 (10Marostegui) a:03Cmjohnson db1050 can now be decommissioned by @Cmjohnson @Cmjohnson remember that one of the disks is failed, so it would be good to identify that one before de... [08:56:54] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: Decommission db1050 - https://phabricator.wikimedia.org/T178162#3686644 (10Marostegui) [08:57:26] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: Decommission db1050 - https://phabricator.wikimedia.org/T178162#3682923 (10Marostegui) [08:57:41] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1093" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384465 [08:57:43] (03CR) 10DCausse: [cirrus] Deploy recall A/B on enwiki (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) (owner: 10DCausse) [08:57:50] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1093" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384465 [09:00:01] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1093" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384465 (owner: 10Marostegui) [09:01:15] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1093" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384465 (owner: 10Marostegui) [09:02:08] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1093 - T174509 T177772 (duration: 00m 46s) [09:02:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:02:16] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) - https://phabricator.wikimedia.org/T177772 [09:02:16] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [09:04:42] PROBLEM - Nginx local proxy to apache on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:04:42] PROBLEM - HHVM rendering on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:05:41] (03PS1) 10Marostegui: db-eqiad.php: Depool db1088 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384466 (https://phabricator.wikimedia.org/T174509) [09:05:42] RECOVERY - Nginx local proxy to apache on mw1315 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 619 bytes in 4.300 second response time [09:06:02] PROBLEM - Apache HTTP on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:06:48] (03CR) 10Jcrespo: [C: 031] db-eqiad.php: Depool db1088 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384466 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [09:07:01] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1088 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384466 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [09:07:16] !log Optimize recentchanges, pagelinks and templatelinks on db1088 - https://phabricator.wikimedia.org/T174509 https://phabricator.wikimedia.org/T177772 [09:07:19] (03PS4) 10DCausse: [cirrus] Prepare profiles for the recall A/B on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) [09:07:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:07:42] RECOVERY - HHVM rendering on mw1315 is OK: HTTP OK: HTTP/1.1 200 OK - 78498 bytes in 4.495 second response time [09:07:52] RECOVERY - Apache HTTP on mw1315 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.033 second response time [09:08:55] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1088 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384466 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [09:09:44] (03PS5) 10DCausse: [cirrus] Prepare profiles for the recall A/B test on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) [09:10:06] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1088 - T174509 T177772 (duration: 00m 46s) [09:10:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:10:13] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) - https://phabricator.wikimedia.org/T177772 [09:10:14] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [09:14:32] PROBLEM - HHVM rendering on mw1296 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:15:23] RECOVERY - HHVM rendering on mw1296 is OK: HTTP OK: HTTP/1.1 200 OK - 78538 bytes in 0.138 second response time [09:16:36] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1093" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384465 (owner: 10Marostegui) [09:16:38] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1088 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384466 (https://phabricator.wikimedia.org/T174509) (owner: 10Marostegui) [09:17:37] (03CR) 10Giuseppe Lavagetto: [C: 032] Fix up puppet-compiler for labs usage [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/325042 (owner: 10Gerrit Patch Uploader) [09:24:17] (03CR) 10Alexandros Kosiaris: [C: 031] base/standard_packages: add dnsutils [puppet] - 10https://gerrit.wikimedia.org/r/383657 (owner: 10Dzahn) [09:25:32] (03CR) 10Alexandros Kosiaris: [C: 04-1] "pedantic inline comment, otherwise LGTM" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) (owner: 10Dzahn) [09:32:54] (03CR) 10Ema: [C: 032] runcommand: do not crash on empty runcommand.arguments [debs/pybal] - 10https://gerrit.wikimedia.org/r/384000 (https://phabricator.wikimedia.org/T178149) (owner: 10Ema) [09:33:03] (03PS1) 10Ema: runcommand: do not crash on empty runcommand.arguments [debs/pybal] (1.14) - 10https://gerrit.wikimedia.org/r/384480 (https://phabricator.wikimedia.org/T178149) [09:34:01] (03CR) 10Ema: [V: 032 C: 032] runcommand: do not crash on empty runcommand.arguments [debs/pybal] (1.14) - 10https://gerrit.wikimedia.org/r/384480 (https://phabricator.wikimedia.org/T178149) (owner: 10Ema) [09:36:36] (03PS1) 10Ema: 1.14.2: do not crash on empty runcommand.arguments [debs/pybal] - 10https://gerrit.wikimedia.org/r/384483 (https://phabricator.wikimedia.org/T178149) [09:37:02] !log restarting elasticsearch on elastic1017 [09:37:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:38:20] (03CR) 10Ema: [C: 032] 1.14.2: do not crash on empty runcommand.arguments [debs/pybal] - 10https://gerrit.wikimedia.org/r/384483 (https://phabricator.wikimedia.org/T178149) (owner: 10Ema) [09:38:33] (03PS1) 10Ema: 1.14.2: do not crash on empty runcommand.arguments [debs/pybal] (1.14) - 10https://gerrit.wikimedia.org/r/384484 (https://phabricator.wikimedia.org/T178149) [09:38:47] (03CR) 10Ema: [V: 032 C: 032] 1.14.2: do not crash on empty runcommand.arguments [debs/pybal] (1.14) - 10https://gerrit.wikimedia.org/r/384484 (https://phabricator.wikimedia.org/T178149) (owner: 10Ema) [09:39:00] !log restarting cassandra on cerium/xenon/praseodymium to pick up NSS security update [09:39:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:40:38] !log restarting cassandra on restbase1007 to pick up NSS security update [09:40:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:43:53] !log pybal 1.14.2 uploaded to apt.w.o [09:43:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:44:49] !log upgrading ulsfo LVSs to pybal 1.14.2 (T178149, T177815) [09:44:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:44:56] T177815: Alerts on LVS services with one single realserver - https://phabricator.wikimedia.org/T177815 [09:44:56] T178149: RunCommandMonitoringProtocol throws an exception if runcommand.arguments is not specified - https://phabricator.wikimedia.org/T178149 [09:54:33] !log upgrading esams LVSs to pybal 1.14.2 (T178149, T177815) [09:54:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:54:41] T177815: Alerts on LVS services with one single realserver - https://phabricator.wikimedia.org/T177815 [09:54:41] T178149: RunCommandMonitoringProtocol throws an exception if runcommand.arguments is not specified - https://phabricator.wikimedia.org/T178149 [10:03:13] !log upgrading codfw LVSs to pybal 1.14.2 (T178149, T177815) [10:03:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:03:21] T177815: Alerts on LVS services with one single realserver - https://phabricator.wikimedia.org/T177815 [10:03:21] T178149: RunCommandMonitoringProtocol throws an exception if runcommand.arguments is not specified - https://phabricator.wikimedia.org/T178149 [10:13:41] (03PS1) 10Volans: Documentation: fix ReadTheDocs CSS override [software/cumin] - 10https://gerrit.wikimedia.org/r/384489 [10:13:43] (03PS1) 10Volans: Documentation: amend changelog [software/cumin] - 10https://gerrit.wikimedia.org/r/384490 (https://phabricator.wikimedia.org/T159308) [10:37:51] (03CR) 10MarcoAurelio: [C: 04-1] Enable blocking feature of abuse filter in fawikiquote (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup) [10:42:55] (03PS3) 10MarcoAurelio: AbuseFilter configuration changes for es.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384265 (https://phabricator.wikimedia.org/T177760) [10:44:13] (03PS4) 10MarcoAurelio: AbuseFilter configuration changes for es.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384265 (https://phabricator.wikimedia.org/T177760) [10:48:21] !log cleaning up ores_classification table in wikidatawiki (T159753) [10:48:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:48:29] T159753: Concerns about ores_classification table size on enwiki - https://phabricator.wikimedia.org/T159753 [10:53:25] (03CR) 10Volans: [C: 032] Documentation: fix ReadTheDocs CSS override [software/cumin] - 10https://gerrit.wikimedia.org/r/384489 (owner: 10Volans) [10:55:45] (03Merged) 10jenkins-bot: Documentation: fix ReadTheDocs CSS override [software/cumin] - 10https://gerrit.wikimedia.org/r/384489 (owner: 10Volans) [10:58:02] (03CR) 10Volans: [C: 032] Documentation: amend changelog [software/cumin] - 10https://gerrit.wikimedia.org/r/384490 (https://phabricator.wikimedia.org/T159308) (owner: 10Volans) [11:00:27] (03Merged) 10jenkins-bot: Documentation: amend changelog [software/cumin] - 10https://gerrit.wikimedia.org/r/384490 (https://phabricator.wikimedia.org/T159308) (owner: 10Volans) [11:10:18] !log mobrovac@tin Started restart [electron-render/deploy@8dd5f13]: Electron hanging - T174916 [11:10:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:10:25] T174916: electron/pdfrender hangs - https://phabricator.wikimedia.org/T174916 [11:14:10] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384495 [11:14:15] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384495 [11:19:22] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384495 (owner: 10Marostegui) [11:20:33] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384495 (owner: 10Marostegui) [11:20:43] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384495 (owner: 10Marostegui) [11:22:05] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1070 - T174509 (duration: 00m 46s) [11:22:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:22:12] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [11:23:02] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1074" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384497 [11:23:07] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1074" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384497 [11:24:32] PROBLEM - HHVM rendering on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:24:44] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1074" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384497 (owner: 10Marostegui) [11:25:22] PROBLEM - Nginx local proxy to apache on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:25:33] PROBLEM - Apache HTTP on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:27:17] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1074" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384497 (owner: 10Marostegui) [11:27:26] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1074" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384497 (owner: 10Marostegui) [11:28:12] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1074 - T174509 (duration: 00m 46s) [11:28:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:28:20] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [11:43:23] PROBLEM - SSH on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:22] RECOVERY - SSH on mw1315 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [11:46:42] (03CR) 10Marostegui: "Apart from the style things, puppet compiles correctly so far: https://puppet-compiler.wmflabs.org/compiler02/8326/" [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [11:48:02] RECOVERY - Apache HTTP on mw1315 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.032 second response time [11:48:43] RECOVERY - HHVM rendering on mw1315 is OK: HTTP OK: HTTP/1.1 200 OK - 78497 bytes in 0.107 second response time [11:50:02] PROBLEM - Check systemd state on mw1315 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [11:51:33] RECOVERY - Nginx local proxy to apache on mw1315 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.037 second response time [11:52:12] PROBLEM - Apache HTTP on mw1315 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:53:02] RECOVERY - Apache HTTP on mw1315 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.026 second response time [11:53:50] ^ hmmh, hung kernel thread, having a look [12:00:02] RECOVERY - Check systemd state on mw1315 is OK: OK - running: The system is fully operational [12:09:04] (03PS2) 10Marostegui: [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 [12:09:32] (03CR) 10jerkins-bot: [V: 04-1] [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:12:19] (03PS3) 10Marostegui: [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 [12:12:50] (03CR) 10jerkins-bot: [V: 04-1] [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:16:48] (03PS4) 10Marostegui: [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 [12:17:18] (03CR) 10jerkins-bot: [V: 04-1] [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:30:48] (03PS5) 10Marostegui: [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 [12:31:16] (03CR) 10Mforns: Removing appInstallId from whitelist (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/384093 (https://phabricator.wikimedia.org/T178174) (owner: 10Nuria) [12:31:16] (03CR) 10jerkins-bot: [V: 04-1] [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:33:24] (03CR) 10Marostegui: "The changes that would be applied to db2084 look good: https://puppet-compiler.wmflabs.org/compiler02/8331/" [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:36:04] (03CR) 10Jcrespo: [WIP]: Support multiinstance in core servers (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:37:41] (03CR) 10Jcrespo: [WIP]: Support multiinstance in core servers (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:38:35] (03CR) 10Huji: "You did not catch my point. Of the lines of code from fawiki that you listed above, the first two have nothing to do with whether AbuseFil" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup) [12:38:38] !log upgrading eqiad LVSs to pybal 1.14.2 (T178149, T177815) [12:38:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:38:47] T177815: Alerts on LVS services with one single realserver - https://phabricator.wikimedia.org/T177815 [12:38:47] T178149: RunCommandMonitoringProtocol throws an exception if runcommand.arguments is not specified - https://phabricator.wikimedia.org/T178149 [12:41:08] (03CR) 10Marostegui: [WIP]: Support multiinstance in core servers (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:41:24] (03PS6) 10Marostegui: [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 [12:41:53] (03CR) 10jerkins-bot: [V: 04-1] [WIP]: Support multiinstance in core servers [puppet] - 10https://gerrit.wikimedia.org/r/384452 (owner: 10Marostegui) [12:42:43] RECOVERY - PyBal backends health check on lvs1010 is OK: PYBAL OK - All pools are healthy [12:43:13] this is because of the pybal upgrade ^ [12:46:33] 10Operations, 10Pybal, 10Traffic, 10Patch-For-Review: Alerts on LVS services with one single realserver - https://phabricator.wikimedia.org/T177815#3687153 (10ema) 05Open>03Resolved a:03ema PyBal upgraded to 1.14.2 on all LVS hosts. [12:46:56] 10Operations, 10Pybal, 10Traffic, 10Patch-For-Review: RunCommandMonitoringProtocol throws an exception if runcommand.arguments is not specified - https://phabricator.wikimedia.org/T178149#3687156 (10ema) 05Open>03Resolved a:03ema PyBal upgraded to 1.14.2 on all LVS hosts. [12:48:01] (03PS3) 10Ema: varnish: remove varnishlog.py [puppet] - 10https://gerrit.wikimedia.org/r/383586 [12:48:17] (03CR) 10Ema: [V: 032 C: 032] varnish: remove varnishlog.py [puppet] - 10https://gerrit.wikimedia.org/r/383586 (owner: 10Ema) [12:48:46] jouncebot: next [12:48:46] In 0 hour(s) and 11 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T1300) [12:53:54] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1088" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384505 [12:53:57] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1088" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384505 [12:58:11] 10Operations, 10wikidiff2, 10User-Addshore, 10WMDE-QWERTY-Team-Board: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3674094 (10Tobi_WMDE_SW) Thanks @Legoktm! @MoritzMuehlenhoff Which would be the next steps in order to get this to production? [12:59:01] !log restart cassandra on maps1001 [12:59:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:00:06] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: It is that lovely time of the day again! You are hereby commanded to deploy European Mid-day SWAT(Max 8 patches). (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T1300). [13:00:06] kart_, Zoranzoki21, dcausse, and TabbyCat: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [13:00:11] o/ [13:00:15] o/ [13:00:27] (03CR) 10Hashar: [C: 032] Deploy Compact Language Links on the German Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383357 (https://phabricator.wikimedia.org/T177836) (owner: 10KartikMistry) [13:00:29] o/ [13:00:33] I can swat, unless you insist, hashar? :) [13:00:52] oh, looks like he already started [13:00:55] Isarra: ping? [13:01:02] * kart_ is here [13:01:22] zeljkof: yeah I am going to handle it :) [13:01:28] hashar: if Zoranzoki isn't here, and Isarra neither, perhaps it's best to postpone Timeless deployment [13:01:32] hashar: thanks :) [13:01:51] hashar: we need to run the script I mention post-deployment. [13:01:59] (03CR) 10MarcoAurelio: [C: 04-1] "Sorry, didn't meant to post the full quoted text." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup) [13:02:02] aharoni: is also here to watch :) [13:02:14] o/ [13:02:30] (and I'm not sure Zoran is familiar with the pages to check on wiki to see there isn't too problematic issues) [13:02:37] I have something to add to the SWAT [13:02:40] if it's okay [13:02:40] kart_: please tell me before you run [13:02:45] (03PS1) 10Ladsgroup: labs: enable draftquality for enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384507 (https://phabricator.wikimedia.org/T176183) [13:02:57] https://gerrit.wikimedia.org/r/384507 [13:03:00] aharoni: Sure. [13:03:38] (03CR) 10Dereckson: [C: 04-1] "I'd recommend a dedicated window to deploy this, so we can check carefully what the wikis look like and provide tips to fix common.css, et" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/377864 (https://phabricator.wikimedia.org/T154371) (owner: 10Framawiki) [13:03:44] hashar: note: Please don't put output of script publicly, private output to me and aharoni is fine. [13:04:50] kart_: okk [13:04:59] (03PS3) 10Hashar: Deploy Compact Language Links on the German Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383357 (https://phabricator.wikimedia.org/T177836) (owner: 10KartikMistry) [13:05:11] on which MW version are we running? [13:05:16] 1.30 wmf-what? [13:05:44] (03CR) 10Hashar: [C: 032] Deploy Compact Language Links on the German Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383357 (https://phabricator.wikimedia.org/T177836) (owner: 10KartikMistry) [13:06:37] hashar, kart_ - I think that you should first run the preferences script [13:07:06] aharoni: kart_ maybe it is easier if you run them yourselves? :D [13:07:17] I'd like to get 384509 cherry-picked, I'll add it to the swat window [13:07:33] aharoni: oh, okay! [13:08:04] (03Merged) 10jenkins-bot: Deploy Compact Language Links on the German Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383357 (https://phabricator.wikimedia.org/T177836) (owner: 10KartikMistry) [13:08:13] (03CR) 10jenkins-bot: Deploy Compact Language Links on the German Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383357 (https://phabricator.wikimedia.org/T177836) (owner: 10KartikMistry) [13:09:24] hashar: Let me dry-run first. [13:10:13] kart_: the change is merged at least [13:11:36] Dereckson: for the timeless skin, I am not quite sure what else need to be done really [13:11:53] hashar: I'm doing dry-run, let me finish and then we can deploy. [13:13:03] hashar: only check if all works fine, looks good, and communicate with the wiki interface editor / admins to fix some CSS issues [13:13:22] but if the deployment is at day 1, and this check is at day 8 [13:13:29] Dereckson: then it is an optional skin isn't it ? I mean, user have to opt-in to switch to it ? [13:13:36] that will give one week of bad user experience to contributors willing to test the skin [13:13:41] yes it's optionnal [13:14:12] but to have an option, you enable it, and hop, the layout is broken, wouldn't be a good first user experience with this skin [13:14:54] so I'd prefer we postpone this when Isarra or something else knowing the points to check is there [13:15:48] Dereckson: isn't it easier to enable it then users can switch to it and start reporting issues? :) [13:16:59] hashar: I'm running, [13:17:02] mwscript php-1.31.0-wmf.3/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php --dry-run --wiki=dewiki [13:17:05] What is wrong? [13:17:15] On tin. [13:17:23] (03CR) 10Hashar: "This is for labs only. It does not have to be in the SWAT window which is for production ? :)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384507 (https://phabricator.wikimedia.org/T176183) (owner: 10Ladsgroup) [13:17:37] kart_: i dont know? :) [13:17:40] kart_: what is happening [13:18:01] PHP Fatal error: Usage: mwscript scriptName.php —wiki=dbname [13:18:04] hashar: ^ [13:18:08] hashar: I guess so, I'm probably cautious following fr.wikt issue where the workmark and tagline were so long it shifted the design of +- 100px down [13:18:12] do you have some funny hyphen perhaps? [13:18:16] Probably location to call script is wrong? [13:18:25] kart_: flip the last two options. --wiki has to be the first argument [13:18:35] kart_: also I have not updated the submodule on tin [13:18:41] kart_: but I can do it right now [13:18:44] (03PS9) 10Rush: openstack: horizon to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384069 (https://phabricator.wikimedia.org/T171494) [13:19:06] done [13:19:22] OK it works. [13:19:29] kart_: mwscript extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php --wiki=dewiki --dry-run [13:19:35] hashar: yep. [13:19:39] well if it works I guess you can run it :) [13:19:45] then I will sync it [13:20:16] dcausse: I am preparing your patches [13:20:22] hashar: thanks [13:21:29] (03PS3) 10Hashar: Add additional namespaces to search results for bnwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383794 (https://phabricator.wikimedia.org/T178041) (owner: 10DCausse) [13:21:31] (03PS3) 10Hashar: [cirrus] Disable A/B test of MLR on testing on 18 wikis with > 1% of search traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382451 (https://phabricator.wikimedia.org/T177490) (owner: 10DCausse) [13:21:33] (03PS6) 10Hashar: [cirrus] Prepare profiles for the recall A/B test on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) (owner: 10DCausse) [13:21:41] (03CR) 10Hashar: "I have NOT deployed this patch per Dereckson comment." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/377864 (https://phabricator.wikimedia.org/T154371) (owner: 10Framawiki) [13:22:12] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383794 (https://phabricator.wikimedia.org/T178041) (owner: 10DCausse) [13:23:34] kart_: script is still running isn't it ? [13:23:35] srsly hashar ... the order of the options is important? [13:24:00] hashar: dry-run is done now. [13:24:06] aharoni: mwscript usage is: mwscript <--wiki=foo> [rest of options] [13:24:08] hashar: now running the script. [13:24:22] kart_: can I sync the config change or does the maintenance script has to complete first? [13:24:25] aharoni: do you want to see the output or should I run the script? [13:24:34] hashar: let the script finish, please. [13:24:53] aharoni: ping :) [13:25:46] seeing dry run output can be useful, yes [13:25:57] aharoni: OK. [13:26:23] (03Merged) 10jenkins-bot: Add additional namespaces to search results for bnwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383794 (https://phabricator.wikimedia.org/T178041) (owner: 10DCausse) [13:26:31] (03CR) 10jenkins-bot: Add additional namespaces to search results for bnwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383794 (https://phabricator.wikimedia.org/T178041) (owner: 10DCausse) [13:28:22] hashar: aharoni is checking dru-run output.. [13:28:27] dry* [13:31:22] 10Operations, 10Puppet, 10User-Joe: Set up puppet catalog diff on host with access to puppetmaster1001 and puppetmaster2001 - https://phabricator.wikimedia.org/T177843#3687274 (10herron) a:03herron [13:31:27] (03CR) 10Ladsgroup: "IIRC I was told to do it in SWAT to sync them with prod just for the sake of synchronization of git log" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384507 (https://phabricator.wikimedia.org/T176183) (owner: 10Ladsgroup) [13:32:58] bah [13:33:07] I should have done the SWAT in another order :D [13:33:31] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382451 (https://phabricator.wikimedia.org/T177490) (owner: 10DCausse) [13:33:37] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) (owner: 10DCausse) [13:34:26] hashar: we need to revert the change, submitting patch. [13:34:38] kart :- [13:34:39] ( [13:34:50] (03PS1) 10KartikMistry: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 [13:35:02] hashar: https://gerrit.wikimedia.org/r/#/c/384513/ [13:35:06] (03PS2) 10Hashar: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [13:35:09] (03CR) 10Hashar: [C: 032] Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [13:35:58] (03Merged) 10jenkins-bot: [cirrus] Disable A/B test of MLR on testing on 18 wikis with > 1% of search traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382451 (https://phabricator.wikimedia.org/T177490) (owner: 10DCausse) [13:36:01] (03Merged) 10jenkins-bot: [cirrus] Prepare profiles for the recall A/B test on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) (owner: 10DCausse) [13:37:00] 10Operations, 10Ops-Access-Requests, 10DBA, 10cloud-services-team (Kanban): Access to raw database tables on labsdb* for wmcs-admin users - https://phabricator.wikimedia.org/T178128#3681627 (10jcrespo) This is a problem created by cloud people themselves. I told them to create the view user with localhost... [13:37:24] (03PS3) 10Muehlenhoff: Configure fixed lock manager ports for labstore NFS servers [puppet] - 10https://gerrit.wikimedia.org/r/357562 (https://phabricator.wikimedia.org/T165136) [13:37:37] dcausse: doing your changes now [13:37:44] ok [13:38:19] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Add additional namespaces to search results for bnwikisource - T178041 (duration: 00m 49s) [13:38:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:38:27] T178041: Special:Search—add additional namespaces to search results for Bengali Wikisource - https://phabricator.wikimedia.org/T178041 [13:38:47] dcausse: I dont think there is anything to test is there? [13:38:51] eg for https://gerrit.wikimedia.org/r/#/c/382451/3 [13:40:02] hashar: it's noop, the only test would be making sure that it's not spammy the logs because of a typo in a config var [13:40:08] hashar: do you have time to redeploy my change? [13:40:23] hashar: we just confirmed that glitch is not a glitch :) [13:40:31] (03CR) 10Rush: [C: 032] openstack: horizon to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384069 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [13:40:42] kart_: ..... [13:40:46] kart_: then send it again :) [13:40:52] hashar: sure. [13:40:53] could it be after doing the pending patches? I'm also waiting :) [13:41:25] hashar: have you seen my comment in the patch? [13:41:30] hashar: the revert is not done yet? [13:41:46] kart_: I did revert it yes [13:41:55] I'm fine with merging and deploying after the SWAT if you think it's okay [13:42:02] Amir1: yeah. change that are solely on -labs.php do not have to run via swat [13:42:12] we just +2 them whenever asked [13:42:21] will do at the end of this window if time allow [13:42:28] !log hashar@tin Synchronized wmf-config: [cirrus] Disable A/B test of MLR on testing on 18 wikis with > 1% of search traffic - T177490 (duration: 00m 48s) [13:42:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:42:35] T177490: Stop A/B test on 18 wikis with > 1% of search traffic - https://phabricator.wikimedia.org/T177490 [13:42:45] I can do it too (so I don't bother you) if that's fine for you [13:43:26] hashar: Why I can't revert: https://gerrit.wikimedia.org/r/#/c/384513/ [13:43:39] It says, can't merge in the patch too. [13:43:47] Can you please check? [13:44:22] !log upgrading deployment-mediawiki04 to wikidiff2 1.5.1 [13:44:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:45:07] (03PS3) 10Gehel: mjolnir: cleanup service declaration [puppet] - 10https://gerrit.wikimedia.org/r/384029 [13:45:19] !log hashar@tin Synchronized wmf-config/: [cirrus] Prepare profiles for the recall A/B test on enwiki - T177502 (duration: 00m 48s) [13:45:21] dcausse: all deployed [13:45:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:45:25] T177502: Deploy A/B test to test relaxing the retrieval query filter - https://phabricator.wikimedia.org/T177502 [13:45:26] hashar: thanks! [13:45:38] is it just me, or is beta down? https://en.wikipedia.beta.wmflabs.org [13:46:03] zeljkof: me as well... [13:46:06] zeljkof: same for me [13:46:30] XioNoX: looks like beta is down again :) ^ [13:46:35] hashar: Can you check: https://gerrit.wikimedia.org/r/#/c/384513/ :) [13:46:36] kart_: ohh my revert patch hasnt been merged ( https://gerrit.wikimedia.org/r/#/c/384513/ ) :D [13:46:43] :D [13:46:45] OK. [13:46:45] it's up for me [13:46:47] hashar: did you just break beta? ;) [13:46:48] (03Abandoned) 10Hashar: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [13:46:57] (03CR) 10Hashar: "Did not get reverted :)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383357 (https://phabricator.wikimedia.org/T177836) (owner: 10KartikMistry) [13:46:59] (03PS1) 10Rush: openstack: horizon correct wmflabsdotorg_admin lookup path [puppet] - 10https://gerrit.wikimedia.org/r/384517 (https://phabricator.wikimedia.org/T171494) [13:47:07] XioNoX, hashar: nevermind, it's back up [13:47:15] (03Restored) 10KartikMistry: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [13:47:18] kart_: so we should deploy it ? [13:47:20] gehel, dcausse: looks like it's back [13:47:29] just like magic :) [13:47:30] zeljkof: yes... [13:47:35] (03CR) 10Gehel: [C: 032] mjolnir: cleanup service declaration [puppet] - 10https://gerrit.wikimedia.org/r/384029 (owner: 10Gehel) [13:47:36] kart_: because now I am confused [13:47:39] hashar: yes. I clicked restore button. [13:47:56] hashar: go ahead and merge, I'm running the script now. [13:47:57] (03CR) 10Rush: [C: 032] openstack: horizon correct wmflabsdotorg_admin lookup path [puppet] - 10https://gerrit.wikimedia.org/r/384517 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [13:47:58] kart_: you told me there is no glitch [13:48:05] (03PS2) 10Rush: openstack: horizon correct wmflabsdotorg_admin lookup path [puppet] - 10https://gerrit.wikimedia.org/r/384517 (https://phabricator.wikimedia.org/T171494) [13:48:21] (03PS2) 10Andrew Bogott: maintain-views: Add log types to logging_whitelist [puppet] - 10https://gerrit.wikimedia.org/r/384201 (https://phabricator.wikimedia.org/T178052) (owner: 10BryanDavis) [13:48:22] hashar: Yes. so, we can revert the revert. [13:48:45] kart_: but the revert has not been merged. That is why I abandoned the change :) [13:48:54] aha [13:48:57] (03Abandoned) 10Hashar: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [13:49:00] kart_: :) [13:49:18] hashar: OK. So, can I go ahead with script? [13:49:20] (03CR) 10Andrew Bogott: [C: 032] maintain-views: Add log types to logging_whitelist [puppet] - 10https://gerrit.wikimedia.org/r/384201 (https://phabricator.wikimedia.org/T178052) (owner: 10BryanDavis) [13:49:31] (03PS3) 10Andrew Bogott: maintain-views: Add log types to logging_whitelist [puppet] - 10https://gerrit.wikimedia.org/r/384201 (https://phabricator.wikimedia.org/T178052) (owner: 10BryanDavis) [13:49:46] kart_: so tin now has the proper config [13:49:54] kart_: but I ahvent synced the cluster yet [13:50:38] !log upgrading deployment-mediawiki0[56] to wikidiff2 1.5.1 [13:50:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:50:54] hashar: it should be fine, I'm going ahead and running script. [13:51:09] hashar: you can deploy in mwdebug first and please let me know. [13:51:18] aharoni: running script now.. [13:51:29] yay [13:51:44] aharoni: kart_ it is on mwdebug1001 [13:53:06] aharoni: ^ [13:53:42] looks good, hashar [13:53:46] aharoni: can you test. [13:53:54] (03CR) 10jenkins-bot: [cirrus] Disable A/B test of MLR on testing on 18 wikis with > 1% of search traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382451 (https://phabricator.wikimedia.org/T177490) (owner: 10DCausse) [13:53:56] (03CR) 10jenkins-bot: [cirrus] Prepare profiles for the recall A/B test on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/382470 (https://phabricator.wikimedia.org/T177502) (owner: 10DCausse) [13:53:57] kart_: did the script run? [13:54:14] aharoni: still running.. [13:54:14] tabbycat: sorry that is running late. Wanna check the CheckUser hotfix ? [13:54:31] hashar: is it on mwdebug? [13:54:33] hashar: wait till script finish for final deploy + aharoni's check [13:54:39] (03PS3) 10Gehel: maps: introduce the base requirements for vector tiles and cleartables [puppet] - 10https://gerrit.wikimedia.org/r/383841 (https://phabricator.wikimedia.org/T157613) [13:55:02] tabbycat: it is now :] [13:55:10] hashar: checking [13:55:22] (03CR) 10Gehel: [C: 032] maps: introduce the base requirements for vector tiles and cleartables [puppet] - 10https://gerrit.wikimedia.org/r/383841 (https://phabricator.wikimedia.org/T157613) (owner: 10Gehel) [13:55:42] kart_: what is deployed on mwdebug1001? [13:55:48] the default preference change? [13:55:52] aharoni: yes [13:55:55] I don't see it yet. [13:55:59] aharoni: looks good to me. [13:56:12] hashar: which mwdebug, please? [13:56:19] tabbycat: mwdebug1001 [13:56:24] ok, checking now [13:56:32] (03CR) 10Ottomata: [C: 032] Fix cron job for refinery data drop of MediaWiki snapshots [puppet] - 10https://gerrit.wikimedia.org/r/384346 (https://phabricator.wikimedia.org/T178256) (owner: 10Mforns) [13:56:33] kart_: how did you test? [13:56:40] (03PS2) 10Ottomata: Fix cron job for refinery data drop of MediaWiki snapshots [puppet] - 10https://gerrit.wikimedia.org/r/384346 (https://phabricator.wikimedia.org/T178256) (owner: 10Mforns) [13:56:43] aharoni: mwdebug1001, check if beta option is not visible + CLL on some random pages. [13:56:51] (03CR) 10Ottomata: [C: 032] "Sorry, I should have caught this in the other review." [puppet] - 10https://gerrit.wikimedia.org/r/384346 (https://phabricator.wikimedia.org/T178256) (owner: 10Mforns) [13:56:55] (03CR) 10Ottomata: [V: 032 C: 032] Fix cron job for refinery data drop of MediaWiki snapshots [puppet] - 10https://gerrit.wikimedia.org/r/384346 (https://phabricator.wikimedia.org/T178256) (owner: 10Mforns) [13:57:13] aharoni: refresh page and after enabling mwdebug1001 :) [13:57:20] hashar: hmm [13:57:21] oh, now I see it - only after purging [13:57:30] on loginwiki, I get the message duplicated [13:57:34] let me check meta [13:57:35] OK, let's wait for the preferences to end, kart_ [13:57:39] hashar: we are good. [13:57:51] aharoni: script is done. [13:58:03] kart_: can I see the new output? [13:58:11] aharoni: oops. [13:58:39] hashar: no, test do not pass [13:58:44] message is duplicated [13:58:51] kart_: oops? [13:58:57] aharoni: I accidently closed tab. [13:59:19] ehh... so is it lost? [13:59:24] aharoni: yes. [13:59:26] meh [13:59:28] was it dry? [13:59:36] aharoni: no. real run. [13:59:42] oh, so let me check... [13:59:44] aharoni: I can get output. [13:59:50] (by dry-run) [14:00:02] tabbycat: so revert I guess ? [14:00:13] hashar: I did https://gerrit.wikimedia.org/r/#/c/384519/ for you [14:00:15] yes [14:00:20] I'll inform Huji and Melos [14:00:25] tabbycat: thank you ) [14:00:32] hopefully they can fix it before it hits master [14:00:35] aharoni: can you check quickly? [14:00:52] kart_: so it was supposed to make my preference actually disabled in the database? [14:00:55] becasue it didn't. [14:01:23] aharoni: yes. [14:01:45] (03PS5) 10Hashar: AbuseFilter configuration changes for es.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384265 (https://phabricator.wikimedia.org/T177760) (owner: 10MarcoAurelio) [14:01:45] nope, didn't seem to work. [14:01:46] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384265 (https://phabricator.wikimedia.org/T177760) (owner: 10MarcoAurelio) [14:01:50] abort, I guess :( [14:02:17] aharoni: kart_ should I sync the change after all? Or should we revert ? :) [14:02:27] kart_: are you sure it was real run? I don't see that it changed anything. [14:02:48] tabbycat: and I am doing the abusefilter change [14:02:59] hashar: okay, ready to check [14:03:07] prêt-a-verifier :D [14:03:11] ;] [14:03:21] kart_: was it supposed to change real preferences? am I supposed to see it in sql on terbium? [14:03:39] aharoni: yes. should update in DB [14:03:53] hashar: please hold on. Sorry for messup. [14:03:56] kart_: it doesn't seem to have updated anything. [14:04:05] kart_: hashar , abort, revert :( [14:04:08] we'll have to rescheduly. [14:04:12] reschedule. [14:04:21] you can probably prepare it via the beta cluster [14:04:30] and check there that it is working as expected :] [14:04:36] (03PS6) 10Muehlenhoff: Add support for stretch to hhvm::debug [puppet] - 10https://gerrit.wikimedia.org/r/380722 [14:04:41] aharoni: wait a second. [14:05:12] hashar: OK. Aborting. [14:05:13] PROBLEM - puppet last run on relforge1001 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[mjolnir-kafka-daemon] [14:05:18] (03Restored) 10Hashar: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [14:05:22] hashar: we have to reschedule. [14:05:24] (03PS3) 10Hashar: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [14:05:30] (03CR) 10Hashar: [C: 032] Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [14:05:39] (03CR) 10Hashar: [C: 032] "Finally that is being reverted for real." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [14:05:48] (03PS1) 10Ema: varnish: execution environment configuration [puppet] - 10https://gerrit.wikimedia.org/r/384520 [14:05:59] (03CR) 10Hashar: [C: 032] labs: enable draftquality for enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384507 (https://phabricator.wikimedia.org/T176183) (owner: 10Ladsgroup) [14:06:11] (03CR) 10Alexandros Kosiaris: [C: 04-1] "First round of comments. Overall a nice idea" (0313 comments) [docker-images/docker-pkg] - 10https://gerrit.wikimedia.org/r/384081 (https://phabricator.wikimedia.org/T177276) (owner: 10Giuseppe Lavagetto) [14:06:13] (03Merged) 10jenkins-bot: AbuseFilter configuration changes for es.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384265 (https://phabricator.wikimedia.org/T177760) (owner: 10MarcoAurelio) [14:06:23] (03CR) 10Krinkle: [C: 031] "Nice catch :)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/371967 (https://phabricator.wikimedia.org/T173342) (owner: 10Dereckson) [14:06:46] (03CR) 10jenkins-bot: AbuseFilter configuration changes for es.wikiquote [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384265 (https://phabricator.wikimedia.org/T177760) (owner: 10MarcoAurelio) [14:07:11] (03Merged) 10jenkins-bot: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [14:07:13] hashar: thanks and sorry again :) [14:07:25] no worries ) [14:08:05] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests, 10User-Addshore: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3687404 (10Dzahn) Alright, thanks. So all members of 'wmde' should be added to ldap_only_users. I will add a patch to do that. Pablo is listed a... [14:08:05] !log hashar@tin Synchronized wmf-config/abusefilter.php: (T177760) Enable AbuseFilter profiler | (T177761) Allow AbuseFilter to block & grant relevant permissions to permit administrators to manage the restricted option. (duration: 00m 47s) [14:08:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:08:13] T177761: Enable AbuseFilter to block on es.wikiquote - https://phabricator.wikimedia.org/T177761 [14:08:13] T177760: Enable AbuseFilter profiler for es.wikiquote - https://phabricator.wikimedia.org/T177760 [14:08:32] PROBLEM - puppet last run on relforge1002 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[mjolnir-kafka-daemon] [14:08:39] kart_: patch is reverted on tin and mwdebug1001 :) [14:08:49] (03Merged) 10jenkins-bot: labs: enable draftquality for enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384507 (https://phabricator.wikimedia.org/T176183) (owner: 10Ladsgroup) [14:08:54] !log European swat is complete [14:09:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:09:08] hashar: nice. Thanks! [14:09:13] hashar: is abusefilter es.wq merged and deployed? [14:09:21] tabbycat: yes! [14:09:32] (03CR) 10jenkins-bot: Revert "Deploy Compact Language Links on the German Wikipedia" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384513 (owner: 10KartikMistry) [14:09:44] hashar: oh, true, didn't saw the !log [14:09:50] thanks :D [14:10:04] Merci bien Monsieur Musso :P [14:10:47] \o: [14:10:49] \o/ [14:15:59] hmm, does anyone know if the abusefilter profiler takes some time to get it working? because it's showing 0 [14:20:44] (03CR) 10Krinkle: [C: 032] multiversion: Fix expanddblist shebang [mediawiki-config] - 10https://gerrit.wikimedia.org/r/371966 (owner: 10Dereckson) [14:20:48] (03CR) 10Krinkle: [C: 032] multiversion: Fix PHP notice when no argument is given [mediawiki-config] - 10https://gerrit.wikimedia.org/r/371967 (https://phabricator.wikimedia.org/T173342) (owner: 10Dereckson) [14:23:24] (03Merged) 10jenkins-bot: multiversion: Fix expanddblist shebang [mediawiki-config] - 10https://gerrit.wikimedia.org/r/371966 (owner: 10Dereckson) [14:24:45] (03PS1) 10BBlack: cacheproxy: IPv4 PMTUD when blackhole detected [puppet] - 10https://gerrit.wikimedia.org/r/384526 [14:24:56] (03PS3) 10BBlack: Refactor (rewrite?!) purging code [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384167 [14:24:58] (03PS2) 10BBlack: strq+purger: refactor, simplify, add queue delays [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384433 [14:25:00] (03PS2) 10BBlack: Rework stats further [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384434 [14:25:02] (03PS2) 10BBlack: link against jemalloc [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384435 [14:25:04] (03PS4) 10BBlack: Release 0.1.0 [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382873 [14:25:11] (03CR) 10jerkins-bot: [V: 04-1] cacheproxy: IPv4 PMTUD when blackhole detected [puppet] - 10https://gerrit.wikimedia.org/r/384526 (owner: 10BBlack) [14:26:54] 10Operations, 10ops-codfw, 10DBA: db2081 unreachable - https://phabricator.wikimedia.org/T178140#3687513 (10Papaul) p:05Triage>03Normal [14:29:17] (03PS1) 10KartikMistry: Deploy Compact Language Links on the German Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384527 (https://phabricator.wikimedia.org/T177836) [14:30:49] !log uploaded wikidiff 1.5.1 to apt.wikimedia.org for jessie/experimental [14:30:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:31:23] (03PS2) 10BBlack: cacheproxy: IPv4 PMTUD when blackhole detected [puppet] - 10https://gerrit.wikimedia.org/r/384526 [14:32:14] (03CR) 10Ottomata: [C: 031] "+1, shall I merge?" [puppet] - 10https://gerrit.wikimedia.org/r/383842 (https://phabricator.wikimedia.org/T178076) (owner: 10Hashar) [14:34:05] !log deploying hotfix for T178068 and T178107 [14:34:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:34:16] T178107: Phabricator admins can not edit all project fields - https://phabricator.wikimedia.org/T178107 [14:34:16] T178068: Remove useless/misleading "Comments" field on top of https://phabricator.wikimedia.org/maniphest/task/edit/form/1/ and other places - https://phabricator.wikimedia.org/T178068 [14:36:26] 10Operations, 10wikidiff2, 10User-Addshore, 10WMDE-QWERTY-Team-Board: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3687556 (10MoritzMuehlenhoff) @Tobi_WMDE_SW : I've built and uploaded 1.5.1 to apt.wikimedia.org and upgraded the app servers in deployment-pre... [14:39:09] 10Operations, 10wikidiff2, 10User-Addshore, 10WMDE-QWERTY-Team-Board: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3687566 (10Addshore) >>! In T177891#3687556, @MoritzMuehlenhoff wrote: > We can only update wgWikiDiff2MovedParagraphDetectionCutoff once all a... [14:40:06] (03PS4) 10BBlack: Refactor (rewrite?!) purging code [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384167 [14:40:09] (03PS3) 10BBlack: strq+purger: refactor, simplify, add queue delays [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384433 [14:40:11] (03PS3) 10BBlack: Rework stats further [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384434 [14:40:13] (03PS3) 10BBlack: link against jemalloc [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384435 [14:40:15] (03PS5) 10BBlack: Release 0.1.0 [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382873 [14:43:25] 10Operations, 10wikidiff2, 10User-Addshore, 10WMDE-QWERTY-Team-Board: Update and use php-wikidiff2 to 1.5 in production - https://phabricator.wikimedia.org/T177891#3687588 (10MoritzMuehlenhoff) > We could set this based on hostname though? > This could also be limited per wiki and we could start with testw... [14:44:12] (03PS10) 10Ottomata: Add the LVS config for the druid-public-broker service [puppet] - 10https://gerrit.wikimedia.org/r/383880 (https://phabricator.wikimedia.org/T176223) [14:44:42] RECOVERY - Host db2081.mgmt is UP: PING OK - Packet loss = 0%, RTA = 37.01 ms [14:45:30] 10Operations, 10Continuous-Integration-Infrastructure (shipyard), 10Patch-For-Review, 10User-Joe: Unify production and CI docker image build process - https://phabricator.wikimedia.org/T177276#3687595 (10Addshore) @thcipriani having run with dates for the past weeks I really don't like them. It would make... [14:45:40] 10Operations, 10LDAP-Access-Requests, 10WMF-NDA-Requests, 10User-Addshore: Request to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T177599#3687599 (10MoritzMuehlenhoff) >>! In T177599#3687404, @Dzahn wrote: > Pablo is listed as "has LDAP NDA access" in that doc. That means "had acc... [14:47:51] (03PS1) 10DCausse: [cirrus] Fix typo in UserTesting config var [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384529 (https://phabricator.wikimedia.org/T177502) [14:50:00] 10Operations, 10Commons, 10Thumbor, 10media-storage, 10Performance-Team (Radar): Jessie rsvg/cairo can't render specific SVG file on Commons - https://phabricator.wikimedia.org/T170628#3687636 (10MoritzMuehlenhoff) >>! In T170628#3684122, @Gilles wrote: > Probably something in the Cairo libraries? That's... [15:05:02] (03PS1) 10Ottomata: Update kafka broker profile simple.yaml config to match new profile params [puppet] - 10https://gerrit.wikimedia.org/r/384533 [15:06:01] (03CR) 10Ottomata: [C: 032] Update kafka broker profile simple.yaml config to match new profile params [puppet] - 10https://gerrit.wikimedia.org/r/384533 (owner: 10Ottomata) [15:10:54] (03PS4) 10BBlack: Rework stats further [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384434 [15:11:15] 10Operations, 10Commons, 10Thumbor, 10media-storage, 10Performance-Team (Radar): Jessie rsvg/cairo can't render specific SVG file on Commons - https://phabricator.wikimedia.org/T170628#3687787 (10MoritzMuehlenhoff) a:05MoritzMuehlenhoff>03None [15:14:05] is wmf/1.31.0-wmf.3 the branch we are using now? [15:15:03] davidwbarratt: https://tools.wmflabs.org/versions/ [15:15:33] jynus ah! thanks! [15:17:53] (03PS3) 10Marostegui: Revert "db-eqiad.php: Depool db1088" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384505 [15:21:25] (03PS4) 10BBlack: link against jemalloc [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384435 [15:21:27] (03PS6) 10BBlack: Release 0.1.0 [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382873 [15:21:36] (03CR) 10jenkins-bot: labs: enable draftquality for enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384507 (https://phabricator.wikimedia.org/T176183) (owner: 10Ladsgroup) [15:22:36] (03CR) 10jenkins-bot: multiversion: Fix expanddblist shebang [mediawiki-config] - 10https://gerrit.wikimedia.org/r/371966 (owner: 10Dereckson) [15:22:37] 10Operations, 10ops-codfw, 10DBA: db2081 unreachable - https://phabricator.wikimedia.org/T178140#3687832 (10Papaul) Please see attachment for the error that was on the screen {F10261356} Steps taken: - Removed PSU's for couple of minutes - Update IDRAC firmware from 2.40 to 2.50 - Update BIOS from 2.4.3 t... [15:24:40] 10Operations, 10ops-codfw, 10DBA: db2081 unreachable - https://phabricator.wikimedia.org/T178140#3687839 (10Marostegui) Thanks @Papaul for troubleshooting this. This was one of the new servers we recently bought. Will you talk to the vendor to get a technician to look at it or advise? This server was not i... [15:25:58] 10Operations, 10monitoring: Cron spam: figure out a way it doesn't get ignored - https://phabricator.wikimedia.org/T178311#3687848 (10Volans) [15:26:32] Is there anyway to pull people's preferences from the backup? https://phabricator.wikimedia.org/T177825#3685628 [15:26:44] I mean we'd need to update the script to handle a mixed-state [15:27:18] but I'm wondering if we can recover the user preference at all? [15:34:26] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1088" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384505 (owner: 10Marostegui) [15:35:13] 10Operations, 10ops-codfw, 10DBA: db2081 unreachable - https://phabricator.wikimedia.org/T178140#3687900 (10Papaul) @Marostegui when you call Dell the first thing they will tell you is to update the firmware that is the reason i didn't call them and i went ahead and update all the firmwares and left the tas... [15:36:23] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1088" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384505 (owner: 10Marostegui) [15:36:32] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1088" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384505 (owner: 10Marostegui) [15:37:21] 10Operations, 10ops-codfw, 10DBA: db2081 unreachable - https://phabricator.wikimedia.org/T178140#3682064 (10jcrespo) I think lately we have been doing the following: if it is the first time, upgrade firmware and collect logs/other proof. If it is the second time, collect evidence and ask for replacements. We... [15:37:28] 10Operations, 10ops-codfw, 10DBA: db2081 unreachable - https://phabricator.wikimedia.org/T178140#3687911 (10Marostegui) >>! In T178140#3687900, @Papaul wrote: > @Marostegui when you call Dell the first thing they will tell you is to update the firmware that is the reason i didn't call them and i went ahead a... [15:38:39] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1088 - T174509 T177772 (duration: 00m 47s) [15:38:39] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1094" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384540 [15:38:41] (03PS1) 10Ema: cache: use esams/cache/misc.yaml for varnish 5 hieradata [puppet] - 10https://gerrit.wikimedia.org/r/384541 (https://phabricator.wikimedia.org/T177233) [15:38:43] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1094" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384540 [15:38:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:38:47] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) - https://phabricator.wikimedia.org/T177772 [15:38:47] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [15:39:01] (03Abandoned) 10Nuria: Do not store PopUps events on MySQL [puppet] - 10https://gerrit.wikimedia.org/r/383389 (https://phabricator.wikimedia.org/T176469) (owner: 10Nuria) [15:39:22] anyone? [15:42:35] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1094" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384540 (owner: 10Marostegui) [15:42:58] (03CR) 10Ema: [C: 032] cache: use esams/cache/misc.yaml for varnish 5 hieradata [puppet] - 10https://gerrit.wikimedia.org/r/384541 (https://phabricator.wikimedia.org/T177233) (owner: 10Ema) [15:43:45] Dereckson: Since the worst that'll happen is apt to be weird design anomalies, we can totally just fix stuff after the fact. [15:44:14] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1094" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384540 (owner: 10Marostegui) [15:44:40] Dereckson: Is timeless deployed on those wikis now? Also, heads up, I'm back on US time now, and sleeping 12-hour nights don't ask. [15:45:35] 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1092 - https://phabricator.wikimedia.org/T177264#3687936 (10Marostegui) @Cmjohnson any update? [15:45:41] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1094 - T174509 (duration: 00m 46s) [15:45:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:45:49] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [15:46:24] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1094" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384540 (owner: 10Marostegui) [15:46:24] 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1092 - https://phabricator.wikimedia.org/T177264#3687939 (10Cmjohnson) @marostegui I received an email over the weekend that the disk has shipped. I expect it today or tomorrow. [15:46:42] 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1092 - https://phabricator.wikimedia.org/T177264#3687941 (10Marostegui) Excellent! Thank you! [15:49:44] (03CR) 10Alexandros Kosiaris: Deployment pipeline profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/382608 (https://phabricator.wikimedia.org/T173128) (owner: 10Thcipriani) [15:50:06] (03CR) 10Alexandros Kosiaris: [C: 032] Deployment pipeline profile [puppet] - 10https://gerrit.wikimedia.org/r/382608 (https://phabricator.wikimedia.org/T173128) (owner: 10Thcipriani) [15:50:17] (03PS8) 10Alexandros Kosiaris: Deployment pipeline profile [puppet] - 10https://gerrit.wikimedia.org/r/382608 (https://phabricator.wikimedia.org/T173128) (owner: 10Thcipriani) [15:50:27] (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] Deployment pipeline profile [puppet] - 10https://gerrit.wikimedia.org/r/382608 (https://phabricator.wikimedia.org/T173128) (owner: 10Thcipriani) [15:50:45] !log gehel@tin Started deploy [tilerator/deploy@f3b26f3]: testing tileshell fix - T177389 [15:50:48] (03PS1) 10Nuria: Do not store PopUps events on MySQL [puppet] - 10https://gerrit.wikimedia.org/r/384542 (https://phabricator.wikimedia.org/T176469) [15:50:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:50:52] T177389: tileshell does not exit - https://phabricator.wikimedia.org/T177389 [15:51:05] !log gehel@tin Finished deploy [tilerator/deploy@f3b26f3]: testing tileshell fix - T177389 (duration: 00m 20s) [15:51:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:51:56] !log killing stuck tileshell process on maps-test2001 - T177389 [15:52:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:53:26] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1084" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384543 [15:53:31] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1084" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384543 [15:54:09] (03CR) 10Ottomata: [C: 032] Do not store PopUps events on MySQL [puppet] - 10https://gerrit.wikimedia.org/r/384542 (https://phabricator.wikimedia.org/T176469) (owner: 10Nuria) [15:54:16] (03PS2) 10Ottomata: Do not store PopUps events on MySQL [puppet] - 10https://gerrit.wikimedia.org/r/384542 (https://phabricator.wikimedia.org/T176469) (owner: 10Nuria) [15:54:18] (03CR) 10Ottomata: [V: 032 C: 032] Do not store PopUps events on MySQL [puppet] - 10https://gerrit.wikimedia.org/r/384542 (https://phabricator.wikimedia.org/T176469) (owner: 10Nuria) [15:56:14] (03PS1) 10Ema: cache: upgrade misc_codfw to Varnish 5 [puppet] - 10https://gerrit.wikimedia.org/r/384544 (https://phabricator.wikimedia.org/T177233) [15:56:23] PROBLEM - puppet last run on contint1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:56:36] (03PS1) 10Andrew Bogott: data services: add amwikimedia to s3 [puppet] - 10https://gerrit.wikimedia.org/r/384545 (https://phabricator.wikimedia.org/T176043) [15:58:37] (03CR) 10Andrew Bogott: [C: 032] data services: add amwikimedia to s3 [puppet] - 10https://gerrit.wikimedia.org/r/384545 (https://phabricator.wikimedia.org/T176043) (owner: 10Andrew Bogott) [15:59:21] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1084" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384543 (owner: 10Marostegui) [15:59:38] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3687964 (10dbarratt) [16:04:17] (03PS3) 10BBlack: Various minor improvements/updates [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384165 [16:04:19] (03PS4) 10BBlack: Remove multi-head support from strq, move into purger. [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382865 [16:04:21] (03PS4) 10BBlack: Move all URL parsing and HTTP req generation to receiver [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382867 [16:04:23] (03PS4) 10BBlack: Chain the purgers together and split their stats [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382868 [16:04:23] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3687986 (10dbarratt) [16:04:27] (03PS4) 10BBlack: Bump http-parser upstream src to 2.7.1 + fixups [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382870 [16:04:29] (03PS5) 10BBlack: Refactor (rewrite?!) purging code [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384167 [16:04:31] (03PS4) 10BBlack: strq+purger: refactor, simplify, add queue delays [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384433 [16:04:33] (03PS5) 10BBlack: Rework stats further [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384434 [16:04:35] (03PS5) 10BBlack: link against jemalloc [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384435 [16:04:37] (03PS7) 10BBlack: Release 0.1.0 [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382873 [16:04:39] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3687964 (10dbarratt) [16:04:45] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1084" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384543 (owner: 10Marostegui) [16:05:41] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1084 - T174509 T177772 (duration: 00m 46s) [16:05:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:05:49] T177772: Purge 90% of rows from recentchanges (and posibly defragment) from commonswiki and ruwiki (the ones with source:wikidata) - https://phabricator.wikimedia.org/T177772 [16:05:50] T174509: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509 [16:06:07] (03PS1) 10Volans: PuppetDB backend: support for Roles and Profiles [software/cumin] - 10https://gerrit.wikimedia.org/r/384547 (https://phabricator.wikimedia.org/T178279) [16:06:20] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1084" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384543 (owner: 10Marostegui) [16:06:22] RECOVERY - puppet last run on contint1001 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [16:08:11] 10Operations, 10monitoring, 10Patch-For-Review: Add PDU redundancy server/router/switch checks in Icinga - https://phabricator.wikimedia.org/T109903#3688004 (10faidon) What's the status and what's left here? @herron? [16:08:12] (03PS2) 10Ema: cache: upgrade misc_codfw to Varnish 5 [puppet] - 10https://gerrit.wikimedia.org/r/384544 (https://phabricator.wikimedia.org/T177233) [16:08:20] (03CR) 10Ema: [V: 032 C: 032] cache: upgrade misc_codfw to Varnish 5 [puppet] - 10https://gerrit.wikimedia.org/r/384544 (https://phabricator.wikimedia.org/T177233) (owner: 10Ema) [16:09:21] !log upgrade misc_codfw to varnish 5 T177233 [16:09:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:09:30] T177233: Upgrade cache_misc to Varnish 5 - https://phabricator.wikimedia.org/T177233 [16:09:53] nobody answered so I created a task: https://phabricator.wikimedia.org/T178313 [16:13:32] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3687964 (10jcrespo) "Recover the preference from the backup " Do you have a timestamp and a query for me? [16:16:43] !log joal@tin Started deploy [analytics/aqs/deploy@aa7ef0d]: Deploy new mediawiki-history endpoints (alpha version) after node-modules fix [16:16:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:17:09] !Log Create bureaucrat account for Dereckson on techconductwiki (T165977) [16:17:09] T165977: Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977 [16:18:22] PROBLEM - puppet last run on kafka1014 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [16:23:40] !log joal@tin Finished deploy [analytics/aqs/deploy@aa7ef0d]: Deploy new mediawiki-history endpoints (alpha version) after node-modules fix (duration: 06m 56s) [16:23:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:32:02] (03PS1) 10Gehel: wdqs: increase heap size to 16GB [puppet] - 10https://gerrit.wikimedia.org/r/384557 (https://phabricator.wikimedia.org/T175919) [16:35:10] !log joal@tin Started deploy [analytics/aqs/deploy@339674c]: Deploy new mediawiki-history endpoints (alpha version) after monitorin fix [16:35:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:39:52] !log joal@tin Finished deploy [analytics/aqs/deploy@339674c]: Deploy new mediawiki-history endpoints (alpha version) after monitorin fix (duration: 04m 42s) [16:39:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:46:20] 10Operations, 10monitoring, 10Patch-For-Review: Add PDU redundancy server/router/switch checks in Icinga - https://phabricator.wikimedia.org/T109903#3688088 (10herron) Server PSU monitoring via icinga check_ipmi is complete and each PSU problem has been acknowledged with a sub task to investigate the affecte... [16:48:22] RECOVERY - puppet last run on kafka1014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:55:24] 10Operations, 10DBA, 10Support-and-Safety, 10Patch-For-Review, 10Wiki-Setup (Create): Create elections committee private wiki - https://phabricator.wikimedia.org/T174370#3688097 (10jrbs) >>! In T174370#3685824, @Dereckson wrote: > There is hi.wiktionary to create soon, I'll see next week to plan a window... [16:56:06] (03CR) 10Ottomata: [C: 032] Add the LVS config for the druid-public-broker service [puppet] - 10https://gerrit.wikimedia.org/r/383880 (https://phabricator.wikimedia.org/T176223) (owner: 10Ottomata) [16:56:11] (03PS1) 10Cmjohnson: Removing db1022 from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/384558 [16:56:16] (03PS11) 10Ottomata: Add the LVS config for the druid-public-broker service [puppet] - 10https://gerrit.wikimedia.org/r/383880 (https://phabricator.wikimedia.org/T176223) [16:56:23] !log deploying LVS for druid-public-broker https://phabricator.wikimedia.org/T176223 [16:56:25] (03CR) 10Ottomata: [V: 032 C: 032] Add the LVS config for the druid-public-broker service [puppet] - 10https://gerrit.wikimedia.org/r/383880 (https://phabricator.wikimedia.org/T176223) (owner: 10Ottomata) [16:56:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:57:28] (03CR) 10Cmjohnson: [C: 032] Removing db1022 from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/384558 (owner: 10Cmjohnson) [16:57:55] (03PS2) 10Cmjohnson: Removing db1022 from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/384558 [16:57:59] (03CR) 10Cmjohnson: [V: 032 C: 032] Removing db1022 from site.pp [puppet] - 10https://gerrit.wikimedia.org/r/384558 (owner: 10Cmjohnson) [16:58:14] (03CR) 10Smalyshev: [C: 031] wdqs: increase heap size to 16GB [puppet] - 10https://gerrit.wikimedia.org/r/384557 (https://phabricator.wikimedia.org/T175919) (owner: 10Gehel) [16:58:45] (03PS2) 10Gehel: wdqs: increase heap size to 16GB [puppet] - 10https://gerrit.wikimedia.org/r/384557 (https://phabricator.wikimedia.org/T175919) [16:59:41] (03CR) 10Gehel: [C: 032] wdqs: increase heap size to 16GB [puppet] - 10https://gerrit.wikimedia.org/r/384557 (https://phabricator.wikimedia.org/T175919) (owner: 10Gehel) [17:00:04] gehel: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Wikidata Query Service weekly deploy deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T1700). [17:00:06] Smalyshev: A patch you scheduled for Wikidata Query Service weekly deploy is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [17:00:25] jouncebot: o/ [17:04:39] (03Abandoned) 10Gehel: wdqs: reduce blazegraph heap size to 10GB [puppet] - 10https://gerrit.wikimedia.org/r/380972 (https://phabricator.wikimedia.org/T175919) (owner: 10Gehel) [17:06:02] !log gehel@tin Started deploy [wdqs/wdqs@b42eb0d]: WDQS GUI uipdates [17:06:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:07:07] 10Operations, 10ops-eqiad, 10DBA: Decommission db1022 (Was: db1022 broke while changing topology on s6- evaluate if to fix or directly decommission) - https://phabricator.wikimedia.org/T163778#3688148 (10Cmjohnson) 05Open>03Resolved db1022 is no longer. removed from site.pp, all dns is removed, killed in... [17:07:09] 10Operations, 10DBA, 10Patch-For-Review: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#3688151 (10Cmjohnson) [17:08:02] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [1000.0] [17:08:13] PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [17:08:22] !log gehel@tin Finished deploy [wdqs/wdqs@b42eb0d]: WDQS GUI uipdates (duration: 02m 20s) [17:08:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:08:33] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [17:09:02] PROBLEM - Eqiad HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [1000.0] [17:09:06] SMalyshev: ^ deployment completed, tests are green... [17:09:13] cool! [17:09:42] PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [17:10:12] PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [17:10:29] 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Decommission old memcached hosts - mc1001->mc1018 - https://phabricator.wikimedia.org/T164341#3688158 (10Volans) @Cmjohnson FYI I'm not using anymore the above hosts for testing. [17:10:51] (03PS1) 10Cmjohnson: Removing db1018 from site.pp and dhcpd file (decom) [puppet] - 10https://gerrit.wikimedia.org/r/384562 [17:11:49] (03CR) 10Muehlenhoff: [C: 032] Add support for stretch to hhvm::debug [puppet] - 10https://gerrit.wikimedia.org/r/380722 (owner: 10Muehlenhoff) [17:12:02] (03PS7) 10Muehlenhoff: Add support for stretch to hhvm::debug [puppet] - 10https://gerrit.wikimedia.org/r/380722 [17:12:24] (03CR) 10Addshore: [C: 031] ci: provide bare copy of puppet.git on Docker slaves [puppet] - 10https://gerrit.wikimedia.org/r/383843 (https://phabricator.wikimedia.org/T178076) (owner: 10Hashar) [17:12:56] the rise in HTTP 5xx above is suspiciously close to the deployment of WDQS. It seems to be going down again though... [17:14:38] it seems gone now [17:14:52] (03PS2) 10Cmjohnson: Removing db1018 from site.pp and dhcpd file (decom) [puppet] - 10https://gerrit.wikimedia.org/r/384562 [17:15:43] RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:16:27] (03CR) 10Cmjohnson: [C: 032] Removing db1018 from site.pp and dhcpd file (decom) [puppet] - 10https://gerrit.wikimedia.org/r/384562 (owner: 10Cmjohnson) [17:16:39] (03PS3) 10Cmjohnson: Removing db1018 from site.pp and dhcpd file (decom) [puppet] - 10https://gerrit.wikimedia.org/r/384562 [17:16:43] (03CR) 10Cmjohnson: [V: 032 C: 032] Removing db1018 from site.pp and dhcpd file (decom) [puppet] - 10https://gerrit.wikimedia.org/r/384562 (owner: 10Cmjohnson) [17:17:02] RECOVERY - Eqiad HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:17:12] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:17:13] RECOVERY - Codfw HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:21:02] PROBLEM - Host db1018 is DOWN: PING CRITICAL - Packet loss = 100% [17:21:20] Where would I find the deployment logs? [17:21:29] 10Operations, 10MediaWiki-Platform-Team, 10Performance-Team, 10HHVM: Convert Wikimedia production HHVM instances to have hhvm.php7.all set true - https://phabricator.wikimedia.org/T173786#3688184 (10Anomie) It looks like that's upstream bug https://github.com/facebook/hhvm/issues/7198. In short, PHP7 allow... [17:21:43] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [17:25:55] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688194 (10dbarratt) >>! In T178313#3688011, @jcrespo wrote: > "Recover the preference from the backup " > > Do you have a timestamp an... [17:26:56] (03PS1) 10Cmjohnson: Removing dns entries for db1018 (decom) [dns] - 10https://gerrit.wikimedia.org/r/384564 [17:28:24] (03PS1) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [17:28:50] (03CR) 10jerkins-bot: [V: 04-1] openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [17:28:56] (03PS2) 10Cmjohnson: Removing dns entries for db1018 (decom) [dns] - 10https://gerrit.wikimedia.org/r/384564 [17:32:24] (03CR) 10Cmjohnson: [C: 032] Removing dns entries for db1018 (decom) [dns] - 10https://gerrit.wikimedia.org/r/384564 (owner: 10Cmjohnson) [17:33:23] (03PS2) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [17:33:33] 10Operations, 10ops-eqiad, 10DBA: decommission db1018 - https://phabricator.wikimedia.org/T176215#3688214 (10Cmjohnson) [17:33:54] (03CR) 10jerkins-bot: [V: 04-1] openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [17:34:51] (03CR) 10ArielGlenn: [C: 032] make the 'latestlink' job not do cleanups of old files or write status files [dumps] - 10https://gerrit.wikimedia.org/r/384211 (owner: 10ArielGlenn) [17:35:00] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688217 (10jcrespo) > How would I get a more precise timestamp for you? Can you tell me a query that was run, so there is a hard limit... [17:35:42] (03PS3) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [17:35:46] !log ariel@tin Started deploy [dumps/dumps@2b9326b]: do the minimum for latestlinks job, no updating other files [17:35:49] !log ariel@tin Finished deploy [dumps/dumps@2b9326b]: do the minimum for latestlinks job, no updating other files (duration: 00m 03s) [17:35:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:35:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:43:48] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688248 (10Catrope) [17:45:40] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688252 (10dbarratt) >>! In T178313#3688217, @jcrespo wrote: > Can you tell me a query that was run, so there is a hard limit per wiki?... [17:51:05] (03PS4) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [17:52:19] (03PS1) 10Cmjohnson: Removing mgmt dns for restbase-dev1001-3 (decom) [dns] - 10https://gerrit.wikimedia.org/r/384568 [17:53:02] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688273 (10jcrespo) Thanks, https://phabricator.wikimedia.org/diffusion/ECHO/browse/master/maintenance/updatePerUserBlacklist.php;65a61a... [17:54:32] (03PS2) 10Cmjohnson: Removing mgmt dns for restbase-dev1001-3 (decom) [dns] - 10https://gerrit.wikimedia.org/r/384568 [17:54:54] (03CR) 10Cmjohnson: [C: 032] Removing mgmt dns for restbase-dev1001-3 (decom) [dns] - 10https://gerrit.wikimedia.org/r/384568 (owner: 10Cmjohnson) [17:57:12] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3687964 (10Catrope) @jcrespo What I think (hope) you will find is two groups of `UPDATE user_properties SET up_value=X WHERE up_user=N A... [18:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) Morning SWAT (Max 8 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T1800). [18:00:04] Smalyshev and davidwbarratt: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [18:00:17] here [18:00:24] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688304 (10dbarratt) Yes, but, just to be clear, the list of prefs that //need// to be updated should be taken from prod **right now** b... [18:00:25] here [18:00:54] (03PS4) 10Addshore: ci: provide bare copy of puppet.git on Docker slaves [puppet] - 10https://gerrit.wikimedia.org/r/383843 (https://phabricator.wikimedia.org/T178076) (owner: 10Hashar) [18:02:30] (03PS1) 10Addshore: ci: add mediawiki core and vendor to gitcache for docker slaves [puppet] - 10https://gerrit.wikimedia.org/r/384570 (https://phabricator.wikimedia.org/T178076) [18:02:32] (03PS2) 10Dzahn: base/standard_packages: add dnsutils [puppet] - 10https://gerrit.wikimedia.org/r/383657 [18:03:34] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688311 (10jcrespo) Can I be a bit simplistic?- I will setup a mysql instance in read only accesible from the cluster with user_properti... [18:03:50] (03PS5) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:04:05] I can SWAT [18:04:47] davidwbarratt: Re your last comment on T178313 , do you have access to stat1006? [18:04:47] T178313: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313 [18:05:12] (03CR) 10Dzahn: base/icinga: if on labs, don't page for mysql procs (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) (owner: 10Dzahn) [18:05:14] (03PS4) 10Dzahn: base/icinga: if on labs, don't page for mysql procs [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) [18:06:00] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688317 (10jcrespo) While that is being setup, if the number of updates is relatively small, compared to the table size, I can try to fi... [18:06:03] (03Abandoned) 10Dzahn: mariadb/icinga: if fqdn like labtest, don't page [puppet] - 10https://gerrit.wikimedia.org/r/383713 (https://phabricator.wikimedia.org/T178008) (owner: 10Dzahn) [18:06:29] RoanKattouw umm, I don't think so [18:06:59] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383464 (https://phabricator.wikimedia.org/T175199) (owner: 10Smalyshev) [18:07:26] (03CR) 10Dzahn: [C: 032] base/standard_packages: add dnsutils [puppet] - 10https://gerrit.wikimedia.org/r/383657 (owner: 10Dzahn) [18:07:35] (03PS5) 10Dzahn: base/icinga: if on labs, don't page for mysql procs [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) [18:07:50] RoanKattouw since I don't know what that is. [18:07:55] Ha OK [18:08:16] It's a box that has a real-time replica of the live DB [18:08:34] I have access to it so I can pull the list of users with broken blacklists from it [18:09:09] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688322 (10Catrope) >>! In T178313#3688317, @jcrespo wrote: > While that is being setup, if the number of updates is relatively small, c... [18:09:26] Doing that right now [18:12:04] (03Merged) 10jenkins-bot: Add configuration for statement indexing for Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383464 (https://phabricator.wikimedia.org/T175199) (owner: 10Smalyshev) [18:12:18] RoanKattouw oh, I have access to teribum and the DB [18:12:19] (03CR) 10jenkins-bot: Add configuration for statement indexing for Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383464 (https://phabricator.wikimedia.org/T175199) (owner: 10Smalyshev) [18:13:02] Aha OK [18:13:08] Well I'm already running my thing [18:13:26] SMalyshev: you change is live on mwdebug1002, check please [18:13:31] *your [18:13:31] catrope@stat1006:~$ mkdir blacklistusers; for db in `cat all.dblist`; do echo $db; echo "SELECT up_user FROM user_properties WHERE up_property = 'echo-notifications-blacklist' AND up_value REGEXP '^(0|0\n)+$';" | mysql --skip-column-names -h s3-analytics-slave $db > blacklistusers/$db.txt; done [18:13:40] thcipriani: thanks, this is kinda hard to check without db reindex though :) [18:14:02] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688332 (10jcrespo) I am backing up the ROW binary logs right now, because they happen to have a 14-15 day expiration. [18:14:40] And it's done [18:14:46] thcipriani: doesn't seem to break anything, but I'll need it on terbium and try to reindex to see if it really works [18:15:01] RoanKattouw oh ok, cool so I can just run that? [18:15:09] It's already finished for me [18:15:12] I'll send you the result [18:15:55] SMalyshev: ok, should I push out live and let you work on terbium, or pull over to terbium and then push out live? [18:16:26] thcipriani: I think you can push it live, even if it doesn't work it won't hurt anything [18:16:33] davidwbarratt: I've put it in my home dir on terbium. /home/catrope/blacklistusers.tar.gz [18:16:33] * thcipriani does [18:17:02] it's a config that is only used when indexing [18:17:07] thcipriani: thanks! [18:18:09] RoanKattouw ok great I can take a look [18:18:34] !log thcipriani@tin Synchronized wmf-config/Wikibase.php: SWAT: [[gerrit:383464|Add configuration for statement indexing for Wikidata]] T175199 (duration: 00m 47s) [18:18:40] ^ SMalyshev live now [18:18:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:18:42] T175199: Index certain statements for Wikidata items - https://phabricator.wikimedia.org/T175199 [18:18:49] thcipriani: thanks! [18:18:56] (03PS3) 10Nuria: Removing appInstallId from whitelist [puppet] - 10https://gerrit.wikimedia.org/r/384093 (https://phabricator.wikimedia.org/T178174) [18:19:13] (03CR) 10Nuria: Removing appInstallId from whitelist (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/384093 (https://phabricator.wikimedia.org/T178174) (owner: 10Nuria) [18:22:41] (03CR) 10Rush: "http://puppet-compiler.wmflabs.org/8338/" [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [18:22:50] (03PS6) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:23:33] RoanKattouw so these are the bad ones from prod right now, correct? [18:23:38] (03PS1) 10Madhuvishy: ssh-key-ldap-lookup: Modify script to run additional checks before ssh login [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) [18:23:39] Yes that's right [18:24:16] As of 18:12ish UTC anyway [18:24:18] RoanKattouw ok, that looks correct to me [18:25:02] thcipriani any questions about my patch? [18:25:39] davidwbarratt: nope, just waiting on the gating tests to finish up then it'll be merged [18:25:51] thcipriani perfect [18:27:18] (03PS2) 10Madhuvishy: ssh-key-ldap-lookup: Modify script to run additional checks before ssh login [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) [18:27:19] davidwbarratt: your change is live on mwdebug1002, check please [18:27:29] thcipriani kk [18:31:09] (03PS7) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:31:34] (03PS8) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:33:12] (03CR) 10Dzahn: "i think this needs a related ticket to talk about it. maybe it is intentional that there needs to be that manual step. but i'm not sure. i" [puppet] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/384306 (owner: 10Paladox) [18:34:34] thcipriani doh! do not deploy, it looks like it breaks the echo notification blacklist. :( [18:34:49] davidwbarratt: ok, reverting, thanks for checking :) [18:35:17] thcipriani no problem, it fixes the bug, but creates a different regression. :( [18:35:26] * thcipriani nods [18:35:31] davidwbarratt: Whoops, what's the regression? [18:35:53] RoanKattouw the blocklist doesn't work at all (i.e. adding someone to the blocklist does nothing) [18:36:01] hah wtf [18:36:05] I did not test that [18:36:15] (03PS9) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:36:16] 10Operations, 10Gerrit: Enable auto submodule updates on operations/puppet - https://phabricator.wikimedia.org/T178322#3688396 (10Paladox) [18:36:30] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688408 (10jcrespo) I get something like: ``` ### UPDATE `enwiki`.`user_properties` ### WHERE ### @1=[user_id] /* INT meta=0 nullable... [18:36:35] RoanKattouw I didn't either, but I thought I should before deploy. :) [18:36:56] Yeah good call [18:37:01] How does a ,true cause that I wonder [18:37:15] (03PS4) 10Paladox: Enable auto submodule updates [puppet] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/384306 (https://phabricator.wikimedia.org/T178322) [18:37:44] (03PS11) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:38:33] 10Operations, 10Gerrit, 10Patch-For-Review: Enable auto submodule updates on operations/puppet - https://phabricator.wikimedia.org/T178322#3688429 (10demon) 05Open>03declined I'm pretty sure nobody in ops wants this functionality on puppet -- the parent repo operates by the policy of "if you merge it, yo... [18:42:10] (03PS12) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:42:16] (03CR) 10Andrew Bogott: [C: 031] "This looks fine -- I'm a bit surprised by the password changes (generally we go ahead and use the real passwords here since it's public) b" [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [18:43:32] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688451 (10Catrope) Thanks for finding that @jcrespo. It's great to see that both the old and new values are in the dump. Could you spec... [18:43:39] (03CR) 10Alexandros Kosiaris: varnish: execution environment configuration (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/384520 (owner: 10Ema) [18:45:19] (03PS13) 10Rush: openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) [18:47:00] (03CR) 10Chad: [C: 04-2] "Declined the task, please abandon this. And please stop adding auto-updating submodules everywhere. If people want them they'll add them." [puppet] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/384306 (https://phabricator.wikimedia.org/T178322) (owner: 10Paladox) [18:47:32] (03Abandoned) 10Paladox: Enable auto submodule updates [puppet] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/384306 (https://phabricator.wikimedia.org/T178322) (owner: 10Paladox) [18:50:56] 10Operations, 10Kubernetes: Operations 2017-18 Q2 Program 6 umbrella task - https://phabricator.wikimedia.org/T178325#3688463 (10akosiaris) [18:51:07] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688475 (10jcrespo) @Catrope I need a regex or a grep command or something. Or, I give you the logs and you can extract exactly the quer... [18:51:44] 10Operations: fix disk space check on dataset1001 - https://phabricator.wikimedia.org/T84148#3688476 (10Dzahn) a:05ArielGlenn>03Dzahn [18:51:46] 10Operations: fix disk space check on dataset1001 - https://phabricator.wikimedia.org/T84148#923388 (10Dzahn) [18:51:54] 10Operations: fix disk space check on dataset1001 - https://phabricator.wikimedia.org/T84148#923388 (10Dzahn) p:05High>03Low [18:52:00] 10Operations: fix disk space check on dataset1001 - https://phabricator.wikimedia.org/T84148#923388 (10Dzahn) 05Open>03Resolved [18:52:06] 10Operations, 10Prod-Kubernetes, 10Kubernetes, 10User-Joe: Create scaffolding of services templates for deployment in production/staging - https://phabricator.wikimedia.org/T177397#3688484 (10akosiaris) [18:52:08] 10Operations, 10Prod-Kubernetes, 10Kubernetes, 10User-Joe: Design pod-level monitoring and service-level alerting - https://phabricator.wikimedia.org/T177396#3688485 (10akosiaris) [18:52:10] 10Operations, 10Prod-Kubernetes, 10Kubernetes, 10User-Joe: Improve monitoring of the Kubernetes clusters - https://phabricator.wikimedia.org/T177395#3688486 (10akosiaris) [18:52:12] 10Operations, 10Prod-Kubernetes, 10Kubernetes, 10User-Joe: Experiment with a TLS proxy/router for pods - https://phabricator.wikimedia.org/T177394#3688487 (10akosiaris) [18:52:14] 10Operations, 10Kubernetes: Operations 2017-18 Q2 Program 6 umbrella task - https://phabricator.wikimedia.org/T178325#3688483 (10akosiaris) [18:52:16] 10Operations, 10Prod-Kubernetes, 10Kubernetes: Implement authentication/authorization in Kubernetes clusters - https://phabricator.wikimedia.org/T177393#3688488 (10akosiaris) [18:52:49] 10Operations: fix disk space check on dataset1001 - https://phabricator.wikimedia.org/T84148#923388 (10Dzahn) @Aklapper thanks, resolved :) (and made public, NDA was just because this was an RT import, yea, that old) [18:53:35] 10Operations, 10monitoring: fix disk space check on dataset1001 - https://phabricator.wikimedia.org/T84148#3688503 (10Dzahn) [18:55:24] (03PS2) 10BBlack: browsersec: bump to 100% 2017-10-17, update translations [puppet] - 10https://gerrit.wikimedia.org/r/376316 (https://phabricator.wikimedia.org/T163251) [18:55:50] (03CR) 10Rush: [C: 032] openstack: nova manager/osm/wikitech to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384566 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [18:57:28] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688539 (10jcrespo) I get, however: ``` ### DELETE FROM `enwiki`.`user_properties` ### WHERE ### @1=[user-id] /* INT meta=0 nullable=... [18:57:36] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current), 10User-Ladsgroup: Review and fix file handle management in worker and celery processes - https://phabricator.wikimedia.org/T174402#3688540 (10Halfak) a:05Ladsgroup>03Halfak [18:58:53] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current), and 2 others: Stress/capacity test new ores* cluster - https://phabricator.wikimedia.org/T169246#3688542 (10Halfak) a:05Ladsgroup>03Halfak [18:59:18] 10Operations, 10ORES, 10Graphite, 10Scoring-platform-team (Current), 10User-fgiunchedi: Regularly purge old ores graphite metrics - https://phabricator.wikimedia.org/T169969#3688544 (10Halfak) a:03Halfak [19:00:36] (03PS3) 10BBlack: browsersec: bump to 100% 2017-10-17, update translations [puppet] - 10https://gerrit.wikimedia.org/r/376316 (https://phabricator.wikimedia.org/T163251) [19:00:38] (03PS1) 10BBlack: ssl_ciphersuite: dump 3DES on 2017-11-17 [puppet] - 10https://gerrit.wikimedia.org/r/384578 (https://phabricator.wikimedia.org/T147199) [19:03:11] (03PS3) 10Madhuvishy: ssh-key-ldap-lookup: Modify script to run additional checks before ssh login [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) [19:06:05] madhuvishy: I don't think it's a good idea to pile these up in that script [19:06:35] (I'm in the QR currently) [19:06:45] (so may be slow to respond, I mean) [19:07:05] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688554 (10jcrespo) Yeah, all modifications of echo-notifications-blacklist value to 0 are DELETES and then inserts, starting from 19:18... [19:07:14] 10Operations, 10Traffic, 10Patch-For-Review, 10User-notice: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support) - https://phabricator.wikimedia.org/T147199#3688555 (10Pigsonthewing) Sad to see this claim made in 'Tech News' today: > If you use Internet Explorer 8 on Windows XP you can... [19:08:55] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688558 (10Catrope) >>! In T178313#3688475, @jcrespo wrote: > @Catrope I need a regex or a grep command or something. Or, I give you the... [19:11:07] "'/usr/sbin/service mjolnir-kafka-daemon .. is masked" on relforge1001. is this intentional? [19:15:08] 10Operations, 10Traffic, 10Patch-For-Review, 10User-notice: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support) - https://phabricator.wikimedia.org/T147199#3688584 (10Johan) @Pigsonthewing Yes, Tech News deals in simplification, for a number of reasons (non-native speakers without access... [19:16:34] 10Operations, 10Citoid, 10RESTBase, 10RESTBase-API, and 5 others: Set-up Citoid behind RESTBase - https://phabricator.wikimedia.org/T108646#3688587 (10mobrovac) [19:19:09] 10Operations, 10Anti-Harassment, 10Collaboration-Team-Triage, 10Notifications: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688602 (10jcrespo) What is your user name on terbium?, I will send all relevant logs to you so you can check it, by yourself as I will... [19:19:19] (03PS1) 10Rush: openstack: nodepool to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384582 (https://phabricator.wikimedia.org/T171494) [19:23:50] (03CR) 10Rush: "http://puppet-compiler.wmflabs.org/8340/labnodepool1001.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/384582 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [19:24:12] (03CR) 10Rush: [C: 032] openstack: nodepool to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384582 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [19:26:14] (03PS1) 10Dzahn: admins: (WIP) add all wmde LDAP users [puppet] - 10https://gerrit.wikimedia.org/r/384584 [19:29:15] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7): Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688632 (10dbarratt) [19:30:09] (03PS1) 10Rush: openstack: ensure system::role on all role/wmcs [puppet] - 10https://gerrit.wikimedia.org/r/384585 (https://phabricator.wikimedia.org/T171494) [19:30:36] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7): Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688636 (10Catrope) >>! In T178313#3688602, @jcrespo wrote: > What is your user name on terbium?, I will send all relevan... [19:31:03] (03CR) 10Rush: [C: 032] openstack: ensure system::role on all role/wmcs [puppet] - 10https://gerrit.wikimedia.org/r/384585 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [19:32:25] (03PS1) 10Ottomata: Set up Kafka MirrorMaker from main -> jumbo in eqiad [puppet] - 10https://gerrit.wikimedia.org/r/384586 (https://phabricator.wikimedia.org/T177216) [19:32:52] (03CR) 10jerkins-bot: [V: 04-1] Set up Kafka MirrorMaker from main -> jumbo in eqiad [puppet] - 10https://gerrit.wikimedia.org/r/384586 (https://phabricator.wikimedia.org/T177216) (owner: 10Ottomata) [19:33:02] (03CR) 10Hoo man: [C: 04-1] "Should also be enabled on cawiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384003 (https://phabricator.wikimedia.org/T177155) (owner: 10Hoo man) [19:34:04] (03PS2) 10Ottomata: Set up Kafka MirrorMaker from main -> jumbo in eqiad [puppet] - 10https://gerrit.wikimedia.org/r/384586 (https://phabricator.wikimedia.org/T177216) [19:38:26] (03PS1) 10Rush: openstack: reuse canonical pdns values for nova network [puppet] - 10https://gerrit.wikimedia.org/r/384587 (https://phabricator.wikimedia.org/T171494) [19:43:29] (03CR) 10Rush: [C: 032] openstack: reuse canonical pdns values for nova network [puppet] - 10https://gerrit.wikimedia.org/r/384587 (https://phabricator.wikimedia.org/T171494) (owner: 10Rush) [19:46:26] (03Draft1) 10Paladox: Gerrit: Switch to the mariadb connector [puppet] - 10https://gerrit.wikimedia.org/r/384588 [19:46:28] (03PS2) 10Paladox: Gerrit: Switch to the mariadb connector [puppet] - 10https://gerrit.wikimedia.org/r/384588 (https://phabricator.wikimedia.org/T176164) [19:46:42] (03CR) 10Paladox: "Sorry that i uploaded this, but would have forgotten :)." [puppet] - 10https://gerrit.wikimedia.org/r/384588 (https://phabricator.wikimedia.org/T176164) (owner: 10Paladox) [19:48:24] (03PS1) 10Rush: openstack: enable paging on full disk for main labvirt role [puppet] - 10https://gerrit.wikimedia.org/r/384589 (https://phabricator.wikimedia.org/T175077) [19:50:27] (03PS5) 10Chad: Remove $stdlogo entirely [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359037 (owner: 10Reedy) [19:50:38] (03PS27) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [19:51:52] (03PS1) 10Herron: MX: Change Exim configuration to use letsencrypt certificate [puppet] - 10https://gerrit.wikimedia.org/r/384591 (https://phabricator.wikimedia.org/T174081) [19:51:59] RoanKattouw uhh... well... it seems to work fine with your patch locally, so maybe it just doesn't work right with just using the debug headers [19:52:28] ha [19:52:37] davidwbarratt: Yeah deferred and job queue stuff usually doesn't work well with that [19:52:56] (03CR) 10Chad: [C: 032] Remove $stdlogo entirely [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359037 (owner: 10Reedy) [19:53:09] (03PS2) 10Hoo man: Enable description usage tracking on a few test wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384003 (https://phabricator.wikimedia.org/T177155) [19:53:11] (03PS1) 10Hoo man: Re-enable Statement usage tracking on cawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384592 (https://phabricator.wikimedia.org/T151717) [19:53:14] (03PS2) 10Herron: MX: Change Exim configuration to use letsencrypt certificate [puppet] - 10https://gerrit.wikimedia.org/r/384591 (https://phabricator.wikimedia.org/T174081) [19:53:59] (03PS2) 10Rush: openstack: enable paging on full disk for main labvirt role [puppet] - 10https://gerrit.wikimedia.org/r/384589 (https://phabricator.wikimedia.org/T175077) [19:54:06] (03Merged) 10jenkins-bot: Remove $stdlogo entirely [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359037 (owner: 10Reedy) [19:54:20] RoanKattouw hmm, well I'm not getting notifications at all locally, so I should probably fix that first, THEN test it. :) [19:56:13] (03CR) 10jenkins-bot: Remove $stdlogo entirely [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359037 (owner: 10Reedy) [19:56:29] !log demon@tin Synchronized tests/cirrusTest.php: Ief0ea01f (duration: 00m 47s) [19:56:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:57:38] !log demon@tin Synchronized wmf-config/InitialiseSettings.php: Ief0ea01f (duration: 00m 46s) [19:57:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:58:02] !log sudo apt install wmf-mariadb101-client on terbium for debugging an issue [19:58:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:59:20] !log demon@tin Synchronized wmf-config/CommonSettings.php: Ief0ea01f (duration: 00m 46s) [19:59:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:00:02] (03CR) 10Rush: "http://puppet-compiler.wmflabs.org/8343/labvirt1001.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/384589 (https://phabricator.wikimedia.org/T175077) (owner: 10Rush) [20:00:04] gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: How many deployers does it take to do Services – Parsoid / OCG / Citoid / Mobileapps / ORES / … deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T2000). [20:00:04] No GERRIT patches in the queue for this window AFAICS. [20:00:06] (03CR) 10Rush: [C: 032] openstack: enable paging on full disk for main labvirt role [puppet] - 10https://gerrit.wikimedia.org/r/384589 (https://phabricator.wikimedia.org/T175077) (owner: 10Rush) [20:00:25] no mobileapps deploy today [20:00:37] no ORES today [20:03:28] (03PS6) 10Rush: base/icinga: if on labs, don't page for mysql procs [puppet] - 10https://gerrit.wikimedia.org/r/384183 (https://phabricator.wikimedia.org/T178008) (owner: 10Dzahn) [20:03:59] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7): Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3688760 (10jcrespo) I have copied to `terbium:/home/catrope/T178313` all relevant logs around the 24 hours of the given d... [20:06:31] 10Operations, 10Puppet, 10DBA: Switch databases to the future parser - https://phabricator.wikimedia.org/T172498#3688779 (10jcrespo) 05Resolved>03Open Open because I didn't get an answer. [20:06:33] 10Operations, 10Puppet, 10Patch-For-Review, 10User-Joe: Switch all hosts to the future parser - https://phabricator.wikimedia.org/T171704#3688781 (10jcrespo) [20:09:07] (03PS2) 10Andrew Bogott: DO NOT MERGE: no-op patch for testing [puppet] - 10https://gerrit.wikimedia.org/r/383942 [20:09:09] (03PS1) 10Andrew Bogott: DO NOT MERGE: testing patch that should break with the future parser [puppet] - 10https://gerrit.wikimedia.org/r/384595 [20:09:14] (03CR) 10Andrew Bogott: [C: 04-2] DO NOT MERGE: testing patch that should break with the future parser [puppet] - 10https://gerrit.wikimedia.org/r/384595 (owner: 10Andrew Bogott) [20:09:45] (03CR) 10jerkins-bot: [V: 04-1] DO NOT MERGE: testing patch that should break with the future parser [puppet] - 10https://gerrit.wikimedia.org/r/384595 (owner: 10Andrew Bogott) [20:11:17] (03PS1) 10RobH: add new ssh key for bob west [puppet] - 10https://gerrit.wikimedia.org/r/384598 (https://phabricator.wikimedia.org/T177889) [20:11:59] (03CR) 10RobH: [C: 032] add new ssh key for bob west [puppet] - 10https://gerrit.wikimedia.org/r/384598 (https://phabricator.wikimedia.org/T177889) (owner: 10RobH) [20:12:26] (03Draft1) 10Paladox: Gerrit: Set lfs configuation [puppet] - 10https://gerrit.wikimedia.org/r/384596 (https://phabricator.wikimedia.org/T171758) [20:12:33] (03PS2) 10Paladox: Gerrit: Set lfs configuation [puppet] - 10https://gerrit.wikimedia.org/r/384596 (https://phabricator.wikimedia.org/T171758) [20:13:02] (03CR) 10jerkins-bot: [V: 04-1] Gerrit: Set lfs configuation [puppet] - 10https://gerrit.wikimedia.org/r/384596 (https://phabricator.wikimedia.org/T171758) (owner: 10Paladox) [20:14:17] 10Operations, 10Ops-Access-Requests, 10Research, 10Patch-For-Review: Request public key change for a research fellow - https://phabricator.wikimedia.org/T177889#3688801 (10RobH) a:03Cervisiarius @Cervisiarius: I've left your existing key in place as well for now. ``` - ssh-rsa AAAAB3NzaC1yc2EAAAAB... [20:14:24] (03PS3) 10Paladox: Gerrit: Set lfs configuation [puppet] - 10https://gerrit.wikimedia.org/r/384596 (https://phabricator.wikimedia.org/T171758) [20:14:44] (03PS1) 10Rush: openstack: cleanup for paths and novaconfig usage [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) [20:15:15] (03CR) 10jerkins-bot: [V: 04-1] openstack: cleanup for paths and novaconfig usage [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [20:17:03] no parsoid deploy today [20:17:33] 10Operations, 10Gerrit, 10ORES, 10Scoring-platform-team, and 3 others: Support git-lfs files in gerrit - https://phabricator.wikimedia.org/T171758#3688811 (10Paladox) I've done the configuration. We need to install the plugin. But the good thing is we can install things, and if the configuration is wrong,... [20:19:05] (03PS2) 10Rush: openstack: cleanup for paths and novaconfig usage [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) [20:19:30] (03PS3) 10Rush: openstack: cleanup for paths and novaconfig usage [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) [20:19:33] (03CR) 10jerkins-bot: [V: 04-1] openstack: cleanup for paths and novaconfig usage [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [20:19:59] (03CR) 10jerkins-bot: [V: 04-1] openstack: cleanup for paths and novaconfig usage [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [20:20:14] !log demon@tin Pruned MediaWiki: 1.30.0-wmf.18 (duration: 02m 59s) [20:20:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:23:15] 10Operations, 10LDAP-Access-Requests, 10WMF-Legal, 10WMF-NDA-Requests: Request access to logstash (nda group) for @framawiki - https://phabricator.wikimedia.org/T176364#3688824 (10RobH) 05Open>03stalled [20:33:07] (03PS1) 10Ottomata: Small refactor for some kafka classes to ease creation of mirror maker profile [puppet] - 10https://gerrit.wikimedia.org/r/384602 (https://phabricator.wikimedia.org/T177216) [20:34:53] (03CR) 10Rush: "http://puppet-compiler.wmflabs.org/8344/" [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [20:35:01] (03PS2) 10Ottomata: Small refactor for some kafka classes to ease creation of mirror maker profile [puppet] - 10https://gerrit.wikimedia.org/r/384602 (https://phabricator.wikimedia.org/T177216) [20:35:54] !log disable puppet around cloud things for a refactor merge [20:36:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:36] (03CR) 10Rush: [V: 032 C: 032] "Not wanting to couple this w/ a toolforge refactor to avoid the style -1" [puppet] - 10https://gerrit.wikimedia.org/r/384599 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [20:48:06] (03PS3) 10Ottomata: Small refactor for some kafka classes to ease creation of mirror maker profile [puppet] - 10https://gerrit.wikimedia.org/r/384602 (https://phabricator.wikimedia.org/T177216) [20:56:53] (03PS1) 10Ottomata: Add profile::prometheus::jmx_exporter define to DRY up using jmx exporter [puppet] - 10https://gerrit.wikimedia.org/r/384608 [20:57:14] (03PS1) 10Andrew Bogott: add dummy star.tools.wmflabs.org.key [labs/private] - 10https://gerrit.wikimedia.org/r/384609 [20:57:24] (03CR) 10jerkins-bot: [V: 04-1] Add profile::prometheus::jmx_exporter define to DRY up using jmx exporter [puppet] - 10https://gerrit.wikimedia.org/r/384608 (owner: 10Ottomata) [20:57:27] (03CR) 10Andrew Bogott: [V: 032 C: 032] add dummy star.tools.wmflabs.org.key [labs/private] - 10https://gerrit.wikimedia.org/r/384609 (owner: 10Andrew Bogott) [20:58:40] (03PS2) 10Ottomata: Add profile::prometheus::jmx_exporter define to DRY up using jmx exporter [puppet] - 10https://gerrit.wikimedia.org/r/384608 [20:59:09] (03CR) 10jerkins-bot: [V: 04-1] Add profile::prometheus::jmx_exporter define to DRY up using jmx exporter [puppet] - 10https://gerrit.wikimedia.org/r/384608 (owner: 10Ottomata) [21:00:04] dapatrick, bawolff, and Reedy: Your horoscope predicts another unfortunate Weekly Security deployment window deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T2100). [21:00:04] No GERRIT patches in the queue for this window AFAICS. [21:00:13] (03CR) 10Ottomata: "Mostly no-op https://puppet-compiler.wmflabs.org/compiler02/8347/, should be safe." [puppet] - 10https://gerrit.wikimedia.org/r/384602 (https://phabricator.wikimedia.org/T177216) (owner: 10Ottomata) [21:02:28] 10Operations, 10Traffic, 10Patch-For-Review, 10User-notice: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support) - https://phabricator.wikimedia.org/T147199#3688928 (10Pigsonthewing) > simplification... at the cost of precision > "you" instead of "you or someone with administrator acces... [21:02:42] (03PS3) 10Ottomata: Add profile::prometheus::jmx_exporter define to DRY up using jmx exporter [puppet] - 10https://gerrit.wikimedia.org/r/384608 [21:03:00] (03PS4) 10Ottomata: Add profile::prometheus::jmx_exporter define to DRY up using jmx exporter [puppet] - 10https://gerrit.wikimedia.org/r/384608 [21:09:18] (03CR) 10Ottomata: "It was getting pretty cumbersome to use jmx_exporter! (I'm doing more with it for Kafka MirrorMaker). This helps a bit." [puppet] - 10https://gerrit.wikimedia.org/r/384608 (owner: 10Ottomata) [21:12:37] PROBLEM - HHVM rendering on mw1298 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:13:17] (03PS1) 10Andrew Bogott: update fake star.tools.wmflabs.org.key to be a bit more convincing [labs/private] - 10https://gerrit.wikimedia.org/r/384613 [21:13:25] (03CR) 10Andrew Bogott: [V: 032 C: 032] update fake star.tools.wmflabs.org.key to be a bit more convincing [labs/private] - 10https://gerrit.wikimedia.org/r/384613 (owner: 10Andrew Bogott) [21:13:27] RECOVERY - HHVM rendering on mw1298 is OK: HTTP OK: HTTP/1.1 200 OK - 78751 bytes in 0.139 second response time [21:17:54] (03PS4) 10Rush: ssh-key-ldap-lookup: Modify script to run additional checks before ssh login [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) (owner: 10Madhuvishy) [21:19:04] (03PS2) 10Dzahn: admins: (WIP) add all wmde LDAP users [puppet] - 10https://gerrit.wikimedia.org/r/384584 [21:19:57] (03PS3) 10Dzahn: admins: (WIP) add all wmde LDAP users [puppet] - 10https://gerrit.wikimedia.org/r/384584 [21:22:53] (03PS6) 10BBlack: Rework stats further [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384434 [21:22:55] (03PS6) 10BBlack: link against jemalloc [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/384435 [21:22:57] (03PS8) 10BBlack: Release 0.1.0 [software/varnish/vhtcpd] - 10https://gerrit.wikimedia.org/r/382873 [21:24:11] (03CR) 10Rush: [C: 032] ssh-key-ldap-lookup: Modify script to run additional checks before ssh login (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) (owner: 10Madhuvishy) [21:24:19] (03CR) 10Rush: [C: 031] ssh-key-ldap-lookup: Modify script to run additional checks before ssh login [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) (owner: 10Madhuvishy) [21:26:30] (03PS4) 10Dzahn: admins: (WIP) add missing wmde LDAP users [puppet] - 10https://gerrit.wikimedia.org/r/384584 [21:29:56] (03CR) 10Bearloga: "Fixes coming in next patchset…" (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/383916 (https://phabricator.wikimedia.org/T178096) (owner: 10Bearloga) [21:35:16] (03CR) 10Rush: ssh-key-ldap-lookup: Modify script to run additional checks before ssh login [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) (owner: 10Madhuvishy) [21:38:37] (03PS28) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [21:46:09] (03PS7) 10Bearloga: Add profiles/roles for stats/ML on Wikimedia Cloud [puppet] - 10https://gerrit.wikimedia.org/r/383916 (https://phabricator.wikimedia.org/T178096) [21:46:38] (03CR) 10jerkins-bot: [V: 04-1] Add profiles/roles for stats/ML on Wikimedia Cloud [puppet] - 10https://gerrit.wikimedia.org/r/383916 (https://phabricator.wikimedia.org/T178096) (owner: 10Bearloga) [21:52:24] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7): Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3689065 (10dbarratt) @Catrope This is really bizarre... the maintenance script was already setup to handle being run mul... [21:56:15] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7): Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3689076 (10dbarratt) [21:58:41] (03PS1) 10Volans: Backends: add support to external backends plugins [software/cumin] - 10https://gerrit.wikimedia.org/r/384616 (https://phabricator.wikimedia.org/T178342) [22:00:10] (03CR) 10EBernhardson: [C: 031] [cirrus] Fix typo in UserTesting config var [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384529 (https://phabricator.wikimedia.org/T177502) (owner: 10DCausse) [22:03:07] (03PS8) 10Bearloga: Add profiles/roles for stats/ML on Wikimedia Cloud [puppet] - 10https://gerrit.wikimedia.org/r/383916 (https://phabricator.wikimedia.org/T178096) [22:04:33] (03CR) 10Faidon Liambotis: [C: 04-2] "See inline for specific comments, but more generally, I think using ssh-key-ldap-lookup for this purpose is a fundamentally flawed idea. I" (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) (owner: 10Madhuvishy) [22:07:20] Is CentralIdLookup not available in maintenance scripts? [22:07:33] (03CR) 10Chad: [C: 031] "Maybe it'll finally work." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [22:07:34] I mean it works locally [22:07:44] but it failed in the cluster (returned an array of 0's) [22:09:52] (03PS29) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [22:09:55] (03CR) 10Paladox: Gerrit: Use systemd::service for systemd (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [22:10:37] everything available in mediawiki should be available from a maintenance script [22:10:53] (03Draft1) 10Paladox: Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 [22:10:58] (03PS2) 10Paladox: Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 [22:11:37] davidwbarratt: Keep in mind, wmf wikis have a significantly different user config due to CentralAuth than a vanilla mediawiki install [22:11:51] if you're encountering differences locally, my first guess would be its related to that [22:11:52] (03CR) 10Paladox: "This will https://gerrit.wikimedia.org/r/#/c/384617/ also need to be merged at the same time. This limit is already set in the systemd fil" [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [22:12:02] (03CR) 10Rush: "moot point now but I meant a 0:)" [puppet] - 10https://gerrit.wikimedia.org/r/384574 (https://phabricator.wikimedia.org/T171508) (owner: 10Madhuvishy) [22:17:18] bawolff yeah I figured that was the case, obviously there isn't any central auth db locally [22:17:52] Its certainly possible to set up central auth locally [22:18:24] Its less of a pain then it sounds (certaintly much easier than some of the other interwiki extensions like GlobalUsage or Wikidata) [22:19:27] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7), 10Patch-For-Review: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3689122 (10dbarratt) >>! In T178313#3689118, @gerritbot wrote: > Change 384618 had a related patch... [22:20:46] 10Operations, 10Collaboration-Team-Triage, 10Notifications, 10Anti-Harassment (AHT Sprint 7), 10Patch-For-Review: Recover Echo Notification Blacklist from Backup - https://phabricator.wikimedia.org/T178313#3689131 (10dbarratt) @jcrespo would it be possible to run the script on the recovered data as a tes... [22:24:15] (03CR) 10Dzahn: [C: 031] "i started alphabetically but then noticed we already append users at the bottom and the script shouldn't care :)" [puppet] - 10https://gerrit.wikimedia.org/r/384584 (owner: 10Dzahn) [22:24:34] (03PS5) 10Dzahn: admins: add missing wmde LDAP users [puppet] - 10https://gerrit.wikimedia.org/r/384584 [22:32:02] (03PS6) 10Dzahn: admins: add missing wmde LDAP users [puppet] - 10https://gerrit.wikimedia.org/r/384584 [22:34:15] (03PS1) 10Rush: openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [22:34:42] (03CR) 10jerkins-bot: [V: 04-1] openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [22:34:56] (03CR) 10Huji: "Yeah, I don't know if we explicitly have that in the documentation, but a situation in which rangeblock is enabled and block is not is har" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/384252 (https://phabricator.wikimedia.org/T178227) (owner: 10Ladsgroup) [22:44:27] 10Operations, 10Edit-Review-Improvements, 10Collaboration-Feature-Rollouts (Collaboration-WL-Graduated-Everywhere), 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017), 10Performance: Systematically test load speeds of Watchlist and Recent Changes - https://phabricator.wikimedia.org/T176445#3625823 (... [22:47:40] (03CR) 10Chad: "In which case let's raise it first, then do this change after to make it configurable. Either way, can we pull it out of the systemd chang" [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [22:48:05] (03CR) 10Paladox: "> In which case let's raise it first, then do this change after to" [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [22:50:47] (03CR) 10Paladox: "> In which case let's raise it first, then do this change after to" [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [22:51:21] (03Draft1) 10Paladox: Gerrit: Make packedGitOpenFiles configurable with puppet variable [puppet] - 10https://gerrit.wikimedia.org/r/384623 [22:51:23] (03Draft2) 10Paladox: Gerrit: Make packedGitOpenFiles configurable with puppet variable [puppet] - 10https://gerrit.wikimedia.org/r/384623 [22:52:43] (03PS3) 10Paladox: Gerrit: Make packedGitOpenFiles configurable with puppet variable [puppet] - 10https://gerrit.wikimedia.org/r/384623 [22:52:45] (03PS30) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [22:52:46] (03PS3) 10Paladox: Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 [22:53:44] (03PS2) 10Rush: openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [22:54:21] (03CR) 10jerkins-bot: [V: 04-1] openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [22:56:30] (03CR) 10Chad: [C: 031] "This can land right away, won't even need a service restart" [puppet] - 10https://gerrit.wikimedia.org/r/384623 (owner: 10Paladox) [22:57:08] (03PS3) 10Rush: openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [22:57:17] (03CR) 10Chad: [C: 031] "Needs service restart, but is ok to go." [puppet] - 10https://gerrit.wikimedia.org/r/384617 (owner: 10Paladox) [22:57:24] (03PS4) 10Rush: openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [22:57:55] (03CR) 10jerkins-bot: [V: 04-1] openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [22:59:32] (03PS5) 10Rush: openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [22:59:50] (03PS6) 10Rush: openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [23:00:05] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: How many deployers does it take to do Evening SWAT (Max 8 patches) deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171016T2300). [23:00:05] davidwbarratt: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [23:00:17] here! [23:00:20] (03CR) 10jerkins-bot: [V: 04-1] openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [23:00:26] and DMaza as a backup! [23:02:27] (03PS7) 10Rush: WIP openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [23:02:57] (03CR) 10jerkins-bot: [V: 04-1] WIP openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [23:02:59] (03PS8) 10Rush: WIP openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) [23:03:26] (03CR) 10jerkins-bot: [V: 04-1] WIP openstack: dns-floating-ip-updater to module/profile/role [puppet] - 10https://gerrit.wikimedia.org/r/384620 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush) [23:04:29] I can SWAT [23:04:44] thcipriani ++ [23:05:04] (03PS4) 10Paladox: Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) [23:05:22] (03CR) 10Dzahn: "confirmed the existing system unit file in prod already has the "20000" value for LimitNOFILE. as long as LimitNOFILE is the same as "git_" [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) (owner: 10Paladox) [23:05:53] davidwbarratt: so am I unreverting this patch? it's ok now? [23:06:15] thcipriani yes, I realized I was testing on a User Talk page, which is excluded from the Echo blacklist [23:06:26] 'sorry about that [23:06:39] ahh, ok, np, better safe than sorry :) [23:07:51] (03CR) 10Dzahn: [C: 031] ""# NOFILE : GERRIT_FDS, determined by "core.packedGitOpenFiles" in the script"" [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) (owner: 10Paladox) [23:09:27] * Melos is here [23:10:30] (03CR) 10Dzahn: [C: 032] Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) (owner: 10Paladox) [23:10:54] (03CR) 10Dzahn: [C: 031] "@paladox can't merge because dependency :)" [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) (owner: 10Paladox) [23:10:59] thanks :) [23:11:17] paladox: well, downgraded 2 to 1, heh [23:12:25] heh [23:14:22] davidwbarratt: patch is live on mwdebug1002, check please [23:14:53] (03CR) 10Dzahn: [C: 032] Gerrit: Make packedGitOpenFiles configurable with puppet variable [puppet] - 10https://gerrit.wikimedia.org/r/384623 (owner: 10Paladox) [23:15:08] no_justification, i wonder should we do this https://gerrit.wikimedia.org/r/#/c/384617/ before the systemd change? [23:16:06] Who is available for SWAT https://gerrit.wikimedia.org/r/#/c/384622/ so I can test it? [23:16:09] paladox: yea :) probably [23:16:24] ok thanks will change it now [23:16:25] :) [23:17:04] paladox: ok, the first was no-op [23:17:08] :) [23:17:19] (03PS5) 10Paladox: Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) [23:17:21] (03PS31) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [23:17:24] mutante done ^^ [23:17:25] :) [23:17:57] (03CR) 10jerkins-bot: [V: 04-1] Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) (owner: 10Paladox) [23:18:25] (03CR) 10Dzahn: [C: 032] Gerrit: Raise git_open_file to 20000 [puppet] - 10https://gerrit.wikimedia.org/r/384617 (https://phabricator.wikimedia.org/T168360) (owner: 10Paladox) [23:18:42] thanks :) [23:18:56] jerkins isnt happy about the last one [23:19:22] (03PS32) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [23:19:49] 23:17:55 ProtocolError: ("Connection broken: error(104, 'Connection reset by peer')", error(104, 'Connection reset by peer')) [23:20:01] works now :) [23:20:05] paladox: where [23:20:12] https://integration.wikimedia.org/ci/job/operations-puppet-tests-docker/7406/console [23:20:20] ah [23:20:26] checking [23:21:11] paladox: it downloaded new setuptools or something [23:21:15] right before that [23:21:15] thcipriani seems to work! [23:21:23] davidwbarratt: ok, going live :) [23:21:25] hmm yep [23:22:09] thcipriani great! [23:24:40] (03PS33) 10Paladox: Gerrit: Use systemd::service for systemd [puppet] - 10https://gerrit.wikimedia.org/r/378768 (https://phabricator.wikimedia.org/T157414) [23:25:15] !log thcipriani@tin Synchronized php-1.31.0-wmf.3/extensions/Echo/includes/ContainmentSet.php: SWAT: [[gerrit:384537|ContainmentSet: Use strict comparison for array_search()]] T177825 (duration: 01m 45s) [23:25:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:25:23] T177825: No on-site Notifications received for certain type of actions - https://phabricator.wikimedia.org/T177825 [23:25:24] ^ davidwbarratt live now [23:25:27] yay! [23:27:42] thcipriani: Are you available for SWAT https://gerrit.wikimedia.org/r/#/c/384622/ so I can test it? [23:28:18] Melos: oh, sure, sorry didn't see that patch initially :) [23:36:01] jouncebot: refresh [23:36:05] I refreshed my knowledge about deployments. [23:39:34] Melos: your change is live on mwdebug1002, check please [23:39:54] ok, testing... [23:41:11] (03Abandoned) 10Paladox: Add branch field to .gitmodules [puppet] - 10https://gerrit.wikimedia.org/r/384307 (owner: 10Paladox) [23:42:34] (03CR) 10Dzahn: "do you guys know if all the openstack stats that you need have been already moved to grafana?" [puppet] - 10https://gerrit.wikimedia.org/r/382917 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [23:44:43] thcipriani: it works :) [23:44:52] Melos: ok, deploying live :) [23:45:03] thank you! [23:46:39] !log increase elasticsearch log levels for transport to debug to help track down T167410 [23:46:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:46:46] T167410: ApiQuerySearch.php: Call to a member function termMatches() on a non-object (boolean) - https://phabricator.wikimedia.org/T167410 [23:49:28] !log thcipriani@tin Synchronized php-1.31.0-wmf.3/extensions/CheckUser/specials/SpecialCheckUser.php: SWAT: [[gerrit:384622|Revert "Revert "Restore checkuser-userlinks when exist""]] T170507 (duration: 00m 46s) [23:49:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:49:35] ^ Melos live everywhere now [23:49:35] T170507: CheckUser "contributions" link should be a red link for non-existent accounts - https://phabricator.wikimedia.org/T170507 [23:50:45] thank you again :)