[00:00:04] twentyafterfour: My dear minions, it's time we take the moon! Just kidding. Time for Phabricator update deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T0000). [02:24:36] !log l10nupdate@deploy1001 scap sync-l10n completed (1.32.0-wmf.12) (duration: 10m 41s) [02:24:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:33:40] (03PS1) 10Krinkle: Remove unused $wgEventLoggingSchemaIndexUri [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446711 [02:47:43] 10Operations, 10Phabricator: unable to subscribe to operations tag after migration and merge from ops-core and ops-request - https://phabricator.wikimedia.org/T89053 (10chasemp) [02:55:48] (03PS1) 10Rush: phab: changes to files in redirector.pp should restart apache [puppet] - 10https://gerrit.wikimedia.org/r/446712 (https://phabricator.wikimedia.org/T199741) [02:56:22] (03CR) 10jerkins-bot: [V: 04-1] phab: changes to files in redirector.pp should restart apache [puppet] - 10https://gerrit.wikimedia.org/r/446712 (https://phabricator.wikimedia.org/T199741) (owner: 10Rush) [02:58:44] !log l10nupdate@deploy1001 scap sync-l10n completed (1.32.0-wmf.13) (duration: 15m 05s) [02:58:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:00:08] (03CR) 10Rush: "http://puppet-compiler.wmflabs.org/11809/" [puppet] - 10https://gerrit.wikimedia.org/r/446712 (https://phabricator.wikimedia.org/T199741) (owner: 10Rush) [03:01:37] (03PS2) 10Rush: phab: changes to files in redirector.pp should restart apache [puppet] - 10https://gerrit.wikimedia.org/r/446712 (https://phabricator.wikimedia.org/T199741) [03:03:19] (03CR) 10Rush: [C: 032] phab: changes to files in redirector.pp should restart apache [puppet] - 10https://gerrit.wikimedia.org/r/446712 (https://phabricator.wikimedia.org/T199741) (owner: 10Rush) [03:09:19] !log l10nupdate@deploy1001 ResourceLoader cache refresh completed at Thu Jul 19 03:09:19 UTC 2018 (duration 10m 35s) [03:09:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:24:42] (03Abandoned) 10Rush: nfs-mounts: per cluster definitions for mounts [puppet] - 10https://gerrit.wikimedia.org/r/345631 (https://phabricator.wikimedia.org/T158883) (owner: 10Rush) [03:28:10] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 844.97 seconds [03:36:24] (03PS1) 10Rush: labstore: notes for nfs-manage [puppet] - 10https://gerrit.wikimedia.org/r/446715 (https://phabricator.wikimedia.org/T169570) [03:39:11] (03CR) 10Rush: [C: 032] labstore: notes for nfs-manage [puppet] - 10https://gerrit.wikimedia.org/r/446715 (https://phabricator.wikimedia.org/T169570) (owner: 10Rush) [03:43:59] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 269.80 seconds [03:50:44] (03CR) 10Krinkle: [C: 032] Remove unused $wgEventLoggingSchemaIndexUri [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446711 (owner: 10Krinkle) [03:51:09] * Krinkle staging on mwdeploy1001/mwdebug1002 [03:52:33] (03Merged) 10jenkins-bot: Remove unused $wgEventLoggingSchemaIndexUri [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446711 (owner: 10Krinkle) [03:52:51] (03CR) 10jenkins-bot: Remove unused $wgEventLoggingSchemaIndexUri [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446711 (owner: 10Krinkle) [03:57:26] !log krinkle@deploy1001 Synchronized wmf-config/CommonSettings.php: I978f3eda8 - Farewell temporary hack from 2013 (duration: 00m 56s) [03:57:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:00:04] kart_: #bothumor Q:Why did functions stop calling each other? A:They had arguments. Rise for ContentTranslation Draft Purge . (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T0400). [04:00:12] * Krinkle unlocks deploy [04:00:58] here. [04:05:54] !log Starting ContentTranslation draft purge script on mwmaint1001, dry run first and then --really [04:05:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:12:17] !log Finished ContentTranslation draft purge script on mwmaint1001 (T199658) [04:12:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:12:21] T199658: Second manual run of unpublished draft purge script - https://phabricator.wikimedia.org/T199658 [04:12:36] (with some new bug though, will report to relavant extension) [04:16:18] (03PS1) 10Andrew Bogott: Neutron nova.conf: notification_driver = messagingv2 [puppet] - 10https://gerrit.wikimedia.org/r/446716 [04:17:07] (03CR) 10Andrew Bogott: [C: 032] Neutron nova.conf: notification_driver = messagingv2 [puppet] - 10https://gerrit.wikimedia.org/r/446716 (owner: 10Andrew Bogott) [04:43:18] (03PS2) 10KartikMistry: apertium-kaz-tat: Fix Build-Dep [debs/contenttranslation/apertium-kaz-tat] - 10https://gerrit.wikimedia.org/r/446525 (https://phabricator.wikimedia.org/T199962) [04:52:51] (03PS1) 10KartikMistry: apertium-eo-fr: Use --no-parallel flag [debs/contenttranslation/apertium-eo-fr] - 10https://gerrit.wikimedia.org/r/446717 (https://phabricator.wikimedia.org/T199962) [05:06:05] (03PS1) 10Marostegui: db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446718 [05:08:43] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446718 (owner: 10Marostegui) [05:10:22] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446718 (owner: 10Marostegui) [05:10:56] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446718 (owner: 10Marostegui) [05:11:37] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 for MySQL upgrade (duration: 00m 56s) [05:11:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:14:21] !log Stop MySQL on db1103 to upgrade MySQL [05:14:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:20:12] (03PS1) 10Marostegui: db-eqiad.php: Slowly repool db1103 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446721 [05:24:34] (03PS2) 10Marostegui: db-eqiad.php: Slowly repool db1103 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446721 [05:26:44] (03PS2) 10Tim Starling: Make "sql wikishared" work again [puppet] - 10https://gerrit.wikimedia.org/r/446530 (https://phabricator.wikimedia.org/T199152) [05:27:12] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Slowly repool db1103 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446721 (owner: 10Marostegui) [05:28:26] (03CR) 10Tim Starling: [C: 032] Make "sql wikishared" work again [puppet] - 10https://gerrit.wikimedia.org/r/446530 (https://phabricator.wikimedia.org/T199152) (owner: 10Tim Starling) [05:28:48] (03Merged) 10jenkins-bot: db-eqiad.php: Slowly repool db1103 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446721 (owner: 10Marostegui) [05:29:04] (03CR) 10jenkins-bot: db-eqiad.php: Slowly repool db1103 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446721 (owner: 10Marostegui) [05:30:07] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Slowly repool db1103:3312 after MySQL upgrade (duration: 00m 54s) [05:30:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:38:45] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446722 [05:42:02] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446722 (owner: 10Marostegui) [05:43:41] (03Merged) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446722 (owner: 10Marostegui) [05:44:57] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3312 after MySQL upgrade (duration: 00m 55s) [05:44:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:48:10] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446722 (owner: 10Marostegui) [05:49:07] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446723 [05:52:39] PROBLEM - Disk space on elastic1018 is CRITICAL: DISK CRITICAL - free space: /srv 50445 MB (10% inode=99%) [05:55:01] gehel,dcausse --^ [05:57:30] 10Operations, 10Core-Platform-Team, 10MediaWiki-Maintenance-scripts: sql enwik gives a poor error message when db doesn't exist - https://phabricator.wikimedia.org/T199008 (10tstarling) Poor but specific. The old script was just guessing when it said "Error looking up DB", it would have said that if the PHP... [05:57:40] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446723 (owner: 10Marostegui) [05:59:12] (03Merged) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446723 (owner: 10Marostegui) [05:59:28] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446723 (owner: 10Marostegui) [06:00:30] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3312 after MySQL upgrade (duration: 00m 54s) [06:00:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:01:26] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446724 [06:03:59] PROBLEM - Disk space on elastic1018 is CRITICAL: DISK CRITICAL - free space: /srv 52285 MB (10% inode=99%) [06:08:41] so /srv is now /dev/md2 493G 420G 49G 90% /srv [06:08:58] I think it is bouncing as it happens sometimes between the partition's limits [06:09:01] should be fine [06:09:40] RECOVERY - Disk space on elastic1018 is OK: DISK OK [06:25:10] 10Operations, 10Core-Platform-Team, 10WMF-JobQueue, 10Services (designing), and 2 others: Exception "Job queue is read-only" - https://phabricator.wikimedia.org/T199594 (10tstarling) To serve read traffic correctly, $wgReadOnly needs to be false. $wgReadOnly is mostly a UI-layer concept which shows some i... [06:38:58] (03PS3) 10Muehlenhoff: Install libvips-tools on mw app servers [puppet] - 10https://gerrit.wikimedia.org/r/446640 (https://phabricator.wikimedia.org/T199938) (owner: 10Reedy) [06:40:21] (03PS4) 10Muehlenhoff: Install libvips-tools on mw app servers [puppet] - 10https://gerrit.wikimedia.org/r/446640 (https://phabricator.wikimedia.org/T199938) (owner: 10Reedy) [06:41:27] (03CR) 10Muehlenhoff: [C: 032] Install libvips-tools on mw app servers [puppet] - 10https://gerrit.wikimedia.org/r/446640 (https://phabricator.wikimedia.org/T199938) (owner: 10Reedy) [06:47:19] 10Operations, 10MediaWiki-extensions-VipsScaler, 10Multimedia, 10Patch-For-Review, 10Wikimedia-log-errors: VipsScaler broken for MediaWiki production (/usr/bin/vips: No such file) - https://phabricator.wikimedia.org/T199938 (10MoritzMuehlenhoff) 05Open>03Resolved a:03MoritzMuehlenhoff vips is now d... [06:47:56] (03PS1) 10Muehlenhoff: Switch mediawiki::packages to require_package [puppet] - 10https://gerrit.wikimedia.org/r/446734 [06:54:01] PROBLEM - puppet last run on labweb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [06:57:46] Reedy: ^ I think that's from your last change "Duplicate declaration: Package[libvips-tools] is already declared in file /etc/puppet/modules/mediawiki/manifests/packages.pp:10; cannot redeclare at /etc/puppet/modules/profile/manifests/openstack/base/wikitech/web.pp:18 at /etc/puppet/modules/profile/manifests/openstack/base/wikitech/web.pp:18:5 on node labweb1001.wikimedia.org" [06:59:55] 10Operations, 10netops, 10Patch-For-Review, 10Wikimedia-Incident: Juniper HA audit - https://phabricator.wikimedia.org/T191667 (10Krinkle) [07:01:02] 10Operations, 10Traffic, 10Wikimedia-Incident: Investigate 2018-04-10 global traffic drop - https://phabricator.wikimedia.org/T191940 (10Krinkle) 05Open>03Resolved [07:01:03] 10Operations, 10Traffic, 10Wikimedia-Incident: Investigate 2018-04-10 global traffic drop - https://phabricator.wikimedia.org/T191940 (10Krinkle) [07:01:14] 10Operations, 10Traffic, 10Wikimedia-Incident: Investigate 2018-04-10 global traffic drop - https://phabricator.wikimedia.org/T191940 (10Krinkle) [07:01:33] (03PS1) 10Rush: wikitech: remove libvips-tools as it is now on all app servers [puppet] - 10https://gerrit.wikimedia.org/r/446735 (https://phabricator.wikimedia.org/T199938) [07:03:53] (03PS1) 10Muehlenhoff: Fix some copy&paste errors in help texts [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446736 [07:04:23] (03CR) 10Krinkle: wikitech: remove libvips-tools as it is now on all app servers (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/446735 (https://phabricator.wikimedia.org/T199938) (owner: 10Rush) [07:04:31] PROBLEM - puppet last run on labweb1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [07:05:32] ACKNOWLEDGEMENT - puppet last run on labweb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues cpettet https://gerrit.wikimedia.org/r/c/operations/puppet/+/446735 [07:05:33] ACKNOWLEDGEMENT - puppet last run on labweb1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues cpettet https://gerrit.wikimedia.org/r/c/operations/puppet/+/446735 [07:07:51] (03CR) 10Muehlenhoff: [C: 031] "Looks fine, but I'd recommend to also switch the others to require_package(), which would have prevented this puppet failure in the first " (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/446735 (https://phabricator.wikimedia.org/T199938) (owner: 10Rush) [07:09:09] PROBLEM - puppet last run on labtestweb2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [07:10:14] (03CR) 10Muehlenhoff: [C: 032] Fix some copy&paste errors in help texts [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446736 (owner: 10Muehlenhoff) [07:10:38] (03CR) 10Muehlenhoff: [V: 032 C: 032] Fix some copy&paste errors in help texts [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446736 (owner: 10Muehlenhoff) [07:12:31] 10Operations, 10MediaWiki-extensions-VipsScaler, 10Multimedia, 10Patch-For-Review, 10Wikimedia-log-errors: VipsScaler broken for MediaWiki production (/usr/bin/vips: No such file) - https://phabricator.wikimedia.org/T199938 (10Krinkle) [07:12:34] 10Operations, 10MediaWiki-extensions-VipsScaler, 10Multimedia, 10Patch-For-Review, 10Wikimedia-log-errors: VipsScaler broken for MediaWiki production (/usr/bin/vips: No such file) - https://phabricator.wikimedia.org/T199938 (10Krinkle) [07:13:21] (03PS1) 10Muehlenhoff: Fix some copy&paste errors in help texts [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446737 [07:14:09] (03CR) 10Muehlenhoff: [V: 032 C: 032] Fix some copy&paste errors in help texts [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446737 (owner: 10Muehlenhoff) [07:14:40] (03Abandoned) 10Muehlenhoff: Fix some copy&paste errors in help texts [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446736 (owner: 10Muehlenhoff) [07:15:31] (03CR) 10Muehlenhoff: [C: 032] Switch mediawiki::packages to require_package [puppet] - 10https://gerrit.wikimedia.org/r/446734 (owner: 10Muehlenhoff) [07:26:03] !log removing cdrom and sr_mod (related module) kernel modules across the fleet (following the blacklisting of the kernel module via puppet) [07:26:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:34:40] PROBLEM - Disk space on elastic1020 is CRITICAL: DISK CRITICAL - free space: /srv 51465 MB (10% inode=99%) [07:40:41] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446724 (owner: 10Marostegui) [07:42:23] (03Merged) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446724 (owner: 10Marostegui) [07:42:39] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446724 (owner: 10Marostegui) [07:43:47] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3312 after MySQL upgrade (duration: 00m 55s) [07:43:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:44:32] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446740 [07:44:50] (03PS4) 10Jcrespo: mariadb: Depool es1019 for reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446553 (https://phabricator.wikimedia.org/T197073) [07:45:09] jynus: you go first :) [07:45:26] (03CR) 10Jcrespo: [C: 032] mariadb: Depool es1019 for reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446553 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [07:46:49] (03PS1) 10Volans: DataTables: force session save [software/debmonitor] - 10https://gerrit.wikimedia.org/r/446741 [07:46:55] (03CR) 10jenkins-bot: mariadb: Depool es1019 for reimage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446553 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [07:47:42] (03PS2) 10Volans: wmf-auto-reimage: use system URLs for Apache test [puppet] - 10https://gerrit.wikimedia.org/r/446346 [07:48:10] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: Depool es1019 (duration: 00m 54s) [07:48:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:48:14] (03CR) 10jerkins-bot: [V: 04-1] DataTables: force session save [software/debmonitor] - 10https://gerrit.wikimedia.org/r/446741 (owner: 10Volans) [07:48:47] (03PS2) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446740 [07:48:51] (03PS2) 10Elukey: EventStreams: Consume messages from the local Kafka cluster [puppet] - 10https://gerrit.wikimedia.org/r/446556 (https://phabricator.wikimedia.org/T199813) (owner: 10Mobrovac) [07:50:37] !log stop es1019 and reimage it [07:50:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:50:41] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446740 (owner: 10Marostegui) [07:52:22] (03Merged) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446740 (owner: 10Marostegui) [07:52:35] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446740 (owner: 10Marostegui) [07:52:57] (03CR) 10Elukey: [C: 032] "After a long chat with Marko and several attempts to figure out what's happening, we decided to remove the only difference between ES in c" [puppet] - 10https://gerrit.wikimedia.org/r/446556 (https://phabricator.wikimedia.org/T199813) (owner: 10Mobrovac) [07:54:17] (03PS1) 10Marostegui: db-eqiad.php: Tackle weights for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446742 [07:54:58] !log Deploy schema change on s2 codfw masters (db2035) this will generate lag on s2 codfw T144010 T51190 T199368 [07:55:00] RECOVERY - Disk space on elastic1020 is OK: DISK OK [07:55:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:55:03] T51190: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190 [07:55:04] T144010: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 [07:55:04] T199368: Convert UNIQUE INDEX to PK in Production - https://phabricator.wikimedia.org/T199368 [07:55:40] marostegui: I'm about to deploy cxserver change, OK to do now? [07:55:48] kart_: give me a sec [07:55:52] OK [07:55:58] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Tackle weights for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446742 (owner: 10Marostegui) [07:56:03] just need to deploy this ^ [07:56:05] should be a minute :) [07:56:11] Sure :) [07:56:54] (03CR) 10Muehlenhoff: [C: 031] "Looks good" [software/debmonitor] - 10https://gerrit.wikimedia.org/r/446741 (owner: 10Volans) [07:57:34] (03Merged) 10jenkins-bot: db-eqiad.php: Tackle weights for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446742 (owner: 10Marostegui) [07:58:40] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3314 after MySQL upgrade (duration: 00m 54s) [07:58:41] kart_: all done - thank you! [07:58:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:00:00] (03PS1) 10Hashar: contint: libvips-tool is provided by mediawiki::packages [puppet] - 10https://gerrit.wikimedia.org/r/446743 [08:00:19] marostegui: cool. [08:00:27] !log kartik@deploy1001 Started deploy [cxserver/deploy@fe0d521]: Update cxserver to 92e3d2e (T199380, T199885) [08:00:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:00:32] T199885: Js error "Cannot read property 'stack' of undefined" from cxserver - https://phabricator.wikimedia.org/T199885 [08:00:35] T199380: [wmf.12] ⧼cx-error-page-not-found⧽ for translating from jawiki - https://phabricator.wikimedia.org/T199380 [08:01:59] (03CR) 10jenkins-bot: db-eqiad.php: Tackle weights for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446742 (owner: 10Marostegui) [08:04:25] !log kartik@deploy1001 Finished deploy [cxserver/deploy@fe0d521]: Update cxserver to 92e3d2e (T199380, T199885) (duration: 03m 58s) [08:04:26] (03CR) 10Hashar: [V: 031] "I have cherry picked it on the CI puppetmaster and that unbroke the duplicate package definition on integration-slave-jessie-1001.integrat" [puppet] - 10https://gerrit.wikimedia.org/r/446743 (owner: 10Hashar) [08:04:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:05:57] (03PS3) 10Volans: wmf-auto-reimage: use system URLs for Apache test [puppet] - 10https://gerrit.wikimedia.org/r/446346 [08:08:33] !log restart eventstreams on scb2* hosts to pick up new Kafka settings (pointing it to main-codfw) - T199813 [08:08:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:08:36] T199813: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 [08:10:47] (03CR) 10Volans: [C: 032] wmf-auto-reimage: use system URLs for Apache test [puppet] - 10https://gerrit.wikimedia.org/r/446346 (owner: 10Volans) [08:17:28] 10Operations, 10Performance-Team, 10vm-requests: Increase webperf1002/webperf2002 space from 50GB to 500 GB (Ganeti) - https://phabricator.wikimedia.org/T199853 (10fgiunchedi) Swift could be an option since data size isn't huge. In that case I would recommend: * write path through a swift client talking http... [08:21:29] ACKNOWLEDGEMENT - Device not healthy -SMART- on db1069 is CRITICAL: cluster=mysql device=megaraid,0 instance=db1069:9100 job=node site=eqiad Marostegui T199056 - The acknowledgement expires at: 2018-07-20 08:21:12. https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=db1069&var-datasource=eqiad%2520prometheus%252Fops [08:23:09] (03CR) 10Muehlenhoff: "No, libogg0, libvorbisenc2, oggvideotools were needed for video scaling in the past, but that's no longer the case." [puppet] - 10https://gerrit.wikimedia.org/r/446743 (owner: 10Hashar) [08:24:43] kart_: you done? [08:25:48] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446747 [08:28:41] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446747 (owner: 10Marostegui) [08:30:21] (03Merged) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446747 (owner: 10Marostegui) [08:30:34] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446747 (owner: 10Marostegui) [08:31:35] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3314 after MySQL upgrade (duration: 00m 54s) [08:31:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:32:43] !log Deploy schema change on dbstore1002:s2 T144010 T51190 T199368 [08:32:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:32:49] T51190: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190 [08:32:49] T144010: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 [08:32:49] T199368: Convert UNIQUE INDEX to PK in Production - https://phabricator.wikimedia.org/T199368 [08:34:55] 10Operations, 10Core-Platform-Team, 10WMF-JobQueue, 10Services (designing), and 2 others: Exception "Job queue is read-only" - https://phabricator.wikimedia.org/T199594 (10mobrovac) p:05High>03Normal >>! In T199594#4436984, @tstarling wrote: > To serve read traffic correctly, $wgReadOnly needs to be fa... [08:39:31] !log start upgrade of cumin to 3.0.1-1 on labpuppetmaster* - T187773 [08:39:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:39:34] T187773: Cumin: upgrade it to 3.0.1 in production - https://phabricator.wikimedia.org/T187773 [08:40:19] (03PS9) 10Volans: Cumin masters in WMCS: upgrade to python3 [puppet] - 10https://gerrit.wikimedia.org/r/419131 (https://phabricator.wikimedia.org/T188112) [08:41:25] PROBLEM - Restbase edge esams on text-lb.esams.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received [08:41:31] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446749 [08:42:16] RECOVERY - Restbase edge esams on text-lb.esams.wikimedia.org is OK: All endpoints are healthy [08:43:08] 10Operations, 10Wikimedia-Apache-configuration, 10Chinese-Sites, 10Patch-For-Review, 10User-Urbanecm: All "zh-my" variant page views get 404 Not Found on zh.wikipedia.org - https://phabricator.wikimedia.org/T198371 (10RazeSoldier) @Urbanecm Any progress? [08:43:16] (03CR) 10Volans: [C: 032] Cumin masters in WMCS: upgrade to python3 [puppet] - 10https://gerrit.wikimedia.org/r/419131 (https://phabricator.wikimedia.org/T188112) (owner: 10Volans) [08:43:27] (03PS2) 10Muehlenhoff: contint: libvips-tool is provided by mediawiki::packages [puppet] - 10https://gerrit.wikimedia.org/r/446743 (owner: 10Hashar) [08:43:31] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446749 (owner: 10Marostegui) [08:44:26] (03CR) 10Muehlenhoff: [C: 032] contint: libvips-tool is provided by mediawiki::packages [puppet] - 10https://gerrit.wikimedia.org/r/446743 (owner: 10Hashar) [08:45:22] (03Merged) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446749 (owner: 10Marostegui) [08:46:25] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3314 after MySQL upgrade (duration: 00m 54s) [08:46:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:47:15] 10Operations, 10Operations-Software-Development: Systemd session creation fails under I/O load - https://phabricator.wikimedia.org/T199911 (10fgiunchedi) It doesn't seem specific to debmonitor afaics, e.g. on ms-be1022 there's an occurrence where a session for root is involved instead: ``` syslog.1:Jul 18 11:... [08:48:10] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446749 (owner: 10Marostegui) [08:53:38] (03CR) 10Catrope: [C: 032] Enable PageTriage ORES filters on labs enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446624 (https://phabricator.wikimedia.org/T198747) (owner: 10Sbisson) [08:53:40] (03PS1) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446751 [08:54:43] (03PS2) 10Catrope: Enable PageTriage ORES filters on labs enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446624 (https://phabricator.wikimedia.org/T198747) (owner: 10Sbisson) [08:55:00] (03CR) 10Catrope: Enable PageTriage ORES filters on labs enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446624 (https://phabricator.wikimedia.org/T198747) (owner: 10Sbisson) [08:55:05] (03CR) 10Catrope: [C: 032] Enable PageTriage ORES filters on labs enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446624 (https://phabricator.wikimedia.org/T198747) (owner: 10Sbisson) [08:56:40] (03Merged) 10jenkins-bot: Enable PageTriage ORES filters on labs enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446624 (https://phabricator.wikimedia.org/T198747) (owner: 10Sbisson) [08:57:27] (03PS1) 10Jcrespo: mariadb: Repool es1019 with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446752 (https://phabricator.wikimedia.org/T197073) [08:58:01] (03CR) 10jenkins-bot: Enable PageTriage ORES filters on labs enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446624 (https://phabricator.wikimedia.org/T198747) (owner: 10Sbisson) [08:59:52] (03PS1) 10Jcrespo: mariadb: Repool es1019 full after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446755 (https://phabricator.wikimedia.org/T197073) [09:01:24] (03PS2) 10Jcrespo: mariadb: Repool es1019 with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446752 (https://phabricator.wikimedia.org/T197073) [09:02:17] (03CR) 10Jcrespo: [C: 032] mariadb: Repool es1019 with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446752 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [09:03:56] (03Merged) 10jenkins-bot: mariadb: Repool es1019 with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446752 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [09:04:16] (03PS1) 10Catrope: Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) [09:04:41] (03PS2) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446751 [09:05:16] (03PS1) 10Volans: cumin: pin also python3-urllib3 for WMCS [puppet] - 10https://gerrit.wikimedia.org/r/446757 (https://phabricator.wikimedia.org/T188112) [09:06:22] (03PS3) 10Ema: network::constants: define all caches, not only cache_misc [puppet] - 10https://gerrit.wikimedia.org/r/445126 (https://phabricator.wikimedia.org/T164609) [09:08:01] (03CR) 10Muehlenhoff: [C: 031] cumin: pin also python3-urllib3 for WMCS [puppet] - 10https://gerrit.wikimedia.org/r/446757 (https://phabricator.wikimedia.org/T188112) (owner: 10Volans) [09:08:15] (03CR) 10jenkins-bot: mariadb: Repool es1019 with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446752 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [09:08:17] (03CR) 10Volans: [C: 032] cumin: pin also python3-urllib3 for WMCS [puppet] - 10https://gerrit.wikimedia.org/r/446757 (https://phabricator.wikimedia.org/T188112) (owner: 10Volans) [09:09:07] (03PS4) 10Ema: network::constants: define all caches, not only cache_misc [puppet] - 10https://gerrit.wikimedia.org/r/445126 (https://phabricator.wikimedia.org/T164609) [09:09:53] 10Operations, 10Core-Platform-Team, 10monitoring, 10Wikimedia-Incident: Add alerts for Logstash rates in production - https://phabricator.wikimedia.org/T199479 (10fgiunchedi) Looks like logstash rates are in graphite ATM, so either a grafana alert and its counterpart in puppet or a graphite alert. Both wou... [09:10:11] (03CR) 10Ema: [C: 032] network::constants: define all caches, not only cache_misc [puppet] - 10https://gerrit.wikimedia.org/r/445126 (https://phabricator.wikimedia.org/T164609) (owner: 10Ema) [09:11:12] (03PS2) 10Jcrespo: mariadb: Repool es1019 fully after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446755 (https://phabricator.wikimedia.org/T197073) [09:11:45] marostegui: do you deploy or I do? [09:12:07] jynus: go for it [09:12:12] I didn't merge :) [09:12:57] should I rebase to my change, or apply both? [09:13:47] Sure, I can +2 mine if you like [09:13:49] Or wait for you [09:13:54] I haven't +2'ed my change [09:13:57] so you can deploy if you like [09:14:02] mmm [09:14:22] https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/446751/ that is my change :) [09:14:41] !log uploaded jenkins 2.121.2 to apt.wikimedia.org (T199448) [09:14:41] mmm [09:14:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:14:49] something is strange [09:14:52] ^ hashar [09:15:00] (03CR) 10Hashar: "I did not have the issue on integration-cumin.integration.eqiad.wmflabs Maybe because python-urllib3 got properly installed as a depende" [puppet] - 10https://gerrit.wikimedia.org/r/446757 (https://phabricator.wikimedia.org/T188112) (owner: 10Volans) [09:15:21] what is strange jynus? [09:15:24] marostegui: can you log in and check which of the change there are yours? [09:15:29] sure [09:15:33] intended for deploy [09:16:05] moritzm: danke! [09:16:10] hashar: yeah I know and on labpuppetmaster1001 is already the newer version but on 1002 it was the old :D [09:16:42] marostegui: I think there are 4 undeployed changes? [09:16:48] jynus: there are no changes from pending to deploy [09:16:54] I see chagnes from catrope [09:17:05] volans: I guess if python3-urllib is already installed, it would not be magically upgraded to the backports one. At least on the labs instance it is all clear! Thank you very mich [09:17:35] thank you for the testing, not sure if you need to remove the cherry-pick [09:17:44] marostegui: what was your last deploy? [09:18:08] 333840f8c839cd1f30bff621c8dbd85fa69e434d [09:18:27] Which in gerrit is: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/446749/ [09:20:10] so there are 2 of those [09:20:17] 333840f8c839cd1f30 [09:20:24] and a6404ed53690d62487 [09:20:58] no [09:21:00] they are different [09:21:10] one is for watchlist and another one for recentchanges [09:21:18] ok, that was confusing me because it has the same title [09:21:19] they have the same title though, so that wasn't helping [09:21:20] :) [09:21:22] yeah [09:21:30] !log installing ant security updates [09:21:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:21:42] I had to do many commits for just the same thing so I didn't bother to change the title, I should've [09:21:42] ok, then, I think I have it now [09:21:53] I just need to deploy mine [09:21:57] yep [09:22:05] once you are done I will +2 mine [09:22:40] then it was the catrope change which was noop [09:23:11] (03PS3) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446751 [09:23:22] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: Repool es1019 with low load (duration: 00m 54s) [09:23:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:23:27] I added a new sentence in the message to avoid the same confusion :) [09:23:41] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446751 (owner: 10Marostegui) [09:23:59] (03PS4) 10Marostegui: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446751 [09:24:50] !log installing tomcat7 security updates [09:24:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:58] 10Operations, 10Wikimedia-Apache-configuration, 10Chinese-Sites, 10Patch-For-Review, 10User-Urbanecm: All "zh-my" variant page views get 404 Not Found on zh.wikipedia.org - https://phabricator.wikimedia.org/T198371 (101233thehongkonger) Should be deployment problem? [09:27:12] !log run xfs_repair on filesystems reporting negative space available on ms-be1040 - T199198 [09:27:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:27:16] T199198: Some swift filesystems reporting negative disk usage - https://phabricator.wikimedia.org/T199198 [09:27:33] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3314 after MySQL upgrade (duration: 00m 54s) [09:27:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:28:17] (03CR) 10jenkins-bot: db-eqiad.php: Increase weight for db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446751 (owner: 10Marostegui) [09:29:24] (03CR) 10Vgutierrez: [C: 04-1] WIP: provide ACMEv2 support based on certbot/acme library (031 comment) [software/certcentral] - 10https://gerrit.wikimedia.org/r/446618 (https://phabricator.wikimedia.org/T199717) (owner: 10Vgutierrez) [09:33:01] !log removed leftover cumin installation from labtestpuppetmaster2001 - T187773 [09:33:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:33:05] T187773: Cumin: upgrade it to 3.0.1 in production - https://phabricator.wikimedia.org/T187773 [09:33:15] RECOVERY - Check systemd state on ms-be1022 is OK: OK - running: The system is fully operational [09:33:26] !log installing openssl security updates [09:33:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:35:32] !log cumin upgrade on labpuppetmaster* hosts completed - T187773 [09:35:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:38:30] (03Restored) 10Gergő Tisza: Deploy TemplateStyles to frwiki and zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416708 (https://phabricator.wikimedia.org/T189022) (owner: 10Zoranzoki21) [09:39:49] (03CR) 10Volans: [V: 032 C: 032] DataTables: force session save [software/debmonitor] - 10https://gerrit.wikimedia.org/r/446741 (owner: 10Volans) [09:40:40] (03Abandoned) 10Gergő Tisza: Deploy TemplateStyles to frwiki and zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416708 (https://phabricator.wikimedia.org/T189022) (owner: 10Zoranzoki21) [09:44:23] (03PS1) 10Muehlenhoff: Add cumin alias for kafkamon [puppet] - 10https://gerrit.wikimedia.org/r/446769 [09:46:43] (03CR) 10Muehlenhoff: [C: 032] Add cumin alias for kafkamon [puppet] - 10https://gerrit.wikimedia.org/r/446769 (owner: 10Muehlenhoff) [09:49:32] !log aaron@deploy1001 Synchronized php-1.32.0-wmf.13/includes/libs/objectcache/MultiWriteBagOStuff.php: 2cee0ab7085a74d35bc192ec8bc9cf461b286695 (duration: 00m 55s) [09:49:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:54:34] (03PS1) 10Hashar: jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) [09:55:00] (03CR) 10Hashar: [C: 04-1] "WIP, I need to add some spec tests" [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [09:55:30] (03CR) 10jerkins-bot: [V: 04-1] jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [09:55:46] !log installing check-postgres update from stretch 9.5 point release [09:55:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:00:34] 10Operations, 10CommRel-Internals, 10Wikimedia-Mailing-lists: Close https://lists.wikimedia.org/mailman/listinfo/cep and keep the archive for now - https://phabricator.wikimedia.org/T155683 (10Elitre) [10:06:04] !log installing file/libmagic security updates [10:06:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:09:19] hashar: https://gerrit.wikimedia.org/r/#/c/446777/ [10:13:05] PROBLEM - Restbase edge esams on text-lb.esams.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received [10:14:06] RECOVERY - Restbase edge esams on text-lb.esams.wikimedia.org is OK: All endpoints are healthy [10:14:25] (03PS10) 10Volans: Cumin masters in prod: upgrade to python3 [puppet] - 10https://gerrit.wikimedia.org/r/412894 (https://phabricator.wikimedia.org/T187773) [10:15:54] (03PS1) 10Marostegui: db-eqiad.php: Give db1103:3314 more traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446780 [10:21:46] as heads up we're planning to upgrade cumin on sarin shortly, so use neodymium instead if you need to run anything cumin-related. We'll advice when we're about to upgrade neodymium too [10:26:32] (03PS4) 10Muehlenhoff: Migrate the server side to Python3 [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405879 (owner: 10Volans) [10:29:43] (03CR) 10Muehlenhoff: [C: 032] Migrate the server side to Python3 [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/405879 (owner: 10Volans) [10:36:36] (03PS1) 10Ppchelko: Remove base64 hack for binary values decoding. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446785 [10:40:03] (03PS1) 10Reedy: Add wikimania.wikimedia.org to apache ServerAlias [puppet] - 10https://gerrit.wikimedia.org/r/446786 (https://phabricator.wikimedia.org/T199935) [10:46:31] (03CR) 10Volans: [C: 032] "Compiler looks happy:" [puppet] - 10https://gerrit.wikimedia.org/r/412894 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [10:52:22] !log made myself a crat on test2wiki [10:52:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:55:24] (03PS2) 10Hashar: jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) [10:56:56] (03CR) 10Hashar: "I thought I wrote test for Jenkins system properties but there are none. Looks good to me as is. I think it has to be applied after Jenk" [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [10:58:15] PROBLEM - Check systemd state on ms-be1035 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [11:00:00] !log upgrading cumin on sarin - T187773 [11:00:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:00:03] T187773: Cumin: upgrade it to 3.0.1 in production - https://phabricator.wikimedia.org/T187773 [11:00:04] addshore, hashar, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: That opportune time is upon us again. Time for a European Mid-day SWAT(Max 6 patches) deploy. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1100). [11:00:04] tgr, davidwbarratt, and RoanKattouw: A patch you scheduled for European Mid-day SWAT(Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [11:00:11] here! [11:00:20] Look on TemplateStyles enable deployment :) [11:00:38] I can SWAT today [11:00:50] tgr, davidwbarratt, and RoanKattouw: anybody wants to deploy their own patches? [11:01:45] uhhhhh, sounds like way too much responsibility. [11:01:54] :D [11:02:07] davidwbarratt: there are docs https://wikitech.wikimedia.org/wiki/SWAT_deploys/Deployers [11:02:46] o/ [11:03:21] zeljkof: can do it if you prefer, I'm fine with sitting in the back though [11:03:47] tgr: I prefer developers deploying their own changes :) go ahead then while I review other patches [11:03:59] davidwbarratt: and you have to be in this group https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/admin/data/data.yaml$62 [11:05:16] well I'm not in that group, so I'll stay out of it for now. :) [11:05:41] (03CR) 10Gergő Tisza: [C: 032] Deploy TemplateStyles to frwiki and zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446586 (https://phabricator.wikimedia.org/T189022) (owner: 10Gergő Tisza) [11:05:43] (03CR) 10Gergő Tisza: [C: 032] Enable TemplateStyles on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/444567 (https://phabricator.wikimedia.org/T197603) (owner: 10Gergő Tisza) [11:05:54] davidwbarratt: please stand by then, you are next, after tgr :) [11:07:25] (03CR) 10jerkins-bot: [V: 04-1] Enable TemplateStyles on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/444567 (https://phabricator.wikimedia.org/T197603) (owner: 10Gergő Tisza) [11:08:43] RoanKattouw: around for SWAT? [11:08:50] zeljkof: Yes sorry, here now [11:09:08] ok, you are third, please stand by :) [11:09:37] ookay.. so the parent patch is merged, the child patch says V:-2 the parent patch failed to merge [11:10:02] (03CR) 10Gergő Tisza: [C: 032] Enable TemplateStyles on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/444567 (https://phabricator.wikimedia.org/T197603) (owner: 10Gergő Tisza) [11:10:04] any chance of https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/445720/ going in at the end? [11:10:33] (03PS2) 10Gergő Tisza: Deploy TemplateStyles to frwiki and zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446586 (https://phabricator.wikimedia.org/T189022) [11:10:38] ah, never mind, not actually merged [11:11:02] (03PS3) 10Gergő Tisza: Enable TemplateStyles on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/444567 (https://phabricator.wikimedia.org/T197603) [11:11:23] I'm just not used to mediawiki-config CI messages, apparently [11:11:59] apergos: please add it to the calendar [11:12:10] RoanKattouw: do you want to deploy your own patch? [11:12:17] Sure [11:12:29] Let me know when [11:12:34] RoanKattouw: ok, in that case you are next, as soon as tgr is done [11:14:20] (03PS1) 10Volans: cumin: remove disable warning for urllib3 [puppet] - 10https://gerrit.wikimedia.org/r/446789 (https://phabricator.wikimedia.org/T187773) [11:15:49] (03PS2) 10Volans: cumin: remove disable warning for urllib3 [puppet] - 10https://gerrit.wikimedia.org/r/446789 (https://phabricator.wikimedia.org/T187773) [11:16:07] (03CR) 10Muehlenhoff: [C: 031] "Looks good to me!" [puppet] - 10https://gerrit.wikimedia.org/r/446789 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [11:16:40] apergos: I don't see your commit in the calendar, did you add it? want to deploy it yourself? [11:16:43] zeljkof: ok. as not quite a dev, I felt weird adding it myself but it's there [11:16:50] (03CR) 10Volans: [C: 032] cumin: remove disable warning for urllib3 [puppet] - 10https://gerrit.wikimedia.org/r/446789 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [11:17:01] and I would prefer someone else deploy if that is ok [11:17:14] apergos: sure, that's why I'm here :) [11:17:46] is there a prefix to add for changes like dblists? (it's not 'config') [11:18:11] (03CR) 10jenkins-bot: Deploy TemplateStyles to frwiki and zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446586 (https://phabricator.wikimedia.org/T189022) (owner: 10Gergő Tisza) [11:18:28] dblists live in the config repo... [11:18:43] !log tgr@deploy1001 Synchronized wmf-config/InitialiseSettings.php: enable TemplateStyles on enwiki, frwiki, zhwiki T197603 T191452 T189022 (duration: 00m 55s) [11:18:50] ah it's about the repo, not if it's a config change per se? fixing [11:18:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:18:51] T191452: Deploy TemplateStyles on French Wikipedia on 2018-07-19 - https://phabricator.wikimedia.org/T191452 [11:18:51] T197603: Deploy TemplateStyles to the English Wikipedia on 2018-07-19 - https://phabricator.wikimedia.org/T197603 [11:18:52] T189022: Deploy TemplateStyles to zhwiki on 2018-07-19 - https://phabricator.wikimedia.org/T189022 [11:19:27] RoanKattouw: I'm done [11:19:46] OK [11:19:56] !log maintain-views --all-databases --replace-all --table ores_model --debug (labsdb) [11:19:57] this is strange, fatal monitor is empty o.o https://logstash.wikimedia.org/app/kibana#/dashboard/Fatal-Monitor?_g=h@bfc149c&_a=h@2d022de [11:19:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:20:09] !log maintain-views --all-databases --replace-all --table ores_classification --debug (labsdb) [11:20:11] ah, there is one thing now [11:20:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:20:16] (03CR) 10Catrope: [C: 032] Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) (owner: 10Catrope) [11:21:59] apergos: is there a phab task related to 445720? [11:22:44] !log testing reimage of californium (spare host) with the upgraded cumin [11:22:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:22:50] davidwbarratt, apergos: do you know how to test at mwdebug1002? [11:23:07] your changes will be deployed there first, before deploying everywhere [11:23:07] https://phabricator.wikimedia.org/T198356 this is the task [11:23:14] and there's nothing to test really.... [11:23:18] zeljkof yes I do [11:23:31] apergos: could you please add the task to the commit message? let me know if you need help with that [11:23:36] davidwbarratt: great [11:23:40] sec [11:23:43] zeljkof it should work on any wiki that has .13 [11:23:52] 10Operations, 10Cloud-Services, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Set up external DNS record for wikitech-static - https://phabricator.wikimedia.org/T164290 (10aborrero) 05Open>03Resolved a:03aborrero Done https://office.wikimedia.org/w/index.php?title=Guide_for_new_engineerin... [11:24:05] apergos: ok, so your change should be deployed without testing? [11:24:27] davidwbarratt: hm, let me check if mwdebug1002 is on .13 [11:25:09] someone was qucker than me [11:25:33] davidwbarratt: ah, I remember now, so since .13 is now on group0 and group1, you can use any of those wikis to test, let me know if you need a list [11:25:49] there's nothing to test... [11:25:59] zeljkof a list would be great. :) [11:26:35] apergos: quicker with adding phab task? I still don't see it [11:26:42] it was already added [11:26:51] and not by me, so I tossed my commit [11:26:56] (03PS2) 10Catrope: Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) [11:27:02] Ugh I hate this rebase requirement [11:27:04] ah no [11:27:05] I am wrong [11:27:13] I tossed the commit that fixed it grrr gerrit uis [11:27:17] Jenkins will just silently mot merge your patch if you don't rebase it first [11:27:19] apergos: somebody edited the commit message, but not by adding the task [11:27:26] (03CR) 10Catrope: Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) (owner: 10Catrope) [11:27:29] (03CR) 10Catrope: [C: 032] Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) (owner: 10Catrope) [11:27:40] RoanKattouw: yes, that is annoying :/ [11:28:04] davidwbarratt: this page has groups and says which group has which version https://tools.wmflabs.org/versions/ [11:28:10] (03PS3) 10ArielGlenn: Remove labs wikis from the categories-rdf list, don't need them [mediawiki-config] - 10https://gerrit.wikimedia.org/r/445720 (owner: 10Smalyshev) [11:28:20] davidwbarratt: so groups 0 and 1 are at .13 [11:28:30] there you are, sorry for that mixup [11:28:31] yeah super annoying [11:28:35] zeljkof ah, thanks! [11:28:43] (03Merged) 10jenkins-bot: Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) (owner: 10Catrope) [11:28:45] davidwbarratt: if you click the triangle after .13 it expands the list [11:29:18] davidwbarratt: the list is a bit cryptic list but if you google a site it should find it :) [11:29:38] davidwbarratt: and let me know if you need help [11:29:45] zeljkof yeah I know it's the dbkey [11:30:00] zeljkof let me know when it's ready to test [11:30:50] davidwbarratt: there is surely a table somewhere translating strings to urls :) you are next, as soon as RoanKattouw is done [11:31:36] (03PS4) 10Zfilipin: Enable Special:Block Feedback Request [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446613 (https://phabricator.wikimedia.org/T199919) (owner: 10Dbarratt) [11:32:36] 10Operations, 10TemplateStyles, 10Traffic, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410 (10Tgr) [11:32:40] 10Operations, 10TemplateStyles, 10Traffic, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410 (10Deskana) [11:33:09] !log catrope@deploy1001 Synchronized wmf-config/InitialiseSettings.php: Enable Structured Discussions beta feature on orwiki (T199971) (duration: 00m 55s) [11:33:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:33:12] T199971: Enable structured discussion beta feature for user on Odia WIkipedia - https://phabricator.wikimedia.org/T199971 [11:34:27] zeljkof: I'm done [11:34:45] RoanKattouw: great, taking over swat then [11:35:03] davidwbarratt: please stand by, the patch will be at mwdebug1002 in a few minutes [11:35:14] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446613 (https://phabricator.wikimedia.org/T199919) (owner: 10Dbarratt) [11:35:46] kk [11:36:57] (03Merged) 10jenkins-bot: Enable Special:Block Feedback Request [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446613 (https://phabricator.wikimedia.org/T199919) (owner: 10Dbarratt) [11:38:14] davidwbarratt: the patch is at mwdebug1002, please test and let me know if I can deploy it [11:38:55] zeljkof it looks awesome! please deploy. :) [11:39:04] davidwbarratt: ok, deploying [11:40:08] !log zfilipin@deploy1001 Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:446613|Enable Special:Block Feedback Request (T199919)]] (duration: 00m 55s) [11:40:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:40:11] T199919: Enable Partial Block banner on all Wikimedia wikis - https://phabricator.wikimedia.org/T199919 [11:40:25] davidwbarratt: it's deployed, please test and thanks for deploying with #releng ;) [11:40:58] zeljkof tested again on prod and looks fantastic https://test.wikipedia.org/wiki/Special:Block [11:41:00] apergos: please stand by, your commit is next [11:41:04] zeljkof thanks! [11:41:35] (03PS4) 10Zfilipin: Remove labs wikis from the categories-rdf list, don't need them [mediawiki-config] - 10https://gerrit.wikimedia.org/r/445720 (https://phabricator.wikimedia.org/T198356) (owner: 10Smalyshev) [11:42:35] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/445720 (https://phabricator.wikimedia.org/T198356) (owner: 10Smalyshev) [11:44:13] (03Merged) 10jenkins-bot: Remove labs wikis from the categories-rdf list, don't need them [mediawiki-config] - 10https://gerrit.wikimedia.org/r/445720 (https://phabricator.wikimedia.org/T198356) (owner: 10Smalyshev) [11:44:19] I'm watching it :-) [11:44:48] !log installing reportbug update from stretch point release to test py3-based debdeploy [11:44:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:46:07] (03PS1) 10Volans: wmf-auto-reimage: convert Py3 bytes to str [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) [11:46:15] !log zfilipin@deploy1001 Synchronized dblists/categories-rdf.dblist: SWAT: [[gerrit:445720|Remove labs wikis from the categories-rdf list, dont need them (T198356)]] (duration: 00m 55s) [11:46:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:46:19] T198356: Generate daily diffs for recently changed categories - https://phabricator.wikimedia.org/T198356 [11:46:52] apergos: it's deployed, please test (if there is anything to test) and thanks for deploying with #releng ;) [11:46:55] (03PS2) 10Volans: wmf-auto-reimage: convert Py3 bytes to str [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) [11:47:48] there's nothing to test (yet), and thank you! [11:48:19] apergos: for future reference, there was a typo in the commit message, it's `Bug: T198356`, not `Bug; T198356` [11:48:25] (03PS1) 10Muehlenhoff: Further Python 3 changes [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446794 [11:48:25] (I've fixed it) [11:48:33] ah I waned it to be a : [11:48:37] *wanted [11:48:40] nothing else for swat? [11:48:44] apergos: it happens :) [11:48:47] and then wondered why gerritbot didn't update the task... meh [11:49:48] !log EU SWAT finished [11:49:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:53:13] (03PS3) 10Volans: wmf-auto-reimage: convert Py3 bytes to str [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) [11:53:24] (03CR) 10Volans: [C: 031] "LGTM, minor nitpick inline" (031 comment) [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446794 (owner: 10Muehlenhoff) [11:54:06] (03CR) 10jerkins-bot: [V: 04-1] wmf-auto-reimage: convert Py3 bytes to str [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [11:57:12] (03CR) 10Volans: "Jenkins failure is due to the removal of the print function from the future, as it's checking the file with py2 flake8, see https://phabri" [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [11:58:22] (03CR) 10Muehlenhoff: Further Python 3 changes (031 comment) [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446794 (owner: 10Muehlenhoff) [11:59:16] (03CR) 10Muehlenhoff: [C: 031] "Looks good!" [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [11:59:28] (03PS2) 10Muehlenhoff: Further Python 3 changes [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446794 [12:00:04] Deploy window Pre MediaWiki train sanity break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1200) [12:03:58] 10Operations, 10Wikimedia-SVG-rendering, 10Upstream: librsvg misinterpret quoted font family names that contain whitespaces - https://phabricator.wikimedia.org/T64987 (10Niridya) [12:06:28] 10Operations, 10Wikimedia-SVG-rendering, 10Upstream: librsvg misinterpret quoted font family names that contain whitespaces - https://phabricator.wikimedia.org/T64987 (10Niridya) [12:26:03] (03CR) 10Muehlenhoff: [C: 032] Further Python 3 changes [debs/debdeploy] - 10https://gerrit.wikimedia.org/r/446794 (owner: 10Muehlenhoff) [12:37:43] (03PS1) 10DCausse: search.wikimedia.org should properly handle multivalue separation char (0x1F) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446801 [12:38:02] !log uploaded debdeploy 0.0.99.5 to apt.wikimedia.org [12:38:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:40:31] (03CR) 10ArielGlenn: [C: 04-1] "I'm testing munged copies of these scripts now, they are ok except for a couple nits (inline). I haven't stared too hard at the manifests " (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/378355 (https://phabricator.wikimedia.org/T198356) (owner: 10Smalyshev) [12:41:31] (03PS1) 10Ladsgroup: Set to write the new change tag backend everywhere, enable reading in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446804 (https://phabricator.wikimedia.org/T194165) [12:48:56] !log installing xerces-c security updates [12:48:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:50:40] (03PS4) 10Volans: cumin: convert Py3 bytes to str in related scripts [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) [12:51:23] (03CR) 10jerkins-bot: [V: 04-1] cumin: convert Py3 bytes to str in related scripts [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [12:51:42] yeah yeah I know [12:52:47] 10Operations, 10Core-Platform-Team, 10WMF-JobQueue, 10Patch-For-Review, and 3 others: Exception "Job queue is read-only" - https://phabricator.wikimedia.org/T199594 (10aaron) Normally, it would be odd to let jobs pile up but not execute them, though the multi-DC use case of $wgReadOnly in one of the DCs wa... [12:54:36] (03CR) 10Volans: [V: 032 C: 032] cumin: convert Py3 bytes to str in related scripts [puppet] - 10https://gerrit.wikimedia.org/r/446793 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [13:00:04] zeljkof: Time to snap out of that daydream and deploy MediaWiki train - European version. Get on with it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1300). [13:00:49] hashar: I have just noticed that Krinkle has added two blockers to wmf.13 task https://phabricator.wikimedia.org/T191059 [13:04:39] ok, so according to the task description: [13:04:43] > Any open subtasks block the train from moving forward. This means no further deployments until the blockers are resolved. [13:04:59] not moving the train forward because of blocking tasks [13:05:00] !log installing ffmpeg security updates [13:05:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:05:46] 10Operations, 10monitoring, 10Patch-For-Review: Sunset Watchmouse's status.wikimedia.org - https://phabricator.wikimedia.org/T199816 (10fgiunchedi) Agreed on removing `status.w.o`, not used by us and confusing for users. We might reintroduce it later with sth similar that fits better our requirements. [13:13:34] (03PS2) 10Andrew Bogott: wikitech: remove libvips-tools as it is now on all app servers [puppet] - 10https://gerrit.wikimedia.org/r/446735 (https://phabricator.wikimedia.org/T199938) (owner: 10Rush) [13:14:22] 10Operations, 10monitoring, 10Patch-For-Review: Sunset Watchmouse's status.wikimedia.org - https://phabricator.wikimedia.org/T199816 (10BBlack) IIRC, there's a bunch of crazy config supporting it on the same rackspace server that hosts wikitech-static. Some hacky stuff I threw together to proxy from watchmo... [13:15:11] (03CR) 10Andrew Bogott: [C: 032] wikitech: remove libvips-tools as it is now on all app servers [puppet] - 10https://gerrit.wikimedia.org/r/446735 (https://phabricator.wikimedia.org/T199938) (owner: 10Rush) [13:16:15] (03PS1) 10Volans: wmcs: convert nfs_hostlist to py3 [puppet] - 10https://gerrit.wikimedia.org/r/446810 (https://phabricator.wikimedia.org/T187773) [13:18:48] (03PS2) 10Volans: wmcs: convert nfs_hostlist to py3 [puppet] - 10https://gerrit.wikimedia.org/r/446810 (https://phabricator.wikimedia.org/T187773) [13:19:49] (03CR) 10Volans: [C: 032] wmcs: convert nfs_hostlist to py3 [puppet] - 10https://gerrit.wikimedia.org/r/446810 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [13:19:58] zeljkof: can I deploy db-eqiad.php? (Asking because I am not sure if that interferes with the train) [13:20:12] (03PS2) 10Marostegui: db-eqiad.php: Give db1103:3314 more traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446780 [13:21:22] marostegui: I'm not moving the train forward because there are two blocking tasks https://phabricator.wikimedia.org/T191059 [13:21:34] marostegui: so, I guess, go ahead :) [13:21:44] Ah right! [13:21:46] Thank you [13:22:31] RECOVERY - puppet last run on labweb1001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [13:22:44] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Give db1103:3314 more traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446780 (owner: 10Marostegui) [13:22:49] !log installing python-pysaml2 security updates [13:22:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:04] (03Merged) 10jenkins-bot: db-eqiad.php: Give db1103:3314 more traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446780 (owner: 10Marostegui) [13:25:10] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Increase weight for db1103:3314 after MySQL upgrade (duration: 00m 54s) [13:25:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:26:36] (03PS1) 10Marostegui: db-eqiad.php: Fully repool db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446812 [13:31:19] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Fully repool db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446812 (owner: 10Marostegui) [13:32:57] (03Merged) 10jenkins-bot: db-eqiad.php: Fully repool db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446812 (owner: 10Marostegui) [13:33:07] RECOVERY - puppet last run on labweb1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:33:41] (03PS1) 10Volans: wmf-auto-reimage: fix py3 bug [puppet] - 10https://gerrit.wikimedia.org/r/446816 (https://phabricator.wikimedia.org/T187773) [13:34:12] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Fully repool db1103:3314 (duration: 00m 53s) [13:34:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:35:27] (03CR) 10Volans: [C: 032] wmf-auto-reimage: fix py3 bug [puppet] - 10https://gerrit.wikimedia.org/r/446816 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [13:36:41] zeljkof: no train right? can I reimage a mediawiki host then I guess [13:36:58] volans: train is blocked on two subtasks, go ahead [13:37:14] ack, thanks [13:37:26] RECOVERY - puppet last run on labtestweb2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:38:03] !log reimaging mw2225 to test py3 reimage with new cumin, conftool and apache test - T187773 [13:38:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:38:06] T187773: Cumin: upgrade it to 3.0.1 in production - https://phabricator.wikimedia.org/T187773 [13:38:23] (03PS1) 10Marostegui: db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446817 (https://phabricator.wikimedia.org/T199368) [13:39:49] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446817 (https://phabricator.wikimedia.org/T199368) (owner: 10Marostegui) [13:41:26] 10Operations: Integrate Stretch 9.5 point release - https://phabricator.wikimedia.org/T199670 (10MoritzMuehlenhoff) These updates have been fully deployed: ``` check-postgres devscripts faad2 file libipc-run-perl patch postgresql-9.6 reportbug subversion xapian-core xerces-c ``` [13:41:56] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446817 (https://phabricator.wikimedia.org/T199368) (owner: 10Marostegui) [13:42:01] zeljkof: sorry I was out [13:43:14] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 for alter table (duration: 00m 54s) [13:43:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:43:27] !log Deploy schema change on db1103:3312 T144010 T51190 T199368 [13:43:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:43:38] T51190: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190 [13:43:38] T144010: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 [13:43:38] T199368: Convert UNIQUE INDEX to PK in Production - https://phabricator.wikimedia.org/T199368 [13:46:52] (03PS1) 10Elukey: role::kafka::jumbo::broker: set max jvm heap size to 2G [puppet] - 10https://gerrit.wikimedia.org/r/446820 [13:46:53] hashar: no problem, I have stopped the train [13:47:01] hashar: not sure what else to do :) [13:47:52] zeljkof: Special:Log get vs post got a patch backported to wmf13 https://phabricator.wikimedia.org/T199856 [13:48:13] (03PS2) 10Elukey: role::kafka::jumbo::broker: set max jvm heap size to 2G [puppet] - 10https://gerrit.wikimedia.org/r/446820 [13:48:31] apparently synced by tyler early this morning. So it is probably solved now [13:48:44] hashar: I guess that means it's solved, but there is another one [13:48:47] the other is Babel patch https://gerrit.wikimedia.org/r/446766 which is pending review [13:50:20] hashar: so in short, train is still blocked? :) [13:50:39] i dont know :] [13:51:40] I guess if in doubt, hold the train [13:52:12] who knows how that bug would explode once it pushed to group2 .. [13:52:38] (03CR) 10Elukey: [C: 032] "https://puppet-compiler.wmflabs.org/compiler03/11812/kafka-jumbo1001.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/446820 (owner: 10Elukey) [13:53:51] (03PS2) 10Vgutierrez: WIP: provide ACMEv2 support based on certbot/acme library [software/certcentral] - 10https://gerrit.wikimedia.org/r/446618 (https://phabricator.wikimedia.org/T199717) [13:57:12] !log restart kafka on kafka-jumbo1001 to raise Xmx/Xms jvm settings (1g -> 2g) [13:57:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:59:33] 10Operations, 10TCB-Team, 10wikidiff2, 10WMDE-QWERTY-Sprint-2018-07-17: Update wikidiff2 library on the WMF production cluster to v1.7.2 - https://phabricator.wikimedia.org/T199801 (10MoritzMuehlenhoff) [14:01:55] (03CR) 10jerkins-bot: [V: 04-1] WIP: provide ACMEv2 support based on certbot/acme library [software/certcentral] - 10https://gerrit.wikimedia.org/r/446618 (https://phabricator.wikimedia.org/T199717) (owner: 10Vgutierrez) [14:05:55] [cross posting] so far all good on sarin, we're planning to upgrade cumin/reimage/debdeploy/etc. on neodymium too, any blocker in progress? [14:07:22] 10Operations, 10TCB-Team, 10wikidiff2, 10WMDE-QWERTY-Sprint-2018-07-17: Update wikidiff2 library on the WMF production cluster to v1.7.2 - https://phabricator.wikimedia.org/T199801 (10WMDE-Fisch) @MoritzMuehlenhoff sounds good! Our goal would be having this rolled out on production in the week of the 6th o... [14:12:11] PROBLEM - Disk space on elastic1024 is CRITICAL: DISK CRITICAL - free space: /srv 50518 MB (10% inode=99%) [14:15:16] 10Operations, 10TCB-Team, 10wikidiff2, 10WMDE-QWERTY-Sprint-2018-07-17: Update wikidiff2 library on the WMF production cluster to v1.7.2 - https://phabricator.wikimedia.org/T199801 (10MoritzMuehlenhoff) The week of the 6th seems unrealistic, I'm also off between Aug 3 to Aug 8, but the week of the 13th see... [14:15:54] 10Operations, 10MediaWiki-Maintenance-scripts: Separate host lookup from the sql shell script - https://phabricator.wikimedia.org/T141255 (10Marostegui) [14:15:57] !log upgrading cumin/debdeploy/reimage and other tools on neodymium - T187773 [14:16:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:16:01] T187773: Cumin: upgrade it to 3.0.1 in production - https://phabricator.wikimedia.org/T187773 [14:16:44] (03PS1) 10Jcrespo: mariadb: Productionize db1095 and db1102 into s1-test [puppet] - 10https://gerrit.wikimedia.org/r/446827 (https://phabricator.wikimedia.org/T197073) [14:17:22] !log roll restart kafka on kafka-jumbo* and kafka main-codfw (kafka2*) to pick up new Xmx/Xms settings (1g -> 2g) [14:17:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:18:06] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1103:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446828 [14:18:35] elukey: using cumin? from where? [14:19:21] volans: manually, if you want I can use cumin [14:19:35] ah ok, just in the middle of upgrading neodymium ;) [14:19:42] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1103:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446828 (owner: 10Marostegui) [14:19:44] yep yep I saw it :) [14:19:44] so if you need it use sarin for the next 10 minutes [14:19:50] then np ;) [14:21:25] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1103:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446828 (owner: 10Marostegui) [14:22:37] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Repool db1103:3312 after alter table (duration: 00m 54s) [14:22:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:24:49] (03PS1) 10Marostegui: db-eqiad.php: Depool db1105:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446830 (https://phabricator.wikimedia.org/T199368) [14:25:33] [cross-posting] cumin/debdeploy/reimage/other-tools upgrade completed, let me know if you encounter any issue [14:26:35] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1105:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446830 (https://phabricator.wikimedia.org/T199368) (owner: 10Marostegui) [14:28:09] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1105:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446830 (https://phabricator.wikimedia.org/T199368) (owner: 10Marostegui) [14:29:23] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Depool db1105:3312 for alter table (duration: 00m 54s) [14:29:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:29:53] (03PS2) 10Jcrespo: mariadb: Productionize db1095 and db1102 into test-s1 [puppet] - 10https://gerrit.wikimedia.org/r/446827 (https://phabricator.wikimedia.org/T197073) [14:30:31] !log Deploy schema change on db1105:3312 T144010 T51190 T199368 [14:30:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:30:37] T51190: Truncate SHA-1 indexes - https://phabricator.wikimedia.org/T51190 [14:30:37] T144010: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 [14:30:38] T199368: Convert UNIQUE INDEX to PK in Production - https://phabricator.wikimedia.org/T199368 [14:33:32] (03CR) 10Marostegui: [C: 031] mariadb: Productionize db1095 and db1102 into test-s1 [puppet] - 10https://gerrit.wikimedia.org/r/446827 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [14:34:26] (03PS1) 10Andrew Bogott: labtest: allow second_region_designate to access the database [puppet] - 10https://gerrit.wikimedia.org/r/446833 [14:39:04] (03PS2) 10Andrew Bogott: labtest: allow second_region_designate to access the database [puppet] - 10https://gerrit.wikimedia.org/r/446833 [14:39:44] (03CR) 10Jcrespo: [C: 032] mariadb: Productionize db1095 and db1102 into test-s1 [puppet] - 10https://gerrit.wikimedia.org/r/446827 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [14:41:43] (03PS3) 10Andrew Bogott: labtest: allow second_region_designate to access the database [puppet] - 10https://gerrit.wikimedia.org/r/446833 [14:43:14] (03CR) 10Andrew Bogott: [C: 032] labtest: allow second_region_designate to access the database [puppet] - 10https://gerrit.wikimedia.org/r/446833 (owner: 10Andrew Bogott) [14:45:31] RECOVERY - Disk space on elastic1024 is OK: DISK OK [14:46:35] (03PS1) 10Volans: cumin: explicitely require clustershell [puppet] - 10https://gerrit.wikimedia.org/r/446838 (https://phabricator.wikimedia.org/T187773) [14:47:01] (03PS1) 10Andrew Bogott: labtest: add some needed parentheses for a ferm list [puppet] - 10https://gerrit.wikimedia.org/r/446839 [14:47:13] 10Operations, 10TCB-Team, 10wikidiff2, 10WMDE-QWERTY-Sprint-2018-07-17: Update wikidiff2 library on the WMF production cluster to v1.7.2 - https://phabricator.wikimedia.org/T199801 (10WMDE-Fisch) Ok 13th should also be fine I just checked back with the community communications team. [14:47:48] (03CR) 10Andrew Bogott: [C: 032] labtest: add some needed parentheses for a ferm list [puppet] - 10https://gerrit.wikimedia.org/r/446839 (owner: 10Andrew Bogott) [14:48:43] 10Operations, 10TCB-Team, 10wikidiff2, 10WMDE-QWERTY-Sprint-2018-07-17: Update wikidiff2 library on the WMF production cluster to v1.7.2 - https://phabricator.wikimedia.org/T199801 (10MoritzMuehlenhoff) Ok, just ping this task when you have the 1.7.2 release ready. [14:49:41] (03CR) 10Muehlenhoff: [C: 031] "Looks good!" [puppet] - 10https://gerrit.wikimedia.org/r/446838 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [14:50:17] (03CR) 10Volans: "compiler looks good too:" [puppet] - 10https://gerrit.wikimedia.org/r/446838 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [14:50:26] (03PS2) 10Volans: cumin: explicitely require clustershell [puppet] - 10https://gerrit.wikimedia.org/r/446838 (https://phabricator.wikimedia.org/T187773) [14:50:33] 10Operations, 10TCB-Team, 10wikidiff2, 10WMDE-QWERTY-Sprint-2018-07-17: Update wikidiff2 library on the WMF production cluster to v1.7.2 - https://phabricator.wikimedia.org/T199801 (10WMDE-Fisch) >>! In T199801#4438521, @MoritzMuehlenhoff wrote: > Ok, just ping this task when you have the 1.7.2 release rea... [14:50:48] (03PS1) 10Jforrester: Install but don't enable the WikibaseMediaInfo extension, part I [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446841 (https://phabricator.wikimedia.org/T180981) [14:50:50] (03PS1) 10Jforrester: Install but don't enable the WikibaseMediaInfo extension, part II [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446842 (https://phabricator.wikimedia.org/T180981) [14:50:52] (03PS1) 10Jforrester: Install but don't enable the WikibaseMediaInfo extension, part III [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446843 (https://phabricator.wikimedia.org/T180981) [14:50:54] (03PS1) 10Jforrester: Install but don't enable the WikibaseMediaInfo extension, part IV [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446844 (https://phabricator.wikimedia.org/T180981) [14:50:56] (03PS1) 10Jforrester: Enable the WikibaseMediaInfo extension in Beta Cluster Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446845 (https://phabricator.wikimedia.org/T180981) [14:51:20] (03CR) 10Volans: [C: 032] cumin: explicitely require clustershell [puppet] - 10https://gerrit.wikimedia.org/r/446838 (https://phabricator.wikimedia.org/T187773) (owner: 10Volans) [14:51:56] (03PS3) 10Jcrespo: mariadb: Repool es1019 fully after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446755 (https://phabricator.wikimedia.org/T197073) [14:51:58] (03PS1) 10Jcrespo: mariadb: Depool db1099 for cloning and upgrade [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446846 (https://phabricator.wikimedia.org/T197073) [14:52:38] (03CR) 10jerkins-bot: [V: 04-1] Enable the WikibaseMediaInfo extension in Beta Cluster Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446845 (https://phabricator.wikimedia.org/T180981) (owner: 10Jforrester) [14:54:00] (03CR) 10Jcrespo: [C: 032] mariadb: Depool db1099 for cloning and upgrade [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446846 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [14:54:08] (03PS2) 10Jcrespo: mariadb: Depool db1099 for cloning and upgrade [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446846 (https://phabricator.wikimedia.org/T197073) [14:55:18] 10Operations, 10ops-eqiad, 10Cloud-VPS, 10cloud-services-team: Rack/cable/configure asw2-b-eqiad switch stack - https://phabricator.wikimedia.org/T183585 (10Marostegui) >>! In T183585#4424395, @ayounsi wrote: > Aiming at doing the asw-b to asw2-b migration on July 31st (3pm UTC, 11am EDT, 8am PDT), 4h. > d... [14:57:14] (03PS1) 10Jforrester: Delete multiversion/submodules.json, putatively unused [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446847 [14:59:35] !log rebooting multatuli for kernel test [14:59:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:00:29] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: Depool db1099 (duration: 00m 54s) [15:00:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:00:41] PROBLEM - puppet last run on labcontrol1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:01:11] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1105:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446848 [15:02:06] (03PS2) 10Marostegui: Revert "db-eqiad.php: Depool db1105:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446848 [15:03:01] PROBLEM - puppet last run on cloudcontrol1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:03:31] (03PS4) 10Jcrespo: mariadb: Repool es1019 fully after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446755 (https://phabricator.wikimedia.org/T197073) [15:03:33] (03PS1) 10Jcrespo: mariadb: Fully repool db1099, including db1099:s8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446849 (https://phabricator.wikimedia.org/T197073) [15:04:11] (03PS2) 10Jcrespo: mariadb: Fully depool db1099, including db1099:s8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446849 (https://phabricator.wikimedia.org/T197073) [15:05:51] all kafka brokers are running with the new heap settings (2G xmx/xms rather than 1g), everything looks good [15:06:54] (03CR) 10Jforrester: [C: 04-2] "Not yet. :-)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446845 (https://phabricator.wikimedia.org/T180981) (owner: 10Jforrester) [15:07:00] (03PS2) 10Jforrester: Enable the WikibaseMediaInfo extension in Beta Cluster Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446845 (https://phabricator.wikimedia.org/T180981) [15:07:41] PROBLEM - puppet last run on labtestcontrol2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:08:54] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1105:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446848 (owner: 10Marostegui) [15:10:23] 10Operations, 10Datasets-General-or-Unknown: Sometimes (at peak usage?), dumps.wikimedia.org becomes very slow for users (sometimes unresponsive) - https://phabricator.wikimedia.org/T45647 (10jcrespo) Is someone still suffering from this issue anymore? If not, it should be closed. [15:10:29] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1105:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446848 (owner: 10Marostegui) [15:11:38] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: Repool db1105:3312 after alter table (duration: 00m 54s) [15:11:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:11:50] 10Operations, 10monitoring, 10Patch-For-Review: Sunset Watchmouse's status.wikimedia.org - https://phabricator.wikimedia.org/T199816 (10Quiddity) For wiki-pages that currently link to that subdomain (that should be updated to remove the link, or otherwise considered as part of this decision), it looks like t... [15:12:42] !log volans@sarin conftool action : set/pooled=yes; selector: name=mw2225.codfw.wmnet [15:12:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:13:55] 10Operations, 10ops-eqiad, 10Cloud-VPS, 10cloud-services-team: Rack/cable/configure asw2-b-eqiad switch stack - https://phabricator.wikimedia.org/T183585 (10ayounsi) [15:15:50] 10Operations, 10Datasets-General-or-Unknown: Provide a good download service of dumps from Wikimedia - https://phabricator.wikimedia.org/T122917 (10ArielGlenn) [15:15:54] 10Operations, 10Datasets-General-or-Unknown: Sometimes (at peak usage?), dumps.wikimedia.org becomes very slow for users (sometimes unresponsive) - https://phabricator.wikimedia.org/T45647 (10ArielGlenn) 05Open>03declined Given that the hosting setup for this service is different now, this might as well be... [15:15:56] (03PS1) 10Volans: wmf-auto-reimage: fix py3 syntax [puppet] - 10https://gerrit.wikimedia.org/r/446852 [15:16:41] (03CR) 10jerkins-bot: [V: 04-1] wmf-auto-reimage: fix py3 syntax [puppet] - 10https://gerrit.wikimedia.org/r/446852 (owner: 10Volans) [15:17:41] (03CR) 10Volans: [V: 032 C: 032] "CI test it with the wrong tools (py2 vs py3)" [puppet] - 10https://gerrit.wikimedia.org/r/446852 (owner: 10Volans) [15:19:20] 10Operations, 10LDAP-Access-Requests: Add Lea Voget (WMDE) & Bmueller to the WMDE LDAP group - https://phabricator.wikimedia.org/T199967 (10herron) p:05Triage>03Normal Hi @RStallman-legalteam, could you please get the NDA ball rolling for these two users? Thanks in advance! [15:19:51] PROBLEM - swift-container-replicator on ms-be1040 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [15:19:51] PROBLEM - swift-account-server on ms-be1040 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [15:19:52] PROBLEM - swift-object-updater on ms-be1040 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-updater [15:19:52] PROBLEM - swift-object-auditor on ms-be1040 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [15:20:01] ugh, that's me and the expired downtime [15:20:01] PROBLEM - swift-container-server on ms-be1040 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [15:20:05] fixed [15:20:36] 10Operations, 10LDAP-Access-Requests, 10Graphite, 10User-Addshore: Give Bmueller grafana-admin access - https://phabricator.wikimedia.org/T199965 (10herron) p:05Triage>03Normal [15:20:50] 10Operations, 10LDAP-Access-Requests, 10Graphite, 10User-Addshore: Give Lea Voget (WMDE) grafana-admin access - https://phabricator.wikimedia.org/T199966 (10herron) p:05Triage>03Normal [15:21:16] !log reimaging mw2226 to test py3 reimage with new cumin, conftool and apache test from neodymium - T187773 [15:21:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:21:20] T187773: Cumin: upgrade it to 3.0.1 in production - https://phabricator.wikimedia.org/T187773 [15:27:30] (03PS1) 10Thcipriani: releasers-mediawiki: fix sudo permissions [puppet] - 10https://gerrit.wikimedia.org/r/446859 [15:28:00] (03PS3) 10Jcrespo: mariadb: Fully depool db1099, including db1099:s8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446849 (https://phabricator.wikimedia.org/T197073) [15:30:57] (03CR) 10Thcipriani: [C: 031] jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [15:31:41] 10Operations, 10Discovery-Search (Current work): Google Search Console access for Search Platform team - https://phabricator.wikimedia.org/T188453 (10herron) Since this task has been inactionable in the access request queue for some time I'm going to remove the SRE-Access-Requests tag. Please re-add project t... [15:32:11] (03CR) 10jenkins-bot: Enable TemplateStyles on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/444567 (https://phabricator.wikimedia.org/T197603) (owner: 10Gergő Tisza) [15:32:13] (03CR) 10jenkins-bot: Enable Structured Discussions beta feature on orwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446756 (https://phabricator.wikimedia.org/T199971) (owner: 10Catrope) [15:32:15] (03CR) 10jenkins-bot: Enable Special:Block Feedback Request [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446613 (https://phabricator.wikimedia.org/T199919) (owner: 10Dbarratt) [15:32:17] (03CR) 10jenkins-bot: Remove labs wikis from the categories-rdf list, don't need them [mediawiki-config] - 10https://gerrit.wikimedia.org/r/445720 (https://phabricator.wikimedia.org/T198356) (owner: 10Smalyshev) [15:32:20] (03CR) 10jenkins-bot: db-eqiad.php: Give db1103:3314 more traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446780 (owner: 10Marostegui) [15:32:22] (03CR) 10jenkins-bot: db-eqiad.php: Fully repool db1103:3314 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446812 (owner: 10Marostegui) [15:32:24] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1103:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446817 (https://phabricator.wikimedia.org/T199368) (owner: 10Marostegui) [15:32:26] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1103:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446828 (owner: 10Marostegui) [15:32:28] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1105:3312 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446830 (https://phabricator.wikimedia.org/T199368) (owner: 10Marostegui) [15:32:30] (03CR) 10jenkins-bot: mariadb: Depool db1099 for cloning and upgrade [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446846 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [15:32:32] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1105:3312" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446848 (owner: 10Marostegui) [15:34:53] (03CR) 10Jcrespo: [C: 032] mariadb: Fully depool db1099, including db1099:s8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446849 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [15:36:18] (03CR) 10Volans: "As requested I've only had a quick pass to the acme_requests.py file. Apart some comments inline (all minors), I appreciate the effort to " (035 comments) [software/certcentral] - 10https://gerrit.wikimedia.org/r/446618 (https://phabricator.wikimedia.org/T199717) (owner: 10Vgutierrez) [15:36:24] (03Merged) 10jenkins-bot: mariadb: Fully depool db1099, including db1099:s8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446849 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [15:39:12] 10Operations, 10ops-codfw, 10Analytics, 10Analytics-Kanban, and 5 others: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 (10Milimetric) [15:50:37] !log droping undocumented grants on labsdb1005 using replication T199186 [15:50:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:51:55] RECOVERY - Check systemd state on ms-be1035 is OK: OK - running: The system is fully operational [15:52:09] godog: FYI another systemd slice, I've cleeared it ^^^ [15:53:01] !log CI: npmjs is slow. Bumping Quibble jobs timeout from 30 to 45 minutes | https://gerrit.wikimedia.org/r/#/c/446863 | T198348 [15:53:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:53:05] T198348: Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 [15:56:15] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0 [15:56:25] PROBLEM - Router interfaces on cr1-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 63, down: 1, dormant: 0, excluded: 0, unused: 0 [15:57:08] 10Operations, 10User-notice: 2018 data center switchover: Move all the things over to codfw - https://phabricator.wikimedia.org/T200022 (10Jdforrester-WMF) p:05Triage>03Normal [15:57:33] 10Operations, 10User-Johan: 2018 data center switchover: Move all the things back to eqiad - https://phabricator.wikimedia.org/T200023 (10Jdforrester-WMF) p:05Triage>03Normal [15:57:48] 10Operations, 10User-notice: 2018 data center switchover: Move all the things over to codfw - https://phabricator.wikimedia.org/T200022 (10Jdforrester-WMF) [15:58:15] 10Operations, 10Traffic, 10Patch-For-Review: Pick up a suitable ACME library for certcentral - https://phabricator.wikimedia.org/T199717 (10Vgutierrez) I've been playing a little bit with the ACME library from certbot and it looks promising. on https://gerrit.wikimedia.org/r/#/c/operations/software/certcentr... [15:58:25] PROBLEM - Host labmon1001 is DOWN: PING CRITICAL - Packet loss = 100% [15:59:59] !log Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds [16:00:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:00:03] T132839: [RfC] Property suggester suggests human properties for non-human items - https://phabricator.wikimedia.org/T132839 [16:00:05] godog, moritzm, and _joe_: (Dis)respected human, time to deploy Puppet SWAT(Max 6 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1600). Please do the needful. [16:00:05] thcipriani: A patch you scheduled for Puppet SWAT(Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [16:00:10] o/ [16:02:58] my patch is pretty easy, looks like a typo [16:03:04] let me see [16:03:25] yep makes sense to me [16:05:17] so apparently the typo was made by chad [16:05:52] at https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/399123/ [16:05:59] and not corrected by robh [16:06:22] =[ [16:07:06] (03CR) 10Jcrespo: [C: 032] releasers-mediawiki: fix sudo permissions [puppet] - 10https://gerrit.wikimedia.org/r/446859 (owner: 10Thcipriani) [16:10:57] thcipriani: could you log out of releases1001, log in again and try, preferibly with a status call [16:11:09] well, status doesn't need super [16:11:15] maybe something else [16:11:57] not sure if the service can be restarted easily [16:12:06] jynus: sure, just tried it. It was failing previously on releases1001 and is now working. I also see my permissions in sudo -l are updated: thanks for your help! [16:12:24] thanks you you for spotting the error [16:12:58] volans: gah, thanks! [16:13:35] tell me if you need me to do it manuall on the other servers [16:16:17] I don't need puppet manually run anywhere else, thanks! [16:16:29] (03CR) 10jenkins-bot: mariadb: Fully depool db1099, including db1099:s8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446849 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [16:16:44] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: Fully depool db1099, including from s8 (duration: 00m 55s) [16:16:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:19:35] (03PS1) 10BBlack: foundationwiki: revert references to the root URL [puppet] - 10https://gerrit.wikimedia.org/r/446867 (https://phabricator.wikimedia.org/T199812) [16:20:52] (03CR) 10BBlack: [C: 032] foundationwiki: revert references to the root URL [puppet] - 10https://gerrit.wikimedia.org/r/446867 (https://phabricator.wikimedia.org/T199812) (owner: 10BBlack) [16:21:15] RECOVERY - Host labmon1001 is UP: PING OK - Packet loss = 0%, RTA = 0.33 ms [16:27:55] (03PS1) 10Andrew Bogott: labmon1001: Turn on paging for wmcs Ops [puppet] - 10https://gerrit.wikimedia.org/r/446868 [16:28:02] !log stop, clone away and upgrade db1099 two mysql instances [16:28:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:28:58] (03CR) 10Arturo Borrero Gonzalez: [C: 031] labmon1001: Turn on paging for wmcs Ops [puppet] - 10https://gerrit.wikimedia.org/r/446868 (owner: 10Andrew Bogott) [16:29:29] (03PS1) 10DCausse: Upgrade to 6.3.1-alpha1 (without hebrew) [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/446869 (https://phabricator.wikimedia.org/T199791) [16:31:15] (03CR) 10Andrew Bogott: [C: 032] labmon1001: Turn on paging for wmcs Ops [puppet] - 10https://gerrit.wikimedia.org/r/446868 (owner: 10Andrew Bogott) [16:37:16] (03CR) 10Jcrespo: [V: 032 C: 032] [WIP] Create framework to transfer files over the LAN [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/433556 (owner: 10Rduran) [16:37:26] (03CR) 10Jcrespo: [V: 032 C: 032] [WIP] Use Cumin to implement the comunication for the transfer [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/433557 (https://phabricator.wikimedia.org/T156462) (owner: 10Rduran) [16:37:34] (03CR) 10Jcrespo: [V: 032 C: 032] [WIP] Refactor code in transfer.py [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/433558 (https://phabricator.wikimedia.org/T156462) (owner: 10Rduran) [16:37:43] (03CR) 10Jcrespo: [V: 032 C: 032] [WIP] Add unit tests for transfer.py and CumminExecution [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/437503 (owner: 10Rduran) [16:48:11] 10Operations, 10ops-codfw, 10Analytics, 10Analytics-Kanban, and 5 others: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 (10mobrovac) A little recap and current status. We tried obtaining heap dumps from memory-fat processes in codfw, but these we... [17:00:04] cscott, arlolra, subbu, halfak, and Amir1: That opportune time is upon us again. Time for a Services – Graphoid / Parsoid / Citoid / ORES deploy. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1700). [17:15:48] PROBLEM - Device not healthy -SMART- on db2061 is CRITICAL: cluster=mysql device=cciss,9 instance=db2061:9100 job=node site=codfw https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=db2061&var-datasource=codfw%2520prometheus%252Fops [17:18:07] (03PS13) 10Jcrespo: [WIP] Add replication managing [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/439871 [17:18:09] (03PS1) 10Jcrespo: transfer.py: Make checksum optional [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/446871 (https://phabricator.wikimedia.org/T156462) [17:19:24] (03PS3) 10Hashar: jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) [17:24:47] 10Operations, 10Maps (Tilerator): Externalize tile storage for maps - https://phabricator.wikimedia.org/T196474 (10Mholloway) [17:24:50] 10Operations, 10Maps (Tilerator): Investigate Swift as a storage backend for maps tiles - https://phabricator.wikimedia.org/T149885 (10Mholloway) [17:26:18] (03PS1) 10Daimona Eaytoy: Add abusefilter-modify-restricted right to sysops on itwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446872 (https://phabricator.wikimedia.org/T199783) [17:29:23] AaronSchulz: There's a guy in #mediawiki complaining that his mediawiki install is super slow since he upgraded and the profile says its spending most of the time in ChronologyProtector [17:29:32] 10Operations, 10Maps (Tilerator): Increase frequency of OSM replication - https://phabricator.wikimedia.org/T137939 (10Mholloway) Still something we're interested in doing, but not sufficiently high-priority for #maps-sprint. [17:30:08] AaronSchulz: I'm wondering if maybe 52af356cad3799 needs to be backported, but I'm not really familar with this part of the codebase [17:33:16] or some other chronology protector change needs to be [17:39:20] that is strange, that part of the codebase could create stalls, but without cpu usage, as it is implemented with database waits [17:39:30] and I assume profiling == CPU usage? [17:39:50] or is it wall clock time? [17:40:17] (03PS5) 10Jcrespo: mariadb: Repool es1019 fully after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446755 (https://phabricator.wikimedia.org/T197073) [17:40:18] (03PS1) 10Jcrespo: mariadb: Repool db1099 (both instances) with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446875 (https://phabricator.wikimedia.org/T197073) [17:40:25] Well usleep is showing up in the profile, I would assume that means its wall clock time [17:41:00] then, yes, as a part of code tought for waiting could indeed create waits :-) [17:41:05] I assume you referring to Ulfr's issue at https://pastebin.com/5avpsR2z [17:41:17] jynus: indeed [17:41:18] I don't know, I am based on your report [17:41:38] Just wanted to make sure you weren't talking about some separate server issue :) [17:42:13] problem is that part of the code is probably very specific deployment-dependent [17:42:39] While it does get enabled by default if you have more than 1 db server [17:42:58] yeah, but probably that is a minority of 3rd party users [17:43:31] so it probably doesn't get extensive 3rd party testing [17:43:52] yeah definitely a minority, but there are more than you would expect [17:44:01] once you include the corporate users [17:44:24] if those are miliseconds [17:44:30] that is 5 secnods stuck waiting [17:44:43] (03CR) 10Herron: jenkins: add buildsDir system property (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [17:44:50] the op's solution would be to reduce the timeout to 0 [17:45:08] well, not a solution, just a bad workaround [17:45:23] in the dirtiest way [17:45:33] I know that is configurable [17:54:34] (03PS4) 10Thcipriani: jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [17:56:09] (03CR) 10Thcipriani: [C: 031] jenkins: add buildsDir system property (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [18:00:04] addshore, hashar, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: It is that lovely time of the day again! You are hereby commanded to deploy Morning SWAT (Max 6 patches). (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1800). [18:00:04] Amir1, Lucas_WMDE, and Daimona: A patch you scheduled for Morning SWAT (Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [18:00:12] o/ [18:00:15] Hey [18:01:04] Since I'm not the only one, can I come back in 10 minutes? [18:01:07] 10Operations, 10Operations-Software-Development, 10Goal, 10Patch-For-Review: Release and deploy Debmonitor (patch management software) [Technology Goal 2017-18_Q4] - https://phabricator.wikimedia.org/T191298 (10Volans) [18:01:56] I think so [18:02:03] Nice :-) [18:02:21] (03CR) 10Herron: [C: 031] "https://puppet-compiler.wmflabs.org/compiler02/11818/" [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [18:03:07] Since no one is around I do the SWAT [18:05:46] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446804 (https://phabricator.wikimedia.org/T194165) (owner: 10Ladsgroup) [18:07:20] (03Merged) 10jenkins-bot: Set to write the new change tag backend everywhere, enable reading in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446804 (https://phabricator.wikimedia.org/T194165) (owner: 10Ladsgroup) [18:08:12] (03CR) 10jenkins-bot: Set to write the new change tag backend everywhere, enable reading in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446804 (https://phabricator.wikimedia.org/T194165) (owner: 10Ladsgroup) [18:08:31] o/ [18:08:34] sorry I’m late [18:10:16] And I'm back [18:12:29] works fine, Moving forward [18:14:12] Lucas_WMDE: You're next [18:14:20] !log ladsgroup@deploy1001 Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:446804|Set to write the new change tag backend everywhere, enable reading in frwiki (T194165)]] (duration: 00m 55s) [18:14:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:14:27] T194165: Start writing to change_tag_def in production - https://phabricator.wikimedia.org/T194165 [18:15:29] ok [18:15:42] I’m trying to check how to verify the change once it’s deployed… [18:17:48] !log start of ladsgroup@mwmaint1001:~$ foreachwikiindblist all populateChangeTagDef.php (T193873) It will take a couple of days [18:18:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:18:09] T193873: Run maintenance script to populate change_tag_def on WMF production (all wikis) - https://phabricator.wikimedia.org/T193873 [18:22:22] Amir1: it seems I can’t reproduce the bug that my patch tries to fix right now [18:22:29] so I won’t be able to tell you if the fix works or not :/ [18:22:59] Lucas_WMDE: should I revert it? [18:23:25] no [18:23:32] I still think it’s a good idea to deploy [18:24:04] hmm, this seems very intermittent and hard to test [18:24:06] okay [18:24:06] but I’m not sure if it makes sense to go through the debug servers first [18:24:18] Noted [18:24:48] it was not as intermittent a few hours ago… but it seems to have calmed down since then https://grafana.wikimedia.org/dashboard/db/wikidata-quality [18:25:52] (you can still see cached SPARQL errors on all kinds of items you visit, because the results are cached, but on the special page I’m not getting them anymore) [18:26:14] (03PS5) 10Thcipriani: jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [18:26:57] okay, I just got a ton of errors on Q183 [18:27:05] with the patch applied those should turn into TODOs [18:27:13] Amir1: looks like I will be able to test after all [18:27:13] !log ladsgroup@deploy1001 Synchronized php-1.32.0-wmf.13/extensions/WikibaseQualityConstraints: SWAT: [[gerrit:446813|Report SPARQL errors as TODO, not VIOLATION (T199788)]] (duration: 00m 56s) [18:27:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:27:17] T199788: Don’t report SPARQL errors as violations - https://phabricator.wikimedia.org/T199788 [18:27:33] ^ Ignore this, I forgot to rebase [18:28:46] It's not merged yet :/ [18:32:59] come one jenkins [18:33:03] 17 minutes already [18:33:51] 92% https://integration.wikimedia.org/ci/job/quibble-composer-mysql-hhvm-docker/308/console [18:34:16] 10Operations, 10ops-eqiad, 10decommission, 10User-ArielGlenn: decommission snapshot1001 - https://phabricator.wikimedia.org/T197021 (10RobH) [18:38:12] finally [18:39:08] Lucas_WMDE: It's live in mwdebug1002, please test [18:39:39] (03PS1) 10RobH: decom of snapshot1001 [puppet] - 10https://gerrit.wikimedia.org/r/446879 (https://phabricator.wikimedia.org/T197021) [18:40:02] Amir1: checking [18:40:16] Daimona: you're next after Lucas. Are you admin on itwiktionary and can test the patch? [18:40:17] (03CR) 10RobH: [C: 032] decom of snapshot1001 [puppet] - 10https://gerrit.wikimedia.org/r/446879 (https://phabricator.wikimedia.org/T197021) (owner: 10RobH) [18:40:30] Yes, they gave me temporary rights [18:40:35] Great [18:44:59] ruh roh [18:45:03] (03PS1) 10RobH: decom of snapshot1001 production dns [dns] - 10https://gerrit.wikimedia.org/r/446880 (https://phabricator.wikimedia.org/T197021) [18:45:04] backend errors [18:45:09] * Lucas_WMDE peeks at logstash [18:45:35] (03CR) 10RobH: [C: 032] decom of snapshot1001 production dns [dns] - 10https://gerrit.wikimedia.org/r/446880 (https://phabricator.wikimedia.org/T197021) (owner: 10RobH) [18:46:11] I can't see any errors in mwdebug1002 [18:46:35] it might just be due to how long the constraint check takes…? [18:46:48] “Error: 503, Backend fetch failed” [18:47:00] 10Operations, 10ops-eqiad, 10decommission, 10User-ArielGlenn: decommission snapshot1001 - https://phabricator.wikimedia.org/T197021 (10RobH) a:03Cmjohnson [18:47:45] (03Abandoned) 10Dduvall: ci: Docker registry for container builds [puppet] - 10https://gerrit.wikimedia.org/r/345422 (https://phabricator.wikimedia.org/T161657) (owner: 10Dduvall) [18:48:36] Lucas_WMDE: moving forward? [18:48:47] (03Abandoned) 10Dduvall: labs: Expand paths for nuyaml hiera lookup under common [puppet] - 10https://gerrit.wikimedia.org/r/274566 (owner: 10Dduvall) [18:49:01] Amir1: I guess so [18:49:06] I just got some TODOs on https://www.wikidata.org/wiki/Special:ConstraintReport/Q42 [18:49:12] which would otherwise have been violations [18:49:14] so the change is working [18:49:20] okay [18:49:29] and the errors must be because of how long the constraint check takes [18:49:53] (does mwdebug1002 have less CPU power than the regular servers, perhaps?) [18:50:19] !log ladsgroup@deploy1001 Synchronized php-1.32.0-wmf.13/extensions/WikibaseQualityConstraints: SWAT: [[gerrit:446813|Report SPARQL errors as TODO, not VIOLATION (T199788)]] (duration: 00m 56s) [18:50:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:50:23] T199788: Don’t report SPARQL errors as violations - https://phabricator.wikimedia.org/T199788 [18:50:34] Lucas_WMDE: ^ Test it please [18:51:38] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446872 (https://phabricator.wikimedia.org/T199783) (owner: 10Daimona Eaytoy) [18:52:41] Daimona: you're up :) [18:53:14] Nice [18:53:17] (03Merged) 10jenkins-bot: Add abusefilter-modify-restricted right to sysops on itwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446872 (https://phabricator.wikimedia.org/T199783) (owner: 10Daimona Eaytoy) [18:53:34] (03CR) 10jenkins-bot: Add abusefilter-modify-restricted right to sysops on itwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446872 (https://phabricator.wikimedia.org/T199783) (owner: 10Daimona Eaytoy) [18:53:49] Amir1: can confirm, it works without mw-debug [18:53:58] Daimona: live in mwdebug1002, please test [18:54:09] Lucas_WMDE: coolio. See you soon! [18:54:23] Going [18:54:53] Working fine [18:55:48] nice, going live [18:57:41] !log ladsgroup@deploy1001 Synchronized wmf-config/abusefilter.php: SWAT: [[gerrit:446872|Add abusefilter-modify-restricted right to sysops on itwiktionary (T199783)]] (duration: 00m 53s) [18:57:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:57:45] T199783: it.wiktionary : request for $wgAbuseFilterAvailableActions[] = 'block' - https://phabricator.wikimedia.org/T199783 [18:57:55] Daimona: ^ live now, please double check :) [19:00:00] Ok, going [19:00:04] Deploy window MediaWiki train - Americas version (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T1900) [19:00:39] Yes, still working, many thanks :-) [19:00:51] :) Thank you! [19:00:58] !log Morning SWAT is done [19:01:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:02] !log roll restart eventstreams on scb2* hosts to prevent OOM issues over the EU night - T199813 [19:04:05] hashar: herron around for (hopefully short) jenkins upgrade? [19:04:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:06] T199813: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 [19:04:54] it's probably best to start with the releases jenkins server so if there are problems it's less impactful for folks [19:05:03] agreed [19:05:05] ready! [19:05:18] and contint2001 is a nice sandbox area to validate the puppet patch [19:05:39] so we can disable puppet on contint1001 , merge the patch, verify on releases hosts / contint2001 [19:05:49] thank you herron for helping out ! :] [19:06:07] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decom/reclaim tin - https://phabricator.wikimedia.org/T196175 (10RobH) [19:06:16] for sure! [19:06:35] hashar: sounds like a good plan to me [19:06:41] (03PS1) 10RobH: decom of tin [puppet] - 10https://gerrit.wikimedia.org/r/446884 (https://phabricator.wikimedia.org/T196175) [19:07:45] (03PS1) 10RobH: decom tin production entries [dns] - 10https://gerrit.wikimedia.org/r/446885 (https://phabricator.wikimedia.org/T196175) [19:08:13] okkk [19:08:25] puppet disabled on contint1001 [19:09:02] cool [19:09:03] hashar: do you have root on the releases machines? [19:09:09] probably not [19:09:09] I've locked contint1001/2001 [19:09:28] I lack root [19:09:39] herron: could you lock releases1001/2001? [19:09:53] it may have been only no_justificatio.n who had root there? [19:10:00] lock them how? [19:10:26] sorry, disable puppet there [19:10:33] ah! sure [19:11:29] done, so now contint1001 and releases[12]001 have puppet agent disabled [19:11:40] so we can no merge the puppet change [19:11:42] I got contint2001, so that's all of them [19:11:45] and upgrade jenkins on one of the host [19:12:12] contint2001 or releases2001 are the spare so we can use those to test puppet + jenkins startup [19:12:14] (03CR) 10RobH: [C: 032] decom of tin [puppet] - 10https://gerrit.wikimedia.org/r/446884 (https://phabricator.wikimedia.org/T196175) (owner: 10RobH) [19:12:14] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decom/reclaim tin - https://phabricator.wikimedia.org/T196175 (10RobH) [19:12:47] (03CR) 10RobH: [C: 032] decom tin production entries [dns] - 10https://gerrit.wikimedia.org/r/446885 (https://phabricator.wikimedia.org/T196175) (owner: 10RobH) [19:13:02] (03PS3) 10ArielGlenn: do 8 jobs in parallel for wikidata weeklies [puppet] - 10https://gerrit.wikimedia.org/r/432368 (https://phabricator.wikimedia.org/T181936) [19:14:08] (03CR) 10ArielGlenn: [C: 032] do 8 jobs in parallel for wikidata weeklies [puppet] - 10https://gerrit.wikimedia.org/r/432368 (https://phabricator.wikimedia.org/T181936) (owner: 10ArielGlenn) [19:14:14] ohhhh https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/446774/ ran through the puppet compiler [19:14:18] completely forgot about ppc [19:14:26] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decom/reclaim tin - https://phabricator.wikimedia.org/T196175 (10RobH) a:03Cmjohnson [19:15:00] yes, although the link I pasted in there is stale. there was another run after ps5 [19:15:08] hopefully ExecStart=jenkins -D${ITEM_FULL_NAME} will be treated literally [19:15:19] here it is https://puppet-compiler.wmflabs.org/compiler02/11820/ [19:15:53] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decom/reclaim tin - https://phabricator.wikimedia.org/T196175 (10RobH) [19:15:55] yeah hmm [19:16:05] +ExecStart=/usr/bin/java -Djenkins.model.Jenkins.buildsDir=/srv/jenkins/builds/${ITEM_FULL_NAME} [19:16:09] (03CR) 10Herron: "> https://puppet-compiler.wmflabs.org/compiler02/11818/" [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [19:16:18] hashar: should be single quoted now [19:16:33] buildsDir='/srv/jenkins/builds/${ITEM_FULL_NAME}' [19:16:51] see https://puppet-compiler.wmflabs.org/compiler02/11820/ [19:16:52] (03PS1) 10Volans: wmf-decommission-host: initial version [puppet] - 10https://gerrit.wikimedia.org/r/446887 (https://phabricator.wikimedia.org/T198649) [19:17:03] (03CR) 10Hashar: [C: 031] jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [19:17:08] yeah that looks saner [19:17:20] herron: since neither hashar or I have root on releases1001 could you merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/446774/, run apt-get install jenkins, and then unlock/run puppet? [19:17:32] yeah you bet, will do that now [19:17:36] after that we can restart jenkins and checkout the update. [19:18:00] cool [19:18:05] apt-getting now [19:18:12] systemctl stop jenkins ; puppet agent -tv; apt upgrade [19:18:46] I cant remember whether the package upgrade starts it automagically, and if it start the service, I cant remember whether it uses the proper script [19:18:58] !log setting mtu 9192 to all codfw interface ranges [19:19:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:19:05] so I usually look at the full process command running with ps -u jenkins f [19:19:17] actually let me set some icing downtimes [19:21:47] ok downtimes are in [19:21:56] carrying on with releases1001 [19:22:42] I should have thought about the icinga down [19:23:15] (03PS4) 10ArielGlenn: quick script to show runtimes of dump jobs [dumps] - 10https://gerrit.wikimedia.org/r/444603 (https://phabricator.wikimedia.org/T199117) [19:23:35] systemctl forcing a pager is so annoying :\ [19:23:52] (03CR) 10ArielGlenn: [C: 032] quick script to show runtimes of dump jobs [dumps] - 10https://gerrit.wikimedia.org/r/444603 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [19:24:10] hmm [19:24:25] https://www.irccloud.com/pastebin/KntS1QfH/ [19:25:03] eek [19:25:10] !log ariel@deploy1001 Started deploy [dumps/dumps@f0c4a70]: handle empty tables properly; script to show runtimes of dump steps [19:25:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:25:13] !log ariel@deploy1001 Finished deploy [dumps/dumps@f0c4a70]: handle empty tables properly; script to show runtimes of dump steps (duration: 00m 03s) [19:25:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:26:32] herron: maybe redownload it? apt-get clean && apt-get install jenkins [19:26:41] worth a shot [19:26:54] else you would have to mess up with reprepro there are guidances on https://wikitech.wikimedia.org/wiki/Jenkins#Updating [19:27:03] no love :( [19:27:46] (03PS2) 10ArielGlenn: fix typo [puppet] - 10https://gerrit.wikimedia.org/r/444850 [19:27:52] grblblbl [19:28:08] I’ll try the reprepro checkupdate [19:28:14] we can do a basic md5 comparison of http://pkg.jenkins-ci.org/debian-stable/binary/jenkins_2.121.2_all.deb and /var/cache/apt/archives/jenkins_2.121.2_all.deb [19:28:59] (03CR) 10ArielGlenn: [C: 032] fix typo [puppet] - 10https://gerrit.wikimedia.org/r/444850 (owner: 10ArielGlenn) [19:30:37] herron: md5 mismatch ! [19:30:56] a freshly downloaded one has 888e4d585c1636c0e4a93b5334b3595b [19:31:12] the (bad) one downloaded from apt has bc579711d3265bc9ed359d7b5bc207b2 [19:32:10] alright so I suppose we try importing http://pkg.jenkins-ci.org/debian-stable/binary/jenkins_2.121.2_all.deb again [19:32:25] yup [19:32:49] and reprepro should have their gpg key already [19:33:02] or maybe it is just on the target hosts [19:35:27] and I have no idea how reprepro works :\ [19:36:14] heh I am looking into how this was imported myself [19:36:41] reprepro -C thirdparty includedeb jessie-wikimedia jenkins_2.121.2_all.deb [19:36:41] reprepro export jessie-wikimedia [19:36:51] accoridng to the orange warning on https://wikitech.wikimedia.org/wiki/Jenkins#Updating [19:37:16] it's under thirdparty/ci now, right? [19:37:23] and stretch [19:37:46] right [19:38:05] * hashar tries sudo npm install jenkins [19:38:36] hashar: that won't work, you forgot the global flag :P [19:39:54] at least the check sum is wrong on both jessie and stretch [19:41:38] herron: maybe the new deb can just be added again? [19:41:49] and reprepro export might then just override the old one [19:42:36] it’s bailing out because the checksum of the new package doesn’t match, and removing the existing package is erroring out too [19:43:29] :^\ what is the erroring on package removal? [19:43:58] else we can grab the package directly on the hosts, do the puppet/upgrade etc [19:44:06] and I can catch up with moritz.m tomorrow [19:44:44] else you might end up spending the whole day figuring out reprepro [19:46:06] haha, I posted more details in security [19:49:55] hashar: morit.z is on vacation ;) I'm having a look [19:51:32] (03PS1) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [19:52:23] (03CR) 10jerkins-bot: [V: 04-1] Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [19:55:05] (03PS2) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [19:55:45] (03CR) 10jerkins-bot: [V: 04-1] Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [19:57:12] (03PS2) 10Volans: wmf-decommission-host: initial version [puppet] - 10https://gerrit.wikimedia.org/r/446887 (https://phabricator.wikimedia.org/T198649) [19:59:08] (03PS3) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [20:07:47] (03PS2) 10Jcrespo: mariadb: Repool db1099 (both instances) with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446875 (https://phabricator.wikimedia.org/T197073) [20:09:33] (03PS4) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [20:10:12] (03CR) 10jerkins-bot: [V: 04-1] Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [20:12:24] (03PS5) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [20:13:00] (03CR) 10Jcrespo: [C: 032] mariadb: Repool db1099 (both instances) with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446875 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [20:18:40] have we picked up releases1001.eqiad.wmnet as the candidate? [20:18:58] then we can systemctl stop jenkins; puppet agent -tv; apt upgrade [20:19:01] (03CR) 10Jcrespo: [V: 032 C: 032] mariadb: Repool db1099 (both instances) with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446875 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [20:19:10] and then check whether jenkins got started with the proper parameters [20:19:15] might have to systemctl start it [20:19:28] ok, I have the package updated but not yet merged that patch [20:19:37] so it looks like the apt-install started jenkins [20:19:57] (03PS1) 10Andrew Bogott: labtestn: define second_region_designate_host [puppet] - 10https://gerrit.wikimedia.org/r/446901 [20:20:18] hashar: so, yeah, your order seems correct [20:20:40] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: Repool db1099 with low load (duration: 00m 54s) [20:20:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:20:47] doesn't matter so much for releases, but probably will for ci-jenkins [20:21:01] should the patch be merged yet? [20:21:17] the puppet patch? yes please [20:21:17] yes [20:21:23] ok, doing it [20:21:30] (03PS6) 10Jcrespo: mariadb: Repool es1019 fully after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446755 (https://phabricator.wikimedia.org/T197073) [20:21:32] (03PS1) 10Jcrespo: mariadb: Repool db1099 fully after warmup [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446902 (https://phabricator.wikimedia.org/T197073) [20:21:43] (03PS6) 10Herron: jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [20:22:28] (03CR) 10Herron: [V: 032 C: 032] jenkins: add buildsDir system property [puppet] - 10https://gerrit.wikimedia.org/r/446774 (https://phabricator.wikimedia.org/T199448) (owner: 10Hashar) [20:23:33] merged [20:23:53] running the commands you listed now hashar [20:25:18] (03CR) 10jenkins-bot: mariadb: Repool db1099 (both instances) with low load [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446875 (https://phabricator.wikimedia.org/T197073) (owner: 10Jcrespo) [20:25:28] (03PS2) 10Andrew Bogott: wmcs: define second_region_designate_host [puppet] - 10https://gerrit.wikimedia.org/r/446901 [20:25:39] (03PS6) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [20:25:57] done, jenkins 2.121.2 installed [20:26:01] cool [20:26:04] systemctl status jenkins [20:26:14] would indicate whether the deb upgrade started it or not [20:26:27] (03PS1) 10RobH: decom prod dns for db1053 [dns] - 10https://gerrit.wikimedia.org/r/446903 (https://phabricator.wikimedia.org/T194634) [20:26:30] (03PS3) 10Andrew Bogott: wmcs: define second_region_designate_host [puppet] - 10https://gerrit.wikimedia.org/r/446901 [20:26:34] releases1001 jenkins[2182]: SEVERE: [jenkins.InitReactorRunner$1 onTaskFailed] Failed Loading global config [20:26:57] :/ [20:27:19] (03CR) 10RobH: [C: 032] decom prod dns for db1053 [dns] - 10https://gerrit.wikimedia.org/r/446903 (https://phabricator.wikimedia.org/T194634) (owner: 10RobH) [20:27:22] Main PID: 2182 (code=exited, status=143) [20:27:26] (03CR) 10Andrew Bogott: [C: 032] wmcs: define second_region_designate_host [puppet] - 10https://gerrit.wikimedia.org/r/446901 (owner: 10Andrew Bogott) [20:27:30] that is java bailling out I guess [20:28:48] well [20:29:40] I wonder if it's because we still have the buildsDir in the globalconfig? [20:30:11] (03PS1) 10RobH: decom db1053 [puppet] - 10https://gerrit.wikimedia.org/r/446904 (https://phabricator.wikimedia.org/T194634) [20:30:25] (03PS7) 10ArielGlenn: Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) [20:30:26] I dont get it [20:30:31] is that releases1001? [20:30:37] yes [20:30:37] it is [20:30:38] Active: active (running) [20:30:43] and there is a java process [20:30:44] (03PS2) 10RobH: decom db1053 [puppet] - 10https://gerrit.wikimedia.org/r/446904 (https://phabricator.wikimedia.org/T194634) [20:30:48] TIME 0:20 and some command [20:30:49] (03CR) 10RobH: [C: 032] decom db1053 [puppet] - 10https://gerrit.wikimedia.org/r/446904 (https://phabricator.wikimedia.org/T194634) (owner: 10RobH) [20:31:40] AHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh [20:31:42] https://releases-jenkins.wikimedia.org/ [20:31:45] (03CR) 10ArielGlenn: [C: 032] Collect run times for certain dump steps on the big wikis [puppet] - 10https://gerrit.wikimedia.org/r/446894 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [20:31:48] jenkins.model.InvalidBuildsDir: /builds does not contain ${ITEM_FULL_NAME} or ${ITEM_ROOTDIR}, cannot distinguish between projects [20:31:52] right [20:32:06] I guess releases is missing the magic hieria / profile thing to inject the build dir? [20:32:09] (03PS3) 10RobH: decom db1053 [puppet] - 10https://gerrit.wikimedia.org/r/446904 (https://phabricator.wikimedia.org/T194634) [20:32:23] apergos: im in rebase race with you [20:32:25] ;] [20:32:51] I won ;-P [20:32:55] you did [20:33:05] and now you have a clear field. [20:33:05] hashar: so it's in the service file [20:33:18] hashar: but if you look at ps output: -Djenkins.model.Jenkins.buildsDir=/builds [20:33:29] hmm [20:33:42] so even though it's single-quoted, it's trying to expand it :( [20:33:49] maybe systemd needs to be reloaded [20:33:59] sure will try that [20:35:02] systemctl show jenkins |grep ExecStart [20:35:12] yeah that gives hmm ${ITEM_ROOTDIR}/builds [20:35:16] without the single quotes :/// [20:36:37] hashar: > To pass a literal dollar sign, use "$$". Variables whose value is not known at expansion time are treated as empty strings. Note that the first argument (i.e. the program to execute) may not be a variable. [20:36:59] https://www.freedesktop.org/software/systemd/man/systemd.service.html#Command%20lines [20:37:07] (03PS1) 10Andrew Bogott: Keystone: more second_region_designate_host things [puppet] - 10https://gerrit.wikimedia.org/r/446966 [20:37:21] 10Operations, 10ops-eqiad, 10DBA, 10decommission: Decommission db1053 - https://phabricator.wikimedia.org/T194634 (10RobH) [20:37:23] great [20:37:29] because: of course [20:37:38] ' Quotes themselves are removed' [20:38:00] yeah, sorry forgot this detail, and I actually hit it already once in the past [20:38:23] it would be very great to have all that process managed in a reliable language such as bash [20:38:30] err my english is crap [20:38:47] (03CR) 10Andrew Bogott: [C: 032] Keystone: more second_region_designate_host things [puppet] - 10https://gerrit.wikimedia.org/r/446966 (owner: 10Andrew Bogott) [20:39:18] I updated to $$ in place and that error went away [20:39:44] https://releases-jenkins.wikimedia.org seems happy [20:39:48] AHH [20:39:56] you guys are all my heroes [20:40:40] disabled puppet again on releases1001 while persisting that [20:41:06] so ideally [20:41:20] on puppet modules/jenkins/manifests/init.pp should accept ${ITEM_DIR} [20:41:32] (03PS1) 10Thcipriani: Escape '$' with '$' for systemd unit files [puppet] - 10https://gerrit.wikimedia.org/r/446967 [20:41:37] but escape the $ sign when doing the assignement to java_args [20:42:17] RECOVERY - puppet last run on labcontrol1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [20:43:38] (03PS1) 10Hashar: jenkins: escape $ in tokens before passing to systemd [puppet] - 10https://gerrit.wikimedia.org/r/446969 [20:43:41] hashar: https://gerrit.wikimedia.org/r/c/operations/puppet/+/446967 [20:43:48] yeah and mine is slightly different https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/446969 :D [20:43:51] doing it just in time [20:44:07] RECOVERY - puppet last run on labtestcontrol2003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [20:44:26] I am sure herron will come with a 3rd way! [20:44:47] regsubst, now I'm waiting for: ${ITEM}/buildroot$$ :P [20:44:57] hehe [20:45:33] yeah [20:45:57] at least release is up and upgraded [20:46:20] (03CR) 10RobH: [C: 031] wmf-decommission-host: initial version [puppet] - 10https://gerrit.wikimedia.org/r/446887 (https://phabricator.wikimedia.org/T198649) (owner: 10Volans) [20:46:28] haha two patches is company, three is a crowd [20:46:39] do we need to escape the first $? regsubst($builds_dir, '\$', '$$', 'G') [20:46:42] https://puppet-compiler.wmflabs.org/compiler02/11825/releases1001.eqiad.wmnet/ [20:47:01] hahaha, called it [20:47:06] grrr unluckily that went with adding the dollar at the end :^\ [20:47:19] solving problems with regexes: now you have two problems [20:48:46] so [20:48:48] (03Abandoned) 10Thcipriani: Escape '$' with '$' for systemd unit files [puppet] - 10https://gerrit.wikimedia.org/r/446967 (owner: 10Thcipriani) [20:48:50] either escape with $$ [20:48:53] or with \$ :] [20:48:58] I am taking baits [20:49:31] * herron rolls dice [20:49:37] afaik linart poettering had nothing to do with puppet, so I'd guess \$ [20:50:13] *lennart [20:50:30] (03PS1) 10ArielGlenn: Collect slowest revision history content dump runs [puppet] - 10https://gerrit.wikimedia.org/r/446970 (https://phabricator.wikimedia.org/T199117) [20:51:05] (03CR) 10jerkins-bot: [V: 04-1] Collect slowest revision history content dump runs [puppet] - 10https://gerrit.wikimedia.org/r/446970 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [20:51:36] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [20:51:38] * volans gotta go now, sorry [20:52:27] RECOVERY - Router interfaces on cr1-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 65, down: 0, dormant: 0, excluded: 0, unused: 0 [20:52:34] * hashar writes test [20:52:35] (03PS2) 10ArielGlenn: Collect slowest revision history content dump runs [puppet] - 10https://gerrit.wikimedia.org/r/446970 (https://phabricator.wikimedia.org/T199117) [20:57:16] (03PS2) 10Hashar: jenkins: escape $ in tokens before passing to systemd [puppet] - 10https://gerrit.wikimedia.org/r/446969 [20:57:25] herron: thcipriani this time with a spec file [20:57:57] so whatever we set in hiera or pass to the class [20:58:04] it is escaped when being passed to systemd [20:58:34] * hashar grabs watter [20:59:10] RECOVERY - puppet last run on cloudcontrol1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:00:38] * thcipriani runs puppet compiler [21:02:21] hashar: herron https://puppet-compiler.wmflabs.org/compiler02/11827/ [21:02:40] nice [21:03:19] (03CR) 10Thcipriani: [C: 031] jenkins: escape $ in tokens before passing to systemd [puppet] - 10https://gerrit.wikimedia.org/r/446969 (owner: 10Hashar) [21:03:41] (03PS1) 10Andrew Bogott: labtestn: update some designate settings [puppet] - 10https://gerrit.wikimedia.org/r/446971 [21:04:01] I feel sorry it is dragging everyone for so long :^\ [21:04:22] ^ sorry herron and thank you [21:04:36] no worries [21:04:49] so right now it’s set as -Djenkins.model.Jenkins.buildsDir=$${ITEM_ROOTDIR}/builds (manually) [21:05:01] we can try running puppet on releases1001 [21:05:03] reload systemd [21:05:06] we want it to be -Djenkins.model.Jenkins.buildsDir=/srv/jenkins/builds/$${ITEM_FULL_NAME} ? [21:05:06] restart jenkins [21:05:09] and see what happens [21:05:13] yeah double $$ for systemd [21:05:25] (03CR) 10Andrew Bogott: [C: 032] labtestn: update some designate settings [puppet] - 10https://gerrit.wikimedia.org/r/446971 (owner: 10Andrew Bogott) [21:05:26] herron: for releases, ITEM_ROOTDIR is correct [21:05:32] ahh gotcha [21:05:38] for contint, it should be /srv/jenkins/etc... [21:05:39] else if we use ${ITEM_FULL_NAME} it will lookup an env variable ITEM_FULL_NAME and falllback to an empty string [21:06:06] makes sense now [21:06:28] here goes [21:06:38] (03PS3) 10Herron: jenkins: escape $ in tokens before passing to systemd [puppet] - 10https://gerrit.wikimedia.org/r/446969 (owner: 10Hashar) [21:07:38] (03CR) 10Herron: [C: 032] jenkins: escape $ in tokens before passing to systemd [puppet] - 10https://gerrit.wikimedia.org/r/446969 (owner: 10Hashar) [21:09:45] enabling puppet on contint2001 [21:10:41] puppet run was a noop on releases1001 (as expected) [21:11:02] great!!!!!!!!!!!!!! [21:11:06] (03CR) 10ArielGlenn: [C: 032] Collect slowest revision history content dump runs [puppet] - 10https://gerrit.wikimedia.org/r/446970 (https://phabricator.wikimedia.org/T199117) (owner: 10ArielGlenn) [21:11:15] (03PS3) 10ArielGlenn: Collect slowest revision history content dump runs [puppet] - 10https://gerrit.wikimedia.org/r/446970 (https://phabricator.wikimedia.org/T199117) [21:11:43] herron: can you upgrade releases2001 as well ? [21:11:55] sure [21:12:00] doing that now [21:15:16] FWIW, releases1001 seems to be respecting buildsDir and is working fine afaict [21:15:22] releases-jenkins rather [21:15:53] awesome [21:15:59] guess I can do contint1001 now [21:16:21] releases2001 is done [21:16:59] !log stopping the CI jenkins on contint1001 [21:17:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:17:57] buildsDir=/srv/jenkins/builds/$${ITEM_FULL_NAME} [21:18:08] \o/ [21:18:13] beautiful [21:18:17] but should we reload systemd ? [21:18:23] sure [21:18:34] also: probably [21:18:45] necessary [21:19:00] hmm puppet does it apparently [21:19:06] fancy [21:19:55] !log starting CI jenkins on contint1001 [21:19:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:20:32] oh yeah [21:20:33] it booted [21:20:53] hashar: still says old version [21:21:08] ahhh [21:21:11] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0 [21:21:11] I forgot apt install [21:21:29] !log restarting CI on contint1001 and actually upgrading it [21:21:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:21:33] I should have let you do the commands [21:22:01] PROBLEM - Router interfaces on cr1-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 63, down: 1, dormant: 0, excluded: 0, unused: 0 [21:22:24] do you need me to apt-get install on contint? [21:22:32] /usr/bin/java -Djenkins.model.Jenkins.buildsDir=/srv/jenkins/builds/${ITEM_FULL_NAME} [21:22:41] or you have the power there [21:22:42] herron: thcipriani looks good :) [21:22:54] so we have root on contint1001 / contint2001 [21:23:00] but not on releases1001 / release2001 [21:23:00] nice ok [21:23:49] so a build of https://integration.wikimedia.org/ci/job/pywikibot-core-tox-docker/3547/ has its build stuff in /srv/jenkins/builds/pywikibot-core-tox-docker/3547 [21:23:55] which mean the parameter works properly [21:23:55] ah good, and everyone who had global build got global cancel [21:25:08] herron: I think we are set. Thank you and sorry for all the rerepro madness + the puppet escaping issue [21:25:23] no worries! glad it all worked out in the end [21:25:59] (03PS2) 10Thcipriani: Scap: update-interwiki-cache for labs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446507 (https://phabricator.wikimedia.org/T198844) [21:26:14] I see jobs getting triggered [21:26:24] herron: thanks for all your help! [21:27:16] any time! [21:32:40] RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0 [21:33:30] RECOVERY - Router interfaces on cr1-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 65, down: 0, dormant: 0, excluded: 0, unused: 0 [21:42:02] herron: have you removed the scheduled maintenance flags in icinga? [21:42:08] or do they clear up automatically? [21:42:51] I set 2 hour downtime for the host and all services, so that will expire on its own [21:42:56] hosts* [21:43:19] herron: awesome. Thank you very very much for all the extra time [21:43:28] usually it takes like 10 or 15 minutes :\ [21:43:28] Did the train not go out on group2 today? [21:43:40] no probelm! [21:43:41] davidwbarratt: nope. too many blockers [21:44:19] https://phabricator.wikimedia.org/T191059 [21:44:29] hashar ah. when's the next oppertunity? [21:46:17] I would expect that's a question best left to the developers handling the blockers [21:47:06] Krenair oh so it's just as soon as the blockers are resolved? [21:47:24] I doubt that [21:48:14] davidwbarratt: I dont think it will happen today, we dont deploy on friday [21:48:28] so probably we would push it on monday if bugs got fixed (I am sure they will) [21:52:12] !log CI Jenkins has been upgraded successfully! [21:52:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:52:18] I am offf [21:59:24] (03PS1) 10Andrew Bogott: wmcs puppetmasters: pass in second_region_designate_host [puppet] - 10https://gerrit.wikimedia.org/r/446981 [22:04:12] (03PS1) 10Dbarratt: Revert "Enable Special:Block Feedback Request" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446982 [22:06:46] (03PS2) 10Dbarratt: Revert "Enable Special:Block Feedback Request" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446982 [22:10:05] (03CR) 10Andrew Bogott: [C: 032] wmcs puppetmasters: pass in second_region_designate_host [puppet] - 10https://gerrit.wikimedia.org/r/446981 (owner: 10Andrew Bogott) [22:18:13] !log enable ospf on eqdfw-knams link (GTT) [22:18:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:22:41] PROBLEM - HTTP availability for Varnish at esams on einsteinium is CRITICAL: job=varnish-text site=esams https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=3&fullscreen&refresh=1m&orgId=1 [22:22:41] PROBLEM - HTTP availability for Nginx -SSL terminators- at esams on einsteinium is CRITICAL: cluster=cache_text site=esams https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=4&fullscreen&refresh=1m&orgId=1 [22:23:41] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5 [22:23:50] RECOVERY - HTTP availability for Varnish at esams on einsteinium is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=3&fullscreen&refresh=1m&orgId=1 [22:24:11] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=esams&var-cache_type=All&var-status_type=5 [22:25:01] RECOVERY - HTTP availability for Nginx -SSL terminators- at esams on einsteinium is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=4&fullscreen&refresh=1m&orgId=1 [22:26:40] (03PS3) 10EBernhardson: [WIP] Add mjolnir kafka daemon to primary elasticsearch clusters [puppet] - 10https://gerrit.wikimedia.org/r/445254 (https://phabricator.wikimedia.org/T198490) [22:30:30] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5 [22:30:51] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=esams&var-cache_type=All&var-status_type=5 [22:43:53] 10Operations, 10monitoring, 10Privacy, 10Security-Core: status.wikimedia.org should not load Google Analytics - https://phabricator.wikimedia.org/T115945 (10Framawiki) [22:43:55] 10Operations, 10monitoring, 10Patch-For-Review: Sunset Watchmouse's status.wikimedia.org - https://phabricator.wikimedia.org/T199816 (10Framawiki) [22:46:25] 10Operations, 10Privacy, 10Security: status.wikimedia.org should have an alternative privacy policy - https://phabricator.wikimedia.org/T189763 (10Framawiki) [22:48:41] 10Operations, 10monitoring, 10Privacy, 10Security: status.wikimedia.org should have an alternative privacy policy - https://phabricator.wikimedia.org/T189763 (10Framawiki) [22:51:31] !log disable ospf on eqdfw-knams link (GTT) [22:51:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:55:57] (03PS1) 10Bstorm: gridengine: stretch doesn't have an hhvm package [puppet] - 10https://gerrit.wikimedia.org/r/446990 (https://phabricator.wikimedia.org/T199276) [23:00:04] addshore, hashar, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) Evening SWAT (Max 6 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180719T2300). [23:00:04] davidwbarratt: A patch you scheduled for Evening SWAT (Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [23:00:19] here! [23:03:36] anyone able to SWAT? [23:05:12] I can SWAT [23:05:39] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446982 (owner: 10Dbarratt) [23:05:50] thcipriani yay! thanks! [23:05:59] * thcipriani doffs cap [23:06:57] (03Merged) 10jenkins-bot: Revert "Enable Special:Block Feedback Request" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446982 (owner: 10Dbarratt) [23:07:16] 10Operations, 10Commons, 10Multimedia, 10media-storage, 10User-Josve05a: Specific revisions of multiple files missing from Swift - 404 Not Found returned - https://phabricator.wikimedia.org/T124101 (10AlexisJazz) https://commons.wikimedia.org/wiki/File:Chris_Benoit_in_the_Ring.jpg 7 previous revisions m... [23:07:59] davidwbarratt: your change is live on mwdebug1002, check please [23:08:08] (03PS1) 10Bstorm: dumps distribution: failing web services over to labstore1006 [puppet] - 10https://gerrit.wikimedia.org/r/446991 (https://phabricator.wikimedia.org/T196651) [23:08:10] checking [23:08:16] (03CR) 10jenkins-bot: Revert "Enable Special:Block Feedback Request" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/446982 (owner: 10Dbarratt) [23:08:54] thcipriani it looks perfect! [23:09:08] awesome! ok, going live. [23:11:03] !log thcipriani@deploy1001 Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:446982|Revert "Enable Special:Block Feedback Request"]] (duration: 00m 55s) [23:11:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:11:10] ^ davidwbarratt live everywhere [23:11:23] thcipriani looks great! thanks! [23:11:28] (03PS1) 10BBlack: Revert "eqsin+zero fallback" [puppet] - 10https://gerrit.wikimedia.org/r/446992 (https://phabricator.wikimedia.org/T189250) [23:11:29] yw :) [23:12:35] (03PS4) 10Bstorm: WIP dumps: fail over dumps web to labstore1006 [dns] - 10https://gerrit.wikimedia.org/r/446476 (https://phabricator.wikimedia.org/T196651) [23:12:43] (03CR) 10jerkins-bot: [V: 04-1] WIP dumps: fail over dumps web to labstore1006 [dns] - 10https://gerrit.wikimedia.org/r/446476 (https://phabricator.wikimedia.org/T196651) (owner: 10Bstorm) [23:13:08] 10Operations, 10ops-eqsin, 10Traffic: rack/setup scs-eqsin.mgmt.eqsin.wmnet - https://phabricator.wikimedia.org/T181569 (10BBlack) 05stalled>03Resolved Assuming the above is true :) [23:15:21] (03PS5) 10Bstorm: WIP dumps: fail over dumps web to labstore1006 [dns] - 10https://gerrit.wikimedia.org/r/446476 (https://phabricator.wikimedia.org/T196651) [23:19:02] (03PS6) 10Bstorm: dumps distribution: fail over dumps web to labstore1006 [dns] - 10https://gerrit.wikimedia.org/r/446476 (https://phabricator.wikimedia.org/T196651) [23:19:54] (03CR) 10Legoktm: "hhvm is available in stretch-wikimedia... https://tools.wmflabs.org/apt-browser/stretch-wikimedia/main/" [puppet] - 10https://gerrit.wikimedia.org/r/446990 (https://phabricator.wikimedia.org/T199276) (owner: 10Bstorm) [23:23:03] (03CR) 10Bstorm: [C: 032] dumps distribution: failing web services over to labstore1006 [puppet] - 10https://gerrit.wikimedia.org/r/446991 (https://phabricator.wikimedia.org/T196651) (owner: 10Bstorm) [23:26:28] (03CR) 10Bstorm: [C: 032] dumps distribution: fail over dumps web to labstore1006 [dns] - 10https://gerrit.wikimedia.org/r/446476 (https://phabricator.wikimedia.org/T196651) (owner: 10Bstorm) [23:32:39] (03PS1) 10BBlack: geo-maps: unblock eqsin routing for Zero-affected countries [dns] - 10https://gerrit.wikimedia.org/r/446996 (https://phabricator.wikimedia.org/T189250) [23:32:41] (03PS1) 10BBlack: geo-maps: cleanup ordering for OC [dns] - 10https://gerrit.wikimedia.org/r/446997 (https://phabricator.wikimedia.org/T189252) [23:33:27] (03CR) 10Bstorm: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/446990 (https://phabricator.wikimedia.org/T199276) (owner: 10Bstorm) [23:34:54] thcipriani: Got two wmf.13 fixes to make progress on train unblocking [23:35:02] (03CR) 10Bstorm: "Wait, I've got it. The problem is the opposite: There's nothing telling stretch VMs to install it in the first place :)" [puppet] - 10https://gerrit.wikimedia.org/r/446990 (https://phabricator.wikimedia.org/T199276) (owner: 10Bstorm) [23:35:05] First https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/446995/, and then cherry-pick https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/446777/ [23:35:10] Krinkle: awesome! [23:35:25] OK to merge the first one? [23:35:32] (03Abandoned) 10Bstorm: gridengine: stretch doesn't have an hhvm package [puppet] - 10https://gerrit.wikimedia.org/r/446990 (https://phabricator.wikimedia.org/T199276) (owner: 10Bstorm) [23:35:54] sure, I'm off the deployment server [23:36:01] Ah, okay. [23:36:09] I can deploy it, np [23:38:04] 10Operations, 10Traffic, 10Patch-For-Review: WP Zero workarounds for eqsin - https://phabricator.wikimedia.org/T189250 (10BBlack) Thanks @DFoy for chasing down the contracts and issues here, we're clear to remove these workaround and close up this ticket. \o/ [23:38:19] (03CR) 10BBlack: [C: 032] Revert "eqsin+zero fallback" [puppet] - 10https://gerrit.wikimedia.org/r/446992 (https://phabricator.wikimedia.org/T189250) (owner: 10BBlack) [23:38:28] (03PS2) 10BBlack: Revert "eqsin+zero fallback" [puppet] - 10https://gerrit.wikimedia.org/r/446992 (https://phabricator.wikimedia.org/T189250) [23:43:09] (03CR) 10BBlack: [C: 032] geo-maps: unblock eqsin routing for Zero-affected countries [dns] - 10https://gerrit.wikimedia.org/r/446996 (https://phabricator.wikimedia.org/T189250) (owner: 10BBlack) [23:43:39] (03CR) 10BBlack: [C: 032] geo-maps: cleanup ordering for OC [dns] - 10https://gerrit.wikimedia.org/r/446997 (https://phabricator.wikimedia.org/T189252) (owner: 10BBlack) [23:46:18] Hm.. jobs are taking >10min [23:46:26] 14min and counting [23:46:37] 10Operations, 10Traffic: Enable Service in Asia Cache DC - https://phabricator.wikimedia.org/T156026 (10BBlack) [23:46:49] bblack: nice going :) [23:47:02] 10Operations, 10Traffic, 10Patch-For-Review: WP Zero workarounds for eqsin - https://phabricator.wikimedia.org/T189250 (10BBlack) 05Open>03Resolved [23:47:10] UNDEPLOY WPZERO [23:47:34] I wish someone would give us a date for turning it off [23:47:48] WP -1? [23:48:09] WP -Infinity [23:48:17] Nvm, trademarked. [23:54:22] 10Operations, 10Traffic: Backend naming in VCL needs to use fqdn+port - https://phabricator.wikimedia.org/T138546 (10Krinkle) [23:54:51] Dunder Mifflin Infinity!