[00:01:53] that's that [00:04:06] RECOVERY - puppet last run on mw2258 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [00:05:47] jouncebot: next [00:05:47] In 106 hour(s) and 54 minute(s): Wikimedia Portals Update (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171127T1100) [00:05:55] welp [00:06:57] isn't there supposed to be a SWAT window right now? [00:07:12] nope [00:07:22] it's a us holiday tommror so this is like a friday [00:07:35] https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171122T0000 tho? [00:08:18] that was very early this morning [00:08:50] oh I missed the "(Tue)" [00:09:01] and thought it was today >.> [00:13:06] 10Operations, 10Quarry, 10cloud-services-team: let quarry use the mariadb module - https://phabricator.wikimedia.org/T181205#3782981 (10Dzahn) [00:13:19] 10Operations, 10Quarry, 10cloud-services-team: let quarry use the mariadb module - https://phabricator.wikimedia.org/T181205#3782996 (10Dzahn) [00:13:22] 10Operations, 10DBA, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070#3151523 (10Dzahn) [00:14:55] 10Operations, 10DBA, 10Patch-For-Review: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070#3151523 (10Dzahn) Created subtask to make quarry use the mariadb module since that is one of the few things still using it. [00:16:23] (03CR) 10Dzahn: "and https://grafana.wikimedia.org/dashboard/db/prometheus-apache-hhvm-dc-stats?orgId=1 is still working now as well" [puppet] - 10https://gerrit.wikimedia.org/r/392763 (owner: 10Dzahn) [00:21:39] 10Operations, 10Goal, 10Patch-For-Review, 10User-fgiunchedi, 10cloud-services-team (Kanban): Port non-deprecated Diamond collectors to Prometheus - https://phabricator.wikimedia.org/T177196#3650139 (10Dzahn) Is there an issue with the memcached exporter since this looks empty: https://grafana.wikimedia.... [00:22:21] (03CR) 10Dzahn: "https://grafana.wikimedia.org/dashboard/db/memcached?orgId=1&var-server=All&from=now-90d&to=now looks empty or am i looking at the wrong" [puppet] - 10https://gerrit.wikimedia.org/r/382922 (https://phabricator.wikimedia.org/T177225) (owner: 10Dzahn) [00:28:16] (03PS1) 10Dzahn: builder: use profile::base::firewall [puppet] - 10https://gerrit.wikimedia.org/r/392994 [00:28:57] (03CR) 10Dzahn: [C: 032] builder: use profile::base::firewall [puppet] - 10https://gerrit.wikimedia.org/r/392994 (owner: 10Dzahn) [00:31:44] (03PS1) 10Dzahn: logstash: use profile::base::firewall [puppet] - 10https://gerrit.wikimedia.org/r/392996 [00:33:10] (03CR) 10Dzahn: "no-op on boron.eqiad.wmnet" [puppet] - 10https://gerrit.wikimedia.org/r/392994 (owner: 10Dzahn) [00:36:37] (03PS1) 10Dzahn: url_downloader: move firewall to role, use profile [puppet] - 10https://gerrit.wikimedia.org/r/392997 [00:51:27] (03CR) 10Chad: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/332777 (https://phabricator.wikimedia.org/T115713) (owner: 10Hashar) [00:51:34] (03CR) 10jerkins-bot: [V: 04-1] (WIP) run tests against multiple mw versions (WIP) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/332777 (https://phabricator.wikimedia.org/T115713) (owner: 10Hashar) [00:52:40] (03CR) 10Chad: "Let's do it :)" [puppet] - 10https://gerrit.wikimedia.org/r/392181 (owner: 10Thcipriani) [00:53:05] (03CR) 10Chad: [C: 031] deployment servers: Switch to component/git [puppet] - 10https://gerrit.wikimedia.org/r/392447 (owner: 10Muehlenhoff) [00:53:23] (03CR) 10Chad: [C: 032] keys: Document usage of gpg --fetch-keys to import all keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392464 (owner: 10Legoktm) [00:54:17] (03CR) 10Chad: [C: 032] keys: Note that Chris, Mark and Markus are no longer releasing new versions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392462 (https://phabricator.wikimedia.org/T180615) (owner: 10Legoktm) [00:54:42] (03CR) 10Chad: [C: 032] "Fair enough, missed that as part of the whole topic" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392463 (owner: 10Legoktm) [00:55:25] PROBLEM - SSH on scb1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:55:25] PROBLEM - pdfrender on scb1001 is CRITICAL: connect to address 10.64.0.16 and port 5252: Connection refused [00:55:27] (03CR) 10Chad: [C: 032] keys: Document which key is which in keys.txt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392461 (owner: 10Legoktm) [00:56:41] (03Merged) 10jenkins-bot: keys: Document which key is which in keys.txt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392461 (owner: 10Legoktm) [00:56:46] (03CR) 10jerkins-bot: [V: 04-1] keys: Remove keys of former release managers from keys.txt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392463 (owner: 10Legoktm) [00:56:48] (03CR) 10jerkins-bot: [V: 04-1] keys: Document usage of gpg --fetch-keys to import all keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392464 (owner: 10Legoktm) [00:56:58] (03CR) 10jenkins-bot: keys: Document which key is which in keys.txt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392461 (owner: 10Legoktm) [00:57:55] PROBLEM - cxserver endpoints health on scb1001 is CRITICAL: /v2/translate/{from}/{to}{/provider} (Machine translate an HTML fragment using Apertium, adapt the links to target language wiki.) timed out before a response was received: /v1/mt/{from}/{to}{/provider} (Machine translate an HTML fragment using Apertium.) timed out before a response was received [00:58:07] (03PS3) 10Chad: keys: Note that Chris, Mark and Markus are no longer releasing new versions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392462 (https://phabricator.wikimedia.org/T180615) (owner: 10Legoktm) [00:58:16] RECOVERY - SSH on scb1001 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [00:59:46] RECOVERY - cxserver endpoints health on scb1001 is OK: All endpoints are healthy [01:00:16] (03PS3) 10Chad: keys: Remove keys of former release managers from keys.txt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392463 (owner: 10Legoktm) [01:00:19] (03CR) 10jenkins-bot: keys: Note that Chris, Mark and Markus are no longer releasing new versions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392462 (https://phabricator.wikimedia.org/T180615) (owner: 10Legoktm) [01:00:26] RECOVERY - pdfrender on scb1001 is OK: HTTP OK: HTTP/1.1 200 OK - 275 bytes in 0.006 second response time [01:00:30] (03PS3) 10Chad: keys: Document usage of gpg --fetch-keys to import all keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392464 (owner: 10Legoktm) [01:01:32] (03CR) 10Chad: [C: 04-2] "This is dangerous. Other repos depend on this." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [01:03:34] (03CR) 10jenkins-bot: keys: Document usage of gpg --fetch-keys to import all keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392464 (owner: 10Legoktm) [01:03:43] (03CR) 10jenkins-bot: keys: Remove keys of former release managers from keys.txt [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392463 (owner: 10Legoktm) [01:05:14] !log demon@tin Synchronized docroot/mediawiki/keys/: update formatting, docs, etc (duration: 00m 46s) [01:05:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:09:19] (03CR) 10Chad: "There isn't an $wmfAllServices[*]['ores'] anymore. Still needed?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/345510 (owner: 10Giuseppe Lavagetto) [01:13:37] (03PS4) 10Chad: Migrate AbuseFilter config off wmg variables, part 1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374651 (owner: 10MaxSem) [01:16:06] (03CR) 10Chad: [C: 032] Migrate AbuseFilter config off wmg variables, part 1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374651 (owner: 10MaxSem) [01:16:41] (03Merged) 10jenkins-bot: Migrate AbuseFilter config off wmg variables, part 1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374651 (owner: 10MaxSem) [01:16:52] (03CR) 10jenkins-bot: Migrate AbuseFilter config off wmg variables, part 1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374651 (owner: 10MaxSem) [01:19:28] (03CR) 10TerraCodes: "> This is dangerous. Other repos depend on this." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [01:19:38] !log demon@tin Synchronized wmf-config/CommonSettings.php: abusefilter stuff (duration: 00m 45s) [01:19:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:20:34] (03PS3) 10Chad: Migrate AbuseFilter config off wmg variables, part 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374652 (owner: 10MaxSem) [01:21:06] !log demon@tin Synchronized wmf-config/InitialiseSettings.php: abusefilter stuff (duration: 00m 45s) [01:21:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:21:28] (03PS3) 10Chad: Cleanup: Removed wgEnableRcFiltersBetaFeature setting for Beta Cluster, true everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374383 (owner: 10Jforrester) [01:21:42] (03CR) 10Chad: [C: 032] Migrate AbuseFilter config off wmg variables, part 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374652 (owner: 10MaxSem) [01:22:35] (03CR) 10Chad: "Status here?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386361 (owner: 10Mobrovac) [01:22:47] (03Merged) 10jenkins-bot: Migrate AbuseFilter config off wmg variables, part 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374652 (owner: 10MaxSem) [01:22:54] (03CR) 10jenkins-bot: Migrate AbuseFilter config off wmg variables, part 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374652 (owner: 10MaxSem) [01:23:54] (03CR) 10Chad: [C: 032] Cleanup: Removed wgEnableRcFiltersBetaFeature setting for Beta Cluster, true everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374383 (owner: 10Jforrester) [01:24:24] !log demon@tin Synchronized wmf-config/CommonSettings.php: abusefilter stuff (duration: 00m 45s) [01:24:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:25:05] (03Merged) 10jenkins-bot: Cleanup: Removed wgEnableRcFiltersBetaFeature setting for Beta Cluster, true everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374383 (owner: 10Jforrester) [01:25:12] (03PS2) 10Chad: Set $wgCentralAuthGlobalBlockInterwikiPrefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386627 (owner: 10Anomie) [01:26:00] (03CR) 10jenkins-bot: Cleanup: Removed wgEnableRcFiltersBetaFeature setting for Beta Cluster, true everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374383 (owner: 10Jforrester) [01:26:16] !log demon@tin Synchronized wmf-config/InitialiseSettings-labs.php: no-op (duration: 00m 44s) [01:26:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:27:04] (03Abandoned) 10Chad: Revert "Enable Timeless everywhere" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392762 (owner: 10Iniquity) [01:27:35] (03CR) 10Chad: [C: 032] Set $wgCentralAuthGlobalBlockInterwikiPrefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386627 (owner: 10Anomie) [01:28:36] (03Merged) 10jenkins-bot: Set $wgCentralAuthGlobalBlockInterwikiPrefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386627 (owner: 10Anomie) [01:29:12] (03PS4) 10Chad: Disable DisableAccount on wikis where there are no disabled users [mediawiki-config] - 10https://gerrit.wikimedia.org/r/338792 (https://phabricator.wikimedia.org/T106067) (owner: 10Reedy) [01:29:30] (03PS3) 10Chad: Disable EducationProgram on cs.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/391163 (https://phabricator.wikimedia.org/T180426) (owner: 10Urbanecm) [01:29:41] !log demon@tin Synchronized wmf-config/CommonSettings.php: Set $wgCentralAuthGlobalBlockInterwikiPrefix (duration: 00m 45s) [01:29:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:29:50] (03CR) 10jenkins-bot: Set $wgCentralAuthGlobalBlockInterwikiPrefix [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386627 (owner: 10Anomie) [01:31:31] (03PS4) 10Chad: Disable EducationProgram on cs.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/391163 (https://phabricator.wikimedia.org/T180426) (owner: 10Urbanecm) [01:33:48] (03CR) 10Chad: [C: 032] Disable EducationProgram on cs.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/391163 (https://phabricator.wikimedia.org/T180426) (owner: 10Urbanecm) [01:34:55] (03Merged) 10jenkins-bot: Disable EducationProgram on cs.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/391163 (https://phabricator.wikimedia.org/T180426) (owner: 10Urbanecm) [01:36:32] !log demon@tin Synchronized wmf-config/InitialiseSettings.php: dropping education program from cswiki (duration: 00m 45s) [01:36:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:36:51] (03CR) 10jenkins-bot: Disable EducationProgram on cs.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/391163 (https://phabricator.wikimedia.org/T180426) (owner: 10Urbanecm) [01:43:57] (03CR) 10Dereckson: [C: 031] "Cautious CR+1 as the community has indeed followed the required steps (discuss a policy, publish it, link it on meta)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/390303 (https://phabricator.wikimedia.org/T166763) (owner: 10TerraCodes) [01:46:21] (03PS2) 10Dereckson: Add stemming languages settings for description indexing [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392894 (https://phabricator.wikimedia.org/T176903) (owner: 10Smalyshev) [01:47:21] (03CR) 10Smalyshev: "It doesn't really depends on that patch (it can be merged independently, in any order) though they are of course related." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392894 (https://phabricator.wikimedia.org/T176903) (owner: 10Smalyshev) [01:53:42] (03CR) 10Dereckson: [C: 04-1] "Technically correct. Input from security would be nice before deploy this." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392224 (https://phabricator.wikimedia.org/T180648) (owner: 10Huji) [01:55:48] (03CR) 10Dereckson: [C: 031] Update logo in the footer button image [mediawiki-config] - 10https://gerrit.wikimedia.org/r/376150 (https://phabricator.wikimedia.org/T174603) (owner: 10Odder) [01:57:22] (03CR) 10Dereckson: "Is this change still needed?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/345179 (https://phabricator.wikimedia.org/T160887) (owner: 10Daniel Kinzler) [02:19:52] 10Operations, 10Quarry, 10cloud-services-team (Kanban): let quarry use the mariadb module - https://phabricator.wikimedia.org/T181205#3783068 (10bd808) [02:23:34] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.8) (duration: 05m 27s) [02:23:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:57:56] (03PS1) 10TerraCodes: Readd wikimedia.org to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [02:58:50] (03CR) 10jerkins-bot: [V: 04-1] Readd wikimedia.org to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:05:38] (03PS2) 10TerraCodes: Readd wikimedia.org to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [03:16:47] (03CR) 10Krinkle: [C: 04-1] "This is used for Http connections in MediaWiki. While codereview-proxy might not exist anymore, there are still lots of other services and" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:24:25] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 846.39 seconds [03:25:00] (03CR) 10TerraCodes: "> This is used for Http connections in MediaWiki. While" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:26:31] (03CR) 10Krinkle: [C: 04-1] "Yes. This is merely an optimisation at this point that bypasses the proxy. If an entry is missing (like wikidata) the end result is that t" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:28:46] (03CR) 10Krinkle: [C: 04-1] "See also T172357. For potential reviewer: This is an old variable that should be audited very carefully for how it is used. It is unclear " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:29:43] (03PS3) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [03:30:20] (03CR) 10TerraCodes: "> Yes. This is merely an optimisation at this point that bypasses the" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:33:11] (03CR) 10Krinkle: Add loginwiki and wikidata to $wgLocalVirtualHosts (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:37:53] (03PS4) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [03:38:39] (03CR) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:45:35] (03CR) 10Krinkle: Add loginwiki and wikidata to $wgLocalVirtualHosts (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [03:59:26] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 220.58 seconds [04:39:36] (03PS5) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [04:39:59] (03CR) 10TerraCodes: ">" (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [04:41:13] (03PS6) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [04:43:17] (03CR) 10Chad: [C: 04-1] "Removes codereview-proxy mention, a must-have for nostalgia." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [04:44:23] (03PS7) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [05:04:57] (03CR) 10Chad: [C: 031] "<3" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [05:54:32] !log kartik@tin Started deploy [cxserver/deploy@5b35ed5]: Update cxserver to e8fe3f0 [05:54:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:58:45] !log kartik@tin Finished deploy [cxserver/deploy@5b35ed5]: Update cxserver to e8fe3f0 (duration: 04m 13s) [05:58:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:19:28] (03CR) 10Marostegui: [C: 031] mariadb: Switchover codfw s5 master from db2023 to db2052 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392879 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [06:19:51] (03CR) 10Marostegui: [C: 031] mariadb: Switchover s5 codfw master (db2023) to db2052 [puppet] - 10https://gerrit.wikimedia.org/r/392881 (https://phabricator.wikimedia.org/T176243) (owner: 10Jcrespo) [06:20:28] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of JeanBono → Rexcornot: supervision needed - https://phabricator.wikimedia.org/T181170#3783151 (10Marostegui) I am online now :-) [06:26:12] (03PS1) 10Marostegui: db-eqiad.php: db1063, db1051 to fully serve dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393012 (https://phabricator.wikimedia.org/T177208) [06:27:45] PROBLEM - puppet last run on mw2173 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/prometheus-puppet-agent-stats] [06:28:43] (03CR) 10Marostegui: [C: 032] db-eqiad.php: db1063, db1051 to fully serve dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393012 (https://phabricator.wikimedia.org/T177208) (owner: 10Marostegui) [06:28:45] PROBLEM - puppet last run on mw2222 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/mediawiki-firejail-ghostscript] [06:29:51] (03Merged) 10jenkins-bot: db-eqiad.php: db1063, db1051 to fully serve dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393012 (https://phabricator.wikimedia.org/T177208) (owner: 10Marostegui) [06:30:08] (03CR) 10jenkins-bot: db-eqiad.php: db1063, db1051 to fully serve dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393012 (https://phabricator.wikimedia.org/T177208) (owner: 10Marostegui) [06:32:06] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Fully pool db1051 and db1063 in vslow service for s5 to warm them up for the s8 split - T177208 (duration: 00m 46s) [06:32:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:32:13] T177208: Provide dedicated database resources for wikidata - https://phabricator.wikimedia.org/T177208 [06:44:21] !log legoktm@tin:~$ echo "https://www.mediawiki.org/keys/keys.html" | mwscript purgeList.php [06:44:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:48:39] (03PS1) 10Marostegui: db-eqiad.php: Depool db1089 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393016 (https://phabricator.wikimedia.org/T180045) [06:50:13] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1089 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393016 (https://phabricator.wikimedia.org/T180045) (owner: 10Marostegui) [06:51:23] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1089 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393016 (https://phabricator.wikimedia.org/T180045) (owner: 10Marostegui) [06:51:38] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1089 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393016 (https://phabricator.wikimedia.org/T180045) (owner: 10Marostegui) [06:53:04] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1089 - T180045 (duration: 00m 45s) [06:53:04] !log Drop index on ores_classification on s2 - T180045 [06:53:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:53:12] T180045: Review and deploy schema change on dropping oresc_rev_predicted_model index - https://phabricator.wikimedia.org/T180045 [06:53:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:55:04] !log Drop index on ores_classification on s1 - T180045 [06:55:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:55:15] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1089" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393020 [06:56:51] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1089" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393020 (owner: 10Marostegui) [06:57:45] RECOVERY - puppet last run on mw2173 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:58:01] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1089" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393020 (owner: 10Marostegui) [06:58:28] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1089" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393020 (owner: 10Marostegui) [06:58:45] RECOVERY - puppet last run on mw2222 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:58:57] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1089 - T180045 (duration: 00m 45s) [06:59:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:59:04] T180045: Review and deploy schema change on dropping oresc_rev_predicted_model index - https://phabricator.wikimedia.org/T180045 [07:10:39] (03PS1) 10Marostegui: db-eqiad.php: Move db1053 to s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393023 (https://phabricator.wikimedia.org/T134476) [07:12:03] (03PS2) 10Marostegui: db-eqiad.php: Move db1053 to s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393023 (https://phabricator.wikimedia.org/T134476) [07:13:46] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Move db1053 to s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393023 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:14:57] (03Merged) 10jenkins-bot: db-eqiad.php: Move db1053 to s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393023 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:16:02] (03PS1) 10Marostegui: mariadb: Move db1053 to s2 [puppet] - 10https://gerrit.wikimedia.org/r/393024 (https://phabricator.wikimedia.org/T134476) [07:16:13] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Move db1053 to s2 to replace db1021 as vslow, dump slave - T134476 (duration: 00m 45s) [07:16:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:16:23] T134476: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476 [07:16:52] (03CR) 10jenkins-bot: db-eqiad.php: Move db1053 to s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393023 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:19:58] (03CR) 10Marostegui: [C: 032] "Puppet looks good: https://puppet-compiler.wmflabs.org/compiler02/8943/" [puppet] - 10https://gerrit.wikimedia.org/r/393024 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:23:34] (03PS1) 10Marostegui: db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393025 (https://phabricator.wikimedia.org/T134476) [07:26:26] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393025 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:27:41] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393025 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:27:50] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393025 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [07:28:51] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1021 - T134476 (duration: 00m 45s) [07:28:55] !log Stop MySQL on db1021 and db1053 to clone db1053 [07:28:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:28:58] T134476: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476 [07:29:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:53:12] (03PS1) 10Marostegui: s2,s4.hosts: Move db1053 from s4 to s2 [software] - 10https://gerrit.wikimedia.org/r/393026 [07:54:38] (03CR) 10Marostegui: [C: 032] s2,s4.hosts: Move db1053 from s4 to s2 [software] - 10https://gerrit.wikimedia.org/r/393026 (owner: 10Marostegui) [07:55:19] (03Merged) 10jenkins-bot: s2,s4.hosts: Move db1053 from s4 to s2 [software] - 10https://gerrit.wikimedia.org/r/393026 (owner: 10Marostegui) [07:56:33] (03PS1) 10Marostegui: db1072.yaml: Change binlog to ROW [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) [07:58:44] (03PS1) 10Marostegui: db-eqiad.php: Depool db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393028 (https://phabricator.wikimedia.org/T153058) [08:02:01] (03Abandoned) 10Marostegui: db1072.yaml: Change binlog to ROW [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:02:15] (03Abandoned) 10Marostegui: db-eqiad.php: Depool db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393028 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:14:44] (03CR) 10Jcrespo: "I think we should deploy this anyway. Probably we should not run ROW by default?" [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:16:21] (03CR) 10Marostegui: "> Probably we should not run" [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:16:28] (03Restored) 10Marostegui: db1072.yaml: Change binlog to ROW [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:16:43] (03Restored) 10Marostegui: db-eqiad.php: Depool db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393028 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:19:05] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393028 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:20:16] (03CR) 10Jcrespo: "Independently of the general decision, marking hosts that need row as ROW it is a good idea, the same way that marking hosts that need STA" [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:20:18] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393028 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:20:26] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393028 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:21:21] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 00m 45s) [08:21:23] !log Stop MySQL on db1072 to MySQL upgrade and kernel upgrade [08:21:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:21:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:21:33] (03CR) 10Jcrespo: [C: 031] db1072.yaml: Change binlog to ROW [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:22:14] (03PS2) 10Muehlenhoff: deployment servers: Switch to component/git [puppet] - 10https://gerrit.wikimedia.org/r/392447 [08:22:33] (03CR) 10Marostegui: [C: 032] db1072.yaml: Change binlog to ROW [puppet] - 10https://gerrit.wikimedia.org/r/393027 (https://phabricator.wikimedia.org/T153058) (owner: 10Marostegui) [08:26:00] (03PS3) 10Jcrespo: mariadb: Switchover codfw s5 master from db2023 to db2052 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392879 (https://phabricator.wikimedia.org/T177208) [08:29:56] !log starting switchover of db2023 to db2052 [08:30:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:31:18] !log installing bdb security updates on trusty [08:31:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:44:29] !log stopping and restarting db2052 [08:44:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:44:36] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1072" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393029 [08:46:26] (03PS2) 10Jcrespo: mariadb: Switchover s5 codfw master (db2023) to db2052 [puppet] - 10https://gerrit.wikimedia.org/r/392881 (https://phabricator.wikimedia.org/T176243) [08:47:00] (03CR) 10Jcrespo: [C: 032] mariadb: Switchover s5 codfw master (db2023) to db2052 [puppet] - 10https://gerrit.wikimedia.org/r/392881 (https://phabricator.wikimedia.org/T176243) (owner: 10Jcrespo) [08:53:54] (03CR) 10Marostegui: [C: 032] Revert "db-eqiad.php: Depool db1072" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393029 (owner: 10Marostegui) [08:55:07] (03Merged) 10jenkins-bot: Revert "db-eqiad.php: Depool db1072" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393029 (owner: 10Marostegui) [08:56:02] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1072 (duration: 00m 44s) [08:56:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:56:55] (03CR) 10jenkins-bot: Revert "db-eqiad.php: Depool db1072" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393029 (owner: 10Marostegui) [09:01:25] (03PS12) 10Elukey: role::analytics_cluster::hadoop: move worker and masters to role/profiles [puppet] - 10https://gerrit.wikimedia.org/r/392658 (https://phabricator.wikimedia.org/T167790) [09:05:44] (03CR) 10Jcrespo: [C: 032] mariadb: Switchover codfw s5 master from db2023 to db2052 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392879 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [09:06:49] (03Merged) 10jenkins-bot: mariadb: Switchover codfw s5 master from db2023 to db2052 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392879 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [09:06:58] (03CR) 10jenkins-bot: mariadb: Switchover codfw s5 master from db2023 to db2052 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392879 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [09:07:36] (03PS1) 10Jcrespo: Update mariadb packages 10.1 to .28 and 10.0 to .33 [software] - 10https://gerrit.wikimedia.org/r/393033 [09:11:53] (03PS1) 10Jcrespo: dbhosts: Update s5 hosts and introduce s8 hosts [software] - 10https://gerrit.wikimedia.org/r/393035 (https://phabricator.wikimedia.org/T177208) [09:12:13] (03CR) 10Jcrespo: [C: 032] Update mariadb packages 10.1 to .28 and 10.0 to .33 [software] - 10https://gerrit.wikimedia.org/r/393033 (owner: 10Jcrespo) [09:13:28] (03CR) 10Jcrespo: "Ports, socket, name updates etc. will follow." [software] - 10https://gerrit.wikimedia.org/r/393035 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [09:13:38] PROBLEM - MariaDB Slave SQL: s5 on dbstore1002 is CRITICAL: CRITICAL slave_sql_state Slave_SQL_Running: No, Errno: 1032, Errmsg: Could not execute Delete_rows_v1 event on table dewiki.pagelinks: Cant find record in pagelinks, Error_code: 1032: handler error HA_ERR_KEY_NOT_FOUND: the events master log db1070-bin.001499, end_log_pos 292651069 [09:14:13] ^ marostegui I will take care of that [09:14:41] Thanks :) [09:16:16] !log jynus@tin Synchronized wmf-config/db-codfw.php: mariadb: Switchover s5 codfw master (duration: 00m 45s) [09:16:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:17:18] (03CR) 10Marostegui: [C: 031] dbhosts: Update s5 hosts and introduce s8 hosts [software] - 10https://gerrit.wikimedia.org/r/393035 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [09:21:18] (03CR) 10Jcrespo: [C: 032] dbhosts: Update s5 hosts and introduce s8 hosts [software] - 10https://gerrit.wikimedia.org/r/393035 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [09:23:35] (03CR) 10Elukey: [C: 032] role::analytics_cluster::hadoop: move worker and masters to role/profiles [puppet] - 10https://gerrit.wikimedia.org/r/392658 (https://phabricator.wikimedia.org/T167790) (owner: 10Elukey) [09:25:18] PROBLEM - MariaDB Slave Lag: s5 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 851.41 seconds [09:27:35] (03Abandoned) 10Mobrovac: Revert "[Beta Labs] Use only EventBus for job processing." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/386361 (owner: 10Mobrovac) [09:27:38] checking if dbstore1001 has the same issues [09:27:48] RECOVERY - MariaDB Slave SQL: s5 on dbstore1002 is OK: OK slave_sql_state Slave_SQL_Running: Yes [09:28:50] was it a missing row? [09:28:50] nope, evething ok on dbstore1001 [09:28:54] 3 missing rows [09:29:08] out of a large number with the same id [09:29:19] RECOVERY - MariaDB Slave Lag: s5 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 249.37 seconds [09:33:22] (03CR) 10Faidon Liambotis: kmod::blacklist: prevent manual install, update initramfs (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/392644 (owner: 10BBlack) [09:42:12] (03CR) 10Ema: kmod::blacklist: prevent manual install, update initramfs (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/392644 (owner: 10BBlack) [09:49:00] (03PS1) 10Elukey: role::analytics_cluster::hadoop::standby: restore namenode bacula backup [puppet] - 10https://gerrit.wikimedia.org/r/393041 (https://phabricator.wikimedia.org/T167790) [09:53:21] (03PS2) 10Elukey: role::analytics_cluster::hadoop::standby: restore namenode bacula backup [puppet] - 10https://gerrit.wikimedia.org/r/393041 (https://phabricator.wikimedia.org/T167790) [09:57:13] (03PS3) 10Elukey: role::analytics_cluster::hadoop::standby: restore namenode bacula backup [puppet] - 10https://gerrit.wikimedia.org/r/393041 (https://phabricator.wikimedia.org/T167790) [09:58:58] (03CR) 10Elukey: [C: 032] "https://puppet-compiler.wmflabs.org/compiler02/8949/analytics1002.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/393041 (https://phabricator.wikimedia.org/T167790) (owner: 10Elukey) [10:03:49] (03CR) 10Elukey: "Andrew: I merged the change anyway since I wanted to progress the Prometheus work, let's chat later on if the refinery profile should be m" [puppet] - 10https://gerrit.wikimedia.org/r/392658 (https://phabricator.wikimedia.org/T167790) (owner: 10Elukey) [10:07:08] !log rebooting sarin for update to 4.9.51 [10:07:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:49:05] 10Operations, 10ops-eqiad: Possible memory errors on ganeti1005 - https://phabricator.wikimedia.org/T181121#3783476 (10akosiaris) p:05Triage>03Normal a:03Cmjohnson Box has been emptied of VMs so and is ready for a hardware check. Let's see what that will say [10:57:34] !log migrating instances off ganeti2008 / kernel reboot to 4.9.51 [10:57:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:06:58] !log migrating instances off ganeti2007 / kernel reboot to 4.9.51 [11:07:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:10:19] PROBLEM - OTRS SMTP on mendelevium is CRITICAL: connect to address 10.64.32.174 and port 25: Connection refused [11:10:58] PROBLEM - Check systemd state on mendelevium is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [11:14:19] RECOVERY - OTRS SMTP on mendelevium is OK: SMTP OK - 0.016 sec. response time [11:14:58] RECOVERY - Check systemd state on mendelevium is OK: OK - running: The system is fully operational [11:15:06] !log migrating instances off ganeti2006 / kernel reboot to 4.9.51 [11:15:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:17:42] 10Operations, 10Analytics-Kanban, 10hardware-requests: eqiad: (2) hardware request for jupyter notebook refresh (SWAP) - https://phabricator.wikimedia.org/T175603#3783540 (10faidon) [11:20:21] 10Operations, 10OTRS, 10Patch-For-Review, 10Security: Upgrade OTRS to 5.0.24 - https://phabricator.wikimedia.org/T181127#3783543 (10akosiaris) 05Open>03Resolved Upgrade completed succesfully. We are currently running 5.0.24 with version 1.0.11 of WikimediaTemplates package. [11:23:09] !log migrating instances off ganeti2005 / kernel reboot to 4.9.51 [11:23:09] 10Operations, 10Wikimedia-Planet, 10Patch-For-Review: Enable http/2 for planet apache - https://phabricator.wikimedia.org/T181202#3782900 (10faidon) Planet is behind misc-web, right? If so, that's a fairly pointless task, unless I'm missing something. Even if HTTP/2 made a difference (doubtful) for the inter... [11:23:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:29:31] 10Operations, 10Wikimedia-Planet, 10Patch-For-Review: Enable http/2 for planet apache - https://phabricator.wikimedia.org/T181202#3782900 (10MoritzMuehlenhoff) Also, over the last 1.5 years there have been four HTTP/2 specific Apache vulnerabilities ranging from DoS to potential code execution via use-after-... [11:34:10] !log kartik@tin Started deploy [cxserver/deploy@51b78ce]: T181209 Update cxserver to e8fe3f0 to fix Youdao MT [11:34:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:34:17] T181209: Content Translation: the Youdao Machine Translation does not work. - https://phabricator.wikimedia.org/T181209 [11:35:06] !log migrating instances off ganeti200 / kernel reboot to 4.9.51 [11:35:10] !log migrating instances off ganeti2004 / kernel reboot to 4.9.51 [11:35:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:35:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:37:11] !log kartik@tin Finished deploy [cxserver/deploy@51b78ce]: T181209 Update cxserver to e8fe3f0 to fix Youdao MT (duration: 03m 01s) [11:37:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:41:04] (03PS2) 10Reedy: GWToolset migratory config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392218 (https://phabricator.wikimedia.org/T87928) [11:45:54] jouncebot: next [11:45:55] In 95 hour(s) and 14 minute(s): Wikimedia Portals Update (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20171127T1100) [11:48:39] pfft [11:48:47] * Reedy laughs at "no deploys" [11:52:29] !log migrating instances off ganeti2003 / kernel reboot to 4.9.51 [11:52:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:53:04] 10Operations, 10Developer-Relations, 10cloud-services-team (Kanban): Create discourse-mediawiki.wmflabs.org (pilot instance) - https://phabricator.wikimedia.org/T180854#3783598 (10Qgil) >>! In T180854#3778284, @bd808 wrote: > I would also recommend that the deployment be managed with Puppet (and possibly sca... [11:56:18] PROBLEM - NTP on sca2004 is CRITICAL: NTP CRITICAL: Offset unknown [12:07:43] akosiaris: ^ ? [12:07:53] oh might be related to moritzm's restart [12:08:35] mobrovac: yeah it's the migrations [12:08:49] should manage to coalesce soon [12:09:06] eer converge, not coalesce [12:09:52] yeah, it's one of the few trusty hosts on there and ntpd can loose clock sync during migration to a different code [12:09:55] node [12:10:25] !log migrating instances off ganeti2002 / kernel reboot to 4.9.51 [12:10:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:26:18] RECOVERY - NTP on sca2004 is OK: NTP OK: Offset -0.07227835059 secs [12:35:56] (03CR) 10Reedy: "I'd love to know wtf phpcs is actually doing" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [12:40:37] !log migrating instances off ganeti2001 / kernel reboot to 4.9.51 [12:40:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:43:53] (03CR) 10Reedy: [C: 032] GWToolset migratory config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392218 (https://phabricator.wikimedia.org/T87928) (owner: 10Reedy) [12:45:05] (03Merged) 10jenkins-bot: GWToolset migratory config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392218 (https://phabricator.wikimedia.org/T87928) (owner: 10Reedy) [12:46:52] (03CR) 10jenkins-bot: GWToolset migratory config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392218 (https://phabricator.wikimedia.org/T87928) (owner: 10Reedy) [12:47:31] !log reedy@tin Synchronized wmf-config/CommonSettings.php: GWToolset migratory config T87928 (duration: 00m 46s) [12:47:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:47:38] T87928: Convert GWToolset to use extension registration - https://phabricator.wikimedia.org/T87928 [13:13:03] (03PS1) 10Paladox: Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 [13:13:27] (03PS2) 10Paladox: Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 [13:13:29] (03CR) 10jerkins-bot: [V: 04-1] Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 (owner: 10Paladox) [13:13:33] !log sutting down db2023 and db2038 for cloning [13:13:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:13:42] more like shutting [13:13:49] (03CR) 10jerkins-bot: [V: 04-1] Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 (owner: 10Paladox) [13:15:06] (03PS3) 10Paladox: Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 [13:22:37] (03PS1) 10Jcrespo: [WIP]mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [13:23:02] (03CR) 10Jcrespo: "Needs more pending changes" [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [13:32:20] (03PS1) 10Mark Bergsma: Support multiple BGP peerings [debs/pybal] - 10https://gerrit.wikimedia.org/r/393066 (https://phabricator.wikimedia.org/T180069) [13:33:58] ema: ^ [13:41:22] (03PS5) 10ArielGlenn: rsync all dumps status files to web servers and unpack them periodically [puppet] - 10https://gerrit.wikimedia.org/r/392875 (https://phabricator.wikimedia.org/T179857) [13:58:20] (03PS2) 10Elukey: role:prometheus::analytics: add druid_exporter targets [puppet] - 10https://gerrit.wikimedia.org/r/392841 (https://phabricator.wikimedia.org/T177459) [13:58:53] (03CR) 10Elukey: [C: 032] role:prometheus::analytics: add druid_exporter targets [puppet] - 10https://gerrit.wikimedia.org/r/392841 (https://phabricator.wikimedia.org/T177459) (owner: 10Elukey) [14:01:10] (03PS1) 10Marostegui: db-eqiad.php: Pool db1053 as vslow,dump in s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393069 (https://phabricator.wikimedia.org/T134476) [14:03:23] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Pool db1053 as vslow,dump in s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393069 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [14:04:28] (03Merged) 10jenkins-bot: db-eqiad.php: Pool db1053 as vslow,dump in s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393069 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [14:05:24] (03PS1) 10Marostegui: db-eqiad.php: Weight 0 for db1053 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393070 [14:07:02] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Weight 0 for db1053 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393070 (owner: 10Marostegui) [14:07:03] (03CR) 10jenkins-bot: db-eqiad.php: Pool db1053 as vslow,dump in s2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393069 (https://phabricator.wikimedia.org/T134476) (owner: 10Marostegui) [14:08:00] (03Merged) 10jenkins-bot: db-eqiad.php: Weight 0 for db1053 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393070 (owner: 10Marostegui) [14:09:14] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Pool db1053 in s2 as vslow to replace db1021 - T134476 (duration: 00m 45s) [14:09:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:09:21] T134476: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476 [14:10:12] (03CR) 10jenkins-bot: db-eqiad.php: Weight 0 for db1053 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393070 (owner: 10Marostegui) [14:10:19] !log Stop Mysql on db1101.s5 to clone db1097 - T178359 [14:10:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:10:26] T178359: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359 [14:34:38] (03PS2) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [14:36:11] (03PS3) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [14:36:31] (03CR) 10Jcrespo: "I will run puppet compiler to check it is noop except db2038" [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [14:40:41] (03PS4) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [14:43:52] (03CR) 10Ema: Support multiple BGP peerings (032 comments) [debs/pybal] - 10https://gerrit.wikimedia.org/r/393066 (https://phabricator.wikimedia.org/T180069) (owner: 10Mark Bergsma) [14:47:49] i don't understand that first comment [14:47:56] it IS split over multiple lines isn't it? [14:48:19] (03PS1) 10Jon Harald Søby: Update logo for Wikimedia Norge's chapter wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393075 (https://phabricator.wikimedia.org/T181241) [14:48:21] oh you mean multiple statements I guess [14:50:38] mark: yes! [14:50:44] :) [14:50:46] :) [15:07:39] PROBLEM - Check systemd state on bohrium is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [15:11:34] working on it --^ [15:13:45] (03PS3) 10Muehlenhoff: deployment servers: Switch to component/git [puppet] - 10https://gerrit.wikimedia.org/r/392447 [15:13:59] PROBLEM - puppet last run on es2017 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[initramfs-tools] [15:14:19] (03PS5) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [15:20:39] RECOVERY - Check systemd state on bohrium is OK: OK - running: The system is fully operational [15:23:27] !log rebooting mwdebug* servers for update to 4.9.51 [15:23:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:25:05] (03PS6) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [15:25:19] (03CR) 10Jcrespo: "https://puppet-compiler.wmflabs.org/compiler02/8953/" [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [15:26:26] (03PS7) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [15:33:29] (03CR) 10Marostegui: [C: 031] mariadb: Move hosts to s8 replica set on codfw (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [15:34:21] (03PS2) 10Mark Bergsma: Support multiple BGP peerings [debs/pybal] - 10https://gerrit.wikimedia.org/r/393066 (https://phabricator.wikimedia.org/T180069) [15:35:03] (03CR) 10jerkins-bot: [V: 04-1] Support multiple BGP peerings [debs/pybal] - 10https://gerrit.wikimedia.org/r/393066 (https://phabricator.wikimedia.org/T180069) (owner: 10Mark Bergsma) [15:35:18] (03CR) 10Muehlenhoff: [C: 032] deployment servers: Switch to component/git [puppet] - 10https://gerrit.wikimedia.org/r/392447 (owner: 10Muehlenhoff) [15:35:33] doh [15:37:31] (03PS6) 10ArielGlenn: rsync all dumps status files to web servers and unpack them periodically [puppet] - 10https://gerrit.wikimedia.org/r/392875 (https://phabricator.wikimedia.org/T179857) [15:37:55] (03PS3) 10Mark Bergsma: Support multiple BGP peerings [debs/pybal] - 10https://gerrit.wikimedia.org/r/393066 (https://phabricator.wikimedia.org/T180069) [15:39:27] (03CR) 10Mark Bergsma: Support multiple BGP peerings (031 comment) [debs/pybal] - 10https://gerrit.wikimedia.org/r/393066 (https://phabricator.wikimedia.org/T180069) (owner: 10Mark Bergsma) [15:40:30] (03PS8) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [15:42:19] (03CR) 10Marostegui: [C: 031] mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [15:43:59] RECOVERY - puppet last run on es2017 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [15:45:04] (03PS1) 10Muehlenhoff: Stop using experimental component on deployment servers [puppet] - 10https://gerrit.wikimedia.org/r/393079 [15:47:22] (03CR) 10Muehlenhoff: [C: 032] Stop using experimental component on deployment servers [puppet] - 10https://gerrit.wikimedia.org/r/393079 (owner: 10Muehlenhoff) [15:50:36] !log disable puppet on db2023, db2038 for deployment whle transfer is ongoing [15:50:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:50:47] (03PS9) 10Jcrespo: mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) [15:51:00] (03CR) 10Jcrespo: [C: 032] mariadb: Move hosts to s8 replica set on codfw [puppet] - 10https://gerrit.wikimedia.org/r/393065 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [15:54:01] (03PS7) 10ArielGlenn: rsync all dumps status files to web servers and unpack them periodically [puppet] - 10https://gerrit.wikimedia.org/r/392875 (https://phabricator.wikimedia.org/T179857) [15:56:03] (03PS1) 10Muehlenhoff: Remove experimental component from contintcloud [puppet] - 10https://gerrit.wikimedia.org/r/393081 [15:57:26] (03PS1) 10Muehlenhoff: Remove experimental component from pybal-test and multatuli [puppet] - 10https://gerrit.wikimedia.org/r/393082 [16:02:51] (03PS1) 10Jcrespo: mariadb: Move db1071 to s8 [puppet] - 10https://gerrit.wikimedia.org/r/393086 (https://phabricator.wikimedia.org/T177208) [16:03:02] (03PS2) 10Jcrespo: mariadb: Move db1071 to s8 [puppet] - 10https://gerrit.wikimedia.org/r/393086 (https://phabricator.wikimedia.org/T177208) [16:03:04] (03CR) 10Marostegui: [C: 031] mariadb: Move db1071 to s8 [puppet] - 10https://gerrit.wikimedia.org/r/393086 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [16:06:06] (03CR) 10Jcrespo: [C: 032] mariadb: Move db1071 to s8 [puppet] - 10https://gerrit.wikimedia.org/r/393086 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [16:16:07] (03CR) 10Ema: [C: 031] Remove experimental component from pybal-test and multatuli [puppet] - 10https://gerrit.wikimedia.org/r/393082 (owner: 10Muehlenhoff) [16:22:04] PROBLEM - Check systemd state on bohrium is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [16:22:54] this is known, piwik's db got corrupted after the vm migration and Manuel is recovering the last backup (wikilove to bacula) [16:23:04] RECOVERY - Check systemd state on bohrium is OK: OK - running: The system is fully operational [16:24:38] 10Operations, 10Release Pipeline, 10Release-Engineering-Team (Watching / External): Update Debian package for Blubber - https://phabricator.wikimedia.org/T179984#3784256 (10akosiaris) >>! In T179984#3782184, @thcipriani wrote: >>>! In T179984#3781320, @akosiaris wrote: >> Which means it complains about not f... [16:24:46] (03PS1) 10Jcrespo: prometheus-mysqld-exporter: Introduce s8 replica set on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/393094 (https://phabricator.wikimedia.org/T177208) [16:25:13] PROBLEM - Host actinium is DOWN: PING CRITICAL - Packet loss = 100% [16:25:23] PROBLEM - Host bohrium is DOWN: PING CRITICAL - Packet loss = 100% [16:25:23] PROBLEM - Host dysprosium is DOWN: PING CRITICAL - Packet loss = 100% [16:25:48] * elukey cries in a corner [16:25:54] PROBLEM - Host dubnium is DOWN: PING CRITICAL - Packet loss = 100% [16:26:03] PROBLEM - Host etcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:04] PROBLEM - Host kubestagetcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:04] PROBLEM - Host sca1004 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:04] PROBLEM - Host etcd1003 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:04] PROBLEM - Host releases1001 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:04] PROBLEM - Host chlorine is DOWN: PING CRITICAL - Packet loss = 100% [16:26:04] PROBLEM - Host neon is DOWN: PING CRITICAL - Packet loss = 100% [16:26:06] ? [16:26:13] PROBLEM - Host etcd1005 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:13] PROBLEM - Host logstash1008 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:23] * akosiaris looking [16:26:48] ganeti1006 [16:26:51] oh [16:26:57] is it fatal? [16:27:03] RECOVERY - Host etcd1003 is UP: PING WARNING - Packet loss = 0%, RTA = 1846.76 ms [16:27:05] RECOVERY - Host actinium is UP: PING WARNING - Packet loss = 0%, RTA = 1866.04 ms [16:27:05] RECOVERY - Host bohrium is UP: PING WARNING - Packet loss = 0%, RTA = 1846.49 ms [16:27:05] RECOVERY - Host etcd1002 is UP: PING WARNING - Packet loss = 0%, RTA = 1853.62 ms [16:27:05] RECOVERY - Host dysprosium is UP: PING WARNING - Packet loss = 0%, RTA = 1872.48 ms [16:27:05] RECOVERY - Host dubnium is UP: PING WARNING - Packet loss = 0%, RTA = 1940.21 ms [16:27:05] RECOVERY - Host kubestagetcd1002 is UP: PING WARNING - Packet loss = 0%, RTA = 1953.70 ms [16:27:11] ok that's weird [16:27:13] PROBLEM - citoid endpoints health on scb1004 is CRITICAL: /api (Zotero alive) timed out before a response was received [16:27:13] only network, then? [16:27:16] I did not do anything [16:27:23] https://phabricator.wikimedia.org/P6373 [16:27:24] (because it recovered so quickly) [16:27:32] or a io wait, etc. [16:27:49] well, cpu blockage [16:28:22] idrac console is not working :-( [16:28:26] should we prevently failover? [16:28:45] https://grafana.wikimedia.org/dashboard/file/server-board.json?var-server=ganeti1006&refresh=1m&orgId=1&from=now-15m&to=now [16:28:54] PROBLEM - Host kubestagetcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [16:28:54] PROBLEM - Host etcd1003 is DOWN: PING CRITICAL - Packet loss = 100% [16:29:02] yeah, starting migrations [16:29:04] PROBLEM - Host actinium is DOWN: PING CRITICAL - Packet loss = 44%, RTA = 6288.02 ms [16:29:13] RECOVERY - Host sca1004 is UP: PING OK - Packet loss = 0%, RTA = 2.72 ms [16:29:13] RECOVERY - Host logstash1008 is UP: PING OK - Packet loss = 0%, RTA = 2.87 ms [16:29:13] RECOVERY - Host releases1001 is UP: PING OK - Packet loss = 0%, RTA = 2.52 ms [16:29:13] RECOVERY - Host neon is UP: PING OK - Packet loss = 0%, RTA = 2.50 ms [16:29:33] PROBLEM - Host dubnium is DOWN: PING CRITICAL - Packet loss = 100% [16:29:33] PROBLEM - citoid endpoints health on scb1001 is CRITICAL: /api (Zotero alive) timed out before a response was received [16:29:33] it is going up and down [16:29:53] PROBLEM - Misc HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3fullscreenorgId=1var-site=Allvar-cache_type=miscvar-status_type=5 [16:30:03] PROBLEM - SSH on ganeti1006 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:30:12] I was doing a gzip -d on bohrium [16:30:16] Just FYI [16:30:24] PROBLEM - Host bohrium is DOWN: PING CRITICAL - Packet loss = 100% [16:30:34] a gzip -d shouldn't kill a host and all vms [16:30:45] !log gnt-node failover ganeti1006 [16:30:52] I know, just saying that it looks like a coincidence… [16:30:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:30:54] PROBLEM - Host etcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [16:30:55] the 50x are piwik's almost for sure [16:30:58] hopefully not related [16:31:04] RECOVERY - citoid endpoints health on scb1004 is OK: All endpoints are healthy [16:31:05] damn... just yesterday ganeti1005 had problems... 1006 today ? [16:31:24] PROBLEM - Host sca1004 is DOWN: PING CRITICAL - Packet loss = 100% [16:31:24] PROBLEM - Host releases1001 is DOWN: PING CRITICAL - Packet loss = 100% [16:31:24] PROBLEM - Host neon is DOWN: PING CRITICAL - Packet loss = 100% [16:31:24] RECOVERY - citoid endpoints health on scb1001 is OK: All endpoints are healthy [16:31:27] let us know if you need help with somthing else [16:31:33] PROBLEM - Host logstash1008 is DOWN: PING CRITICAL - Packet loss = 100% [16:31:54] PROBLEM - Host dysprosium is DOWN: PING CRITICAL - Packet loss = 100% [16:32:52] console is working but no login... stalls forever [16:32:55] oh, it is not high cpu [16:32:59] it is high system [16:33:15] I wonder if it really was gzip -d [16:33:30] akosiaris: I got into ganeti [16:33:33] RECOVERY - Host kubestagetcd1002 is UP: PING OK - Packet loss = 0%, RTA = 1.36 ms [16:33:33] RECOVERY - Host releases1001 is UP: PING OK - Packet loss = 0%, RTA = 1.44 ms [16:33:33] RECOVERY - Host dysprosium is UP: PING OK - Packet loss = 0%, RTA = 1.41 ms [16:33:33] RECOVERY - Host bohrium is UP: PING OK - Packet loss = 0%, RTA = 1.39 ms [16:33:33] RECOVERY - Host actinium is UP: PING OK - Packet loss = 0%, RTA = 1.45 ms [16:33:33] RECOVERY - Host etcd1003 is UP: PING OK - Packet loss = 0%, RTA = 1.22 ms [16:33:33] RECOVERY - Host etcd1005 is UP: PING OK - Packet loss = 0%, RTA = 0.83 ms [16:33:34] RECOVERY - Host dubnium is UP: PING OK - Packet loss = 0%, RTA = 0.88 ms [16:33:34] RECOVERY - Host etcd1002 is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms [16:33:35] RECOVERY - Host chlorine is UP: PING OK - Packet loss = 0%, RTA = 1.02 ms [16:33:37] probably you could now, too [16:33:43] RECOVERY - Host neon is UP: PING OK - Packet loss = 0%, RTA = 2.37 ms [16:33:43] RECOVERY - Host sca1004 is UP: PING OK - Packet loss = 0%, RTA = 2.58 ms [16:33:47] damn [16:33:52] exact same error as ganeti1005 [16:33:54] RECOVERY - SSH on ganeti1006 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [16:34:14] how many etcd instances were on the failed host? [16:34:14] [13333511.538975] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set [16:34:14] [13333511.538975] bad because of flags: 0x14(referenced|dirty) [16:34:15] [13333511.539015] BUG: Bad page state in process in:imklog pfn:525357 [16:34:15] [13333511.539016] page:ffffea865494d5c0 count:0 mapcount:0 mapping: (null) index:0x1 [16:34:18] not good ... [16:34:37] it's fine now btw [16:34:40] maybe a full restart? [16:34:49] I mean, not now [16:35:07] https://phabricator.wikimedia.org/T181121 [16:35:10] actually now, I got that on the console [16:35:16] exact same thing [16:35:18] volans: those are for role(etcd::networking) [16:35:43] RECOVERY - Host logstash1008 is UP: PING OK - Packet loss = 0%, RTA = 1.99 ms [16:35:44] is there enough hosts to failover? [16:36:13] yeah but I am worried. Those 2 boxes are of the same batch [16:36:22] elukey: 1002/3 are kubernetes AFAICT [16:36:22] and the other 2 are the same batch as well [16:36:32] yeah ignore the etcds [16:36:33] it's fine [16:36:54] even if they crapped their pants, it's of no consequence for now [16:37:02] super [16:37:09] ok, standing by [16:37:21] (me, not the machine :-)) [16:37:21] agree, just to ensure going forward that they are more spread across multiple physical hosts;) [16:37:38] actually they were [16:37:44] but https://phabricator.wikimedia.org/T181121 [16:37:59] not sure what to trust right now [16:38:16] is any of the 4 boxes trustworthy ? [16:38:18] looking at logs [16:38:20] yeah, I'm wondering if we should have some monitoring for this, but don't want to bother you now [16:38:23] 10Operations, 10ops-eqiad: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121#3784279 (10jcrespo) [16:39:48] 10Operations, 10ops-eqiad: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121#3780358 (10jcrespo) IT happened almost the same issue on 1006 at 4:28pm today: https://grafana.wikimedia.org/dashboard/file/server-board.json?var-server=ganeti1006&refresh=1m&orgId=1&from=151... [16:40:30] SEL has nothing in it [16:41:27] to be fair, the one machine showing large cpu was borhium [16:42:02] obviously I am not blaming gzip, but maybe that machine has something that makes the host go wild [16:42:10] raclog is empty too [16:42:18] ok, we are starting to look at a kernel bug then ? [16:42:18] s/machine/vm/ [16:42:44] kernel or bad luck with hard? [16:43:03] 2 different boxes exhibiting the exact same hardware issue 2 days apart ? [16:43:08] we have a new kernel to roll out, don't want to add another variable to the mix, but maybe worth to try on one host [16:43:32] it might very well be indeed bad luck (or bad batch) [16:43:38] but somehow I am not betting my money on it [16:43:48] in case I'd choose one that has not yet failed [16:45:54] RECOVERY - Misc HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3fullscreenorgId=1var-site=Allvar-cache_type=miscvar-status_type=5 [16:46:10] it's not the RAID either [16:46:21] just doublecheck [16:46:23] /usr/local/lib/nagios/plugins/check_raid megacli [16:46:23] OK: optimal, 1 logical, 4 physical [16:46:23] OK [16:46:38] I see some vague references to ext4 memory bug [16:46:42] on kernel 4.4 [16:46:52] but this is 4.9 [16:47:29] (03PS1) 10Mark Bergsma: [WiP] Support per-service-IP BGP MED values [debs/pybal] - 10https://gerrit.wikimedia.org/r/393097 (https://phabricator.wikimedia.org/T165764) [16:48:22] (03CR) 10jerkins-bot: [V: 04-1] [WiP] Support per-service-IP BGP MED values [debs/pybal] - 10https://gerrit.wikimedia.org/r/393097 (https://phabricator.wikimedia.org/T165764) (owner: 10Mark Bergsma) [16:49:41] note how boriuhm is the only guest increasing resources https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&from=now-30m&to=now&var-server=bohrium&var-network=eth0 [16:49:55] the others just stall with not too much increase in usage [16:50:30] maybe anecdotal [16:53:10] marostegui: could you please repeat whatever it is that you were doing on bohrium ? [16:53:46] akosiaris: he is afk atm but I can explain [16:53:55] back [16:53:59] let me do it again [16:54:00] XD [16:54:00] hahaha [16:54:30] started [16:55:20] (03PS2) 10Jcrespo: prometheus-mysqld-exporter: Introduce s8 replica set on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/393094 (https://phabricator.wikimedia.org/T177208) [16:56:23] huh, I think I just lost bohrium [16:56:23] PROBLEM - Host bohrium is DOWN: PING CRITICAL - Packet loss = 100% [16:56:27] ah there we go [16:56:28] oh my [16:56:30] for that love of ... [16:56:36] so it is the gzip.. [16:56:37] marostegui: it's all your fault!!! [16:56:55] I have been facing the similar issue. I could reproduce this when running "rsync" with large number of files (between USB drive and local filesystem). Kernel version 3.12.9-301.fc20.x86_64. [16:56:55] PROBLEM - Host etcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [16:56:58] https://bugzilla.redhat.com/show_bug.cgi?id=1048442 [16:57:01] maybe ? [16:57:03] RECOVERY - Host bohrium is UP: PING WARNING - Packet loss = 58%, RTA = 3.48 ms [16:57:11] I see closed not a bug [16:57:13] RECOVERY - Host etcd1002 is UP: PING OK - Packet loss = 0%, RTA = 3.91 ms [16:57:41] (03PS3) 10Jcrespo: prometheus-mysqld-exporter: Introduce s8 replica set on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/393094 (https://phabricator.wikimedia.org/T177208) [16:57:43] jesus... [16:58:02] But this time it doesn't look as bad (yet) [16:58:18] Installed the memory test in grub using memtest-setup and upon running it showed many memory errors. Had to replace the memory with compatible to that of the motherboard. Running rsync after that went find without any issues. It turns out to be incompatible memory module. [16:58:24] damn... if it is that [16:58:54] but that would mean hw issues on two hosts right? [16:59:03] now i lost connection again [16:59:21] (03PS2) 10Mark Bergsma: [WiP] Support per-service-IP BGP MED values [debs/pybal] - 10https://gerrit.wikimedia.org/r/393097 (https://phabricator.wikimedia.org/T165764) [16:59:25] (03CR) 10Jcrespo: "Of course, this assumes we will have rebooted the codfw multi hosts." [puppet] - 10https://gerrit.wikimedia.org/r/393094 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [16:59:36] elukey: yeah, which only means bad batch of hardware [16:59:54] PROBLEM - Host ganeti1006 is DOWN: PING CRITICAL - Packet loss = 100% [16:59:58] Nov 23 16:57:55 ganeti1006 kernel: [13334986.089222] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 [17:00:03] ah this is even better this time around [17:00:07] at least it oopsed [17:00:23] PROBLEM - Host actinium is DOWN: PING CRITICAL - Packet loss = 100% [17:00:33] PROBLEM - Host bohrium is DOWN: PING CRITICAL - Packet loss = 100% [17:00:51] yeah this time box ain't coming back [17:00:54] PROBLEM - Host kubestagetcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [17:00:54] PROBLEM - Host logstash1008 is DOWN: PING CRITICAL - Packet loss = 100% [17:00:54] PROBLEM - Host etcd1003 is DOWN: PING CRITICAL - Packet loss = 100% [17:00:54] PROBLEM - Host dubnium is DOWN: PING CRITICAL - Packet loss = 100% [17:00:54] PROBLEM - Host etcd1002 is DOWN: PING CRITICAL - Packet loss = 100% [17:00:54] PROBLEM - Host etcd1005 is DOWN: PING CRITICAL - Packet loss = 100% [17:00:54] PROBLEM - Host neon is DOWN: PING CRITICAL - Packet loss = 100% [17:00:55] PROBLEM - Host releases1001 is DOWN: PING CRITICAL - Packet loss = 100% [17:00:58] forcing reboot [17:01:03] PROBLEM - Host chlorine is DOWN: PING CRITICAL - Packet loss = 100% [17:01:03] PROBLEM - Host dysprosium is DOWN: PING CRITICAL - Packet loss = 100% [17:01:04] PROBLEM - Host sca1004 is DOWN: PING CRITICAL - Packet loss = 100% [17:01:31] !log force powercycle on ganeti1006 T181121 [17:01:32] given that seems that we have a repro, I would give the kernel upgrade a shot akosiaris, thoughts? [17:01:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:01:38] T181121: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121 [17:02:13] volans: yeah let's see if indeed we have a reproduction, cause this time around the effect was way worse [17:02:34] is swap configured on those hosts on a regular file? [17:02:50] the vms, I mean [17:03:02] (03CR) 10Marostegui: [C: 031] prometheus-mysqld-exporter: Introduce s8 replica set on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/393094 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [17:03:02] https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=bohrium&var-network=eth0&from=now-1h&to=now&panelId=17&fullscreen [17:03:40] otherise I would not see where 60GB come from [17:03:43] RECOVERY - Host ganeti1006 is UP: PING OK - Packet loss = 0%, RTA = 1.00 ms [17:04:19] yes it sees [17:04:24] it is* [17:04:55] I think borium is the trigger, obviously, that doesn't help too much [17:05:33] RECOVERY - Host logstash1008 is UP: PING OK - Packet loss = 0%, RTA = 4.86 ms [17:05:35] but maybe it could be isolated? [17:05:40] the watcher is restarting all VMs [17:05:53] RECOVERY - Host dysprosium is UP: PING OK - Packet loss = 0%, RTA = 8.38 ms [17:05:53] RECOVERY - Host etcd1005 is UP: PING OK - Packet loss = 0%, RTA = 9.74 ms [17:05:53] RECOVERY - Host chlorine is UP: PING OK - Packet loss = 0%, RTA = 9.17 ms [17:05:53] RECOVERY - Host neon is UP: PING OK - Packet loss = 0%, RTA = 9.54 ms [17:05:53] RECOVERY - Host releases1001 is UP: PING OK - Packet loss = 0%, RTA = 9.51 ms [17:06:02] marostegui: can I have that command you used ? [17:06:03] RECOVERY - Host dubnium is UP: PING OK - Packet loss = 0%, RTA = 8.55 ms [17:06:03] RECOVERY - Host etcd1003 is UP: PING OK - Packet loss = 0%, RTA = 8.77 ms [17:06:04] RECOVERY - Host kubestagetcd1002 is UP: PING OK - Packet loss = 0%, RTA = 8.44 ms [17:06:04] RECOVERY - Host etcd1002 is UP: PING OK - Packet loss = 0%, RTA = 9.66 ms [17:06:04] PROBLEM - Misc HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3fullscreenorgId=1var-site=Allvar-cache_type=miscvar-status_type=5 [17:06:04] RECOVERY - Host bohrium is UP: PING WARNING - Packet loss = 37%, RTA = 8.58 ms [17:06:09] hoepfully that will give some stability [17:06:13] RECOVERY - Host sca1004 is UP: PING OK - Packet loss = 0%, RTA = 9.44 ms [17:06:13] RECOVERY - Host actinium is UP: PING WARNING - Packet loss = 66%, RTA = 10.11 ms [17:06:56] (03PS19) 10Zoranzoki21: Enable the ArticlePlaceholder for sewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387077 (https://phabricator.wikimedia.org/T179241) [17:07:52] (03PS8) 10ArielGlenn: rsync all dumps status files to web servers and unpack them periodically [puppet] - 10https://gerrit.wikimedia.org/r/392875 (https://phabricator.wikimedia.org/T179857) [17:07:53] to be fair, piwiki may not be the best piece of software, ever [17:08:02] 10Operations, 10ops-eqiad: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121#3784334 (10akosiaris) https://bugzilla.redhat.com/show_bug.cgi?id=1048442 points indeed to memory module error. That being said, for 2 boxes (of the same batch) to exhibit the exact same beha... [17:08:25] piwik is awful software but shouldn't have that impact [17:09:04] yeah, sure [17:09:21] maybe some swapping creates memory pressure, which starts a bug [17:09:44] maybe it is a red herring beacuse of the memory bug [17:10:27] yeah the kernel null pointer dereference is a different action than before [17:10:38] (03CR) 10Jcrespo: [C: 032] prometheus-mysqld-exporter: Introduce s8 replica set on prometheus [puppet] - 10https://gerrit.wikimedia.org/r/393094 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [17:13:42] so at the moment piwik is not trying to insert any data on the db(record statistics set off), it replies to all the POSTs with a 204 [17:13:56] and the db is empty waiting for the backup [17:15:42] so, I am proactively gonna drain ganeti1006 as well [17:15:49] except bohrium [17:16:00] I doubt it has anything to do with bohrium [17:16:09] I honestly do, but let's be on the safe side [17:17:17] * elukey nods [17:18:03] oh, I do not think it has anything to do with what bohrium does (its service) [17:18:04] RECOVERY - Misc HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3fullscreenorgId=1var-site=Allvar-cache_type=miscvar-status_type=5 [17:18:32] it probably just happens to affect that VM [17:18:56] as in, the host kernel processes handling that VM [17:19:16] yeah it was the trigger.. but I doubt it has anything to do with the actual problem [17:19:38] kernel processes sounds weird, but you get what I am trying to say [17:19:39] still separating a possible trigger is saner than having it alongside everything else [17:20:59] I'd say we wait for Chris to run the mem test on 1005, if that's in fact broken, it's likely we hit a bad batch, if not we can dig further [17:21:18] yeah agreed [17:21:24] but it's 4 days until then [17:21:39] well somewhat less but you get my point [17:23:33] 10Operations, 10ops-eqiad: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121#3784351 (10akosiaris) p:05Normal>03High [17:25:03] 4.9.51 was deployed on bohium on the 16th (as general note) [17:25:04] yeah, I see what you mean. I reckon we don't have smarthands for eqiad, right? [17:26:39] I am not sure. Haven't checked recently. Anyway I 'll keep an eye on the other 2 boxes [17:26:50] almost everything is now on those 2 (1007, 1008) [17:28:49] 10Operations, 10ops-eqiad: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121#3784354 (10akosiaris) After some IRC discussions, let's proceed with checking `ganeti1005` and see what we make of it. For now all VMs are on ganeti1007, ganeti1008 except bohrium. Let's hope... [17:29:30] !log set ganeti1006 as drained. ganeti1005 was already set. That will prevent scheduling VMs on those. T181121 [17:29:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:29:38] T181121: Possible memory errors on ganeti1005, ganeti1006 - https://phabricator.wikimedia.org/T181121 [17:30:37] elukey: I think bohrium is rather the "victim" here, ganeti started to oops at 16:24:33 and bohrium started to login hung tasks starting at 16:26:43 [17:30:51] (and those are logged after 120 seconds, so that adds up) [17:30:57] * elukey nods [17:31:18] (started to log) I meant [17:32:25] moritzm: the trigger is probably some memory expensive operation. It just happened to be in the range bohrium was residing on and got triggered by a gzip -d on bohrium [17:32:58] but it could very well be triggered by a simple ls command [17:33:07] assuming we are talking about a faulty dimm [17:33:20] yeah, I though that so [17:33:27] don't we have ECC on those boxes ? [17:33:28] and after reboot, it could be any other vm [17:33:53] akosiaris: did you check the hw log? [17:34:01] yeah nothing [17:34:03] normally it should complain there [17:34:09] both SEL and RACLOG [17:36:50] akosiaris: I don't think these have ECC RAM, dmicode prints "Error Information Handle: Not provided" (and IIRC this means no ECC) [17:36:58] (03PS3) 10Mark Bergsma: [WiP] Support per-service-IP BGP MED values [debs/pybal] - 10https://gerrit.wikimedia.org/r/393097 (https://phabricator.wikimedia.org/T165764) [17:37:07] ok then, that gives an extra hint... [17:40:05] hmm, although there's also other sources which claim that "Total Width: 72 bits" in dmicode means ECC-capable, and that's the case on ganeti1005. [17:41:40] Physical Memory Array [17:41:40] Location: System Board Or Motherboard [17:41:40] Use: System Memory [17:41:40] Error Correction Type: Multi-bit ECC [17:41:59] I say we are ECC [17:42:00] :P [17:42:43] so what would be the plan? wait for Chris next week? [17:42:54] (trying to figure out the outage that piwik will take) [17:43:12] outage ? is piwik currently in an outage ? [17:44:25] I was thinking of yes, waiting for Chris next week unless something way more urgent comes up [17:44:38] I 've emptied both boxes from VMs, bohrium excluded [17:45:02] unless it's a bad batch and all boxes are hit by it (at which point we are practically doomed) [17:45:05] we should be ok [17:45:13] so when bohrium was moved mysql didn't like it a lot and one table got corrupted, so me and Manuel were trying to import the last backup (unzipping it with gzip -d :) [17:45:58] I don't think the corruption had anything to do with the move [17:45:58] at the moment mysql is empied on bohrium, and data is not being inserted (maintenance mode for piwik) [17:46:13] more probably with ganeti1005 croaking yesterday ? [17:46:29] well, now it is the best time to reimport it, if it is isolated [17:46:32] ah yes I meant the ganeti1005 event :) [17:46:47] I agree with jynus.. Import it and let's see what happens [17:46:52] if it breaks again, we know why .:-) [17:47:00] we blame piwik! [17:47:09] no, we blame elukey! [17:47:14] :-D [17:47:17] of course! \o/ [17:47:19] ahahaha [17:48:06] akosiaris: ok so am I free to proceed with another gzip on bohrium now ? [17:48:21] yeah go ahead [17:48:31] I am monitoring the situation as well [17:48:36] (03PS4) 10Dzahn: Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 (owner: 10Paladox) [17:49:06] started [17:51:29] done, nothing exploded afaics [17:51:37] yup [17:51:55] (03CR) 10Dzahn: [C: 032] Revert "planet: Add support for http/2 on stretch" [puppet] - 10https://gerrit.wikimedia.org/r/393064 (owner: 10Paladox) [17:52:11] this is more and more pointing towards a memory issue [17:52:13] well, I think it is swapping a lot [17:52:40] the vm [17:52:47] but nothing breaking [17:52:56] it's not swapping at all [17:53:03] Swap: 998396 0 998396 [17:53:25] https://grafana.wikimedia.org/dashboard/file/server-board.json?var-server=bohrium&refresh=1m&orgId=1&from=now-30m&to=now&panelId=17&fullscreen&var-network=eth0 [17:53:38] that's disk space [17:53:39] thn it is creating 20GB of temporary files [17:53:52] 10Operations, 10Wikimedia-Planet, 10Patch-For-Review: upgrade planet instances to stretch - https://phabricator.wikimedia.org/T168490#3784394 (10Dzahn) [17:53:59] ok, but temp files are not swap [17:54:00] 10Operations, 10Wikimedia-Planet, 10Patch-For-Review: Enable http/2 for planet apache - https://phabricator.wikimedia.org/T181202#3784392 (10Dzahn) 05Open>03declined >>! In T181202#3783548, @faidon wrote: > 008b62f4048 should probably be reverted, its effects manually reverted (`a2dismod http2`) and this... [17:54:58] elukey: I can move bohrium off of ganeti1006 if you want [17:55:29] just to be safe for the next few days [17:55:55] this is weird btw.. those memory dimms were manufactured to be problematic on thanksgiving ? [17:55:57] :P [17:56:01] ahhaha [17:56:08] our luck this quarter is quite great [17:56:18] xddddddd [17:57:07] akosiaris: not sure what it is best for bohrium frankly [17:57:52] well you kind of tested the entire memory already. With just 1 VM on that boxes, the only other thing that can trigger the memory issue is puppet [17:58:20] the entire memory of bohrium* [17:58:32] so the next step now would be to load a mysqldump and then re-enable piwik [17:58:40] go for it [17:58:43] super [17:58:46] great! [17:59:34] I do not know if db2023 will want to start [17:59:44] not that I care much now that it has been cloned away [18:01:58] the import is now being done, we'll see if we kill it [18:07:52] so? [18:08:11] it will take a few hours [18:08:14] so far so good [18:08:23] but only 1.5G written XD [18:08:27] nah, if it has been that [18:08:38] it was too fast from good to bad state [18:08:44] it had already happen [18:09:16] it was bohrium's mastermind plan to run alone on ganeti1006 [18:09:36] lol [18:10:27] (03PS2) 10Muehlenhoff: Remove experimental component from pybal-test and multatuli [puppet] - 10https://gerrit.wikimedia.org/r/393082 [18:11:06] ok then. I think I am gonna go then. Feel free to page me if all hell breaks loose [18:11:29] ack! Will stay a bit more and then go as well [18:11:33] I am going to call it a day too [18:11:41] thanks all for the help! [18:11:44] (03CR) 10Muehlenhoff: [C: 032] Remove experimental component from pybal-test and multatuli [puppet] - 10https://gerrit.wikimedia.org/r/393082 (owner: 10Muehlenhoff) [18:25:12] (03PS1) 10Jcrespo: mariadb: Move some (only the single-instance) s5 hosts to s8 [puppet] - 10https://gerrit.wikimedia.org/r/393102 (https://phabricator.wikimedia.org/T177208) [18:26:34] (03CR) 10Jcrespo: "It would also be nice to restart later db1092, too, so we get rid of the old socket location." [puppet] - 10https://gerrit.wikimedia.org/r/393102 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [18:41:23] (03PS4) 10Mark Bergsma: [WiP] Support per-service-IP BGP MED values [debs/pybal] - 10https://gerrit.wikimedia.org/r/393097 (https://phabricator.wikimedia.org/T165764) [18:50:04] (03PS1) 10Elukey: profile::piwik::my.cnf: enable innodb_file_per_table [puppet] - 10https://gerrit.wikimedia.org/r/393108 [18:50:34] (03CR) 10Elukey: [C: 032] profile::piwik::my.cnf: enable innodb_file_per_table [puppet] - 10https://gerrit.wikimedia.org/r/393108 (owner: 10Elukey) [18:50:59] this one was applied manually by Manuel before starting the import --^ [18:55:49] (03CR) 10Marostegui: [C: 031] mariadb: Move some (only the single-instance) s5 hosts to s8 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/393102 (https://phabricator.wikimedia.org/T177208) (owner: 10Jcrespo) [19:03:36] 10Operations, 10DBA, 10MediaWiki-Configuration, 10Wikidata: Test moving testwikidatawiki database to s8 replica set on Wikimedia - https://phabricator.wikimedia.org/T180694#3784503 (10Marostegui) To give an update. Jaime successfully moved wikidatata to the s8 set of servers in codfw (passive DC). There is... [19:15:10] PROBLEM - Host mw2251 is DOWN: PING CRITICAL - Packet loss = 100% [19:28:05] 10Operations, 10DBA, 10MediaWiki-Configuration, 10Wikidata: Test moving testwikidatawiki database to s8 replica set on Wikimedia - https://phabricator.wikimedia.org/T180694#3784515 (10Addshore) >>! In T180694#3784503, @Marostegui wrote: > To give an update. > Jaime successfully moved wikidatata to the s8 s... [19:56:34] 10Operations, 10DBA, 10MediaWiki-Configuration, 10Wikidata: Test moving testwikidatawiki database to s8 replica set on Wikimedia - https://phabricator.wikimedia.org/T180694#3784552 (10Marostegui) wikidata: https://gerrit.wikimedia.org/r/#/c/391835/ Again, this was only for READS and has NO EFFECT on produc... [19:57:48] (03CR) 10Gehel: [C: 04-1] "Minor comment inline, otherwise LGTM" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/392691 (https://phabricator.wikimedia.org/T177254) (owner: 10Herron) [20:05:32] (03CR) 10Chad: "Likewise...." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [20:29:45] (03CR) 10Zoranzoki21: "https://gerrit.wikimedia.org/r/#/c/392184/" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383944 (https://phabricator.wikimedia.org/T45956) (owner: 10Zoranzoki21) [20:41:46] (03CR) 10Chad: [C: 04-1] "WikimediaMaintenance, WikimediaMessages. I could've sworn it was more, but perhaps that's not the case anymore." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [20:43:04] (03PS8) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [20:44:36] (03CR) 10Chad: [C: 032] Update logo for Wikimedia Norge's chapter wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393075 (https://phabricator.wikimedia.org/T181241) (owner: 10Jon Harald Søby) [20:45:51] (03Merged) 10jenkins-bot: Update logo for Wikimedia Norge's chapter wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393075 (https://phabricator.wikimedia.org/T181241) (owner: 10Jon Harald Søby) [20:46:55] (03CR) 10jenkins-bot: Update logo for Wikimedia Norge's chapter wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393075 (https://phabricator.wikimedia.org/T181241) (owner: 10Jon Harald Søby) [20:49:15] !log demon@tin Synchronized static/images/project-logos/nowikimedia.png: logo update (duration: 02m 51s) [20:49:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:54:51] (03PS1) 10TerraCodes: Remove single editor tab for plwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393121 (https://phabricator.wikimedia.org/T181045) [20:55:26] (03PS3) 10Chad: Move a variable closer to other relevant code [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374653 (owner: 10MaxSem) [20:57:55] (03PS1) 10Chad: wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 [20:58:03] (03CR) 10Jayprakash12345: [C: 031] Remove single editor tab for plwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393121 (https://phabricator.wikimedia.org/T181045) (owner: 10TerraCodes) [21:13:09] RECOVERY - Host mw2251 is UP: PING OK - Packet loss = 0%, RTA = 36.88 ms [21:14:59] (03PS15) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [21:15:29] !log rebooted mw2251 after unresponsive on mgmt console and no ping [21:15:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:22:23] 10Operations, 10ops-codfw: mw2251 problems - https://phabricator.wikimedia.org/T181263#3784641 (10ArielGlenn) [21:25:55] (03CR) 10jerkins-bot: [V: 04-1] Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [21:30:34] (03CR) 10Reedy: "Filed as https://phabricator.wikimedia.org/T181262 and upstream as https://github.com/squizlabs/PHP_CodeSniffer/issues/1758" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [21:31:31] (03CR) 10TerraCodes: "> WikimediaMaintenance, WikimediaMessages. I could've sworn it was" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [21:33:43] (03CR) 10Hashar: "How are you updating it? IIRC we have to do:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [21:34:27] !log ariel@puppetmaster1001 conftool action : set/pooled=no; selector: name=mw2251.codfw.wmnet [21:34:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:34:37] (03CR) 10Reedy: "That shouldn't mean it gets stuck "Creating file list..."" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [21:35:38] 10Operations, 10ops-codfw: mw2251 problems - https://phabricator.wikimedia.org/T181263#3784650 (10ArielGlenn) I've depooled it for now. [21:35:58] (03CR) 10Reedy: "And composer.lock looks sane" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [21:36:31] (03CR) 10TerraCodes: "nvm my last comment. patches submitted as https://gerrit.wikimedia.org/r/#/c/393126/ and https://gerrit.wikimedia.org/r/393131" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [21:42:02] (03CR) 10MarcoAurelio: "This change was last updated 11 months ago. Is there any interest in doing this? :)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/309066 (https://phabricator.wikimedia.org/T85847) (owner: 10Legoktm) [21:46:05] Reedy, does https://gerrit.wikimedia.org/r/#/c/391706/ requires SWAT? I think it's a no-op? [21:47:35] No, doesn't need a swat window [21:47:46] I meant to sync that one [21:50:21] (03CR) 10MarcoAurelio: "I don't think this needs to be rebased everyday." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387077 (https://phabricator.wikimedia.org/T179241) (owner: 10Zoranzoki21) [21:52:20] Actually, needs a rebase [21:52:59] I can't access the 'rebase' button on not-self changes [21:53:03] (03CR) 10Chad: "Indeed. Only time you need to rebase is if there's an actual conflict, or if you need to bring in some other change that you now rely on. " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387077 (https://phabricator.wikimedia.org/T179241) (owner: 10Zoranzoki21) [21:53:22] (03CR) 10Chad: "Needs a rebase, then will merge." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/391706 (owner: 10Brian Wolff) [21:53:45] TabbyCat: People rebase too often anyway [21:54:02] It's not necessary until merge time, or if there's something you explicitly want to have included in your tree [21:54:29] I probably abuse that feature a bit too; because manual rebasing is such a pain for me... [21:58:28] (03PS16) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [21:59:20] TabbyCat: Don't get me wrong, it's a nice feature. But pressing it every day to keep it up to date and kill the little warning is just spammy :) [21:59:26] (the warning is a little overzealous, tbh) [22:00:24] no_justification, got you right, no worries :) [22:01:00] Yeah, you only need to rebase every time the button becomes active [22:01:18] * no_justification whacks Reedy [22:01:20] Shush you [22:01:34] * Reedy cries [22:01:41] Don't give people ideas :p [22:04:22] * TabbyCat spams gerrit with rebases [22:04:37] such fun to clog zuul with new checks :p [22:05:31] That too ^ [22:05:39] Mostly useless rechecking of code that didn't change [22:08:01] I shall make a lolcat... Every time you rebase a change, no_justification makes Reedy cry. Think on Reedy, do not rebase if there's no need. [22:08:35] Every time you rebase, it also adds DB entries, git reflog entries.... [22:08:51] It's not a completely zero overhead [22:09:02] * no_justification should write an e-mail [22:09:04] :) [22:09:14] commit number also changes, right? [22:09:21] s/number/hash [22:09:24] (03CR) 10jerkins-bot: [V: 04-1] Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:09:56] (03PS17) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [22:10:05] Think of the hashes! [22:10:07] TabbyCat: Yes, sha1 will obviously change since you have a new parent [22:10:07] We might run out! [22:11:20] (03CR) 10jerkins-bot: [V: 04-1] Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:11:36] Success [22:11:44] We're at 10 characters for shortest unique sha1 in MW core now [22:13:02] (default is 7, Linux kernel is at 11 or 12, iirc [22:13:17] (shortest abbreviated sha1) [22:13:41] https://github.com/git/git/commit/e6c587c733b4634030b353f4024794b08bc86892 is fun [22:16:56] (03PS18) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [22:17:12] (03CR) 10Chad: [C: 032] Move a variable closer to other relevant code [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374653 (owner: 10MaxSem) [22:18:10] (03CR) 10jerkins-bot: [V: 04-1] Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:18:12] (03CR) 10Chad: "Could we separate them? One change for the fixes, then another to bump the version cleanly?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:18:16] (03Merged) 10jenkins-bot: Move a variable closer to other relevant code [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374653 (owner: 10MaxSem) [22:18:21] (03CR) 10Chad: "(makes me feel safer about no-ops)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:18:26] (03CR) 10jenkins-bot: Move a variable closer to other relevant code [mediawiki-config] - 10https://gerrit.wikimedia.org/r/374653 (owner: 10MaxSem) [22:19:33] !log demon@tin Synchronized wmf-config/CommonSettings.php: no-op, moving stuff around (duration: 00m 47s) [22:19:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:20:16] (03CR) 10Chad: "It wasn't about automatically sorting, it was just adding a check in jenkins for additions being out of order." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/383999 (owner: 10Hoo man) [22:22:14] (03PS19) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [22:22:20] I'll split them in a few minutes [22:22:27] That should pass now.. [22:23:30] (03CR) 10jerkins-bot: [V: 04-1] Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:23:38] Damn it. 1 left [22:24:49] (03PS20) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [22:26:42] (03CR) 10Chad: "Holy crap it's working" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [22:28:31] (03CR) 10Chad: "I think this can be abandoned now?" [puppet] - 10https://gerrit.wikimedia.org/r/332531 (https://phabricator.wikimedia.org/T141324) (owner: 10Paladox) [22:30:12] (03Abandoned) 10Paladox: Gerrit: Enable logstash by default for prod gerrit [puppet] - 10https://gerrit.wikimedia.org/r/332531 (https://phabricator.wikimedia.org/T141324) (owner: 10Paladox) [22:30:37] (03PS21) 10Reedy: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 [22:30:40] (03PS1) 10Reedy: Code style for mediawiki-codesniffer 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393142 [22:30:45] soooorted [22:31:51] Interesting.. I accidentally removed [22:31:54] But it's still working [22:32:15] meh [22:33:24] How on earth did we still have NS_IMAGE constants lying about? [22:33:25] Heh [22:33:31] (03CR) 10Chad: [C: 032] Code style for mediawiki-codesniffer 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393142 (owner: 10Reedy) [22:33:34] (03CR) 10Reedy: [C: 031] "SHIP IT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393142 (owner: 10Reedy) [22:34:40] (03Merged) 10jenkins-bot: Code style for mediawiki-codesniffer 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393142 (owner: 10Reedy) [22:37:54] 10Operations, 10Analytics, 10hardware-requests: Refresh or replace oxygen - https://phabricator.wikimedia.org/T181264#3784670 (10faidon) [22:38:11] (03CR) 10jenkins-bot: Code style for mediawiki-codesniffer 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393142 (owner: 10Reedy) [22:39:48] !log demon@tin Synchronized wmf-config/: style fixes, no-op (duration: 00m 47s) [22:39:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:49:19] (03CR) 10Chad: [C: 031] "votewiki too maybe?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) (owner: 10TerraCodes) [22:50:01] Reedy: This one is trivial Reedy: This one is trivial [22:50:07] Errr, wrong copy+paste [22:50:24] This one: https://gerrit.wikimedia.org/r/#/c/393122/ [22:51:05] (03CR) 10Reedy: [C: 032] wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 (owner: 10Chad) [22:52:37] (03PS2) 10Chad: wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 [22:52:47] Stupid jgit [22:56:39] (03PS1) 10Reedy: Simplify PageTriage config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393164 [22:56:44] ^ that's 3 more we can remove [23:00:16] (03CR) 10Reedy: [C: 04-1] Simplify PageTriage config (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393164 (owner: 10Reedy) [23:04:03] (03CR) 10Reedy: "Meh, default is [], so no problems here" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393164 (owner: 10Reedy) [23:10:59] (03PS1) 10Reedy: Remove $wmgSearchExtraNamespaces imtermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393165 [23:12:59] (03PS1) 10Reedy: Remove wmgRelatedSitesPrefixes intermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393166 [23:15:13] (03PS1) 10Reedy: Remove GettingStarted intermediate variables [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393167 [23:20:33] (03PS9) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [23:20:39] (03PS10) 10TerraCodes: Add loginwiki and wikidata to $wgLocalVirtualHosts [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392999 (https://phabricator.wikimedia.org/T117302) [23:20:47] (03CR) 10Reedy: [C: 032] Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [23:21:04] (03PS2) 10TerraCodes: Remove single editor tab for plwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393121 (https://phabricator.wikimedia.org/T181045) [23:22:14] (03Merged) 10jenkins-bot: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [23:22:26] (03CR) 10jenkins-bot: Update mediawiki-codesniffer to 14.1.0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367465 (owner: 10Reedy) [23:23:22] !log reedy@tin Synchronized phpcs.xml: (no justification provided) (duration: 00m 45s) [23:23:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:24:18] !log reedy@tin Synchronized composer.json: (no justification provided) (duration: 00m 45s) [23:24:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:25:23] !log reedy@tin Synchronized composer.lock: (no justification provided) (duration: 00m 44s) [23:25:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:26:00] (03CR) 10Reedy: [C: 032] Simplify PageTriage config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393164 (owner: 10Reedy) [23:27:23] (03Merged) 10jenkins-bot: Simplify PageTriage config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393164 (owner: 10Reedy) [23:27:47] (03PS2) 10Reedy: Remove $wmgSearchExtraNamespaces imtermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393165 [23:27:51] (03CR) 10Reedy: [C: 032] Remove $wmgSearchExtraNamespaces imtermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393165 (owner: 10Reedy) [23:29:13] (03Merged) 10jenkins-bot: Remove $wmgSearchExtraNamespaces imtermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393165 (owner: 10Reedy) [23:29:17] (03CR) 10jenkins-bot: Simplify PageTriage config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393164 (owner: 10Reedy) [23:29:39] (03PS2) 10Reedy: Remove wmgRelatedSitesPrefixes intermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393166 [23:29:43] (03CR) 10Reedy: [C: 032] Remove wmgRelatedSitesPrefixes intermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393166 (owner: 10Reedy) [23:30:59] (03Merged) 10jenkins-bot: Remove wmgRelatedSitesPrefixes intermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393166 (owner: 10Reedy) [23:31:16] (03PS2) 10Reedy: Remove GettingStarted intermediate variables [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393167 [23:31:20] (03CR) 10Reedy: [C: 032] Remove GettingStarted intermediate variables [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393167 (owner: 10Reedy) [23:32:24] (03CR) 10jenkins-bot: Remove $wmgSearchExtraNamespaces imtermediatary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393165 (owner: 10Reedy) [23:32:48] (03Merged) 10jenkins-bot: Remove GettingStarted intermediate variables [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393167 (owner: 10Reedy) [23:33:29] (03CR) 10Reedy: wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 (owner: 10Chad) [23:33:32] (03PS3) 10Reedy: wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 (owner: 10Chad) [23:33:36] (03CR) 10Reedy: [C: 032] wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 (owner: 10Chad) [23:34:59] (03Merged) 10jenkins-bot: wgPageImagesExpandOpenSearchXml: drop intermediate $wmg setting [mediawiki-config] - 10https://gerrit.wikimedia.org/r/393122 (owner: 10Chad) [23:42:40] !log reedy@tin Synchronized wmf-config/: Simplify variable assignment (duration: 00m 47s) [23:42:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log