[00:10:42] (03PS1) 10Catrope: Enable and configure GrowthExperiments on svwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627395 (https://phabricator.wikimedia.org/T257220) [00:19:38] 10Operations, 10ops-codfw, 10DC-Ops, 10Maps: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet - https://phabricator.wikimedia.org/T260271 (10Papaul) [00:45:04] PROBLEM - Postgres Replication Lag on maps2003 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 39715720 and 1 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring [00:47:02] RECOVERY - Postgres Replication Lag on maps2003 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 118048 and 90 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring [01:27:47] (03PS20) 10Ryan Kemper: elasticsearch: Let spicerack handle wait for all write queues to clear [cookbooks] - 10https://gerrit.wikimedia.org/r/603731 (https://phabricator.wikimedia.org/T261239) [02:07:00] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.36.0-wmf.9 [core] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627400 [02:07:43] (03PS2) 10DannyS712: Branch commit for wmf/1.36.0-wmf.9 [core] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627400 (https://phabricator.wikimedia.org/T257977) (owner: 10TrainBranchBot) [04:43:04] PROBLEM - Router interfaces on cr4-ulsfo is CRITICAL: CRITICAL: host 198.35.26.193, interfaces up: 75, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [04:49:32] PROBLEM - OSPF status on cr1-codfw is CRITICAL: OSPFv2: 5/6 UP : OSPFv3: 5/6 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status [04:55:22] RECOVERY - OSPF status on cr1-codfw is OK: OSPFv2: 5/5 UP : OSPFv3: 5/5 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status [04:58:46] RECOVERY - Router interfaces on cr4-ulsfo is OK: OK: host 198.35.26.193, interfaces up: 77, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [05:09:27] (03CR) 10Marostegui: [C: 03+2] mariadb: Add arbcom_ruwiki to the list of private wikis [puppet] - 10https://gerrit.wikimedia.org/r/627331 (https://phabricator.wikimedia.org/T262832) (owner: 10Jcrespo) [05:10:19] !log Restart sanitarium hosts on eqiad and codfw T262832 [05:10:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:10:27] T262832: Prepare and check storage layer for arbcom_ruwiki - https://phabricator.wikimedia.org/T262832 [05:28:16] 10Operations, 10DBA, 10observability: Prometheus/MariaDB counts a 'SELECT ... FOR UPDATE' query as an UPDATE query - https://phabricator.wikimedia.org/T262579 (10Marostegui) I am not sure there's much else to do here. I believe that the most likely explanation on why they aren't on the binlog is T262579#6453... [05:30:16] (03PS1) 10Marostegui: dbproxy: Depool labsdb1010 [puppet] - 10https://gerrit.wikimedia.org/r/627409 (https://phabricator.wikimedia.org/T261456) [05:31:26] (03CR) 10Marostegui: [C: 03+2] dbproxy: Depool labsdb1010 [puppet] - 10https://gerrit.wikimedia.org/r/627409 (https://phabricator.wikimedia.org/T261456) (owner: 10Marostegui) [05:33:33] 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review, 10User-Kormat: db2125 crashed - mgmt iface also not available - https://phabricator.wikimedia.org/T260670 (10Marostegui) Thank you Papaul. I have started MySQL back in db2125 so replication doesn't get behind too much. [05:33:48] !log Depool labsdb1010 for PDU maintenance [05:33:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:55:48] (03PS1) 10Marostegui: mariadb: Productionize es2028 [puppet] - 10https://gerrit.wikimedia.org/r/627410 (https://phabricator.wikimedia.org/T261717) [06:03:07] (03CR) 10Marostegui: [C: 03+2] mariadb: Productionize es2028 [puppet] - 10https://gerrit.wikimedia.org/r/627410 (https://phabricator.wikimedia.org/T261717) (owner: 10Marostegui) [06:05:09] !log marostegui@cumin1001 dbctl commit (dc=all): 'Set es2012 as es1 codfw master T261717', diff saved to https://phabricator.wikimedia.org/P12584 and previous config saved to /var/cache/conftool/dbconfig/20200915-060508-marostegui.json [06:05:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:05:16] T261717: Productionize es20[26-34] and es10[26-34] - https://phabricator.wikimedia.org/T261717 [06:06:24] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool es2011 to clone es2028', diff saved to https://phabricator.wikimedia.org/P12585 and previous config saved to /var/cache/conftool/dbconfig/20200915-060623-marostegui.json [06:06:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:08:05] !log Stop mysql on es2011 to clone es2028 [06:08:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:15:57] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10Marostegui) I have depooled labsdb1010, I will stop mysql in a couple of hours [06:18:06] 10Operations, 10Analytics, 10Traffic, 10Wikimedia-General-or-Unknown: Cookie “WMF-Last-Access-Global” has been rejected for invalid domain. - https://phabricator.wikimedia.org/T261803 (10Aklapper) [06:19:01] 10Operations, 10Analytics, 10Traffic, 10Wikimedia-General-or-Unknown: Cookie “WMF-Last-Access-Global” has been rejected for invalid domain. - https://phabricator.wikimedia.org/T261803 (10Aklapper) Introduced in T138027; T262882 might be a dup? [06:26:05] 10Operations, 10Traffic, 10netops: Wikimedia projects not reachable for some Telecom Italia users - https://phabricator.wikimedia.org/T262869 (10Aklapper) For anyone running into this, please follow https://www.mediawiki.org/wiki/How_to_report_a_bug#Reporting_a_connectivity_issue (but please note that this t... [06:26:16] (03CR) 10Marostegui: "Thank you for taking on this Brooke. I would like to hear Stevie's opinion on this, as I wanted her to take a look on something similar fo" [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [06:28:42] 10Operations, 10DBA, 10observability: Prometheus/MariaDB counts a 'SELECT ... FOR UPDATE' query as an UPDATE query - https://phabricator.wikimedia.org/T262579 (10jcrespo) 05Open→03Invalid [06:29:14] (03CR) 10Marostegui: wikireplicas: Proposal for a proxy setup on multi-instance replicas (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [06:34:32] (03CR) 10Lars Wirzenius: [C: 03+2] "LGTM." [core] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627400 (https://phabricator.wikimedia.org/T257977) (owner: 10TrainBranchBot) [06:34:48] (03CR) 10Ayounsi: [C: 03+1] "LGTM, but I don't know about the specific urllib3.disable_warnings() questions." [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [06:48:14] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [06:53:05] (03Merged) 10jenkins-bot: Branch commit for wmf/1.36.0-wmf.9 [core] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627400 (https://phabricator.wikimedia.org/T257977) (owner: 10TrainBranchBot) [06:57:45] !log 1.36.0-wmf.9 was branched at 7269b6b57b6f79646b96ece818d2f2d38e0d2ea6 for T257977 [06:57:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:57:52] T257977: 1.36.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T257977 [07:03:29] (03PS21) 10Ryan Kemper: elasticsearch: Let spicerack handle wait for all write queues to clear [cookbooks] - 10https://gerrit.wikimedia.org/r/603731 (https://phabricator.wikimedia.org/T261239) [07:03:38] (03PS2) 10Muehlenhoff: superset: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627203 [07:05:29] (03CR) 10Muehlenhoff: [C: 03+2] superset: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627203 (owner: 10Muehlenhoff) [07:10:07] (03PS5) 10Ryan Kemper: elasticsearch: Store which dcs to query in class [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) [07:11:17] (03PS6) 10Ryan Kemper: elasticsearch: Store which dcs to query in class [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) [07:12:06] (03PS4) 10Muehlenhoff: Manage /etc/apt/sources.list via Puppet (WIP) [puppet] - 10https://gerrit.wikimedia.org/r/626693 (https://phabricator.wikimedia.org/T156562) [07:12:11] (03PS4) 10Muehlenhoff: icinga: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627201 [07:12:13] (03PS2) 10Muehlenhoff: turnilo: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627202 [07:12:15] (03PS2) 10Muehlenhoff: alerts: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627226 [07:13:29] (03CR) 10Muehlenhoff: [C: 03+2] alerts: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627226 (owner: 10Muehlenhoff) [07:13:35] (03CR) 10jerkins-bot: [V: 04-1] elasticsearch: Store which dcs to query in class [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) (owner: 10Ryan Kemper) [07:13:37] (03PS3) 10Muehlenhoff: alerts: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627226 [07:16:35] !log pre-configure SGIX port on cr2-eqsin [07:16:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:19:37] !log elukey@cumin1001 START - Cookbook sre.druid.roll-restart-workers [07:19:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:19:54] !log roll restart druid cluster to pick up openjdk updates [07:19:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:22:48] I'm adding a new host to swift codfw shortly, no impact expected [07:23:13] (03CR) 10Filippo Giunchedi: [V: 03+2 C: 03+2] codfw-prod: add ms-be2057 at object weight 100 [software/swift-ring] - 10https://gerrit.wikimedia.org/r/625604 (https://phabricator.wikimedia.org/T261633) (owner: 10Filippo Giunchedi) [07:23:18] 10Operations, 10MediaWiki-General, 10Performance-Team, 10serviceops-radar, and 3 others: Move MainStash out of Redis to a simpler multi-dc aware solution - https://phabricator.wikimedia.org/T212129 (10Gilles) @Marostegui @LSobanski is this still on track to be purchased in Q2? [07:24:23] !log swift codfw add ms-be2057 at object weight 100 - T261633 [07:24:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:24:29] T261633: Put ms-be2057 (Dell R740xd2) in service - https://phabricator.wikimedia.org/T261633 [07:31:32] (03PS5) 10Muehlenhoff: icinga: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627201 [07:33:54] 10Operations, 10MediaWiki-General, 10Performance-Team, 10serviceops-radar, and 3 others: Move MainStash out of Redis to a simpler multi-dc aware solution - https://phabricator.wikimedia.org/T212129 (10Marostegui) Yes, and hopefully also set up during Q2 :-) [07:41:01] (03PS6) 10Muehlenhoff: icinga: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627201 [07:44:17] (03PS7) 10Ryan Kemper: elasticsearch: Store which dcs to query in class [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) [07:45:03] (03CR) 10Muehlenhoff: [C: 03+2] icinga: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627201 (owner: 10Muehlenhoff) [07:45:49] (03CR) 10Ryan Kemper: "Okay, inline feedback is implemented." (033 comments) [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) (owner: 10Ryan Kemper) [07:46:14] (03CR) 10Ryan Kemper: elasticsearch: Let spicerack handle wait for all write queues to clear (032 comments) [cookbooks] - 10https://gerrit.wikimedia.org/r/603731 (https://phabricator.wikimedia.org/T261239) (owner: 10Ryan Kemper) [07:47:02] (03PS1) 10Filippo Giunchedi: hieradata: enable rsyslog queues in esams/eqiad [puppet] - 10https://gerrit.wikimedia.org/r/627418 (https://phabricator.wikimedia.org/T226703) [07:47:43] looking for kind souls to +1 ^ [07:49:29] having a look in ~ 5m [07:50:11] (03PS1) 10Lars Wirzenius: testwikis wikis to 1.36.0-wmf.9 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627419 [07:50:13] (03CR) 10Lars Wirzenius: [C: 03+2] testwikis wikis to 1.36.0-wmf.9 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627419 (owner: 10Lars Wirzenius) [07:50:16] (03CR) 10Kormat: [C: 03+1] hieradata: enable rsyslog queues in esams/eqiad [puppet] - 10https://gerrit.wikimedia.org/r/627418 (https://phabricator.wikimedia.org/T226703) (owner: 10Filippo Giunchedi) [07:50:27] godog: +1'd in defiance of your criteria [07:51:00] (03Merged) 10jenkins-bot: testwikis wikis to 1.36.0-wmf.9 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627419 (owner: 10Lars Wirzenius) [07:51:00] kormat: haha! appreciate it, I knew I could count on you! [07:51:13] moritzm: thanks! [07:51:29] * kormat wonders if he's being baited [07:52:35] heheh fair, no I genuinely appreciate it [07:54:38] !log liw@deploy1001 Started scap: testwikis to 1.36.0-wmf.9 [07:54:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:55:33] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good, PCC also fine: https://puppet-compiler.wmflabs.org/compiler1001/25068/" [puppet] - 10https://gerrit.wikimedia.org/r/627418 (https://phabricator.wikimedia.org/T226703) (owner: 10Filippo Giunchedi) [07:56:09] (03PS3) 10Muehlenhoff: turnilo: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627202 [07:58:14] 10Operations, 10ops-eqiad, 10DC-Ops: Netbox report accounting icinga alert - https://phabricator.wikimedia.org/T250053 (10ayounsi) > there could be some type of auto-generated task that assigns each DC engineer a Phabricator task That's something we want to have for a while, see for example T225140. But as F... [08:01:04] !log T187984 migration script on otrs1001 proceeding as expected. Still in step 31/44, but that's what we saw in the test migration [08:01:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:01:10] T187984: Update OTRS to the latest stable version (6.0.x) - https://phabricator.wikimedia.org/T187984 [08:02:23] (03CR) 10Elukey: [C: 03+1] "LGTM, I also want to follow Kormat's lead so here's my +1." [puppet] - 10https://gerrit.wikimedia.org/r/627418 (https://phabricator.wikimedia.org/T226703) (owner: 10Filippo Giunchedi) [08:02:55] !log elukey@cumin1001 END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) [08:02:55] :D [08:02:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:03:38] (03CR) 10Filippo Giunchedi: [C: 03+2] hieradata: enable rsyslog queues in esams/eqiad [puppet] - 10https://gerrit.wikimedia.org/r/627418 (https://phabricator.wikimedia.org/T226703) (owner: 10Filippo Giunchedi) [08:04:01] !log elukey@cumin1001 START - Cookbook sre.druid.roll-restart-workers [08:04:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:05:48] !log liw@deploy1001 scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_498180604" --store-class=LCStoreCDB --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 11m 10s) [08:05:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:09:55] PROBLEM - aqs endpoints health on aqs1007 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:10:12] 10Operations, 10ops-eqiad, 10Analytics-Radar: an-presto1004 down - https://phabricator.wikimedia.org/T253438 (10elukey) @wiki_willy do we have a high level timeline about when we could have the host back in service? We are not in a hurry but it has been down from the end of March :( [08:10:24] PROBLEM - aqs endpoints health on aqs1005 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:10:53] this is due to the druid roll restart, weird [08:11:40] PROBLEM - aqs endpoints health on aqs1004 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:11:42] PROBLEM - aqs endpoints health on aqs1008 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:11:44] RECOVERY - aqs endpoints health on aqs1007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:13:22] !log Stop MySQL on labsdb1010 for PDU maintenance T261456 [08:13:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:13:29] T261456: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 [08:13:34] RECOVERY - aqs endpoints health on aqs1008 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:14:31] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10Marostegui) labsdb1010 mysql has been stopped @Cmjohnson please take special care of: dbproxy1018 and dbproxy1019, and labsdb1011 as those hosts are serving a [08:15:53] (03CR) 10Muehlenhoff: [C: 03+2] turnilo: Also enforce access setting in the IDP service definition [puppet] - 10https://gerrit.wikimedia.org/r/627202 (owner: 10Muehlenhoff) [08:16:10] RECOVERY - aqs endpoints health on aqs1005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:22:12] (03PS1) 10Muehlenhoff: Add comment for people.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/627422 [08:27:04] RECOVERY - aqs endpoints health on aqs1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [08:28:54] PROBLEM - Rate of JVM GC Old generation-s runs - elastic2029-production-search-psi-codfw on elastic2029 is CRITICAL: 107.8 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-codfw&var-instance=elastic2029&panelId=37 [08:31:29] (03PS2) 10Giuseppe Lavagetto: mobileapps: use restbase-for-services everywhere [deployment-charts] - 10https://gerrit.wikimedia.org/r/626270 [08:32:34] (03CR) 10Jbond: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/627422 (owner: 10Muehlenhoff) [08:33:17] (03CR) 10Giuseppe Lavagetto: [C: 03+2] mobileapps: use restbase-for-services everywhere [deployment-charts] - 10https://gerrit.wikimedia.org/r/626270 (owner: 10Giuseppe Lavagetto) [08:34:40] (03CR) 10Kormat: "LGTM, just one nit." (031 comment) [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626602 (owner: 10Jcrespo) [08:34:46] (03Merged) 10jenkins-bot: mobileapps: use restbase-for-services everywhere [deployment-charts] - 10https://gerrit.wikimedia.org/r/626270 (owner: 10Giuseppe Lavagetto) [08:35:03] 10Operations, 10netbox: Netbox: fill network topology - https://phabricator.wikimedia.org/T205897 (10ayounsi) [08:35:50] (03CR) 10Muehlenhoff: [C: 03+2] Add comment for people.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/627422 (owner: 10Muehlenhoff) [08:42:14] (03Abandoned) 10Jbond: git: allow multiple calls to git::systemconfig [puppet] - 10https://gerrit.wikimedia.org/r/626260 (https://phabricator.wikimedia.org/T262244) (owner: 10Jbond) [08:49:09] 10Operations, 10ops-eqiad, 10DC-Ops: Physically db1131 from B5 to C8 - https://phabricator.wikimedia.org/T262901 (10Marostegui) [08:49:20] 10Operations, 10ops-eqiad, 10DC-Ops: Physically db1131 from B5 to C8 - https://phabricator.wikimedia.org/T262901 (10Marostegui) [08:49:30] 10Operations, 10ops-eqiad, 10DC-Ops: Physically db1131 from B5 to C8 - https://phabricator.wikimedia.org/T262901 (10Marostegui) p:05Triage→03Medium [08:51:12] (03PS10) 10Jbond: pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [08:52:14] (03CR) 10jerkins-bot: [V: 04-1] pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [08:52:29] !log elukey@cumin1001 END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) [08:52:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:53:11] !log elukey@cumin1001 START - Cookbook sre.zookeeper.roll-restart-zookeeper [08:53:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:53:26] !log roll restart druid zookeeper clusters for openjdk upgrades [08:53:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:55:39] 10Operations, 10ops-eqiad, 10DC-Ops: Physically move db1131 from B5 to C8 - https://phabricator.wikimedia.org/T262901 (10Marostegui) [08:58:51] !log oblivian@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [08:58:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:59:55] !log elukey@cumin1001 END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) [08:59:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:01:48] !log elukey@cumin1001 START - Cookbook sre.zookeeper.roll-restart-zookeeper [09:01:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:02:23] (03PS2) 10JMeybohm: lvs: Rename termbox-https to termbox [puppet] - 10https://gerrit.wikimedia.org/r/627297 (https://phabricator.wikimedia.org/T254581) [09:02:27] (03PS2) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 1/3 [puppet] - 10https://gerrit.wikimedia.org/r/627298 (https://phabricator.wikimedia.org/T254581) [09:02:28] (03PS2) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627299 (https://phabricator.wikimedia.org/T254581) [09:02:30] (03PS2) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 3/3 [puppet] - 10https://gerrit.wikimedia.org/r/627300 (https://phabricator.wikimedia.org/T254581) [09:04:22] !log restart elasticsearch on elastic2029 (high GC [09:04:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:04:53] (03PS3) 10JMeybohm: lvs: Rename termbox-https to termbox [puppet] - 10https://gerrit.wikimedia.org/r/627297 (https://phabricator.wikimedia.org/T254581) [09:04:55] (03PS3) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 1/3 [puppet] - 10https://gerrit.wikimedia.org/r/627298 (https://phabricator.wikimedia.org/T254581) [09:04:57] (03PS3) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627299 (https://phabricator.wikimedia.org/T254581) [09:04:59] (03PS3) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 3/3 [puppet] - 10https://gerrit.wikimedia.org/r/627300 (https://phabricator.wikimedia.org/T254581) [09:05:28] PROBLEM - Ensure local MW versions match expected deployment on mw1318 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:30] PROBLEM - Ensure local MW versions match expected deployment on mw2300 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:30] PROBLEM - Ensure local MW versions match expected deployment on mw1397 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:32] PROBLEM - Ensure local MW versions match expected deployment on parse2008 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:32] PROBLEM - Ensure local MW versions match expected deployment on wtp1044 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:32] PROBLEM - Ensure local MW versions match expected deployment on wtp2006 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:32] PROBLEM - Ensure local MW versions match expected deployment on mw2278 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:34] PROBLEM - Ensure local MW versions match expected deployment on mw1314 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:34] PROBLEM - Ensure local MW versions match expected deployment on mw1329 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:35] PROBLEM - Ensure local MW versions match expected deployment on mw1341 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:35] PROBLEM - Ensure local MW versions match expected deployment on parse2019 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:35] PROBLEM - Ensure local MW versions match expected deployment on mw2339 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:35] PROBLEM - Ensure local MW versions match expected deployment on mw2357 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:35] PROBLEM - Ensure local MW versions match expected deployment on mw2226 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:35] PROBLEM - Ensure local MW versions match expected deployment on mw2255 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:44] PROBLEM - Ensure local MW versions match expected deployment on mw1385 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:44] PROBLEM - Ensure local MW versions match expected deployment on mw1364 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:45] PROBLEM - Ensure local MW versions match expected deployment on mw1370 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:45] PROBLEM - Ensure local MW versions match expected deployment on mw1296 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw1361 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw1363 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw1315 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw1279 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw1339 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw2326 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:48] PROBLEM - Ensure local MW versions match expected deployment on mw2292 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:49] PROBLEM - Ensure local MW versions match expected deployment on mw2289 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:49] PROBLEM - Ensure local MW versions match expected deployment on mw2282 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:50] !log oblivian@deploy1001 helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [09:05:50] !log oblivian@deploy1001 helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' . [09:05:50] PROBLEM - Ensure local MW versions match expected deployment on mw2216 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:50] PROBLEM - Ensure local MW versions match expected deployment on mw2264 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:05:54] PROBLEM - Ensure local MW versions match expected deployment on mw1396 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:54] PROBLEM - Ensure local MW versions match expected deployment on mw1366 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:54] PROBLEM - Ensure local MW versions match expected deployment on mw2307 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:54] PROBLEM - Ensure local MW versions match expected deployment on mw2334 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:54] PROBLEM - Ensure local MW versions match expected deployment on mw2265 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:54] PROBLEM - Ensure local MW versions match expected deployment on mw2266 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:55] PROBLEM - Ensure local MW versions match expected deployment on mw1387 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:56] PROBLEM - Ensure local MW versions match expected deployment on mw2245 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:56] PROBLEM - Ensure local MW versions match expected deployment on mw2272 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:05:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:06:02] PROBLEM - Ensure local MW versions match expected deployment on labweb1001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:02] PROBLEM - Ensure local MW versions match expected deployment on mw1382 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:02] PROBLEM - Ensure local MW versions match expected deployment on wtp1036 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:02] PROBLEM - Ensure local MW versions match expected deployment on wtp1037 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on parse2012 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on mw2370 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on mw2323 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on mw2316 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on mw2310 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on wtp2002 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:08] PROBLEM - Ensure local MW versions match expected deployment on wtp2017 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:18] PROBLEM - Ensure local MW versions match expected deployment on mw1353 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:18] PROBLEM - Ensure local MW versions match expected deployment on mw1379 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:18] PROBLEM - Ensure local MW versions match expected deployment on mw1356 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:18] PROBLEM - Ensure local MW versions match expected deployment on mw1367 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:18] PROBLEM - Ensure local MW versions match expected deployment on mw1372 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:18] PROBLEM - Ensure local MW versions match expected deployment on mw1378 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:22] PROBLEM - Ensure local MW versions match expected deployment on wtp1033 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:22] PROBLEM - Ensure local MW versions match expected deployment on parse2005 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:22] PROBLEM - Ensure local MW versions match expected deployment on mw2291 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:22] PROBLEM - Ensure local MW versions match expected deployment on mw2235 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:22] PROBLEM - Ensure local MW versions match expected deployment on mw2229 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:22] PROBLEM - Ensure local MW versions match expected deployment on mw2363 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:28] PROBLEM - Ensure local MW versions match expected deployment on mw1348 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:28] PROBLEM - Ensure local MW versions match expected deployment on mw2294 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:28] PROBLEM - Ensure local MW versions match expected deployment on mw2250 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:28] PROBLEM - Ensure local MW versions match expected deployment on mw2234 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:30] PROBLEM - Ensure local MW versions match expected deployment on mw1395 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:30] PROBLEM - Ensure local MW versions match expected deployment on mw1390 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:31] liw: ^^ [09:06:32] PROBLEM - Ensure local MW versions match expected deployment on wtp1045 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:32] PROBLEM - Ensure local MW versions match expected deployment on mw1302 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:32] PROBLEM - Ensure local MW versions match expected deployment on mw1265 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:32] PROBLEM - Ensure local MW versions match expected deployment on mw2374 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:34] PROBLEM - Ensure local MW versions match expected deployment on parse2015 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:34] PROBLEM - Ensure local MW versions match expected deployment on parse2009 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:34] PROBLEM - Ensure local MW versions match expected deployment on mw2332 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:34] PROBLEM - Ensure local MW versions match expected deployment on mw2335 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:38] PROBLEM - Ensure local MW versions match expected deployment on wtp1027 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:38] PROBLEM - Ensure local MW versions match expected deployment on mwdebug1002 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:38] PROBLEM - Ensure local MW versions match expected deployment on mw1301 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:38] PROBLEM - Ensure local MW versions match expected deployment on mw1323 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on wtp1031 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on snapshot1005 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on mw2322 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on mw2320 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on wtp2018 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on wtp2012 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:40] PROBLEM - Ensure local MW versions match expected deployment on mw2219 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:41] PROBLEM - Ensure local MW versions match expected deployment on mw2230 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:41] PROBLEM - Ensure local MW versions match expected deployment on mw2269 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:42] PROBLEM - Ensure local MW versions match expected deployment on mw2228 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:42] PROBLEM - Ensure local MW versions match expected deployment on mw2256 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:44] I am downtiming the alerts [09:06:44] PROBLEM - Ensure local MW versions match expected deployment on mw1347 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:44] PROBLEM - Ensure local MW versions match expected deployment on mw1273 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:44] PROBLEM - Ensure local MW versions match expected deployment on mw1274 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:45] PROBLEM - Ensure local MW versions match expected deployment on mw1282 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:46] PROBLEM - Ensure local MW versions match expected deployment on mw2237 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:46] PROBLEM - Ensure local MW versions match expected deployment on mw2220 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:49] acking* [09:06:52] PROBLEM - Ensure local MW versions match expected deployment on parse2016 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:52] PROBLEM - Ensure local MW versions match expected deployment on mw2360 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:52] PROBLEM - Ensure local MW versions match expected deployment on deploy2001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:52] PROBLEM - Ensure local MW versions match expected deployment on mw2247 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:06:58] wt... [09:06:58] PROBLEM - Ensure local MW versions match expected deployment on mw1359 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:14] PROBLEM - Ensure local MW versions match expected deployment on mw1321 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:16] PROBLEM - Ensure local MW versions match expected deployment on mw1283 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:16] PROBLEM - Ensure local MW versions match expected deployment on mw1287 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:16] PROBLEM - Ensure local MW versions match expected deployment on mw1306 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:16] PROBLEM - Ensure local MW versions match expected deployment on mw1281 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:18] PROBLEM - Ensure local MW versions match expected deployment on mw2260 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:18] PROBLEM - Ensure local MW versions match expected deployment on mw2223 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:18] PROBLEM - Ensure local MW versions match expected deployment on mw2279 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:18] PROBLEM - Ensure local MW versions match expected deployment on mw2258 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:38] PROBLEM - Ensure local MW versions match expected deployment on mw2306 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:38] PROBLEM - Ensure local MW versions match expected deployment on mw2359 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:40] PROBLEM - Ensure local MW versions match expected deployment on mw1389 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:40] PROBLEM - Ensure local MW versions match expected deployment on mw1404 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:40] PROBLEM - Ensure local MW versions match expected deployment on mw1401 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:40] PROBLEM - Ensure local MW versions match expected deployment on mw1407 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:40] PROBLEM - Ensure local MW versions match expected deployment on mw1402 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:42] PROBLEM - Ensure local MW versions match expected deployment on mw1362 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:46] PROBLEM - Ensure local MW versions match expected deployment on mw1350 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:48] PROBLEM - Ensure local MW versions match expected deployment on mw1320 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:52] PROBLEM - Ensure local MW versions match expected deployment on mw2225 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:54] PROBLEM - Ensure local MW versions match expected deployment on mw1346 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:54] PROBLEM - Ensure local MW versions match expected deployment on mw1328 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:58] PROBLEM - Ensure local MW versions match expected deployment on mw1386 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:58] PROBLEM - Ensure local MW versions match expected deployment on mw1373 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:58] PROBLEM - Ensure local MW versions match expected deployment on mw1377 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:58] PROBLEM - Ensure local MW versions match expected deployment on mw1383 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:07:58] PROBLEM - Ensure local MW versions match expected deployment on parse2001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:00] PROBLEM - Ensure local MW versions match expected deployment on parse2007 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:00] PROBLEM - Ensure local MW versions match expected deployment on mw2328 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:00] PROBLEM - Ensure local MW versions match expected deployment on mwmaint2001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:06] PROBLEM - Ensure local MW versions match expected deployment on mw1278 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:06] PROBLEM - Ensure local MW versions match expected deployment on mw1289 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:06] PROBLEM - Ensure local MW versions match expected deployment on mw1272 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:06] PROBLEM - Ensure local MW versions match expected deployment on mw1293 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:06] PROBLEM - Ensure local MW versions match expected deployment on mw1308 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:06] PROBLEM - Ensure local MW versions match expected deployment on snapshot1007 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:08] PROBLEM - Ensure local MW versions match expected deployment on mw2303 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:08] PROBLEM - Ensure local MW versions match expected deployment on mw2375 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:08] PROBLEM - Ensure local MW versions match expected deployment on mw2329 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:12] PROBLEM - Ensure local MW versions match expected deployment on mw1393 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on mwmaint1002 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on snapshot1008 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on mw1349 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on wtp1039 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on mw2315 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on mw2368 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:15] PROBLEM - Ensure local MW versions match expected deployment on mw2231 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:16] PROBLEM - Ensure local MW versions match expected deployment on mw2268 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:16] PROBLEM - Ensure local MW versions match expected deployment on mw2288 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on mw2369 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on parse2004 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on mw2356 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on mw2311 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on mw2338 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on mw2373 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:22] PROBLEM - Ensure local MW versions match expected deployment on wtp2019 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:23] PROBLEM - Ensure local MW versions match expected deployment on wtp2014 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:23] PROBLEM - Ensure local MW versions match expected deployment on mw2244 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:24] PROBLEM - Ensure local MW versions match expected deployment on wtp2020 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:24] PROBLEM - Ensure local MW versions match expected deployment on snapshot1009 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:25] PROBLEM - Ensure local MW versions match expected deployment on mw1412 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:25] PROBLEM - Ensure local MW versions match expected deployment on mw1369 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:26] PROBLEM - Ensure local MW versions match expected deployment on mw1374 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:26] PROBLEM - Ensure local MW versions match expected deployment on mw2280 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:27] PROBLEM - Ensure local MW versions match expected deployment on mw2267 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:27] PROBLEM - Ensure local MW versions match expected deployment on mw2248 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:28] PROBLEM - Ensure local MW versions match expected deployment on wtp1025 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:28] PROBLEM - Ensure local MW versions match expected deployment on mw1298 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:29] PROBLEM - Ensure local MW versions match expected deployment on mw1313 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:29] PROBLEM - Ensure local MW versions match expected deployment on mw1290 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:30] PROBLEM - Ensure local MW versions match expected deployment on cloudweb2001-dev is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:30] PROBLEM - Ensure local MW versions match expected deployment on mw2372 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:31] PROBLEM - Ensure local MW versions match expected deployment on wtp2008 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:31] PROBLEM - Ensure local MW versions match expected deployment on wtp2007 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:32] PROBLEM - Ensure local MW versions match expected deployment on mw2286 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:32] PROBLEM - Ensure local MW versions match expected deployment on mw2218 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:33] <_joe_> uhhh [09:08:33] PROBLEM - Ensure local MW versions match expected deployment on mw2275 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:34] PROBLEM - Ensure local MW versions match expected deployment on wtp2001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:35] !log elukey@cumin1001 END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) [09:08:38] PROBLEM - Ensure local MW versions match expected deployment on mw1309 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:08:42] PROBLEM - Ensure local MW versions match expected deployment on mw1408 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:42] PROBLEM - Ensure local MW versions match expected deployment on mw1398 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on mw1358 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on mw1335 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on mw1285 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on mw1261 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on parse2006 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on mw2302 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:44] PROBLEM - Ensure local MW versions match expected deployment on mw2327 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:45] PROBLEM - Ensure local MW versions match expected deployment on mw2314 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:45] PROBLEM - Ensure local MW versions match expected deployment on mw2232 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:46] PROBLEM - Ensure local MW versions match expected deployment on mw2221 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:50] PROBLEM - Ensure local MW versions match expected deployment on mw1392 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:50] PROBLEM - Ensure local MW versions match expected deployment on mw1357 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:50] PROBLEM - Ensure local MW versions match expected deployment on mw1371 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:50] PROBLEM - Ensure local MW versions match expected deployment on mw1375 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:50] <_joe_> sigh, this alert goes off way too often [09:08:52] PROBLEM - Ensure local MW versions match expected deployment on mw2321 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:52] PROBLEM - Ensure local MW versions match expected deployment on mw2293 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:52] PROBLEM - Ensure local MW versions match expected deployment on mw2271 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:54] PROBLEM - Ensure local MW versions match expected deployment on mw1411 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:54] PROBLEM - Ensure local MW versions match expected deployment on mw1410 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:54] PROBLEM - Ensure local MW versions match expected deployment on mw1351 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:56] PROBLEM - Ensure local MW versions match expected deployment on mw1303 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:56] PROBLEM - Ensure local MW versions match expected deployment on mw1267 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:58] PROBLEM - Ensure local MW versions match expected deployment on mw2324 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:08:58] PROBLEM - Ensure local MW versions match expected deployment on mw2337 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:00] PROBLEM - Ensure local MW versions match expected deployment on wtp1032 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:00] PROBLEM - Ensure local MW versions match expected deployment on wtp1029 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:00] PROBLEM - Ensure local MW versions match expected deployment on mw1330 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:00] PROBLEM - Ensure local MW versions match expected deployment on mw1305 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:00] PROBLEM - Ensure local MW versions match expected deployment on wtp1028 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:00] PROBLEM - Ensure local MW versions match expected deployment on parse2020 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:02] PROBLEM - Ensure local MW versions match expected deployment on mw2313 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:02] PROBLEM - Ensure local MW versions match expected deployment on mw1354 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:02] PROBLEM - Ensure local MW versions match expected deployment on mw1381 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:08] PROBLEM - Ensure local MW versions match expected deployment on mw2239 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:08] PROBLEM - Ensure local MW versions match expected deployment on mw2236 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:08] PROBLEM - Ensure local MW versions match expected deployment on mw2254 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:10] PROBLEM - Ensure local MW versions match expected deployment on mw2367 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:10] PROBLEM - Ensure local MW versions match expected deployment on mw2353 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:10] PROBLEM - Ensure local MW versions match expected deployment on mw2350 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:12] PROBLEM - Ensure local MW versions match expected deployment on wtp2013 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:12] PROBLEM - Ensure local MW versions match expected deployment on mw2217 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:12] PROBLEM - Ensure local MW versions match expected deployment on wtp2004 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:25] PROBLEM - Ensure local MW versions match expected deployment on mw1295 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:30] PROBLEM - Ensure local MW versions match expected deployment on mw1394 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:30] PROBLEM - Ensure local MW versions match expected deployment on mw1400 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:30] PROBLEM - Ensure local MW versions match expected deployment on mw1399 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on mw1312 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on snapshot1006 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on parse2002 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on mw2331 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on mw2296 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on mw2355 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:32] PROBLEM - Ensure local MW versions match expected deployment on mw2287 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:36] PROBLEM - Ensure local MW versions match expected deployment on parse2003 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:36] PROBLEM - Ensure local MW versions match expected deployment on mw2330 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:36] PROBLEM - Ensure local MW versions match expected deployment on mw2336 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:36] PROBLEM - Ensure local MW versions match expected deployment on mw2376 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:36] PROBLEM - Ensure local MW versions match expected deployment on mw2238 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:38] PROBLEM - Ensure local MW versions match expected deployment on mw1406 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:38] PROBLEM - Ensure local MW versions match expected deployment on mw1345 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:38] PROBLEM - Ensure local MW versions match expected deployment on mw1311 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:38] PROBLEM - Ensure local MW versions match expected deployment on mw1263 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:38] PROBLEM - Ensure local MW versions match expected deployment on mw1327 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:38] PROBLEM - Ensure local MW versions match expected deployment on mw1319 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:39] PROBLEM - Ensure local MW versions match expected deployment on mw1334 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:39] PROBLEM - Ensure local MW versions match expected deployment on mw1288 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:40] PROBLEM - Ensure local MW versions match expected deployment on mw1270 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:40] PROBLEM - Ensure local MW versions match expected deployment on mw1336 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:41] PROBLEM - Ensure local MW versions match expected deployment on mw1326 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:41] PROBLEM - Ensure local MW versions match expected deployment on mw1343 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:42] PROBLEM - Ensure local MW versions match expected deployment on mw2317 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:42] PROBLEM - Ensure local MW versions match expected deployment on mw2333 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:43] PROBLEM - Ensure local MW versions match expected deployment on mw2309 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:43] PROBLEM - Ensure local MW versions match expected deployment on mw2318 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:44] PROBLEM - Ensure local MW versions match expected deployment on mw2371 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:44] PROBLEM - Ensure local MW versions match expected deployment on mw2351 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:45] PROBLEM - Ensure local MW versions match expected deployment on mw2252 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:45] PROBLEM - Ensure local MW versions match expected deployment on mw2246 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:46] PROBLEM - Ensure local MW versions match expected deployment on mw2277 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:46] PROBLEM - Ensure local MW versions match expected deployment on mw2274 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:47] PROBLEM - Ensure local MW versions match expected deployment on mwdebug2001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:47] PROBLEM - Ensure local MW versions match expected deployment on mw2263 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:48] PROBLEM - Ensure local MW versions match expected deployment on mw1405 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:48] PROBLEM - Ensure local MW versions match expected deployment on mw1403 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:49] PROBLEM - Ensure local MW versions match expected deployment on mw1391 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:54] PROBLEM - Ensure local MW versions match expected deployment on mw1380 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:54] PROBLEM - Ensure local MW versions match expected deployment on wtp1042 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:54] PROBLEM - Ensure local MW versions match expected deployment on wtp2016 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:56] PROBLEM - Ensure local MW versions match expected deployment on snapshot1010 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:58] PROBLEM - Ensure local MW versions match expected deployment on mw1413 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:58] PROBLEM - Ensure local MW versions match expected deployment on mw1355 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:09:58] PROBLEM - Ensure local MW versions match expected deployment on mw1384 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:02] PROBLEM - Ensure local MW versions match expected deployment on wtp1035 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:02] PROBLEM - Ensure local MW versions match expected deployment on mw2352 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:02] PROBLEM - Ensure local MW versions match expected deployment on mw2261 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:10] PROBLEM - Ensure local MW versions match expected deployment on mw1294 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:10] PROBLEM - Ensure local MW versions match expected deployment on mw1266 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:10] PROBLEM - Ensure local MW versions match expected deployment on mw1317 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:10] PROBLEM - Ensure local MW versions match expected deployment on mw1275 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on mw1307 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on parse2011 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on parse2013 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on mw2325 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on mw2354 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on mw2362 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:12] PROBLEM - Ensure local MW versions match expected deployment on mw2299 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:13] PROBLEM - Ensure local MW versions match expected deployment on mwdebug2002 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:16] PROBLEM - Ensure local MW versions match expected deployment on mw1352 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:16] PROBLEM - Ensure local MW versions match expected deployment on mw1277 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:16] PROBLEM - Ensure local MW versions match expected deployment on mw1331 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on mw2308 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on wtp2009 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on mw2224 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on mw2284 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on mw2285 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on mw2241 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:18] PROBLEM - Ensure local MW versions match expected deployment on mw2273 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:19] PROBLEM - Ensure local MW versions match expected deployment on mw2259 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:19] PROBLEM - Ensure local MW versions match expected deployment on mw2276 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:20] PROBLEM - Ensure local MW versions match expected deployment on wtp1041 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:20] PROBLEM - Ensure local MW versions match expected deployment on wtp1038 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:22] PROBLEM - Ensure local MW versions match expected deployment on wtp1047 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:22] PROBLEM - Ensure local MW versions match expected deployment on mw1337 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:22] PROBLEM - Ensure local MW versions match expected deployment on mw1344 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:28] PROBLEM - Ensure local MW versions match expected deployment on mw1333 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:28] PROBLEM - Ensure local MW versions match expected deployment on mw1284 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:28] PROBLEM - Ensure local MW versions match expected deployment on mw1268 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:28] PROBLEM - Ensure local MW versions match expected deployment on mw2366 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:28] PROBLEM - Ensure local MW versions match expected deployment on mw2358 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:28] PROBLEM - Ensure local MW versions match expected deployment on wtp2003 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:30] PROBLEM - Ensure local MW versions match expected deployment on scandium is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:34] PROBLEM - Ensure local MW versions match expected deployment on mw1322 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:34] PROBLEM - Ensure local MW versions match expected deployment on mw1338 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:34] PROBLEM - Ensure local MW versions match expected deployment on mw1299 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:34] PROBLEM - Ensure local MW versions match expected deployment on parse2010 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:35] PROBLEM - Ensure local MW versions match expected deployment on mw2365 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:35] PROBLEM - Ensure local MW versions match expected deployment on mw2304 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:35] PROBLEM - Ensure local MW versions match expected deployment on mw2361 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:35] PROBLEM - Ensure local MW versions match expected deployment on mw2312 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:35] PROBLEM - Ensure local MW versions match expected deployment on mw2251 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:36] PROBLEM - Ensure local MW versions match expected deployment on mw2253 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:36] PROBLEM - Ensure local MW versions match expected deployment on mw2281 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:37] PROBLEM - Ensure local MW versions match expected deployment on mw2215 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:37] PROBLEM - Ensure local MW versions match expected deployment on mw2222 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:38] PROBLEM - Ensure local MW versions match expected deployment on mw2262 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:38] PROBLEM - Ensure local MW versions match expected deployment on mw2270 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:40] PROBLEM - Ensure local MW versions match expected deployment on mw1388 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:40] PROBLEM - Ensure local MW versions match expected deployment on mw1376 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:40] PROBLEM - Ensure local MW versions match expected deployment on mw1365 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:40] PROBLEM - Ensure local MW versions match expected deployment on mw1340 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:41] PROBLEM - Ensure local MW versions match expected deployment on mw1264 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:42] PROBLEM - Ensure local MW versions match expected deployment on mw1409 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:42] PROBLEM - Ensure local MW versions match expected deployment on mw1324 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:42] PROBLEM - Ensure local MW versions match expected deployment on mw1286 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:43] PROBLEM - Ensure local MW versions match expected deployment on mw1300 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:43] PROBLEM - Ensure local MW versions match expected deployment on parse2018 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:44] PROBLEM - Ensure local MW versions match expected deployment on mw2305 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:44] PROBLEM - Ensure local MW versions match expected deployment on mw2364 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:45] PROBLEM - Ensure local MW versions match expected deployment on mw2283 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:48] PROBLEM - Ensure local MW versions match expected deployment on wtp1030 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:48] PROBLEM - Ensure local MW versions match expected deployment on mwdebug1001 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:50] PROBLEM - Ensure local MW versions match expected deployment on wtp1046 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:50] PROBLEM - Ensure local MW versions match expected deployment on mw2319 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:50] PROBLEM - Ensure local MW versions match expected deployment on wtp2015 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:50] PROBLEM - Ensure local MW versions match expected deployment on wtp2011 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:50] PROBLEM - Ensure local MW versions match expected deployment on mw2233 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:50] PROBLEM - Ensure local MW versions match expected deployment on mw2243 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:52] PROBLEM - Ensure local MW versions match expected deployment on wtp1034 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:52] PROBLEM - Ensure local MW versions match expected deployment on wtp1048 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:52] PROBLEM - Ensure local MW versions match expected deployment on wtp1026 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:52] PROBLEM - Ensure local MW versions match expected deployment on wtp2010 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:52] PROBLEM - Ensure local MW versions match expected deployment on mw2249 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:10:58] PROBLEM - Ensure local MW versions match expected deployment on labweb1002 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:00] PROBLEM - Ensure local MW versions match expected deployment on mw1368 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:04] PROBLEM - Ensure local MW versions match expected deployment on wtp1043 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:04] PROBLEM - Ensure local MW versions match expected deployment on mw2290 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:05] PROBLEM - Ensure local MW versions match expected deployment on mw1342 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:05] PROBLEM - Ensure local MW versions match expected deployment on mw1276 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on mw1332 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on wtp1040 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on mw1316 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on mw1310 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on mw1325 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on mw1269 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:12] PROBLEM - Ensure local MW versions match expected deployment on mw1262 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:13] PROBLEM - Ensure local MW versions match expected deployment on mw1271 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:13] PROBLEM - Ensure local MW versions match expected deployment on mw1304 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:14] PROBLEM - Ensure local MW versions match expected deployment on parse2014 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:14] PROBLEM - Ensure local MW versions match expected deployment on mw2297 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:15] PROBLEM - Ensure local MW versions match expected deployment on mw2298 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:16] PROBLEM - Ensure local MW versions match expected deployment on mw2295 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:16] PROBLEM - Ensure local MW versions match expected deployment on mw2301 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:16] PROBLEM - Ensure local MW versions match expected deployment on mw2257 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:17] PROBLEM - Ensure local MW versions match expected deployment on mw2240 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:17] PROBLEM - Ensure local MW versions match expected deployment on mw2227 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:18] PROBLEM - Ensure local MW versions match expected deployment on mw2242 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:20] PROBLEM - Ensure local MW versions match expected deployment on parse2017 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:11:25] PROBLEM - Ensure local MW versions match expected deployment on mw1297 is CRITICAL: CRITICAL: 3 mismatched wikiversions https://wikitech.wikimedia.org/wiki/Application_servers [09:13:18] 10Operations, 10netbox: Netbox: fill network topology - https://phabricator.wikimedia.org/T205897 (10ayounsi) [09:13:30] !log oblivian@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' . [09:13:31] !log oblivian@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [09:13:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:13:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:17:54] (03CR) 10Gehel: [C: 04-1] elasticsearch: Store which dcs to query in class (032 comments) [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) (owner: 10Ryan Kemper) [09:19:28] RECOVERY - Rate of JVM GC Old generation-s runs - elastic2029-production-search-psi-codfw on elastic2029 is OK: (C)100 gt (W)80 gt 55.93 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-codfw&var-instance=elastic2029&panelId=37 [09:22:11] !log Stop MySQL on s5 and s8 eqiad primary master - lag will show up on labsdb hosts T261455 [09:22:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:22:17] T261455: New Date - Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C2 and C3 - https://phabricator.wikimedia.org/T261455 [09:22:54] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: New Date - Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C2 and C3 - https://phabricator.wikimedia.org/T261455 (10Marostegui) >>! In T261455#6458359, @Marostegui wrote: >>>! In T261455#6423007, @Marostegui wrote: >> Please take extra care with db1087, db1100 and d... [09:23:03] (03CR) 10Giuseppe Lavagetto: [C: 03+2] wikifeeds: use the service proxy in staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/626132 (https://phabricator.wikimedia.org/T255878) (owner: 10Giuseppe Lavagetto) [09:24:14] (03Merged) 10jenkins-bot: wikifeeds: use the service proxy in staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/626132 (https://phabricator.wikimedia.org/T255878) (owner: 10Giuseppe Lavagetto) [09:27:06] (03PS11) 10Jbond: pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [09:28:00] (03CR) 10jerkins-bot: [V: 04-1] pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [09:29:24] (03PS1) 10Giuseppe Lavagetto: Bump wikifeeds chart to pick up the TLS upgrades. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627430 [09:31:05] (03PS1) 10JMeybohm: lvs: Rename cxserver-https to cxserver [puppet] - 10https://gerrit.wikimedia.org/r/627431 (https://phabricator.wikimedia.org/T255879) [09:31:07] (03PS1) 10JMeybohm: lvs: Remove cxserver non-TLS endpoint from LVS 1/3 [puppet] - 10https://gerrit.wikimedia.org/r/627432 (https://phabricator.wikimedia.org/T255879) [09:31:09] (03PS1) 10JMeybohm: lvs: Remove cxserver non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627433 (https://phabricator.wikimedia.org/T255879) [09:31:11] (03PS1) 10JMeybohm: lvs: Remove cxserver non-TLS endpoint from LVS 3/3 [puppet] - 10https://gerrit.wikimedia.org/r/627434 (https://phabricator.wikimedia.org/T255879) [09:31:44] 10Operations, 10Prod-Kubernetes, 10serviceops, 10Kubernetes, 10Patch-For-Review: Move cxserver to use TLS only - https://phabricator.wikimedia.org/T255879 (10JMeybohm) a:03JMeybohm [09:31:46] (03CR) 10Giuseppe Lavagetto: [C: 03+2] Bump wikifeeds chart to pick up the TLS upgrades. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627430 (owner: 10Giuseppe Lavagetto) [09:33:05] (03Merged) 10jenkins-bot: Bump wikifeeds chart to pick up the TLS upgrades. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627430 (owner: 10Giuseppe Lavagetto) [09:36:10] (03PS12) 10Jbond: pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [09:46:55] PROBLEM - MariaDB Replica Lag: m2 on db2078 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 8643.00 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [09:47:30] marostegui: ^ that yours? [09:47:40] PROBLEM - MariaDB Replica Lag: m2 on db1117 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 8727.00 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [09:48:04] kormat: that is from the otrs maintenance, I will silence those. I did yesterday for 24h [09:48:08] jynus: for awareness ^ [09:48:38] it should take 48 hours [09:48:44] so 24 hours more [09:49:02] yes, just gave them 24h more [09:49:08] RECOVERY - Ensure local MW versions match expected deployment on mw1401 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:08] RECOVERY - Ensure local MW versions match expected deployment on mw1407 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:08] RECOVERY - Ensure local MW versions match expected deployment on mw1402 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:08] RECOVERY - Ensure local MW versions match expected deployment on mw1389 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:08] RECOVERY - Ensure local MW versions match expected deployment on mw1404 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:12] RECOVERY - Ensure local MW versions match expected deployment on mw1362 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:12] RECOVERY - Ensure local MW versions match expected deployment on mw1350 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:12] RECOVERY - Ensure local MW versions match expected deployment on mw1320 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:16] RECOVERY - Ensure local MW versions match expected deployment on mw2225 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:20] RECOVERY - Ensure local MW versions match expected deployment on mw1328 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:20] RECOVERY - Ensure local MW versions match expected deployment on mw1346 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:25] RECOVERY - Ensure local MW versions match expected deployment on mw1386 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:25] RECOVERY - Ensure local MW versions match expected deployment on mw1377 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:25] RECOVERY - Ensure local MW versions match expected deployment on mw1373 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:25] RECOVERY - Ensure local MW versions match expected deployment on mw1383 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:25] RECOVERY - Ensure local MW versions match expected deployment on parse2001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:26] RECOVERY - Ensure local MW versions match expected deployment on parse2007 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:26] RECOVERY - Ensure local MW versions match expected deployment on mw2328 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:28] RECOVERY - Ensure local MW versions match expected deployment on mwmaint2001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:34] RECOVERY - Ensure local MW versions match expected deployment on mw1278 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:34] RECOVERY - Ensure local MW versions match expected deployment on mw1308 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:34] RECOVERY - Ensure local MW versions match expected deployment on mw1293 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:35] RECOVERY - Ensure local MW versions match expected deployment on mw1289 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:35] RECOVERY - Ensure local MW versions match expected deployment on mw1272 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:35] RECOVERY - Ensure local MW versions match expected deployment on snapshot1007 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:35] RECOVERY - Ensure local MW versions match expected deployment on mw2375 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:35] RECOVERY - Ensure local MW versions match expected deployment on mw2329 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:36] RECOVERY - Ensure local MW versions match expected deployment on mw2303 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:38] RECOVERY - Ensure local MW versions match expected deployment on mw1393 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:42] RECOVERY - Ensure local MW versions match expected deployment on snapshot1008 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:42] RECOVERY - Ensure local MW versions match expected deployment on mw1349 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:42] RECOVERY - Ensure local MW versions match expected deployment on mwmaint1002 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:42] RECOVERY - Ensure local MW versions match expected deployment on wtp1039 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:42] RECOVERY - Ensure local MW versions match expected deployment on mw2368 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:42] RECOVERY - Ensure local MW versions match expected deployment on mw2315 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:43] RECOVERY - Ensure local MW versions match expected deployment on mw2288 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:43] RECOVERY - Ensure local MW versions match expected deployment on mw2231 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:43] RECOVERY - Ensure local MW versions match expected deployment on mw2268 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on parse2004 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on mw2338 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on mw2369 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on mw2373 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on mw2356 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on mw2311 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:50] RECOVERY - Ensure local MW versions match expected deployment on wtp2020 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:51] RECOVERY - Ensure local MW versions match expected deployment on wtp2014 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:51] RECOVERY - Ensure local MW versions match expected deployment on mw2244 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:52] RECOVERY - Ensure local MW versions match expected deployment on wtp2019 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:52] RECOVERY - Ensure local MW versions match expected deployment on snapshot1009 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:53] RECOVERY - Ensure local MW versions match expected deployment on mw1412 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:53] RECOVERY - Ensure local MW versions match expected deployment on mw1374 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:54] RECOVERY - Ensure local MW versions match expected deployment on mw1369 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on mw2280 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on mw2267 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on mw2248 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on wtp1025 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on mw1313 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on mw1290 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:58] RECOVERY - Ensure local MW versions match expected deployment on mw1298 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:49:59] RECOVERY - Ensure local MW versions match expected deployment on cloudweb2001-dev is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:00] RECOVERY - Ensure local MW versions match expected deployment on mw2372 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:00] RECOVERY - Ensure local MW versions match expected deployment on wtp2008 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:00] RECOVERY - Ensure local MW versions match expected deployment on mw2286 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:01] RECOVERY - Ensure local MW versions match expected deployment on wtp2007 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:01] RECOVERY - Ensure local MW versions match expected deployment on mw2218 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:02] RECOVERY - Ensure local MW versions match expected deployment on mw2275 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:06] RECOVERY - Ensure local MW versions match expected deployment on wtp2001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:08] RECOVERY - Ensure local MW versions match expected deployment on mw1309 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:10] RECOVERY - Ensure local MW versions match expected deployment on mw1408 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:12] RECOVERY - Ensure local MW versions match expected deployment on mw1398 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:12] RECOVERY - Ensure local MW versions match expected deployment on mw1358 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:14] RECOVERY - Ensure local MW versions match expected deployment on mw1335 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:14] RECOVERY - Ensure local MW versions match expected deployment on mw1285 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:14] RECOVERY - Ensure local MW versions match expected deployment on mw1261 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:14] RECOVERY - Ensure local MW versions match expected deployment on parse2006 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:15] RECOVERY - Ensure local MW versions match expected deployment on mw2327 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:15] RECOVERY - Ensure local MW versions match expected deployment on mw2314 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:15] RECOVERY - Ensure local MW versions match expected deployment on mw2302 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:15] RECOVERY - Ensure local MW versions match expected deployment on mw2221 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:16] RECOVERY - Ensure local MW versions match expected deployment on mw2232 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw1392 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw1375 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw1371 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw1357 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw2321 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw2293 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:22] RECOVERY - Ensure local MW versions match expected deployment on mw2271 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:24] RECOVERY - Ensure local MW versions match expected deployment on mw1411 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:24] RECOVERY - Ensure local MW versions match expected deployment on mw1410 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:24] RECOVERY - Ensure local MW versions match expected deployment on mw1351 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:28] RECOVERY - Ensure local MW versions match expected deployment on mw1303 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:28] RECOVERY - Ensure local MW versions match expected deployment on mw1267 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:28] RECOVERY - Ensure local MW versions match expected deployment on mw2337 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:28] RECOVERY - Ensure local MW versions match expected deployment on mw2324 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on wtp1032 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on wtp1028 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on wtp1029 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on mw1305 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on mw1330 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on parse2020 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:34] RECOVERY - Ensure local MW versions match expected deployment on mw2313 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:35] RECOVERY - Ensure local MW versions match expected deployment on mw1381 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:35] RECOVERY - Ensure local MW versions match expected deployment on mw1354 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:40] RECOVERY - Ensure local MW versions match expected deployment on mw2239 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:40] RECOVERY - Ensure local MW versions match expected deployment on mw2254 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:40] RECOVERY - Ensure local MW versions match expected deployment on mw2236 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:40] RECOVERY - Ensure local MW versions match expected deployment on mw2350 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:40] RECOVERY - Ensure local MW versions match expected deployment on mw2367 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:40] RECOVERY - Ensure local MW versions match expected deployment on mw2353 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:44] RECOVERY - Ensure local MW versions match expected deployment on wtp2004 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:44] RECOVERY - Ensure local MW versions match expected deployment on wtp2013 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:44] RECOVERY - Ensure local MW versions match expected deployment on mw2217 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:54] RECOVERY - Ensure local MW versions match expected deployment on mw1295 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:50:58] RECOVERY - Ensure local MW versions match expected deployment on mw1394 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:00] RECOVERY - Ensure local MW versions match expected deployment on mw1400 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:00] RECOVERY - Ensure local MW versions match expected deployment on mw1399 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:04] RECOVERY - Ensure local MW versions match expected deployment on mw1312 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:04] RECOVERY - Ensure local MW versions match expected deployment on snapshot1006 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:04] RECOVERY - Ensure local MW versions match expected deployment on parse2002 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:04] RECOVERY - Ensure local MW versions match expected deployment on mw2355 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:05] RECOVERY - Ensure local MW versions match expected deployment on mw2331 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:05] RECOVERY - Ensure local MW versions match expected deployment on mw2296 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:05] RECOVERY - Ensure local MW versions match expected deployment on mw2287 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on parse2003 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on mw2330 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on mw2376 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on mw2336 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on mw2238 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on mw1406 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:10] RECOVERY - Ensure local MW versions match expected deployment on mw1311 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:11] RECOVERY - Ensure local MW versions match expected deployment on mw1345 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:11] RECOVERY - Ensure local MW versions match expected deployment on mw1343 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:12] RECOVERY - Ensure local MW versions match expected deployment on mw1319 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:12] RECOVERY - Ensure local MW versions match expected deployment on mw1263 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:13] RECOVERY - Ensure local MW versions match expected deployment on mw1326 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:13] RECOVERY - Ensure local MW versions match expected deployment on mw1336 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:14] RECOVERY - Ensure local MW versions match expected deployment on mw1334 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:14] RECOVERY - Ensure local MW versions match expected deployment on mw1327 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:15] RECOVERY - Ensure local MW versions match expected deployment on mw1288 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:15] RECOVERY - Ensure local MW versions match expected deployment on mw1270 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:16] RECOVERY - Ensure local MW versions match expected deployment on mw2309 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:16] RECOVERY - Ensure local MW versions match expected deployment on mw2351 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:17] RECOVERY - Ensure local MW versions match expected deployment on mw2318 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:17] RECOVERY - Ensure local MW versions match expected deployment on mw2317 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:18] RECOVERY - Ensure local MW versions match expected deployment on mw2333 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:18] RECOVERY - Ensure local MW versions match expected deployment on mw2246 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:19] RECOVERY - Ensure local MW versions match expected deployment on mw2371 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:19] RECOVERY - Ensure local MW versions match expected deployment on mw2277 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:20] RECOVERY - Ensure local MW versions match expected deployment on mw2274 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:20] RECOVERY - Ensure local MW versions match expected deployment on mw2252 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:21] RECOVERY - Ensure local MW versions match expected deployment on mw2263 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:21] RECOVERY - Ensure local MW versions match expected deployment on mwdebug2001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:22] RECOVERY - Ensure local MW versions match expected deployment on mw1391 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:22] RECOVERY - Ensure local MW versions match expected deployment on mw1405 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:23] RECOVERY - Ensure local MW versions match expected deployment on mw1403 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:24] RECOVERY - Ensure local MW versions match expected deployment on mw1380 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:28] RECOVERY - Ensure local MW versions match expected deployment on wtp1042 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:28] RECOVERY - Ensure local MW versions match expected deployment on wtp2016 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:28] RECOVERY - Ensure local MW versions match expected deployment on snapshot1010 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:28] RECOVERY - Ensure local MW versions match expected deployment on mw1413 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:30] RECOVERY - Ensure local MW versions match expected deployment on mw1384 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:30] RECOVERY - Ensure local MW versions match expected deployment on mw1355 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:34] RECOVERY - Ensure local MW versions match expected deployment on wtp1035 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:34] RECOVERY - Ensure local MW versions match expected deployment on mw2352 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:34] RECOVERY - Ensure local MW versions match expected deployment on mw2261 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on mw1307 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on mw1317 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on mw1294 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on mw1275 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on mw1266 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on parse2011 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:42] RECOVERY - Ensure local MW versions match expected deployment on parse2013 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:46] RECOVERY - Ensure local MW versions match expected deployment on mw2299 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:46] RECOVERY - Ensure local MW versions match expected deployment on mw2354 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:46] RECOVERY - Ensure local MW versions match expected deployment on mw2362 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:46] RECOVERY - Ensure local MW versions match expected deployment on mw2325 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:46] RECOVERY - Ensure local MW versions match expected deployment on mwdebug2002 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:48] RECOVERY - Ensure local MW versions match expected deployment on mw1352 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:48] RECOVERY - Ensure local MW versions match expected deployment on mw1277 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:48] RECOVERY - Ensure local MW versions match expected deployment on mw1331 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on wtp2009 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on mw2259 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on mw2285 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on mw2308 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on mw2224 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on mw2284 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:50] RECOVERY - Ensure local MW versions match expected deployment on mw2241 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:51] RECOVERY - Ensure local MW versions match expected deployment on mw2276 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:51] RECOVERY - Ensure local MW versions match expected deployment on mw2273 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:52] RECOVERY - Ensure local MW versions match expected deployment on wtp1038 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:52] RECOVERY - Ensure local MW versions match expected deployment on wtp1047 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:53] RECOVERY - Ensure local MW versions match expected deployment on wtp1041 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:53] RECOVERY - Ensure local MW versions match expected deployment on mw1344 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:51:54] RECOVERY - Ensure local MW versions match expected deployment on mw1337 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:00] RECOVERY - Ensure local MW versions match expected deployment on scandium is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:00] RECOVERY - Ensure local MW versions match expected deployment on mw1333 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:00] RECOVERY - Ensure local MW versions match expected deployment on mw1284 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:00] RECOVERY - Ensure local MW versions match expected deployment on mw1268 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:02] RECOVERY - Ensure local MW versions match expected deployment on mw2366 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:02] RECOVERY - Ensure local MW versions match expected deployment on mw2358 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:02] RECOVERY - Ensure local MW versions match expected deployment on wtp2003 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on mw1338 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on mw1322 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on mw1299 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on parse2010 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on mw2304 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on mw2365 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:06] RECOVERY - Ensure local MW versions match expected deployment on mw2312 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:07] RECOVERY - Ensure local MW versions match expected deployment on mw2361 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:07] RECOVERY - Ensure local MW versions match expected deployment on mw2253 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:08] RECOVERY - Ensure local MW versions match expected deployment on mw2251 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:08] RECOVERY - Ensure local MW versions match expected deployment on mw2281 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:09] RECOVERY - Ensure local MW versions match expected deployment on mw2262 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:09] RECOVERY - Ensure local MW versions match expected deployment on mw2215 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:10] RECOVERY - Ensure local MW versions match expected deployment on mw2222 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:10] RECOVERY - Ensure local MW versions match expected deployment on mw2270 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:11] RECOVERY - Ensure local MW versions match expected deployment on mw1388 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:11] RECOVERY - Ensure local MW versions match expected deployment on mw1365 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:12] RECOVERY - Ensure local MW versions match expected deployment on mw1376 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:12] RECOVERY - Ensure local MW versions match expected deployment on mw1264 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:13] RECOVERY - Ensure local MW versions match expected deployment on mw1340 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:14] RECOVERY - Ensure local MW versions match expected deployment on mw1409 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:14] RECOVERY - Ensure local MW versions match expected deployment on mw1324 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:14] RECOVERY - Ensure local MW versions match expected deployment on mw1286 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:15] RECOVERY - Ensure local MW versions match expected deployment on mw1300 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:15] RECOVERY - Ensure local MW versions match expected deployment on parse2018 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:18] RECOVERY - Ensure local MW versions match expected deployment on mw2305 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:18] RECOVERY - Ensure local MW versions match expected deployment on mw2364 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:18] RECOVERY - Ensure local MW versions match expected deployment on mw2283 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:20] RECOVERY - Ensure local MW versions match expected deployment on mwdebug1001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:20] RECOVERY - Ensure local MW versions match expected deployment on wtp1030 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:20] RECOVERY - Ensure local MW versions match expected deployment on wtp1046 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:20] RECOVERY - Ensure local MW versions match expected deployment on mw2319 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:20] RECOVERY - Ensure local MW versions match expected deployment on mw2233 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:22] RECOVERY - Ensure local MW versions match expected deployment on wtp2011 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:22] RECOVERY - Ensure local MW versions match expected deployment on wtp2015 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:22] RECOVERY - Ensure local MW versions match expected deployment on mw2243 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:28] RECOVERY - Ensure local MW versions match expected deployment on wtp1026 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:28] RECOVERY - Ensure local MW versions match expected deployment on wtp1034 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:28] RECOVERY - Ensure local MW versions match expected deployment on wtp1048 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:28] RECOVERY - Ensure local MW versions match expected deployment on wtp2010 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:28] RECOVERY - Ensure local MW versions match expected deployment on mw2249 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:32] RECOVERY - Ensure local MW versions match expected deployment on labweb1002 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:38] RECOVERY - Ensure local MW versions match expected deployment on mw1368 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:38] RECOVERY - Ensure local MW versions match expected deployment on wtp1043 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:38] RECOVERY - Ensure local MW versions match expected deployment on mw2290 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:38] RECOVERY - Ensure local MW versions match expected deployment on mw1342 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:38] RECOVERY - Ensure local MW versions match expected deployment on mw1276 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on wtp1040 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on mw1316 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on mw1325 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on mw1332 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on mw1310 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on mw1262 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:48] RECOVERY - Ensure local MW versions match expected deployment on mw1304 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:49] RECOVERY - Ensure local MW versions match expected deployment on mw1271 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:49] RECOVERY - Ensure local MW versions match expected deployment on mw1269 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:50] RECOVERY - Ensure local MW versions match expected deployment on parse2014 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:50] RECOVERY - Ensure local MW versions match expected deployment on mw2298 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:51] RECOVERY - Ensure local MW versions match expected deployment on mw2297 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:51] RECOVERY - Ensure local MW versions match expected deployment on mw2301 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:52] RECOVERY - Ensure local MW versions match expected deployment on mw2295 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:52] RECOVERY - Ensure local MW versions match expected deployment on mw2242 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:53] RECOVERY - Ensure local MW versions match expected deployment on mw2240 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:53] RECOVERY - Ensure local MW versions match expected deployment on mw2257 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:54] RECOVERY - Ensure local MW versions match expected deployment on mw2227 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:52:54] RECOVERY - Ensure local MW versions match expected deployment on parse2017 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:00] RECOVERY - Ensure local MW versions match expected deployment on mw1297 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:04] RECOVERY - Ensure local MW versions match expected deployment on mw1318 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:04] RECOVERY - Ensure local MW versions match expected deployment on mw2300 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:04] RECOVERY - Ensure local MW versions match expected deployment on mw1397 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:04] RECOVERY - Ensure local MW versions match expected deployment on wtp1044 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:05] RECOVERY - Ensure local MW versions match expected deployment on parse2008 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:05] RECOVERY - Ensure local MW versions match expected deployment on wtp2006 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:05] RECOVERY - Ensure local MW versions match expected deployment on mw2278 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:08] RECOVERY - Ensure local MW versions match expected deployment on mw1341 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:08] RECOVERY - Ensure local MW versions match expected deployment on mw1314 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:08] RECOVERY - Ensure local MW versions match expected deployment on mw1329 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:08] RECOVERY - Ensure local MW versions match expected deployment on parse2019 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:10] RECOVERY - Ensure local MW versions match expected deployment on mw2339 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:10] RECOVERY - Ensure local MW versions match expected deployment on mw2226 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:10] RECOVERY - Ensure local MW versions match expected deployment on mw2357 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:10] RECOVERY - Ensure local MW versions match expected deployment on mw2255 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:18] RECOVERY - Ensure local MW versions match expected deployment on mw1385 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:18] RECOVERY - Ensure local MW versions match expected deployment on mw1370 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:18] RECOVERY - Ensure local MW versions match expected deployment on mw1364 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:18] RECOVERY - Ensure local MW versions match expected deployment on mw1296 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw1361 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw1363 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw1339 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw1315 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw1279 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw2326 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:22] RECOVERY - Ensure local MW versions match expected deployment on mw2292 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:23] RECOVERY - Ensure local MW versions match expected deployment on mw2282 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:23] RECOVERY - Ensure local MW versions match expected deployment on mw2289 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:24] RECOVERY - Ensure local MW versions match expected deployment on mw2216 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:24] RECOVERY - Ensure local MW versions match expected deployment on mw2264 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:26] RECOVERY - Ensure local MW versions match expected deployment on mw1396 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:26] RECOVERY - Ensure local MW versions match expected deployment on mw1366 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:26] RECOVERY - Ensure local MW versions match expected deployment on mw2334 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:26] RECOVERY - Ensure local MW versions match expected deployment on mw2307 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:28] RECOVERY - Ensure local MW versions match expected deployment on mw2265 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:28] RECOVERY - Ensure local MW versions match expected deployment on mw2266 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:28] RECOVERY - Ensure local MW versions match expected deployment on mw1387 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:30] RECOVERY - Ensure local MW versions match expected deployment on mw2272 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:30] RECOVERY - Ensure local MW versions match expected deployment on mw2245 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:34] RECOVERY - Ensure local MW versions match expected deployment on labweb1001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:34] RECOVERY - Ensure local MW versions match expected deployment on mw1382 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:35] RECOVERY - Ensure local MW versions match expected deployment on wtp1036 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:35] RECOVERY - Ensure local MW versions match expected deployment on wtp1037 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on parse2012 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on mw2316 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on mw2310 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on mw2323 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on mw2370 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on wtp2002 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:40] RECOVERY - Ensure local MW versions match expected deployment on wtp2017 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:50] RECOVERY - Ensure local MW versions match expected deployment on mw1353 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:50] RECOVERY - Ensure local MW versions match expected deployment on mw1372 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:50] RECOVERY - Ensure local MW versions match expected deployment on mw1379 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:50] RECOVERY - Ensure local MW versions match expected deployment on mw1356 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:50] RECOVERY - Ensure local MW versions match expected deployment on mw1367 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:50] RECOVERY - Ensure local MW versions match expected deployment on mw1378 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:52] RECOVERY - Ensure local MW versions match expected deployment on wtp1033 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:52] RECOVERY - Ensure local MW versions match expected deployment on parse2005 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:52] RECOVERY - Ensure local MW versions match expected deployment on mw2363 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:52] RECOVERY - Ensure local MW versions match expected deployment on mw2291 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:52] RECOVERY - Ensure local MW versions match expected deployment on mw2229 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:53] RECOVERY - Ensure local MW versions match expected deployment on mw2235 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:56] RECOVERY - Ensure local MW versions match expected deployment on mw1390 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:56] RECOVERY - Ensure local MW versions match expected deployment on mw1395 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:56] RECOVERY - Ensure local MW versions match expected deployment on mw1348 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:56] RECOVERY - Ensure local MW versions match expected deployment on mw2294 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:56] RECOVERY - Ensure local MW versions match expected deployment on mw2250 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:56] RECOVERY - Ensure local MW versions match expected deployment on mw2234 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:58] RECOVERY - Ensure local MW versions match expected deployment on wtp1045 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:58] RECOVERY - Ensure local MW versions match expected deployment on mw1265 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:58] RECOVERY - Ensure local MW versions match expected deployment on mw1302 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:53:58] (03PS1) 10Alexandros Kosiaris: prometheus: Scrape k8s etcd nodes [puppet] - 10https://gerrit.wikimedia.org/r/627439 [09:54:00] RECOVERY - Ensure local MW versions match expected deployment on parse2009 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:00] RECOVERY - Ensure local MW versions match expected deployment on mw2374 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:00] RECOVERY - Ensure local MW versions match expected deployment on parse2015 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:00] RECOVERY - Ensure local MW versions match expected deployment on mw2332 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:00] RECOVERY - Ensure local MW versions match expected deployment on mw2335 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:04] RECOVERY - Ensure local MW versions match expected deployment on mwdebug1002 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:04] RECOVERY - Ensure local MW versions match expected deployment on wtp1027 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:04] RECOVERY - Ensure local MW versions match expected deployment on mw1323 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:04] RECOVERY - Ensure local MW versions match expected deployment on mw1301 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on wtp1031 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on snapshot1005 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on mw2322 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on mw2320 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on wtp2012 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on wtp2018 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:06] RECOVERY - Ensure local MW versions match expected deployment on mw2219 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:08] RECOVERY - Ensure local MW versions match expected deployment on mw2230 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:08] RECOVERY - Ensure local MW versions match expected deployment on mw2256 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:08] RECOVERY - Ensure local MW versions match expected deployment on mw2269 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:08] RECOVERY - Ensure local MW versions match expected deployment on mw2228 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:10] RECOVERY - Ensure local MW versions match expected deployment on mw1347 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:10] RECOVERY - Ensure local MW versions match expected deployment on mw1273 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:10] RECOVERY - Ensure local MW versions match expected deployment on mw1274 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:10] RECOVERY - Ensure local MW versions match expected deployment on mw1282 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:12] RECOVERY - Ensure local MW versions match expected deployment on mw2220 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:12] RECOVERY - Ensure local MW versions match expected deployment on mw2237 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:18] RECOVERY - Ensure local MW versions match expected deployment on parse2016 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:18] RECOVERY - Ensure local MW versions match expected deployment on mw2360 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:18] RECOVERY - Ensure local MW versions match expected deployment on deploy2001 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:18] RECOVERY - Ensure local MW versions match expected deployment on mw2247 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:24] RECOVERY - Ensure local MW versions match expected deployment on mw1359 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:36] RECOVERY - Ensure local MW versions match expected deployment on mw1321 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw1306 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw1287 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw1281 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw1283 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw2223 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw2258 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:40] RECOVERY - Ensure local MW versions match expected deployment on mw2260 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:41] RECOVERY - Ensure local MW versions match expected deployment on mw2279 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:54:51] 10Operations, 10ops-eqiad, 10DC-Ops: Physically move db1131 from B5 to C8 - https://phabricator.wikimedia.org/T262901 (10Peachey88) [09:55:02] RECOVERY - Ensure local MW versions match expected deployment on mw2306 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [09:55:02] RECOVERY - Ensure local MW versions match expected deployment on mw2359 is OK: OKAY: wikiversions in sync https://wikitech.wikimedia.org/wiki/Application_servers [10:01:38] (03PS1) 10Giuseppe Lavagetto: wikifeeds: re-add statsd config [deployment-charts] - 10https://gerrit.wikimedia.org/r/627441 [10:01:45] (03CR) 10jerkins-bot: [V: 04-1] wikifeeds: re-add statsd config [deployment-charts] - 10https://gerrit.wikimedia.org/r/627441 (owner: 10Giuseppe Lavagetto) [10:02:02] (03PS2) 10Giuseppe Lavagetto: wikifeeds: re-add statsd config [deployment-charts] - 10https://gerrit.wikimedia.org/r/627441 [10:03:52] (03PS1) 10Volans: wmcs: remove old unused records [dns] - 10https://gerrit.wikimedia.org/r/627442 (https://phabricator.wikimedia.org/T262863) [10:12:58] !log jayme@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [10:13:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:22:00] !log liw@deploy1001 rebuilt and synchronized wikiversions files: Revert "testwikiswikis to 1.36.0-wmf.9" [10:22:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:24:26] (03PS1) 10Lars Wirzenius: Revert "testwikis wikis to 1.36.0-wmf.9" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627469 [10:24:28] (03CR) 10Lars Wirzenius: [C: 03+2] Revert "testwikis wikis to 1.36.0-wmf.9" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627469 (owner: 10Lars Wirzenius) [10:24:30] (03PS1) 10Lars Wirzenius: testwikis wikis to 1.36.0-wmf.8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627470 [10:24:32] (03CR) 10Lars Wirzenius: [C: 03+2] testwikis wikis to 1.36.0-wmf.8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627470 (owner: 10Lars Wirzenius) [10:24:55] (03Merged) 10jenkins-bot: Revert "testwikis wikis to 1.36.0-wmf.9" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627469 (owner: 10Lars Wirzenius) [10:25:38] (03Merged) 10jenkins-bot: testwikis wikis to 1.36.0-wmf.8 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627470 (owner: 10Lars Wirzenius) [10:33:51] !log jayme@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [10:33:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:36:12] (03CR) 10MSantos: [C: 03+2] wikifeeds: re-add statsd config [deployment-charts] - 10https://gerrit.wikimedia.org/r/627441 (owner: 10Giuseppe Lavagetto) [10:36:56] !log uploaded libxml2 2.9.1+dfsg1-5+deb8u8+wmf1 for jessie-wikimedia [10:36:57] <_joe_> jayme: I bumped the push-notifications chart version [10:36:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:37:06] <_joe_> removing an unused file [10:37:19] <_joe_> if you want to re-deploy to staging :) [10:37:24] _joe_: 😱 [10:37:27] <_joe_> well, once the change is merged [10:37:43] (03Merged) 10jenkins-bot: wikifeeds: re-add statsd config [deployment-charts] - 10https://gerrit.wikimedia.org/r/627441 (owner: 10Giuseppe Lavagetto) [10:38:49] sure .. effie ^^ [10:39:12] _joe_: oh [10:39:21] jayme: I will do it in a bit [10:48:49] !log oblivian@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . [10:48:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:50:01] (03PS13) 10Jbond: pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [10:54:57] !log mass update PDU SNMP community in LibreNMS - T246890 [10:55:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:55:47] 10Operations, 10GrowthExperiments-NewcomerTasks, 10Product-Infrastructure-Team-Backlog, 10serviceops: Service operations setup for Add a Link project - https://phabricator.wikimedia.org/T258978 (10kostajh) Meeting 14/09/2020 Attendees: - Kosta (Growth) - Giuseppe (SRE) - Martin (Research) Summar... [10:56:04] (03PS1) 10ArielGlenn: Add sample code illustrating use of the commandmanagement module classes [dumps] - 10https://gerrit.wikimedia.org/r/627475 [10:56:49] !log oblivian@deploy1001 helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [10:56:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:58:13] !log jayme@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [10:58:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:00] ah lol that was me ^ [11:00:04] Amir1, Lucas_WMDE, awight, and Urbanecm: How many deployers does it take to do European mid-day backport window deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1100). [11:00:04] No GERRIT patches in the queue for this window AFAICS. [11:02:35] 10Operations, 10GrowthExperiments-NewcomerTasks, 10Product-Infrastructure-Team-Backlog, 10serviceops: Service operations setup for Add a Link project - https://phabricator.wikimedia.org/T258978 (10kostajh) [11:03:21] (03CR) 10Hnowlan: "> Patch Set 2: Code-Review-1" (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 (owner: 10Hnowlan) [11:03:37] (03PS3) 10Hnowlan: api-gateway: migrate to new helmfile format [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 [11:05:31] (03PS14) 10Jbond: pdu.rotate-snmp disable urllib warnings and add ability to coninue on errors [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [11:07:40] 10Operations: Improve process to add/update keys for pwstore repo - https://phabricator.wikimedia.org/T262393 (10MoritzMuehlenhoff) p:05Triage→03Medium a:03MoritzMuehlenhoff [11:11:32] PROBLEM - ps1-c6-codfw-infeed-load-tower-A-phase-Y on ps1-c6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:32] PROBLEM - ps1-d7-codfw-infeed-load-tower-B-phase-Y on ps1-d7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:35] PROBLEM - ps1-c7-eqiad-infeed-load-tower-B-phase-Z on ps1-c7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:35] PROBLEM - ps1-b5-codfw-infeed-load-tower-B-phase-Y on ps1-b5-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:42] PROBLEM - ps1-a2-codfw-infeed-load-tower-B-phase-Y on ps1-a2-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:42] PROBLEM - ps1-a2-codfw-infeed-load-tower-B-phase-X on ps1-a2-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:42] PROBLEM - ps1-d7-eqiad-infeed-load-tower-B-phase-Z on ps1-d7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:48] PROBLEM - ps1-c8-codfw-infeed-load-tower-A-phase-X on ps1-c8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:48] PROBLEM - ps1-d7-codfw-infeed-load-tower-A-phase-Y on ps1-d7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:48] PROBLEM - ps1-d6-codfw-infeed-load-tower-B-phase-X on ps1-d6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:48] PROBLEM - ps1-a7-codfw-infeed-load-tower-A-phase-X on ps1-a7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:52] PROBLEM - ps1-d8-eqiad-infeed-load-tower-A-phase-Y on ps1-d8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:11:58] PROBLEM - ps1-a7-eqiad-infeed-load-tower-A-phase-Y on ps1-a7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:00] PROBLEM - ps1-c2-eqiad-infeed-load-tower-B-phase-Y on ps1-c2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:01] er, that's a downtime expiring [11:12:03] expired downtime? [11:12:06] PROBLEM - ps1-c7-eqiad-infeed-load-tower-B-phase-X on ps1-c7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:06] PROBLEM - ps1-b1-eqiad-infeed-load-tower-A-phase-X on ps1-b1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:07] please ignore [11:12:10] k [11:12:12] PROBLEM - ps1-b1-eqiad-infeed-load-tower-A-phase-Y on ps1-b1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:12] PROBLEM - ps1-b1-eqiad-infeed-load-tower-B-phase-X on ps1-b1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:12] PROBLEM - ps1-c7-codfw-infeed-load-tower-B-phase-Z on ps1-c7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:12] PROBLEM - ps1-d8-eqiad-infeed-load-tower-B-phase-X on ps1-d8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:12] PROBLEM - ps1-a6-eqiad-infeed-load-tower-A-phase-Y on ps1-a6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:16] PROBLEM - ps1-b2-codfw-infeed-load-tower-A-phase-Z on ps1-b2-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:16] PROBLEM - ps1-a7-eqiad-infeed-load-tower-A-phase-X on ps1-a7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:18] PROBLEM - ps1-c2-eqiad-infeed-load-tower-B-phase-Z on ps1-c2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:18] PROBLEM - ps1-a2-codfw-infeed-load-tower-A-phase-X on ps1-a2-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:18] PROBLEM - ps1-c7-codfw-infeed-load-tower-A-phase-X on ps1-c7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:18] PROBLEM - ps1-d8-eqiad-infeed-load-tower-B-phase-Y on ps1-d8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:20] PROBLEM - ps1-c6-codfw-infeed-load-tower-B-phase-Z on ps1-c6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:20] PROBLEM - ps1-a6-eqiad-infeed-load-tower-A-phase-Z on ps1-a6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:23] !log mass update SCS SNMP community in LibreNMS - T246890 [11:12:24] PROBLEM - ps1-d7-codfw-infeed-load-tower-A-phase-X on ps1-d7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:24] PROBLEM - ps1-b1-eqiad-infeed-load-tower-B-phase-Z on ps1-b1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:24] PROBLEM - ps1-a7-codfw-infeed-load-tower-A-phase-Z on ps1-a7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:26] PROBLEM - ps1-a3-codfw-infeed-load-tower-B-phase-X on ps1-a3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:12:28] PROBLEM - ps1-a7-eqiad-infeed-load-tower-B-phase-X on ps1-a7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:32] PROBLEM - ps1-d8-eqiad-infeed-load-tower-A-phase-X on ps1-d8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:34] PROBLEM - ps1-oe15-esams-infeed-load-tower-A-single-phase on ps1-oe15-esams is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-a3-codfw-infeed-load-tower-B-phase-Z on ps1-a3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-c7-codfw-infeed-load-tower-A-phase-Y on ps1-c7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-b5-codfw-infeed-load-tower-A-phase-X on ps1-b5-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-a2-codfw-infeed-load-tower-A-phase-Y on ps1-a2-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-c6-codfw-infeed-load-tower-A-phase-X on ps1-c6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-c8-codfw-infeed-load-tower-A-phase-Y on ps1-c8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:36] PROBLEM - ps1-c7-eqiad-infeed-load-tower-A-phase-Z on ps1-c7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:37] PROBLEM - ps1-a3-codfw-infeed-load-tower-A-phase-X on ps1-a3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:40] PROBLEM - ps1-d8-eqiad-infeed-load-tower-B-phase-Z on ps1-d8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:40] PROBLEM - ps1-d7-codfw-infeed-load-tower-B-phase-X on ps1-d7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:42] PROBLEM - ps1-a3-codfw-infeed-load-tower-A-phase-Y on ps1-a3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:42] PROBLEM - ps1-a3-codfw-infeed-load-tower-A-phase-Z on ps1-a3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:42] PROBLEM - ps1-b5-codfw-infeed-load-tower-A-phase-Y on ps1-b5-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:44] PROBLEM - ps1-c6-codfw-infeed-load-tower-B-phase-X on ps1-c6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:44] PROBLEM - ps1-a6-eqiad-infeed-load-tower-B-phase-X on ps1-a6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:44] PROBLEM - ps1-c7-eqiad-infeed-load-tower-A-phase-Y on ps1-c7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:44] PROBLEM - ps1-d7-eqiad-infeed-load-tower-B-phase-Y on ps1-d7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:46] PROBLEM - ps1-b1-eqiad-infeed-load-tower-B-phase-Y on ps1-b1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:48] PROBLEM - ps1-c7-codfw-infeed-load-tower-B-phase-X on ps1-c7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:48] PROBLEM - ps1-a7-eqiad-infeed-load-tower-B-phase-Y on ps1-a7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:50] PROBLEM - ps1-d7-codfw-infeed-load-tower-B-phase-Z on ps1-d7-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:50] PROBLEM - ps1-a6-eqiad-infeed-load-tower-B-phase-Y on ps1-a6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:50] PROBLEM - ps1-d6-codfw-infeed-load-tower-B-phase-Z on ps1-d6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:50] PROBLEM - ps1-c6-codfw-infeed-load-tower-A-phase-Z on ps1-c6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:50] PROBLEM - ps1-oe15-esams-infeed-load-tower-B-single-phase on ps1-oe15-esams is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:52] PROBLEM - ps1-d8-eqiad-infeed-load-tower-A-phase-Z on ps1-d8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:52] PROBLEM - ps1-a7-eqiad-infeed-load-tower-B-phase-Z on ps1-a7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:52] PROBLEM - ps1-b2-codfw-infeed-load-tower-B-phase-X on ps1-b2-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:52] PROBLEM - ps1-c6-codfw-infeed-load-tower-B-phase-Y on ps1-c6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:52] PROBLEM - ps1-d6-codfw-infeed-load-tower-B-phase-Y on ps1-d6-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:12:52] PROBLEM - ps1-c7-eqiad-infeed-load-tower-A-phase-X on ps1-c7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:13:01] I re-downtimed it for 2h [11:13:26] thx [11:14:40] (03PS15) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [11:15:13] jbond42: should I have a look? ^^ [11:15:34] *right now [11:17:14] volans: thanks but no im still hitting edge cases [11:17:40] k [11:17:56] !log roll out scap 3.15.0-1 to all - T261234 [11:18:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:18:02] T261234: Deploy Scap version 3.15.0-1 - https://phabricator.wikimedia.org/T261234 [11:18:51] RECOVERY - ps1-c2-eqiad-infeed-load-tower-B-phase-Y on ps1-c2-eqiad is OK: SNMP OK - ps1-c2-eqiad-infeed-load-tower-B-phase-Y 763 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:51] RECOVERY - ps1-c7-eqiad-infeed-load-tower-B-phase-X on ps1-c7-eqiad is OK: SNMP OK - ps1-c7-eqiad-infeed-load-tower-B-phase-X 725 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:51] RECOVERY - ps1-b1-eqiad-infeed-load-tower-A-phase-X on ps1-b1-eqiad is OK: SNMP OK - ps1-b1-eqiad-infeed-load-tower-A-phase-X 257 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:51] RECOVERY - ps1-b1-eqiad-infeed-load-tower-A-phase-Y on ps1-b1-eqiad is OK: SNMP OK - ps1-b1-eqiad-infeed-load-tower-A-phase-Y 191 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:51] RECOVERY - ps1-b1-eqiad-infeed-load-tower-B-phase-X on ps1-b1-eqiad is OK: SNMP OK - ps1-b1-eqiad-infeed-load-tower-B-phase-X 242 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:51] RECOVERY - ps1-d8-eqiad-infeed-load-tower-B-phase-X on ps1-d8-eqiad is OK: SNMP OK - ps1-d8-eqiad-infeed-load-tower-B-phase-X 624 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:51] RECOVERY - ps1-c2-eqiad-infeed-load-tower-B-phase-Z on ps1-c2-eqiad is OK: SNMP OK - ps1-c2-eqiad-infeed-load-tower-B-phase-Z 813 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:52] RECOVERY - ps1-c7-codfw-infeed-load-tower-B-phase-Z on ps1-c7-codfw is OK: SNMP OK - ps1-c7-codfw-infeed-load-tower-B-phase-Z 419 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:52] RECOVERY - ps1-d8-codfw-infeed-load-tower-B-phase-Y on ps1-d8-codfw is OK: SNMP OK - ps1-d8-codfw-infeed-load-tower-B-phase-Y 32 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:53] RECOVERY - ps1-b2-codfw-infeed-load-tower-A-phase-Z on ps1-b2-codfw is OK: SNMP OK - ps1-b2-codfw-infeed-load-tower-A-phase-Z 824 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:53] RECOVERY - ps1-b1-eqiad-infeed-load-tower-B-phase-Z on ps1-b1-eqiad is OK: SNMP OK - ps1-b1-eqiad-infeed-load-tower-B-phase-Z 331 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:54] RECOVERY - ps1-d8-eqiad-infeed-load-tower-B-phase-Y on ps1-d8-eqiad is OK: SNMP OK - ps1-d8-eqiad-infeed-load-tower-B-phase-Y 651 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:54] RECOVERY - ps1-a2-codfw-infeed-load-tower-A-phase-X on ps1-a2-codfw is OK: SNMP OK - ps1-a2-codfw-infeed-load-tower-A-phase-X 878 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:55] RECOVERY - ps1-c7-codfw-infeed-load-tower-A-phase-X on ps1-c7-codfw is OK: SNMP OK - ps1-c7-codfw-infeed-load-tower-A-phase-X 762 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:56] RECOVERY - ps1-c6-codfw-infeed-load-tower-B-phase-Z on ps1-c6-codfw is OK: SNMP OK - ps1-c6-codfw-infeed-load-tower-B-phase-Z 708 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:56] RECOVERY - ps1-d7-codfw-infeed-load-tower-A-phase-X on ps1-d7-codfw is OK: SNMP OK - ps1-d7-codfw-infeed-load-tower-A-phase-X 730 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:56] RECOVERY - ps1-a7-codfw-infeed-load-tower-A-phase-Z on ps1-a7-codfw is OK: SNMP OK - ps1-a7-codfw-infeed-load-tower-A-phase-Z 871 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:57] RECOVERY - ps1-d8-codfw-infeed-load-tower-A-phase-Z on ps1-d8-codfw is OK: SNMP OK - ps1-d8-codfw-infeed-load-tower-A-phase-Z 324 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:57] RECOVERY - ps1-a3-codfw-infeed-load-tower-B-phase-X on ps1-a3-codfw is OK: SNMP OK - ps1-a3-codfw-infeed-load-tower-B-phase-X 949 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:18:59] RECOVERY - ps1-d8-eqiad-infeed-load-tower-A-phase-X on ps1-d8-eqiad is OK: SNMP OK - ps1-d8-eqiad-infeed-load-tower-A-phase-X 491 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-a3-codfw-infeed-load-tower-B-phase-Z on ps1-a3-codfw is OK: SNMP OK - ps1-a3-codfw-infeed-load-tower-B-phase-Z 922 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-c7-eqiad-infeed-load-tower-A-phase-Z on ps1-c7-eqiad is OK: SNMP OK - ps1-c7-eqiad-infeed-load-tower-A-phase-Z 588 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-b5-codfw-infeed-load-tower-A-phase-X on ps1-b5-codfw is OK: SNMP OK - ps1-b5-codfw-infeed-load-tower-A-phase-X 680 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-c7-codfw-infeed-load-tower-A-phase-Y on ps1-c7-codfw is OK: SNMP OK - ps1-c7-codfw-infeed-load-tower-A-phase-Y 208 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-a2-codfw-infeed-load-tower-A-phase-Y on ps1-a2-codfw is OK: SNMP OK - ps1-a2-codfw-infeed-load-tower-A-phase-Y 121 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-c6-codfw-infeed-load-tower-A-phase-X on ps1-c6-codfw is OK: SNMP OK - ps1-c6-codfw-infeed-load-tower-A-phase-X 809 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:03] RECOVERY - ps1-a3-codfw-infeed-load-tower-A-phase-X on ps1-a3-codfw is OK: SNMP OK - ps1-a3-codfw-infeed-load-tower-A-phase-X 953 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:04] RECOVERY - ps1-oe15-esams-infeed-load-tower-A-single-phase on ps1-oe15-esams is OK: SNMP OK - ps1-oe15-esams-infeed-load-tower-A-single-phase 290 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:04] RECOVERY - ps1-c8-codfw-infeed-load-tower-A-phase-Y on ps1-c8-codfw is OK: SNMP OK - ps1-c8-codfw-infeed-load-tower-A-phase-Y 493 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:09] RECOVERY - ps1-d8-eqiad-infeed-load-tower-B-phase-Z on ps1-d8-eqiad is OK: SNMP OK - ps1-d8-eqiad-infeed-load-tower-B-phase-Z 835 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:09] RECOVERY - ps1-d7-codfw-infeed-load-tower-B-phase-X on ps1-d7-codfw is OK: SNMP OK - ps1-d7-codfw-infeed-load-tower-B-phase-X 384 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:11] RECOVERY - ps1-c7-eqiad-infeed-load-tower-A-phase-Y on ps1-c7-eqiad is OK: SNMP OK - ps1-c7-eqiad-infeed-load-tower-A-phase-Y 638 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:11] RECOVERY - ps1-a3-codfw-infeed-load-tower-A-phase-Y on ps1-a3-codfw is OK: SNMP OK - ps1-a3-codfw-infeed-load-tower-A-phase-Y 904 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:11] RECOVERY - ps1-a3-codfw-infeed-load-tower-A-phase-Z on ps1-a3-codfw is OK: SNMP OK - ps1-a3-codfw-infeed-load-tower-A-phase-Z 935 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:11] RECOVERY - ps1-c6-codfw-infeed-load-tower-B-phase-X on ps1-c6-codfw is OK: SNMP OK - ps1-c6-codfw-infeed-load-tower-B-phase-X 603 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:11] RECOVERY - ps1-b5-codfw-infeed-load-tower-A-phase-Y on ps1-b5-codfw is OK: SNMP OK - ps1-b5-codfw-infeed-load-tower-A-phase-Y 450 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:11] RECOVERY - ps1-d8-codfw-infeed-load-tower-B-phase-Z on ps1-d8-codfw is OK: SNMP OK - ps1-d8-codfw-infeed-load-tower-B-phase-Z 309 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:13] RECOVERY - ps1-d7-eqiad-infeed-load-tower-B-phase-Y on ps1-d7-eqiad is OK: SNMP OK - ps1-d7-eqiad-infeed-load-tower-B-phase-Y 546 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:13] RECOVERY - ps1-d8-codfw-infeed-load-tower-A-phase-X on ps1-d8-codfw is OK: SNMP OK - ps1-d8-codfw-infeed-load-tower-A-phase-X 298 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:17] RECOVERY - ps1-b1-eqiad-infeed-load-tower-B-phase-Y on ps1-b1-eqiad is OK: SNMP OK - ps1-b1-eqiad-infeed-load-tower-B-phase-Y 173 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:17] RECOVERY - ps1-c7-codfw-infeed-load-tower-B-phase-X on ps1-c7-codfw is OK: SNMP OK - ps1-c7-codfw-infeed-load-tower-B-phase-X 325 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:19] RECOVERY - ps1-d7-codfw-infeed-load-tower-B-phase-Z on ps1-d7-codfw is OK: SNMP OK - ps1-d7-codfw-infeed-load-tower-B-phase-Z 471 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:19] RECOVERY - ps1-d8-codfw-infeed-load-tower-A-phase-Y on ps1-d8-codfw is OK: SNMP OK - ps1-d8-codfw-infeed-load-tower-A-phase-Y 46 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:19] RECOVERY - ps1-d6-codfw-infeed-load-tower-B-phase-Z on ps1-d6-codfw is OK: SNMP OK - ps1-d6-codfw-infeed-load-tower-B-phase-Z 219 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:19] RECOVERY - ps1-c6-codfw-infeed-load-tower-A-phase-Z on ps1-c6-codfw is OK: SNMP OK - ps1-c6-codfw-infeed-load-tower-A-phase-Z 957 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:19] RECOVERY - ps1-oe15-esams-infeed-load-tower-B-single-phase on ps1-oe15-esams is OK: SNMP OK - ps1-oe15-esams-infeed-load-tower-B-single-phase 668 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:19:54] (03PS16) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [11:20:43] RECOVERY - ps1-d8-eqiad-infeed-load-tower-A-phase-Y on ps1-d8-eqiad is OK: SNMP OK - ps1-d8-eqiad-infeed-load-tower-A-phase-Y 649 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:20:43] RECOVERY - ps1-c6-codfw-infeed-load-tower-A-phase-Y on ps1-c6-codfw is OK: SNMP OK - ps1-c6-codfw-infeed-load-tower-A-phase-Y 727 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:22:53] RECOVERY - ps1-a2-codfw-infeed-load-tower-B-phase-X on ps1-a2-codfw is OK: SNMP OK - ps1-a2-codfw-infeed-load-tower-B-phase-X 304 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:22:53] RECOVERY - ps1-d8-eqiad-infeed-load-tower-A-phase-Z on ps1-d8-eqiad is OK: SNMP OK - ps1-d8-eqiad-infeed-load-tower-A-phase-Z 717 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:27] RECOVERY - ps1-a2-codfw-infeed-load-tower-B-phase-Y on ps1-a2-codfw is OK: SNMP OK - ps1-a2-codfw-infeed-load-tower-B-phase-Y 65 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-a7-codfw-infeed-load-tower-A-phase-X on ps1-a7-codfw is OK: SNMP OK - ps1-a7-codfw-infeed-load-tower-A-phase-X 780 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-b2-codfw-infeed-load-tower-B-phase-X on ps1-b2-codfw is OK: SNMP OK - ps1-b2-codfw-infeed-load-tower-B-phase-X 325 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-b5-codfw-infeed-load-tower-B-phase-Y on ps1-b5-codfw is OK: SNMP OK - ps1-b5-codfw-infeed-load-tower-B-phase-Y 179 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-c6-codfw-infeed-load-tower-B-phase-Y on ps1-c6-codfw is OK: SNMP OK - ps1-c6-codfw-infeed-load-tower-B-phase-Y 572 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-c7-eqiad-infeed-load-tower-A-phase-X on ps1-c7-eqiad is OK: SNMP OK - ps1-c7-eqiad-infeed-load-tower-A-phase-X 775 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-c7-eqiad-infeed-load-tower-B-phase-Z on ps1-c7-eqiad is OK: SNMP OK - ps1-c7-eqiad-infeed-load-tower-B-phase-Z 550 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:29] RECOVERY - ps1-c8-codfw-infeed-load-tower-A-phase-X on ps1-c8-codfw is OK: SNMP OK - ps1-c8-codfw-infeed-load-tower-A-phase-X 444 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:30] RECOVERY - ps1-d6-codfw-infeed-load-tower-B-phase-X on ps1-d6-codfw is OK: SNMP OK - ps1-d6-codfw-infeed-load-tower-B-phase-X 197 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:30] RECOVERY - ps1-d6-codfw-infeed-load-tower-B-phase-Y on ps1-d6-codfw is OK: SNMP OK - ps1-d6-codfw-infeed-load-tower-B-phase-Y 99 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:31] RECOVERY - ps1-d7-codfw-infeed-load-tower-A-phase-Y on ps1-d7-codfw is OK: SNMP OK - ps1-d7-codfw-infeed-load-tower-A-phase-Y 140 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:31] RECOVERY - ps1-d7-codfw-infeed-load-tower-B-phase-Y on ps1-d7-codfw is OK: SNMP OK - ps1-d7-codfw-infeed-load-tower-B-phase-Y 143 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:32] RECOVERY - ps1-d7-eqiad-infeed-load-tower-B-phase-Z on ps1-d7-eqiad is OK: SNMP OK - ps1-d7-eqiad-infeed-load-tower-B-phase-Z 748 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:24:32] RECOVERY - ps1-d8-codfw-infeed-load-tower-B-phase-X on ps1-d8-codfw is OK: SNMP OK - ps1-d8-codfw-infeed-load-tower-B-phase-X 308 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:30:44] (03PS17) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [11:32:02] (03CR) 10jerkins-bot: [V: 04-1] cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [11:36:23] marostegui: hey, is it fine if I re-enable DPL at ruwikinews now, as we discussed yesterday evening? [11:36:57] Urbanecm: +1 [11:37:03] okay, doing [11:38:04] Urbanecm: I will monitor s3 then [11:38:10] okay, thanks [11:41:10] (03PS1) 10Urbanecm: Revert "Disable DynamicPageList on ruwikinews" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627479 (https://phabricator.wikimedia.org/T262240) [11:41:46] liw: the php symlink at deploy1001 is dirty [11:41:50] (ie. not commited) [11:41:57] (03CR) 10Urbanecm: [C: 03+2] Revert "Disable DynamicPageList on ruwikinews" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627479 (https://phabricator.wikimedia.org/T262240) (owner: 10Urbanecm) [11:42:40] (03Merged) 10jenkins-bot: Revert "Disable DynamicPageList on ruwikinews" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627479 (https://phabricator.wikimedia.org/T262240) (owner: 10Urbanecm) [11:43:49] Urbanecm, yes; everything broke today. I'll fix that [11:43:52] (03PS18) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [11:44:06] liw: yup, saw that scap sync doesn't work at all :/ [11:44:16] !log urbanecm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: 294931fc6eb9e365894ec0cf94c155d55ecae549: Revert "Disable DynamicPageList on ruwikinews" (T262240; T262391) (duration: 00m 58s) [11:44:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:44:23] marostegui: DPL is back up [11:44:24] T262391: DPL extension has been disabled on Russian Wikinews - https://phabricator.wikimedia.org/T262391 [11:44:27] Urbanecm: cheers [11:44:29] (03PS1) 10Lars Wirzenius: Commit manually updated php symlink (cleanup from panic fixing) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627480 [11:44:31] (03CR) 10Lars Wirzenius: [C: 03+2] Commit manually updated php symlink (cleanup from panic fixing) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627480 (owner: 10Lars Wirzenius) [11:44:46] Urbanecm: Let's see how s3 does then from now on :) [11:45:01] yup - please keep me posted. I'll update the task and ruwikinews community [11:45:05] (03CR) 10jerkins-bot: [V: 04-1] cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [11:45:13] (03Merged) 10jenkins-bot: Commit manually updated php symlink (cleanup from panic fixing) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627480 (owner: 10Lars Wirzenius) [11:45:14] scap sync-file returned this: 11:44:16 /usr/bin/sudo -u root -- /usr/local/sbin/check-and-restart-php php7.2-fpm 100 on mw2306.codfw.wmnet returned [56]: [11:45:17] is this a reason to worry? [11:51:49] rzl: ^ ping as you're on duty, and I'm not sure who I should ping ^ [12:00:01] (03PS1) 10Tchanders: Enable Special:Investigate on itwiki and svwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627481 [12:00:05] Deploy window Pre MediaWiki train sanity break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1200) [12:00:27] (03CR) 10Tchanders: [C: 04-1] "Wait for GuidedTour translations" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627481 (owner: 10Tchanders) [12:03:22] 10Operations, 10DBA, 10Patch-For-Review, 10User-Kormat, 10User-jbond: Refactor mariadb puppet code - https://phabricator.wikimedia.org/T256972 (10Kormat) [12:05:34] !log elukey@cumin1001 START - Cookbook sre.cassandra.roll-restart [12:05:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:05:51] !log roll restart cassandra on aqs* to pick up openjdk upgrades [12:05:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:08:37] !log rolling restart of php7.2-fpm [12:08:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:14:09] (03PS4) 10JMeybohm: lvs: Rename termbox-https to termbox [puppet] - 10https://gerrit.wikimedia.org/r/627297 (https://phabricator.wikimedia.org/T254581) [12:15:42] (03PS1) 10Elukey: role::search::airflow: add profile::analytics::cluster::gitconfig [puppet] - 10https://gerrit.wikimedia.org/r/627483 [12:17:07] PROBLEM - PyBal connections to etcd on lvs1016 is CRITICAL: CRITICAL: 102 connections established with conf1004.eqiad.wmnet:4001 (min=104) https://wikitech.wikimedia.org/wiki/PyBal [12:18:48] !log update libxml2 on stretch and jessie [12:18:50] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1087 from vslow, s8, add db1092 temporarily', diff saved to https://phabricator.wikimedia.org/P12589 and previous config saved to /var/cache/conftool/dbconfig/20200915-121849-marostegui.json [12:18:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:23:05] RECOVERY - PyBal connections to etcd on lvs1016 is OK: OK: 104 connections established with conf1004.eqiad.wmnet:4001 (min=104) https://wikitech.wikimedia.org/wiki/PyBal [12:24:16] (03CR) 10JMeybohm: "> Patch Set 2:" (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 (owner: 10Hnowlan) [12:24:19] (03CR) 10Elukey: "https://puppet-compiler.wmflabs.org/compiler1003/25069/an-airflow1001.eqiad.wmnet/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/627483 (owner: 10Elukey) [12:24:23] (03CR) 10Elukey: [C: 03+2] role::search::airflow: add profile::analytics::cluster::gitconfig [puppet] - 10https://gerrit.wikimedia.org/r/627483 (owner: 10Elukey) [12:26:47] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "You should do a few additional things here:" [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 (owner: 10Hnowlan) [12:31:04] (03CR) 10JMeybohm: [C: 03+1] "Cool!" [puppet] - 10https://gerrit.wikimedia.org/r/627439 (owner: 10Alexandros Kosiaris) [12:35:50] (03CR) 10Kormat: [C: 03+1] "> Patch Set 9:" [puppet] - 10https://gerrit.wikimedia.org/r/622444 (https://phabricator.wikimedia.org/T260843) (owner: 10Bstorm) [12:38:17] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "https://puppet-compiler.wmflabs.org/compiler1001/25071/" [puppet] - 10https://gerrit.wikimedia.org/r/627297 (https://phabricator.wikimedia.org/T254581) (owner: 10JMeybohm) [12:40:16] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "given we still have the service IP on all backends anyways, you can skip this patch in this case IMO." [puppet] - 10https://gerrit.wikimedia.org/r/627299 (https://phabricator.wikimedia.org/T254581) (owner: 10JMeybohm) [12:41:39] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "LGTM, this will just remove monitoring, so that we don't risk paging." [puppet] - 10https://gerrit.wikimedia.org/r/627298 (https://phabricator.wikimedia.org/T254581) (owner: 10JMeybohm) [12:42:10] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "IMHO we should skip patch 2 and merge this directly." [puppet] - 10https://gerrit.wikimedia.org/r/627300 (https://phabricator.wikimedia.org/T254581) (owner: 10JMeybohm) [12:45:09] PROBLEM - Logstash Elasticsearch indexing errors #o11y on icinga1001 is CRITICAL: 15.94 ge 8 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash [12:46:31] (03CR) 10Gehel: [C: 04-1] "Yet another comment." (031 comment) [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) (owner: 10Ryan Kemper) [12:47:43] PROBLEM - restbase endpoints health on restbase1021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:47:43] PROBLEM - restbase endpoints health on restbase1020 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:47:47] PROBLEM - restbase endpoints health on restbase2013 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:47:51] PROBLEM - Restbase LVS codfw on restbase.svc.codfw.wmnet is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/RESTBase [12:48:01] PROBLEM - wikifeeds eqiad on wikifeeds.svc.eqiad.wmnet is CRITICAL: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most read articles for January 1, 2016) timed out before a response was received: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most-read articles for January 1, 2016 (with aggregated=true)) timed out before a response was received https://wikitech.wikimedia.org/wiki/Wikifeeds [12:48:55] PROBLEM - Restbase LVS eqiad on restbase.svc.eqiad.wmnet is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/RESTBase [12:48:55] PROBLEM - restbase endpoints health on restbase1027 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:48:56] PROBLEM - restbase endpoints health on restbase1018 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:01] PROBLEM - restbase endpoints health on restbase2022 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:01] PROBLEM - restbase endpoints health on restbase2015 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:05] PROBLEM - restbase endpoints health on restbase1019 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:06] PROBLEM - restbase endpoints health on restbase1022 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:06] PROBLEM - restbase endpoints health on restbase2020 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:17] PROBLEM - restbase endpoints health on restbase1023 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:17] PROBLEM - restbase endpoints health on restbase2018 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:49] PROBLEM - restbase endpoints health on restbase2012 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:49:49] PROBLEM - restbase endpoints health on restbase2014 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:50:39] PROBLEM - restbase endpoints health on restbase1016 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:50:41] wow [12:50:46] any maintenance ongoing? [12:50:53] PROBLEM - restbase endpoints health on restbase2019 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:51:08] elukey: not sure if the PDU work started already [12:51:23] #page for restbase if anyone with more context is around [12:51:47] PROBLEM - restbase endpoints health on restbase2021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:51:55] PROBLEM - restbase endpoints health on restbase1017 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:51:57] PROBLEM - wikifeeds codfw on wikifeeds.svc.codfw.wmnet is CRITICAL: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most read articles for January 1, 2016) timed out before a response was received: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most-read articles for January 1, 2016 (with aggregated=true)) timed out before a response was received https://wikitech.wikimedia.org/wiki/Wikifeeds [12:51:57] PROBLEM - restbase endpoints health on restbase2017 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:01] PROBLEM - restbase endpoints health on restbase1026 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:15] PROBLEM - restbase endpoints health on restbase-dev1004 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:15] PROBLEM - restbase endpoints health on restbase2011 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:21] PROBLEM - restbase endpoints health on restbase1025 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:21] PROBLEM - restbase endpoints health on restbase-dev1005 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:21] PROBLEM - restbase endpoints health on restbase-dev1006 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:21] PROBLEM - restbase endpoints health on restbase2016 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:21] PROBLEM - restbase endpoints health on restbase2023 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:49] PROBLEM - restbase endpoints health on restbase1024 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:49] PROBLEM - restbase endpoints health on restbase1027 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:49] PROBLEM - restbase endpoints health on restbase1018 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:49] PROBLEM - restbase endpoints health on restbase2010 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:59] PROBLEM - restbase endpoints health on restbase1022 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:59] PROBLEM - restbase endpoints health on restbase1019 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:52:59] PROBLEM - restbase endpoints health on restbase2020 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:53:07] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "https://puppet-compiler.wmflabs.org/compiler1001/25073/" [puppet] - 10https://gerrit.wikimedia.org/r/627431 (https://phabricator.wikimedia.org/T255879) (owner: 10JMeybohm) [12:53:23] this doesn't look right [12:53:43] RECOVERY - restbase endpoints health on restbase2014 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:53:43] RECOVERY - restbase endpoints health on restbase2012 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:53:45] RECOVERY - restbase endpoints health on restbase2021 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:53:45] RECOVERY - Restbase LVS codfw on restbase.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/RESTBase [12:53:53] RECOVERY - wikifeeds eqiad on wikifeeds.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Wikifeeds [12:53:55] RECOVERY - restbase endpoints health on restbase1017 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:53:55] RECOVERY - wikifeeds codfw on wikifeeds.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Wikifeeds [12:53:55] RECOVERY - restbase endpoints health on restbase2017 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:53:56] RECOVERY - restbase endpoints health on restbase1026 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:54:07] RECOVERY - restbase endpoints health on restbase2011 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:54:07] RECOVERY - restbase endpoints health on restbase-dev1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:54:13] RECOVERY - restbase endpoints health on restbase1025 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:54:43] (03CR) 10Gehel: "LGTM, trivial enough" [puppet] - 10https://gerrit.wikimedia.org/r/627483 (owner: 10Elukey) [12:55:30] (03CR) 10Giuseppe Lavagetto: [C: 03+1] lvs: Remove cxserver non-TLS endpoint from LVS 1/3 [puppet] - 10https://gerrit.wikimedia.org/r/627432 (https://phabricator.wikimedia.org/T255879) (owner: 10JMeybohm) [12:55:44] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "Same as for termbox." [puppet] - 10https://gerrit.wikimedia.org/r/627433 (https://phabricator.wikimedia.org/T255879) (owner: 10JMeybohm) [12:57:00] (03CR) 10Giuseppe Lavagetto: [C: 03+1] lvs: Remove cxserver non-TLS endpoint from LVS 3/3 [puppet] - 10https://gerrit.wikimedia.org/r/627434 (https://phabricator.wikimedia.org/T255879) (owner: 10JMeybohm) [12:57:33] PROBLEM - restbase endpoints health on restbase1020 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:57:33] PROBLEM - restbase endpoints health on restbase1021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:09] RECOVERY - restbase endpoints health on restbase-dev1006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:09] RECOVERY - restbase endpoints health on restbase2023 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:09] RECOVERY - restbase endpoints health on restbase2016 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:09] RECOVERY - restbase endpoints health on restbase-dev1005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:29] RECOVERY - restbase endpoints health on restbase1016 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:43] PROBLEM - restbase endpoints health on restbase1018 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:43] PROBLEM - restbase endpoints health on restbase2010 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:51] PROBLEM - restbase endpoints health on restbase2022 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:58:53] PROBLEM - restbase endpoints health on restbase2020 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:43] PROBLEM - Restbase LVS codfw on restbase.svc.codfw.wmnet is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/RESTBase [12:59:43] PROBLEM - restbase endpoints health on restbase2014 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:43] PROBLEM - restbase endpoints health on restbase2021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:43] PROBLEM - restbase endpoints health on restbase2012 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:45] PROBLEM - wikifeeds codfw on wikifeeds.svc.codfw.wmnet is CRITICAL: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most read articles for January 1, 2016) is CRITICAL: Test retrieve the most read articles for January 1, 2016 returned the unexpected status 429 (expecting: 200): /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most-read articles for January 1, 2016 (with aggregated=true)) is CRITICAL: [12:59:45] most-read articles for January 1, 2016 (with aggregated=true) returned the unexpected status 429 (expecting: 200) https://wikitech.wikimedia.org/wiki/Wikifeeds [12:59:53] PROBLEM - wikifeeds eqiad on wikifeeds.svc.eqiad.wmnet is CRITICAL: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most read articles for January 1, 2016) timed out before a response was received: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most-read articles for January 1, 2016 (with aggregated=true)) timed out before a response was received https://wikitech.wikimedia.org/wiki/Wikifeeds [12:59:53] PROBLEM - restbase endpoints health on restbase1017 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:53] PROBLEM - restbase endpoints health on restbase2017 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:57] PROBLEM - restbase endpoints health on restbase1026 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [12:59:58] (03CR) 10Jcrespo: [C: 04-1] mariadb.py: Remove redundant code already present on WMFMariaDB class (031 comment) [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626602 (owner: 10Jcrespo) [13:00:04] liw and brennen: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) Mediawiki train - European+American Version deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1300). [13:00:09] PROBLEM - restbase endpoints health on restbase-dev1004 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:00:43] PROBLEM - restbase endpoints health on restbase2019 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:00:49] PROBLEM - restbase endpoints health on restbase2015 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:01:05] PROBLEM - restbase endpoints health on restbase2018 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:01:36] PROBLEM - restbase endpoints health on restbase2013 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:01:37] 10Operations, 10LDAP-Access-Requests: Add Bereket teshome to the ldap/wmde and ldap/nda group - https://phabricator.wikimedia.org/T262921 (10bete) [13:01:39] <_joe_> uhm let's see the telemetry [13:01:46] (03PS5) 10Jcrespo: mariadb.py: Remove redundant code already present on WMFMariaDB class [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626602 [13:01:48] (03PS1) 10ZPapierski: Switch between active W[D|C]QS indexes [puppet] - 10https://gerrit.wikimedia.org/r/627495 (https://phabricator.wikimedia.org/T262828) [13:01:49] <_joe_> akosiaris: mobileapps having issues? [13:02:46] <_joe_> no looks like wikifeeds [13:02:46] * akosiaris looking [13:02:53] PROBLEM - restbase endpoints health on restbase1019 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:02:55] I'm around, let me know if I can help [13:03:03] PROBLEM - restbase endpoints health on restbase1023 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:03:21] _joe_: latency increases [13:03:39] PROBLEM - restbase endpoints health on restbase2021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:03:39] PROBLEM - restbase endpoints health on restbase2014 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:03:48] <_joe_> yes, on wikifeeds, p99 [13:03:49] p90s, though, p50 is still ok [13:03:49] PROBLEM - restbase endpoints health on restbase2017 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:03:52] _joe_, akosiaris I am writing in #sre, it may be AQS [13:04:06] PROBLEM - restbase endpoints health on restbase2011 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:13] PROBLEM - restbase endpoints health on restbase-dev1005 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:13] PROBLEM - restbase endpoints health on restbase-dev1006 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:13] PROBLEM - restbase endpoints health on restbase2016 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:21] I am doing a roll restart of cassandra in there, that matches with the wikifeeds latencies, but I don't know if wikifeeds uses aqs or not [13:04:33] RECOVERY - restbase endpoints health on restbase1027 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:33] RECOVERY - restbase endpoints health on restbase2019 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:33] RECOVERY - restbase endpoints health on restbase1018 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:33] RECOVERY - restbase endpoints health on restbase2010 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:33] RECOVERY - restbase endpoints health on restbase1024 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:43] RECOVERY - restbase endpoints health on restbase1022 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:43] RECOVERY - restbase endpoints health on restbase2020 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:43] RECOVERY - restbase endpoints health on restbase1019 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:45] RECOVERY - restbase endpoints health on restbase2015 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:45] RECOVERY - restbase endpoints health on restbase2022 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:55] RECOVERY - restbase endpoints health on restbase1023 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:04:59] RECOVERY - restbase endpoints health on restbase2018 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:19] RECOVERY - restbase endpoints health on restbase1021 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:19] RECOVERY - restbase endpoints health on restbase1020 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:25] RECOVERY - restbase endpoints health on restbase2013 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:31] RECOVERY - restbase endpoints health on restbase2021 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:31] RECOVERY - Restbase LVS codfw on restbase.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/RESTBase [13:05:31] RECOVERY - restbase endpoints health on restbase2012 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:31] RECOVERY - restbase endpoints health on restbase2014 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:45] RECOVERY - wikifeeds codfw on wikifeeds.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Wikifeeds [13:05:45] RECOVERY - wikifeeds eqiad on wikifeeds.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Wikifeeds [13:05:46] RECOVERY - restbase endpoints health on restbase2017 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:46] RECOVERY - restbase endpoints health on restbase1017 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:05:46] RECOVERY - restbase endpoints health on restbase1026 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:06:01] RECOVERY - restbase endpoints health on restbase2011 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:06:01] RECOVERY - restbase endpoints health on restbase-dev1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:06:07] RECOVERY - restbase endpoints health on restbase2016 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:06:07] RECOVERY - restbase endpoints health on restbase-dev1005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:06:07] RECOVERY - restbase endpoints health on restbase-dev1006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:06:43] PROBLEM - Restbase LVS eqiad on restbase.svc.eqiad.wmnet is CRITICAL: /en.wikipedia.org/v1/page/talk/{title} (Get structured talk page for enwiki Salt article) is CRITICAL: Test Get structured talk page for enwiki Salt article returned the unexpected status 503 (expecting: 200): /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https: [13:06:43] dia.org/wiki/RESTBase [13:08:28] (03PS4) 10Jcrespo: resolve: Allow connections with :
, in addition to port [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626603 [13:10:24] (03CR) 10Jcrespo: [C: 04-1] "Wait, I don't think the logic is correct, needs more tests." [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626602 (owner: 10Jcrespo) [13:10:37] PROBLEM - Restbase LVS eqiad on restbase.svc.eqiad.wmnet is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/RESTBase [13:10:45] PROBLEM - restbase endpoints health on restbase1019 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:21] PROBLEM - restbase endpoints health on restbase1020 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:21] PROBLEM - restbase endpoints health on restbase1021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:27] PROBLEM - restbase endpoints health on restbase2013 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:29] (03PS1) 10Filippo Giunchedi: update debian files to handle new prometheus-icinga-am service [debs/prometheus-icinga-exporter] - 10https://gerrit.wikimedia.org/r/627497 [13:11:31] (03PS1) 10Filippo Giunchedi: Merge branch 'upstream/0.10' into HEAD [debs/prometheus-icinga-exporter] - 10https://gerrit.wikimedia.org/r/627498 [13:11:33] PROBLEM - Restbase LVS codfw on restbase.svc.codfw.wmnet is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/RESTBase [13:11:33] PROBLEM - restbase endpoints health on restbase2014 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:33] PROBLEM - restbase endpoints health on restbase2021 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:33] PROBLEM - restbase endpoints health on restbase2012 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:35] PROBLEM - wikifeeds codfw on wikifeeds.svc.codfw.wmnet is CRITICAL: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most read articles for January 1, 2016) is CRITICAL: Test retrieve the most read articles for January 1, 2016 returned the unexpected status 429 (expecting: 200) https://wikitech.wikimedia.org/wiki/Wikifeeds [13:11:43] PROBLEM - restbase endpoints health on restbase1017 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:11:49] PROBLEM - restbase endpoints health on restbase1026 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:03] PROBLEM - restbase endpoints health on restbase-dev1004 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:03] PROBLEM - restbase endpoints health on restbase2011 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:05] PROBLEM - IPMI Sensor Status on ms-be1024 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 2 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [13:12:09] PROBLEM - restbase endpoints health on restbase1025 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:09] PROBLEM - restbase endpoints health on restbase-dev1005 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:36] PROBLEM - restbase endpoints health on restbase1027 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:36] PROBLEM - restbase endpoints health on restbase1024 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:12:45] PROBLEM - restbase endpoints health on restbase2022 is CRITICAL: /en.wikipedia.org/v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:15] RECOVERY - restbase endpoints health on restbase1021 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:15] RECOVERY - restbase endpoints health on restbase1020 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:21] RECOVERY - restbase endpoints health on restbase2013 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:27] RECOVERY - restbase endpoints health on restbase2014 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:27] RECOVERY - restbase endpoints health on restbase2012 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:27] RECOVERY - Restbase LVS codfw on restbase.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/RESTBase [13:13:27] RECOVERY - restbase endpoints health on restbase2021 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:35] RECOVERY - wikifeeds codfw on wikifeeds.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Wikifeeds [13:13:35] RECOVERY - restbase endpoints health on restbase1017 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:56] RECOVERY - restbase endpoints health on restbase2011 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:13:56] RECOVERY - restbase endpoints health on restbase-dev1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:14:01] RECOVERY - restbase endpoints health on restbase1025 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:14:01] RECOVERY - restbase endpoints health on restbase-dev1005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:14:24] !log beginning work inside racks c2, c3, c4 and c5 eqiad [13:14:27] RECOVERY - restbase endpoints health on restbase1027 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:14:27] RECOVERY - restbase endpoints health on restbase1024 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:14:27] RECOVERY - Restbase LVS eqiad on restbase.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/RESTBase [13:14:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:14:37] RECOVERY - restbase endpoints health on restbase2022 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:14:37] RECOVERY - restbase endpoints health on restbase1019 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:15:41] RECOVERY - restbase endpoints health on restbase1026 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase [13:18:14] !log elukey@cumin1001 END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) [13:18:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:18:33] RECOVERY - Logstash Elasticsearch indexing errors #o11y on icinga1001 is OK: (C)8 ge (W)1 ge 0.4 https://wikitech.wikimedia.org/wiki/Logstash%23Indexing_errors https://logstash.wikimedia.org/goto/1cee1f1b5d4e6c5e06edb3353a2a4b83 https://grafana.wikimedia.org/dashboard/db/logstash [13:19:25] (03PS19) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [13:20:20] (03CR) 10jerkins-bot: [V: 04-1] cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [13:24:37] (03PS20) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [13:25:27] (03PS1) 10Filippo Giunchedi: prometheus: use prometheus-icinga-am to send alerts [puppet] - 10https://gerrit.wikimedia.org/r/627500 (https://phabricator.wikimedia.org/T258948) [13:25:48] (03PS6) 10Jcrespo: mariadb.py: Remove redundant code already present on WMFMariaDB class [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626602 [13:26:14] (03CR) 10Jcrespo: "After testing, I believe this is closer to the intended logic." [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626602 (owner: 10Jcrespo) [13:28:08] 04Critical Alert for device ps1-c1-eqiad.mgmt.eqiad.wmnet - Device rebooted [13:30:02] (03PS1) 10CDanis: eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) [13:30:27] (03PS2) 10CDanis: eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) [13:30:54] (03PS5) 10Jcrespo: resolve: Allow connections with :
, in addition to port [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/626603 [13:30:56] (03PS1) 10Jcrespo: mariadb: Use labsdb mysql config group for both labsdb and clouddb hosts [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/627502 [13:32:14] (03CR) 10Giuseppe Lavagetto: [C: 03+1] eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) (owner: 10CDanis) [13:33:08] 04̶C̶r̶i̶t̶i̶c̶a̶l Device ps1-c1-eqiad.mgmt.eqiad.wmnet recovered from Device rebooted [13:33:15] 04Critical Alert for device ps1-23-ulsfo.mgmt.ulsfo.wmnet - Device rebooted [13:33:36] (03PS3) 10CDanis: eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) [13:33:55] (03CR) 10Alexandros Kosiaris: [C: 03+1] eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) (owner: 10CDanis) [13:33:58] (03CR) 10JMeybohm: [C: 03+2] lvs: Rename termbox-https to termbox [puppet] - 10https://gerrit.wikimedia.org/r/627297 (https://phabricator.wikimedia.org/T254581) (owner: 10JMeybohm) [13:34:05] (03CR) 10JMeybohm: [C: 03+2] lvs: Rename cxserver-https to cxserver [puppet] - 10https://gerrit.wikimedia.org/r/627431 (https://phabricator.wikimedia.org/T255879) (owner: 10JMeybohm) [13:34:59] (03PS2) 10JMeybohm: lvs: Rename cxserver-https to cxserver [puppet] - 10https://gerrit.wikimedia.org/r/627431 (https://phabricator.wikimedia.org/T255879) [13:35:07] (03CR) 10Ottomata: [C: 03+1] eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) (owner: 10CDanis) [13:36:58] (03CR) 10CDanis: [C: 03+2] eventgate-logging-external: +replicas & cache NEL [deployment-charts] - 10https://gerrit.wikimedia.org/r/627501 (https://phabricator.wikimedia.org/T262087) (owner: 10CDanis) [13:37:38] (03PS21) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [13:38:08] 04̶C̶r̶i̶t̶i̶c̶a̶l Device ps1-23-ulsfo.mgmt.ulsfo.wmnet recovered from Device rebooted [13:40:48] <[1997kB]> ^ is it related to toolforge? [13:41:23] <[1997kB]> I'm getting 502 bad gateway [13:42:50] [1997kB]: no [13:45:29] PROBLEM - kubelet operational latencies on kubernetes2003 is CRITICAL: instance=kubernetes2003.codfw.wmnet https://wikitech.wikimedia.org/wiki/Kubernetes https://grafana.wikimedia.org/dashboard/db/kubernetes-kubelets?orgId=1 [13:46:47] 10Operations, 10ops-eqiad, 10serviceops: mw1360's NIC is faulty - https://phabricator.wikimedia.org/T262151 (10Volans) The device is still active in Netbox, shouldn't be marked as failed? [13:47:52] (03PS22) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [13:48:08] 04Critical Alert for device ps1-c4-eqiad.mgmt.eqiad.wmnet - Device rebooted [13:49:53] (03PS24) 10Ottomata: Canary events refinery job [puppet] - 10https://gerrit.wikimedia.org/r/624168 (https://phabricator.wikimedia.org/T251609) [13:50:37] !log akosiaris@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [13:50:37] !log akosiaris@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [13:50:41] (03CR) 10Ottomata: [C: 03+2] Canary events refinery job [puppet] - 10https://gerrit.wikimedia.org/r/624168 (https://phabricator.wikimedia.org/T251609) (owner: 10Ottomata) [13:50:41] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Canary events refinery job [puppet] - 10https://gerrit.wikimedia.org/r/624168 (https://phabricator.wikimedia.org/T251609) (owner: 10Ottomata) [13:50:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:50:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:51:51] !log cdanis@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [13:51:51] !log cdanis@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [13:51:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:51:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:53:08] 04̶C̶r̶i̶t̶i̶c̶a̶l Device ps1-c4-eqiad.mgmt.eqiad.wmnet recovered from Device rebooted [13:53:13] (03PS1) 10Ottomata: refine - Exclude CitationUsagePageLoad from refine [puppet] - 10https://gerrit.wikimedia.org/r/627507 [13:54:24] (03Abandoned) 10JMeybohm: lvs: Remove termbox non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627299 (https://phabricator.wikimedia.org/T254581) (owner: 10JMeybohm) [13:54:31] (03Abandoned) 10JMeybohm: lvs: Remove cxserver non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627433 (https://phabricator.wikimedia.org/T255879) (owner: 10JMeybohm) [14:01:24] (03PS1) 10Ottomata: refinery java_job.sh.erb - fix proxyPort CLI opt [puppet] - 10https://gerrit.wikimedia.org/r/627509 [14:01:26] !log akosiaris@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [14:01:26] !log akosiaris@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [14:01:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:01:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:01:45] (03CR) 10Kormat: wikireplicas: Proposal for a proxy setup on multi-instance replicas (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [14:02:09] (03PS2) 10JMeybohm: lvs: Remove mobileapps non-TLS endpoint from LVS 1/2 [puppet] - 10https://gerrit.wikimedia.org/r/627271 (https://phabricator.wikimedia.org/T255876) [14:02:11] (03PS3) 10JMeybohm: lvs: Remove mobileapps non-TLS endpoint from LVS 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/627266 (https://phabricator.wikimedia.org/T255876) [14:02:37] (03CR) 10Ottomata: [C: 03+2] refinery java_job.sh.erb - fix proxyPort CLI opt [puppet] - 10https://gerrit.wikimedia.org/r/627509 (owner: 10Ottomata) [14:02:51] (03Abandoned) 10JMeybohm: lvs: Remove mobileapps non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627265 (https://phabricator.wikimedia.org/T255876) (owner: 10JMeybohm) [14:03:02] RECOVERY - kubelet operational latencies on kubernetes2003 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Kubernetes https://grafana.wikimedia.org/dashboard/db/kubernetes-kubelets?orgId=1 [14:03:08] 04Critical Alert for device ps1-c8-codfw.mgmt.codfw.wmnet - Device rebooted [14:03:42] 10Operations, 10DBA, 10Blocked-on-schema-change, 10User-Kormat: Schema change to make change_tag.ct_rc_id unsigned - https://phabricator.wikimedia.org/T259831 (10Kormat) [14:03:51] (03Abandoned) 10JMeybohm: lvs: Remove blubberoid non-TLS endpoint from LVS 2/3 [puppet] - 10https://gerrit.wikimedia.org/r/627269 (https://phabricator.wikimedia.org/T236017) (owner: 10JMeybohm) [14:04:15] (03PS2) 10Ottomata: refine - Exclude CitationUsagePageLoad from refine [puppet] - 10https://gerrit.wikimedia.org/r/627507 [14:04:23] (03CR) 10Ottomata: [V: 03+2 C: 03+2] refine - Exclude CitationUsagePageLoad from refine [puppet] - 10https://gerrit.wikimedia.org/r/627507 (owner: 10Ottomata) [14:04:34] PROBLEM - IPMI Sensor Status on kafka-jumbo1004 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [PS Redundancy = Critical, Status = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [14:05:15] (03PS10) 10Elukey: Add basic Debian packaging [debs/hue] - 10https://gerrit.wikimedia.org/r/618728 (https://phabricator.wikimedia.org/T233073) [14:05:18] (03Abandoned) 10Mholloway: Revert "mobileapps: use the service proxy everywhere" [deployment-charts] - 10https://gerrit.wikimedia.org/r/626037 (owner: 10MSantos) [14:06:16] PROBLEM - IPMI Sensor Status on db1088 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 1 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [14:06:29] (03CR) 10Ayounsi: cookbook sre.pdu: Fix reboot logic and other minor fixes (031 comment) [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [14:07:48] 10Operations, 10DBA, 10Blocked-on-schema-change, 10User-Kormat: Schema change to make change_tag.ct_rc_id unsigned - https://phabricator.wikimedia.org/T259831 (10Kormat) s6 codfw progress: [] db2076.codfw.wmnet sanitarium master [] db2087.codfw.wmnet [] db2089.codfw.wmnet [] db2095.codfw.wmnet sanitarium... [14:08:08] 04Critical Alert for device ps1-b7-codfw.mgmt.codfw.wmnet - Device rebooted [14:08:14] 04̶C̶r̶i̶t̶i̶c̶a̶l Device ps1-c8-codfw.mgmt.codfw.wmnet recovered from Device rebooted [14:08:27] 10Operations, 10DBA, 10Blocked-on-schema-change, 10User-Kormat: Schema change to make change_tag.ct_rc_id unsigned - https://phabricator.wikimedia.org/T259831 (10Kormat) [14:08:30] (03PS11) 10Elukey: Add basic Debian packaging [debs/hue] - 10https://gerrit.wikimedia.org/r/618728 (https://phabricator.wikimedia.org/T233073) [14:11:36] (03CR) 10Jcrespo: "I don't understand well the proposal, but in case context was necessary, when dns-per-section were setup many years ago, the desire/intend" [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [14:14:21] (03PS23) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [14:15:13] (03PS4) 10Hnowlan: api-gateway: migrate to new helmfile format [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 [14:16:12] (03PS1) 10Ottomata: wgEventStreams - enable canary events for eventlogging_TemplateWizard [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627511 (https://phabricator.wikimedia.org/T251609) [14:16:30] 10Operations, 10Product-Infrastructure-Team-Backlog, 10Reading List Service, 10MW-1.36-notes (1.36.0-wmf.9; 2020-09-15): fixListSize.php creates excessive DB load in production - https://phabricator.wikimedia.org/T262575 (10Mholloway) 05Open→03Resolved a:03Mholloway [14:18:10] (03PS2) 10Ottomata: wgEventStreams - enable canary events for eventlogging_TemplateWizard [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627511 (https://phabricator.wikimedia.org/T251609) [14:18:57] liw: brennen is train happening now? i have a simple config sync I'd like to do [14:19:07] ottomata, no, train is blocked [14:19:14] ok so I can proceed? [14:19:40] (proceeding...) :) [14:19:43] I guess [14:19:49] do tell if it worked [14:19:55] (03CR) 10Ottomata: [C: 03+2] wgEventStreams - enable canary events for eventlogging_TemplateWizard [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627511 (https://phabricator.wikimedia.org/T251609) (owner: 10Ottomata) [14:20:17] (because train can't do squat until scap can rebuild l10n cache again) [14:20:41] (03Merged) 10jenkins-bot: wgEventStreams - enable canary events for eventlogging_TemplateWizard [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627511 (https://phabricator.wikimedia.org/T251609) (owner: 10Ottomata) [14:21:01] (03PS5) 10Hnowlan: api-gateway: migrate to new helmfile format [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 [14:21:13] (03PS24) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [14:21:28] !log otto@deploy1001 sync-file aborted: wgEventStreams: Set canary_events_enabled: true for eventlogging_TemplateWizard - T251609 (duration: 00m 06s) [14:21:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:21:35] (03CR) 10Ppchelko: [C: 03+1] "I donno how to review this. eyeballing it definitely wouldn't catch any errors. looks sane." [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 (owner: 10Hnowlan) [14:21:37] T251609: Automate ingestion and refinement into Hive of event data from Kafka using stream configs and canary/heartbeat events - https://phabricator.wikimedia.org/T251609 [14:22:24] (03CR) 10Nuria: [C: 03+1] "Seems the best to avoid errors, thank you." [puppet] - 10https://gerrit.wikimedia.org/r/627507 (owner: 10Ottomata) [14:22:31] !log otto@deploy1001 Synchronized wmf-config/InitialiseSettings.php: wgEventStreams: Set canary_events_enabled: true for eventlogging_TemplateWizard - T251609 (duration: 00m 56s) [14:22:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:22:54] (03PS1) 10Elukey: sre.cassandra.roll-restart.py: add more accurate sleep time [cookbooks] - 10https://gerrit.wikimedia.org/r/627512 [14:23:53] !log oblivian@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [14:23:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:02] (03CR) 10Hnowlan: "> Patch Set 3: Code-Review-1" [deployment-charts] - 10https://gerrit.wikimedia.org/r/627250 (owner: 10Hnowlan) [14:25:28] (03PS2) 10Elukey: sre.cassandra.roll-restart.py: add more accurate sleep time [cookbooks] - 10https://gerrit.wikimedia.org/r/627512 [14:29:48] (03CR) 10JMeybohm: [C: 04-1] "I was a bit to fast I guess...looking at PCC again it seems as if the target configs are empty (https://puppet-compiler.wmflabs.org/compil" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/627439 (owner: 10Alexandros Kosiaris) [14:30:29] (03CR) 10Giuseppe Lavagetto: [C: 03+1] lvs: Remove mobileapps non-TLS endpoint from LVS 1/2 [puppet] - 10https://gerrit.wikimedia.org/r/627271 (https://phabricator.wikimedia.org/T255876) (owner: 10JMeybohm) [14:30:43] (03CR) 10Giuseppe Lavagetto: [C: 03+1] lvs: Remove mobileapps non-TLS endpoint from LVS 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/627266 (https://phabricator.wikimedia.org/T255876) (owner: 10JMeybohm) [14:31:14] (03CR) 10Giuseppe Lavagetto: [C: 03+1] lvs: Remove blubberoid non-TLS endpoint from LVS 1/3 [puppet] - 10https://gerrit.wikimedia.org/r/627268 (https://phabricator.wikimedia.org/T236017) (owner: 10JMeybohm) [14:31:57] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "ok but let's check with releng that no ci process uses http." [puppet] - 10https://gerrit.wikimedia.org/r/627270 (https://phabricator.wikimedia.org/T236017) (owner: 10JMeybohm) [14:33:39] (03CR) 10JMeybohm: "> Patch Set 1: Code-Review+1" [puppet] - 10https://gerrit.wikimedia.org/r/627270 (https://phabricator.wikimedia.org/T236017) (owner: 10JMeybohm) [14:35:06] RECOVERY - IPMI Sensor Status on kafka-jumbo1004 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [14:35:57] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good, 10 seconds is also what I've been using for manual rolling restarts" [cookbooks] - 10https://gerrit.wikimedia.org/r/627512 (owner: 10Elukey) [14:36:46] RECOVERY - IPMI Sensor Status on db1088 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [14:38:08] !log remove old SNMP community from all network devices [14:38:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:39:38] (03CR) 10Elukey: "> Patch Set 2: Code-Review+1" [cookbooks] - 10https://gerrit.wikimedia.org/r/627512 (owner: 10Elukey) [14:40:52] (03PS1) 10Ayounsi: Remove need for codfw only SNMP community [puppet] - 10https://gerrit.wikimedia.org/r/627514 [14:41:28] (03PS1) 10Filippo Giunchedi: grafana: switch to eqiad [puppet] - 10https://gerrit.wikimedia.org/r/627515 (https://phabricator.wikimedia.org/T259143) [14:42:25] !log volans@cumin1001 START - Cookbook sre.dns.netbox [14:42:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:46:07] 10Operations, 10LDAP-Access-Requests: Add Bereket teshome to the ldap/wmde and ldap/nda group - https://phabricator.wikimedia.org/T262921 (10RLazarus) And @bete one request (at the same time, to speed things along): Could you please ask your manager to comment on this Phab ticket, giving approval? [14:48:17] !log volans@cumin1001 END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [14:48:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:48:28] (03CR) 10Cwhite: [C: 03+1] grafana: switch to eqiad [puppet] - 10https://gerrit.wikimedia.org/r/627515 (https://phabricator.wikimedia.org/T259143) (owner: 10Filippo Giunchedi) [14:49:23] jouncebot: next [14:49:24] In 1 hour(s) and 10 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1600) [14:49:48] I'm flipping grafana back to eqiad shortly FYI, no impact expected [14:51:25] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good to me" [debs/hue] - 10https://gerrit.wikimedia.org/r/618728 (https://phabricator.wikimedia.org/T233073) (owner: 10Elukey) [14:52:20] (03CR) 10Filippo Giunchedi: [C: 03+2] grafana: switch to eqiad [puppet] - 10https://gerrit.wikimedia.org/r/627515 (https://phabricator.wikimedia.org/T259143) (owner: 10Filippo Giunchedi) [14:53:27] PROBLEM - IPMI Sensor Status on db1090 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 2 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [14:53:41] !log switch grafana to eqiad - T259143 [14:53:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:53:50] T259143: Upgrade to Grafana 7 - https://phabricator.wikimedia.org/T259143 [14:56:18] (03PS1) 10Giuseppe Lavagetto: wikifeeds: use the service proxy everywhere [deployment-charts] - 10https://gerrit.wikimedia.org/r/627517 (https://phabricator.wikimedia.org/T255878) [14:57:35] 10Operations, 10LDAP-Access-Requests: Add Bereket teshome to the ldap/wmde and ldap/nda group - https://phabricator.wikimedia.org/T262921 (10bete) @RLazarus sure. will do [14:57:53] (03PS1) 10Volans: databases: remove leftover records from old hosts [dns] - 10https://gerrit.wikimedia.org/r/627518 (https://phabricator.wikimedia.org/T244153) [14:57:55] (03PS1) 10Volans: wmcs: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627519 (https://phabricator.wikimedia.org/T244153) [14:57:57] (03CR) 10Cwhite: [C: 03+2] update debian files to handle new prometheus-icinga-am service [debs/prometheus-icinga-exporter] - 10https://gerrit.wikimedia.org/r/627497 (owner: 10Filippo Giunchedi) [14:57:59] (03PS1) 10Volans: puppetdb: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627520 (https://phabricator.wikimedia.org/T244153) [14:58:01] (03PS1) 10Volans: ores: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627521 (https://phabricator.wikimedia.org/T244153) [14:58:03] (03PS1) 10Volans: k8s: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627522 (https://phabricator.wikimedia.org/T244153) [14:58:05] (03PS1) 10Volans: misc: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627523 (https://phabricator.wikimedia.org/T244153) [14:58:07] (03PS1) 10Volans: lvs: convert IPv6 PTR ORIGINs to /64 [dns] - 10https://gerrit.wikimedia.org/r/627524 (https://phabricator.wikimedia.org/T244153) [14:59:17] (03CR) 10Cwhite: [V: 03+2 C: 03+2] Merge branch 'upstream/0.10' into HEAD [debs/prometheus-icinga-exporter] - 10https://gerrit.wikimedia.org/r/627498 (owner: 10Filippo Giunchedi) [14:59:33] (03CR) 10Giuseppe Lavagetto: "I have tested the wikifeeds on staging quickly using service-checker." [deployment-charts] - 10https://gerrit.wikimedia.org/r/627517 (https://phabricator.wikimedia.org/T255878) (owner: 10Giuseppe Lavagetto) [14:59:35] (03CR) 10Ayounsi: "Reclaiming public IPs, nice!" [dns] - 10https://gerrit.wikimedia.org/r/627523 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [14:59:39] (03PS1) 10Ostrzyciel: Enable the reverted tag on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627526 (https://phabricator.wikimedia.org/T164307) [15:00:39] (03PS2) 10Ostrzyciel: Enable the reverted tag on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627526 (https://phabricator.wikimedia.org/T164307) [15:02:54] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good, these were missed in T227640" [dns] - 10https://gerrit.wikimedia.org/r/627521 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:03:14] (03Abandoned) 10Cwhite: update debian files to handle new prometheus-icinga-am service [debs/prometheus-icinga-exporter] (debian/sid) - 10https://gerrit.wikimedia.org/r/626001 (owner: 10Cwhite) [15:04:28] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good, these are obsolete now." [dns] - 10https://gerrit.wikimedia.org/r/627520 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:07:02] 10Operations, 10Prod-Kubernetes, 10serviceops, 10Kubernetes: Move eventgate-analytics to use TLS only - https://phabricator.wikimedia.org/T255870 (10JMeybohm) a:03JMeybohm [15:07:12] (03PS1) 10JMeybohm: lvs: Remove check_eventgate_analytics_http_cluster monitoring [puppet] - 10https://gerrit.wikimedia.org/r/627528 (https://phabricator.wikimedia.org/T255870) [15:07:15] (03PS1) 10JMeybohm: lvs: Remove eventgate-analytics non-TLS endpoint from LVS 1/2 [puppet] - 10https://gerrit.wikimedia.org/r/627529 (https://phabricator.wikimedia.org/T255870) [15:07:18] (03PS1) 10JMeybohm: lvs: Remove eventgate-analytics non-TLS endpoint from LVS 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/627530 (https://phabricator.wikimedia.org/T255870) [15:07:57] (03CR) 10Muehlenhoff: [C: 03+1] "All obsolete; radon was an authdns, polonium an MX, californium hosted wikitech and technetium/cygnus were VMs created for a pentest in 20" [dns] - 10https://gerrit.wikimedia.org/r/627523 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:08:40] (03PS25) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [15:08:43] PROBLEM - ps1-603-eqsin-infeed-load-tower-B-single-phase on ps1-603-eqsin is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:45] PROBLEM - ps1-a1-eqiad-infeed-load-tower-B-phase-Z on ps1-a1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:45] PROBLEM - ps1-a8-eqiad-infeed-load-tower-B-phase-Z on ps1-a8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:45] PROBLEM - ps1-b6-eqiad-infeed-load-tower-B-phase-X on ps1-b6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:45] PROBLEM - ps1-b7-eqiad-infeed-load-tower-B-phase-Z on ps1-b7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:45] PROBLEM - ps1-d5-eqiad-infeed-load-tower-B-phase-Z on ps1-d5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:45] PROBLEM - ps1-c3-codfw-infeed-load-tower-B-phase-Z on ps1-c3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:49] PROBLEM - ps1-a5-eqiad-infeed-load-tower-A-phase-X on ps1-a5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:49] PROBLEM - ps1-604-eqsin-infeed-load-tower-A-single-phase on ps1-604-eqsin is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:49] PROBLEM - Host ps1-c3-eqiad is DOWN: PING CRITICAL - Packet loss = 100% [15:08:49] PROBLEM - ps1-b6-eqiad-infeed-load-tower-B-phase-Y on ps1-b6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-604-eqsin-infeed-load-tower-B-single-phase on ps1-604-eqsin is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-a5-eqiad-infeed-load-tower-A-phase-Y on ps1-a5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-a4-eqiad-infeed-load-tower-A-phase-X on ps1-a4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-b3-eqiad-infeed-load-tower-A-phase-X on ps1-b3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-a8-codfw-infeed-load-tower-A-phase-X on ps1-a8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-d2-eqiad-infeed-load-tower-A-phase-Z on ps1-d2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:56] PROBLEM - ps1-a5-eqiad-infeed-load-tower-A-phase-Z on ps1-a5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:57] PROBLEM - ps1-d3-eqiad-infeed-load-tower-B-phase-Y on ps1-d3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:57] PROBLEM - ps1-b6-eqiad-infeed-load-tower-B-phase-Z on ps1-b6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:58] PROBLEM - ps1-22-ulsfo-infeed-load-tower-B-single-phase on ps1-22-ulsfo is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:58] PROBLEM - ps1-b2-eqiad-infeed-load-tower-B-phase-Z on ps1-b2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:59] PROBLEM - ps1-a1-eqiad-infeed-load-tower-A-phase-Y on ps1-a1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:08:59] PROBLEM - ps1-a2-eqiad-infeed-load-tower-B-phase-X on ps1-a2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:00] PROBLEM - ps1-a8-eqiad-infeed-load-tower-A-phase-Y on ps1-a8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:00] PROBLEM - ps1-b7-eqiad-infeed-load-tower-A-phase-Y on ps1-b7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:01] PROBLEM - ps1-b8-eqiad-infeed-load-tower-B-phase-X on ps1-b8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:01] PROBLEM - ps1-c3-codfw-infeed-load-tower-A-phase-Y on ps1-c3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:02] PROBLEM - ps1-c8-eqiad-infeed-load-tower-B-phase-Z on ps1-c8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:02] PROBLEM - ps1-a3-eqiad-infeed-load-tower-B-phase-Z on ps1-a3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:04] this is me ill extend downtime [15:09:19] PROBLEM - ps1-a4-eqiad-infeed-load-tower-B-phase-X on ps1-a4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:19] PROBLEM - ps1-a3-eqiad-infeed-load-tower-A-phase-Y on ps1-a3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:19] PROBLEM - ps1-a5-eqiad-infeed-load-tower-B-phase-Z on ps1-a5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:19] PROBLEM - ps1-a8-codfw-infeed-load-tower-B-phase-X on ps1-a8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:19] PROBLEM - ps1-b3-eqiad-infeed-load-tower-B-phase-X on ps1-b3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:19] PROBLEM - ps1-b2-eqiad-infeed-load-tower-A-phase-Y on ps1-b2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:19] PROBLEM - ps1-d2-eqiad-infeed-load-tower-B-phase-Z on ps1-d2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:20] PROBLEM - ps1-c8-eqiad-infeed-load-tower-A-phase-Y on ps1-c8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d3-eqiad-infeed-load-tower-B-phase-X on ps1-d3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d2-eqiad-infeed-load-tower-A-phase-Y on ps1-d2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d2-eqiad-infeed-load-tower-B-phase-X on ps1-d2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d3-eqiad-infeed-load-tower-A-phase-Z on ps1-d3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d4-eqiad-infeed-load-tower-A-phase-Y on ps1-d4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d4-eqiad-infeed-load-tower-A-phase-X on ps1-d4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:23] PROBLEM - ps1-d2-eqiad-infeed-load-tower-B-phase-Y on ps1-d2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:24] PROBLEM - ps1-d2-eqiad-infeed-load-tower-A-phase-X on ps1-d2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:24] PROBLEM - ps1-d4-eqiad-infeed-load-tower-B-phase-Z on ps1-d4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:25] PROBLEM - ps1-d4-eqiad-infeed-load-tower-B-phase-Y on ps1-d4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:25] PROBLEM - ps1-d4-eqiad-infeed-load-tower-B-phase-X on ps1-d4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:26] PROBLEM - ps1-d3-eqiad-infeed-load-tower-A-phase-Y on ps1-d3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:26] PROBLEM - ps1-d5-eqiad-infeed-load-tower-A-phase-Y on ps1-d5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:27] PROBLEM - ps1-d5-eqiad-infeed-load-tower-B-phase-X on ps1-d5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:27] PROBLEM - ps1-d5-eqiad-infeed-load-tower-B-phase-Y on ps1-d5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:28] PROBLEM - ps1-a2-eqiad-infeed-load-tower-A-phase-Y on ps1-a2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:28] PROBLEM - ps1-a3-eqiad-infeed-load-tower-B-phase-X on ps1-a3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:29] PROBLEM - ps1-b2-eqiad-infeed-load-tower-B-phase-X on ps1-b2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:29] PROBLEM - ps1-a4-eqiad-infeed-load-tower-B-phase-Z on ps1-a4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:30] PROBLEM - ps1-b3-eqiad-infeed-load-tower-B-phase-Z on ps1-b3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:30] PROBLEM - ps1-a8-codfw-infeed-load-tower-B-phase-Z on ps1-a8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:31] PROBLEM - ps1-b8-eqiad-infeed-load-tower-A-phase-Y on ps1-b8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:31] PROBLEM - ps1-c8-eqiad-infeed-load-tower-B-phase-X on ps1-c8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:32] PROBLEM - Host ps1-c2-eqiad is DOWN: PING CRITICAL - Packet loss = 100% [15:09:32] PROBLEM - ps1-a1-eqiad-infeed-load-tower-A-phase-Z on ps1-a1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:33] PROBLEM - ps1-a8-eqiad-infeed-load-tower-A-phase-Z on ps1-a8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:33] PROBLEM - ps1-a2-eqiad-infeed-load-tower-B-phase-Y on ps1-a2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:34] PROBLEM - ps1-b6-eqiad-infeed-load-tower-A-phase-X on ps1-b6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:34] PROBLEM - ps1-b7-eqiad-infeed-load-tower-A-phase-Z on ps1-b7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:35] PROBLEM - ps1-b8-eqiad-infeed-load-tower-B-phase-Y on ps1-b8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:35] PROBLEM - ps1-c3-codfw-infeed-load-tower-A-phase-Z on ps1-c3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:36] PROBLEM - ps1-d5-eqiad-infeed-load-tower-A-phase-Z on ps1-d5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:47] 10Operations, 10LDAP-Access-Requests: Add Bereket teshome to the ldap/wmde and ldap/nda group - https://phabricator.wikimedia.org/T262921 (10conny-kawohl_WMDE) Hey @RLazarus, I am @bete's Engineering Manager and give my approval for this action [15:09:57] PROBLEM - ps1-a3-eqiad-infeed-load-tower-A-phase-X on ps1-a3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-a4-eqiad-infeed-load-tower-A-phase-Z on ps1-a4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-a5-eqiad-infeed-load-tower-B-phase-Y on ps1-a5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-b3-eqiad-infeed-load-tower-A-phase-Z on ps1-b3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-b2-eqiad-infeed-load-tower-A-phase-X on ps1-b2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-a8-codfw-infeed-load-tower-A-phase-Z on ps1-a8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-a1-eqiad-infeed-load-tower-B-phase-X on ps1-a1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:09:59] PROBLEM - ps1-a2-eqiad-infeed-load-tower-B-phase-Z on ps1-a2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:00] PROBLEM - ps1-a8-eqiad-infeed-load-tower-B-phase-X on ps1-a8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:00] PROBLEM - ps1-b6-eqiad-infeed-load-tower-A-phase-Y on ps1-b6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:01] PROBLEM - ps1-b8-eqiad-infeed-load-tower-B-phase-Z on ps1-b8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:01] PROBLEM - ps1-b7-eqiad-infeed-load-tower-B-phase-X on ps1-b7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:02] PROBLEM - ps1-c3-codfw-infeed-load-tower-B-phase-X on ps1-c3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:03] PROBLEM - ps1-c8-eqiad-infeed-load-tower-A-phase-X on ps1-c8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:09] PROBLEM - ps1-22-ulsfo-infeed-load-tower-A-single-phase on ps1-22-ulsfo is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:09] PROBLEM - ps1-a1-eqiad-infeed-load-tower-A-phase-X on ps1-a1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-a2-eqiad-infeed-load-tower-A-phase-Z on ps1-a2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-a8-eqiad-infeed-load-tower-A-phase-X on ps1-a8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-b7-eqiad-infeed-load-tower-A-phase-X on ps1-b7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-b2-eqiad-infeed-load-tower-B-phase-Y on ps1-b2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-a3-eqiad-infeed-load-tower-B-phase-Y on ps1-a3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-b8-eqiad-infeed-load-tower-A-phase-Z on ps1-b8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:11] PROBLEM - ps1-c3-codfw-infeed-load-tower-A-phase-X on ps1-c3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:12] PROBLEM - ps1-c8-eqiad-infeed-load-tower-B-phase-Y on ps1-c8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:12] PROBLEM - ps1-d5-eqiad-infeed-load-tower-A-phase-X on ps1-d5-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:18] (03PS1) 10Filippo Giunchedi: rsync: handle quickdatacopy cron cleanup when flipping source/dest [puppet] - 10https://gerrit.wikimedia.org/r/627531 [15:10:23] PROBLEM - ps1-a2-eqiad-infeed-load-tower-A-phase-X on ps1-a2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:23] PROBLEM - ps1-a4-eqiad-infeed-load-tower-B-phase-Y on ps1-a4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:23] PROBLEM - ps1-b8-eqiad-infeed-load-tower-A-phase-X on ps1-b8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:23] PROBLEM - ps1-a3-eqiad-infeed-load-tower-A-phase-Z on ps1-a3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:23] PROBLEM - ps1-b2-eqiad-infeed-load-tower-A-phase-Z on ps1-b2-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:23] PROBLEM - ps1-a8-codfw-infeed-load-tower-B-phase-Y on ps1-a8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:23] PROBLEM - ps1-b3-eqiad-infeed-load-tower-B-phase-Y on ps1-b3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:25] PROBLEM - ps1-603-eqsin-infeed-load-tower-A-single-phase on ps1-603-eqsin is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:25] PROBLEM - ps1-a1-eqiad-infeed-load-tower-B-phase-Y on ps1-a1-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:27] PROBLEM - ps1-a6-eqiad-infeed-load-tower-A-phase-X on ps1-a6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:27] PROBLEM - ps1-a7-eqiad-infeed-load-tower-A-phase-Z on ps1-a7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:27] PROBLEM - ps1-a8-eqiad-infeed-load-tower-B-phase-Y on ps1-a8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:27] PROBLEM - ps1-b6-eqiad-infeed-load-tower-A-phase-Z on ps1-b6-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:27] PROBLEM - ps1-b7-eqiad-infeed-load-tower-B-phase-Y on ps1-b7-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:27] PROBLEM - ps1-d3-eqiad-infeed-load-tower-A-phase-X on ps1-d3-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:28] PROBLEM - ps1-c8-eqiad-infeed-load-tower-A-phase-Z on ps1-c8-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:28] PROBLEM - ps1-c3-codfw-infeed-load-tower-B-phase-Y on ps1-c3-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:29] PROBLEM - ps1-d4-eqiad-infeed-load-tower-A-phase-Z on ps1-d4-eqiad is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:10:33] PROBLEM - Host ms-be1024 is DOWN: PING CRITICAL - Packet loss = 100% [15:11:23] PROBLEM - Host thanos-be1003 is DOWN: PING CRITICAL - Packet loss = 100% [15:12:23] PROBLEM - Host ms-be1024.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:13:40] 10Operations, 10observability, 10good first task: nagios-nrpe-server.service: systemd unit references path below legacy directory /var/run/ - https://phabricator.wikimedia.org/T252990 (10ema) 05Open→03Resolved >>! In T252990#6458970, @gerritbot wrote: > Change 621967 **merged** by Herron: > [operations/p... [15:13:41] At a previous company, we had a jar where oyu had to put in 5 EUR every time you caused a page because you rebooted a machine without a silence/maintenance window in. Around Christmas every year, we went to have dinner from that budget :) [15:14:04] lol [15:14:16] klausman: to a fast food place? [15:14:34] klausman: I'm pretty sure we can organize a conference with that money here :D [15:14:44] (03CR) 10Elukey: "After reading some docs, what I'd do is:" [puppet] - 10https://gerrit.wikimedia.org/r/626481 (https://phabricator.wikimedia.org/T213741) (owner: 10Razzi) [15:14:59] michelin restaurant for 30+ [15:15:49] RECOVERY - Host thanos-be1003 is UP: PING OK - Packet loss = 0%, RTA = 0.16 ms [15:15:55] marostegui: no, usually a nice italian restaurant [15:16:13] RECOVERY - Host ms-be1024 is UP: PING OK - Packet loss = 0%, RTA = 0.14 ms [15:16:53] Let's just say that the pager at that place was never as quiet as one would hope [15:17:21] PROBLEM - Host an-worker1106.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:18:25] PROBLEM - Host deploy1001.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:18:26] PROBLEM - Host labsdb1010.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:18:29] PROBLEM - Host ps1-c4-eqiad is DOWN: PING CRITICAL - Packet loss = 100% [15:18:31] PROBLEM - Host logstash1028.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:18:36] 10Operations, 10Prod-Kubernetes, 10serviceops, 10Kubernetes: Move eventgate-main to use TLS only - https://phabricator.wikimedia.org/T255873 (10JMeybohm) a:03JMeybohm [15:18:41] (03CR) 10Filippo Giunchedi: "PCC https://puppet-compiler.wmflabs.org/compiler1003/25075/" [puppet] - 10https://gerrit.wikimedia.org/r/627531 (owner: 10Filippo Giunchedi) [15:18:45] PROBLEM - Host ms-fe1007.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:09] PROBLEM - Host snapshot1006.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:29] PROBLEM - Host an-worker1089.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:29] PROBLEM - Host an-worker1100.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:29] PROBLEM - Host an-worker1108.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:29] PROBLEM - Host an-worker1107.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:45] PROBLEM - Host an-worker1090.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:19:49] PROBLEM - Host an-worker1105.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:20:05] PROBLEM - Host cp1084.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:20:09] PROBLEM - Host ms-be1025.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:20:15] PROBLEM - Host cp1083.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:20:19] (03PS2) 10Volans: lvs: convert IPv6 PTR ORIGINs to /64 [dns] - 10https://gerrit.wikimedia.org/r/627524 (https://phabricator.wikimedia.org/T244153) [15:20:25] PROBLEM - Host ms-be1054.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:21:29] RECOVERY - Host ps1-c3-eqiad is UP: PING WARNING - Packet loss = 90%, RTA = 1.29 ms [15:22:21] RECOVERY - Host ps1-c2-eqiad is UP: PING OK - Packet loss = 0%, RTA = 1.68 ms [15:23:23] PROBLEM - Host kafka-jumbo1005.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:23:23] PROBLEM - Host kafka-jumbo1007.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:23:27] PROBLEM - Host mwlog1001.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:23:32] (03PS1) 10JMeybohm: lvs: Remove check_eventgate_main_http_cluster monitoring [puppet] - 10https://gerrit.wikimedia.org/r/627536 (https://phabricator.wikimedia.org/T255873) [15:23:34] (03PS1) 10JMeybohm: lvs: Remove eventgate-main non-TLS endpoint from LVS 1/2 [puppet] - 10https://gerrit.wikimedia.org/r/627537 (https://phabricator.wikimedia.org/T255873) [15:23:36] (03PS1) 10JMeybohm: lvs: Remove eventgate-main non-TLS endpoint from LVS 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/627538 (https://phabricator.wikimedia.org/T255873) [15:24:07] PROBLEM - Host ores1006.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:24:13] PROBLEM - Host thanos-be1003.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:24:23] RECOVERY - IPMI Sensor Status on db1090 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [15:26:33] (03PS1) 10Ottomata: eventgate-logging-external - set schema_uri_query_param and stream_query_param [deployment-charts] - 10https://gerrit.wikimedia.org/r/627539 (https://phabricator.wikimedia.org/T262087) [15:26:37] !log manual install prometheus-icinga-exporter upgrade on icinga2001 [15:26:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:27:38] (03CR) 10Ottomata: [V: 03+2 C: 03+2] eventgate-logging-external - set schema_uri_query_param and stream_query_param [deployment-charts] - 10https://gerrit.wikimedia.org/r/627539 (https://phabricator.wikimedia.org/T262087) (owner: 10Ottomata) [15:28:21] !log otto@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [15:28:21] !log otto@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [15:28:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:28:28] (03CR) 10ProcrastinatingReader: "Saw the other patch has been merged - thanks! This one removes it from wmf-config, which isn't required anymore. I probably can't get arou" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/619991 (https://phabricator.wikimedia.org/T255506) (owner: 10ProcrastinatingReader) [15:28:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:28:46] (03PS1) 10Papaul: DHCP: Add MAC address for maps2005-maps2010 [puppet] - 10https://gerrit.wikimedia.org/r/627540 (https://phabricator.wikimedia.org/T260271) [15:29:29] PROBLEM - Host thanos-be1003 is DOWN: PING CRITICAL - Packet loss = 100% [15:29:56] (03PS1) 10JMeybohm: lvs: Remove proton non-TLS endpoint from LVS 1/2 [puppet] - 10https://gerrit.wikimedia.org/r/627541 (https://phabricator.wikimedia.org/T255877) [15:30:00] (03PS1) 10JMeybohm: lvs: Remove proton non-TLS endpoint from LVS 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/627542 (https://phabricator.wikimedia.org/T255877) [15:30:02] (03CR) 10BBlack: [C: 03+1] lvs: convert IPv6 PTR ORIGINs to /64 [dns] - 10https://gerrit.wikimedia.org/r/627524 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:30:08] (03CR) 10Papaul: [C: 03+2] DHCP: Add MAC address for maps2005-maps2010 [puppet] - 10https://gerrit.wikimedia.org/r/627540 (https://phabricator.wikimedia.org/T260271) (owner: 10Papaul) [15:30:10] (03PS26) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [15:30:17] RECOVERY - Host ores1006.mgmt is UP: PING WARNING - Packet loss = 90%, RTA = 2.36 ms [15:30:17] RECOVERY - Host an-worker1105.mgmt is UP: PING OK - Packet loss = 0%, RTA = 2.03 ms [15:30:32] !log otto@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [15:30:32] !log otto@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [15:30:35] RECOVERY - Host deploy1001.mgmt is UP: PING OK - Packet loss = 0%, RTA = 2.24 ms [15:30:35] RECOVERY - Host labsdb1010.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.71 ms [15:30:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:30:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:30:43] RECOVERY - Host ms-be1024.mgmt is UP: PING OK - Packet loss = 0%, RTA = 3.00 ms [15:30:55] RECOVERY - Host ms-fe1007.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.52 ms [15:31:09] RECOVERY - ps1-d3-eqiad-infeed-load-tower-A-phase-X on ps1-d3-eqiad is OK: SNMP OK - ps1-d3-eqiad-infeed-load-tower-A-phase-X 274 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:31:15] RECOVERY - Host cp1084.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.04 ms [15:31:16] RECOVERY - Host snapshot1006.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.98 ms [15:31:21] (03PS2) 10Volans: ores: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627521 (https://phabricator.wikimedia.org/T244153) [15:31:35] RECOVERY - ps1-d3-eqiad-infeed-load-tower-B-phase-Y on ps1-d3-eqiad is OK: SNMP OK - ps1-d3-eqiad-infeed-load-tower-B-phase-Y 236 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:31:39] RECOVERY - Host an-worker1089.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms [15:31:39] RECOVERY - Host an-worker1100.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.09 ms [15:31:39] RECOVERY - Host an-worker1108.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.06 ms [15:31:39] RECOVERY - Host an-worker1107.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.10 ms [15:31:53] RECOVERY - Host an-worker1090.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.82 ms [15:32:03] RECOVERY - ps1-d3-eqiad-infeed-load-tower-A-phase-Z on ps1-d3-eqiad is OK: SNMP OK - ps1-d3-eqiad-infeed-load-tower-A-phase-Z 278 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:32:03] RECOVERY - ps1-d3-eqiad-infeed-load-tower-B-phase-X on ps1-d3-eqiad is OK: SNMP OK - ps1-d3-eqiad-infeed-load-tower-B-phase-X 281 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:32:03] RECOVERY - ps1-d3-eqiad-infeed-load-tower-A-phase-Y on ps1-d3-eqiad is OK: SNMP OK - ps1-d3-eqiad-infeed-load-tower-A-phase-Y 255 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:32:13] RECOVERY - Host ms-be1025.mgmt is UP: PING OK - Packet loss = 0%, RTA = 34.90 ms [15:32:21] RECOVERY - Host cp1083.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.11 ms [15:32:31] RECOVERY - Host ms-be1054.mgmt is UP: PING OK - Packet loss = 0%, RTA = 24.65 ms [15:32:47] PROBLEM - IPMI Sensor Status on kafka-jumbo1005 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [PS Redundancy = Critical, Status = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [15:32:53] (03CR) 10Volans: [C: 03+2] ores: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627521 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:33:03] !log otto@deploy1001 helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [15:33:03] !log otto@deploy1001 helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [15:33:05] RECOVERY - Host mwlog1001.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.87 ms [15:33:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:33:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:33:33] (03PS2) 10Volans: puppetdb: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627520 (https://phabricator.wikimedia.org/T244153) [15:34:13] (03CR) 10Volans: [C: 03+2] puppetdb: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627520 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:35:23] RECOVERY - Host thanos-be1003 is UP: PING OK - Packet loss = 0%, RTA = 0.17 ms [15:35:27] RECOVERY - Host kafka-jumbo1005.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.84 ms [15:35:27] RECOVERY - Host kafka-jumbo1007.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.06 ms [15:35:33] RECOVERY - Host an-worker1106.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.05 ms [15:36:21] RECOVERY - Host thanos-be1003.mgmt is UP: PING OK - Packet loss = 0%, RTA = 2.00 ms [15:36:49] RECOVERY - Host logstash1028.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.24 ms [15:36:55] (03PS3) 10Volans: lvs: convert IPv6 PTR ORIGINs to /64 [dns] - 10https://gerrit.wikimedia.org/r/627524 (https://phabricator.wikimedia.org/T244153) [15:38:07] RECOVERY - ps1-22-ulsfo-infeed-load-tower-B-single-phase on ps1-22-ulsfo is OK: SNMP OK - ps1-22-ulsfo-infeed-load-tower-B-single-phase 445 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:38:18] (03PS1) 10RobH: setting ps1-c[23]-eqiad monitoring [puppet] - 10https://gerrit.wikimedia.org/r/627547 (https://phabricator.wikimedia.org/T261455) [15:38:23] !log ppchelko@deploy1001 Started deploy [restbase/deploy@f7cda70]: Fix analytics by-country endpoint [15:38:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:38:39] RECOVERY - ps1-22-ulsfo-infeed-load-tower-A-single-phase on ps1-22-ulsfo is OK: SNMP OK - ps1-22-ulsfo-infeed-load-tower-A-single-phase 552 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:38:52] (03CR) 10RobH: [C: 03+2] setting ps1-c[23]-eqiad monitoring [puppet] - 10https://gerrit.wikimedia.org/r/627547 (https://phabricator.wikimedia.org/T261455) (owner: 10RobH) [15:43:00] (03PS27) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [15:43:01] (03CR) 10Volans: [C: 03+2] lvs: convert IPv6 PTR ORIGINs to /64 [dns] - 10https://gerrit.wikimedia.org/r/627524 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [15:43:56] (03PS5) 10Reedy: Add security.wikimedia.org microsite [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) [15:44:57] (03CR) 10Cwhite: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/627500 (https://phabricator.wikimedia.org/T258948) (owner: 10Filippo Giunchedi) [15:45:59] RECOVERY - ps1-a4-eqiad-infeed-load-tower-B-phase-Z on ps1-a4-eqiad is OK: SNMP OK - ps1-a4-eqiad-infeed-load-tower-B-phase-Z 370 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:46:03] RECOVERY - ps1-a4-eqiad-infeed-load-tower-A-phase-X on ps1-a4-eqiad is OK: SNMP OK - ps1-a4-eqiad-infeed-load-tower-A-phase-X 295 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:46:03] RECOVERY - ps1-a4-eqiad-infeed-load-tower-B-phase-X on ps1-a4-eqiad is OK: SNMP OK - ps1-a4-eqiad-infeed-load-tower-B-phase-X 330 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:46:13] RECOVERY - ps1-a4-eqiad-infeed-load-tower-A-phase-Z on ps1-a4-eqiad is OK: SNMP OK - ps1-a4-eqiad-infeed-load-tower-A-phase-Z 326 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:46:26] RECOVERY - IPMI Sensor Status on ms-be1024 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [15:47:35] RECOVERY - ps1-b6-eqiad-infeed-load-tower-A-phase-X on ps1-b6-eqiad is OK: SNMP OK - ps1-b6-eqiad-infeed-load-tower-A-phase-X 292 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:47:43] RECOVERY - ps1-b6-eqiad-infeed-load-tower-A-phase-Y on ps1-b6-eqiad is OK: SNMP OK - ps1-b6-eqiad-infeed-load-tower-A-phase-Y 268 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:49:03] RECOVERY - ps1-b2-eqiad-infeed-load-tower-A-phase-X on ps1-b2-eqiad is OK: SNMP OK - ps1-b2-eqiad-infeed-load-tower-A-phase-X 331 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:49:07] RECOVERY - ps1-b2-eqiad-infeed-load-tower-B-phase-X on ps1-b2-eqiad is OK: SNMP OK - ps1-b2-eqiad-infeed-load-tower-B-phase-X 310 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:49:15] RECOVERY - ps1-a4-eqiad-infeed-load-tower-B-phase-Y on ps1-a4-eqiad is OK: SNMP OK - ps1-a4-eqiad-infeed-load-tower-B-phase-Y 258 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:49:15] RECOVERY - ps1-b2-eqiad-infeed-load-tower-B-phase-Z on ps1-b2-eqiad is OK: SNMP OK - ps1-b2-eqiad-infeed-load-tower-B-phase-Z 449 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:49:15] RECOVERY - ps1-b6-eqiad-infeed-load-tower-B-phase-X on ps1-b6-eqiad is OK: SNMP OK - ps1-b6-eqiad-infeed-load-tower-B-phase-X 292 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:49:41] 10Operations, 10ops-eqiad, 10netops: eqiad row D switch fabric recabling - https://phabricator.wikimedia.org/T256112 (10ayounsi) [15:50:26] 10Operations, 10ops-eqiad, 10DBA, 10netops, and 2 others: Upgrade eqiad rack D4 to 10G switch - https://phabricator.wikimedia.org/T196487 (10ayounsi) [15:51:27] RECOVERY - ps1-b6-eqiad-infeed-load-tower-B-phase-Y on ps1-b6-eqiad is OK: SNMP OK - ps1-b6-eqiad-infeed-load-tower-B-phase-Y 243 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:52:07] RECOVERY - ps1-a8-eqiad-infeed-load-tower-A-phase-X on ps1-a8-eqiad is OK: SNMP OK - ps1-a8-eqiad-infeed-load-tower-A-phase-X 69 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:52:07] RECOVERY - ps1-a8-eqiad-infeed-load-tower-B-phase-X on ps1-a8-eqiad is OK: SNMP OK - ps1-a8-eqiad-infeed-load-tower-B-phase-X 139 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:52:13] RECOVERY - ps1-a8-eqiad-infeed-load-tower-A-phase-Z on ps1-a8-eqiad is OK: SNMP OK - ps1-a8-eqiad-infeed-load-tower-A-phase-Z 345 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:52:36] ... i dunno why that alerts [15:52:49] we arent working in row a, but icinga just restarted due to updates in puppet [15:53:13] maybe due to snmp rotation today? [15:53:39] RECOVERY - ps1-a6-eqiad-infeed-load-tower-A-phase-Z on ps1-a6-eqiad is OK: SNMP OK - ps1-a6-eqiad-infeed-load-tower-A-phase-Z 364 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:53:39] RECOVERY - ps1-b6-eqiad-infeed-load-tower-B-phase-Z on ps1-b6-eqiad is OK: SNMP OK - ps1-b6-eqiad-infeed-load-tower-B-phase-Z 281 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:53:45] RECOVERY - ps1-a6-eqiad-infeed-load-tower-A-phase-Y on ps1-a6-eqiad is OK: SNMP OK - ps1-a6-eqiad-infeed-load-tower-A-phase-Y 254 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:53:47] RECOVERY - ps1-a6-eqiad-infeed-load-tower-A-phase-X on ps1-a6-eqiad is OK: SNMP OK - ps1-a6-eqiad-infeed-load-tower-A-phase-X 286 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:55:07] RECOVERY - ps1-a2-eqiad-infeed-load-tower-A-phase-Y on ps1-a2-eqiad is OK: SNMP OK - ps1-a2-eqiad-infeed-load-tower-A-phase-Y 313 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:55:07] RECOVERY - ps1-a2-eqiad-infeed-load-tower-B-phase-Y on ps1-a2-eqiad is OK: SNMP OK - ps1-a2-eqiad-infeed-load-tower-B-phase-Y 323 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:55:12] robh: yes this is the snmp rotate script fixing the stragelers [15:55:13] RECOVERY - ps1-a2-eqiad-infeed-load-tower-B-phase-Z on ps1-a2-eqiad is OK: SNMP OK - ps1-a2-eqiad-infeed-load-tower-B-phase-Z 302 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:55:13] RECOVERY - ps1-a2-eqiad-infeed-load-tower-A-phase-Z on ps1-a2-eqiad is OK: SNMP OK - ps1-a2-eqiad-infeed-load-tower-A-phase-Z 280 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:55:19] (03PS6) 10Reedy: Add security.wikimedia.org microsite [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) [15:55:23] cool [15:55:44] 10Operations, 10LDAP-Access-Requests: Add Bereket teshome to the ldap/wmde and ldap/nda group - https://phabricator.wikimedia.org/T262921 (10RLazarus) @conny-kawohl_WMDE Thanks! [15:55:49] RECOVERY - ps1-a6-eqiad-infeed-load-tower-B-phase-X on ps1-a6-eqiad is OK: SNMP OK - ps1-a6-eqiad-infeed-load-tower-B-phase-X 330 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:56:22] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops, 10Patch-For-Review: New Date - Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C2 and C3 - https://phabricator.wikimedia.org/T261455 (10RobH) [15:56:31] 10Operations, 10ops-eqiad, 10Analytics-Radar: an-presto1004 down - https://phabricator.wikimedia.org/T253438 (10wiki_willy) Let me check with @Cmjohnson . He's tied up with PDU upgrades this week, and he's out on vacation half of next week. But let's see if we can at least get a timeframe for you. Thanks,... [15:56:33] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: New Date - Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C2 and C3 - https://phabricator.wikimedia.org/T261455 (10RobH) [15:56:41] RECOVERY - ps1-b3-eqiad-infeed-load-tower-A-phase-Z on ps1-b3-eqiad is OK: SNMP OK - ps1-b3-eqiad-infeed-load-tower-A-phase-Z 297 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:56:47] RECOVERY - ps1-b3-eqiad-infeed-load-tower-B-phase-Z on ps1-b3-eqiad is OK: SNMP OK - ps1-b3-eqiad-infeed-load-tower-B-phase-Z 308 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:56:54] (03CR) 10Alexandros Kosiaris: "> Patch Set 1: Code-Review-1" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/627439 (owner: 10Alexandros Kosiaris) [15:57:08] (03PS2) 10Alexandros Kosiaris: prometheus: Scrape k8s etcd nodes [puppet] - 10https://gerrit.wikimedia.org/r/627439 [15:58:05] RECOVERY - ps1-a6-eqiad-infeed-load-tower-B-phase-Y on ps1-a6-eqiad is OK: SNMP OK - ps1-a6-eqiad-infeed-load-tower-B-phase-Y 255 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:58:05] RECOVERY - ps1-b3-eqiad-infeed-load-tower-A-phase-X on ps1-b3-eqiad is OK: SNMP OK - ps1-b3-eqiad-infeed-load-tower-A-phase-X 456 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:58:15] RECOVERY - ps1-a7-eqiad-infeed-load-tower-B-phase-Z on ps1-a7-eqiad is OK: SNMP OK - ps1-a7-eqiad-infeed-load-tower-B-phase-Z 422 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:58:25] RECOVERY - ps1-a7-eqiad-infeed-load-tower-A-phase-X on ps1-a7-eqiad is OK: SNMP OK - ps1-a7-eqiad-infeed-load-tower-A-phase-X 333 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:58:39] RECOVERY - ps1-a7-eqiad-infeed-load-tower-A-phase-Y on ps1-a7-eqiad is OK: SNMP OK - ps1-a7-eqiad-infeed-load-tower-A-phase-Y 309 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:58:57] 10Operations, 10ops-eqiad, 10DC-Ops: Physically move db1131 from B5 to C8 - https://phabricator.wikimedia.org/T262901 (10wiki_willy) a:05wiki_willy→03Cmjohnson @Cmjohnson - are you able to work with @Marostegui on moving this host, maybe after one of the PDU upgrades this week? Thanks, Willy [15:59:04] (03PS1) 10Reedy: Remove old list of sites listed on miscweb [puppet] - 10https://gerrit.wikimedia.org/r/627551 [15:59:25] (03CR) 10Elukey: "https://puppet-compiler.wmflabs.org/compiler1002/25077/miscweb1002.eqiad.wmnet/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) (owner: 10Reedy) [15:59:56] (03CR) 10Alexandros Kosiaris: [C: 03+1] k8s: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627522 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [16:00:04] jbond42 and cdanis: #bothumor I � Unicode. All rise for Puppet request window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1600). [16:00:17] RECOVERY - ps1-603-eqsin-infeed-load-tower-A-single-phase on ps1-603-eqsin is OK: SNMP OK - ps1-603-eqsin-infeed-load-tower-A-single-phase 746 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:00:29] (03CR) 10Cwhite: prometheus: Scrape k8s etcd nodes (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/627439 (owner: 10Alexandros Kosiaris) [16:00:31] (03PS1) 10Papaul: Add maps20(0[5-9]|1[0]) to site.pp [puppet] - 10https://gerrit.wikimedia.org/r/627553 (https://phabricator.wikimedia.org/T260271) [16:01:23] RECOVERY - ps1-a1-eqiad-infeed-load-tower-A-phase-Z on ps1-a1-eqiad is OK: SNMP OK - ps1-a1-eqiad-infeed-load-tower-A-phase-Z 371 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:01:25] RECOVERY - ps1-a1-eqiad-infeed-load-tower-B-phase-Z on ps1-a1-eqiad is OK: SNMP OK - ps1-a1-eqiad-infeed-load-tower-B-phase-Z 365 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:01:25] RECOVERY - ps1-a1-eqiad-infeed-load-tower-B-phase-Y on ps1-a1-eqiad is OK: SNMP OK - ps1-a1-eqiad-infeed-load-tower-B-phase-Y 319 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:01:27] RECOVERY - ps1-a1-eqiad-infeed-load-tower-B-phase-X on ps1-a1-eqiad is OK: SNMP OK - ps1-a1-eqiad-infeed-load-tower-B-phase-X 146 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:01:33] RECOVERY - ps1-a1-eqiad-infeed-load-tower-A-phase-X on ps1-a1-eqiad is OK: SNMP OK - ps1-a1-eqiad-infeed-load-tower-A-phase-X 106 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:02:27] RECOVERY - ps1-603-eqsin-infeed-load-tower-B-single-phase on ps1-603-eqsin is OK: SNMP OK - ps1-603-eqsin-infeed-load-tower-B-single-phase 207 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:02:45] RECOVERY - ps1-d5-eqiad-infeed-load-tower-A-phase-X on ps1-d5-eqiad is OK: SNMP OK - ps1-d5-eqiad-infeed-load-tower-A-phase-X 265 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:02:51] RECOVERY - ps1-d5-eqiad-infeed-load-tower-B-phase-X on ps1-d5-eqiad is OK: SNMP OK - ps1-d5-eqiad-infeed-load-tower-B-phase-X 237 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:02:51] RECOVERY - ps1-d5-eqiad-infeed-load-tower-A-phase-Y on ps1-d5-eqiad is OK: SNMP OK - ps1-d5-eqiad-infeed-load-tower-A-phase-Y 241 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:02:51] RECOVERY - ps1-d5-eqiad-infeed-load-tower-B-phase-Y on ps1-d5-eqiad is OK: SNMP OK - ps1-d5-eqiad-infeed-load-tower-B-phase-Y 207 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:02:55] RECOVERY - ps1-d5-eqiad-infeed-load-tower-A-phase-Z on ps1-d5-eqiad is OK: SNMP OK - ps1-d5-eqiad-infeed-load-tower-A-phase-Z 319 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:03:11] RECOVERY - IPMI Sensor Status on kafka-jumbo1005 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:03:35] (03CR) 10Jbond: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/627514 (owner: 10Ayounsi) [16:04:29] RECOVERY - ps1-a3-eqiad-infeed-load-tower-B-phase-X on ps1-a3-eqiad is OK: SNMP OK - ps1-a3-eqiad-infeed-load-tower-B-phase-X 332 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:31] RECOVERY - ps1-a3-eqiad-infeed-load-tower-A-phase-Y on ps1-a3-eqiad is OK: SNMP OK - ps1-a3-eqiad-infeed-load-tower-A-phase-Y 362 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:35] RECOVERY - ps1-a3-eqiad-infeed-load-tower-A-phase-X on ps1-a3-eqiad is OK: SNMP OK - ps1-a3-eqiad-infeed-load-tower-A-phase-X 377 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:35] RECOVERY - ps1-a3-eqiad-infeed-load-tower-B-phase-Y on ps1-a3-eqiad is OK: SNMP OK - ps1-a3-eqiad-infeed-load-tower-B-phase-Y 356 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:45] RECOVERY - ps1-a2-eqiad-infeed-load-tower-A-phase-X on ps1-a2-eqiad is OK: SNMP OK - ps1-a2-eqiad-infeed-load-tower-A-phase-X 428 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:45] RECOVERY - ps1-b3-eqiad-infeed-load-tower-B-phase-X on ps1-b3-eqiad is OK: SNMP OK - ps1-b3-eqiad-infeed-load-tower-B-phase-X 419 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:45] RECOVERY - ps1-a3-eqiad-infeed-load-tower-B-phase-Z on ps1-a3-eqiad is OK: SNMP OK - ps1-a3-eqiad-infeed-load-tower-B-phase-Z 318 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:45] RECOVERY - ps1-604-eqsin-infeed-load-tower-A-single-phase on ps1-604-eqsin is OK: SNMP OK - ps1-604-eqsin-infeed-load-tower-A-single-phase 634 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:04:59] RECOVERY - ps1-a3-eqiad-infeed-load-tower-A-phase-Z on ps1-a3-eqiad is OK: SNMP OK - ps1-a3-eqiad-infeed-load-tower-A-phase-Z 296 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:05:59] RECOVERY - ps1-b8-eqiad-infeed-load-tower-A-phase-X on ps1-b8-eqiad is OK: SNMP OK - ps1-b8-eqiad-infeed-load-tower-A-phase-X 348 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:01] RECOVERY - ps1-b8-eqiad-infeed-load-tower-A-phase-Y on ps1-b8-eqiad is OK: SNMP OK - ps1-b8-eqiad-infeed-load-tower-A-phase-Y 445 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:01] RECOVERY - ps1-b8-eqiad-infeed-load-tower-B-phase-Y on ps1-b8-eqiad is OK: SNMP OK - ps1-b8-eqiad-infeed-load-tower-B-phase-Y 311 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:03] RECOVERY - ps1-b8-eqiad-infeed-load-tower-A-phase-Z on ps1-b8-eqiad is OK: SNMP OK - ps1-b8-eqiad-infeed-load-tower-A-phase-Z 257 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:05] RECOVERY - ps1-b8-eqiad-infeed-load-tower-B-phase-Z on ps1-b8-eqiad is OK: SNMP OK - ps1-b8-eqiad-infeed-load-tower-B-phase-Z 308 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:22] (03PS7) 10Reedy: Add security.wikimedia.org microsite [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) [16:06:27] RECOVERY - ps1-b8-eqiad-infeed-load-tower-B-phase-X on ps1-b8-eqiad is OK: SNMP OK - ps1-b8-eqiad-infeed-load-tower-B-phase-X 385 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:44] (03PS2) 10Volans: k8s: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627522 (https://phabricator.wikimedia.org/T244153) [16:06:55] RECOVERY - ps1-a8-eqiad-infeed-load-tower-A-phase-Y on ps1-a8-eqiad is OK: SNMP OK - ps1-a8-eqiad-infeed-load-tower-A-phase-Y 324 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:55] RECOVERY - ps1-b3-eqiad-infeed-load-tower-B-phase-Y on ps1-b3-eqiad is OK: SNMP OK - ps1-b3-eqiad-infeed-load-tower-B-phase-Y 365 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:55] RECOVERY - ps1-d5-eqiad-infeed-load-tower-B-phase-Z on ps1-d5-eqiad is OK: SNMP OK - ps1-d5-eqiad-infeed-load-tower-B-phase-Z 291 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:06:55] RECOVERY - ps1-604-eqsin-infeed-load-tower-B-single-phase on ps1-604-eqsin is OK: SNMP OK - ps1-604-eqsin-infeed-load-tower-B-single-phase 306 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:07:27] RECOVERY - ps1-d4-eqiad-infeed-load-tower-A-phase-Y on ps1-d4-eqiad is OK: SNMP OK - ps1-d4-eqiad-infeed-load-tower-A-phase-Y 199 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:07:27] RECOVERY - ps1-d4-eqiad-infeed-load-tower-B-phase-Z on ps1-d4-eqiad is OK: SNMP OK - ps1-d4-eqiad-infeed-load-tower-B-phase-Z 214 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:07:27] RECOVERY - ps1-d4-eqiad-infeed-load-tower-A-phase-X on ps1-d4-eqiad is OK: SNMP OK - ps1-d4-eqiad-infeed-load-tower-A-phase-X 159 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:07:27] RECOVERY - ps1-d4-eqiad-infeed-load-tower-B-phase-X on ps1-d4-eqiad is OK: SNMP OK - ps1-d4-eqiad-infeed-load-tower-B-phase-X 174 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:07:27] RECOVERY - ps1-d4-eqiad-infeed-load-tower-B-phase-Y on ps1-d4-eqiad is OK: SNMP OK - ps1-d4-eqiad-infeed-load-tower-B-phase-Y 172 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:09:01] RECOVERY - ps1-b7-eqiad-infeed-load-tower-B-phase-X on ps1-b7-eqiad is OK: SNMP OK - ps1-b7-eqiad-infeed-load-tower-B-phase-X 386 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:09:01] RECOVERY - ps1-b7-eqiad-infeed-load-tower-A-phase-X on ps1-b7-eqiad is OK: SNMP OK - ps1-b7-eqiad-infeed-load-tower-A-phase-X 410 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:09:05] RECOVERY - ps1-b7-eqiad-infeed-load-tower-A-phase-Z on ps1-b7-eqiad is OK: SNMP OK - ps1-b7-eqiad-infeed-load-tower-A-phase-Z 325 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:09:27] RECOVERY - ps1-b7-eqiad-infeed-load-tower-A-phase-Y on ps1-b7-eqiad is OK: SNMP OK - ps1-b7-eqiad-infeed-load-tower-A-phase-Y 496 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:10:27] (03PS8) 10Jbond: Add security.wikimedia.org microsite [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) (owner: 10Reedy) [16:10:31] RECOVERY - ps1-a5-eqiad-infeed-load-tower-A-phase-Y on ps1-a5-eqiad is OK: SNMP OK - ps1-a5-eqiad-infeed-load-tower-A-phase-Y 300 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:10:39] RECOVERY - ps1-a5-eqiad-infeed-load-tower-B-phase-Y on ps1-a5-eqiad is OK: SNMP OK - ps1-a5-eqiad-infeed-load-tower-B-phase-Y 266 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:10:46] (03CR) 10Volans: [C: 03+2] k8s: remove leftover records for old hosts [dns] - 10https://gerrit.wikimedia.org/r/627522 (https://phabricator.wikimedia.org/T244153) (owner: 10Volans) [16:10:47] RECOVERY - ps1-a5-eqiad-infeed-load-tower-A-phase-X on ps1-a5-eqiad is OK: SNMP OK - ps1-a5-eqiad-infeed-load-tower-A-phase-X 298 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:04] 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review, 10User-Kormat: db2125 crashed - mgmt iface also not available - https://phabricator.wikimedia.org/T260670 (10Papaul) @Marostegui please see below . Asking me to do what we already did when we first had the problem which was to upgrade the server Firmw... [16:11:05] RECOVERY - ps1-a5-eqiad-infeed-load-tower-A-phase-Z on ps1-a5-eqiad is OK: SNMP OK - ps1-a5-eqiad-infeed-load-tower-A-phase-Z 275 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:23] RECOVERY - ps1-a5-eqiad-infeed-load-tower-B-phase-Z on ps1-a5-eqiad is OK: SNMP OK - ps1-a5-eqiad-infeed-load-tower-B-phase-Z 300 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:23] RECOVERY - ps1-b2-eqiad-infeed-load-tower-A-phase-Y on ps1-b2-eqiad is OK: SNMP OK - ps1-b2-eqiad-infeed-load-tower-A-phase-Y 447 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:23] RECOVERY - ps1-b7-eqiad-infeed-load-tower-B-phase-Y on ps1-b7-eqiad is OK: SNMP OK - ps1-b7-eqiad-infeed-load-tower-B-phase-Y 383 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:23] RECOVERY - ps1-a2-eqiad-infeed-load-tower-B-phase-X on ps1-a2-eqiad is OK: SNMP OK - ps1-a2-eqiad-infeed-load-tower-B-phase-X 393 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:23] RECOVERY - ps1-d4-eqiad-infeed-load-tower-A-phase-Z on ps1-d4-eqiad is OK: SNMP OK - ps1-d4-eqiad-infeed-load-tower-A-phase-Z 193 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:11:31] (03CR) 10Ottomata: analytics_cluster/turnilo: Configure url shortner (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/622600 (https://phabricator.wikimedia.org/T233336) (owner: 10Milimetric) [16:12:11] PROBLEM - IPMI Sensor Status on mc1032 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 2 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:12:22] (03PS9) 10Jbond: Add security.wikimedia.org microsite [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) (owner: 10Reedy) [16:12:37] (03CR) 10Jbond: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) (owner: 10Reedy) [16:13:23] PROBLEM - Host ps1-c5-eqiad is DOWN: PING CRITICAL - Packet loss = 100% [16:13:35] RECOVERY - ps1-a7-eqiad-infeed-load-tower-A-phase-Z on ps1-a7-eqiad is OK: SNMP OK - ps1-a7-eqiad-infeed-load-tower-A-phase-Z 413 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:13:35] RECOVERY - ps1-a8-eqiad-infeed-load-tower-B-phase-Y on ps1-a8-eqiad is OK: SNMP OK - ps1-a8-eqiad-infeed-load-tower-B-phase-Y 256 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:13:35] RECOVERY - ps1-b7-eqiad-infeed-load-tower-B-phase-Z on ps1-b7-eqiad is OK: SNMP OK - ps1-b7-eqiad-infeed-load-tower-B-phase-Z 556 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:13:35] RECOVERY - ps1-b2-eqiad-infeed-load-tower-A-phase-Z on ps1-b2-eqiad is OK: SNMP OK - ps1-b2-eqiad-infeed-load-tower-A-phase-Z 543 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:14:05] 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review, 10User-Kormat: db2125 crashed - mgmt iface also not available - https://phabricator.wikimedia.org/T260670 (10Marostegui) @wiki_willy can we escalate this to someone else? This seems a bit of a loop over a kinda new server that shows the same HW error... [16:14:21] PROBLEM - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is CRITICAL: /v2/suggest/sections/{title}/{from}/{to} (Suggest source sections to translate) is CRITICAL: Test Suggest source sections to translate returned the unexpected status 503 (expecting: 200) https://wikitech.wikimedia.org/wiki/CX [16:15:03] PROBLEM - Host dbproxy1019.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:15:36] (03CR) 10Jbond: [C: 03+2] Add security.wikimedia.org microsite [puppet] - 10https://gerrit.wikimedia.org/r/612279 (https://phabricator.wikimedia.org/T257834) (owner: 10Reedy) [16:15:37] PROBLEM - Host cloudmetrics1001.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:15:41] RECOVERY - ps1-a7-eqiad-infeed-load-tower-B-phase-X on ps1-a7-eqiad is OK: SNMP OK - ps1-a7-eqiad-infeed-load-tower-B-phase-X 372 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:15:41] RECOVERY - ps1-a7-eqiad-infeed-load-tower-B-phase-Y on ps1-a7-eqiad is OK: SNMP OK - ps1-a7-eqiad-infeed-load-tower-B-phase-Y 329 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:15:43] RECOVERY - ps1-a8-eqiad-infeed-load-tower-B-phase-Z on ps1-a8-eqiad is OK: SNMP OK - ps1-a8-eqiad-infeed-load-tower-B-phase-Z 328 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:15:43] RECOVERY - ps1-a1-eqiad-infeed-load-tower-A-phase-Y on ps1-a1-eqiad is OK: SNMP OK - ps1-a1-eqiad-infeed-load-tower-A-phase-Y 278 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:15:45] PROBLEM - Host elastic1041.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:08] 04Critical Alert for device ps1-b7-codfw.mgmt.codfw.wmnet - Device rebooted got acknowledged [16:16:13] RECOVERY - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX [16:16:13] PROBLEM - Host an-conf1002.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:15] PROBLEM - IPMI Sensor Status on mc1030 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 2 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:16:19] PROBLEM - Host an-test-worker1002.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:21] PROBLEM - Host aqs1005.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:23] PROBLEM - Host dbproxy1018.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:25] PROBLEM - Host mc1032.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:27] PROBLEM - Host druid1002.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:16:57] PROBLEM - Host relforge1002.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:17:16] C5 ^ [16:17:38] PROBLEM - Host elastic1040.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:17:38] PROBLEM - Host elastic1043.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:17:50] PROBLEM - Host cloudcontrol1005.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:10] PROBLEM - Host elastic1042.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:12] PROBLEM - Host es1022.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:12] PROBLEM - Host wdqs1013.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:14] PROBLEM - Host wtp1037.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:14] PROBLEM - Host wtp1038.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:16] PROBLEM - Host wtp1039.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:26] PROBLEM - Host kubernetes1012.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:30] PROBLEM - Host labstore1005.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:18:38] PROBLEM - Host mc1031.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:19:22] RECOVERY - Host cloudcontrol1005.mgmt is UP: PING OK - Packet loss = 0%, RTA = 2.06 ms [16:19:33] this was all c5 mgmt...fixed [16:19:50] RECOVERY - Host elastic1042.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms [16:20:08] RECOVERY - Host dbproxy1019.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.04 ms [16:20:42] RECOVERY - Host cloudmetrics1001.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.72 ms [16:21:16] RECOVERY - Host an-conf1002.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.04 ms [16:21:22] RECOVERY - Host an-test-worker1002.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.08 ms [16:21:24] RECOVERY - Host aqs1005.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.77 ms [16:21:26] RECOVERY - Host dbproxy1018.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.05 ms [16:21:28] RECOVERY - Host mc1032.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.73 ms [16:21:30] RECOVERY - Host druid1002.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.77 ms [16:22:02] RECOVERY - Host relforge1002.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.73 ms [16:22:04] RECOVERY - ps1-b2-eqiad-infeed-load-tower-B-phase-Y on ps1-b2-eqiad is OK: SNMP OK - ps1-b2-eqiad-infeed-load-tower-B-phase-Y 489 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:23:14] RECOVERY - Host elastic1040.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms [16:23:14] RECOVERY - Host elastic1041.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms [16:23:16] RECOVERY - Host elastic1043.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms [16:23:16] RECOVERY - Host es1022.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.01 ms [16:23:16] RECOVERY - Host wdqs1013.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.03 ms [16:23:18] RECOVERY - Host wtp1037.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.84 ms [16:23:18] RECOVERY - Host wtp1038.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.76 ms [16:23:20] RECOVERY - Host wtp1039.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.82 ms [16:23:30] PROBLEM - IPMI Sensor Status on aqs1005 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 2 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:23:32] RECOVERY - Host kubernetes1012.mgmt is UP: PING OK - Packet loss = 0%, RTA = 1.10 ms [16:23:36] RECOVERY - Host labstore1005.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.80 ms [16:23:46] RECOVERY - Host mc1031.mgmt is UP: PING OK - Packet loss = 0%, RTA = 0.73 ms [16:24:11] (03PS1) 10Herron: mx: remove spamhaus dnsbl lookups [puppet] - 10https://gerrit.wikimedia.org/r/627557 (https://phabricator.wikimedia.org/T262642) [16:26:28] PROBLEM - IPMI Sensor Status on mc1031 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Power Supply 2 = Critical, Power Supplies = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:28:32] (03PS1) 10RobH: ps1-c[45]-eqiad update [puppet] - 10https://gerrit.wikimedia.org/r/627558 (https://phabricator.wikimedia.org/T261456) [16:28:57] (03CR) 10jerkins-bot: [V: 04-1] ps1-c[45]-eqiad update [puppet] - 10https://gerrit.wikimedia.org/r/627558 (https://phabricator.wikimedia.org/T261456) (owner: 10RobH) [16:30:22] (03PS2) 10RobH: ps1-c[45]-eqiad update [puppet] - 10https://gerrit.wikimedia.org/r/627558 (https://phabricator.wikimedia.org/T261456) [16:31:40] (03PS1) 10Hnowlan: api-gateway: limit persistent connections to attempt to address SSL issues. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) [16:31:42] (03CR) 10RobH: [C: 03+2] ps1-c[45]-eqiad update [puppet] - 10https://gerrit.wikimedia.org/r/627558 (https://phabricator.wikimedia.org/T261456) (owner: 10RobH) [16:31:56] (03CR) 10jerkins-bot: [V: 04-1] api-gateway: limit persistent connections to attempt to address SSL issues. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:32:42] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops, 10Patch-For-Review: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10RobH) [16:32:44] (03CR) 10Papaul: [C: 03+2] Add maps20(0[5-9]|1[0]) to site.pp [puppet] - 10https://gerrit.wikimedia.org/r/627553 (https://phabricator.wikimedia.org/T260271) (owner: 10Papaul) [16:32:48] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10RobH) [16:34:33] (03PS2) 10Hnowlan: api-gateway: limit persistent connections to attempt to address SSL issues. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) [16:34:40] 10Operations, 10ops-codfw, 10DC-Ops, 10Maps, 10Patch-For-Review: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet - https://phabricator.wikimedia.org/T260271 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts: ` maps2005.codfw.wmnet ` The l... [16:36:03] (03Abandoned) 10Majavah: Add arbcom_ruwiki to $private_wikis [puppet] - 10https://gerrit.wikimedia.org/r/627257 (https://phabricator.wikimedia.org/T262812) (owner: 10Majavah) [16:36:05] (03Abandoned) 10Hnowlan: api-gateway: disable connection reuse [deployment-charts] - 10https://gerrit.wikimedia.org/r/627322 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:38:24] (03CR) 10Ppchelko: api-gateway: limit persistent connections to attempt to address SSL issues. (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:40:58] RECOVERY - Host ps1-c5-eqiad is UP: PING OK - Packet loss = 0%, RTA = 1.26 ms [16:41:30] (03PS1) 10Jbond: webserver-misc-apps.discovery.wmnet: add security.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/627561 [16:41:59] (03CR) 10Jbond: [C: 03+2] webserver-misc-apps.discovery.wmnet: add security.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/627561 (owner: 10Jbond) [16:42:00] RECOVERY - ps1-b6-eqiad-infeed-load-tower-A-phase-Z on ps1-b6-eqiad is OK: SNMP OK - ps1-b6-eqiad-infeed-load-tower-A-phase-Z 252 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:42:05] (03CR) 10jerkins-bot: [V: 04-1] webserver-misc-apps.discovery.wmnet: add security.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/627561 (owner: 10Jbond) [16:42:52] RECOVERY - IPMI Sensor Status on mc1032 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:43:11] (03PS2) 10Jbond: webserver-misc-apps.discovery.wmnet: add security.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/627561 [16:44:35] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10RobH) [16:44:50] (03PS1) 10Ahmon Dancy: Use require instead of include in ServiceConfig.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627563 [16:46:27] (03CR) 10Hnowlan: api-gateway: limit persistent connections to attempt to address SSL issues. (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:47:04] (03CR) 10Ppchelko: [C: 03+1] "let's try I guess?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:49:20] (03CR) 10Catrope: [C: 03+1] Enable the reverted tag on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627526 (https://phabricator.wikimedia.org/T164307) (owner: 10Ostrzyciel) [16:53:41] RECOVERY - IPMI Sensor Status on aqs1005 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:55:07] (03CR) 10Hnowlan: [C: 03+2] api-gateway: limit persistent connections to attempt to address SSL issues. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:56:28] (03Merged) 10jenkins-bot: api-gateway: limit persistent connections to attempt to address SSL issues. [deployment-charts] - 10https://gerrit.wikimedia.org/r/627559 (https://phabricator.wikimedia.org/T262490) (owner: 10Hnowlan) [16:56:44] RECOVERY - IPMI Sensor Status on mc1031 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [16:57:15] !log hnowlan@deploy1001 helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [16:57:17] 10Operations, 10ops-codfw, 10DC-Ops, 10Maps, 10Patch-For-Review: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet - https://phabricator.wikimedia.org/T260271 (10Papaul) @RKemper the installing is not able to setup the raid using the partman recipe below . We are using a HW raid 10 and a SW ra... [16:57:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:59:07] !log hnowlan@deploy1001 helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [16:59:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:00:04] halfak and accraze: It is that lovely time of the day again! You are hereby commanded to deploy Services – Graphoid / ORES. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1700). [17:00:20] RECOVERY - Host ps1-c4-eqiad is UP: PING OK - Packet loss = 0%, RTA = 1.39 ms [17:00:33] !log hnowlan@deploy1001 helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [17:00:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:00:57] 10Operations, 10ops-codfw, 10DC-Ops, 10Maps, 10Patch-For-Review: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet - https://phabricator.wikimedia.org/T260271 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['maps2005.codfw.wmnet'] ` Of which those **FAILED**: ` ['maps2005.codfw.wmne... [17:05:09] !log ppchelko@deploy1001 Finished deploy [restbase/deploy@f7cda70]: Fix analytics by-country endpoint (duration: 86m 46s) [17:05:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:05:26] !log ppchelko@deploy1001 Started deploy [restbase/deploy@f7cda70]: Fix analytics by-country endpoint, feeds time out [17:05:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:07:06] (03PS2) 10Ahmon Dancy: Use require instead of include in ServiceConfig.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627563 (https://phabricator.wikimedia.org/T187147) [17:08:43] (03CR) 10Herron: "adding subscribers from the task for review" [puppet] - 10https://gerrit.wikimedia.org/r/627557 (https://phabricator.wikimedia.org/T262642) (owner: 10Herron) [17:08:46] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: patch new cross-connect - https://phabricator.wikimedia.org/T261791 (10Cmjohnson) a:05Cmjohnson→03RobH @robh cross-connect has been connected 15/16 and matched the serial number equinix provided. [17:09:27] (03CR) 10CRusnov: "This change is ready for review." [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/627567 (owner: 10CRusnov) [17:09:46] RECOVERY - ps1-d2-eqiad-infeed-load-tower-B-phase-Z on ps1-d2-eqiad is OK: SNMP OK - ps1-d2-eqiad-infeed-load-tower-B-phase-Z 721 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:09:50] RECOVERY - ps1-d2-eqiad-infeed-load-tower-B-phase-Y on ps1-d2-eqiad is OK: SNMP OK - ps1-d2-eqiad-infeed-load-tower-B-phase-Y 477 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:10:08] 10Operations, 10Traffic, 10netops: Wikimedia projects not reachable for some Telecom Italia users - https://phabricator.wikimedia.org/T262869 (10CDanis) >>! In T262869#6461316, @Aklapper wrote: > For anyone running into this, please follow https://www.mediawiki.org/wiki/How_to_report_a_bug#Reporting_a_connec... [17:10:14] RECOVERY - ps1-d2-eqiad-infeed-load-tower-A-phase-X on ps1-d2-eqiad is OK: SNMP OK - ps1-d2-eqiad-infeed-load-tower-A-phase-X 744 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:10:14] RECOVERY - ps1-d2-eqiad-infeed-load-tower-B-phase-X on ps1-d2-eqiad is OK: SNMP OK - ps1-d2-eqiad-infeed-load-tower-B-phase-X 705 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:10:16] RECOVERY - ps1-d2-eqiad-infeed-load-tower-A-phase-Y on ps1-d2-eqiad is OK: SNMP OK - ps1-d2-eqiad-infeed-load-tower-A-phase-Y 448 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:10:16] RECOVERY - ps1-d2-eqiad-infeed-load-tower-A-phase-Z on ps1-d2-eqiad is OK: SNMP OK - ps1-d2-eqiad-infeed-load-tower-A-phase-Z 671 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:11:10] PROBLEM - ps1-a8-codfw-infeed-load-tower-A-phase-Y on ps1-a8-codfw is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:12:15] 10Operations, 10Product-Infrastructure-Team-Backlog, 10Push-Notification-Service: [EPIC] Deploy push-notifications service to production - https://phabricator.wikimedia.org/T256237 (10MSantos) [17:12:18] 10Operations, 10Product-Infrastructure-Team-Backlog, 10Push-Notification-Service, 10serviceops, and 2 others: Deploy push-notifications service to Kubernetes - https://phabricator.wikimedia.org/T256973 (10MSantos) [17:15:56] RECOVERY - IPMI Sensor Status on mc1030 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [17:17:39] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: New Date - Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C2 and C3 - https://phabricator.wikimedia.org/T261455 (10RobH) [17:19:25] (03PS2) 10Cwhite: profile: remove usage of logstash statsd outputs [puppet] - 10https://gerrit.wikimedia.org/r/625975 (https://phabricator.wikimedia.org/T256418) [17:20:40] RECOVERY - ps1-c3-codfw-infeed-load-tower-A-phase-Z on ps1-c3-codfw is OK: SNMP OK - ps1-c3-codfw-infeed-load-tower-A-phase-Z 207 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:21:18] RECOVERY - ps1-c3-codfw-infeed-load-tower-B-phase-X on ps1-c3-codfw is OK: SNMP OK - ps1-c3-codfw-infeed-load-tower-B-phase-X 212 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:21:20] RECOVERY - ps1-c3-codfw-infeed-load-tower-B-phase-Y on ps1-c3-codfw is OK: SNMP OK - ps1-c3-codfw-infeed-load-tower-B-phase-Y 269 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:21:20] RECOVERY - ps1-c3-codfw-infeed-load-tower-B-phase-Z on ps1-c3-codfw is OK: SNMP OK - ps1-c3-codfw-infeed-load-tower-B-phase-Z 237 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:21:24] RECOVERY - ps1-c3-codfw-infeed-load-tower-A-phase-Y on ps1-c3-codfw is OK: SNMP OK - ps1-c3-codfw-infeed-load-tower-A-phase-Y 265 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:21:36] RECOVERY - ps1-c3-codfw-infeed-load-tower-A-phase-X on ps1-c3-codfw is OK: SNMP OK - ps1-c3-codfw-infeed-load-tower-A-phase-X 236 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:21:57] 10Operations, 10Traffic, 10netops: Wikimedia projects not reachable for some Telecom Italia users - https://phabricator.wikimedia.org/T262869 (10Dzahn) >>! In T262869#6461316, @Aklapper wrote: > but please note that this ticket is public so you may not want to post your IP and other personal data If you are... [17:23:16] RECOVERY - ps1-c8-eqiad-infeed-load-tower-B-phase-Z on ps1-c8-eqiad is OK: SNMP OK - ps1-c8-eqiad-infeed-load-tower-B-phase-Z 358 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:23:16] RECOVERY - ps1-c8-eqiad-infeed-load-tower-A-phase-Z on ps1-c8-eqiad is OK: SNMP OK - ps1-c8-eqiad-infeed-load-tower-A-phase-Z 376 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:23:20] RECOVERY - ps1-c8-eqiad-infeed-load-tower-B-phase-X on ps1-c8-eqiad is OK: SNMP OK - ps1-c8-eqiad-infeed-load-tower-B-phase-X 295 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:23:20] RECOVERY - ps1-c8-eqiad-infeed-load-tower-B-phase-Y on ps1-c8-eqiad is OK: SNMP OK - ps1-c8-eqiad-infeed-load-tower-B-phase-Y 315 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:23:34] RECOVERY - ps1-c8-eqiad-infeed-load-tower-A-phase-X on ps1-c8-eqiad is OK: SNMP OK - ps1-c8-eqiad-infeed-load-tower-A-phase-X 248 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:23:34] RECOVERY - ps1-c8-eqiad-infeed-load-tower-A-phase-Y on ps1-c8-eqiad is OK: SNMP OK - ps1-c8-eqiad-infeed-load-tower-A-phase-Y 408 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [17:23:40] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: New Date - Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C2 and C3 - https://phabricator.wikimedia.org/T261455 (10RobH) [17:26:17] (03CR) 10Cwhite: [C: 03+2] "PCC checks out: https://puppet-compiler.wmflabs.org/compiler1002/25079/" [puppet] - 10https://gerrit.wikimedia.org/r/625975 (https://phabricator.wikimedia.org/T256418) (owner: 10Cwhite) [17:28:56] (03CR) 10Krinkle: [C: 03+1] "LGTM. Whenever prod is clear, let's merge this, observe it in beta, and then on mwdebug/prod shortly thereafter." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627563 (https://phabricator.wikimedia.org/T187147) (owner: 10Ahmon Dancy) [17:30:26] (03CR) 10Herron: "FYI currently seeing OTRS mail in the deferred queue on MX with "mendelevium.eqiad.wmnet [10.64.32.174] Connection refused"" [puppet] - 10https://gerrit.wikimedia.org/r/626630 (https://phabricator.wikimedia.org/T187984) (owner: 10Alexandros Kosiaris) [17:35:30] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10RobH) [17:35:58] (03CR) 10Krinkle: [C: 03+2] Use require instead of include in ServiceConfig.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627563 (https://phabricator.wikimedia.org/T187147) (owner: 10Ahmon Dancy) [17:36:37] (03Merged) 10jenkins-bot: Use require instead of include in ServiceConfig.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627563 (https://phabricator.wikimedia.org/T187147) (owner: 10Ahmon Dancy) [17:37:25] * Krinkle locks deploy on deploy1001 [17:38:46] 10Operations, 10ops-eqiad, 10DBA, 10DC-Ops: Tue, Sept 15 PDU Upgrade 12pm-4pm UTC- Racks C4 and C5 - https://phabricator.wikimedia.org/T261456 (10RobH) [17:41:54] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: patch new cross-connect - https://phabricator.wikimedia.org/T261791 (10RobH) No light on the connection, you may have to roll the fiber at the dmarc panel. [17:43:08] !log ppchelko@deploy1001 Finished deploy [restbase/deploy@f7cda70]: Fix analytics by-country endpoint, feeds time out (duration: 37m 42s) [17:43:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:45:41] * Krinkle staging on mwdebug2001.codfw [17:45:50] PROBLEM - IPMI Sensor Status on dbproxy1020 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Status = Critical, PS Redundancy = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures [17:46:24] (03PS1) 10Jason Linehan: Enable MediaWiki client errors on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627575 (https://phabricator.wikimedia.org/T255585) [17:46:48] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: patch new cross-connect - https://phabricator.wikimedia.org/T261791 (10ayounsi) There is no light in. I emailed Telia to let them know we're ready. Updated the X-connect ID/ports, etc in the termination A https://netbox.wikimedia.org/circuits/circuits/95/ Nex... [17:48:06] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: patch new cross-connect - https://phabricator.wikimedia.org/T261791 (10ayounsi) From Telia: > We currently see this link down on our end. Receiving low power. > Tx Power: 0.65160 mW (-1.86019 dBm) > Rx Power: 0.00260 mW (-25.85027 dBm) [17:52:43] (03CR) 10Mholloway: "This change is ready for review." (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/624043 (https://phabricator.wikimedia.org/T260247) (owner: 10MSantos) [17:54:46] (03CR) 10Jdlrobson: [C: 03+1] Enable MediaWiki client errors on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627575 (https://phabricator.wikimedia.org/T255585) (owner: 10Jason Linehan) [17:56:54] jouncebot: next [17:56:54] In 0 hour(s) and 3 minute(s): Morning backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1800) [17:58:16] (03PS1) 10Razzi: profile::piwik::webserver: Add geoip data to Matomo [puppet] - 10https://gerrit.wikimedia.org/r/627577 (https://phabricator.wikimedia.org/T213741) [17:58:52] Urbanecm: hold on a minute [17:58:58] ah I see not yet started [17:59:01] one minute shall suffice [17:59:07] !log krinkle@deploy1001 Synchronized src/ServiceConfig.php: If727ae4335 (duration: 00m 56s) [17:59:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:59:12] Krinkle: I'm not doing anything atm, I'll delay window once you're done [18:00:04] RoanKattouw, Niharika, and Urbanecm: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Morning backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1800). [18:00:04] Ostrzyciel and hip: A patch you scheduled for Morning backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [18:00:11] \o/ [18:00:22] Krinkle: ping me once deploy host is free please :-) [18:01:45] (03CR) 10Razzi: "https://puppet-compiler.wmflabs.org/compiler1002/25080/matomo1002.eqiad.wmnet/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/627577 (https://phabricator.wikimedia.org/T213741) (owner: 10Razzi) [18:02:11] Urbanecm: done [18:02:16] thanks [18:02:28] Ostrzyciel: hello, are you around? :-) [18:02:42] hip is here [18:02:45] Urbanecm: yup [18:02:47] thanks hip100 [18:02:50] let's start then! [18:03:01] (03CR) 10Elukey: [C: 03+2] profile::piwik::webserver: Add geoip data to Matomo [puppet] - 10https://gerrit.wikimedia.org/r/627577 (https://phabricator.wikimedia.org/T213741) (owner: 10Razzi) [18:03:30] (03CR) 10Urbanecm: [C: 03+2] Enable the reverted tag on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627526 (https://phabricator.wikimedia.org/T164307) (owner: 10Ostrzyciel) [18:04:19] (03Merged) 10jenkins-bot: Enable the reverted tag on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627526 (https://phabricator.wikimedia.org/T164307) (owner: 10Ostrzyciel) [18:05:12] Ostrzyciel: pulled onto mwdebug2001, can you test? [18:05:38] yes, will do [18:06:47] Urbanecm: oh, wait, no... it's about background jobs so I don't know if it will work in debug mode [18:06:55] hmm, probably not [18:07:18] I'll sync then [18:08:15] hip100: are you able to test your patch? [18:08:18] yep [18:08:54] okay, thx [18:08:55] !log urbanecm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: 79004b7e503c7274fa56d2699b423b6919fbc869: Enable the reverted tag on all wikis (T164307) (duration: 00m 56s) [18:09:00] I'll ping you once its ready [18:09:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:09:02] T164307: Add Reverted filter to RecentChanges Filters - https://phabricator.wikimedia.org/T164307 [18:09:08] Ostrzyciel: your patch should be live. Please make sure it works :) [18:09:12] (03PS2) 10Urbanecm: Enable MediaWiki client errors on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627575 (https://phabricator.wikimedia.org/T255585) (owner: 10Jason Linehan) [18:09:14] Urbanecm: sounds good [18:09:18] (03CR) 10Urbanecm: [C: 03+2] Enable MediaWiki client errors on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627575 (https://phabricator.wikimedia.org/T255585) (owner: 10Jason Linehan) [18:10:05] (03Merged) 10jenkins-bot: Enable MediaWiki client errors on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627575 (https://phabricator.wikimedia.org/T255585) (owner: 10Jason Linehan) [18:10:05] Urbanecm tested Ostrzyciel's patch, works - https://en.wikipedia.org/w/index.php?title=User:DannyS712/sandbox&diff=978569119&oldid=976159700&diffmode=source shows that the prior edit was reverted [18:10:11] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: patch new cross-connect - https://phabricator.wikimedia.org/T261791 (10RobH) a:05RobH→03ayounsi So I am a bit confused: * I wasn't included in the above email to Telia, I have no idea what is going on for this. Please include me in communications on link... [18:10:16] cool DannyS712, thanks! [18:10:42] (03PS28) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [18:10:56] hip100: your patch is available at mwdebug2001, please test and let me know [18:11:27] Urbanecm: looks good [18:11:32] thanks, syncing [18:12:16] yeah, it works :) [18:12:19] DannyS712: thanks :) [18:12:41] DannyS712: will you be available for a short while, so we can make sure T241503 works? [18:12:52] !log urbanecm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: 1d3456570b80b1d8af1d2b71975496e54f87b24e: Enable MediaWiki client errors on frwiki (T255585) (duration: 00m 57s) [18:12:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:12:58] T255585: Extend client-side error logging coverage - https://phabricator.wikimedia.org/T255585 [18:13:04] yes, do you want to do https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/583745 now ? [18:13:08] hip100: should be done :) [18:13:08] 04̶C̶r̶i̶t̶i̶c̶a̶l Device ps1-b7-codfw.mgmt.codfw.wmnet recovered from Device rebooted [18:13:14] Thanks Urbanecm! [18:13:18] happy to help! [18:13:22] DannyS712: yes :) [18:13:40] (03CR) 10Urbanecm: [C: 03+2] [Beta cluster] add a fake 'UselessRightForTesting' to available rights [mediawiki-config] - 10https://gerrit.wikimedia.org/r/583745 (https://phabricator.wikimedia.org/T241503) (owner: 10DannyS712) [18:14:29] (03Merged) 10jenkins-bot: [Beta cluster] add a fake 'UselessRightForTesting' to available rights [mediawiki-config] - 10https://gerrit.wikimedia.org/r/583745 (https://phabricator.wikimedia.org/T241503) (owner: 10DannyS712) [18:14:51] Ostrzyciel the reverted tag doesn't show up as an option in the tags interface for recent changes, though it works if you add it to the url (eg [18:14:52] https://en.wikipedia.org/w/index.php?hidebots=1&hidecategorization=1&hideWikibase=1&limit=500&days=30&enhanced=1&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&title=Special:RecentChanges&urlversion=2) [18:15:04] DannyS712: cache? [18:15:18] DannyS712: probably something with the UI, I've seen it lag similarly for manual revert tag [18:15:23] cache maybe? [18:15:24] doesn't show up on beta cluster either, and its been enabled there for longer [18:15:30] hm. [18:15:34] that's more worrying [18:15:42] relevent caching appears to be `ChangesListSpecialPage-changeTagListSummary` - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/specialpage/ChangesListSpecialPage.php#913 - which is for a day [18:15:55] I'll have a closer look at this in half an hour :) [18:15:57] but I'm not positive [18:15:59] thanks both [18:16:02] thanks for noticing [18:16:22] DannyS712: patch should be at beta btw [18:17:30] (03PS29) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 [18:18:16] This is the second day now I am getting this error when trying to edit things - was Meta the other day and today Office Wiki. Different browsers - same account and computer. "Error contacting the Parsoid/RESTBase server (HTTP 500)" Is this a known problem, something new, or just my odd luck? [18:18:25] *trying to save edits [18:18:51] Urbanecm not showing up at https://meta.wikimedia.beta.wmflabs.org/wiki/Special:GlobalGroupPermissions/global-dannys712-test as an option to add [18:18:53] I assume this is visual editor varnent? [18:19:05] DannyS712: hmm, maybe it is at deployment host, but not synced, let me check [18:19:22] Urbanecm: Yeah - and the error also appears if trying to switch to code view [18:19:30] DannyS712: yup, should be live as soon as https://integration.wikimedia.org/ci/job/beta-scap-eqiad/317383/console finishes [18:20:01] Mac 10.15 - Safari and Chrome [18:20:20] varnent: I recommend you to fill a Phabricator task, so it can be investigated [18:20:49] (03CR) 10Jbond: cookbook sre.pdu: Fix reboot logic and other minor fixes (033 comments) [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [18:20:54] Urbanecm: Will do - I was hoping it was just a momentary known quirk :) [18:21:33] varnent: I'm not aware of any - and thanks [18:21:41] DannyS712: what about now? [18:21:47] hasn't finished yet [18:22:08] well the syncing part was, scap-cdb-rebuild shouldn't affect us IIRC [18:23:11] oh, its there, just because it wasn't lowercase it was sorted at the top [18:23:17] (03PS2) 10Urbanecm: Remove abusefilter-view right grant from wmf-config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/619991 (https://phabricator.wikimedia.org/T255506) (owner: 10ProcrastinatingReader) [18:23:32] (03CR) 10Urbanecm: [C: 03+2] Remove abusefilter-view right grant from wmf-config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/619991 (https://phabricator.wikimedia.org/T255506) (owner: 10ProcrastinatingReader) [18:24:13] (03Merged) 10jenkins-bot: Remove abusefilter-view right grant from wmf-config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/619991 (https://phabricator.wikimedia.org/T255506) (owner: 10ProcrastinatingReader) [18:24:21] DannyS712: cool, ping me once you're done with testing, I'll test this patch ^^ now :-) [18:24:44] done [18:25:07] (03CR) 10Jbond: "Ready for review" (032 comments) [cookbooks] - 10https://gerrit.wikimedia.org/r/627272 (owner: 10Jbond) [18:25:27] (03PS1) 10DannyS712: Revert "[Beta cluster] add a fake 'UselessRightForTesting' to available rights" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627453 (https://phabricator.wikimedia.org/T241503) [18:25:37] (03PS2) 10DannyS712: Revert "[Beta cluster] add a fake 'UselessRightForTesting' to available rights" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627453 (https://phabricator.wikimedia.org/T241503) [18:25:42] DannyS712: you need me to revert, right? [18:25:50] ^ yeah, can you deploy [18:25:58] (03CR) 10Urbanecm: [C: 03+2] Revert "[Beta cluster] add a fake 'UselessRightForTesting' to available rights" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627453 (https://phabricator.wikimedia.org/T241503) (owner: 10DannyS712) [18:26:07] approved, the computer should do that automatically :-) [18:26:38] (03Merged) 10jenkins-bot: Revert "[Beta cluster] add a fake 'UselessRightForTesting' to available rights" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627453 (https://phabricator.wikimedia.org/T241503) (owner: 10DannyS712) [18:27:40] will revoke once it no longer shows up as an option on the other groups [18:27:55] thx [18:28:08] !log urbanecm@deploy1001 Synchronized wmf-config/abusefilter.php: 084729b7fd0716f11265f1b37570afc120b27109: Remove abusefilter-view right grant from wmf-config (T255506) (duration: 00m 56s) [18:28:13] (03PS10) 10Ahmon Dancy: Factor out datacenters lists [mediawiki-config] - 10https://gerrit.wikimedia.org/r/622193 [18:28:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:28:14] T255506: Identify how abuse log details were purged from the CU logs - https://phabricator.wikimedia.org/T255506 [18:28:42] (03CR) 10jerkins-bot: [V: 04-1] Factor out datacenters lists [mediawiki-config] - 10https://gerrit.wikimedia.org/r/622193 (owner: 10Ahmon Dancy) [18:30:59] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: Telia IC-361191) cross-connection - https://phabricator.wikimedia.org/T261791 (10RobH) a:05ayounsi→03Cmjohnson Ok, summary of what we know and what we need to check: TX light to Telia: * Telia's RX of our light is at 0.00260 mW (-25.85027 dBm) * Our TX on... [18:31:19] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: Telia IC-361191) cross-connection - https://phabricator.wikimedia.org/T261791 (10RobH) [18:31:24] 10Operations, 10ops-eqiad, 10DC-Ops, 10netops: Telia IC-361191) patch - https://phabricator.wikimedia.org/T261791 (10RobH) [18:32:17] (03CR) 10Dzahn: [C: 03+1] rsync: handle quickdatacopy cron cleanup when flipping source/dest [puppet] - 10https://gerrit.wikimedia.org/r/627531 (owner: 10Filippo Giunchedi) [18:32:33] !log Morning B&C done [18:32:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:32:37] (03PS1) 10Elukey: Revert "profile::piwik::webserver: Add geoip data to Matomo" [puppet] - 10https://gerrit.wikimedia.org/r/627454 [18:33:46] (03CR) 10Elukey: [C: 03+2] Revert "profile::piwik::webserver: Add geoip data to Matomo" [puppet] - 10https://gerrit.wikimedia.org/r/627454 (owner: 10Elukey) [18:33:49] (03PS11) 10Ahmon Dancy: Factor out datacenters lists [mediawiki-config] - 10https://gerrit.wikimedia.org/r/622193 [18:33:51] (03PS1) 10CDanis: original logstash: accept Network Error Logging reports [puppet] - 10https://gerrit.wikimedia.org/r/627582 (https://phabricator.wikimedia.org/T257527) [18:36:36] (03CR) 10Ahmon Dancy: "Bug from patchset 9 has been fixed and we have validation that tests notice that bug (now that https://gerrit.wikimedia.org/r/c/operations" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/622193 (owner: 10Ahmon Dancy) [18:36:44] (03CR) 10Herron: [C: 03+1] "LGTM!" [puppet] - 10https://gerrit.wikimedia.org/r/627582 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [18:37:56] 10Operations, 10Product-Infrastructure-Data, 10Epic, 10Goal, 10Patch-For-Review: automatically collect network error reports from users' browsers (Network Error Logging API) - https://phabricator.wikimedia.org/T257527 (10ssingh) >>! In T257527#6437113, @CDanis wrote: > == Rollout planning braindump > >... [18:39:04] (03CR) 10Bstorm: [C: 03+2] wikireplicas: create multiinstance roles and profiles [puppet] - 10https://gerrit.wikimedia.org/r/622444 (https://phabricator.wikimedia.org/T260843) (owner: 10Bstorm) [18:40:08] Urbanecm should be ready to test [18:40:26] DannyS712: you mean the removal? [18:40:33] yeah. {{doing}} [18:40:33] well, go ahead if it's possible :-) [18:40:39] cool! [18:41:40] (03CR) 10CDanis: [C: 03+2] original logstash: accept Network Error Logging reports [puppet] - 10https://gerrit.wikimedia.org/r/627582 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [18:41:50] Urbanecm worked [18:42:02] excellent! [18:42:08] thanks for your help [18:42:49] DannyS712: re that thing with the reverted tag in RC: does beta cluster just mirror main wikipedia config? if so, it had the reverted tag disabled with the rest of the wikis (extept testwiki); I just checked the RC on testwiki and the tag is there [18:43:01] (03CR) 10Urbanecm: "> Patch Set 1:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/619991 (https://phabricator.wikimedia.org/T255506) (owner: 10ProcrastinatingReader) [18:43:03] so I guess it should appear on other wikis in a day [18:43:20] Ostrzyciel hmm, that is probably the reason. Is it possible to reset the cache manually? Urbanecm [18:43:33] I'm not sure if it is cache, and if so, where it resides [18:43:53] wait - does beta have an _edit_ with that tag? [18:44:45] (03CR) 10Dzahn: [V: 03+1 C: 03+1] "lgtm, also verified with more hosts here: https://puppet-compiler.wmflabs.org/compiler1003/25082/ where autosync is not used the crons a" [puppet] - 10https://gerrit.wikimedia.org/r/627531 (owner: 10Filippo Giunchedi) [18:45:41] Urbanecm it does now, https://en.wikipedia.beta.wmflabs.org/w/index.php?title=User:DannyS712/sandbox&action=history [18:46:04] and Ostrzyciel I was wrong earlier about adding it to the url working [18:46:51] is there a maintenance script to touch cached values? [18:47:14] well, depends how exactly this is cached [18:47:22] DannyS712: weird thing is that special:Tags says 0 edits [18:47:38] thats because its slow... [18:47:53] (03CR) 10Bstorm: [C: 04-2] "> Patch Set 1:" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [18:48:14] wdym, slow? [18:48:57] it uses ChangeTags::tagUsageStatistics which is cached for 5 minutes (https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/changetags/ChangeTags.php) so it takes a bit to update [18:49:14] aha [18:49:27] maybe RC does the same? [18:49:30] and shows displayed ones? [18:49:32] *used ones [18:49:49] rc uses a different, much longer cache, I think - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/specialpage/ChangesListSpecialPage.php#913 [18:50:05] (03PS3) 10Razzi: Add geoip::data::puppet to profile::piwik::instance [puppet] - 10https://gerrit.wikimedia.org/r/626481 (https://phabricator.wikimedia.org/T213741) [18:50:27] i see [18:50:54] DannyS712: try now? [18:51:28] PROBLEM - Mobileapps LVS eqiad on mobileapps.svc.eqiad.wmnet is CRITICAL: /{domain}/v1/page/summary/{title} (Get summary for test page) is CRITICAL: Test Get summary for test page returned the unexpected status 503 (expecting: 200) https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [18:51:55] (03CR) 10Elukey: [C: 03+2] Add geoip::data::puppet to profile::piwik::instance [puppet] - 10https://gerrit.wikimedia.org/r/626481 (https://phabricator.wikimedia.org/T213741) (owner: 10Razzi) [18:51:56] yup, works now on betacluster [18:52:00] cool [18:52:14] purged with shell.php [18:52:49] (03CR) 10Krinkle: [C: 04-1] "LGTM, but per the issue we found, this should be covered by a test. For the deploy, we'll also need to write (e.g. in a comment here) the " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/622193 (owner: 10Ahmon Dancy) [18:52:50] I was going to say, something like `MediaWikiServices::getInstance()->getMainWANObjectCache()->delete( keyName )` should be enough [18:53:09] is there an easy way to run that on all wikis in production, so that they don't need to wait a day? [18:53:27] ehh, I'd ignore that [18:53:40] it works, assigns tags, and that's all we need to do :-) [18:53:57] okay [18:55:22] RECOVERY - Mobileapps LVS eqiad on mobileapps.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [18:55:56] (03PS2) 10Bstorm: wikireplicas: Proposal for a proxy setup on multi-instance replicas [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) [18:56:21] (03CR) 10Herron: [C: 03+1] "LGTM, thx for this!" [puppet] - 10https://gerrit.wikimedia.org/r/624310 (owner: 10Dzahn) [18:56:56] PROBLEM - logstash syslog TCP port on logstash1009 is CRITICAL: connect to address 127.0.0.1 and port 10514: Connection refused https://wikitech.wikimedia.org/wiki/Logstash [18:57:42] (03CR) 10Bstorm: [C: 04-2] wikireplicas: Proposal for a proxy setup on multi-instance replicas (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [18:58:29] (03PS8) 10Ryan Kemper: elasticsearch: Store which dcs to query in class [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) [18:58:52] RECOVERY - logstash syslog TCP port on logstash1009 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 10514 https://wikitech.wikimedia.org/wiki/Logstash [18:59:53] (03PS3) 10Dzahn: lists: move hiera lookup out of module, add servername as proper parameter [puppet] - 10https://gerrit.wikimedia.org/r/624310 [19:00:04] liw and brennen: I, the Bot under the Fountain, allow thee, The Deployer, to do Mediawiki train - European+American Version (secondary timeslot) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1900). [19:00:38] current status: train still blocked on T262900 [19:00:39] T262900: Rebuilding l10n cache fails for train - https://phabricator.wikimedia.org/T262900 [19:01:28] (03CR) 10Dzahn: [V: 03+1] "https://puppet-compiler.wmflabs.org/compiler1003/25084/lists1001.wikimedia.org/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/624310 (owner: 10Dzahn) [19:01:51] (03CR) 10Dzahn: [V: 03+1 C: 03+2] lists: move hiera lookup out of module, add servername as proper parameter [puppet] - 10https://gerrit.wikimedia.org/r/624310 (owner: 10Dzahn) [19:03:27] (03PS2) 10Dzahn: lists: replace hardcoded server name with variable [puppet] - 10https://gerrit.wikimedia.org/r/624319 [19:04:40] (03CR) 10Dzahn: "ran puppet on lists1001 - nothing happened" [puppet] - 10https://gerrit.wikimedia.org/r/624310 (owner: 10Dzahn) [19:05:24] (03CR) 10Dzahn: [V: 03+1] "https://puppet-compiler.wmflabs.org/compiler1003/25085/" [puppet] - 10https://gerrit.wikimedia.org/r/624319 (owner: 10Dzahn) [19:09:39] (03PS1) 10Elukey: Revert "Add geoip::data::puppet to profile::piwik::instance" [puppet] - 10https://gerrit.wikimedia.org/r/627455 [19:10:49] (03CR) 10Elukey: [C: 03+2] Revert "Add geoip::data::puppet to profile::piwik::instance" [puppet] - 10https://gerrit.wikimedia.org/r/627455 (owner: 10Elukey) [19:12:29] (03CR) 10Bstorm: [C: 04-2] wikireplicas: Proposal for a proxy setup on multi-instance replicas (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/627379 (https://phabricator.wikimedia.org/T260389) (owner: 10Bstorm) [19:13:14] 10Operations, 10Parsoid, 10observability, 10serviceops, 10User-jijiki: Create per cluster error rate alerts on Mediawiki servers - https://phabricator.wikimedia.org/T262078 (10jijiki) [19:14:04] (03CR) 10Herron: [C: 03+1] "LGTM, PCC is a noop https://puppet-compiler.wmflabs.org/compiler1003/25086/" [puppet] - 10https://gerrit.wikimedia.org/r/624319 (owner: 10Dzahn) [19:15:24] (03PS1) 10Ppchelko: Api-gateway: Implement support for X-Wikimedia-Debug header [deployment-charts] - 10https://gerrit.wikimedia.org/r/627588 (https://phabricator.wikimedia.org/T262396) [19:18:00] (03CR) 10Dzahn: [V: 03+1 C: 03+2] lists: replace hardcoded server name with variable [puppet] - 10https://gerrit.wikimedia.org/r/624319 (owner: 10Dzahn) [19:18:52] (03PS9) 10Ryan Kemper: elasticsearch: Store which dcs to query in class [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) [19:19:09] 10Operations, 10ops-eqiad, 10DC-Ops, 10fundraising-tech-ops: (Need By: 2020-09-30) rack/setup/install frmx1001 & frdata1002 - https://phabricator.wikimedia.org/T260181 (10Jgreen) [19:19:21] (03CR) 10Ryan Kemper: elasticsearch: Store which dcs to query in class (033 comments) [software/spicerack] - 10https://gerrit.wikimedia.org/r/626240 (https://phabricator.wikimedia.org/T261239) (owner: 10Ryan Kemper) [19:21:26] (03CR) 10Dzahn: "ran puppet on lists1001 - noop - also running on icinga1001 - noop" [puppet] - 10https://gerrit.wikimedia.org/r/624319 (owner: 10Dzahn) [19:21:58] (03PS5) 10Dzahn: nrpe: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/624376 (https://phabricator.wikimedia.org/T209953) [19:22:27] 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review, 10User-Kormat: db2125 crashed - mgmt iface also not available - https://phabricator.wikimedia.org/T260670 (10wiki_willy) Hi @Marostegui - I can escalate this to our account rep, and see if they can either escalate up further or swap it out with a new... [19:25:59] 10Operations, 10ops-eqiad, 10DC-Ops, 10fundraising-tech-ops: (Need By: 2020-09-30) rack/setup/install frmx1001 & frdata1002 - https://phabricator.wikimedia.org/T260181 (10Jgreen) [19:30:46] (03PS1) 10CDanis: logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) [19:32:16] (03CR) 10jerkins-bot: [V: 04-1] logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [19:32:18] (03CR) 10CDanis: "PCC: https://puppet-compiler.wmflabs.org/compiler1001/25090/" [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [19:35:11] (03PS2) 10CDanis: logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) [19:36:42] (03CR) 10jerkins-bot: [V: 04-1] logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [19:43:42] (03CR) 10CDanis: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [19:55:06] Urbanecm: Adde to this one: https://phabricator.wikimedia.org/T262838 [19:55:17] *added [20:05:02] 10Operations, 10Traffic, 10netops: Wikimedia projects not reachable for some Telecom Italia users - https://phabricator.wikimedia.org/T262869 (10Andyrom75) >>! In T262869#6460813, @CDanis wrote: > Today we had reports of an issue from @Andyrom75 that was happening all the time on their Wind (AS1267) mobile c... [20:05:24] 10Operations, 10Product-Infrastructure-Data, 10Epic, 10Goal, 10Patch-For-Review: automatically collect network error reports from users' browsers (Network Error Logging API) - https://phabricator.wikimedia.org/T257527 (10CDanis) >>! In T257527#6463988, @ssingh wrote: > 1. For the TTL (defined by `max_age... [20:06:45] (03PS3) 10CDanis: logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) [20:08:05] (03CR) 10jerkins-bot: [V: 04-1] logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [20:09:11] (03CR) 10Effie Mouzeli: "New PCC https://puppet-compiler.wmflabs.org/compiler1001/25092/prometheus1003.eqiad.wmnet/index.html" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/627439 (owner: 10Alexandros Kosiaris) [20:10:01] (03PS4) 10CDanis: logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) [20:12:00] (03CR) 10Cwhite: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [20:12:59] (03PS5) 10CDanis: logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) [20:13:10] PROBLEM - Mobileapps LVS eqiad on mobileapps.svc.eqiad.wmnet is CRITICAL: /{domain}/v1/page/metadata/{title} (retrieve extended metadata for Video article on English Wikipedia) is CRITICAL: Test retrieve extended metadata for Video article on English Wikipedia returned the unexpected status 503 (expecting: 200) https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [20:15:06] RECOVERY - Mobileapps LVS eqiad on mobileapps.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [20:15:15] (03CR) 10CDanis: [C: 03+2] logstash: NEL: rename overloaded body field to report_body [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [20:20:31] (03PS1) 10Jdlrobson: Remove site footer links indirection [skins/Nostalgia] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627459 (https://phabricator.wikimedia.org/T258001) [20:20:33] (03CR) 10Ottomata: "Thank you logstash >:I" [puppet] - 10https://gerrit.wikimedia.org/r/627589 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [20:23:31] (03PS1) 10CDanis: collector7: fix omitted 'tags' stanza [puppet] - 10https://gerrit.wikimedia.org/r/627591 (https://phabricator.wikimedia.org/T257527) [20:24:33] (03PS2) 10CDanis: collector7: fix omitted 'type' stanza [puppet] - 10https://gerrit.wikimedia.org/r/627591 (https://phabricator.wikimedia.org/T257527) [20:25:55] (03CR) 10CDanis: [C: 03+2] collector7: fix omitted 'type' stanza [puppet] - 10https://gerrit.wikimedia.org/r/627591 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [20:26:10] (03PS1) 10Bartosz Dziewoński: Fix APCOND_FR_NEVERBLOCKED handling [extensions/FlaggedRevs] (wmf/1.36.0-wmf.8) - 10https://gerrit.wikimedia.org/r/627461 (https://phabricator.wikimedia.org/T262970) [20:26:33] (03PS1) 10Bartosz Dziewoński: Fix APCOND_FR_NEVERBLOCKED handling [extensions/FlaggedRevs] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627462 (https://phabricator.wikimedia.org/T262970) [20:27:24] Jdlrobson: just the one for wmf.9, yeah? i can merge and pull that down; shouldn't need to be synced since we haven't been able to sync that branch at all yet. [20:28:35] jouncebot: now [20:28:35] For the next 0 hour(s) and 31 minute(s): Mediawiki train - European+American Version (secondary timeslot) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T1900) [20:29:27] brennen: yep [20:32:58] cool, thanks. [20:33:07] (03CR) 10Brennen Bearnes: [C: 03+2] Remove site footer links indirection [skins/Nostalgia] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627459 (https://phabricator.wikimedia.org/T258001) (owner: 10Jdlrobson) [20:37:07] (03Merged) 10jenkins-bot: Remove site footer links indirection [skins/Nostalgia] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627459 (https://phabricator.wikimedia.org/T258001) (owner: 10Jdlrobson) [20:38:10] 10Operations, 10Platform Engineering, 10Product-Infrastructure-Team-Backlog, 10RESTBase, and 4 others: High numbers of HTTP 429 errors - https://phabricator.wikimedia.org/T262691 (10eprodromou) OK, we'll take a look at getting this figured out. @Pchelolo , let's consult on what's needed. [20:39:45] i ththanks brennen [20:39:47] 1 down! [20:40:12] is there a string of these? [20:40:30] 10Operations, 10Editing-team, 10MassMessage, 10WMF-JobQueue, 10Platform Team Workboards (Clinic Duty Team): Same MassMessage is being sent more than once - https://phabricator.wikimedia.org/T93049 (10eprodromou) [20:41:32] (03CR) 10Dzahn: [V: 03+1] "https://puppet-compiler.wmflabs.org/compiler1002/25087/" [puppet] - 10https://gerrit.wikimedia.org/r/624376 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [20:42:11] (03PS1) 10Bartosz Dziewoński: Add constants from FlaggedRevs to defines.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627594 [20:42:13] (03PS1) 10Bartosz Dziewoński: flaggedrevs: Remove non-existent config options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627595 [20:42:53] (03CR) 10Dzahn: [V: 03+1 C: 03+2] "compiled on a lot of hosts and icinga1001 - noop" [puppet] - 10https://gerrit.wikimedia.org/r/624376 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [20:44:54] !log removing extraneous recursive symlink /srv/mediawiki-staging/php-1.36.0-wmf.9/php-1.36.0-wmf.8 [20:44:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:47:42] brennen: hmm there's also https://gerrit.wikimedia.org/r/c/mediawiki/skins/Nostalgia/+/627560 but im a little confused - i cant seem to chery pick it [20:47:46] did someone else already do that? [20:49:59] (03PS1) 10Jdlrobson: mainPageLink is deprecated [skins/Nostalgia] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627596 (https://phabricator.wikimedia.org/T257996) [20:50:17] ^ brennen i dont think it got cherry picked [20:52:48] (03PS6) 10Dzahn: monitoring: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/624369 (https://phabricator.wikimedia.org/T209953) [20:55:21] (03CR) 10Brennen Bearnes: [C: 03+2] mainPageLink is deprecated [skins/Nostalgia] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627596 (https://phabricator.wikimedia.org/T257996) (owner: 10Jdlrobson) [20:56:11] (03CR) 10Dzahn: [C: 03+2] "https://puppet-compiler.wmflabs.org/compiler1003/25094/" [puppet] - 10https://gerrit.wikimedia.org/r/624369 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [20:56:19] Jdlrobson: will fetch that one as well once merged. [20:58:48] (03Merged) 10jenkins-bot: mainPageLink is deprecated [skins/Nostalgia] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627596 (https://phabricator.wikimedia.org/T257996) (owner: 10Jdlrobson) [20:59:50] brennen: ok great. I think that should unblock you - just the message issue now [20:59:58] (03CR) 10Dzahn: "ran puppet on about 5 different hosts using these, then on icinga1001 - noop everywhere" [puppet] - 10https://gerrit.wikimedia.org/r/624369 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [21:00:17] Jdlrobson: cool, ty. [21:02:34] (03CR) 10Dzahn: [V: 03+1] "for the log rsync this change should not change anything, for the https monitoring it should add it on the second server. this will just m" [puppet] - 10https://gerrit.wikimedia.org/r/624328 (owner: 10Dzahn) [21:04:26] (03CR) 10Bstorm: [C: 03+1] "Looks right to me" [software/wmfmariadbpy] - 10https://gerrit.wikimedia.org/r/627502 (owner: 10Jcrespo) [21:05:20] (03CR) 10Bstorm: [C: 03+1] tools-clush-generator: use eqiad1.wikimedia.cloud [puppet] - 10https://gerrit.wikimedia.org/r/626464 (https://phabricator.wikimedia.org/T260614) (owner: 10Andrew Bogott) [21:05:47] (03CR) 10Dzahn: [V: 03+1] "actually.. the HTTPS check command already takes the parameter "xmldumps_server" which is dumps.wikimedia.org which is an alias for labsto" [puppet] - 10https://gerrit.wikimedia.org/r/624328 (owner: 10Dzahn) [21:06:47] (03PS1) 10CDanis: WIP: can't use type, use tags [puppet] - 10https://gerrit.wikimedia.org/r/627599 [21:08:58] RECOVERY - MariaDB Replica Lag: m2 on db1117 is OK: OK slave_sql_lag Replication lag: 0.00 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:09:04] (03PS3) 10Dzahn: dumps: rename the do_acme parameter and lookup [puppet] - 10https://gerrit.wikimedia.org/r/624328 [21:10:00] (03PS2) 10CDanis: WIP: can't use type, use tags [puppet] - 10https://gerrit.wikimedia.org/r/627599 [21:10:03] (03CR) 10jerkins-bot: [V: 04-1] dumps: rename the do_acme parameter and lookup [puppet] - 10https://gerrit.wikimedia.org/r/624328 (owner: 10Dzahn) [21:13:24] (03CR) 10CDanis: "PCC lgtm, test passes https://puppet-compiler.wmflabs.org/compiler1003/25096/" [puppet] - 10https://gerrit.wikimedia.org/r/627599 (owner: 10CDanis) [21:13:32] (03PS3) 10CDanis: logstash NEL: use 'tags' not 'type' [puppet] - 10https://gerrit.wikimedia.org/r/627599 (https://phabricator.wikimedia.org/T257527) [21:14:18] (03PS4) 10Dzahn: dumps: rename the do_acme parameter and lookup [puppet] - 10https://gerrit.wikimedia.org/r/624328 [21:15:08] (03CR) 10Cwhite: [C: 03+1] logstash NEL: use 'tags' not 'type' [puppet] - 10https://gerrit.wikimedia.org/r/627599 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [21:16:09] (03CR) 10CDanis: [C: 03+2] logstash NEL: use 'tags' not 'type' [puppet] - 10https://gerrit.wikimedia.org/r/627599 (https://phabricator.wikimedia.org/T257527) (owner: 10CDanis) [21:20:32] (03PS5) 10Dzahn: dumps: rename the do_acme parameter and lookup [puppet] - 10https://gerrit.wikimedia.org/r/624328 [21:22:40] (03PS6) 10Dzahn: dumps: rename the do_acme parameter and lookup [puppet] - 10https://gerrit.wikimedia.org/r/624328 [21:22:50] (03CR) 10Dzahn: [V: 03+1] "> If that represents a change, then we should probably introduce it on a separate patch, since this is aiming to be a renaming of a variab" [puppet] - 10https://gerrit.wikimedia.org/r/624328 (owner: 10Dzahn) [21:26:53] (03PS6) 10Dzahn: scap: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/624343 (https://phabricator.wikimedia.org/T209953) [21:27:19] (03CR) 10Dzahn: [C: 03+2] scap: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/624343 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [21:30:55] (03CR) 10Dzahn: "noop on deploy1001, conf1004, parse2001, .." [puppet] - 10https://gerrit.wikimedia.org/r/624343 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [21:33:07] (03PS3) 10Dzahn: service::node: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/624346 [21:39:20] (03PS1) 10Bstorm: icinga: permissions add both cases for bstorm [puppet] - 10https://gerrit.wikimedia.org/r/627602 [21:42:15] (03CR) 10Dzahn: [C: 03+2] "https://puppet-compiler.wmflabs.org/compiler1003/25100/" [puppet] - 10https://gerrit.wikimedia.org/r/624346 (owner: 10Dzahn) [21:49:09] (03CR) 10Dzahn: "noop on aqs1004, scandium, parse2001,..." [puppet] - 10https://gerrit.wikimedia.org/r/624346 (owner: 10Dzahn) [21:50:24] (03CR) 10Bstorm: [C: 03+1] wmcs: remove old unused records [dns] - 10https://gerrit.wikimedia.org/r/627442 (https://phabricator.wikimedia.org/T262863) (owner: 10Volans) [21:52:44] (03CR) 10Dzahn: "I have been in this situation myself countless times, so definitely feel why this would be desired. But I also can't help to think it shou" [puppet] - 10https://gerrit.wikimedia.org/r/627602 (owner: 10Bstorm) [22:08:25] 10Operations, 10Language-Team, 10Wikimedia-Mailing-lists: localisation-team mailing list to be archived and made read-only - https://phabricator.wikimedia.org/T262788 (10RLazarus) 05Open→03Resolved p:05Triage→03Medium a:03RLazarus Done! ` rzl@lists1001:~$ sudo disable_list localisation-team /var/l... [22:10:34] (03CR) 10Dzahn: puppetmaster: (re)move hiera lookup for scripts to profiles (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624335 (owner: 10Dzahn) [22:10:39] (03CR) 10Hashar: "Here are the quilt patches! :]" [debs/hue] - 10https://gerrit.wikimedia.org/r/618728 (https://phabricator.wikimedia.org/T233073) (owner: 10Elukey) [22:14:30] (03CR) 10Dzahn: puppetmaster::backend: replace hiera with lookup, data types (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624342 (https://phabricator.wikimedia.org/T209953) (owner: 10Dzahn) [22:14:52] (03PS4) 10Dzahn: puppetmaster::backend: replace hiera with lookup, data types [puppet] - 10https://gerrit.wikimedia.org/r/624342 (https://phabricator.wikimedia.org/T209953) [22:24:57] Can someone take a look at varnish XID 521411223 please? I just got a 503 trying to nuke pages on wikidata [22:29:28] (03CR) 10Bstorm: [C: 03+1] "What actually uses this? Is it intended to be a testing platform for clush runs as a hostlist argument?" [puppet] - 10https://gerrit.wikimedia.org/r/626457 (https://phabricator.wikimedia.org/T260614) (owner: 10Andrew Bogott) [22:37:25] (03CR) 10CRusnov: "One caveat of course is that numerous wikimedia.org records are not able to be migrated separately in this patch." [dns] - 10https://gerrit.wikimedia.org/r/627605 (https://phabricator.wikimedia.org/T258729) (owner: 10CRusnov) [22:38:48] (03CR) 10Bstorm: [C: 04-1] "I'm actually wondering what or who uses this script at this point while we are looking at it. I might find out "everybody", but I'll take " (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624122 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [22:40:07] (03PS2) 10Dzahn: ntp::daemon: replace hiera() with lookup(), lint [puppet] - 10https://gerrit.wikimedia.org/r/624332 [22:43:32] (03PS3) 10CRusnov: toolforge/gridscripts/runninggridtasks.py: Fix Python3 PEP8 Warning [puppet] - 10https://gerrit.wikimedia.org/r/624122 (https://phabricator.wikimedia.org/T247364) [22:44:46] (03CR) 10CRusnov: "Thanks for looking at it, if it turns out nobody uses it this patch can become an rm instead. 😊" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624122 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [22:46:50] (03PS5) 10CRusnov: base/apt-upgrade-activity.py: Port to Python3 [puppet] - 10https://gerrit.wikimedia.org/r/624732 (https://phabricator.wikimedia.org/T247364) [22:51:31] (03CR) 10Dzahn: ntp::daemon: replace hiera() with lookup(), lint (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624332 (owner: 10Dzahn) [22:51:52] (03CR) 10CRusnov: [C: 03+2] base/apt-upgrade-activity.py: Port to Python3 [puppet] - 10https://gerrit.wikimedia.org/r/624732 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [22:53:12] (03PS1) 10CRusnov: Revert "base/apt-upgrade-activity.py: Port to Python3" [puppet] - 10https://gerrit.wikimedia.org/r/627607 [22:53:42] (03CR) 10CRusnov: [V: 03+2 C: 03+2] Revert "base/apt-upgrade-activity.py: Port to Python3" [puppet] - 10https://gerrit.wikimedia.org/r/627607 (owner: 10CRusnov) [22:57:18] (03PS1) 10Urbanecm: Revert "Remove abusefilter-view right grant from wmf-config" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627608 (https://phabricator.wikimedia.org/T255506) [22:57:40] (03CR) 10Urbanecm: [C: 03+2] Revert "Remove abusefilter-view right grant from wmf-config" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627608 (https://phabricator.wikimedia.org/T255506) (owner: 10Urbanecm) [22:58:07] (03PS4) 10Dzahn: puppetdb: (re)move hiera lookup for db pass to profile [puppet] - 10https://gerrit.wikimedia.org/r/624340 [22:58:20] (03Merged) 10jenkins-bot: Revert "Remove abusefilter-view right grant from wmf-config" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627608 (https://phabricator.wikimedia.org/T255506) (owner: 10Urbanecm) [22:58:40] (03CR) 10CRusnov: "This change is ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/627627 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [22:58:46] (03PS1) 10Bstorm: toolforge grid: Remove some old scripts we don't use anymore [puppet] - 10https://gerrit.wikimedia.org/r/627628 (https://phabricator.wikimedia.org/T247364) [23:00:04] RoanKattouw, Niharika, and Urbanecm: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Evening backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200915T2300). [23:00:04] MatmaRex: A patch you scheduled for Evening backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [23:00:07] !log urbanecm@deploy1001 Synchronized wmf-config/abusefilter.php: 62b21d55a8f0a94b8cd268d5024df0cf64013dd5: Revert "Remove abusefilter-view right grant from wmf-config" (T255506) (duration: 00m 59s) [23:00:13] I can deploy today [23:00:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:00:15] T255506: Identify how abuse log details were purged from the CU logs - https://phabricator.wikimedia.org/T255506 [23:00:22] MatmaRex: hello :-) [23:00:25] hello [23:00:34] (03CR) 10Bstorm: "> Patch Set 3:" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624122 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [23:00:41] (03CR) 10Urbanecm: [C: 03+2] Fix APCOND_FR_NEVERBLOCKED handling [extensions/FlaggedRevs] (wmf/1.36.0-wmf.8) - 10https://gerrit.wikimedia.org/r/627461 (https://phabricator.wikimedia.org/T262970) (owner: 10Bartosz Dziewoński) [23:00:53] (03CR) 10Urbanecm: [C: 03+2] Fix APCOND_FR_NEVERBLOCKED handling [extensions/FlaggedRevs] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627462 (https://phabricator.wikimedia.org/T262970) (owner: 10Bartosz Dziewoński) [23:01:02] MatmaRex: do your patches depend on each other? [23:01:12] I'm asking if we can go with config now, and wait for backports to settle [23:01:22] Urbanecm: FYI i can't directly test the FlaggedRevs patches, the effect will be visible in autopromotion logs [23:01:25] Urbanecm: no, they don't [23:01:33] we can go with the config [23:01:48] the config patches should be no-ops [23:02:20] okay, thanks [23:02:21] looking [23:02:42] MatmaRex: so, also untestable, I guess? [23:02:48] (03CR) 10Urbanecm: [C: 03+2] flaggedrevs: Remove non-existent config options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627595 (owner: 10Bartosz Dziewoński) [23:02:59] yeah [23:03:45] for https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/627594, I guess the deployment order should be "defines first, IS.php later", is that irght? [23:04:07] (03Merged) 10jenkins-bot: Fix APCOND_FR_NEVERBLOCKED handling [extensions/FlaggedRevs] (wmf/1.36.0-wmf.8) - 10https://gerrit.wikimedia.org/r/627461 (https://phabricator.wikimedia.org/T262970) (owner: 10Bartosz Dziewoński) [23:04:11] (03Merged) 10jenkins-bot: Fix APCOND_FR_NEVERBLOCKED handling [extensions/FlaggedRevs] (wmf/1.36.0-wmf.9) - 10https://gerrit.wikimedia.org/r/627462 (https://phabricator.wikimedia.org/T262970) (owner: 10Bartosz Dziewoński) [23:04:21] Urbanecm: oh, yeah. i forgot that this is a pain, sorry. i can split the commit [23:04:29] or maybe just abandon it. it's not important, meh [23:04:30] 10Operations, 10Wikidata, 10Wikimedia-Mailing-lists: Stop archiving the wikidata-bugs mailinglist in pipermail - https://phabricator.wikimedia.org/T262773 (10RLazarus) Any list administrator can disable archiving new messages, at https://lists.wikimedia.org/mailman/admin/wikidata-bugs/archive -- just change... [23:04:52] MatmaRex: no problem, as long as we know the order, it's fine :) [23:05:06] (03CR) 10Urbanecm: [C: 03+2] Add constants from FlaggedRevs to defines.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627594 (owner: 10Bartosz Dziewoński) [23:05:09] (03PS2) 10Urbanecm: flaggedrevs: Remove non-existent config options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627595 (owner: 10Bartosz Dziewoński) [23:05:15] (03CR) 10Urbanecm: [C: 03+2] flaggedrevs: Remove non-existent config options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627595 (owner: 10Bartosz Dziewoński) [23:06:05] (03Merged) 10jenkins-bot: Add constants from FlaggedRevs to defines.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627594 (owner: 10Bartosz Dziewoński) [23:06:48] (03Merged) 10jenkins-bot: flaggedrevs: Remove non-existent config options [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627595 (owner: 10Bartosz Dziewoński) [23:07:27] !log urbanecm@deploy1001 Scap failed!: Call to mwscript eval.php stderr: not empty [23:07:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:08:07] MatmaRex: the defines.php patch doesn't work as intended [23:08:15] (03CR) 10CRusnov: "> Patch Set 3:" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/624122 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [23:08:25] do i have a syntax error or something? [23:08:27] MatmaRex: it refuses to be synced, with "Use of undefined constant APCOND_FR_MAXREVERTEDEDITRATIO" as the error [23:08:38] (03Abandoned) 10CRusnov: toolforge/gridscripts/runninggridtasks.py: Fix Python3 PEP8 Warning [puppet] - 10https://gerrit.wikimedia.org/r/624122 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [23:08:42] it doesn't define the constants for some reason [23:09:04] (03PS1) 10CDanis: WIP: serve NEL headers on group0 [puppet] - 10https://gerrit.wikimedia.org/r/627629 [23:09:09] i have no idea why that would happen [23:09:22] but we should just revert it, it's not necessary [23:09:34] will do [23:09:44] (the other one doesn't depend on it) [23:09:50] ack [23:12:38] (03PS1) 10Urbanecm: Revert "Add constants from FlaggedRevs to defines.php" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627630 [23:12:39] (03CR) 10Urbanecm: [C: 03+2] Revert "Add constants from FlaggedRevs to defines.php" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627630 (owner: 10Urbanecm) [23:13:11] (03Merged) 10jenkins-bot: Revert "Add constants from FlaggedRevs to defines.php" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/627630 (owner: 10Urbanecm) [23:13:22] so, syncing the other config patch [23:14:09] !log urbanecm@deploy1001 Synchronized wmf-config/flaggedrevs.php: ac8bd3894f2dc8f2735cc9fa7b860af1d91c6707: flaggedrevs: Remove non-existent config options (duration: 00m 58s) [23:14:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:15:52] (03PS2) 10CDanis: WIP: serve NEL headers on group0 [puppet] - 10https://gerrit.wikimedia.org/r/627629 [23:17:14] 10Operations, 10Traffic, 10netops: Wikimedia projects not reachable for some Telecom Italia users - https://phabricator.wikimedia.org/T262869 (10Andyrom75) ~15min ago connection has been restored. I'll test it again tomorrow. [23:18:40] (03PS3) 10CDanis: WIP: serve NEL headers on group0 [puppet] - 10https://gerrit.wikimedia.org/r/627629 [23:18:42] !log urbanecm@deploy1001 Synchronized php-1.36.0-wmf.8/extensions/FlaggedRevs/backend/FlaggedRevsHooks.php: 5beace32a396adfcce46b04e7f969b2f9f9effda: Fix APCOND_FR_NEVERBLOCKED handling (T262970) (duration: 00m 58s) [23:18:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:18:49] T262970: FlaggedRevs doesn't check the 'neverBlocked' / APCOND_FR_NEVERBLOCKED option when autopromoting - https://phabricator.wikimedia.org/T262970 [23:19:45] MatmaRex: second backport is landing now [23:19:54] thanks [23:20:15] (03PS4) 10CDanis: WIP: serve NEL headers on group0 [puppet] - 10https://gerrit.wikimedia.org/r/627629 [23:20:33] !log urbanecm@deploy1001 Synchronized php-1.36.0-wmf.9/extensions/FlaggedRevs/backend/FlaggedRevsHooks.php: 1c0b0d161fe1024d6d08a27bbacf5b62c56c9c01: Fix APCOND_FR_NEVERBLOCKED handling (T262970) (duration: 00m 56s) [23:20:37] and, done [23:20:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:20:49] MatmaRex: unless you have any lastminutes, its should be fine :) [23:20:54] *done [23:21:14] Urbanecm: thank you! [23:21:21] happy to help! [23:21:55] (03CR) 10Dzahn: [V: 03+1] "https://puppet-compiler.wmflabs.org/compiler1002/25102/puppetdb2002.codfw.wmnet/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/624340 (owner: 10Dzahn) [23:28:46] (03CR) 10CDanis: [C: 03+1] base/apt-upgrade-activity.py: Port to Python3 [puppet] - 10https://gerrit.wikimedia.org/r/627627 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [23:29:36] (03CR) 10CRusnov: [C: 03+2] base/apt-upgrade-activity.py: Port to Python3 [puppet] - 10https://gerrit.wikimedia.org/r/627627 (https://phabricator.wikimedia.org/T247364) (owner: 10CRusnov) [23:35:41] (03PS5) 10CDanis: WIP: serve NEL headers on group0 [puppet] - 10https://gerrit.wikimedia.org/r/627629 [23:56:39] 10Operations, 10Wikidata, 10Wikimedia-Mailing-lists: Stop archiving the wikidata-bugs mailinglist in pipermail - https://phabricator.wikimedia.org/T262773 (10Dzahn) I can only agree with what RLazarus already said. This is just a config change at the list-admin level. The people receiving wikidata-bugs-owner...