[09:19:12] 10serviceops, 10SRE, 10Traffic, 10Wikimedia-Incident: The safe service restart script doesn't detect failure when running with poolcounter. - https://phabricator.wikimedia.org/T272262 (10Joe) [09:20:08] 10serviceops, 10SRE, 10Traffic, 10Wikimedia-Incident: The safe service restart script doesn't detect failure when running with poolcounter. - https://phabricator.wikimedia.org/T272262 (10Joe) p:05Triage→03Unbreak! a:03Joe [10:38:04] 10serviceops, 10SRE, 10Wikimedia-Incident: High latency on appservers - https://phabricator.wikimedia.org/T272215 (10Joe) [10:38:09] 10serviceops, 10SRE, 10Traffic, 10Patch-For-Review, 10Wikimedia-Incident: The safe service restart script doesn't detect failure when running with poolcounter. - https://phabricator.wikimedia.org/T272262 (10Joe) [10:38:38] 10serviceops, 10SRE, 10Wikimedia-Incident: High latency on appservers - https://phabricator.wikimedia.org/T272215 (10Joe) [10:38:42] 10serviceops, 10SRE, 10Traffic, 10Patch-For-Review, 10Wikimedia-Incident: The safe service restart script doesn't detect failure when running with poolcounter. - https://phabricator.wikimedia.org/T272262 (10Joe) 05Open→03Resolved The script has been merged and will deploy everywhere in the next 20 mi... [11:05:56] 10serviceops, 10SRE, 10Wikimedia-Incident: High latency on appservers - https://phabricator.wikimedia.org/T272215 (10Joe) The timers have been reenabled, and the next scap deployment should properly run check_and_restart for php7-fpm, and restart those. [11:30:10] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: k8s_infrastructure_users: rsyslog and echostore share the same id - https://phabricator.wikimedia.org/T269461 (10JMeybohm) p:05Triage→03Medium [11:58:55] 10serviceops, 10SRE, 10ops-eqsin: ganeti5002 was down / powered off, machine check entries in SEL - https://phabricator.wikimedia.org/T261130 (10MoritzMuehlenhoff) Thanks Papaul and Rob, I'll take care of re-adding ganeti5002 to the eqsin Ganeti cluster. [14:54:20] 10serviceops, 10MW-on-K8s, 10SRE: Create a yaml structure for defining apache virtualhosts for mediawiki, that can be used both in puppet and in helm charts. - https://phabricator.wikimedia.org/T272305 (10Joe) [15:08:17] 10serviceops, 10SRE, 10Traffic: Upgrade envoyproxy to 1.16.2 - https://phabricator.wikimedia.org/T271407 (10Vgutierrez) there are some issues with the python requirements of envoy 1.16.2 as it requires python 3.6 or higher and clearly the building environment isn't fulfilling the requirement. So a tiny worka... [15:53:01] 10serviceops, 10SRE, 10Traffic: Upgrade envoyproxy to 1.16.2 - https://phabricator.wikimedia.org/T271407 (10Joe) @Vgutierrez we can create a new building env based on buster I think, that's much better as an option. [16:01:50] 10serviceops, 10Add-Link, 10GrowthExperiments-NewcomerTasks, 10Product-Infrastructure-Team-Backlog, and 3 others: Service operations setup for Add a Link project - https://phabricator.wikimedia.org/T258978 (10kostajh) [16:04:12] 10serviceops, 10Add-Link, 10GrowthExperiments-NewcomerTasks, 10Product-Infrastructure-Team-Backlog, and 3 others: Service operations setup for Add a Link project - https://phabricator.wikimedia.org/T258978 (10kostajh) >>! In T258978#6751367, @akosiaris wrote: >>>! In T258978#6729580, @kostajh wrote: >> @ak... [16:27:39] 10serviceops, 10Add-Link, 10GrowthExperiments-NewcomerTasks, 10Product-Infrastructure-Team-Backlog, and 3 others: Service operations setup for Add a Link project - https://phabricator.wikimedia.org/T258978 (10kostajh) [17:25:31] 10serviceops, 10SRE, 10Wikimedia-Incident: High latency on appservers - https://phabricator.wikimedia.org/T272215 (10jcrespo) p:05Triage→03Medium This was UBN on Saturday, based on Joe's comment, I am putting this now to Medium. More details are yet to be provided on the Incident report, I can help with... [17:28:15] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services), and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqiad.wmnet for host... [17:28:33] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services), and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqiad.wmnet for host... [17:29:22] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services), and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqiad.wmnet for host... [17:30:13] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services), and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqiad.wmnet for host... [18:10:36] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services), and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2274.codfw.wmnet'] ` and were **ALL** s... [18:11:08] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2275.codfw.wmnet'] ` an... [18:11:24] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2273.codfw.wmnet'] ` an... [18:11:46] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2271.codfw.wmnet'] ` an... [19:15:34] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [19:19:28] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [19:20:52] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [19:54:48] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2276.codfw.wmnet'] ` an... [19:54:55] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2277.codfw.wmnet'] ` an... [20:00:52] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2303.codfw.wmnet'] ` an... [20:01:42] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2305.codfw.wmnet'] ` an... [20:27:04] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:28:05] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:29:02] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:30:11] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:30:18] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2310.codfw.wmnet'] ` Of... [20:30:39] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [21:08:41] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2307.codfw.wmnet'] ` an... [21:10:07] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2309.codfw.wmnet'] ` an... [21:10:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2310.codfw.wmnet'] ` an... [21:12:49] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2311.codfw.wmnet'] ` an...