[00:46:08] 10serviceops, 10SRE: upgrade mwmaint servers to buster - https://phabricator.wikimedia.org/T267607 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mwmaint2001.codfw.wmnet'] ` and were **ALL** successful. [01:03:42] mwmaint2001 now on buster, no puppet issue whatsoever (almost surprised but good). doesn't say anything about running the actual jobs yet.. of course [01:04:29] also that was a HP and it did not even break [01:04:39] that's the ral magic! [01:04:42] *real [01:04:46] :pp [01:04:59] * bd808 hates all the HP bits in cloudland [01:05:18] I was connecting to console and "oh no, it's a HP" [01:05:51] cant use any of the racadm commands either [05:22:46] 10serviceops, 10Performance-Team, 10Platform Engineering, 10Wikimedia-Rdbms, and 4 others: Determine and implement multi-dc strategy for ChronologyProtector - https://phabricator.wikimedia.org/T254634 (10aaron) [07:26:43] 10serviceops, 10SRE: upgrade mwmaint servers to buster - https://phabricator.wikimedia.org/T267607 (10MoritzMuehlenhoff) IIRC the previous update for the mwmaint servers happened via a hardware replacement: mwmaint1002 was new server which replaced terbium. Procedure-wise it's probably best if we reimage an ex... [08:54:52] 10serviceops, 10SRE, 10Patch-For-Review, 10User-jijiki: Upgrade memcached to version 1.6.x - https://phabricator.wikimedia.org/T270315 (10hashar) That broke Puppet on `deployment-memc08.deployment-prep.eqiad.wmflabs`, I don't know why other memcached instances are not affected though. I have filed T275187... [10:09:49] elukey: this is probably you? some host aliases for cumin having to do with hadoop are unhappy, "Unable to execute query for alias hadoop-backup: Unexpected boolean operator 'or' with hosts '' and "Alias hadoop-master-backup matched 0 hosts", "Alias hadoop-standby-backup matched 0 hosts" [10:14:26] apergos: yep I just merged a change for it, it should clear when puppet runs on cumin [10:33:05] awesome ty [11:43:54] 10serviceops, 10SRE: mcrouter v0.41 fails to connect to a mcrouter v0.37 ssl proxy - https://phabricator.wikimedia.org/T275202 (10jijiki) p:05Triage→03Medium [11:45:20] 10serviceops, 10SRE: mcrouter v0.41 fails to connect to a mcrouter v0.37 ssl proxy - https://phabricator.wikimedia.org/T275202 (10jijiki) [11:59:47] 10serviceops, 10SRE, 10observability, 10Sustainability (Incident Followup), 10User-jijiki: add monitoring of sustained memcached TKO rates - https://phabricator.wikimedia.org/T253384 (10jijiki) [12:27:40] 10serviceops, 10MW-on-K8s, 10Release Pipeline, 10Patch-For-Review, 10Release-Engineering-Team-TODO: Create restricted docker-registry namespace for security patched images - https://phabricator.wikimedia.org/T273521 (10akosiaris) >>! In T273521#6841748, @Legoktm wrote: >>>! In T273521#6841527, @dancy wro... [13:30:37] 10serviceops, 10SRE, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, 10Platform Engineering (Icebox): Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10akosiaris) [14:21:23] wow look at that. "undeploy graphoid", two such beautiful beautiful words together [14:51:43] cancel culture [15:05:33] I'm here for it [15:06:54] 10serviceops, 10MediaWiki-Authentication-and-authorization: sessionstore certificates will expire soon - https://phabricator.wikimedia.org/T274564 (10akosiaris) p:05Triage→03High I 'll work on this next week. [15:22:47] 10serviceops, 10SRE: upgrade mwmaint servers to buster - https://phabricator.wikimedia.org/T267607 (10RLazarus) See T266717 for some related discussion. [16:12:45] 10serviceops, 10MW-on-K8s, 10Release Pipeline, 10Patch-For-Review, 10Release-Engineering-Team-TODO: Create restricted docker-registry namespace for security patched images - https://phabricator.wikimedia.org/T273521 (10dancy) Thanks for the reference @akosiaris. That was a good read. [17:06:25] 10serviceops, 10SRE, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services), and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqiad.wmnet for host... [17:14:31] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:41:38] 10serviceops, 10SRE: mcrouter v0.41 fails to connect to a mcrouter v0.37 ssl proxy - https://phabricator.wikimedia.org/T275202 (10jijiki) 05Open→03Resolved a:03jijiki We may have discovered this sooner if we had a relevant alert. I am closing this task and cont the alert/monitoring discussion on T253384 [18:01:02] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2272.codfw.wmnet'] ` an... [18:12:15] 10serviceops, 10SRE, 10observability, 10Sustainability (Incident Followup), 10User-jijiki: add monitoring of sustained memcached TKO rates - https://phabricator.wikimedia.org/T253384 (10jijiki) I thought about this a bit, I do have some reservations as it is towards the right direction, but here it is. A... [18:30:14] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1341.eqiad.wmnet'] ` an... [18:34:52] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1367.eqiad.wmnet'] ` an... [18:50:47] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:52:21] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:53:27] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:55:45] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:12:04] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1261.eqiad.wmnet'] ` an... [20:12:27] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2257.codfw.wmnet'] ` an... [20:13:50] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1270.eqiad.wmnet'] ` an... [20:14:33] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1287.eqiad.wmnet'] ` an... [20:55:20] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:56:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [21:01:49] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [21:04:22] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [21:29:47] 10serviceops, 10SRE, 10Patch-For-Review, 10User-jijiki: Upgrade memcached to version 1.6.x - https://phabricator.wikimedia.org/T270315 (10hashar) [21:40:53] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2336.codfw.wmnet'] ` an... [21:42:55] 10serviceops, 10IABot-Team, 10InternetArchiveBot: IABot 301 POST requests - https://phabricator.wikimedia.org/T274090 (10jijiki) p:05Triage→03Medium [21:46:16] 10serviceops, 10SRE, 10observability, 10Sustainability (Incident Followup), 10User-jijiki: add monitoring of sustained memcached TKO rates - https://phabricator.wikimedia.org/T253384 (10jijiki) After some fiddling, another thing I am wondering if possible things to alert on the sum of hard TKO states re... [21:51:32] both APP and API in codfw are 100% buster now [21:56:01] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [22:07:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1262.eqiad.wmnet'] ` an... [22:09:44] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1320.eqiad.wmnet'] ` an... [22:14:05] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [22:17:19] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1340.eqiad.wmnet'] ` an... [22:19:07] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [22:34:10] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [22:51:57] 10serviceops, 10Performance-Team, 10Platform Engineering, 10Wikimedia-Rdbms, and 4 others: Determine and implement multi-dc strategy for ChronologyProtector - https://phabricator.wikimedia.org/T254634 (10aaron) [23:09:24] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1339.eqiad.wmnet'] ` an... [23:27:41] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1333.eqiad.wmnet'] ` an... [23:32:21] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1342.eqiad.wmnet'] ` an... [23:48:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1317.eqiad.wmnet'] ` an...