[05:25:04] 10serviceops, 10Operations: High traffic on mc1020 (18 Aug) - https://phabricator.wikimedia.org/T260622 (10Joe) Since no analysis on the incoming keys was done, there is no way to know what the problem was. I'll uncordon mc1020, and monitor the situation, but I assume there isn't much else to do at this point. [07:18:02] 10serviceops: Decommission mw[2135-2214].codfw.wmnet - https://phabricator.wikimedia.org/T260654 (10jijiki) [07:24:20] 10serviceops, 10Platform Team Initiatives (Session Management Service (CDP2)), 10User-Clarakosi: Package table_properties utility for Debian - https://phabricator.wikimedia.org/T226551 (10jijiki) [07:24:26] 10serviceops, 10Operations, 10Packaging, 10Platform Team Initiatives (Session Management Service (CDP2)): Need help to create and deploy Debian-packaged Python 3 app - https://phabricator.wikimedia.org/T229980 (10jijiki) 05Open→03Resolved a:03jijiki Closing this task due to inactivity, please reopen... [07:39:46] 10serviceops, 10DBA, 10Operations, 10Parsoid, 10Parsoid-Tests: update mysql GRANTs for testreduce - https://phabricator.wikimedia.org/T260627 (10Kormat) Hi, i've created the new grants. Please test and let me know if there are any issues. Cheers. [07:47:30] 10serviceops, 10DBA, 10Phabricator, 10Release-Engineering-Team-TODO, and 3 others: Improve privilege separation for phabricator's config files and mysql credentials - https://phabricator.wikimedia.org/T146055 (10jcrespo) 05Stalled→03Open [07:56:07] 10serviceops, 10DBA, 10Phabricator, 10Release-Engineering-Team-TODO, and 4 others: Improve privilege separation for phabricator's config files and mysql credentials - https://phabricator.wikimedia.org/T146055 (10jcrespo) This is done: https://gerrit.wikimedia.org/r/c/operations/puppet/+/620879 Anything pen... [08:13:24] _joe_: o/ anything on the reboots beside what's in the etherpad? [08:13:52] <_joe_> jayme: no, but wait until I have merged a couple patches to reboot-cluster please :) [08:13:59] hrhr [08:14:08] sure...let me know [08:16:00] <_joe_> also volans is fixing something in spicerack itself [08:17:01] okay. I'm not in a rush to continue with that. :) Just wanted to get it of our backs [08:23:30] yeah, sorry about that, there was a bug in the icinga module in spicerack, we can release it right away to onblock you [08:37:01] 10serviceops, 10DBA, 10Operations, 10Parsoid, and 2 others: update mysql GRANTs for testreduce - https://phabricator.wikimedia.org/T260627 (10Kormat) [08:44:35] 10serviceops, 10Operations, 10SRE-tools: Create a cookbook to perform a rolling reboot of a kubernetes cluster - https://phabricator.wikimedia.org/T260661 (10Joe) [08:44:45] 10serviceops, 10Operations, 10SRE-tools: Create a cookbook to perform a rolling reboot of a kubernetes cluster - https://phabricator.wikimedia.org/T260661 (10Joe) p:05Triage→03Medium [08:51:17] 10serviceops, 10Operations, 10SRE-tools: Create a cookbook for depooling one or all services from one kubernetes cluster - https://phabricator.wikimedia.org/T260663 (10Joe) [08:55:19] 10serviceops, 10Operations, 10SRE-tools: Create a cookbook for applying an apache config change safely - https://phabricator.wikimedia.org/T260664 (10Joe) [08:58:24] 10serviceops, 10Operations, 10SRE-tools: Create a cookbook to automate gerrit's switchover - https://phabricator.wikimedia.org/T260666 (10Joe) p:05Triage→03Medium [09:52:58] <_joe_> jayme, effie: reboot-cluster now seems to work decently well, I'm going to revisit instructions a bit [09:53:20] <_joe_> I'm currently rebooting the codfw apis in batches of 5 [09:53:58] nice! [09:55:04] <_joe_> jayme: after the backport window is done, I'd ask you to follow up with the eqiad jobrunners probably [10:27:04] 10serviceops, 10Operations, 10Platform Engineering: PHP microservice for containerized shell execution - https://phabricator.wikimedia.org/T260330 (10tstarling) p:05Medium→03Triage [10:30:56] 10serviceops, 10Operations, 10Platform Engineering: PHP microservice for containerized shell execution - https://phabricator.wikimedia.org/T260330 (10tstarling) a:03tstarling Assigning to myself since implementation work is underway. [12:38:30] 10serviceops, 10Prod-Kubernetes, 10Release Pipeline, 10Patch-For-Review: Refactor our helmfile.d dir structure for services - https://phabricator.wikimedia.org/T258572 (10JMeybohm) FTR: I'm going to do mathoid [15:39:30] * jayme switching location, biab [16:16:20] you probably saw in the Etherpad, i rebooted all remaining eqiad API appserver yesterday [16:20:46] 10serviceops, 10Operations, 10Platform Team Workboards (Clinic Duty Team): PHP microservice for containerized shell execution - https://phabricator.wikimedia.org/T260330 (10AMooney) p:05Triage→03Medium [16:59:11] <_joe_> mutante: yes, only thing left are the jhobrunners [16:59:31] 'k, cool [16:59:44] <_joe_> btw volans fixed a bug in spicerack so now even reboot-single works well [17:00:09] <_joe_> now we actually force the checks in icinga :P [17:01:05] eheheh [17:01:07] sorry about that [17:11:09] _joe_: there is nothing special about rebooting the mcrouter proxies right? (Besides not to take them all down at the same time ofc) [17:11:42] <_joe_> jayme: let's talk tomorrow [17:11:53] <_joe_> I need to go afk now :P [17:12:01] _joe_: okay, ttyl o/ [20:12:57] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, 10Sustainability (Incident Followup): mw* servers memory leaks (12 Aug) - https://phabricator.wikimedia.org/T260281 (10eprodromou) We're tracking this, but unsure as to next steps. Let us know if more active investigation from Platform team... [21:09:16] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, 10Sustainability (Incident Followup): mw* servers memory leaks (12 Aug) - https://phabricator.wikimedia.org/T260281 (10Joe) [21:17:12] 10serviceops, 10ChangeProp, 10Platform Team Workboards (Clinic Duty Team): Partition the transclusions topic in ChangeProp - https://phabricator.wikimedia.org/T157649 (10Pchelolo) a:03Pchelolo I've merged https://github.com/wikimedia/change-propagation/pull/351 Now we need to deploy change-prop with this... [21:33:15] 10serviceops, 10Operations, 10Parsing-Team, 10TechCom, and 4 others: Strategy for storing parser output for "old revision" (Popular diffs and permalinks) - https://phabricator.wikimedia.org/T244058 (10daniel) a:05tstarling→03None [22:19:26] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations: decom releases1001 and releases2001 - https://phabricator.wikimedia.org/T260742 (10Dzahn) [22:19:39] 10serviceops, 10Operations: decom releases1001 and releases2001 - https://phabricator.wikimedia.org/T260742 (10Dzahn) [22:21:34] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Patch-For-Review: replace backends for releases.wikimedia.org with buster VMs - https://phabricator.wikimedia.org/T247652 (10Dzahn) both https://releases.wikimedia.org and https://releases-jenkins.wikimedia.org have been switched to the... [22:25:06] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Patch-For-Review: replace backends for releases.wikimedia.org with buster VMs - https://phabricator.wikimedia.org/T247652 (10Dzahn) 05Open→03Resolved