[00:39:24] 10serviceops, 10Arc-Lamp, 10Performance-Team: Resolve arclamp disk exhaustion problem (Oct 2019) - https://phabricator.wikimedia.org/T235455 (10Krinkle) [00:39:27] 10serviceops, 10Arc-Lamp, 10Performance-Team: Resolve arclamp disk exhaustion problem (Oct 2019) - https://phabricator.wikimedia.org/T235455 (10Krinkle) [01:19:27] 10serviceops, 10Core Platform Team, 10MediaWiki-General, 10Operations, and 2 others: Revisit timeouts, concurrency limits in remote HTTP calls from MediaWiki - https://phabricator.wikimedia.org/T245170 (10tstarling) > * SwiftFileBackend has a concerning mix of high timeouts, high traffic and an inability t... [01:25:21] 10serviceops, 10Operations, 10Thumbor, 10User-jijiki: Upgrade Thumbor to Buster - https://phabricator.wikimedia.org/T216815 (10AntiCompositeNumber) [05:30:51] 10serviceops, 10Operations, 10Thumbor, 10User-jijiki: Upgrade Thumbor to Buster - https://phabricator.wikimedia.org/T216815 (10AntiCompositeNumber) [08:04:37] 10serviceops, 10Operations, 10Security-Team, 10vm-requests, 10PM: Eqiad: 1VM request for Peek (PM service in use by Security Team) - https://phabricator.wikimedia.org/T252210 (10Dzahn) 05Open→03Stalled a:05Dzahn→03None Giving it back to the pool and setting to stalled because of ongoing discussio... [09:38:24] 10serviceops, 10Core Platform Team, 10Operations, 10Traffic, and 2 others: Reduce rate of purges emitted by MediaWiki - https://phabricator.wikimedia.org/T250205 (10aaron) I'm not fond of the idea of not sending purges for indirect edits nor using RefreshLinksJob instead of HtmlCacheUpdateJob (too slow IMO... [10:43:19] 10serviceops, 10Operations, 10Patch-For-Review: upgrade people.wikimedia.org backend to buster - https://phabricator.wikimedia.org/T247649 (10Dzahn) 05Open→03Resolved This has happened and an announcement has been sent to ops and wikitech-l lists. [10:44:01] 10serviceops, 10Operations: decom people1001 - https://phabricator.wikimedia.org/T253296 (10Dzahn) [11:42:26] 10serviceops, 10Operations, 10decommission, 10ops-codfw, 10Patch-For-Review: codfw: decom at least 15 appservers(mw2158 through mw2172) in codfw rack C3 to make room for new servers - https://phabricator.wikimedia.org/T247018 (10Dzahn) 05Stalled→03Open [12:49:29] in eqiad we have 24 jobrunners, in codfw we have 31 jobrunners. that is because we run old and new servers. in rack C3 (the one dcops would use to test cabling stuff) there are 13 jobrunners, all consecutive numbers with the same role. and all old, so to be decom'ed. Now if i remove them all once we will of course only have 18 jobrunners vs 24 but the new ones [12:49:59] or we need to change some other roles to become jobrunners to balance it [12:51:42] also ideally we don't want all the jobrunners to be in the same rack like that [12:58:03] do you have opinions on whether it's ok to remove those 13 all at once and keep (only) 18? https://gerrit.wikimedia.org/r/c/operations/puppet/+/597771 [13:27:53] <_joe_> mutante: shouldn't we have installed the new servers to balance that out? [13:28:12] <_joe_> like end up like in eqiad after we decommissioned all the older servers [13:28:35] <_joe_> but yes, I think it's in general ok to run with 18 jobrunners, but we might need to rebalance that [13:42:57] _joe_: yes, we balanaced appserver and API_appserver but not jobrunners well enough then [13:43:36] debugging something for cloud with issues on install_servers [13:44:24] thanks for the input, yes, i will follow-up with a change to create some more jobrunners from new servers then [15:09:19] 10serviceops, 10Operations, 10decommission, 10ops-codfw, 10Patch-For-Review: codfw: decom at least 15 appservers(mw2158 through mw2172) in codfw rack C3 to make room for new servers - https://phabricator.wikimedia.org/T247018 (10ops-monitoring-bot) Icinga downtime for 4 days, 0:00:00 set by dzahn@cumin10... [15:25:31] <_joe_> akosiaris: not sure if this is genius or evil https://github.com/awslabs/cdk8s