[00:04:03] 10serviceops, 10Operations: (Need By Dec 20) rack/setup/install mw13[49-84].eqiad.wmnet - https://phabricator.wikimedia.org/T236437 (10Dzahn) a:03Dzahn [00:05:26] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create production and canary releases for existent eventgate helmfile services - https://phabricator.wikimedia.org/T245203 (10EBernhardson) Re-deployed our glent esbulk oozie job against refinery versioned `2020-02-19T16.58.16+00.00--scap_s... [00:13:23] 10serviceops, 10Operations: (Need By Dec 20) rack/setup/install mw13[49-84].eqiad.wmnet - https://phabricator.wikimedia.org/T236437 (10Dzahn) [00:15:06] 10serviceops, 10Operations: (Need By Dec 20) rack/setup/install mw13[49-84].eqiad.wmnet - https://phabricator.wikimedia.org/T236437 (10Dzahn) All hosts have roles, weight 30, have been pooled and changed to status "Active" in Netbox. ` {"mw1349.eqiad.wmnet": {"weight": 30, "pooled": "yes"}, "tags": "dc=eqiad... [00:16:00] 10serviceops, 10Operations: (Need By Dec 20) rack/setup/install mw13[49-84].eqiad.wmnet - https://phabricator.wikimedia.org/T236437 (10Dzahn) 05Open→03Resolved [02:34:32] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create production and canary releases for existent eventgate helmfile services - https://phabricator.wikimedia.org/T245203 (10Ottomata) Thank you! [12:40:40] what's the rough time line to decom mw1221-mw1258 and mw2135-mw2214? as these were not reimaged after the HHVM removal, they still have PHP7.0 installed. unless these decoms are close, I'd remove the PHP 7.0 packages from those via Cumin? [14:12:17] akosiaris: thank you for review! ...do we have namespaces?! :) [14:12:38] ottomata: nope, not yet [14:12:43] k [14:12:47] but I 'll get to them later today [14:15:38] <3 [14:35:07] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create production and canary releases for existent eventgate helmfile services - https://phabricator.wikimedia.org/T245203 (10Ottomata) [14:36:52] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create production and canary releases for existent eventgate helmfile services - https://phabricator.wikimedia.org/T245203 (10Ottomata) [15:14:01] <_joe_> rlazarus: once you're done with the new eqiad servers, there is also the new codfw servers. There you can probably be more aggressive with the installations [15:14:11] <_joe_> it's also dc switchover prep work wink wink [15:14:15] haha yep [15:14:34] mutante is already partway through those, I'll grab a few after the meeting probably [15:14:37] <_joe_> also [15:15:10] I'm almost done with these httpbb tests though, I want to get those mailed out first [15:15:11] <_joe_> the average latency on both appservers and api has gone down [15:15:16] <_joe_> oh sure [15:15:33] <_joe_> adding servers is kind of a background task I assume once you know how to do it [15:15:38] <_joe_> use it as a filler [15:16:12] <_joe_> when did you start adding new servers? [15:16:47] <_joe_> I'd bet on the 18th from the graphs [15:17:16] yep [15:17:28] should be a smaller number on the 18th and then more on the 19th [15:17:46] <_joe_> yeah so this confirms again my finding that php-fpm's performance suffers quite a bit at high concurrency [15:18:21] <_joe_> https://people.wikimedia.org/~oblivian/T206341/ has a recap of some tests I did comparing hhvm vs php-fpm [15:19:10] oh neat [15:19:23] <_joe_> but for instance https://people.wikimedia.org/~oblivian/T206341/images/high_workers_re-parse_c40.png shows pretty clearly that under heavy parser load [15:19:29] <_joe_> hhvm has much better performance [15:19:57] <_joe_> while at lower concurrency latency is comparable https://people.wikimedia.org/~oblivian/T206341/images/high_workers_re-parse_c10.png [15:20:30] and way worse for the obama page it looks like? [15:20:45] or, ranges from "slightly worse" to "way worse" [15:20:49] <_joe_> so I'm starting to wonder if my old idea of running multiple instances of php-fpm on the same box could just allow us to get better performance anbd throughput [15:20:51] <_joe_> yes [15:21:11] yeah I was just getting there too -- depends on what we're bottlenecking on I guess [15:21:12] <_joe_> the obama page is our standard test for "wikitext can just be too much" [15:21:25] <_joe_> I think it's on managing many busy workers [15:21:27] but you're right, in that case it probably isn't "number of cores" or whatever [15:21:29] yeah [15:21:38] <_joe_> and their access to shared resources like apc I guess [15:21:58] yeah I guess I'm just wondering if it's process-shared resources or system-shared resourced [15:21:59] *s [15:22:04] but easy enough to find out [15:22:08] <_joe_> process-shared [15:22:17] <_joe_> that's apc, I mean [15:22:29] nod [15:27:48] meanwhile in things I learned from apache-fast-test, https://test.wikidata.org/wiki/Q77119 [15:28:20] imagine a world where every baYjnvuD can freely share in the sum of all evgWPgeSywYLStrjvjBE [15:36:42] <_joe_> ahahahah [15:36:51] <_joe_> I think there is a reason why that's used [15:36:58] <_joe_> which I forgot [16:00:37] <_joe_> mark: 1h30 of meeting? I have an interview just after that, I usually need some time to prepare [16:00:54] <_joe_> I might leave a few minutes earlier [16:00:59] that's ok [16:12:42] I was about to say -- is there any chance we can do a five-minute sanity break about halfway through [16:13:06] agree we have 90m of stuff to discuss but a quick pause will probably make the second half of the meeting way more useful [16:14:39] we can do that yes [16:14:53] we might not need it all [16:15:06] but for sure our usual meetings already seem rushed at times [16:15:08] and that's not great with annual planning [16:16:36] 10serviceops, 10ChangeProp, 10Release Pipeline, 10Release-Engineering-Team-TODO, and 3 others: Migrate changeprop to kubernetes - https://phabricator.wikimedia.org/T213193 (10akosiaris) [16:16:47] yeah agree [16:24:29] <_joe_> I'm also skipping the product-platform sync again :/ [16:31:25] * akosiaris alone in some meeting? [17:21:20] akosiaris: am i ready to go with deploying eg-analytincs-external? [17:23:13] ottomata: review https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/573624/ first? [17:41:17] 10serviceops: upgrade MediaWiki appservers to Debian 10 (buster) - https://phabricator.wikimedia.org/T245757 (10Dzahn) [17:42:00] 10serviceops: upgrade MediaWiki appservers to Debian 10 (buster) - https://phabricator.wikimedia.org/T245757 (10Dzahn) This has dependencies on other teams as well. [17:59:38] 10serviceops, 10Release-Engineering-Team: upgrade MediaWiki appservers to Debian 10 (buster) - https://phabricator.wikimedia.org/T245757 (10Jdforrester-WMF) [17:59:59] 10serviceops, 10Release-Engineering-Team, 10Release-Engineering-Team-TODO: upgrade MediaWiki appservers to Debian 10 (buster) - https://phabricator.wikimedia.org/T245757 (10Jdforrester-WMF) [18:09:33] 10serviceops, 10Release-Engineering-Team-TODO, 10Release-Engineering-Team (Deployment services): upgrade MediaWiki appservers to Debian 10 (buster) - https://phabricator.wikimedia.org/T245757 (10greg) [18:22:23] ah! reviewing [18:22:26] ...meetings... [18:26:01] akosiaris: commented [18:26:02] ty [18:46:07] 10serviceops, 10MediaWiki-Docker, 10Release-Engineering-Team (Pipeline), 10User-brennen: Clarify and document our docker image building process and policies. - https://phabricator.wikimedia.org/T216234 (10brennen) [19:19:01] 10serviceops, 10Release-Engineering-Team-TODO, 10Release-Engineering-Team (Deployment services): upgrade MediaWiki appservers to Debian 10 (buster) - https://phabricator.wikimedia.org/T245757 (10Dzahn) 05Open→03Stalled [22:28:48] 10serviceops, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install new codfw mw systems - https://phabricator.wikimedia.org/T241852 (10ops-monitoring-bot) Icinga downtime for 1:00:00 set by dzahn@cumin1001 on 8 host(s) and their services with reason: new_install ` mw[2317-2324].codfw.wmnet ` [23:01:27] 10serviceops, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install new codfw mw systems - https://phabricator.wikimedia.org/T241852 (10ops-monitoring-bot) Icinga downtime for 1:00:00 set by dzahn@cumin1001 on 8 host(s) and their services with reason: new_install ` mw[2317-2324].codfw.wmnet ` [23:14:49] 10serviceops, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install new codfw mw systems - https://phabricator.wikimedia.org/T241852 (10ops-monitoring-bot) Icinga downtime for 1:00:00 set by dzahn@cumin1001 on 8 host(s) and their services with reason: new_install ` mw[2317-2324].codfw.wmnet `