[00:09:34] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-jijiki: Mechanism to flag webrequests as "debug" - https://phabricator.wikimedia.org/T263683 (10Milimetric) Ok, some of these things are easy and some are a bit harder. Instead of filtering out the requests at "refine" time, we wi... [01:01:50] 10serviceops, 10Operations, 10Performance-Team, 10User-jijiki: Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe - https://phabricator.wikimedia.org/T264604 (10Krinkle) [06:44:29] 10serviceops, 10Release Pipeline, 10Release-Engineering-Team, 10Release-Engineering-Team-TODO: Provide the official production base images for Wikimedia use - https://phabricator.wikimedia.org/T238774 (10Joe) [06:44:44] 10serviceops, 10MW-on-K8s, 10Operations: Create the base container images for running MediaWiki in a production environment - https://phabricator.wikimedia.org/T265324 (10Joe) 05Open→03Resolved [06:45:43] 10serviceops, 10Operations, 10docker-pkg: Docker image on the build host seem to ignore apt priority for wikimedia packages - https://phabricator.wikimedia.org/T268612 (10Joe) a:03Joe [06:46:18] <_joe_> jayme / akosiaris: should I target helm3 for a new service? and if so, anything specific I should be careful about? [08:12:54] _joe_: stick to helm2 for now. We're just planning to use helm3 for the admin part in first [08:13:23] <_joe_> ack [08:14:07] <_joe_> jayme: btw, I forgot to ask. With calico-within-k8s, what happens if the kube master is down? [08:14:36] <_joe_> would calico keep chugging along nicely, or would it fail? [08:15:50] We will probably need to test to be completely sure, but as with a down master no changes to the cluster typology can be made, I would say it just keeps running as every other pod does as long as the node(s) are fine [08:20:58] <_joe_> I think calico might need to store data at runtime though, we should test it. [08:21:45] what data are you thinking of? [08:22:48] <_joe_> I just remember calico was pretty chatty with etcd even while running [08:23:16] <_joe_> and I did test back in the day that version 2.x would work fine in a static-state when etcd was down, and correctly reconnect [08:23:54] <_joe_> given they've rewritten all of it, not sure it's still true [08:24:35] <_joe_> btw back in the day calico suggested to use calico/node within the k8s cluster only for small installations [08:24:55] <_joe_> I guess that's changed with newer kube versions that use etcd3 and are thus more performant [08:29:07] there is an additional "api-fan-out-service" not (typha) which is recommended for clusters >50 nodes. We plan to directly start using it to gain some experience while we go [08:57:03] 10serviceops, 10Growth-Team, 10Operations, 10Patch-For-Review, and 2 others: Reimage one memcached shard per DC to Buster - https://phabricator.wikimedia.org/T252391 (10jijiki) >>! In T252391#6632790, @elukey wrote: >>>! In T252391#6631201, @jijiki wrote: >> Since we are happy with the current settings, I... [09:00:42] akosiaris: the issues with helm3 & repo you had...I do have them as well it seems. It does fetch but not honor the updates it seems [09:01:11] or 0.2.5-wmf1 < 0.2.0 ... [09:36:56] it is... [10:22:29] <_joe_> jayme: btw I fully expect us to have more than 50 nodes in a few months [10:23:03] that's why we thought we should include typha from start [10:33:27] 10serviceops, 10Data-Persistence-Consultation, 10MediaWiki-Parser, 10Parsoid: CAPEX for ParserCache for Parsoid - https://phabricator.wikimedia.org/T263587 (10LSobanski) I believe the key questions to #DBA have been answered so I'm removing the team tag. Please add us back if there is anything else we can... [10:54:09] jayme: mine were after all a PEBKAC. I was aliasing helm3 to change the directories, but helmfile called it directly [10:54:21] when I fixed that it worked ok [10:54:35] btw https://gerrit.wikimedia.org/r/c/operations/puppet/+/643456 and friends. [10:55:09] those should cover the removal of the config file in https://gerrit.wikimedia.org/r/c/operations/debs/kubernetes/+/642011, leaving some reviews into it [11:04:41] oh, I forgot about api-server. Fixing that after lunch [12:14:42] Ok apiserver done as well, let's see what PCC says for all of this [12:58:07] 10serviceops, 10Operations, 10Kubernetes: Migrate Chartmuseum (python3-docker-report) to use helm3 - https://phabricator.wikimedia.org/T268743 (10JMeybohm) [12:58:22] 10serviceops, 10Operations, 10Kubernetes: Migrate Chartmuseum (python3-docker-report) to use helm3 - https://phabricator.wikimedia.org/T268743 (10JMeybohm) p:05Triage→03Medium [13:43:03] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Upgrade kubernetes clusters to a security supported (LTS) version - https://phabricator.wikimedia.org/T244335 (10akosiaris) [16:10:22] akosiaris: where did you get your PCC valued from at https://gerrit.wikimedia.org/r/c/operations/puppet/+/643458/3 ? [17:14:53] 10serviceops, 10Operations, 10Scap, 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)): Make a way to build Scap .deb in Docker - https://phabricator.wikimedia.org/T265501 (10LarsWirzenius) 05Open→03Resolved This is merged into scap.git now. [17:23:02] jayme: https://puppet-compiler.wmflabs.org/compiler1001/26684/ (I 've responded in the patch too) [20:32:44] 10serviceops, 10Operations, 10MW-1.36-notes (1.36.0-wmf.18; 2020-11-17), 10Performance Issue, and 3 others: Strategy for storing parser output for "old revision" (Popular diffs and permalinks) - https://phabricator.wikimedia.org/T244058 (10Krinkle) [22:36:49] 10serviceops, 10Operations, 10Release-Engineering-Team-TODO, 10Patch-For-Review, and 2 others: Upgrade MediaWiki appservers to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) >>! In T245757#6644078, @MoritzMuehlenhoff wrote: > With all Puppet patches and debs landed, mwdebug... [22:38:34] 10serviceops, 10Operations, 10Release-Engineering-Team-TODO, 10Patch-For-Review, and 2 others: Upgrade MediaWiki appservers to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) >>! In T245757#6644135, @MoritzMuehlenhoff wrote: > Actually, before reimaging let's merge https://g... [22:40:33] 10serviceops, 10Operations, 10Release-Engineering-Team-TODO, 10Patch-For-Review, and 2 others: Upgrade MediaWiki appservers to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) Icinga all green with one exception: opcache-health, attempted to hit "reschedule next service check...