[07:46:06] 10serviceops, 10SRE, 10Patch-For-Review: upgrade conf2* servers to stretch - https://phabricator.wikimedia.org/T271573 (10elukey) @JMeybohm I created the first two patch to swap the zookeeper servers, in theory it should work fine. The delicate step is the roll restart of the daemons after the second one, bu... [07:51:25] FYI, I've built PHP 7.2.34 packages (plus the latest backported fixes after 7.2 was EOLed), will run some tests on mwdebug and prod canaries and then gradually rollout further [07:53:13] <_joe_> moritzm: ack, will you also rebuild the docker image? [07:55:42] yeah, I'll do that once I'm confident that 7.4.34 is solidly running on the canaries [09:37:04] 10serviceops, 10MW-on-K8s, 10SRE: Figure out appropriate readiness and liveness probes - https://phabricator.wikimedia.org/T276908 (10Joe) 05Open→03Resolved [09:37:10] 10serviceops, 10MW-on-K8s, 10SRE, 10Patch-For-Review: Create a basic helm chart to test MediaWiki on kubernetes - https://phabricator.wikimedia.org/T265327 (10Joe) [09:44:49] 10serviceops, 10MW-on-K8s, 10SRE: Benchmark performance of MediaWiki on k8s - https://phabricator.wikimedia.org/T280497 (10Joe) [09:44:56] 10serviceops, 10MW-on-K8s, 10SRE: Benchmark performance of MediaWiki on k8s - https://phabricator.wikimedia.org/T280497 (10Joe) p:05Triage→03High [10:03:36] <_joe_> akosiaris: elukey and I were talking about how to manage istio docker images [10:03:46] * elukey sees Alex running [10:03:47] * akosiaris runs [10:03:50] <_joe_> I see we imported calico/node directly in our registry? [10:03:51] ahahahahah [10:04:35] <_joe_> so, I was looking at istio images and they're ubuntu-based, so converting them to "our" images should be easy [10:05:01] we did not [10:05:03] https://wikitech.wikimedia.org/wiki/Kubernetes/Kubernetes_Infrastructure_upgrade_policy#Using_existing_upstream_binaries [10:05:07] for the MVP we'd need two istio daemons, istiod (control plane, formerly citadel+etc..) and istio-gateway [10:06:10] I would not even try to use upstream images. But if istio releases binaries that policy can cover you [10:06:15] <_joe_> akosiaris: I don't see the calico/node image in production-images [10:06:36] there is also a tool called "istioctl" that can take care of deploying things instead of using helm3, like https://github.com/kubeflow/kfserving/blob/master/test/scripts/run-e2e-tests.sh#L39-L71 [10:06:53] _joe_: https://gerrit.wikimedia.org/g/operations/debs/calico/+/refs/heads/master [10:07:05] akosiaris: istio ships binaries for all things in every release so in theory we could use those IIUC [10:07:19] As you can see it's a simple debian/rules file that fetches and validates upstream binaries [10:07:27] and creates the corresponding packages [10:07:31] ack ack [10:09:12] <_joe_> akosiaris: yeah we're just pushign their images https://gerrit.wikimedia.org/r/plugins/gitiles/operations/debs/calico/+/refs/heads/master/debian/push-calico-images.sh [10:09:31] No we are not [10:09:57] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/debs/calico/+/refs/heads/master/debian/get-calico-release.sh [10:10:16] we extract the binaries [10:10:29] <_joe_> akosiaris: the image on our registry and the one on dockerhub have the exact same sha256 [10:12:23] <_joe_> anyways, I think doing something more clean is possible with little effort overall, in this case [10:13:26] <_joe_> elukey: can you link me the images you wanted to use? On dockerhub I mean [10:14:27] _joe_ this is the bit that I still don't have :D [10:14:52] I mean I thought that we'd have used base debian images with the istio binaries on top (like istiod etc..) [10:15:09] <_joe_> elukey: that's optimal ofc [10:15:15] I'll check on the helm files if I see docker image names [10:15:30] <_joe_> if you don't need more than just a base distro with a binary [10:15:36] <_joe_> what akosiaris told you is the best option [10:16:18] <_joe_> https://github.com/istio/istio/blob/master/docker/Dockerfile.base is pretty bland, but they do install a few packages [10:16:37] <_joe_> but this is declaredly for debug builds only [10:19:43] _joe_ I am missing the part of where we get the docker images for calico [10:20:04] <_joe_> elukey: ignore that for now [10:23:59] <_joe_> so the problem I see is I can't find the dockerfiles for e.g. gcr.io/istio-testing/operator [10:24:09] <_joe_> which is one of the default images used by the chart [10:28:01] so in theory in the heml files it says [10:28:56] can't find the sentence but it referred to dockerhub for production images [10:29:07] (gcr.io only for testing etc..) [10:29:17] but for example I can't find istiod [10:29:25] I suppose it is in some image [10:29:34] which one? No idea, still need to dig into it :D [10:30:06] the main problem is that most of the guides use istioctl, that does everything behind the scenes with a minimal yaml [10:32:00] but as discussed previously it is basically https://github.com/alexandrelevine/istio/blob/master/docker/Dockerfile.istiod [10:32:51] all right need to step afk in a bit, but I'll add all info the the task and open one to think about a istio debian pkg [10:33:03] thanks for the pointers :) [10:36:55] <_joe_> I hope we thoroughly confused you [12:22:59] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.0-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) 05Open→03Invalid We're going to tag a new version with fixes, and open a new task for that. [16:07:35] 10serviceops, 10Analytics, 10Event-Platform, 10SRE, 10Patch-For-Review: DRY kafka broker declaration in helmfiles - https://phabricator.wikimedia.org/T253058 (10akosiaris) Hi! Adopting the new functionality in networkpolicy resources has indeed created some tech debt. It's a tech debt we created on purp... [16:08:13] 10serviceops, 10Analytics, 10Event-Platform, 10SRE, 10Patch-For-Review: DRY kafka broker declaration in helmfiles - https://phabricator.wikimedia.org/T253058 (10Ottomata) <3 [16:22:48] 10serviceops, 10Analytics, 10Event-Platform, 10SRE, 10Patch-For-Review: DRY kafka broker declaration in helmfiles - https://phabricator.wikimedia.org/T253058 (10akosiaris) a:03akosiaris [16:36:22] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.0-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) 05Invalid→03Open Re-opening and editing, because I'm lazy. [16:36:50] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.1-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) [16:37:16] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.1-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) 05Open→03Stalled [16:38:04] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.1-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) [16:38:26] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.1-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) [16:54:54] 10serviceops, 10Scap, 10Release-Engineering-Team-TODO: Deploy Scap version 3.17.1-1 - https://phabricator.wikimedia.org/T279695 (10LarsWirzenius) 05Stalled→03Open We've tagged the bug fix release 3.17.1 and tested it on the beta cluster. We'd like it be built as a .deb and deployed to production, when th... [18:42:41] 10serviceops, 10SRE, 10Performance-Team (Radar): Get rid of nutcracker for connecting to redis - https://phabricator.wikimedia.org/T277183 (10jijiki) [18:42:47] 10serviceops, 10Platform Engineering, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Phasing out "redis_sessions" MediaWiki cluster - https://phabricator.wikimedia.org/T267581 (10jijiki) [18:42:51] 10serviceops, 10Platform Engineering, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Phasing out "redis_sessions" MediaWiki cluster - https://phabricator.wikimedia.org/T267581 (10jijiki) [18:52:28] 10serviceops, 10SRE, 10User-jijiki: Shrink redis_sessions cluster - https://phabricator.wikimedia.org/T280582 (10jijiki) [18:52:48] 10serviceops, 10SRE, 10User-jijiki: Shrink redis_sessions cluster - https://phabricator.wikimedia.org/T280582 (10jijiki) [18:52:51] 10serviceops, 10Platform Engineering, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Phasing out "redis_sessions" MediaWiki cluster - https://phabricator.wikimedia.org/T267581 (10jijiki) [18:53:35] 10serviceops, 10Platform Engineering, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Phasing out "redis_sessions" MediaWiki cluster from the memcached cluster - https://phabricator.wikimedia.org/T267581 (10jijiki) [18:54:26] 10serviceops, 10SRE, 10Performance-Team (Radar), 10User-jijiki: Shrink redis_sessions cluster - https://phabricator.wikimedia.org/T280582 (10Krinkle) [18:59:13] 10serviceops, 10Platform Engineering, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Phasing out "redis_sessions" MediaWiki cluster and away from the memcached cluster - https://phabricator.wikimedia.org/T267581 (10jijiki) [19:00:31] 10serviceops, 10SRE, 10Performance-Team (Radar), 10User-jijiki: Shrink redis_sessions cluster - https://phabricator.wikimedia.org/T280582 (10Krinkle) Note that deployment of T113916 (migrate module dep store from a MW core DB table written to during GET requests, to instead use Main Stash) was halted becau... [19:06:34] 10serviceops, 10SRE: Move "redis_sessions" to "redis_misc" cluster - https://phabricator.wikimedia.org/T280586 (10jijiki) [19:07:55] 10serviceops, 10SRE: Move "redis_sessions" to "redis_misc" cluster - https://phabricator.wikimedia.org/T280586 (10jijiki) p:05Triage→03Medium [19:08:27] 10serviceops, 10SRE: Move "redis_sessions" to "redis_misc" cluster - https://phabricator.wikimedia.org/T280586 (10jijiki) [19:08:31] 10serviceops, 10Platform Engineering, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Phasing out "redis_sessions" MediaWiki cluster and away from the memcached cluster - https://phabricator.wikimedia.org/T267581 (10jijiki) [19:10:01] 10serviceops, 10SRE, 10Performance-Team (Radar): Phase out nutcracker for connecting to redis - https://phabricator.wikimedia.org/T277183 (10jijiki) [19:15:26] 10serviceops, 10SRE, 10Performance-Team (Radar), 10User-jijiki: Shrink redis_sessions cluster - https://phabricator.wikimedia.org/T280582 (10jijiki) >>! In T280582#7015884, @Krinkle wrote: > Note that deployment of T113916 was halted because the Redis capacity was actually considered too small. (That task... [19:54:36] 10serviceops, 10SRE, 10Performance-Team (Radar), 10User-jijiki: Shrink redis_sessions cluster - https://phabricator.wikimedia.org/T280582 (10Reedy) [20:39:51] 10serviceops, 10SRE, 10Patch-For-Review: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301) - https://phabricator.wikimedia.org/T280203 (10Dzahn) [20:41:40] 10serviceops, 10SRE, 10Patch-For-Review: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301) - https://phabricator.wikimedia.org/T280203 (10Dzahn) [22:10:25] 10serviceops: move phabricator to new hardware generation - https://phabricator.wikimedia.org/T280597 (10Dzahn) [23:14:19] 10serviceops, 10Phabricator: move phabricator to new hardware generation - https://phabricator.wikimedia.org/T280597 (10Dzahn)