[00:21:33] 10serviceops, 10MW-on-K8s, 10Operations, 10Release Pipeline (Blubber), 10Release-Engineering-Team (Pipeline): Deployment infrastructure for PHP microservices - https://phabricator.wikimedia.org/T261369 (10Krinkle) Some kind of `/deploy` repo seems needed, I think, as otherwise we would be deploying unaud... [02:08:55] 10serviceops, 10Platform Engineering Roadmap Decision Making, 10Code-Health-Objective, 10Performance-Team (Radar), and 4 others: Determine multi-dc strategy for CentralAuth - https://phabricator.wikimedia.org/T267270 (10Krinkle) [02:08:58] 10serviceops, 10Performance-Team, 10Platform Engineering, 10Wikimedia-Rdbms, and 2 others: Determine multi-dc strategy for ChronologyProtector - https://phabricator.wikimedia.org/T254634 (10Krinkle) [02:13:58] 10serviceops, 10Performance-Team, 10Platform Engineering, 10Wikimedia-Rdbms, and 2 others: Determine and implement multi-dc strategy for ChronologyProtector - https://phabricator.wikimedia.org/T254634 (10Krinkle) [08:28:55] 10serviceops, 10Continuous-Integration-Infrastructure, 10Operations, 10Release-Engineering-Team (CI & Testing services): replace doc1001.eqiad.wmnet with a buster VM - https://phabricator.wikimedia.org/T247653 (10hashar) [08:51:42] 10serviceops: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10elukey) There is a very interesting diff between mc1032 (not showing the issue) and mc1033 (showing the issue): ` elukey@mc1032:~$ sudo cat /etc/network/interfaces # This file describes the network interfa... [09:13:45] 10serviceops: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10elukey) https://gerrit.wikimedia.org/r/c/operations/puppet/+/648238/1/modules/install_server/files/autoinstall/scripts/late_command.sh @jbond there seems to be something strange happening during reimages a... [09:15:23] 10serviceops, 10Operations: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10Joe) a:03jbond Per @elukey's comment, assigning to John. [10:16:58] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10jijiki) [10:20:20] 10serviceops, 10Operations, 10Patch-For-Review: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10jbond) Noticed the following servers with this issue which i have manually fixed `... [10:21:11] 10serviceops, 10Operations, 10Patch-For-Review: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10jbond) for the record the following line was introduced by error > up ip addr add 2620:0:861:107:10:64:48:155 dev eno1/64 it should be the same as what puppet adds i.... [10:45:57] 10serviceops, 10Operations, 10Patch-For-Review: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10jbond) I have applied a fix and tested this on sertest1001 and all looks good but will leave this task open for further confirmation [10:53:42] _joe_: should I me just as careful with rolling the lock redis hosts on codfw too? in mediawiki-config [10:54:05] <_joe_> effie: I don't think so, no, as all requests to redis lock come from eqiad [10:54:09] <_joe_> so using the eqiad hosts [10:54:17] ok in that case I will swap them in one go [10:54:51] I thought so, but I wanted to be sure [10:59:26] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` mc1022.eqiad.wmnet ` The log can be... [10:59:45] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` mc2022.codfw.wmnet ` The log can be... [11:34:47] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mc2022.codfw.wmnet'] ` and were **ALL** successful. [11:44:38] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mc1022.eqiad.wmnet'] ` and were **ALL** successful. [12:18:30] 10serviceops, 10MW-on-K8s, 10Operations, 10Release Pipeline (Blubber), 10Release-Engineering-Team (Pipeline): Deployment infrastructure for PHP microservices - https://phabricator.wikimedia.org/T261369 (10akosiaris) >>! In T261369#6693707, @Krinkle wrote: > Some kind of `/deploy` repo seems needed, I thi... [12:28:42] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Target Sources (component/kubernetes-future/source/Sources) is configured multiple times - https://phabricator.wikimedia.org/T270271 (10JMeybohm) [12:29:20] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Target Sources (component/kubernetes-future/source/Sources) is configured multiple times - https://phabricator.wikimedia.org/T270271 (10JMeybohm) [13:47:06] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10hashar) [13:47:52] ^ seems the docker registry yields "missing or empty Content-Length header" from time to time , I merely triaged the task. [14:08:36] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10kostajh) ` lang=bash $ docker pull docker-registry.wikimedia.org/releng/node10-test-browser Using... [14:12:05] hashar: thanks. Do you know what we use for our docker registry software? [14:13:17] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10kostajh) Are we using harbor (was scanning T209271)? If so, perhaps this is the same issue as http... [14:17:31] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10kostajh) > I can not reproduce it consistently. @zeljkofilipin so eventually it worked for you?... [14:23:26] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Add kubernetes 1.17+ topology annotations - https://phabricator.wikimedia.org/T270191 (10akosiaris) [15:01:42] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10jijiki) [15:15:05] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10zeljkofilipin) @kostajh it was failing consistently on my macos 11 machine, working fine on my mac... [15:18:40] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: kubernetes tmpfiles references path below legacy directory /var/run/ - https://phabricator.wikimedia.org/T270298 (10JMeybohm) [15:22:40] wkandek are you able to comment on https://phabricator.wikimedia.org/T270184#6695583, its not clear to me from the task or change why the additional IP got added, and if its safe to remove [15:31:33] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Restart kubernetes services on package upgrades - https://phabricator.wikimedia.org/T270302 (10JMeybohm) [15:37:36] 10serviceops, 10Operations, 10Release-Engineering-Team-TODO, 10Patch-For-Review, and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10hnowlan) mw1265 is now reimaged and pooled with weight 5 (as opposed to its previous 25) [15:38:27] 10serviceops, 10Operations, 10Release-Engineering-Team-TODO, 10Patch-For-Review, and 2 others: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10jijiki) [15:54:54] 10serviceops, 10Operations: ifup@eno1.service failed on some buster hosts - https://phabricator.wikimedia.org/T270220 (10RLazarus) Oh wow, I filed this and went to bed, love to wake up and see it fully handled. :) Thanks all! [15:55:35] akosiaris: as a follow up question for yesterday's talk: from the log helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production', what's [staging] means there? [15:55:50] the staging k8s cluster? [15:57:45] yes [15:57:45] volans: yes [15:58:11] so we have prod and staging pods in the staging cluster and the same in the prod cluster? [15:58:17] that one is kind of misleading btw [15:58:36] I would have thought deploying staging means tdeployng to the staging cluster and prod to prod [15:58:40] while "release" being the name of the helm release which is not aligned to our new naming scheme currently [15:58:42] and a bug that it is executed probably. There is no production release in staging [16:01:00] ok, as you might guess staging and production in the same deployment log are a bit confusing :) [16:10:56] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10jijiki) [16:22:33] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` mc1019.eqiad.wmnet ` The log can be... [16:22:44] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` mc2019.codfw.wmnet ` The log can be... [16:23:09] jbond: commented on https://phabricator.wikimedia.org/T270184#6695583 [16:24:19] great thanks [16:47:58] kostajh: sorry missed your message. I know nothing about the docker registry :D i am merely an end user of it! [16:48:45] if anyone here knows about it though, there are a bunch of docker pull failing with "ERROR: missing or empty Content-Length header" # https://phabricator.wikimedia.org/T270270 [16:49:33] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mc1019.eqiad.wmnet'] ` and were **ALL** successful. [16:49:40] <_joe_> jayme: ^^ [16:50:43] hum... [16:54:23] I tried some hours ago to quickly reproduce but was not able to [16:54:55] Reminds of the PDF thing tbh - wanted to re-read that [17:00:29] 10serviceops, 10Operations, 10Platform Engineering, 10Wikidata, and 4 others: Upgrade memcached cluster to Debian Buster - https://phabricator.wikimedia.org/T213089 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mc2019.codfw.wmnet'] ` and were **ALL** successful. [17:09:17] 10serviceops, 10Analytics, 10User-jijiki: Clarify multi-service instance concepts in helm charts and enable canary releases - https://phabricator.wikimedia.org/T242861 (10jijiki) [17:15:32] 10serviceops, 10Operations, 10User-jijiki: Upgrade memcached to version 1.6.x - https://phabricator.wikimedia.org/T270315 (10jijiki) [17:15:44] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10JMeybohm) >>! In T270270#6695364, @kostajh wrote: > Are we using harbor (was scanning T209271)? If... [17:15:51] 10serviceops, 10Operations, 10User-jijiki: Upgrade memcached to version 1.6.x - https://phabricator.wikimedia.org/T270315 (10jijiki) [17:15:53] 10serviceops, 10Operations, 10Patch-For-Review: Upgrade and improve our application object caching service (memcached) - https://phabricator.wikimedia.org/T244852 (10jijiki) [17:17:49] 10serviceops, 10Operations, 10Patch-For-Review: Upgrade and improve our application object caching service (memcached) - https://phabricator.wikimedia.org/T244852 (10jijiki) [17:29:36] 10serviceops, 10Operations: Test and deploy mcrouter 0.41 - https://phabricator.wikimedia.org/T244476 (10jijiki) 05Open→03Declined [17:29:38] 10serviceops, 10Operations, 10Patch-For-Review: Upgrade and improve our application object caching service (memcached) - https://phabricator.wikimedia.org/T244852 (10jijiki) [17:29:41] 10serviceops, 10Operations: Test and deploy mcrouter 0.41 - https://phabricator.wikimedia.org/T244476 (10jijiki) For reasons explained in T251574#6148741, going with 0.41 is our only option, closing this task [17:30:40] 10serviceops, 10Operations: Recurrent TX bw saturation for mediawiki memcached shards - https://phabricator.wikimedia.org/T258679 (10jijiki) 05Open→03Resolved a:03jijiki I am closing this since there are not immediate actionables :) [17:30:44] 10serviceops, 10Operations, 10Patch-For-Review: Upgrade and improve our application object caching service (memcached) - https://phabricator.wikimedia.org/T244852 (10jijiki) [17:38:06] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10JMeybohm) Could you please check if you see any additional errors/hints in the docker daemon logs?... [18:52:32] effie: if you could review https://gerrit.wikimedia.org/r/c/operations/puppet/+/643354 when you get a chance that'd be awesome [21:19:44] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10kostajh) >>! In T270270#6696236, @JMeybohm wrote: > Could you please check if you see any addition... [21:24:35] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10kostajh) I can reproduce the error on macOS 11.0.1 + Docker for Mac 3.0.1 (Docker version 20.10.0)... [21:28:46] 10serviceops, 10MediaWiki-Docker, 10Operations, 10User-zeljkofilipin: docker pull from docker-registry fails with `ERROR: missing or empty Content-Length header` - https://phabricator.wikimedia.org/T270270 (10kostajh) I filed https://github.com/docker/for-mac/issues/5143 in case it's an issue with Docker f... [22:22:22] kostajh: you mind posting the dockerd debug logs from pulling an image (with error) somewhere? I would like to figure out at which point it fails (fetching some meta info or a specific layer maybe) [22:31:12] jayme: yes I can do that tomorrow [22:31:21] cool, thanks