[00:09:22] <wikibugs>	 10serviceops, 10Parsoid-Tests, 10SRE, 10Parsoid (Tracking), 10Patch-For-Review: Make testreduce web UI publicly accessible on the internet - https://phabricator.wikimedia.org/T266509 (10Dzahn)
[00:30:51] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by legoktm on cumin1001.eq...
[01:13:37] <wikibugs>	 10serviceops, 10observability, 10Patch-For-Review, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10Dzahn) ran into this issue today when working on T266509.  Was wondering for some time why the envoy setup looks fine but things are not working.  until...
[01:14:08] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2295.codfw.wmnet', 'mw22...
[06:30:08] <wikibugs>	 10serviceops, 10SRE, 10Traffic, 10Wikimedia-production-error: Cyberbot is getting a lot of 502 errors, or blank responses when querying the API - https://phabricator.wikimedia.org/T273003 (10Joe) >>! In T273003#6778171, @CDanis wrote: > It seems the User-Agent being used is `Peachy MediaWiki Bot API Versio...
[07:25:38] <effie>	 _joe_: heads up, I will merge 629343 and enable ipv6 on restbase_dev 
[07:25:47] <_joe_>	 +1
[07:41:02] <effie>	 I think I should enable on 2/3 mwdebug servers, restbase-dev host have had almost no socket errors the past 2 days 
[07:41:53] <effie>	 or on a canary mw server, and let it sit a bit before rolling out everywhere
[08:23:55] <_joe_>	 effie: I'm more worried about "this breaks application X for unforeseen reasons"
[08:24:25] <effie>	 we can sit on a canary for as long as we want
[08:24:41] <effie>	 this has been going on for months, no need to rush it 
[09:16:21] <_joe_>	 yeah I would say do restbase-dev this morning, app canaries this afternoon if no fire is visible there, and we can then wait a week before promoting it to everywhere
[09:39:17] <wikibugs>	 10serviceops, 10observability, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10akosiaris) I 've left a comment in the merged change, duplicating here for visibility (since the change is merged already)  IPv4 compatible addresses are deprecated (yet still...
[09:39:57] <akosiaris>	 I think we should revisit the ipv4_compat approach once more tbh
[09:40:27] <akosiaris>	 I 've commented on the task, but it can be problematic and we do use ACLs based on network addresses in envoy
[09:41:13] <akosiaris>	 last I checked supporting it "correctly" would require quite a bit of duplication of config, but maybe we can get away with templating it in a for loop 
[10:01:51] <_joe_>	 ugh I just realized I read the patch incorrectly, damn
[10:02:05] <_joe_>	 effie: did you just deploy to restbase-dev, correct?
[10:02:18] <akosiaris>	 yes, only that
[10:03:03] <effie>	 we can fix it, no harm has been done
[10:03:06] <_joe_>	 yeah so, this basically makes envoy listen on all IPs, not just on localhost. I misread the patch
[10:03:30] <effie>	 it was 0.0.0.0 before
[10:03:44] <_joe_>	 uh, wait a sec
[10:03:51] <akosiaris>	 yeah, it was always listening on all IPs anyway
[10:03:54] <_joe_>	 it was for local listeners?
[10:04:08] <_joe_>	 then it was a brainfart from me when reviewing your patch
[10:04:16] <effie>	 things happen 
[10:04:25] <_joe_>	 (it's still properly firewalled on ipv4 at least, that's why probably)
[10:04:47] <akosiaris>	 even the admin interface is listening on all IPs
[10:04:58] <effie>	 but not in v6!
[10:05:32] <akosiaris>	 lol. indeed
[10:05:39] <_joe_>	 yeah still properly firewalled
[10:05:42] <akosiaris>	 it's listening btw on all ips so that prometheus can scrape it
[10:05:50] <_joe_>	 yes, but it's firewalled
[10:06:01] <effie>	 I know, I didn't touch the admin interface. I am about to pop out, let me take a better look later or tomorrow
[10:06:01] <akosiaris>	 yes. &R_SERVICE(tcp, 9631, (@resolve((prometheus1003.eqiad.wmnet prometheus1004.eqiad.wmnet)) @resolve((prometheus1003.eqiad.wmnet prometheus1004.eqiad.wmnet), AAAA)));
[10:06:05] <_joe_>	 I prefer the approach I took on k8s, but I never managed to backport it
[10:06:13] <effie>	 which is ?
[10:06:28] <_joe_>	 we only listen with the admin interface on localhost
[10:06:32] <effie>	 sh 
[10:06:34] <effie>	 ah !
[10:06:45] <effie>	 I thought about the ::1 thing on the service itself
[10:06:46] <_joe_>	 and then we have a listener that just exposes the prometheus endpoints to everyone
[10:08:35] <akosiaris>	 interestingly on restbase nodes, there is no reason to listen on anything but localhost
[10:08:38] <effie>	 then I could possible work to backport that, then see what we will do about v6
[10:08:58] <effie>	 akosiaris: generally, this intended to fix the connecting to 'localhost' issue
[10:09:04] <_joe_>	 akosiaris: on every host, the only thing that needs to listen on 0.0.0.0 is really the tls terminator
[10:09:18] <akosiaris>	 effie: yup
[10:09:22] <akosiaris>	 _joe_: and yup
[10:09:25] <_joe_>	 but given everything is firewalled, I never cared
[10:09:42] <akosiaris>	 defence in depth!
[10:09:50] <_joe_>	 but indeed introducing ipv4_compat on "lo" is not what I was worried about
[10:10:12] <_joe_>	 akosiaris: well, those listeners are just proxies to things you can reach from within anywhere in production anyways
[10:10:23] <_joe_>	 if anything, it will be more respectful than curl(1) 
[10:10:31] <_joe_>	 so... not really in need of defence :P
[10:12:17] <effie>	 so if we fix this to listen to 127.0.0.1 and ::1 
[10:12:26] <effie>	 will that make eveyrone happy ?
[10:12:52] <akosiaris>	 so, if I stop ferm for a while on any node, will I have just opened a port for anyone to mess with the admin interface?
[10:13:13] <effie>	 akosiaris: this can be true for many things in the infra 
[10:13:32] <akosiaris>	 sure, but that's no excuse for letting it happen if not needed
[10:13:33] <_joe_>	 akosiaris: yes, that's why I wanted to backport the approach we took in k8s
[10:13:44] <_joe_>	 it's a better approach, just didn't have time
[10:14:06] <_joe_>	 but I need to find time for a major overhaul of envoy's config anywyas
[10:14:17] <akosiaris>	 OKR!
[10:14:20] <_joe_>	 me/someone in our team. We need to move to v3
[10:14:31] <_joe_>	 I have enough OKRs for this quarter, thanks :D
[10:14:39] <akosiaris>	 it can be next Q
[10:14:57] <_joe_>	 yes, or someone else picks it up. Valentin is already working on v3
[10:15:03] <effie>	 for the time being, because I really need to go 
[10:15:06] <_joe_>	 we have more stuff we need to do with envoy
[10:15:09] <_joe_>	 effie: go go go :)
[10:15:20] <_joe_>	 you can read the backlog later if you are interested :)
[10:15:23] <effie>	 should we roll back, or fix it listen to 127.0.0.1 and ::!
[10:15:25] <effie>	 1
[10:15:35] <_joe_>	 no it's restbase-dev, it's ok IMHO
[10:15:40] <effie>	 I agree
[10:15:41] <akosiaris>	 I don't think we need to rollback, we are ok for now
[10:16:04] <akosiaris>	 but it seems we need to invest a bit more time overall in it? I did not know there is a v3 config format.
[10:16:12] <akosiaris>	 how better is it? Do I want to look it up?
[10:16:20] <effie>	 I would like to see the socket errors fixed though 
[10:16:25] <_joe_>	 akosiaris: there is other stuff I'd like to do, including starting to use file-based xDS, move to the v3 config, better CI...
[10:16:27] <akosiaris>	 or will it be more protobufs in a yaml format?
[10:16:36] <_joe_>	 akosiaris: ofc it is
[10:16:49] <_joe_>	 v3 just adds new features and fixes some logical inconsistencies
[10:16:51] <akosiaris>	 effie: They are harmless btw, it's not urgent. 
[10:16:59] <_joe_>	 I agree
[10:17:05] <effie>	 akosiaris: they are, but when it wasnet like that 
[10:17:13] <_joe_>	 so, maybe we need to create a phab tag for service-proxy or envoy
[10:17:16] <akosiaris>	 and if v3 differs significantly it might be work down the drain
[10:17:20] <effie>	 and we had a minor issue that was related with eg temporary timeouts
[10:17:29] <effie>	 this graph, was useful:
[10:17:37] <_joe_>	 that we might need to share with traffic folks
[10:17:52] <akosiaris>	 yeah I 've seen the task. I get why it's useful to not have those around
[10:18:05] <effie>	 https://grafana.wikimedia.org/d/5E7tdiGWz/xxxx-effie?viewPanel=3&orgId=1
[10:18:28] <effie>	 as it would be usually 0, so anything else, it would show up there
[10:22:51] <akosiaris>	 our k8s envoys also don't listen on IPv6 of course. And more or less the same issue exists there too, just probably not on the same magnitude
[10:25:59] <effie>	 anyway, I think that socket errors, generally, is a useful metric when debugging a problem, right now we basically useless
[10:26:08] <effie>	 it is* 
[10:42:56] <wikibugs>	 10serviceops, 10envoy, 10observability, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10Joe)
[10:43:53] <wikibugs>	 10serviceops, 10SRE, 10envoy, 10Service-Architecture: Using envoy to connect from MediaWiki to restbase causes an explosion of live LVS connections. - https://phabricator.wikimedia.org/T266855 (10Joe)
[10:44:04] <wikibugs>	 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Allow canarying new envoy configurations in kubernetes - https://phabricator.wikimedia.org/T265882 (10Joe)
[10:44:14] <wikibugs>	 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Improve envoy configuration CI checks - https://phabricator.wikimedia.org/T265881 (10Joe)
[10:44:24] <wikibugs>	 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Upgrade envoy configuration to use the v3 API - https://phabricator.wikimedia.org/T265880 (10Joe)
[10:44:41] <wikibugs>	 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Consider using a file-based xDS system for envoy in k8s - https://phabricator.wikimedia.org/T265879 (10Joe)
[10:49:22] <moritzm>	 wmfdehttps://bugs.kde.org/show_bug.cgi?id=0.0.4 is the only container image we use with (besides Toolforge), which contains sudo. By name it sounds like an image developers use for local tests, but we should upgrade it nonetheless, I guess
[14:02:26] <wikibugs>	 10serviceops, 10Analytics-Radar, 10Release-Engineering-Team, 10observability, and 2 others: Create a separate 'mwdebug' cluster - https://phabricator.wikimedia.org/T262202 (10jijiki)
[14:02:29] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10User-jijiki: Mechanism to flag webrequests as "debug" - https://phabricator.wikimedia.org/T263683 (10jijiki) 05Open→03Resolved @Milimetric  patch is merged! We are setting  debug=1 in the X-Analytics header if "X-Wikimedia-Debug" is present. Thank you fo...
[15:02:58] <ottomata>	 jayme:  ping on https://phabricator.wikimedia.org/T269160#6777382 :D
[15:45:06] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10JMeybohm) >>! In T269160#6777382, @elukey wrote: > Waiting for @JMeybohm's...
[15:45:07] <jayme>	 ottomata: pong
[15:47:20] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) >>! In T269160#6780685, @JMeybohm wrote: >>>! In T269160#6777382, @...
[15:48:11] <ottomata>	 TY!
[15:55:35] <wikibugs>	 10serviceops, 10Wikimedia-Logstash, 10Kubernetes: Create a logstash dashboard showing all application logs for a selected service - https://phabricator.wikimedia.org/T263755 (10Ottomata) 05Open→03Resolved a:03Ottomata Now that we've upgraded, here's a permalink:  https://logstash.wikimedia.org/app/dash...
[15:55:55] <ottomata>	 Heya, should we include this in a wikitech doc somehwere?
[15:55:55] <ottomata>	 https://logstash.wikimedia.org/app/dashboards#/view/7f883390-fe76-11ea-b848-090a7444f26c?_g=(filters%3A!()%2CrefreshInterval%3A(pause%3A!t%2Cvalue%3A0)%2Ctime%3A(from%3Anow-15m%2Cto%3Anow))
[15:56:54] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) ` Error: pods is forbidden: User "eventstreams-internal" cannot lis...
[15:59:27] <ottomata>	 Added here: https://wikitech.wikimedia.org/wiki/Deployment_pipeline/FAQ
[15:59:30] <ottomata>	 but not sure if that is the best place
[16:01:24] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10JMeybohm) You probably have not yet depoyed the admin part (the new namespa...
[16:04:17] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) >>! In T269160#6780761, @JMeybohm wrote: > You probably have not ye...
[16:06:46] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10JMeybohm) Apart from you testing my attention again (kube_env admin [codfw|...
[16:09:39] <elukey>	 jayme: o/
[16:10:08] <elukey>	 so the admin for codfw yields to a very long chart: raw etc.. set of changes
[16:10:12] <elukey>	 like the one I did for staging
[16:10:37] <elukey>	 I said "n" to the first just to quickly stop but of course now it asks me a long list of "do you want to apply?"
[16:11:19] <jayme>	 elukey: yeah..maybe I did not say it enough: It's a mess :P
[16:11:55] <jayme>	 there is one (or more) loops in that shell script. So you will be questioned a bunch of times
[16:12:24] <jayme>	 As that is probably my fault (not deploying that after upgrading raw charts version) I can take it if you want
[16:12:44] <elukey>	 at this point I'll say "n" to all just to avoid any mess, is it ok? Then I'd let you proceed (when you have time) so I don't cause troubles
[16:13:03] <jayme>	 yeah, thats fine
[16:13:57] <jayme>	 sorry for that. :-| Let me know when you got out of the no-no-loop 
[16:14:24] <elukey>	 nono it is fine, really sorry to pester you 
[16:19:57] <elukey>	 jayme: I am out :)
[16:21:01] <jayme>	 elukey: ack
[16:25:22] <ottomata>	 elukey:  logs in logstash for eventstreams-internal: https://logstash.wikimedia.org/goto/b408da9f4b39f66a0d098006285a033e
[16:37:51] <jayme>	 elukey: you should be good on codfw
[16:38:51] <elukey>	 jayme: ack trying again
[16:40:37] <elukey>	 jayme: yep worked!
[16:42:51] <jayme>	 elukey: cool. Give me a sec to complete eqiad
[16:52:46] <jayme>	 elukey: eqiad should be fine as well
[16:53:55] <elukey>	 <3
[16:56:24] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) es-internal deployed in both eqiad and codfw, next steps are:  - te...
[17:00:09] <wikibugs>	 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10Ottomata) @elukey [[ https://logstash.wikimedia.org/goto/b408da9f4b39f66a0d...
[17:03:41] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[17:06:53] <elukey>	 updated https://wikitech.wikimedia.org/wiki/Kubernetes#Add_a_new_service as well :)
[17:11:21] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[17:12:26] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[17:13:54] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[17:18:58] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[17:35:54] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[17:36:54] <wikibugs>	 10serviceops, 10Dumps-Generation, 10Platform Engineering, 10SRE: Upgrade snapshot hosts to Buster - https://phabricator.wikimedia.org/T269377 (10ArielGlenn) I've built th package and set up a test instance in deployment-prep, but there's issues with mediawiki scripts there; see T273089 for the details.
[17:51:34] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1407.eqiad.wmnet'] `  an...
[17:52:21] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1406.eqiad.wmnet'] `  an...
[18:01:00] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2301.codfw.wmnet'] `  an...
[18:08:17] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[18:10:56] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[18:14:08] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[18:20:48] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1268.eqiad.wmnet'] `  an...
[18:25:01] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[18:44:06] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2215.codfw.wmnet'] `  an...
[18:58:10] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2216.codfw.wmnet'] `  an...
[19:15:28] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[19:28:04] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2217.codfw.wmnet'] `  an...
[19:30:05] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2218.codfw.wmnet'] `  an...
[19:31:11] <mutante>	 hrmm.. So I am still trying to get https://parsoid-rt-tests.wikimedia.org to work again. meanwhile I: a) made envoy listen on IPv6    b) made backend nginx listen on IPv6.. as opposed to before now I can use curl from a random cp machine, like cp4030 and actually connect (404, but not connection refused anymore).  except.. from external I still see 502 .. sigh
[19:31:47] <mutante>	 but if my curl from cp works.. it is not firewalling, it is not about v4/v6 anymore, cert matches.. what else
[19:32:09] <cdanis>	 mutante: have you looked at the ats-be logs?
[19:32:38] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2219.codfw.wmnet'] `  an...
[19:33:53] <wikibugs>	 10serviceops, 10envoy, 10observability, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10jijiki) >>! In T255568#6779477, @akosiaris wrote: > I 've left a comment in the merged change, duplicating here for visibility (since the change is merged already)...
[19:35:10] <mutante>	 cdanis: I looked at /var/log/trafficserver/*.log and it says "[CONNECTION_ERROR] to 10.64.48.40". But when i telnet to that IP on 443 .. i am connected
[19:35:55] <cdanis>	 any chance it is using the wrong port?
[19:35:59] <mutante>	 curl -vvv -6 https://testreduce.discovery.wmnet/ also works
[19:36:38] <mutante>	 cdanis: well.. I changed the upstream port to be 8001, not the default of 80, but the TLS port should be 443 
[19:36:45] <cdanis>	 mm
[19:37:24] <mutante>	 could not connect [CONNECTION_ERROR] to 10.64.48.40 for 'https://testreduce.discovery.wmnet/favicon.ico'
[19:37:48] <mutante>	 then I do
[19:37:50] <mutante>	 url -vvv https://testreduce.discovery.wmnet/favicon.ico
[19:37:54] <mutante>	 curl
[19:38:00] <mutante>	 and it is connected and just 404
[19:38:29] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1263.eqiad.wmnet'] `  an...
[19:40:49] <mutante>	 also I can see I am already talking to the nginx backend when doing that .. hrmm
[20:12:13] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[20:16:25] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[20:22:59] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[20:35:15] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2221.codfw.wmnet'] `  an...
[20:51:54] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1405.eqiad.wmnet'] `  an...
[21:23:51] <wikibugs>	 10serviceops, 10Scap, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO: Define a mediawiki "version" - https://phabricator.wikimedia.org/T218412 (10Legoktm)
[21:36:27] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2222.codfw.wmnet'] `  an...
[21:37:41] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2246.codfw.wmnet'] `  an...
[23:57:30] <wikibugs>	 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...