[00:09:22] 10serviceops, 10Parsoid-Tests, 10SRE, 10Parsoid (Tracking), 10Patch-For-Review: Make testreduce web UI publicly accessible on the internet - https://phabricator.wikimedia.org/T266509 (10Dzahn) [00:30:51] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by legoktm on cumin1001.eq... [01:13:37] 10serviceops, 10observability, 10Patch-For-Review, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10Dzahn) ran into this issue today when working on T266509. Was wondering for some time why the envoy setup looks fine but things are not working. until... [01:14:08] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2295.codfw.wmnet', 'mw22... [06:30:08] 10serviceops, 10SRE, 10Traffic, 10Wikimedia-production-error: Cyberbot is getting a lot of 502 errors, or blank responses when querying the API - https://phabricator.wikimedia.org/T273003 (10Joe) >>! In T273003#6778171, @CDanis wrote: > It seems the User-Agent being used is `Peachy MediaWiki Bot API Versio... [07:25:38] _joe_: heads up, I will merge 629343 and enable ipv6 on restbase_dev [07:25:47] <_joe_> +1 [07:41:02] I think I should enable on 2/3 mwdebug servers, restbase-dev host have had almost no socket errors the past 2 days [07:41:53] or on a canary mw server, and let it sit a bit before rolling out everywhere [08:23:55] <_joe_> effie: I'm more worried about "this breaks application X for unforeseen reasons" [08:24:25] we can sit on a canary for as long as we want [08:24:41] this has been going on for months, no need to rush it [09:16:21] <_joe_> yeah I would say do restbase-dev this morning, app canaries this afternoon if no fire is visible there, and we can then wait a week before promoting it to everywhere [09:39:17] 10serviceops, 10observability, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10akosiaris) I 've left a comment in the merged change, duplicating here for visibility (since the change is merged already) IPv4 compatible addresses are deprecated (yet still... [09:39:57] I think we should revisit the ipv4_compat approach once more tbh [09:40:27] I 've commented on the task, but it can be problematic and we do use ACLs based on network addresses in envoy [09:41:13] last I checked supporting it "correctly" would require quite a bit of duplication of config, but maybe we can get away with templating it in a for loop [10:01:51] <_joe_> ugh I just realized I read the patch incorrectly, damn [10:02:05] <_joe_> effie: did you just deploy to restbase-dev, correct? [10:02:18] yes, only that [10:03:03] we can fix it, no harm has been done [10:03:06] <_joe_> yeah so, this basically makes envoy listen on all IPs, not just on localhost. I misread the patch [10:03:30] it was 0.0.0.0 before [10:03:44] <_joe_> uh, wait a sec [10:03:51] yeah, it was always listening on all IPs anyway [10:03:54] <_joe_> it was for local listeners? [10:04:08] <_joe_> then it was a brainfart from me when reviewing your patch [10:04:16] things happen [10:04:25] <_joe_> (it's still properly firewalled on ipv4 at least, that's why probably) [10:04:47] even the admin interface is listening on all IPs [10:04:58] but not in v6! [10:05:32] lol. indeed [10:05:39] <_joe_> yeah still properly firewalled [10:05:42] it's listening btw on all ips so that prometheus can scrape it [10:05:50] <_joe_> yes, but it's firewalled [10:06:01] I know, I didn't touch the admin interface. I am about to pop out, let me take a better look later or tomorrow [10:06:01] yes. &R_SERVICE(tcp, 9631, (@resolve((prometheus1003.eqiad.wmnet prometheus1004.eqiad.wmnet)) @resolve((prometheus1003.eqiad.wmnet prometheus1004.eqiad.wmnet), AAAA))); [10:06:05] <_joe_> I prefer the approach I took on k8s, but I never managed to backport it [10:06:13] which is ? [10:06:28] <_joe_> we only listen with the admin interface on localhost [10:06:32] sh [10:06:34] ah ! [10:06:45] I thought about the ::1 thing on the service itself [10:06:46] <_joe_> and then we have a listener that just exposes the prometheus endpoints to everyone [10:08:35] interestingly on restbase nodes, there is no reason to listen on anything but localhost [10:08:38] then I could possible work to backport that, then see what we will do about v6 [10:08:58] akosiaris: generally, this intended to fix the connecting to 'localhost' issue [10:09:04] <_joe_> akosiaris: on every host, the only thing that needs to listen on 0.0.0.0 is really the tls terminator [10:09:18] effie: yup [10:09:22] _joe_: and yup [10:09:25] <_joe_> but given everything is firewalled, I never cared [10:09:42] defence in depth! [10:09:50] <_joe_> but indeed introducing ipv4_compat on "lo" is not what I was worried about [10:10:12] <_joe_> akosiaris: well, those listeners are just proxies to things you can reach from within anywhere in production anyways [10:10:23] <_joe_> if anything, it will be more respectful than curl(1) [10:10:31] <_joe_> so... not really in need of defence :P [10:12:17] so if we fix this to listen to 127.0.0.1 and ::1 [10:12:26] will that make eveyrone happy ? [10:12:52] so, if I stop ferm for a while on any node, will I have just opened a port for anyone to mess with the admin interface? [10:13:13] akosiaris: this can be true for many things in the infra [10:13:32] sure, but that's no excuse for letting it happen if not needed [10:13:33] <_joe_> akosiaris: yes, that's why I wanted to backport the approach we took in k8s [10:13:44] <_joe_> it's a better approach, just didn't have time [10:14:06] <_joe_> but I need to find time for a major overhaul of envoy's config anywyas [10:14:17] OKR! [10:14:20] <_joe_> me/someone in our team. We need to move to v3 [10:14:31] <_joe_> I have enough OKRs for this quarter, thanks :D [10:14:39] it can be next Q [10:14:57] <_joe_> yes, or someone else picks it up. Valentin is already working on v3 [10:15:03] for the time being, because I really need to go [10:15:06] <_joe_> we have more stuff we need to do with envoy [10:15:09] <_joe_> effie: go go go :) [10:15:20] <_joe_> you can read the backlog later if you are interested :) [10:15:23] should we roll back, or fix it listen to 127.0.0.1 and ::! [10:15:25] 1 [10:15:35] <_joe_> no it's restbase-dev, it's ok IMHO [10:15:40] I agree [10:15:41] I don't think we need to rollback, we are ok for now [10:16:04] but it seems we need to invest a bit more time overall in it? I did not know there is a v3 config format. [10:16:12] how better is it? Do I want to look it up? [10:16:20] I would like to see the socket errors fixed though [10:16:25] <_joe_> akosiaris: there is other stuff I'd like to do, including starting to use file-based xDS, move to the v3 config, better CI... [10:16:27] or will it be more protobufs in a yaml format? [10:16:36] <_joe_> akosiaris: ofc it is [10:16:49] <_joe_> v3 just adds new features and fixes some logical inconsistencies [10:16:51] effie: They are harmless btw, it's not urgent. [10:16:59] <_joe_> I agree [10:17:05] akosiaris: they are, but when it wasnet like that [10:17:13] <_joe_> so, maybe we need to create a phab tag for service-proxy or envoy [10:17:16] and if v3 differs significantly it might be work down the drain [10:17:20] and we had a minor issue that was related with eg temporary timeouts [10:17:29] this graph, was useful: [10:17:37] <_joe_> that we might need to share with traffic folks [10:17:52] yeah I 've seen the task. I get why it's useful to not have those around [10:18:05] https://grafana.wikimedia.org/d/5E7tdiGWz/xxxx-effie?viewPanel=3&orgId=1 [10:18:28] as it would be usually 0, so anything else, it would show up there [10:22:51] our k8s envoys also don't listen on IPv6 of course. And more or less the same issue exists there too, just probably not on the same magnitude [10:25:59] anyway, I think that socket errors, generally, is a useful metric when debugging a problem, right now we basically useless [10:26:08] it is* [10:42:56] 10serviceops, 10envoy, 10observability, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10Joe) [10:43:53] 10serviceops, 10SRE, 10envoy, 10Service-Architecture: Using envoy to connect from MediaWiki to restbase causes an explosion of live LVS connections. - https://phabricator.wikimedia.org/T266855 (10Joe) [10:44:04] 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Allow canarying new envoy configurations in kubernetes - https://phabricator.wikimedia.org/T265882 (10Joe) [10:44:14] 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Improve envoy configuration CI checks - https://phabricator.wikimedia.org/T265881 (10Joe) [10:44:24] 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Upgrade envoy configuration to use the v3 API - https://phabricator.wikimedia.org/T265880 (10Joe) [10:44:41] 10serviceops, 10SRE, 10envoy, 10Kubernetes, 10Service-Architecture: Consider using a file-based xDS system for envoy in k8s - https://phabricator.wikimedia.org/T265879 (10Joe) [10:49:22] wmfdehttps://bugs.kde.org/show_bug.cgi?id=0.0.4 is the only container image we use with (besides Toolforge), which contains sudo. By name it sounds like an image developers use for local tests, but we should upgrade it nonetheless, I guess [14:02:26] 10serviceops, 10Analytics-Radar, 10Release-Engineering-Team, 10observability, and 2 others: Create a separate 'mwdebug' cluster - https://phabricator.wikimedia.org/T262202 (10jijiki) [14:02:29] 10serviceops, 10Analytics, 10Analytics-Kanban, 10User-jijiki: Mechanism to flag webrequests as "debug" - https://phabricator.wikimedia.org/T263683 (10jijiki) 05Open→03Resolved @Milimetric patch is merged! We are setting debug=1 in the X-Analytics header if "X-Wikimedia-Debug" is present. Thank you fo... [15:02:58] jayme: ping on https://phabricator.wikimedia.org/T269160#6777382 :D [15:45:06] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10JMeybohm) >>! In T269160#6777382, @elukey wrote: > Waiting for @JMeybohm's... [15:45:07] ottomata: pong [15:47:20] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) >>! In T269160#6780685, @JMeybohm wrote: >>>! In T269160#6777382, @... [15:48:11] TY! [15:55:35] 10serviceops, 10Wikimedia-Logstash, 10Kubernetes: Create a logstash dashboard showing all application logs for a selected service - https://phabricator.wikimedia.org/T263755 (10Ottomata) 05Open→03Resolved a:03Ottomata Now that we've upgraded, here's a permalink: https://logstash.wikimedia.org/app/dash... [15:55:55] Heya, should we include this in a wikitech doc somehwere? [15:55:55] https://logstash.wikimedia.org/app/dashboards#/view/7f883390-fe76-11ea-b848-090a7444f26c?_g=(filters%3A!()%2CrefreshInterval%3A(pause%3A!t%2Cvalue%3A0)%2Ctime%3A(from%3Anow-15m%2Cto%3Anow)) [15:56:54] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) ` Error: pods is forbidden: User "eventstreams-internal" cannot lis... [15:59:27] Added here: https://wikitech.wikimedia.org/wiki/Deployment_pipeline/FAQ [15:59:30] but not sure if that is the best place [16:01:24] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10JMeybohm) You probably have not yet depoyed the admin part (the new namespa... [16:04:17] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) >>! In T269160#6780761, @JMeybohm wrote: > You probably have not ye... [16:06:46] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10JMeybohm) Apart from you testing my attention again (kube_env admin [codfw|... [16:09:39] jayme: o/ [16:10:08] so the admin for codfw yields to a very long chart: raw etc.. set of changes [16:10:12] like the one I did for staging [16:10:37] I said "n" to the first just to quickly stop but of course now it asks me a long list of "do you want to apply?" [16:11:19] elukey: yeah..maybe I did not say it enough: It's a mess :P [16:11:55] there is one (or more) loops in that shell script. So you will be questioned a bunch of times [16:12:24] As that is probably my fault (not deploying that after upgrading raw charts version) I can take it if you want [16:12:44] at this point I'll say "n" to all just to avoid any mess, is it ok? Then I'd let you proceed (when you have time) so I don't cause troubles [16:13:03] yeah, thats fine [16:13:57] sorry for that. :-| Let me know when you got out of the no-no-loop [16:14:24] nono it is fine, really sorry to pester you [16:19:57] jayme: I am out :) [16:21:01] elukey: ack [16:25:22] elukey: logs in logstash for eventstreams-internal: https://logstash.wikimedia.org/goto/b408da9f4b39f66a0d098006285a033e [16:37:51] elukey: you should be good on codfw [16:38:51] jayme: ack trying again [16:40:37] jayme: yep worked! [16:42:51] elukey: cool. Give me a sec to complete eqiad [16:52:46] elukey: eqiad should be fine as well [16:53:55] <3 [16:56:24] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) es-internal deployed in both eqiad and codfw, next steps are: - te... [17:00:09] 10serviceops, 10Analytics, 10Analytics-Kanban, 10Event-Platform, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10Ottomata) @elukey [[ https://logstash.wikimedia.org/goto/b408da9f4b39f66a0d... [17:03:41] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:06:53] updated https://wikitech.wikimedia.org/wiki/Kubernetes#Add_a_new_service as well :) [17:11:21] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:12:26] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:13:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:18:58] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:35:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [17:36:54] 10serviceops, 10Dumps-Generation, 10Platform Engineering, 10SRE: Upgrade snapshot hosts to Buster - https://phabricator.wikimedia.org/T269377 (10ArielGlenn) I've built th package and set up a test instance in deployment-prep, but there's issues with mediawiki scripts there; see T273089 for the details. [17:51:34] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1407.eqiad.wmnet'] ` an... [17:52:21] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1406.eqiad.wmnet'] ` an... [18:01:00] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2301.codfw.wmnet'] ` an... [18:08:17] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:10:56] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:14:08] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:20:48] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1268.eqiad.wmnet'] ` an... [18:25:01] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [18:44:06] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2215.codfw.wmnet'] ` an... [18:58:10] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2216.codfw.wmnet'] ` an... [19:15:28] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [19:28:04] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2217.codfw.wmnet'] ` an... [19:30:05] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2218.codfw.wmnet'] ` an... [19:31:11] hrmm.. So I am still trying to get https://parsoid-rt-tests.wikimedia.org to work again. meanwhile I: a) made envoy listen on IPv6 b) made backend nginx listen on IPv6.. as opposed to before now I can use curl from a random cp machine, like cp4030 and actually connect (404, but not connection refused anymore). except.. from external I still see 502 .. sigh [19:31:47] but if my curl from cp works.. it is not firewalling, it is not about v4/v6 anymore, cert matches.. what else [19:32:09] mutante: have you looked at the ats-be logs? [19:32:38] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2219.codfw.wmnet'] ` an... [19:33:53] 10serviceops, 10envoy, 10observability, 10User-fgiunchedi: Envoy should listen on ipv6 and ipv4 - https://phabricator.wikimedia.org/T255568 (10jijiki) >>! In T255568#6779477, @akosiaris wrote: > I 've left a comment in the merged change, duplicating here for visibility (since the change is merged already)... [19:35:10] cdanis: I looked at /var/log/trafficserver/*.log and it says "[CONNECTION_ERROR] to 10.64.48.40". But when i telnet to that IP on 443 .. i am connected [19:35:55] any chance it is using the wrong port? [19:35:59] curl -vvv -6 https://testreduce.discovery.wmnet/ also works [19:36:38] cdanis: well.. I changed the upstream port to be 8001, not the default of 80, but the TLS port should be 443 [19:36:45] mm [19:37:24] could not connect [CONNECTION_ERROR] to 10.64.48.40 for 'https://testreduce.discovery.wmnet/favicon.ico' [19:37:48] then I do [19:37:50] url -vvv https://testreduce.discovery.wmnet/favicon.ico [19:37:54] curl [19:38:00] and it is connected and just 404 [19:38:29] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1263.eqiad.wmnet'] ` an... [19:40:49] also I can see I am already talking to the nginx backend when doing that .. hrmm [20:12:13] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:16:25] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:22:59] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia... [20:35:15] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2221.codfw.wmnet'] ` an... [20:51:54] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1405.eqiad.wmnet'] ` an... [21:23:51] 10serviceops, 10Scap, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO: Define a mediawiki "version" - https://phabricator.wikimedia.org/T218412 (10Legoktm) [21:36:27] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2222.codfw.wmnet'] ` an... [21:37:41] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2246.codfw.wmnet'] ` an... [23:57:30] 10serviceops, 10SRE, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...