[01:45:46] bd808, tofu just talks directly to the openstack apis, right? It doesn't use some kind of bespoke tofu middleware or api provider? [01:46:03] Because I can't make much sense out of https://github.com/gophercloud/gophercloud/commit/e3947338f69b8b52bd494f09388152febc49c6aa, it seems like it would break basically everything. [01:46:48] * andrewbogott checking to see if that's actually right or if he's just confused about the version strings [01:48:29] andrewbogott: Yes, just the normal api. It gets the actual version numbers for each service from calling the service catalog API. [01:49:15] I guess the cinder change seems right... [01:49:42] So it's maybe just a mistake for that one service. Seems like an awfully specific mistake though [01:51:31] I don't actually know a lot about how this works. Is gophercloud a tf provider, an alternative to terraform-provider-openstack/openstack? [01:52:18] (I'm looking at paws and quarry which are the two things that I know work with magnum, don't see any refs to gophercloud in there but maybe it's an indirect dependency) [01:54:58] oh maybe you need gophercloud because go rather than python [01:55:09] Gophercloud is used inside the terraform provider. [01:55:13] https://docs.openstack.org/api-ref/container-infrastructure-management/ is the docs location, but that URL is about the only place I can find the “container-infrastructure-management” string. [01:56:01] Yes, gophercloud is a golang native OpenStack client. [01:58:40] Let me dig up my github credentials and see if I can ask a question or open an issue... [02:02:23] I think https://opendev.org/openstack/magnum/search?q=“Openstack-Api-Version” makes it likely that the gophercloud change is wrong. [02:04:53] I don't know how this code is structured at all, but this seems like it could be the same issue: https://github.com/gophercloud/gophercloud/issues/1682 [02:07:28] oh, hang on, that was filed in 2019 [02:07:29] dang [02:07:37] definitely not related [02:08:26] "If you know how to fix an issue, consider opening a pull request for it. " ok then, I just might [02:09:01] unless you want to do the honors [08:03:11] morning [08:03:22] o/ [08:39:23] there was a brief alert about something related to anycast, is anyone rebooting/restarting/reloading something somewhere? (hmm... I think it might have actually been nrpe bad output for the anycast check) [08:44:43] dcaro: sounds like a side effect of T374842 [08:44:44] T374842: Retire anycast_healthchecker Icinga check - https://phabricator.wikimedia.org/T374842 [08:45:13] yep, that's most probably it :), thanks [09:06:50] hello! reading the backscroll about gophercloud, I'm trying to figure out which versions of the provider contain that commit, but it's not straightforward [09:10:26] it seems they have given up on updating CHANGELOG.md, but this suggests the commit is in 2.7.0 but not in 2.6.0 https://github.com/gophercloud/gophercloud/compare/v2.6.0...v2.7.0 [09:12:25] and gophercloud 2.7 was introduced in v3.1.0 of the openstack provider [09:12:40] I will double check with a.ndrew later if this matches his findings [09:16:07] oh and toolsdb is lagging again :/ looking [09:16:36] just a short spike that resolved by itself, apparently [11:32:11] fwiw on that abuse complaint - the domain they posted (botnix.cloud) seems to be a minecraft-server hosting thing [11:33:26] which is significant as a huge amount of ddos traffic is stupid minecraft-beef stuff [11:33:56] might point more at a malicious user rather than 'misconfigured box' [12:41:23] I am rolling out ipv6 config on the eqiad1 cloudservices (DNS) hosts, so in case you see alerts those are probably me [12:43:22] ack [12:46:16] I looked into the clouddumps alert that triggered 20 mins ago. the unit kiwix-mirror-update.service failed with an rsync error. restarting the unit seems to have fixed it. [13:38:24] thanks [15:12:15] andrewbogott: I think at this point I've mostly convinced myself that gophercloud is the broken bit and not us. I guess the one thing I haven't done or seen you do is ask the OpenStack Magnum folks directly if `Openstack-Api-Version: container-infrastructure-management` is valid or a hallucination. Thanks for putting some energy into looking. [15:14:07] That's what I think too, do you want to open an issue with gophercloud or shall I? I'm thinking that there might just be a bug in their aliasing setup. [15:15:11] If you have the energy to describe the problem to them that would be great. I totally can too, but maybe you have a bit more OpenStack authority than I do. :) [15:15:33] I'll give it a go [16:19:56] dhinus: dcaro: andrewbogott: please check that my summary in https://phabricator.wikimedia.org/T394035#10924578 is accurate [16:23:41] taavi: LGTM, thanks. I'm also interested in hearing if you think there are weak points in this version, apart from the fact we will not be able to use this service from a CLI client [16:24:46] dhinus: yeah, the main downside ihmo is the additional complexity for making a user-facing API, since that will require Yet One More Service To Maintain [16:25:42] I'm personally ok with that tradeoff, I hope I won't regret it in the future :) [16:42:18] that looks right to me, thanks for writing it up taavi [16:42:27] andrewbogott: dhinus fyi. about the partman recipe, the patch https://gerrit.wikimedia.org/r/c/operations/puppet/+/1075552 was to 'restore' the previous status after introducing the shared partman script, we were already deleting the drives before that [16:42:38] I don't think the UI end is more work than any of the other options really. [16:43:04] dcaro: ok. When I have codfw1dev to play with I'll fork that recipe without that bit and see if it does something reasonable :) [16:43:46] taavi: LGTM [16:45:00] andrewbogott: note that the cookbooks might not be ready for this (not sure xd). In codfw though we might not have enough HA, probably taking a full node out breaks the cluster :/ [16:47:33] Yeah, that's why I'm waiting for new codfw1dev hardware to come online before experimenting. T393614 [16:47:33] T393614: Q4:rack/setup/install cloudcephosd200[567] - https://phabricator.wikimedia.org/T393614 [16:49:20] dcaro: ack about the partman patch, that makes sense [16:50:51] I think it makes sense to test if we can do what data-platform is doing, i.e. not delete the partitions on reimage. it will become even more useful if we get larger hosts with larger drives [17:00:30] bd808: https://github.com/gophercloud/gophercloud/issues/3429 in case you want to add more detail [17:01:04] dhinus: agree yep [17:23:03] not sure if there's anyone around, but this would be good to get a review on, it's a simple patch but prevents the jobs-emailer from failing again https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/24 [17:23:27] (we have an alert now though, so not super-critical) [17:29:00] * dcaro off [17:29:01] cya tomorrow [20:27:41] Ready for review on https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/87 [20:40:57] I came across a problem in the pre-commit steps of Components-CLI, https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/jobs/539588, where the pipeline is throwing a 500 error when building wheel for shellcheck_py. Running the tests locally causes no issues so it's Gitlab server that is having trouble when downloading this library. I'll dig deeper tomorrow, but any clues are appreciated. [20:41:03] * chuckonwu off [23:34:04] taavi: I feel like you and I both have written things in Phab about adding TLS support for the proxied services behind dynamicproxy, but today I am not finding that ticket. Am I hallucinating or just not guessing the right keywords? [23:39:37] Context for wanting to find this ticket/conversation is me trying to figure out how to expose the Kubernetes API from a Magnum cluster to WMF production. I'm thinking https://zuul-k8s.wmcloud.org should be possible, but might need some work on dynamicproxy (or other hacks) [23:41:29] Tyler found it for me! https://phabricator.wikimedia.org/T274386#10779578 I was looking for open tasks :)