[06:01:30] bd808: for a service only accessed from wikiprod, you might as well just make the backend available over IPv6 and skip the web proxy config. that also lets you not expose the k8s api to the entire internet. [07:32:08] It seems it was temporary (re: I came across a problem in the pre-commit steps) [08:40:39] morning. toolsdb was again lagging during the night, and recovered. if this keeps happening every night, it might be some cron job doing bulk deletes on a db. I wouldn't worry for now. [09:50:24] anyone familiar on how to locally test https://gitlab.wikimedia.org/repos/cloud/cloud-vps/horizon/deploy ? My current state is that it tries to authenticate to eqiad keystone and gets unauthorized [11:41:15] can I get a review for https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/49? [12:15:46] one tab of phabricator board is using 17G on my firefox... [12:15:48] https://usercontent.irccloud-cdn.com/file/XvSgpHKj/image.png [12:18:57] taavi: +1d [12:29:44] taavi: replied to your questions here, with some of my own xd https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/24 [12:34:27] thanks, replied [12:48:43] andrewbogott: does rados allow limiting access to (s3) buckets beyond a project? ideally loki would have a service account that can only access the loki buckets but nothing else in those projects [13:25:07] I don't know, but not that I know of. [13:25:39] There's still the option of creating per-tool keystone projects which would unlock things like that, but I can't get anyone interested in that idea :) [13:35:00] i mean if we wanted to implement that isolation today, we could just make a separate cloud vps project to hold the buckets, no per-tool support needed for that [13:35:51] also true! I guess I was thinking you'd have a container per tool but maybe you're thinking about having all Loki's stuff in one place. [13:37:16] yeah, loki will definitely put all the log information in a single (well, 3) container. if you wanted per-tool containers you'd need to run an entirely separate loki installation which doesn't make sense for a vast majority of the tools. [13:37:50] yep, that would be a lot of loki [13:38:17] so that brings me to the next question: which service account should I create the s3 credentials under? I guess I should create new developer accounts for loki in both tools and toolsbeta, and create the ec2 style credentials under those? [13:41:51] You're talking about storing the logs in new projects, right? Like tools-loki-logs and toolsbeta-loki-logs, right? [13:42:17] So probably the service user only really needs the role assigned on those projects and not on tools or toolsbeta themselves [13:42:58] (the object_storage role, I mean) [13:43:35] I'm honestly not really sure [13:43:46] a separate tenant would indeed be the best for security [13:44:11] but that means I now need to figure out how to make the toolforge tofu repo do things on multiple keystone tenants [13:44:12] Are we using object storage in tools/toolsbeta for other things? [13:44:30] tofu state at the moment [13:44:32] soon: harbor storage [13:46:04] And those are things which it's fine (or maybe good) for every tool user to be able to read... [13:46:11] I'm just thinking my way through this... [13:46:47] But we don't have a way to put logs in an s3 container in 'tools' without providing access to every log ever to every toolforge member [13:47:08] by "member" you mean former projectadmin, right? not current "viewer"? [13:47:19] No, I mean everyone, readers included. [13:47:25] uh [13:47:34] Is that wrong? Do 'readers' not have access to s3 things? [13:47:41] I thought it was members only [13:47:48] * andrewbogott checks [13:48:09] rgw_keystone_accepted_roles = 'admin, member, object_storage' [13:48:12] You are right [13:48:34] phew. tofu state must not be world-readable and none of that can be writable by non-admins [13:48:45] and loki data also should not be readable by non-admins [13:49:14] ok, great. [13:49:37] So yeah, I think I don't have a lot of concern about storing logs in that same security sphere [13:50:25] which is to say: I think it's up to you whether you want them in a separate tenant or not. [13:50:33] i think out of those, tofu state is most sensitive, in theory write access to it could be escalated to write access to most openstack resources [13:50:36] Having them in the existing tenant seems OK to me. [13:50:58] yeah [13:51:34] otoh, this is much easier to do properly now than later [13:51:37] so let's do them separately [13:52:27] Fine with me, tenants are cheap. As you say, it's just getting tofu to manage service projects. [13:53:02] I guess we could manage them in tofu-infra which would be very simple to implement but not perfectly organized. [13:53:50] want me to file project request tasks for the record? [13:54:23] yeah, probably a good idea. [13:54:24] also, do you think you'll have time for T396016 anytime soon? [13:54:25] T396016: Review our handling of keystone 'member' role (previously known as 'projectadmin') - https://phabricator.wikimedia.org/T396016 [13:55:07] I haven't been thinking of that as high priority, is it blocking things or just making you nervous? [13:56:07] not a blocker, but that would let me manage the required role assignments here within toolforge tofu instead of doing them by hand [13:56:38] ah, I see! good point. [13:57:09] I don't think you should plan on it getting done soon unless you need me to drop everything and focus on it. It'll be delicate and take a while. [13:57:27] +1 for a new tenant :), we would need one for tools and one for toolsbeta right? [13:57:52] sure, that's fine [13:58:02] dcaro: yeah, although we can start with toolsbeta only for now. [13:58:15] awesome yes [14:00:22] I have some minor worries about getting tofu do things on multiple tenants, but it's probably doable, worst case with multiple "applys" once per tenant [14:00:46] T397339 [14:00:46] T397339: Request creation of toolsbeta-logging VPS project - https://phabricator.wikimedia.org/T397339 [14:48:28] bd808: regarding magnum vs. gophercloud, I'm not sure I agree with stephen's interpretation but he's fixing it https://github.com/gophercloud/gophercloud/issues/3429#issuecomment-2983369440 [14:59:41] taavi, tell me again the cookbook that reboots all cloudvirts? [15:01:20] andrewbogott: wmcs.openstack.cloudvirt.safe_reboot, but give it a --cluster-name instead of a --fqdn (and likely --ceph-only) [15:01:37] oh, it's hiding! Ok, trying... [15:24:05] taavi: trying IPv6 exposure is on my list of things to do. I had problems building a cluster with the IPv6 network before, but it turns out that there was at least one other issue that I have since resolved when that test failed. [15:26:19] andrewbogott: His commit message for the magnum patch looks like he was going to give a citation and then forgot. :/ He is giving out big "everyone should read the docs I read" energy. [15:26:59] He might be technically correct about the name being 'wrong' but he's going to have a steep climb. [15:27:17] Although his patch supports multiple names so it seems like a polite way to change it, it seems backwards-compatible. [15:28:12] But we're vindicated in any case. I don't know what that means for your deployment, probably you don't want to wait for the next release. [15:30:02] Oh, I just downgraded to a version of the openstack provider that works. I lost a day to that, but this is somewhat expected when copying a thing you haven't touched for 9 months and then trying to catch up with upstream changes. [15:31:33] heh. I checked the guy's github profile. https://that.guru/ is his blog and well that tells me something [15:48:41] confident! [16:16:29] dcaro: unfortunately https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/API_Gateway partially duplicates https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/APIs :( [16:17:17] oh, I did not see that page [16:17:42] yeah I didn't make it particularly discoverable then [17:18:18] Can anyone review this: https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/36 [17:31:17] taavi: I am fixing it :) [17:35:52] hashar: hmmm.. your patch just removes that rule, instead of fixing data: urls to work like I intended them to when writing that rule [17:36:10] WIP [17:37:13] taavi: I did it in two step, the first change removes the erroneous entry and I think it is fine is a noop [17:37:37] cause that `img-src: 'unsafe-inline'` does not apply and is most probably ignored [17:37:57] the next change adds `data:` which does cause a change and might need to revert :] [17:38:52] I have split them so that the addition of `data:` can be reverted if need be! [17:39:05] ah [17:39:18] if you need a demo/test case: https://object.eqiad1.wikimediacloud.org/swift/v1/AUTH_a3598983742448b3b056b5fcb228faa9/artifacts/2e1/wikimedia/2e13672fcdfd4d4ba68ec123eceac5dc/index.html [17:39:40] the Firefox inspector seems to ignore the CSP ( https://phabricator.wikimedia.org/F62379119 ) which is how I caught it [17:39:50] and I am wondering whether that might be a security issue in Firefox [17:40:06] but after 10 hours of work, I think I will have dinner instead :-] [17:40:16] thanks taavi ! [17:41:28] I am off for dinner [18:04:25] taavi: for the pages, I think I'll incorporate some of the old APIs one into the API_gateway one, makes sense to me that for admin pages are per-component (I'll add also how to get the pods and such things on the api-gateway one, same format we have for the others). Maybe part of it will go to the generic Toolforge (about the generic APIs and such) and link from the re to the specific component page [18:04:28] wdyt? [18:14:03] chuckonwu: done, almost there [18:14:40] * dcaro off [18:24:09] taavi: tank you :)