[10:37:14] !log cloudinfra rebooting cloudinfra-acme-chief-01 to ensure hostname stability (T276041) [10:37:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [10:37:24] T276041: Puppet failure on cloudinfra-acme-chief-01.novalocal - https://phabricator.wikimedia.org/T276041 [10:37:24] Hello Ms/Mr Cloud! Please detach (newly created) LDAP account from WP account 'grin', so I can attach my old and regained account ('grin' as well) myself. You can also remove the new (currently attached) one from LDAP, or I'll request that later on. I have repeated this request on the new account's talk page. [10:38:58] 😁 can you open a task in phabricator.wikimedia.org please? that will make sure we get to it [10:39:22] Will do. [10:39:27] thanks! [10:39:32] thanks as well. :-) [10:46:54] dcaro, uh, where do you prefer the task to open into? there seems to be no ops or similar component. [10:47:39] or just into the cloud services top, wherever it lands? :) [10:48:20] use cloud-services-team, we'll move around if needed [10:48:32] ok. [14:41:51] !log deployment-prep changed profile::redis::multidc::discovery from 'false' to "" to comply with strict typing in the deployment-memc puppet prefix. [14:41:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [17:54:24] tarrow: your VM orig-01.wikibase-registry.eqiad1.wikimedia.cloud has had puppet disabled for many months. Can I enable it? [18:01:02] question, on the DNS Zones tab on horizon, can I create a A record to an ip of instance with a web proxy so i could also use that DNS, in this case, lta-tracker.wmflabs.org, to access the web proxy [18:01:07] (if that made any sense) [18:01:17] or do i need an floating ip to do that [18:07:43] andrewbogott: go ahead. I don't remember us disabling it for any good reason. It probably died when we bumped into a disc quota or something. [18:08:55] However, I'll check with the rest of the team tomorrow if we actually still need that VM (I'm on holiday and it's mostly after euro working hours). Thanks! :) [18:09:56] Last week, I asked how to share data between instances on Cloud VPS (or send data from a Cloud VPS instance to a tool on Toolforge). I think @mutante said: "ideally, use puppet to create a systemd timer on your cloud VPS that runs a command which dumps it into the webserver document root..." To do that, do I need to request Shared Storage NFS [18:09:57] services? https://wikitech.wikimedia.org/wiki/Help:Shared_storage#/data/project [18:10:25] ariutta: I don't believe so [18:11:40] ah wait yes you would [18:11:56] "You can request for access to the listed shares by filing a task on Phabricator under the Data-Services and VPS-Projects projects. When Shared Storage NFS services have been granted, NFS will be mounted by puppet on any VMs where the hiera key mount_nfs: true applies." [18:15:54] The one thing that confused me was mutante said I could dump data into the webserver document root. But I don't think Share Storage NFS services allows one instance to access the document root of another instance, does it? It seems as if the first instance would have to dump the data to /data/project and then the webserver instance would have to [18:15:55] transfer from /data/project to the document root. [18:22:35] tarrow: oh, it has a full drive [18:22:41] so it definitely hasn't been doing anything useful [18:23:54] Maybe Cloud VPS works like Toolforge, where I can transfer data from the bastion node like: cp /data/project/wikipathways/testfile.txt /data/project/wikipathways-data/public_html/ [18:24:48] ariutta: do you want to /share/ data between two VMs, or just do a one-off transfer (e.g. to rebuild or replace a VM)? [18:25:30] andrewbogott: I want to do a periodic transfer [18:25:58] ok, so yeah, NFS would be a reasonable solution or you could set up some kind of rsync [18:26:20] andrewbogott: you can rsync from vps to toolforge? [18:26:22] you're right that it would need to happen via /data/project or /scratch [18:26:33] oh, hang on [18:26:37] I was thinking this was within a project [18:26:39] My first thought was rsync, but I was told managing the SSH credentials would be difficult [18:26:48] between toolforge and a non-toolforge VM? that's a different problem... [18:27:00] Either one works for me [18:27:35] My first thought was to rsync from a Cloud VPS instance to Toolforge, but I can just do it all within a Cloud VPS project, no Toolforge [18:27:35] Things are set up to have fairly strict boundaries between projects. [18:28:00] so, I'm confused. Where is the data coming from and where is it going? [18:28:58] A Cloud VPS instance is periodically generating some data summaries, and I want to share those data summaries via a webserver. That webserver can be on Cloud VPS or whatever is easiest. [18:29:21] I see [18:29:46] so you could avoid data transfer entirely by having the same instance that generates the summaries publish them couldn't you? [18:31:50] Yes, that might work. I'm still new to Cloud VPS, but I thought it's recommended to make each instance have a tight focus on functionality. But I could definitely treat an instance more like a "normal" on-prem server. [18:33:18] question, on the DNS Zones tab on horizon, can I create a A record to an ip of instance with a web proxy so i could also use that DNS, in this case, lta-tracker.wmflabs.org, to access the web proxy [18:33:46] Zppix: why would you not just name the proxy that? [18:34:10] andrewbogott: that is very good point... *facepalms* [18:34:11] ariutta: If you keep things within a single project it should be straightforward. You could run the webserver on a different VM and use NFS to share the files between the VMs [18:34:13] or do it all in one place [18:34:27] Zppix: as far as I know you can have multiple front-end proxies to the same backend [18:34:46] tarrow: that VM is totally broken so I'm going to ignore it for now. Please clean it up or delete it sometime soon :) [18:35:14] * Zppix needs coffee [18:35:26] I don't know why I was trying to overcomplicate it [18:37:12] @andrewbogott: Cool, thanks! I went ahead and made a Shared Storage NFS services request: https://phabricator.wikimedia.org/T276141 [18:37:29] ok! [18:37:46] andrewbogott: Roger, I suspect it was supposed to be decomissioned and deleted then. Thanks for flagging it up [18:38:38] also can confirm you can have multiple web proxies per instance [18:38:58] per instance or per project? [18:39:09] both [18:39:48] you can per project for sure, per instance probably yes but haven't tried [18:39:48] no I was i can confirm that you can [18:39:48] saying* [18:40:13] ah, didnt see that :P [18:40:35] np [19:06:36] The docs say Cloud VPS instances aren't backed up, and we should always be prepared to redeploy an instance (via Puppet, etc.). So we should treat an instance's disk storage as ephemeral, and if we need data to stay around, use a volume (Cinder Attachable Block Storage). If I'm using a LAMP stack webserver (role::simplelamp2) to share data and that [19:06:37] data should survive an instance failure, should I put the data onto a Cinder volume and serve it directly from the volume? I could create a symlink in /var/www/html/ that points to the mounted volume. [19:08:35] ariutta: out of a gut feeling I would probably serve normally from local file system but then additionally create a cron/timer that does something like rsync -av /var/www/html /srv/cinder/foo/backup/ [19:09:08] just 2 cents though, by no means the official docs [19:09:31] That would make sense. I just didn't want to take up twice the storage space if not needed :-) [19:11:46] ariutta: btw, there is systemd::timer::job in puppet to automatically create a service and timer for you that runs any command.. which can be rsync to backup. then you have also puppetized the job itself if an instance goes away [19:11:58] Nice! [19:12:10] ask me for an example if you get to that [19:12:43] I think I'm about ready, if you have an example handy! [19:13:08] https://gerrit.wikimedia.org/r/q/topic:%22cron-timer%22+(status:open%20OR%20status:merged) [19:13:17] eh, let me pick a more specific one [19:13:37] https://gerrit.wikimedia.org/r/c/operations/puppet/+/655172/4/modules/deployment/manifests/rsync.pp [19:13:40] at the bottom [19:13:46] that is a timer that runs an rsync [19:13:55] Very helpful, thanks! [19:14:01] but the first step you need is to have your own role class [19:14:07] ideally [19:14:13] that you can apply to your nodes [19:14:23] now you use simplelamp2, right? [19:14:38] yes, that's right [19:14:40] so you want to add to that [19:14:45] * Zppix finds puppet super complicated and [19:14:45] either by adding a second role [19:14:57] or by making a new role that does what simplelamp2 does but then also more [19:15:19] that is the idea of it at least [19:15:21] pretty cool [19:15:37] that each cloud vps project can pick from a menu of standard roles but also make their own [19:15:57] the thing is mostly that you need someone to merge code in prod or use your own puppetmaster [19:16:07] with their own pros and cons [19:16:16] mutante: that menu on horizon doesnt show, atleast not for me [19:16:26] I have to go to the puppet repo for the list xD [19:16:28] I can help though to get something merged if it's actually used [19:17:32] Zppix: yes, back in Wikitech there were buttons to add a class from the repo to the list shown to cloud VPS users in UI to select from [19:17:53] Theres supposed to be on horizon too according to the docs [19:18:01] Zppix: but in Horizon it's not like that anymore (maybe) [19:18:24] https://wikitech.wikimedia.org/wiki/Help:Puppet#Apply_a_puppet_role_to_or_change_hiera_config_of_an_individual_instance [19:18:35] yea, I think i noticed this too Zppix [19:18:57] it could be improved, to have a larger list of pre-selected roles [19:19:06] that have been tested to work for cloud VPS uses [19:19:19] i know it used to show it, idk if it was removed or what [19:19:28] I think there werent just that many users. [19:19:34] I mean I talked to like 5 people about simplelamp [19:19:37] when replacing it [19:20:03] * Zppix really needs to upgrade my puppetmaster to buster, its running on a year old image of stretch [19:20:07] would be nice to have more though and more standard roles beyond simplelamp [19:23:49] I hate the fact i have to delete the instance and create a new one to change the image [19:24:34] Zppix: but.. did you let puppet setup the puppetmaster? [19:24:47] yes, but i dont wanna have to redo all the certs [19:24:52] gotcha [19:24:59] for the 3rd time [19:34:37] mutante: puppet used to have an API that provided available classes and docs. Because of 'open core' nonsense that feature vanished from the version of puppet that we use and that API was the backbone of the menu-based puppet UI. [19:35:33] andrewbogott: ooh, interesting. I had no idea. 'open core'.. hrmm.. like varnish [19:35:51] or gitlab [19:36:25] At least I think that's where it went. I told some of the devs that I was using it to generate UI and they were like, oh, cool! But didn't offer any suggestions beyond that. [19:36:45] Also SREs complained to me periodically about the UI being slow to load and I was tired of that so wasn't very motivated to keep it alive :) [19:41:13] *nod*. I think we could do something close enough by just having a wiki page that lists a couple recommended classes. It doesn't have to be necessarily in Horizon itself and doesn't need to let you select any role from the repo. Probably makes more sense to have a hand curated list or users are overwhelmed by all the prod roles. [19:42:34] as long as there is the normal form to apply any class to a node..docs can be elsewhere what to type into it [19:42:49] mutante: yeah, agreed, some docs with suggested roles and a link from Horizon would just about do the trick. [19:43:20] yep [19:45:37] !log adding inherited domain-wide roles to novaadmin and novaobserver as per T274385 [19:45:38] andrewbogott: Unknown project "adding" [19:45:38] T274385: rework novaadmin and novaobserver project memberships - https://phabricator.wikimedia.org/T274385 [19:50:54] !log admin adding inherited domain-wide roles to novaadmin and novaobserver as per T274385 [19:50:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [19:50:58] T274385: rework novaadmin and novaobserver project memberships - https://phabricator.wikimedia.org/T274385 [19:51:11] !log admin removing novaobserver from all projects save 'observer' for T274385 [19:51:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [20:12:11] !log admin removing novaadmin from all projects save 'admin' for T274385 [20:12:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [20:12:17] T274385: rework novaadmin and novaobserver project memberships - https://phabricator.wikimedia.org/T274385 [21:15:08] mutante: just do what I do, enable all the roles :P (jk) [21:57:47] !log tools.wd-image-positions deployed bf35152db8 (GitHub actions; adds pytest to venv) [21:57:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wd-image-positions/SAL