[00:07:50] General toolforge-related question: is it, frowned upon, to have clients (lets say the Android app) hit a service hosted on toolforge? Or maybe a better question is, are there general guidelines around what kinds/amount of traffic that should/shouldnt be hitting a toolforge service. Or does "it depend" :) [00:15:16] nikkinikk: Depends what you mean by "hit" etc [00:16:30] For example, if someone was going to make the Wikipedia Android app hard depend on something on toolforge, and to some more extent, a lot of load, that would definitely be frowned upon [00:19:07] Reedy: Specifically, a node API hosted on toolforge that Android would hit for a proof of concept project. Unsure on expected load right now, but there would be a hard dependency on the service to enable one of the new Android features, but wouldnt be reliant on it for basic functionality (reading articles, editing articles, etc). just for one new experimental feature. [00:34:49] I wouldn't recommend it just because Toolforge doesn't have the same privacy policy, scalability options, monitoring, logging, etc. [00:36:46] nikkinikk: traffic wise you would probably be fine. Service stability is a different question [00:46:04] Got it, ok i might go back get some more info and then bug you guys again. Because I'm not sure what the expected uptime/stability is, yall just made it too easy to get stuff up and alive its so tempting to just leave it on there haha. [06:26:06] hey, I'm rebuilding some deployment-prep instances. if I have an instance that needs more than the default 20Gb disk for storage but if I don't care if I lose that data, is Cinder still preferred or should I use LVM? I see the large disk instance size on Horizon but it's marked as "private" so not sure if I should use that [06:27:45] andrewbogott: ^ since you asked to ping, but I'd imagine others can help too [06:28:12] if Cinder is preferred, I'm going to need more than the default 10Gb for all of deployment-prep :D [08:16:57] Majavah: I would say cinder is more future-proof [08:17:09] feel free to ask for more quota [09:12:47] !log admin draining cloudvirt1022 for T275753 [09:12:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:48:41] why are some hiera values set in operations/puppet/hieradata/cloud/eqiad1 and not in horizon? [09:57:59] !log tools depool tools-sgewebgrid-generic-0901 to reboot VM. It was stuck in MIGRATING state when draining cloudvirt1022 [09:58:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:09:15] Majavah: I guess that it depends on the project, it seems some are to workaround weird scenarios (ex. puppet enc not working), and some seem to be just config (that could live in horizon), I guess that having version control is a plus on having it there, but being able to change it right away is a plus of horizon [10:33:28] Majavah: good question. We try to have everything on horizon, but for some concrete cases, it makes sense to have them on ops/puppet.git. At least some defaults. If for whatever reason the puppet enc API is down, and therefore cloud puppetmasters cannot read hiera from horizon, then it could "destroy" some VMs by using wrong hiera default values. So it is kinda a fail safe mechanism to have some "key" values in [10:33:28] ops/puppet.git [11:01:58] !log admin rebooting cloudvirt1022 for T275753 [11:02:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:24:28] !log admin rebooted cloudvirt1022, re-adding to ceph and removing from maintenance host aggregate for T275753 [11:24:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:25:46] !log tools rebooted tools-sgewebgrid-generic-0901, repool it again [11:25:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:29:48] !log admin draining cloudvirt1024 for T275753 [11:29:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:32:15] vpsalertmanager.toolforge.org deployment-prep silence is expiring in a day, probably should be extended somehow? [14:24:13] more questions: why does instance-puppet have a phabricator/ directory but there is no such project as "phabricator" [14:53:44] maybe leftover? it's been a while since it had any meaningful change, there's a phabricator VM under devtools though [15:02:10] yeah, I'm just wondering since that references some deleted deployment-prep VMs [15:12:04] !log admin rebooting cloudvirt1024 for T275753 [15:12:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [15:41:38] !log admin draining cloudvirt1025 for T275753 [15:41:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [15:55:18] !log admin rebooting cloudvirt1025 for T275753 [15:55:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [15:57:45] !log admin draining cloudvirt1026 for T275753 [15:57:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:09:30] !log admin rebooting cloudvirt1026 for T275753 [16:09:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:11:45] !log admin draining cloudvirt1031 for T275753 [16:11:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:33:36] !log admin rebooting cloudvirt1031 for T275753 [16:33:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:34:48] !log admin draining cloudvirt1032 for T275753 [16:34:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:46:40] !log toolhub Restarting docker process. Not sure if crash or another problem. [16:46:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolhub/SAL [16:47:11] bd808: I think I live migrated the VM the other day. Could be related... [16:49:08] arturo: yeah, it's possible. No big deal. It's a super experimental demo server :) [16:50:29] ok :-) [16:59:19] !log admin rebooting cloudvirt1032 for T275753 [16:59:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [18:36:10] !log admin rebooting cloudmetrics1002; the console is hanging [18:36:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [19:41:16] how long does it normally take for a newly-created vm to register in dns? [20:04:23] almost instant in my experience [20:04:40] shdubsh: which VM and what name are you trying? [20:05:14] Majavah: pontoon-elastic7-02.monitoring.eqiad.wmflabs [20:05:57] shdubsh: https://wikitech.wikimedia.org/wiki/News/Phasing_out_the_.wmflabs_domain [20:06:21] new VMs only hsve .wikimedia.cloud names, not .wmflabs names [20:07:01] shdubsh: try pontoon-elastic7-02.monitoring.eqiad1.wikimedia.cloud [20:07:04] it pings [20:07:56] yep, that appears to be alive now (it wasn't earlier `Host pontoon-elastic7-02.monitoring.wikimedia.cloud not found: 2(SERVFAIL)`) [20:08:25] now the error is `Permission denied (publickey).` [20:08:26] that name is missing .eqiad1. in the middle [20:08:45] * shdubsh facepalms (my bad, sorry!) [20:10:42] shdubsh: did you get into it? I see a connection closed message in auth.log with your username in it [20:11:14] bd808: something must be up with my ssh config at this point [20:11:24] `jumphost loop via bastion.wmcloud.org` [20:11:40] what does your ssh config look like? [20:14:03] hmmm... the connection closed lines in auth.log are marked "[preauth]" which maybe means the public key was not accepted? [20:14:38] "Accepted publickey for cwhite" victory! [20:15:11] Majavah bd808: got it. had a fairly legacy setup quite a few *. entries. nuked a few and it clicked into place [20:15:24] thanks for the assist! [20:15:51] * bd808 mostly cheered from the bleachers [20:25:13] !log tools.lexeme-forms deployed 15a24d63eb (minor Czech verbs improvement) [20:25:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [21:08:15] !log tools.lexeme-forms deployed 1435d31446 (update Swedish translations) [21:08:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [21:54:07] shdubsh: if you just created the VM give it about ~5-10 minutes to run the init script, my guess it was denying your key cause it hadn't finished ssh setup yet [21:54:40] i've had the same exact thing happen before and based on instance logs when its happened to me its cause cloud-init had not finished yet