[05:45:12] !log tools.sal restarted webservice (T259560) [05:45:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sal/SAL [08:01:14] To discuss development ideas about the LanguageConverter it might be useful to ask the maintainers instead. ;) [14:11:40] Hmm, this one instance has been "scheduling" for 22 hours now. Not really a problem, but I suppose it's consuming unnessecary resources? Is there some form to fill out to create a Phabricator task or something? [14:13:48] Something sounds wrong for sure. Just file a task and tag cloud services team and cloud-vps [14:43:28] Reedy: ok, thanks! [15:25:28] kalle: we're having some db performance issues which cause actions to time out. I'd advise deleting and recreating, unless you're unlucky it should work this time [17:00:19] andrewbogott: Ran delete some 20 hours ago, but that didn't help. [17:04:19] Ok... I'm in transit but will clean things up later today [17:06:08] No worries, we don't the the resources or anything. [17:06:27] Added a bug now. T259644 [17:06:28] T259644: New instance has been scheduling for more than a day - https://phabricator.wikimedia.org/T259644 [18:24:00] !log cloudinfra Made DNS entry lists.wmcloud.org A 185.15.56.49 (Tried CNAME to mailman.wmcloud.org first but Horizon didn't like that) (T259444) [18:24:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [18:24:02] T259444: Request for creating a DNS record for lists.wmcloud.org to 185.15.56.28 - https://phabricator.wikimedia.org/T259444 [18:54:50] !log admin restarting mariadb on cloudcontrol1004 to setup parallel replication [18:54:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [19:45:54] bd808: sorry but one thing, gmail gave me this: "DNS Error: 29726073 DNS type 'mx' lookup of lists.wmcloud.org responded with code NXDOMAIN Domain name not found: lists.wmcloud.org" I thought if no MX record exists, it would fallback to A type record but I might be wrong [19:46:18] nope, you need an MX [19:47:01] :( [19:47:12] Amir1: do you only need an MX record? Or will this also be used for other things? [19:47:18] no, just MX [19:47:30] Thanks [19:47:34] sorry again [19:47:40] I bother a lot [19:56:26] Amir1: we are having some db problems that will keep me from making the MX record for a bit, but I'll update the task when it is done [19:56:48] thanks. No worries. This can wait [21:57:47] Amir1: I think you may have been correct about not needing an MX record. I figured out that the A record I had made went missing from Designate as fallout from the db things we have going on in OpenStack right now. I recreated the record. Maybe you can run your test again? [22:00:04] !log deployment-prep deleted deployment-chromium01 [22:00:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [22:25:23] bd808: sure [22:25:59] bd808: Actually, I don't need to [22:26:00] https://usercontent.irccloud-cdn.com/file/01OdFPqs/image.png [22:28:36] I need to fix the HTTPS traffic though (I need to open port 443, do the Let'sEncrypt dance, ...) [22:29:14] Amir1: are you using deployment-prep or a separate project? [22:29:31] it's a separate project (mailman) [22:29:44] I had done it before with meet [22:29:50] so you won't have acme-chief then [22:30:02] and the letsencrypt::cert method will fail [22:30:13] i recommend you use the same workaround we used for gerrit [22:30:37] which is basically installing certbot and a cron [22:30:38] 72 ensure_packages('certbot') [22:30:38] 73 cron { 'certbot_renew': [22:30:51] then you have to manually run it once but the renewal will be automatic [22:31:39] the reason you can't use lesencrypt::certificate like in the past is that https://gerrit.wikimedia.org/r/c/operations/puppet/+/602722 does not get merged [22:31:54] and upstream "Account creation on ACMEv1 is disabled." [22:32:04] so it still works to renew existing certs but not to create new certs [22:32:16] just trying to save you from that rabbit hole [22:33:36] ah puppet [22:33:51] Thanks! [22:34:11] it's not puppet, it's https://community.letsencrypt.org/t/end-of-life-plan-for-acmev1/88430 [22:34:38] Amir1: so.. install certbot package. manually run it. i did the option that just gets me the certs and does not try to edit webserver config.. copied them in place [22:34:49] then the cron does the rest [22:34:53] mutante: is this about the version of certbot you have installed? [22:34:58] if you want to use ACMEv2 [22:35:16] ningu: it's that acme_tiny.py uses ACMEv1 [22:35:20] certbot is the alternative [22:35:27] ah ok [22:35:38] fix should be https://gerrit.wikimedia.org/r/c/operations/puppet/+/602722 [22:35:53] I've always just used certbot [22:36:02] !log tools.lexeme-forms deployed 39457a18ab (Bengali adjectives and verbs) [22:36:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [22:36:13] that would fasten things up [22:36:25] ningu: yea, but the production method in the past was letsencrypt::certificate puppet class [22:36:35] and then that was replaced with acme_chief [22:36:48] but acme_chief means you have to setup another instance in your project for it [22:37:08] acme_tiny looks useful in general, actually, but I've never had much of an issue getting certbot installed and working [22:37:09] unfortunately prod is not like cloud again [22:37:52] ningu: it is fine for a few servers but it's a different story when it's hundreds of servers [22:37:53] and since everyone else uses certbot essentially, I just rely on it being well maintained [22:38:04] yeah, I've only ever had a few so :P [22:38:26] the point of acme_chief is that the requests to LE are all made from a central server [22:38:41] which the gives the certs to the servers actually needing them [22:39:03] setting that up in another cloud instance to serve 1 other instance would be way overkill [22:39:41] how many cert requests are there really, though? even with hundreds of servers, you'd just need the renewal every few months and check for it once a day or whatever? [22:39:44] doing it completely manual is also bad.. so that method above was my "that's as good as it gets" to solve all that [22:40:04] no, not completely manually, I meant puppetized certbot [22:40:12] maybe that's still too disorganized though [22:41:11] https://phabricator.wikimedia.org/source/operations-software-acme-chief/browse/master/README.md [22:43:29] !log tools.lexeme-forms actually deployed 39457a18ab (forgot to git rebase) [22:43:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL