[00:00:41] !log admin T193264 Added osm.db.svc.eqiad.wmflabs to cloud DNS [00:00:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [00:00:52] T193264: Replace labsdb100[4567] with instances on cloudvirt1019 and cloudvirt1020 - https://phabricator.wikimedia.org/T193264 [11:50:18] bstorm_: hey, should scoring platform do it? https://phabricator.wikimedia.org/T219563 [11:56:45] Amir1: a bit early for her [11:57:07] it's fine. I'm around for a couple of hours :D [12:00:22] Amir1: I would guess, you don't probably have access to add a DNS entry under svc.eqiad.wmflabs [12:00:36] yes, that's probably true [12:00:40] Though... "We'd want to update apps that connect to the server as well" should be doable (at least creating patches) [12:00:58] !log wikidata-dev add alaasarhan as member+admin [12:00:59] I'm presuming Brooke created it as a TODO, and ease of tracking the "updating apps" followup work [12:00:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikidata-dev/SAL [12:01:15] okay, it was on the SP's team kanban so I thought we should do it [12:01:49] I think you blame Phabs' "Create Subtasks" being crappy and copying every man and his dog down [12:03:54] LOL probably [12:19:21] that changed. now you get a new pop up with options how to create subtask, heh [12:19:29] and you can remove people there [13:19:16] !log maps shutting down maps-tiles2 and maps-tiles3 [13:19:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Maps/SAL [13:33:00] Amir1: yeah, that was just a tag remaining when I created a subtask. Wikilabels should start using the new hostname after it is created, but nothing to do now. Didn't mean to confuse [13:54:37] !log tools moving tools-static-13 to eqiad1-r [13:54:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:34:11] !log tools moving tools-static.wmflabs.org to point to tools-static-13 in eqiad1-r [14:34:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:34:19] !log admin moving tools-static.wmflabs.org to point to tools-static-13 in eqiad1-r [14:34:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:37:41] no worries, thanks [15:49:08] !log integration - integration-slave-jessie-1001 - systemctl stop xvfb, systemctl start xvfb to confirm nothing was broken by https://gerrit.wikimedia.org/r/c/operations/puppet/+/499993 [15:49:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Integration/SAL [16:22:21] noticing some interesting messages on a cloud instance running mwv in /srv/lxc/mediawiki-vagrant_default_1543587714233_46525/rootfs/var/log/mail.info any chance someone more knowledgable could take a quick look and verify we aren't somehow becoming a mail spammer? [16:22:36] i think it's just registration emails from some bot filling out the form on wiki, but not 100%... [16:23:44] ebernhardson: what instance? [16:23:54] bd808: cirrus-browser-bot.eqiad.wmflabs [16:25:10] the default setup for MediaWiki-Vagrant traps all emails and dumps them into the vagrant user's spool, but I'll look to see if that vm is somehow different [16:25:37] that makes sense, i wasn't sure how it's all configured [16:28:26] exim on that vm is all messed up [16:28:44] missing some config files it looks like, specifically the hashed alias db [16:29:46] that mwv instance is dated roughly nov 30 2018, could probably destroy and rebuild i suppose [16:30:54] its running exim instead of postfix... /me keeps digging [16:31:34] hmmm... no. Its running postfix... [16:33:07] !log search Ran `postmap /etc/postfix/virtual` and `postalias /etc/postfix/aliases` in mwv LXC container on cirrus-browser-bot.search.eqiad.wmflabs [16:33:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Search/SAL [16:36:56] !log search Ran `postfix flush` in mwv LXC container on cirrus-browser-bot.search.eqiad.wmflabs [16:36:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Search/SAL [16:37:48] ebernhardson: fixed. You now have 14356 messages in /var/mail/vagrant on that vm [16:38:46] bd808: :) [16:38:50] the `postalias /etc/postfix/aliases` thing was what Puppet had skipped at some point. [16:39:18] thanks! [16:49:52] !log tools cleared E state from 21 queues [16:49:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:50:02] Hi, I'm fighting with PATH environ variable at toolforge grid [16:50:05] see https://paste.ee/p/uVPel for demonstration [16:50:08] any ideas how to fix this? [16:53:52] !log tools moving tools-static-12 to eqiad1-r [16:53:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:58:01] Urbanecm: looking. I think I know the answer, but I want to double check [16:58:10] thx bd808 [17:02:54] Urbanecm: your interactive work on the bastion is getting $PATH settings from your $HOME/.bashrc file. Bash only sources this file for shells that are both interactive and non-login. The environment created by grid engine to run your submitted job does not meet these qualifications. [17:03:18] so basically, the config is different [17:03:56] hmm, what file should i use then bd808 ? [17:04:01] The best advice I can give is to write grid jobs that are independent of shell config [17:04:48] I think your job would work as `jsub -N test-direct ~/bin/sysopbot ~/pwb/scripts/login.py` [17:05:58] another q is why environ used when doing test-direct and test-script differs [17:06:44] I *think* I know that too, but let me double check before I tell a lie :) [17:06:58] any known problems with tools-static? Quarry is failing to load a bunch of cdnjs things for me atm [17:07:38] Lucas_WMDE: andrewbogott was doing some changes tools-static eariler today. Could be unintended fallout [17:07:53] should I open a Phabricator task? [17:11:41] !log tools Restarted nginx on tools-static-13 [17:11:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:11:55] hmmm... [17:12:12] ping says “destination host unreachable” [17:12:19] did the instance change, perhaps? [17:12:22] and I have an old IP? [17:12:23] it did [17:12:37] Lucas_WMDE: is there a hard-coded IP someplace? In theory dns should be updated everywhere [17:12:56] (I waited a long time! But maybe not long enough) [17:13:09] uh oh [17:13:12] andrewbogott: I'm seeing "185.15.56.60" from my local [17:13:13] tools-static.wmflabs.org. 7198 IN A 208.80.155.174 [17:13:21] on Toolforge I see 172.16.0.186 [17:13:24] I still hvae this cached [17:13:28] dang [17:13:28] on my own system, 208.80.155.174 [17:13:38] um… I 'll revive the old one for a bit longer. [17:13:47] what was the TTL on it before the IP was changed? [17:14:55] 1.1.1.1 and 8.8.8.8 both seem to have the new IP [17:15:02] Lucas_WMDE: better? [17:15:21] andrewbogott: https://tools-static.wmflabs.org/toolforge/ is working for me again [17:15:29] Krenair: It was whatever the default is, I think one hour [17:15:46] bd808: ok. Guess I'll wait until tomorrow to make the rest of this move [17:15:56] interesting [17:16:19] !log tools aborted move of tools-static-12; will wait until tomorrow and give DNS caches more time to update [17:16:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:16:22] andrewbogott: yes, thank you <3 [17:16:23] wonder why mine has >7000 [17:16:32] andrewbogott: fair enough. Distributed caches are fun! [17:16:50] in my case it was the WMDE office’s DNS cache that still had (has) the old entry, apparently [17:17:07] can I get that to reveal its TTL for the entry? [17:17:11] * Lucas_WMDE is not a DNS expert [17:17:17] just dig the hostname [17:17:23] dig tools-static.wmflabs.org [17:18:01] I don’t see a TTL in there [17:18:10] “WHEN” seems to be the request timestamp [17:18:16] you should get a line back with the IP, that it's an A record, that it's for the internet, and the name [17:18:31] like tools-static.wmflabs.org. 7198 IN A 208.80.155.174 [17:18:31] (but if I dig without a DNS argument it’ll just reach my local systemd-resolved anyways) [17:18:37] (without a *nameserver argument) [17:18:48] Urbanecm: ok, to the magic that is happening in your 'direct' test is that our `jsub` program has code to try and figure out the full path of the main executable locally (on the bastion) before sending the command to the job grid. This is an attempt to work with the other issue you are running into about .bashrc and related interactive shell config. [17:19:12] ok, thx bd808 [17:19:55] so when you run something like `jsub -N whatever my_random_prog` it acts like you did `jsub -N whatever $(which my_random_prog)` [17:20:36] and that `which` lookup happens with whatever PATH is set in the submitting shell [17:20:39] hm, clearly the default ttl is more than an hour :( [17:20:51] andrewbogott: I bet its 1 day [17:21:39] probably. I was lazy and used "all the dns caches I can think of" as a proxy for "all the dns caches everywhere" [17:25:06] !log tools Cleared the "Eqw" state of 44 jobs with `qstat -u '*' | grep Eqw | awk '{print $1;}' | xargs -L1 sudo qmod -cj` on tools-sgegrid-master [17:25:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:30:59] oh, the number after the domain in the dig output is the TTL? [17:31:21] in that case the WMDE office DNS server will have the outdated IP address for about half an hour longer [17:31:24] (assuming it’s in seconds) [20:22:38] !log tools Disabled puppet on tools-checker-0{1,2} to make testing new role::wmcs::toolforge::checker easier (T219243) [20:22:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [20:22:42] T219243: Migrate tools-checker system to Stretch - https://phabricator.wikimedia.org/T219243 [20:24:29] !log tools Cherry-picked https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/500095/ to tools-puppetmaster-01 for testing (T219243) [20:24:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [20:32:36] !log tools Creating tools-checker-03 with role::wmcs::toolforge::checker (T219243) [20:32:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [20:32:39] T219243: Migrate tools-checker system to Stretch - https://phabricator.wikimedia.org/T219243 [20:33:13] o/ bstorm_ [20:33:21] Have you disabled labsdb1004 yet? [20:33:40] I'm hoping to double-check that labels.wmflabs.org is still online and happy after that happens. [20:34:08] Hey halfak, I can stop postgres in a sec. [20:34:13] I can't see any reason why it wouldn't be as we seem to be reading and writing to the new clouddb1002 DB. But I just want to be sure. [20:34:15] I was just making sure it won't alert at all :) [20:34:26] Perfect! [20:37:01] halfak: it's turned off [20:37:05] postgres that is [20:37:36] And everything seems happy on our end. [20:37:38] Thank you! [20:37:42] Great :0 [20:37:43] :) [20:37:56] Good migration. I'd buy you a beer if you were in Minneapolis :D [20:48:58] !log tools Using root console to fix broken initial puppet run on tools-checker-03. [20:49:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:04:20] haha [21:08:58] !log tools Updated cherry-pick of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/500095/ on tools-puppetmaster-01 (T219243) [21:09:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:09:02] T219243: Migrate tools-checker system to Stretch - https://phabricator.wikimedia.org/T219243 [21:13:29] !log tools depooled tools-sgewebgrid-generic-0903 because of some stuck jobs and odd load characteristics [21:13:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:57:42] !help [21:57:42] audiodude: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [21:57:53] what is the process for being added to a VPS box? [21:58:23] my teammate said he couldn't add me, but he also claimed to try adding "Audiodude" which is my Wikipedia username, but I log into toolforge as "tmoney" [21:58:24] audiodude: virtual servers are organized into projects. Access to VMs is controlled via project membership. [21:58:48] So someone who is already a projectadmin in the project would need to add you [21:58:59] okay, so if I can "become enwp10" on toolforge I should be able to ssh into the VPS box? [21:59:30] you won't be able to ssh between VMs, only from your local system directly to whatever VM you need to access. [21:59:52] And I don't understand where 'become' would fit in there. Isn't enwp10 just a tool? [22:00:09] I'm not trying to SSH between VMs [22:00:19] enwp10 is a tool, I guess there's a separate "project" [22:00:26] I thought I was part of it though [22:00:48] Are we talking about tools or a different project? [22:00:56] there is no project named "enwp10" [22:01:11] sorry I was conflating the two [22:01:19] okay [22:01:24] the SSH usernames will be the same [22:01:52] tools has a 'become' command and the concept of tools internally, this is distinct from projects [22:01:58] an example of a project is 'tools' or 'bastion' [22:02:06] audiodude: https://toolsadmin.wikimedia.org/tools/id/enwp10 -- that says you are a maintainer [22:02:51] bd808: right that's what I expect [22:03:34] sorry, let me back up [22:03:38] What are you trying to SSH to exactly audiodude ? [22:04:01] > We also have got a dedicated server in VPS to run all the WP1 stuff, I [22:04:01] > wanted to give you access, but it looks like you need to do a few [22:04:01] > registration stuff first. I wasn't able to add user "Audiodude". [22:04:24] do you have a hostname? [22:04:30] Kelson is the teammate, he hasn't mentioned a specific hostname yet [22:04:50] krenair@bastion-eqiad1-01:~$ ldapsearch -LLLx member=uid=Kelson,ou=people,dc=wikimedia,dc=org dn [22:04:50] dn: cn=project-bastion,ou=groups,dc=wikimedia,dc=org [22:04:50] dn: cn=project-tools,ou=groups,dc=wikimedia,dc=org [22:04:50] dn: cn=tools.enwp10,ou=servicegroups,dc=wikimedia,dc=org [22:04:51] dn: cn=project-mwoffliner,ou=groups,dc=wikimedia,dc=org [22:04:53] krenair@bastion-eqiad1-01:~$ [22:04:59] am guessing this is about something in mwoffliner? [22:05:11] yes I believe so [22:05:43] audiodude: it's hard for us to help if we don't know what project you mean — but step one is to figure out what your username is, and step two is to have Kelson add you. If you have a toolforge account then your account is already set up and just needs to be added. [22:06:16] if you log into toolforge as tmoney, that is your UID [22:06:18] then you will log in with something like this: https://wikitech.wikimedia.org/wiki/Help:Access#Accessing_instances_with_ProxyJump_ssh_option_(recommended) [22:06:19] okay [22:06:23] which can be used as 'Shell name' in Horizon to add you to the project [22:07:06] okay, so I think Kelson just tried to add me as Audiodude, which is my Wikipedia handle instead of my toolforge username tmoney [22:07:18] I'll let him know [22:07:21] In the Shell name field or the other field? [22:07:34] I don't know, I quoted everything he told me [22:07:35] Because your CN is Audiodude [22:07:43] The distinction between UID and CN is important. [22:09:08] !xy is https://en.wikipedia.org/wiki/XY_problem -- Tell us what you are trying to accomplish rather than asking how to solve a secondary issue [22:09:08] krenair@bastion-eqiad1-01:~$ ldapsearch -LLLx uid=tmoney cn [22:09:08] dn: uid=tmoney,ou=people,dc=wikimedia,dc=org [22:09:08] This key already exists - remove it, if you want to change it [22:09:08] cn: Audiodude [22:09:20] !xy [22:09:20] The XY problem is asking about your attempted *solution* rather than your *actual problem*. http://meta.stackoverflow.com/a/66378 [22:09:53] pretty much the same :) [22:10:46] sorry, I thought my problem was what Kelson said, I need to do some "registration" before he can "add me" to the mw-offliner box [22:11:02] but I also suspected it had something to do with the UID/CN mismatch [22:11:15] no, it sounds like you already have an account [22:11:35] I thought so too [22:11:37] You're not Automactic / Chris are you? [22:11:43] no [22:11:54] ok, didn't think so, just checking there's no multiple-account-confusion going on [22:12:02] the real problem is "Kelson doesn't know how to add me to the mw-offliner box" [22:12:13] which maybe Kelson should ask for help on.... [22:12:22] people are not added to boxes [22:12:41] it is not a per-instance thing without a bunch of extra config IIRC [22:12:54] people get added to projects and instances will check you are a member of their project [22:13:06] okay, I think I'm figuring that out yes [22:13:18] there are 6 instances in that mwoffliner project - https://tools.wmflabs.org/openstack-browser/project/mwoffliner [22:14:29] gotcha [22:14:40] okay so Kelson needs to add me as a User to mwoffliner [22:14:57] and then I can proxyjump to the box I need from my local machine [22:15:01] is that right? [22:15:24] yes, via one of the standard bastions [22:17:48] okay thanks everybody, sorry for not being more clear