[00:05:38] musikanimal: sorry, totally didn't see https://phabricator.wikimedia.org/T196525#4261598 and forgot to check back [06:01:35] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/443348 (owner: 10L10n-bot) [10:04:31] (03PS1) 10MarcoAurelio: Ignore 'CommunityTechBot' [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/443391 (https://phabricator.wikimedia.org/T198552) [10:05:30] (03PS2) 10MarcoAurelio: Ignore 'CommunityTechBot' [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/443391 (https://phabricator.wikimedia.org/T198552) [10:05:54] (03PS3) 10MarcoAurelio: Ignore 'CommunityTechBot' [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/443391 (https://phabricator.wikimedia.org/T198552) [10:35:17] (03CR) 10Merlijn van Deen: [C: 032] Ignore 'CommunityTechBot' [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/443391 (https://phabricator.wikimedia.org/T198552) (owner: 10MarcoAurelio) [10:35:46] (03Merged) 10jenkins-bot: Ignore 'CommunityTechBot' [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/443391 (https://phabricator.wikimedia.org/T198552) (owner: 10MarcoAurelio) [10:35:56] (03CR) 10jenkins-bot: Ignore 'CommunityTechBot' [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/443391 (https://phabricator.wikimedia.org/T198552) (owner: 10MarcoAurelio) [10:37:27] !log wikibugs merged ignore-communitytechbot-patch and restarted [10:37:28] valhallasw`cloud: Unknown project "wikibugs" [10:37:28] valhallasw`cloud: Did you mean to say "tools.wikibugs" instead? [10:37:32] !log tools.wikibugs merged ignore-communitytechbot-patch and restarted [10:37:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [10:38:57] bedankt, het was snel ^^ [10:51:10] Hauskatze: thanks for submitting the patch :-) [10:51:39] np, happy to help [10:52:35] vgutierrez: hey! I've got your email regarding updating the TLS version of my tool (Fountain). Can you please clarify it a bit? I'm pretty sure the TLS version I use is dependent on the mono version installed on the web grid, and I didn't get the same notification regarding my other tool (chie-bot) which is using the same grid queue (webgrid-lighttpd@tools-webgrid). I don't do any special TLS handling in either of those tools, s [10:54:31] kf8: I don't know the details, but looking at the tool, it might be about the outgoing requests used for oauth login/api calls [10:55:26] valhallasw`cloud: thanks! that actually might be true! let me double check [10:55:48] I thought it was about mediawiki api [10:56:26] Both are probably affected [10:58:27] valhallasw`cloud: probably not, because I'm using different libraries to access them [10:59:07] kf8: right, my fault then, I haven't identified your tool as running on web grid [10:59:34] kf8: being a mono one, as long as it has been restarted after the mono upgrade, you should be safe :) [11:00:07] T194665 [11:00:07] T194665: Provide an up-to-date mono environment on toolforge - https://phabricator.wikimedia.org/T194665 [11:00:09] that update :) [11:00:54] vgutierrez: it was restarted on 06/20/2018. is that recent enough? [11:01:46] indeed, it was upgraded on May 28th [11:03:09] vgutierrez: great, thanks! please contact me again if the issue persists [11:27:49] thanks valhallasw`cloud vgutierrez [11:41:43] np :) [11:48:44] !log phabricator upgrading phabricator to stretch [11:48:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Phabricator/SAL [12:35:44] !help i cannot seem to ssh into phabricator after upgraind to stretch using apt [12:35:45] paladox: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [12:35:52] upgrading [12:36:00] i am looking at https://horizon.wikimedia.org/project/instances/22a00740-9c8f-4258-8ad0-b4082c03deee/console [12:36:10] which is showing [12:36:11] [ 43.251945] rc.local[614]: [1;31mError: Could not send report: Failed to open TCP connection to labs-puppetmaster-eqiad.wikimedia.org:8140 (getaddrinfo: Name or service not known)[0m [13:03:09] paladox: into phabricator? [13:03:22] arturo yep, the instance is called phabricator [13:03:30] phabricator.phabricator.eqiad.wmflabs [13:04:15] i think it may be the ssh-phab service [13:04:21] overiding the ssh service [13:04:47] that links gives me `'Unable to get log for instance "22a00740-9c8f-4258-8ad0-b4082c03deee"` [13:04:50] did you kill the instance? [13:05:02] oh, I'm probably not member of the project [13:05:44] nope, didn't kill it [13:06:46] paladox: I don't know what changed in the config during the upgrade [13:07:11] paladox: but if nobody have a better idea, I can try opening the disk locally and modifying some files to add a temporal ssh hole [13:07:23] ah ok [13:07:24] cc andrewbogott ^^^ [13:07:30] arturo i am trying to stop ssh-phab [13:07:33] with a local hack [13:07:36] on puppet-phabricator [13:08:13] but if the VM can't talk to the puppetmaster, perhaps that won't work [13:08:29] hmm, it seems it can [13:08:33] after it trys again [13:09:41] ok nice [13:10:43] but this is to puppet-phabricator.phabricator.eqiad.wmflabs [13:10:55] not to http://labs-puppetmaster-eqiad.wikimedia.org:8140 [13:10:58] arturo ^^ [13:11:10] ok, so your own puppetmaster [13:11:30] yep [13:13:42] did it work? [13:14:14] * arturo lunch soon [13:14:26] arturo nope dosen't seem to have worked [13:14:39] i've stopped ferm with a hack [13:15:11] perhaps the VM lost connectivity with the DNS server somehow [13:15:24] hmm [13:15:28] though the web ui works [13:15:44] https://phab.wmflabs.org (rebooted the instance so will take a minute or two to come back up) [13:15:45] paladox: do you you usually upgrade VMs between distros in place? In my experience that breaks things, I always discourage it. [13:15:52] andrewbogott yeh [13:15:57] it worked many times before [13:15:58] for me [13:16:52] though wierd thing is i could ssh after the upgrade (i rebooted then i could ssh in) (i also do the apt auto remove command too) [13:34:53] hey, chasemp and/or bstorm_, see discussion in -security, chris would like to delay adding those shelves to labstore1006 until next week because he has a pile of other work and a 2 day week this week [13:35:05] ack, just reading now [13:35:18] if this is ok, then $someone should move the relevant services back to labstore1006, prolly today [13:35:27] and then forget about it until the shelf move [13:36:13] apergos: we had a bit of leftover drama from this on sat that make me inclined to keep it all 1007 for now if you're ok w/ it, I think load wise it has seemed fine [13:36:49] andrew pinged me and it seemed [13:36:58] that labstore1007 was rather loaded there for awhile [13:37:04] (I was gone, it had resolved when I got back) [13:37:21] so I'd rather split up the services again, if I understood correctly [13:38:52] apergos: the issue at least then was not 1007 at all, but 1006 still lingering in a few places and the dwait state procs looking for 1006 causing load to surge [13:39:10] but I haven't looked at 1007's load today at all [13:39:14] oh! well that was my complete misinterpetation then [13:39:34] apergos: for reference we solved it by purging all mentions of 1006 everywhere on sat [13:39:52] where were there mentions left over anyways? [13:41:01] the puppet change to move things over did not remove from /etc/fstab (so resurface on reboot) or umount directly (so left behind for shenanigans by default). my adhoc thinking had all been done in toolforge only to address that but there were enough instnaces outside tools to cause issue [13:41:30] https://grafana.wikimedia.org/dashboard/db/host-overview?refresh=300s&orgId=1&var-server=labstore1007&var-datasource=eqiad%20prometheus%2Fops&var-cluster=misc&from=now-7d&to=now this looks pretty calm actually [13:41:32] I would like to think that shouldn't happen regardless [13:41:36] but that was teh deal [13:41:43] I wanted to inspect some of the mount options there today [13:41:49] yeah puppet does not clean up fstab, it is true [13:42:17] as with all things nfs, the mess is way bigger than the issue causing it :) [13:42:24] this points even more to a separte shell script tat would take care of some of that [13:42:38] ugh, yes [13:42:49] and one stats host hd to be rebooted iirc too [13:43:07] noticed that too, but I wasn't involved at all [13:43:36] me neither, just saw it on the lists [13:43:56] all right, well given what I see for labstore1007 over the last several days it seems quite happy to me [13:44:04] so sure, we can leave it [13:44:22] that's what I get for reading irc in a hurry on the weekend [13:45:28] heard ;) apergos since you're in the outlier tz between teh few of us, want to send a cal invite for a time you're ok w/? (next mon) [13:45:35] bc next tue and wed I see existing maintenance things actually [13:45:57] any time is going to suck tbh [13:46:10] we have our main meeting at 7 my time, that's til 8 [13:46:14] before that is too early for you [13:46:19] after that is ... gonna suck [13:46:31] so just schedule it and... do I need to be there? maybe I don't [13:46:53] you just want to set up a new pv and extend the existing lv for /srv/dumps [13:47:04] I didn't even set those up originally, madhu did I guess [13:47:45] right, I'm half-educated about this all as well [13:47:59] ah so you're hoping two half-educateds =one whole? :-D [13:48:12] well if you put it at 8 pm my time or later but not too much later... [13:48:21] at some point I won't be even a half person [13:48:29] 5 utc in other words [13:49:47] but we should make sure chris is good with that time too [13:49:56] otherwise maybe it waits til thurs [13:50:00] I'll send someting out to at least get the shelves there and we can think on what do to w/ it. I think making those drives present as a single raid 10 physical is the Right Thing(TM) and then wehtehr to extend the LV or make another mount is up for discussion [13:50:06] * chasemp nods [13:50:14] ok [13:50:25] yes they should be a single logical drive indeed [15:26:34] ssh: connect to host phabricator port 22: Connection refused [15:26:40] arturo i get that ^^ [15:35:59] root@puppet-phabricator:/var/lib/git/operations/puppet# host labs-puppetmaster-eqiad.wikimedia.org [15:36:00] Host labs-puppetmaster-eqiad.wikimedia.org not found: 3(NXDOMAIN) [15:36:06] does labs-puppetmaster-eqiad.wikimedia.org exist? [15:36:15] andrewbogott arturo ^^ [15:37:22] paladox: that's an obsolete name, the new puppetmaster is just called 'labs-puppetmaster.wikimedia.org' [15:37:48] ah andrewbogott seems the script that reboots vm still uses it [15:38:04] 'the script that reboots vm'? [15:38:36] andrewbogott: is deployment-tin still the deployment server for beta? If not, which one is now? :-) [15:38:38] andrewbogott when i rebooted phabricator it comes up with this log https://horizon.wikimedia.org/project/instances/22a00740-9c8f-4258-8ad0-b4082c03deee/console [15:39:01] and furthur down i see [15:39:01] [ 35.248827] rc.local[612]: [1;33mWarning: Failed to open TCP connection to labs-puppetmaster-eqiad.wikimedia.org:8140 (getaddrinfo: Name or service not known)[0m [15:39:02] [ 35.250933] rc.local[612]: [0;32mInfo: Retrieving pluginfacts[0m [15:39:20] but then works after it re runs puppet without " puppet agent --onetime --verbose --no-daemonize --no-splay --show_diff --waitforcert=10 --certname=phabricator.phabricator.eqiad.wmflabs --server=labs-puppetmaster-eqiad.wikimedia.org" [15:39:59] Hauskatze: better question for #wikimedia-releng [15:40:09] oops, wrong channel indeed [15:54:25] [ 173.605342] rc.local[609]: [mNotice: /Stage[main]/Role::Phabricator/Service[sshd]/ensure: ensure changed 'stopped' to 'running'[0m [15:54:35] seems maybe the sshd server stopped runnin [15:54:37] running [16:25:07] !log reinstalling phabricator as sshd service is failing [16:25:08] paladox: Unknown project "reinstalling" [16:25:24] !log phabricator reinstalling phabricator as sshd service is failing [16:25:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Phabricator/SAL [17:55:09] Q: Does one need to add their SSH keys somewhere besides wikitech to be able to access horizon.wikimedia.org? [17:57:44] Niharika: two-factor authentication I think [17:58:42] Not sure if we're still using NovaKey to manage SSH keys on Wikitech though. I think we do this now on toolsadmin? [17:58:59] Ah, okay. [17:59:05] Niharika: ssh keys are not related to horizon but 2factor is yes [17:59:15] chasemp: Got it, thanks! [18:30:08] so... I enabled 2FA for "mooeypoo" in testwiki... I go to wikitech.wikimedia.org to see what's going on with my ssh kys and maybe update one but I see "enabled 2FA" in my settings for "mooeypoo" on that wiki... did it just not get updated, or is that user not part of the LDAP ? or did I do something wrong? [18:31:01] Should I enable 2FA for "mooeypoo" user on wikitech.wikimedia.org or should I wait for some update, or is that user doesn't matter for anything? I'm confused :\ [18:32:31] mooeypoo: Wikitech is not a SUL wiki [18:32:44] 2FA there is independent from other Wikimedia wikis [18:34:15] Hauskatze: I thought so, but my confusion is which of the users is the one I actually need to login to cloud account [18:34:21] yep Hauskatze nailed it [18:34:24] LDAP or wikitech [18:34:32] LDAP [18:34:44] wikitech has a wiki user name and a "shell" name set at login [18:34:45] ok so that one doesn't work [18:35:06] ... yes, I know the shell name, I was stupid when I registered and misunderstood what that means and now I'm "wikigit" everywhere [18:35:07] :P [18:35:15] either I'm wrong then or something is off :) [18:35:16] heh [18:35:37] Thankfully, no one can see it in gerrit, so it's only me when I clone stuff :p [18:36:43] ok so if logging in to cloud doesn't work, should I check my SSH keys .... and where? I have one set up in wikitech and gerrit because I always duplicate them to both (the instructions say to do that) so theoretically I should have it working, but I can't manage to log in [18:37:36] wikigit is cool :P [18:45:32] mooeypoo: stepping back, I'm not sure what you mean by "cloud account" [18:45:39] horizon.wikimedia.org? [19:05:26] chasemp: yes, plus ssh'ing into commtech-2.commtech.eqiad.wmflabs [19:05:40] I get "Could not resolve hostname commtech-2.commtech.eqiad.wmflabs: Name or service not known" that's what started this whole thing :\ [19:11:49] mooeypoo: that error sounds like maybe your are not bouncing through a bastion? That hostname would only work once you are inside the Cloud VPS environment [19:12:15] that was my suspicion too, but I don't know how to do that [19:12:52] (03PS1) 10Paladox: phab-01.wmflabs.org -> phab.wmflabs.org [labs/icinga2] - 10https://gerrit.wikimedia.org/r/443482 [19:13:11] mooeypoo: have you read https://wikitech.wikimedia.org/wiki/Help:Access#Accessing_instances_with_ProxyCommand_ssh_option_(recommended) ? [19:13:14] (03PS2) 10Paladox: phab-01.wmflabs.org -> phab.wmflabs.org [labs/icinga2] - 10https://gerrit.wikimedia.org/r/443482 [19:13:23] (03CR) 10Paladox: [V: 032 C: 032] phab-01.wmflabs.org -> phab.wmflabs.org [labs/icinga2] - 10https://gerrit.wikimedia.org/r/443482 (owner: 10Paladox) [19:13:59] mooeypoo: I can help you with that, you probably just need some .ssh/config rules [19:20:49] bd808: thanks for that, that's what was missing! [19:21:04] chasemp: bd808 ... and RoanKattouw helped me out -- I can connect now. THANKS! [21:15:39] (03CR) 10MarcoAurelio: "recheck" [labs/striker] - 10https://gerrit.wikimedia.org/r/440758 (owner: 10MarcoAurelio) [21:17:19] (03CR) 10jerkins-bot: [V: 04-1] Use #acl*repository-admins instead of #repository-admins [labs/striker] - 10https://gerrit.wikimedia.org/r/440758 (owner: 10MarcoAurelio) [21:17:42] (03CR) 10MarcoAurelio: "py34 issues this time" [labs/striker] - 10https://gerrit.wikimedia.org/r/440758 (owner: 10MarcoAurelio) [21:28:21] (03CR) 10Hashar: "recheck" [labs/striker] - 10https://gerrit.wikimedia.org/r/421670 (https://phabricator.wikimedia.org/T190543) (owner: 10BryanDavis) [21:29:11] (03CR) 10Hashar: "recheck" [labs/striker] - 10https://gerrit.wikimedia.org/r/443203 (https://phabricator.wikimedia.org/T198076) (owner: 10BryanDavis) [21:29:24] (03CR) 10Hashar: "recheck" [labs/striker] - 10https://gerrit.wikimedia.org/r/421669 (owner: 10BryanDavis) [21:30:22] (03CR) 10jerkins-bot: [V: 04-1] Update UI to use term "Wikimedia developer account" [labs/striker] - 10https://gerrit.wikimedia.org/r/421670 (https://phabricator.wikimedia.org/T190543) (owner: 10BryanDavis) [21:30:52] (03CR) 10jerkins-bot: [V: 04-1] Temporarily replace requirements.txt with `pip --freeze` from prod [labs/striker] - 10https://gerrit.wikimedia.org/r/443203 (https://phabricator.wikimedia.org/T198076) (owner: 10BryanDavis) [21:31:03] (03CR) 10jerkins-bot: [V: 04-1] Order maintainers by cn [labs/striker] - 10https://gerrit.wikimedia.org/r/421669 (owner: 10BryanDavis)