[09:47:23] Is anyone here and can take a look? https://phabricator.wikimedia.org/T224656 ((for me it smells like broken index or a problem with the disc)) [09:58:16] !log tools T224558 reboot tools-worker-1029 after puppet changes for sssd/sudo in jessie [09:58:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:58:19] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [09:59:37] Wurgl: I suggest you add the DBAs in that phabricator task loop [10:00:30] Which ones? [10:00:53] Aha! The group? [10:00:57] perhaps adding the `DBA` phab tag, yes [10:01:56] !log tools T224558 add tools-worker-1029 to the nodes pool of k8s [10:01:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:09:43] !log tools T224558 disable puppet in all tools-worker- nodes [10:09:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:09:47] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [10:27:19] !log tools T224558 switch tools-worker-1001 to sssd/sudo. Includes drain/depool/reboot/repool [10:27:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:27:22] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [10:28:27] !log tools T224558 use hiera config in prefix tools-worker for sssd/sudo [10:28:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:33:24] !log tools T224558 switch tools-worker-1002 to sssd/sudo. Includes drain/depool/reboot/repool [10:33:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:33:27] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [10:48:21] !log tools T224558 drop/build a VM for tools-worker-1002. It didn't like the sssd/sudo change :-( [10:48:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:48:24] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [10:59:03] !log tools.integraality Service stop, mv logs, service start for T224651 [10:59:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.integraality/SAL [11:23:19] !log tools T224558 depool tools-worker-1003 [11:23:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:23:24] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [11:58:31] Hi! I’m using this query in one of my tools to get the first edit of a user (example for dewiki): `SELECT rev_timestamp FROM revision_userindex WHERE rev_user=336793 ORDER BY rev_timestamp ASC LIMIT 1;` This used to be really fast, but currently it takes > 10 seconds, up to several minutes. Any ideas what causes the long execution time and how to speed it up? [11:59:05] hi ireas I would suggest you open a phabricator task and add the `DBA` tag to it :-) [11:59:35] !log tools T224558 repool tools-worker-1003 (using sssd/sudo now!) [11:59:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:59:39] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [11:59:53] arturo: okay, thanks! [12:08:06] !log tools.integraality Nuked the virtualenv and reinstalled all deps from scratch, in desperation for T224651 [12:08:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.integraality/SAL [12:20:28] !log tools cordon/drain tools-worker-1003 because T224651 and T224651 [12:20:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:20:33] T224651: Manual update - stale file handle - https://phabricator.wikimedia.org/T224651 [12:22:52] !log tools cordon/drain tools-worker-1029 because T224651 and T224651 [12:22:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:23:46] !log tools cordon/drain tools-worker-1001 because T224651 and T224651 [12:23:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:25:11] !log tools cordon/drain tools-worker-1002 because T224651 and T224651 [12:25:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:29:29] !log tools switch hiera setting back to classic/sudoldap for tools-worker because T224651 (T224558) [12:29:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:29:33] T224558: sssd: support for Debian Jessie - https://phabricator.wikimedia.org/T224558 [12:29:34] T224651: Manual update - stale file handle - https://phabricator.wikimedia.org/T224651 [12:35:18] !log tools enable puppet in tools-worker nodes [12:35:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:42:19] !log tools reboot tools-woker-1001 to cleanup sssd config and let nslcd/nscd start freshly [12:42:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:47:17] !log tools reboot tools-woker-1002 to cleanup sssd config and let nslcd/nscd start freshly [12:47:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:01:03] !log tools reboot tools-woker-1003 to cleanup sssd config and let nslcd/nscd start freshly [13:01:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:01:36] !log tools uncordon/repool tools-worker-1001/2/3. They should be fine now. I'm only leaving 1029 cordoned for testing purposes [13:01:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:15:56] !help I’m getting a scary error when trying to `git fetch` from Phabricator: [15:15:56] lucaswerkmeister: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [15:16:00] https://paste.gnome.org/pvds38ghq [15:16:34] I'm not familiar with git-ssh.wikimedia.org? [15:17:00] Ah, it's phabricator [15:17:07] yeah [15:17:11] Diffusion or Differential, whichever one [15:18:19] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/git-ssh.wikimedia.org was last edited 2017 [15:19:01] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/phab1003.eqiad.wmnet [15:19:06] lucaswerkmeister i think that's expected. [15:19:15] we recently migrated phabricator to a new server. [15:19:21] phab1001 -> phab1003 [15:19:23] Yeah that matches phab1003 [15:19:42] then I guess https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/git-ssh.wikimedia.org should be updated [15:19:52] (I can’t do it, it’s protected) [15:19:55] Seems like it. [15:20:09] or should just be a redirect to phab1003 :) [15:20:21] or transclude it [15:20:49] anything that doesn’t require me to know about this internal server move ^^ [15:21:37] mutante: You seem to be listed on commits about this [15:22:20] Do you have access to update that, and/or an opinion on the way to update it? [15:23:08] I can update it, so I'm sure mutante can :) Let me see what other aliases, etc. are handled like in there... [15:23:35] Looks like just kinda copied https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/login-stretch.tools.wmflabs.org [15:28:02] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/git-ssh.wikimedia.org <--- I think I did that right [15:28:03] also, apparently my ssh client first tries to connect via IPv6, and I have to wait for that to time out before a connection via IPv4 succeeds [15:28:10] Just doing it like the cool kids do [15:33:12] hello, i can confirm this service recently switched from phab1001 to phab1003, so the fingerprint change is expected [15:33:16] thanks for already updating it [15:33:18] https://wikitech.wikimedia.org/wiki/Phab1003 is here [15:33:58] setup related to IPv6 should have no changes to before [15:34:15] wait, apparently I’m not actually able to connect either [15:34:18] I missed it before in the debug output [15:34:29] but I’m just getting “permission denied (publickey,keyboard-interactive)” [15:35:29] this sounds like an issue with the key not being loaded locally [15:35:44] it’s at least offering the key according to debug1 [15:35:55] currently running ssh with -vvv but need to wait for the IPv6 timeout before I see results :) [15:36:32] i don't think this part is related to the server switch [15:37:05] that was literally just changing the backend IP [15:37:57] btw, which project are you trying to use? [15:38:13] since there is a ticket about moving everything off of this service, afaik [15:38:14] tool-quickcategories.git [15:38:29] and once Striker can automatically create Gerrit repositories, yes I’ll gladly migrate [15:39:39] ssh debug output: [15:39:42] https://paste.gnome.org/pqpqg6mto [15:40:25] It shouldn't be hard to convert striker to create gerrit repos. [15:42:12] paladox: are you familiar with using it this way? does it work for you? [15:43:39] mutante cloning over ssh for phab? [15:43:50] ye [15:44:00] also trying to find that ticket.. [15:44:13] ok [15:46:33] mutante: which one? the general “stop using Differential” is https://phabricator.wikimedia.org/T191182 [15:47:37] yes, that. thanks [15:48:37] i remember doing that database query a couple months ago [15:49:00] the one you mentioned doesn't seem to be in that though..hmm [15:49:41] well tool-quickcategories didn’t exist yet in Dec 2018 [15:49:55] ..we have been adding new things to it? :( [15:50:25] well as long as Striker offers one-click repository creation I’ll continue to use it [15:50:29] as I’ve said on that task already [15:50:39] i was under the impression there is only one subtask left [15:51:25] that's unfortunate if different people go in opposite directions [15:52:08] I’ll create a Phabricator task for adding Gerrit support to Striker [15:52:12] unrelated.. paladox seems to have a patch about the IPv6 part but i need to double check that. i will in a little while [15:52:27] that sounds good,thx [15:52:35] and CC the two people (so far) who said it shouldn’t be too much work :P [15:53:25] it's a simple post to gerrit's rest api to create an repository. [15:56:18] paladox: oh.. you might be right after all. that "unmapped" IP is git-ssh and the other one is phab1003-vcs [15:56:28] yup [15:56:29] lucaswerkmeister: hold on.. we might know the rason [15:56:49] hold on with what? I assume this isn’t related to Striker [15:57:13] it is not. i just mean "we are working on it. might be fixed in a few min" [15:57:19] ok thanks :) [16:00:38] that fixed ipv6 now, but still get "vcs@git-ssh.wikimedia.org: Permission denied (publickey,keyboard-interactive)." [16:02:19] yup, IPv6 working, at least the permission denied happens a lot faster now [16:02:36] I filed https://phabricator.wikimedia.org/T224676 for Striker+Gerrit btw [16:04:59] paladox: phab1001 also had IPs on eth0 that were not in DNS.. unrelated comment.. and removed [16:05:11] thanks. [16:06:05] also i had removed the phab role from the old server yesterday so that puppet could be enabled again without adding/starting phab stuff there [16:09:19] permissions on /srv/repos are the same, i see the sshd is running. twentyafterfour had tested it. maybe he would know more [16:09:24] trying to find logs [16:18:35] yea, so the IPv6 thing is fixed indeed and that was a mistake we made. thanks for pointing it out [16:19:14] the permission denied issue though.. i don't know. i don't see logs for it and since everybody uses vcs@ there must be something in the phab db to map users to keyus [16:19:19] not familiar with that part [16:19:29] hm, okay [16:19:29] would like to add twentyafterfour to a ticket [16:19:33] should I create the ticket? [16:19:51] yes please. so i can add Mukund and refer to it later today [16:22:03] https://phabricator.wikimedia.org/T224677 [16:25:20] * paladox is going to try and support repo creation in gerrit from striker. [16:25:37] \o/ [16:26:10] paladox: step 1 for that would be adding a gerrit deploy to the striker role in MediaWiki-Vagrant so that things can be tested. :) [16:26:23] ah [16:28:05] with gerrit we use a git fat deployment. [16:28:57] i guess nows the time to add a gerrit class to MediaWiki-Vagrant :) [17:00:18] * twentyafterfour checks T224677 [17:06:16] bd808 does vagrant use wikimedia's apt repo (ie stretch?) [17:09:50] ah, it does! [17:16:56] the good thing is gerrit does not require a db making an install easy. [17:17:38] well technically it does from 2.16 (but we can use a h2 db which requires nothing from our side, gerrit setups that). [17:29:42] turns out the vcs user account is locked.. i hear [18:34:28] now I’m getting “connection closed” from the IPv6 address [18:38:31] stop using that new fancy pants internets [18:49:33] it waits for 2 minutes before closing the connection btw [18:53:30] and it happens with `ssh -4` as well, not the new fancy pants internets’ fault :P [18:53:41] hangs after the “offering publilc key” step [19:00:44] lucaswerkmeister yup, he's debugging *i think* [19:01:42] apparently someone called a repo THMBREXT (brexit). [19:03:13] ok lol [19:03:41] (needed a few second to realise what the THM stood for ^^) [19:04:07] (better than abbreviating it after Boris Johnson I suppose) [19:04:54] lol [19:07:00] lucaswerkmeister: i might be seeing that too [19:09:34] cscott i think your issue is a gerrit one, where as lucaswerkmeister is a phab one. [19:10:26] paladox: but maybe root cause is similar (ipv6 config, IIUC) [19:10:31] hmm [19:10:37] no no my problem is IPv4 too [19:12:55] cscott what does ssh cscott@gerrit.wikimedia.org -p 29418 -vvv show? [19:18:35] the user is not locked anymore, it's like on phab1001 now [19:18:53] it's like we solved 3 issues and now it's the 4th :P [19:19:24] like the opposite of yak shaving [19:19:32] heh [19:19:37] what's yak shaving? [19:19:49] paladox: https://en.wiktionary.org/wiki/yak_shaving [19:20:07] I think I might have used the term incorrectly, actually [19:20:07] Ren and Stimpy :P [19:20:11] lol [19:20:26] https://en.wiktionary.org/wiki/when_you%27re_up_to_your_neck_in_alligators,_it%27s_hard_to_remember_that_your_initial_objective_was_to_drain_the_swamp#English [19:20:48] but googling “yak shaving gif” did bring me to the one I was thinking of, https://i.imgur.com/t0XHtgJ.gifv [19:21:58] so.. i unlocked the user and then also did " ./bin/config set diffusion.ssh-user vcs" [19:22:38] i also verified the sudo rules are there and the same as before [19:23:08] twentyafterfour: ^ [19:23:33] the vcs user is allowed to run these: [19:23:38] vcs ALL=(phd) SETENV: NOPASSWD: /usr/bin/git-upload-pack, /usr/bin/git-receive-pack, /usr/bin/svnserve [19:23:41] as phd [19:23:47] this is the same as on old server too [19:23:50] mutante what does /srv/deployment/phabricator/deploymentbin/ssh-auth paladox show? [19:24:57] paladox: no such file or directory [19:25:03] oh [19:25:03] hmm [19:25:04] on BOTH servers [19:25:04] misplet [19:25:13] /srv/deployment/phabricator/deployment/bin/ssh-auth paladox [19:25:17] ohh [19:25:18] /srv/phab/phabricator/bin/ssh-auth [19:25:22] that one ^^ [19:25:49] that shows a lot of keys [19:25:51] also on BOTH [19:26:32] hmm [19:26:40] it does not make a difference if i add "paladox" or not [19:29:26] "paladox" does not show up in the output of it either way [19:29:47] "debug2: we sent a publickey packet, wait for reply" [19:30:00] so it's sent a pub key, but nothing is returned. [19:33:05] turns out my problem was https://apple.stackexchange.com/questions/277479/openssh-hangs-at-rekey-after-134217728-blocks [19:33:37] oh okay [19:33:53] wait, you can have multiple processes listen on a socket? [19:34:08] yup [19:34:28] well, if you set the socket options correctly IIRC [19:35:31] paladox: command="'/srv/deployment/phabricator/deployment/phabricator/bin/ssh-exec' '--phabricator-ssh-user' 'Paladox' '--phabricator-ssh-key' '274'",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa ... [19:35:35] followed by your keys [19:35:44] if i use the right capitalization of your user name :p [19:35:48] heh [19:36:02] so if that works, something else must be wrong? [19:37:42] i want more logs directly from the sshd that phab starts... [19:37:57] auth.log isnt helpful anymore since we fixed the user locked issue [19:38:33] sudo service ssh-phab status [19:38:51] SO_REUSEPORT (since Linux 3.9) [19:38:52] Permits multiple AF_INET or AF_INET6 sockets to be bound to an [19:38:52] identical socket address. This option must be set on each [19:38:52] socket (including the first socket) prior to calling bind(2) on [19:38:54] the socket. To prevent port hijacking, all of the processes [19:38:54] binding to the same address must have the same effective UID. [19:38:57] This option can be employed with both TCP and UDP sockets. [19:38:57] For TCP sockets, this option allows accept(2) load distribution [19:38:59] in a multi-threaded server to be improved by using a distinct [19:39:01] listener socket for each thread. This provides improved load [19:39:04] distribution as compared to traditional techniques such using a [19:39:06] single accept(2)ing thread that distributes connections, or hav‐ [19:39:08] ing multiple threads that compete to accept(2) from the same [19:39:10] socket. [19:39:12] For UDP sockets, the use of this option can provide better dis‐ [19:39:15] cscott: Tsk, over-paste. [19:39:17] tribution of incoming datagrams to multiple processes (or [19:39:19] threads) as compared to the traditional technique of having mul‐ [19:39:21] tiple processes compete to receive datagrams on the same socket. [19:39:26] (man 7 socket) [19:40:34] James_F: man 7 socket | sed -e 's/^/James_F /' | irc [19:40:37] ;) [19:40:49] phabricator-ssh-exec: Welcome to Phabricator.You are logged in as Paladox. [19:40:58] i can do this without even having a key [19:41:02] from localhost [19:41:16] heh [19:43:10] paladox: systemctl status ssh-phab just shows me some "Invalid user" lines but nothing about "vcs" at all :p [19:43:18] oh [19:43:31] cat /var/log/syslog ? :P [19:44:32] just one thing about "vcs" and that is how it was set: [19:44:38] (/Stage[main]/Phabricator/File[/srv/phab/phabricator/conf/local/local.json]/content) - "diffusion.ssh-user": "vcs", [19:58:39] bd808 should we auto provision gerrit through puppet or let the user run the java -jar init command? [20:19:53] can't we just add the striker role on a cloud VPS and configure it to use the existing gerrit test server [20:20:18] without having to add new code in vagrant [20:20:47] we already have a gerrit made for testing and also already have the puppet role to setup striker::web.. no? [20:21:33] i kind of fail to understand how vagrant actually makes it easier [20:29:35] twentyafterfour: oops.. that puppet change failed [20:29:51] found unexpected end of stream while scanning a quoted scalar at line 325 column 15 at /etc/puppet/modules/phabricator/manifests/init.pp:87:23 [20:30:41] mutante: amended [20:30:54] missing ' [20:31:16] already merged though :P [20:31:40] https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/513407 [20:32:08] ok, reusing it [20:44:42] mutante: mediawiki-vagrant makes developing on my laptop much easier than trying to integrate with some other random networked service. The mwvagrant stuff is also how I run the striker test server in cloud vps [20:45:43] paladox: Ideally the puppet role would set everything up, but if that's hard for various reasons we can add an instructions page to the role that tell you the manual things that need to be done [20:46:13] the existing striker role has one of those because there were a few things that I couldn't figure out how to automate reliably [20:46:21] ok, it can be done, but getting it to not do it on each puppet run would be hard. [20:47:05] oh [20:47:12] paladox: is there any file that is created when it runs the first time? [20:47:19] i guess we could do "creates => /var/lib/gerrit/touch" [20:47:31] it creates an index folder [20:47:59] and also a bunch of other folders (but we would need to configure a default etc/gerrit.config though) [20:48:35] the role should make a fully functional service if possible, so it probably should include a config file [20:49:41] ok. yup that will be easy to do. [20:49:52] at least mostly a copy and paste from prods gerrit role. [22:12:23] !log tools.quickcategories git remote add github https://github.com/lucaswerkmeister/tool-quickcategories.git # work around T224677 [22:12:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [22:17:45] !log tools.quickcategories deployed e3cd2871eb (sort key support and minor UI fix) [22:17:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [22:30:15] !log tools.quickcategories deployed ff3e7b2931 (add sort key hint to placeholder) [22:30:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [23:33:02] bd808 i think https://gerrit.wikimedia.org/r/#/c/mediawiki/vagrant/+/513314/ may work