[00:19:20] bd808: Who would be a good mentor for teaching QA engineers about back-end debugging, error logs, Kibana, etc.? [00:27:04] kaldari: hmm. anomie is the best I know at digging up root causes. Most senior engineers should be pretty competent at it though I would think [00:27:16] Thanks! [11:03:06] (03PS1) 10Legoktm: Skip wmf branches over 900 (e.g. 1.32.0-wmf.999) [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/440517 (https://phabricator.wikimedia.org/T197462) [11:04:57] (03CR) 10Legoktm: [C: 032] Skip wmf branches over 900 (e.g. 1.32.0-wmf.999) [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/440517 (https://phabricator.wikimedia.org/T197462) (owner: 10Legoktm) [11:05:24] (03Merged) 10jenkins-bot: Skip wmf branches over 900 (e.g. 1.32.0-wmf.999) [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/440517 (https://phabricator.wikimedia.org/T197462) (owner: 10Legoktm) [15:55:42] hey all. I broke my vagrant trying to git pull [15:56:05] after a vagrant destroy vagrant is reporting all roles as disabled [15:56:19] jdlrobson: that's a bit annoying :/ [15:56:19] and vagrant up runs but does nothing [15:56:38] i wondered if anyone could take a look if it's salveagable before i spend a day replicating my environment again [15:57:02] it's the one on reading-web-staging-3.eqiad.wmflabs. I hate to ask, but I'm wary how much of my day I could waste doing this :) [15:57:15] jdlrobson: if `vagrant roles list -e` is empty, there is probably nothing that we can find telling what was there [15:57:23] that is only tracked locally [15:58:25] bd808: yeh it's empty [15:58:33] i was also getting some weird ssh problems yesterday when trying to up it [16:00:15] There's 2 difficult problems in Wikimedia land: Finding a solution that makes everyone happy and vagrant. [16:08:25] https://www.irccloud.com/pastebin/iryjsMqF/ [16:08:35] anyone encountered this before on a vagrant up? [16:09:52] jdlrobson: I have seen that before. I do not have a good story about how to fix it. It is some bug between the LXC layer, the bridge network, and Vagrant itself [16:10:03] ok im just gonna burn it to the ground then :/ [16:10:21] sometimes it goes away after 2-3 `vagrant reload`. Other times I have fixed it by rebooting the Cloud VPS VM [16:11:31] It went away.. but still broken, so gonna burn it to the ground. Is debian-9.4-stretch the machine the cool kids use these days? Is a medium instance recommended for an extension heavy setup or should i go with large [16:12:24] yes to stretch. I think a medium would usually be big enough for a pretty advanced/heavily used test wiki [16:18:29] bd808: would this also be a good time to setup a personal project for all my tools [16:19:12] jdlrobson: we really try to discourage grab bag and single owner projects [16:19:29] ah got it. So what's the recommendation for projects like https://trending.wmflabs.org/ [16:19:54] toolsforge maybe? [16:21:57] jdlrobson: yeah, trending might be a good fit for a tool [16:22:10] what does the backend for it look like? [16:22:34] bd808: it's a node script, but it has a heavy amount of environment variables (e.g. private keys) and needs to run on trending.wmflabs.org domain [16:25:09] jdlrobson: the private stuff should be doable for a kubernetes webservice in a few different ways. For the domain we could add a redirect via https://wikitech.wikimedia.org/wiki/Nova_Resource:Redirects, but the canonical URL would be tools.wmflabs.org// [16:25:37] hmm [16:25:54] You could certainly try setting up a tool for it and see how that works before switching canonically [16:26:51] we have a feature request for .toolforge.org canonical URLs but we do not have work on that scheduled yet [16:28:22] k k [16:52:50] bd808: sorry me again [16:52:55] Getting "No usable default provider could be found for your system." on my recent install [16:52:58] is there a role i missed out on? [16:54:03] jdlrobson: I think that generally means that your shell session is not loading the alias that makes `vagrant` actually run the `/usr/local/mwvagrant`wrapper script [16:56:57] stupid question but how can i fix that? [16:57:05] jdlrobson: so log out and log back in or source /etc/profile.d/alias-vagrant.sh [16:58:20] or mwvagrant :) [16:58:22] the command [16:59:16] something very weird with this machine [17:00:01] https://www.irccloud.com/pastebin/FDwPoKqD/ [17:00:48] jdlrobson: I'm heading in to meeting, but yeah it looks like there are some issues there. [17:00:50] I can help debug in an hour or so [17:01:24] thanks bd808 [17:01:38] jdlrobson try it as root :) [17:02:05] it should not need root, so if it does there is something broken [17:02:34] is this on stretch jdlrobson? [17:02:38] or jessie [17:02:45] ie the host image (not the vagrant image) [17:05:56] I think i worked it out. Had a .vagrant file with wrong ownership [17:10:14] nope [18:07:57] The proxy isthe challenge now http://reading-web-staging.wmflabs.org/ 504 time outs [18:31:06] bd808: can you ping me when you're free [18:31:53] jdlrobson: have you set a security group to allow the proxy to access the instance? [18:32:02] chicocvenancio: yeh :/ [18:32:34] it has default and web [18:32:35] vagrant is running [18:32:42] proxy is pointing at 8080 [18:33:30] some weird errors around `default: mesg: ttyname failed: Inappropriate ioctl for device` [18:33:37] this is from a fresh instance :/ [18:33:54] what are the configs for those security groups? [18:34:08] at least oneof them has to allow cloud access to por 8080 [18:34:13] *port 8080 [18:38:07] yup it does [18:38:42] should i try jessie rather than stretch? [18:41:12] is it surely a vagrant issue at this point? [18:41:30] Do requests from within the instance work? [18:47:56] chicocvenancio: am trying from scratch again just in case i did something wrong in the setup process. fingers crossed [18:48:14] * chicocvenancio crosses fingers [18:57:49] \o/ http://reading-web-staging.wmflabs.org/wiki/Main_Page [18:58:04] :D [18:58:11] the solution was 'try again from the beginning'? [19:11:52] !log twl cleaning up miscellaneous puppet issues [19:11:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Twl/SAL [19:12:07] andrewbogott: basically yes [19:12:13] andrewbogott: but 3 times [19:12:22] that sounds discouraging :( [19:13:59] only remaining issue i have is logging in - https://login.wiki.local.wmftest.net/wiki/Special:CentralLogin/start?token=30a10cfff108db55c037224a7f4eb07d [19:14:07] it redirects to that url which doesnt work [19:34:08] having problems again :/ now trying to reload it [19:34:11] https://www.irccloud.com/pastebin/1EPT0bMC/ [19:35:50] jdlrobson: try `sudo netstat -nlp | grep :8080` the message says something is already using the port, that command should tell you what it is [19:36:10] (me guess would be vagrant itself) [19:36:32] tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 16302/redir [19:37:59] thank you :) [19:38:14] not sure what "redir" is, do you know? [19:38:50] ahh... I see [19:39:05] so i have it running now but it looks like this: http://reading-web-staging.wmflabs.org/w/index.php [19:39:07] jdlrobson: ugh. we have seen that one before too. Basically the network proxy setup between the Cloud VPS vm and the LXC container can get 'stuck' and not tear everything down when you restart or completely rebuild the container [19:40:02] jdlrobson: the wiki not found thing can sometimes be fixed by restarting the apache process inside the LXC container -- vagrant ssh -- sudo service apache2 restart [19:40:21] h/t to Roan for telling me about that this week actually [19:40:42] AAMZING [19:41:07] Kosta had the same issue but found an Apache restart didn't fix it for him. vagrant destroy + vagrant up did though [19:41:13] I have a theory that sometimes apache starts before the NFS mount of /vagrant is working and ends up stuck with an open file handle to the underlying empty directory [19:41:18] But obviously that's much slower :) [19:41:52] bd808: That is 100% consistent with my earlier investigations of the issue. glob('/vagrant/*') returned an empty array [19:42:31] and w000tttt [19:42:36] achievement unlocked [19:42:41] http://reading-web-staging.wmflabs.org/wiki/Special:MobileDiff/9 I now have the new wikidiff2 on mobile [19:43:51] that's some fancy lorem ipsum you got there [19:44:05] "cold-pressed echo park tattooed poutine" [19:47:52] During early development of VE we used bacon ipsum, but that fragment sounds more like hipster ipsum [19:48:04] (For the record I am being entirely serious, those are both real things) [22:39:51] (03PS1) 10QChris: Add .gitreview [labs/tools/ScrotBot] - 10https://gerrit.wikimedia.org/r/440630 [22:40:34] (03CR) 10QChris: [V: 032 C: 032] Add .gitreview [labs/tools/ScrotBot] - 10https://gerrit.wikimedia.org/r/440630 (owner: 10QChris)