[06:37:54] (03CR) 10Hashar: "No worries Kunal. I guess I randomly came across that wikibugs2 repo, noticed tox would fail on stretch (it has python35) and send a rough" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/381492 (owner: 10Hashar) [12:22:24] tgr: ping again. here's the current errors: https://pastebin.com/skmDEtrv [12:49:47] !log git upgrading gerrit-test3 to latest 2.14.6 pre release [12:49:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [13:25:17] {"error":"mwoauthdatastore-bad-token","message":"No token was found matching your request."} [13:38:08] Please deploy a wmf/1.31.0-wmf.5 in es.wikipedia.org. [13:41:30] Please deploy a wmf/1.31.0-wmf.5 in es.wikipedia.org. [13:41:55] Guest16974: please stop wasting everybody's time. [13:42:01] No. [13:42:05] Please deploy [13:42:14] Guest16974: Please stop spamming. [13:42:30] No. [13:42:36] Please deploy [13:42:45] Guest16974: OK, bot. [13:42:57] Okay. [13:43:01] Deploying... [13:43:08] Guest16974: Please and reply No [13:43:20] No [13:45:58] Please deploy [14:04:55] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: @addshore & @Christoph_Jauera_(WMDE) - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:30:54] chasemp: andrewbogott: So I went ahead and deleted the large instance so I can respawn a big disk version of it. Unfortunately Bigdisk also spawns a lot of RAM with it, which exceeds my quota limits. It apparently allocates 24GB of RAM. [14:31:17] Cyberpower678: hm, I'll look [14:31:28] (tx andrewbogott) [14:32:16] andrewbogott: thx. :-) [14:32:45] wow 24gb of ram, that's alot heh :) [14:32:52] Indeed. [14:33:35] Since MySQL doesn't need that much, I think, even with a large DB, since most of the operations are on the disk itself. [14:33:58] chasemp: ^ Or does MySQL need more RAM as the DB grows larger? [14:35:05] it's all contextual to what you are doing, but it doesn't hurt. we'll see what andrew says, I agree that flavor requires 24GB of RAM atm which seems excessive [14:35:27] i thought it was 8192 [14:35:31] basically a large w/ a lot of disk along [14:35:40] So do I, and apparently so did andrewbogott [14:36:27] Cyberpower678: I made you a new flavor called 'justdisk' that is what we all thought 'bigdisk' was previously :) [14:36:30] * Cyberpower678 will use this opportunity to get his other instances off of trusty [14:36:40] :-) [14:36:43] * Cyberpower678 checks [14:38:13] * Cyberpower678 is spawning the instance [14:44:31] I've figured out that $this->makeOAuthCall( $requestToken, $tokenUrl ); results in {"error":"mwoauthdatastore-bad-token","message":"No token was found matching your request."} [14:48:33] DatGuy: so what's the takeaway from that? [14:48:42] That I'm very confused [14:48:57] I've been testing it for about a full 24 hours and can't figure it out [14:48:59] How can I help. I have a lot of OAuth experience. [14:49:31] with the mediawiki OAuth library? [14:49:37] (php obviously) [14:49:48] I haven't used. I wrote my own. [14:50:07] Link to the Library> [14:50:12] ? [14:50:35] https://packagist.org/packages/mediawiki/oauthclient [14:51:46] Is this an owner-only consumer, or is this a tool? [14:52:02] tool [14:52:13] Can you Pastebin your code? [14:53:01] https://pastebin.com/ZX0TZQQ1 [14:53:01] brb [14:53:10] it's mostly just from the usage section in the readme [14:57:10] I see that, so what's the URL that's actually running this thing? [14:57:42] Or how are you running it right now. [14:58:07] Technical Advice IRC meeting starting now in channel #wikimedia-tech, hosts: @addshore & @Christoph_Jauera_(WMDE) - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [15:21:56] localhost [15:22:00] Cyberpower678: ^ [15:22:11] xampp specifically [15:25:30] So on a browser then. [15:25:31] DatGuy: Are you using the grant that you created for toolforge from localhost? If so that may be your problem [15:25:52] bd808: Actually I think the code is the problem. [15:26:24] He's not directing the browser to Wikipedia to have the user authenticate, nor is he retrieving the verification tokens. [15:26:40] that's done in Client.php [15:26:52] DatGuy: the redirect is done there? [15:26:52] or, well, it's actually echo-d [15:26:53] that library works, I use it in https://tools.wmflabs.org/bash/. The demo code that comes with the library is a command line toy [15:27:09] bd808: I thought so. [15:27:26] bd808: He's using that example code on his web application [15:27:46] so what does it need to be for webapp? [15:28:16] DatGuy: Instead of echoing $next, use the command header( "Location: $next" ); [15:28:27] and that'll work? [15:28:27] I should do a more real version and post it on wikitech like https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Flask_OAuth_tool [15:29:39] seems like it [15:29:57] OAuth_verifier and _token aren't anything private, are they? [15:30:12] meaning it's randomised each request and not like the private key [15:30:18] secret* to be exact [15:31:25] DatGuy: the verifier is kind of public. It only works with a given consumer key to confirm the authorization request. [15:31:43] got it. thanks. that's way more simple than I suspected [15:31:51] Is it working now? [15:32:18] yep [15:32:22] :-) [15:32:30] bd808: nice library [15:33:42] mutante: how long does it take for puppet to run? [15:34:07] he's currently on a plane Cyberpower678 :) [15:34:31] bd808: I'm running "puppet agent" but it's been running for more than 20 minutes. [15:34:42] wow that's a long time [15:34:47] did you do puppet agent -tv? [15:34:56] No. Just puppet agen [15:35:05] Cyberpower678: how exactly did you run it? `sudo puppet agent -tv` ? [15:35:22] ah [15:35:26] No. I did "sudo puppet agent" [15:35:28] puppet agent will run forever [15:35:35] you need to do puppet agent -tv [15:35:45] * Cyberpower678 kills puppet [15:36:09] Added -tv to it [15:43:41] Cyberpower678: the magic there is that without extra flags `puppet agent` just starts a demon that tries to apply your puppet config every 30 minutes [15:44:06] Oh. [15:44:14] Demons are scary. ;-) [15:45:03] the `-t` or `--test` flag is shorthand for `puppet agent --onetime --verbose --ignorecache --no-daemonize --no-usecacheonfailure --detailed-exitcodes --no-splay --show_diff` [15:45:35] which really means "compile and apply the catalog once and show me what happened" [15:46:06] Adding the extra `-v` makes the output shown even more verbose [15:51:06] bd808: does .htaccess work on labs? [15:52:25] DatGuy: no. the lighttpd server does not read .htaccess files [15:52:31] also there is no labs ;) [15:52:36] dammit, toolforge [15:52:44] so must I use the default 404? [15:53:45] bd808: I think there's a bug on Horizon. [15:53:47] you can make a $HOME/.lighttpd.conf file [15:54:34] I've been trying to figure out why SRV wasn't mounting nor was Maria being setup. When I launched the instance I selected the two puppet roles and clicked launch instance. [15:54:54] DatGuy: I'm not 100% sure if $HOME/.lighttpd.conf can properly intercept 404s however. The nginx proxy may look at the status code and then return the global 404 page always [15:55:00] But when I went back to check, they weren't applied. [15:55:37] "I selected the two puppet roles and clicked launch instance." -- how did you select puppet config before having and instance created? [15:55:47] *an instance [15:56:11] bd808: On horizon you are given the options of applying puppet roles when configuring the instance. [15:56:20] DatGuy: https://redmine.lighttpd.net/projects/1/wiki/Server_error-handler-404Details [15:56:29] huh. I'm not sure I've ever noticed that :) [15:56:30] there are some tools with cool 404s https://tools.wmflabs.org/bub/nonexistent [15:57:03] zhuyifei1999_: can you add that to mine? :DD [15:57:13] uh, not my tool [15:57:22] Aw. :( [15:57:24] I saw that 404 by chance [15:57:49] zhuyifei1999_: actually I want that as my "no webservice" message. :D [15:57:54] it looks like bub does that by having all URLs route through a python dispatch script [16:00:50] DatGuy: found some docs on wikitech -- https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web/Lighttpd#Header.2C_mimetype.2C_character_encoding.2C_error_handler [16:01:09] server.error-handler-404 += "/error-404.php" [16:01:30] great, thanks [16:01:46] we have lots of docs. The trick is finding them :) [16:03:32] wait, is that in $HOME or /public_html? [16:04:27] DatGuy: $HOME [16:04:36] nice [16:05:59] Cyberpower678: I'm not seeing puppet config as part of the launch instance dialog. Did you maybe apply security groups (firewall) when you launched? [16:06:44] I could swear they were there as an option. [16:07:01] But yes I also applied security groups to the instance. [16:57:10] bd808: How do I pull up the a puppet role configuration? [16:57:30] For some reason the Role cyberbot::db, didn't point the data directory to SRV [16:57:39] mutante: set it up to. [16:58:34] Cyberpower678: I'm not quite sure I understand the question. In Horizon you setup puppet for an instance using the Puppet tab on the instance's details page. [16:58:58] there are other fancy ways to do that as well, but probably not needed for what you are doing [16:59:26] bd808: No what I want to see is what's inside of Role. [16:59:56] What it's telling Puppet to do. Because either puppet misfired and set up my DB wrong, or the Role hasn't been updated yet. [17:00:28] Cyberpower678: https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/cyberbot/db.pp [17:01:06] it doesnt set the datadir because that was a feature of the mysql class which turned out to be broken [17:01:11] due to hardcoded precise stuff used by quarry [17:01:14] oh [17:01:28] so we stopped using that and installed mariadb with just require_package [17:01:30] I wish I knew that. MariaDB is locked up. [17:01:45] i thought we talked about that [17:02:02] how is that related to the db locking up? [17:02:07] which already worked [17:02:13] mutante: I don't think so. We did talk about a lot though. :p [17:02:24] mutante: I deleted cyberbot-db-01 [17:02:29] why ? [17:02:33] and am in the process of rebuilding it. [17:02:49] Because I had to to get the 300 GB of disk space that was approved for me. [17:02:59] Cyberpower678: puppet:///modules/profile/cyberbot/my.cnf' [17:03:05] that is the path to the config file [17:03:32] Where do I type that path in [17:04:06] you would git clone the repo , then you'd have all the files locally [17:04:12] or you can use github again https://github.com/wikimedia/puppet/blob/production/modules/profile/files/cyberbot/my.cnf [17:04:25] my point is: [17:04:26] datadir = /srv/mysql/data [17:04:38] the datadir does get set and it does use /srv like you requested [17:04:47] where you had enough space [17:05:13] so i dont understand what happened in between but i doubt that "mariadb is locked" has to do with the datadir [17:08:51] mutante: so what happened is I applied the puppet role thinking it also sets up the listener address and data directory. So when went to restore the dump, it wrote to root, ran out of space, and locked up. [17:40:36] bd808: mutante mentions git cloning puppet:///modules/profile/cyberbot/my.cnf [17:40:48] bd808: when I do that, I get fatal: Unable to find remote helper for 'puppet' [18:22:57] (03PS1) 10Krinkle: frontend: Restore loader indicator when landing from permalink [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/386434 [18:23:11] (03CR) 10Krinkle: [C: 032] "https://developer.mozilla.org/en-US/docs/Web/API/HTMLFormElement" [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/386434 (owner: 10Krinkle) [18:23:38] (03Merged) 10jenkins-bot: frontend: Restore loader indicator when landing from permalink [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/386434 (owner: 10Krinkle) [18:51:49] paladox: since mutante is on a plane, maybe you can help> [18:51:50] ? [18:51:58] i can try [18:52:14] paladox: So I used puppet to install MariaDB. [18:52:19] yep [18:52:31] However something seems to go wrong every time. [18:52:40] oh [18:52:43] Namely the mysqld socket file isn't created [18:52:48] oh [18:52:54] mysql should create it [18:53:01] So when I try to access the DB as root, I can't connect [18:53:08] yep [18:53:12] any errors in the logs [18:53:12] ? [18:53:22] When I try to restart service, I also get errors. [18:53:25] on why it carn't create the socket [18:53:41] Where's the log usually saved? [18:53:50] /var/log/mysql/ [18:55:11] Cyberpower678: if there is no log there check journalctl and syslog [18:55:12] uh, not there [18:56:51] hmm the folder is empty for me [18:56:53] DatGuy: filed T179030 about the client being stupid [18:56:53] T179030: OAuthClient should check for error before validating JWT - https://phabricator.wikimedia.org/T179030 [18:57:08] log_error = /var/log/mysql/error.log [18:57:18] apart from that, is there something to fix? [18:57:21] paladox: zhuyifei1999_https://pastebin.com/9SpJvcvQ [18:57:25] https://pastebin.com/9SpJvcvQ [18:57:49] thanks [18:58:06] if you haven't seen https://www.mediawiki.org/wiki/OAuth/For_Developers#OAuth_in_detail what would have helped you find it? [18:58:22] exitcode 1 isn't really helpful imo [18:58:24] if you have seen it, is there something in there that should be clarified? [18:58:41] anything in the mysql logs [18:58:54] paladox: zhuyifei1999_ also https://pastebin.com/XLHLVarL [18:58:54] Cyberpower678 /var/log/mysql/error.log [18:58:54] ? [18:59:02] Just has notes in it. [18:59:15] thanks [19:00:00] please look the logs :) [19:00:05] systemd wont tell us much [19:00:07] :) [19:00:13] /var/log/mysql/error.log [19:01:17] paladox: zhuyifei1999_: https://pastebin.com/vj3dZX4N [19:01:56] thanks [19:01:58] hmm [19:02:16] as soon as it starts [19:02:18] it stops [19:02:24] Version: '10.1.26-MariaDB-0+deb9u1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Debian 9.1 [19:02:31] looks like it creats the socket then stops [19:03:56] * zhuyifei1999_ can't help unfortunately [19:04:00] It definitely has to do with something puppet is doing. Because before the puppet configurations are applied, as in I have to restart MariaDB, the thing works normally. [19:04:24] mutante: is the expert who set this up for me, but he's traveling, so probably unreachable. [19:04:46] wait for jynus? [19:04:59] tgr: The fix was that the demo wouldn't actually work. gotta use header() for the tool to redirect, instead of just putting a "Insert token here" [19:05:26] is the directory created for Cyberpower678 re do what you said works then run puppet agent -tv [19:05:31] to see what puppet does :) [19:06:05] paladox: the entire DB installation is done with puppet. [19:06:24] yep, but you said "Because before the puppet configurations are applied, as in I have to restart MariaDB, the thing works normally." [19:06:25] So it doesn't even exist until puppet agent -tv is run. [19:06:52] Well it starts the DB installation before puppet applies the my.cnf file [19:07:05] So the DB needs to be restarted after installation with puppet. [19:07:18] i wonder what is in /var/log/mysql/mysql.err [19:07:19] ? [19:07:27] paladox: https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/cyberbot/db.pp [19:07:33] That's the puppet role. [19:08:27] DatGuy: the demo in the readme file, you mean? [19:08:38] yep [19:09:05] i wonder if it's because [19:09:07] sql_mode = [19:09:08] is blank [19:09:23] Cyberpower678 check /var/log/mysql/mysql.err [19:09:45] paladox: https://pastebin.com/HWUcXGvC [19:10:09] ah [19:10:10] thanks [19:10:13] now we have the error [19:10:14] 2017-10-25 18:55:44 140159842312768 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist [19:10:47] Cyberpower678 could you cd to /var/lib/mysql [19:10:51] please [19:10:57] does it have folders in there? [19:11:42] Yes [19:11:47] ok [19:12:08] that data will most likly need to be copied to /srv/mysql/data/ [19:12:20] though i doint know if we want to do it manual [19:12:26] Ah. [19:12:28] i see no way of puppitising that data. [19:12:45] though you could do cp -R * /srv/mysql/data/ [19:13:03] but remeber to try and puppitise something or speak to mutante on what to do :) [19:13:28] copying for now will unbreak mysql for now [19:15:36] please do puppitise it later :) [19:15:55] paladox: service mysql start still erroring out [19:16:15] check /var/log/mysql/mysql.err [19:16:20] Cyberpower678 ^^ [19:16:49] Fatal error: Can't open and lock privilege tables: Table 'mysql.user' doesn't exist [19:16:58] oh [19:17:03] ah [19:17:23] Cyberbot678 try sudo chown -R mysql:mysql /srv/mysql/data/ [19:17:33] also is there a mysql folder in /srv/mysql/data/ ? [19:17:58] Grunt, I think I copied it wrong. Give me a sec [19:18:49] ok [19:21:42] Cyberpower678 did that work? [19:21:59] Give me more seconds [19:22:10] ok [19:23:00] paladox: remind me of the quickest way to empty out an entire directory? [19:23:14] rm -rf [19:23:29] doint do rm -rf / :) [19:23:35] that will delete everything [19:23:57] Haha. I can always respawn the instance. :D [19:24:02] heh [19:27:40] Service started [19:27:51] I'm in [19:27:52] :) [19:27:56] :-) [19:28:01] Awesome. [19:28:17] Next problem [19:28:21] mysql --host cyberbot-db-01.cyberbot.eqiad.wmflabs --database s51059__cyberbot [19:28:21] ERROR 2003 (HY000): Can't connect to MySQL server on 'cyberbot-db-01.cyberbot.eqiad.wmflabs' (113) [19:28:33] The server should be visible. [19:28:41] um [19:28:53] Err wait [19:29:22] I should be getting a not allowed to connect error. [19:29:44] The listening address isn't bound to localhost is it [19:30:44] i think it's because bind_address = 0.0.0.0 [19:30:50] i think it's because you have [19:30:50] bind_address = 0.0.0.0 [19:30:51] set [19:30:53] but maybe do [19:31:00] mysql -u -p [19:31:07] that will do it to localhost [19:31:21] https://github.com/wikimedia/puppet/blob/production/modules/profile/files/cyberbot/my.cnf#L25 [19:31:33] brb [19:32:16] paladox: Well it should be listening for external connections. [19:42:07] labtestcontrol2003 is hitting wmf.3? [19:42:09] Um...why? [19:42:26] https://logstash.wikimedia.org/goto/11043646bcdb57bb54c08c76ea2ef80c [19:43:23] Do we have a host that's up but not pooled in scap? [19:47:07] no_justification: sorry man, that's me erroneously apologies. It's come up as a puppetmaster and there are backend scap things that happened which I did not expect [19:48:29] Hmmm, that's scary for a puppetmaster.... [19:49:24] I'm digging down through the layers [19:50:31] Lmk if I can help with anything [19:50:55] I seriously will, thanks for the heads up [20:19:40] im back [20:20:19] Cyberpower678 did you manage to connect? [20:20:39] So am I and the dump is writing to the newly rebuilt DB. [20:20:52] Now to get the listener to accept external connections [20:21:05] For some reason it isn't. [20:22:53] Cyberpower678 bind_address is what you want [20:22:54] :) [20:22:57] https://github.com/wikimedia/puppet/blob/production/modules/profile/files/cyberbot/my.cnf#L25 [20:23:17] paladox: doesn't 0.0.0.0 mean everything though? [20:23:27] um [20:23:28] not sure [20:23:52] oh i see [20:23:53] https://serverfault.com/questions/257513/how-bad-is-setting-mysqls-bind-address-to-0-0-0-0 [20:25:16] i think you want to open the firewall for port 3306 [20:25:22] which requires a puppet change [20:25:28] otherwise it would be manual [20:25:35] puppet change to what? [20:25:38] but mutante wants it done in a puppet class :) [20:25:47] to open the firewall for port 3306 [20:25:50] mysql port [20:26:16] There already exists a security group for that. [20:27:37] yes but that only allows outside traffic that dosen't tell the host to allow the traffic in :) [20:47:04] so... what if I have a tool that I that is a static web, but I need npm to download the deps? [20:57:14] davidwbarratt: npm should be available to use from your tool, you can just do npm install? [20:57:42] not sure if I'm misunderstanding, let me know if you ran into some trouble trying this [20:57:57] madhuvishy uhh.. I haven't created the tool yet, but the kuberneties container doesn't include node in static-web [20:58:04] madhuvishy or should kuberneties not be used? [20:59:17] ah i see [21:00:03] madhuvishy `unless you can use more than one container? [21:01:08] @seen jynus [21:01:24] davidwbarratt: so these npm packages just pull in static stuff right? [21:01:40] madhuvishy right and run webpack and then die [21:02:02] madhuvishy but I think you can have a deployments.yaml that can run both containers? [21:02:09] you can just pull in the packages and launch it as a plain webservice with k8s [21:02:34] like https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Default_web_server_.28lighttpd_.2B_PHP.29 [21:03:17] you just are looking to serve static files in your webservice, npm install is just a step to do first to pull in deps [21:03:26] iiuc [21:07:04] davidwbarratt: ^ [21:07:12] hmm [21:19:17] davidwbarratt: let me know if that helps, and if you have more questions! [21:19:33] paladox: sorry, I had to AFK. [21:19:37] ok [21:19:38] madhuvishy sure will! thanks! [21:19:41] So how do I fix the firewalling? [21:20:00] i could tell you the manual command but then i will be told off by mutante :) [21:20:14] as he likes things to be done with puppet [21:20:18] instead of manually [21:20:34] paladox: I agree with mutante but he's busy, and I need the DB working now. [21:20:36] :-) [21:20:39] ok [21:20:55] We'll work to get the puppet role setup correctly by learning from this. [21:21:05] sudo iptables -A INPUT -p tcp --dport 3306 -j ACCEPT [21:21:13] though that will reset it self on reboot [21:21:28] which is kind of safe so rebooting it will reset it back to the default firewall. [21:21:52] paladox: hmm... [21:22:00] It kind of needs to be permanent [21:22:10] that would be puppet then [21:22:12] heh :) [21:22:21] you just need to add that command to a file [21:22:29] and remember to use when rebootin [21:22:31] rebooting [21:22:35] What I don' understand is why it worked correctly last time. Puppet didn't open the traffic to that port when mutante worked on the instance [21:22:37] until you do the puppet change. [21:22:50] It happened to me too [21:23:05] * Cyberpower678 is confused [21:23:08] BRB [21:23:28] * paladox looks up polygerrit plugin docs :). [21:48:31] (03PS1) 10Rush: openstack: add new puppetmaster profile dummy secrets [labs/private] - 10https://gerrit.wikimedia.org/r/386540 [21:49:51] (03PS2) 10Rush: openstack: add new puppetmaster profile dummy secrets [labs/private] - 10https://gerrit.wikimedia.org/r/386540 [21:50:24] paladox: So here's what's confusing me. [21:50:40] ok [21:50:49] When mutante setup the DB with puppet and tinkered, I don't remember him playing with IP Tables. [21:51:22] And every time I restarted that instance, the DB was reachable without reconfiguring the IP table. [21:51:31] (03CR) 10Rush: [V: 032 C: 032] openstack: add new puppetmaster profile dummy secrets [labs/private] - 10https://gerrit.wikimedia.org/r/386540 (owner: 10Rush) [21:51:37] hmm [21:51:47] did running the command i give you work? [21:52:00] All I had to do was assign my security group to the instance, and presto outside access was allowed. [21:52:12] I haven't tried yet. [21:53:01] paladox: -bash: iptables: command not found [21:53:09] oh [21:53:15] which os do you have installed? [21:53:19] Stretch [21:53:26] ok [21:53:30] i have that install too [21:53:52] root@phabricator:/home/paladox# sudo iptables -A INPUT -p tcp --dport 5666 -j ACCEPT [21:53:52] root@phabricator:/home/paladox# [21:54:51] I forgot to sudo [21:55:15] oh [21:55:22] But still not connecting. [21:55:35] oh [21:55:38] what's the error? [21:55:46] ERROR 2003 (HY000): Can't connect to MySQL server on 'cyberbot-db-01.cyberbot.eqiad.wmflabs' (113) [21:55:52] oh [21:56:22] i get this [21:56:23] root@phabricator:/home/paladox# mysql --host cyberbot-db-01.cyberbot.eqiad.wmflabs --database s51059__cyberbot [21:56:23] ERROR 1130 (HY000): Host 'phabricator.phabricator.eqiad.wmflabs' is not allowed to connect to this MariaDB server [21:57:07] :O [21:57:10] eut [21:57:13] *wut [21:57:18] heh [21:57:44] So it's just not working for me then. GREAT [21:58:17] are you doing it from the same host [21:58:19] ? [21:58:22] if you are [21:58:26] use localhost [21:58:28] mysql [21:58:34] cyberpower678@cyberbot-exec-iabot-01:~$ mysql --host cyberbot-db-01.cyberbot.eqiad.wmflabs [21:58:34] ERROR 2003 (HY000): Can't connect to MySQL server on 'cyberbot-db-01.cyberbot.eqiad.wmflabs' (113) [21:59:03] Unless. [21:59:25] oh is that tools? [21:59:26] bd808 wait a second... if I can run a container on Toolforge... and that container runs as root... what is stopping me from installing anything within the container? [21:59:38] bd808: When I destroyed the old instance and spawned this new one, I used the same name. [21:59:48] Could the IP Tables be messed up somewhere? [22:00:14] paladox: no. It's my bot's exec node [22:00:24] oh [22:01:47] davidwbarratt: neither on the grid, nor within kubernetes do Tools run as root [22:02:00] they run as the Tool user itself [22:02:09] chasemp _within_ the container? [22:02:09] paladox: I think my exec nodes may be using an outdated IP table. [22:02:16] oh i see [22:02:19] chasemp: Can you confirm this. [22:02:53] chasemp: When I destroyed the old instance and spawned this new one, I used the same name. Could the IP Tables be messed up somewhere? [22:02:53] davidwbarratt: yes [22:03:05] chasemp kk [22:03:10] chasemp thanks! [22:03:29] chasemp I didn't know you can run containers as anything other than root [22:03:32] Cyberpower678: those things I wouldn't imagine are connected, there is a bug where a host of a prevously used name inherits the puppet config of the old host named that [22:03:43] but that's not iptables specific [22:04:21] chasemp: well, that bug never happened. I had to apply the configuration [22:04:42] that may be doubly weird I'm not sure, maybe that was fixed already [22:04:42] chasemp: My instance cyberbot-exec-iabot-01 cannot connection to cyberbot-db-01 [22:05:04] But things outside my instance can. [22:05:16] *project [22:05:53] chasemp: ^ [22:06:01] I have no idea what is going on. [22:06:38] it runs trusty https://tools.wmflabs.org/openstack-browser/server/cyberbot-exec-01.cyberbot.eqiad.wmflabs [22:06:53] Cyberpower678: when I telnet from cyberbot-exec-iabot-01 to cyberbot-db-01 on port 3306 I can see the traffic hit [22:06:56] and the error message of [22:07:01] hHost 'cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs' is not allowed to connect to this MariaDB serverConnection closed by foreign host. [22:07:05] seems mariadb specific [22:07:18] I believe you have some database specific restriction denying the connection [22:07:23] instead of a purely connectivity issue [22:07:29] i.e. not iptables or security groups [22:07:33] chasemp: but when I run mysql --host cyberbot-db-01.cyberbot.eqiad.wmflabs --database s51059__cyberbot [22:07:47] I get "ERROR 2003 (HY000): Can't connect to MySQL server on 'cyberbot-db-01.cyberbot.eqiad.wmflabs' (113)" [22:07:51] root@cyberbot-db-01:~# tcpdump port 3306 [22:07:51] tcpdump: verbose output suppressed, use -v or -vv for full protocol decode [22:07:51] listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes [22:07:53] 22:06:23.298843 IP cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs.34918 > cyberbot-db-01.cyberbot.eqiad.wmflabs.mysql: Flags [S], seq 3992021385, win 29200, options [mss 1460,sackOK,TS val 8530720 ecr 0,nop,wscale 9], length 0 [22:08:51] chasemp: so no doubt the instance can connect so what causes mysql to fail to connect since the instance was rebuilt. [22:08:57] https://stackoverflow.com/questions/1559955/host-xxx-xx-xxx-xxx-is-not-allowed-to-connect-to-this-mysql-server [22:09:12] that error message seems to be mysql privs specific [22:09:12] By instance I mean cyberbot-db-01 [22:09:28] davidwbarratt: you have discovered part of why we don't have "bring your own container" yet :) [22:09:28] yep, it's lack of permission at mysqld [22:09:37] chasemp: I'm getting an entirely different error message when using mysql from cyberbot-exec-iabot-01 [22:10:03] Platonides: ^ [22:10:24] :) [22:10:40] chasemp: Platonides: See examples I'm about to paste below: [22:10:42] cyberpower678@cyberbot-exec-iabot-01:~$ mysql --host cyberbot-db-01.cyberbot.eqiad.wmflabs --database s51059__cyberbot [22:10:43] ERROR 2003 (HY000): Can't connect to MySQL server on 'cyberbot-db-01.cyberbot.eqiad.wmflabs' (113) [22:10:53] tools.iabot@tools-bastion-03:~$ mysql --host cyberbot-db-01.cyberbot.eqiad.wmflabs --database s51059__cyberbot [22:10:53] ERROR 1130 (HY000): Host 'tools-bastion-03.tools.eqiad.wmflabs' is not allowed to connect to this MariaDB server [22:11:31] tools is working fine. cyberbot-exec-iabot-01 isn't. [22:12:50] That doesn't look like Tools works fine [22:13:14] https://stackoverflow.com/questions/2857446/error-1130-in-mysql [22:13:19] chasemp: For tools it's getting the explicit error that a lack of permissions is letting it connect. [22:13:36] Which I remedy by adding my grants. [22:13:42] telnet cyberbot-db-01.cyberbot.eqiad.wmflabs 3306 -- "telnet: Unable to connect to remote host: No route to host" [22:13:47] something funky there [22:14:35] bd808: where is that from? [22:14:38] i get [22:14:39] root@phabricator:/home/paladox# telnet cyberbot-db-01.cyberbot.eqiad.wmflabs 3306 [22:14:40] but ping works? [22:14:46] Trying 10.68.19.101... [22:14:46] Connected to cyberbot-db-01.cyberbot.eqiad.wmflabs. [22:14:52] Escape character is '^]'. [22:14:53] chasemp: that's from cyberbot-exec-iabot-01 [22:14:56] `Host 'phabricator.phabricator.eqiad.wmflabs' is not allowed to connect to this MariaDB serverConnection closed by foreign host. [22:15:01] that's not what it does for me from there oddly [22:15:08] cyberbot-exec-iabot-01:~$ telnet cyberbot-db-01.eqiad.wmflabs 3306 [22:15:08] Trying 10.68.19.101... [22:15:08] Connected to cyberbot-db-01.eqiad.wmflabs. [22:15:10] Escape character is '^]'. [22:15:12] hHost 'cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs' is not allowed to connect to this MariaDB serverConnection closed by foreign host. [22:15:33] that is bizarre [22:16:07] I'm ssh'ed in with my labs root key. ping works. telnet gives that no route response [22:16:16] bd808 but if I'm reading these docs right.. I can start multiple containers with deployments.yaml? or is that incorrect? [22:16:38] davidwbarratt: you can, yes. [22:16:45] bd808 YAY! [22:16:55] a "pod" can contain multiple containers [22:17:06] bd808: I cant' get it to fail that way [22:17:07] so far [22:17:24] davidwbarratt: the ones we make with `webservice --backend kubernetes` are just a single container [22:17:44] but if you make your own deployment then you can have multiple [22:17:48] it's consistent for me and I can see the traffic hit cyberbot-db-01 and respond [22:17:59] bd808 ah! so how would I invoke deployments.yaml? [22:18:19] * bd808 looks for the example doc [22:18:45] davidwbarratt: there is a micro example at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Kubernetes#Kubernetes_continuous_jobs [22:19:03] basically you read the upstream Kubernetes documentation [22:19:21] Cyberpower678: I'm about out of time at the moment, I'm not sure. From what I see I would say it's a mariadb permissions issue, a new exec host would have a new IP and in mysql grants are user@ip and so I would expect a new exec to have issues. [22:19:27] but I'm not sure what bd808 is seeing [22:19:32] kubectl create -f my_deployment.yaml [22:19:35] bd808 haha. ok, great. :) I'm going to make a docker-compose for now (since I know how to do that and it runs easily on the mac) and then I'll convert that to a pod. :) [22:19:39] so we have reached phab task level of oddness I think [22:20:00] chasemp: But it's not a new instance [22:20:15] It was working fine before I trashed cyberbot-db-01 and rebuilt it. [22:20:34] chasemp: is the cyberbot-exex-iabot-01 that you are on the owner of the 10.68.23.85 ip? [22:20:37] ah, I had which was new mixed up [22:20:53] * bd808 wonders if there is weirdness in the tubes [22:20:54] inet 10.68.23.85/21 [22:21:01] yes [22:21:01] really really weird [22:22:23] oh.. there is weirdness. When I telnet cyberbot-db-01 routes to 10.68.18.167, but ping goes to 10.68.19.101 [22:22:46] dns cache? [22:22:58] host cyberbot-db-01.cyberbot.eqiad.wmflabs says "cyberbot-db-01.cyberbot.eqiad.wmflabs has address 10.68.19.101" [22:23:05] * chasemp crazy ideas [22:23:18] must be a cache somewhere [22:23:21] root@phabricator:/home/paladox# telnet cyberbot-db-01 [22:23:21] Trying 10.68.19.101... [22:23:32] root@phabricator:/home/paladox# ping cyberbot-db-01.cyberbot.eqiad.wmflabs [22:23:32] PING cyberbot-db-01.cyberbot.eqiad.wmflabs (10.68.19.101) 56(84) bytes of data [22:24:07] !log cyberbot cyberbot-exec-iabot-01:~# service nscd restart [22:24:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cyberbot/SAL [22:24:13] that...is hanging [22:24:22] chasemp: for me too... [22:24:32] maybe bad timing :) [22:24:34] * bd808 had run by not !logged yet [22:24:45] chasemp: paladox: I can definitely confirm it's not a grants issue. I've now added all of the permissions and my exec node cannot connect. [22:24:48] bd808: ^ [22:25:06] cyberbot-exec-iabot-01:~# host cyberbot-db-01.cyberbot.eqiad.wmflabs [22:25:06] cyberbot-db-01.cyberbot.eqiad.wmflabs has address 10.68.19.101 [22:25:10] bd808: try telnet [22:25:53] chasemp: I'm still getting the 10.68.18.167 address from somewhere [22:25:54] this could be more gotchas of host name reuse stale dns will abound [22:26:00] bd808: that's weird [22:26:07] bd808: try a new terminal [22:26:14] [22:26:41] disconnected and reconnected. same bad ip [22:26:50] this is really weird [22:27:12] dig @labs-recursor0.wikimedia.org cyberbot-db-01.cyberbot.eqiad.wmflabs [22:27:13] dig and host give the right answer [22:27:18] 10.68.19.101 [22:27:48] its not in the hosts file [22:28:03] I restarted both nscd and nslcd [22:28:19] disconnected and reconnected [22:28:28] ping gets the right address [22:28:43] telnet gets some stale address [22:28:56] dig for cyberbot-exec-iabot-01 is not returning an ip for me [22:29:24] i only see [22:29:25] ;; AUTHORITY SECTION: [22:29:26] . 885 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2017102501 1800 900 604800 86400 [22:30:02] paladox: you have to put .eqiad.wmflabs on it [22:30:04] paladox: it would only work from inside a labs project and with the full hostname [22:30:10] ah thanks [22:30:21] telnet is stalling [22:30:22] root@phabricator:/home/paladox# telnet cyberbot-exec-iabot-01 [22:30:22] Trying 10.68.23.85... [22:30:37] it should have errored by now [22:31:04] Cyberpower678: what happens if you don't use hostnames and harcode the 10.68.19.101 IP? [22:31:13] that will be a blackhole firewall route [22:31:23] paladox: ^ [22:31:31] thanks [22:31:45] chasemp: I'd rather not. If I ever needed to rebuild, then I'd have to change the host names everywhere. [22:31:50] for testing [22:31:56] not forever [22:31:56] * Cyberpower678 tests [22:32:41] It works [22:32:43] `telnet 10.68.19.101 3306` gives me "Host 'cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs' is not allowed to connect to this MariaDB server" [22:33:07] so there is a cache somewhere holding on to the old instance IP [22:33:18] let's reboot the exec node [22:33:20] to see [22:33:32] maybe nscd or nslcd is hanging on too tight for $reasons? [22:33:42] there or the recursor are the only caches I can think of [22:33:49] and the recursors have it right [22:34:17] Somebody killed my root access on that instance. I can't reboot [22:34:28] ok [22:34:34] cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs I am goign to reboot then [22:34:40] 3..2..1.. [22:34:53] !log cyberbot reboot cyberbot-exec-iabot-01.cyberbot.eqiad.wmflabs [22:34:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cyberbot/SAL [22:34:58] Can somebody explain [22:35:00] sudo reboot [22:35:01] sudo: unknown uid 3045: who are you? [22:35:17] that's probably from nslcd freaking out from restart [22:35:23] yeah ^ thta [22:35:25] or maybe it just went haywire [22:35:49] Okay it's working again. :D [22:35:57] root access that is. [22:36:11] bd808: telent? [22:36:12] Host is connecting now too. :D [22:36:13] telnet [22:36:25] chasemp: that fixed it for me. telnet cyberbot-db-01.cyberbot.eqiad.wmflabs 3306 -- Trying 10.68.19.101... [22:36:28] methinks nscd took a wrong turn [22:36:35] there is a way to clear cache w/o restarting [22:36:37] nscd -g? [22:36:53] can't recall [22:36:55] but either way [22:37:01] reusing hostnames has weird outcomes :) [22:37:05] I had kill-9'd it [22:37:08] but yeah [22:37:14] reusing hostnames is bad mojo [22:37:16] * chasemp off to dinner [22:37:30] chasemp: enjoy [22:37:37] Cyberpower678: I think you are back to mariadb grant issues now [22:37:51] bd808: I can fix those myself. :D [22:38:01] Thanks for the troubleshooting. :-) [22:38:33] chasemp https://stijn.tintel.eu/blog/2012/05/10/how-to-really-flush-the-various-nscd-caches :) [22:38:43] nscd --invalidate=hosts [22:39:15] TIL "The nscd caches are saved to disk" [22:39:34] I thought they were just in ram [22:39:37] What is Secureauth? :O [22:39:48] I've never EVER seen that before. [22:40:36] Cyberpower678: context? [22:41:05] there is a multi-factor auth product named secureauth, but we don't have it here [22:42:01] bd808: When trying to auth into mysql. [22:43:03] bd808: ERROR 1275 (HY000): Server is running in --secure-auth mode, but 's51059'@'tools-bastion-03.tools.eqiad.wmflabs' has a password in the old format; please change the password to the new format [22:43:03] oh. that's the mysql protocol change [22:43:46] But it was still working this morning on the instance I spawned last week. Did the change just happen? [22:44:04] did you load your permissions tables from a dump file made on an old version of MySQL (like pre 4.1)? [22:44:13] No [22:44:28] I did exactly what I did last week. [22:45:26] that client flag defaults to off... unless our packages set it to true in a config file somewhere [22:45:45] It must have. [23:05:06] bd808: I'm reading help docs, but none of them are working. I just can't figure out how to convert to newer auth methods or disable secure-auth [23:06:31] https://dev.mysql.com/doc/workbench/en/wb-mysql-connections-secure-auth.html [23:07:33] ignore ^^ [23:07:38] that's for workbence [23:07:41] workbench [23:08:04] paladox: I use workbench [23:08:07] https://dev.mysql.com/doc/refman/5.6/en/account-upgrades.html [23:08:13] on the linux os? [23:08:23] i thought you have mariadb installed? [23:08:41] mariadb dosen't have workbench as far as i know :) [23:09:11] paladox: nevermind. [23:09:30] I use Workbench to query the DB. [23:09:43] see https://dev.mysql.com/doc/refman/5.6/en/account-upgrades.html Cyberpower678 [23:10:55] * Cyberpower678 looks [23:27:28] paladox: ALTER USER 's51059'@'%.tools.eqiad.wmflabs' IDENTIFIED WITH mysql_native_password BY ''; [23:27:36] yeh [23:27:45] Gives me a syntax error [23:28:43] paladox: I don't understand [23:29:05] oh [23:29:11] what syntax error does it say? [23:29:14] oh [23:29:15] wait [23:29:18] i know the error [23:29:44] never mind [23:29:45] Cyberpower678 what's the syntax error? [23:29:53] * paladox it's 00:29am here :) [23:30:43] paladox: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'USER 's51059'@'%.tools.eqiad.wmflabs' IDENTIFIED WITH mysql_native_password BY '' at line 1 [23:30:58] jmm [23:31:07] you forgot ALTER :) [23:31:15] ALTER USER 's51059'@'%.tools.eqiad.wmflabs' IDENTIFIED WITH mysql_native_password BY ''; [23:31:19] Is what I used [23:31:36] hmm [23:32:29] oh [23:32:35] it is for mysql 5.7 [23:32:49] the pre 5.7 one is below [23:32:51] SET old_passwords = 0; [23:32:51] UPDATE mysql.user SET plugin = 'mysql_native_password', [23:32:51] Password = PASSWORD('DBA-chosen-password') [23:32:51] WHERE (User, Host) = ('user1', 'localhost'); [23:32:52] FLUSH PRIVILEGES; [23:33:00] Cyberpower678 ^^ [23:33:09] Considering that secure-auth is being forced on me, I would say that is 5.7 [23:33:17] But how do I check to be sure. [23:33:18] your using mariadb [23:33:22] which is not mysql :) [23:33:35] Well I can't even disable it. [23:35:05] https://serverfault.com/questions/573809/skip-secure-auth-mysql-5-6-15 [23:35:34] secure-auth = OFF in my.cnf [23:35:46] though not sure under which section it goes [23:37:19] follow "Before MySQL 5.7, you must modify the mysql.user table directly using these statements:" [23:37:21] section [23:39:01] anyways i have to go now, sorry as it's 00:38am in the morning :). utc +1. [23:39:18] my clock goes back on sunday anyways [23:40:18] paladox: The doc is useless [23:41:03] Sorry, but every time I hit a certain step, I get a syntax error. [23:41:04] :/ [23:41:24] This just stupid. Why does MariaDB force this on users without proper documentation [23:41:44] https://jira.mariadb.org/browse/MDEV-9859 [23:42:00] It seems secure-auth is false by default [23:42:03] in mariadb [23:42:10] but mysql 5.6+ has it on by default [23:42:35] anyways must goo