[10:44:32] hi all [10:45:03] how can I reset my toolsadmin.wikimedia.org user's password? [11:04:38] avgas: That's LDAP login, identical to your wikitech login. I think you should be able to reset it on wikitech. [13:10:23] Is there any limitations to frequency of logging into Wikipedia? My bot is logging in every minute (don't ask why) and sometimes gets error "Check your login and password" [13:12:57] see the replies in #wikimedia-tech [14:32:02] !log deployment-prep git pull on /var/lib/git/labs/private and resolve one merge conflict. (the root key file is too old here) [14:32:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [15:31:10] Technical Advice IRC meeting starting in 30 minutes in channel #wikimedia-tech, hosts: @Amir1 & @Thiemo_WMDE - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [17:32:18] hi, im getting puppet errors [17:32:19] with [17:32:21] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Variable $::nameservers is not defined! at /etc/puppet/modules/base/manifests/resolving.pp:6 on node phabricator.phabricator.eqiad.wmflabs [17:32:22] Warning: Not using cache on failed catalog [17:32:23] Error: Could not retrieve catalog; skipping run [17:32:35] on two instances [17:33:10] make that 3 [17:37:01] also see -releng. [17:39:40] i've found the breakage. [17:39:46] caused by https://gerrit.wikimedia.org/r/#/c/394042/ [17:41:13] https://gerrit.wikimedia.org/r/#/c/394095/ [17:42:23] I wonder if there is just something like a puppetmaster restart that is needed after _joe_'s patch [17:42:49] i can test that on another instance. [17:42:58] paladox: are you only seeing this blow up on hosts that are using a project local puppetmaster? [17:43:20] i will check on another instance that dosent use a local puppet master [17:43:59] works on instances that doint use a local puppet master [17:44:02] bd808 ^^ [17:44:46] let me check to see if the main puppetmaster for Cloud VPS has picked up that patch yet... [17:46:32] thanks [17:46:54] paladox: that patch is on the shared Cloud VPS puppetmaster. I guess that means there is something weird about how it is interacting with role::puppetmaster::standalone [17:47:07] yeh. [17:47:49] andrewbogott: do you have a minute to look at role::puppetmaster::standalone and how https://gerrit.wikimedia.org/r/#/c/394042/ may be causing breakage? [17:48:34] bd808: I can in a bit, I'm in the midst of a thing [17:50:35] bd808, paladox, I can at least confirm in passing that that patch (in isolation) seems to break something [17:50:52] thanks. [17:54:30] [17:54:48] 2017-11-29 16:16:57* chas.emp | if you add environment = future to [main] in puppet.conf I bet it runs clean. Not sure if it's actually desired state but it's a test :) [17:55:00] I'm going to try to dump that on giuseppe [17:55:04] from -operations earlier. Not sure if I got it right, but I belive it was the same thing [17:59:42] Actually, ignore me. That patch was merged after what I thought was related. [18:04:10] paladox: I think that https://gerrit.wikimedia.org/r/#/c/394098/ will fix the issue without reverting [18:04:19] thanks, testing :) [18:04:52] nope still fails andrewbogott [18:05:01] i cherry picked https://gerrit.wikimedia.org/r/#/c/394098/ and ran puppet [18:05:06] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Variable $::nameservers is not defined! at /etc/puppet/modules/base/manifests/resolving.pp:6 on node puppet-paladox3.git.eqiad.wmflabs [18:05:06] Warning: Not using cache on failed catalog [18:05:06] Error: Could not retrieve catalog; skipping run [18:06:36] I don't think cherry-picking onto the puppetmaster will help, those changes have to actually be applied there... [18:06:55] so you'll need to hand-apply the config changes from that patch in order to run puppet, in order to actually apply the patch :) [18:07:39] i see that it is applyed here [18:07:41] default_manifest = $confdir/manifests/site.pp [18:07:41] environmentpath = $confdir/environments [18:07:43] in puppet.conf [18:08:54] that's not applied, it should be default_manifest = $confdir/manifests [18:09:34] oh i see [18:10:12] still fails after changing it to ^^ [18:11:00] you'll need to service apache2 restart [18:11:38] ok thanks. [18:12:21] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: comparison of String with 4 failed at /etc/puppet/modules/role/manifests/puppetmaster/standalone.pp:65 on node puppet-paladox3.git.eqiad.wmflabs [18:12:21] Warning: Not using cache on failed catalog [18:12:21] Error: Could not retrieve catalog; skipping run [18:12:24] fails with ^^ [18:12:55] hm, it doesn't like if $puppet_major_version < 4 { ? [18:12:56] which is if $puppet_major_version < 4 { [18:12:59] yep [18:13:39] ah [18:13:42] $puppet_major_version = hiera('puppet_major_version', 3), [18:13:56] $puppet_major_version = hiera('puppet_major_version', undef), -> $puppet_major_version = hiera('puppet_major_version', 3), [18:15:04] yeah, ok, I'll make it default to 3 [18:15:10] that fixes it [18:15:10] is tools-login.wmflabs.org down ? [18:15:15] but then puppet reverts it [18:15:32] see https://phabricator.wikimedia.org/P6394 andrewbogott [18:16:10] probably just me [18:16:17] uh never mind, it works now [18:16:18] :) [18:16:24] andrewbogott: any resolution to my request ? [18:16:28] matanya: I just got logged into it [18:16:35] matanya: I think there's discussion on the ticket [18:16:44] indeed [18:16:53] but last reply is by me :) [18:17:23] oh, ok, I'm behind then. Maybe ping bd808 or give it a day :) [18:17:39] I guess I pinged him by saying you should ping him [18:18:15] lol [18:18:29] matanya: I'm still not sold on the file sharing plan, but andrewbogott seems to think it might be useful [18:18:48] I just get nervous about file drops generally [18:18:53] even open source meeting solution is worth it imho [18:19:04] useful but also maybe in need of supervision or some kind of controls on who can upload [18:19:17] i can administer it [18:19:44] can we try with a limited poc ? [18:20:06] i will install and grant access to the standard committee at first and see the response [18:21:20] paladox: ok, I think things are cleared up now — thanks for diagnosing! [18:21:49] matanya: yeah, I think we can let you try something out, but we should make some sort of plan about how to decide if the experiment is working [18:22:17] bd808: check usage ? [18:22:44] if we can find a video system that is actually workable I think we would want to figure out how to move it to "real" hardware. Without a real time kernel I fear it won't scale very well [18:22:59] andrewbogott thankyou for fixing it. I wonder how this will be applyed to all the local puppet masters [18:23:10] that have to manually make a change and get puppet to reapply. [18:23:16] i did a test with andrewbogott on my own home virt and he said it was great [18:23:26] I think they'll recover since the patch should be applied via cron... [18:24:11] you mean git fetched? [18:24:47] yeah [18:25:08] ah, it changes this file /etc/puppet/puppet.conf.d/20-master.conf which is why puppet.conf was different. [18:25:17] paladox: I will keep an eye on the deployment-prep master [18:25:28] thanks :) [18:25:28] hm, maybe that's not self-hosted [18:25:48] matanya: should we put a time box on the initial evaluation? Maybe something like 2-3 months and then check to see if it is actually getting use? [18:26:07] i am cool with the bd808 [18:26:12] *that [18:26:49] matanya: ok. I'll add some notes to the ticket and then we can get you started. [18:26:58] thanks [18:42:55] !log collaborate Created project with matanya as initial admin (T181369) [18:42:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Collaborate/SAL [18:42:58] T181369: Request creation of collaborate VPS project - https://phabricator.wikimedia.org/T181369 [19:22:52] bd808: i have connectivity issues to labs, might it be my ssh version ? [19:22:59] it times out on auth [19:33:37] legoktm: *waves* have you been able to get anywhere with the releasetaggerbot issue? [19:53:56] matanya: I suppose it could be, but I don't know that we have changed the configuration on our side recently. [19:54:47] matanya: is your result the same with tools-login directly and via the cloud bastion? [19:55:15] bd808: i can pm you the timeout [19:55:48] sure [20:35:21] (03PS1) 10Merlijn van Deen: Catch, log and re-raise all exceptions at highest level [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/394133 [20:35:42] (03CR) 10jerkins-bot: [V: 04-1] Catch, log and re-raise all exceptions at highest level [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/394133 (owner: 10Merlijn van Deen) [20:37:51] (03PS2) 10Merlijn van Deen: Catch, log and re-raise all exceptions at highest level [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/394133 [21:02:17] valhallasw`cloud: I honestly haven't actually spent time looking at it yet [21:07:29] back [21:07:55] woops wrong channel [21:21:23] legoktm: np, I think in the end nothing is really broken? at least magically things started working again. Maybe it was a security issue after all, that got opened up or something [21:21:31] in any case the logging should be better now [21:27:10] valhallasw`cloud: we should have it file its own bugs [21:27:47] !log deployment-prep Update ores submodule, for RevIdScorer statistics [21:27:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [21:36:38] yeah, 10 hours of lag. https://tools.wmflabs.org/replag/ [21:56:58] (03CR) 10Jforrester: "Seems sane." [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/394133 (owner: 10Merlijn van Deen) [22:44:17] !log   [22:44:17] Deadsoul: Unknown project "" [23:27:13] !log tools.zppixbot cleared reminders.db file. [23:27:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.zppixbot/SAL