[00:17:54] now at https://discourse-mediawiki.wmflabs.org/t/edit-conflicts-by-tool-not-detected/1223 [00:22:27] !log tools Raise web-memlimit for isbn tool to 6G for tomcat8 (T217406) [00:22:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [00:22:31] T217406: Stretch grid problem: cannot migrate tomcat webservice - https://phabricator.wikimedia.org/T217406 [00:59:19] AnomieBOT jobs 715447 and 707476 seem to be stuck. I tried to qdel them and kill -9 the jobs on tools-sgeexec-0917 and it didn't make them go away. [01:03:13] anomie: force deleted for you. The grid at least seems to think they are gone now [01:03:36] Thanks [01:04:22] !log tools.anomiebot Force deleted jobs 707476 and 715447 per irc request [01:04:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.anomiebot/SAL [04:59:22] !log tools disabled puppet for a little bit on tools-bastion-07 [04:59:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [05:03:29] Hi friends. [05:03:51] I've been getting e-mails that my tools such as "dbreps" and "mzmcbride" are going to be disabled soon. [05:06:56] I'm skimming https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation now. [05:07:42] It's not clear what happens if I do nothing. If I just wait until March 25, 2019, can I log in and just re-establish my crontab on the new host? [05:07:57] !log tools re-enabled puppet for tools-sgebastion-07 [05:07:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [05:08:20] I'm pretty sure I write them to ~/misc/crontab.txt or similar, so it would just be a matter of running "crontab ~/misc/crontab.txt" if so. [05:09:23] Haley: That's pretty much the idea as long as the jobs work with the new versions of libraries and execution environments [05:09:44] Okay, great! [05:09:47] Waiting until the end is not a good idea because they might not "just work" with the new setup [05:10:04] Haley: So I'd recommend trying things before they are about to get turned off. [05:10:10] That's the big thing. [05:10:25] Depending on what you run [08:32:47] !log testlabs re-enabled puppet on ldap-jessie/trusty VMs [08:32:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Testlabs/SAL [11:20:44] !log tools disable puppet in tools-sgebastion-07 for testing T215154 [11:20:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:20:48] T215154: Toolforge: sgebastion: systemd resource control not working - https://phabricator.wikimedia.org/T215154 [11:53:32] !log tools enable puppet in tools-sgebastion-07 (T215154) [11:53:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:53:36] T215154: Toolforge: sgebastion: systemd resource control not working - https://phabricator.wikimedia.org/T215154 [12:17:07] !log tools reboot tools-sgebastion-07 (T215154) [12:17:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:17:20] T215154: Toolforge: sgebastion: systemd resource control not working - https://phabricator.wikimedia.org/T215154 [12:19:03] ah, that would be why my mosh session suddenly died [12:19:58] sorry for that Lucas_WMDE [12:20:02] I didn't see other option [12:20:12] should be back online already, right? [12:20:17] I’ll try [12:20:41] not yet, apparently [12:20:54] let me check [12:21:04] (perhaps a wall message next time? or was it super urgent?) [12:21:18] now it’s back [12:21:30] yes, a wall message should have worked [12:22:24] thanks, and sorry Lucas_WMDE [12:22:37] no problem :) [12:22:44] thanks for taking care of the issue, it does sound important ^^ [12:33:36] !log tools reboot tools-sgebastion-08 (T215154) [12:33:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:33:40] T215154: Toolforge: sgebastion: systemd resource control not working - https://phabricator.wikimedia.org/T215154 [15:01:44] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: @nuria & @Thiemo_WMDE - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [15:37:11] 768121 0.25770 u2_candida tools.firefl qw 03/12/2019 17:49:27 1 [15:37:35] This is the result of qstat when run through putty [15:37:55] How can this qw be changed to running state? [15:43:04] Hey guys, with the whole login host shuffle I lost track a bit and https://wikitech.wikimedia.org/wiki/Help:Toolforge#Bastion_hosts is outdated [15:43:28] multichill: outdated how? [15:43:40] It says we have 2 login hosts [15:44:03] did I not fix the DNS for dev.tools? [15:44:16] We have trusty hosts, we have stretch, we have normal and dev [15:44:20] So I would expect 4 now [15:44:40] the trusty hosts have ~2 weeks to live. I didn't want to polute the docs with them [15:45:03] the two names listed there are "canonical" and will continue to exist over time [15:45:05] Especially during migration the documentation should be up to date [15:45:27] Coud you please make it complete? :-) [15:45:51] I would like to reconnect to the trusty dev host to finish some things and I have no clue what hostname it has now [15:46:18] https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation#SSH_to_the_Stretch_bastion [15:46:25] login-trusty.tools.wmflabs.org [15:47:02] That's not the dev host [15:47:43] trusty-dev.tools.wmflabs.org [15:48:12] there is no actually difference between trusty-login and trusty-dev [15:48:40] they are physically distinct instances, but identical in config [15:51:13] Technical Advice IRC meeting starting in 10 minutes in channel #wikimedia-tech, hosts: @nuria & @Thiemo_WMDE - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [15:52:53] bd808: Just by convention right. Don't do heavy stuff on the normal login host [15:56:29] bd808: Not sure how long ago you updated dns, but dev.tools.wmflabs.org made me end up at the old host [15:57:21] multichill: I may have missed that one. I will double check and fix if so in the next hour or two (meeting time of the day now) [16:01:28] No rush, dns is slow anywa [16:01:31] *anyway [16:10:30] !log tools Updated DNS for dev.tools.wmflabs.org to point to Stretch secondary bastion. This was missed on 2019-03-07 [16:10:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:10:40] !log tools rebooted cron server [17:10:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:11:06] !log tools specifically rebooted SGE cron server tools-sgecron-01 [17:11:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:09:42] !log deployment-prep restart elasticsearch on deployment-elastic* to deploy apifeature usage fix (T183156) [18:09:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [18:09:47] T183156: ApiFeatureUsage data is not being populated in the Beta Cluster - https://phabricator.wikimedia.org/T183156 [18:14:47] Is something wrong with puppet in deployment-prep? Running puppet on deployment-elastic05.deployment-prep.eqiad.wmflabs fails to fetch the catalog and does nothing, paste of run: https://phabricator.wikimedia.org/P8194 [18:14:56] Looks like SSL certificate problems [18:15:39] the primary error is probably: Error: Could not send report: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: puppet] [18:16:17] Krenair ^^ [18:19:14] * Krenair looks [18:19:45] thanks [18:21:01] those instances have been up for at least 3-4 months (most like that is just when they got migrated to the new region), they should not be experiencing these puppet SSL problems typically associated with new instances [18:22:38] i thought i might check dates in /etc, but i see files go back as far as aug 7, 2006. So maybe not a good way to tell how old they are :) [18:22:59] they have certainly been around some time thogh [18:23:12] um [18:23:17] /etc/puppet/puppet.conf is wrong [18:23:18] very very wrong [18:23:21] server = labs-puppetmaster.wikimedia.org [18:23:29] how did that get there [18:23:55] lets see what happens when I fix that back to our one [18:24:00] okay it's happier [18:24:07] lots of changes [18:24:38] looks like it updated the file i was expecting from a patch merged earlier today [18:24:58] looks like puppet on beta is a pain [18:25:21] s/ on beta// [18:25:26] maybe having the config spread across wikitech hiera, horizon and operations/puppet.git doesn't help [18:26:06] yep [18:26:28] i keep hearing there is a longer term plan to replace this thing, we've only been talking about it for 3 years or so so it's getting closer :) [18:26:45] which thing? [18:27:04] beta cluster, maybe replace is the wrong word. They've been calling it a staging cluster or some such [18:27:19] like upgrading our lists to mailman v3... stalled for years [18:27:23] :( [18:27:28] replacing beta itself wouldn't solve this problem [18:27:37] you'd start running into the same kinds of issues [18:29:48] Krenair: the idea would be that an entirely new system would be designed taking into account the problems we've had with beta cluster over the years. I assume that means we would solve some of the old problems and create a slew of new ones [18:36:21] is it the design of the beta cluster itself that is the problem though? :) [18:36:35] hiera could be moved around without any redesigning [18:41:06] what about the other puppet problems around it? [18:43:30] tbh i'm not sure, i only have to use whatever releng comes up with :) [20:08:16] !log tools.tsreports rebuilt venv & restarted webservice on Stretch -- seems to work without any changes (phew). [20:08:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.tsreports/SAL [20:32:06] !log tools.tsreports marked as deprecated & wrote conversion guide to quarry [20:32:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.tsreports/SAL [22:55:14] !log tools Rebuilding jessie Kubernetes images [22:55:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:21:51] I'm having some trouble migrating my Trusty jobs to Stretch on toolforge, anyone want to help me with this? [23:22:02] bstorm_: ^? [23:23:14] 👋🏻 I'm kind of foggy headed from being sick, but I checked back in for a bit, so I'll try to be helpful. What's happening? [23:25:16] The migration instructions leave a bit to be desired... [23:25:43] kaldari: what are you having trouble with? [23:25:49] It says the final step is "$ jstart ...", but what goes after the jstart? [23:26:16] I can give you the qstat output for each of my projects [23:26:49] Effectively, it should be whatever you used to launch them on the Trusty grid [23:26:58] I have no idea [23:27:03] But that may have been a long time ago :) [23:27:11] Most of them were set up years ago [23:27:15] Can you give me a project name? [23:27:20] sure [23:27:33] The easiest option might be just looking around there for me. [23:27:42] reftoolbar [23:27:59] Ok, taking a look [23:28:55] thanks! [23:29:31] Looks like that's a webservice [23:30:00] !log tools Rebuilding stretch Kubernetes images [23:30:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:31:04] So that means this is the pertinent section https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation#Move_a_grid_engine_webservice [23:31:40] It would be with --backend=gridengine...but I'm not sure what type yet...checking [23:32:25] How did you find out that the job was a webservice? [23:32:27] It's a lighttpd project. [23:32:52] I ran `history` on the shell as the project and also noted that there's a file called `service.manifest` [23:33:09] history showed me that it was started with the webservice start command [23:33:21] I checked for a cron job with `crontab -l` and found none [23:33:44] I didn't see any `jstart` command so I'm assuming nobody ran one [23:33:47] got it! I was just looking at the wrong documentation then [23:34:57] from the service.manifest file, it was running on lighttpd...is it a php project? [23:35:04] yes [23:35:35] kaldari: you should consider moving it to a php7.2 webservice on the Kubernetes cluster :) [23:35:43] There's that :) [23:36:06] the server kitties will thank you ;) [23:36:42] More docs on all that https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation#Move_a_grid_engine_webservice [23:36:56] Good luck! [23:37:12] * chicocvenancio alway has bad memories about debugging lighttp tools [23:37:18] Hah, never mind I sent that link above as well. I did mention that I'm foggy headed. [23:38:01] Thanks!! [23:40:15] So the final command that I'll run on Stretch is "webservice --backend=gridengine lighttpd start"? [23:41:09] (If I was keeping under the same set-up) [23:41:51] kaldari: yes, that's the most direct translation. But *nudge* `webservice --backend=kubernetes php7.2 start` is more awesome ;) Unless your tool shells out to random binaries [23:42:11] sure, I'll give it a shot [23:42:24] but with the less critical tools first :) [23:49:54] Kubernetes seems to work