[00:04:26] tgr: its not the same... unless your tool is already on Kubernetes? [00:05:01] each bastion can only see one of the two job grids; whichever matches the OS of the bastion [00:06:11] yeah, the tutorial recommended moving tools to kubernetes so I did that [00:06:14] tgr: if you migrated a tool directly from the job grid to the Kubernetes cluster then the nag emial will whine at you again the following week because it sadly has no way to check the kubernetes namespaces for running pods [00:06:26] but it was not clear what that means wrt. OS version [00:06:57] ah, ok [00:07:09] yeah, I was trying to figure out why I got the email again [00:23:08] bd808: I don't get how making /tmp noexec improves security. one is not supposed to execute binaries from other users, and if you want to exec something yourself without being able to write anywhere, you can just map a memory and jump to it [00:23:39] "one is not supposed to" !== "it can't be done" [00:24:08] how will it be unintentionally executed? [00:24:53] a typical attack scenario is something convoluted like a web vuln that allows writing to /tmp followed by another vuln that gets a file in /tmp symlinked into the $PATH of others [00:24:53] (intentional execution... is basically calling to get hacked) [00:26:17] so let's say that is indeed an issue. how will be prevent users from creating world writable directories themselves in their home dirs? [00:26:34] noexec on /tmp is are pretty standard security recommendation -- https://www.debian.org/doc/manuals/securing-debian-howto/ch4.en.html#s4.10 [00:26:48] and what are we going to do with /dev/shm, /data/project/.shared, /var/tmp, etc? [00:27:39] make incremental changes when we can [00:28:08] * zhuyifei1999_ just feel like this is another kaslr [00:29:16] zhuyifei1999_: do you have a use case for exec from /tmp? Or just reacting to the change with questions? [00:29:32] 'Be careful if setting /tmp noexec when you want to install new software, since some programs might use it for installation' [00:29:32] both are valid, I'm just trying to understand [00:30:35] *nod* which is what brought it up at all with musikanimal's custom ruby compile [00:31:23] I don't have a use case that I must use /tmp. I could just write to somewhere else instead (well, instance local storage is probably limited to /var/tmp, and tmpfs is probably /dev/shm & /run/user/blah) [00:32:11] Like I said in that ticket, we can revisit the decision. It needs discussion rather than immediate rejection however [00:32:30] I could overwrite TMPDIR to there, and, if they are noexec'ed, then I point to NFS home dir [00:33:07] like, with my own code it's not a must, but it would be nice if it could be easier [00:33:19] I'm going to try to build Ruby manually next. Not sure if the makefile that tells it to write to /tmp or if it's that rbenv [00:33:48] musikanimal: you can overwrite TMPDIR to, say, /var/tmp [00:34:04] oh neat-o [00:34:20] if that fails you can use somewhere in your home dir [00:34:23] musikanimal: and you checked that the Stretch system ruby is not good enough right? [00:35:25] yeah that was going to be my last resort. I'm pretty sure 2.3.3 will work just fine; I'm just trying to get everything to the latest and greatest [00:35:37] 2.3.3 is from 2016 [00:36:16] * bd808 is from the 1970's [00:36:39] why is stretch, our brand new system, running on 3 year old softwares... [00:36:59] * zhuyifei1999_ 's unpopular opinion: we should run sid [00:37:09] (maintenance nightmare) [00:38:26] Stretch is 18 months old and had a freeze before that, so 2-3 year old releases are about normal [00:38:53] the real fix for these things is containers, but we aren't there yet [00:39:15] containers with cron support! :) [00:39:47] and sudo & uid mapping [00:39:53] Toolforge is full of tech debt that goes all the way back to Toolserver in 2005 [00:40:24] Choices could have been made in 2013 that would make our life better today asn maintainers, but they were not [00:41:08] was container technology that good in 2013? [00:41:30] no, but shared hosting as a business model was also mostly dead [00:41:57] and grid engine was already dead tech [00:43:24] We are now in this ugly cycle of trying to make upgrades and improvements with minimal disruption to actively used services and jobs. Many tool mantainers already dislike the rate of change we have forced on them [00:43:53] so we have to move slowly and carefully, but also make changes that are not going to be popular with everyone [08:39:18] Hello, any idea when Beta Cluster will come out of read-only mode? Thanks. [09:54:32] * CFisch_WMDE has a problem on one of the tools [09:54:58] https://www.irccloud.com/pastebin/t9UbjROH/ [09:56:37] tl;dr: "You cant run this webservice, there's a gridengine service running already. You cant stop this gridengine service, there's none running" [11:12:44] CFisch_WMDE: is there a kubernetes service? [11:13:27] It might be webservice misbehaving, or there is a k8s webservice [11:32:42] chicocvenancio: When I run "webservice --backend=kubernetes stop" I also get the "Looks like you already have another webservice running, with a gridengine backend" [11:34:28] So afaik there should be no webservice running - when you check https://tools.wmflabs.org/commons-video-clicks/ at also gives you nothing there [11:36:36] CFisch_WMDE: I'm checking it, one moment [11:41:40] cool thanks [11:45:04] CFisch_WMDE: it's up now, running on kubernetes. for some reason the `webservice` command didn't like the `service.manifest` file's contents. I just renamed it to `service.manifest.old` so a new one would be created and started the service with `webservice --backend=kubernetes start` [11:45:21] I'll open a bug report about this [11:45:24] thanks for reporting it [11:45:31] gtirloni: it's known [11:45:40] ah [11:45:45] do we have a phab task for that? [11:45:57] Thanks a lot for fixing it gtirloni ! [11:46:16] gtirloni: Probably, not sure which [11:46:30] chicocvenancio: ok, I'll try to find it. thanks for the heads up [11:47:09] T216375 [11:47:09] T216375: "Looks like you already have another webservice running" failure when trying to migrate webservice - https://phabricator.wikimedia.org/T216375 [11:47:22] bd.808 made a FAQ entry about this in Wikitech as well [11:47:28] https://wikitech.wikimedia.org/wiki/News/Toolforge_Trusty_deprecation#'webservice_stop'_says_service_is_not_running,_but_'webservice_start'_says_service_is_running [11:47:42] cool [11:47:55] Yep! [13:18:34] !log shinken re-enabled puppet on shinken-02 [13:18:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Shinken/SAL [13:36:22] !log shinken rebooted shinken-02 [13:36:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Shinken/SAL [13:50:48] Hello everybody, how can I use virtualenv with webservice? Is there some easy way to demonstrate? (I could not find anything in docs) [14:02:05] Dvorapa: see if this helps https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Using_virtualenv_with_webservice_shell [14:03:46] gtirloni: yeah, I read that. But there's nothing about non-python, so not really helpful [14:04:25] what language are you using? [14:06:21] Dart (so perhaps some general venv, not python oriented?) and Perl (I want to fix k8s dependency failure this way) [14:10:40] two languages I don't know anything about, sorry. Maybe open a phab task explaining your problem? [14:11:27] Well and is there some general approach to run webservice using venv? (because from that Python oriented help page I can not decode it) [14:15:10] not that I know of, you'll probably need to use some language-specific solution (i.e. the virtualenv equivalent in Perl). Maybe if you can get conda installed locally in your tool? I don't know [14:17:35] hmm, thank you [14:18:32] Dvorapa: each language has its own approach, unfortunately. I can currently walk through python, nodejs and rails in toolforge [14:18:57] (using something like virtual environments on each of those) [14:18:58] I see [14:19:37] conda seems interesting as a cross-language tool but I haven't tried it. I just see it mentioned often... https://conda.io [14:22:53] chicocvenancio: is there some tutorial for nodejs btw? it should be similar to Dart so I can use it for my purpose instead (it'll be limited as NodeJS is not perfect, but I'll find a way to make it work) [14:23:21] I mean NodeJS webservice venv tutorial or help page [14:23:54] gtirloni: conda seems really interesting, but I'm not sure how to run webservice from it [14:24:39] Dvorapa: I haven't made one and can't find one. I don't think so [14:25:08] hmm, never mind [14:26:04] I use nvm to get a local node version and install packages locally [14:26:22] I did make a small note about activating nvm with webservice here https://wikitech.wikimedia.org/wiki/Help:Node.js#Use_other_versions [14:46:59] there is a 'generic virtualenv'. it's called 'gentoo prefix'. it's self-contained but takes forever to set up [14:47:16] so not really suggesting it [14:54:14] zhuyifei1999_: wow, this seems really cool, never heard about it. I still don't know how to run webservice from it as for conda, but I quess I can avoid it and run an apache from it adjusted to my needs [14:59:24] takes forever to setup... yeah, that's the large downside of it. But it is a little bit expected [15:13:27] !log tools shutdown tools-puppetmaster-01 [15:13:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:59:42] !log tools started tools-puppetmaster-01 (new size: m1.large) [15:59:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:29:05] !log tools upgraded and rebooted tools-puppetmaster-01 (new kernel) [16:29:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:16:48] Hi, I just ran into the limit of max_user_connections on labsdb replicas. I cannot even log in to now flush my inactive connections. I don't mind all my inactive processes being killed. Is it possible for someone with privileges to flush my connections for user "u2443"? Thanks in advance. [19:17:37] (I was just prototyping a connection in Ipython and accidently spun up more than 10 connections) [Sorry :( ] [19:55:18] notconfusing: you can say ! help to get someone who can [19:56:34] !help I just ran into the max_user_connections on labsdb replicas. I cannot even log in to now flush my inactive connections. I don't mind all my inactive processes being killed. Is it possible for someone with privileges to flush my connections for user "u2443"? Thanks in advance. [19:56:34] notconfusing: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [19:57:02] I can take a look [19:58:22] notconfusing: I don't see them right now as active... [19:58:27] Can you try again? [19:59:01] Hmm, I see now that even if I use a different replica.my.cnf file I get the same error. Do yuo think its counting connections from my VPS? [19:59:18] Any connections to that server would be counted [19:59:31] But it would be by user [20:00:09] When I just checked, I literally saw 0 connections running for that user, though [20:00:42] Thanks bstorm_ for checking, any other thoughts on what to try? [20:02:23] Are you just running an interactive process? [20:02:40] Or does your app potentially grab a connection pool higher than 20? [20:03:01] `select * from information_schema.processlist where user='u2443';` comes up dry [20:08:30] My app should just grab 2 of these at a time. But maybe its iterating over it too quickly and it's temporary. OK thanks! [20:08:38] That might be it... [20:08:47] np [20:31:42] bd808: https://www.mediawiki.org/wiki/Wikimedia_Technology#Cloud_Services_(WMCS) is outdated, correct? [20:43:08] bd808: Srishti was right, you need to register the revision for the schema used in the gadget. I think https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/extension.json#L96 seems like a good place to do it, any objections? [21:02:00] Kb03: sort of, yes. The WMCS team still exists, but the budget is now owned by the Technical Engagement team -- https://www.mediawiki.org/wiki/Wikimedia_Technical_Engagement [21:03:19] milimetric: that's needed for a schema that is only used from javascript? I thought the registration parts were only about calling EventLogging from PHP code? [21:04:26] bd808: yeah, both JS and PHP need to know what revision to send with each event. And that example from extension.json is how you configure the revision [21:04:50] otherwise it would have to use the latest revision or something, and that's not what clients would want [21:06:29] * bd808 reads https://www.mediawiki.org/wiki/Extension:EventLogging/Programming#How_it_works and sees this info [21:09:33] milimetric: so no objections from me for adding to WikimediaEvents. We really don't have a better place to put the config for this one since its a gadget and we apparently don't use wmf-config settings for these registrations [21:20:34] ok bd808, makes sense, I guess Srishti will send a patch [21:34:28] fwiw I've been using `pip install pywikibot` for a year or two now with no issues