[12:15:16] !help my tool (quickcategories) appears to have died yesterday for no apparent reason [12:15:16] lucaswerkmeister: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [12:15:26] uwsgi.log says “SIGINT/SIGQUIT received...killing workers...” [12:15:36] last modified yesterday 19:31 UTC [12:16:08] the same thing seems to have happened to another tool, ordia (I can’t read its uwsgi.log but it was also last modified 19:31) [12:16:13] lucaswerkmeister: restarting it works? [12:16:18] haven’t tried it yet [12:16:22] probably [12:16:24] but I’m curious why it died :) [12:16:27] I couldn’t find any SAL entries [12:16:35] Grid engine? [12:16:39] Or k8s? [12:16:42] k8s [12:16:55] * chicocvenancio is curious as well [12:17:01] `webservice start` says “your job is already running” [12:17:24] `kubectl get pods` shows the pod as “0/1” in the “ready” column, “pending” in “status” [12:17:32] Ohh [12:17:40] Can you describe it? [12:17:50] describe what? [12:17:58] `kubectl describe pod PODNAME` [12:18:20] oh, okay [12:18:24] https://paste.gnome.org/patdtqecq [12:18:32] That is a question because I'm not sure Toolforge k8s permissions allows that [12:18:33] * lucaswerkmeister is still mostly a k8s noob, sorry [12:19:11] I can try a `webservice restart` [12:19:15] lucaswerkmeister: is that the full output from describing it? [12:19:26] yes [12:19:38] sorry, except for a “no events.” line at the bottom [12:19:43] Toolforge is on an old k8s version, so I am a bit unfamiliar with it [12:19:52] (which doesn’t have a newline after it, so it bleeds into my prompt, that’s why I didn’t copy it) [12:20:04] Yeah, that was the line I wanted, but populated with the events [12:20:25] Your user probably doesn't have access to the events [12:20:31] ah okay [12:20:42] There is definitely an event if it is in pending mode [12:20:49] Pending status [12:21:07] Or at least there should be one [12:21:27] Webservice restart is your best bet to get it up [12:21:33] trying that [12:21:40] “your job is not running, starting…” [12:22:32] You can watch it with `kubectl get pods --watch` [12:22:46] To see if it is properly scheduled and starts [12:23:11] ah, nice [12:23:20] `kubectl get pods` still shows the same pod, though (-xqip5) [12:23:29] same age, too [12:23:43] That's not a good sign [12:23:50] You can probably delete it [12:23:52] !log tools.quickcategories kubectl delete pod quickcategories-654583560-xqip5 [12:23:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [12:23:58] Yeah [12:23:59] That [12:24:06] might as well start with the per-tool SALs, haven’t played with those before :) [12:25:06] okay, that started a new pod, which is also pending now [12:25:13] no output from `kubectl logs` [12:25:58] If it stays pending then there might be something wrong with k8s in Toolforge [12:26:20] seems so [12:27:06] I tried `kubectl exec -it … bash` but it says the pod does not have a host assigned [12:27:16] (I guess that’s what “pending” means? the container isn’t running yet?) [12:27:27] Yes [12:27:50] The event would tell you why it doesn't have a node yet [12:28:13] But I think users don't have that permission, root might [12:28:44] okay [12:29:47] none of the recent Phabricator tasks sound related, I guess I’ll create one [12:30:22] It is a Sunday, and very early on the timezone of the oncall person for WMCS. So yeah, phab task and waiting seems sensible [12:42:08] https://phabricator.wikimedia.org/T220912 [12:42:12] anyways, thanks for your help so far :) [12:42:21] * lucaswerkmeister starts making lunch [16:07:09] Hrm. Sounds like too many k8s workers are down [16:20:25] I am wondering whether https://phabricator.wikimedia.org/T220912 is a Python3.4/Python3.5 problem? [16:23:11] !log tools moved all tools-worker nodes off of cloudvirt1015 and uncordoned them [16:23:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:27:05] I have installed python3.5 in a Python virtualenv on Toolforge, but Kubernetes seems to use python3.4. It is unclear to me how I can construct a Python3.4 virtualenv on the tool's Toolforge account. [16:28:14] fnielsen: you need to use a k8s shell to create the veenv [16:28:55] From the tool account `webservice backend=kubernetes python shell` [16:31:32] ok [16:38:34] I have gotten into a shell with "webservice --backend=kubernetes python shell", but where do I find virtualenv or similar? [16:40:52] "python3 /usr/lib/python3/dist-packages/virtualenv.py --python=python3.4 venv3.4" seems to create a 3.4 virtualenv [17:04:33] fnielsen: https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Flask_OAuth_tool#Create_a_Python_virtual_environment_for_the_application's_external_library_dependencies [17:04:49] There are several ways to do it. [17:05:33] Toolforge k8s expects to be in the $HOME/www/python/venv folder