[01:03:00] !help [01:03:00] Davod: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [01:03:28] Davod: what's up? [01:04:25] Hello, I inform I successfuly started the Webservice on webarchivebot, but by explicitly invoking the parameters for Kubernetes [01:05:17] However, Kubernets returns error when starting a continuous job on a PHP7.2 Docker image. [01:06:34] tools.webarchivebot@tools-bastion-03:~/bin$ kubectl get pods [01:06:35] NAME READY STATUS RESTARTS AGE [01:06:35] webarchivebot-3613232555-sfb57 1/1 Running 0 53s [01:06:35] webarchivebot-backend-4187794248-gdjtv 0/1 CrashLoopBackOff 3 1m [01:06:35] tools.webarchivebot@tools-bastion-03:~/bin$ nano WebArchiveBOT.sh [01:06:35] tools.webarchivebot@tools-bastion-03:~/bin$ nano WebArchiveBOT.sh [01:07:49] You can use `kubectl logs ` to see what the error is [01:08:34] looks like "no such file or directory". Where is the deployment template that you used to start the job? [01:08:51] This is the Docker image I used: [01:08:52] docker-registry.tools.wmflabs.org/toollabs-php72-web:latest [01:10:01] how did you tell Kubernetes to start it? -- https://wikitech.wikimedia.org/wiki/Help:Toolforge/Kubernetes#Kubernetes_continuous_jobs [01:10:37] Let me see [01:12:54] container_linux.go:247: starting container process caused "exec: \"/usr/bin/php -c /data/project/webarchivebot/bin/webarchivebot.ini -f /data/project/webarchivebot/bin/main.php\": stat /usr/bin/php -c /data/project/webarchivebot/bin/webarchivebot.ini -f /data/project/webarchivebot/bin/main.php: no such file or directory" [01:13:13] This is kubectl logs [01:13:32] right. that's where I saw the "no such file or directory" message [01:14:06] In the deployment file, the command section I set [01:14:08] command: [ "/usr/bin/php -c $CONF_FILE -f $DIR/main.php" ] [01:15:43] Now, is [01:15:44] command: [ "/usr/bin/php", "-c $CONF_FILE", "-f $DIR/main.php" ] [01:17:49] tools.webarchivebot@tools-bastion-03:~/bin$ kubectl logs webarchivebot-backend-999730792-a34tz [01:17:49] Could not open input file: /data/project/webarchivebot/bin/main.php [01:18:08] I got this error message, however, /data/project/webarchivebot/bin/main.php exists [01:20:16] AAaaaaahhhh... [01:20:33] In the documentation, I see: [01:20:34] This deployment: [01:20:34] Uses the 'stashbot' namespace that the tool is authorized to control [01:20:34] Creates a container using the 'latest' version of the 'docker-registry.tools.wmflabs.org/toollabs-python2-base' Docker image. [01:20:34] Runs the command /data/project/stashbot/bin/stashbot.sh run when the container starts. [01:20:34] Mounts the /data/project/stashbot/ NFS directory as /data/project/stashbot/ inside the container. [01:20:56] Davod: if you start an interactive pod, does the command work? That would be something like: webservice --backend=kubernetes php7.2 shell and then your command [01:21:37] I see that /data/project/webarchivebot/bin/main.php is a symlink which I think should work but I'm wondering if that is somehow causing problems [01:22:01] The problem with my bot, is the NFS share is mounted *after* running the command given. [01:22:16] The symlink target is inside my home directory. [01:22:48] main.php -> ../git/WebArchiveBOT4/bin/main.php [01:22:57] tn volumes are setup before the command is run. That order in the help page is just to match the order of the parts of the definition [01:23:15] ahhh, thanks for the clarification [01:24:43] Strange [01:25:55] So, I got rid the symlinks. I'll try again [01:27:46] Well, apparently the problem was caised by symlinking those files [01:28:08] interesting [01:29:15] Nope, it failed again [01:29:36] tools.webarchivebot@tools-bastion-03:~/bin$ kubectl logs webarchivebot-backend-999730792-0y946 [01:29:37] Could not open input file: /data/project/webarchivebot/bin/main.php [01:32:36] "/usr/bin/php", "-c $CONF_FILE", "-f $DIR/main.php" [01:34:55] hmm, is someone trying to connect github to this channel? [01:46:21] So, the error persist [01:46:21] Could not open input file: /data/project/webarchivebot/bin/main.php [01:46:46] The deployment file: [01:46:47] command: [ "/usr/bin/php", "-c $CONF_FILE", "-f $DIR/main.php" ] [01:47:16] command: [ "/usr/bin/php", "-c /data/project/webarchivebot/bin/webarchivebot.ini", "-f /data/project/webarchivebot/bin/main.php" ] [01:47:48] Davod: checking what works with the shell is your best option for debugging that [01:48:22] OK [01:48:37] Following is the generated deployment file: [01:48:39] kind: Deployment [01:48:39] apiVersion: extensions/v1beta1 [01:48:39] metadata: [01:48:39] name: webarchivebot-backend [01:48:39] namespace: webarchivebot [01:48:40] spec: [01:48:42] replicas: 1 [01:48:44] template: [01:48:48] metadata: [01:48:50] labels: [01:48:52] name: webarchivebot-backend [01:48:54] spec: [01:48:56] containers: [01:48:58] - name: webarchivebot-backend [01:49:00] image: docker-registry.tools.wmflabs.org/toollabs-php72-web:latest [01:49:02] command: [ "/usr/bin/php", "-c /data/project/webarchivebot/bin/webarchivebot.ini", "-f /data/project/webarchivebot/bin/main.php" ] [01:49:05] workingDir: /data/project/webarchivebot [01:49:07] env: [01:49:08] pastbins are a good thing [01:49:09] - name: HOME [01:49:11] value: /data/project/webarchivebot [01:49:13] imagePullPolicy: Always [01:49:15] volumeMounts: [01:49:19] - name: home [01:49:21] mountPath: /data/project/webarchivebot [01:49:53] Davod, please use a pastebin [01:49:59] there is a wikimedia one at https://phabricator.wikimedia.org/paste/edit/form/14/ [01:50:12] there is also others like pastebin.com [01:50:57] or paste.ubuntu.com [01:51:00] The next thing I would try is changing the command definition to properly quote the arguments: [ "/usr/bin/php", "-c", "/data/project/webarchivebot/bin/webarchivebot.ini", "-f", "/data/project/webarchivebot/bin/main.php" ] [01:51:29] I'm not sure how 'smart' the parser is for that [01:51:34] https://phabricator.wikimedia.org/P7942 [01:51:39] thanks [01:51:55] Let me try that [01:53:37] Apparently the problem was the parameters... [01:55:40] that github bot is annoying :/ [01:57:23] As I see, is totally necessary to validate the command when I start the container? [01:58:10] And, that validation is an implementation of the upstream or only from Toolforge? [01:58:59] Davod: I'm not sure I understand your question. Can you rephrase it? [02:01:42] I'm talking about the deployment file parsing. The section "command" gives the error if the path is not found. So, that implementation comes from Kubernetes upstream, or the Wikimedia Labs developers? [02:02:29] It comes from the Docker runtime that the Kubernetes deployment is managing [02:03:05] but it is related to the base docker image you are using too I suppose. The parsing is all Docker [02:03:13] mmm [02:08:25] Well, the last time the tool worked correctly. So, I rolled back to create symlinks for the PHP files, and again, I got CrashLoopBackOff [02:08:36] No logs are given when trying to get them [02:11:01] I just removed the symlink and recreated the file called from the command section [02:11:09] without killing the pod [02:11:57] But keeps without restart after 5 attemps [02:13:35] Davod: this may help -- https://kubernetes.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/ [02:13:43] * bd808 -> dinner [02:20:46] !log discourse rebuilding MW Discourse after applying https://github.com/tgr/discourse-chat-integration/commit/dffc22f9f71d77b41a3c794307841952eb5909b4 [02:20:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Discourse/SAL [02:24:14] dang, github does not expose commits by ID? [02:28:37] Once again, the pod gets error: [02:28:38] webarchivebot-backend-999730792-b098o 0/1 CrashLoopBackOff 2 51s [02:29:08] And kubectl logs webarchivebot-backend(...) outputs nothing [02:31:50] Source code is here https://github.com/Amitie10g/WebArchiveBOT/tree/kubernetes/bin [02:34:01] Davod: what did you change between when it worked and when it broke? [02:35:09] In summary, the way how the command in the deployment file is set, then, removing the symlinks to actual files [02:35:50] Deployment file: https://phabricator.wikimedia.org/P7942 [02:37:54] And running kubectl logs, no uptup is given [02:38:34] And running kubectl get pods, shows "Error", then "CrashLoopBackOff" [02:39:09] And I'm unable to access the shell [02:40:09] when it shows CrashLoopBackOff that means that Kubernetes is not trying to run the container because it has failed too many times in quick succession [02:40:39] you have to either wait or destroy the deployment and start over [02:42:48] OK [02:43:56] sometimes `kubectl get pod --output=yaml` will give you more information about the last state change [02:44:17] Well, I'll wait some minutes and I'll start again [02:44:29] I created symlink, lets what happen [02:56:30] Well, I waited some minutes, but still persist. Maybe tomorrow I'll try again [03:16:32] I will let you know when I see Davod and I will deliver that message to them [03:16:32] @notify Davod Something got broken in your php script. Running `/usr/bin/php -c /data/project/webarchivebot/bin/webarchivebot.ini -f /data/project/webarchivebot/bin/main.php` is exiting with a 255 error status and no output. [20:16:54] !help Hi, https://codesearch.wmflabs.org/ no loads. Can you restart/reload it please? [20:16:54] Zoranzoki21: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [20:24:57] Works for me, unless I don't understand what 'loading' is supposed to look like [20:25:03] oh, and also they're gone :/ [20:25:07] heh [20:25:40] Dosent load for me [20:25:54] Does now meh [20:26:26] WFM too [20:26:45] It's hard for me to tell if it's slow because I'm on train wifi [20:27:03] Reedy: likely the slow WiFi :P [21:00:31] loads fast for me too :D