[09:49:33] I keep getting 504 Gateway timeout when accessing https://tools.wmflabs.org/copyvios, is that a known issue? [10:02:00] !help [10:02:00] CurbSafeCharmer: If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-team [10:02:11] hi there! [10:02:25] CurbSafeCharmer: let me investigate what's going on [10:02:31] Thanks [10:06:00] the tool may be overloaded [10:07:26] arturo I'm getting 503s now [10:08:11] !log tools.copyvios restart webservice, because I see in the logs `uWSGI listen queue of socket ":8000" (fd: 7) full !!! (101/100)` [10:08:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.copyvios/SAL [10:09:06] something is wrong with the source code perhaps [10:09:16] open("/usr/lib/uwsgi/plugins/python3_plugin.so"): No such file or directory [core/utils.c line 3664] [10:09:16] !!! UNABLE to load uWSGI plugin: /usr/lib/uwsgi/plugins/python3_plugin.so: cannot open shared object file: No such file or directory !!! [10:09:16] [uWSGI] getting INI configuration from /data/project/copyvios/www/python/uwsgi.ini [10:09:23] or some py2 vs py3 thing? [10:10:46] anyway CurbSafeCharmer it seems to work now? [10:11:48] just testing [10:12:47] yes, working, though slower than usual [10:13:18] arturo thanks for your help. [10:13:41] according to https://grafana-labs.wikimedia.org/d/toolforge-k8s-namespace-resources/kubernetes-namespace-resources?orgId=1&var-namespace=tool-copyvios&refresh=5m&from=now-1h&to=now metrics may indicate this tool needs higher CPU limits [10:14:09] CurbSafeCharmer: I suggest you contact the tool owner to let they know about this [10:14:47] ok, not sure who that is but will try to find out [10:22:56] Earwig, arturo points to needing a higher CPU limit on copyvio. [11:04:49] @arturo I'm getting 504s again [11:05:27] :-/ [11:31:14] I could try poking more but I would really like to have the tool developer around [11:31:54] @arturo it is Earwig. I've invited them here and also left a message on their talk page. [11:37:20] CurbSafeCharmer: probably also open a phabricator task [13:22:28] !log tools relocating tools-sgewebgrid-lighttpd-0914 to cloudvirt1012 to spread same VMs across different hypervisors [13:22:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:29:43] CurbSafeCharmer: arturo re copyvios the tool tends to be slow normally probably due to all the things it must do, i wonder if it 504s due to the long load times [15:43:24] The python uwsgi error messages are confusing but not a bug. For historical reasons the config is shared between py2 and py3 so uwsgi tries to load both runtimes. One or the other will fail to load in Kubernetes. [15:44:18] That error is a soft error though. The runtime matching the container loads and runs. [17:43:59] oh it seems we have a ticket already T244107 [17:44:00] T244107: Copyvios tool webservice failed to start on new Kubernetes cluster - https://phabricator.wikimedia.org/T244107 [18:53:53] !log tools T168677 created DNS TXT record _psl.toolforge.org. with value `https://github.com/publicsuffix/list/pull/970` [18:53:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:53:56] T168677: Add new Cloud Services domains to public suffix list - https://phabricator.wikimedia.org/T168677 [20:02:24] bd808: thanks, `webservice migrate` worked perfectly over the weekend for me. I did notice that if the webservice is still starting up, its now a 503 error isntead of 502, which I think is better? [20:03:54] arturo: on the publicsuffix PR, do we have a generic cloud-admin@wikimedia.org address that could be provided instead of individuals? [22:21:29] legoktm: we have a cloud-admin mailing list [22:21:58] anyway I dont expect anybody to send any email...