[00:43:49] hey, https://admin.toolforge.org/tools is not working, Error ID: 3mcmm7t5-66899867, is it a known error? [00:45:07] blerg. it was working :/ I'll take a look [00:45:21] ty [00:53:20] !log tools.admin Hard restart [00:53:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.admin/SAL [00:54:18] dont|panic: should be fixed, and as a bonus I think I know why it broke :) [00:54:50] ty :D [00:55:05] yep, looks good :) [00:57:58] !log tools.admin Hard restart. Deploying 9d497a7 (Toolinfo: do not require a trailing / for toolforge.org urls) [00:57:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.admin/SAL [01:26:53] !log tools.sigma fix hardcoded /sigma paths so summary.py tool works again [01:26:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sigma/SAL [01:27:16] oops, I thought that would show up under my name [02:10:11] hi everyone [02:10:16] now that we use toolforge.org now [02:10:21] how do user emails change? [02:10:25] is it still sigma@tools.wmflabs.org [02:10:33] or should i use sigma@toolforge.org (?) [02:11:47] oh [02:11:48] Exceptions to service migration [02:11:48] Toolforge email will still be operational under the legacy address formats and is not currently being updated: [02:11:48] tools.toolname@tools.wmflabs.org [02:11:48] toolname.maintainers@tools.wmflabs.org [02:11:50] just kidding found it [02:11:52] thanks all [03:07:21] !log tools.sigma Switch to Python 3.7/k8s [03:07:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sigma/SAL [07:59:18] I am not an spammer, i am an real person. My request id is 845, and i need to approve me a request. [08:03:24] I am not an spammer, i am an real person. My request id is 845, and i need to approve me a request. [08:05:56] I am not an spammer, i am an real person. My request id is 845, and i need to approve me a request. [08:06:12] In toolforge. [08:07:49] bd808 I am not an spammer, i am an real person. My request id is 845, and i need to approve me a request. In Toolforge. [08:17:50] I am investigating a spike of HTTP 502 Bad Gateway errors on a toolforge project (https://wdreconcile.toolforge.org/) [08:18:38] pintoch This has been resolved. [08:18:53] have you got a link to share about that? [08:18:59] https://wdreconcile.toolforge.org/ [08:19:02] pintoch [08:20:13] the spike seems to have started over the past 24 hours [08:20:26] hmmm [08:20:52] is it running on kubernetes or in the grid engine? [08:22:48] Majavah Toolforge project is running in Kubernetes. [08:23:08] Majavah And i need to approve me a request id 845 for Toolforge. [08:23:44] Majavah: it is on kubernetes. My investigation so far: https://phabricator.wikimedia.org/T257405 [08:25:02] Majavah And i need to approve me a request id 845 for Toolforge. [08:26:18] Jack_Frost need to approve me a request id 845 for Toolforge. [08:26:27] JAA10: if you're really JAA, say so in the IA channel [08:26:56] legoktm I are really JAA, but i need to approve my request id 845. [08:27:13] legoktm For ToolForge. [08:27:27] yeah no [08:27:44] legoktm Approve me a request now. [08:27:44] pintoch: I assume you've checked uwsgi.log and `kubectl pods pod_name` [08:28:31] legoktm I need to approve my request id 845 now for Toolforge. [08:28:51] Declined. [08:28:54] Bye. [08:30:20] (I pm'd the real JAA to verify that it was an imposter) [08:31:11] pintoch: that command above for kube logs was wrong, its "kubectl logs pod_name" where pod_name comes from "kubectl get pods" [08:31:24] 502 from openresty means that k8s usually can't talk to the underlying app [08:40:38] Majavah: somehow I have never seen any logs from the webservice pod with this command (perhaps I should change the logging configuration) [08:41:23] legoktm: that is also my experience, but the puzzling thing is that it only happens for some URLs, for which the 502 error is returned immediately [08:41:37] pintoch: that's usually empty but I've seen some strange things appear there sometimes [08:41:48] viewing the uwsgi.log file is usually more helpful [08:42:47] yes indeed [08:43:30] the weird thing is that uwsgi.log misses recent (and successful) requests [08:45:00] it is 20M big, too. By the way, I have been wondering whether it can be an issue if uwsgi.log gets too big? [08:46:06] it can grow to hundreds of megabytes pretty quickly on this tool, so I regularly delete it and restart the webservice [09:09:37] Hello, i can deploy Wikimedia in my own datacenter??? [09:13:00] arturo My request id 846 needs to approved. [09:15:16] sure, let me approve it [09:16:12] arturo: from same person: https://phabricator.wikimedia.org/T257411 [09:16:29] any objections if I just decline and disable their phab account? [09:16:47] Majavah: yes please [09:17:43] arturo: done [09:18:41] thanks! [11:08:46] !log toolsbeta live-hacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/610029 (T234617) [11:08:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [11:08:49] T234617: Toolforge. introduce new domain toolforge.org - https://phabricator.wikimedia.org/T234617 [11:11:49] !log tools live-hacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/610029 (T234617) [11:11:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:16:39] !log tools merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/610029 -- important change to front-proxy (T234617) [11:16:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:16:43] T234617: Toolforge. introduce new domain toolforge.org - https://phabricator.wikimedia.org/T234617 [11:31:07] !log tools.citation-template-filling fix .lighttpd.conf after domain switch [11:31:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.citation-template-filling/SAL [11:31:38] !log tools.citation-template-filling fix .lighttpd.conf after domain switch (T257421) [11:31:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.citation-template-filling/SAL [11:56:36] And me give me a op. [16:11:46] arturo: Hi! I know our failure to plan doesn't make an emergency for you, but I didn't realize your 15:00 UTC meeting when quota reviews happen wouldn't occur this week, would it be possible to review T257336? This isn't a typical quota increase, as there is specific hardware installed and only available to that project that we can't boot due to the quota limits [16:11:47] T257336: Request increased quota for wikidata-query Cloud VPS project - https://phabricator.wikimedia.org/T257336 [16:12:34] ebernhardson: sorry, yes, you are right. I can take a look [16:12:51] lol. that's a fun bug ebernhardson [16:13:10] thanks! [16:14:02] The disk bit doesn't matter (disk is not really quota'ed). But yes we should let you use your hardware ;) [16:14:22] pfft [16:14:25] steal it for other projects ;) [16:14:30] lol [16:14:45] I hear that zppix bot needs a lot of capacity [16:22:16] !log tools.listpages Hard restart to correct webservice state tracking (T257417) [16:22:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.listpages/SAL [16:27:54] !log tools.wpcleaner Hard restart to regenerate lighttpd.conf and recreate Ingress objects (T257384) [16:27:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wpcleaner/SAL [16:36:39] !log tools.wdreconcile Hard restart to reset Ingess objects (T257405) [16:36:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wdreconcile/SAL [16:56:12] ebernhardson: I'm going off for today now. Hopefully other folks in my team can take a look today! [17:01:15] alright, thanks! [17:03:20] ebernhardson: do I need to make the instance for y'all too? I think I do? [17:04:52] bd808: not sure, i tried to start the instance and it was denied for quota. Mostly i was going to try a second time and see what happens [17:05:04] yeah, give it a shot [17:05:07] ok, one sec [17:06:10] !log wikidata-query Added self (bd808) as projectadmin [17:06:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikidata-query/SAL [17:08:22] bd808: It's starting, thanks! [17:08:36] ebernhardson: that instance is being built on cloudvirt1024 though [17:08:42] :S [17:08:57] feel free to delete and re-try, i'm not sure what i should be selecting instead [17:09:26] I think you did what you could do. I think we need to change how your custom flavor works [17:09:40] I'll get it sorted! :) [17:09:49] thanks [17:20:36] !log tools.refill-api Deleted pod to restart the app T257471 [17:20:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.refill-api/SAL [18:30:42] i'm trying to push my tool to gerrit and it says it requires git-review 1.27. is that something that is provided by toolforge or do i have to update that myself? [18:33:13] gifti: you will have to update yourself please. needing the newer version is a consequence of Gerrit recently being upgraded to 3.2 [18:33:55] gifti: here are docs how to install it https://www.mediawiki.org/wiki/Gerrit/git-review#Installation [18:34:44] mutante: I think they are talking about git-review that is installed on toolforge [18:34:50] it's somehow using 1.25 [18:35:03] Majavah: ooh. thanks for pointing that out [18:35:13] * Majavah files a task [18:35:31] gifti: ^ ignore my comment then. i said that because it was a common request right after the Gerrit upgrade but unrelated to toolforge [18:35:40] Majavah: thanks for making a task [18:36:56] created https://phabricator.wikimedia.org/T257496 [19:09:56] part of me wants to scream that having git-review on toolforge is bad, but I guess since T64871 installed it we should try to make it work [19:09:57] T64871: Install git review on tools - https://phabricator.wikimedia.org/T64871 [19:13:08] well atleast don't leave a broken version there :P [19:13:26] "broken" is pretty relative. :) [19:14:30] and we have 1.25 because that is what Debian provides in Stretch. Upgrading Toolforge to Buster is a much bigger project than most folks would guess [19:15:10] "JuSt ReBuIlD tHe VmS!1!" [19:17:29] ebernhardson: wcqs-beta-01 is building and this time it is on cloudvirt-wdqs1002.eqiad.wmnet where is belongs :) [19:18:17] ebernhardson: we ended up making the flavor smaller to fit on your hypervisors, but it is still within what we promised you in the planning tasks [19:23:28] !log quarry `framawiki@quarry-web-01:/tmp$ find /tmp/* -mtime +360 -user www-data -exec sudo rm -v {} \;` 778 files deleted for 10G. [19:23:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Quarry/SAL [19:24:01] Framawiki: yikes that's a lot of leaked tmp garbage [19:24:15] are there things that really need to live there for a year? [19:24:50] We have a puppet module to setup tmpreaper that could be added into the quarry puppet stuff [19:25:17] bd808: of course not. As usual I forgot to bind the task in the log, so here it is: https://phabricator.wikimedia.org/T238375 [19:26:28] thanks for tmpreaper, noted [19:26:40] as a person who has created many tmp resource leaks, I empathize with the upstream maintainer :) [19:28:02] hmmm.. maybe we don't have a nice module for it yet. But we do set it up in ::profile::toolforge::grid::node::web [19:35:13] Framawiki: I haven't said this in a while, so thank you very much for working on Quarry :) [19:38:30] bd808: I was actually suspicious of the flavor sizing, but wasn't sure what exactly it was supposed to be. This will work grat, thanks! [19:39:10] ebernhardson: I think we made it to fill up a "normal" hypervisor without remembering that your hardware is not as huge :) [19:39:33] makes sense [19:57:52] bd808: not sure if a 5min session of workaround is "working on the project" :p and for once I'm faster than zhuyifei1999_ [19:58:31] what was the message? irccloud had an outage and I didn't see [19:59:41] zhuyifei1999_: just working on T238375 [19:59:41] T238375: quarry-web-01 leaks files in /tmp - https://phabricator.wikimedia.org/T238375 [20:09:01] Hi, I've restarted the templatehoard server using legoktm's recommendations for a Rust webserver (kubernetes backend, type golang111), but am now getting 502 messages that seem to be caused by my server not finding files whose paths involve a symlink. Is this expected? I don't see anything in the documentation about it. [20:10:13] Framawiki, bd808: feel free to go ahead with that [20:10:35] now that I'm thinking about it, why doesn't xlxswriter use anon inodes? [20:13:29] (The 502 response is because one of the endpoints crashes when it should return an error message. I will fix that, but still curious why symlinks don't seem to work.) [20:20:30] Erutuon: symlinks *should* work. Do you have logs or other things to tell you which ones are failing to resolve from inside the container? [20:51:59] !log tools.totoazero deployed ee28dfd maj_articles_recents.py [20:52:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.totoazero/SAL [20:56:51] bd808: My logging aren't especially clear, but initially Kubernetes wasn't finding ~/bin/server (originally a symlink to ~/git/templatehoard-server/target/release/templatehoard-server) and now when the server is running, it isn't finding various files under ~/git/templatehoard-server/cbor. I'm not entirely sure that it's a problem with Kubernetes though, since I'm running the server using [20:56:53] ~/git/server.sh, which cds to ~/git/templatehoard-server before running the server. Probably I need to uncomplicate all of this somehow. [20:59:40] *my logging isn't ^.^ [21:04:26] Anyway, when I start server.sh with the Grid Engine (generic), it works; with Kubernetes (golang111) it doesn't. [21:48:50] !log tools.lexeme-forms deployed b65c1018ff (translation update) [21:48:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [22:27:19] hello! lately I've noticed login.toolforge.org is very slow. It will freeze halfway through typing something. Is this happening for anyone else? I have not had this problem on VPS [22:29:17] sometimes people do things on that host they shouldn't [22:31:01] musikanimal: Right at this moment? [22:31:08] yes [22:31:09] looks to be... login is sloooooow [22:31:14] high load [22:31:29] root login is much faster [22:31:35] yeah I just tried dev.toolforge.org and it seems to be OK. I'll use that for now [22:35:35] https://grafana.wikimedia.org/d/ykpqNajZk/cloud-nfs-stats?panelId=254&fullscreen&orgId=1 shows a weird NFS traffic pattern beginning around 18:33-18:52 ? [22:43:31] wurgl's scp keeps coming up on iotop [22:43:50] wurgl 29498 0.1 0.0 38796 2864 ? DNs 22:33 0:00 | \_ scp -f /data/project/persondata/spielwiese/dump_de_td_data.bz2 [22:44:08] 29498 be/6 wurgl 0.00 B/s 0.00 B/s 0.00 % 99.99 % scp -f /data/project/persondata/spielwiese/dump_de_td_data.bz2 [22:45:48] Reedy, did you kill that? [22:45:54] No [22:46:35] okay well I went to kill it and found the process was gone and I could ls my home dir quickly [22:46:55] naturally finished? [22:47:07] maybe yeah