[07:16:28] Hi folks. I have my python-based Telegram wikilinksbot running on Toolforge, and it's been working well. It needs to run continuously, and until a few hours ago it had been running for 3 days with no problem, until it stopped. When I enter the terminal it was running in, it just said "Killed". Any idea why? [07:16:53] (i.e. is it likely to be some error in the code, or is there something on Toolforge that kills things that are running for too long or something?) [07:22:27] Googling/Stackoverflow indicates that "Killed" normally comes when a script uses too much memory [07:26:37] Maybe you need to take care of gc, I suspect. [07:30:50] what's gc? [07:36:21] ha, garbace collection [07:36:24] looking into it [08:15:31] acagastya, is it really as simple as doing `import gc` and `gc.enable()` in python, and that's it? [08:16:52] That, I can't say for certain. [08:17:31] python should be able to gc automatically -- and since it doesn't maybe somewhere in the code, it is not getting collected. You would have to see the logs for that. [08:17:40] And obviously, the source code. [12:22:35] Jhs: gc is enabled by default. [12:22:58] is the bot running on kubernetes? [12:23:14] They might not be discarding what is no longer needed. [12:23:33] yes, 'memory leak' [12:24:23] I have tools to figure out causes of memory leaks [12:24:56] but we want to make sure it is in fact a memory leak, hence 'is the bot running on kubernetes?' [12:26:04] Has been idle for four hours, maybe they are offline. [12:26:45] it's fine I can wait [12:26:55] it's morning for me :) [12:27:19] East Coast, I suspect. [12:27:32] no, CDT [12:27:40] I'm in Illinois [12:28:05] Yeah, would have been the next guess. Too early for the west coast. [12:29:12] this is what I use to debug memory leaks: https://github.com/zhuyifei1999/guppy3 [12:29:19] in Python [12:29:41] (yeah I'm that person who made it work on Py 3) [12:31:27] You do realise that having 1999 in your username triggers a rush of imposter syndrome, no? [12:32:10] * zhuyifei1999_ looks up imposter syndrome [12:32:58] don't understand. why? [12:34:58] Because the nature of primitive brain -- it starts comparing -- people are just so much better (and with 1999 in username, possible a younger person is so skilled), makes me question my life decisions. [12:35:37] nineteen ninety nine. there isn't another way for so many nines :P [12:37:31] 1999.999? [12:38:20] umm okay, but you have to insert non-nine words into it [12:38:29] like 'nine hundred' [12:38:34] oh [12:38:48] you mean 'point nine nine nine' [12:39:07] 'point' isn't nine :P [12:44:47] But why nine? [12:46:53] why not :P you want 1888? [12:50:05] 1444 -- that is a fourier series right there. [12:53:36] ? you mean https://en.wikipedia.org/wiki/Fourier_series ? isn't that about how any periodic function can be made into sinusoids? [13:03:09] acagastya: I just saw on your twitter (yeah I'm bored) that you need a crash course on JS Promises, check out https://commons.wikimedia.org/wiki/User_talk:Zhuyifei1999/Archive_55#AjaxQuickDelete_and_Cat-a-lot_on_betacommons there's a long long explanation :P [13:39:25] https://www.smbc-comics.com/comics/20130201.gif?w=144 [13:41:01] ah [14:56:41] Jhs: I probably killed your bot. It was running on the login.toolforge.org bastion host rather than on the grid engine or kubernetes clusters. I killed processes from ~10 users and tools that were on the bastion yesterday. [14:57:07] Ah, the classic newbie blunger. [14:59:46] bd808. "b" is for bot-slayer. [15:10:04] bd808: Somehow, without using a mediawiki framework, and using that example, I was able to make the bot work (CC: zhuyifei1999_). I wrote too many async functions, but now it works. [15:10:41] ok [15:10:45] I am not proud of that code -- but it works: https://github.com/acagastya/enwnbot/blob/master/promUrlShortener.js [15:11:26] globalState o.O [15:11:34] Thanks all. (And bd808, in that case, the API is not misbehaving -- we would just need better tutorials, I think.) [15:11:58] zhuyifei1999_: I use React -- sometimes with Redux. [15:12:28] The global state is so to monitor if error occurs, fallback to another method for working. [15:13:13] It holds error message, and two boolean values -- one for checking if it was logged in or not. [15:13:50] There is obviously a better way of doing this -- I didn't think this would work -- but I can finally rest now, before I start editing again. [15:14:55] ok [18:12:42] !log tools.paws-public applied in-place fix for non-ASCII usernames and applied this to my own version of the image T252217 [18:12:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.paws-public/SAL [18:12:45] T252217: Public-Link of Notebooks of users with non-ASCII characters return 500 error - https://phabricator.wikimedia.org/T252217 [19:37:13] !log tools adding docker image for paws-public docker-registry.tools.wmflabs.org/paws-public-nginx:openresty T252217 [19:37:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:37:17] T252217: Public-Link of Notebooks of users with non-ASCII characters return 500 error - https://phabricator.wikimedia.org/T252217 [19:39:00] !log tools.paws-public switch deployment to the openresty version to try it out T252217 [19:39:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.paws-public/SAL [21:23:12] !log tools.zppixbot syncing to deploy 599873 -- T233993 [21:23:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.zppixbot/SAL [21:23:28] MacFan4000: ^ [21:24:19] !log tools.zppixbot sync done [21:24:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.zppixbot/SAL [21:29:05] herron, hey, elk7-fe1.logging.eqiad.wmflabs appears to be out of disk space [21:41:30] Krenair: hey, ok thx I’ll have a look next week, its just a test host [21:41:36] ok [21:42:28] no rush just noticed it after looking at an `apt-cache policy ...` across all instances [22:07:37] !log tools.zppixbot-test purge old *.*db files, tar+gzip logs/* and nuke the pycahce's [22:07:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.zppixbot-test/SAL [22:10:19] !log tools.zppixbot-test dropped old logs [22:10:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.zppixbot-test/SAL [22:10:55] !log mediawiki-vagrant Rebooting mwv-builder-03.mediawiki-vagrant.eqiad.wmflabs [22:10:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Mediawiki-vagrant/SAL [22:13:11] bd808: how do you see the size of a $HOME folder [22:13:59] RhinosF1: `du -sh $HOME`, but that can be an expensive NFS operation too. [22:14:17] RhinosF1: I'm logged in on the NFS master right now, what tool would you like checked? [22:14:38] bd808: I was going too compare zppixbot to zppixbot-test [22:14:52] as I've just nuked loads of caches and logs from -test [22:15:08] zppixbot is 1.3G, -test is 127M [22:15:26] * bd808 makes eyes at 1.3G for an irc bot [22:15:36] bd808: what's the big stuff? [22:16:10] 780M ZppixBot [22:16:19] 95M envs [22:16:38] 66M stashbot (?) [22:17:33] bd808: 66M in a stashbot folder in our $HOME [22:17:52] yeah. looks like some bigish venvs [22:18:14] the main disk hog looks to be the MediaWiki deploy [22:18:27] bd808: ah [22:18:47] I ran some cleanup scripts on that not so long ago [22:18:55] anything nukeable? [22:20:46] RhinosF1: probably not in the wiki. Looks like is mostly extensions and the giant pile of i81n files that MediaWiki ships [22:21:25] bd808: it should be near stoock [22:22:36] I'm going to delete the stashbot folder on Monday [22:27:44] we do have caches for things that no longer exist [22:35:35] bd808: is there anyway to stop a kubectl pod auto-recreating for a short while? Bar deleting it [22:36:54] RhinosF1: yes, you can scale it down to 0 replicas. something like `kubectl scale --replicas=0 $(kubectl get deployment -o name)` [22:37:11] bd808: what do you do to bring it back? [22:37:24] same thing but --replicas=1 [22:38:27] `kubectl get deployment -o name` probably returns multiple things for zppixbot [22:38:44] because it has a webservice deployment and a custom deployment for the bot [22:38:55] it does [22:39:17] so really just do `kubectl get deployment -o name` directly to figure out which deployment you want to scale [22:39:36] and then `kubectl scale --replicas=0 ` [22:40:00] so deployment.apps/sopel.bot for us instead of [22:41:14] yeah, that sounds right [22:42:20] cool, I plan to stop the bot while I purge the database stuff. [23:42:04] !log clouddb-services stopped puppet and restarting mariadb on clouddb1002 after filtering out a table T253738 [23:42:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Clouddb-services/SAL [23:42:07] T253738: ToolsDB: master crashed, replica having consistency issues - https://phabricator.wikimedia.org/T253738