[00:44:00] [bz] (8NEW - created by: 2Tyler Romeo, priority: 4Normal - 6enhancement) [Bug 52354] Run Minion testing instance for security testing - https://bugzilla.wikimedia.org/show_bug.cgi?id=52354 [00:49:23] "this webserver does not list directory contents by default" is there a way to enable this for a particular tool? [00:51:05] gry: I'm not sure if there is an .htaccess override and if that is enabled; let me see. [00:53:34] gry: Where did you get "this webserver does not list directory contents by default" from? [00:53:45] https://tools.wmflabs.org/wmtran/v/ last line [00:54:48] gry: Ah, okay. It works if you put "Options +Indexes" in an .htaccess in the directory to be listed. [00:59:44] Ok. Similarly how do I disable auto-load of index.html, so it shows directory listing even if index.html exists? [01:04:08] gry: You can add "DirectoryIndex does-not-exist.html" as another line to .htaccess (with "does-not-exist.html" being a filename that is unlikely to exist in the directory). [01:08:01] thanks :) [01:08:14] np [02:24:01] Hi, I'm trying to make this script run - what am I doing wrong? https://toolserver.googlecode.com/svn/trunk/wlm/wlm2.py\ [02:24:09] (wrong link) https://toolserver.googlecode.com/svn/trunk/wlm/wlm2.py [02:25:19] anyone around? [02:32:13] yukon: You mean on Tools? [02:33:38] yes scfc_de [02:35:55] yukon: What's the URL on Tools? [02:36:25] tools.wmflabs.org/mono/wlm2.py [02:38:56] scfc_de: ^ [02:40:34] yukon: The server's error log shows "File "/data/project/mono/public_html/wlm2.py" is writeable by others". You need to remove the writing rights for the world group. [02:44:40] octal is now 755 [02:46:15] yukon: Now the error is: "UID of script "/data/project/mono/public_html/wlm2.py" is smaller than min_uid". The file needs to be owned by the tool account; to do that, "become mono" to switch to the tool account, change to public_html and "take wlm2.py" to have the tool account take ownership of the file. [02:47:52] hmm - in winzcp, should I change the owner to local-mono? it's set at 'mono' right now [02:47:59] *winscp [02:48:47] I don't think you can change the owner in winscp (or anything else besides using "take"). Does it work for you? [02:51:09] permission denied by server when I try that [02:51:24] Looking for advice/assistance with jsub and paralellized tasks. [02:51:40] error code 3 [02:56:35] yukon: Then you have to use "take" on the command line. [02:56:46] Hasteur: What do you mean by "parallelized"? [02:58:09] scfc_de: I mean a bot process that takes an input and then does some computation. I'd like to queue up a bunch of jobs, but self restrict to only 5 running at a time. [03:00:22] Hasteur: Any particular reason you want to restrict to 5? The grid limits to 16 IIRC, and *should* make sure that every user gets his fair share. [03:01:20] I'm submitting a series of jobs (i.e. 30) and want to leave myself room for random pitch jobs. [03:03:50] Hasteur: Toolserver has a grid resource "user_slot" (of 10 per user), where you could request 2 per parallel job and thus achieving at most 5 jobs at the same time, *but* we haven't implemented that in Tools yet. Should be a very minor change, so it could be done on Monday when Coren is more online. [03:08:13] ok, trying to not let long tasks go over 24 hours (as aparently the grid kills it) but trying to subdivide and not have to babysit the queue all day long too [03:13:40] [bz] (8NEW - created by: 2Tim Landscheidt, priority: 4Unprioritized - 6normal) [Bug 52976] Provide user_slot resource in grid - https://bugzilla.wikimedia.org/show_bug.cgi?id=52976 [03:14:29] Hasteur: The grid shouldn't kill long running jobs (or any jobs for that matter :-)). When were the jobs killed? [03:18:51] Some time around 19:57 UTC. I can tell this based off the bot not doing any more edits after that point until I fired the next run thinking that the current run had finished. [[:en:Special:Contributions/HasteurBot]] [03:19:17] Gap between 19:57 and 02:34 (when I started the next job) [03:20:32] Hasteur: Job name "g13_nudge"? [03:21:03] yep. [03:24:26] Job #809031 ended Sat Aug 17 19:57:16 2013 with "exit_status 1". [03:25:18] Job #812020 lasted 3 minutes around 2:33Z and exited with "exit_status 137", which is odd because that's usually out-of-memory, but maxvmem was 147.828M. [03:26:04] But Job #809031 seems to have ended due to the bot exiting abnormally. [03:26:24] ("qacct -j 809031" for what I can see.) [03:27:49] #812028 again "exit_status 137", but #812076 and #812061 "exit_status 1". Do you see anything in the job's output? [03:29:14] Strange. I know I submitted the job with 512mb requested [03:29:19] A-ha! [03:29:55] Annother reason to not have long run executions: The data set you started with may not be valid when you're traversing down it at the speed of a glacier [03:30:45] :-) [03:30:49] The page the bot was going after had been deleted between the point that the list of pages was generated and the time the bot got around to evaluating it. [03:31:33] Sorry for all the hubbub [03:32:14] No problem. But the one job (#812020) shouldn't have OOMed, because 147 MByte < 512 MByte. I think the default is 256 MByte, so even that wouldn't have been reached. [03:32:54] Technically, there can be other reasons why a job is "kill -KILL"ed and thus giving "exit status 137", but OOM is the most common cause. [03:33:14] Let's see if I can confirm it in the syslog of the exec host. [03:35:27] Well, rethinking it, it wouldn't be in the syslog -- don't know if the grid engine keeps track of that anywhere. [08:54:02] * valhallasw sends stroopwafels to addbonn [08:54:58] :) [08:55:17] I could do with some, rather hungry ! [08:59:12] :D [13:15:25] kma500: I was planning to write a *short* mail, but it's becoming a bit long. But I'm working on it! :-) [15:23:31] Coren: ping [15:53:09] mhm I am back :o just fyi [15:55:47] Coren: what is wrong with exec-01 [16:16:11] petan: YEAH :) HAPPY [18:02:27] sorry if this has already come up, but i'm trying to connect to a labs instance and i'm getting an error about being able to create my home directory? [18:02:39] https://gist.github.com/edsu/6263011 [18:04:12] filesystem failure? [18:05:18] oh hm, was just noticing http://lists.wikimedia.org/pipermail/labs-l/2013-August/001488.html [18:05:27] maybe it needs a reboot [18:06:11] * edsu tries [18:07:24] not sure if that reboot i did from the instance list actually worked [18:08:17] https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000864 [18:08:57] @notify Coren [18:08:57] This user is now online in #wikimedia-labs. I'll let you know when they show some activity (talk, etc.) [18:13:29] * edsu emails instead [19:15:05] hmm [19:20:43] yuvipanda: :) [19:20:58] how's it going? [19:21:32] ori-l: pretty good, I think :) [19:21:34] much better than expected, really [19:21:36] ori-l: first time I've managed to completely disconnect myslef from code. [19:21:38] so that's an achievement unlocked [19:21:52] ori-l: flying out tomorrow [19:22:07] ori-l: I was going to sleep early but then I told an englishman I was working for Wikipedia, and now here I am after two beers :P [19:22:10] nice [19:22:41] ori-l: my credit card is still unusable, and so is my netbanking [19:22:46] but other than that... :P