[01:50:17] hi zhuyifei [01:50:44] I think I have been able to reset master to restore [01:50:48] Now [01:51:27] By "cherry-picking the additional commits" do you mean I have to recommit the additional commits [01:51:30] ? [01:51:50] the import error is still there [02:22:02] Finally the import error is gone [04:44:47] hi zhuyifei1999_ [04:44:52] hi [04:45:08] I think all the code is working as expected except [04:45:15] the uploading part [04:45:31] hmm [04:45:35] what's wrong? [04:45:52] should I try site.upload instead of UploadRobot? [04:46:02] yes [04:46:17] https://github.com/infobliss/sibutest2/blob/master/NationaalArchief2.py#L222 the bot is not getting called properly [04:46:19] I said to not use UploadRobot like months ago [04:46:37] yeah you did [04:46:57] so it's a good opportunity to change it now [04:47:02] k [04:47:39] https://github.com/toollabs/video2commons/blob/8cf788ea011823eda2b250d874e43d789111ad6a/video2commons/backend/upload/__init__.py#L89 [04:47:48] This is an example [04:47:50] yeah [04:48:00] it's overly complicated though [04:48:28] it's purpose is to make files up to 4GiB upload [04:48:56] ok then could you let me know how to simplify it ? [04:49:15] remove the "chunk_size=chunked, async=bool(chunked)" [04:49:32] ok is that all? [04:49:49] and nearly all of the wrappers around [04:50:09] is the comment parameter required? [04:50:21] yes [04:50:26] it's the upload summary [04:50:46] it defaults to the file description page wikitext [04:50:54] which is really ugly [04:51:31] for our case shall we use the description parameter for comment? [04:51:37] no [04:51:43] a sec [04:52:17] see https://phabricator.wikimedia.org/T170195#3426445 [04:53:02] in "widely used pattern as far as I am aware" there are many links to varous files with different upload summeries [04:53:45] ok [04:54:04] (Of course I still prefer the v2c style as it shows the source of the file clearly) [04:55:17] shall we say upload from the "glamname"? [04:55:34] sure [04:55:42] ok [04:55:46] unless basvb object :) [04:55:56] ok [04:56:13] https://github.com/toollabs/video2commons/blob/8cf788ea011823eda2b250d874e43d789111ad6a/video2commons/backend/upload/__init__.py#L79 [04:56:23] page takes as param wikifilename [04:56:37] that's the filename on-wiki [04:56:50] again in site.upload we have source_filename=filename [04:57:09] that's the uri to the file [04:57:32] ok the file location [04:57:38] yeah [04:57:45] the one to-be-uploaded [04:58:12] like http://afbeeldingen.gahetna.nl/naa/thumb/10000x10000/b0539951-3ad8-5d9c-9327-9717699e8c19.jpg [04:58:35] you might not have the rights to upload-by-url though [04:58:47] so you may have to download locally then upload [04:59:01] omg [04:59:06] that's so bad [04:59:42] removing the local copy will be again an overhead [04:59:58] use tmpfile module [05:00:38] *tempfile https://docs.python.org/2/library/tempfile.html [05:01:00] ok [05:02:19] but if I don't have rights to upload-by-url then how does UploadRobot doesn't face any issues? [05:02:33] *don't face [05:03:46] a sec [05:04:41] https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/specialbots.py#L124 [05:05:43] great [05:06:39] can I use this function in my code? [05:07:11] I don't really recommend it [05:07:57] ok [05:08:45] this code can OOM [05:09:16] i.e. download everything to memory, then write to file [05:10:02] it doesn't use with: statements either [05:10:53] OOM? [05:11:04] out of memory [05:12:07] ok [05:12:48] you know, there's a one-liner to do nearly the whole thing https://stackoverflow.com/a/22776 [05:13:05] so if there's a very large file that can OOM we have to raise an exception [05:13:15] no [05:13:33] that code is fundamentally bad on how it downloads data [05:14:54] alright [05:15:22] but say we have a file that we cannot accommodate into memory then what do we do? [05:16:06] https://stackoverflow.com/a/22776 [05:16:32] use python standard library [05:17:11] ok so this will take care of files however large [05:19:22] https://www.irccloud.com/pastebin/eDROLXDn/ [05:19:39] the source of that function in python3] [05:21:04] nice [05:21:30] thanks zhuyifei1999_ [05:21:37] np [05:21:47] I will report to you again after trying to upload this way [05:21:55] k [08:11:15] hi [08:11:27] hello [09:12:28] hello basvb [09:12:51] hi [09:12:53] shall we discuss about multiple image upload? [09:13:18] yes, although I realised there are some steps which might be more logical before that [09:13:34] 1: a good url recognizer [09:13:41] url/identifier [09:14:01] 2: how we want to deal with wikitext changing [09:14:11] both of these could influence the multiple image workflow [09:15:26] you mean wikitext edit? [09:16:56] also I didn't propoerly get the point about url. Are you saying about allowing image uploads by url? [09:17:21] which is done by unique ID only now [09:19:03] yes we should do some recognizing of different formats. easier for the user -> better [09:19:17] but we could do that at a per glam level [09:19:30] a few regexes per glam should do right? [09:19:52] sure we had planned for providing that support [09:20:30] for NA only http://proxy.handle.net/10648/* type [09:21:27] different kind of unique urls [09:21:37] but we can do that on a per glam basis [09:21:45] so user flow: [09:21:54] 1. user selects glam/collection from drop down [09:22:06] 2. it shows a field where one can enter image url/identifier [09:22:45] I got you [09:22:53] 3. It also shows a field (mark) "upload multiple image". You select that and 2. changes to searchstring [09:23:07] we can then per glam allow for multiple image upload to be possible [09:24:06] or may be we can have both textboxes for url/id and searchstring [09:24:14] If multiple images are uploaded the user likely wants to do changes on all images so add a category to all of them etc, [09:24:22] and not change 10x the same thing in the wikitext [09:24:48] ok [09:24:52] ok so user selects upload multiple [09:24:57] and gives a search string [09:25:23] then we provide all the images with a mark "upload" (and on top select all dissellect all) [09:26:18] probably you want to have a maximum of results (100?) and some good idea what to do if there are more results [09:26:33] this will get a bit more complicated with ideally loading in images and providing some info [09:26:41] otherwise the user can;t select [09:26:46] so this is not easy to implement [09:27:17] zhuyifei1999_: any input on the multiple image upload? [09:27:37] yeah this is exactly like google 'image' search [09:27:57] show thumbnail of images [09:28:11] it's not exactly compatible with the edit-wikitext-directly scheme imo [09:28:27] we would need two workflows to deal with wikitext [09:29:49] yep, I would suggest to make it a setting to be able to edit wikitext (also for single upload) with some mark [09:29:55] and, unless we duplicate a lot of code, some related stuffs might have to move into js [09:30:00] some users just want to do quick upload [09:31:26] well, do you want to enable a switch between editing form and raw wikitext? I think we discussed this a while ago and it's complex [09:31:55] unless we can afford to abandon all the data input during the switch [09:33:15] (the switch shouldn't be too much of a problem for me myself, but I'm not sure about infobliss) [09:33:21] for multiple image it may not be logical to show raw wikitext for all images [09:33:43] exactly [09:33:52] maybe some road in the middle, but that could become ugly (category as form add, maybe for some glams another very important field, rest wikitext (after that and then we can abandon the exact mapping at that point) [09:34:00] instead we can ask for category etc that applies to all of the images [09:34:22] and for mutliple uploads no wikitext editing [09:34:50] you would have to implement a. wikitext from glam b. wikitext from user form input c. user form from glam [09:35:25] user provides some inputs (category) + glam mapping [09:35:32] these result in a proposed wikitext [09:35:37] hmm [09:35:41] which the user can edit [09:35:47] (in some cases) [09:35:54] which results in a final text to upload [09:36:15] so in multi image allow category input only? [09:36:20] I could also agree with not going for wikitext editing at all, and leave that to the user to do on cmmons [09:36:35] yeah simplicity of use was our first priority. [09:36:52] that would work [09:36:54] it's of course a little less nice, but commons has the editor for it [09:37:10] though I'm not sure if you want multi-input [09:37:22] what do you mean exactly with multi-input [09:37:39] I mean input multiple categories at once [09:38:09] afaik html do not support that natively [09:38:54] I think categories are often interesting to add, especially with multi-image upload [09:39:03] so we can limit the user to 0 or 1 custom categories, or write some js to make the # of custom categories arbitrary [09:39:04] if I search for queen beatrix 1950 [09:39:20] then it would be good to add the category: queen beatrix in 1950 to all 20 resulting files [09:39:36] yeah [09:39:44] or 2 categories, if 2 are applicable [09:39:54] zhuyifei1999_: we have an "Add more categories" button already for multiple category input [09:40:24] where? [09:40:35] in the sibutest app [09:40:42] I mean the code [09:40:56] oh wait [09:41:58] https://github.com/infobliss/sibutest2/blob/master/templates/index.html#L166 [09:43:05] hmm [09:43:29] that should work [09:43:35] It now is on the level that for all glams you can add categories? I think that's fine in this case [09:43:59] for some other fields we might like them for some glams and not for others [09:44:29] although the code is ... ugly... I'll save my complaints for a later time [09:44:47] ok zhuyifei1999_ [09:50:34] so zhuyifei1999_ could you suggest anything on how I may show thumbnails . [09:50:54] hmm [09:51:10] basvb: does the glams provide them? [09:51:19] *do [09:52:18] they provide the url to the image in the metadata, the NA provides 10 urls (in different sizes) [09:52:58] yeah [09:53:59] that's fine [09:54:15] well that is pretty glam specific likely [09:54:18] we have to form a layout and do other things [09:54:37] http://amdata.adlibsoft.com/wwwopac.ashx?database=AMcollect&search=priref=124&output=json gives only one image url [09:54:48] which you even have to change [09:54:52] and add checkboxes on the corner of each image [09:54:55] as it's an internal database url [09:55:07] checkboxes on the corner? [09:55:25] just a checkbox below the image/some text (title?) [09:55:31] for select/deselect [09:56:50] zhuyifei1999_: I think most glams provide just the single (full scale) image [09:57:59] http://ahm.adlibsoft.com/ahmimages/pictura2009/S_A_10075_000.jpg [09:58:47] ok [09:59:17] we may be able to scale it down [09:59:30] zhuyifei does that cost a lot of time for scaling down when there are 100 images or something? [10:07:51] well if we want to avoid scaling down what we can do is show two images per page and have navigation via pagination [10:08:07] or may be navigation by scrolling [10:08:22] as in google image search [10:08:30] basvb: it might, all of them has to be done on server-side [10:08:42] there are also caching issues etc. [10:08:54] that defeats the purpose of multi-uploading with previews a bit [10:10:48] * infobliss_ needs a brb [10:17:47] hi [10:20:16] so how do we go about multiple images preview? [10:21:00] * zhuyifei1999_ don't know, unless we want to stress labs servers [10:21:15] ok [10:21:28] inspect some of the searches of glams [10:21:33] http://am.adlibhosting.com/results [10:22:03] how do they provide the previews [10:22:28] and is that something we can do as well [10:23:29] but don't they fetch directly from db? [10:23:58] maybe, but they are exposing their db through adlibhosting [10:24:01] db never contain the actual files, only uris [10:24:14] maybe you can just do the http://ahm.adlibsoft.com/ahmimages/pictura2009/S_A_10075_000.jpg with something like http://ahm.adlibsoft.com/ahmimages/pictura2009/S_A_10075_000.jpg/200px [10:24:32] and they have the thumbnails on their serverside [10:25:39] 404 - File or directory not found [10:26:09] on what? [10:26:15] that last url? [10:26:15] I just made that up [10:26:20] inspect their code [10:26:26] ok [10:27:04] you mean their api doc? [10:27:16] is the code open too? [10:27:31] no, look at their generated html [10:28:33] https://am-web.adlibhosting.com/wwwopacx_images/wwwopac.ashx?command=getcontent&server=images&value=KA_21965.JPG&width=150&height=150&imageformat=jpg&scalemode=fit&canvascolor=eeeeee [10:29:25] although that doesn't give me in image they do something with the images there [10:29:59] basvb: & => & [10:30:08] aah oops [10:30:45] sorry I'm doing something at the same time, was hoping you could try and attempt to investigate the html a bit infobliss and see if you can construct the image url from this [10:31:03] I think it's a good lesson in backtracking how something works [10:31:08] ok [10:31:37] the url I gave you should be able to form into a workable url I think [10:43:03] infobliss did you find the url pattern? [10:47:02] sorry my net got disconnected [10:49:44] give me a moment basvb I am looking for [10:51:03] https://am-web.adlibhosting.com/wwwopacx_images/wwwopac.ashx?command=getcontent&server=images&value=LA_2019.JPG&width=150&height=150&imageformat=jpg&scalemode=fit&canvascolor=eeeeee [10:51:19] value= changes [10:52:27] all image thumbnails are 150*150 [11:04:19] yes I got at https://am-web.adlibhosting.com/wwwopacx_images/wwwopac.ashx?command=getcontent&server=images&value=KA_21965.JPG&width=100&height=100 [11:04:28] other sizes work as well [11:04:51] scalemode fit you can decide whether you want it, canvascolor is the white above and below [11:05:14] ok I'm off for todahy [11:05:20] I think for most glams it can work like this [11:05:38] so server side image thumbing