[05:34:31] sorry for my unresponsiveness yesterday. was busy trying to hack one of cp678's tools (he agreed) and talking to him about it [05:35:18] got some session exploits and an arbitrary code execution exploit [08:45:26] np zhuyifei1999_ [09:01:59] I am writing the generic license checker function [09:02:52] whose purpose is to extract the license info and fill the 'Permission' field in the Photograph/other infobox template [09:04:09] As of now the fill_template() method in the GLAM_I class is doing the metadata mapping for each GLAM [09:05:23] Ideally this method will call the license_checker() method [09:06:35] In fact it will be very much glam specific [09:06:46] Also it's a few if conditions only [09:07:35] I am not sure if it should be a separate function altogether. [10:22:57] Never mind [10:23:06] It is required. [10:23:29] Since we are not adding the license info as part of the description anymore [10:23:47] so this new method can do that part too. [10:24:15] I recently discovered that the new uploads (only 2) did not have license info [10:24:38] As reported by YiFeiBot [10:37:35] infobliss: tl;dr? [10:43:07] ? [10:43:30] I was asking if you can provide a summary of the above ^ [10:44:10] * zhuyifei1999_ just got home [10:45:01] nothing to focus actually [10:45:08] k [10:45:14] I got the point of having the function [13:12:57] When I am testing in my local machine [13:12:58] https://github.com/infobliss/sibutest2/blob/master/OOP/NationaalArchiefGLAM.py#L134 [13:13:13] anything beyond this line is not getting executed [13:13:29] are you getting an exception? [13:13:30] I changed to - categories.append( '[[Category:Photographs by ' + photographer + ']]') [13:13:37] it worked [13:13:51] I mean .format not working here [13:15:07] Is this die to python version? [13:15:12] *du [13:15:14] *due [13:15:20] ? [13:15:29] what's your original code? [13:15:45] Is .format not working due to the python version? [13:15:51] what's your original code? [13:16:06] the link above [13:16:14] https://github.com/infobliss/sibutest2/blob/master/OOP/NationaalArchiefGLAM.py#L134 [13:17:34] that should work [13:17:41] did you get an exception? [13:17:53] no [13:18:08] then what happens after that line? [13:18:42] the if block is executed [13:18:56] then it goes to error.html [13:19:37] in your code error.html is from an except: clause right? [13:19:46] ^ indicate an exception happened [13:20:17] ok [13:20:25] log the exception [13:20:32] and see what went wrong [13:20:45] ok [13:20:58] Also one more thing [13:21:18] 'photographer': u'[\u2026] Punt (Anefo)' [13:21:35] This is the photographer name extracted from an url [13:21:36] ? [13:21:40] yeah? [13:22:00] a sec [13:23:02] when parameter contains this as part of it [13:23:04] https://github.com/infobliss/sibutest2/blob/master/OOP/GenericGLAM.py#L53 [13:23:10] gives exception [13:23:27] unless photographer is removed from the template dictionary [13:23:32] what is the exception? [13:23:45] well I will print the exception [13:23:50] and let you know [13:24:02] For me it goes to error.html in every case [13:27:55] 'ascii' codec can't encode character u'\u2026' in position 1: ordinal not in range(128) [13:28:04] ah ascii errors [13:28:09] *unicodeerror [13:29:44] which line errrored again? https://github.com/infobliss/sibutest2/blob/master/OOP/GenericGLAM.py#L53 or https://github.com/infobliss/sibutest2/blob/master/OOP/NationaalArchiefGLAM.py#L134? [13:30:53] https://github.com/infobliss/sibutest2/blob/master/OOP/GenericGLAM.py#L53 [13:30:58] sorry [13:31:00] the other [13:31:38] can you give me repr(parameters)? [13:31:50] ok [13:32:28] https://justpaste.it/19f82 [13:33:00] ah [13:33:14] https://github.com/infobliss/sibutest2/blob/master/OOP/infobox_templates.py#L80 ''' => u''' [13:34:06] u''' is required? [13:34:25] it makes the string literal unicode [13:35:00] ok [13:35:17] uh, are you on python 2 or 3? [13:35:38] I don't know may be 2 [13:35:39] I mean you want the tool to be python 2 or 3? [13:35:54] when I write python --version it gives 2 [13:36:09] which one do you want it to be? [13:36:09] but when I write python3 it gives python 3.5 [13:36:13] 3.5 [13:36:18] u''' may be a syntax error in python 3 [13:37:12] oh ok, toollabs is in python 3 [13:37:22] there everything works fine [13:37:28] than use python 3 on your local environement [13:37:32] *then [13:37:45] I have a dilemma [13:37:59] I think both 2 and 3 are in my machine [13:38:06] it's ubuntu 16.04 [13:38:08] then just use python 3 [13:38:17] i.e. python3 -m flask ... [13:38:26] oh ok thanks [16:52:37] hi [17:09:51] hi [17:10:26] I was online but didn't have a clue that you have sent a text [17:10:44] I have written one license checker function [17:10:54] though it is not geberic [17:11:15] *generic [17:11:41] and also I have enabled upload by url for NA [17:12:04] It's hard to find a free image in NA [17:12:13] I have been looking for one for a long time now [17:12:34] trial and error, trying searchstrings and then checking json of images [17:12:55] well does CC0 mean free? [17:14:29] there are CC0 images that also have public domain = false [17:14:43] for e.g. http://www.gahetna.nl/beeldbank-api/zoek/a9643e42-d0b4-102d-bcf8-003048976d84 [17:15:38] Also do you guys get a ping sound when someone texts here? [17:15:49] I don't get any :( [17:16:58] CC0 is basically public domain, with a reason of "copyright holder released the work into PD" [17:17:08] yeah I do [17:17:48] ok [17:18:08] I only get pings when I'm named, I'll be here in a bit [17:18:20] ok basvb [17:18:30] people use different irc clients. I just went lazy and use an online service [17:18:49] (the irc client I liked died) [17:19:05] oh sad [17:19:37] how do I enable the ping sound? [17:19:54] depends on your client [17:20:10] it's from the browser [17:20:16] webchat? [17:20:21] yeah [17:20:40] left top option [17:20:43] webchat is a very basic irc client. I don't think it can [17:20:45] if not there I don't know [17:21:13] which one do you use zhuyifei1999_ ? [17:21:24] the lazy solution [17:21:53] cc0 images are ok [17:22:02] can you link the function implementation line? [17:22:05] I used to use pidgeonclient (not pidgin) by [[User:Petr Bena]]. Now I use irccloud [17:22:32] + why is the function not generic? [17:22:41] you just make a generic functions which returns false [17:23:06] so it has to be overwritten by a local function which either returns a template for the license (or true) and false if the license is not ok [17:23:28] That much only? [17:23:39] unless zhuyifei1999_ thinks that is improper coding [17:23:44] don't know what else it should do [17:23:46] ? [17:23:49] * zhuyifei1999_ reads [17:24:07] I'd say return true [17:24:09] you know that a lot better than I do [17:24:28] no I mean I think we can do away with a generic function. [17:24:30] uh [17:24:34] wait [17:24:40] each glam will take care of itself [17:24:41] and get the license in the metadata mapping function? [17:24:48] yeah [17:25:09] setting it as a generic function makes sure every glam class needs to implement a license check [17:25:17] and ensures that you can use it for each glam [17:25:18] yeah return false [17:25:27] return false for generic case [17:25:38] for specific case: check if the license is ok [17:26:00] and then return true (or the specific license, but I think that is not needed, that can be done in a glam specific mapping) [17:26:32] just either a license template string or false [17:26:47] false or none, both shall work [17:27:18] https://codeshare.io/50q4Zl [17:27:24] for the NA glam [17:27:32] we can then just do if not glamimage.license: rejectupload() [17:27:36] sending the permission and license together [17:27:45] something like that ^ [17:28:50] And we also add the license info appropriately in the wikitext [17:29:01] maybe the json can be loaded in another funciton? [17:29:15] because you'll use that on another location as well [17:29:35] true [17:29:47] But then what will be sent to this function as param? [17:29:49] duh... I'd say a million properties instead of a single function to parse the json [17:30:23] what do you mean exactly zhuyifei? [17:30:53] a sec [17:31:03] and then I'd also suggest to return just the license [17:31:47] ok [17:31:51] as wikitext, and add it into the parameter on function call when used in glam for the mapping [17:32:00] infobliss: AFAIK, 'permission' parameter in the infobox template in redundant to the license templates [17:32:01] or to just check it's existence [17:32:06] no need to set the former [17:32:09] true I was just looking into it [17:32:36] permission is the field where you can add the license [17:32:40] there is no license field [17:32:51] you can also add the license outside the infobox [17:32:56] but don't add it twice [17:33:06] yeah it has to be added outside infobox [17:33:12] not twice [17:33:26] otherwise YifeiBot catches it :) [17:33:32] I think inside is better long term (structured data) but that's something we could have hour long discussions on [17:33:47] YifeiBot says 'image has no license' [17:34:40] and the cc0 images from NA are also ok under cc0 license which is user here for public domain already [17:34:55] by a million properties I mean @property def license(self): doc = if self.jsondata.get('auteursrechten_voorwaarde_Public_Domain'): return '{{CC-0}}' elif ...: return ... etc. [17:35:13] apply to all parameters [17:35:48] ok basvb we have to deal with CC0 separately [17:35:49] so essencially you can '{obj.license}'.format(theglamimageobj) [17:35:52] you mean check all options in one call to the json [17:36:07] dictionary [17:36:13] which file did my bot catch? [17:36:35] it could be a false positive [17:36:56] in the current code files get uploaded without a license I think [17:37:07] at least the code in the codeshare [17:37:30] infobliss, what else did you manage to complete today? [17:37:41] upload by url [17:38:08] and I have got freedom from Toollabs testing and committing [17:38:34] all development and testing in local machine now [17:38:36] nice so you can test locally [17:39:14] there was issue due to python 2, I installed pip3 [17:39:22] now it is same as Toollabs [17:39:57] the upload by url code was tested [17:40:03] I will send the code [17:40:07] pls wait [17:40:21] can you add the generic functions to https://github.com/infobliss/sibutest2/blob/master/OOP/GenericGLAM.py as a basic function, so I know which to write for the amsterdam museum [17:40:36] also I think we should work a bit on the naming [17:40:44] OOP is not a good name in my opinion [17:41:02] ok [17:41:15] I'd like to see 1 folder with just the template [17:41:18] templates [17:41:33] that also allows for automatic fetching of all glams [17:41:36] in fact it is not clear to me what to send as param to license_checker() [17:42:51] for most glams a dictionary (which is the parsed json) [17:43:08] ok [17:43:15] zhuyifei1999_ : https://commons.wikimedia.org/w/index.php?title=File:Prestatieloop_van_Amsterdamse_wielerclub_La_Champion_in_Amsterdamse_Bos,_actie_-_Nationaal_Archief_-_925-2648.jpg&action=history [17:43:28] this dictionary you get from a function which parses the json into a dictionary (just the one line) but you repeat the use of that [17:44:04] well, it clearly didn't have a license https://commons.wikimedia.org/w/index.php?title=File:Prestatieloop_van_Amsterdamse_wielerclub_La_Champion_in_Amsterdamse_Bos,_actie_-_Nationaal_Archief_-_925-2648.jpg&diff=253254215&oldid=253134433 [17:44:11] infobliss, is that because you replaced the permission parameter and moved that into license parameter [17:44:23] no basvb [17:44:48] if it has no license it shouldn't be uploaded [17:45:01] it's because we have not put any license info in wikitext, only filled the infobox template [17:45:05] it had license [17:45:22] I added the code for adding the license [17:47:57] infobox_templates.py will be brought outside OOP [17:49:56] I thought we discussed the folder structure needed some redoing, so I don't know what inside and outside OOP will have for implications [17:50:18] +1 [17:50:35] you said you want the templates separately in a folder [17:50:43] 1 folder with all the templates [17:50:58] 1 with libraries (maybe that needs another name) [17:51:35] so some of the generic library functions (which we can extend and delete unusable from), the genericglam code and the infobox templates [17:53:07] some of the generic library functions (which we can extend and delete unusable from), the genericglam code and the infobox templates <== inside one folder? [17:54:38] yes [17:54:45] I made those 2 folders already a month ago [17:55:11] Glam_mappings and libraries [17:55:24] maybe they can use a rename but they are a bit more descriptive already [17:55:40] can you delete the trial folder? [17:55:49] yes [17:56:01] just do some general clean up, remove what is not needed [17:56:21] I get confused when you say templates [17:56:36] Sometimes I think you are referring to infobox templates [17:56:48] sometimes you refer to mappings [17:57:15] yeah I will do some cleanup [17:57:50] sure, just ask which one I mean [17:57:56] in the end there are 3 types of tempaltes [17:58:00] or even more [17:58:23] ok [17:58:45] which is the 3rd type? [18:00:17] commons infobox, html templates, mappings [18:00:37] uh html ok [18:10:44] I have been thinking about the amsterdam museum mapping [18:10:56] the issue with multiple files which does not fit within the current structure (idea) [18:11:14] if you provide one identifier it might be related to more than 1 files [18:12:07] that is strange [18:12:22] there is three options: just upload the first file (incomplete solution); upload all files (needs the ability to return multiple files for upload, bit unexpected for the user); something else (let the user pick, very complicated to build generically() [18:13:12] option 3 could be done [18:13:33] It's going to specific to this GLAM [18:15:01] could you give one such sample url from the Amsterdam Museum site? [18:16:28] http://amdata.adlibsoft.com/wwwopac.ashx?database=AMcollect&search=priref=12&output=json [18:16:41] I don't like to build things specific to 1 glam [18:16:53] it's going to heavily impact the general structure [18:17:54] ok [18:20:05] 4 images? [18:21:28] sometimes 1 [18:21:31] sometimes 10 [18:21:34] maybe sometimes 20 [18:21:50] never count on the 1 example telling the complete story [18:23:02] hmm [18:25:08] zhuyifei1999_: what do you think about this mutliple image for one ID issue? [18:28:27] can I have a tl'dr? I'm working on some js madness [18:28:58] one unique id gives several images [18:29:15] how to decide what/how to upload? [18:32:14] just upload N images (up to roughly 10) from one ID seems the best to me [18:32:37] it's not that nice but as long as that's supported it's what correlates with that one ID [18:32:56] ok [18:32:56] hmm [18:33:05] is it possible to give a user a warning [18:33:21] or some sort of choosing like we do for a search [18:34:53] I guess it could be done, reusing some of that structure, but it is quite a specific case I think [18:35:01] so I wouldn't want it to take up to much effort [18:35:54] It's also an option to go for just uploading all now, but keep this in mind as a task to later make cleaner (by asking for feedback from the user) [18:36:59] is there an /actually unique/ identifier for each image [18:37:17] for gallica there are ids and page numbers [18:41:30] basvb : do you know? [18:41:33] http://amdata.adlibsoft.com/wwwopac.ashx?database=AMcollect&search=priref=12&output=json [18:41:37] the id is on object level [18:41:59] the image has reproduction.reference.lref but you can't find this easily as end-user [18:42:28] it's not displayed on the page (only one image is displayed there) [18:43:36] all reproduction numbers don't have corresponding images either. [18:44:01] what do you mean? [18:45:52] If you look at http://hdl.handle.net/11259/collection.12 you can only find one of the images [18:46:05] for this object reproduction[3] and reproduction[4] are empty, i.e., they have no images [18:47:32] https://am-web.adlibhosting.com/wwwopacx_images/wwwopac.ashx?command=getcontent&server=images&value=A_334_000.jpg&width=500&height=500 [18:47:48] https://am-web.adlibhosting.com/wwwopacx_images/wwwopac.ashx?command=getcontent&server=images&value=A_334.jpg&width=500&height=500 [18:47:50] I see the images [18:48:00] here they are all roughly the same, so uploading 4 is not good [18:48:18] https://am-web.adlibhosting.com/wwwopacx_images/wwwopac.ashx?command=getcontent&server=images&value=BuitenCollectie\S_BC_00850_003.jpg&width=500&height=500 [18:48:37] the quality varies a lot [18:50:16] yeah these are corresponding to reproduction[0], reproduction[1], reproduction[2], reproduction[5] [18:50:30] reproduction[2] says "low-res scan" [18:51:12] where do you see those [18:51:21] I only see 4 reproductions at http://amdata.adlibsoft.com/wwwopac.ashx?database=AMcollect&search=priref=12&output=json [18:51:47] there's indeed digitale opname (digital copy), scan and low res scan [18:51:53] what browser do you use? [18:52:08] aah I get what you mean [18:52:15] there's ,"","" in the array [18:52:22] yeah [18:52:23] yep, well those we would skip ofcourse anyhow [18:52:55] guess that answers that, for now I'll determine what's the highest resolution and then go for the first object with that resolution [18:53:28] keeps the problem simple [18:53:37] right [18:53:57] I'll hope to finish the mapping tomorrow or otherwise this weekend [18:54:06] what are your work plans for tomorrow? [18:59:56] ok I submitted the evaluation [19:00:19] hi [19:00:26] was in brb [19:00:59] tomorrow's plan: [19:01:26] clean up and modify the folder structure [19:01:34] that's no 1 [19:02:02] I'd also like you to provide an estimate of the time things will take [19:02:07] 2. enable wikitext editing before final uplaod [19:02:14] or well the time you will spend on them tomorrow [19:02:29] this helps you in not getting stuck to long on 1 difficult task but move on [19:02:36] 1. 1 hour [19:02:57] 2. including UI changes : 3 hours [19:03:05] for 2. I think that is not a good idea for now [19:03:19] wasn't the focus on getting the generic mapping a bit more structured [19:03:29] and getting multi image upload supported [19:04:22] which generic mapping are you referring to? [19:04:36] I'd like to at least see: 2. generic function for license chekc (returning false) [19:04:43] 3. generic function for image thumbnail [19:05:08] that's a 10 minute job as you can have them empty returning false [19:05:24] implementing them for NA is the next thing, that can be 1-2 hours [19:05:43] 3. can also return false by default [19:05:55] what is the param taken by generic function for image thumbnail? [19:06:09] what do you think is logical? [19:06:45] I think the dictionary based on the json is the best thing to be sending around [19:06:53] or parts of it [19:06:58] the implementation will be glam specific [19:07:21] and return value? [19:07:24] it will allow us to have proper functions within the provided glams [19:07:33] by default None or false [19:08:03] and for the specific glam they return either a license-text or an thumbnail image url [19:08:24] ok [19:08:47] so go for 2 hours on that if you want to implement it for NA (if it takes more just stop and we discuss it in the evening [19:09:21] then you can decide whether you want to work on the multi-image implementation or wikitext editing for final upload for the other 3-4 hours [19:09:37] or put some (2-3) hours into both of those [19:09:44] if you get stuck on the one move to the other [19:09:58] ok [19:10:19] and if you get bored (I don't think so) you can put another hour into the documentation, explaining what should be provided for 1 glam [19:10:46] sounds like a good plan, I hope this helps you to keep focus and knowing when to move on when you get stuck [19:11:04] yeah sounds good [19:11:37] I can't see the evaluation right now. [19:11:52] they will show tomorrow may be [19:11:57] Ok, I have to do something for my final hours of the day, tommorow I'm likely here a bit during the day working on the Amsterdam Museum [19:12:25] do you get all info in the evaluation (the bad, okey, good grades?) or only the comment section to you [19:12:38] comment [19:12:40] ok [19:13:58] If I'm to blunt please know that I'm dutch and we're known for being blunt (and sometimes that's misinterpreted as us being incivil) ;) [19:14:56] np [19:25:27] good night [19:28:54] good night