[17:23:07] hi [17:31:42] hi [17:32:26] I have done further changes here and there [17:32:54] I will now commit them and test from toollabs [17:33:06] a few things are left [17:33:45] so let us start the meeting at our usual time 18:00 UTC [17:39:12] https://github.com/infobliss/sibutest2/blob/master/libraries/GenericGLAM.py#L61 this is not a proper if-else [17:39:24] you can't assume them to be art-photo [17:40:10] I'm also not so much a fan of adding other things than wikitemplate parameters to the parameters constant [17:41:35] https://github.com/infobliss/sibutest2/blob/master/libraries/GenericGLAM.py#L16 and L19 can you please comment what those should be replaced with in the not-generic case [17:41:38] this is very unclear [17:45:10] ok [17:45:48] why are you doing https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L82? [17:45:51] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L82 [17:45:55] imo L16 and L19 won't be needed at all [17:46:05] the whole idea is to call https://github.com/infobliss/sibutest2/blob/master/libraries/infobox_templates.py#L55 [17:46:10] and use that [17:46:22] these will need a lot of rewriting to be proper code [17:49:23] imo L82 is just putting empty values in the mapping without which we will get error [17:50:11] yes there is a predefined dictionary with the correct mapping! [17:50:21] at https://github.com/infobliss/sibutest2/blob/master/libraries/infobox_templates.py#L55 [17:50:27] what is otherwise the purpose of https://github.com/infobliss/sibutest2/blob/master/libraries/infobox_templates.py#L55 [17:52:33] ok [17:55:14] the trick is to make code generic and reuse it [17:56:19] yes [17:56:51] will use the empty mapping [17:57:08] from infobox_templates.py [17:57:15] yes, but the code is otherwise going to need quite some restructuring as well [17:57:22] I want to see you code comment a lot more [17:57:34] ok [17:57:37] and zhuyifei1999_ already pointed out the print statements (in production code) need to go [17:57:45] that is being done [17:57:56] next commit will address them [17:58:03] if logical things should be separated into multiple functions [17:59:20] you mean break down larger functions into smaller ones if possible? [18:02:40] yes [18:03:07] ok [18:07:50] now I am extracting the largest resolution image from the given images [18:08:28] https://codeshare.io/5Qmbj7 [18:13:28] you need an indent before line 6 and 7 [18:13:59] doe it work? [18:15:21] it is copy pasted from original code [18:15:30] identatation got disturbed [18:15:34] it works [18:17:30] ok nice [18:37:11] right now removing the print statements could be risky [18:37:18] they support me while testing [18:37:32] use logging [18:37:33] we may remove them all when we have tested at least one more GLAM [18:37:55] sure [18:38:02] ok [19:17:39] zhuyifei1999_: I've had this struggle before, how can I import python files which are in a parent (or subfolder of a parent) folder? [19:17:56] sys.path [19:18:08] it's hacky [19:18:09] so appending the path with .. [19:18:14] ok thought that was hacky yes [19:18:20] that's what I use to do [19:19:20] eg https://github.com/toollabs/video2commons/blob/master/utils/cleanuptasks.py#L26 [19:20:52] basically, don't do it whenever there is a better method [19:21:51] but in the case of ^, there's isn't a way :( (the v2c repo is cloned to different paths) [19:26:40] strangely enough https://github.com/infobliss/sibutest2/blob/master/Glam_mappings/AmsterdamMuseum.py#L9 seems to work without a sys.paath [19:26:52] or am I missing something there [19:27:42] sys.path includes the directory of the initial script iirc [19:27:59] so if it runs app.py https://github.com/infobliss/sibutest2 is included [19:30:04] I'm running that file directly [19:30:10] in pycharm [19:30:19] amsterdammuseum.py [19:30:27] * zhuyifei1999_ don't know pycharm [19:30:53] oh an IDE [19:31:10] yep [19:31:19] pycharm may use other methods to run the file [19:31:29] eg python -m [19:31:51] or set PYTHONPATH or other environment variables [19:31:58] or ... [19:31:58] yes, could be [19:32:11] likely project root appended to pythonpath [19:32:11] but direct runs shouldn't work [19:35:15] btw, https://phabricator.wikimedia.org/T172065 I'm insane :P [19:37:52] how so? [19:37:58] amount of subtasks? [19:38:31] 21 completed, 82 to go [19:39:40] what are these tools doing exactly ? (an example?) [19:40:14] loading resources from third party sites [19:40:22] yep that I got [19:40:28] so for example showing google maps? [19:40:49] or what we do by loading from a glam is that also loading from a 3th party website? [19:40:56] connecting to their API [19:41:10] https://tools.wmflabs.org/coord/ loads google maps [19:41:37] connecting from backend is okay, not connecting on frontend [19:41:50] i.e. browser [19:41:57] ok [19:42:11] https://tools.wmflabs.org/coord/ [19:42:14] uh [19:42:20] https://phabricator.wikimedia.org/phame/post/view/65/toolforge_provides_proxied_mirrors_of_cdnjs_and_now_fontcdn_for_your_usage_and_user-privacy/ [19:43:22] you don't want google to know "this ip visited this tool on this date" do you? :P [19:43:36] preferably no [19:43:53] I know what you can do when you have all/wrong data [19:44:18] especially considering we use google to much and google probably knows this ip is this person [19:44:25] *too much [19:44:39] "I know what you can do when you have all/wrong data" <= ? [19:44:41] yep, although google has a relatively open approach [19:44:55] don't understand that ^ [19:45:25] as a data scientist I'm used to working with very sensitive data, I've never done anything wrong but we often have discussions about legal and ethic use of data [19:45:40] we can not use all the data we have/can get by far [19:45:49] because it would infringe on the privacy of the clients [19:45:53] yeah [19:46:16] the EU has much better regulation in that area than the US [19:46:50] that's why we try to make tools not load third party resources [19:47:07] for that reason many dutch/eu companies forbid to store their (cloud) data in the US [19:47:28] outside EU i mean [19:47:34] so our privacy policy, not a mixture of ours and googles, are in effect [19:47:39] yes [19:47:40] hmm [19:59:11] btw isn't it like 3:30 AM for you zhuyifei1999_? [19:59:31] almost 4AM [20:00:00] 3:59? [20:00:13] now 4 [20:00:17] infobliss: I revised https://github.com/infobliss/sibutest2/blob/master/libraries/gen_lib.py#L23 [20:00:23] and deleted some outdated functions from there [20:00:36] please use the title generator in the NA archive code as well [20:00:45] there is duplicate code in NA doing the same thing [20:00:46] ok [20:00:58] I completed the Amsterdam Museum mapping [20:01:25] now I only have to rewrite the code to fit with the rest (genericGLAM) [20:01:35] and actually combine it's elements [20:02:54] do you want me to do something here? [20:04:28] basvb : Is there a way to push to github from pyCharm? [20:06:17] yes [20:06:27] I'll get it working [20:06:32] with the genericGLAM [20:06:48] because it should be that others can add glams easily [20:06:57] so I want to do it and tell you what is difficult [20:07:28] pycharm < VCS I'm a bit done for today, time to watch some film/series etc, [20:08:11] btw the license checker(), get_thumb_url(), gallery_builder() etc should be implemented for each GLAM in its own class [20:08:27] yes [20:08:36] as done fro NA https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py [20:09:07] please describe that at https://github.com/infobliss/sibutest2/blob/master/libraries/GenericGLAM.py#L16 [20:09:16] what kind of returns you expect for which cases [20:09:38] I don't think we need those dummy functions in GenericGLAM at all [20:09:39] you can't just say it to me, it should be clear from instructions/code [20:09:46] I think we do [20:10:07] for naming conventions and to return false if nothing is given [20:10:49] ok [20:11:11] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L42 the whole mapping has to be done, can't you check directly from the source? [20:11:12] I didn't have to use the functions in GenericGLAM anywhere [20:12:04] oh sorry [20:12:09] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L49 how is that code so long/complicated [20:12:12] that function is to be removed [20:12:20] why? [20:12:25] you use it in https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L49 [20:12:35] it's very nice to have [20:12:36] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L42 is redundant now [20:12:54] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L93 [20:12:56] I think I see the issue [20:13:11] you are doing way too much of the gallery building in the glam-specific function [20:13:36] aah you have the same function twice [20:13:40] that's not good indeed [20:14:03] not an issue I forgot to remove [20:14:16] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L93 that doesn't look very clean [20:14:20] I wrote the new one recently [20:14:32] you want to return false if the license is not valid [20:14:38] that's the whole idea of the license checker [20:15:05] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L94 why are you not reusing the load_json_from_url function? [20:15:18] my logic is if there's no acceptable license then return empty [20:15:42] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L37 has the same [20:16:09] also why are you returning a constant? [20:16:58] do you prefer parsed_json to be declared global? [20:17:14] i.e., a class variable? [20:17:19] nop [20:17:21] no [20:17:27] no global variables thanks [20:17:40] it can be a library function [20:17:42] then how to use L37 in L94? [20:18:15] def load_json_from_url(url): return json.loads(urllib2.urlopen(url).read().decode()) [20:18:23] and that works for all glams [20:18:30] which load json from urls [20:18:35] this is a one line code [20:18:43] and? [20:18:58] if you use it at 100 places [20:19:01] and you have to change it [20:19:08] than you have to change 100 places [20:19:12] rule #1: don't repeat yourself [20:19:14] which means problems [20:19:18] so you make a function [20:19:22] and then only have to change 1 place [20:19:40] I believe this code works differently in python2 and 3 [20:19:52] ok [20:19:58] #2: make logic clear and without garbage [20:19:59] or wel urrlib2 is different, so that's the imports [20:20:10] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L13 to 16 [20:20:14] that's for that code [20:20:20] so it's not 1 line but also those 4 [20:20:30] #3: explain with comments whenever it's unclear with code [20:20:32] (not sure if that's the clean way to do that, it's my hacky way ;) [20:21:28] ok got you [20:21:41] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L49 which resolution is this returning? [20:21:58] infobliss: oh and if you stick to python 3, you can remove that try: ... except importerror: ... to only the python3-compatriable version [20:22:03] largest avaiable [20:22:16] zhuyifei1999_: I believe toollabs was running on python 2? [20:22:27] no python 3 [20:22:32] ok so I thought the idea behind the thumb function was to get a 100x100 thumbnail version [20:22:34] basvb: depends on how you start it [20:22:35] for the gallery [20:22:48] there is /usr/bin/python3 [20:22:52] zhuyifei1999_: glam2commons is running python2 on tools [20:23:02] that's what I meant [20:23:14] then stick the code to python 2 [20:23:29] we aren't making a general-use library [20:23:43] I prefer to work in python3 myself [20:23:48] so was doing that locally ;) [20:23:51] no need for those compatibility cruft [20:24:41] basvb: if you have python2 production and python3 testing, you may be unable to reproduce errors in production [20:24:44] I am working in python3 locally [20:25:26] https://github.com/infobliss/sibutest2/blob/master/glams/NationaalArchiefGLAM.py#L2 should this be modified to python3? [20:25:30] make the environment on testing / production as similar as possible [20:25:41] then why are we not using python 3 on toollabs infobliss ? [20:25:44] if we both prefer it [20:25:55] and zhuyifei1999_ my testing practices are not in order I know ;) [20:25:57] I prefer python3 [20:26:01] infobliss: editing it won't magically work [20:26:24] shebang ling is used when you execute it directly [20:26:42] * zhuyifei1999_ prefer python 2 (just a habit) [20:27:17] better to change to python 3 I think, it's been around for years already [20:27:46] Till now I was thinking that g2c in toollabs is using python3 [20:27:47] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#python_.28Python3_.2B_Kubernetes.29 [20:28:07] depends on how you start the service [20:28:23] I'm sure it wasn't before [20:28:26] you might be using python3 without knowing [20:28:26] ok got you [20:28:35] please check it [20:28:43] I think default is python 3 [20:29:00] earlier when I was using python 2 locally I got errors in toollabs [20:29:14] but now I don't get any once I moved to python3 [20:29:46] don't assume please check it to be certain [20:30:06] I've already wasted days of my life on working in wrong versions :) [20:30:16] basvb: /me is old school sigh. still coding js as if coffee script, angular2, etc. don't even exist :P [20:30:42] old school 2010 coding? [20:30:50] kind of [20:31:21] As per this help doc I am starting service in python3 only [20:31:28] though python is new school compared to perl [20:31:38] inbefore somebody comments on them using cobal or fortran [20:31:51] Coren was the last Perl coder in WMF afaik [20:32:00] lol [20:32:47] I've done some prolog in the past [20:33:04] 1970s coding :) [20:33:24] assembly? [20:34:23] just logical programming, nothing fancy [20:34:26] * zhuyifei1999_ will read more python3 docs when I get to it [20:34:47] as long as you write print() most work is done [20:35:03] https://docs.python.org/3/reference/datamodel.html <= especially this [20:36:28] any of you knows lisp? [20:36:29] I don't know all of that stuff ;) [20:36:44] but I'm allowed to be a hacky programmer as a data scientist [20:36:52] infobliss: a bit, forgot most of it already [20:37:21] but lisp is cool, just not the syntax [20:37:43] I wrote a simple program in lisp yesterday ;) [20:37:56] you have to write the operator before all the operands [20:37:57] cool [20:38:15] I didn't do any serious lisp code though [20:38:23] yeah, super weird [20:38:33] like prefix notation [20:38:46] I had to do my functional programming course in https://en.wikipedia.org/wiki/Clean_(programming_language) [20:38:49] it was terrible [20:39:06] the language was made by the professors in Nijmegen [20:39:08] so we had to use it [20:39:27] basvb: try Haskell [20:39:40] sure it was basicly haskell but then bad [20:39:54] ok. we have functional programming as a course here too [20:39:56] if you had an error it didn't help you at all [20:40:02] I didn't do it though [20:40:13] and some other core errors in the language [20:40:45] * zhuyifei1999_ heard that once you are obsessed with functional programming, magic [20:41:08] i.e. very high productivity [20:41:08] I think it depends a bit on what you do? [20:41:17] but I liked the recursive puzzles yes [20:41:20] that part of it [20:44:22] you both have holidays btw that you are staying up all night? [20:44:55] sleeping can sometimes be difficult [20:46:05] lol [20:46:09] getting up even more so [20:46:33] insomniac [20:46:57] tomorrow I gotta try to implement http://guides.rubyonrails.org/active_model_basics.html API in python 3 (because someone from WMCN asked me to) [20:47:13] that's why I was reading data models [20:47:51] is it written in rubyonrails now? [20:48:12] well, in python that has to be a complete rewrite [20:48:24] python is explicit and ruby is magical :P [20:48:45] this seems pretty core in the programming language? [20:48:49] Zen of Python [20:48:52] yes [20:49:23] that's far outside my domain [20:49:37] :P [20:49:37] I'm more of the writing complicated algorithms [20:49:51] yeah [20:50:16] * zhuyifei1999_ isn't good at those [20:50:23] although I don't do that often enough [20:50:31] like to puzzle with logic [20:50:58] logic? try https://docs.python.org/3/reference/datamodel.html#metaclasses [20:51:43] https://en.wikipedia.org/wiki/Propositional_calculus [20:51:56] or just generally complicated puzzles [20:52:10] it can make non-built-in things-that-could-only-be-built-in [20:52:21] eg. java enums in python [20:52:35] * zhuyifei1999_ looks [20:53:40] why is this calculus? [20:53:54] it's logic [20:54:02] I know it under propositional logic [20:54:37] In general terms, a calculus is a formal system that consists of a set of syntactic expressions (well-formed formulas), a distinguished subset of these expressions (axioms), plus a set of formal rules that define a specific binary relation, intended to be interpreted as logical equivalence, on the space of expressions. [20:54:42] that's what the article states on it [20:55:10] wikipedia needs some tl;dr [20:55:37] https://en.wikipedia.org/wiki/First-order_logic is also interesting [20:55:56] I guess you'll get those at uni [21:02:35] then you can start building turing machines https://en.wikipedia.org/wiki/Turing_machine [21:02:42] and program in those [21:03:04] sigh [21:03:26] not a good idea? [21:04:19] turing is too smart [21:04:33] he has a test and a machine [21:04:57] but the concept of the machine is interesting [21:05:25] it basicly tells you what you need to be able to do everything a computer can (theoretically) [21:06:28] it's like the mathimatical proof behind computing [21:11:00] ok I'm off to sleep, nn