[00:46:46] * M4r51n gn/ [10:06:20] Is there a tool to compare item list like AWB list compare? [10:08:40] Probably not like AWB… but what exactly are you trying to do? [10:08:54] There are many ways list entities and certain aspects of them [10:13:38] addshore: your composer.json and README do not state required PHP https://github.com/addwiki/wikibase-api/tree/master [10:13:54] And filing an issue on phab is too much effort ;p [10:14:17] "php": ">=5.5" [10:14:28] based on mediawiki-api-base needing "php": ">=5.5" [10:14:36] care to make the change? ;) [10:25:42] DanielK_WMDE: hoo aude Jonas_WMDE: thoughts on https://github.com/wmde/WikibaseDataModel/issues/611 ? [10:28:07] Commented [10:28:27] addshore: https://github.com/addwiki/wikibase-api/pull/40 ;) [10:29:27] JeroenDeDauw: Did you need to call asshore because of that? :P [10:29:35] + him [10:29:47] huh? [10:30:15] hoo: addshore is bad by default. didn't you hear, he's even broken now [10:30:16] Look at that PR again [10:30:36] what pr [10:30:55] * JeroenDeDauw is confuse [10:31:20] https://github.com/addwiki/wikibase-api/pull/40 [10:31:24] Look at the title :D [10:31:48] hoo: looks good to me? [10:32:06] oh [10:32:07] LOL [10:32:14] fail [10:32:54] addshore: your nickname has been altered [10:33:00] :/ [10:33:06] Adrian_WMDE: Thiemo_WMDE: what's about https://gerrit.wikimedia.org/r/#/c/268388/? [10:33:18] are you on it to get it unblocked? [10:33:44] JeroenDeDauw: want to force push a new commit message too? ;) [10:33:51] pfffft [10:34:00] :P [10:34:10] Anyway, I got to go… cu o/ [10:34:49] wow, the api is cursing at me [10:34:55] addshore: you happy now? [10:34:59] 'continue': {'continue': 'gapcontinue||exturlusage', [10:35:00] 'gapcontinue': '!!Fuck_you!!'}, [10:35:46] JeroenDeDauw: merged ;) [10:36:06] mdupont: nice [10:40:29] the FU shows up on every end of batch [11:12:31] Tobi_WMDE_SW: Waiting for Thiemo to look at my answers [11:29:01] Lydia_WMDE: https://en.wikipedia.org/wiki/Celestial_coordinate_system [11:31:29] e.g. https://en.wikipedia.org/wiki/Barnard%27s_Star [11:32:17] https://en.wikipedia.org/wiki/Mercalli_intensity_scale needs formatting as roman numerals :) [11:33:10] Adrian_WMDE: yeah, Jonas_WMDE and Thiemo_WMDE will look into that today [11:40:26] Lydia_WMDE: T126445 [12:57:15] aude: Hey, maybe I'm getting it wrong. But I do only one select but batches the result in packages of 50 rows and sends it to the master [12:57:22] is it a bad way? [12:57:38] should I select each time? [13:10:54] OK, I read all docs possible couldn't find an answer to it [13:22:35] hoo: hey, around? [13:22:40] yes [13:22:42] what's up? [13:22:55] re. https://gerrit.wikimedia.org/r/#/c/268874/ [13:23:01] I've some questions [13:23:04] :D [13:23:17] But I do only one select but batches the result in packages of 50 rows and sends it to the master [13:23:54] Is it a bad way to do? should I select each time, hoo? [13:24:25] Amir1: Well, you should not try to select more than 1000 or maybe 5000 rows [13:24:45] default is 1000 [13:24:47] so populating the whole RC (10M+ rows) in one go wouldn't really work well [13:24:53] yeah, but we have 10M rows to populate [13:24:58] no no [13:25:14] We don't want to populate the whole thing [13:25:27] ut? [13:25:29] * but [13:26:21] populating the whole db brings the ores server to its knees [13:26:47] I have no idea about that [13:26:57] so… what/ how much do you plan to populate? [13:27:50] this populating has two use cases: 1- once it's deployed, I think for edits in the last three days would be enough [13:27:53] Also, do you plan to populate it for every incoming edit or only for certain ones (say non-bot edits or so) [13:28:07] hoo: no for that we have a hook [13:28:16] that pushes a job [13:28:19] I know, but do you plan to run that for all edits? [13:28:28] Because we have up to 600 edits/minute [13:28:35] can your service cope with that load? [13:28:47] I think it does [13:28:54] because we already pre-cache [13:28:58] Could do just non-bot edits [13:29:03] o/ [13:29:06] That will cut down work by 2/3 generally [13:29:09] o/ :) [13:29:21] * halfak just woke up and is checking on server processes he started last night [13:29:37] halfak: Yeah, that would save you a lot https://grafana.wikimedia.org/dashboard/db/wikidata-edits [13:29:46] but we also have many non-bot users doing a lot of edits via tools [13:30:21] hoo, yeah. Understood. Those, I think we want to score. [13:30:22] we deploy the extension in wikidata after the service got deployed into prod [13:30:36] I think that would be good [13:31:14] Amir1, +1 for making sure that the maintenance script runs on the most recent edits first [13:31:27] It does [13:31:36] that's why it has order by clause [13:31:39] :) [13:32:32] but the hook for asking scores is being ran for every edit no matter what [13:32:41] I should work on it [13:32:45] I still fear your select will be expensive, if you run it over 1M+ rows [13:32:59] it's easy to implement [13:33:09] because the selectivity will be quite low after some initial rows have been populated [13:33:34] hoo: we shouldn't run this script for more than 20-50K [13:34:07] that'll only give you maybe 1 or 2 hours only [13:34:18] excluding bots maybe 4 or so [13:34:33] not only because of the extension and db, but because of ORES service too [13:35:01] if this only does on request at a time, it shouldn't be able to knock over your service really [13:35:16] or do you parallelize the https? requests? [13:35:37] no we don't parallelize it [13:36:01] It shouldn't be a problem then [13:36:05] but it might take a very very long time to finish [13:36:14] the other requests will be parallel, because jobs don't run in sequence [13:36:40] hm… maybe it's not ever worth populating then? [13:36:44] for more than 100K edits it might take more than 24 hours [13:37:24] but you should really load test your infrastructure and measure where the limits are [13:37:27] it's not worth populating except for RC and watchlist [13:37:34] (once it's in production) [13:37:57] But that will eventually also happen after a few days [13:38:11] exactly [13:38:22] this is just a simple kick off [13:38:40] If you think this is needed, ok [13:38:51] for about 10K edits [13:39:10] 10k non-bot edits happen in an hour [13:39:18] so just wiat an hour post deploy and you got that anyway [13:39:31] kk [13:39:55] also it has a second use case, for when ores is down. In that time span the extension doesn't store scores of edits [13:40:01] you could also start by deploying the hook and only enable the user facing bits a few days after [13:40:33] once the ores is back online we should run this to populate left overs [13:40:48] that's a valid use case [13:40:48] hoo: in case of Wikidata yes [13:40:58] but for that it should support time spans or rc_ids for batching [13:41:02] Amir1: 50 inserts into master would be ok [13:41:11] s/batching/selecting/ [13:41:16] but after maybe 500, wait for slaves [13:41:19] but for small wikis (or maybe even en.wp) [13:41:29] or just do wait for slaves for each insert [13:41:48] I only really have Wikidata in mind right now, just to be clear. What I say might have no relevance for any other wikis [13:42:31] aude: ores service can't handle more than 50 revs at a time [13:42:48] Amir1: that's ok [13:42:56] hoo: you're right. I should implement time span and rev ids [13:43:15] i'm just thinking of the part where we insert to master [13:43:37] * aude reads scrollback [13:43:46] Lydia_WMDE: https://phabricator.wikimedia.org/T126349 [13:45:22] hoo: The biggest use case of the maintenance script is for small wikis once they're kicked off [13:45:42] with one or two runs we can go back for at least a few days [13:45:48] Amir1: Ok, for those it'll probably just work [13:45:52] (1K -2K) [13:46:17] Do you have a ticket about load testing of the service? [13:46:26] I would like to see that prior to deploy [13:46:39] Just so that we know when to expect problems [13:47:14] halfak might help with that also yuvi [13:47:54] Keep my updated on that, please. I'm always worried about site stability [13:48:01] same goes for others, mostly probably aude [13:48:32] will the service be migrated from labs to production? [13:48:49] aude: yes, soonish [13:48:52] aude: yes, before the extension goes live [13:48:52] ok [13:49:02] before deployment to to Wikidata [13:49:06] ok [13:49:26] hoo: this extension might goes live for fa.wp (a small wiki) [13:49:32] before the prod migration [13:49:40] depends on ops [13:49:43] sounds good :) [13:49:48] ugh :/ [13:49:55] fa.wp is not that small though [13:50:03] I wouldn't want that… also you need to proxy to get from prod appservers/ job runners to labs [13:51:19] that's not impossible to do, but not nice at all [13:51:25] I'd wait for the production deploy of the service [13:51:42] if there's no ticket for load testing, please create one. [13:51:58] I think there are [13:52:14] but I don't know of it [13:52:21] hoo, we do a lot of load testing :) [13:52:32] E.g. our precacher hits ORES in realtime [13:52:44] And we do a lot of offline analyses where we'll hit ORES as fast as we can. [13:52:52] So do other people we don't know. [13:53:23] halfak: Ok, but still I want numbers… there's got to be a point where it'll start falling over. [13:53:41] If you can handle 5k requests per second, fine… but below that you might run into problems [13:54:06] We have backpressure. When our queue hits 100 score requests. [13:54:26] We can clear 100 requests in 2 seconds. [13:54:43] So that means once everything has a 2 second delay, we say "slow down". [13:54:57] the system is horizontally scaleable [13:55:11] But hoo has a point. [13:55:20] I don't think we have a number for the current worker set. [13:56:10] I also try to exclude bots from the scoring hooks [13:56:54] If you need code review for anything or got problems with integrating MW, please let me or aude know. [13:57:24] hoo, I don't suppose you'll help us review changes to the python-based service, would you? [13:57:27] :D [13:58:17] Haven't done Python in a while, so that would make that interesting :D [13:59:08] but really, I can probably only be of help when it comes to gluing stuff to MediaWiki [14:02:44] awesome hoo, One question before I start. I want to define a global variable. A bool called wgOresBotsExcluded true by default. while running the hook for saving an edit. I want to call it. I heard global variables are performancely expensive and needs to be avoided. Is is correct? [14:02:50] No worries hoo. Just always on the look out for good reviewers. :) [14:04:22] Amir1: Don't worry for that, if you only check it once per edit. I'm not sure about HHVMs behaviour, but I know some JITs have troubles with globals [14:04:30] because these are hard to nail down to a type [14:04:43] but that doesn't really matter here, if you don't poll in a tight loop or so [14:05:00] function calls are way more expensive (and not troublesome yet either) [14:06:35] ok thanks :) [14:17:11] hoo: https://gerrit.wikimedia.org/r/#/c/269672/ [14:19:31] Looks good [14:19:50] great :) [14:20:49] aude: https://gerrit.wikimedia.org/r/269663 [14:20:58] Please check this [14:30:38] Amir1: :/ [14:31:09] aude: ? [14:31:28] https://gerrit.wikimedia.org/r/#/c/269555/ the slashes [14:32:06] aude: woah… since when are we doing that [14:32:21] * hoo eyes Reedy [14:41:40] PhpStorm was complaining [14:42:06] Pretty sure you can shut that up [14:42:20] not using php storm myself, so dunno how [14:42:20] Well, it was more it wasn't seeing the sub functions [14:42:40] So, presumably, it was looking for those functions in that NS, not in the global scope [14:42:52] Which may or may not be a bug [14:43:09] hm… that should really work [14:43:14] it should follow the php semantics [14:43:25] pretty sure others who use it don't face such problems [14:43:26] !hss [14:43:26] ZOMG!! https://upload.wikimedia.org/wikipedia/mediawiki/6/69/Hesaidsemanticga2.jpg [14:43:27] but not sure [14:43:46] lol [14:44:03] :D [14:53:58] * hoo wonders why we don't have any math properties on WD yet [14:54:04] Has it not been announced? [14:56:34] one has been proposd [14:56:37] proposed* [14:56:41] I see [14:56:49] has to be discussed :) [14:56:56] thought there might already be ones "queued" [14:57:00] like for other data types [14:57:12] i don't think that is the case, but could be wrong [14:57:40] https://www.wikidata.org/wiki/Wikidata:Property_proposal/Pending [14:57:43] nothing for math [14:58:06] I see [14:58:29] I'll be back in an hour or so… cu o/ [14:58:34] ok [15:15:21] Thiemo_WMDE: Hehe... ChangeOps::validate has this: $entity = unserialize( serialize( $entity ) ); [15:15:26] *sigh* [15:15:55] DanielK_WMDE: that a problem? [15:16:55] JeroenDeDauw: right now it's actually good to have it there. But EntityDocument should really have a copy() method. As we see above, we need it. [15:17:17] More fundamentally, we'd actually need to clone in more places, but we don't because it's expensive, we just hope for the best [15:17:34] I want immutable entities... [15:18:14] JeroenDeDauw: unserialize( serialize( $entity ) ) becomes problematic if we use any kind of lazy loading in the data model, btw [15:22:08] DanielK_WMDE: http://bit.ly/1KbMDqT [15:24:00] * DanielK_WMDE gets some marshmellows [15:48:05] aude: around? [15:53:52] Amir1: hello! we had a small chat regarding ores, the other day, I'd like to know, is there scope for contribution to the python based ores service? I looked at the github page, but didn't find pointers to start... [15:55:30] hey codezee thank you for your interst. Dpends on how do you want to help [15:55:49] have a huge backlog of tasks you can pick up in the phabricator task board [15:55:56] phabricator.wikimedia.org/tag/revisionscoringasaservice/ [15:56:47] if you want to help in non-technical manner. We need community liasions, laberlers (specially for Wikidata) [15:56:50] etc. [15:57:11] * jzerebecki broke Wikibase CI [15:57:30] Amir1: as an intro, I'm a computer science student with a knowledge of python and interesting in machine learning and related field, so I'm digging [15:57:57] Since I've been contributing to mediawiki, I came across wiki-ai [15:58:05] which interested me [15:58:29] one of the cool things we are working on is using self-training in our models [15:58:42] (semi-supervised ML) [15:58:58] all my previous contributions have been to mediawiki in general(php + js), but now I'm looking for such stuff to work on, if its possible [15:59:04] there are several low hanging fruits we can work on too [15:59:19] great [15:59:40] you can join the #wikimedia-ai [15:59:45] also the mailing list [16:00:02] (ai @ lists) [16:33:14] Thiemo_WMDE: merged things can be reverted [16:33:30] * jzerebecki ducks [16:38:43] Amir1: ? [17:28:47] hi guys, anyone here to discuss https://gerrit.wikimedia.org/r/#/c/267019 ? [18:30:52] DanielK_WMDE: why does https://gerrit.wikimedia.org/r/#/c/269657/ fail? [18:32:21] physikerwelt: i don't think the failure is valid [18:32:38] we can try +2 again [18:33:58] math test fail as well... https://phabricator.wikimedia.org/T126422 Trying +2 again would probably not harm but James_F was really not happy with manual merging last time we talked [18:34:22] physikerwelt: +2 but not submit (let jenkins try again) [18:42:17] ... by the way... how can we get properties here https://www.wikidata.org/wiki/Special:ListProperties?datatype=math [18:43:09] is there another chat that discusses the data in wikidata [18:52:44] ... it semms that there is some progress on https://www.wikidata.org/wiki/Wikidata:Property_proposal/Natural_science#defining_formula ... how long does it usually take until the new value types are added? [18:56:04] physikerwelt: properties need to be proposed [18:56:34] how long it takes... depends how quick some consensus forms [19:09:53] aude, physikerwelt: i think one of the wikibase qunit tests is flaky, but i only noticed it since I today fiddled with our CI config. so maybe it wasn't before. [19:13:35] jzerebecki do you by chance know if the php55 test are executed on different hardaware or at least with different packages installed https://phabricator.wikimedia.org/T126422 indicates that the greek latex package teubner might be missing [19:15:26] physikerwelt: yes those run on the next ubuntu lts release compared to php53 [19:18:34] jzerebecki... ok so I can probably not do anything beside waiting if someone installs the package or disable the tests [19:20:56] physikerwelt: those vms are provisioned with operations/puppet.git ; wikitech can show you the puppet classes that are applied to those nodes. [19:35:57] Wo ist die ietef Bingo Karte? ;) [19:42:14] jzerebecki... a long time ago hashar tried to wrote a puppet role that installed all the required modules... but it was never reviewed and was finally abandonned... so it's probably better to remove the tests [19:47:39] physikerwelt: but it worked before, right? that would indicate that it is puppetized. is it currently used in production? [19:52:56] physikerwelt: i just suspect that because the package names are usually different between those distributions, that the puppet module needs to be fixed to work on both [19:53:21] jzerebecki: any idea what is wrong here? https://gerrit.wikimedia.org/r/#/c/269718/ [19:53:49] DanielK_WMDE: Just try again [19:54:06] That test has been flaky at least all day now [19:54:11] third time is a chram?... [19:54:48] yeah :P [19:56:14] DanielK_WMDE: Do we have an agreement to use the new and shiny array syntax in Wikibase? [19:56:30] (I don't plan to migrate old code… but I guess someone will do that once we start) [19:57:02] hoo: does core? [19:57:03] i see no reason to migrate old code. but i also see now reason not to use the new syntax. [19:57:30] jzerebecki: I think the RfC said it's ok for new code [19:57:43] it's not a new language feature, so it sahould be unproblematic. [19:57:47] not sure that got reflected in the coding conventions, yet [19:59:26] jzerebecki... unfortunately I do not remember the details. I think meta packages are used rater than specifying the packages individually [20:02:06] the package list from vagrant puppet/modules/role/manifests/math.pp:5 works is quite stable... but the packages used in production and ci tend to break from time to time [20:08:34] hoo: not sure i like someone changing the syntax en masse [20:08:52] but yeah, seems we are doing it for core and is generally ok [20:09:05] but would want team agreement [20:09:36] Would like to have a team agreement to not mass change the syntax [20:09:42] because otherwise it will happen [20:10:10] +1 [20:10:25] Should we have an RfC ticket for this? [20:10:28] jzerebecki: do we know more about https://phabricator.wikimedia.org/T126512#2016147 ? [20:10:31] hoo: yeah [20:11:01] it is probably just me, but i get "Uncaught TypeError: (language || this.language).toLowerCase is not a function" [20:11:09] on Special:NewItem locally [20:11:17] (updated my uls) [20:11:46] and Uncaught TypeError: $ulsTrigger.tipsy is not a function [20:12:14] I can try to look into that later on [20:13:05] in my debugger, language is usually a language code, (always "es") [20:13:16] but then one of them is an OO widget [20:35:29] hi, is there a field for spoken articles in a wikidata item ? [20:41:58] aude, hoo|away, jzerebecki: https://gerrit.wikimedia.org/r/#/c/269745/ [20:44:07] aude: I don't know more than what is on the ticket [20:45:17] Helmoony: i don't know, but it would be nice! Not sure how to best do that, though. You'd have to specify the language as a qualifier, i suppose. [20:51:21] physikerwelt____: any idea what an involved font name or package name might be? [20:53:30] scratch my question. i'm still reading the bug [20:57:56] DanielK_WMDE: i saw [21:04:02] jzerebecki: I commented on phabricator [21:53:52] DanielK_WMDE, thank you [22:01:52] Lydia_WMDE: Hi [22:03:31] d3r1ck: hi [22:05:34] Thank God hoo is around too :) [22:05:48] hey d3r1ck :) [22:05:54] Lydia_WMDE: i sent you a mail, did you receive it? [22:06:17] d3r1ck: yes but i did not get to reading it yet. it has been a super busy day [22:06:46] Lydia_WMDE: Ohhh, sorry to her that. Please don't over stress yourself, take it easy :) [22:06:57] heh [22:07:00] i am trying [22:07:24] Lydia_WMDE: so i guess you did not have the discussion with your team members today? [22:07:37] about the prject? [22:07:40] *project [22:08:00] d3r1ck: we did talk about it briefly and the next thing i need to do is send an email to the guy who worked on it for wikipedia [22:09:19] Lydia_WMDE: Ok, thanks about that. Can you Cc me in the mail so that i can hook up with him? [22:09:34] I think hoo too, but he has to agree first :) [22:10:01] hoo: should Lydia_WMDE Cc you in the mail she wants to sent to the IFTTT Wikipedia guy? [22:11:03] She can, but I don't think there's a need for me to be involved on that level [22:12:12] hoo: If you say so. [22:12:37] Lydia_WMDE: Welll we have all heard from hoo. So you can go ahead and do what was in mind. I will be waiting for the next step. [22:12:50] Lydia_WMDE: So after that, what is the next thing i am suppose to do? [22:13:21] I think it might be good, if you start by picking up some easy tasks around Wikidata, to get into it [22:13:29] GSoC is a bit away still, also [22:14:40] hoo: yeah, thats what i am doing, i did one yesterday even though it was too related to MobileFrontend [22:15:14] hoo: and It was fine but jdlrobson took it to a more higher level since there some thing to discuss about the bug [22:15:28] Actually the bug was even filed by Lydia_WMDE :0 [22:15:32] * :) [22:16:02] hoo: Thats actually what i am doing, i am working on that. Thats a preliminary step and i must do it :) [22:17:48] https://phabricator.wikimedia.org/T111016 https://phabricator.wikimedia.org/T50799 https://phabricator.wikimedia.org/T66315 [22:18:01] these bugs shouldn't be to hard to fix (but not trivial either) [22:18:11] if you are interested in one, go ahead :) [22:18:31] There are many more, these are just a fw [22:19:44] hoo: Ok, let me commence