[03:05:57] anyone has any idea what is this? https://www.wikidata.org/wiki/Q5849656 [03:11:50] yurik: one of 6 subregions that make up one of the 32 departments of Colombia [03:12:30] mutante, i guess it should be organized differently, right? [03:12:45] is that an admin division? [03:12:54] i got very confused by that one for some reason [03:13:13] i think it should use "is in the admin region" [03:13:17] and "Caldas" [03:13:20] yes [03:13:24] * yurik is cleaning up all the OSM admin levels and tyighing them to wikidata [03:13:43] sigh, lots of work is ahead of us there :( [03:13:45] "located in the administrative territorial entity" [03:13:48] that's what i meant [03:13:48] right [03:14:16] i suspect the graph ext can be used to generate the whole tree... [03:14:25] country is divided into "departments" and the next level under that [03:14:26] too bad it will be humongous [03:14:28] is this "subregion" [03:15:08] right, thanks, the map at https://es.wikipedia.org/wiki/Bajo_Occidente kinda cleaned it up for me [08:48:35] Hello, I'm wondering if there's any way to conveniently edit wikidata without having to enable javascript [10:13:36] Just to give another update about the dumps: The TTL dumps will arrive soon, but the JSON dumps will still run for a while [10:14:24] We have one known performance problem which affects both dump types [10:14:34] and probably another thing which only affects the JSON dumps [10:14:39] I'm going to investigate [10:20:13] or maybe it's just the general slowdown and the json runners were unlucky… mh [10:28:33] DanielK_WMDE: FYI: https://phabricator.wikimedia.org/T151356 [10:28:47] That's why our dumpers are annoyingly slow now [10:29:08] According to Jaime we hit a Maria bug there… [10:30:19] (It's even worse this week, because we were also hit a by a database server fault, thus the process had to be restarted) [10:39:26] hoo: Maria bug? nice :D [10:39:58] hoo: it's picking the wrong index, because of page_is_redirect = 0 ? That's stupid [10:41:18] hoo: wait... the EXPLAIN sais it's using the same index. And in the second case, it scans *way* mroe rows... that should be a lot worse... [10:44:13] Well, these are just estimations [10:44:25] the key difference is that it's not doing a range select [10:47:12] maybe we just need to regenerate the table stats? [10:47:16] btw, i replied to https://gerrit.wikimedia.org/r/#/c/322644/3 [10:47:20] addshore: --^ [10:47:42] DanielK_WMDE: yup, im going to ammend it now :) [10:47:53] \o/ [10:49:00] DanielK_WMDE: are there any other things like this that should probably be in core? [10:49:43] addshore: everything ;) [10:49:52] haha [10:49:59] nothing light-weight and self-contained comes to mind. [10:50:12] i'll let you know when i come across something [10:50:32] oh, one thing: TitleFactory. [10:50:48] DanielK_WMDE: TitleFactory, DataUpdateAdapter, GenericEventDispatcher [10:50:54] for more, rep for "aw man this should really be in core, yo!" [10:50:57] That's what a quick grep gave me [10:51:03] * addshore writes a post it note containing those names [10:51:20] hoo: heh! how close is your grep to my suggestion? [10:51:50] hoo: i think there already is something like DataUpdateAdapter in core. CallableDataUpdate or something [10:52:20] yeh there is I think [10:52:33] hoo: what timezone is aaron in? [10:52:46] but it's not quite the same thing [10:52:46] addshore: pacific [10:53:08] cool! [10:53:16] it's thanksgiving week, though [10:53:21] lot's of wolks are not working [10:53:25] *folks [10:53:27] ahh true, and it is a fair way through the week now too [10:54:10] so maybe wait until tonight, and merge tomorrow if he doesn't comment? [10:54:25] having aarons input on this would be nice, but i don't think we need to block on that [10:54:54] if you really want his oppinion, operhaps send him an email [10:56:47] DanielK_WMDE: Ok with me, if the change is done and there are tests for it [11:23:02] which project in phabicator do I enter a problem with the list of interwiki links shown in wikipedia under? [11:24:28] nikki: Maybe there's a mediawiki-interwiki [11:24:51] so there is :) [11:33:24] My train is arriving… will disconnect now [11:33:25] cu o& [11:33:27] * o/ [12:13:35] nikki: Have you have used the query service to calculate a percentage? [12:14:18] not yet, but it sounds like it should be possible [12:15:16] I'm adding https://www.wikidata.org/wiki/Property:P350 to paintings. Would be nice to be able to calculate the coverage [12:22:25] hmm... http://tinyurl.com/zjv5mq2 is the only approach I can think of at the moment (i.e. one subquery to return how many should have it, another to return how many do and then calculate the percentage) [12:22:39] nikki: http://tinyurl.com/hcg73w4 seems to work! :-) [12:22:55] ah :) [12:23:15] yours looks nicer than mine [12:31:17] nikki: Took a bit of tweaking: https://www.wikidata.org/wiki/User:Multichill/Painting_collections_RKDimages [12:31:55] cool :) [16:23:26] yurik: how are you finding/matching things in osm? [16:43:12] nikki: i think it's based on wikidata tags in osm [16:43:22] not sure yurik is doing it or someone else [16:43:41] or maybe looking at wikipedia tags would also work [16:43:53] it is based on wikidata tags in osm but that's not what I meant [16:44:09] what do you mean? [16:44:58] they were adding wikidata tags to osm and I'm wondering how they're finding the things which need tags adding (and which wikidata id to add) [16:46:38] because when I've tried to find and add missing wikidata ids, it's been really slow and annoying to do :/ [16:50:01] looks like mapbox is workign on this [16:50:15] http://wiki.openstreetmap.org/wiki/Wikidata#Linking_from_OSM_to_Wikidata [16:50:26] and there is support for this in josm and ID editor [16:50:53] and using the tasking manager (http://tasks.osmcanada.ca/project/42) [16:52:21] and https://forum.openstreetmap.org/viewtopic.php?pid=618646#p618646 (what yurik is doing, using the wikipedia tags) [16:53:34] ah [16:54:28] nikki, here's how we're doing it at Mapbox https://github.com/mapbox/mapping/issues/242 [16:54:39] hi aude :D [16:55:35] hi planemad :) [16:56:08] we first find a list of potential matches by matching an input set of OSM features to Wikidata entries with coordinates and the same name, and then filter it down by match distance [16:56:32] surprisingly good results atlas for cities and towns [16:56:37] *atleast [16:57:48] biking, home back in an hour [17:52:40] aude, nikki, planemad - i found a great way to massively attach wikidata IDs in OSM: I download all admin objects using relation["wikipedia"]["wikidata"!~".*"]["admin_level"="6"]({{bbox}}); select all relations, and fetch matching wikidata IDs using wikipedia plugin [17:52:57] so far i did all 1..5, and some of 6 [17:53:48] now i need to somehow get a tree from wikidata of all admin districts [17:55:12] ones which never had a wikipedia tag won't get a wikidata tag with your method, I guess [17:56:00] right [17:56:40] but considering that i already did 5000+ of them, not many are missing [17:57:32] yurik, nice! surprised this did not raise any flags with the OSM community with such a massive change [17:57:44] nikki, do you know how it would be possible to get a tree data in SPAQL? given the root (e.g. USA), gets all states, plus all districts for those states, etc [17:57:58] planemad, they did - i stopped at 6 [17:58:16] now under discussion on the same thread as the bot [17:58:22] ah ok, the community works :) [17:58:29] yep :) [17:58:39] yurik, can you share the link to the discussion [17:58:40] i slowed down a lot, and started resolving many by hand [17:58:58] https://forum.openstreetmap.org/viewtopic.php?id=56436 [18:01:21] yurik: Blazegraph is a graph engine so getting a tree shouldn't be too hard. Did you try https://tools.wmflabs.org/wikidata-todo/tree.html?lang=en&q=Q701&rp=131 ? [18:03:22] Or https://tools.wmflabs.org/wikidata-todo/tree.html?lang=en&q=Q55&rp=131&depth=2 if that doesn't kill your browser :P [18:04:58] yurik: Another question for you, does any of our visualization libraries support Venn diagrams? I would like to make Venn diagrams based on SPARQL [18:33:22] yurik: I'm not having much luck :( it should be possible to fetch stuff somehow, but everything I try is timing out [18:34:22] there could be some recursion [18:34:33] not sure how well the query service handles that [18:41:05] who knows... it's not like I can make any sense of how it works [18:42:54] like http://tinyurl.com/zchk69l times out, but http://tinyurl.com/gsuhqhy is really quick, the only difference is that I tried to set the country as a variable [18:43:12] changing the "bind" to "values" doesn't help either [18:45:42] nikki, what about the graph examples - they use some weird service that creates nested graphing? [18:46:15] nikkj http://tinyurl.com/hnqskun [18:46:19] nikki, [18:46:32] SMalyshev might know about this? [18:47:19] yurik: about? [18:48:43] SMalyshev, that query you made a while ago for the subregions - would it be possible to generalize it - as in given an admin region, it gets all the subregions, plus their subregions, etc, up to N levels [18:49:10] this way we can draw a tree, and also to validate the whole subtree without going into each subbranch one by one [18:49:22] yurik: maybe... but it requires good data I suspect [18:49:24] we could try [18:49:39] SMalyshev, well, i need it specifically so i can validate data and compare it with OSM [18:49:44] and make sure both are in sync [18:50:34] + 'category_tree' => Array () [18:50:34] + 'primary_category' => null [18:50:34] + 'featured' => null [18:50:34] + 'dcom_tags' => null [18:50:39] gah ooop [18:52:55] multichill, you can draw anything you want with vega - the problem is to actually draw it. Vega is like C rather than PHP - you have to do everything by hand :) [18:53:15] multichill, which specific graph would you like to make? [20:14:25] it seems there was an important update to pywikibot recently? [20:51:55] SMalyshev, so what do you think - is it possible to generate a graph just like http://tinyurl.com/hnqskun for administrative subregions? [20:52:25] its ok for it to be bad quality - its perfect because we can see mistakes and fix them [20:52:38] yurik: probably [20:52:48] yurik: let's try it. Which region do you want? [20:52:52] SMalyshev, i tried it yesterday, but my query was timing out [20:53:01] SMalyshev, UK [20:53:14] with all subregions... say 3 levels deep [20:59:38] 3 levels is ton of stuff [20:59:50] how many things to you want there? [21:00:17] you can get like every pub, bridge and road crossing in scotland [21:01:31] SMalyshev, no no, it has to be only cities/villages/... [21:01:43] basically the administrative division/sub-division of the world [21:02:10] SMalyshev, https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative#10_admin_level_values_for_specific_countries [21:02:21] OSM describes it with a number [21:03:38] yurik: hmm I'm not sure which property to use... [21:04:01] SMalyshev, i think its identical to what you did for the subregions graph [21:04:45] http://tinyurl.com/j39xsyb [21:04:47] SMalyshev, ^ [21:05:36] yurik: ah here's a problem. I used there specific classes like "province" etc [21:05:45] but I can't do it on multiple steps [21:06:01] I have no idea what second-level division of UK is called [21:07:06] SMalyshev, but i thought you don't need to specify it - you can simply specify "instance of or a subclass of" Q56061 -- administrative territorial entity [21:07:27] yurik: well, ok, I could try that [21:08:28] SMalyshev, did you ever use the graph building service? I was trying to adapt it, but because it has to be "reverse" rather than "forward" (plus a filter to get rid of anything unrelated, i think it timed out [21:09:09] yeah, all admin division types should at least be subclasses of that, you could look for ones which are subclasses of both https://www.wikidata.org/wiki/Q13220204 and https://www.wikidata.org/wiki/Q717478 but it depends how well modelled it is [21:09:12] by reverse i mean - given an item, find all items that list that item as their P131, and who themselves are admin entities [21:09:31] yurik: for 2 levels, I get 1179 items [21:09:43] SMalyshev, that's pretty good i guess? [21:09:44] I expect for 3 levels it would be order of magnitude more [21:10:14] I'm not sure I even want to try to graph it... it'd be a mess [21:10:19] already gives me something to work with? but even 2 would be good, if they are clean - e.g. no dups (multiple P131 would be weird) [21:10:35] so try http://tinyurl.com/zyol45u [21:10:35] SMalyshev, it should show a semi-balanced tree i think [21:11:33] I don't think SSSP gives duplicates. BFS might not sure [21:11:58] uuu...... pretty picture :) [21:12:08] i switched it to the graph mode, taking a while to rebalance [21:12:22] that's what I thought [21:12:47] yurik: https://en.wikipedia.org/wiki/Venn_diagram for sets retrieved using SPARQL [21:13:34] For example, items for humans that are painters + items for humans that have ulan + items for humans that have RKDartists [21:13:53] Jersey is kind of sitting alone [21:14:36] and I think Scotland is about to reparate from England :) but don't think that graph will converge soon [21:16:14] yurik: anyway, does it looks like what you were looking for? [21:16:54] SMalyshev, yes, thanks! Is it possible to run it for 3+ levels though? I would love to get all of subdivision items this way (not graph it of course :) [21:17:26] yurik: yeah change maxIterations [21:17:39] this is needed for data quality control as well - capture multiple p131, or those that are not pointing anywhere... would be cool to run it for the whole planet [21:17:45] i'm worried about that param :) [21:17:48] ok, thanks! [21:18:52] I wonder if jersey should even be there... it isn't technically in the uk, as I understand it [21:19:24] nikki: it's listed as P131 in uk I guess [21:19:35] yeah, I'm just questioning the p131 statement :) [21:20:18] that's above my paygrade, let UK politicians worry about that :) [21:22:53] ooh http://tinyurl.com/jg2sd5k the graph for taiwan works pretty nicely (even with the inconsistencies in whether to link the two provinces or not) [21:24:09] yeah taiwan renders nicely as a graph [21:37:44] even with 3 levels [22:40:51] jzerebecki: ping [22:41:30] I'd like someone to look at https://phabricator.wikimedia.org/T126146 - which is blocking a UBN! issue [23:24:50] DanielK_WMDE_: hm… do you have any idea what to poke at regarding the JSON dumps being incredibly slow?