[00:10:48] hm, i wonder if there is a way to only do a query on a partial set of the Qs [05:12:27] hello [10:54:48] not sure if it is admissible: https://www.wikidata.org/wiki/Q56708228 [10:55:33] Tpt[m]: usual SEO spam [10:56:33] I think all identifiers are editable by users? [10:57:45] yes, I think so [11:06:13] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q56279965 is a good honey pot [13:13:22] SothoTalKer_: what sort of partial set are you thinking of? [13:16:47] sjoerddebruin: I wish we had a better idea of which identifiers can make things notable and which can't [13:18:38] I was working on a script a while back to check various things to see if something is notable, and the one thing that I can't really do is check whether there are notable identifiers [13:19:42] since there doesn't seem to be any way to find out which ones can be ignored [13:19:46] nikki: well, limiting the range of Qs that are searched by a query [13:23:50] if you mean something like search everything from Q40000000 to Q50000000, then I don't think you can [13:24:18] ok :( [13:24:35] what are you actually trying to do? [13:25:09] well, my query times out so i wanted to do it in parts (: [13:27:50] and what are you trying to query for? :P [13:27:54] maybe it can be altered to not time out [13:28:39] http://tinyurl.com/ya2dxnmr [13:37:03] replacing the service line with optional { ?item rdfs:label ?itemLabel filter (lang(?itemLabel) = "en") } makes it return in 39 seconds for me, and then commenting out the limits returns all of them in 52 seconds [13:38:00] I don't use the label service much because it seems to be slower (and I don't like having to manually select all the variables) but I wonder if there's a way to speed it up [13:44:34] so it's just the label service which is so slow? :x [13:45:17] thanks :3 [13:48:07] hm, if i use that i should be able to get rid of the second query and do it all at once [13:48:49] probably :) [13:49:15] replacing the minus { ... } bits with optional { ... } and filter (!bound(...)) also seems to be a little bit faster here [13:50:58] ok, i cant get rid of the second query, else i get a bad aggregate error :x [13:51:32] to bad there's no way to get all the results just as is and then remove all the doubles [13:51:36] too [13:54:30] you probably forgot to group by ?itemLabel as well [13:54:41] but when I try to remove the second query it times out again :/ [13:55:33] i probably have to reduce the set even further. The time it takes seems to depend on the load of the server, heh. [13:57:25] like the optional rdfs query still has to work on a larger set. Maybe if I make multiple subqueries it gets faster :D [13:58:10] it might do, if it removes enough rows [14:04:01] ah well, i go with the 2 query and the rdfs solution for now :) [14:04:10] thank you :3 [14:06:15] you're welcome :D [14:10:20] and now my spreadsheet is slow :D [14:15:42] I can't help with that :P [14:18:17] i wonder if the real excel would be faster than libreoffice. [14:28:32] nikki: there are curated identifiers and uncurated ones... [14:28:49] https://www.wikidata.org/wiki/Q19595382 is a mess for example [14:29:03] "authority control' is curated, so why does ORCID has this for example? [14:35:29] i don't think the curation works very well. I cleaned up a lot of them (: [14:47:47] I don't think everything with an identifier marked as authority control would necessarily be notable [14:49:15] like for places, osm relation id, tripadvisor id, facebook places id... [15:43:56] can an admin please deal with https://www.wikidata.org/wiki/Special:Contributions/Abs.Alg_03 they are mass spamming [17:03:42] for anyone remotely interested https://addshore.com/2018/12/wikidata-architecture-overview-diagrams/ [17:04:43] Thanks! :D