[01:08:37] Hey people [01:09:00] What about import of structured data from Wikipedia infoboxes? [01:09:12] I see it's not all there yet. [01:09:53] It's not [01:09:54] Is there an existing plan for world domination for the future - or do we do that today already? If so, how? [01:11:10] World domination as in? Having ALL the data? [01:12:38] Organized effort to get it done - maybe automated import of everything, or special tools to select things for inclusion one by one - or whatever... [01:12:55] There are various tools [01:13:35] Seems an obvious idea to go for all infobox data... [01:13:35] but there's no real plan to import everything... individuals import what they (and the community) find useful, from both Wikimedia sites (Wikipedia, ...) as well as 3rd party sources [01:14:02] Could you give me some pointers on tools? [01:14:14] That'd be interesting. [01:15:18] Is there just bots for that or other means as well? [01:15:34] Offhand I can point you to https://tools.wmflabs.org/autolist/ [01:15:45] which can be used based on categories etc. [01:15:57] I know that there are various other tools that use infoboxes etc. [01:16:06] "etc." doesn't seem to encompass infoboxes. [01:16:07] but I've never used these, so don't really know them [01:16:19] in case of Autolist. [01:16:34] yeah, autolist isn't able to read infoboxes (or other parts of Wikitext) [01:16:58] What I read sounded as if there could be a world domination plan called "phase II" for that stuff... [01:19:18] I was not able to find tools for infobox data so far... :-( [10:57:10] tarrow: got more sparql foo for me? ;) [10:57:26] perhaps! Let me know what you need? [10:58:33] well, I was a query with a hole treeof P31 / instance of [10:59:05] eg. I want all things that are a instance of this, or an instance of a subclass of this [11:00:18] *looks for an example in the sparql examples* [11:03:01] hmmm,, ?race wdt:P31/wdt:P279* wd:Q1968664 ? [11:04:29] yes tarrow thats it! :) win! [11:04:53] :D [11:42:42] hmm, this oen seems to give me a very off number... [11:42:42] SELECT (count(?r) AS ?rcount) WHERE { ?r ?x } [11:47:56] hi, is someone using the wikidata vagrant role... I wonder if additional configuration is needed to enable the interwikilinks ... currently I don't see anything there http://math.beta.wmflabs.org/wiki/Non-Archimedean_ordered_field [12:49:38] DanielK_WMDE_: entity_stats table, columns( entityId, statName, count) ;) [15:57:59] I'm trying to configure my wiki to allow useres to select a wikidata item... is there an example where something like that has already been done? [16:02:32] physikerwelt: a remote wikibase client pulling data from wikidata.org? [16:03:14] Lydia_WMDE: o/ [16:03:22] Amir1: hey [16:03:48] let's go to skype [16:04:05] ok [16:04:37] I'm trying to submit an edit through the API Sandbox, and the API is unable to decode this snak value: {"amount": "+55", "unit": "http://www.wikidata.org/entity/Q76", "upperBound": "+56", "lowerBound": "+54" } [16:04:47] Even though it's commensurate with the API output? [16:04:51] harej: which api module? [16:06:43] benestar: any idea how to use the sparql endpoint to get a TSV / basic list instead of json? [16:07:24] I see the UI thing does format=json but format=tsv just returns XML.. do I really have to do it using headers? any idea? [16:19:29] addshore: yes [16:20:07] physikerwelt: https://phabricator.wikimedia.org/T48556 [16:21:39] addshore: looking [16:23:22] benestar: meh, it looks like the query engine throws an exception half way throguh the list anyway :/ [16:23:37] * addshore is frustrated by how much execution time adding distinct to a count causes [16:24:34] addshore: wbcreateclaim [16:24:42] *looks at the docs* [16:24:50] addshore: do you think it makes sense to look into the change you started [16:25:02] The docs... kinda suck. Unless you have access to far richer documentation than I do. [16:25:26] addshore: I'm currentky wondering why my testwiki even knows that it's not wikipedia [16:25:51] http://math.beta.wmflabs.org/wiki/Chua's_circuit has no interwiki links while https://en.wikipedia.org/wiki/Chua's_circuit has [16:25:54] format Available in versions after 1.4.0. This is an optional query parameter that allows you to set the result type other than via the Accept Headers. Valid values are json, xml, application/sparql-results+json, and application/sparql-results+xml. json and xml are simple short cuts for the full mime type specification. Setting this parameter will override any Accept Header that is present. [16:25:57] addshore: --^ [16:26:55] harej: nope, the same docs ;) [16:27:55] harej: why are you trying to set the unit to barack Obama? :P [16:28:15] It's a silly test. [16:28:35] Though for what it's worth I get the same error even when I try setting the unit to int 1 [16:29:56] harej: works for me with the data you have given [16:30:01] https://usercontent.irccloud-cdn.com/file/QWO8jdEj/ [16:30:06] what error exactly were you getting? [16:30:22] physikerwelt: it would be great, but right now the team is concentrating on other things! [16:31:14] benestar: mhhhm, I guess headers it is! I wonder if a different format might avoid the time-out in the engine.... I dont know if it already has the full result and timesout just sending the data, or if it actually died when doing the query [16:31:47] addshore: did you try using ?explain? [16:31:53] that really helps understanding what blazegraph actually does [16:32:36] benestar: where, in the query? or request? [16:33:06] addshore: just add explain=true to the query [16:33:24] * to the http request [16:33:58] addshore: I thought that it might be realatively easy to concrete something along the lines of https://www.mediawiki.org/wiki/Manual:Enabling_autocomplete_in_a_form and to user the autcompletion of the search field of wikidata.org [16:34:19] s/concrete/complete [16:34:45] benestar: oooooh, that looks nice [16:36:21] physikerwelt: there is slightly more integration than just stuff like that :0 [16:36:22] :) [16:37:02] addshore... that is all that I would need at this very moment [16:37:43] hmm, I don't follow, auto-completion of search? for what purpose? it doesn't allow you to actually do anyting [16:37:47] addshore: the error I get is invalid snak (could not decode snak value) [16:38:02] Aka wikibase-api-invalid-snak [16:38:14] harej: sent me you POSTDATA string in pm? [16:40:00] addshore: I have the problem that I want to have language independet identifiers for concepts. And if I could get the QXXX number from from that API that would help a lot [16:41:08] Addshore I emailed it to you; actually using IRC on my phone, since IRCCloud magically doesn't work on my work computer anymore. [16:41:27] *looking* [16:43:36] harej: interesting, the top one fails, the bottom one succeeds ;) https://usercontent.irccloud-cdn.com/file/VS7pV9nW/ [16:45:30] They look identical to me? [16:45:34] yeh :P [16:45:49] I'm guessing some of the chars are actually different though.... [16:46:17] wait, I spotted it [16:46:28] https://usercontent.irccloud-cdn.com/file/Tp2Tix8c/ [16:46:31] harej: ^^ [16:46:37] you have a typo! [16:46:42] Fucking shit. [16:46:45] :D [16:47:41] Alright. Time to make my changes to WiDaR [16:49:02] best regards, the US federal government :P [16:50:23] Bitbucket is like Weird Alternate Universe GitHub. [16:50:33] harej: I love typos like that :D mainly when they are not mine however :) [16:50:39] hope you didnt spend to long on it! [16:50:56] physikerwelt: Well right now I don't think there is really any easy integration [16:51:02] Whatever. I get paid by the hour. [16:51:43] addshore... ok I'll try to find my own way [16:52:48] benestar: any idea if I use LIMIT and OFFSET in a query what the default order will be? or will it just be random? [16:53:25] addshore: I don't think they order it by some default val, rather "random" how it is stored in the db though not sure [16:53:45] hmm, okay, I may ask SMalyshev when he wakes up if he has any idea ;) [17:07:38] benestar: [17:07:39] Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using ORDER BY. [17:08:04] but of course using limit and offset and the queries return, add an order by in there and they timeout [17:08:08] ah ok :) [17:08:27] addshore: sure, ordering is very expensive [17:08:37] basically it has to fetch the whole result set and then order it [17:08:49] yup :P [17:08:52] :S [17:09:08] right now I think the best solution for me it to bypass the limit somehow ;) [17:09:33] which limit? [17:09:36] is there a hard one? [17:10:26] benestar: yes, 30 seconds query execution time / general time [17:11:25] working on dumps would be much easier but the data is a bit outdated [17:11:44] yeh, I managed to cover all of the other cases we want to with sql and sparql [17:11:53] just not this last one without hitting the limit :P [17:12:16] what exactly are you trying to do? [17:12:39] count the number of referenced statements ;) [17:13:08] but yeh benestar https://github.com/wmde/wikidata-analysis/commit/c56b89315f87fe36673282d7a9450c51e3ff9ec9 [17:13:14] I could always fall back to the dump stuff [17:13:34] but if I manage to make it possible without then that would be a big bonus and would mean daily rather than weekly etc data [17:14:52] addshore: I got my autocomplete feature via the php api https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&search=Magnetic%20moment [18:14:52] SMalyshev: you'll have to let me know what you think of https://gerrit.wikimedia.org/r/#/c/256039/ when you are around ;) [18:21:29] cool, so currently 800epm ish happening and neither dispatch or query service is choking! [19:08:23] SMalyshev: *agrees with your comment* I just have a look at the code to see if there was a default timeout if no header was set, but apparently not, things would just run forever [19:20:21] addshore, if you have a string and then you isolate a floating point number from that string and assign it to a variable, is the resulting variable considered to also be a string? [19:20:55] harej: that is a language dependant question I believe :P [19:21:02] JavaScript. [19:21:28] I'm asking because of this weirdness: https://www.wikidata.org/w/index.php?title=Q4115189&diff=277672372&oldid=277646872 [19:21:42] (I'm trying to introduce decimals to Quick Statements) [19:22:21] IF the isolated decimal value does not remain a string and becomes a floating point thing, that would explain why that happens. [19:22:46] well, as a guess I would say it becomes a float ;) [19:25:32] that's bad! I want it to remain a string! [19:28:32] So I am trying some stuff out in a JavaScript interpreter and it would seem that it does actually remain a string. Which makes my problem that much more baffling. [19:29:46] well harej you can probably pass a float string in there and the api will convert it [19:30:57] http://hastebin.com/afukatapaz.avrasm [19:31:44] addshore: So it's the API's doing, then. Is there any way I can cajole the API into... not doing that? [19:31:59] nope, if you pass it a float its gonna think it is a float :p [19:32:16] But I am pretty sure I am passing it as a string? [19:32:28] but everything is a string there! [19:32:55] and then also in php strings and ints etc aren't that dissimilar [19:33:43] This is the relevant code: [19:33:49] https://www.irccloud.com/pastebin/1eRZZ4Tl/ [19:33:49] https://www.irccloud.com/pastebin/J9FjqCKE/ [19:34:12] Uhh, somehow that pasted twice. [19:36:33] Actually, *that* isn't the relevant stuff. [19:36:52] THIS is. https://www.irccloud.com/pastebin/KljnKRzB/ [19:36:52] THIS is. https://www.irccloud.com/pastebin/YPMybeZt/ [19:37:05] Bloody hell, what's wrong with IRCCloud? [19:38:06] Lydia_WMDE: http://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/ [19:38:09] hey :) [19:41:54] Further, I don't seem to have any problems when doing it in the API sandbox: https://www.wikidata.org/w/index.php?title=Q4115189&type=revision&diff=277772014&oldid=277676380 [19:48:26] harej: hmmm [19:52:30] Amir1: nice [20:29:39] jzerebec1i: I added you to the patch! :) [20:35:57] addshore: thx [20:47:40] https://www.wikidata.org/w/index.php?title=Q4115189&type=revision&diff=277831315&oldid=277772014 << WiDaR with units!!!! [20:47:43] :D [20:47:57] next up: widar with decimal points (unusually difficult for some reason) [20:48:44] https://www.wikidata.org/w/index.php?title=Q4115189&diff=next&oldid=277831315 << Oops! I did it again. [20:48:58] all of the float [20:49:33] https://www.wikidata.org/w/index.php?title=Q4115189&diff=prev&oldid=277833637 [20:49:36] This works. [20:49:59] Quantity-with-units has a different workflow from regular quantities. [20:50:56] WiDaR includes some wrappers for ease of use, and I think those wrappers are fucking with the numbers it receives. Quantity-with-units bypasses this by sending a pure MediaWiki API query. [20:52:59] I've been Twittering with Magnus Manske about it. I named the specific lines in his Widar script that were bugging me. We'll see where this goes. [21:33:42] I have this WDQS query: http://tinyurl.com/zbu56fg I would like to add another requirement, that the placeofburial are in Sweden. But when I add "?placeofburial (wdt:P17|wdt:P131)* wd:Q34 ." I get a strange error. What am I doing wrong? [21:35:05] Ainali: ?placeofburial (wdt:P17/131)* wd:Q34 . [21:35:12] it's a / and not a | [21:39:33] you probably want ?placeofburial wdt:P17/wdt:P131* wd:Q34 . [21:45:36] SMalyshev: s/he is splotted, can't read us... [21:45:46] plitted* [21:45:51] splitted! [21:45:53] rah [21:46:01] three times to write it right [21:47:04] Harmonia_Amanda: Thanks, that put me on the right track [21:47:16] (22:35:06) Ainali a quitté le salon (quit: *.net *.split) [21:47:16] (22:35:12) Harmonia_Amanda: it's a / and not a | [21:47:20] ;) [21:47:48] and (22:39:33) SMalyshev: you probably want ?placeofburial wdt:P17/wdt:P131* wd:Q34 . [21:48:16] Harmonia_Amanda: Yes, that what was I used [21:48:24] But it timed out... [21:48:30] https://www.wikidata.org/w/index.php?title=Q4115189&type=revision&diff=277884476&oldid=277833637 YEAAHHHH!!! [21:48:59] Ainali: we have problem with the endpoint since several days [21:49:27] so if your query is too heavy,... [21:50:09] This one is a bit lighter and works: http://tinyurl.com/q75kxml [21:51:35] But it surprises me to se that the two first results has P119:Q34 (Place of burial:Sweden). If I could filter those out I would have a list I could start working with [21:55:08] ok, I'm tired, but you're querying for the items *withm P119:Q34 no? [21:58:54] Harmonia_Amanda: Hmm, I want items in Sweden, but more specific than just country level. Example: an item having P119:Q10602636 is a good result (because Q10602636 is located in Q34) [21:59:58] add a "filter not exist" for the exact value Q34? [22:00:09] but that will make the query heavy... [22:04:33] Ah, but could I instead add something like that the placeofburial must have P31:Q39614 without making it heavy? [22:04:48] yes [22:05:36] just add a second line : ?placeofburial wdi:P31 wd:Q39614 . [22:05:52] wdt* [22:06:08] Ainali: http://tinyurl.com/onnt2am [22:06:16] 1535 results [22:06:44] Harmonia_Amanda: Awesome, big thanks! [22:06:50] :) [22:27:48] SMalyshev: also, tarrow managed to get his query service setup for his wikibase install, with autpo updating every 10 mins etc :) So your docs must be good! ;D [22:38:42] addshore: great! :)