[04:43:56] hi guys [04:44:12] while trying to get up and running with wikidata [04:44:16] I got some errors [04:44:24] I've pasted em https://pastebin.com/930bqpbB here [04:44:39] please check them and see if you can see the cause [06:04:30] FYI: wikidata is in read only mode for ~30 min [06:22:16] maintenance is done [10:11:45] is it intended that "part of" (P361) has an inverse constraint but "has part" (P527) does not? [19:03:25] PROBLEM - puppet last run on wdqs1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [19:21:37] hi folks, [19:21:38] I was trying to write a query in SPARQL, perhaps someone more experienced can help me. [19:21:39] I wanted to get all things/people/... located (or any of its subproperties) in a place (region). The thing is that it returns me the items related to that region, but not related to the cities located in that region. How could I make a query become 'recursive'? [19:21:39] this is my query: https://pastebin.com/Nv4DYybR [19:28:25] RECOVERY - puppet last run on wdqs1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:52:29] can I get the link to the english wikipedia article of a human (Q5) in my results? [21:53:20] cortexman: any specific human, and do you mean SPARQL results? [21:53:43] SPARQL https://gist.github.com/brianmingus/f0c4a7e5e0edde96870e3357548738c2 [21:54:56] the links to the wikipedia articles are on the entity pages [21:55:08] not sure if that means they are queryable [21:55:15] Yes [21:55:18] One sec :) [21:56:09] http://tinyurl.com/yaljcsjq [21:58:33] cortexman: http://tinyurl.com/y6umkjcq or so I guess - for some reason also printing the label takes forever [21:59:06] There's a ton of Q5 so you'll probably need a limit, but dunno, didn't try without [21:59:51] thanks! [22:00:05] (thanks for the hint on how that works, sjoerddebruin :) ) [22:00:05] yes, i'd be interested in sorting by article length, but i kind of doubt that's stored [22:00:12] i'd really just need to stand up wikipedia locally i think [22:00:39] reosarevok: lol [22:01:17] cortexman: I think there's a way to query that too, but I have no idea how tbh. What are you trying to get exactly? [22:01:36] i'm doing machine learning / NLP and so longer articles on people are good [22:02:16] is schema.org a standardization that allows easier linking btwn ontologies? [22:02:47] i'm interested in federated queries, i just feel like they are destined to be slow [22:06:48] Hmm. cortexman: https://petscan.wmflabs.org/?psid=2319510 is a list of American politician articles by article length, in case that kind of stuff seems useful [22:06:58] (you can output json) [22:07:23] there seems to be an api that has articles by byte length [22:07:27] not sure about wikidata yet [22:07:53] PetScan.. i see "Positron Emission Tomography Scanner" :) [22:09:12] interesting [22:13:35] not sure if PetScan can be used to get a list of all people. that's some serious category jiu-jitsiu [22:15:27] All people seems rather intensive. [22:18:14] Category People, depth 10, sort by size, has property P5 (human)..wait [22:25:29] It'll kill it. All people is too hardcore [22:25:52] What you might be able to do is use it to get a list of subcategories, and *then* automatically query several times at lower levels [22:26:03] hm. [22:26:11] it's hard to know without reading the source of PetScan [22:26:21] if it's doing one wikidata query per person, it'll never finish [22:26:38] Well, I have tried to go in high depth, it doesn't work [22:26:41] It just breaks eventually [22:29:08] Heh, even high depth category queries break. (deep depth?) [22:30:37] Something like going over all categories in https://petscan.wmflabs.org/?psid=2319576 might be interesting [22:30:38] perhaps they break due to recursive membership [22:31:10] (I expect people who don't have an occupation category are more likely to have short articles and can often be ignored) [22:32:19] Nah, they just break because there's a ton of results. There's probably a way to make it not break, but I'd probably just do it by smaller categories myself - of course, I haven't played with the Mediawiki API so there might be a better way [22:49:25] it is much faster to just use wikidata for the query and then download all the articles [22:49:36] or grab them out of a dump [22:49:50] i think PetScan has a lot of n^2 calculations [22:52:51] according to wikidata, there are only 442,068 humans in the english wikipedia ^_^ [23:02:18] Wikidata contains 4 million human though [23:06:04] that's across all wikis presumably [23:06:47] ~400k seems reasonable given there are 5.5 mln articles on en