[15:22:44] <kiwi_75>	 https://www.wikidata.org/wiki/Q36020#P1472 warns Commons link should exist, but it does exists!
[15:25:59] <kiwi_75>	 to the devs thx!
[19:38:29] <datasc>	 Hi guys, did anyone try to parse the 780Go dump json or am I doing something wrong? I can't even count the number of lines
[19:40:21] <Addax>	 datasc: it's... pretty brutal cruising through it
[19:40:28] <Addax>	 I don't even think I tried the full dump
[19:41:13] <datasc>	 Yeah... the problem is that it contains a bunch of useless elements
[19:41:58] <Addax>	 datasc: what I do is actually use sparql to get a base set of elements and then request other statements as necessary
[19:42:14] <Addax>	 it's muuuuuch faster than trying to crawl the dump - and yes, I have limits in place to respect the TOS
[19:42:28] <Addax>	 It could be a LOT faster if I didn't, but that's obviously terrible behavior
[19:42:59] <Addax>	 (I also cache statements, so I don't request them over and over again)
[19:43:10] <datasc>	 I didn't try the query service
[19:43:22] <Addax>	 datasc: what kind of information are you digging for?
[19:44:36] <datasc>	 Virtually all informations about the world. Demographics, resources, political situations etc.. For example I would first like to query all cities in the world
[19:44:49] <datasc>	 Is that possible?
[19:45:33] <datasc>	 Is wikidata the right dataset for my usage, for example I would like all cities that are still active in our current world
[19:45:45] <datasc>	 ?
[19:46:20] <Addax>	 Hmm, not sure about "still active" but I imagine you could get a pretty good sample set of cities. I don't know about "all cities," because *someone* would have to create that data
[19:47:21] <Addax>	 Look up carthage and new york city, see what the statements say about the two
[19:48:16] <Addax>	 of course, there are lots of cities called "carthage" these days
[19:49:58] <datasc>	 Polysemy could be a problem but some errors don't bother me
[19:50:31] <datasc>	 I'll just try to fetch all instances of city
[19:51:33] <Addax>	 *nod* I'd use the query service - you're going to have much more manageable datasets that way
[19:51:49] <Addax>	 If memory serves, the TOS is 6 requests at a time, so ...
[19:52:20] <Addax>	 (I have been using that as a limit for a while now and do not get the "server busy" throttle, so I think that's applicable)
[19:52:52] <Addax>	 I have a retry set up just in case I get the server busy response but it hasn't fired off for a long time
[19:53:26] <datasc>	 6 requests at a time? But is one big request valid?
[19:53:43] <Addax>	 yes
[19:54:20] <Addax>	 my process is to request all organizations (businesses) which gives a 240000-line response... I then trawl through those, looking for changes
[19:54:21] <datasc>	 Ok so no problems, let's try
[19:54:47] <datasc>	 Oh ok, yeah I don't need to look for changes for my problematic
[19:55:23] <datasc>	 How much time does it take for your request to complete?
[19:55:33] <Addax>	 that one? 22-24 sec
[19:56:48] <datasc>	 Ok, I'll make my request and give it a try, thanks for the help
[19:57:03] <Addax>	 the full runtime is a lot longer, because I have to look for connected edges too with the TOS limit, but caching helps (there's a lot of common statements!)
[20:00:45] <datasc>	 devs should probably try to make a better json dump, only with english labels, less informations and more interesting instances. Or one dump per instances
[20:01:13] <Addax>	 well, the labels aren't THAT much of the data...
[20:02:35] <datasc>	 each property has an hash, that's not useful for many application
[20:12:18] <Addax>	 datasc: I am familiar, yeah. I actually filter out the non-english labels myself too.
[21:32:44] <datasc>	 Well I can't even reach 10k cities in one query 
[23:54:38] <bitbit>	 Hi, I'm quite new to SPARQL and am trying to query a csv of cities with headers "city,country,population". However, I was only able to make the country display as "wd:Q12345". How do you make the query display the country's name? https://w.wiki/4vt