[10:57:01] melderick: The new dump is there, sorry for the delay [11:01:01] hoo: oh great ! thanks for the news :) I go fetch it ^__^ [11:03:24] hm, I wonder whether we should look into Zstd [11:05:41] hoo: that much time spent in compressing the dump ? [11:07:21] Well, bzip2 takes about 5h to compress the dump [11:07:32] and we can't use pbzip2, because many clients can't digest that [11:07:51] also Zstd is probably going to be a fair bit smaller [11:11:25] i see [11:12:12] i usually don't wait for the bzip2 dump :) [14:43:26] hello [14:44:13] I am working on T114473 and I want to get the list of badges by language. I am not sure how to list all the items which are created for a specific language [14:44:13] T114473: Provide a special page to list all available badges - https://phabricator.wikimedia.org/T114473 [14:44:20] could you please help me with some advice? [14:52:22] hey victorbarbu ! Do you mean you want to go from a list of badge / IDs to a list of labels? [14:52:29] in a given language? [15:48:54] hi folks. I have a question about the notability/deletion policy after finding an entity missing that I had been using. I have asked user who performed the deletion if they can clarify their rationale, (https://www.wikidata.org/w/index.php?title=Topic:Tg9hzsbqlpy8ri6b&topic_showPostId=tg9hzsgrg0dg158j#flow-post-tg9hzsgrg0dg158j) but would appreciate the perspective of others here. To be perfectly honest, prior to today I didn' [15:48:55] know that wikidata IDs were ever outright deleted (rather than merged or similarly deprecated). [15:51:20] the page was deleted on viwiki: https://vi.wikipedia.org/w/index.php?title=%C4%90%E1%BA%B7c_bi%E1%BB%87t:Nh%E1%BA%ADt_tr%C3%ACnh&page=Ia+Dom [15:51:41] so the link was deleted from the wikidata item at the same time [15:53:00] tomlee did you see my answer? [15:53:12] Stryn: yes, just reading the history to understand myself. that's very helpful! [15:53:50] should the newly created article have a wikidata link now, though? or should the old wikidata item be restored and amended to point to the new wikidata article? [15:56:27] the article on viwiki is about the same subject as it was before the deletion`? [15:56:51] if so, then we can restore the wikidata item [15:57:36] I believe so. My own project interacts with wikidata for international place name translations. From my own records I can confirm the name of the item is `Ia Dom`, that is is an inhabited place and that it is at or near 107.5377778,13.80888889 [15:58:11] (and that the wikidata ID I used ws Q10772504) [15:58:51] ok, restored [15:58:55] thanks very much! [18:17:56] Lydia_WMDE: https://phabricator.wikimedia.org/T149419 [18:26:44] 1000 edits on Wikidata is a milestone to some? [18:29:02] if done without automatic tools maybe [18:30:09] I still find it rather cute. [18:39:29] sjoerddebruin: After I made the query for you I made http://tinyurl.com/zg32v9o ;-) [18:40:12] ah, for the weird P31 thing [18:40:14] Everything that has more than one instance of and one of the instance of statements is a subclass of the other [18:40:43] 68 left on https://petscan.wmflabs.org/?language=en&project=wikipedia&ns%5B0%5D=1&sparql=SELECT%20%3Fitem%20WHERE%20%7B%20%3Fitem%20wdt%3AP31%20%3Fsub0%20.%20%3Ftree0%20%28wdt%3AP5%29*%20%3Fsub0%20.%20%3Ftree0%20%28wdt%3AP31%29*%20wd%3AQ28640%20%7D&wikidata_prop_item_use=Q210167%2C%20Q28640&wpiu=none&sortby=date&interface_language=en&active_tab=tab_output&doit= too [18:41:02] (occupations as P31/P279) [18:47:03] excluding some large ones that need pwoper fixing [21:33:37] So, https://grafana.wikimedia.org/dashboard/db/wikidata-datamodel-terms seems to be on ar as language by default. addshore, can you confirm? [21:42:01] Amir1: sjoerddebruin pointed https://tools.wmflabs.org/wikidata-game/distributed/#game=15 out to me. Cool! I completely missed that. I've been playing around with this for some time. [21:42:30] Did you ever try to base it on templates? My score for (infobox and navigation) templates is much much higher [21:42:39] https://tools.wmflabs.org/widar/index.php?action=authorize seems broken though. :O [21:42:41] \o/ [21:43:11] multichill: Amazing multichill. I use categories to score but I should start trying out templates too [21:43:38] My manual list is at https://nl.wikipedia.org/wiki/Gebruiker:NoclaimsBot/Template_claim (more linked from https://www.wikidata.org/wiki/User:NoclaimsBot) [21:43:57] I'm quite sure kian scales better than me ;-) [21:45:16] :D but it's more complicated :((( [21:45:49] Amir1: Take for example the Dutch Wikipedia, only "infobox*", "navigatie*" and "taxobox" are of interest. Pulling the usage of those templates from the database is easy and even faster than categories [21:47:40] Yes, I probably also start from Dutch [21:47:53] so many volunteers to test out <3 [21:49:59] So you train kian. Kian gives suggestions. You put that in the s52709__kian_p db and that is used in the tool right? Any docs on how you train kian Amir1? [21:50:46] Or do I just RTFM and install https://github.com/Ladsgroup/Kian ? ;-) [21:50:47] multichill: yup, that's correct. Some docs can be found in https://github.com/Ladsgroup/Kian [21:51:19] But there are some tips, the sql queries are too big to be ran using python. I run it directly and Kian understand it's there [21:57:17] Amir1, I heard that you speak Persian, and was wondering if I could get a quick spot of help if you're around [21:57:56] primefac: sure, sup? [21:58:02] وزیر فرهنگ در دولت دکتر مصدق، 1331-1330 ش (1952-1951م) [21:59:17] source is from about halfway down http://www.hessaby.com/khadamat.htm [21:59:22] Minister of culture in Mosadegh cabinet, 1330-1331 Solar Hijri (1952-1951 Gregorian) [21:59:30] HTH [21:59:48] interesting. Google was *almost* right :-p [21:59:53] thanks [22:00:11] out of curiosity, would there be a significant difference between "education" and "culture" when written in Persian? [22:00:24] Amir1: I'm looking at the categories.json is the format of that file explained somewhere? What am I looking at? [22:01:01] primefac: education: آموزش - culture: فرهنگ [22:01:13] multichill: let me check [22:01:47] very interesting. Thanks Amir1! [22:03:43] multichill: e.g. "Concursos_de_belleza_en_1988": [4, 0, 2] in esSpain means. there are four members in that category where 0 of them defined as P17:Spain and two of them have P17 but not Spain [22:04:08] multichill: Do you want me to add you the to the Kian service group so you have access to such data? [22:04:33] Amir1: Clear. So the basic function of kian is to answer a question not with yes or no, but a percentage of certainty, right? [22:04:54] Your chmod skills are excellent, I can already read everything ;-) [22:05:01] yes, like most of ML classifiers [22:05:06] :)))) [22:06:59] And how about the next step? Now we have to make up the suggestions (humans on nl, football on it, etc), would we be able to process a dataset and the program comes with suggestions we didn't think of? [22:07:55] multichill: scripts folder [22:07:58] Amir1: That would be *really* interesting. [22:08:05] several ways to do that [22:08:16] 1- Add them when it's confident enough [22:08:38] 2- put them into kian table for the kian game when it's confident but not much [22:09:23] 3- Find possible mistakes when it's confident that the existing data is wrong [22:09:52] But that's all per set basis, right? For example nlHuman? [22:10:36] Yup [22:11:01] but the kian tables has model id so you can add as much as you want [22:11:39] Do you want to see the kian tables or you already can? :D [22:12:01] MariaDB [s52709__kian_p] ? [22:12:11] yup [22:12:40] I think I made two tables or two databases (one for the game and one for possible mistakes) but don't recall [22:13:08] I see a table kian and a table kian_mistakes [22:14:08] so two tables in one db