[12:57:05] hi [12:57:38] anybody there? [15:08:27] Hello, I have a question regarding the recommended import method for wikidata dumps. Is the utility "mwdumper" still the prefered way? It seems to not be maintained anymore with some patches available in phabricator. Also the drump downloaded is over 1.2 TB extracted, which I did to take the load of the mwdumper while setting it up manually. Any help is appreciated. I am using the https://github.com/wmde/wikibase-docker stack [15:08:27] btw. [15:56:25] hi esaller ! [15:56:34] esaller: which dumps? :) [15:56:38] json dumps or? [15:59:04] hi addshore ! I am currently working with *-pages-articles-multistream.xml.bz2 on https://dumps.wikimedia.org/wikidatawiki/20190101/. Before that I tried a dump in the entities subfolder (munge.sh, etc.) which did not show up in the mediawiki or the query service. [15:59:28] I am probably missing something huge with how everything works together [16:03:43] The one thing that worked was the included WikibaseImport. With this it showed up in the wiki and query service. All the IDs did not match up though (entity Q# and property P# ). Also this seems terribly slow compared to an actual dump [16:15:02] so esaller you should just be able to do an xml import using the standard mediawiki import process for that [16:15:28] although your settings, specificaly the namespace config will have to be the same, on wikidata.org the namespace for items, that differs on the default install [16:15:46] then to get it from there to the query service youll need to dump the triples using one of the wikidata maint scripts [16:16:04] then munge the data using the query service script and load it into blazegraph. [16:16:25] BUT... do you just want the data in the query service, or are you looking at altering the data through wikibase once youve loaded it? [16:18:24] I actually only need the data in the query service. It would be part of a research project and I want to have a reproducible state of wikidata and be able to query it [16:21:29] for that it would be much better to download one of the dumps at https://dumps.wikimedia.org/wikidatawiki/entities/ and feed that into a local query service instance [16:21:59] see the documentation at https://github.com/wikimedia/wikidata-query-rdf/blob/master/docs/getting-started.md [16:22:19] Yes :) [16:22:21] what Lucas_WMDE said :D [16:27:08] Huh, that seems like a more straight forward approach. I stumbled over this repo and used it in the wikibase-docker stack. (Specifically I took a entities dump, followed it, but was not able to query it in WDQS or the blazegraph dashboard) [16:28:56] I will try to setup a minimal version of just the repo version you posted. Thanks a lot for the help! I have been banging my head against it for some time now. [22:35:22] hm. does anyone here know how to merge categories on commons? I belive https://commons.wikimedia.org/wiki/Category:%D0%93%D0%B8%D0%B6%D0%B6%D0%B0%D0%BA and https://commons.wikimedia.org/wiki/Category:Gidzhak is the same thing