[07:48:24] Hello guys, just a quick question :) [07:48:27] Using Wikidata enriched with other data source, I must ingest the entire Wikidata JSON dump in a dev graph database of mine. That's easy (yet time-consuming) but once that's done, I want to keep my copy updated by querying the RecentChanges and LogEvents API endpoints to retrieve de changes/deletes/creates that occurred between two timestamps (I'd [07:48:27] do so every few minutes) - that's relatively easy too! [07:48:28] How to get the cutoff timestamp for a given JSON dump? Where is this available or how to figure it out since the lastmodified timestamps aren't present in JSON dumps. [08:17:30] morning [08:20:40] morning [08:45:32] https://www.wikidata.org/wiki/Q6525292#P569 - I think the enwiki date is more likely to be true than the SNAC one, is SNAC usually a mess or something? [09:29:22] Hello people, it seems that we have an issue with the Wikidata mailing-list https://phabricator.wikimedia.org/T187163 If you tried to send an email to the list recently, let me know :) [09:35:41] Hi, there seems to be an issue with the example queries list: While the Cats example is still on the top of the page https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples, it doesn't show up in query.wikidata.org [09:37:41] That's probably an action of the #TeamGoat :D [09:37:57] Thanks for noticing, I'll have a look at it [09:42:32] Zolo removed it the 4th (because exactly the same request than the one for goats), Fuzheado put it back the 10th (with the comment " Keep cats at the top, as many training materials use this already (video, and otherwise)" witch I totally agree with) [09:45:08] I agree too, and it's probably just a problem of updating delay [10:35:17] Hi, where I find the documentation for querying Wikidata programmatically? i.e. Given a wikidata id of a node I'm interested in fetching all its outgoing edges and corresponding target nodes [10:35:26] and do the above task programmatically [10:47:37] AdityaAS: https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:Data_access is the best starting point in the documentation, I think [11:02:22] Lucas_WMDE: Hi, I went through the link you shared and I understand that for a given entity say Q42 (Douglas Adams) I can get say the image property P18 (I'm assuming here that Q42 does indeed have a P18 property) [11:02:54] What I'm looking for is, given an entity (Q42) an api which lists out all the properties that exist for Q42 along with the RHS of the property [11:03:33] Something like " [11:03:34] sex or gender [11:03:34] male [11:03:34] 3 references [11:03:36] country of citizenship [11:03:38] United Kingdom [11:03:41] 1 reference [11:03:43] name in native language [11:03:46] Douglas Adams (English) [11:03:48] 0 references [11:03:51] birth name [11:03:53] Douglas Noel Adams (British English) [11:03:56] 1 reference [11:03:58] given name [11:04:01] Douglas [11:04:03] series ordinal [11:04:06] 1 [11:04:08] 1 reference [11:04:11] Noël [11:04:13] series ordinal [11:04:16] 2 [11:04:18] 1 reference [11:04:21] family name [11:04:23] Adams [11:04:26] Something like ^^ as a list of jsons [11:10:15] AdityaAS: I don’t think we have that in exactly this format, but the JSON version of the “linked data interface” has all that information and more [11:44:54] Bah, where's multichill when you need him :p [11:52:40] Lucas_WMDE: the json version works too [12:18:28] Was able to get the info that I needed. Thanks Lucas_WMDE [12:18:48] yay! no problem :) [15:28:29] FYI, the Wikidata mailing-list currently encounters issues. If you send an email, it may not be received. We're working on this, I'll let you know as soon as possible when it's fixed. Sorry for the inconvenience! [15:30:00] Oh no! :O [16:20:31] hi all, I'm parsing the en-wiki dump and got only 17m pages at the end - does anyone know off-hand how many pages ought to be in the english wikipedia dump? [16:20:52] i was expecting something like 40m records [16:21:18] Hm, https://en.wikipedia.org/wiki/Special:Statistics [16:22:23] thanks sjoerddebruin [16:22:32] of that 44m, how many do you guess are redirects? [16:22:50] 5m pages, ~5m redirects? [16:23:35] even if there's a talk page for every article, 44m seems like a lot - what am I missing? [16:24:37] What dump did you download, [16:25:14] "Current revisions only, no talk or user pages" for "pages-articles-multistream.xml.bz2" according to https://en.wikipedia.org/wiki/Wikipedia:Database_download#Where_do_I_get_it? [16:25:19] sjoerddebruin: you should probably only focus on the namespace you need/know [16:25:25] * spencerk [16:26:21] Wikidata has 7½M items linked to enwiki articles http://tinyurl.com/y9aj4m7g [16:26:31] unless I got something wrong in that query [16:27:32] "en-wiki dump", Lucas. :) [16:28:03] (so it's not really Wikidata-related, but I hate to point users to other channels while the knowledge is probably here more than there) [16:28:50] yeah, i used the latest-pages-articles dump [16:29:06] i haven't checked-out the multistream one [16:30:55] thanks sjoerddebruin which channel would you recommend? [16:30:58] (sorry, new here) [16:31:27] You're fine here. :P Read the rest of my message. [16:32:37] sjoerddebruin: yes, I’m aware :) [16:32:47] but I thought it could still help to have that number [16:33:07] when trying to figure out whether 17M pages seems plausible [16:35:28] * sjoerddebruin curses into a bag. https://www.wikidata.org/w/index.php?title=Wikidata:Database_reports/Constraint_violations/Mandatory_constraints/Violations&action=history [16:38:42] That... is a biggun [16:39:45] "Let's change the regex format while the transition is still happening and keep the mandatory status!" [16:41:44] Oh, so it's "it'll solve itself in a few days" rather than "here you go, sjoerddebruin, fix this"? [16:41:54] Probably. [16:43:03] sjoerddebruin: that regex looks just borked to me [16:43:12] The ID's changed, it seems. [16:43:19] “0-f”… regexes don’t know about hex digits like that [16:43:31] I assume that’s supposed to be [0-9a-f-] [16:43:36] Feel free to fix. [16:43:46] I'm ordering pizza to forget this day. [16:49:19] sjoerddebruin: does that work? I need to try it sometimes too [16:50:16] (Spoiler: it doesn't with Finnish pizza) [16:50:22] Need to do more research. [16:50:44] I commend your commitment to science. [16:50:55] Should request a grant for that. [16:51:05] Might be approved! [16:51:26] Also for beer. "In order to give the best attention to each Wikidata item about beer, I need to drink it." [16:51:33] Nemo_bis: damn, I suspect it won't work with Estonian pizza either then [16:51:35] As long as you gurantee an exceptional diversity of your pizza sources. [16:51:46] and what's wrong with Finnish pizza? :O [16:52:01] ohnoes [16:52:09] The Finns are always watching [16:52:19] I'm everywhere :) [16:52:44] I also ate a Estonian pizza on last weekend [16:53:29] I miss the Italian Wikimania pizza's :) [16:53:56] Pizzeria Oasi? [16:54:08] Indeed. [16:54:43] It was a good place to hide during the terrible rain at the end. :) [16:57:47] I was hiding with a ton of chocolate in that moment so I was rather good too. [16:58:31] Hmmm, and there was also rain at last Wikimania... [16:58:35] Damn weather gods... [17:02:01] Wait, how is rain relevant, I thought y'all would be inside with your computers whatever the weather :p [17:15:35] Lucas_WMDE: how does the whole WMDE suffix thingy works? Who needs to have it and when? [17:15:50] do you mean in IRC or on-wiki? [17:16:58] In general [17:17:30] (we have new board elections coming this weekend, I might end up in the WMEE board, and I'm wondering if that means something should change) [17:17:49] (but I haven't seen any docs about that) [17:18:30] I don't think there are any rules for IRC nicknames. [17:20:13] on-wiki I use the (WMDE) account for anything I do as part of my employment [17:20:20] so I generally shouldn’t use it to do any real editing [17:20:34] for that I have my personal account [17:21:10] and I have several browser profiles open, one for work and one private, so if I want to do any quick edits while at work I’ll do it from the private window, where I’m logged into my private account [17:21:26] so it’s just some nobody editing :) [17:23:55] I see :) [17:24:13] with IRC I only have a single client open, so it’s a bit blurrier [17:24:55] and I sometimes answer questions as Lucas_WMDE that are really more in WikidataFacts’ area of expertise [17:24:59] I just hope no one’s too confused :D [17:25:33] Oh. You're also WikidataFacts? :D [17:25:56] yes! :D [17:27:39] haha [17:27:41] Now I know! [17:28:03] that account hasn’t been anonymous for almost a year now :D [17:28:34] Well, I just never noticed :) [17:31:07] Anyone else who's also someone else that I should know about? :D [17:32:03] hm, let me think… [17:32:19] (this is the perfect opportunity to spoil some movies you haven’t seen yet, right? :P ) [17:33:04] (probably...) [17:45:14] hey, if anybody is interested, i grepped around the 44m dump by namespace, this is what things look like broadly [17:45:15] https://user-images.githubusercontent.com/399657/36164809-80f1e254-10bb-11e8-9ef9-e8f294e25cf1.png [17:45:41] there's nearly more UserTalk pages than articles [17:59:03] btw, the Structured Commons office hour over in #wikimedia-office starts in a few minutes [19:26:35] Alright, time to be productive. [21:00:18] About time. https://www.wikidata.org/w/index.php?title=Wikidata:Requests_for_permissions/Bot&curid=4769822&action=history