[06:34:15] hi, any admins around? [06:34:34] with some time to spare? [06:34:35] AdityaAS: hi [06:35:20] Hi Eurodyne, I have a couple of questions with respect to Wikidata and Freebase.. Will you be able to help? [06:36:09] sorry, I don't know much about that ): [06:37:26] Ah. Alright. There was a paper by Google a few years back "From Freebase to Wikidata: The Great Migration". I wanted to ask if the migration is still happening or if it has stopped (due to some reasons) [06:38:18] Freebase claimed to have information about 4.5 billion entities. Wikidata seems to be much smaller in comparison (If indeed the migration did happen) [06:39:27] AdityaAS: many Freebase statements are loaded in the primary sources tool. When editors enable this tool, this data is presented as proposed statements on item pages [06:39:50] they can validate the statements so that they get added to the item [06:40:10] AFAICT it is the only sort of Freebase import that is going on at the moment [06:40:55] Oh, so it is somewhat manual [06:41:27] I have a use case where using the Freebase data would be invaluable. I was hoping I could use Wikidata instead [06:44:01] AdityaAS: maybe https://developers.google.com/freebase/ helps? [07:00:34] Yep. That does. [07:03:55] An aside question, Wikipedia as a knowledge base contains more descriptive information (context, hyperlinks) etc.. Whereas Freebase though has a large number of entities, does not have that information. [07:04:00] How about Wikidata? [07:21:24] I don't know that much about Freebase, but I would compare it to Wikidata rather than to Wikipedia [07:22:12] Btw. the link I posted claims that Freebase contains ~1.9bn triples of information. Wikidata is at 2.35bn meanwhile, thus it is already "larger" than Freebase was [07:22:26] although this measure is certainly not the best to measure size [08:22:48] :O https://www.wikidata.org/w/index.php?title=MediaWiki:Gadgets-definition&curid=21137&diff=529370361&oldid=511956086 [08:24:32] what is it? [08:24:41] Search for other namespaces. [08:24:58] But properties aren't included, am I correct Amir1? [08:25:31] yeah, AFAIK, wbsearchentities should handle it [08:26:20] It does, not added to the UI for some reason. [08:28:26] hmm, I guess I can't do anything for that (unless added to UI) [08:28:41] maybe we should write another gadget for that, because the API call and stuff is different [08:30:09] Nah, there are better things to write gadgets for. [08:30:43] Like "add new item" when nothing matches your value, linking to new item with the value filled in. :) [08:32:07] And do we still need that cirrus "beta" gadget? [08:38:50] IDK [08:40:24] Anyway, thanks for this one. :) [08:45:03] may I ask what this gadget does?! [08:45:21] Shows search suggestions for other namespaces on Wikidata. [08:46:38] how is this different from a prefix search? [08:47:16] That one didn't show suggestions? [08:47:33] At least not on the top right. [08:47:37] ah, now I see [08:47:56] maybe it wasn't loaded properly in the beginning [08:52:40] this is the first time I (occasionally) see autodescriptions in the search suggestions as well. Is this also done by the searchAll gadget? [08:59:43] That should be autodesc. [09:10:13] I wonder if there's a way to make the query service go through pages of mw api results [09:11:14] I have several things where I want to go through a user's contributions and do queries against the items, but they did more edits than I can get back in a single request [09:12:31] and I imagine it would be useful when people want to check all items in a large category and things like that too [09:43:51] sjoerddebruin: you are welcome :) [10:59:57] MisterSynergy: Where can I see the statistics about Wikidata? Like the total triples like you quoted before [11:00:05] @MisterSynergy [11:03:14] AdityaAS: triples are here https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?refresh=1m&panelId=7&fullscreen&orgId=1&from=now-1y&to=now [11:03:26] Other numbers at https://grafana.wikimedia.org/dashboard/db/wikidata :) [11:03:48] Thanks sjoerddebruin [13:57:51] http://tinyurl.com/y73yh2ej almost 600 million descriptions... [14:04:10] nikki: I got 600001861, does that mean we just passed 600M? [14:04:44] I assume so [14:04:58] I assume bots are still happily adding them [14:05:34] it was at 599996503 for me [14:06:14] ok [14:06:21] pity that https://grafana.wikimedia.org/dashboard/db/wikidata-datamodel-terms is broken [14:07:08] yeah, now that I try again it's over 600 million [14:08:42] bah, it doesn't want to tell me how many category descriptions we have [14:09:03] from our results the other day, I'm pretty sure the "wikimedia ..." ones are more than half though [14:12:29] sum of first 152 lines of https://gist.github.com/lucaswerkmeister/a5b02cda1ce9ea14874dd2828ce57e79 is 347569647 [14:12:41] and it looks like those are all Wikimedia stuff (category or disambiguation) [14:12:52] so yeah, more than half looks fairly accurate :) [15:38:01] Lucas_WMDE: I get a constraint violation for country: novalue... it seems it doesn't like having novalue if there's a constraint saying that items used in a statement should have a specific property [15:38:09] but that seems wrong, since novalue is a special value, not an item [15:38:37] hrm [15:39:18] if we had an explicit “must have custom value” constraint I’d feel more comfortable removing that implicit constraint from all the constraint types that impose it [15:39:39] I think all the checkers that need the value will currently tell you that it shouldn’t be somevalue/novalue [15:39:53] the alternative is to add every item in antarctica as an exception :P [15:40:08] they could all be changed to skip the constraint check instead, but I feel like there’s probably some properties where you want to prevent somevalue/novalue [15:40:18] imo a new "allowed types of value" should be introduced [15:40:19] and we currently don’t have a constraint type for that [15:41:17] matej_suchanek: that could work if we had an item for “custom value” [15:41:29] (somevalue/novalue would be represented by themselves, of course :) ) [15:41:44] like that, yeah [15:42:21] can one of you open a Phabricator task so we don’t forget? [15:43:26] doing... [15:43:33] thanks [15:45:45] https://phabricator.wikimedia.org/T172129 [15:47:03] I guess we should have another ticket for removing the constraint violation for country: novalue too? [15:50:05] nikki: https://phabricator.wikimedia.org/T172130 [15:50:50] yet another chain of tasks :) [15:50:55] do we really have to wait until then? :/ [15:51:38] maybe I should remove that constraint, it has over 800,000 violations [15:52:50] hm, perhaps it doesn’t have to wait [15:54:16] the description of the constraint says "If [item A] has this property with value [item B], [item B] is required to have property ..." after all, it doesn't say "isn't allowed to use novalue or somevalue" [15:54:59] I wonder how the bot treats novalue's... [15:56:01] no idea [15:58:20] if you find out that the bot ignores them instead of reporting them, that would be a strong argument for ignoring them in the extension before we have the new constraint type… [15:58:33] the bot’s not open source, is it? that would simplify my life a lot (in general) [15:59:01] haven't seen a piece of code yet, probably not [16:00:00] looks like the bot complains as well, at least on Format: https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P3963#.22Format.22_violations [16:00:34] but perhaps it varies by constraint type [16:00:46] I saw some regexes which allow an empty string, then some/novalue are considered ok [16:00:55] at least by the bot [16:01:11] ok… [16:01:42] but this "hack" can also be replaced by the proposed const. type [16:03:13] definitely [16:06:53] nikki: which constraint has 800,000 violations? I only count 18997 country: novalue statements [16:08:13] the one on country which says the target also needs country [16:08:34] oh, you mean 800,000 violations total, not just due to the somevalue/novalue thing? [16:08:37] yeah [16:08:51] ah, ok [16:09:23] * nikki needs to rush out [16:12:09] nikki, Lucas_WMDE: when you have time, please see https://phabricator.wikimedia.org/T172134 [16:16:33] matej_suchanek: <3 for opening a Phabricator task and not just writing on a talk page that I’ll never see ;) [16:16:45] did you explicitly CC me? I’m curious why I got an email about it [16:17:16] yes, I did... perhaps adding the tags is enough [16:17:38] oh, you've already commented [16:18:10] if adding the tags is enough, then I can think of someone who could benefit from that mechanism *cough* [16:18:19] but I don’t think I’m actually subscribed to the tag, lemme change that [16:19:36] having read your reply, you mean that "number of values" qualifier wouldn't be necessary [16:19:47] yes [16:19:51] it would always be minimum+maximum quantity [16:20:00] that way we don’t need a new property [16:20:10] smart [16:20:27] (though you’re admin, so new property isn’t a big hurdle :D ) [16:20:40] I haven't created one... [16:22:13] Hm, the effect of https://phabricator.wikimedia.org/T169060 is lower than I thought. [16:22:13] in the proposal, I've mentioned "single value" that could also be merged... what do you think? [16:23:00] I'm afraid it may complicate things for users [16:23:17] matej_suchanek: no objection from me [16:23:36] SingleValueChecker and MultiValuleChecker only differ in 4 lines, if I can get rid of that duplication I’m happy :) [16:23:42] which users do you mean? [16:23:55] users that maintain properties [16:24:07] for users that see the constraint violations, it should be okay if we have good messages [16:24:08] okay [16:24:17] for them it could be a bit confusing, yeah [16:24:36] I expect this to be decided on first [16:25:32] but I'm glad to hear there will be no problem with it in the backend [16:27:09] nikki: I’m surprised that you didn’t complain about the message [16:27:09] > Properties with constraint "Q21510865" need to have a value. [16:27:10] that item ID should be a link with label :) [16:31:00] Ah, found one. Items like these have suggestions now. https://www.wikidata.org/wiki/Q1499116 [18:37:58] Hm, it's decreasing... https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch?refresh=1m&orgId=1&from=now-7d&to=now [18:42:02] I don't know what to do with https://www.wikidata.org/wiki/Q163758... it seems people didn't used to distinguish the massif and the highest peak and now they do and we have another item specifically for the massif and another specifically for the peak [20:29:54] WikidataFacts: where was that message shown? [20:36:16] nikki: https://www.wikidata.org/wiki/Q1513315, constraint violation on country statement [20:37:22] huh. I don't remember seeing that one