[07:45:45] is anyone around who actually uses twitter? wondering if they can try pinging magnus about https://www.wikidata.org/wiki/Wikidata:Administrators%27_noticeboard#Another_fight_with_bot [07:46:42] blocking the account will break the quick statements batch mode, but if it's going to keep re-adding removed statements every day, I don't see any other option :/ [07:50:21] nikki: I pinged him (from my personal account) [07:50:29] thanks [07:50:31] np [09:32:04] Anybody familiar with the Wikidata search API? [09:32:19] I wonder what the "search-continue" field means. [09:34:28] no_gravity: in wbsearchentities? [09:34:47] "offset to continue the query from a previous search" [09:35:49] sjoerddebruin: Yup, in wbsearchentities. [09:35:53] sjoerddebruin: What does it mean? [09:36:03] For example here: [09:36:07] https://www.wikidata.org/w/api.php?action=wbsearchentities&format=json&search=star+wars&language=en&type=item [09:36:17] "search-continue: 7" [09:36:27] 7 what? [09:36:50] 7 results [09:37:08] I mean what does the field mean? [09:37:17] Hello, excuse me if I come back to an old question. Are there tools for extracting a little dump of Wikidata focused on life sciences? [09:37:25] sjoerddebruin: That there are more results and I have to paginate or something? [09:37:37] 7 would be strange choice for max nr of results. [09:37:38] The amount of results currently showing, I think. You can use the value to feed the parameter "continue". [09:37:41] https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities [09:37:47] 7 is the default for some reason. [09:37:53] 7? Really? [09:38:03] I would have intuitively grasped it if it were 10. [09:38:43] Not sure how 10 would influence our perforamnce. [09:38:49] Funky [09:38:50] Performance, need more coffeee. [09:39:03] Ciccio: you can use the Query Service? [09:39:19] sjoerddebruin: Thanks [09:39:38] No problem. API documentation is still hidden at times. [09:42:00] sjoerddebruin: do you mean the SPARQL endpoint? Yep, I definitely could, but I should hard code custom queries. I'm wondering if there is a smarter way to get those data. [09:43:02] That seems like the easiest way to get a small portion of data. https://www.wikidata.org/wiki/Wikidata:Data_access [09:44:24] sjoerddebruin: ok, got it, thank you very much : ) [11:19:33] PROBLEM - High lag on wdqs1002 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [1800.0] [11:19:34] PROBLEM - WDQS HTTP on wdqs1002 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Temporarily Unavailable - 387 bytes in 0.001 second response time [11:19:43] PROBLEM - WDQS SPARQL on wdqs1002 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Temporarily Unavailable - 387 bytes in 0.001 second response time [13:03:38] Hello, when I download data from wikidata in nt (by adding ".nt" at the end of the URL), the properties of the property (e.g., owl:equivalentProperty, owl:inverseOf) are not stated accordingly the OWL standard. E.g., "part of" (P361) is the Inverse property of "has part" (P527). Inverse is P1696 and not owl:inverseOf. Is there a way to extract data in a standard way? [13:06:38] the only person I know who knows about the rdf stuff is SMalyshev, who is probably not around right now [13:06:53] maybe there's someone else who can help though [13:14:45] I'm getting badtoken API errors for lots of userscripts. Is something happening? [13:14:55] Reloading the page works, but it's annoying [13:18:28] I haven't noticed anything :/ [13:27:13] hi nikki, ok thank you [13:41:54] When I query for something part of (P361) something else I do not get all the results such as "something else has part something". Since they are logically related (part of is the inverse of has part) It's strange...Why does that happen? [13:43:41] Ciccio: unfortunately there is no such inference going on in SPARQL [13:43:54] you have to add the reverse edge explicitly in Wikidata [13:44:19] For example, according to the reasonator https://tools.wmflabs.org/reasonator/?&q=720988, The human Genome has only 7 Chromosomes, and that's not true : ( [13:44:23] or you explicitly ask for it in the query, e. g. wdt:P361|^wdt:P527 [13:44:42] but for part of / has part, both edges should always exist in Wikidata [13:44:53] so the inference shouldn’t be necessary [13:46:14] pintoch: Doesn't Wikidata perform any inference about its data? :( [13:48:08] Lucas_WMDE: so those kinds of cases are managed manually, right? In the case of chromosomes of human genome, there is not a part of counterpart for has part [13:48:40] Ciccio: well, people can run bots to fix these issues, but there’s nothing in Wikidata itself that does inference [13:49:15] ok Lucas_WMDE, got it : ) [13:52:50] Lucas_WMDE: actually your integration of constraint violations in Wikibase introduce some sort of inference, right? [13:53:03] it isn't exposed in SPARQL but it is already something [13:53:07] I wouldn’t really say that [13:53:14] it tells you that the statement is missing [13:53:21] yeah, it does not add it [13:53:23] but that doesn’t mean that adding (inferring) the statement is the right solution [13:53:43] perhaps something else should be changed [13:54:01] yeah, I just mean that it is some sort of basic reasoning on top of the data model [13:54:29] that is available via the API, and so on [13:55:28] yeah [14:24:06] Lucas_WMDE: what does wdt:P361|^wdt:P527 mean? [14:24:30] it’s a property path, you can use it in the predicate position of a triple [14:24:46] a|b means “or” (?subject a ?object OR ?subject b ?object) [14:24:55] ^ inverts the direction of the partial path [14:25:24] so ?subject wdt:P361|^wdt:P527 ?object means ?subject wdt:P361 ?object OR ?object wdt:P527 ?subject [14:25:35] ah [14:25:41] thanks [14:26:00] tbh, a UNION might be the better way to write this, much more readable :) [14:28:33] not as compact though! especially if I want to make the indentation nice and symmetrical [14:28:58] yeah [15:02:33] PROBLEM - High lag on wdqs1001 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [1800.0] [15:06:34] PROBLEM - High lag on wdqs1003 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [1800.0] [15:08:34] PROBLEM - High lag on wdqs1003 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [1800.0] [15:10:34] PROBLEM - High lag on wdqs1003 is CRITICAL: CRITICAL: 41.38% of data above the critical threshold [1800.0] [15:30:50] Hey! I'm sorry, I think I broke Template:Wikidata-Dev-Navigation by adding a new link, could someone have a look? Thanks a lot ^^ [15:42:36] Auregann_WMDE: why do you think you broke it? any links where I can see it's broken? [15:46:16] if you mean why you didn't see all the links at Template:Wikidata-Dev-Navigation it required purging [15:55:23] Hello, folks! When you transferred data from freebase to wikidata, why you used only wikipedia links for construction objects mapping? Is it because all other urls have little influence or something else? [16:15:19] I don't really understand the question, and I don't know anything about the freebase stuff so I can't really answer the question, but my guess would be that if you have a dataset with lots of wikipedia links, that's going to produce the most matches, because the other ids/urls in wikidata mostly came from the wikipedia pages [16:20:50] nikki: I found answer in paper about migration, don't notice it before. My bad. Thanks for response anyway. [16:21:04] oh, good :) [17:28:58] Interesting. https://grafana.wikimedia.org/dashboard/db/wikidata-webpagetest?refresh=5m&panelId=40&fullscreen&orgId=1 [18:00:31] looking at https://twitter.com/everypolitbot/status/885467159916412929 I wonder if there is a simple way in SPARQL to say "only consider claims that hold today" (as specified by the start time and end time qualifiers) [18:01:02] (this question has probably been asked many times, sorry ^^) [18:01:38] I can see how you would decompose every statement in the SPARQL query to add the filters manually, but it is quite painful to do [18:02:18] I don’t think there’s anything especially simple [18:02:38] is that the sort of thing one could implement with a SERVICE ? [18:03:18] hmm… probably [18:03:20] I have no idea how the service API works :-/ [18:03:47] but if so, I would find that quite useful (I have run into this problem multiple times) [18:03:51] a custom function might also work [18:03:59] FILTER(wikibase:isCurrent(?statement)) or something like that [18:04:12] ah, interesting [18:04:22] (that’s a fictional function I just made up) [18:04:25] but that still requires you to bind all the statements to variables [18:04:29] yeah [18:05:11] I was also thinking about the same thing for ranks [18:05:35] very often I want to say "only consider statements which have the best rank among the other statements with the same property on the item" [18:05:57] but that’s wdt:, isn’t it? [18:06:03] oh is it? [18:06:06] yes :) [18:06:10] ah that's great :-D [18:06:14] only best rank, never deprecated rank [18:06:19] the t stands for “truthy” :) [18:06:32] oh I see!! [18:06:35] “wow, that was easy!” :D [18:07:01] that is very good to know, thanks [18:07:45] yeah so basically my service would only be useful for queries in the past [18:07:51] although I don't think you can do that if you need qualifiers [18:08:04] nikki: good point [18:08:05] (unless there's some truthy version of p: that I've not heard about?) [18:08:20] no, but best rank statements have rdf:type wikibase:BestRank iirc [18:08:27] (though I don’t think I’ve ever used that) [18:08:57] still complicates things since you have to find best rank or normal rank if there aren't any best rank ones :/ [18:09:17] no, best rank is “preferred, or normal if there aren’t any preferred statements” [18:09:20] oh [18:09:24] it does that for you :) [18:11:00] this is the sort of moment when I realize it would be worth reading the docs sometimes :-° [18:11:34] oof, but I can’t test any of this on the sandbox item right now because apparently WDQS is 3 hours behind Wikidata :O [18:12:12] woot [18:12:16] https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?refresh=1m&panelId=8&fullscreen&orgId=1 [18:12:21] lord have mercy [18:12:27] wdqs1002 is DAYS behind [18:12:41] I said that before yeah [18:13:08] and no matter how many times I send the query I always get x-served-by: wdqs1003 [18:13:53] SMalyshev: ^ [18:14:24] looks like error rate and varnish latency both plateaued 4 hours ago https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?refresh=1m&orgId=1 [18:14:24] yes I know. Somebody is spamming the service :( [18:14:30] aw, shit [18:14:43] unfortunately I didn't have much success in asking help from ops in blocking the source [18:14:50] Like, sending queries? [18:15:02] yes, same query actually. very heavy one [18:15:05] 13K per hour [18:15:07] Interesting. [18:15:15] * WikidataFacts quickly checks https://grafana.wikimedia.org/dashboard/db/wikidata-quality?orgId=1 [18:15:29] phew, doesn’t look like it’s my code, no elevated query count [18:15:29] unfortunately, even with ip rating and timeouts enough passes through to do the damage [18:15:43] Just a regular IP or? [18:15:58] it looks like this: SELECT ?cid ?article ?gnd WHERE { ?cid wdt:P227 ?gnd. ?article schema:about ?cid . filter( regex(str(?article), "en.wikipedia.org" )) } limit 1000 offset 185000 [18:16:10] gnd, interesting [18:16:22] offset is different, but always high and always there are 10k+ of them per hour [18:16:28] looks like some broken script [18:17:19] it's single IP, so maybe I could block it on my side of thins [18:17:20] nikki: apparently fnielsen took care of the quickstatementsbot fight (https://www.wikidata.org/wiki/Wikidata:Administrators%27_noticeboard#Another_fight_with_bot) [18:17:22] *things [18:18:08] ugh, that query can easily be rewritten to be much more efficient too [18:18:17] if only we could contact the creator to tell them that :D [18:18:26] I have the IP but that's it [18:18:31] though the offset might still be tricky [18:18:46] blazegraph doesn't do large offsets very well [18:20:07] and i assume it doesn’t have a useful rDNS [18:20:16] (the IP I mean, sorry) [18:20:32] WikidataFacts: those were just two of the examples, from what I understood [18:20:40] hm [18:21:10] it's in Frankfurt, that's all I know more or less [18:21:21] the plot thickens [19:44:47] A bunch of DCs are in Frankfurt as that's where the DE-CIX is [19:45:12] And pretty much all the cloud providers have a region there [19:48:53] hey, it looks like WDQS load went back down about half an hour ago [19:49:08] SMalyshev: did you manage to get anywhere with the ops people? [19:49:15] or did the queries stop? [19:49:30] nope, but I think it stopped for the meantime [19:49:31] ok [19:49:39] I'll look into how to do blocking on our side, if possible [19:50:07] so now we wait for the lag to go back down… looks like wdqs1003 lag is already down by half [19:52:32] yeah as soon as junk has stopped, it is catching up [19:52:38] probably will be on in a hour or so [19:53:04] the 7-day graphs look *really* interesting: https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?refresh=1m&orgId=1&from=now-7d&to=now [19:53:13] straight, clean triangles in the Batch Progress [19:54:26] I see, nice :) [19:54:36] but it's *only* missing data [20:01:20] triangles are db reloads... due to link format change [20:01:33] there's still 2 servers needs to be reloaded, out of 6 [20:02:44] oh, you’re staggering them? [20:02:51] okay [20:03:02] yes of course, I don't want to take everything down for 3 days [20:03:10] oh [20:03:17] I didn’t realize how long a reload takes [20:03:40] yup. About a day to load the db from dump, and about 1 or 2 to catch up [20:03:58] depending on how fresh the dump is [20:04:02] ah, so the wdqs1002 lag is unrelated to those expensive queries, that just hasn’t caught up yet [20:04:10] and how much stuff added in the meantime [20:04:16] WikidataFacts: right [20:04:57] now if everybody would stop editing wikidata for a couple of days, that'd go faster of course... ;) [20:05:43] so what you’re saying is, sjoerddebruin has a good excuse to block aggressive bots much earlier right now? ;) [20:06:20] sure, why not :) [20:13:27] oh, I had been wondering why wdqs1002 was so much more lagged [20:14:41] no wonder we have lagging problems... atm there's a user editing 5 times per sec. [20:14:55] anyways, the very belated confirmation for nikki and pintoch: yes, `?statement a wikibase:BestRank` does the useful thing http://tinyurl.com/ycupxmsa [20:15:11] seems like wdqs1003 is back to normal [20:15:15] yup [20:15:22] now if only I could convince my queries to always go to that one :P [20:15:25] that spamming host creates 99% of 429's and 500's for the last week :( [20:15:49] SMalyshev: is it possible to select the host that queries are routed to [20:15:51] ? [20:15:58] I really need to block it until whoever made it fixes that [20:15:59] WikidataFacts: nope [20:16:01] hrm [20:16:14] that means the “Data updated *n* time ago” display can be misleading [20:16:20] because it might hit a completely different server [20:16:24] vps that we're using is working on kernel level (no packet inspection) and randomized [20:16:35] ok [20:16:51] and now they went to 8 edit per sec... [20:17:00] WikidataFacts: yup, ideally all servers are the same but in reality... [20:17:19] that's why we can't cancel queries btw [20:17:26] maybe the response should include the date? then it could update it [20:17:32] ah, yes, I remember that issue :) [20:17:33] I just have no way to send cancel request to a specific server [20:18:44] WikidataFacts: re: ranks, cool :) also I see you found an opportunity to use | again :P [20:19:01] :D [20:19:40] one thing’s really inconvenient about | though – autocompletion doesn’t quite understand it [20:19:51] try starting with p:P31|p:sex or gender and autocompleting it [20:23:47] is it legal running several QS at once? I would say no but try to explain... [20:24:18] you can run several queries at once, but there's 5 per ip limit [20:24:29] It's really unwanted. [20:24:35] and if those are heavy queries, I'd rather you didn't :) [20:24:39] ah, I see what you mean. although not a problem *I'm* going to have any time soon since I've force-hidden those popups with some !important css [20:24:40] There have been discussions about this, only once instance per time. [20:25:01] look on RC [20:25:20] I suspect them being behind the today's lag increase [20:26:09] https://www.wikidata.org/wiki/User:Mr.Ibrahembot was causing a lot of lag too, that one is running slower now. [20:26:14] I don't think it's any one user [20:26:24] there's been several people doing lots of edits at different points [20:26:46] True. https://www.wikidata.org/wiki/Special:Contributions/XXN-bot just continues. [20:27:28] I mean https://www.wikidata.org/wiki/Special:Contributions/Muhammad_Abul-Futooh [20:27:37] those edits have some capitalisation problems, and if they want to do stuff like that, there are more efficient ways :/ [20:27:39] Yeah, that one has been editing for a while now. [20:27:48] RECOVERY - High lag on wdqs1003 is OK: OK: Less than 30.00% above the threshold [600.0] [20:27:53] 454.150 edits this month [20:28:02] 425.627 in the last 24hr [20:28:14] \o/ [20:28:16] omg [20:28:18] PROBLEM - High lag on wdqs1001 is CRITICAL: CRITICAL: 55.17% of data above the critical threshold [1800.0] [20:28:26] Please warn. [20:29:38] hmm icinga you are wrong, wdq1001 is fine as far as I see [20:29:58] also 1003, both caught up now [20:30:19] Seems faster than the dispatch then... [20:30:43] I'm writing a letter for them [20:32:16] done [20:33:01] that wbsetdescription spike https://grafana.wikimedia.org/dashboard/db/wikidata-edits?refresh=1m&panelId=4&fullscreen&orgId=1 [20:33:35] ah, so we have graphs for these as well... [20:33:44] Graph everything! [20:34:14] they've made it 532 edits/s... [20:34:31] http://tinyurl.com/ybgjzwm9 [20:34:45] need to login for that [20:34:52] this is what the broken bot I' fighting is doing ^ [20:35:15] sjoerddebruin: yeah sorry, this is from raw logs, so yeah requires login [20:35:27] Do you need a special account or? [20:36:08] hmm I dunno if you have account that can be used to see stats with PI that one would work, but if not then probably not :( [20:36:13] Seems like he stopped. https://www.wikidata.org/wiki/Special:Contributions/Muhammad_Abul-Futooh [20:36:18] RECOVERY - High lag on wdqs1001 is OK: OK: Less than 30.00% above the threshold [600.0] [20:36:29] ok thanks icinga :) [20:36:54] wdqs1002 is slowly catching up [20:39:35] matej_suchanek: http://wikidata.wikiscan.org/hours/24/users is useful for this [20:40:01] didn't know this one, thanks a lot [20:40:07] I will explore it tomorow [20:40:07] :D [20:40:40] This is this month so far http://wikidata.wikiscan.org/?date_filter=201707&menu=userstats [20:40:48] now I’m just imagining this scene in my head [20:40:48] icinga: wdqs1001 is CRITICAL [20:40:48] SMalyshev: *glares at icinga* icinga. we talked about this [20:40:48] icinga: sorry, nevermind, wdqs1001 is OK [20:40:56] http://wikidata.wikiscan.org/?menu=userstats&usort=total&bot=1&detail=0&page=1&date_filter=201707 (with bots) [20:41:06] heh :) [20:42:37] the queue is indeed dropping now [20:45:01] The number of edits per day is still 19% bigger than last year. [20:45:10] A lot of interesting stats. http://wikidata.wikiscan.org [20:45:59] 99% main namespace edits is a proud thing. :) [20:46:13] just been there, thank you again [20:46:25] (only 71% for enwiki and 83% for commons) [20:49:32] I have a latitude and longitude, how can I make a coordinate thing the query service understands so that I can switch to the map? [20:52:24] https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Globe_coordinate mentions "geo:wktLiteral" [20:52:47] I have bind(concat("Point(", ?lon, " ", ?lat, ")")^^geo:wktLiteral as ?coords2) . but that gives me an error [20:53:42] bind(geo:wktLiteral(concat("Point(", ?lon, " ", ?lat, ")")) as ?coords2) . doesn't give me an error but doesn't give me any results anymore either [20:55:09] (the coordinates come from some other dataset, which is why I have variables in a form it doesn't understand) [20:58:37] and bind(concat("Point(", ?lon, " ", ?lat, ")") as ?coords2) . doesn't give me an error, does give me results, but the variable is empty [21:00:19] and bind(geo:wktLiteral(?lon, ?lat) as ?coords2) . doesn't give me any results. I think I've run out of ideas now [21:03:12] bind(geo:wktLiteral(concat("Point(", str(?lon), " ", str(?lat), ")")) as ?coords2) . also doesn't work [21:03:39] WikidataFacts: maybe you can help me :P I'm trying to turn latitude and longitude variables into something it knows how to map [21:05:29] sorry, I got DCed for a bit [21:06:02] nikki: I think you want the strdt function [21:06:16] strdt(concat(…), geo:wktLiteral) [21:06:19] or the other way around, not sure [21:06:57] <3 [21:06:59] yes :D [21:07:04] :) [21:07:16] bind(strdt(concat("Point(", str(?lon), " ", str(?lat), ")"), geo:wktLiteral) as ?coords2) . works, ugly as hell but... [21:07:25] maybe I don't need those str()s [21:07:40] oh, I do. bah [21:08:55] https://www.wikidata.org/wiki/Special:Contributions/Muhammad_Abul-Futooh seems to be running again, still too fast imo [21:09:16] moment... [21:10:22] it's a bit slower, 1-2 hits/s [21:11:44] hello [21:11:49] hey! [21:12:27] by the way, I wonder why they were allowed to edit so quickly [21:12:39] I mean no throttling was in place [21:13:06] how I can install wikidata with wikimedia [21:13:35] all extension necesary [21:13:46] to isntalla [21:14:16] to buil a wiki [21:14:32] diego__: see https://www.mediawiki.org/wiki/Wikibase/Installation [21:15:30] can write in spanish [21:15:40] the english [21:15:51] no good [21:20:28] estoy en un proyecto de cultura y necesito hacer una wiki, como la de wikidepia, lo de la instalacion de mediawiki ya esta, pero lo que me falta es como instalar mediawiki paso a paso [21:21:24] perdon estoy en un proyecto de cultura y necesito hacer una wiki, como la de wikidepia, lo de la instalacion de mediawiki ya esta, pero lo que me falta es como instalar wikidata paso a paso [21:23:52] qual versão de MediaWiki voçe esta usando? [21:24:21] (sinto muito, mas eu não falo espanol, e meu portugues tamben nao e muito bom – eu espero que voce pode entender :) ) [21:24:35] welp, nevermind [21:24:53] hola [21:25:12] estoy en un proyecto de cultura y necesito hacer una wiki, como la de wikidepia, lo de la instalacion de mediawiki ya esta, pero lo que me falta es como instalar mediawiki paso a paso [21:26:12] y como hacer funcionar wikidata dentro del wiki [21:26:15] y como hacer funcionar wikidata dentro del wiki [21:27:59] can you help [23:57:05] sparql's driving me crazy. can anyone figure out why queries like this return blanks? http://tinyurl.com/ybfgdjh4