[08:09:27] This is nice to wake up to. https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch?refresh=1m&orgId=1&from=now-12h&to=now [08:09:54] Still, we need more capacity. [11:36:58] Lucas_WMDE: any idea why http://tinyurl.com/ybvcae6b works and http://tinyurl.com/ybgyzhxx doesn't? [11:37:08] I'm guessing it's a bug but maybe I'm missing something [11:37:35] if I add "?title wikibase:apiOutput mwapi:title" it has the ids as the page title, so it's definitely *finding* items, it's just not filling out the item variable properly [11:39:31] hm, interesting edge case [11:39:37] Wikidata items aren’t sitelinks of themselves, that’s probably the issue [11:39:52] ah, good point [11:40:33] but it seems obvious that this should work – MWAPI has *less* work to do if the API already returns items :) [11:40:38] open a task? [11:40:48] I was trying to generate some search links that combine a free-text search on the title with restrictions on the statements, but typical, I manage to run into some weird behaviour that makes it more awkward :P [11:40:57] will do [11:41:05] ok [11:57:23] https://phabricator.wikimedia.org/T172685 [12:55:12] Lucas_WMDE: more behaviour I don't understand :( http://tinyurl.com/y8m9dv66 this is super fast, but uncommenting the line near the end makes it time out [12:55:46] I tried putting the service bit in a separate "with { }" thing but that didn't help [12:57:00] I tried disabling the optimizer but that doesn’t help either [12:58:34] weird [13:00:15] I can’t find any way to make that not timeout [13:01:57] wtf, why are the datatypes empty? http://tinyurl.com/y8mwxbou [13:02:33] oh, because IRIs aren’t literals [13:07:31] I'm confused :P [13:08:54] I was trying to debug if there was anything wrong with ?item [13:09:02] but it’s correct [13:09:27] Hmm https://www.wikidata.org/wiki/Q23708554 [13:09:58] ah [13:10:10] I thought you meant I couldn't make ?item like that or something [13:10:30] no, it’s completely correct [13:10:44] oh good :D [13:10:44] and the query is still slow even if you replace ?item with a wd:Q… literal, actually [13:10:48] no clue why [13:10:51] strange [13:33:18] In case you are wondering why dispatch is growing again. https://www.wikidata.org/wiki/Special:NewPages [13:45:14] any idea how many cebwiki articles are still left to be created? [13:46:02] Let's see. [13:47:24] Yesterday we had 4,025,799 sitelinks to ceb, they have 4,986,579 articles. [13:47:33] (using https://www.wikidata.org/wiki/User:Pasleim/Sitelink_statistics) [13:47:49] oh, okay [13:48:23] Note that the sitelinks is including all the pages. [13:48:30] So the gap is even bigger. [13:49:22] do you think there’s a significant number of non-article sitelinks to cebwiki? [13:49:27] hm, perhaps categories [13:53:18] https://ceb.wikipedia.org/wiki/Mud_Lake [13:53:58] There is also https://ceb.wikipedia.org/wiki/Mill_Creek [17:32:31] I cannot edit Wikidata because search is not available .. [17:32:42] ? [17:32:45] I trust I am not the first one to notice this [17:33:15] An error has occurred while searching: Search is currently too busy. Please try again later. [17:34:35] I seem some spikes on https://grafana.wikimedia.org/dashboard/db/elasticsearch?orgId=1&from=now-6h&to=now [17:35:19] Seems like a patch that went wrong and was just reverted. [17:36:05] ok [17:36:27] SMalyshev: I forgot, are you aware of https://www.wikidata.org/wiki/User:Sjoerddebruin/Cirrus? :) [17:37:00] sjoerddebruin: yes, thanks [17:37:17] we'll be tuning it soon, see https://phabricator.wikimedia.org/T172467 [17:37:22] Thanks Sjoerd [17:37:24] Yeah, I saw that. [17:37:26] GerardM: no problem. [19:40:07] oooh SMalyshev [19:40:17] lowercase entity IDs [19:40:30] they can indeed be a thing, and will be in old data [19:40:31] addshore: yeah do you know anything about them? [19:41:00] is addbot creating them or something else? [19:41:07] They can exist (historically) but everything now should only accept / normalize to uppercase ids [19:41:25] Nope, not an issue with addbot, it used to be an api issue, im sure there is a revelant ticket somewhere [19:41:33] also, note, addbot hasnt made an edit on wikidata for years [19:42:22] they may be a problem in queries, as I'm pretty sure URIs are case sensitive [19:42:31] oooh, interesting [19:42:45] where did you find these lowercase entity ids? in the database? [19:43:10] if it's widespread I could maybe make some normalization when we convert string to URI... [19:43:44] addshore: are we sure nobody produces new lowercase ones? [19:43:58] addshore: they are in page_props table [19:44:03] not 100% sure no, internally for wikidata Q123 = q123 [19:44:08] see https://phabricator.wikimedia.org/T172642 [19:45:44] oooh, interesting, so in the client tables [19:51:46] So. $entityPerPage[$row->pp_page] = $entityId; where $entityId is an EntityId object, and the toString method is thus used [19:52:10] that should produce uppercase one AFAIK [19:52:23] that comes form $this->serialization which is set by self::normalizeIdSerialization( $serialization ); [19:53:31] SMalyshev: const PATTERN = '/^Q[1-9]\d{0,9}\z/i'; [19:53:51] that's verification, right? [19:54:04] according to that the regex is case insensitive, but i guess thats got to handle legacy stuff that is lower case... [19:54:28] $localId = strtoupper( $serializationParts[2] ); in the constructor!! :) [19:54:28] El búfer 2 está vacío. [19:54:46] return new self( 'Q' . $numericId ); [19:55:06] return new self( self::joinSerialization( [ $repositoryName, '', 'Q' . $numericId ] ) ); [19:55:10] yup, so looks like legacy db stuff will containue having lowercase things, but anything new will be uppercase [19:55:58] ok, so I guess I'll add uppercasing for string->URI converrtor and maybe somebody (not me :) can write a bot/script to clean up page_props... [19:56:00] so, we could run some scripts to fix all the stuff in the dbs [19:56:11] may even make sense to split the bug into two... [19:57:05] All of the wikibase apis will expose them with upper case Qids [19:57:45] we solved that problem aggges ago, but I guess "Lowercase QIDs returned by wikibase:mwapi" is talking about some pageprops mediawiki api which gets them straight from the DB I guess? [19:59:38] SMalyshev: I'll let you update the ticket if thats okay! I leave for my flight in 7 hours and still need to pack and sleep :D [19:59:52] addshore: ok, thanks :) [20:00:25] addshore: yes this is from pageprops API which has no idea about specific props I guess [20:01:56] could add some normalization there but that feels ugly [20:03:03] yeah I think better solution would be to just clean up bad IDs [20:03:54] lowercase entity IDs in statement IDs are unrelated to this, right? because those still exist on WDQS even without MWAPI [20:04:31] WikidataFacts: statement IDS are not related to this [20:04:37] ok [20:04:45] WikidataFacts: statement GUIDs are indeed something else [20:04:59] but you shouldnt really see them any more! (lowecase ids in them) [20:05:12] well old statements still have them [20:05:18] and they’re not normalized on output, apparently [20:06:00] WikidataFacts: yeh, true, old statement ids in json will remain lowercase [20:42:59] WikidataFacts: would you be able to make the ticket mentioned in https://phabricator.wikimedia.org/T172685? [20:43:39] I'm not a developer, I don't know how all this stuff works. I've heard of page props but know pretty much nothing about it, so I certainly don't know how I'm supposed to outline a case for having them >_< [20:48:12] I’m not sure I can make a good case for this either [20:48:31] ?item wikibase:apiOutputItem mwapi:title works btw [20:57:28] confusing... :/