[07:54:02] meh https://github.com/google/primarysources/issues/106#issuecomment-222612530 [07:55:35] you also need to sign a Contributor License Agreement tho [08:56:55] heads up for T136598 [08:56:55] T136598: Wikidata master database connection issue - https://phabricator.wikimedia.org/T136598 [08:57:38] ~5000 errors/hour [08:57:40] yikes [08:58:07] was something deployed yesterday? [08:59:31] bots are behaving this time, I see not issue there [08:59:56] Maybe aude knows, I don't see hoo here. [09:00:13] ok, will ping them later if they connect, see you [09:00:29] (sorry uf this was offtopic) [09:00:54] No, this isn't offtopic to me. :) [09:01:10] is there a wikidata-devel or is this a good place? [09:01:31] Mostly people are here, I think. Not a dev myself. ;) [09:01:45] ok, thanks anyway [09:02:11] Not sure who's in the office today, even Lydia_WMDE haven't checked the chat today. [09:02:25] meep [09:02:26] :D [09:02:29] wasup? [09:02:39] it is not an "unbreak now", so no need to rush [09:02:56] looking [09:03:37] jynus: ok i'll add a bunch of people who might be able to help investigate [09:03:46] thanks [09:04:24] np [09:05:23] Nemo_bis: sjoerddebruin: if we are sure it'd help we can take the primary sources tool out of the google org [09:47:56] where would it make sense to put data about minecraft, on a dedicated wikibase instance or on wikidata itself ? (data like what's on the infobox in there http://minecraft.gamepedia.com/Stone ) [09:49:07] I think it probably doesn't belong on wikidata, but I'm not sure I really understand what data wikidata should contain [09:50:17] after all google knowledge graph do have that kind of data about minecraft https://www.google.com/search?q=minecraft+stone [10:21:41] rom1504: You can probably put it on Wikidata. If the properties you want already exist, that's a good sign; otherwise, propose the properties at https://www.wikidata.org/wiki/WD:Property_proposal and see what others say. [10:22:18] Wikidata does have an abundance of Pokémon-related info after all. [10:24:06] I don't work with fictional universe stuff so I don't know where the particular boundaries are, but I have found with my general experience with Wikidata that there is a lot you can do :D [11:26:30] I see, I'll look into it [17:58:18] jzerebecki: https://www.w3.org/International/tests/repository/html5/the-dir-attribute/results-dir-auto [17:59:58] jzerebecki: https://msdn.microsoft.com/en-us/library/ms533728%28v=vs.85%29.aspx [18:04:36] jzerebecki: https://www.w3.org/TR/html5/dom.html#the-dir-attribute [18:29:48] DanielK_WMDE: https://gerrit.wikimedia.org/r/#/c/291960/ [19:04:23] DanielK_WMDE: around? I wanted to talk about T89733 [19:04:23] T89733: Allow ContentHandler to expose structured data to the search engine. - https://phabricator.wikimedia.org/T89733 [19:05:06] SMalyshev: hey. i'm about to leave, but i still have a few minutes [19:05:29] I have followed the patches a bit, but I find it hard to say anything about them. too much cirrus specific stuff [19:05:39] DanielK_WMDE: cool. So mainly about the content and parser output [19:05:53] DanielK_WMDE: I am trying to separate cirrus specific parts from generic parts [19:07:20] DanielK_WMDE: so what is done now (before the patch) is that search indexer uses ParserCache or Content::getParserOutput to get parser output [19:07:37] DanielK_WMDE: looks like you think there's a better way to do it - so what should happen instead? [19:09:24] also, I'm not 100% sure about the relationship between Title, WikiPage and Content and which should play which role here [19:09:31] SMalyshev: well, I don't really have a very good idea... I just don't want a Content object to load some other stuuff from somewhere, and return data based on that. That makes no sense. Content should be self-contained, and only return ionfo about the actual data it contains. [19:10:05] SMalyshev: so, for now, I'd simply suggest to keep your old logic, but do it in COntent Handler (instead of content): [19:10:11] DanielK_WMDE: but that's what is happening now: $parserOutput = $content->getParserOutput( $page->getTitle(), $revId ); [19:10:25] but getParserOutput doesn't load content [19:10:34] it sends the content though the parser, and returns the result [19:10:40] DanielK_WMDE: I have no idea :) new code does the same thing as the old code [19:11:06] but actually, I really regret having put that into the COntent object. All the heavy lifting logic should be in COntentHandler. It was a mistake to put that into Content. [19:11:27] unless I am missing something in the details... they both seem to call getParserOutput ? [19:11:46] SMalyshev: yea - it's not so much what it does, it's where it is. To me, a class and a method is a knowledge domain. And COntent shouldn't know about loading HTML from the parser cache. [19:11:55] ContentHandler however can know about that, no problem [19:12:25] DanielK_WMDE: well, it doesn't - ParserCache is handled in WikiPage [19:12:54] SMalyshev: Content shouldn't know about *any* storage layer stuff [19:13:01] WikiPage is a storage interface [19:13:25] DanielK_WMDE: ah. hmm... so I can switch it to $this->getParserOutput but that loses cache [19:13:30] and I do want to use cache... [19:13:47] so just move the logic into ContentHandler [19:13:56] what'S the problem with that? [19:14:26] (Ideally, Content wouldn't know the ContentHandler either - it currently does, but only via global state) [19:14:28] DanielK_WMDE: so Content would call ContentHandler to convert Title to ParserOutput? [19:14:31] no. [19:14:43] Cirrus would ask the ContentHandler, instead of asking Content. [19:15:03] $title->getContentHandler()->getSearchFields( $title ) [19:15:21] or $title->getContentHandler()->getSearchFields( $title, $content ) if you have the content. [19:15:23] well, this seems a bit wrong, since ContentHandler is supposed to be metadata-class, and Content is supposed to be specific content of the page, not? [19:15:34] but then the question is whether content actually corresponds to whatever is in the parser cache [19:15:48] ContentHandler is not a metadata class. [19:15:57] it's a factory for all services related to a specific content type [19:16:13] well, not technically a factory. But an access point for them [19:16:18] DanielK_WMDE: right, for content *type* but not for specific content? [19:16:28] ContentHandler is a functionality bundle of stuff-you-ca-do-with-content-of-type-x [19:16:32] yes [19:16:39] but you can pass the specific content as a param [19:17:02] ideally, Content should really have no methods beyond getType() [19:17:10] I guess I can... I'm just not sure then what it would be useful for [19:17:43] Keeping stuff where it belongs and avoiding confusion :) [19:18:14] Perhaps give it a try and see if it makes things harder or easier for you [19:18:16] ok, so suppose we move getFieldsForSearchIndex to ContentHandler. How then we get the output? [19:18:30] still using WikiPage? [19:18:31] What output? [19:18:37] ParserOutput [19:18:37] yes, still use WikiPage (for now) [19:19:20] I think the code is fine, it's just in the wronmg place [19:19:50] i'm not sure if the method in ContentHandler should get the Content object passed in, or just a Title [19:19:54] there's two problems there: a) revision ID and b) there's a hook CirrusSearchBuildDocumentParse which requires ParserOutput, but there's no such object in index updater anymore as it's done in ContentHandler... [19:19:59] (or a WikiPage, even) [19:20:15] DanielK_WMDE: it needs content since e.g. TextContent uses content size as a field [19:20:25] You can get Content from WikiPage asily [19:20:33] and it will then be consistent with the ParserOutput [19:20:43] DanielK_WMDE: ok, so maybe I should pass WikiPage then and not Content [19:20:45] WikiPage can also give you the revision id [19:20:55] yes, perhaps [19:21:01] I do start with WikiPage in Updater. [19:21:18] so that would make the code more streight forward. yay :) [19:21:21] so if I ask WikiPage for parser output twice - is it cached? [19:21:34] if it'S the same WIkiPage object, then yes [19:21:48] WikiPage is a lazy loading handle for page content [19:21:57] ok then, then it should work [19:22:04] cool! [19:22:20] ok, I'll try it and see what comes out, thanks [19:22:31] ok, i'll run off now. i have a lot of meetings tomorrow, but i'll be online for the archcom stuff in the evening. [19:22:39] thanks for working on this! [19:22:51] cool, ttyl :) [20:38:53] The VIKINGS are coming! [20:39:05] again? [20:40:05] We nearly conquered the World, and then we got involved with the danes.. :( [20:40:48] never get involved with the Danes, didn't you learn anything with Hamlet? [20:41:25] If its not the Danes, then it is the Sedes… [20:41:35] Svedes… [20:41:37] everyone die in the end, except the one who died in the beginning [20:41:40] Never mind… [20:41:45] #wikimedia-operations [20:41:53] ah [20:42:18] audephone: /join maybe ? :p [20:42:28] Yeah [20:42:28] Hello audephone [20:42:39] Hi jeblad [20:43:32] Swedes are strange. We invaded them once, and they refused to fight. Very odd people. [20:44:03] There should be a property for odd-ness factor on Wikidata [20:44:52] Can I deprecate Sweden? =) [20:45:00] who decide about oddness? [20:45:13] (as a neuro-atypic person, that's a real question) [20:47:28] everyone's odd in their own ways [20:51:17] I guess most of the time it's "you do things that are unusual to me, therefore you're odd" [20:51:20] Its my interpretation of the World that is correct. It is a self-evident fact! [20:51:38] but everyone does things that some other people find unusual, therefore everyone is odd! :D [20:52:05] nikki? is that wmde nikki? [20:52:29] I don't work for wmde, so if there's a nikki working for wmde that's not me [20:53:07] Ok, so you are the odd not-nikki! ;) [20:53:34] I have the wikimedia username nikki, therefore I'm the only real nikki! :P [21:01:08] Users that manually copy data from wikidata into wikipedia to use those local in production modules … those people … o_O [21:02:34] ah we have the same on frwiki [21:03:21] people who actually think that using a bot to recuperate all wikidata edits on frwiki is a viable solution... [21:03:51] * Harmonia_Amanda doesn't understand at all [21:04:57] We have about 1700 modules that can be replaced by 3 well-behaved modules collecting the same from wikidata [22:46:11] ns info multichill [22:51:34] Someone remind me: how many different mechanisms does Wikidata have for machine-readable data? There's Special:EntityData, the WDQS, and wbgetclaims? [22:54:47] WikidataFacts: btw, please consider changing your on-wiki username, even though I understand your disclaimer on your userpage [22:55:45] :( [22:55:49] any suggestions for an alternative? [22:55:57] anything that doesn't include "Wikidata" [22:56:58] alright, I’ll try to think of something