[06:53:20] Morning! [07:06:31] DanielK_WMDE: got 5 mins to chat about dispatching? :) [07:07:05] addshore: i'm at the doctor's office - so sure, until i get called in :D [07:07:11] hahaa! [07:07:30] SO, dispatching is done by the maint script, and the maint script reads rows and then creates jobs right? [07:07:46] yes [07:08:03] How does that script decide where to start reading from in the changes table? as the rows are not removed at that point right? [07:08:31] it tracks the position for each target wiki in the wb_changes_dispatch table [07:08:38] ahhh [07:08:39] that table is also used for locking [07:08:47] when does wb_changes_dispatch get written to? [07:09:06] when the jobs got successfully posted [07:09:13] *looks* [07:09:49] by the job itself? or by some other code? [07:10:07] by the maint script. when the job was posted, not when it gets executed [07:10:16] the job is exectured later on the client wiki [07:10:48] the "dispatching" part ends when the jobs for the client wikis are in the queue. [07:10:52] ahh lockClient in ChangeDIspatchCOodrinator! [07:10:57] on the client side, we have "change handling" [07:11:14] yes, that's the locking bit. could be that it also updates the pointer [07:11:33] *looks for the update* [07:11:53] / update state record for already known client wiki [07:12:15] right yes, that looks like the way I thought it worked [07:12:43] so the script locks for a client, updates the touched time, and then tries to add a bunch of jobs, but if those jobs cause exceptions then the changes never get dispatched [07:12:53] addshore: i thin the pointer is part of the $state parameter of releaseClient [07:13:30] addshore: you mean the jobs cause an exception while being posed, or when being executed? [07:13:36] while being posted [07:13:46] huh. how would that happen? [07:14:06] hmm, but yeh the touched time is updated in the release too [07:14:14] it happened yesterday during the wdiad -> codfw thing [07:14:14] but yea, that would fail. and dispatching for that target wiki would not progress, the pointer would not be incremented [07:14:42] addshore: the touch date is purely informative. it's the position pointer (referencing a change id) that is pertinent [07:15:11] addshore: chd_seen [07:15:16] maybe not the most obvious name [07:15:41] *looks* [07:16:41] heh, I cant see what calls releaseClient xd [07:16:58] ahh ChangeDIspatcher [07:17:05] addshore: look at $continueAfter in ChangeDispatcher::dispatchTo [07:17:34] so yeh, new new chd_seen is set in an array in ChanegDispatcher and passed to the coordinator in releaseClient [07:17:42] exactly [07:18:12] but if anything fails, wb_changes_dispatch is't updated with a new chd_seen. So it'll get stuck, if the error persists. [07:18:24] though it shouldn't starve dispatching to other wikis [07:18:26] and there is no try/catch for sendNotification so if it exceptions I would believe it wouldn't update the seen [07:18:31] this is actually be design [07:18:45] it wouldn't, and it shouldn't [07:18:58] the idea is that it would retry, and recover from intermediate errors [07:19:08] Then I guess the thing I saw that didnt get dispatched was just a random fluke! [07:19:14] and it shouldn't affect dispatching to other clients [07:19:21] possible [07:19:36] I'll go see if it even made it into the wb_changes table [07:20:17] yea. we don't have ACID semantics for updating secoindary data. things can get lost [07:20:22] we also had roughly 400 items that were created around the time of the readonly switch that never made it into the entity_per_page table thus wouldnt render [07:20:44] ick. we should ditch that table... [07:21:04] hehe, yeh, mediawiki said the page existed, so it tried to render it, and then wikibase said, no that entity doesnt exist! [07:21:30] the table is really redundant [07:21:39] also FYI we now have a dash for the dispatch maint script https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch-script [07:22:27] *sigh* the entire maintainance script business sucks. we shouldn't need that. [07:22:38] yeh ;) [07:22:47] but thats for future us to deal with right? ;) [07:23:08] http://voicesfromthecommunity.co/wp-content/uploads/2014/04/Do-Not-Fix.png [07:24:43] addshore: re [09:56:55] Lydia_WMDE: the tickets from yesterday are: https://phabricator.wikimedia.org/T133045 https://phabricator.wikimedia.org/T133048 https://phabricator.wikimedia.org/T133144 [09:59:20] DanielK_WMDE: Adrian_WMDE: Thiemo_WMDE: Lydia_WMDE: addshore: here's the scrumofscrums etherpad link again: https://etherpad.wikimedia.org/p/Scrum-of-Scrums [10:03:56] addshore: thx [10:08:14] Lydia_WMDE: https://www.loc.gov/standards/iso639-2/php/langcodes-keyword.php?SearchTerm=gsw&SearchType=ALL [10:09:15] DanielK_WMDE: this is what I found for the stuff we were talking about re dispatching https://phabricator.wikimedia.org/T133144 [10:09:46] at a guess simply adding those changes to the top of the table would be bad if other changes had been dispatched for those items since? [10:13:04] Lydia_WMDE: https://doc.wikimedia.org/mediawiki-core/master/php/Names_8php_source.html [10:29:39] addshore: messing with the order would be bad if anything actually tried to maintain a replicated copy by re-playing the changes [10:29:46] but all we do based on the changes is purge [10:29:48] that should be fine [10:29:59] oh the change just purges everything? [10:30:09] cool! *removes a bit of complexity from the script* [10:30:24] not everything. we are trying to be as selective as possibloe, that's why usage-tracking is rather fine grained [10:30:53] addshore: well, the repo shouldn't assume that this is all that the clients do [10:31:00] but the assumption is fine for a one-off fix on our cluister [10:31:16] okay, I'll make it reply all change since the first one it needs to re add then!? [10:31:27] *replay [10:32:15] if you want that script to be generic, that'S the safe option [10:32:33] but it will cause lots and lots of unneccessary purges. which causes a lot of load on the app servers [10:33:00] i'd personally only replay the ones we missed. [10:33:13] I could add an option to just ignore the excessive step for now :) [10:33:17] the script would be a one-off, and wouldn't go into the wikibase repo. [10:33:25] we can keep it in the wikimedia tools repo somewhere [10:33:31] its likely it will need to be run again on thursday [10:33:55] yea, sure, we should keep it. but that doesn't mean it has to be a cononical part of wikibase [10:33:59] yay, multi option multi purpose ;) [10:34:29] all of the options ;) [10:55:09] DanielK_WMDE: thoughts? https://gerrit.wikimedia.org/r/#/c/284440/ [11:37:56] How many statements are there on Wikidata currently? [11:38:59] Last number I saw was 70 million, but it's probably grown since then? [11:39:55] Found the right chart. 94 million. Wow! [11:39:57] harej: 94 million, https://grafana.wikimedia.org/dashboard/db/wikidata-datamodel-statements [11:40:39] Curious that it's not one of the KPIs. That's where I checked first. [11:40:44] kpi? [11:41:06] key performance indicator. the major metrics. [11:43:22] Also, am I reading this graph right: https://grafana.wikimedia.org/dashboard/db/wikidata-kpis [11:43:37] average 4.5 statements per item :/ not enoughhh. although I guess that's also totally skewed by the large number of things like categories [11:43:47] is Special:EntityData accessed 2.58 million times per day? [11:46:14] that's how I would read it, whether it's right or not, I dunno :) [11:48:16] Hi all! [11:48:36] Also, what does "entity usage pages" refer to? [11:55:09] I assumed from the size of the number that it means the item pages [14:49:51] * aude waves [15:03:56] Thiemo_WMDE: lets agree not to do timezones? :P http://stackoverflow.com/questions/6841333/why-is-subtracting-these-two-times-in-1927-giving-a-strange-result [15:04:01] * addshore waves at aude [15:07:04] :o [15:20:27] In my Template, I’m trying to grab an image whether or not the user enters [[File:]] or [[Image:]] or no prefix at all and then if there’s a caption, then to add that, too. My {{#if}} skills are weak. My work in progress: https://dpaste.de/vYM3 [15:20:28] 10[5] 04https://www.wikidata.org/wiki/File:13 => [15:20:31] 10[6] 04https://www.wikidata.org/wiki/Image: [15:51:26] is it not possible to close the query explanation thing for the sparql thing and have it *stay* closed? [18:03:46] In my Template, I’m trying to grab an image whether or not the user enters [[File:]] or [[Image:]] or no prefix at all and then if there’s a caption, then to add that, too. My {{#if}} skills are weak. My work in progress: https://dpaste.de/vYM3 [18:03:47] 10[7] 04https://www.wikidata.org/wiki/File:13 => [18:03:50] 10[8] 04https://www.wikidata.org/wiki/Image: