[00:30:11] AaronSchulz: here for another hour or so [00:30:15] if you want to do it together [02:00:58] Krinkle: gah, I missed that window [02:02:20] I'm still here, AaronSchulz [02:08:01] Krinkle: I put it in for Monday, but we I'm around now [02:08:27] *we can do it now since I'm around now [02:08:40] AaronSchulz: lets do it [02:08:43] want to drive? [02:11:19] ok [02:24:51] Krinkle: ssh acting up again [02:26:14] ah, there we go [02:27:44] AaronSchulz: logstash and testwiki standing by :) [02:27:47] which canary? [02:30:34] 1001 [02:32:11] AaronSchulz: eh.. [02:32:23] did the fix go out? [02:32:50] prod = wmf.27 [02:32:58] fix is in wmf.28 + wmf.29 [02:34:45] Krinkle: I though the fix was ages ago [02:34:47] * AaronSchulz checks [02:34:55] that would explain what I'm seeing though [02:35:10] "included in" in Gerrit says wmf.28/29 [02:35:22] which was indeed 2.5 week ago [02:35:26] but we've had two failed trains [02:38:16] merge conflicts [02:42:01] easy merge though [02:42:11] right, will need ot checkout wmf and cherry-pick carefully. Are there any non-diff conflicts we need to think about as well? E.g. things we changed elsewhere in rdbms or objectcache since wmf.27? [02:43:21] https://github.com/wikimedia/mediawiki/commits/wmf/1.36.0-wmf.27/includes/libs/rdbms vs https://github.com/wikimedia/mediawiki/commits/bebbc12f95/includes/libs/rdbms [02:44:33] - rdbms: use LoadBalancer::MAX_LAG_DEFAULT constant within LoadMonitor [02:44:34] - rdbms: sanity check if $conn is false in LoadBalancer::getConnection [02:45:25] we did end up changing effetie value for lagWarnThreshold afaik [02:45:28] but that's only a logger call [02:45:31] Krinkle: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/661832 [02:45:33] confirmed no side-effects [02:45:34] wasnt sure [02:45:57] the other changes don't matter [02:46:45] ack, yeah only self::KEY_LOCAL_DOMAIN changed I see now [02:46:46] cool [02:46:49] reviewing now [02:47:24] +2'ed [02:47:24] I'm getting read-only mode on mwdebug, so it will be assuring to see that go away when that merges [02:47:36] yeah [02:47:56] can you revert or checkout HEAD~1 in wmf-config on deploy and scap pull that so we can test the MW change only first? [02:48:35] ok [02:51:04] Krinkle: done [02:51:24] CI... coming along [02:56:51] ETA 11/14min [03:02:18] 9/17min [03:02:31] https://xkcd.com/612/ [03:02:31] :D [03:05:24] heh [03:05:50] * AaronSchulz fed his dog and prepped oatmeal while waiting [03:06:19] https://integration.wikimedia.org/zuul/?filter=661832 [03:21:17] 🚀 [03:21:49] Krinkle: it's on 1001 [03:24:02] > MediumSpecificBagOStuff::mergeViaCas failed due to read I/O error on get() for testwiki:abusefilter-profile:v3:22. [03:24:21] not seen that before but I guess could happen very rarely [03:25:37] Krinkle: I'm going to do the config part on mwdebug again soon [03:25:49] sgtm [03:26:17] I oculdn't get the cas error to happen again on another wiki or the same, but I guess we can look out for it post-deploy just in case [03:26:26] maybe roll out the rdbms change meanwhile ? [03:28:04] Krinkle: the config change is on 1001 now too [03:28:33] I can checkout the last rev and push the LB changes for real though [03:29:31] doing that now [03:29:43] * Krinkle nods [03:37:49] Krinkle: seems fine [03:38:18] doing the config one soon [03:40:59] yep [03:41:02] looking good all [03:41:14] opening memc/wan meanwhile [03:41:18] dashes [03:42:37] https://logstash.wikimedia.org/app/dashboards#/view/mediawiki-errors https://grafana.wikimedia.org/d/2Zx07tGZz/wanobjectcache https://grafana.wikimedia.org/d/000000316/memcache [03:43:49] Krinkle: btw, did you see my comment on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/657471/ ? [03:46:22] 5 Mil misses -> 15 Mil misses per min [03:46:36] as expected I suppose with global being >50% of keys [03:46:42] key accesses* [03:47:36] overall getWithSet calls 15 Mil -> 25 Mil, that one is slightly more surprising. maybe due to nested calls? [03:57:51] and starting to recover pretty quickly [03:58:02] I'm surprised bandwidth barely moved at all [03:58:06] I guess most keys are small [03:58:29] https://usercontent.irccloud-cdn.com/file/76lYIuzs/memc-T252564.png [03:58:30] T252564: Let WANObjectCache store "sister keys" on the same backend as the main value key - https://phabricator.wikimedia.org/T252564 [04:01:06] AaronSchulz: btw, confirmed nested keys indeed. SqlBlobStore has a nested getWIthSet, in getBlob() -> [getWithSet -]> fetchBlobs -> expandBlob -> [getWithSet-]> ExternalStore fetch [04:01:11] forgot about that one [04:10:24] Krinkle: it also is using "blob addresses" as cache keys [04:10:27] seems messy [04:21:56] Krinkle: I guess https://gerrit.wikimedia.org/r/c/mediawiki/core/+/543730 can be reverted and getBlob() could have the caching removed (being a naive wrapper) [04:22:28] hmm, well the error models are a bit different, so maybe nix the second part [04:25:16] there should be an expandBlob() and expandBlobInternal() or something [04:28:09] hmm, the expandBlob() params are also poorly named [07:05:13] Krinkle: phedenskog: dpifke: in the FOSDEM Matrix, can you check if you got invited to dedicated rooms for the sessions you're hosting? [07:05:23] they should have integrated video widgets at the top [08:52:56] gilles: yes I'm invited [09:39:08] I got my FOSDEM t-shirt today, right size :) [09:40:57] I woke up last night in the middle of a nightmare about FOSDEM: the videos and chats didn't work and there where no questions to ask, and things where not in-sync. [09:53:38] haha [11:34:09] https://github.com/FOSDEM/video/blob/master/instructions/fosdem2021/speakers.md [11:34:53] after the Q&A anyone can join the backstage chatroom and talk directly to the speaker, including with video [12:12:36] just got my t-shirt as well, just in time! [12:52:15] gilles: About "IMPORTANT: How to know exactly when your q&a session goes live?" I guess then the best way is download the video, check the length and calculate when the Q&A will start? [12:52:25] And just start at that time :) [13:02:28] I added the Q&A start times on the talks I host in the document (adding those extra 30 s). [13:24:35] good idea [13:36:27] Which document? [14:30:59] Krinkle: The Google Docs where we write the questions. [14:31:39] Look for a email from Gilles. [16:20:15] Krinkle: https://docs.google.com/document/d/1WBue8MROL18UWMYvSEQaXGK2OPWFKAPx4OlAvN2-RMs/edit [18:36:38] k