[07:49:26] greetings [09:33:25] hola [09:37:08] morning! [11:04:46] updatetools is spamming again ("notification about job updatetools"). I'm doing more debugging, will report back in T413431 [11:04:46] T413431: updatetools is frequently timing out - https://phabricator.wikimedia.org/T413431 [15:49:11] apropos of nothing, yesterday I watched https://www.youtube.com/watch?v=czzAVuVz7u4 and found it interesting [15:51:34] taavi: failover for cloudlb is the same, right? Just stop bird on the host to be reimaged? [15:51:53] andrewbogott: for cloudlb, yes [15:52:08] I forgot that cloudgws still use keepalived instead of bird, so for those it's stopping keepalived instead [15:52:16] good to know! [15:52:30] I might do eqiad1 cloudlb nodes now if you don't have concerns [15:52:41] did you do that in codfw yet? [15:52:56] yes [15:52:59] then sure [16:23:17] bd808 (or whoever): I have a dockerfile question which you might have an easy answer for. The file in question is https://gitlab.wikimedia.org/repos/sre/wikitech-static-docker/-/blob/main/dockerfile?ref_type=heads and the issue is that I'm filling up the gitlab runner at build-time. [16:23:35] I think it very likely that line 75 is the culprit, as it essentially doubles the space needs for the build. [16:25:26] My question is... do you see a less messy way to only get one copy of that dir or should I just cram those two build steps into one container and leave the build-time cruft behind? [16:26:56] (I guess the other possibility is that docker is already being smart and that step doesn't actually duplicate things) [16:31:15] I was going to suggest a bigger gitlab runner as a possible solution, but it looks like you are already using the wmcs runners which are I believe the largest available right now without making a custom runner. [16:31:32] * bd808 looks at the Dockerfile [16:32:36] yeah, I started out the morning thinking I would negotiate with releng to have bigger runners. But then thought the responsible thing would be to try optimizing a bit first [16:35:49] andrewbogott: can you tell me more about the "build-time cruft" aspect? Is that just the apt cache & httrack and python runtimes mostly, or is there something else lurking in there? [16:36:13] nah, that's it I think. Pretty harmless. [16:36:38] I just wondered if there was something like a 'move' option or a 'link rather than copy' option or similar. [16:36:48] If not then I'll just combine those containers. [16:37:20] hmmmm not totally sure how to order things and still do the lunr build so maybe /everything/ will have to go in one container [16:37:26] I think the way the file is setup you are getting four copies of the html pages by the end of the build: the first scrape, the copy into the node container, the lunr index, and the copy into the nginx container. [16:37:47] yeah, I bet you're right. [16:37:49] The big storage is probably images right? [16:38:01] yeah [16:38:06] which don't get copied into the lunr container [16:38:20] (gotta run downstairs, back in 5) [16:39:17] Lines 3 & 4 probably don't do quite what you expect. I wonder if I can make Blubber do all these build steps... [16:46:49] what do 3 and 4 do? [16:53:03] bd808: if you're trying a local test be sure and change line 18 to -#L50 \ or you'll be waiting all day [16:53:15] but I really didn't mean to nerd-snipe you into rewriting everything! [16:56:41] andrewbogott: 3 is apt-get update and then 4 is apt-get install. Those being separate lines means they are separate cache layers and that has impacts on things when caching is happening. The way this is used maybe that doesn't matter though--meaning that maybe there is never really a layer cache in play until things are uploaded to the registry. [16:59:15] The update could definitely be doing nothing and I wouldn't notice [16:59:29] which maybe means I should just remove that line anyway [18:22:50] * dhinus off