[00:06:41] AaronSchulz: what part was bogus? https://gerrit.wikimedia.org/r/c/mediawiki/core/+/609902 [00:07:34] "maxPreferedKeySize" [00:08:01] right, we never set that [00:08:04] I misread [00:08:05] OK. [00:08:06] landing :) [01:36:55] Krinkle: ha, thanks for the +2 on 615302, I was just about to ask if you meant to do that with your earlier comment. :) [01:37:38] FYI, possible fix for corrupt graphs is at https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/617275. Rebasing now. [01:38:22] dpifke: I was in puppet mood [01:38:25] because prometheus [01:38:33] but then when I was about to cherry pick it in beta [01:38:39] I realised it does not apply to ops/puppet [01:38:53] dpifke: can you scap deploy the arclamp to beta? [01:39:50] Yup, no problem. Unless we want to try to do the other patch at the same time. [01:40:33] I can also cherry pick that one into beta; I ran it manually there and it seemed to do the right thing, but might not be a bad idea to let it run overnight. [01:43:24] dpifke: not following what to cherry pick where? [01:44:22] Sorry, talking about the change to arclamp-generate-svgs to hopefully reduce flamegraph.pl out-of-memory errors. (I'm almost certain that's what's causing the corrupt output svgs.) [01:44:36] https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/617275 [01:45:04] ah sure, yeah, cherry picking that and seeing it complete a couple runs first would be good yeah [01:45:54] OK. Will deploy the new metrics to beta & prod, then cherry-pick the other patch just in beta. [02:03:05] The updated arclamp-generate-metrics script is in prod, but won't have any effect until https://gerrit.wikimedia.org/r/c/operations/puppet/+/613359 lands. [02:03:51] dpifke: https://performance.wikimedia.beta.wmflabs.org/arclamp/metrics [02:03:54] should be there though, right? [02:04:01] The updated arclamp-generate-svgs script is in beta, I'll verify first thing tomorrow that it does the right thing. [02:04:27] Yes, but that's a cached copy. [02:04:45] ack yeah waiting for it to roll over [02:04:48] Prometheus will scrape direct from webperf1002 so as to get the more recent data. [02:04:51] s/rollover/run/ [02:05:07] oh you mean the response is http cached? [02:05:17] Yup. [02:05:27] right 5min default cache for public http responses in ATS [02:05:29] But only if viewed through the reverse proxy. Which is why we don't want to scrape from there. [02:05:36] right sure [02:05:44] what's the cron cadence? [02:06:09] Every other minute. [02:06:58] The `find` commands it runs shouldn't introduce a huge load (thanks to the kernel dirent cache), but aren't completely free. [02:06:58] shows up now for me [22:15:03] dpifke: Looks like we're only a few steps away from being able to import the mongo records. [22:15:37] Might be a good time to get that over with in Beta first if you haven't already :) [22:16:40] Sounds good to me. Reviewing the script now.