[00:58:13] godog: did you mean miss instead of hit? Or is this hit somewhere other than varnish? [07:42:55] Krinkle: did you find something special/new interesting with the web vitals/seo post? [09:11:07] phedenskog: nice! re: moving one tool at a time to AM alerts :) [09:11:43] Krinkle: yeah miss increased too, I meant hit-local specifically as experienced by ats-tls, cfr [09:11:46] https://grafana.wikimedia.org/d/8T2XA-5Gz/frontend-ats-tls-ttfb-latency?orgId=1&var-site=All&var-cache_type=upload&var-status_type=2&var-cache_status=hit-front&var-cache_status=hit-local&var-cache_status=int-front&var-cache_status=int-local&var-cache_status=miss&var-ttfb_max=1.5&var-percentile=95&from=1613955274162&to=1614059356023 [09:16:26] godog: there's nothing right now we can do with changing summary/description etc? When I test the alert name and the summary is the same in AlertManager (with Grafana alerts) and the description is the description of the alert. Right now I adding descriptions that explains what fails but then you need to click once to see it (and the title is repeated in the summary so it do not add any value). [09:16:59] I just want to make sure I just make as good as it can be right now :) [09:18:19] when I move on to the WebPageReplay alerts they are 12*6 so I wanna make sure I only do them once. [09:19:26] phedenskog: I think I don't understand the "click once to see it", you mean to expand 'description' label in alerts.w.o ? [09:19:36] s/label/annotation/ [09:23:13] yep you need click on the description field to show the value. One more extra step :) However maybe it doesn't matter now. One thing though: In the email from the alert manager the source is included, so clicking on that takes you to the Graph. Could that be included too in front end of the Alert Manager? Right now I need to add that myself as a tag right? [09:24:30] godog: But I probably still need to do that (adding a tag). The source links to the exact panel but we usually have history graphs on the same dashboards so if I investigate something the panel is not enough info you need to see the full dashboard. [09:27:12] phedenskog: got it re: clicking once, that's on purpose to not clutter the ui by default, because 'description' can be long [09:27:33] I see what you are saying though that it is one more step heh [09:28:04] re: tag, yeah afaics grafana doesn't send the dashboard url along by default, I'm reading https://github.com/grafana/grafana/blob/master/pkg/services/alerting/notifiers/alertmanager.go [09:28:43] I'm also trying to find grafana default variables, perhaps there's the dashboard url in there [09:30:34] I guess $__dashboard but that's the dashboard name according to https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/ [09:32:35] let me try thanks [10:00:15] godog: seems like ${__dashboard} isn't evaluated in the description/tag field. Thinking I could do generic messages that had the correct link. [11:22:51] godog: how does your day look tomorrow? can we have a quick meeting so I can sync everything with you before I move on with the rest of the alerts? [11:28:00] phedenskog: totally, I'll be busy in the morning and a couple of meetings in the afternoon, feel free to send me an invite ! [12:49:02] phedenskog: nope, just that rum data remains important [12:50:56] godog: I don't follow, why would latency of ats/varnish cache hit in eqsin or codfw for upload be affected by swift failing over to eqiad? That's essentially the miss/applayer/backend for it right? And we don't have the shared varnish backend anymore either [12:51:44] Anyway, I'm not worried, just looking to understand Jan general [12:51:50] in general * [13:08:21] Krinkle: good question, I don't have a good explanation for that behaviour, my guess would be that ats (the backend instance) still talks to swift even in (some?) hit-local cases [13:28:44] godog: hmm okay so I did understand correctly, maybe I should be worried then [13:30:41] Krinkle: the other explanation I can think of is that the metric isn't measuring what we think it is measuring [13:30:48] anyways I just wanted to give you the heads up [14:35:22] Krinkle: yeah, he works for Akamai pushing Boomerang so ... :) [14:38:55] Akamai gonna push RUM, Catchpoint synthetic so good that we have both. I wonder though for "normal" web sites if Crux is enough (instead of RUM) since its still only Chrome that will effect the Google Web Vitals? [18:12:37] gilles: following your suggestion I moved all wikipedia preview images to wikipedia/static (https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/666680/) Any comments you may have on this patch are welcome (or a +1 would help me getting it deployed). Thanks!