[04:09:21] 06Traffic: Images randomly fail to load - https://phabricator.wikimedia.org/T418323#11666449 (10BrokenImages1234) ` GET https://upload.wikimedia.org/wikipedia/commons/thumb/4/49/Kaiten-zushi_005.jpg/500px-Kaiten-zushi_005.jpg NS_BINDING_ABORTED Status 429 Version HTTP/2 Transferred 939 B (0 B size) Referrer Pol... [06:51:51] 10netops, 06Infrastructure-Foundations, 10Observability-Logging: ~5k/logs/sec from netdev - https://phabricator.wikimedia.org/T412143#11666554 (10ayounsi) From JTAC after sharing with them our gNMIc config : > I was able to replicate the issue in the lab. I've also shared the same update with engineering.... [07:19:04] 10netops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 06ServiceOps new, 06SRE: Eqiad: lsw1-d7-eqiad BGP maintenance - https://phabricator.wikimedia.org/T418772#11666586 (10ops-monitoring-bot) Draining ganeti1051.eqiad.wmnet of running VMs [07:43:51] 10netops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 06ServiceOps new, 06SRE: Eqiad: lsw1-d7-eqiad BGP maintenance - https://phabricator.wikimedia.org/T418772#11666621 (10JMeybohm) [08:10:44] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations, 13Patch-For-Review: Export development_network_probe data to Puppet servers for CDN deployment - https://phabricator.wikimedia.org/T402512#11666664 (10elukey) I am finally able to query a week worth of IPs from webrequest and dump a txt file on t... [08:25:31] vgutierrez, XioNoX: sorry, got distracted yesterday, since lvs1013 is just an experimental host, shall I simply powercycle it? then we can see if it's some NIC issue which solves itself on reboot. the server is from 2017 after all [08:27:00] moritzm: what's the issue? [08:28:51] that the server is from 2017 :D [08:29:15] true, right now I'd say, why not decom it? [08:29:20] XioNoX: from yesterday: https://paste.debian.net/hidden/7b3b268e [08:29:39] oh wow I have a short memory :) [08:29:40] TTBOMK these are used for some Liberica/Katran tests still [08:30:06] XioNoX: I'll send you a Tiktok message next time :-) [08:30:12] hahahaha [08:30:33] yeah maybe decom it and re-use a less old decomed host if testing is still needed [08:32:31] let's wait for Valentin to chime in, he's probably the last one to use it for tests [08:32:48] for sure [08:40:45] 10netops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 06ServiceOps new, 06SRE: Eqiad: lsw1-d7-eqiad BGP maintenance - https://phabricator.wikimedia.org/T418772#11666770 (10JMeybohm) [08:44:07] yes please go ahead moritzm [08:47:44] doing that now [08:52:48] all fine after the powercycle [08:53:20] 06Traffic: Images randomly fail to load - https://phabricator.wikimedia.org/T418323#11666816 (10ihurbain) The whole thing, bar a couple of redacted elements (for another image on officewiki): ` GET scheme https host upload.wikimedia.org filename /wikipedia/commons/thumb/8/86/WMF_homepage_banner_image_01.pn... [09:35:32] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11666985 (10ABran-WMF) >>! In T417998#11639695, @ABran-WMF wrote: > This is currently blocked... [09:36:21] 10netops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 06ServiceOps new, 06SRE: Eqiad: lsw1-d7-eqiad BGP maintenance - https://phabricator.wikimedia.org/T418772#11666990 (10MoritzMuehlenhoff) [09:54:34] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, 10Thumbor: MediaViewer (and the commons file page) should serve WebP originals not thumbnails of equivalent size - https://phabricator.wikimedia.org/T418745#11667054 (10MatthewVernon) Perhaps instead for the odd non-web format (which seem... [09:57:16] 10netops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 06ServiceOps new, 06SRE: Eqiad: lsw1-d7-eqiad BGP maintenance - https://phabricator.wikimedia.org/T418772#11667074 (10BTullis) [11:54:53] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, 10Thumbor: MediaViewer (and the commons file page) should serve WebP originals not thumbnails of equivalent size - https://phabricator.wikimedia.org/T418745#11667612 (10Ladsgroup) >>! In T418745#11665357, @Tacsipacsi wrote: > TIFFs (e.g.... [14:20:28] vgutierrez: qq about ATS backend caching. If i'm visiting https://turnilo-next.w.o and still getting the infamous landing page, does that refresh the 24h cache on the cp host I landed on? [14:20:52] 24h is the max TTL that the CDN will allow [14:21:30] so once a page is cached, a new hit does not reset the cache time then [14:22:14] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, and 2 others: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11668230 (10Ladsgroup) Top "file formats" for the non-standard sizes with enwiki as referrer are as follows: ` spark-sql (default)> s... [14:22:49] if that's indeed the case, the turnilo-next.w.o page has been cached by ATS for more than 24h, and I'm still getting redirected to the landing page sometimes, and sometimes to the idp-test auth page (which is expected) [14:23:02] and then for each asset (css, js, etc) it's a roll of the dice [14:23:37] s/has been cached by ATS for more than 24h/was first cached in ATS > 24h ago/ [14:24:29] https://wikitech.wikimedia.org/wiki/Kafka_HTTP_purging#One-off_purge seems to only work for wiki pages according to c.laime [14:24:41] so, I'm wondering what my options are atm [14:27:09] I can see the change going through puppet roughly 24h ago https://puppetboard.wikimedia.org/report/cp7015.magru.wmnet/544038a0f5516590224008140e531c7755824a68 [14:28:28] what's even more surprising to me is that uncached URLs (assuming we differentiate URLs in the cache based on query args) such as https://turnilo-next.wikimedia.org/?abcd=1 still redirect me to the landing page [14:29:30] brouberol: is caching even enabled in ATS for that backend? [14:30:03] oh huh it is caching: 'normal' [14:30:10] it should be yes, according to https://gerrit.wikimedia.org/r/c/operations/puppet/+/1247013 [14:30:34] I don't think what claime told you is true [14:31:32] for additional datapoints, [14:31:33] brouberol@cp6009:~$ curl https://turnilo-next.discovery.wmnet:30443 # shows idp redirection [14:31:33] brouberol@cp6009:~$ curl -s -v -H 'Host: turnilo-next.wikimedia.org' http://turnilo-next.wikimedia.org:3128 # shows the landing page html [14:31:55] sorry, disregard the