[04:50:31] 06Traffic, 10Hiddenparma: Introduce allowlists into the CDN (text) filtering - https://phabricator.wikimedia.org/T399057 (10Joe) 03NEW [05:08:29] 06Traffic, 10Hiddenparma: Introduce allowlists into the CDN (text) filtering - https://phabricator.wikimedia.org/T399057#10987056 (10Joe) I should add - following up on what we already did with `x-provenance`, as much identification of traffic as we can should be done at the HAProxy layer; or at least up to th... [05:28:22] 06Traffic: Too Many Requests Error on certain useragents (QtWebEngine, outdated chromium based browsers, Safari 605) - https://phabricator.wikimedia.org/T398753#10987070 (10Joe) We've changed the rules so that you should not be consistently blocked, but you might still be rate-limited if you make a lot of reques... [05:28:43] 06Traffic: Too Many Requests Error on certain useragents (QtWebEngine, outdated chromium based browsers, Safari 605) - https://phabricator.wikimedia.org/T398753#10987072 (10Joe) 05Open→03Resolved a:03Joe [06:18:30] 06Traffic, 10Hiddenparma, 13Patch-For-Review: Requestctl should use x-provenance header - https://phabricator.wikimedia.org/T396621#10987090 (10Fabfur) [08:05:39] 06Traffic, 10Hiddenparma, 13Patch-For-Review: Requestctl should use x-provenance header - https://phabricator.wikimedia.org/T396621#10987293 (10Fabfur) [10:16:08] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 13Patch-For-Review: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#10987549 (10cmooney) @ayounsi just a data-point but the QFX5120 in the codfw expansion cage, on 23.4R2.13, are exporting the BFD... [10:29:35] 06Traffic, 10HaproxyKafka, 10Sustainability (Incident Followup): Avoid logging errors per produced message - https://phabricator.wikimedia.org/T380583#10987592 (10Fabfur) 05In progress→03Resolved [11:11:28] 06Traffic: Split haproxy configuration in different files - https://phabricator.wikimedia.org/T399071 (10Fabfur) 03NEW [11:37:22] 06Traffic: Upgrade to haproxy 2.8.15 - https://phabricator.wikimedia.org/T398720#10987749 (10SLyngshede-WMF) [12:24:19] 06Traffic, 13Patch-For-Review: Split haproxy configuration in different files - https://phabricator.wikimedia.org/T399071#10987873 (10Fabfur) 05Open→03Resolved Removed confd leftover files too [12:35:47] 06Traffic: Upgrade to haproxy 2.8.15 - https://phabricator.wikimedia.org/T398720#10987915 (10SLyngshede-WMF) [12:56:06] 06Traffic: Consider using the alternate chain of Google Trust Services certificates - https://phabricator.wikimedia.org/T398596#10987973 (10Vgutierrez) [13:17:41] 06Traffic, 06Experimentation Lab: Block requests to /evt-103e/v2/events with no Edge Unique - https://phabricator.wikimedia.org/T398181#10988046 (10Vgutierrez) 05In progress→03Resolved ` $ curl -s -v https://en.wikipedia.org/evt-103e/v2/events -o /dev/null 2>&1 |egrep '(x-cache|HTTP/2)' < HTTP/2 204 <... [13:18:51] 06Traffic: Disable OCSP for GTS issued certificates - https://phabricator.wikimedia.org/T399079 (10Vgutierrez) 03NEW [13:19:12] 06Traffic: Disable OCSP for GTS issued certificates - https://phabricator.wikimedia.org/T399079#10988060 (10Vgutierrez) p:05Triage→03Medium [13:28:48] 06Traffic: Upgrade to haproxy 2.8.15 - https://phabricator.wikimedia.org/T398720#10988112 (10SLyngshede-WMF) [13:48:56] 10netops, 06Infrastructure-Foundations, 10netbox, 06SRE: Decom cookbook: delete virtual interfaces from device - https://phabricator.wikimedia.org/T398412#10988208 (10taavi) [14:30:08] 06Traffic, 10Hiddenparma, 13Patch-For-Review: Requestctl should use x-provenance header - https://phabricator.wikimedia.org/T396621#10988484 (10Fabfur) 05Open→03Resolved [15:39:46] 06Traffic: Remove katran blockers for low-traffic non-k8s based services - https://phabricator.wikimedia.org/T373020#10988812 (10Vgutierrez) 05Open→03Resolved [16:00:20] FIRING: DnsboxServiceMismatch: Service ntp-b state mismatch on dns7002:9100 - https://wikitech.wikimedia.org/wiki/DNS#DnsboxServiceMismatch - https://grafana.wikimedia.org/d/96fb573c-0f3c-456a-886c-e50c29f3ed48/dns-box-service-state?var-site=magru&var-instance=dns7002:9100 - https://alerts.wikimedia.org/?q=alertname%3DDnsboxServiceMismatch [16:00:38] ^ yep expected. fired it intentionally [16:10:20] RESOLVED: DnsboxServiceMismatch: Service ntp-b state mismatch on dns7002:9100 - https://wikitech.wikimedia.org/wiki/DNS#DnsboxServiceMismatch - https://grafana.wikimedia.org/d/96fb573c-0f3c-456a-886c-e50c29f3ed48/dns-box-service-state?var-site=magru&var-instance=dns7002:9100 - https://alerts.wikimedia.org/?q=alertname%3DDnsboxServiceMismatch [16:12:06] 06Traffic, 06Data-Platform-SRE, 10PyBal: Pybal: Depool nodes outside broadcast domain - https://phabricator.wikimedia.org/T363697#10988976 (10bking) As we have migrated from L2 load balancing to IPIP (ref T394062 ), this ticket is no longer relevant. Closing... [16:12:22] 06Traffic, 10PyBal, 10Data-Platform-SRE (2025.07.05 - 2025.07.25): Pybal: Depool nodes outside broadcast domain - https://phabricator.wikimedia.org/T363697#10988978 (10bking) [16:12:53] 06Traffic, 10PyBal, 10Data-Platform-SRE (2025.07.05 - 2025.07.25): Pybal: Depool nodes outside broadcast domain - https://phabricator.wikimedia.org/T363697#10988980 (10bking) 05Open→03Invalid [16:25:12] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#10989042 (10elukey) @Jhancock.wm is there another node that we can test provision on? For example 2044 or 2045, just to understand if it is a problem of 2043 or not. Lem... [16:33:12] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#10989096 (10Volans) Are we sure that the network card is properly installed? I'm getting this from racadm: ` racadm>>get nic.nicconfig ERROR: SWC0244 : Invalid Fully Qu... [17:28:59] 06Traffic, 13Patch-For-Review: Consider using a dedicated TLS certificate for upload.w.o - https://phabricator.wikimedia.org/T394484#10989288 (10CDanis) Thanks for fixing Probenet, @Vgutierrez {F63651793} [17:49:14] 10netops, 06Traffic, 06Infrastructure-Foundations, 06SRE: Alert when anycast-healthchecker withdraws BGP route - https://phabricator.wikimedia.org/T374619#10989357 (10ssingh) On the DNS hosts as of today, we have an alert in place if we detect a mismatch between the service state as defined by confd/confct... [17:51:12] cdanis: could you share that turnilo link? [17:51:35] vgutierrez: https://w.wiki/EgVj [18:22:40] 06Traffic: Remove OCSP monitoring and related bits - https://phabricator.wikimedia.org/T399114 (10ssingh) 03NEW [18:22:58] 06Traffic: Remove OCSP monitoring and related bits - https://phabricator.wikimedia.org/T399114#10989460 (10ssingh) [18:23:04] 06Traffic, 13Patch-For-Review: Disable OCSP for GTS issued certificates - https://phabricator.wikimedia.org/T399079#10989459 (10ssingh) [18:23:14] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:test NIC for lvs1017 - https://phabricator.wikimedia.org/T387145#10989461 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1017.eqiad.wmnet with OS bullseye [18:23:17] 06Traffic: Remove OCSP monitoring and related bits - https://phabricator.wikimedia.org/T399114#10989462 (10ssingh) 05Open→03In progress p:05Triage→03Medium [18:25:04] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:test NIC for lvs1017 - https://phabricator.wikimedia.org/T387145#10989482 (10BCornwall) I've updated lvs1017's BIOS and Mellanox firmware to the latest versions (2.23.0 and 16.35.30.06) prior to reimaging [19:43:39] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:test NIC for lvs1017 - https://phabricator.wikimedia.org/T387145#10989724 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1017.eqiad.wmnet with OS bullseye executed with errors: - lvs1017 (**FAIL**) - Removed f... [20:31:39] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:test NIC for lvs1017 - https://phabricator.wikimedia.org/T387145#10989858 (10BCornwall) Hi, @VRiley-WMF, I'm unable to get lvs1017 to PXE boot - I'm getting `media test failure` errors that advise checking the cables. I'm able to ping the connected switch (lsw1-e... [23:40:20] FIRING: DnsboxServiceMismatch: Service recdns state mismatch on dns5004:9100 - https://wikitech.wikimedia.org/wiki/DNS#DnsboxServiceMismatch - https://grafana.wikimedia.org/d/96fb573c-0f3c-456a-886c-e50c29f3ed48/dns-box-service-state?var-site=eqsin&var-instance=dns5004:9100 - https://alerts.wikimedia.org/?q=alertname%3DDnsboxServiceMismatch [23:45:20] RESOLVED: DnsboxServiceMismatch: Service recdns state mismatch on dns5004:9100 - https://wikitech.wikimedia.org/wiki/DNS#DnsboxServiceMismatch - https://grafana.wikimedia.org/d/96fb573c-0f3c-456a-886c-e50c29f3ed48/dns-box-service-state?var-site=eqsin&var-instance=dns5004:9100 - https://alerts.wikimedia.org/?q=alertname%3DDnsboxServiceMismatch