[02:50:41] FIRING: SystemdUnitFailed: netbox_ganeti_codfw02_sync.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:50:41] FIRING: [2x] SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:45:41] FIRING: [2x] SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:10:30] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru <-> asw1-b3-magru link down March 2026 - https://phabricator.wikimedia.org/T418978#11686377 (10ayounsi) a:03RobH Rob, could you prioritize this ? Thanks [09:39:06] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11686507 (10ayounsi) a:05ayounsi→03Papaul @papaul, can you try a factory reset of the switch from rack 23? (the one failing the TLS cookbook). I'm also still... [09:40:15] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: PeeringBGPDown (instance cr1-drmrs:9804) - https://phabricator.wikimedia.org/T416987#11686528 (10ayounsi) 05Open→03Resolved [11:45:41] FIRING: SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:56:22] 10CAS-SSO, 06Infrastructure-Foundations: Upgrade Apereo CAS to version 7.3 series - https://phabricator.wikimedia.org/T419419 (10SLyngshede-WMF) 03NEW [12:56:24] 10CAS-SSO, 06Infrastructure-Foundations: Upgrade Apereo CAS to version 7.3 series - https://phabricator.wikimedia.org/T419419#11687575 (10SLyngshede-WMF) p:05Triage→03Low [13:40:10] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11687791 (10Papaul) @ayounsi yes I can do that. Do we have like some Documentation on how to factory reset the the Nokia switch somewhere or it is just "delete /"... [14:59:29] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: Update ULSFO LVS service IP's - https://phabricator.wikimedia.org/T418971#11688170 (10MoritzMuehlenhoff) p:05Triage→03High [15:45:41] FIRING: SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:48:36] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru <-> asw1-b3-magru link down March 2026 - https://phabricator.wikimedia.org/T418978#11688418 (10RobH) I'll work on this now. [16:00:09] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru <-> asw1-b3-magru link down March 2026 - https://phabricator.wikimedia.org/T418978#11688523 (10RobH) CS1253254 filed, listed myself, Arzhel, Cathal, and Papaul on the CC list. > Account: WIKIMEDIA > Contact: Robert McMahon Halsell > D... [19:27:17] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru <-> asw1-b3-magru link down March 2026 - https://phabricator.wikimedia.org/T418978#11689626 (10RobH) Ok, they swapped the optic in cr2-magru but still shows down: et-0/0/1 up down Core: asw1-b3-magru:et-0/0/50 {#70130} The o... [19:35:41] FIRING: [2x] SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:35:51] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru <-> asw1-b3-magru link down March 2026 - https://phabricator.wikimedia.org/T418978#11689696 (10RobH) > Support, > > Thank you, we can see the old module QSFP-100GBASE-SR4 SN GT3AAG00321 was removed and replaced with QSFP-100GBASE-SR4 mo... [19:50:41] FIRING: [2x] SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:28:36] 10netops, 06Infrastructure-Foundations, 10ops-magru, 06SRE: cr2-magru <-> asw1-b3-magru link down March 2026 - https://phabricator.wikimedia.org/T418978#11690332 (10RobH) They've now replace the patch cable but we're still seeing down: > Comentário gerado em Smart Hands: Dear, evening. > > As requested... [23:50:41] FIRING: SystemdUnitFailed: bitu-permission-request.service on idm1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed