[00:05:25] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:52:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [01:47:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [04:05:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:49:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:22:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [07:57:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [08:05:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:12:11] 10netops, 06Infrastructure-Foundations, 06SRE: cr1-eqiad:et-1/1/2 <-> cr1-codfw:et-1/0/2 transport flapping, disabled for now - https://phabricator.wikimedia.org/T407578#11284197 (10cmooney) Thanks Brandon you did the right thing. For now, for troubleshooting, I have set the Arelion circuit to 'drained' sta... [08:24:49] 10netops, 06Infrastructure-Foundations, 06SRE: cr1-eqiad:et-1/1/2 <-> cr1-codfw:et-1/0/2 transport flapping, disabled for now - https://phabricator.wikimedia.org/T407578#11284209 (10cmooney) Seems this started fairly suddenly yesterday afternoon: {F66756494 width=800} The link is flapping hard up/down cons... [08:52:02] 10netops, 06Infrastructure-Foundations, 06SRE: Arelion 100G transport cr1-eqiad:et-1/1/2 <-> cr1-codfw:et-1/0/2 flapping on eqiad side [Oct 2025] - https://phabricator.wikimedia.org/T407578#11284373 (10cmooney) [09:02:11] 10netops, 06Infrastructure-Foundations, 06SRE: Arelion 100G transport cr1-eqiad:et-1/1/2 <-> cr1-codfw:et-1/0/2 flapping on eqiad side [Oct 2025] - https://phabricator.wikimedia.org/T407578#11284488 (10cmooney) The link has been mostly stable since re-enabling it at 08:15 UTC, it flapped a few times immediat... [09:34:48] RESOLVED: PuppetZeroResources: Puppet has failed generate resources on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [10:00:21] 10netops, 06Infrastructure-Foundations, 06SRE: Arelion 100G transport cr1-eqiad:et-1/1/2 <-> cr1-codfw:et-1/0/2 flapping on eqiad side [Oct 2025] - https://phabricator.wikimedia.org/T407578#11284632 (10cmooney) So there was a known fault on the Arelion side and they had raised a ticket internally about it.... [11:22:45] 10netops, 06Infrastructure-Foundations, 07Documentation: The links under "Test IP fragmentation issues" on `wikitech:Reporting a connectivity issue` no longer appear to work - https://phabricator.wikimedia.org/T407505#11284843 (10Ladsgroup) Triaging the ticket to the correct team. Please correct me if it's w... [12:05:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:15:25] RESOLVED: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:31:39] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#11285219 (10cmooney) Further to note that the Nokias do support the Openconfig BGP paths, apart from this one which is no main issue: ` /network-insta... [19:59:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [20:09:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [23:55:55] FIRING: MaxConntrack: Max conntrack at 82.98% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack