[00:03:28] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (94.72%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [00:03:29] FIRING: SystemdUnitFailed: wmf_auto_restart_ipmiseld.service on netmon1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:33:28] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.03%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [03:33:28] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [04:03:28] FIRING: SystemdUnitFailed: wmf_auto_restart_ipmiseld.service on netmon1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:04:32] RESOLVED: SystemdUnitFailed: wmf_auto_restart_ipmiseld.service on netmon1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:33:28] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (94.78%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [06:59:31] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.03%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [07:03:28] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.03%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [07:33:28] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [07:51:34] FIRING: DiskSpace: Disk space cumin2002:9100:/ 4.945% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cumin2002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [08:00:25] FIRING: SystemdUnitFailed: prometheus-puppet-agent-stats.service on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:27:34] RESOLVED: DiskSpace: Disk space cumin2002:9100:/ 5.208% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cumin2002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [08:27:55] RESOLVED: SystemdUnitFailed: prometheus-puppet-agent-stats.service on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:01:22] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Map dumps HTTPS traffic as low-priority for QoS - https://phabricator.wikimedia.org/T397153#10927019 (10cmooney) >>! In T397153#10925689, @xcollazo wrote: > Should we also mark rsync traffic as low-priority then? Hmm yeah it might not be a... [09:01:51] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Map dumps HTTPS traffic as low-priority for QoS - https://phabricator.wikimedia.org/T397153#10927020 (10cmooney) FWIW the change to mark the HTTP traffic is in place and working ` cmooney@clouddumps1002:~$ sudo iptables -v -n -t mangle -L P... [09:47:33] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10927145 (10BTullis) Hello, just to let you know, I'm now trying the same operation on an-coord1003 T394499#10927000 and getting the same error as @RobH ab... [10:12:00] 10netbox, 06Infrastructure-Foundations: Upgrade Netbox to version 4.0.11 - https://phabricator.wikimedia.org/T397300 (10SLyngshede-WMF) 03NEW [10:12:08] 10netbox, 06Infrastructure-Foundations: Upgrade Netbox to version 4.0.11 - https://phabricator.wikimedia.org/T397300#10927285 (10SLyngshede-WMF) p:05Triageβ†’03Medium [10:21:25] FIRING: SystemdUnitFailed: netbox_report_cables_run.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:32:54] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10927411 (10Volans) @BTullis the SSD upgrade is a type of its own, not STORAGE, so the files must be in `/srv/firmware/poweredge-r440/SSD`. If you use that... [10:37:11] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: Sync firmwares directory between the cumin hosts - https://phabricator.wikimedia.org/T397306 (10Volans) 03NEW p:05Triageβ†’03Medium [10:37:16] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10927437 (10Volans) Created T397306 [10:41:49] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: Sync firmwares directory between the cumin hosts - https://phabricator.wikimedia.org/T397306#10927452 (10MoritzMuehlenhoff) We could also have one seedhost on a single designated Cumin host where dc ops can write to. And then set up an rsync which syncs th... [10:49:23] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: Sync firmwares directory between the cumin hosts - https://phabricator.wikimedia.org/T397306#10927495 (10Volans) That's an interesting idea that would work right now because the auto-download from the Dell website is broken, but if we fix that then any cum... [10:51:25] RESOLVED: SystemdUnitFailed: netbox_report_cables_run.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:03:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.06%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [11:18:10] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10927588 (10BTullis) >>! In T394543#10927411, @Volans wrote: > @BTullis the SSD upgrade is a type of its own, not STORAGE, so the files must be in `/srv/fi... [11:26:21] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#10927634 (10taavi) [11:28:25] FIRING: SystemdUnitFailed: user@499.service on aux-k8s-worker1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:30:41] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#10927696 (10MoritzMuehlenhoff) [11:33:29] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [11:55:39] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10927811 (10Volans) The cookbook exited with that code because it had a failure, unfortunately was missing a useful logging message at the right point. I'm... [11:58:13] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review: Upgrade Netbox to version 4.0.11 - https://phabricator.wikimedia.org/T397300#10927812 (10SLyngshede-WMF) [11:58:25] RESOLVED: SystemdUnitFailed: user@499.service on aux-k8s-worker1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:21:57] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review: Upgrade Netbox to version 4.0.11 - https://phabricator.wikimedia.org/T397300#10927916 (10SLyngshede-WMF) ` sudo cookbook sre.deploy.python-code -t T397300 -r 'Release v4.0.11 to netbox-next' -u netbox netbox 'A:netbox-canary' ` [12:29:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [12:33:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.02%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [12:59:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.05%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [13:03:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.01%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [13:15:17] 10netbox, 06Infrastructure-Foundations: Upgrade Netbox to version 4.0.11 - https://phabricator.wikimedia.org/T397300#10928150 (10SLyngshede-WMF) Database restore from prod: ` SSH_AUTH_SOCK=/run/keyholder/proxy.sock scp -3 root@netboxdb2003.codfw.wmnet:/srv/postgres-backup/psql-all-dbs-latest.sql.gz root@netb... [13:31:48] 10netops, 06Infrastructure-Foundations, 06SRE: Map dumps HTTPS traffic as low-priority for QoS - https://phabricator.wikimedia.org/T397153#10928195 (10cmooney) 05Openβ†’03Resolved a:03cmooney [13:59:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.02%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [14:03:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.09%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [14:11:29] 10SRE-tools, 06Infrastructure-Foundations, 10Observability-Alerting, 10SRE Observability (FY2025/2026-Q1): Cookbook sre.hosts.remove_downtime does not remove silences - https://phabricator.wikimedia.org/T395032#10928440 (10lmata) [14:59:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.06%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [15:03:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.18%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [15:13:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [15:33:29] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [15:42:09] dear foundations [15:42:30] I added a VM, and then switched BGP to true, and ran homer and I got [15:42:30] ERROR:homer_plugins.wmf-netbox:No BGP group found for wikikube-worker-exp2001. [15:42:44] have I done something wrong? is it me or you ? [15:43:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [15:57:41] we chatted over irc, but `wikikube-worker-exp` needs to be added to the list below https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/homer/deploy/+/refs/heads/master/plugins/wmf-netbox.py#38 [16:00:18] cheers arzhel, thanks [16:01:06] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10928851 (10BTullis) >>! In T394543#10927811, @Volans wrote: > If you try to re-run it it does tell you there is nothing to upgrade right? I can confirm t... [16:13:44] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: SSD firmware update not working in firmware cookbook - https://phabricator.wikimedia.org/T394543#10928884 (10Volans) yes if you pick the same version (option 0 above) it would just tell you that there is nothing to do because already at the same versio... [16:29:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.01%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [16:33:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.12%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [16:59:25] FIRING: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:59:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.06%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [17:03:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.06%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [17:04:25] RESOLVED: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:29:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [17:29:58] ^ out of curiosity, do we know what these are about? I may have missed it [17:30:11] as in, the text is clear but have we looked into what's causing that or is that expected somehow? [17:33:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.06%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [17:49:20] sukhe: it's normal to happen from time to time, I was going to attempt rebalancing that nodegroup later today, per the runbook [17:50:20] cdanis: thanks! makes sense. [17:50:49] fwiw, I have only observed this recently so it may have been when it was added (not sure) or something that changed (which was what my question was about I guess) [17:51:29] probably an existing VM got resized, or just a bad assignment on adding a new one [17:55:05] yeah... [17:59:32] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (95.04%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [18:03:29] FIRING: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.16%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [18:10:51] πŸ’™cdanis@ganeti1046.eqiad.wmnet ~ πŸ•‘β˜• sudo gnt-node list -o name,group | grep 1036 [18:10:53] ganeti1036.eqiad.wmnet B [18:10:55] πŸ’™cdanis@ganeti1046.eqiad.wmnet ~ πŸ•‘β˜• sudo hbal -L -X -G B [18:13:29] RESOLVED: [2x] GanetiMemoryPressure: Ganeti: High memory usage (95.16%) on ganeti1036:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [18:14:42] neat [18:14:50] it's not done yet, but it is moving several VMs off of 1036 [18:15:52] https://phabricator.wikimedia.org/P78380 [18:25:19] thanks for rebalancing, the current disproportion was due some forced migrations caused when debugging https://phabricator.wikimedia.org/T397025 [18:25:55] sadly the cpufreq governor turned out to be red herring: https://phabricator.wikimedia.org/T396660#10928689 [18:27:32] interesting [19:33:29] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [20:40:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [21:50:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [21:59:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:33:29] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts