[06:25:51] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Homer trying to delete BGP peerings for VMs on new Eqiad ganeti nodes - https://phabricator.wikimedia.org/T381175#11885209 (10ayounsi) 05Open→03Resolved I think we're all good here, the issue has been tackled in 2 different ways and... [06:28:23] 10netops, 06Infrastructure-Foundations, 06SRE, 07Epic: [tracking] Don't keep on the public vlans hosts that don't require it - https://phabricator.wikimedia.org/T317177#11885212 (10ayounsi) [08:47:30] 10netops, 06Infrastructure-Foundations, 10Observability-Metrics, 13Patch-For-Review: gNMIc: investigate new "collector" command - https://phabricator.wikimedia.org/T416360#11885702 (10ayounsi) 05Open→03Resolved a:03ayounsi All done. [09:06:22] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Network telemetry - collect device sub-interface statistics with gnmic - https://phabricator.wikimedia.org/T424683#11885878 (10ayounsi) Nice! We can also filter out the `.16386`, `.16384`, `.16385`, `.16383`, `.32769` - weird juniper... a... [10:11:30] 06Traffic, 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Puppet agent failure detected on instance deployment-cache-text08 in project deployment-prep - https://phabricator.wikimedia.org/T424549#11886136 (10elukey) 05Open→03Resolved Fixed! [10:18:33] 06Traffic, 10Liberica, 06Machine-Learning-Team, 10Prod-Kubernetes, 07Kubernetes: Migrate ML k8s apiserver and services to IPIP - https://phabricator.wikimedia.org/T420438#11886145 (10elukey) @klausman @DPogorzelski-WMF Hi! Do you have a timeline for this work? [10:21:35] 06Traffic, 10Liberica, 06Machine-Learning-Team, 10Prod-Kubernetes, 07Kubernetes: Migrate ML k8s apiserver and services to IPIP - https://phabricator.wikimedia.org/T420438#11886154 (10klausman) a:03klausman [10:21:57] 06Traffic, 10Liberica, 06Machine-Learning-Team, 10Prod-Kubernetes, 07Kubernetes: Migrate ML k8s apiserver and services to IPIP - https://phabricator.wikimedia.org/T420438#11886155 (10klausman) >>! In T420438#11886144, @elukey wrote: > @klausman @DPogorzelski-WMF Hi! Do you have a timeline for this work?... [12:59:10] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11886556 (10ayounsi) > suggest corebgp-- for it I suggest `core1` instead of `corebgp` but that lgtm! For v4 I'd have thought a /31 for a vlan used only between 2 CRs. So if we add anoth... [13:56:39] 06Traffic, 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: Surge in webrequest validation check - https://phabricator.wikimedia.org/T422030#11886758 (10xcollazo) Even though the patch applied cleanly and DAG `refine_webrequest_hourly_text` continues to move forward, the error email... [14:01:37] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886785 (10SLyngshede-WMF) Minor error in command, should have been: ` $ ssh cumin1003.eqiad.wmnet $ sudo cookbook sre.dns.admin depool ulsfo -t T408892 -r... [14:03:14] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886801 (10SLyngshede-WMF) Depooling command output, for the records: ` slyngshede@cumin1003:~$ sudo cookbook sre.dns.admin depool ulsfo -t T408892 -r "New... [14:25:45] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886863 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=6733bed9-572f-4b81-9a71-76b2217ca3b5) set by pt1979@cumin1003 for 4:00:00 on 4 hos... [14:30:44] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11886897 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=ea06e422-63a1-4feb-89ac-13f0b89b4956) set by pt1979@cumin1003 for 4:00:00 on 5 hos... [15:19:49] 06Traffic, 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: Surge in webrequest validation check - https://phabricator.wikimedia.org/T422030#11887099 (10JAllemandou) The problem is due to a forgotten step in deploying the `refinery` repo. I have just fixed that. Let's validate that... [15:33:39] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11887131 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=241a7848-479d-48b2-8824-9a08c17249ab) set by ayounsi@cumin1003 for 20:00:00 on 39... [15:39:50] 06Traffic, 06collaboration-services, 10Continuous-Integration-Infrastructure, 10Continuous-Integration-Config: Purge frontend cache when publish new coverage report under https://doc.wikimedia.org/cover - https://phabricator.wikimedia.org/T423951#11887159 (10Dzahn) 05Open→03Resolved [15:40:20] 06Traffic, 06collaboration-services, 10Continuous-Integration-Infrastructure, 10Continuous-Integration-Config: Purge frontend cache when publish new coverage report under https://doc.wikimedia.org/cover - https://phabricator.wikimedia.org/T423951#11887160 (10Dzahn) a:05Dzahn→03None [17:25:40] 06Traffic, 06SRE: [Search Console Verification DNS Request] - {{wikimediafoundation.org}} - https://phabricator.wikimedia.org/T404974#11887479 (10Aklapper) [18:17:00] 06Traffic, 06Commons, 06DC-Ops, 10MediaWiki-Core-Revision-backend, and 3 others: ESAMS serving an older revision of some overwritten files - https://phabricator.wikimedia.org/T425216#11887577 (10AlexisJazz) https://commons.wikimedia.org/wiki/File:Hana_Vagnerov%C3%A1_v_Show_Jana_Krause_19._5._2021_upout%C3%... [22:52:36] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11888002 (10Papaul) [23:10:31] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11888032 (10Papaul)