[00:52:25] 06serviceops, 06Content-Transform-Team, 10Wikifeeds, 06Wikipedia-Android-App-Backlog: Significant increase in wikifeeds latency and mobileapps error rate since 2025/11/13 - https://phabricator.wikimedia.org/T410296#11408049 (10Scott_French) Ah, interesting! Thanks @Jgiannelos - I didn't realize there had /... [01:40:16] 06serviceops, 13Patch-For-Review: Migrate the etcd main cluster to cfssl-based PKI - https://phabricator.wikimedia.org/T352245#11408078 (10Scott_French) I'll be out the remainder of this week, but when I return I'd like to get this moving forward in eqiad. The sequencing would look similar to codfw, though so... [02:01:06] 06serviceops, 13Patch-For-Review: MediaWiki on PHP 8.3 production workload migration - https://phabricator.wikimedia.org/T405955#11408084 (10Scott_French) [02:08:30] 06serviceops, 13Patch-For-Review: MediaWiki on PHP 8.3 production workload migration - https://phabricator.wikimedia.org/T405955#11408087 (10Scott_French) Note: There is one lingering //unused// reference to a very stale 8.1 image in [[ https://gerrit.wikimedia.org/g/operations/deployment-charts/+/3f9dc0486fab... [08:59:10] 06serviceops, 06Infrastructure-Foundations, 10SRE-tools: Add a --rack flag to sre.k8s.pool-depool-node - https://phabricator.wikimedia.org/T410537#11408474 (10MLechvien-WMF) a:03MLechvien-WMF [09:50:58] 06serviceops, 13Patch-For-Review: Migrate the etcd main cluster to cfssl-based PKI - https://phabricator.wikimedia.org/T352245#11408637 (10MoritzMuehlenhoff) FWIW, the plan for eqiad sounds good to me [10:53:49] 06serviceops, 06Traffic, 06MediaWiki-Platform-Team (Kanban Board), 07OKR-Work: api-gateway helm chart: rest routes should return retry-after when a rate limit applies. - https://phabricator.wikimedia.org/T405636#11408840 (10daniel) a:03Hokwelum [11:11:48] 06serviceops: Incident: 2022-09-08 codfw appservers degradation - https://phabricator.wikimedia.org/T317340#11408893 (10jijiki) 05Open→03Resolved a:03jijiki Bluntly closing [11:18:59] 06serviceops, 06Connection-Team: MediaWiki periodic job campaignevents-aggregateanswers-metawiki failed - https://phabricator.wikimedia.org/T410748#11408912 (10Clement_Goubert) >>! In T410748#11406974, @Daimona wrote: >>>! In T410748#11405819, @Clement_Goubert wrote: >> The commands should be run on `deploymen... [11:29:19] 06serviceops, 06Abstract Wikipedia team, 10MediaWiki-Action-API, 06MW-Interfaces-Team, and 4 others: wikifunctions.org API no longer works via that URL (without 'www.') - https://phabricator.wikimedia.org/T411066#11408962 (10Clement_Goubert) p:05Triage→03High a:03Clement_Goubert [11:40:59] 06serviceops, 06Data-Engineering, 06Data-Engineering-Radar, 10Observability-Logging, and 2 others: Fix Kafka replicas skew - https://phabricator.wikimedia.org/T407185#11409008 (10Clement_Goubert) 05Open→03In progress [11:43:53] 06serviceops, 10MinT, 10Prod-Kubernetes, 06SRE: machinetranslation eqiad pods in state ContainerStatusUnknown - https://phabricator.wikimedia.org/T411058#11409010 (10KartikMistry) @RLazarus We deployed MinT lastly on 06 Nov with a37ece7cde26383bba8b3f22519635f3e3b95da5. Is it possible that resource allocat... [11:44:47] 06serviceops, 10CXServer, 10LPL Projects (Other), 07Unplanned-Sprint-Work: cxserver: Remove Yandex MT key from production - https://phabricator.wikimedia.org/T408138#11409012 (10KartikMistry) Thanks @jijiki [12:23:22] 06serviceops, 06Connection-Team: MediaWiki periodic job campaignevents-aggregateanswers-metawiki failed - https://phabricator.wikimedia.org/T410748#11409166 (10Daimona) 05Open→03Resolved a:03Clement_Goubert Thanks for the pointers, makes sense to me! I'll go ahead and close this then, as it was just... [12:35:31] 06serviceops, 06Abstract Wikipedia team, 10MediaWiki-Action-API, 06MW-Interfaces-Team, and 4 others: wikifunctions.org API no longer works via that URL (without 'www.') - https://phabricator.wikimedia.org/T411066#11409199 (10Clement_Goubert) 05Open→03Resolved Deployed and tested quickly, looks like... [14:00:24] 06serviceops, 10MinT, 10Prod-Kubernetes, 06SRE: machinetranslation eqiad pods in state ContainerStatusUnknown - https://phabricator.wikimedia.org/T411058#11409494 (10JMeybohm) `ContainerStatusUnknown` usually happens when a node is down or otherwise in trouble which seems to have been the for the two nodes... [14:41:51] 06serviceops: Improve detection of kafka-main broker TLS certificate rotations - https://phabricator.wikimedia.org/T410552#11409667 (10Blake) If my understanding is correct, we're now going to need to do some metric relabeling. Naively, what I would like to do is something like the following: ` (process_start_... [15:23:31] Hello everyone I'm about to deploy another changeprop change into production, FYI, https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1211655 [15:24:28] dpogorzelski: ack, thanks :) [15:38:47] 06serviceops, 06MediaWiki-Engineering, 10MediaWiki-Maintenance-system, 10MW-on-K8s: mwscript-k8s makes it challenging to run LoggedUpdateMaintenance across all wikis - https://phabricator.wikimedia.org/T411104#11409831 (10Urbanecm_WMF) #serviceops: I'd appreciate your opinion on the `mwscript-k8s` facing s... [15:47:01] 06serviceops, 06Growth-Team, 06MediaWiki-Engineering, 10MediaWiki-Maintenance-system, 10MW-on-K8s: mwscript-k8s makes it challenging to run LoggedUpdateMaintenance across all wikis - https://phabricator.wikimedia.org/T411104#11409850 (10Urbanecm_WMF) [15:56:46] 06serviceops, 06Growth-Team, 06MediaWiki-Engineering, 10MediaWiki-Maintenance-system, 10MW-on-K8s: mwscript-k8s makes it challenging to run LoggedUpdateMaintenance across all wikis - https://phabricator.wikimedia.org/T411104#11409929 (10Clement_Goubert) Added documentation of FOREACHWIKI_IGNORE_ERRORS ht... [16:11:12] 06serviceops, 06Growth-Team, 06MediaWiki-Engineering, 10MediaWiki-Maintenance-system, 10MW-on-K8s: mwscript-k8s makes it challenging to run LoggedUpdateMaintenance across all wikis - https://phabricator.wikimedia.org/T411104#11409999 (10Urbanecm_WMF) >>! In T411104#11409929, @Clement_Goubert wrote: > Add... [16:21:37] 06serviceops, 06Growth-Team, 06MediaWiki-Engineering, 10MediaWiki-Maintenance-system, 10MW-on-K8s: mwscript-k8s makes it challenging to run LoggedUpdateMaintenance across all wikis - https://phabricator.wikimedia.org/T411104#11410073 (10Clement_Goubert) >>! In T411104#11409999, @Urbanecm_WMF wrote: >>>!... [16:39:46] 06serviceops, 06Data-Engineering, 06Data-Engineering-Radar, 10Observability-Logging, and 2 others: Fix Kafka replicas skew - https://phabricator.wikimedia.org/T407185#11410134 (10Clement_Goubert) Rebalance done on kafka-main-eqiad {F70672099} Partition count stays imbalanced due to partition size variance,... [18:11:59] 06serviceops, 05WE4.2 Bot detection: hcaptcha extension, proxy: Define the backoff and retry strategies - https://phabricator.wikimedia.org/T411115 (10Raine) 03NEW [18:53:04] 06serviceops, 10DNS, 06SRE, 06Traffic, 07Language codes: Redirect legacy language codes for Toki Pona to tok.wikipedia.org - https://phabricator.wikimedia.org/T404507#11410708 (10Tamzin) This technically wasn't stalled, but there wasn't much reason to get around to it till now, so, noting that T404457 ha... [20:35:45] 06serviceops: hcaptcha proxy: update wikitech page - https://phabricator.wikimedia.org/T411131 (10Raine) 03NEW [20:37:02] 06serviceops, 07Documentation: hcaptcha proxy: update wikitech page - https://phabricator.wikimedia.org/T411131#11411028 (10taavi) [21:11:51] 06serviceops, 07Documentation, 05WE4.2 Bot detection: hcaptcha proxy: update wikitech page - https://phabricator.wikimedia.org/T411131#11411072 (10Raine) p:05Triage→03Medium a:03Raine [21:34:39] 06serviceops: hcaptcha proxy: bump connection limits + stress test - https://phabricator.wikimedia.org/T411141 (10Raine) 03NEW [22:18:20] 06serviceops: hcaptcha-proxy: update service catalog - https://phabricator.wikimedia.org/T411148 (10Raine) 03NEW