[00:00:05] Deploy window Web Team deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0000) [00:00:08] jan_drewniak: I was eventually able to get everything deployed once that wrapped up. no worries! [00:40:22] (03PS1) 10TrainBranchBot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1233317 [00:40:22] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1233317 (owner: 10TrainBranchBot) [00:51:11] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11556167 (10Krinkle) >>! In T414805#11555784, @Izno wrote: > https://en.wikipedia.org/wiki/User:Bradv/Scripts/ExpandDiffs.js#L-20 […]... [00:52:37] (03Merged) 10jenkins-bot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1233317 (owner: 10TrainBranchBot) [01:04:03] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11556200 (10Nux) Are icons really a problem? I mean, are there actually external sites using images loaded from Wikimedia as icons on... [01:05:03] FIRING: MediaWikiEditFailures: Elevated MediaWiki edit failures (session_loss) for cluster - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000208/edit-count?orgId=1&viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiEditFailures [01:10:38] (03PS1) 10TrainBranchBot: Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1233324 [01:10:38] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1233324 (owner: 10TrainBranchBot) [01:25:03] RESOLVED: MediaWikiEditFailures: Elevated MediaWiki edit failures (session_loss) for cluster - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000208/edit-count?orgId=1&viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiEditFailures [01:29:41] FIRING: [9x] SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:35:48] (03Merged) 10jenkins-bot: Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1233324 (owner: 10TrainBranchBot) [02:01:12] !log mwpresync@deploy2002 Started scap build-images: Publishing wmf/next image [02:01:14] RECOVERY - dump of s1 in codfw on backupmon1001 is OK: Last dump for s1 at codfw (db2141) taken on 2026-01-27 00:00:14 (153 GiB, +0.4 %) https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [02:01:32] FIRING: [4x] ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [02:10:04] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.46.0-wmf.13 [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233336 (https://phabricator.wikimedia.org/T413804) [02:10:07] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/1.46.0-wmf.13 [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233336 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [02:13:54] !log mwpresync@deploy2002 Finished scap build-images: Publishing wmf/next image (duration: 12m 42s) [02:20:57] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11556311 (10Nux) Found two problems with the migration: 1. The docs for the editor suggest using 22px (though I guess you can replace... [02:22:50] (03Merged) 10jenkins-bot: Branch commit for wmf/1.46.0-wmf.13 [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233336 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [02:29:46] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11556320 (10Ladsgroup) why not simply using https://upload.wikimedia.org/wikipedia/commons/thumb/6/61/Contribs_icon-black.svg/20px-Con... [02:30:42] (03PS1) 10Foks: AccountRecovery: Adding additional Zendesk fields [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233349 (https://phabricator.wikimedia.org/T414597) [02:50:40] FIRING: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:52:43] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11556343 (10Nux) Because SVG is sharp on any dpi/ppi. [02:56:18] PROBLEM - dump of m1 in codfw on backupmon1001 is CRITICAL: Last dump for m1 at codfw (db2160) taken on 2026-01-27 00:15:00 is 75 GiB, but the previous one was 89 GiB, a change of -15.9 % https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [03:00:05] Deploy window Automatic branching of MediaWiki, extensions, skins, and vendor – see Heterogeneous deployment/Train deploys (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0300) [03:06:16] PROBLEM - dump of s1 in eqiad on backupmon1001 is CRITICAL: Last dump for s1 at eqiad (db1240) taken on 2026-01-27 00:00:10 is 153 GiB, but the previous one was 183 GiB, a change of -16.2 % https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [03:19:14] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate eventstreams-internal.discovery.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [03:34:14] FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [03:41:18] PROBLEM - dump of s8 in eqiad on backupmon1001 is CRITICAL: Last dump for s8 at eqiad (db1171) taken on 2026-01-27 00:00:03 is 183 GiB, but the previous one was 240 GiB, a change of -23.4 % https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [04:00:04] Deploy window Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see Heterogeneous deployment/Train deploys (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0400) [04:02:09] (03PS1) 10TrainBranchBot: testwikis to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233367 (https://phabricator.wikimedia.org/T413804) [04:02:13] (03CR) 10TrainBranchBot: [C:03+2] "Initiated by mwpresync@deploy2002" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233367 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [04:03:04] (03Merged) 10jenkins-bot: testwikis to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233367 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [04:03:33] !log mwpresync@deploy2002 Started scap sync-world: testwikis to 1.46.0-wmf.13 refs T413804 [04:03:39] T413804: 1.46.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T413804 [04:48:36] !log mwpresync@deploy2002 Finished scap sync-world: testwikis to 1.46.0-wmf.13 refs T413804 (duration: 45m 03s) [04:48:41] T413804: 1.46.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T413804 [05:00:05] Deploy window Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0500) [05:02:46] !log mwpresync@deploy2002 Pruned MediaWiki: 1.46.0-wmf.10 (duration: 02m 42s) [05:09:14] FIRING: [3x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [05:29:41] FIRING: [9x] SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:35:10] FIRING: [3x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [05:39:14] FIRING: [3x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [05:41:18] PROBLEM - dump of m1 in eqiad on backupmon1001 is CRITICAL: Last dump for m1 at eqiad (db1217) taken on 2026-01-27 03:05:15 is 75 GiB, but the previous one was 90 GiB, a change of -15.9 % https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [06:01:32] FIRING: [4x] ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [06:36:54] (03PS1) 10Clare Ming: Add ssh key for cjming new laptop [puppet] - 10https://gerrit.wikimedia.org/r/1233566 [06:41:14] PROBLEM - dump of s8 in codfw on backupmon1001 is CRITICAL: Last dump for s8 at codfw (db2198) taken on 2026-01-27 00:00:05 is 183 GiB, but the previous one was 240 GiB, a change of -23.4 % https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [06:42:52] (03PS1) 10Marostegui: installserver: Do not format db2248 [puppet] - 10https://gerrit.wikimedia.org/r/1233569 (https://phabricator.wikimedia.org/T415358) [06:44:52] (03CR) 10Marostegui: [C:03+2] installserver: Do not format db2248 [puppet] - 10https://gerrit.wikimedia.org/r/1233569 (https://phabricator.wikimedia.org/T415358) (owner: 10Marostegui) [06:50:40] FIRING: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:00:05] Deploy window MediaWiki infrastructure (UTC early) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0700) [07:00:05] marostegui, Amir1, and federico3: It is that lovely time of the day again! You are hereby commanded to deploy Primary database switchover. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0700). [07:18:08] !log marostegui@cumin1003 START - Cookbook sre.mysql.newdepool depool db2248: Reimage [07:18:29] !log marostegui@cumin1003 END (PASS) - Cookbook sre.mysql.newdepool (exit_code=0) depool db2248: Reimage [07:19:06] !log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2248.codfw.wmnet with reason: reimage [07:19:14] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate eventstreams-internal.discovery.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [07:19:51] FIRING: CoreRouterInterfaceDropPercent: Core router normal + high priority queue drops are high on cr3-eqsin:xe-0/1/3 (Peering: Equinix (Wikimedia-SG1-IX-00 Singapore, ... [07:19:51] MAC filter) {#1016}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#CoreRouterInterfaceDropPercent - https://grafana.wikimedia.org/d/5p97dAASz/network-device-interface-queues-and-error-stats?var-site=eqsin,var-instance=cr3-eqsin:9804&var-interface=xe-0/1/3 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDropPercent [07:20:17] !log marostegui@cumin1003 START - Cookbook sre.hosts.reimage for host db2248.codfw.wmnet with OS trixie [07:24:51] RESOLVED: CoreRouterInterfaceDropPercent: Core router normal + high priority queue drops are high on cr3-eqsin:xe-0/1/3 (Peering: Equinix (Wikimedia-SG1-IX-00 Singapore, ... [07:24:51] MAC filter) {#1016}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#CoreRouterInterfaceDropPercent - https://grafana.wikimedia.org/d/5p97dAASz/network-device-interface-queues-and-error-stats?var-site=eqsin,var-instance=cr3-eqsin:9804&var-interface=xe-0/1/3 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDropPercent [07:24:51] ACKNOWLEDGEMENT - dump of m1 in codfw on backupmon1001 is CRITICAL: Last dump for m1 at codfw (db2160) taken on 2026-01-27 00:15:00 is 75 GiB, but the previous one was 89 GiB, a change of -15.9 % Marostegui This is expected https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [07:24:51] ACKNOWLEDGEMENT - dump of m1 in eqiad on backupmon1001 is CRITICAL: Last dump for m1 at eqiad (db1217) taken on 2026-01-27 03:05:15 is 75 GiB, but the previous one was 90 GiB, a change of -15.9 % Marostegui This is expected https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [07:24:51] ACKNOWLEDGEMENT - dump of s1 in eqiad on backupmon1001 is CRITICAL: Last dump for s1 at eqiad (db1240) taken on 2026-01-27 00:00:10 is 153 GiB, but the previous one was 183 GiB, a change of -16.2 % Marostegui This is expected https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [07:24:51] ACKNOWLEDGEMENT - dump of s8 in codfw on backupmon1001 is CRITICAL: Last dump for s8 at codfw (db2198) taken on 2026-01-27 00:00:05 is 183 GiB, but the previous one was 240 GiB, a change of -23.4 % Marostegui This is expected https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [07:24:51] ACKNOWLEDGEMENT - dump of s8 in eqiad on backupmon1001 is CRITICAL: Last dump for s8 at eqiad (db1171) taken on 2026-01-27 00:00:03 is 183 GiB, but the previous one was 240 GiB, a change of -23.4 % Marostegui This is expected https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Rerun_a_failed_backup [07:36:14] !log marostegui@cumin1003 START - Cookbook sre.hosts.downtime for 2:00:00 on db2248.codfw.wmnet with reason: host reimage [07:39:36] !log marostegui@cumin1003 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2248.codfw.wmnet with reason: host reimage [08:00:05] Amir1, Urbanecm, and awight: Time to do the UTC morning backport window deploy. Don't look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T0800). [08:00:05] No Gerrit patches in the queue for this window AFAICS. [08:01:32] !log marostegui@cumin1003 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2248.codfw.wmnet with OS trixie [08:04:42] 06SRE, 06Traffic: OGP lists fullsize thumbnail version of original instead the original itself - https://phabricator.wikimedia.org/T415598#11556682 (10Bawolff) >>! In T415598#11555961, @AntiCompositeNumber wrote: >>>! In T415598#11555931, @TheDJ wrote: >> The ogp.me tag is listing the thumbnail variant of the... [08:19:44] !log marostegui@cumin1003 START - Cookbook sre.mysql.newpool pool db2248: After reimage [08:20:21] !log marostegui@cumin1003 dbctl commit (dc=all): 'Set db1184 with weight 0 T415238', diff saved to https://phabricator.wikimedia.org/P87950 and previous config saved to /var/cache/conftool/dbconfig/20260127-082020-marostegui.json [08:20:26] T415238: Switchover s1 master (db1163 -> db1184) - https://phabricator.wikimedia.org/T415238 [08:20:44] !log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s1 T415238 [08:21:03] (03CR) 10Marostegui: [C:03+2] mariadb: Promote db1184 to s1 master [puppet] - 10https://gerrit.wikimedia.org/r/1230144 (https://phabricator.wikimedia.org/T415238) (owner: 10Gerrit maintenance bot) [08:24:46] !log Starting s1 eqiad failover from db1163 to db1184 - T415238 [08:24:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:25:03] !log marostegui@cumin1003 dbctl commit (dc=all): 'Promote db1184 to s1 primary T415238', diff saved to https://phabricator.wikimedia.org/P87951 and previous config saved to /var/cache/conftool/dbconfig/20260127-082502-marostegui.json [08:25:42] !log marostegui@cumin1003 dbctl commit (dc=all): 'Depool db1163 T415238', diff saved to https://phabricator.wikimedia.org/P87952 and previous config saved to /var/cache/conftool/dbconfig/20260127-082542-marostegui.json [08:25:49] T415238: Switchover s1 master (db1163 -> db1184) - https://phabricator.wikimedia.org/T415238 [08:27:13] !log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1163.eqiad.wmnet with reason: schema change [08:28:20] (03PS1) 10Marostegui: db1163: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/1233641 (https://phabricator.wikimedia.org/T411163) [08:28:49] !log Deploy schema change on old s1 master db1163 T411163 T411164 [08:29:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:29:03] T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163 [08:29:03] T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164 [08:29:07] (03CR) 10Marostegui: [C:03+2] db1163: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/1233641 (https://phabricator.wikimedia.org/T411163) (owner: 10Marostegui) [09:05:21] !log marostegui@cumin1003 END (PASS) - Cookbook sre.mysql.newpool (exit_code=0) pool db2248: After reimage [09:09:19] (03PS1) 10Urbanecm: Update interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233648 (https://phabricator.wikimedia.org/T413283) [09:09:43] (03CR) 10TrainBranchBot: [C:03+2] "Approved by urbanecm@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233648 (https://phabricator.wikimedia.org/T413283) (owner: 10Urbanecm) [09:10:43] (03Merged) 10jenkins-bot: Update interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233648 (https://phabricator.wikimedia.org/T413283) (owner: 10Urbanecm) [09:11:55] !log urbanecm@deploy2002 Started scap sync-world: Backport for [[gerrit:1233648|Update interwiki cache (T413283)]] [09:12:02] T413283: Create Jju Wikipedia - https://phabricator.wikimedia.org/T413283 [09:13:04] !log cwhite@deploy2002 Started deploy [performance/arc-lamp@03c538c]: T391517 [09:13:10] T391517: Daily flame graph for "fn-EditAction" missing since 28 March 2025 - https://phabricator.wikimedia.org/T391517 [09:13:12] !log cwhite@deploy2002 Finished deploy [performance/arc-lamp@03c538c]: T391517 (duration: 00m 08s) [09:15:28] 06SRE, 10SRE-Access-Requests: Requesting access to deployer for trueg - https://phabricator.wikimedia.org/T415632 (10trueg) 03NEW [09:19:36] (03PS1) 10Samwilson: Enable watchlist labels everywhere (prod and beta) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233651 (https://phabricator.wikimedia.org/T413967) [09:22:50] !log urbanecm@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233648|Update interwiki cache (T413283)]] (duration: 10m 55s) [09:22:55] T413283: Create Jju Wikipedia - https://phabricator.wikimedia.org/T413283 [09:29:41] FIRING: [9x] SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:39:11] (03PS2) 10Cwhite: centralauth: add recording rules for grafana widgets [puppet] - 10https://gerrit.wikimedia.org/r/1233214 (https://phabricator.wikimedia.org/T415035) (owner: 10Tiziano Fogli) [09:39:14] FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [09:39:33] (03CR) 10Cwhite: [C:03+1] "LGTM!" [puppet] - 10https://gerrit.wikimedia.org/r/1233214 (https://phabricator.wikimedia.org/T415035) (owner: 10Tiziano Fogli) [09:41:07] (03CR) 10Cwhite: [C:03+1] centralauth: add recording rules for grafana widgets (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/1233214 (https://phabricator.wikimedia.org/T415035) (owner: 10Tiziano Fogli) [09:41:14] (03CR) 10Tiziano Fogli: [C:03+2] centralauth: add recording rules for grafana widgets [puppet] - 10https://gerrit.wikimedia.org/r/1233214 (https://phabricator.wikimedia.org/T415035) (owner: 10Tiziano Fogli) [09:44:14] (03CR) 10Cathal Mooney: [C:03+2] Netops: link to more-specific dashboards for interface based alerts [alerts] - 10https://gerrit.wikimedia.org/r/1229163 (owner: 10Cathal Mooney) [09:45:56] (03Merged) 10jenkins-bot: Netops: link to more-specific dashboards for interface based alerts [alerts] - 10https://gerrit.wikimedia.org/r/1229163 (owner: 10Cathal Mooney) [09:58:54] 06SRE, 10SRE-Access-Requests: Requesting access to deployment for trueg - https://phabricator.wikimedia.org/T415632#11556902 (10Novem_Linguae) [10:01:32] FIRING: [4x] ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:06:17] FIRING: [4x] ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:08:36] (03PS1) 10Phuedx: TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 [10:09:29] (03PS1) 10Brouberol: postgresql-airflow-main: force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233660 (https://phabricator.wikimedia.org/T415635) [10:10:43] PROBLEM - PyBal backends health check on lvs1019 is CRITICAL: PYBAL CRITICAL - CRITICAL - wdqs-main_443: Servers wdqs1018.eqiad.wmnet, wdqs1021.eqiad.wmnet, wdqs1020.eqiad.wmnet, wdqs1022.eqiad.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBal [10:11:22] Oh good, it looks like someone is burning down wdqs again [10:12:45] (03PS2) 10Brouberol: postgresql-airflow-main: force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233660 (https://phabricator.wikimedia.org/T415635) [10:14:23] PROBLEM - PyBal backends health check on lvs1020 is CRITICAL: PYBAL CRITICAL - CRITICAL - wdqs-main_443: Servers wdqs1018.eqiad.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBal [10:18:43] RECOVERY - PyBal backends health check on lvs1019 is OK: PYBAL OK - All pools are healthy https://wikitech.wikimedia.org/wiki/PyBal [10:20:23] RECOVERY - PyBal backends health check on lvs1020 is OK: PYBAL OK - All pools are healthy https://wikitech.wikimedia.org/wiki/PyBal [10:26:25] (03CR) 10Joal: [C:03+1] postgresql-airflow-main: force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233660 (https://phabricator.wikimedia.org/T415635) (owner: 10Brouberol) [10:40:45] PROBLEM - Host wikikube-worker1097 is DOWN: PING CRITICAL - Packet loss = 80%, RTA = 9069.21 ms [10:41:39] RECOVERY - Host wikikube-worker1097 is UP: PING OK - Packet loss = 0%, RTA = 0.29 ms [10:46:17] FIRING: [4x] ProbeDown: Service wdqs1015:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:48:19] (03PS2) 10GergesShamon: [arwikibooks] Update logos and wordmark [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 [10:50:40] FIRING: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:51:17] FIRING: [4x] ProbeDown: Service wdqs1015:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:53:06] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-i" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [10:58:38] (03PS1) 10Santiago Faci: Removed `mpic` as local service [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233670 (https://phabricator.wikimedia.org/T407805) [10:59:24] (03CR) 10CI reject: [V:04-1] Removed `mpic` as local service [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233670 (https://phabricator.wikimedia.org/T407805) (owner: 10Santiago Faci) [10:59:32] (03CR) 10Samtar: [C:03+1] Enable watchlist labels everywhere (prod and beta) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233651 (https://phabricator.wikimedia.org/T413967) (owner: 10Samwilson) [11:00:05] Deploy window MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1100) [11:05:23] (03CR) 10Santiago Faci: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233670 (https://phabricator.wikimedia.org/T407805) (owner: 10Santiago Faci) [11:05:37] (03CR) 10Brouberol: [C:03+2] postgresql-airflow-main: force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233660 (https://phabricator.wikimedia.org/T415635) (owner: 10Brouberol) [11:06:17] RESOLVED: [2x] ProbeDown: Service wdqs1015:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1015:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [11:15:17] (03PS1) 10STran: Remove deprecated IRS v2 configurations [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233674 (https://phabricator.wikimedia.org/T413951) [11:16:13] (03PS1) 10Brouberol: postgresql-airflow-main: (fix) force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233675 (https://phabricator.wikimedia.org/T415635) [11:16:17] FIRING: [2x] ProbeDown: Service wdqs1011:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1011:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [11:18:20] (03CR) 10BCornwall: [V:03+1 C:03+1] "I would move this to a subdir named `completions/` so that way someone can potentially contribute fish/zsh completions as well." [puppet] - 10https://gerrit.wikimedia.org/r/1230341 (owner: 10Majavah) [11:19:13] (03CR) 10Joal: [C:03+1] postgresql-airflow-main: (fix) force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233675 (https://phabricator.wikimedia.org/T415635) (owner: 10Brouberol) [11:19:14] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate eventstreams-internal.discovery.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [11:19:25] (03CR) 10Brouberol: [C:03+2] postgresql-airflow-main: (fix) force the pods to avoid nodes with 1G NICs [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233675 (https://phabricator.wikimedia.org/T415635) (owner: 10Brouberol) [11:21:17] RESOLVED: [2x] ProbeDown: Service wdqs1011:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1011:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [11:22:31] !log brouberol@deploy2002 helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply [11:22:39] !log brouberol@deploy2002 helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply [11:31:34] (03PS3) 10Majavah: utils/pcc: Add bash completion script [puppet] - 10https://gerrit.wikimedia.org/r/1230341 [11:37:57] 10ops-codfw, 06SRE, 06DC-Ops, 06ServiceOps new, and 2 others: Q2:rack/setup/install wikikube-worker2332-56 - https://phabricator.wikimedia.org/T408757#11557236 (10Jhancock.wm) 05Open→03Resolved [11:38:28] 06SRE, 06Infrastructure-Foundations, 10netops, 06Traffic: Map internet-bound upload traffic to low-priority QoS queue - https://phabricator.wikimedia.org/T415649 (10cmooney) 03NEW p:05Triage→03Low [11:41:40] 06SRE, 06Infrastructure-Foundations, 10netops, 06Traffic: Map internet-bound upload traffic to low-priority QoS queue - https://phabricator.wikimedia.org/T415649#11557255 (10cmooney) [11:42:18] (03PS1) 10Reedy: CommonSettings.php: Stop loading WebAuthn [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233679 (https://phabricator.wikimedia.org/T303495) [11:42:21] (03PS1) 10Reedy: wmf-config: Remove $wmgUseWebAuthn and extension from extension-list [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233680 (https://phabricator.wikimedia.org/T303495) [11:42:31] (03CR) 10Reedy: [C:04-2] "Not yet!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233679 (https://phabricator.wikimedia.org/T303495) (owner: 10Reedy) [11:47:58] (03CR) 10Dzahn: [C:03+2] add abstract.wikipedia.org to section for wikis not covered by langlist (031 comment) [dns] - 10https://gerrit.wikimedia.org/r/1227706 (https://phabricator.wikimedia.org/T411724) (owner: 10Dzahn) [11:52:45] (03CR) 10Mszwarc: [C:03+1] Remove deprecated IRS v2 configurations [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233674 (https://phabricator.wikimedia.org/T413951) (owner: 10STran) [11:58:39] !log fnegri@cumin1003 START - Cookbook sre.wikireplicas.add-wiki for database kajwiki (T415041) [11:58:46] T415041: [wikireplicas] Create views for new wiki kajwiki - https://phabricator.wikimedia.org/T415041 [12:01:26] (03CR) 10Dzahn: [V:03+1 C:03+2] "https://puppet-compiler.wmflabs.org/output/1227735/7949/zuul1001.eqiad.wmnet/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/1227735 (https://phabricator.wikimedia.org/T405119) (owner: 10Dzahn) [12:02:17] FIRING: [2x] ProbeDown: Service wdqs1013:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1013:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [12:05:51] (03PS1) 10Kevin Bazira: aptrepo: add ROCm 7.0.0 packages to wikimedia bookworm mirror [puppet] - 10https://gerrit.wikimedia.org/r/1233681 (https://phabricator.wikimedia.org/T415627) [12:07:17] FIRING: ProbeDown: Service wdqs1017:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1017:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [12:14:30] 10ops-eqiad, 06SRE, 06DC-Ops: Inbound errors on interface lswtest-d8-eqiad:mgmt0 () - https://phabricator.wikimedia.org/T415109#11557505 (10phaultfinder) [12:14:31] 10ops-codfw, 06SRE, 06DC-Ops: Inbound errors on interface lsw1-f2-codfw:mgmt0 () - https://phabricator.wikimedia.org/T412154#11557506 (10phaultfinder) [12:14:32] 10ops-codfw, 06SRE, 06DC-Ops: Inbound errors on interface lsw1-f4-codfw:mgmt0 () - https://phabricator.wikimedia.org/T412153#11557507 (10phaultfinder) [12:14:33] 10ops-codfw, 06SRE, 06DC-Ops: Inbound errors on interface lsw1-e4-codfw:mgmt0 () - https://phabricator.wikimedia.org/T412152#11557508 (10phaultfinder) [12:14:35] 10ops-codfw, 06SRE, 06DC-Ops: Inbound errors on interface lsw1-e2-codfw:mgmt0 () - https://phabricator.wikimedia.org/T412155#11557510 (10phaultfinder) [12:15:34] 06SRE, 10PageImages, 06Traffic: OGP lists fullsize thumbnail version of original instead the original itself - https://phabricator.wikimedia.org/T415598#11557513 (10TheDJ) It does a straight `$file->transform()`, which indeed has this effect if creating a thumbnail according to the specified instructions. Th... [12:16:51] (03CR) 10Dpogorzelski: [C:03+2] aptrepo: add ROCm 7.0.0 packages to wikimedia bookworm mirror [puppet] - 10https://gerrit.wikimedia.org/r/1233681 (https://phabricator.wikimedia.org/T415627) (owner: 10Kevin Bazira) [12:22:47] RESOLVED: [2x] ProbeDown: Service wdqs1013:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1013:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [12:22:49] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87956 and previous config saved to /var/cache/conftool/dbconfig/20260127-122248-marostegui.json [12:22:56] T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163 [12:22:57] T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164 [12:25:15] (03CR) 10Superpes15: [C:03+1] "Now seems good but please, before deploying, please check if they also want a wordmark" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [12:25:39] (03CR) 10Superpes15: [C:03+1] "*Tagline" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [12:32:58] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P87957 and previous config saved to /var/cache/conftool/dbconfig/20260127-123257-marostegui.json [12:35:30] jouncebot: nowandnext [12:35:30] No deployments scheduled for the next 0 hour(s) and 24 minute(s) [12:35:30] In 0 hour(s) and 24 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1300) [12:36:44] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploy" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1214584 (https://phabricator.wikimedia.org/T410908) (owner: 10Cparle) [12:38:02] (03CR) 10GergesShamon: "@superpes15.itwiki@gmail.com, The tagline hasn't changed, and the community hasn't requested it be changed; the tagline in the logo is the" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [12:38:25] (03PS1) 10Samtar: Watchlist: do not double-escape labels, and always use `` [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233688 (https://phabricator.wikimedia.org/T415489) [12:41:22] (03CR) 10Neriah: [arwikibooks] Update logos and wordmark (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [12:42:19] (03CR) 10Majavah: [C:03+2] utils/pcc: Add bash completion script [puppet] - 10https://gerrit.wikimedia.org/r/1230341 (owner: 10Majavah) [12:43:04] (03PS4) 10D3r1ck01: EditWatchlistPaginate feature flag has been removed from MW code, so remove it from config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1214584 (https://phabricator.wikimedia.org/T410908) (owner: 10Cparle) [12:43:06] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P87958 and previous config saved to /var/cache/conftool/dbconfig/20260127-124305-marostegui.json [12:43:21] (03PS3) 10GergesShamon: [arwikibooks] Update logos and wordmark [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 [12:44:00] (03CR) 10GergesShamon: [arwikibooks] Update logos and wordmark (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [12:44:53] (03CR) 10GergesShamon: "Done" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [12:46:08] jouncebot: nowandnext [12:46:08] No deployments scheduled for the next 0 hour(s) and 13 minute(s) [12:46:08] In 0 hour(s) and 13 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1300) [12:53:15] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87959 and previous config saved to /var/cache/conftool/dbconfig/20260127-125314-marostegui.json [12:53:23] T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163 [12:53:24] T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164 [12:53:30] !log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance [12:53:39] !log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db2195 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87960 and previous config saved to /var/cache/conftool/dbconfig/20260127-125338-marostegui.json [12:55:34] !log fnegri@cumin1003 END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database kajwiki (T415041) [12:55:39] T415041: [wikireplicas] Create views for new wiki kajwiki - https://phabricator.wikimedia.org/T415041 [13:00:05] Deploy window Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1300) [13:03:23] jouncebot: nowandnext [13:03:23] For the next 0 hour(s) and 56 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1300) [13:03:23] In 0 hour(s) and 56 minute(s): UTC afternoon backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1400) [13:05:25] RESOLVED: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:06:27] (03PS1) 10Santiago Faci: Test Kitchen UI: Deploy v.1.1.7 release to staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233696 (https://phabricator.wikimedia.org/T415325) [13:07:00] (03PS1) 10Dzahn: zookeeper: add parameter and path to tls cert passphrase [puppet] - 10https://gerrit.wikimedia.org/r/1233697 (https://phabricator.wikimedia.org/T405119) [13:13:51] (03PS1) 10BCornwall: ssl: Remove unused digicert certificates [puppet] - 10https://gerrit.wikimedia.org/r/1233698 (https://phabricator.wikimedia.org/T414955) [13:16:37] (03CR) 10BCornwall: [V:03+1] "PCC SUCCESS (NOOP 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/7951/console" [puppet] - 10https://gerrit.wikimedia.org/r/1233698 (https://phabricator.wikimedia.org/T414955) (owner: 10BCornwall) [13:17:53] (03CR) 10Majavah: [C:03+2] hieradata: openstack: Use dedicated memcache user in eqiad1 [puppet] - 10https://gerrit.wikimedia.org/r/1230335 (https://phabricator.wikimedia.org/T273950) (owner: 10Majavah) [13:19:00] (03PS2) 10Majavah: hieradata: Use dedicated memcache user by default in Cloud VPS [puppet] - 10https://gerrit.wikimedia.org/r/1230336 (https://phabricator.wikimedia.org/T273950) [13:20:28] (03CR) 10Majavah: hieradata: Use dedicated memcache user by default in Cloud VPS (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/1230336 (https://phabricator.wikimedia.org/T273950) (owner: 10Majavah) [13:25:21] 06SRE, 10LDAP-Access-Requests, 06WMF-NDA-Requests: Grant Access to NDA for Johannnes89 - https://phabricator.wikimedia.org/T414789#11557725 (10Arnoldokoth) @Johannnes89 Np. Do we need to amend anything or this can be resolved? [13:26:56] 10ops-eqiad, 06SRE, 06DC-Ops: Alert for device ps1-e3-eqiad.mgmt.eqiad.wmnet - PDU sensor over limit - https://phabricator.wikimedia.org/T415466#11557737 (10Jclark-ctr) 05Open→03Resolved a:03Jclark-ctr [13:27:46] (03CR) 10TrainBranchBot: [C:03+2] "Approved by samtar@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1214584 (https://phabricator.wikimedia.org/T410908) (owner: 10Cparle) [13:28:40] (03Merged) 10jenkins-bot: EditWatchlistPaginate feature flag has been removed from MW code, so remove it from config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1214584 (https://phabricator.wikimedia.org/T410908) (owner: 10Cparle) [13:29:13] !log samtar@deploy2002 Started scap sync-world: Backport for [[gerrit:1214584|EditWatchlistPaginate feature flag has been removed from MW code, so remove it from config (T410908)]] [13:29:18] T410908: Remove the $wgEditWatchlistPaginate feature flag - https://phabricator.wikimedia.org/T410908 [13:29:41] FIRING: [9x] SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:31:20] !log samtar@deploy2002 samtar, cparle: Backport for [[gerrit:1214584|EditWatchlistPaginate feature flag has been removed from MW code, so remove it from config (T410908)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [13:32:35] !log samtar@deploy2002 samtar, cparle: Continuing with sync [13:36:44] !log samtar@deploy2002 Finished scap sync-world: Backport for [[gerrit:1214584|EditWatchlistPaginate feature flag has been removed from MW code, so remove it from config (T410908)]] (duration: 07m 31s) [13:36:50] T410908: Remove the $wgEditWatchlistPaginate feature flag - https://phabricator.wikimedia.org/T410908 [13:39:14] FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [13:41:25] (03CR) 10TrainBranchBot: [C:03+2] "Approved by samtar@deploy2002 using scap backport" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233688 (https://phabricator.wikimedia.org/T415489) (owner: 10Samtar) [13:45:56] (03Merged) 10jenkins-bot: Watchlist: do not double-escape labels, and always use `` [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233688 (https://phabricator.wikimedia.org/T415489) (owner: 10Samtar) [13:46:26] !log samtar@deploy2002 Started scap sync-world: Backport for [[gerrit:1233688|Watchlist: do not double-escape labels, and always use `` (T415489)]] [13:46:31] T415489: HTML entities are double-escaped in watchlist filters - https://phabricator.wikimedia.org/T415489 [13:48:08] !log fnegri@cumin1003 START - Cookbook sre.wikireplicas.add-wiki for database kaiwiki (T414240) [13:48:14] T414240: [wikireplicas] Create views for new wiki kaiwiki - https://phabricator.wikimedia.org/T414240 [13:48:18] !log fnegri@cumin1003 END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database kaiwiki (T414240) [13:48:30] !log samtar@deploy2002 samtar: Backport for [[gerrit:1233688|Watchlist: do not double-escape labels, and always use `` (T415489)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [13:49:03] !log fnegri@cumin1003 START - Cookbook sre.wikireplicas.add-wiki for database pplwiki (T415050) [13:49:08] T415050: [wikireplicas] Create views for new wiki pplwiki - https://phabricator.wikimedia.org/T415050 [13:49:13] !log fnegri@cumin1003 END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database pplwiki (T415050) [13:49:27] !log samtar@deploy2002 samtar: Continuing with sync [13:51:05] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11557872 (10Krinkle) 22px is the button size reserved by WikiEditor. Any different size will look off in the interface, including 1px... [13:52:05] (03PS1) 10Dreamy Jazz: CheckUser: Enable read new for user agent table migration on group1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233706 (https://phabricator.wikimedia.org/T361199) [13:53:31] !log samtar@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233688|Watchlist: do not double-escape labels, and always use `` (T415489)]] (duration: 07m 05s) [13:53:37] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploy" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233706 (https://phabricator.wikimedia.org/T361199) (owner: 10Dreamy Jazz) [13:53:38] T415489: HTML entities are double-escaped in watchlist filters - https://phabricator.wikimedia.org/T415489 [13:54:38] heck yeah [13:54:47] * TheresNoTime hides before the backport window actually starts /j [13:54:52] decided to fix the bug first after all? ;) [13:55:32] Lucas_WMDE: didn't fancy deploying a surprise feature(tm) this week [13:55:36] (03PS1) 10Michael Große: metrics(ReviseTone): add missing instrumentation parameters [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233709 (https://phabricator.wikimedia.org/T415580) [13:55:50] * Lucas_WMDE is tempted to !bash that for some reason [13:55:56] (03CR) 10TrainBranchBot: [C:03+2] "Approved by dreamyjazz@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233706 (https://phabricator.wikimedia.org/T361199) (owner: 10Dreamy Jazz) [13:56:07] so many people deploying stuff before the window starts [13:56:12] (03CR) 10Vgutierrez: [C:03+1] ssl: Remove unused digicert certificates [puppet] - 10https://gerrit.wikimedia.org/r/1233698 (https://phabricator.wikimedia.org/T414955) (owner: 10BCornwall) [13:56:19] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploy" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233709 (https://phabricator.wikimedia.org/T415580) (owner: 10Michael Große) [13:56:58] (03Merged) 10jenkins-bot: CheckUser: Enable read new for user agent table migration on group1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233706 (https://phabricator.wikimedia.org/T361199) (owner: 10Dreamy Jazz) [13:57:30] !log dreamyjazz@deploy2002 Started scap sync-world: Backport for [[gerrit:1233706|CheckUser: Enable read new for user agent table migration on group1 (T361199)]] [13:57:35] T361199: Set user agent schema migration config to read new on WMF wikis - https://phabricator.wikimedia.org/T361199 [13:59:38] !log dreamyjazz@deploy2002 dreamyjazz: Backport for [[gerrit:1233706|CheckUser: Enable read new for user agent table migration on group1 (T361199)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [14:00:02] (03PS7) 10Krinkle: varnish: Restrict unauth sitemap access to verified crawlers (cat B) [puppet] - 10https://gerrit.wikimedia.org/r/1233188 (https://phabricator.wikimedia.org/T407122) [14:00:04] Lucas_WMDE, Urbanecm, and TheresNoTime: Time to do the UTC afternoon backport window deploy. Don't look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1400). [14:00:05] Dreamy_Jazz, MichaelG_WMF, and Superpes: A patch you scheduled for UTC afternoon backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [14:01:27] hey hey :) [14:01:43] o/ [14:01:47] o/ [14:03:12] I can deploy once Dreamy_Jazz is done [14:03:24] <3 [14:03:53] mine can probably not be tested in a sensible way. It adds instrumentation parameters when a particular kind of edit is saved, which I should not do in production myself as staff. [14:04:11] Lucas_WMDE I'm finishing another patch and you can deploy both together :) [14:04:25] ok ^^ [14:04:44] !log dreamyjazz@deploy2002 dreamyjazz: Continuing with sync [14:06:03] (03PS1) 10Superpes15: [enwikibooks] Enable VisualEditor on Project and Transwiki namespaces [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233711 (https://phabricator.wikimedia.org/T415595) [14:08:15] Lucas_WMDE Done! I suppose you can deploy everything together (they are all quite easy) :) [14:08:35] thanks, looking [14:08:47] !log dreamyjazz@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233706|CheckUser: Enable read new for user agent table migration on group1 (T361199)]] (duration: 11m 16s) [14:08:53] T361199: Set user agent schema migration config to read new on WMF wikis - https://phabricator.wikimedia.org/T361199 [14:09:08] I'm one with my use of scap [14:10:28] thanks! [14:10:45] (03CR) 10TrainBranchBot: [C:03+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233709 (https://phabricator.wikimedia.org/T415580) (owner: 10Michael Große) [14:11:45] I’m trying to figure out if https://phabricator.wikimedia.org/T415595 should wait because the proposal has technically only been open for a few days (less than a week) [14:12:12] but https://meta.wikimedia.org/wiki/Requesting_wiki_configuration_changes doesn’t say that the discussion has to be open for X long before consensus can be said to have been reached [14:13:00] it’s probably fine to go ahead with it [14:13:07] Consensus reached, first discussion started on 9 January, no issue for me [14:13:11] (03Merged) 10jenkins-bot: metrics(ReviseTone): add missing instrumentation parameters [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233709 (https://phabricator.wikimedia.org/T415580) (owner: 10Michael Große) [14:13:44] !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1233709|metrics(ReviseTone): add missing instrumentation parameters (T415580)]] [14:13:52] T415580: Revise Tone instrumentation for a saved edit is missing important custom parameters - https://phabricator.wikimedia.org/T415580 [14:13:59] I guess https://phabricator.wikimedia.org/T392286 is in the same situation, https://meta.wikimedia.org/wiki/Talk:Universal_Code_of_Conduct/Coordinating_Committee#U4C_feedback_requested_re._phab:T392286 technically started 6 days ago [14:14:00] T392286: Signature button should appear in the edit toolbar on "Case" namespace on u4cwiki - https://phabricator.wikimedia.org/T392286 [14:14:11] Lucas_WMDE It concerns a private wiki [14:14:23] but I think it’s a harmless enough and unobjectionable change (that’s also safe to revert) [14:14:25] so, sure [14:14:31] Also it's an old issue :) [14:14:49] well yes, but a task doesn’t gain consensus just by sitting around for nine months with no reactions :P [14:15:07] I created the patch when I was in the U4C but I didn't have time to schedule the deploy [14:15:23] Yep yep, but at that time the consensus was clear, in our private discussion [14:15:53] !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, migr: Backport for [[gerrit:1233709|metrics(ReviseTone): add missing instrumentation parameters (T415580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [14:16:49] !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, migr: Continuing with sync [14:16:55] I just did a quick check that editing isn’t totally broken ^^ [14:17:13] ok, thanks^^ [14:17:20] hm, actually, mwdebug logstash has an error “Non-scalar value found in the event [14:17:20] ” [14:17:26] does that sound familiar MichaelG_WMF? [14:17:39] * Lucas_WMDE looks at the patch [14:17:54] I interrupted the job [14:18:01] I think it might be due to the patch [14:18:17] the patch has [14:18:17] 'revision_id' => $event->getLatestRevisionAfter(), [14:18:25] and the logstash entry has [14:18:29] prop_name: revision_id [14:18:40] prop_val_type: MediaWiki\Revision\RevisionStoreRecord [14:18:48] no does not sound familiar [14:18:58] MichaelG_WMF: I think you’re missing a ->getId() on that object? [14:19:11] yeah, this might be what is going on [14:19:17] getLatestRevisionAfter() returns a revision record not its ID [14:19:21] I thought I tested this :/ [14:19:36] I’ll revert on wmf.12 [14:19:57] thank you! [14:20:16] * MichaelG_WMF goes back, fixes their change, and triple-checks it [14:20:20] (03PS1) 10Lucas Werkmeister (WMDE): Revert "metrics(ReviseTone): add missing instrumentation parameters" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 [14:20:43] (03CR) 10BCornwall: [V:03+1 C:03+2] ssl: Remove unused digicert certificates [puppet] - 10https://gerrit.wikimedia.org/r/1233698 (https://phabricator.wikimedia.org/T414955) (owner: 10BCornwall) [14:20:48] (03PS2) 10Lucas Werkmeister (WMDE): Revert "metrics(ReviseTone): add missing instrumentation parameters" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 (https://phabricator.wikimedia.org/T415580) [14:20:55] (03CR) 10TrainBranchBot: [C:03+2] "Copied votes on follow-up patch sets have been updated:" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 (https://phabricator.wikimedia.org/T415580) (owner: 10Lucas Werkmeister (WMDE)) [14:21:03] (03CR) 10TrainBranchBot: "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 (https://phabricator.wikimedia.org/T415580) (owner: 10Lucas Werkmeister (WMDE)) [14:21:16] I’m glad I remembered to check mwdebug logstash [14:21:23] next step, learn to check it *before* continuing the deploy [14:21:30] but hey, still an improvement [14:21:55] (a while ago I deployed a faulty config – QuickSurvey iirc – that logged millions of warnings :S) [14:23:33] bleh, PS1 and PS2 of that change are both in zuul’s gate-and-submit-wmf queue… I think I’d better abort the first one [14:24:17] (03CR) 10CI reject: [V:04-1] Revert "metrics(ReviseTone): add missing instrumentation parameters" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 (https://phabricator.wikimedia.org/T415580) (owner: 10Lucas Werkmeister (WMDE)) [14:24:20] now it’s rebuilding PS2 yay [14:24:29] (and spiderpig is still running) [14:26:49] 06SRE, 10LDAP-Access-Requests, 06WMF-NDA-Requests: Grant Access to NDA for Johannnes89 - https://phabricator.wikimedia.org/T414789#11558081 (10Johannnes89) Due to my mistake https://gerrit.wikimedia.org/r/c/operations/puppet/+/1229200 refers to `johannnes89` (there's no account with that name) instead of `j8... [14:28:28] (03CR) 10Lucas Werkmeister (WMDE): Revert "metrics(ReviseTone): add missing instrumentation parameters" (031 comment) [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 (https://phabricator.wikimedia.org/T415580) (owner: 10Lucas Werkmeister (WMDE)) [14:34:43] (03Merged) 10jenkins-bot: Revert "metrics(ReviseTone): add missing instrumentation parameters" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233721 (https://phabricator.wikimedia.org/T415580) (owner: 10Lucas Werkmeister (WMDE)) [14:34:51] yay [14:35:19] !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1233721|Revert "metrics(ReviseTone): add missing instrumentation parameters" (T415580)]] [14:35:25] T415580: Revise Tone instrumentation for a saved edit is missing important custom parameters - https://phabricator.wikimedia.org/T415580 [14:37:27] !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for [[gerrit:1233721|Revert "metrics(ReviseTone): add missing instrumentation parameters" (T415580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [14:39:17] trying sandbox again [14:39:40] MichaelG_WMF: now I’m wondering, did I accidentally manage to hit a newcomer code path with my test edit earlier, or does every edit go through that method? [14:39:42] ^^ [14:40:53] ok, logstash looks healthier onw [14:40:54] *now [14:40:57] !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync [14:41:01] that's also something I want to look into. I'm not sure if we have that information available there or whether the plan is to correlate that we the newcomer task during analytics [14:41:27] ok [14:42:04] I'm figuring this out right now, together with Sergio. It would be really great if there would be a better way to test these things _before_ an experiment actually starts [14:43:39] (03PS1) 10Joal: Add comments to dse-k8s-eqiad airflow helmfiles [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233730 [14:44:21] fwiw the canaries didn’t flag up an unusual error rate in https://spiderpig.wikimedia.org/jobs/1241 [14:44:29] (it had just started sync-prod-k8s when I interrupted the deploy) [14:44:56] !log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233721|Revert "metrics(ReviseTone): add missing instrumentation parameters" (T415580)]] (duration: 09m 37s) [14:45:01] T415580: Revise Tone instrumentation for a saved edit is missing important custom parameters - https://phabricator.wikimedia.org/T415580 [14:45:05] if the canaries don’t get enough traffic to flag up errors that happen on every edit, that sounds concerning [14:45:22] * Lucas_WMDE checks mediawiki-errors logstash [14:45:30] I can't access that, but it does indeed sound troubling [14:45:45] hm, nothing in that logstash at all [14:45:52] I see only 1 event of that kind in my logstash board for event validation [14:46:03] aha, EventBus channel [14:46:10] ok there it is, 123 events overall [14:46:20] https://logstash.wikimedia.org/goto/5c26d0d11dcc48f8165fba59a6a866c9 [14:46:20] oh wait, now there is more [14:46:45] so that was something like half a dozen errors per minute [14:46:52] until the revert finished deploying [14:47:09] (all from canary hosts, except one pinkunicorn = mwdebug) [14:47:35] Superpes: just checking, are you still there? ^^ [14:47:44] For sure [14:47:45] we’re good to go for your changes now [14:47:46] thx [14:48:08] (03CR) 10Lucas Werkmeister (WMDE): [enwikibooks] Enable VisualEditor on Project and Transwiki namespaces (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233711 (https://phabricator.wikimedia.org/T415595) (owner: 10Superpes15) [14:48:11] (03CR) 10TrainBranchBot: [C:03+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1137428 (https://phabricator.wikimedia.org/T392286) (owner: 10Superpes15) [14:48:11] (03CR) 10TrainBranchBot: [C:03+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233711 (https://phabricator.wikimedia.org/T415595) (owner: 10Superpes15) [14:49:17] (03Merged) 10jenkins-bot: [u4cwiki] Add signature button to edit toolbar in Case namespace [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1137428 (https://phabricator.wikimedia.org/T392286) (owner: 10Superpes15) [14:49:21] (03Merged) 10jenkins-bot: [enwikibooks] Enable VisualEditor on Project and Transwiki namespaces [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233711 (https://phabricator.wikimedia.org/T415595) (owner: 10Superpes15) [14:49:53] !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1137428|[u4cwiki] Add signature button to edit toolbar in Case namespace (T392286)]], [[gerrit:1233711|[enwikibooks] Enable VisualEditor on Project and Transwiki namespaces (T415595)]] [14:50:02] T392286: Signature button should appear in the edit toolbar on "Case" namespace on u4cwiki - https://phabricator.wikimedia.org/T392286 [14:50:03] T415595: Enable VisualEditor in en.Wikibooks namespaces (Wikibooks and Transwiki) - https://phabricator.wikimedia.org/T415595 [14:50:54] * Lucas_WMDE checks when https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/1233223 was merged [14:51:05] ah, it did make it into wmf.13 :S [14:51:26] yeah, I know. The fix will also need to be backported there [14:51:35] yeah [14:51:41] I was just gonna make it a train blocker for now [14:51:46] or do you want to revert it on wmf.13 first? [14:52:06] !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, superpes: Backport for [[gerrit:1137428|[u4cwiki] Add signature button to edit toolbar in Case namespace (T392286)]], [[gerrit:1233711|[enwikibooks] Enable VisualEditor on Project and Transwiki namespaces (T415595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [14:52:10] Testing [14:52:11] (the alternative would be to fix it on master and then backport that to wmf.13 and then unblock the train) [14:52:12] thanks [14:52:48] (03CR) 10Lucas Werkmeister (WMDE): "congrats on [SpiderPig job #1234](https://spiderpig.wikimedia.org/jobs/1234) ^^" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233228 (owner: 10Dreamy Jazz) [14:53:29] @Lucas_WMDE undecided. probably making it a train blocker makes more sense. this is important for the experiment we're running, so I don't want to wait another week. OTOH, we can just revert it now on wmf.13 and then backport the proper fix. [14:53:37] but that maybe makes it even more complicated [14:53:40] Lucas_WMDE Looks fine to me :) [14:54:09] thanks Superpes! [14:54:12] !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, superpes: Continuing with sync [14:54:31] MichaelG_WMF: ack, thanks. I’ll mark it as a blocker for now and leave a comment with the options [14:55:50] thank you! [14:58:15] !log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1137428|[u4cwiki] Add signature button to edit toolbar in Case namespace (T392286)]], [[gerrit:1233711|[enwikibooks] Enable VisualEditor on Project and Transwiki namespaces (T415595)]] (duration: 08m 22s) [14:58:23] T392286: Signature button should appear in the edit toolbar on "Case" namespace on u4cwiki - https://phabricator.wikimedia.org/T392286 [14:58:24] T415595: Enable VisualEditor in en.Wikibooks namespaces (Wikibooks and Transwiki) - https://phabricator.wikimedia.org/T415595 [14:58:36] Lucas_WMDE Thanks for your assistance (as always) :3 [14:59:11] np :) [14:59:19] !log UTC afternoon backport+config window done [14:59:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:00:04] Deploy window Test Kitchen UI Deployment Window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1500) [15:07:21] (03PS1) 10Dreamy Jazz: SI: Add applied filters to page_load instrumentation event [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233736 (https://phabricator.wikimedia.org/T415369) [15:07:26] jouncebot: nowandnext [15:07:26] For the next 0 hour(s) and 22 minute(s): Test Kitchen UI Deployment Window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1500) [15:07:26] In 0 hour(s) and 22 minute(s): Test Kitchen Experiment Deployment Window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1530) [15:07:40] Anyone mind if I backport during this window? [15:09:14] FIRING: [3x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [15:10:35] (03CR) 10TrainBranchBot: [C:03+2] "Approved by dreamyjazz@deploy2002 using scap backport" [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233736 (https://phabricator.wikimedia.org/T415369) (owner: 10Dreamy Jazz) [15:12:07] (03CR) 10Brouberol: [C:03+1] "LGTM with a tiny nit!" [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233730 (owner: 10Joal) [15:19:14] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate eventstreams-internal.discovery.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [15:20:34] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11558341 (10Ladsgroup) To double check, should it be `background-size: cover;`? [15:21:16] 06SRE, 10SRE-Access-Requests: Requesting access to deployment for trueg - https://phabricator.wikimedia.org/T415632#11558344 (10Dzahn) a:03thcipriani Hi Tyler, what do you think? [15:21:42] (03Merged) 10jenkins-bot: SI: Add applied filters to page_load instrumentation event [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233736 (https://phabricator.wikimedia.org/T415369) (owner: 10Dreamy Jazz) [15:22:16] !log dreamyjazz@deploy2002 Started scap sync-world: Backport for [[gerrit:1233736|SI: Add applied filters to page_load instrumentation event (T415369)]] [15:22:17] 06SRE, 10SRE-Access-Requests: Requesting access to deployment for trueg - https://phabricator.wikimedia.org/T415632#11558348 (10Dzahn) @gmodena or @DSantamaria do you approve? [15:22:23] T415369: Suggested Investigations: Instrument the Filters - https://phabricator.wikimedia.org/T415369 [15:23:46] !log sukhe@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on cp2043.codfw.wmnet with reason: host not provisioned [15:24:25] !log dreamyjazz@deploy2002 dreamyjazz: Backport for [[gerrit:1233736|SI: Add applied filters to page_load instrumentation event (T415369)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [15:26:11] !log dreamyjazz@deploy2002 dreamyjazz: Continuing with sync [15:30:05] Deploy window Test Kitchen Experiment Deployment Window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1530) [15:30:14] !log dreamyjazz@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233736|SI: Add applied filters to page_load instrumentation event (T415369)]] (duration: 07m 58s) [15:30:20] T415369: Suggested Investigations: Instrument the Filters - https://phabricator.wikimedia.org/T415369 [15:34:14] FIRING: [3x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [15:45:05] I'm going to need to backport again shortly [15:46:59] (03PS2) 10Joal: Add comments to dse-k8s-eqiad airflow helmfiles [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233730 [15:47:45] (03PS1) 10Dreamy Jazz: Follow-up: SI: Add applied filters to page_load instrumentation event [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233743 (https://phabricator.wikimedia.org/T415369) [15:47:54] (03CR) 10Dreamy Jazz: [C:03+2] Follow-up: SI: Add applied filters to page_load instrumentation event [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233743 (https://phabricator.wikimedia.org/T415369) (owner: 10Dreamy Jazz) [15:48:07] (03CR) 10TrainBranchBot: [C:03+2] "Approved by dreamyjazz@deploy2002 using scap backport" [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233743 (https://phabricator.wikimedia.org/T415369) (owner: 10Dreamy Jazz) [15:52:38] 06SRE, 06Infrastructure-Foundations, 10netops, 06Traffic: Map internet-bound upload traffic to low-priority QoS queue - https://phabricator.wikimedia.org/T415649#11558424 (10cmooney) [15:53:48] 10SRE-swift-storage, 06Data-Persistence, 10MediaViewer, 10Thumbor, 06Traffic: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11558427 (10Krinkle) >>! In T414805#11558341, @Ladsgroup wrote: > To double check, should it be `background-size: cover;`? No, that w... [15:57:43] (03CR) 10Joal: Add comments to dse-k8s-eqiad airflow helmfiles (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233730 (owner: 10Joal) [16:00:05] jelto, arnoldokoth, mutante, and arnaudb: May I have your attention please! SRE Collaboration Services office hours. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1600) [16:01:16] (03Merged) 10jenkins-bot: Follow-up: SI: Add applied filters to page_load instrumentation event [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233743 (https://phabricator.wikimedia.org/T415369) (owner: 10Dreamy Jazz) [16:01:46] !log dreamyjazz@deploy2002 Started scap sync-world: Backport for [[gerrit:1233743|Follow-up: SI: Add applied filters to page_load instrumentation event (T415369)]] [16:01:53] T415369: Suggested Investigations: Instrument the Filters - https://phabricator.wikimedia.org/T415369 [16:03:54] !log dreamyjazz@deploy2002 dreamyjazz: Backport for [[gerrit:1233743|Follow-up: SI: Add applied filters to page_load instrumentation event (T415369)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [16:08:11] Testing.... [16:08:58] !log dreamyjazz@deploy2002 dreamyjazz: Continuing with sync [16:13:04] !log dreamyjazz@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233743|Follow-up: SI: Add applied filters to page_load instrumentation event (T415369)]] (duration: 11m 18s) [16:13:10] T415369: Suggested Investigations: Instrument the Filters - https://phabricator.wikimedia.org/T415369 [16:19:37] (03PS1) 10Dreamy Jazz: CheckUser: Read new for user agent table migration everywhere [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233752 (https://phabricator.wikimedia.org/T361199) [16:19:38] jouncebot: nowandnext [16:19:38] For the next 0 hour(s) and 40 minute(s): SRE Collaboration Services office hours (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1600) [16:19:39] In 0 hour(s) and 40 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1700) [16:20:35] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, January 28 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#depl" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233752 (https://phabricator.wikimedia.org/T361199) (owner: 10Dreamy Jazz) [16:27:06] 10ops-eqiad, 06SRE, 06DC-Ops, 06Data-Platform-SRE (2026.01.23 - 2026.02.13): Q3:rack/setup/install dse-k8s-worker10[20-22] - https://phabricator.wikimedia.org/T414216#11558586 (10Jclark-ctr) a:05BTullis→03None [16:46:42] (03PS2) 10Phuedx: TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 [16:46:54] !log dancy@deploy2002 Installing scap version "4.237.0" for 2 host(s) [16:47:34] (03CR) 10CI reject: [V:04-1] TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [16:48:45] !log dancy@deploy2002 Installation of scap version "4.237.0" completed for 2 hosts [16:52:56] (03PS1) 10Federico Ceratto: clone: setup and start repl on target host early [cookbooks] - 10https://gerrit.wikimedia.org/r/1233761 (https://phabricator.wikimedia.org/T415564) [17:00:05] jhathaway and rzl: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Puppet request window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1700). [17:00:05] No Gerrit patches in the queue for this window AFAICS. [17:02:56] jhathaway, rzl: if either of you are bored I have a Puppet patch that is running as a cherry-pick in Beta that could land -- https://gerrit.wikimedia.org/r/c/operations/puppet/+/1229186 [17:03:16] bd808: I can take a look after our summit stuff wraps up for the day [17:05:54] There are many better things to do in Lisbon than merge Puppet patches, but if you find time it would be appreciated. [17:07:07] I'm remoting in on a godawful mix of UTC-8 and UTC, so when they're heading out for a glamorous dinner on the town I'll be free to have a look :) [17:10:07] (03CR) 10Majavah: [V:03+1] "PCC SUCCESS (NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/7952/console" [puppet] - 10https://gerrit.wikimedia.org/r/1229186 (https://phabricator.wikimedia.org/T415113) (owner: 10BryanDavis) [17:11:39] (03CR) 10Majavah: [V:03+1 C:03+2] "PCC on prod is no-op, and the cherry-pick works as expected." [puppet] - 10https://gerrit.wikimedia.org/r/1229186 (https://phabricator.wikimedia.org/T415113) (owner: 10BryanDavis) [17:29:41] FIRING: [9x] SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:31:03] FIRING: MediaWikiEditFailures: Elevated MediaWiki edit failures (session_loss) for cluster - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000208/edit-count?orgId=1&viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiEditFailures [17:38:30] (03PS1) 10Clare Ming: ext.testKitchen: Add distinct config for each intake URL [extensions/TestKitchen] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233770 [17:38:48] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-i" [extensions/TestKitchen] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233770 (owner: 10Clare Ming) [17:39:17] (03PS1) 10Clare Ming: ext.testKitchen: Add distinct config for each intake URL [extensions/TestKitchen] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233771 [17:39:34] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-i" [extensions/TestKitchen] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233771 (owner: 10Clare Ming) [17:40:08] (03PS3) 10Phuedx: TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 [17:40:51] Thanks for that merge taavi. :) [17:40:58] (03CR) 10CI reject: [V:04-1] TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [17:43:01] PROBLEM - Check unit status of statograph_post on alert1002 is CRITICAL: CRITICAL: Status of the systemd unit statograph_post https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state [17:43:50] (03PS1) 10Brennen Bearnes: metrics(ReviseTone): revision id must be an integer [extensions/GrowthExperiments] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233773 (https://phabricator.wikimedia.org/T415580) [17:45:32] (03PS4) 10Clare Ming: TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [17:48:44] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-i" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [17:50:44] (03PS1) 10Brouberol: airflow: store large XCOMs in s3 to alleviate load on the database [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) [17:53:01] RECOVERY - Check unit status of statograph_post on alert1002 is OK: OK: Status of the systemd unit statograph_post https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state [17:53:43] (03PS2) 10Brouberol: airflow: store large XCOMs in s3 to alleviate load on the database [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) [18:00:05] Deploy window MediaWiki infrastructure (UTC late) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1800) [18:08:41] (03PS1) 10Clare Ming: Update pageVisitBotDetection [extensions/WikimediaEvents] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233783 (https://phabricator.wikimedia.org/T411453) [18:08:55] (03CR) 10ScheduleDeploymentBot: "Scheduled for deployment in the [Tuesday, January 27 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-i" [extensions/WikimediaEvents] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233783 (https://phabricator.wikimedia.org/T411453) (owner: 10Clare Ming) [18:19:18] (03PS1) 10Dreamy Jazz: SI: Provide caller for user IDs query in cases pager [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233788 (https://phabricator.wikimedia.org/T415694) [18:19:23] jouncebot: nowandnext [18:19:23] For the next 0 hour(s) and 40 minute(s): MediaWiki infrastructure (UTC late) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1800) [18:19:23] In 0 hour(s) and 40 minute(s): MediaWiki train - Utc-7+Utc-0 Version (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1900) [18:19:52] Going to deploy a non-urgent backport to fix some potential for log spam in wmf.13 [18:19:58] (03CR) 10TrainBranchBot: [C:03+2] "Approved by dreamyjazz@deploy2002 using scap backport" [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233788 (https://phabricator.wikimedia.org/T415694) (owner: 10Dreamy Jazz) [18:21:04] (03CR) 10Michael Große: [C:03+1] metrics(ReviseTone): revision id must be an integer [extensions/GrowthExperiments] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233773 (https://phabricator.wikimedia.org/T415580) (owner: 10Brennen Bearnes) [18:23:19] (03CR) 10Btullis: "Will we need any new housekeeping routines for the XCOM files on S3?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) (owner: 10Brouberol) [18:25:53] Dreamy_Jazz: mind pinging when you're done with that? i have a train-related backport. [18:26:04] Yeah, I can [18:26:20] I can also stop my scap so they can be deployed at the same time? [18:26:35] (My backport hasn't actually merged yet and it is low-prio) [18:26:52] Dreamy_Jazz: i can sling out both at once if you'd like. [18:26:55] Sure [18:27:24] My change is https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/+/1233788 [18:27:27] Stopped scap [18:29:17] (03CR) 10TrainBranchBot: [C:03+2] "Approved by brennen@deploy2002 using scap backport" [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233788 (https://phabricator.wikimedia.org/T415694) (owner: 10Dreamy Jazz) [18:29:17] (03CR) 10TrainBranchBot: [C:03+2] "Approved by brennen@deploy2002 using scap backport" [extensions/GrowthExperiments] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233773 (https://phabricator.wikimedia.org/T415580) (owner: 10Brennen Bearnes) [18:29:41] Thanks! [18:29:53] sure thing. will there be anything to test with that one? [18:30:36] (03Merged) 10jenkins-bot: SI: Provide caller for user IDs query in cases pager [extensions/CheckUser] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233788 (https://phabricator.wikimedia.org/T415694) (owner: 10Dreamy Jazz) [18:31:03] RESOLVED: MediaWikiEditFailures: Elevated MediaWiki edit failures (session_loss) for cluster - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000208/edit-count?orgId=1&viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiEditFailures [18:31:24] I can test that the logstash log doesn't get created, but the change is clear enough that I don't think I'll need to test [18:31:54] But can be around if you'd like me to test it [18:34:38] (03CR) 10Joal: [C:03+1] airflow: store large XCOMs in s3 to alleviate load on the database (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) (owner: 10Brouberol) [18:36:48] Dreamy_Jazz: always good for peace of mind, though i expect it'll be fine. [18:37:13] Sure, it will be quick to test so happy to [18:41:27] (03Merged) 10jenkins-bot: metrics(ReviseTone): revision id must be an integer [extensions/GrowthExperiments] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233773 (https://phabricator.wikimedia.org/T415580) (owner: 10Brennen Bearnes) [18:42:03] !log brennen@deploy2002 Started scap sync-world: Backport for [[gerrit:1233788|SI: Provide caller for user IDs query in cases pager (T415694)]], [[gerrit:1233773|metrics(ReviseTone): revision id must be an integer (T415580)]] [18:42:13] T415694: Suggested Investigations: Caller not provided for query for user IDs in cases pager - https://phabricator.wikimedia.org/T415694 [18:42:13] T415580: Revise Tone instrumentation for a saved edit is missing important custom parameters - https://phabricator.wikimedia.org/T415580 [18:44:11] !log brennen@deploy2002 brennen, dreamyjazz: Backport for [[gerrit:1233788|SI: Provide caller for user IDs query in cases pager (T415694)]], [[gerrit:1233773|metrics(ReviseTone): revision id must be an integer (T415580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [18:44:22] Testing... [18:45:16] My change works [18:45:25] Dreamy_Jazz: y [18:45:27] er, ty. :) [18:45:32] :D [18:45:37] !log brennen@deploy2002 brennen, dreamyjazz: Continuing with sync [18:46:48] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1203 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87970 and previous config saved to /var/cache/conftool/dbconfig/20260127-184647-marostegui.json [18:46:57] T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163 [18:46:57] T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164 [18:49:49] !log brennen@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233788|SI: Provide caller for user IDs query in cases pager (T415694)]], [[gerrit:1233773|metrics(ReviseTone): revision id must be an integer (T415580)]] (duration: 07m 46s) [18:49:56] T415694: Suggested Investigations: Caller not provided for query for user IDs in cases pager - https://phabricator.wikimedia.org/T415694 [18:49:58] T415580: Revise Tone instrumentation for a saved edit is missing important custom parameters - https://phabricator.wikimedia.org/T415580 [18:56:57] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P87971 and previous config saved to /var/cache/conftool/dbconfig/20260127-185656-marostegui.json [19:00:04] brennen and andre: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) MediaWiki train - Utc-7+Utc-0 Version deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T1900). [19:00:39] FIRING: CoreBGPDown: Core BGP session down between cr1-eqiad and cr2-eqord (208.80.154.198) - group Confed_eqord - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://grafana.wikimedia.org/d/ed8da087-4bcb-407d-9596-d158b8145d45/bgp-neighbors-detail?orgId=1&var-site=eqiad&var-device=cr1-eqiad:9804&var-bgp_group=Confed_eqord&var-bgp_neighbor=cr2-eqord - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [19:01:17] FIRING: ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1014:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:01:40] !log 1.46.0-wmf.13 train status (T413804): currently blocked by T415619 [19:01:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:01:48] T413804: 1.46.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T413804 [19:01:48] T415619: Creation of dynamic property MediaWiki\Language\Dependency\FileDependency::$filename is deprecated {"exception":"[object] (ErrorException(code: 0) - https://phabricator.wikimedia.org/T415619 [19:02:47] brennen: Does that actually block the train, or just TWN? [19:02:53] FIRING: [2x] CoreRouterInterfaceDown: Core router interface down - cr2-eqiad:xe-1/0/1:0 (Transport: cr2-eqord:xe-0/1/5 (Arelion, IC-314533 24ms 10Gbps wave) {#10180823000321:0}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown [19:03:14] James_F: yeah, good question. [19:04:05] i note T415597, so maybe seen elsewhere. poking around in logstash. [19:04:05] T415597: PHP Deprecated: Creation of dynamic property MediaWiki\Language\Dependency\FileDependency::$filename is deprecated - https://phabricator.wikimedia.org/T415597 [19:04:18] Hmm, yeah, same issue I imagine? [19:04:55] also given this is a deprecation warning it may not need to block unless it's a high volume [19:05:39] FIRING: [4x] CoreBGPDown: Core BGP session down between cr1-eqiad and cr2-eqord (208.80.154.198) - group Confed_eqord - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [19:06:17] FIRING: [2x] ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#wdqs1014:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:07:05] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P87972 and previous config saved to /var/cache/conftool/dbconfig/20260127-190704-marostegui.json [19:11:49] James_F: ~800 similar errors on testwiki / testcommonswiki, so sort of seems like it could blow up in volume. [19:11:59] Ack. [19:17:13] !log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1203 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87973 and previous config saved to /var/cache/conftool/dbconfig/20260127-191713-marostegui.json [19:17:18] !log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance [19:17:22] T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163 [19:17:22] T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164 [19:17:27] !log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1214 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87974 and previous config saved to /var/cache/conftool/dbconfig/20260127-191726-marostegui.json [19:19:14] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate eventstreams-internal.discovery.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [19:19:48] (03PS1) 10PipelineBot: citoid: pipeline bot promote [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233810 [19:34:15] FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [19:56:17] FIRING: [3x] ProbeDown: Service wdqs1014:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:57:47] !log removing 1 file for legal compliance [19:57:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:01:17] FIRING: [5x] ProbeDown: Service wdqs1013:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:11:17] FIRING: [6x] ProbeDown: Service wdqs1013:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:28:42] (03CR) 10Brouberol: airflow: store large XCOMs in s3 to alleviate load on the database (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) (owner: 10Brouberol) [20:32:53] RESOLVED: [2x] CoreRouterInterfaceDown: Core router interface down - cr2-eqiad:xe-1/0/1:0 (Transport: cr2-eqord:xe-0/1/5 (Arelion, IC-314533 24ms 10Gbps wave) {#10180823000321:0}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown [20:35:39] RESOLVED: [4x] CoreBGPDown: Core BGP session down between cr1-eqiad and cr2-eqord (208.80.154.198) - group Confed_eqord - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [20:35:45] (03PS1) 10Reedy: FileBasedMessageGroupFactory: Update cache version [extensions/Translate] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233822 (https://phabricator.wikimedia.org/T415619) [20:37:25] brennen: ^ That missed the train... It's not going to fix everything (or possibly much, based on the bug report), but it shouldn't harm either [20:38:16] <_Gerges> ping [20:41:17] FIRING: [8x] ProbeDown: Service wdqs1011:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:43:02] (03CR) 10Brouberol: airflow: store large XCOMs in s3 to alleviate load on the database (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) (owner: 10Brouberol) [20:44:05] Reedy: ack, will backport. [20:44:36] (03CR) 10TrainBranchBot: [C:03+2] "Approved by brennen@deploy2002 using scap backport" [extensions/Translate] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233822 (https://phabricator.wikimedia.org/T415619) (owner: 10Reedy) [20:45:32] (03PS3) 10Brouberol: airflow: store large XCOMs in s3 to alleviate load on the database [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) [20:45:34] (03CR) 10Brouberol: "I'm unse" [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) (owner: 10Brouberol) [20:51:17] FIRING: [10x] ProbeDown: Service wdqs1011:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:51:46] Reedy: have any thoughts on rolling this forward? 30 or 40 an hour of these aren't that worrisome from a logspam perspective but i'm not sure whether to expect it to be a much higher volume. [20:59:47] note for backport window: scap still running, feel free to take over after that finishes [21:00:05] RoanKattouw, Urbanecm, TheresNoTime, kindrobot, and cjming: gettimeofday() says it's time for UTC late backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T2100) [21:00:05] _Gerges and cjming: A patch you scheduled for UTC late backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [21:00:27] <_Gerges> here [21:00:56] o/ [21:01:17] FIRING: [11x] ProbeDown: Service wdqs1011:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:01:47] _Gerges: do you need a deployer? [21:02:03] <_Gerges> yes [21:02:43] i can deploy for you - let's see if the current scap finishes soon 🤞 [21:03:58] hopefully soon - thanks cjming, i'm a touch distracted currently, having a plumbing situation after our recent cold snap [21:05:22] thanks brennen - will keep an eye on things - good luck with your plumbing! [21:06:17] FIRING: [12x] ProbeDown: Service wdqs1011:443 has failed probes (http_wdqs_main_external_search_sparql_endpoint_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:13:02] is 27 minutes to merge reasonable? it seems a bit stalled - https://integration.wikimedia.org/ci/job/quibble-with-gated-extensions-vendor-mysql-php83/9469/console [21:14:48] yeah, this seems excessive [21:14:53] cjming: hmm.... maybe? We have some challenges seeing what is going on inside jobs that are running phpunit batches in parallel. [21:16:23] it's sitting on ready to submit; i always forget whether that's a pathological scenario and what to do about it if so [21:16:49] [meanwhile: loud crunching noises from the drainpipe] [21:17:08] is there a way to bypass the jenkins job that it's choking on? [21:17:21] phab search found T383932, which - if the current patch is experiencing the same issue - might anecdotally imply that this CI job might remain stalled until the CI 60min timeout (if it's not manually cancelled before that) [21:17:22] T383932: Build timed out while quibble is waiting for Post-dependency install, pre-database dependent steps: 3440s elapsed, 1/2 completed - https://phabricator.wikimedia.org/T383932 [21:17:37] ^^ [21:18:11] you can cancel the job and then resubmit, but I would not advocate a force merge [21:20:53] thanks bd808 - i guess that is what i'll do - cancelling #9469 [21:21:18] (03CR) 10CI reject: [V:04-1] FileBasedMessageGroupFactory: Update cache version [extensions/Translate] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233822 (https://phabricator.wikimedia.org/T415619) (owner: 10Reedy) [21:22:17] i'll try re-scap backporting 1233822 and then if that finishes, move onto the first config patch in the queue [21:23:03] (03CR) 10TrainBranchBot: [C:03+2] "Approved by cjming@deploy2002 using scap backport" [extensions/Translate] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233822 (https://phabricator.wikimedia.org/T415619) (owner: 10Reedy) [21:24:47] (03Merged) 10jenkins-bot: FileBasedMessageGroupFactory: Update cache version [extensions/Translate] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233822 (https://phabricator.wikimedia.org/T415619) (owner: 10Reedy) [21:25:21] that was quick [21:25:22] !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1233822|FileBasedMessageGroupFactory: Update cache version (T415619)]] [21:25:28] T415619: Creation of dynamic property MediaWiki\Language\Dependency\FileDependency::$filename is deprecated {"exception":"[object] (ErrorException(code: 0) - https://phabricator.wikimedia.org/T415619 [21:25:29] test result caching :D [21:25:39] nice [21:25:54] yeah, the test cache stuff that dduvall did is awesome. [21:27:45] !log cjming@deploy2002 cjming, reedy: Backport for [[gerrit:1233822|FileBasedMessageGroupFactory: Update cache version (T415619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [21:28:06] presumably i can sync? [21:29:08] going to error on syncing [21:29:13] not sure how to test [21:29:37] !log cjming@deploy2002 cjming, reedy: Continuing with sync [21:29:41] FIRING: [9x] SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:30:34] (03PS4) 10GergesShamon: [arwikibooks] Update logos and wordmark [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 [21:31:41] Just curious, what ended up being wrong with the logo stuff _Gerges? [21:32:51] <_Gerges> I was using an older version of the logo update tool in Tox [21:33:19] _Gerges: your config patch is next up [21:33:46] !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233822|FileBasedMessageGroupFactory: Update cache version (T415619)]] (duration: 08m 24s) [21:34:00] T415619: Creation of dynamic property MediaWiki\Language\Dependency\FileDependency::$filename is deprecated {"exception":"[object] (ErrorException(code: 0) - https://phabricator.wikimedia.org/T415619 [21:34:17] (03CR) 10TrainBranchBot: [C:03+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [21:35:10] (03Merged) 10jenkins-bot: [arwikibooks] Update logos and wordmark [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1195183 (owner: 10GergesShamon) [21:35:33] (03PS1) 10Xcollazo: analytics: refinery: add data purge for File Export. [puppet] - 10https://gerrit.wikimedia.org/r/1233836 (https://phabricator.wikimedia.org/T414389) [21:35:41] !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1195183|[arwikibooks] Update logos and wordmark]] [21:37:54] !log cjming@deploy2002 cjming, gergesshamon: Backport for [[gerrit:1195183|[arwikibooks] Update logos and wordmark]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [21:38:22] _Gerges: want to check? [21:38:58] lmk if/when i can sync [21:39:03] <_Gerges> done [21:39:14] gtg then i assume? [21:39:27] <_Gerges> https://usercontent.irccloud-cdn.com/file/f26VBfBV/image.png [21:39:37] <_Gerges> yes [21:39:41] cool [21:39:44] !log cjming@deploy2002 cjming, gergesshamon: Continuing with sync [21:40:50] (03PS2) 10Xcollazo: analytics: refinery: add data purge for File Export. [puppet] - 10https://gerrit.wikimedia.org/r/1233836 (https://phabricator.wikimedia.org/T414389) [21:41:12] brennen: It does look like most of the prod noise is Translate... [21:42:17] And most of that will disappear as things get cache expired.... But I suspect new wikis will bring more for a bit [21:42:49] (03CR) 10Xcollazo: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1233836 (https://phabricator.wikimedia.org/T414389) (owner: 10Xcollazo) [21:42:56] i can roll forward (after backport window) and see what happens. [21:43:50] !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1195183|[arwikibooks] Update logos and wordmark]] (duration: 08m 09s) [21:44:01] running purgeList no [21:44:02] now [21:45:37] _Gerges: should be live - all files were purged [21:45:49] (03CR) 10Xcollazo: "PPC looks good." [puppet] - 10https://gerrit.wikimedia.org/r/1233836 (https://phabricator.wikimedia.org/T414389) (owner: 10Xcollazo) [21:47:09] (03CR) 10TrainBranchBot: [C:03+2] "Approved by cjming@deploy2002 using scap backport" [extensions/TestKitchen] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233770 (owner: 10Clare Ming) [21:47:09] (03CR) 10TrainBranchBot: [C:03+2] "Approved by cjming@deploy2002 using scap backport" [extensions/TestKitchen] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233771 (owner: 10Clare Ming) [21:47:28] <_Gerges> thanks [21:47:33] yw! [21:48:15] (03Merged) 10jenkins-bot: ext.testKitchen: Add distinct config for each intake URL [extensions/TestKitchen] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233770 (owner: 10Clare Ming) [21:53:44] (03Merged) 10jenkins-bot: ext.testKitchen: Add distinct config for each intake URL [extensions/TestKitchen] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233771 (owner: 10Clare Ming) [21:53:53] (03CR) 10Ahmon Dancy: [C:03+2] "Approved by Jeena" [deployment-charts] - 10https://gerrit.wikimedia.org/r/1227821 (https://phabricator.wikimedia.org/T401197) (owner: 10Clément Goubert) [21:54:15] !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1233770|ext.testKitchen: Add distinct config for each intake URL]], [[gerrit:1233771|ext.testKitchen: Add distinct config for each intake URL]] [21:55:11] (03Merged) 10jenkins-bot: charts: Remove unused chart mediawiki-dev [deployment-charts] - 10https://gerrit.wikimedia.org/r/1227821 (https://phabricator.wikimedia.org/T401197) (owner: 10Clément Goubert) [21:55:24] cjming: mind giving me a shout when you're wrapping up? [21:55:42] brennen: sure thing -- just one more config patch and backport - that ok? [21:55:48] yeah, sounds good. [21:56:02] then i'll give group0 a shot. [21:56:25] !log cjming@deploy2002 cjming: Backport for [[gerrit:1233770|ext.testKitchen: Add distinct config for each intake URL]], [[gerrit:1233771|ext.testKitchen: Add distinct config for each intake URL]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [21:56:47] !log cjming@deploy2002 cjming: Continuing with sync [21:58:23] (03PS5) 10Clare Ming: TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [22:00:04] Deploy window Web Team deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260127T2200) [22:00:49] !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233770|ext.testKitchen: Add distinct config for each intake URL]], [[gerrit:1233771|ext.testKitchen: Add distinct config for each intake URL]] (duration: 06m 34s) [22:01:12] (03CR) 10TrainBranchBot: [C:03+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [22:02:18] (03Merged) 10jenkins-bot: TestKitchen: Add event intake service URLs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233659 (owner: 10Phuedx) [22:02:48] !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1233659|TestKitchen: Add event intake service URLs]] [22:04:57] !log cjming@deploy2002 phuedx, cjming: Backport for [[gerrit:1233659|TestKitchen: Add event intake service URLs]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [22:05:20] !log cjming@deploy2002 phuedx, cjming: Continuing with sync [22:09:25] !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233659|TestKitchen: Add event intake service URLs]] (duration: 06m 37s) [22:09:32] last one [22:09:49] (03CR) 10TrainBranchBot: [C:03+2] "Approved by cjming@deploy2002 using scap backport" [extensions/WikimediaEvents] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233783 (https://phabricator.wikimedia.org/T411453) (owner: 10Clare Ming) [22:14:19] (03Merged) 10jenkins-bot: Update pageVisitBotDetection [extensions/WikimediaEvents] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233783 (https://phabricator.wikimedia.org/T411453) (owner: 10Clare Ming) [22:14:50] !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1233783|Update pageVisitBotDetection (T411453)]] [22:16:58] !log cjming@deploy2002 cjming: Backport for [[gerrit:1233783|Update pageVisitBotDetection (T411453)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [22:17:31] !log cjming@deploy2002 cjming: Continuing with sync [22:21:36] !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1233783|Update pageVisitBotDetection (T411453)]] (duration: 06m 46s) [22:22:02] brennen: all yours [22:22:16] thanks for your patience [22:23:03] cjming: thanks! [22:23:19] np! [22:24:39] (03PS1) 10TrainBranchBot: group0 to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233845 (https://phabricator.wikimedia.org/T413804) [22:24:42] (03CR) 10TrainBranchBot: [C:03+2] "Initiated by brennen@deploy2002" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233845 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [22:25:29] (03CR) 10Btullis: [C:03+1] airflow: store large XCOMs in s3 to alleviate load on the database [deployment-charts] - 10https://gerrit.wikimedia.org/r/1233776 (https://phabricator.wikimedia.org/T415661) (owner: 10Brouberol) [22:25:42] (03Merged) 10jenkins-bot: group0 to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233845 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [22:37:28] yep, ok, that's not gonna work. [22:37:56] (03PS1) 10TrainBranchBot: testwikis to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233848 (https://phabricator.wikimedia.org/T413804) [22:37:59] (03CR) 10TrainBranchBot: [C:03+2] "Initiated by brennen@deploy2002" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233848 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [22:38:54] (03CR) 10CI reject: [V:04-1] testwikis to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233848 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [22:39:20] Does look like most of them are in Translate... [22:43:50] (03CR) 10Brennen Bearnes: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233848 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [22:48:00] (03Abandoned) 10Brennen Bearnes: testwikis to 1.46.0-wmf.13 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233848 (https://phabricator.wikimedia.org/T413804) (owner: 10TrainBranchBot) [22:50:42] !log brennen@deploy2002 Started scap sync-world: Reverting 1.46.0-wmf.13 to testwikis (T413804) [22:50:47] T413804: 1.46.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T413804 [22:57:16] !log brennen@deploy2002 Finished scap sync-world: Reverting 1.46.0-wmf.13 to testwikis (T413804) (duration: 06m 58s) [22:57:21] T413804: 1.46.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T413804 [22:58:20] (03PS1) 10Brennen Bearnes: Revert "group0 to 1.46.0-wmf.13" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233856 (https://phabricator.wikimedia.org/T413804) [22:59:53] (03CR) 10CI reject: [V:04-1] Revert "group0 to 1.46.0-wmf.13" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233856 (https://phabricator.wikimedia.org/T413804) (owner: 10Brennen Bearnes) [23:07:27] (03PS1) 10Zabe: build: Upgrade PHPUnit from 10.5.58 to 10.5.62 to unblock CI [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) [23:07:45] (03PS2) 10Zabe: build: Upgrade PHPUnit from 10.5.59 to 10.5.62 to unblock CI [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) [23:09:20] (03CR) 10Jforrester: "Oh, right, I need to reply about why we had composer.lock in this repo." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) (owner: 10Zabe) [23:10:00] (03CR) 10Jforrester: [C:03+1] "We should land this immediately to unblock deploys." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) (owner: 10Zabe) [23:10:15] brennen: Are you OK to sync ^? [23:10:21] (03PS1) 10Reedy: Updated phpunit/phpunit from 9.6.21 to 9.6.33 [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233859 (https://phabricator.wikimedia.org/T415723) [23:10:29] (03CR) 10Reedy: [C:03+2] Updated phpunit/phpunit from 9.6.21 to 9.6.33 [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233859 (https://phabricator.wikimedia.org/T415723) (owner: 10Reedy) [23:10:31] James_F: yessir [23:10:36] (03PS1) 10Reedy: Updated phpunit/phpunit from 9.6.21 to 9.6.33 [core] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233860 (https://phabricator.wikimedia.org/T415723) [23:10:41] Well, Reedy is merging to prod branches right now… [23:10:44] (03CR) 10Reedy: [C:03+2] Updated phpunit/phpunit from 9.6.21 to 9.6.33 [core] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233860 (https://phabricator.wikimedia.org/T415723) (owner: 10Reedy) [23:10:46] So presumably that's handled? ;-) [23:12:42] All three are really needed :P [23:12:53] (03PS1) 10Jforrester: Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) [23:13:08] And ^^^ revert to unblock the train, hopefully. [23:13:17] (03CR) 10Reedy: [C:03+2] build: Upgrade PHPUnit from 10.5.59 to 10.5.62 to unblock CI [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) (owner: 10Zabe) [23:13:18] (03CR) 10Brennen Bearnes: [C:03+2] build: Upgrade PHPUnit from 10.5.59 to 10.5.62 to unblock CI [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) (owner: 10Zabe) [23:13:20] (Yay serialisation, have I ever said how much I love you?) [23:13:50] believe we'll need to make sure the current wikiversions merges as well before syncing [23:14:12] all of this is what i get for throwing caution to the winds about a late-day train deploy [23:14:30] (03Merged) 10jenkins-bot: build: Upgrade PHPUnit from 10.5.59 to 10.5.62 to unblock CI [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233858 (https://phabricator.wikimedia.org/T415723) (owner: 10Zabe) [23:14:33] brennen: you broke phpunit? [23:14:35] :P [23:15:13] haha, no, but the fates decreed that i would stumble over a broken phpunit for the hubris of mucking with things after 3pm local [23:15:37] it's after 2300 local here ;) [23:16:47] (03PS2) 10Brennen Bearnes: Revert "group0 to 1.46.0-wmf.13" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233856 (https://phabricator.wikimedia.org/T413804) [23:17:13] (03CR) 10CI reject: [V:04-1] Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) (owner: 10Jforrester) [23:17:15] FIRING: MediaWikiHighErrorRate: Elevated rate of MediaWiki errors - kube-mw-web - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000438/mediawiki-exceptions-alerts?panelId=18&fullscreen&orgId=1&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiHighErrorRate [23:17:17] (03CR) 10Brennen Bearnes: [C:03+2] Revert "group0 to 1.46.0-wmf.13" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233856 (https://phabricator.wikimedia.org/T413804) (owner: 10Brennen Bearnes) [23:18:21] (03Merged) 10jenkins-bot: Revert "group0 to 1.46.0-wmf.13" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1233856 (https://phabricator.wikimedia.org/T413804) (owner: 10Brennen Bearnes) [23:19:05] (03PS2) 10Jforrester: Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) [23:19:12] (03CR) 10Reedy: [C:03+2] Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) (owner: 10Jforrester) [23:19:15] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate eventstreams-internal.discovery.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [23:22:09] zabe: oh fucking seriously? [23:22:15] RESOLVED: MediaWikiHighErrorRate: Elevated rate of MediaWiki errors - kube-mw-web - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000438/mediawiki-exceptions-alerts?panelId=18&fullscreen&orgId=1&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiHighErrorRate [23:22:33] https://github.com/sebastianbergmann/phpunit/commit/b36f02317466907a230d3aa1d34467041271ef4a [23:22:34] hehe [23:22:56] (03Merged) 10jenkins-bot: Updated phpunit/phpunit from 9.6.21 to 9.6.33 [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233859 (https://phabricator.wikimedia.org/T415723) (owner: 10Reedy) [23:23:00] (03CR) 10CI reject: [V:04-1] Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) (owner: 10Jforrester) [23:23:01] "a regression" [23:23:03] that'sr eally helpful [23:23:11] (03Merged) 10jenkins-bot: Updated phpunit/phpunit from 9.6.21 to 9.6.33 [core] (wmf/1.46.0-wmf.12) - 10https://gerrit.wikimedia.org/r/1233860 (https://phabricator.wikimedia.org/T415723) (owner: 10Reedy) [23:23:33] (03CR) 10Reedy: Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) (owner: 10Jforrester) [23:23:36] (03CR) 10Reedy: [C:03+2] Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) (owner: 10Jforrester) [23:23:52] so there're currently a bunch of serialization (i assume - __PHP_Incomplete_Class given) in .12 [23:25:37] serialization _errors_. i can type. [23:25:55] Can you see in what part of MW? [23:26:02] Or is the trace unhelpful? [23:27:25] James_F: T415725 - i'm assuming this is from the earlier rolling forward to group0 [23:27:25] T415725: TypeError: MediaWiki\Extension\Translate\MessageGroupProcessing\CachedMessageGroupFactoryLoader::MediaWiki\Extension\Translate\MessageGroupProcessing\{closure}(): Argument #1 ($value) must be of type DependencyWrapper, __PHP_In - https://phabricator.wikimedia.org/T415725 [23:27:35] still translate [23:27:40] Hmm, yeah, makes sense. Drat. [23:27:41] James_F: If you're doing them manually, you need to +0.0.01 [23:28:03] Reedy: Argh, 10.5.63? [23:28:14] Yup [23:28:15] https://github.com/sebastianbergmann/phpunit/releases/tag/10.5.63 [23:28:20] Bother. [23:29:04] Though https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/1233857 passed on .62. [23:29:10] (03PS1) 10Dwisehaupt: frack dns cleanup and reconfig [dns] - 10https://gerrit.wikimedia.org/r/1233877 (https://phabricator.wikimedia.org/T364185) [23:29:23] https://github.com/sebastianbergmann/phpunit/releases/tag/10.5.62 [23:29:26] https://github.com/sebastianbergmann/phpunit/commit/b36f02317466907a230d3aa1d34467041271ef4a it's some coverage non specific regression [23:29:32] Right, both are fine but ..63 is better. [23:29:34] serialization! [23:29:58] (03CR) 10CI reject: [V:04-1] frack dns cleanup and reconfig [dns] - 10https://gerrit.wikimedia.org/r/1233877 (https://phabricator.wikimedia.org/T364185) (owner: 10Dwisehaupt) [23:31:45] Have aborted the script and tweaked it to +0.0.1 for the rest. [23:34:15] FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [23:35:04] (03Merged) 10jenkins-bot: Revert "Language: Namespace dependency classes" [core] (wmf/1.46.0-wmf.13) - 10https://gerrit.wikimedia.org/r/1233862 (https://phabricator.wikimedia.org/T415619) (owner: 10Jforrester) [23:35:33] brennen: Want me to deploy the 4 patches? [23:36:14] Reedy: fire away [23:38:01] !log reedy@deploy2002 Started scap sync-world: Backport for [[gerrit:1233860|Updated phpunit/phpunit from 9.6.21 to 9.6.33 (T415723)]], [[gerrit:1233862|Revert "Language: Namespace dependency classes" (T415619)]], [[gerrit:1233858|build: Upgrade PHPUnit from 10.5.59 to 10.5.62 to unblock CI (T415723)]], [[gerrit:1233859|Updated phpunit/phpunit from 9.6.21 to 9.6.33 (T415723)]] [23:38:09] T415723: CI blocked from installing phpunit by CVE-2026-24765 - https://phabricator.wikimedia.org/T415723 [23:38:09] T415619: Creation of dynamic property MediaWiki\Language\Dependency\FileDependency::$filename is deprecated {"exception":"[object] (ErrorException(code: 0) - https://phabricator.wikimedia.org/T415619 [23:46:45] Thanks for the C+2s, Reedy. [23:49:31] (03PS2) 10Dwisehaupt: frack dns cleanup and reconfig [dns] - 10https://gerrit.wikimedia.org/r/1233877 (https://phabricator.wikimedia.org/T364185) [23:50:06] (03CR) 10CI reject: [V:04-1] frack dns cleanup and reconfig [dns] - 10https://gerrit.wikimedia.org/r/1233877 (https://phabricator.wikimedia.org/T364185) (owner: 10Dwisehaupt)