[01:23:44] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Dumps-Generation, 13Patch-For-Review: when analyzing a Wikifunctions dump, parent_id in page creation revisions is sometimes 0 and sometimes None - https://phabricator.wikimedia.org/T420974#11888191 (10Ottomata) [RevisionEntitySerializer - allow rev_pa... [01:50:31] FIRING: [4x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [01:50:31] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [01:50:31] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [03:10:31] FIRING: [3x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [03:10:31] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [03:10:31] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [07:10:46] FIRING: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [07:10:46] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [07:10:46] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [07:11:33] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list April 2026 - https://phabricator.wikimedia.org/T424607#11888379 (10GFontenelle_WMF) Thank you so much, @mforns! [07:11:41] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Commons-Impact-Metrics, 10Commons-Impact-Metrics-Requests: Update Commons Impact Metrics allow-list April 2026 - https://phabricator.wikimedia.org/T424607#11888380 (10GFontenelle_WMF) 05Open→03Resolved [08:00:03] 06Data-Engineering, 06MediaWiki-Platform-Team, 05FY2025-26 KR 5.1, 07OKR-Work: redioscope: periodically publish top clients to the data lake - https://phabricator.wikimedia.org/T424823#11888491 (10daniel) [08:01:29] 06Data-Engineering, 06MediaWiki-Platform-Team, 05FY2025-26 KR 5.1, 07OKR-Work: redioscope: periodically publish top clients to the data lake - https://phabricator.wikimedia.org/T424823#11888499 (10daniel) [09:08:34] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental: Accelerate sqoop landing for MediaWiki History private tables - https://phabricator.wikimedia.org/T424355#11888757 (10APizzata-WMF) After a discussion with @xcollazo and @JAllemandou we decided to create 3 parallel processes: ` /usr/loc... [11:10:46] FIRING: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [11:10:46] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [11:10:46] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [12:27:39] 06Data-Engineering: Inconsistent wiki list: grouped_wikis.csv extended *after* some sqoop jobs have already started - https://phabricator.wikimedia.org/T425385#11889256 (10Aklapper) [13:15:00] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Epic, 10Event-Platform: Relative Trending - https://phabricator.wikimedia.org/T425418 (10JMonton-WMF) 03NEW [13:19:38] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Relative Trending - Design document - https://phabricator.wikimedia.org/T425421 (10JMonton-WMF) 03NEW [13:56:48] 06Data-Engineering, 06Data-Engineering-Radar, 10Event-Platform, 06Machine-Learning-Team (Q4 FY2025-26), 13Patch-For-Review: Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11889722 (10isarantopoulos) since this happens al... [14:04:26] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11889747 (10Ottomata) Here is EventBus code that previously emitted revert info: https://gerrit.wikimedia.org/r/c/mediawiki/extensi... [14:04:44] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11889749 (10Ottomata) @xcollazo @Milimetric I was wrong about only UI revert info being available. It looks like we have a lot more! [14:55:48] FIRING: EventgateLatency: Elevated latency for POST events on eventgate-logging-external in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateLatency [15:10:46] FIRING: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [15:10:46] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [15:10:46] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [15:10:53] RESOLVED: EventgateLatency: Elevated latency for POST events on eventgate-logging-external in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateLatency [15:41:12] !log Test Kitchen edge-unique experiments (poll 200367) - adds: logged-out-retention-round10; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [15:41:14] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:42:24] 06Data-Engineering: Migrate generated-data-platform-aqs Docker images away from Debian Bullseye - https://phabricator.wikimedia.org/T425310#11890264 (10Snwachukwu) Data Engineering owns AQS services. I recently bumped media-analytics service to Bookworm. https://gerrit.wikimedia.org/r/c/generated-data-platform/a... [15:54:35] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental: Draft: Architectural design agreement: Incremental MediaWiki History - https://phabricator.wikimedia.org/T424359#11890356 (10xcollazo) ## Decisions from design meeting — 2026-05-05 ### Decision 1: Move to Spark 3.5 (drop the Spark 3.1.2... [16:13:58] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): A more recent Spark + Iceberg will make Incremental MediaWiki History much more efficient - https://phabricator.wikimedia.org/T424381#11890461 (10xcollazo) CC @BTullis, @Gehel, @Ahoelzl [16:34:05] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10AQS2.0: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202#11890533 (10simon04) >>! In T415202#11851634, @Ladsgroup wrote: > I updated my tool to get videoplays. Hi @Ladsgroup, the following URL seemed to wor... [17:10:43] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10AQS2.0: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202#11890762 (10New_York-air) Same here [17:35:07] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental: Draft: Architectural design agreement: Incremental MediaWiki History - https://phabricator.wikimedia.org/T424359#11890969 (10xcollazo) --- **Updated plan — 2026-05-05** (supersedes the earlier plan comment) Key changes from the design m... [18:12:48] FIRING: EventgateLatency: Elevated latency for POST events on eventgate-logging-external in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateLatency [18:17:48] RESOLVED: EventgateLatency: Elevated latency for POST events on eventgate-logging-external in eqiad. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?viewPanel=79&orgId=1&var-service=eventgate-logging-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateLatency [18:23:24] 06Data-Engineering, 13Patch-For-Review: table_maintenance_iceberg_monthly permission issue fails task due to permission on Ivy cache artifact - https://phabricator.wikimedia.org/T418804#11891317 (10xcollazo) From @BTullis via Slack: ` I have executed this: btullis@cumin1003:~$ sudo cumin A:hadoop-worker 'rm -... [18:26:10] 06Data-Engineering, 13Patch-For-Review: table_maintenance_iceberg_monthly permission issue fails task due to permission on Ivy cache artifact - https://phabricator.wikimedia.org/T418804#11891334 (10xcollazo) >>! In T418804#11891241, @CodeReviewBot wrote: > xcollazo **merged** https://gitlab.wikimedia.org/repos... [18:26:43] 06Data-Engineering, 13Patch-For-Review: table_maintenance_iceberg_monthly permission issue fails task due to permission on Ivy cache artifact - https://phabricator.wikimedia.org/T418804#11891347 (10xcollazo) a:03xcollazo [18:27:13] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: table_maintenance_iceberg_monthly permission issue fails task due to permission on Ivy cache artifact - https://phabricator.wikimedia.org/T418804#11891349 (10xcollazo) [19:02:22] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental, 07Epic: Incremental MediaWiki History - https://phabricator.wikimedia.org/T424350#11891537 (10xcollazo) [19:04:00] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental: Spike: can analytics-refinery-source build Spark 3.5 artifacts? - https://phabricator.wikimedia.org/T425474 (10xcollazo) 03NEW [19:10:46] FIRING: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [19:10:46] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [19:10:46] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [19:31:59] 06Data-Engineering: Mediawiki History Failure - https://phabricator.wikimedia.org/T425443#11891633 (10Aklapper) [19:33:00] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental: Spike: can analytics-refinery-source build Spark 3.5 artifacts? - https://phabricator.wikimedia.org/T425474#11891644 (10xcollazo) ## Spike Result: GO — build within analytics-refinery-source **Approach tested:** new `refinery-job-35` Ma... [19:37:04] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental: Spike: can analytics-refinery-source build Spark 3.5 artifacts? - https://phabricator.wikimedia.org/T425474#11891663 (10xcollazo) CC @JAllemandou [19:38:18] 06Data-Engineering: Mediawiki History Failure - https://phabricator.wikimedia.org/T425443#11891677 (10AKhatun_WMF) [19:40:04] (03PS1) 10Xcollazo: Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) [19:42:10] (03PS2) 10Xcollazo: Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) [19:50:17] 06Data-Engineering: Mediawiki History Failure - https://phabricator.wikimedia.org/T425443#11891713 (10AKhatun_WMF) [Slack thread](https://wikimedia.slack.com/archives/C05RHK7PS6Q/p1777920926844709) Error log: ` 26/05/02 21:44:21 WARN DenormalizedHistoryChecker: DenormMetricsGrowthErrors ratio 0.0831792975970425... [19:52:11] 06Data-Engineering: Mediawiki History Failure - https://phabricator.wikimedia.org/T425443#11891720 (10AKhatun_WMF) [19:54:45] 06Data-Engineering: Mediawiki History Failure [2026-04] - https://phabricator.wikimedia.org/T425443#11891721 (10AKhatun_WMF) [20:01:48] (03CR) 10CI reject: [V:04-1] Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) (owner: 10Xcollazo) [20:21:00] (03PS3) 10Xcollazo: Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) [20:32:22] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11891858 (10xcollazo) >>! In T423583#11889748, @Ottomata wrote: > @xcollazo @Milimetric I was wrong about only... [20:39:29] (03CR) 10CI reject: [V:04-1] Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) (owner: 10Xcollazo) [20:40:20] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Traffic, 13Patch-For-Review: Surge in webrequest validation check - https://phabricator.wikimedia.org/T422030#11891870 (10AKhatun_WMF) From current Ops Week: - The `ERROR` emails have stopped. - We are still getting `WARNING` emails quite frequently:... [20:43:05] (03PS4) 10Xcollazo: Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) [21:01:57] (03CR) 10CI reject: [V:04-1] Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) (owner: 10Xcollazo) [21:52:12] !log Test Kitchen mw-user experiment (poll 201461) - adds: ab-test-email-confirmation-banner; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [21:52:14] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:59:43] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11892141 (10Ottomata) Well, that is just a TBD schema right now! We'll see what MW let's me get, but I think... [22:00:29] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10MWH-Incremental, 07Epic: Incremental MediaWiki History - https://phabricator.wikimedia.org/T424350#11892144 (10Ottomata) [22:00:32] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11892143 (10Ottomata) [23:10:46] FIRING: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [23:10:46] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [23:10:46] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [23:34:29] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Mediawiki History Failure [2026-04] - https://phabricator.wikimedia.org/T425443#11892350 (10Ahoelzl) p:05Triage→03High [23:40:11] (03PS5) 10Xcollazo: Spike: refinery-job-35 submodule compiles against Spark 3.5.8 + Iceberg 1.10.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283087 (https://phabricator.wikimedia.org/T425474) [23:43:06] (03PS1) 10Xcollazo: Add .DS_Store, logs/, and *.bak to .gitignore [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1283113