[01:10:10] 06Machine-Learning-Team: Install AMD GPU + torch version of ML Labs machines - https://phabricator.wikimedia.org/T412357#11577089 (10achou) @Isaac: Kevin is currently working on the vLLM image for Lift Wing in {T415627}. He's been testing different versions and building the image on our ML-build machine, which w... [05:03:27] 06Machine-Learning-Team: Install AMD GPU + torch version of ML Labs machines - https://phabricator.wikimedia.org/T412357#11577291 (10kevinbazira) >>! In T412357#11577087, @achou wrote: > @Isaac: Kevin is currently working on the vLLM image for Lift Wing in {T415627}. He's been testing different versions and buil... [09:36:49] FIRING: KubernetesDeploymentUnavailableReplicas: ... [09:36:49] Deployment revertrisk-wikidata-predictor-00005-deployment in revertrisk at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revertrisk&var-deployment=revertrisk-wikidata-predictor-00005-deployment - ... [09:36:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [10:26:49] RESOLVED: KubernetesDeploymentUnavailableReplicas: ... [10:26:49] Deployment revertrisk-wikidata-predictor-00005-deployment in revertrisk at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revertrisk&var-deployment=revertrisk-wikidata-predictor-00005-deployment - ... [10:26:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [10:31:45] 06Machine-Learning-Team, 10ORES, 10Automoderator, 06Moderator-Tools-Team: ORES is not working on testwiki - https://phabricator.wikimedia.org/T411786#11577946 (10gkyziridis) 05Open→03Resolved [10:47:42] 10Lift-Wing, 06Machine-Learning-Team, 10Wikidata, 07OKR-Work, 13Patch-For-Review: Optimize revertrisk-wikidata inference service to achieve ~500ms latency target - https://phabricator.wikimedia.org/T414060#11578047 (10kevinbazira) In T409388#11560918, the WME team reported encountering the error: `{"erro... [10:49:56] 06Machine-Learning-Team, 06Data-Engineering, 10Event-Platform: Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11578057 (10achou) Answering @gkyziridis's questions: > … what kind of optimization do you have in mind? I meant o... [10:59:12] (03PS1) 10Kevin Bazira: revertrisk-wikidata: handle unexpected dicts [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236255 (https://phabricator.wikimedia.org/T414060) [11:48:45] (03CR) 10Gkyziridis: [C:03+1] "LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236255 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [13:05:58] (03PS1) 10Kevin Bazira: revertrisk-wikidata: add retry logic for Wikidata requests [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236280 (https://phabricator.wikimedia.org/T414060) [13:08:09] (03CR) 10Kevin Bazira: [C:03+2] revertrisk-wikidata: handle unexpected dicts [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236255 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [13:09:24] (03Merged) 10jenkins-bot: revertrisk-wikidata: handle unexpected dicts [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236255 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [14:25:23] 06Machine-Learning-Team: Install AMD GPU + torch version of ML Labs machines - https://phabricator.wikimedia.org/T412357#11578738 (10Isaac) Thanks all for your work on this! I'll monitor then and happy to take it for a test run when it's ready :) [14:26:53] 10Lift-Wing, 06Machine-Learning-Team, 07Essential-Work: Update WMF Debian vLLM image to support latest upstream software stack - https://phabricator.wikimedia.org/T415627#11578754 (10kevinbazira) I've upgraded the wmf-debian-vllm image to support the latest upstream software stack as of Jan 2026. Below is th... [14:46:47] 10Lift-Wing, 06Machine-Learning-Team, 07Essential-Work: Update WMF Debian vLLM image to support latest upstream software stack - https://phabricator.wikimedia.org/T415627#11578850 (10kevinbazira) I have tested the image built in T415627#11578754 on ML-Lab. It was able to successfully load and run inference o... [15:38:57] (03CR) 10Gkyziridis: [C:03+1] "LGTM! Nice that you are using tenacity!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236280 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [15:40:31] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10MediaWiki-Recent-changes, 06Moderator-Tools-Team (Kanban), 07OKR-Work (WE1 FY2025-26): Enable revert risk filters for first batch of wikis: < 1000 monthly edits - https://phabricator.wikimedia.org/T411485#11579191 (10Ladsgroup) Go for it. [15:55:10] (03CR) 10Kevin Bazira: [C:03+2] revertrisk-wikidata: add retry logic for Wikidata requests [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236280 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [15:56:37] (03Merged) 10jenkins-bot: revertrisk-wikidata: add retry logic for Wikidata requests [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236280 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [18:29:17] 06Machine-Learning-Team: Reduce logstash logs from machine learning infra - https://phabricator.wikimedia.org/T416384 (10Ladsgroup) 03NEW [18:36:15] 06Machine-Learning-Team: Reduce logstash logs from machine learning infra - https://phabricator.wikimedia.org/T416384#11580128 (10Ladsgroup) Also if you check UA, most logs are simply from "MediaWiki/1.46.0-wmf.13" or "ChangePropagation/WMF". Can we sample these? [19:55:05] 06Machine-Learning-Team, 06Data-Engineering, 10Event-Platform: Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11580587 (10Ottomata) > we may need to produce predictions to a separate stream instead of mediawiki.page_revert_r... [19:56:25] 06Machine-Learning-Team, 10Add-Link-Structured-Task, 10Community Feedback (Growth), 06Growth-Team: Introduce case sensitivity to machine learning model for Add a Link - https://phabricator.wikimedia.org/T405185#11580589 (10Sucheta-Salgaonkar-WMF) @OKarakaya-WMF thanks soo much for the analysis you provided... [22:40:18] 06Machine-Learning-Team, 10Add-Link-Structured-Task, 10Community Feedback (Growth), 06Growth-Team: Introduce case sensitivity to machine learning model for Add a Link - https://phabricator.wikimedia.org/T405185#11581215 (10KStoller-WMF) >(2) Do we have data to suggest that this might be an even more pervas...