[00:19:51] FIRING: ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:24:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:29:16] 10Tool-recaptime-dev, 06Release-Engineering-Team, 10GitLab (Project Migration): Create new GitLab project group: recaptime-dev - https://phabricator.wikimedia.org/T418698#11666227 (10ajhalili2006) 05Open→03Declined Thanks for the response and feedback! I know it's confusing how it fits within here (r... [00:33:58] (03update) 10raymond-ndibe: docs: update deployment docs [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/36 (owner: 10dcaro) [00:34:02] (03approved) 10raymond-ndibe: docs: update deployment docs [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/36 (owner: 10dcaro) [00:34:11] (03update) 10raymond-ndibe: docs: update deployment docs [repos/cloud/toolforge/registry-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/registry-admission/-/merge_requests/36 (owner: 10dcaro) [00:34:12] (03update) 10raymond-ndibe: docs: update deployment instructions [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/70 (owner: 10dcaro) [00:34:15] (03approved) 10raymond-ndibe: docs: update deployment instructions [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/70 (owner: 10dcaro) [00:34:20] (03update) 10raymond-ndibe: docs: update deployment docs [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/44 (owner: 10dcaro) [00:34:27] (03approved) 10raymond-ndibe: docs: update deployment docs [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/44 (owner: 10dcaro) [00:34:28] (03update) 10raymond-ndibe: docs: update deployment instructions [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/70 (owner: 10dcaro) [00:34:37] (03update) 10raymond-ndibe: docs: update deployment docs [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/44 (owner: 10dcaro) [00:34:40] (03update) 10raymond-ndibe: docs: added standard deployment notes [repos/cloud/toolforge/image-config] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/20 (owner: 10dcaro) [00:34:44] (03approved) 10raymond-ndibe: docs: added standard deployment notes [repos/cloud/toolforge/image-config] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/20 (owner: 10dcaro) [00:34:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:35:45] (03close) 10raymond-ndibe: Draft: Jobs api version that deletes and recreates jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/261 [00:39:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:44:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:49:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:17:51] FIRING: ProbeDown: Service tools-k8s-haproxy-8:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:20:27] 10Tools: What is recaptime-dev and how does it advance the goals of the Wikimedia movement? - https://phabricator.wikimedia.org/T418818 (10bd808) 03NEW [01:22:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:27:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance extdist-06 in project extdist - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [01:27:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:32:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:37:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:42:51] RESOLVED: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:43:51] FIRING: ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:48:06] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:48:51] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:53:06] RESOLVED: [2x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:55:52] (03update) 10raymond-ndibe: logs-api,logs: test start, end params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [02:17:51] FIRING: [2x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:22:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:27:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:32:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:37:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:39:13] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [02:42:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:47:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:48:28] (03update) 10raymond-ndibe: logs-api,logs: test start, end params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [02:52:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:59:19] (03update) 10raymond-ndibe: logs-api,logs: test start, end params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [02:59:32] (03update) 10raymond-ndibe: logs-api,logs: test since, until params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [03:08:25] (03update) 10raymond-ndibe: logs-api,logs: test since, until params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [03:10:28] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [03:15:00] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [03:18:51] FIRING: [2x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:20:07] (03update) 10raymond-ndibe: components-api: bump to 0.0.186-20260302234750-21efe557 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1152 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:20:07] (03approved) 10raymond-ndibe: components-api: bump to 0.0.186-20260302234750-21efe557 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1152 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:20:19] (03merge) 10raymond-ndibe: components-api: bump to 0.0.186-20260302234750-21efe557 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1152 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:20:28] (03update) 10raymond-ndibe: logs-api: bump to 0.0.14-20260302234841-60bec639 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1146 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:20:34] (03approved) 10raymond-ndibe: logs-api: bump to 0.0.14-20260302234841-60bec639 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1146 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:20:36] (03update) 10raymond-ndibe: logs-api: bump to 0.0.14-20260302234841-60bec639 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1146 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:02] (03merge) 10raymond-ndibe: logs-api: bump to 0.0.14-20260302234841-60bec639 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1146 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:12] (03update) 10raymond-ndibe: jobs-emailer: bump to 0.0.77-20260302234644-7259b5e1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1159 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:15] (03approved) 10raymond-ndibe: jobs-emailer: bump to 0.0.77-20260302234644-7259b5e1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1159 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:18] (03update) 10raymond-ndibe: jobs-emailer: bump to 0.0.77-20260302234644-7259b5e1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1159 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:43] (03merge) 10raymond-ndibe: jobs-emailer: bump to 0.0.77-20260302234644-7259b5e1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1159 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:45] (03update) 10raymond-ndibe: envvars-api: bump to 0.0.81-20260302155558-9020eb8b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1157 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:51] (03update) 10raymond-ndibe: envvars-api: bump to 0.0.81-20260302155558-9020eb8b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1157 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:21:53] (03approved) 10raymond-ndibe: envvars-api: bump to 0.0.81-20260302155558-9020eb8b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1157 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:22:32] (03merge) 10raymond-ndibe: envvars-api: bump to 0.0.81-20260302155558-9020eb8b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1157 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:22:59] (03update) 10raymond-ndibe: builds-api: bump to 0.0.209-20260302155339-cd236f5b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1156 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:23:04] (03approved) 10raymond-ndibe: builds-api: bump to 0.0.209-20260302155339-cd236f5b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1156 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:23:21] (03update) 10raymond-ndibe: builds-api: bump to 0.0.209-20260302155339-cd236f5b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1156 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:23:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:23:54] (03merge) 10raymond-ndibe: builds-api: bump to 0.0.209-20260302155339-cd236f5b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1156 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:24:32] (03update) 10raymond-ndibe: jobs-api: bump to 0.0.466-20260302155936-7a426183 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1153 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:24:33] (03approved) 10raymond-ndibe: jobs-api: bump to 0.0.466-20260302155936-7a426183 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1153 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:24:38] (03update) 10raymond-ndibe: jobs-api: bump to 0.0.466-20260302155936-7a426183 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1153 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:25:11] (03merge) 10raymond-ndibe: jobs-api: bump to 0.0.466-20260302155936-7a426183 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1153 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:25:45] (03update) 10raymond-ndibe: maintain-kubeusers: bump to 0.0.191-20260302155714-fe5f1382 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1151 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:25:51] (03approved) 10raymond-ndibe: maintain-kubeusers: bump to 0.0.191-20260302155714-fe5f1382 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1151 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:25:54] (03update) 10raymond-ndibe: maintain-kubeusers: bump to 0.0.191-20260302155714-fe5f1382 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1151 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:26:19] (03merge) 10raymond-ndibe: maintain-kubeusers: bump to 0.0.191-20260302155714-fe5f1382 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1151 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:26:40] (03update) 10raymond-ndibe: envvars-admission: bump to 0.0.38-20260302155454-a88eccb9 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1150 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:26:44] (03update) 10raymond-ndibe: envvars-admission: bump to 0.0.38-20260302155454-a88eccb9 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1150 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:26:50] (03approved) 10raymond-ndibe: envvars-admission: bump to 0.0.38-20260302155454-a88eccb9 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1150 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:27:09] (03merge) 10raymond-ndibe: envvars-admission: bump to 0.0.38-20260302155454-a88eccb9 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1150 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:27:52] (03update) 10raymond-ndibe: ingress-admission: bump to 0.0.79-20260302155630-364cb534 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1149 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:27:58] (03update) 10raymond-ndibe: ingress-admission: bump to 0.0.79-20260302155630-364cb534 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1149 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:28:00] (03approved) 10raymond-ndibe: ingress-admission: bump to 0.0.79-20260302155630-364cb534 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1149 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:28:25] (03merge) 10raymond-ndibe: ingress-admission: bump to 0.0.79-20260302155630-364cb534 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1149 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:28:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:28:52] (03update) 10raymond-ndibe: maintain-harbor: bump to 0.0.68-20260302234411-b765488a [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1158 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:28:56] (03approved) 10raymond-ndibe: maintain-harbor: bump to 0.0.68-20260302234411-b765488a [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1158 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:29:00] (03update) 10raymond-ndibe: maintain-harbor: bump to 0.0.68-20260302234411-b765488a [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1158 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:29:27] (03merge) 10raymond-ndibe: maintain-harbor: bump to 0.0.68-20260302234411-b765488a [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1158 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:29:29] (03update) 10raymond-ndibe: api-gateway: bump to 0.0.88-20260302155547-44264f79 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1148 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:29:35] (03update) 10raymond-ndibe: api-gateway: bump to 0.0.88-20260302155547-44264f79 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1148 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:29:45] (03approved) 10raymond-ndibe: api-gateway: bump to 0.0.88-20260302155547-44264f79 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1148 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:30:06] (03merge) 10raymond-ndibe: api-gateway: bump to 0.0.88-20260302155547-44264f79 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1148 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:30:23] (03update) 10raymond-ndibe: builds-builder: bump to 0.0.142-20260302155522-7ff17e10 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1145 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:30:28] (03approved) 10raymond-ndibe: builds-builder: bump to 0.0.142-20260302155522-7ff17e10 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1145 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:30:32] (03update) 10raymond-ndibe: builds-builder: bump to 0.0.142-20260302155522-7ff17e10 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1145 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:31:00] (03merge) 10raymond-ndibe: builds-builder: bump to 0.0.142-20260302155522-7ff17e10 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1145 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [03:33:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:38:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:43:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:48:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:17:44] (03update) 10raymond-ndibe: logs-api,logs: add logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [04:18:52] FIRING: ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:18:54] (03update) 10raymond-ndibe: logs-api,logs: test since, until params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [04:23:23] (03update) 10raymond-ndibe: logs-api,logs: add logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [04:23:31] (03update) 10raymond-ndibe: logs-api,logs: add logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [04:23:52] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:24:06] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:24:46] (03update) 10raymond-ndibe: logs-api,logs: add logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [04:28:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:29:01] (03update) 10raymond-ndibe: logs-api,logs: test since, until params for logs-api and jobs-api log endpoint [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [04:33:51] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:38:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:39:06] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:42:20] (03update) 10raymond-ndibe: logs-api,logs: add logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [04:42:27] (03update) 10raymond-ndibe: logs-api,logs: add logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [04:43:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:48:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:49:06] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:53:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:58:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:03:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:08:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:13:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:18:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:25:33] 10Tool-wsindex, 10Wikisource Reader App (Android), 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11666494 (10Muguro) **Weekly Internship Report** //Week 12: February 23 - February 27// **Overview of Tasks Completed:** Task 1: [[ http... [05:33:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:38:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:43:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:45:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:50:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:55:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:00:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:05:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:10:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:15:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:20:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:25:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:30:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:35:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:40:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:45:51] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:50:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:55:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:00:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:05:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:15:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:20:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:25:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:30:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:35:51] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:40:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:45:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:50:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:51:06] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:55:51] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:00:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:05:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:10:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:15:42] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829 (10fgiunchedi) 03NEW [08:15:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:20:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:25:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:27:56] FIRING: [2x] SystemdUnitDown: The service unit hdfs_rsync_mediawiki_content_history.service is in failed status on host clouddumps1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [08:30:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:35:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:38:59] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Drop support for cl_to, cl_collation and il_to from wikireplicas - https://phabricator.wikimedia.org/T417492#11666764 (10Gehel) [08:44:49] 06cloud-services-team, 10Cloud-VPS: grafana.wmcloud.org unavailable - failed db migration - https://phabricator.wikimedia.org/T418831 (10fgiunchedi) 03NEW [08:45:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:47:02] 06cloud-services-team, 10Cloud-VPS: grafana.wmcloud.org unavailable - failed db migration - https://phabricator.wikimedia.org/T418831#11666794 (10Volans) Seems related to https://github.com/grafana/grafana/issues/118836 [08:50:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:53:29] 06cloud-services-team, 10Cloud-VPS: grafana.wmcloud.org unavailable - failed db migration - https://phabricator.wikimedia.org/T418831#11666817 (10fgiunchedi) Thank you @Volans ! Upstream issue recommends either downgrading grafana, upgrading mariadb or wait for a fixed grafana release with collation compatibil... [08:55:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:00:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:05:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:10:51] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:15:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:17:51] FIRING: ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:21:06] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:22:51] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:23:21] FIRING: ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:26:06] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:28:21] RESOLVED: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:31:06] FIRING: [4x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:33:43] 10superset.wmcloud.org, 10Automoderator, 06Moderator-Tools-Team, 06Product-Analytics: Migrate Automoderator activity dashboard from wmcloud to wikimedia Superset instance - https://phabricator.wikimedia.org/T418836 (10Samwalton9-WMF) 03NEW [09:36:34] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11666993 (10dcaro) On redis, there was an increase in the network traffic around the time the alert started triggering (not exactly, it seems a bit later): https://grafana.wmcloud.org/g... [09:37:34] 10Cloud-VPS (Quota-requests): Quota increases for gitlab-runners - https://phabricator.wikimedia.org/T418813#11666996 (10Volans) Given the size of the request this will need a discussion in the next WMCS team sync up on Thu., As I'm not aware of any prior coordinated plans for this specific migration, if there w... [09:38:04] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11666998 (10dcaro) It also matches an increase on the amount of `set` commands, and on the time the server spends on those. [09:42:00] 06cloud-services-team, 10superset.wmcloud.org: Sunset superset.wmcloud.org - https://phabricator.wikimedia.org/T416373#11667015 (10Samwalton9-WMF) [09:42:16] 06cloud-services-team, 10superset.wmcloud.org: Sunset superset.wmcloud.org - https://phabricator.wikimedia.org/T416373#11667020 (10Samwalton9-WMF) [10:08:38] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11667129 (10dcaro) Enabled logging on fourohfour, it seems that the `did-you-mean` urls are taking really long, looking [10:17:19] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11667178 (10dcaro) Added logs before and after hitting redis cache, and the slowdown does not seem to be there (diff between miss memory-hit redis): ` [2026-03-03 10:16:32,815] WARNING i... [10:18:21] RESOLVED: ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_this_tool_does_not_exist_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:22:56] FIRING: [2x] SystemdUnitDown: The systemd unit hdfs_rsync_mediawiki_content_history.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:23:49] 10Cloud-VPS (Quota-requests): Quota increases for gitlab-runners - https://phabricator.wikimedia.org/T418813#11667214 (10fnegri) @dduvall I'm happy to see that your experiments with Magnum went well (I remember we discussed this in SF back in 2023!) and you're planning to expand its usage. At the same time, Magn... [10:36:23] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11667276 (10dcaro) Some progress, I've added a memory cache for the 'did-you-mean' endpoint (that gets all tools and projects), and that reduced most of the calls made to it from several... [10:42:12] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11667294 (10dcaro) Things seem stable for almos 10m, most of the requests to that endpoint were for the non-existing tool `quentinv57-tools.toolforge.org` [10:42:50] 06cloud-services-team, 10Toolforge: fourohfour general unavailability / overload - https://phabricator.wikimedia.org/T418829#11667295 (10dcaro) That tool is redirected from the proxies too: https://codesearch.wmcloud.org/search/?q=quentinv57-tools.toolforge.org [11:04:00] (03open) 10dcaro: app: cache the suggested tools in memory [toolforge-repos/fourohfour] - 10https://gitlab.wikimedia.org/toolforge-repos/fourohfour/-/merge_requests/16 [11:05:51] (03update) 10dcaro: app: cache the suggested tools in memory [toolforge-repos/fourohfour] - 10https://gitlab.wikimedia.org/toolforge-repos/fourohfour/-/merge_requests/16 [11:42:08] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11667548 (10BTullis) I've added the puppet definition of the x1 section to `an-redacteddb1001` now. However, the service didn'... [12:05:42] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11667642 (10BTullis) I have also fixed up the Icinga checks by manually creating the grants required to carry out the checks.... [12:06:20] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11667658 (10BTullis) 05Open→03Stalled a:05BTullis→03None [12:30:22] 10Tool-wsindex, 10Wikisource Reader App (Android), 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11667802 (10Bodhisattwa) [12:31:00] 06cloud-services-team, 10superset.wmcloud.org: Sunset superset.wmcloud.org - https://phabricator.wikimedia.org/T416373#11667806 (10Pginer-WMF) >>! In T416373#11665536, @Andrew wrote: >> If it is shutting down, it would be good to understand. which will be the available alternatives to provide communities with... [12:38:42] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Carry out controlled network switch down tests in cloud - https://phabricator.wikimedia.org/T417393#11667832 (10fgiunchedi) [12:42:48] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11667849 (10Marostegui) >>! In T407485#11667548, @BTullis wrote: > I've added the puppet definition of the x1 section to `an-r... [12:54:16] 06cloud-services-team, 10Cloud-VPS: Consider enabling per-unit cgroup metrics in Cloud VPS via cadvisor - https://phabricator.wikimedia.org/T418083#11667904 (10fgiunchedi) [13:34:10] 06cloud-services-team, 10Toolforge: [fourohfour] general unavailability / overload - https://phabricator.wikimedia.org/T418829#11668013 (10dcaro) a:03dcaro [13:34:18] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): [fourohfour] general unavailability / overload - https://phabricator.wikimedia.org/T418829#11668015 (10dcaro) p:05Triage→03Medium [13:34:27] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): [fourohfour] general unavailability / overload - https://phabricator.wikimedia.org/T418829#11668017 (10dcaro) [13:34:36] 10Toolforge (Toolforge iteration 25): [harbor,tools] Harbor object usage in S3 is steadily increasing - https://phabricator.wikimedia.org/T418528#11668018 (10dcaro) p:05Triage→03Medium [13:35:14] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): [fourohfour] general unavailability / overload - https://phabricator.wikimedia.org/T418829#11668020 (10dcaro) 05Open→03In progress [13:38:13] 10Toolforge (Toolforge iteration 25): [harbor,tools] Harbor object usage in S3 is steadily increasing - https://phabricator.wikimedia.org/T418528#11668022 (10dcaro) 05Open→03In progress [14:11:03] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 07Epic: [KR] WE6.3 Introduce a sustainability scoring system for the Toolforge platform - https://phabricator.wikimedia.org/T368600#11668160 (10dcaro) [14:14:30] 10Toolforge (Toolforge iteration 25): [cicd] add trixie image with newer python - https://phabricator.wikimedia.org/T409058#11668179 (10dcaro) 05In progress→03Resolved [14:16:35] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11668200 (10BTullis) >>! In T407485#11667849, @Marostegui wrote: >>>! In T407485#11667548, @BTullis wrote: >> This will be del... [14:22:45] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Set up x1 replication to an-redacteddb1001 - https://phabricator.wikimedia.org/T407485#11668231 (10Marostegui) No, I really don't have strong feelings about it. As this is host is owned by your team, I am happy wi... [14:23:11] FIRING: [2x] SystemdUnitDown: The systemd unit hdfs_rsync_mediawiki_content_history.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [14:32:04] (03update) 10dcaro: runtime::diff_with_running_job: temp conditional to force job version upgrade from v1 -> v2 [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/259 (https://phabricator.wikimedia.org/T359649) (owner: 10raymond-ndibe) [14:32:38] 06cloud-services-team, 10Toolforge: [clis] standardize the package names - https://phabricator.wikimedia.org/T399080#11668273 (10dcaro) a:05dcaro→03None [14:33:23] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-api,harbor,image-config] Move pre-built images to harbor - https://phabricator.wikimedia.org/T409727#11668283 (10dcaro) 05In progress→03Stalled [14:33:45] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 07Epic, 13Patch-For-Review: [jobs-api,webservice] Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#11668286 (10dcaro) 05In progress→03Stalled [14:36:10] 10Toolforge (Toolforge iteration 25): [jobs-api] omitting filelog field from jobs creation payload completely breaks jobs-api for the tool - https://phabricator.wikimedia.org/T417518#11668312 (10dcaro) p:05Triage→03Medium [14:36:31] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25): [jobs-api] Create storage layer, and save business models in persistent storage - https://phabricator.wikimedia.org/T359650#11668316 (10dcaro) 05In progress→03Stalled [14:37:04] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25): [jobs-api] Create storage layer, and save business models in persistent storage - https://phabricator.wikimedia.org/T359650#11668324 (10dcaro) Waiting to get {T359649} in first. [14:37:38] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api] make job status an enum, with clearly defined states - https://phabricator.wikimedia.org/T401172#11668328 (10dcaro) Waiting for {T359650} to get deployed [14:38:25] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 07Epic, 13Patch-For-Review: [jobs-api] expose jobs-api continuous jobs to the internet via `toolname.toolforge.org`, just like webservice - https://phabricator.wikimedia.org/T388092#11668331 (10dcaro) To be deployed after {T40... [14:39:03] 06cloud-services-team, 10Toolforge: [components-api] optionally log deployments to SAL automatically - https://phabricator.wikimedia.org/T393169#11668333 (10dcaro) a:05dcaro→03None [14:44:20] 06cloud-services-team, 10Toolforge: Node.js buildpack selects EOL Node.js version - https://phabricator.wikimedia.org/T418414#11668349 (10dcaro) I'll close as duplicate of {T380127}, as both will be solved the same way, not to say the task is not useful though, just to be able to keep track of workable items. [14:44:37] 06cloud-services-team, 10Toolforge: Node.js buildpack selects EOL Node.js version - https://phabricator.wikimedia.org/T418414#11668351 (10dcaro) [14:44:53] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668354 (10dcaro) →14Duplicate dup:03T418414 [14:45:23] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668357 (10dcaro) 05Duplicate→03Open Merged the wrong way :facepalm: [14:46:05] 06cloud-services-team, 10Toolforge: Node.js buildpack selects EOL Node.js version - https://phabricator.wikimedia.org/T418414#11668363 (10dcaro) →14Duplicate dup:03T380127 [14:46:22] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668365 (10dcaro) [14:46:25] 06cloud-services-team, 10Toolforge: [build service] Python pack is outdated, does not support latest Python 3.14 stable release - https://phabricator.wikimedia.org/T408108#11668368 (10dcaro) I'll merge with {T380127} as both get fixed the same way. [14:46:38] 06cloud-services-team, 10Toolforge: [build service] Python pack is outdated, does not support latest Python 3.14 stable release - https://phabricator.wikimedia.org/T408108#11668371 (10dcaro) →14Duplicate dup:03T380127 [14:46:44] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668373 (10dcaro) [14:47:04] 06cloud-services-team, 10Toolforge: [Build service] latest builder has old PHP - https://phabricator.wikimedia.org/T401875#11668378 (10dcaro) I'll merge with {T380127} as both are fixed the same way. [14:47:15] 06cloud-services-team, 10Toolforge: [Build service] latest builder has old PHP - https://phabricator.wikimedia.org/T401875#11668381 (10dcaro) →14Duplicate dup:03T380127 [14:47:18] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668383 (10dcaro) [14:48:19] 06cloud-services-team, 10Toolforge: Upgrade golang buildpack to 1.22 - https://phabricator.wikimedia.org/T363854#11668387 (10dcaro) I'll merge with {T380127} as they are fixed the same way [14:48:30] 06cloud-services-team, 10Toolforge: Upgrade golang buildpack to 1.22 - https://phabricator.wikimedia.org/T363854#11668389 (10dcaro) →14Duplicate dup:03T380127 [14:48:39] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668391 (10dcaro) [14:49:17] 06cloud-services-team, 10Tool-spacemedia, 10Toolforge: [Build service] latest builder has old Java - https://phabricator.wikimedia.org/T405415#11668395 (10dcaro) I'll merge this into {T380127} as they are fixed the same way [14:49:29] 06cloud-services-team, 10Tool-spacemedia, 10Toolforge: [Build service] latest builder has old Java - https://phabricator.wikimedia.org/T405415#11668398 (10dcaro) →14Duplicate dup:03T380127 [14:49:33] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668400 (10dcaro) [14:50:27] 06cloud-services-team, 10Toolforge: [functional tests] improve cleanaup - https://phabricator.wikimedia.org/T409009#11668405 (10dcaro) [14:51:14] 06cloud-services-team, 10Toolforge: [functional tests] improve cleanup - https://phabricator.wikimedia.org/T409009#11668407 (10dcaro) [14:51:39] 06cloud-services-team, 10Toolforge: [functional tests] improve cleanup leftover test logs - https://phabricator.wikimedia.org/T409009#11668411 (10dcaro) [14:58:17] 06cloud-services-team, 10Toolforge: [jobs-api] Indicate when a job is too big to be scheduled - https://phabricator.wikimedia.org/T383515#11668455 (10dcaro) a:05Raymond_Ndibe→03None [14:58:25] 06cloud-services-team, 10Toolforge, 07Epic: [cicd] Streamline toolforge cli deployment and external contributor ci flows - https://phabricator.wikimedia.org/T392524#11668458 (10dcaro) a:05Raymond_Ndibe→03None [14:58:35] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge: [components-api] Add webservice support - https://phabricator.wikimedia.org/T362077#11668464 (10dcaro) a:05dcaro→03None [14:58:41] 06cloud-services-team, 10Toolforge: [jobs-api] Generate the openapi definition from the code - https://phabricator.wikimedia.org/T390138#11668466 (10dcaro) a:05Raymond_Ndibe→03None [14:58:52] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): Replace ingress-nginx before upstream EOL date - https://phabricator.wikimedia.org/T392356#11668468 (10dcaro) [14:58:56] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25): [infra,k8s] Upgrade Toolforge Kubernetes to version 1.32 - https://phabricator.wikimedia.org/T379047#11668470 (10dcaro) [14:59:04] 06cloud-services-team, 10Toolforge: [components-api] add order to the components deployment - https://phabricator.wikimedia.org/T362075#11668477 (10dcaro) a:05dcaro→03None [14:59:12] 06cloud-services-team, 10Toolforge: [jobs-api] separate jobs-framework k8s object templates from code - https://phabricator.wikimedia.org/T358815#11668478 (10dcaro) a:05Raymond_Ndibe→03None [15:00:07] 06cloud-services-team, 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api] Refactor before webservice support - https://phabricator.wikimedia.org/T359804#11668481 (10dcaro) [15:00:13] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [harbor,trove] Trove DB filled disk and caused toolforge-build to fail as a result - https://phabricator.wikimedia.org/T354714#11668484 (10dcaro) a:05dcaro→03None [15:00:50] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): [harbor] Update HarborDown runbook with the incident debugging details - https://phabricator.wikimedia.org/T354739#11668491 (10dcaro) [15:01:07] 06cloud-services-team, 10Toolforge, 05Goal: [harbor] Deploy with Helm - https://phabricator.wikimedia.org/T356301#11668495 (10dcaro) a:05Raymond_Ndibe→03None [15:19:39] 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [components-api] Queue builds when the build queue is full - https://phabricator.wikimedia.org/T402568#11668550 (10dcaro) This task has to be extended a bit more on what options do we have to implement this, some early suggestions: * On components-api... [15:20:01] 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [components-api] Queue builds when the build queue is full - https://phabricator.wikimedia.org/T402568#11668564 (10dcaro) 05Stalled→03In progress [15:20:35] 10Toolforge (Toolforge iteration 25), 07Upstream: [builds-builder] golang based images get infinite nested loops for procfile entries - https://phabricator.wikimedia.org/T363417#11668570 (10dcaro) Will be fixed when upgrading the bulidpacks {T380127} [15:21:23] 10Toolforge (Toolforge iteration 25), 07Upstream: [builds-builder] golang based images get infinite nested loops for procfile entries - https://phabricator.wikimedia.org/T363417#11668573 (10dcaro) [15:21:26] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11668574 (10dcaro) [15:24:13] 06cloud-services-team, 10Toolforge: [toolforge] simplify calling the different toolforge apis from within the containers - https://phabricator.wikimedia.org/T356377#11668588 (10dcaro) a:05dcaro→03None [15:25:48] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [harbor,infra] Find a way to manage toolforge project policies with code - https://phabricator.wikimedia.org/T360509#11668608 (10dcaro) a:05Raymond_Ndibe→03None [15:26:05] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): Replace ingress-nginx before upstream EOL date - https://phabricator.wikimedia.org/T392356#11668612 (10dcaro) 05Open→03In progress [15:26:37] 06cloud-services-team, 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api] Refactor before webservice support - https://phabricator.wikimedia.org/T359804#11668619 (10dcaro) 05Open→03Stalled [15:28:37] 10Toolforge (Toolforge iteration 25): [harbor,toolsbeta] for some reason maintain_harbor seems to not be cleaning up toolforge/* images - https://phabricator.wikimedia.org/T417894#11668630 (10dcaro) Waiting to get to the limit point of number of images to double check if the fixes did fix it. [15:31:26] 10Toolforge (Toolforge iteration 25): [harbor,tools] Harbor object usage in S3 is steadily increasing - https://phabricator.wikimedia.org/T418528#11668643 (10dcaro) Potential next steps to debug: * Checking in harbor DB if those objects are still referenced * Checking the timestamps of the objects to see if they... [15:33:11] 10Toolforge (Toolforge iteration 25): [harbor,tools] Harbor object usage in S3 is steadily increasing - https://phabricator.wikimedia.org/T418528#11668662 (10fnegri) Looking at the rate of increase, we have about 10 days before we reach the current limit. Shall we bump the limit up so we have more time to inves... [15:33:49] 10Toolforge (Toolforge iteration 25): [harbor,tools] Harbor object usage in S3 is steadily increasing - https://phabricator.wikimedia.org/T418528#11668664 (10fnegri) {F72484058} [15:35:12] 06cloud-services-team, 10Toolforge: [harbor,toolsbeta] for some reason maintain_harbor seems to not be cleaning up toolforge/* images - https://phabricator.wikimedia.org/T417894#11668673 (10dcaro) a:05Raymond_Ndibe→03None [15:35:25] 06cloud-services-team, 10Toolforge (Toolforge iteration 25): [image-config] deprecate and move all data to builds-api - https://phabricator.wikimedia.org/T409728#11668677 (10dcaro) a:05Raymond_Ndibe→03None [15:35:40] 06cloud-services-team, 10Toolforge: [jobs-api,webservice] Fetch images from builds-api - https://phabricator.wikimedia.org/T409725#11668680 (10dcaro) a:05Raymond_Ndibe→03None [15:35:57] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [builds-api] Add an endpoint to get all available images - https://phabricator.wikimedia.org/T409726#11668682 (10dcaro) a:05Raymond_Ndibe→03None [15:36:12] 06cloud-services-team, 10Toolforge: [image-config] deprecate and move all data to builds-api - https://phabricator.wikimedia.org/T409728#11668684 (10dcaro) [15:39:46] 06cloud-services-team, 10Toolforge: Update maintain_kubeusers to use the toolstate database - https://phabricator.wikimedia.org/T334629#11668710 (10dcaro) a:05Raymond_Ndibe→03None [15:44:52] 06cloud-services-team, 10Toolforge, 07Kubernetes: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237#11668746 (10dcaro) a:03Andrew [15:45:10] 06cloud-services-team, 10Toolforge (Toolforge iteration 25), 07Kubernetes: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237#11668749 (10dcaro) [15:47:32] (03approved) 10fnegri: Allow to force ARM kubernetes stack [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/303 (owner: 10volans) [15:57:23] (03update) 10dcaro: runtime::diff_with_running_job: temp conditional to force job version upgrade from v1 -> v2 [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/259 (https://phabricator.wikimedia.org/T359649) (owner: 10raymond-ndibe) [15:57:25] (03approved) 10dcaro: runtime::diff_with_running_job: temp conditional to force job version upgrade from v1 -> v2 [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/259 (https://phabricator.wikimedia.org/T359649) (owner: 10raymond-ndibe) [15:57:39] (03merge) 10dcaro: runtime::diff_with_running_job: temp conditional to force job version upgrade from v1 -> v2 [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/259 (https://phabricator.wikimedia.org/T359649) (owner: 10raymond-ndibe) [16:00:43] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-api: bump to 0.0.467-20260303155756-d163bf8c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1160 (https://phabricator.wikimedia.org/T359649) [16:00:45] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-api: bump to 0.0.467-20260303155756-d163bf8c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1160 (https://phabricator.wikimedia.org/T359649) [16:02:12] (03update) 10raymond-ndibe: jobs-api: add jobs version migration script and docs [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/746 (https://phabricator.wikimedia.org/T359649) (owner: 10dcaro) [16:05:46] (03update) 10dcaro: jobs-api: add jobs version migration script and docs [repos/cloud/toolforge/toolforge-deploy] (bump_jobs-api) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/746 (https://phabricator.wikimedia.org/T359649) [16:05:57] (03update) 10dcaro: jobs-api: add jobs version migration script and docs [repos/cloud/toolforge/toolforge-deploy] (bump_jobs-api) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/746 (https://phabricator.wikimedia.org/T359649) [16:07:06] (03merge) 10dcaro: jobs-api: add jobs version migration script and docs [repos/cloud/toolforge/toolforge-deploy] (bump_jobs-api) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/746 (https://phabricator.wikimedia.org/T359649) [16:07:11] (03update) 10dcaro: jobs-api: bump to 0.0.467-20260303155756-d163bf8c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1160 (https://phabricator.wikimedia.org/T359649) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [16:07:58] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [16:21:15] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [16:24:55] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [16:37:08] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [16:46:12] (03update) 10fnegri: Allow to force ARM kubernetes stack [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/303 (owner: 10volans) [17:04:39] 06cloud-services-team, 10Data-Services, 07Documentation: Revise and reformat Portal:Data_Services - https://phabricator.wikimedia.org/T348024#11669208 (10TBurmeister) 05In progress→03Resolved Hi @Hugoa, the page layout looks really nice, and the information you chose to keep in each box makes sense t... [17:11:55] 10Cloud-VPS (Project-requests): Request creation of azwikimedia VPS project - https://phabricator.wikimedia.org/T417736#11669240 (10Andrew) Sorry about the slow process @Nemoralis. There are a few issues that I see with your request. 1) The email and survey issues seem like things that are best managed by exist... [17:17:00] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (Google Summer of Code (2026)): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11669286 (10Divya.code) Hi @Nokib_Sarkar and @Tiven2240, I’m Divya, a 3rd-year engineering student from India with experience in React, JavaScript, and buildi... [17:27:11] 10Tool-wiktlexbot: Properly retrieve lemmas for sign languages - https://phabricator.wikimedia.org/T418890 (10Redmin) 03NEW [17:31:30] (03approved) 10dcaro: Allow to force ARM kubernetes stack [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/303 (owner: 10volans) [17:32:01] 10Tool-wiktlexbot: Properly retrieve lemmas for sign languages - https://phabricator.wikimedia.org/T418890#11669351 (10Redmin) p:05Triage→03Medium [17:42:22] (03update) 10dcaro: jobs-api: bump to 0.0.467-20260303155756-d163bf8c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1160 (https://phabricator.wikimedia.org/T359649) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [17:42:43] (03approved) 10dcaro: jobs-api: bump to 0.0.467-20260303155756-d163bf8c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1160 (https://phabricator.wikimedia.org/T359649) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [17:42:50] (03merge) 10dcaro: jobs-api: bump to 0.0.467-20260303155756-d163bf8c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1160 (https://phabricator.wikimedia.org/T359649) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [17:49:09] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version - https://phabricator.wikimedia.org/T359649#11669408 (10dcaro) All tools were migrated :) Logs left under `/home/dcaro/tool_... [17:49:32] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version - https://phabricator.wikimedia.org/T359649#11669411 (10dcaro) 05In progress→03Resolved [17:51:15] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version - https://phabricator.wikimedia.org/T359649#11669438 (10dcaro) List of tools that needed some manual intervention: ` 200... [17:54:08] 10Tool-wiktlexbot: Properly retrieve lemmas for sign languages - https://phabricator.wikimedia.org/T418890#11669465 (10Redmin) [17:59:24] (03update) 10taavi: Validate Gateway API HTTPRoute resources [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/37 (https://phabricator.wikimedia.org/T418276) [18:01:28] 10wikitech.wikimedia.org, 06serviceops-radar, 06SRE, 13Patch-For-Review, 07SRE-Unowned: Redesign wikitech-static - https://phabricator.wikimedia.org/T376400#11669487 (10Andrew) 05Open→03Resolved [18:05:49] (03update) 10taavi: Validate Gateway API HTTPRoute resources [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/37 (https://phabricator.wikimedia.org/T418276) [18:06:19] (03update) 10taavi: Validate Gateway API HTTPRoute resources [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/37 (https://phabricator.wikimedia.org/T418276) [18:08:10] (03update) 10taavi: Deploy istio and a Gateway [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1143 (https://phabricator.wikimedia.org/T418274) [18:08:52] (03update) 10taavi: Validate Gateway API HTTPRoute resources [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/37 (https://phabricator.wikimedia.org/T418276) [18:09:25] 06cloud-services-team, 10superset.wmcloud.org: Sunset superset.wmcloud.org - https://phabricator.wikimedia.org/T416373#11669528 (10fnegri) I tried to understand how many people are opening the existing superset dashboards. Unfortunately we only have logs for the past few days, but it looks like there is some o... [18:10:55] (03update) 10taavi: Allow tools to manage HTTPRoute resources [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/83 (https://phabricator.wikimedia.org/T418276) [18:13:58] (03merge) 10taavi: Validate Gateway API HTTPRoute resources [repos/cloud/toolforge/ingress-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-admission/-/merge_requests/37 (https://phabricator.wikimedia.org/T418276) [18:14:18] (03update) 10taavi: Allow tools to manage HTTPRoute resources [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/83 (https://phabricator.wikimedia.org/T418276) [18:18:11] 06cloud-services-team, 10superset.wmcloud.org: Sunset superset.wmcloud.org - https://phabricator.wikimedia.org/T416373#11669589 (10fnegri) A slightly more accurate version: `lang=shell-session root@superset-bastion:~# kubectl logs pod/superset-5d5b4875f9-9nhz6 |awk '{print $4 " " $7}' |grep " /superset/dashbo... [18:18:52] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: ingress-admission: bump to 0.0.85-20260303181419-22df37af [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1161 (https://phabricator.wikimedia.org/T418276) [18:19:17] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [18:19:48] !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component ingress-admission [18:22:43] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [18:23:04] !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component ingress-admission [18:23:11] FIRING: [2x] SystemdUnitDown: The systemd unit hdfs_rsync_mediawiki_content_history.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [18:24:53] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [18:25:15] !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component ingress-admission [18:26:15] 06cloud-services-team, 10Toolforge: toolforge-deploy tests failure - https://phabricator.wikimedia.org/T418897 (10taavi) 03NEW [18:26:26] 06cloud-services-team, 10Toolforge: toolforge-deploy tests failure: Your local changes to the following files would be overwritten by checkout: components/jobs-api/2025_04_migration_of_all_jobs_to_version_2 - https://phabricator.wikimedia.org/T418897#11669628 (10taavi) [18:32:45] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [18:34:02] 06cloud-services-team, 10Toolforge: toolforge-deploy tests failure: Your local changes to the following files would be overwritten by checkout: components/jobs-api/2025_04_migration_of_all_jobs_to_version_2 - https://phabricator.wikimedia.org/T418897#11669691 (10taavi) @dcaro To unblock other work, I have stas... [18:40:54] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component ingress-admission [19:41:40] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node [19:51:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-31 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [20:01:16] !log root@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0) [20:01:48] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [20:09:58] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component ingress-admission [20:21:28] RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance toolsbeta-test-k8s-etcd-31 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [20:28:27] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node [20:50:45] !log root@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0) [20:51:02] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node [21:09:46] !log tools.cluebotng-review Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22642685900 (https://github.com/cluebotng/component-configs/commits/3cbfb68b3c0e7d97130ede1be762389f300234d2) [21:09:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [21:11:18] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99) [21:14:37] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [21:15:17] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [21:17:05] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [21:27:55] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [21:44:36] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [21:56:10] !log root@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [21:58:08] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [21:58:41] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:00:49] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:00:58] FIRING: JobsEmailerNoEmails: No emails sent in the last hour - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerNoEmails - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerNoEmails [22:01:23] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:03:00] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:03:35] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:05:58] RESOLVED: JobsEmailerNoEmails: No emails sent in the last hour - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerNoEmails - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerNoEmails [22:06:22] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:06:56] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:08:12] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node [22:15:11] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99) [22:16:17] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:16:51] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:23:11] FIRING: [2x] SystemdUnitDown: The systemd unit hdfs_rsync_mediawiki_content_history.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [22:29:06] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:40:03] !log root@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [22:40:43] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:41:19] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:47:01] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [22:48:57] 10Cloud-VPS (Quota-requests): Quota increases for gitlab-runners - https://phabricator.wikimedia.org/T418813#11671120 (10dduvall) Sounds good to me. Our planned Zuul migration also involves using Magnum for untrusted workloads (see T396936 and related tasks), so yes let's have a cross-team meeting to sync up on... [22:55:11] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [22:56:28] FIRING: PuppetAgentNoResources: No Puppet resources found on instance toolsbeta-test-k8s-etcd-33 on project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [23:00:59] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [23:10:15] !log root@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [23:36:35] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [23:37:08] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [23:48:58] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [23:49:31] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [23:51:06] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [23:51:39] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [23:52:35] 10VPS-project-Codesearch, 06collaboration-services: confd fails with "no such host" in SRV lookup from _etcd-client-ssl._tcp.codesearch.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T417458#11671461 (10Dzahn) masked the confd service. this removes the "FATALs spam" (tail -f /var/log/syslog | grep... [23:53:33] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [23:54:07] !log root@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [23:55:37] !log root@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node