[00:03:05] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-1 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesse
[00:05:55] <jinxer-wm>	 FIRING: MaxConntrack: Max conntrack at 84.34% on cloudvirt1067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack
[00:48:05] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-1 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesse
[00:55:56] <jinxer-wm>	 RESOLVED: MaxConntrack: Max conntrack at 82.82% on cloudvirt1067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack
[03:13:05] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[03:19:19] <jinxer-wm>	 FIRING: HighIOWaitStalling: High iowait detected on clouddumps1002:9100. - https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Shared_storage#Dumps - https://grafana.wikimedia.org/d/000000568/wmcs-dumps-general-view - https://alerts.wikimedia.org/?q=alertname%3DHighIOWaitStalling
[03:24:19] <jinxer-wm>	 RESOLVED: HighIOWaitStalling: High iowait detected on clouddumps1002:9100. - https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Shared_storage#Dumps - https://grafana.wikimedia.org/d/000000568/wmcs-dumps-general-view - https://alerts.wikimedia.org/?q=alertname%3DHighIOWaitStalling
[05:22:50] <wikibugs>	 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install cloudcephosd10[48-51] - https://phabricator.wikimedia.org/T394333#10973902 (10ayounsi) There is currently only one switch per rack, so I suggest we only use one uplink for now, and revisit it the day we have more.
[06:08:06] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[06:46:28] <wmcs-alerts>	 FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance runner-1033 in project gitlab-runners   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun
[07:01:31] <wmcs-alerts>	 FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 3 deleted instances on gitlab-runners-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[07:11:28] <wmcs-alerts>	 RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance runner-1033 in project gitlab-runners   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun
[07:24:48] <jinxer-wm>	 FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudgw1004:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources
[07:28:06] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[07:29:48] <jinxer-wm>	 RESOLVED: PuppetZeroResources: Puppet has failed generate resources on cloudgw1004:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources
[07:57:26] <wikibugs>	 (03CR) 10Eugene233: "recheck" [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166054 (https://phabricator.wikimedia.org/T390397) (owner: 10Bovimacoco)
[08:15:33] <wikibugs>	 06cloud-services-team, 10Toolforge: [toolforge-cli-gen] review the https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-gen-cli client as potential consolidation - https://phabricator.wikimedia.org/T398651#10974128 (10Addshore) @dcaro want to schedule a call to walk through it all in more detail?
[08:19:14] <wikibugs>	 06cloud-services-team, 10Toolforge: [toolforge-cli-gen] review the https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-gen-cli client as potential consolidation - https://phabricator.wikimedia.org/T398651#10974130 (10dcaro) >>! In T398651#10974128, @Addshore wrote: > @dcaro want to schedule a call to...
[08:28:06] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[09:17:41] <wikibugs>	 (03open) 10dcaro: Draft: DONOTMERGE: always auth as tf-test [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/72
[09:21:37] <wikibugs>	 (03update) 10dcaro: Draft: DONOTMERGE: always auth as tf-test [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/72
[09:21:45] <wikibugs>	 (03PS1) 10Essa237: Refined the landing page [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166348
[09:23:21] <wikibugs>	 (03update) 10dcaro: Draft: DONOTMERGE: always auth as tf-test [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/72
[09:25:02] <wikibugs>	 (03Abandoned) 10Essa237: [Fix] added a landing page [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1157586 (owner: 10Essa237)
[09:28:15] <wikibugs>	 10wikitech.wikimedia.org, 06SRE, 10SRE-Access-Requests: Add Sowmya Guru to list of "WMDE group" approvers on Wikitech - https://phabricator.wikimedia.org/T398686 (10Tobi_WMDE_SW) 03NEW
[09:29:57] <wikibugs>	 (03open) 10taavi: logs: Move multi-pod fix from jobs-api to here [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/82 (https://phabricator.wikimedia.org/T398647)
[09:33:17] <wikibugs>	 10wikitech.wikimedia.org, 06SRE, 10SRE-Access-Requests: Add Sowmya Guru to list of "WMDE group" approvers on Wikitech - https://phabricator.wikimedia.org/T398686#10974387 (10Clement_Goubert)
[09:33:29] <wikibugs>	 10wikitech.wikimedia.org, 06SRE, 10SRE-Access-Requests: Add Sowmya Guru to list of "WMDE group" approvers on Wikitech - https://phabricator.wikimedia.org/T398686#10974388 (10Clement_Goubert)
[09:33:34] <wikibugs>	 (03update) 10dcaro: Draft: DONOTMERGE: always auth as tf-test [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/72
[09:35:50] <wikibugs>	 (03open) 10taavi: Draft: Use logging multi-pod fix moved to toolforge-weld [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/179 (https://phabricator.wikimedia.org/T398647)
[09:37:00] <wikibugs>	 10wikitech.wikimedia.org, 06SRE, 10SRE-Access-Requests: Add Sowmya Guru to list of "WMDE group" approvers on Wikitech - https://phabricator.wikimedia.org/T398686#10974404 (10Clement_Goubert) 05Open→03In progress p:05Triage→03Medium @Tobi_WMDE_SW Can you or @sowmya.guru fill out the first part of the...
[09:39:55] <wikibugs>	 (03update) 10taavi: Draft: Use logging multi-pod fix moved to toolforge-weld [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/179 (https://phabricator.wikimedia.org/T398647)
[09:40:55] <wikibugs>	 (03update) 10dcaro: Draft: DONOTMERGE: always auth as tf-test [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/72
[09:44:42] <wikibugs>	 (03update) 10taavi: logs: Move multi-pod fix from jobs-api to here [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/82 (https://phabricator.wikimedia.org/T398647)
[10:02:24] <wikibugs>	 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: Move Kubernetes log source multi-pod handling from jobs-api to toolforge-weld - https://phabricator.wikimedia.org/T398647#10974500 (10taavi)
[10:33:06] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[10:42:31] <wikibugs>	 (03PS1) 10NkwadaNora: [fix]: created a pyproject.toml file at the root of the project, this tells tox to skip trying to build a Python distribution and just run your npm lint commands insteadsince the project is javascript, typeScript and not a python project [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166379
[10:51:33] <wikibugs>	 (03Abandoned) 10NkwadaNora: rearrange the location of some files [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1152117 (owner: 10NkwadaNora)
[10:53:06] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[10:54:01] <wikibugs>	 (03CR) 10Eugene233: "recheck" [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166379 (owner: 10NkwadaNora)
[10:55:29] <wmcs-alerts>	 FIRING: NfsAlmostFull: The NFS drive is over 85% capacity (currently 87.13%) at host paws-nfs-1 in project paws   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DNfsAlmostFull
[11:02:39] <wikibugs>	 (03CR) 10NkwadaNora: [C:03+1] [fix]: created a pyproject.toml file at the root of the project, this tells tox to skip trying to build a Python distribution and just run y [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166379 (owner: 10NkwadaNora)
[11:05:22] <wikibugs>	 (03CR) 10Eugene233: [C:03+2] [fix]: created a pyproject.toml file at the root of the project, this tells tox to skip trying to build a Python distribution and just run y [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166379 (owner: 10NkwadaNora)
[11:28:06] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[12:31:40] <wikibugs>	 (03open) 10taavi: Query logs from Loki [repos/cloud/toolforge/jobs-api] (taavi/logging) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/180 (https://phabricator.wikimedia.org/T398645)
[12:36:01] <wikibugs>	 (03update) 10taavi: Query logs from Loki [repos/cloud/toolforge/jobs-api] (taavi/logging) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/180 (https://phabricator.wikimedia.org/T398645)
[12:38:41] <jinxer-wm>	 FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks
[12:42:39] <wikibugs>	 06cloud-services-team, 10Cloud-VPS (Project-requests): Request creation of wikidata-deleted VPS project - https://phabricator.wikimedia.org/T398254#10975074 (10taavi) a:03taavi
[12:42:46] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 wikidata-deleted START - Cookbook wmcs.vps.create_project for project wikidata-deleted in eqiad1 (T398254)
[12:42:47] <stashbot>	 taavi@cloudcumin1001: Unknown project "wikidata-deleted"
[12:42:48] <stashbot>	 T398254: Request creation of wikidata-deleted VPS project - https://phabricator.wikimedia.org/T398254
[12:43:25] <wikibugs>	 (03open) 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49: projects: added project wikidata-deleted [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/256 (https://phabricator.wikimedia.org/T398254)
[12:44:08] <wikibugs>	 (03update) 10taavi: Query logs from Loki [repos/cloud/toolforge/jobs-api] (taavi/logging) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/180 (https://phabricator.wikimedia.org/T398645)
[12:44:37] <wikibugs>	 (03merge) 10taavi: projects: added project wikidata-deleted [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/256 (https://phabricator.wikimedia.org/T398254) (owner: 10group_199_bot_333a6c67971a471aeb1cf0b14ccf9f49)
[12:47:06] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 wikidata-deleted END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project wikidata-deleted in eqiad1 (T398254)
[12:47:07] <stashbot>	 taavi@cloudcumin1001: Unknown project "wikidata-deleted"
[12:49:44] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 wikidata-deleted START - Cookbook wmcs.vps.create_project for project wikidata-deleted in eqiad1 (T398254)
[12:49:48] <stashbot>	 T398254: Request creation of wikidata-deleted VPS project - https://phabricator.wikimedia.org/T398254
[12:51:18] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: Cloud VPS project creation cookbook times out really often - https://phabricator.wikimedia.org/T398712 (10taavi) 03NEW
[12:51:18] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 wikidata-deleted END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project wikidata-deleted in eqiad1 (T398254)
[12:53:08] <wikibugs>	 06cloud-services-team, 10Cloud-VPS (Project-requests), 13Patch-For-Review: Request creation of wikidata-deleted VPS project - https://phabricator.wikimedia.org/T398254#10975101 (10taavi) 05Open→03Resolved This project has been created. @bovlb: please make sure that you are subscribed to [[ https://li...
[12:54:07] <wmcs-alerts>	 FIRING: HarborComponentDown: A Harbor component is down. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown
[12:59:47] <icinga-wm>	 PROBLEM - toolschecker: NFS read/writeable on labs instances on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 504 Gateway Time-out - string OK not found on http://checker.tools.wmflabs.org:80/nfs/home - 324 bytes in 60.005 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[13:01:25] <icinga-wm>	 RECOVERY - toolschecker: NFS read/writeable on labs instances on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 35.375 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[13:03:06] <wmcs-alerts>	 FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[13:13:06] <wmcs-alerts>	 FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[13:18:06] <wmcs-alerts>	 FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[13:23:06] <wmcs-alerts>	 FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[13:24:49] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-36
[13:26:07] <wikibugs>	 06cloud-services-team, 10Toolforge: toolsbeta harbor disk full - https://phabricator.wikimedia.org/T398715 (10taavi) 03NEW p:05Triage→03High
[13:28:06] <wmcs-alerts>	 FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[13:30:39] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-36
[13:33:06] <wmcs-alerts>	 RESOLVED: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-36 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProce
[13:50:45] <wikibugs>	 (03approved) 10marostegui: Update ES switchover script [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/11 (https://phabricator.wikimedia.org/T397628) (owner: 10ladsgroup)
[14:08:05] <wmcs-alerts>	 FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-24 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses
[14:38:05] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-12 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[14:44:37] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-24
[14:48:41] <jinxer-wm>	 RESOLVED: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks
[14:56:18] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-24
[14:58:04] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-12 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[15:03:05] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-12 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[15:08:04] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-12 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[15:33:05] <wmcs-alerts>	 RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-24 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses
[16:13:27] <wikibugs>	 06cloud-services-team, 10Toolforge: toolsbeta harbor disk full - https://phabricator.wikimedia.org/T398715#10975666 (10Raymond_Ndibe) a:03Raymond_Ndibe
[16:38:36] <wikibugs>	 10Tools: [versions] Link to MediaWiki release notes for deployed versions - https://phabricator.wikimedia.org/T398725 (10bd808) 03NEW
[17:03:47] <wikibugs>	 (03merge) 10ladsgroup: Update ES switchover script [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/11 (https://phabricator.wikimedia.org/T397628)
[17:39:05] <wmcs-alerts>	 FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-57 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses
[18:23:21] <wikibugs>	 06cloud-services-team, 10Toolforge: toolsbeta harbor disk full - https://phabricator.wikimedia.org/T398715#10975938 (10Raymond_Ndibe) Managed to get it down to 50% by manually cleaning up some images and running garbage collection (took some fiddling to get gc to run because gc needs redis and redis was down b...
[18:24:00] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services, 06Release-Engineering-Team: Add the 'other assignee' field to the Phabricator test instance - https://phabricator.wikimedia.org/T398732 (10A_smart_kitten) 03NEW
[18:24:10] <wikibugs>	 06cloud-services-team, 10Toolforge (Toolforge iteration 21): toolsbeta harbor disk full - https://phabricator.wikimedia.org/T398715#10975962 (10Raymond_Ndibe)
[18:24:24] <wikibugs>	 06cloud-services-team, 10Toolforge (Toolforge iteration 21): toolsbeta harbor disk full - https://phabricator.wikimedia.org/T398715#10975965 (10Raymond_Ndibe) 05Open→03In progress
[18:49:02] <wikibugs>	 (03PS1) 10Krinkle: IPInfo: Improve and simplify getAsInfo implementation further [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1166433
[18:50:31] <wikibugs>	 (03CR) 10Krinkle: [C:03+2] IPInfo: Improve and simplify getAsInfo implementation further [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1166433 (owner: 10Krinkle)
[18:51:27] <wikibugs>	 (03Merged) 10jenkins-bot: IPInfo: Improve and simplify getAsInfo implementation further [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/1166433 (owner: 10Krinkle)
[19:49:04] <wmcs-alerts>	 FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-47 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess
[23:15:15] <wikibugs>	 (03PS1) 10Jacob4code: Updated README Elaborate steps to run tool on local setup [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166450
[23:16:41] <wikibugs>	 (03CR) 10Jacob4code: "Hi can you please review this ?" [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1166450 (owner: 10Jacob4code)