[08:40:23] 10GitLab (Infrastructure), 06collaboration-services: upgrade gitlab hosts to bookworm - https://phabricator.wikimedia.org/T399306#10999160 (10Jelto) p:05Triage→03High a:03Jelto [09:06:50] 10GitLab (Infrastructure), 06collaboration-services: upgrade gitlab hosts to bookworm - https://phabricator.wikimedia.org/T399306#10999234 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1003 for host gitlab2003.wikimedia.org with OS bookworm [09:29:46] 10GitLab (Project Migration), 06cloud-services-team, 13Patch-For-Review, 10Toolforge (Toolforge iteration 21): Migrate misctools package to GitLab - https://phabricator.wikimedia.org/T398202#10999330 (10dcaro) [09:35:05] 10GitLab (Project Migration), 06cloud-services-team, 13Patch-For-Review, 10Toolforge (Toolforge iteration 21): Migrate misctools package to GitLab - https://phabricator.wikimedia.org/T398202#10999387 (10dcaro) a:03dcaro [09:35:09] 10GitLab (Project Migration), 06cloud-services-team, 13Patch-For-Review, 10Toolforge (Toolforge iteration 21): Migrate misctools package to GitLab - https://phabricator.wikimedia.org/T398202#10999390 (10dcaro) 05Open→03Resolved [09:46:16] 10GitLab (Infrastructure), 06collaboration-services: upgrade gitlab hosts to bookworm - https://phabricator.wikimedia.org/T399306#10999476 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1003 for host gitlab2003.wikimedia.org with OS bookworm completed: - gitlab2003 (**PASS**)... [09:48:03] 10GitLab (Infrastructure), 06collaboration-services: upgrade gitlab hosts to bookworm - https://phabricator.wikimedia.org/T399306#10999503 (10Jelto) [12:11:20] Hi! our tofu runs on CI started failing due to being unable to contact the provider [12:11:31] `"https://proxy-eqiad1.wmflabs.org:5668/dynamicproxy-api/v1/tools/mapping/prometheus.svc.toolforge.org` [12:11:43] ex. https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/jobs/562476 [12:12:22] iirc there was some kind of outbound traffic filtering enabled not long ago or something? Maybe it's missing that entry? [12:12:35] (any other debugging help is appreciated too :) ) [12:16:03] I think I found something :), will send a patch soon [12:17:05] 10GitLab (Infrastructure), 06collaboration-services, 13Patch-For-Review: upgrade gitlab hosts to bookworm - https://phabricator.wikimedia.org/T399306#11000072 (10Jelto) [12:17:35] 10GitLab (Infrastructure), 06collaboration-services, 13Patch-For-Review: upgrade gitlab hosts to bookworm - https://phabricator.wikimedia.org/T399306#11000074 (10Jelto) The new test instance `gitlab-1002` is running `bookworm`. I'll do a few more tests but it looks promising so far. [12:20:00] Here it is https://gerrit.wikimedia.org/r/c/operations/puppet/+/1169098 [12:20:18] proxy-eqiad1.wmflabs.org/proxy-eqiad1.wmcloud.org is listed only with port 443, 5668 should be added [12:20:40] ah yes exactly :) [12:21:32] looks good to me, thanks for uploading the patch. Should I merge it? [12:22:12] give me a sec, I used the wrong dns name (deprecated) [12:22:14] updating [12:22:39] proxy-eqiad1.wmflabs.org and proxy-eqiad1.wmcloud.org resolve both to the same IP. But yes lets use the newer one [12:22:59] huh, the codfw1 does not resolve :/ [12:23:40] then just use eqiad1 for now? [12:24:05] changed it, will update the name when we add the entry for codfw too [12:24:27] both are needed, on to manage openstack in eqiad, one to manage it in codfw (development environment) [12:25:06] ah in codfw only the old wmflabs name exists.. [12:25:13] yep looks like, needs fixing :) [12:28:23] +1ed, let me know if you need help merging this change [12:28:56] is there anything else I have to do after the merge? [12:29:54] no not really, wait for wmcs puppet master and puppet runs (10+30 minutes max). If this is urgent I can force a puppet run [12:31:14] ack, not urgent, I can wait, thanks! [12:31:27] great :) [13:40:09] hmm... it did not work, something I noticed is that the ip that the workers resolve the hostname to is internal 172.16.16.6, maybe the puppet name resolution for the firewall rules uses the external one? [13:42:24] hmm... the runner also resolves to the internal ip [13:42:26] https://www.irccloud.com/pastebin/zQQCIMso/ [13:42:46] but the connection fails from ci :/ https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/jobs/562524 [13:48:10] It looks like ferm was not restarted. There is no rule for port 5668 in iptables. After restarting ferm there is one. [13:48:10] Let me restart ferm on all runners, one sec [13:48:56] okok [13:49:30] ok I restarted ferm, can you try again? [13:53:58] on it, one already passed 👍 [13:54:15] we get often timeouts when pulling things from github, is that a known issue? [13:54:34] (any workaround besides taking it off github? xd, it's the terraform provider in this case) [13:56:14] the checks now worked well thanks! [13:57:35] ex. of the github timeout thing (happened quite often today) https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/jobs/562535 [13:59:15] great, nice to hear that [13:59:16] Regarding the Github timeouts I have to check. I'm not aware of any throttling we have in place for github [21:48:08] 10GitLab (Administration, Settings & Policy), 06MediaWiki-Gerrit-Group-Requests: Remove +2 rights from SD0001 for SecurePoll - https://phabricator.wikimedia.org/T399508 (10Snaevar) 03NEW [22:31:04] 10GitLab (Administration, Settings & Policy), 06MediaWiki-Gerrit-Group-Requests: Remove +2 rights from SD0001 for SecurePoll - https://phabricator.wikimedia.org/T399508#11002637 (10Zabe) > Violation 1: > Creation of a patch in T396441, while the underlying discussion at https://en.wikipedia.org/wiki/Wikipedia:...