[00:04:43] E: Failed to fetch http://httpredir.debian.org/debian/pool/main/k/keyutils/keyutils_1.5.9-9_amd64.deb Cannot initiate the connection to httpredir.debian.org:80 (2001:4f8:1:c::15). - connect (101: Network is unreachable) [IP: 2001:4f8:1:c::15 80] [00:05:37] seems the network is broken in mw vagrant? [00:16:50] root@stretch:/home/vagrant# [00:16:55] looks broken based on the name [00:22:38] paladox: have to use a http_proxy ? [00:22:55] I doin't think so, at least this worked on stretch without a http_proxy [00:23:04] it works with host but ping fails [00:23:16] what about curl [00:23:41] curl -vvv [00:24:21] i haven't tried curl. [00:33:30] paladox: not familiar with mw vagrant, but maybe it's the IPv6 address returned by the resolver? ` [IP: 2001:4f8:1:c::15 80]` [00:33:54] oh [00:33:58] * paladox tries ipv4 [00:34:24] happens with ipv4 too [00:34:25] ping -4 google.com [00:34:25] PING google.com (172.217.5.238) 56(84) bytes of data. [00:34:25] ^C [00:34:25] --- google.com ping statistics --- [00:34:26] 3 packets transmitted, 0 received, 100% packet loss, time 2047ms [00:35:16] well, that's good to know at least :) [00:36:47] heh [00:37:09] looking in default config [00:37:14] it uses virbr0 [00:37:19] looking with ip addr i see: [00:37:41] https://phabricator.wikimedia.org/P9558 [00:38:11] how about `brctl show`? [00:39:05] * paladox does [00:39:30] jeh https://phabricator.wikimedia.org/P9559 [00:42:51] how about `sysctl net.ipv4.ip_forward`? [00:43:01] * jeh checks https://gerrit.wikimedia.org/r/c/operations/puppet/+/546373 [00:43:08] sysctl net.ipv4.ip_forward [00:43:08] net.ipv4.ip_forward = 1 [00:43:50] well pinging gerrit-test6 works inside the container [00:46:14] do you mind if I connect to gerrit-test6 and take a look? [00:47:05] jeh sure [00:47:36] jeh i see this for lxc-net: [00:47:51] https://phabricator.wikimedia.org/P9560 [00:48:04] "DHCP, sockets bound exclusively to interface lxcbr0" [00:48:15] i guess it should be virbr0? [00:52:05] it seems odd that there are no interfaces attached to lxcbr0 (seen with brctl show) [00:55:05] hmm [01:04:25] it looks like the containers are set to use virbr0, but the iptables rules only forward lxcbr [01:04:33] oh! [01:04:53] well seems like i broke the vps [01:04:57] by doing a network change [01:05:03] https://wiki.debian.org/LXC/SimpleBridge [01:05:10] I wonder if we need to have lxc-net running, libvirt should handle the network here alone I think [01:05:41] jeh are you still ssh'd in? [01:05:50] no, I lost connection [01:05:59] oh [01:06:17] looking at the puppet module, I think it was following this setup: https://wiki.debian.org/LXC#Network_setup_in_buster [01:06:29] yeh [01:08:56] I'm connected to the console of gerrit-test6 now [01:09:06] ah [01:09:20] jeh can you remove the change i did to /etc/networking/interface* please? [01:09:27] (was at the bottom) [01:10:36] sure, the lxcbr0 section? [01:10:39] yup [01:10:41] please :) [01:10:44] and thanks! [01:11:35] no problem, going to restart to flush out all the network things [01:12:26] thanks! [01:13:13] it's back, the iptables rules look a lot better now [01:13:21] thanks! [01:14:14] jeh much better! [01:14:22] i can ping google now! [01:14:24] thank you! [01:14:26] great [01:15:26] prior to the reboot it was missing rules like ` -A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT` [01:15:39] not sure why, but it's something to check if it happens again [01:15:45] oh [01:29:26] jeh hmm seems broke again [01:32:49] I’ll take a look, did it break after a new container was created? Or just stop on its own? [01:33:03] new container [01:33:17] i destroyed the other one as it was complaining about port 8080 already in use [01:35:35] looks like we lost all the iptables rules again (checked with `iptables-save -c` ) [01:35:39] oh [01:35:57] could puppet have overwrote it? Or is it vagrant/ [01:35:58] *? [01:36:28] feels like it's something between lxc and libvirt [01:36:38] ok [01:40:07] I'd like to disable puppet, edit `/etc/default/lxc-net` then run `systemctl stop lxc-net && systemctl restart libvirtd` [01:40:15] +1, thanks [01:40:17] *! [01:42:23] jeh works! [01:42:30] at least root@mediawiki-vagrant:/home/vagrant# [01:42:38] @mediawiki-vagrant is correct [01:43:13] ok cool, I completed the tasks. want to do a reboot (to ensure we have a clean config) then run a few tests? [01:43:28] yup [01:43:47] few tests? [01:44:18] just the same vagrant stuff you've been trying to do :) [01:44:33] ah! [01:44:41] so mwvagrant destroy and mwvagrant up [01:45:13] disable lxc-net too `systemctl disable lxc-net` [01:45:39] sure! [01:45:51] OK, it's back online [01:45:58] * paladox does [01:46:19] jeh i run `sudo service lxc-net stop` too, right? [01:46:37] nope, that should be disabled now [01:46:44] libvirtd should handle the networking [01:46:50] oh [01:46:52] * jeh hopes [01:46:55] it seemed to have started [01:47:01] i just disabled it and stopped it [01:47:15] root@gerrit-test6:/home/paladox# systemctl disable lxc-net [01:47:15] Synchronizing state of lxc-net.service with SysV service script with /lib/systemd/systemd-sysv-install. [01:47:15] Executing: /lib/systemd/systemd-sysv-install disable lxc-net [01:50:02] paladox: In the last week I have setup 5 MediaWiki-Vagrant on Buster instances using the Puppet role with literally no problems. Are you mixing MediaWiki-Vagrant and other roles on the same instance? What manual things have you done to this server? [01:50:36] bd808 i haven't done anything manual apart from add a local.yaml file && also copied gerrit.git.wmflabs.org cert to /etc/acme. [01:50:47] I also have the gerrit role applied along with lvm (srv) [01:50:54] that's really weird :/ [01:51:13] oh, wait. You have the prod gerrit role on the same instance? [01:51:17] yup [01:51:39] that seems like a likely cause of issues [01:51:45] oh [01:51:55] oh!! [01:51:56] ferm and LXC fighting over iptables rules [01:51:56] hold on [01:52:00] i think i know why [01:52:05] wrong ip [01:52:32] since i changed from gerrit-test5 -> 6 i forgot to change the ip [01:52:57] gerrit::service::ipv4: 172.16.4.51 [01:53:56] yeah, iptables rules are gone again. Ferm sounds like the likely culprit [01:54:52] https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/8b299229d17aa8a2360ef4bb4fb00254a2b29a23%5E%21/#F0 [01:55:24] jeh bd808 https://phabricator.wikimedia.org/P9561 [01:55:28] I'll re-enable puppet and revert my local changes [01:55:46] jeh ah too late :) [01:55:51] re-enabled puppet and rebooted! [01:55:58] +1 :) [01:57:00] I need to head out, but I'll circle back a bit later [01:57:14] thanks! [02:00:30] hmm [02:00:36] still dosen't work :( [02:00:44] thought changing the ip would fix it [11:38:01] !log toolsbeta new k8s: refresh deployment for nginx-ingress with latest changes from puppet [11:38:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [11:58:13] !log toolsbeta adding `profile::toolforge::bastion::nproc: 100` to puppet prefix `toolsbeta-sgebastion` (T236202) [11:58:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [11:58:17] T236202: Modify webservice and maintain-kubeusers to allow switching to the new cluster - https://phabricator.wikimedia.org/T236202 [12:58:37] !log git rebuilding gerrit-test6 as gerrit-test7 [12:58:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL [13:13:03] paladox: hows it going? any luck with the container networking? [13:13:23] jeh i'm currently rebuilding gerrit-test6 as 7 (without gerrit first) [13:13:25] seems to be working [13:13:32] i'll add gerrit after it's done! [13:18:07] !log puppet-diff syncing puppet facts from puppetmaster1001.eqiad.wmnet [13:18:08] arturo: Unknown project "puppet-diff" [13:18:13] !log puppet-diffs syncing puppet facts from puppetmaster1001.eqiad.wmnet [13:18:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Puppet-diffs/SAL [13:21:49] !log puppet-diffs syncing puppet facts from tools-puppetmaster-01.eqiad.wmflabs [13:21:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Puppet-diffs/SAL [13:57:44] ==> default: Notice: /Stage[main]/Role::Echo/Mediawiki::User[Selenium Echo user a]/Mediawiki::Maintenance[mediawiki_user_Selenium Echo user a_wiki]/Exec[mediawiki_user_Selenium Echo user a_wiki]/returns: [efe89eb135db7c54efe9e69b] [no req] PasswordError from line 137 of /vagrant/mediawiki/maintenance/createAndPromote.php: The password entered is in a list of very commonly used passwords. Please choose a more unique password. [14:47:04] jeh it seems trying to create an account through https://ldapauth-gitldap.wmflabs.org/wiki/Main_Page fails. (i presume it failing to add it to ldap for some reason) [14:50:11] Nov 08 14:49:56 mediawiki-vagrant slapd[21594]: conn=1021 op=3 do_add: invalid dn (cn:caseExactMatch:=Paladox,ou=People,dc=wmftest,dc=net) [15:00:20] I don't know what the ldap schema looks like for ldapauth-gitldap, do you know if it's using the same DN defined in mw-vagrant? [15:00:30] Yeh, it should [15:00:43] (it's a fresh clone with no modifications apart from adding the loca.yaml file) [15:03:24] I had this issue with gerrit-test5 too. [15:14:44] oh! [15:14:50] jeh found the issue [15:15:05] $wgLDAPSearchAttributes = array( 'ldap' => 'cn:caseExactMatch:' ); [15:15:12] needs to be /$wgLDAPSearchAttributes = array( 'ldap' => 'cn:caseExactMatch' ); [15:15:13] i think [15:15:20] at least i did /$wgLDAPSearchAttributes = array( 'ldap' => 'cn' ); [15:17:27] I'm not sure about the fix, but your user account is indeed there now. `ldapsearch -x -LLL -b ou=People,dc=wmftest,dc=net cn=Paladox` [15:17:43] yup [15:17:54] cn:caseExactMatch:=Paladox,ou=People,dc=wmftest,dc=net [15:17:58] would that be valid syntax? [15:18:03] the := part? [15:18:55] hmm [15:19:08] seems to work logging in with := but fails when creating an account [15:19:28] yeah, I think so [15:21:21] tested exactCaseMatch syntax with ldapsearch ` ldapsearch -x -b ou=People,dc=wmftest,dc=net "(cn:caseExactMatch:=Paladox)"` [15:22:01] oh [15:26:55] jeh can you write with it? [15:28:54] might be able to use ldapmodify [15:29:40] i guess if caseExactMatch dosen't match it throws something? Or at least the ldap extension thinks so. [17:15:24] !log tools pushed new buster images with the prefix name "toolforge" [17:15:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:22:28] !log tools.quickcategories deployed 32bb3fcae6 (silence bs4 warning) [17:22:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [17:46:06] !log deployment-prep Upgraded php7.2 on deployment-mwmaint01, was too old for MW [17:46:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [18:36:58] !log tools pushed buster-sssd images to the docker repo [18:37:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:37:37] !log tools pushed new webservice package supporting buster containers to repo T230961 [18:37:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:37:40] T230961: Install a version of Python newer than 3.5.3 in Toolforge - https://phabricator.wikimedia.org/T230961 [18:40:16] !log tools pushed new webservice package to the bastions T230961 [18:40:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:36:56] !log toolsbeta rebooted the proxy server just in case that fixes something. [19:36:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [20:34:36] !log language Added 10G to RAM quota to allow spinning up a bigmem instance T237354 [20:34:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Language/SAL [20:34:40] T237354: "bigram" instance for Language team - https://phabricator.wikimedia.org/T237354 [21:53:56] does anyone know how I would get Jenkins to run on PHP 7.3? https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GraphQL/+/547844 [22:05:45] most likely someone in -releng would know [22:39:49] Krenair thanks, I'm asking in there [22:47:32] !log tools adding rsync::server::wrap_with_stunnel: false to the tools-docker-registry-03/4 servers to unbreak puppet [22:47:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:05:08] bstorm_: tell cdanis about that [23:05:52] 👍🏻 [23:06:00] There's been a few. [23:07:35] i've had that too. [23:10:29] best leave a comment on https://phabricator.wikimedia.org/T237424 [23:10:59] https://gerrit.wikimedia.org/r/c/operations/puppet/+/547527 [23:41:05] paladox: sudo -u broken on buster? [23:41:17] Seems so per your finding. [23:41:27] It's using your local user rather then using the set user. [23:41:43] unfortunate. it seems to break crons [23:43:00] paladox: https://linux.die.net/man/5/sssd-sudo hmmm [23:43:10] yup [23:44:44] i dont want to move the crons to root if i can avoid it [23:46:58] sigh.. ariel's law strikes again [23:47:47] https://twitter.com/GVKitchen/status/1176249436768485378 [23:47:52] err wrong channel [23:48:01] "it never takes 5 minutes" and already migrated to buster but now this [23:48:04] paladox: hehe [23:48:11] :P