[17:01:18] * anarcat waves [17:01:28] i wonder if https://rachelbythebay.com/w/2020/10/26/num/ could be a good addition to Cumin [17:01:28] :) [17:02:08] anarcat: https://gerrit.wikimedia.org/r/c/operations/software/cumin/+/636729 ;) [17:02:20] haha [17:02:25] this world is small :D [17:02:31] volans always one step ahead of me [17:02:45] actually was bblack that pointed that out to me [17:02:51] i also found out about spicerack and "cookbooks" recently and my mind was kind of blown [17:03:02] there was a small discussion about the thousand separator point, feel free to comment in the CR [17:03:07] i have been using Fabric to do stuff like that at tpo recently, and getting increasingly frustrated with it [17:04:13] that could be counter-intuitive, yes [17:04:15] thanks, I just jumped into a meeting, but can give more info if you want later, keep in mind that spicerack/cookbooks are really tigthened to WMF infra [17:04:18] should tolerate them, but not enforce them [17:04:22] yeah [17:04:34] would be happy to chat about fabric (and mitogen, actually) later if you're interested [17:04:40] have a good meeting [17:05:59] thx [17:06:02] ttyl [18:42:43] * volans back anarcat [19:35:32] hey volans [19:36:16] volans: so does the cookbook stuff execute Python code on remote servers (like mitogen), or does it just run shell commands (like fabric)? [19:36:40] i ask because one thing i find frustrating with Fabric is that i end up writing a lot of python code that's just a wrapper around shell [19:36:46] i'd much rather have real python run on remote servers [19:37:34] so i could use mitogen for that but (a) the future of mitogen is uncertain (https://github.com/dw/mitogen/issues/751 although less so than fabric) and (b) mitogen doesn't have a good inventory / commandline story (like fabric does, but cumin libs could help here) [19:37:43] volans: also, how's cumin debian packaging going? :) [19:40:40] so, cookbooks are just piece of python code that uses mostly spicerack as a library and runs on the centralized hosts (cumin hosts) to perform tasks that interact with out infrastructure, so let's say depool a server from the load balancer talking to etcd, running some remote command on the hosts via cumin (that is integrated into spicerack), post something on a phabricator task, etc.. [19:41:13] so the tl;dr of remote commands is ssh via cumin, but on the other side we might run a more complex python script if needed [19:41:26] so like fabric [19:41:39] ie. you can run a python command on the other side, but only if that code was already deployed there [19:41:44] yes [19:41:46] right [19:42:10] technically you could wrap a python script and send it over, but it's not RPC [19:42:51] yeah, i've been considering that, but at that point i'd probably rewrite with mitogen [19:43:03] or just make this a lib that i ship everywhere [19:43:14] which i am trying to avoid because i am using this to bootstrap hosts [19:44:08] not your problem, obviously :) [19:44:19] lol, sure :) but most of the remote commands are abstracted via the spicerack modules, so for example if you look at a decommission cookbook [19:44:19] you folks use FAI for bootstrap, do i remember this right? [19:44:22] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/cookbooks/+/refs/heads/master/cookbooks/sre/hosts/downtime.py [19:44:32] line 52 and later [19:44:55] that's pretty [19:45:01] it's kinda self-explanatory in what it does [19:45:10] but it is probably less pretty underneath the spice rack [19:45:16] and that's where all the bugs live [19:45:24] but that's all well tested :D [19:45:25] where all the crumbs from the last meal end up ;) [19:45:28] neat [19:45:31] yeah :D [19:45:34] like unit tested you mean? [19:45:48] yes, integration tests for this is super complex [19:45:59] and we don't have a proper "staging" with all the bits right now [19:46:00] for sure [19:46:06] unit tests might be hard, even [19:46:11] i haven't written any on top of fabric [19:46:29] as for previous questions: [19:46:50] - debian packaging, still TBD to send it upstream, we should resume that [19:47:56] - FAI, no we don't use it [19:48:21] (and my laptop just crashed, from mobile) [19:48:52] fun times [19:49:03] we use preseed extensively and a reimage script that predates spicerack that I'm converting to cookbook for the other bits [19:49:25] that basically starts the first puppet run [19:49:42] oh wow, so partman wrangling [19:49:45] i gave up on that [19:49:54] i ended up just writing my own installer, which is kind of crazy [19:50:00] but i'm reusing bits of FAI [19:50:16] the partition manager ("setup-storage") mostly, so it's less crazy [19:50:36] eheh yes partman that friendly piece of softwareâ„¢ :-P [19:50:48] interesting [19:50:56] yeah [19:51:28] it's still kind of rough, but it looks like this https://gitweb.torproject.org/admin/tsa-misc.git/tree/fabric_tpa/host.py#n507 [19:51:44] a lot of it relies on grml-debootstrap as well, which does magic things like install grub and run post-install scripts in the chroot [19:52:41] recently a colleague made a partman integration that allows to retain specific partitions, very useful to reimage a stateful server like a db from say stretch to buster keeping the data partition [19:55:28] interesting [19:55:43] i suspect fai-setup-storage might have trouble doing that, but i haven't investigated that [19:55:53] as soon as I get back my laptop can point you to it [19:58:56] oh and i had another question, sorry if you've been asked this a billion times already [19:59:18] https://www.mediawiki.org/wiki/GitLab_consultation#Outcome affects only mediawiki.org and the wiki engine, or all of wikipedia? [19:59:27] ie. will you folks stick to gerrit or are migrating to gitlab as well? [19:59:43] the HN thread is ambiguous there as well https://news.ycombinator.com/item?id=24919569 [19:59:49] "Wikimedia is moving to GitLab" [20:00:15] yeah, AFAIK the plan in the long term is to move everything from gerrit to a gitlab installation [20:00:39] oh cool [20:00:48] will take quite some time and I think that puppet and other SRE stuff will probably be towards the end [20:00:51] i must say that will make collaboration easier for people like me ;) [20:00:54] *be migrated [20:00:56] yeah i can imagine [20:03:28] anarcat: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/install_server/files/autoinstall/scripts/reuse-parts.sh [20:04:11] and [20:04:11] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/install_server/files/autoinstall/reuse-parts.cfg [20:05:03] that is suspiciously short [20:05:09] that ends up being used like this [20:05:09] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/install_server/files/autoinstall/partman/custom/reuse-db.cfg [20:05:43] where 'keep' is the important part :D [20:06:04] someone with deep partman knowledge wrote this [20:06:05] scary [20:07:40] actually I think was all knowledge gained in the process :) [20:07:49] scarier :) [20:07:56] but works! [20:08:01] and has even some tests [20:08:12] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/install_server/files/autoinstall/reuse-parts-test.cfg [20:08:26] I mean, a way to test it [20:09:11] but yeah nobody wants to interact with partman, even less its internals :D [20:10:48] your host.py is not that rough, quite clean actually [20:11:19] thanks [20:11:48] by not clean i mean stuff like this: [20:11:56] res = con.run('. /tmp/fai/disk_var.sh && mkdir -p /target && mount "$ROOT_PARTITION" /target ; mkdir -p /target/boot && mount "$BOOT_PARTITION" /target/boot', warn=True) # noqa: E501 [20:12:22] or the bits after logging.info('uploading post-scripts %s to %s', # [...] line 607+ [20:12:31] which basically hide all the tricky bits under a pile of horrible shell scripts [20:12:48] e.g. https://gitweb.torproject.org/admin/tsa-misc.git/tree/installer/post-scripts/50-tor-install-luks-setup [20:12:59] and so on https://gitweb.torproject.org/admin/tsa-misc.git/tree/installer/post-scripts [20:13:53] eheheh the pile ofhorrible shll is always hard to get completely rid off [20:14:02] yeah [20:48:33] [[Tech]]; 97.89.109.8; [none]; https://meta.wikimedia.org/w/index.php?diff=20615617&oldid=20613166&rcid=16588331 [20:59:15] [[Tech]]; Hasley; Reverted changes by [[Special:Contributions/97.89.109.8|97.89.109.8]] ([[User talk:97.89.109.8|talk]]) to last version by AntiCompositeNumber; https://meta.wikimedia.org/w/index.php?diff=20615708&oldid=20615617&rcid=16588639