On reproducible distro configuration

On reproducible distro configuration

Or how I learned to stop worrying and use custom shell scripts

Modern Linux distros aim to equip both casual users and developers with many of their tools out of the box, but given the vast repositories of useful software, you'll likely install additional tools for specific use cases, and this often happens over time. Eventually, your OS installation becomes something unique to your workflow. Many distros also offer in-place upgrades between releases or entirely rolling-release models to hypothetically allow such mutable configuration to remain on your machine throughout its life.

But what happens when a reinstall is required, perhaps due to data corruption (?!), or simply desired? If you were thinking "I'll painstakingly recall the additional software I installed and custom configuration and manually set up the same state!" let's think again. We can encode the custom state of our post-installation OS to automate its recreation, thus satisfying our objective: reproducibility! 🤯

Reproducible OSes?

Some modern distros offer a mechanism to reproduce immutable installations, which addresses the software installation concerns. See:

However, it may not address the custom configuration that often lives outside the jurisdiction of system-level package managers; think ~/.config and friends.

A solution explored below could be used in conjunction with the declarative, immutable architecture to set up fully reproducible systems, and this is a fine approach should you use such a novel OS.

But, I don't, as I use my software projects that are reinstalled as changed and I have yet to explore how this workflow could be easily adapted to an immutable system.

So, what might be some alternative approaches?

Ansible and friends?

How about existing "infrastructure automation" software such as Ansible or Puppet?

For a long time, I used Ansible and this worked well enough. But, as I didn't use it frequently outside of this use case, and my configuration remained relatively stable, I found myself often needing to reference the docs when I had to make changes; an unacceptably unwieldy process. Additionally, the forced directory structure and method of injecting variables became cumbersome. Essentially, Ansible was overly-complex for my use case, but it gets credit for its declarative configuration, which will make an appearance in my solution.

It is worth noting that while using Ansible I was installing an X11 window manager and associated configuration, significantly complicating the typical post-installation state. When I switched to GNOME Shell, much of the Ansible config could be eliminated, which only made the aforementioned shortcomings all the more obvious.

Check out my obsolete Ansible-based setup to see what this looked like in practice.

Custom scripts!

Why not write custom shell scripts? One could say: because then managing reproducible configuration becomes a software project all of its own. The antidote here is declarative configs, like Ansible's YAML-based architecture. However, one could retort: this simply moves the complexity from the user-facing configuration to the business logic of the software, which we will have to write if creating a custom solution! Fear not, in my case this code is as simple as:

dnf() {
    sudo dnf -q -y $@
}

enable_coprs() {
    echo '--- Enabling coprs ---'
    while read -r copr; do
        dnf copr enable "$copr"
    done < "$WORKDIR"/coprs.txt
}

install_packages() {
    echo '--- Installing packages ---'
    while read -r package; do
        dnf install "$package"
    done < "$WORKDIR"/packages.txt
}

The $WORKDIR variable allows these "library" functions to be applied to any directory in the project's root (facilitating modular configuration, like Ansible's roles), and the only structure enforced is the naming of the plain-text declaration files containing repositories to enable and software packages to install. Additional functions can be written as needed. No more documentation to reference!

A main configure script employs these library functions as such:

#!/bin/sh

# Set the chassis type (desktop, laptop) to include hardware-specific configuration.
export CHASSIS="$(hostnamectl | awk '$1 == "Chassis:" {print $2}')"
export WORKDIR=.

# Source the library functions.
source lib/dnf.sh

# This function clones and configures dotfile repositories.
configure() {
    [ ! -e "$2" ] && git clone https://github.com/jcrd/"$1" "$2"
    pushd "$2"
    ./configure
    popd
}

# Configure dotfiles.
configure zsh-config ~/.config/zsh

# Enable copr repos with a library function.
enable_coprs

# Install packages with another library function!
install_packages

# Custom state is easy to recreate: just include the shell commands
# you already know and love!
# Set user's shell to zsh. 😎
sudo usermod --shell /usr/bin/zsh "$USER"

# Chassis-specific configuration, like Ansible's roles.
export WORKDIR=chassis/"$CHASSIS"
if [ -e "$WORKDIR" ]; then
    ./"$WORKDIR"/configure
fi

See my live configuration for the whole thing in action, complete with more library functions and the above desktop role!

Now, after a fresh installation of Fedora, reproducing my configuration is as easy as:

git clone https://github.com/jcrd/fedora-workstation-config
cd fedora-workstation-config
./configure

Say goodbye to complexity and hello to easily reproducible Linux installations for maximum automation ⚙️ and peace of mind 🙏!