- Overview
- Basic concept of Debian Live
- Install packages
- Prepare the work directory
- Main configuration
- Extra packages
- Installing a backports kernel
- Set a static /etc/resolv.conf
- Set an explanatory footer text in /etc/issue.footer
- Set a random password at boot
- Fix initial networking setup
- Fix the shutdown process
- Build it
- Booting it
- In action
- Improvements
Overview ^
BitFolk‘s Rescue VM is a live system based on the Debian Live project. You boot it, it finds its root filesystem over read-only NFS, and then it mounts a unionfs RAM disk over that so that you can make changes (e.g. install packages) that don’t persist. People generally use it to repair broken operating systems, reset root passwords etc.
Every few years I have to rebuild it, because it’s important that it’s new enough to be able to effectively poke around in guest filesystems. Each time I have to try to remember how I did it. It’s not that difficult but it’s well past time that I document how it’s done.
Basic concept of Debian Live ^
The idea is that everything under the config/ directory of your build area is either
- a set of configuration options for the process itself,
- some files to put in the image,
- some scripts to run while building the image, or
- some scripts to run while booting the image.
Install packages ^
Pick a host running at least the latest Debian stable. It might be possible to build a live image for a newer version of Debian, but the live-build system and its dependencies like debootstrap might end up being too old.
$ sudo apt install live-build live-boot live-config |
Prepare the work directory ^
$ sudo mkdir -vp /srv/lb/auto $ cd /srv/lb |
Main configuration ^
All of these config options are described in the lb_config man page.
$ sudo tee auto/config >/dev/null <<'_EOF_' #!/bin/sh set -e cacher_prefix="apt-cacher.lon.bitfolk.com/debian" mirror_host="deb.debian.org" main_mirror="http://${cacher_prefix}/${mirror_host}/debian/" sec_mirror="http://${cacher_prefix}/${mirror_host}/debian-security/" lb config noauto \ --architectures amd64 \ --distribution bullseye \ --binary-images netboot \ --archive-areas main \ --apt-source-archives false \ --apt-indices false \ --backports true \ --mirror-bootstrap "$main_mirror" \ --mirror-chroot-security "$sec_mirror" \ --mirror-binary "$main_mirror" \ --mirror-binary-security "$sec_mirror" \ --memtest none \ --net-tarball true \ "${@}" _EOF_ |
The variables at the top just save me having to repeat myself for all the mirrors. They make both the build process and the resulting image use BitFolk’s apt-cacher to proxy the deb.debian.org mirror.
I’m not going to describe every config option as you can just look them up in the man page. The most important one is --binary-images netboot
to make sure it builds an image that can be booted by network.
Extra packages ^
There’s some extra packages I want available in the rescue image. Here’s how to get them installed.
$ sudo tee config/package-lists/bitfolk_rescue.list.chroot > /dev/null <<_EOF_ pwgen less binutils build-essential bzip2 gnupg openssh-client openssh-server perl perl-modules telnet screen tmux rpm _EOF_ |
Installing a backports kernel ^
I want the rescue system to be Debian 11 (bullseye), but with a bullseye-backports kernel.
We already used --backports true
to make sure that we have access to the backports package mirrors but we need to run a script hook to actually install the backports kernel in the image while it’s being built.
$ sudo tee config/hooks/live/9000-install-backports-kernel.hook.chroot >/dev/null <<'_EOF_' #!/bin/sh set -e apt -y install -t bullseye-backports linux-image-amd64 apt -y purge -t bullseye linux-image-amd64 apt -y purge -t bullseye 'linux-image-5.10.*' _EOF_ |
Set a static /etc/resolv.conf ^
This image will only be booted on one network where I know what the nameservers are, so may as well statically override them. If you were building an image to use on different networks you’d probably instead want to use one of the public resolvers or accept what DHCP gives you.
$ sudo tee config/includes.chroot/etc/resolv.conf >/dev/null <<_EOF_ nameserver 85.119.80.232 nameserver 85.119.80.233 _EOF_ |
Set an explanatory footer text in /etc/issue.footer ^
The people using this rescue image don’t necessarily know what it is and how to use it. I take the opportunity to put some basic info in the file /etc/issue.footer in the image, which will later end up in the real /etc/issue
$ sudo tee config/includes.chroot/etc/issue.footer >/dev/null <<_EOF_ BitFolk Rescue Environment - https://tools.bitfolk.com/wiki/Rescue Blah blah about what this is and how to use it _EOF_ |
Set a random password at boot ^
By default a Debian Live image has a user name of “user” and a password of “live“. This isn’t suitable for a networked service that will have sshd active from the start, so we will install a hook script that sets a random password. This will be run near the end of the image’s boot process.
$ sudo tee config/includes.chroot/lib/live/config/2000-passwd >/dev/null <<'_EOF_' #!/bin/sh set -e echo -n " random-password " NEWPASS=$(/usr/bin/pwgen -c -N 1) printf "user:%s\n" "$NEWPASS" | chpasswd RED='\033[0;31m' NORMAL='\033[0m' { printf "****************************************\n"; printf "Resetting user password to random value:\n"; printf "\t${RED}New user password:${NORMAL} %s\n" "$NEWPASS"; printf "****************************************\n"; cat /etc/issue.footer } >> /etc/issue _EOF_ |
This script puts the random password and the footer text into the /etc/issue file which is displayed above the console login prompt, so the user can see what the password is.
Fix initial networking setup ^
This one’s a bit unfortunate and is a huge hack, but I’m not sure enough of the details to report a bug yet.
The live image when booted is supposed to be able to set up its network by a number of different ways. DHCP would be the most sensible for an image you take with you to different networks.
The BitFolk Rescue VM is only ever booted in one network though, and we don’t use DHCP. I want to set static networking through the ip=…
s syntax of the kernel command line.
Unfortunately it doesn’t seem to work properly with live-boot as shipped. I had to hack the /lib/live/boot/9990-networking.sh file to make it parse the values out of the kernel command line.
Here’s a diff. Copy /lib/live/boot/9990-networking.sh to config/includes.chroot/usr/lib/live/boot/9990-networking.sh and then apply that patch to it.
It’s simple enough that you could probably edit it by hand. All it does is comment out one section and replace it with some bits that parse IP setup out of the $STATICIP
variable.
Fix the shutdown process ^
Again this is a horrible hack and I’m sure there is a better way to handle it, but I couldn’t work out anything better and this works.
This image will be running with its root filesystem on NFS. When a shutdown or halt command is issued however, systemd seems extremely keen to shut off the network as soon as possible. That leaves the shutdown process unable to continue because it can’t read or write its root filesystem any more. The shutdown process stalls forever.
As this is a read-only system with no persistent state I don’t care how brutal the shutdown process is. I care more that it does actually shut down. So, I have added a systemd service that issues systemctl –force –force poweroff any time that it’s about to shut down by any means.
$ sudo tee config/includes.chroot/etc/systemd/system/always-brutally-poweroff.service >/dev/null <<_EOF_ [Unit] Description=Every kind of shutdown will be brutal poweroff DefaultDependencies=no After=final.target [Service] Type=oneshot ExecStart=/usr/bin/systemctl --force --force poweroff [Install] WantedBy=final.target _EOF_ |
And to force it to be enabled at boot time:
$ sudo tee config/includes.chroot/etc/rc.local >/dev/null <<_EOF_ #!/bin/sh set -e systemctl enable always-brutally-poweroff _EOF_ |
Build it ^
At last we’re ready to build the image.
$ sudo lb clean && sudo lb config && sudo lb build |
The “lb clean” is there because you probably won’t get this right first time and will want to iterate on it.
Once complete you’ll find the files to put on your NFS server in binary/ and the kernel and initramfs to boot on your client machine in tftpboot/live/
$ sudo rsync -av binary/ my.nfs.server:/srv/rescue/ |
Booting it ^
The details of exactly how I boot the client side (which in BitFolk’s case is a customer VM) are out of scope here, but this is sort of what the kernel command line looks like on the client (normally all on one line):
root=/dev/nfs ip=192.168.0.225:192.168.0.243:192.168.0.1:255.255.248.0:rescue hostname=rescue nfsroot=192.168.0.243:/srv/rescue nfsopts=tcp boot=live persistent |
Explained:
- root=/dev/nfs
- Get root filesystem from NFS.
- ip=192.168.0.225:192.168.0.243:192.168.0.1:255.255.248.0:rescue
- Static IP configuration on kernel command line. Separated by colons:
- Client’s IP
- NFS server’s IP
- Default gateway
- Netmask
- Host name
- hostname=rescue
- Host name.
- nfsroot=192.168.0.243:/srv/rescue
- Where to mount root from on NFS server.
- nfsopts=tcp
- NFS client options to use.
- boot=live
- Tell live-boot that this is a live image.
- persistent
- Look for persistent data.
In action ^
Here’s an Asciinema of this image in action.
Improvements ^
There’s a few things in here which are hacks. What I have works but no doubt I am doing some things wrong. If you know better please do let me know in comments or whatever. Ideally I’d like to stick with Debian Live though because it’s got a lot of problems solved already.