KDE Linux/RootFSv2
This is still WIP and not currently in production :)
RootFSv2 is a hugely impactful change of the rootfs setup during pre-alpha.
Transition to v2 happens automatically. After it is completed you may wish to clean up /system subvolumes (OTHER THAN @system!) to liberate some disk space from the clutches of the overlays. Mind that after cleanup you cannot reboot into v1 snapshots anymore!
If you have manual fstab setups you may need to prepare @system manually. The section Manual Transition below describes how that works.
Background
v1
v1 was the original prototype. It treated all of / as readonly and mounted writable directories in the relevant places. It turned out the relevant places are numerous and spread out all over the file tree.
v1 had a number of problems. For one it was originally utilizing overlays, a poor choice because it means overlayfs limitations come into play and the relevant directories were no longer proper btrfs (no subvols, no snapshot, no quota, etc.). To work around that problem some directories in /var were turned into subvolumes, further complicating the mount lineup. The lineup also showed to be incomplete at points during development: /root behaves more like /home and is expected to be writable by many tools, /snap needed to be under direct control of snapd, /nix was required by nix packages, /opt was expected to be writable by proprietary third party installers (arguably they should be sysexts but a terrible experience that would be). When introducing new subvolumes things got a bit complicated because the previous directories may have already been populated causing the question of how to best replicate the data into the subvol. Not to mention that early boot setup constantly needed to ensure all necessary subvolumes indeed exist. It's also super complicated to wrap your head around.
Additionally the handling of the @kde-linux_* subvolume was expected to be a bit complicated moving forward. Being an unpacked tarball we'd have to store the tarball in addition to the subvolume to enable delta downloads (the delta would need to be against the tarball, not the subvolume). Also verity checking the subvolume doesn't seem to be possible any time soon as btrfs only has file based verity checks via fs-verity.
The actual subvolume lineup was:
- @kde-linux_*: the versioned /
- @etc-overlay: /etc (as writable overlay)
- @var-overlay: /var (as writable overlay
- @home: /home
- @root: /root
- @snap: /snap
- @containers: /var/lib/containers (separate from the var overlay so it can use snapshots)
- @docker: /var/lib/docker (separate from the var overlay so it can use snapshots)
v2
To deal with the problems of v1 a new lineup v2 was suggested. It'd be simpler and hopefully prove less troublesome in the long run.
v2 turns v1 inside out and instead of considering / readonly we now consider /usr readonly. This is directly owed to the fact that basically everything that isn't /usr is expected to be writable in some form or fashion. This also means tools like snap and nix can create their directories in / without us having to care or support that explicitly.
Somewhat unrelated but done at the same time is a migration from the unpacked btrfs subvolume to erofs images that get mounted into /usr. Hopefully allowing verity checks (because now it is a file -- allowing fs-verity checks) and delta downloads.
The v2 lineup:
- @system: the actual / containing subvolumes at the user's discretion
- kde-linux_*.erofs: (subdir /usr) mounted into /usr
/var and /etc are populated as necessary via systemd-tmpfiles copying from /usr/share/factory (inside the erofs). This also improves the factory reset story because we can technically bootstrap out of an empty / now via systemd.
Transition
During early initrd stages, before the rootfs is actually mounted, all data gets replicated into a new @system subvolume. For /etc and /var we first copy all data from /usr/share/factory and then overwrite it with the original overlay/upper data. This effectively replicates the overlay, natively. For the original subvolumes we create subvolume snapshots inside @system, relying on copy-on-write to do its job here. Once @system is created the boot continues as usual.
Manual Transition
Because of the complexity brought in by the overlays it is very recommended to start with a fresh /var. That means docker images, flatpaks, and the like won't make it over into v2. You get a much more "pristine" setup this way though.
- The root of the btrfs partition is mounted in /system
- Create a @system subvolume
- Inside you want to do some initial population:
- For /etc and /var you may choose to start with empty directories and only copy the overlay/upper data into @system
- For the subvolumes you can btrfs subvolume snapshot as desired
The additions to /system should look something like this:
- @system (everything below can be a directory or subvolume, entirely up to you)
- @system/etc/...
- @system/home/...
- @system/usr (empty -- only serves as mount point)
- @system/var/...
Please note that if you do not populate /etc correctly you may be greeted by systemd-firstboot and asked to setup timezones and stuff.
Should anything go wrong you can always boot into an earlier snapshot - it will use the v1 data - and start over.
Sitter Says
- If you don't have important stuff in /var it may be beneficial to throw it away and start fresh. That way you get a much more streamlined /var than the automatic transition creates.
- Similarly for /etc, except there you want to make sure you pre-fill @system/etc/ with the actual overlay/upper data, otherwise your user account won't be known to the system.
Known Issues
- It is unclear how to deal with etc file updates we want to push. Is it the users responsibility when changing a file to keep it updated? What about systemd presets (enabled/disabled services)?