Arch_Linux_Root_on_ZFS_Detailed_Guide

Arch Linux Root on ZFS (Basic Guide)

This document provides a basic and detailed guide for installing Arch Linux with the root filesystem on ZFS. It covers the steps necessary to set up a ZFS root on an Arch Linux system using UEFI and systemd-boot. This guide is designed for beginners and includes clear explanations of each step to ensure that even users new to Linux or ZFS can follow along comfortably. Whether you are a new or experienced Linux user looking to explore ZFS as a file system option for your root partition, this guide aims to provide a comprehensive yet understandable approach.

This guide assumes that your system has a single disk, referred to as /dev/sda. If your system includes more than one disk or a different disk naming convention, please adjust the commands accordingly to match your specific hardware configuration

Additionally, this installation process is designed for systems booted with EFI, and the EFI will be installed on the boot partition. It is crucial to note that if you attempt to install using a BIOS setup instead of EFI, the EFI will not install correctly, and the system will fail to boot. Ensure that your system is configured to boot with EFI to follow this guide successfully.

Starting from the live environment

# The loadkeys command allows you to switch the keyboard layout to match your preference. 
# For example, en/es/it...

loadkeys en

# Wipes all the partition and filesystem signatures from the disk /dev/sda. It essentially resets the disk to a 'clean' state, removing any data structures that might interfere with new partitioning.

sgdisk --zap-all /dev/sda

# Creates a new partition (numbered 1). The partition starts at the default starting point (0) and extends 512 MB beyond this point.
# Sets the type of the first partition to ef00, which is for an EFI System Partition required for UEFI boot.

sgdisk -n1:0:+512M -t1:ef00 /dev/sda

# Creates a second partition on /dev/sda, starting right after the first, extending to the end of the available disk space, and using up all remaining space on the disk.
# Sets the partition type to bf00, which is used for ZFS filesystems.

sgdisk -n2:0:0 -t2:bf00 /dev/sda

# Format a partition with the FAT32 filesystem, which is commonly used for EFI System Partitions (ESP) required for systems that boot using UEFI.

mkfs.vfat /dev/sda1

# Install the ZFS file system on Arch Linux using a script hosted on GitHub. The script is designed to automate the installation process, ensuring compatibility with your current kernel version.

curl -s https://raw.githubusercontent.com/eoli3n/archiso-zfs/master/init | bash

# zpool create: This is the base command to create a new ZFS storage pool.
# ashift=12: Sets the alignment shift for the storage pool, which can optimize performance with newer drives that have 4K sectors. An ashift of 12 means 2^12 = 4096-byte sectors.
# autotrim=on: Enables the automatic TRIM operation on the pool, which can improve the performance and lifespan of SSDs by informing them about unused blocks.
# relatime=on: Updates the access time of files only if they are accessed after the last modification time. This is less write-intensive than the default behavior and can improve performance.
# acltype=posixacl: Enables POSIX ACLs (Access Control Lists), which allow for more fine-grained permission settings compared to traditional Unix file permissions.
# canmount=off: Prevents automatic mounting of the dataset. This is useful for parent datasets that serve as containers for other datasets.
# compression=lz4: Enables LZ4 compression for the pool, which provides an excellent balance between compression ratio and performance, reducing disk space usage with minimal overhead.
# dnodesize=legacy: Sets the size of Dnodes to "legacy", which is the default size used in earlier ZFS versions. This option can affect how metadata is stored and accessed.
# normalization=formD: Sets Unicode normalization to Form D (decomposed), which can be useful for ensuring consistent file naming and access in environments that use different Unicode forms.
# xattr=sa: Stores extended attributes as system attributes within the inode itself, reducing overhead when accessing or storing extended attributes.
# devices=off: Prevents device files within the ZFS filesystem from being interpreted as device nodes, enhancing security.
# mountpoint=none: Specifies that the filesystem should not be mounted at a specific mount point.
# /mnt rpool /dev/sda3: Creates the ZFS pool 'rpool' on partition /dev/sda3 with mount points rooted at /mnt for system setup.

zpool create \
    -o ashift=12 \
    -o autotrim=on \
    -O relatime=on \
    -O acltype=posixacl -O canmount=off -O compression=lz4 \
    -O dnodesize=legacy -O normalization=formD \
    -O xattr=sa -O devices=off -O mountpoint=none -R /mnt rpool /dev/sda2

# Create a ZFS dataset named ROOT under the rpool pool
# Dataset not set to mount automatically (canmount=off) and without any designated mount point (mountpoint=none), serving as a container for other datasets.

zfs create -o canmount=off -o mountpoint=none rpool/ROOT

# Creates a ZFS dataset named default under rpool/ROOT
# Configured to mount at the root directory (mountpoint=/) but not to mount automatically on system boot (canmount=noauto)

zfs create -o mountpoint=/ -o canmount=noauto rpool/ROOT/default

# This command creates a new ZFS dataset.
# -V 16G: Specifies that a volume of 16GB
# -b $(getconf PAGESIZE): Sets the block size for the volume to the system's memory page size, which can optimize memory usage.
# -o compression=zle: Enables zero-length compression, which is efficient for data that consists of zeros, as often found in swap.
# -o logbias=throughput: Optimizes operations for throughput, suitable for swap usage where latency is less of a concern.
# -o sync=always: Ensures data integrity by writing data to disk immediately, important for swap to ensure that no data is lost.
# -o primarycache=metadata: Only metadata is cached, not data, which is ideal for swap to ensure actual data is written out immediately rather than being cached.
# -o secondarycache=none: Disables the secondary cache (L2ARC), as caching is not beneficial for swap data that should not be read back frequently.
# -o com.sun:auto-snapshot=false: Disables automatic snapshots for the swap dataset, as snapshots of swap are not useful and consume unnecessary resources.
zfs create -V 16G -b $(getconf PAGESIZE) -o compression=zle -o logbias=throughput -o sync=always -o primarycache=metadata -o secondarycache=none -o com.sun:auto-snapshot=false rpool/swap

# Prepares the specified block device to be used as swap space. 

mkswap /dev/zvol/rpool/swap

# Enables the device for swapping, adding the ZFS swap volume to the system's swap space.
# This allows the system to start using it as additional virtual memory.

swapon /dev/zvol/rpool/swap

# This command unmounts ZFS filesystems.
# -a: This option specifies that all mounted ZFS filesystems should be unmounted.

zfs umount -a

# Deletes all files and directories within the /mnt directory.

rm -rf /mnt/*

# This command safely disconnects the ZFS pool named 'rpool' from the system.

zpool export rpool

# This command is used to bring a previously exported ZFS pool back into the system.
# -d /dev/sda2: Specifies the disk device where the pool is located.
# -R /mnt: Sets an alternate root for all mount points within the pool. 
# -N: This option tells zpool import not to mount the filesystems automatically.

zpool import -d /dev/sda2 -R /mnt rpool -N

# This command specifically mounts the dataset rpool/ROOT/default

zfs mount rpool/ROOT/default

# This command mounts all ZFS filesystems that are not currently mounted but are set to be mounted automatically. The -a flag stands for "all."

zfs mount -a

# This command creates a directory named boot inside /mnt. This directory will serve as the mount point for the boot partition.

mkdir /mnt/boot

# This command mounts the device /mnt/boot

mount /dev/sda1 /mnt/boot

# Create a directory named etc within the /mnt directory to hold the fstab file.

mkdir /mnt/etc

# genfstab: Generate the fstab file. This file is used by the system on startup to mount filesystems automatically.
# -U: Use UUIDs for identifying partitions instead of device names (like /dev/sda1). UUIDs are preferred because they are unique and do not change even if the order of the drives is changed.
# /mnt: Specifies the root directory of the file system for which fstab is being generated. 
# >> /mnt/etc/fstab: This redirects the output of genfstab into the file /mnt/etc/fstab.

genfstab -U /mnt >> /mnt/etc/fstab

# Add an entry for a swap space into the fstab file.
# /dev/zvol/rpool/swap: This is the path to the swap device, which in this case is a ZFS volume designated for swap.
# none: This indicates that no mount point is needed since this is swap space.
# swap: Specifies the type of the filesystem, which is swap.
# discard: An option that enables discard/TRIM support on the swap, which can improve performance on SSDs by allowing the system to inform the SSD about free blocks.
# 0: The first '0' indicates the dump flag, which is not used with swap (hence it's set to 0).
# 0: The second '0' indicates the pass during boot-up filesystem checking; for swap, this is also set to 0, meaning it does not require fsck (filesystem check).
# >> /mnt/etc/fstab: This redirects the output into the file /mnt/etc/fstab.

echo "/dev/zvol/rpool/swap    none       swap  discard                    0  0" >> /mnt/etc/fstab

# pacstrap is a script used in Arch Linux that simplifies the process of installing the base system onto a new root filesystem.
# /mnt: This is the target directory where the new system will be installed.
# base: This is a meta-package that contains all the essential packages and scripts necessary to set up a minimal Arch Linux base system.
# base-devel: This group of packages includes development tools such as compilers and utilities necessary for building software from source.
# linux: This package installs the Linux kernel, the core of the Arch Linux operating system.
# linux-firmware: This package provides essential firmware files for various hardware devices, such as network interfaces, sound cards, and other peripherals...
# vim: This package installs Vim, a powerful text editor.

pacstrap /mnt base base-devel linux linux-firmware vim

# This command allows you to enter and work inside your newly installed Arch Linux system as if it were already running, enabling you to set up and configure it before you actually boot into it for the first time.

arch-chroot /mnt

# Installs Reflector, a utility that optimizes your Arch Linux mirror list, ensuring faster package downloads by selecting the most up-to-date and quickest mirrors based on your geographic location.

pacman -S reflector
reflector --country 'United States' --age 12 --protocol https --sort rate --save /etc/pacman.d/mirrorlist

# Keep Only Boot and Swap Entries on /etc/fstab
# Boot: This is formatted as VFAT, typically used for EFI boot partitions. It must be explicitly listed to ensure it mounts correctly when your computer starts.
# Swap: If using ZFS, it’s handled differently but still needs to be in fstab to activate on boot, helping your system manage memory better.
# Remove ZFS Root Entries: ZFS manages its own mounts automatically, so you don’t need to list it in fstab, which avoids potential conflicts during boot.
# Adjust fmask and dmask for Boot: These settings control file and directory permissions on the VFAT-formatted boot partition.
# Set to 0077 to restrict access to the root user only, protecting sensitive boot files from other users.

vim /etc/fstab

Example:


# /dev/sda1
UUID=C050-B254          /boot           vfat            rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro        0 2
/dev/zvol/rpool/swap    none       swap  discard                    0  0

# [archzfs]: This is the repository label.
# Server = https://archzfs.com/$repo/x86_64: This line specifies the URL of the repository server, where $repo is a variable that pacman replaces with the repository name (archzfs in this case).

vim /etc/pacman.conf

[archzfs]
Server = https://archzfs.com/$repo/x86_64

# Adding GPG keys for a repository is an essential security step to ensure the authenticity of the packages you download and install
# curl -O https://archzfs.com/archzfs.gpg: Downloading files from the internet, to fetch the GPG key file (archzfs.gpg) from the ArchZFS website.
# pacman-key -a archzfs.gpg: Adds the downloaded GPG key file to the pacman keyring. Allows pacman to verify the signatures of packages from the ArchZFS repository.
# pacman-key --lsign-key DDF....: This command locally signs the imported key, which means you trust this key on your local machine. 
# The long string is the fingerprint of the GPG key for the ArchZFS repository. 
# Local signing is necessary to fully enable the usage of a repository in pacman since it prevents warnings about untrusted keys.
# pacman -Syy: This command forces pacman to refresh all package databases, ignoring the local cache. The double y flags tell pacman to refresh the databases even if they seem to be up-to-date. 
# This is particularly useful after adding a new repository to ensure your package lists are current.

curl -O https://archzfs.com/archzfs.gpg
pacman-key -a archzfs.gpg
pacman-key --lsign-key DDF7DB817396A49B2A2723F7403BD972F75D9D76
pacman -Syy

# Symbolic link from your specific time zone file to /etc/localtime, effectively setting the system's time zone to match the selected city.
ln -sf /usr/share/zoneinfo/Region/City /etc/localtime

# Synchronizes the hardware clock with the system clock

hwclock --systohc

# Modifies the locale.gen file to activate the en_US.UTF-8 locale by removing the comment symbol (#), preparing it to be generated. 
# Affecting how your system handles language and regional formatting (like date, time, and number formats).

sed -i 's/#\(en_US\.UTF-8\)/\1/' /etc/locale.gen

# This command reads the updated locale.gen file and generates the locales enabled within it. 
# Generating locales allows the system to support various language and regional settings, ensuring that system messages and interfaces can be presented in the chosen language.
locale-gen

# This file configures the primary language setting for the system, affecting all users by default unless overridden by user-specific settings
echo "LANG=en_US.UTF-8" > /etc/locale.conf

# This command configures the keyboard layout and font for the console (terminal) interface. It sets the KEYMAP to "us" for using the US keyboard layout.
# The FONT affects how text is displayed and input on the terminal during non-graphical sessions. Disable with # by default.

echo -e "KEYMAP=us\n#FONT=latarcyrheb-sun32" > /etc/vconsole.conf

# This command sets the system's hostname to "arch" by writing it to the /etc/hostname file. 
# The hostname is a unique identifier for your device on a network, essential for networking and system administration.

echo arch > /etc/hostname

# This command appends the local IP addresses for IPv4 (127.0.0.1) and IPv6 (::1) associated with the hostname "localhost" to the /etc/hosts file.
# The hosts file maps IP addresses to hostnames, allowing the system to resolve hostnames to IP addresses quickly, especially important for local testing and network configuration.
# Adding these entries ensures that "localhost" resolves correctly whether accessed via IPv4 or IPv6, facilitating smooth operation of applications and services that rely on these standard IP addresses for system operations.

echo -e "127.0.0.1 localhost\n::1 localhost" >> /etc/hosts

# This command prompts you to enter a new password for the root account when run within a root session

passwd

# pacman: The package manager used in Arch Linux for installing, updating, and managing software.
# -Syu: This option updates all the packages on your system to the latest version available from the repositories before installing the new packages. 
# The 'S' stands for sync. 
# The 'y' for refresh the repositories to get the latest list of packages. 
# The 'u' for upgrading the installed packages.
# archzfs-linux: Installs the Arch Linux ZFS packages, enabling support for ZFS, a robust and flexible filesystem renowned for its data integrity and a vast array of features.
# amd-ucode: Installs the microcode updates for AMD processors. Microcode is low-level software that helps correct the processor's behavior according to the latest updates from the manufacturer, essential for system stability and security.
# For Intel processors install intel-ucode instead of amd-ucode.
# networkmanager: Installs NetworkManager, which simplifies the use of networking on your system. It provides easy configuration and management of network connections via a graphical or command-line interface.
# sudo: Installs sudo, a program that allows users to run programs with the security privileges of another user, typically the superuser or root. This is crucial for managing permissions and carrying out administrative tasks securely.
# openssh: Installs OpenSSH, a suite of secure networking utilities based on the Secure Shell (SSH) protocol, essential for managing your system remotely in a secure manner.
# rsync: Installs rsync, a fast and versatile file copying tool widely used for backups and mirroring files and directories locally and remotely.
# borg: Installs BorgBackup (Borg), a deduplicating backup program that optionally supports compression and authenticated encryption. It's highly efficient in managing backup storage space and securing the backups.
# Choose the default option (`all`) for the *archzfs* group.

pacman -Syu archzfs-linux amd-ucode networkmanager sudo openssh rsync borg

# Sets a unique system identifier for the ZFS file system, ensuring that the ZFS pools are managed securely and consistently.
# zgenhostid: Sets the host ID in the ZFS filesystem. 
# It is used to write the host's identifier to the /etc/hostid file, which ZFS then uses to confirm the integrity and ownership of the pool.
# hostid: Returns the numeric identifier of the host,a 32-bit number derived from the IP address of the machine.

zgenhostid $(hostid)

# Configures the ZFS pool named rpool to save its configuration data to a specific cache file, improving pool detection and management during system boot.

zpool set cachefile=/etc/zfs/zpool.cache rpool

# mkinitcpio.conf in Arch Linux is a configuration file used to specify how the initial ramdisk (initrd) image is created.
# The initrd is a temporary root file system loaded into memory when the Linux system boots.
# This file is crucial because it determines which modules and scripts are included in the initrd, affecting the early boot process of the Linux kernel.
# HOOKS in mkinitcpio.conf define a series of scripts that mkinitcpio executes in order to create the initrd.
# Each hook can add specific modules or functionalities necessary for the boot process, such as hardware detection, file system access, and other necessary services.
# Remove `fsck` and adding `zfs` after `keyboard`
# The fsck hook is responsible for checking file systems for errors at boot time. 
# This is not needed for ZFS because ZFS has its own mechanisms for ensuring file system integrity and does not rely on traditional file system checks like fsck.
# Using fsck with ZFS could potentially disrupt the boot process or lead to unnecessary delays.
# Including the zfs hook ensures that ZFS file systems are recognized and available during the boot process. 
# Placing it before keyboard and other user interaction hooks ensures that the ZFS pools are available early enough in the boot sequence, which is critical for systems where the root file system itself is on ZFS. 
# This order ensures that all necessary ZFS modules and scripts are loaded before the system tries to mount file systems and before any user interaction occurs.

vim /etc/mkinitcpio.conf

# Example

HOOKS=(base udev autodetect modconf kms keyboard keymap consolefont block zfs filesystems)

# This command triggers the regeneration of the initramfs image for the kernel preset named "linux," based on the configurations defined in /etc/mkinitcpio.conf.

mkinitcpio -p linux

# Enables the zfs.target, which is a systemd target unit for ZFS. 
# A target unit is used to group together and manage several units. 
# Enabling this target ensures that all ZFS-related services grouped under this target are considered during system startup.

systemctl enable zfs.target

# This service is responsible for importing ZFS pools based on a cache file that remembers previous pool imports. 
# Enabling this service allows your system to automatically recognize and import ZFS pools at boot, using the configuration stored in the cache to speed up the boot process.

systemctl enable zfs-import-cache.service

# By enabling this service, you ensure that ZFS filesystems are automatically mounted when the system boots. 

systemctl enable zfs-mount.service

# Similar to zfs.target, this is another systemd target that specifically deals with the importing of ZFS pools. 
# Enabling this target ensures that any services that depend on ZFS pools being available are correctly sequenced at startup 

systemctl enable zfs-import.target

# Installs the systemd-boot bootloader into the EFI System Partition located at /boot, setting up the initial boot management mechanism to allow the system to start up and load the operating system.

bootctl --path=/boot install

# Automatically update the systemd-boot bootloader whenever the systemd package is upgraded.
# [Trigger]: This section specifies the conditions under which the hook should be activated
# Type = Package: Specifies that the trigger is related to package operations.
# Operation = Upgrade: The trigger activates specifically when an upgrade operation occurs.
# Target = systemd: Limits the trigger to operations involving the systemd package.
# [Action]: This section details what should be done when the trigger conditions are met
# Description = update systemd-boot: A brief description of what the action does.
# When = PostTransaction: Specifies that the action should occur after the transaction (i.e., the upgrade of the systemd package) is completed.
# Exec = /usr/bin/bootctl update: The command to be executed, which updates systemd-boot using the bootctl tool.

vim /etc/pacman.d/hooks/100-systemd-boot.hook

[Trigger]
Type = Package
Operation = Upgrade
Target = systemd

[Action]
Description = update systemd-boot
When = PostTransaction
Exec = /usr/bin/bootctl update

# This file is part of the configuration for systemd-boot, a simple UEFI bootloader managed by systemd
# default arch: This line sets the default boot entry that systemd-boot will use if no other selection is made by the user.
# timeout 3: This specifies the timeout in seconds before the default boot entry is automatically selected.

vim /boot/loader/loader.conf

default arch
timeout 3

# title Arch Linux: This line sets the title of the boot entry as displayed on the boot menu.
# linux /vmlinuz-linux: Specifies the path to the Linux kernel. /vmlinuz-linux is the typical name for the compressed, bootable Linux kernel in Arch Linux.
# initrd /amd-ucode.img: Specifies the initial ramdisk image that contains microcode updates for AMD processors. 
# This line is crucial for ensuring that the processor firmware is updated to its latest version before the kernel starts, which can improve hardware compatibility and system stability.
# If using an Intel processor, replace with /intel-ucode.img.
# initrd /initramfs-linux.img: Points to the general initramfs image. The initramfs (initial RAM filesystem) is a temporary root filesystem loaded into memory during the boot process. 
# It contains necessary drivers and scripts used to mount the real root filesystem.
# options zfs=rpool/ROOT/default rw: This line specifies kernel parameters. 
# The zfs=rpool/ROOT/default parameter tells the system that the root filesystem is on a ZFS pool located at rpool/ROOT/default. 
# The rw at the end stands for "read-write," indicating that the root filesystem should be mounted with read and write permissions.

vim /boot/loader/entries/arch.conf

title Arch Linux
linux /vmlinuz-linux
initrd /amd-ucode.img
initrd /initramfs-linux.img
options zfs=rpool/ROOT/default rw

# This command creates a new user account named 'username', includes a home directory with the -m option, and adds the new user to the 'wheel' group using the -G wheel option.
# The 'wheel' group is configured to allow its members to execute commands with elevated privileges, similar to those of the root user, via sudo.
useradd -mG wheel username

# This command sets or changes the password for the user 'username'.

passwd username

# This command opens the sudoers file in the Vim editor for safe editing. visudo checks the syntax of the file to prevent configuration errors from blocking sudo operations.
# You’ll need to find the line that includes the wheel group and uncomment it (remove the # at the beginning of the line).
# This line typically looks something like %wheel ALL=(ALL) ALL. Uncommenting this line grants all members of the wheel group the ability to execute any command as any user, including root, which is essential for administrative tasks.

EDITOR=vim visudo

# This command configures the NetworkManager service to start automatically at boot, managing network connections to ensure your system is connected to the internet or local network as soon as it starts.

systemctl enable NetworkManager

# This command sets the SSH daemon (sshd) to start automatically when the system boots, enabling remote access to your system via the SSH protocol for secure command-line administration from other computers.

systemctl enable sshd

# exit: This command exits from the chroot environment
# zfs umount -a: This ZFS command unmounts all ZFS-mounted filesystems.
# umount -R /mnt: This command recursively unmounts everything mounted under /mnt, not only the specified mount point is unmounted, but also all mount points within it.

exit
zfs umount -a
umount -R /mnt

# Safely detach the ZFS storage pool named rpool from the system

zpool export rpool

# A minimal Arch Linux system with root on ZFS should now be configured.

reboot

A minimal Arch Linux system with root on ZFS should now be configured.