Note

The Funtoo Linux project has transitioned to "Hobby Mode" and this wiki is now read-only.

ZFS as Root Filesystem

From Funtoo
Revision as of 18:04, September 19, 2013 by Oleg (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

This tutorial will show you how to install Funtoo on ZFS (rootfs). This tutorial is meant to be an "overlay" over the Regular Funtoo Installation. Follow the normal installation and only use this guide for steps 2, 3, and 8.

Introduction to ZFS

Since ZFS is a new technology for Linux, it can be helpful to understand some of its benefits, particularly in comparison to BTRFS, another popular next-generation Linux filesystem:

  • On Linux, the ZFS code can be updated independently of the kernel to obtain the latest fixes. btrfs is exclusive to Linux and you need to build the latest kernel sources to get the latest fixes.
  • ZFS is supported on multiple platforms. The platforms with the best support are Solaris, FreeBSD and Linux. Other platforms with varying degrees of support are NetBSD, Mac OS X and Windows. btrfs is exclusive to Linux.
  • ZFS has the Adaptive Replacement Cache replacement algorithm while btrfs uses the Linux kernel's Last Recently Used replacement algorithm. The former often has an overwhelmingly superior hit rate, which means fewer disk accesses.
  • ZFS has the ZFS Intent Log and SLOG devices, which accelerates small synchronous write performance.
  • ZFS handles internal fragmentation gracefully, such that you can fill it until 100%. Internal fragmentation in btrfs can make btrfs think it is full at 10%. Btrfs has no automatic rebalancing code, so it requires a manual rebalance to correct it.
  • ZFS has raidz, which is like RAID 5/6 (or a hypothetical RAID 7 that supports 3 parity disks), except it does not suffer from the RAID write hole issue thanks to its use of CoW and a variable stripe size. btrfs gained integrated RAID 5/6 functionality in Linux 3.9. However, its implementation uses a stripe cache that can only partially mitigate the effect of the RAID write hole.
  • ZFS send/receive implementation supports incremental update when doing backups. btrfs' send/receive implementation requires sending the entire snapshot.
  • ZFS supports data deduplication, which is a memory hog and only works well for specialized workloads. btrfs has no equivalent.
  • ZFS datasets have a hierarchical namespace while btrfs subvolumes have a flat namespace.
  • ZFS has the ability to create virtual block devices called zvols in its namespace. btrfs has no equivalent and must rely on the loop device for this functionality, which is cumbersome.

The only area where btrfs is ahead of ZFS is in the area of small file efficiency. btrfs supports a feature called block suballocation, which enables it to store small files far more efficiently than ZFS. It is possible to use another filesystem (e.g. reiserfs) on top of a ZFS zvol to obtain similar benefits (with arguably better data integrity) when dealing with many small files (e.g. the portage tree).

Disclaimers

   Warning

This guide is a work in progress. Expect some quirks.

   Important

Since ZFS was really designed for 64 bit systems, we are only recommending and supporting 64 bit platforms and installations. We will not be supporting 32 bit platforms!

Video Tutorial

As a companion to the install instructions below, a YouTube video ZFS install tutorial is now available:

Downloading the ISO (With ZFS)

In order for us to install Funtoo on ZFS, you will need an environment that provides the ZFS tools. Therefore we will download a customized version of System Rescue CD with ZFS already included.

Name: sysresccd-3.7.1_zfs_0.6.2.iso   (492 MB)
Release Date: 2013-08-27
md5sum e6cbebfafb3c32c97be4acd1bb099743


Download System Rescue CD with ZFS

Creating a bootable USB from ISO

After you download the iso, you can do the following steps to create a bootable USB:

Make a temporary directory
root # mkdir /tmp/loop

Mount the iso
root # mount -o ro,loop /root/sysresccd-3.7.1_zfs_0.6.2.iso /tmp/loop

Run the usb installer
root # /tmp/loop/usb_inst.sh

That should be all you need to do to get your flash drive working.

Creating partitions

There are two ways to partition your disk: You can use your entire drive and let ZFS automatically partition it for you, or you can do it manually.

We will be showing you how to partition it manually because if you partition it manually you get to create your own layout, you get to have your own separate /boot partition (Which is nice since not every bootloader supports booting from ZFS pools), and you get to boot into RAID10, RAID5 (RAIDZ) pools and any other layouts due to you having a separate /boot partition.

fdisk (MBR Style)

A Fresh Start:

First lets make sure that the disk is completely wiped from any previous disk labels and partitions. We will also assume that /dev/sda is the target drive.

root # fdisk /dev/sda
Command (m for help): o ↵
Building a new DOS disklabel with disk identifier 0xbeead864.
   Warning

This is a destructive operation. Make sure you really don't want anything on this disk.

Now that we have a clean drive, we will create the new layout.

Create Partition 1 (boot):

Command: n ↵
Partition type: 
Partition number: 
First sector: 
Last sector: +250M ↵

Create Partition 2 (ZFS):

Command: n ↵
Partition type: 
Partition number: 
First sector: 
Last sector: 

Command: t ↵
Partition number: 2 ↵
Hex code: bf ↵

Command: p ↵

Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048      514047      256000   83  Linux
/dev/sda2          514048  1953525167   976505560   bf  Solaris

gdisk (GPT Style)

A Fresh Start:

First lets make sure that the disk is completely wiped from any previous disk labels and partitions. We will also assume that /dev/sda is the target drive.

root # gdisk /dev/sda

Command: x ↵
Expert command: z ↵
About to wipe out GPT on /dev/sda. Proceed?: y ↵
GPT data structures destroyed! You may now partition the disk using fdisk or other utilities.
Blank out MBR?: y ↵
   Warning

This is a destructive operation. Make sure you really don't want anything on this disk.

Now that we have a clean drive, we will create the new layout.

Create Partition 1 (boot):

Command: n ↵
Partition Number: 
First sector: 
Last sector: +250M ↵
Hex Code: 

Create Partition 2 (BIOS Boot Partition):

Command: n ↵
Partition Number: 
First sector: 
Last sector: +32M ↵
Hex Code: EF02 ↵
   Important

Only make the above BIOS Boot Partition if you are using GRUB 2 on GPT. If you are using the extlinux bootloader, this partition is not necessary.

Create Partition 3 (ZFS):

Command: n ↵
Partition Number: 
First sector: 
Last sector: 
Hex Code: bf00 ↵

Command: p ↵

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          514047   250.0 MiB   8300  Linux filesystem
   2          514048          579583   32.0 MiB    EF02  BIOS boot partition
   3          579584      1953525134   931.2 GiB   BF00  Solaris root

Command: w ↵


Format your boot volume

Format your separate /boot partition:

root # mkfs.ext2 /dev/sda1


Encryption (Optional)

If you want encryption, then create your encrypted vault(s) now by doing the following:

root # cryptsetup luksFormat /dev/sda2
root # cryptsetup luksOpen /dev/sda2 vault_1
   Important

If you followed the manual GPT partitioning instructions, you should change /dev/sda2 to /dev/sda3.

Create the zpool

We will first create the pool. The pool will be named `rpool` and the disk will be aligned to 4096 (using ashift=12)

root # zpool create -f -o ashift=12 -o cachefile= -O compression=on -m none -R /mnt/funtoo rpool /dev/sda2
   Important

If you followed the manual GPT partitioning instructions, you should change /dev/sda2 to /dev/sda3. If you are using encrypted root, then change /dev/sda2 to vault_1.

   Note

ashift=12 should be use if you have a newer, advanced format disk that has a sector size of 4096 bytes. If you have an older disk with 512 byte sectors, you should use ashift=9 or don't add the option for auto detection

   Note

If you have a previous pool that you would like to import, you can do a: zpool import -f -R /mnt/funtoo <pool_name>

Create the zfs datasets

We will now create some datasets. For this installation, we will create a small but future proof amount of datasets. We will have a dataset for the OS (/), and your swap. We will also show you how to create some optional datasets: /home, /var, /usr/src, and /usr/portage.

Create some empty containers for organization purposes, and make the dataset that will hold /
root # zfs create rpool/ROOT
root # zfs create -o mountpoint=/ rpool/ROOT/funtoo

Optional, but recommended datasets: /home
root # zfs create -o mountpoint=/home rpool/HOME

Optional, portage tree, distfiles, and binary packages:
root # zfs create rpool/FUNTOO
root # zfs create -o mountpoint=/usr/portage -o compression=off rpool/FUNTOO/portage
root # zfs create -o mountpoint=/usr/portage/distfiles rpool/FUNTOO/portage/distfiles
root # zfs create -o mountpoint=/usr/portage/packages rpool/FUNTOO/portage/packages

Optional datasets: /usr/src
root # zfs create -o mountpoint=/usr/src rpool/FUNTOO/src

Create your swap zvol

Make your swap +1G greater than your RAM. An 8G machine would have 9G of SWAP (This is kinda big though). For machines with this much memory, You could just make it 2G if you don't have any problems.

root # zfs create -o sync=always -o primarycache=metadata -o secondarycache=none -o volblocksize=4K -V 1G rpool/swap

Format your swap zvol

root # mkswap -f /dev/zvol/rpool/swap
root # swapon /dev/zvol/rpool/swap


Last minute checks and touches

Check to make sure everything appears fine. Your output may differ depending on the choices you made above:

root # zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          sda2      ONLINE       0     0     0

errors: No known data errors

root # zfs list
rpool              3.10G  15.5G   136K  none
rpool/HOME          136K  15.5G   136K  /mnt/funtoo/home
rpool/ROOT          308K  15.5G   136K  none
rpool/ROOT/funtoo   172K  15.5G   172K  /mnt/funtoo
rpool/swap         3.09G  18.6G    76K  -

Now we will continue to install funtoo.

Installing Funtoo

Download and extract the Funtoo stage3 and continue installation as normal.

Then once you've extracted the stage3, chroot into your new funtoo environment:

Go into the directory that you will chroot into
root # cd /mnt/funtoo

Mount your boot drive
root # mount /dev/sda1 /mnt/funtoo/boot

Bind the kernel related directories
root # mount -t proc none /mnt/funtoo/proc
root # mount --rbind /dev /mnt/funtoo/dev
root # mount --rbind /sys /mnt/funtoo/sys

Copy network settings
root # cp /etc/resolv.conf /mnt/funtoo/etc/

chroot into your new funtoo environment
root # env -i HOME=/root TERM=$TERM chroot /mnt/funtoo /bin/bash --login

Place your mountpoints into your /etc/mtab file
root # cat /proc/mounts > /etc/mtab

Sync your tree
root # emerge --sync

Add filesystems to /etc/fstab

Before we continue to compile and or install our kernel in the next step, we will edit the /etc/fstab file because if we decide to install our kernel through portage, portage will need to know where is your /boot so that it can place the files in there. We also need to update /etc/mtab so our system knows what is mounted

root # nano /etc/fstab

root # <fs>                  <mountpoint>    <type>          <opts>          <dump/pass>
root # Do not add the /boot line below if you are using whole-disk zfs
/dev/sda1               /boot           ext2            defaults        0 2
/dev/zvol/rpool/swap    none            swap            sw              0 0

Kernel Configuration

Your kernel should have the below options if necessary.

   Note

The below configurations are the requirements for "Bliss Initramfs Creator". Some of these might not be needed for genkernel.

When you get up to the kernel, make sure that you disable the CFQ scheduler, and turn on No-op (It's the default one once you disable all schedulers). The reason for this is because ZFS has its own scheduler and the CFQ one conflicts with it. Go to your kernel sources tree (normally /usr/src/linux), and make sure you have the following options enabled in kernel config:

- Linux Kernel
ZLIB_INFLATE/ZLIB_DEFLATE can be compiled as a module but must be declared 
in the ADDON_MODS variable in hooks/addon.sh.

General setup --->
> [*] Initial RAM filesystem and RAM disk (initramfs/initrd) support
> () Initramfs source file(s)

Device Drivers --->
> Generic Driver Options --->
>> [*] Maintain a devtmpfs filesystem to mount at /dev
>> [*] Automount devtmpfs at /dev, after the kernel mounted the rootfs

* All other drivers required to see your PATA/SATA drives (or USB devices) need to be compiled in
or you can compile them as a module and declare them in the ADDON_MODS variable.

For LUKS support:

- Linux Kernel
Device Drivers --->
[*] Multiple devices driver support (RAID and LVM) --->
<*> Device mapper support
<*> Crypt target support

Cryptographic API --->
<*> XTS support
-*- AES cipher algorithms

Continue and compile/install your kernel:

root # make bzImage modules
root # make install
root # make modules_install

Installing the ZFS userspace tools

root # emerge -av zfs

Check to make sure that the zfs tools are working, the zpool.cache file that you copied before should be displayed.

root # zpool status
root # zfs list

If everything worked, continue.

Install the bootloader

GRUB 2

Before you do this, make sure this checklist is followed:

  • Installed kernel and kernel modules
  • Installed zfs package from the tree
  • /dev, /proc, /sys are mounted in the chroot environment

Once all this is checked, let's install grub2. First we need to enable the "libzfs" use flag so zfs support is compiled for grub2.

root # echo "sys-boot/grub libzfs" >> /etc/portage/package.use

Then we will compile grub2:

root # emerge -av grub

Once this is done, you can check that grub is version 2.00 by doing the following command:

root # grub-install --version
grub-install (GRUB) 2.00

Now try to install grub2:

root # grub-install --no-floppy /dev/sda

You should receive the following message

Installation finished. No error reported.

If not, then go back to the above checklist.

LILO

Before you do this, make sure the following checklist is followed:

  • /dev/, /proc and /sys are mounted.
  • Installed the sys-fs/zfs package from the tree.

Once the above requirements are met, LILO can be installed.

Now we will install LILO.

root # emerge -av sys-boot/lilo

Once the installation of LILO is complete we will need to edit the lilo.conf file.

root # nano /etc/lilo.conf
boot=/dev/sda
prompt
timeout=4
default=Funtoo

image=/boot/bzImage
      label=Funtoo
      read-only
      append="root=rpool/ROOT/funtoo"
      initrd=/boot/initramfs

All that is left now is to install the bootcode to the MBR.

This can be accomplished by running:

root # /sbin/lilo

If it is successful you should see:

Warning: LBA32 addressing assumed
Added Funtoo + *
One warning was issued

Create the initramfs

There are two ways to do this, you can use genkernel, or you can use my bliss initramfs creator. I will show you both.

genkernel

root # emerge -av sys-kernel/genkernel
root # You only need to add --luks if you used encryption
root # genkernel --zfs --luks initramfs

Bliss Initramfs Creator

If you are encrypting your drives, then add the "luks" use flag to your package.use before emerging:

root # echo "sys-kernel/bliss-initramfs luks" >> /etc/portage/package.use

Now install the creator:

root # emerge bliss-initramfs


Then go into the install directory, run the script as root, and place it into /boot:

root # cd /opt/bliss-initramfs
root # ./createInit
root # mv initrd-<kernel_name> /boot

<kernel_name> is the name of what you selected in the initramfs creator, and the name of the outputted file.

Using boot-update

/boot on separate partition

If you created a separate non-zfs partition for boot then configuring boot-update is almost exactly the same as a normal install except that auto detection for root does not work. You must tell boot-update what your root is.

Genkernel

If your using genkernel you must add 'real_root=ZFS=<root>' and 'dozfs' to your params. Example entry for boot.conf:

"Funtoo ZFS" {
        kernel vmlinuz[-v]
        initrd initramfs-genkernel-x86_64[-v]
        params real_root=ZFS=rpool/ROOT/funtoo
        params += dozfs
        # Also add 'params += crypt_root=/dev/sda2' if you used encryption
        # Adjust the above setting to your system if needed
}

Bliss Initramfs Creator

If you used the Bliss Initramfs Creator then all you need to do is add 'root=<root>' to your params. Example entry for boot.conf:

"Funtoo ZFS" {
        kernel vmlinuz[-v]
        initrd initrd[-v]
        params root=rpool/ROOT/funtoo quiet
        # If you have an encrypted device with a regular passphrase,
        # you can add the following line
        params += enc_root=/dev/sda3 enc_type=pass
}

After editing /etc/boot.conf, you just need to run boot-update to update grub.cfg

root # boot-update

/boot on ZFS

TBC - pending update to boot-update to support this

Final configuration

Add the zfs tools to openrc

root # rc-update add zfs boot

Clean up and reboot

We are almost done, we are just going to clean up, set our root password, and unmount whatever we mounted and get out.

Delete the stage3 tarball that you downloaded earlier so it doesn't take up space.
root # cd /
root # rm stage3-latest.tar.xz

Set your root password
root # passwd
>> Enter your password, you won't see what you are writing (for security reasons), but it is there!

Get out of the chroot environment
root # exit

Unmount all the kernel filesystem stuff and boot (if you have a separate /boot)
root # umount -l proc dev sys boot

Turn off the swap
root # swapoff /dev/zvol/rpool/swap

Export the zpool
root # cd /
root # zpool export rpool

Reboot
root # reboot
   Important

Don't forget to set your root password as stated above before exiting chroot and rebooting. If you don't set the root password, you won't be able to log into your new system.

and that should be enough to get your system to boot on ZFS.

After reboot

Create initial ZFS Snapshot

Continue to set up anything you need in terms of /etc configurations. Once you have everything the way you like it, take a snapshot of your system. You will be using this snapshot to revert back to this state if anything ever happens to your system down the road. The snapshots are cheap, and almost instant.

To take the snapshot of your system, type the following:

root # zfs snapshot -r rpool@install

To see if your snapshot was taken, type:

root # zfs list -t snapshot

If your machine ever fails and you need to get back to this state, just type (This will only revert your / dataset while keeping the rest of your data intact):

root # zfs rollback rpool/ROOT/funtoo@install
   Important

For a detailed overview, presentation of ZFS' capabilities, as well as usage examples, please refer to the ZFS Fun page.