Aaron’s ZFS Guide: Best Practices and Caveats

9 meses ago

🗂️ Aaron’s ZFS Guide – Table of Contents

Zpool Administration	ZFS Administration	Appendices
0. Install ZFS on Debian GNU/Linux	9. Copy-on-write	A. Visualizing The ZFS Intent Log (ZIL)
1. VDEVs	10. Creating Filesystems	B. Using USB Drives
2. RAIDZ	11. Compression and Deduplication	C. Why You Should Use ECC RAM
3. The ZFS Intent Log (ZIL)	12. Snapshots and Clones	D. The True Cost Of Deduplication
4. The Adjustable Replacement Cache (ARC)	13. Sending and Receiving Filesystems
5. Exporting and Importing Storage Pools	14. ZVOLs	Linked Content / Mirrored
6. Scrub and Resilver	15. iSCSI, NFS and Samba	Aaron’s ZFS Guide Linked: ZFS RAIDZ stripe width
7. Getting and Setting Properties	16. Getting and Setting Properties	Aaron’s ZFS Guide Linked: How A ZIL Improves Disk Latencies
8. Best Practices and Caveats	17. Best Practices and Caveats

Best Practices

As with all recommendations, some of these guidelines carry a great amount of weight, while others might not. You may not even be able to follow them as rigidly as you would like. Regardless, you should be aware of them. I’ll try to provide a reason why for each. They’re listed in no specific order. The idea of “best practices” is to optimize space efficiency, performance and ensure maximum data integrity.

Always enable compression. There is almost certainly no reason to keep it disabled. It hardly touches the CPU and hardly touches throughput to the drive, yet the benefits are amazing.
Unless you have the RAM, avoid using deduplication. Unlike compression, deduplication is very costly on the system. The deduplication table consumes massive amounts of RAM.
Avoid running a ZFS root filesystem on GNU/Linux for the time being. It’s a bit too experimental for /boot and GRUB. However, do create datasets for /home/, /var/log/ and /var/cache/.
Snapshot frequently and regularly. Snapshots are cheap, and can keep a plethora of file versions over time. Consider using something like the zfs-auto-snapshot script.
Snapshots are not a backup. Use “zfs send” and “zfs receive” to send your ZFS snapshots to an external storage.
If using NFS, use ZFS NFS rather than your native exports. This can ensure that the dataset is mounted and online before NFS clients begin sending data to the mountpoint.
Don’t mix NFS kernel exports and ZFS NFS exports. This is difficult to administer and maintain.
For /home/ ZFS installations, setting up nested datasets for each user. For example, pool/home/atoponce and pool/home/dobbs. Consider using quotas on the datasets.
When using “zfs send” and “zfs receive”, send incremental streams with the “zfs send -i” switch. This can be an exceptional time saver.
Consider using “zfs send” over “rsync”, as the “zfs send” command can preserve dataset properties.

Caveats

The point of the caveat list is by no means to discourage you from using ZFS. Instead, as a storage administrator planning out your ZFS storage server, these are things that you should be aware of, so as not to catch you with your pants down, and without your data. If you don’t head these warnings, you could end up with corrupted data. The line may be blurred with the “best practices” list above. I’ve tried making this list all about data corruption if not headed. Read and head the caveats, and you should be good.

When loading the “zfs” kernel module, make sure to set a maximum number for the ARC. Doing a lot of “zfs send” or snapshot operations will cache the data. If not set, RAM will slowly fill until the kernel invokes OOM killer, and the system becomes responsive. I have set in my /etc/modprobe.d/zfs.conf file “options zfs zfs_arc_max=2147483648”, which is a 2 GB limit for the ARC.

A “zfs destroy” can cause downtime for other datasets. A “zfs destroy” will touch every file in the dataset that resides in the storage pool. The larger the dataset, the longer this will take, and it will use all the possible IOPS out of your drives to make it happen. Thus, if it take 2 hours to destroy the dataset, that’s 2 hours of potential downtime for the other datasets in the pool.

Debian and Ubuntu will not start the NFS daemon without a valid export in the /etc/exports file. You must either modify the /etc/init.d/nfs init script to start without an export, or create a local dummy export.

Debian and Ubuntu, and probably other systems use a parallized boot. As such, init script execution order is no longer prioritized. This creates problems for mounting ZFS datasets on boot. For Debian and Ubuntu, touch the “/etc/init.d/.legacy-bootordering file, and make sure that the /etc/init.d/zfs init script is the first to start, before all other services in that runlevel.

Do not create ZFS storage pools from files in other ZFS datasets. This will cause all sorts of headaches and problems.

When creating ZVOLs, make sure to set the block size as the same, or a multiple, of the block size that you will be formatting the ZVOL with. If the block sizes do not align, performance issues could arise.

This article includes content by Aaron Toponce, originally published on pthree.org in 2012, which is unfortunately no longer available online. I’ve mirrored his valuable work here to ensure that readers continue to have access to this information.