Tadeu Bento

Aaron’s ZFS Guide: Best Practices and Caveats


🗂️ Aaron’s ZFS Guide – Table of Contents

Zpool AdministrationZFS AdministrationAppendices
0. Install ZFS on Debian GNU/Linux9. Copy-on-writeA. Visualizing The ZFS Intent Log (ZIL)
1. VDEVs10. Creating FilesystemsB. Using USB Drives
2. RAIDZ11. Compression and DeduplicationC. Why You Should Use ECC RAM
3. The ZFS Intent Log (ZIL)12. Snapshots and ClonesD. The True Cost Of Deduplication
4. The Adjustable Replacement Cache (ARC)13. Sending and Receiving Filesystems
5. Exporting and Importing Storage Pools14. ZVOLsLinked Content / Mirrored
6. Scrub and Resilver15. iSCSI, NFS and SambaAaron’s ZFS Guide Linked: ZFS RAIDZ stripe width
7. Getting and Setting Properties16. Getting and Setting PropertiesAaron’s ZFS Guide Linked: How A ZIL Improves Disk Latencies
8. Best Practices and Caveats17. Best Practices and Caveats

Best Practices

As with all recommendations, some of these guidelines carry a great amount of weight, while others might not. You may not even be able to follow them as rigidly as you would like. Regardless, you should be aware of them. I’ll try to provide a reason why for each. They’re listed in no specific order. The idea of “best practices” is to optimize space efficiency, performance and ensure maximum data integrity.

Caveats

The point of the caveat list is by no means to discourage you from using ZFS. Instead, as a storage administrator planning out your ZFS storage server, these are things that you should be aware of, so as not to catch you with your pants down, and without your data. If you don’t head these warnings, you could end up with corrupted data. The line may be blurred with the “best practices” list above. I’ve tried making this list all about data corruption if not headed. Read and head the caveats, and you should be good.

When loading the “zfs” kernel module, make sure to set a maximum number for the ARC. Doing a lot of “zfs send” or snapshot operations will cache the data. If not set, RAM will slowly fill until the kernel invokes OOM killer, and the system becomes responsive. I have set in my /etc/modprobe.d/zfs.conf file “options zfs zfs_arc_max=2147483648”, which is a 2 GB limit for the ARC.

A “zfs destroy” can cause downtime for other datasets. A “zfs destroy” will touch every file in the dataset that resides in the storage pool. The larger the dataset, the longer this will take, and it will use all the possible IOPS out of your drives to make it happen. Thus, if it take 2 hours to destroy the dataset, that’s 2 hours of potential downtime for the other datasets in the pool.

Debian and Ubuntu will not start the NFS daemon without a valid export in the /etc/exports file. You must either modify the /etc/init.d/nfs init script to start without an export, or create a local dummy export.

Debian and Ubuntu, and probably other systems use a parallized boot. As such, init script execution order is no longer prioritized. This creates problems for mounting ZFS datasets on boot. For Debian and Ubuntu, touch the “/etc/init.d/.legacy-bootordering file, and make sure that the /etc/init.d/zfs init script is the first to start, before all other services in that runlevel.

Do not create ZFS storage pools from files in other ZFS datasets. This will cause all sorts of headaches and problems.

When creating ZVOLs, make sure to set the block size as the same, or a multiple, of the block size that you will be formatting the ZVOL with. If the block sizes do not align, performance issues could arise.


This article includes content by Aaron Toponce, originally published on pthree.org in 2012, which is unfortunately no longer available online. I’ve mirrored his valuable work here to ensure that readers continue to have access to this information.

Exit mobile version