Storage Backend Comparision

I’m currently using FileThin on an XFS formatted partition as my storage backend in Linstor, but have seen recommendations to use LVM Thin instead.

What are the downsides of using FileThin compared to LVM Thin? Does LVM Thin have less overhead compared to FileThin?

I took the time to run some performance benchmarks and wanted to share the results.Hopefully, this can spark some discussion.

I compared the following storage backends:

  • Raw host devices (for baseline performance)
  • FileThin
  • LVM Thin
  • ZFS Thin

For testing the storage backends I didn’t want to test the network so no replication is used during these tests. I ran 3 passes of the following fio command, where I was mainly interested in random write performance for 4K byte size.

fio --name fio-test --filename test0 --ioengine libaio --direct 1 --rw randwrite --bs 4k --runtime 30s --numjobs 4 --iodepth=32 --size=10G --group_reporting --rwmixwrite=100

CSI Parameters
Each storage class had the following parameters:

provisioner: linstor.csi.linbit.com
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
parameters:
  linstor.csi.linbit.com/placementCount: "1"
  linstor.csi.linbit.com/allowRemoteVolumeAccess: "false"

Test Results

Pass Storage Backend IOPS Bandwidth (MiB/s) Avg Latency (µs) CPU Usage (%)
1 Raw Device 44,751 175 2,859 46.98
2 Raw Device 64,978 254 1,968 29.87
3 Raw Device 68,665 267 1,868 23.77
1 FileThin 8,547 33.4 14,970 12.03
2 FileThin 14,800 57.9 8,640 12.18
3 FileThin 19,127 74.6 6,702 12.31
1 LVM Thin 20,563 80.4 6,214 31.04
2 LVM Thin 37,936 148 3,377 38.73
3 LVM Thin 44,686 174 2,876 36.79
1 ZFS Thin 34,400 134 3,716 12.92
2 ZFS Thin 35,300 138 3,625 11.21
3 ZFS Thin 37,400 146 3,425 12.01




My Observations

Performance Improvements Across Runs:

  • The first pass was always slower than the subsequent runs. FileThin and LVM Thin showed the most improvement across passes. This suggests that caching mechanisms might be in play.
  • ZFS Thin was more stable, with less variation between runs.

Open Questions

  • Has anyone else observed caching effects in similar tests?
  • Are there any recommended tunings for LVM Thin or ZFS Thin to improve performance? I’ve already reviewed Performance Tuning for LINSTOR Persistent Storage in Kubernetes - LINBIT. It mentions disk-flushes and md-flushes can possibly be disabled. But this can only be done with hardware raid right? I could use some help in figuring out if I can disable this.
3 Likes

Some good reads, which I should have found sooner:

Seems LVM Thin beats ZFS Thin in those tests. It was less obvious in my tests.

2 Likes
  1. I wasn’t sure if you were running the same tests using the same PV, or if you were deleteing the PV between “passes”. If you were using the same PV, the performance hit could be due to block allocation during the first pass, that doesn’t have to happen during subsequent passes.

  2. For hardware installations, you would only want to disable flushes if you have battery backed write caches on your storage controllers. Otherwise, you could end up with partial writes or corruption during hard crashes. In virtual machines, you could enable “writethrough” on the virtual storage to allow the host machines caches to handle the flushing, but you’d still leave yourself open to corruption if the host crashes and doesn’t have battery backed write caches.

1 Like