Linstor/Piraeus 2.8.1 for Kubevirt - slow disk on Linux, fast on Windows

kreeuwijk · August 27, 2025, 9:58pm

I’m getting a strange result when doing some performance testing with Kubevirt on a bare-metal Kubernetes cluster with Piraeus 2.8.1. It’s 3 servers (all control plane nodes) that each have 4x 1.8TB SSDs for Piraeus. I’ve configured the storagecluster with:

    linstorSatelliteConfigurations:
      - name: host-network
        spec:
          podTemplate:
            spec:
              hostNetwork: true
          properties:
            - name: PrefNic
              value: "data-nic"
      - name: lvm-thin-storage-pool
        spec:
          storagePools:
            - name: lvm-thin
              lvmThinPool:
                volumeGroup: drbd-vg
                thinPool: thinpool
              source:
                hostDevices:
                  - /dev/sda #1.8TB SSD
                  - /dev/sdb #1.8TB SSD
                  - /dev/sdc #1.8TB SSD
                  - /dev/sdd #1.8TB SSD

    linstorNodeConnections:
      - name: dedicated-storage-network
        spec:
          paths:
            - name: dedicated
              interface: data-nic

    storageClasses:
      - name: linstor-lvm-storage
        annotations:
          storageclass.kubernetes.io/is-default-class: "true"
          storageclass.kubevirt.io/is-default-virt-class: "true"
        reclaimPolicy: Delete
        allowVolumeExpansion: true
        volumeBindingMode: WaitForFirstConsumer
        parameters:
          placementCount: "2"
          storagePool: "lvm-thin"
          resourceGroup: "lvm"
          DrbdOptions/Net/allow-two-primaries: "yes"
          allowRemoteVolumeAccess: "true"
          DrbdOptions/Disk/disk-flushes: "no"
          DrbdOptions/Disk/md-flushes: "no"
          DrbdOptions/Disk/al-extents: "65534"
          DrbdOptions/Net/max-buffers: "36864"
          DrbdOptions/Net/rcvbuf-size: "2097152"
          DrbdOptions/Net/sndbuf-size: "1048576"
          DrbdOptions/Net/after-sb-0pri: "discard-zero-changes"
          DrbdOptions/Net/after-sb-1pri: "discard-secondary"
          DrbdOptions/Net/after-sb-2pri: "disconnect"
          DrbdOptions/PeerDevice/c-fill-target: "2048"
          DrbdOptions/PeerDevice/c-max-rate: "0"
          DrbdOptions/PeerDevice/c-min-rate: "102400"
          DrbdOptions/PeerDevice/c-plan-ahead: "10"

I used the linstor CLI to create the data-nic interfaces on a 25Gbps dedicated NIC, before enabling the linstorNodeConnection.

When I start a Windows VM, performance seems to be just fine. CrystalDiskMark shows numbers in line with expectations. But on a Linux VM, performance is terrible for some reason. FIO shows no more than 100 IOPS and dd tops out at 2.4MB/sec.

Nothing I do changes this, so there must be something wrong somewhere.

Any idea?

candlerb · August 28, 2025, 12:24pm

What’s the exact dd command? Have you set bs, oflag and if so what to?

kreeuwijk · August 28, 2025, 4:04pm

The exact command run was

dd if=/dev/zero of=/dev/sdb bs=4k count=10000 oflag=direct

kermat · August 28, 2025, 7:56pm

CrystalDiskMark and dd are probably not testing the same things.

What settings are you using with CrystalDiskMark?

Running dd like you’ve shown is effectively testing an IO depth or queue depth of 1, meaning that each of those 10,000 4k writes has to wait for the previous write to complete before placing another in the queue. It’s also testing sequential writes, as opposed to random writes, which are more commonly what people are looking at with 4k writes.

I’m not super familiar with Windows benchmarking tools, but it’s also possible that CrystalDiskMark’s IO is getting multi-threaded where dd isn’t going to do that on its own.

If you wanted to do more robust benchmarking in Linux I would recommend looking at fio. Then you will be able to more closely replicate real world IO patterns in Linux.

kreeuwijk · August 29, 2025, 8:28am

FIO gave me only ±100 IOPS in a 65/35 read write mixed workload. Running the exact same FIO test on the same hardware, but using Portworx Enterprise as the CSI, results in ±24.000 IOPS.

The problem isn’t with the testing methodology, the problem is with Piraeus. I’m looking for some pointers on where the issue could be.

kermat · August 29, 2025, 1:58pm

You haven’t shared the testing parameters, only that you’re using CrystalDiskMarks in Windows and some combination of fio and dd in Linux, so making sure you’re comparing at least somewhat similar tests seems important. If you’re sure it’s not methodology, I’ll move on.

If Windows VMs perform as expected but Linux VMs are giving you trouble, both while using the same Piraeus storage class, that would suggest there is an issue above the storage, no? Windows VMs and Linux VMs are going to use different storage drivers and IO scheduling methods.

I would look at these things within the Linux guest:

lspci | grep Virtio # ensure NICs and disks are using virtio, not e1000/IDE
dmesg | grep -i virtio # confirm drivers loaded correctly
/sys/block/sdb/queue/scheduler # ensure not using CFQ
cat /proc/cpuinfo # see if CPU flags (e.g., AES, AVX, etc.) are being exposed

If everything looks okay there, can you share more about the guest OSes you’re testing with?

kreeuwijk · August 29, 2025, 3:08pm

Earlier testing on other hardware didn’t show any issues, so I’m hoping the issue is something hardware-specific. We’re using Ubuntu for testing the Linux VMs, I tried both 22.04 and 24.04 with the same result.

The linux guest shows:

root@vm1x-ubuntu-fio:~# lspci | grep Virtio
01:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
05:00.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI (rev 01)
06:00.0 Communication controller: Red Hat, Inc. Virtio console (rev 01)
07:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01)
08:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01)
09:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01)
0a:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01)
0b:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01)
0c:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon (rev 01)
0d:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio RNG (rev 01)

root@vm1x-ubuntu-fio:~# dmesg | grep -i virtio
[    2.065290] scsi host0: Virtio SCSI HBA
[    2.070584] virtio_blk virtio3: [vda] 209719464 512-byte logical blocks (107 GB/100 GiB)
[    2.123908] virtio_blk virtio4: [vdb] 2048 512-byte logical blocks (1.05 MB/1.00 MiB)
[    2.135260] virtio_blk virtio5: [vdc] 209719464 512-byte logical blocks (107 GB/100 GiB)
[    2.135555] virtio_net virtio0 enp1s0: renamed from eth0
[    2.141024] virtio_blk virtio6: [vdd] 209719464 512-byte logical blocks (107 GB/100 GiB)
[    2.144539] virtio_blk virtio7: [vde] 209719464 512-byte logical blocks (107 GB/100 GiB)

root@vm1x-ubuntu-fio:~# cat /sys/block/vdb/queue/scheduler
[none] mq-deadline

root@vm1x-ubuntu-fio:~# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel Core i7 9xx (Nehalem Class Core i7)
stepping	: 3
microcode	: 0x1
cpu MHz		: 1995.317
cache size	: 16384 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips	: 3990.63
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel Core i7 9xx (Nehalem Class Core i7)
stepping	: 3
microcode	: 0x1
cpu MHz		: 1995.317
cache size	: 16384 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips	: 3990.63
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel Core i7 9xx (Nehalem Class Core i7)
stepping	: 3
microcode	: 0x1
cpu MHz		: 1995.317
cache size	: 16384 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips	: 3990.63
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel Core i7 9xx (Nehalem Class Core i7)
stepping	: 3
microcode	: 0x1
cpu MHz		: 1995.317
cache size	: 16384 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips	: 3990.63
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

Appreciate any insights you could provide!

kreeuwijk · August 29, 2025, 3:11pm

For reference of what I see on the Linux VM:

# fio fio_65_35_mix.job
vdc: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=32
vdd: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=32
vde: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=32
fio-3.28
Starting 3 processes
Jobs: 3 (f=3), 0-300000 IOPS: [m(3)][0.0%][r=728KiB/s,w=768KiB/s][r=91,w=96 IOPS][eta 15d:02h:22m:01s]

with a FIO test file like so:

root@vm1x-ubuntu-fio:~# cat fio_65_35_mix.job
[global]
rw=randrw
bs=8k
ioengine=libaio
direct=1
iodepth=32
size=1000Gb

[vdc]
rate_iops=65000,35000
filename=/dev/vdc

[vdd]
rate_iops=65000,35000
filename=/dev/vdd

[vde]
rate_iops=65000,35000
filename=/dev/vde

kermat · August 29, 2025, 4:52pm

You could try running a blkdiscard from within the guest against the affected devices to see if there might be some stale blocks that can be discarded from previous tests:

blkdiscard /dev/vdc
blkdiscard /dev/vdd
blkdiscard /dev/vde

kreeuwijk · September 1, 2025, 12:23pm

Running the same configuration on different hardware encounters no issues

Jobs: 3 (f=3), 0-300000 IOPS: [m(3)][0.1%][r=24.9MiB/s,w=25.0MiB/s][r=3190,w=3198 IOPS]

This one is on nested virtualization with consumer-grade NVMe. So it must be something hardware-related.

kreeuwijk · September 1, 2025, 12:37pm

Tried this, same results unfortunately. Still ±100 IOPS.

kermat · September 10, 2025, 2:23pm

Did you ever find the bottleneck here? Just curious.

kreeuwijk · September 10, 2025, 2:51pm

Not yet, the Linbit people are looking at it now.

Topic		Replies	Views
Error: Failed to query free space from storage pool LINSTOR	8	143	August 22, 2024
Recover from abrupt server shutdown General	6	121	July 24, 2024
Linstor and KVM host on seperate nodes for cloudstack LINBIT SDS Integrations	4	171	June 27, 2024
Trying to Move VM to LINSTOR Storage: storage migration failed: Source and target image have different sizes General	8	1699	August 9, 2024
LINSTOR on Proxmox VE Newbie experiences LINSTOR	4	1181	August 5, 2024

Linstor/Piraeus 2.8.1 for Kubevirt - slow disk on Linux, fast on Windows

Related topics