LINSTOR on Proxmox VE Newbie experiences

I worked some years with DRBD 8/9. Now I’m trying LINSTOR. I’ll write some experiences here…

I have a two node Proxmox VE Cluster and installed LINSTOR successfully.

Typical Newbie Error #1: DRBD 8 loaded instead of DRBD 9

When I try to migrate an existing VM to the LINSTOR-Storage, I get the following error message in Proxmox:

TASK ERROR: storage migration failed: API Return-Code: 500. Message: Could not autoplace resource pm-8fb7414f, because: [{"ret_code":-4611686018407201828,"message":"Satellite 'pve01' does not support the following layers: [DRBD]" .... (Hundreds lines more ...)

Reason

DRBD 8 was loaded instead of DRBD 9

Fix (maybe overcomplex or false):

  1. unload drbd with command rmmod drbd
  2. load drbd with command modprobe drbd
  3. check that drbd 9 is loaded with cat /proc/drbd. Should look like this:
version: 9.2.10 (api:2/proto:86-122)
GIT-hash: b92a320cb72a0b85144e742da5930f2d3b6ce30c build by root@cln22-felina, 2024-08-01 22:25:24
Transports (api:21): tcp (9.2.10)
  1. Recreate initramfs with command update-initramfs -k all -c
  2. reboot System (to make sure correct module is loaded at system boot)

In this other thread the specification of the system is shown:

Basic VM Creation on Linstor Storage: Working Fine

Speed is quite nice (dedicated 1 Gbe Network link for Linstor).

  • Online Storage Moving of existing VM to Linstor: Not working (See above linked thread).

  • Online Storage Moving of newly created VMs from Linstor to LVM-Storage: Not working:

create full clone of drive scsi0 (linstor01:pm-db982f51_102)
TASK ERROR: storage migration failed: target storage 
is known to cause issues with aio=io_uring (used by current drive)

Strange: LVM-Backed other VMs are all using io_uring. Message does not make sense to me.

  • Live Migration to other Node (Linstor Satellite with disk): Working Fine

  • Cross Cluster Live Migration to another Node: Working Fine

  • VM Deletion: Working Fine

  • Cross Cluster Live Migration from another node to local cluster on Linstor Storage: Working Fine

  • Offline Storage Moving of existing VM to Linstor: Working fine

  • Offline Storage Moving of newly created VMs from Linstor to LVM-Storage: Working fine

Oh interesting, I’ve not tried this myself. Do other storage plugins/types support moving VMs between them?

Oops, probably better to follow along in the other post :slight_smile:

So far all non-linstor-types I used supported that. (I’m using mostly directory type storages (qcow2). Sometimes zfs and ceph).

I just tried directory storage with linstor and online moving. Moving from linstor to directory storage worked mostly. The one issue I got was, that the deletion after migrating to directory storage was not possible. I got this error:

drive-scsi0: transferred 11.0 GiB of 11.0 GiB (100.00%) in 2m 51s
drive-scsi0: transferred 11.0 GiB of 11.0 GiB (100.00%) in 2m 52s, ready
all 'mirror' jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0: mirror-job finished
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
trying to acquire cfs lock 'storage-linstor01' ...
cfs-lock 'storage-linstor01' error: got lock request timeout
TASK OK

So the source disk could not be deleted and I had to do that manually.

I also could not delete the leftover virtual disk on linstor via GUI, because I got the error message: “Cannot remove this resource because a vm with vmid (vmid of migrated vm) is existing.” That seems to be an precaution check, which does not help that much, because the vm in question is existing, but is no longer using the linstor-resource. I would suggest, that the linstor-plugin should be fixed here, meaning to execute the delete, if the corresponding vm is not configured to use the linstor resource.

So I had to remove the resource from both nodes manually via “linstor resource delete …”.

This behaviour seems to be linked to:

Doing it a second time didn’t show up with the issue described here. seems to be I had a concurrent linstor task when migrating.


Online moving from directory storage to linstor worked fine.


Short test with zfs showed the “different sizes problem” as it is there with lvm.

1 Like

Fast incremental Backups with Proxmox Backup Server: Working Fine