Trying to Move VM to LINSTOR Storage: storage migration failed: Source and target image have different sizes

When I try to move a storage of the same VM, I get this error:

create full clone of drive scsi0 (lvmlocal01:vm-100-disk-1)

NOTICE
  Trying to create diskful resource (pm-83eb69d4) on (cln21-blabla).
drive mirror is starting for drive-scsi0
drive-scsi0: Cancelling block job
drive-scsi0: Done.
TASK ERROR: storage migration failed: block job (mirror) error: drive-scsi0: Source and target image have different sizes (io-status: ok)

Any helpful hint, what is wrong?

This is my Server- and LINSTOR configuration:

# lsblk
NAME                                  MAJ:MIN  RM  SIZE RO TYPE MOUNTPOINTS
sda                                     8:0     0  3,6T  0 disk 
โ”œโ”€sda1                                  8:1     0 93,1G  0 part 
โ”‚ โ”œโ”€SYS-root                          252:1     0 83,8G  0 lvm  /
โ”‚ โ””โ”€SYS-swap                          252:2     0  9,3G  0 lvm  [SWAP]
โ”œโ”€sda2                                  8:2     0   99M  0 part 
โ”œโ”€sda3                                  8:3     0  1,5T  0 part 
โ”‚ โ””โ”€lvmlocal-vm--501--disk--0         252:0     0  100G  0 lvm  
โ””โ”€sda4                                  8:4     0    2T  0 part 
  โ”œโ”€linstor_vg-thinpool_tmeta         252:3     0  108M  0 lvm  
  โ”‚ โ””โ”€linstor_vg-thinpool-tpool       252:5     0  1,6T  0 lvm  
  โ”‚   โ”œโ”€linstor_vg-thinpool           252:6     0  1,6T  1 lvm  
  โ”‚   โ””โ”€linstor_vg-pm--d0e23104_00000 252:8     0   11G  0 lvm  
  โ”‚     โ””โ”€drbd1000                    147:1000  0   11G  0 disk 
  โ””โ”€linstor_vg-thinpool_tdata         252:4     0  1,6T  0 lvm  
    โ””โ”€linstor_vg-thinpool-tpool       252:5     0  1,6T  0 lvm  
      โ”œโ”€linstor_vg-thinpool           252:6     0  1,6T  1 lvm  
      โ””โ”€linstor_vg-pm--d0e23104_00000 252:8     0   11G  0 lvm  
        โ””โ”€drbd1000                    147:1000  0   11G  0 disk 

cat /etc/pve/storage.cfg

...
drbd: linstor01
    content images, rootdir
    controller 192.168.204.4
    resourcegroup pve-rg

linstor sp list

+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| StoragePool          | Node         | Driver   | PoolName            | FreeCapacity | TotalCapacity | CanSnapshots | State | SharedName                        |
|================================================================================================================================================================|
| DfltDisklessStorPool | cln21-blabla | DISKLESS |                     |              |               | False        | Ok    | cln21-blabla;DfltDisklessStorPool |
| DfltDisklessStorPool | cln22-blabla | DISKLESS |                     |              |               | False        | Ok    | cln22-blabla;DfltDisklessStorPool |
| pve-storage          | cln21-blabla | LVM_THIN | linstor_vg/thinpool |     1.64 TiB |      1.64 TiB | True         | Ok    | cln21-blabla;pve-storage          |
| pve-storage          | cln22-blabla | LVM_THIN | linstor_vg/thinpool |     1.64 TiB |      1.64 TiB | True         | Ok    | cln22-blabla;pve-storage          |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

linstor n l
+----------------------------------------------------------------+
| Node         | NodeType  | Addresses                  | State  |
|================================================================|
| cln21-blabla | SATELLITE | 192.168.204.3:3366 (PLAIN) | Online |
| cln22-blabla | SATELLITE | 192.168.204.4:3366 (PLAIN) | Online |
+----------------------------------------------------------------+

linstor rg list
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”Š ResourceGroup โ”Š SelectFilter                โ”Š VlmNrs โ”Š Description โ”Š
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”Š DfltRscGrp    โ”Š PlaceCount: 2               โ”Š        โ”Š             โ”Š
โ•žโ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ”„โ•ก
โ”Š pve-rg        โ”Š PlaceCount: 2               โ”Š        โ”Š             โ”Š
โ”Š               โ”Š StoragePool(s): pve-storage โ”Š        โ”Š             โ”Š
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

linstor resource list
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”Š ResourceName โ”Š Node โ”Š Port โ”Š Usage โ”Š Conns โ”Š State โ”Š CreatedOn โ”Š
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Here are the software versions used:

8.0.7        pve-cluster
5.1.12       pve-container
8.2.2        pve-docs
4.2023.08-4  pve-edk2-firmware
4.2023.08-4  pve-edk2-firmware-legacy
4.2023.08-4  pve-edk2-firmware-ovmf
0.7.1        pve-esxi-import-tools
5.0.7        pve-firewall
4.0.5        pve-ha-manager
8.2.0        pve-headers
3.2.2        pve-i18n
1.3.0        pve-lxc-syscalld
8.2.4        pve-manager
9.0.2-1      pve-qemu-kvm
5.3.0-3      pve-xtermjs
2.3.1        pve-zsync
1.23.0-1     linstor-client
1.29.0-1     linstor-common
1.29.0-1     linstor-controller
8.0.4-1      linstor-proxmox
1.29.0-1     linstor-satellite
1.23.0-1     python-linstor

The problem is a known bug. The reason for the message is because of the drbd-metadata, which adds to the guest storage and thus leading to different sizes. A workround exists by using external metadata. This is discouraged (by linbit) because the linstor-proxmox-plugin is only designed for use with internal metadata.

Comment to the bug from a developer:

tl;dr: the problem is now fully understood, but it will take time until this is fully resolved.

Look here for details:

The problem is fixed with the latest version of the linstor-proxmox plugin. One has to set โ€œexactsize yesโ€ as option for the linstor-storage in /etc/pve/storage.cfg for online storage migration to work.

This is my /etc/pve/storage.cfg:

drbd: linstor01
        resourcegroup pve-rg
        content rootdir,images
        controller 192.168.204.4
        exactsize yes

For details read:

1 Like

I think itโ€™s obvious, that having to set an additional flag is not a perfect choice and I would love linstor/drbd more, to have a software that just works โ„ข. But so far: Thanks for the fix!

I did another online moving disk test and it failed.

I just realized, that there had been an orphaned linstor resource definition of a vm. This command showed me all resource-definitions:

for rd in $(linstor -m resource-definition list | jq -r '.[][].name');do 
   linstor resource-definition list-properties $rd
done

The following was the orphaned resource-definition:

# linstor resource-definition list-properties pm-d0e23104

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”Š Key                                 โ”Š Value โ”Š
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”Š Aux/pm/vmid                         โ”Š 102   โ”Š
โ”Š DrbdOptions/ExactSize               โ”Š false โ”Š
โ”Š DrbdOptions/Net/allow-two-primaries โ”Š yes   โ”Š
โ”Š DrbdOptions/Resource/quorum         โ”Š off   โ”Š
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

This avoided a successful test for online moving of storage from lvm to linstor. The value ExactSize yes was overruled by this resource-definition and I again got the error.

I had to delete that resource-definition. After that the live online storage moving worked again.

Iโ€™m sure someone else will struggle with this before it receives a proper fix. Thank you for noting these issues and workarounds.

I still have unresolved trouble migrating Storage and VM. There is a VM, I migrated back and forth and it can not be migrated between lvm and linstor because of the different sizes problem, although exactsize is set to yes, although the settings of any resource-definition object has exactsize set to yes.

I have been using a Proxmox cluster for testing lately and I am able to reproduce this.

First, if I create a new disk for a running VM backed by LINSTOR with the exactsize yes parameter specified in /etc/pve/storage.cfg, I can move this disk image back and forth between LVM โ†” LINSTOR repeatedly without issue while the VM is running. Everything works as expected here.

However, exactsize yes is only supposed to be temporarily set for migrationโ€ฆ

On the flipside, creating a new disk image for a running VM backed by LINSTOR with exactsize no presents an issue. Trying to move a running VMโ€™s disk from LINSTOR โ†’ LVM at this stage results in:

TASK ERROR: storage migration failed: block job (mirror) error: drive-virtio2: Source and target image have different sizes (io-status: ok)

So, when prepping for migration and toggling exactsize yes in /etc/pve/storage.cfg, and also ensuring that the current resource definition is adjusted (linstor rd set-property <res_id> DrbdOptions/ExactSize yes), migration still fails in Proxmox:

TASK ERROR: storage migration failed: block job (mirror) error: drive-virtio2: Source and target image have different sizes (io-status: ok)

The workaround is to detach the disk, re-attach and then attempt live migration once more. Power cycling the virtual machine also allows the migration to happen, and of course, offline migration works without issue.

Iโ€™m guessing the background DRBD adjust command either needs to do some extra operations for this to work, or DRBD has a technical reason it needs to be demoted to Secondary (inactive) when DrbdOptions/ExactSize yes is toggled from no to yes.

Thanks for powering through this one, Iโ€™ll comment on the GitHub issue with the findings above.

1 Like