Resource UpToDate but The Size is different

Hi,
I am just setup proxmox cluster with 3 node, use linstor as storage backend with 3 replica, and create some VM with linstor storage backend. I install linstor gui as well. However, when I see a resource from linstor gui, there is different size. Please see below

is it OK with above condition? will it problem in the future?

When using thinly provisioned storage the exact amount of consumed storage for replicated volumes may slightly differ between nodes, this is normal.

Hi,
Thanks for your clarify.

Hi,

I was monitoring the system and found this DRBD status. The size difference seems quite large.

is this normal?

What should i do? is it deleting the resource and re-syncing it?

If you want extra assurance your data is not corrupted or out of sync, you can always perform a quick validation on the resource:

  1. The resource cannot be Primary (or InUse), make sure the VM has been stopped.
  2. On each node (pve21, pve22, pve23) perform a sha256sum /dev/drbd1019. Do this in order, one at a time. Each host should show as Primary when calculating the sum.
  3. Verify each host has matching hash values for the /dev/drbd1019 volume.

Assuming the hash values match, there isn’t anything you need to do, but you can delete the volumes containing the larger data allocations and re-add them. This should have the added effect of only consuming ~10 MiBs on each node. This is due to LINSTOR’s ability to only synchronize the allocated data when using thinly provisioned volumes.

With that said, LINSTOR cannot control the “drift” that is possible with thinly provisioned storage across multiple nodes in the cluster.

Hope this helps.

Hi Ryan,

Thanks for your assist. Below is the result

root@pve21:~# sha256sum /dev/drbd1019

4d8c267d5eb283fbc70148c520936e27db8941e66eb7b2d5646a7fa5ebd00bde  /dev/drbd1019

root@pve22:~# sha256sum /dev/drbd1019
2a2a7626443160a94e0d0a4e666c055b4b58e41162624197175e00ee5f1b3de0  /dev/drbd1019


root@pve23:~# sha256sum /dev/drbd1019
4d8c267d5eb283fbc70148c520936e27db8941e66eb7b2d5646a7fa5ebd00bde  /dev/drbd1019

Additional information. The resource on pve22 is diskless previously. The resource automatic created when I tick “Diskless on remaining” on resource group

unrelated to the rest, but: don’t do that. the proxmox plugin creates (and deletes) them on the fly as required. “diskless on remaining” is only there if there is no other way to create diskless assignments on the fly, but in this case there is.

other than that: I would have used drbdadm verify

Hi Rck,

Thanks for your info. I run drbdadm verify and this is the result

pm-08172708 role:Secondary
  disk:UpToDate open:no
  pve21 role:Secondary
    replication:VerifyS peer-disk:UpToDate done:33.47
  pve23 role:Secondary
    replication:VerifyS peer-disk:UpToDate done:32.82

Given that the replica on pve22 also shows almost zero usage, it does seem like that replica is broken. It would be interesting to know exactly how you got it into that state.

I suggest removing and re-adding the resource on pve22, and check the responses carefully for any errors while you do that.

Hi Candlerb,

Here what I do

Remove resource (from linstor gui) on pve22 to become diskless. Below the status

Edit resource group and tick “Diskless on remaining”. Actually, I have no idea for what is this. I am just curious and tick it :smiley:

Click Submit on resource group that has been modified. Below the result

Another test was to remove the disk and re-add it. The results were the same. Please see the gif file below.

drbd-remove-add-resource

As you’ve already been told, please don’t tick this. You don’t want it. Proxmox will add diskless resources where required.

Once you’ve re-added the disk, you can repeat the sha256sum exercise given before. If pve22’s replica is still different, then something is bad. You might want to start by describing the exact versions of Linstor, drbd9 kernel module, and underlying OS that you’re using.

Hi, I am just answer your question how i got this situation :smiley: . So, I explain how i got it.Below is my Proxmox cluster

root@pve22:~# dpkg -l | grep -i drbd | awk '/^ii/ {print $1, $2, $3}' | column -t
ii  drbd-dkms           9.2.14-1
ii  drbd-reactor        1.9.0-1
ii  drbd-utils          9.32.0-1
ii  linstor-common      1.31.3-1
ii  linstor-controller  1.31.3-1
ii  linstor-proxmox     8.1.3-1
ii  linstor-satellite   1.31.3-1

root@pve22:~# pveversion 
pve-manager/9.0.5/9c5600b249dbfd2f (running kernel: 6.14.8-2-pve)

root@pve22:~# cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 13 (trixie)"
NAME="Debian GNU/Linux"
VERSION_ID="13"
VERSION="13 (trixie)"
VERSION_CODENAME=trixie
DEBIAN_VERSION_FULL=13.0
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
root@pve22:~# 

What does the sha256sum exercise show now?

If pve22 is still different… I am wondering whether there’s some chance that when the logical volume is being recreated, it’s reusing the same extents without zeroing them - i.e. there is valid metadata, and so drbd thinks the sync is already complete.

In that case, it might be best to force a full resync of the volume.

I think you would do that by logging into pve22 and doing drbdsetup invalidate XXXX (where XXXX is the minor number, matching /dev/drbdXXXX) but I’m not 100% sure.

Are you also changing the place count from 2 (the LINSTOR default) to 3?

Just trying to fully understand how this is happening, or what is triggering a resource to think it is UpToDate.

Hi,
Thanks for your assist. The resource is matching now (pve21, pve22, pve23)

1 Like

Hi,
Hi,Yes. I am. Sometime, I change from 3 to 2 when the node is not enough (ie one of satellite node is down)

I was unable to recreate this in my test cluster using the same steps you mentioned using in the LINSTOR GUI. Did you see this happen more than once, or was it something you could reproduce?

Currently just once. However, I found 2 resources that do not match. Then use drbdadm invalidate to make it sync again