I have a proxmox prod cluster with DRBD/linstor which has been working flawlessly for the past year. Suddenly I cannot create new VMs or add disks to existing VMs.
When trying to add a new disk to a VM it fails with:
update VM 121: -scsi1 linstor_storage:40,iothread=on
NOTICE
Trying to create diskful resource (pm-ca42ac92) on (wirt23a).
Diskfull assignment on wirt23a failed, let's autoplace it.
TASK ERROR: API Return-Code: 500. Message:
Could not autoplace resource pm-ca42ac92, because:
read timeout at /usr/share/perl5/Net/HTTP/Methods.pm line 274.
at /usr/share/perl5/PVE/Storage/Custom/LINSTORPlugin.pm line 433.
[...]
I searched this forum and found Cannot create new VM - fresh install
wirt23a:~# linstor rg l
╭────────────────────────────────────────────────────────────────────╮
┊ ResourceGroup ┊ SelectFilter ┊ VlmNrs ┊ Description ┊
╞════════════════════════════════════════════════════════════════════╡
┊ DfltRscGrp ┊ PlaceCount: 2 ┊ ┊ ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ pve-rg ┊ PlaceCount: 2 ┊ 0 ┊ ┊
┊ ┊ StoragePool(s): pve-storage ┊ ┊ ┊
╰────────────────────────────────────────────────────────────────────╯
Create a new rg fails as well:
wirt23a:~# linstor rg spawn pve-rg testres_0 1GiB
Error: Socket timeout, no data received for more than 300s.
After the 5 minutes I see the resource in the volumes list:
wirt23a:~# linstor v l -r testres_0
╭─────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ wirt23a ┊ testres_0 ┊ pve-storage ┊ 0 ┊ 1059 ┊ None ┊ ┊ ┊ Unknown ┊
┊ wirt23b ┊ testres_0 ┊ pve-storage ┊ 0 ┊ 1059 ┊ /dev/drbd1059 ┊ 315 KiB ┊ Unused ┊ UpToDate ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────╯
on wirt23b I see:
wirt23b:~# linstor r l -r testres_0
╭───────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════╡
┊ testres_0 ┊ wirt23a ┊ 7059 ┊ ┊ ┊ Unknown ┊ ┊
┊ testres_0 ┊ wirt23b ┊ 7059 ┊ Unused ┊ Connecting(wirt23a) ┊ UpToDate ┊ 2024-11-29 14:06:21 ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────╯
drbdadm status shows connecting on wirt23b, but doesn’t see the resource on wirt23a.
Has anyone any hints how to solve this issue?
thanks
Philipp
PS: The nodes seem happy:
wirt23b:~# linstor n l
╭────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞════════════════════════════════════════════════════════╡
┊ wirt23a ┊ COMBINED ┊ 172.16.31.1:3366 (PLAIN) ┊ Online ┊
┊ wirt23b ┊ COMBINED ┊ 172.16.31.2:3366 (PLAIN) ┊ Online ┊
╰────────────────────────────────────────────────────────╯