Proxmox: no DRBD storage while LINBIT controller is down

Hmm, maybe I should learn more about kernels and kernel modules, but I am pretty sure that’s this has given me version 9 before (all on PVE-1):

root@pve-1:~# apt install proxmox-default-headers drbd-dkms drbd-utils
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
drbd-dkms is already the newest version (9.2.12-2).
drbd-utils is already the newest version (9.29.0-1).
<snip>
...
<snap>
root@pve-1:~# modprobe drbd

root@pve-1:~# cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: 211FB288A383ED945B83420

root@pve-1:~# dkms status
drbd/9.2.12-2, 6.8.12-4-pve, x86_64: installed

Any explanation what’s happening? As I said: on this very machine, DRBD9 was running before the updates and is - regarding to apt - still installed. But it doesn’t get loaded anymore.

But even better than an explanation of things that might have happened is a step-by-step procedure how to fix a situation like this.

It seems that the ppa now has updated packages as well. So…

systemctl disable --now linstor-satellite # on both PVE nodes
apt dist-upgrade # on the controller (still running in the old VM)
systemctl enable linstor-controller --now # on the controller
apt dist-upgrade # on the PVE nodes
systemctl enable --now linstor-satellite # on the PVE nodes

After this:

# on the controller
root@linstor-controller:~# apt list linst* --installed
Listing... Done
linstor-client/unknown,now 1.24.0-1 all [installed]
linstor-common/unknown,now 1.30.2-1 all [installed]
linstor-controller/unknown,now 1.30.2-1 all [installed]
# on the PVE nodes
root@pve-2:~# apt list linst* --installed
Listing... Done
linstor-client/unknown,now 1.24.0-1 all [installed]
linstor-common/unknown,now 1.30.2-1 all [installed]
linstor-controller/unknown,now 1.30.2-1 all [installed]
linstor-proxmox/unknown,now 8.0.4-1 all [installed]
linstor-satellite/unknown,now 1.30.2-1 all [installed]

Now I was able to delete the raspi using linstor node delete and linstor node lost.
Status at this time:

root@linstor-controller:~# linstor node list -p
+-------------------------------------------------------------------------------+
| Node    | NodeType  | Addresses                   | State                     |
|===============================================================================|
| pve-1   | SATELLITE | 192.168.113.21:3366 (PLAIN) | Online                    |
| pve-2   | SATELLITE | 192.168.113.22:3366 (PLAIN) | Online 

I wiped and reinstalled Ubuntu on the raspi. Then:

add-apt-repository ppa:linbit/linbit-drbd9-stack && sudo apt update
apt install linux-headers-6.8.0-1017-raspi # the raspi is NOT a PVE node, so didn't install promox headers
apt install drbd-dkms drbd-utils linstor-satellite
modprobe drbd

root@raspi-1:/home/a-tupti# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ 28e2ab938fe5e99fdcb27c0c393a9f2a3fb8fdee\ build\ by\ buildd@bos03-arm64-100\,\ 2024-10-29\ 09:17:58
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x09020c
DRBD_KERNEL_VERSION=9.2.12
DRBDADM_VERSION_CODE=0x091d00
DRBDADM_VERSION=9.29.0

root@raspi-1:/home/a-tupti# cat /proc/drbd 
version: 9.2.12 (api:2/proto:118-122)
GIT-hash: 2da6f528dc4ab3fd25c511f7b03531100e54ab08 build by root@raspi-1, 2024-12-22 14:21:05
Transports (api:21):

root@raspi-1:/home/a-tupti# apt list linst* --installed
Listing... Done
linstor-common/noble,now 1.30.1-1ppa1~noble1 all [installed,automatic]
linstor-satellite/noble,now 1.30.1-1ppa1~noble1 all [installed]

Back on the controller

linstor node create raspi-1 192.168.111.20

root@linstor-controller:~# linstor node list -p
+-------------------------------------------------------------------------------+
| Node    | NodeType  | Addresses                   | State                     |
|===============================================================================|
| pve-1   | SATELLITE | 192.168.113.21:3366 (PLAIN) | Online                    |
| pve-2   | SATELLITE | 192.168.113.22:3366 (PLAIN) | Online                    |
| raspi-1 | SATELLITE | 192.168.111.20:3366 (PLAIN) | OFFLINE(VERSION MISMATCH) |
+-------------------------------------------------------------------------------+

Really? A mismatch in the third place of the version? There are no exactly matching packages with version 1.30.x available in the official and ppa repo!

That is very annoying. I got hit by that as well. Now I have no way back, because the ppa doesn’t seem to keep prior versions. So there is no tiebreaker until this is resolved. At least not a Raspberry PI.