Proxmox: no DRBD storage while LINBIT controller is down

Hmm, maybe I should learn more about kernels and kernel modules, but I am pretty sure that’s this has given me version 9 before (all on PVE-1):

root@pve-1:~# apt install proxmox-default-headers drbd-dkms drbd-utils
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
drbd-dkms is already the newest version (9.2.12-2).
drbd-utils is already the newest version (9.29.0-1).
<snip>
...
<snap>
root@pve-1:~# modprobe drbd

root@pve-1:~# cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: 211FB288A383ED945B83420

root@pve-1:~# dkms status
drbd/9.2.12-2, 6.8.12-4-pve, x86_64: installed

Any explanation what’s happening? As I said: on this very machine, DRBD9 was running before the updates and is - regarding to apt - still installed. But it doesn’t get loaded anymore.

But even better than an explanation of things that might have happened is a step-by-step procedure how to fix a situation like this.

It seems that the ppa now has updated packages as well. So…

systemctl disable --now linstor-satellite # on both PVE nodes
apt dist-upgrade # on the controller (still running in the old VM)
systemctl enable linstor-controller --now # on the controller
apt dist-upgrade # on the PVE nodes
systemctl enable --now linstor-satellite # on the PVE nodes

After this:

# on the controller
root@linstor-controller:~# apt list linst* --installed
Listing... Done
linstor-client/unknown,now 1.24.0-1 all [installed]
linstor-common/unknown,now 1.30.2-1 all [installed]
linstor-controller/unknown,now 1.30.2-1 all [installed]
# on the PVE nodes
root@pve-2:~# apt list linst* --installed
Listing... Done
linstor-client/unknown,now 1.24.0-1 all [installed]
linstor-common/unknown,now 1.30.2-1 all [installed]
linstor-controller/unknown,now 1.30.2-1 all [installed]
linstor-proxmox/unknown,now 8.0.4-1 all [installed]
linstor-satellite/unknown,now 1.30.2-1 all [installed]

Now I was able to delete the raspi using linstor node delete and linstor node lost.
Status at this time:

root@linstor-controller:~# linstor node list -p
+-------------------------------------------------------------------------------+
| Node    | NodeType  | Addresses                   | State                     |
|===============================================================================|
| pve-1   | SATELLITE | 192.168.113.21:3366 (PLAIN) | Online                    |
| pve-2   | SATELLITE | 192.168.113.22:3366 (PLAIN) | Online 

I wiped and reinstalled Ubuntu on the raspi. Then:

add-apt-repository ppa:linbit/linbit-drbd9-stack && sudo apt update
apt install linux-headers-6.8.0-1017-raspi # the raspi is NOT a PVE node, so didn't install promox headers
apt install drbd-dkms drbd-utils linstor-satellite
modprobe drbd

root@raspi-1:/home/a-tupti# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ 28e2ab938fe5e99fdcb27c0c393a9f2a3fb8fdee\ build\ by\ buildd@bos03-arm64-100\,\ 2024-10-29\ 09:17:58
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x09020c
DRBD_KERNEL_VERSION=9.2.12
DRBDADM_VERSION_CODE=0x091d00
DRBDADM_VERSION=9.29.0

root@raspi-1:/home/a-tupti# cat /proc/drbd 
version: 9.2.12 (api:2/proto:118-122)
GIT-hash: 2da6f528dc4ab3fd25c511f7b03531100e54ab08 build by root@raspi-1, 2024-12-22 14:21:05
Transports (api:21):

root@raspi-1:/home/a-tupti# apt list linst* --installed
Listing... Done
linstor-common/noble,now 1.30.1-1ppa1~noble1 all [installed,automatic]
linstor-satellite/noble,now 1.30.1-1ppa1~noble1 all [installed]

Back on the controller

linstor node create raspi-1 192.168.111.20

root@linstor-controller:~# linstor node list -p
+-------------------------------------------------------------------------------+
| Node    | NodeType  | Addresses                   | State                     |
|===============================================================================|
| pve-1   | SATELLITE | 192.168.113.21:3366 (PLAIN) | Online                    |
| pve-2   | SATELLITE | 192.168.113.22:3366 (PLAIN) | Online                    |
| raspi-1 | SATELLITE | 192.168.111.20:3366 (PLAIN) | OFFLINE(VERSION MISMATCH) |
+-------------------------------------------------------------------------------+

Really? A mismatch in the third place of the version? There are no exactly matching packages with version 1.30.x available in the official and ppa repo!

1 Like

That is very annoying. I got hit by that as well. Now I have no way back, because the ppa doesn’t seem to keep prior versions. So there is no tiebreaker until this is resolved. At least not a Raspberry PI.

1 Like

Even more annoying is the lack of caring by LINBIT.

The version mismatches between the PPA and the public Proxmox repos is annoying, and was an oversight on LINBIT’s part. Our devs and tests didn’t consider users mixing where they’re getting packages from.

We’re resolving that now. The build pipeline for the PPA is running and LINSTOR 1.30.2 should be uploaded in a few hours.

While I hear you comments like this are a little rude. The forums, the PPA, and the public Proxmox VE repositories are provided and supported completely for free. All of the software found in the PPA and LINBIT’s Proxmox plugin repos are open source. The majority of LINBIT was in customer response mode for the holidays, including myself who just returned to the office from holiday with my family. We do care, but we are human, and do make mistakes/oversights.

4 Likes

@proxmeup I’m using several open-source products, and in 70% of the cases, there is a complete lack of interest on the side of the developers on interacting with the community. I’m talking about issues being closed before they are even looked at, just by some robot, or issues, even actual confirmed bugs, remain open for years. This is how it often works in open source: unless the developer/community is personally bothered by a bug or needs a feature, not much will happen.

Heck, I’m using certain proprietary products, software and hardware, some of them even with paid maintenance. While they may offer support, their first level is sometimes so terrible that you’ll still go for weeks without sensible feedback.

As such, I’m really thankful that Linbit has not only open-sourced DRBD9/Linstor but even provides free support.

I understand that you probably wrote that in a moment of frustration. Still, please consider there is a paid option available if you require support. Otherwise, let’s all show a bit more appreciation for companies providing support for open source software and keep our helpers such as Matt and Rick motivated to keep coming back to the forum! :+1::blush:

With that in mind, thanks to you and @mtisza, too, for sharing your experience! It was/is an interesting read.

1 Like

Hello,

sorry, but I’ve been sick for 2 days and didn’t feel like writing.

Anyway, this has now become a mixed thread: code and dealing with each other. For all readers following; I will deal with code first, 'cause that’s what most likely has brought you here - but not without my sincere apology for my words if they sounded rude.

The problem has been fixed. As @kermat said, the build pipeline was activated and the versions are matching again.

Maybe someone is looking for a summary, so here it is:

I aimed for a 2-node PVE cluster with a Raspberry Pi as the PVE quorum and diskless DRBD tie-breaker. Strictly home use, but this setup tries to achieve higher availability. If you’re using the (LINBIT-) recommended way with LINSTOR and DRBD9 and Proxmox, that means that you need a HA linstor-controller as well. As @kermat recommends: the controller should be running on the PVE nodes, not in a VM.

On the PVE nodes you should install the software as described by LINBIT: How to Setup LINSTOR on Proxmox VE

apt -y install pve-headers-$(uname -r) proxmox-default-headers drbd-dkms drbd-utils linstor-common linstor-client linstor-controller python-linstor linstor-satellite

Just follow the LINBIT docs and everything will be alright.

On the raspi, you need to know, that you can’t get the required software packages from the official LINBIT repos, if you’re not a (paying) customer. But fortunately some nice LINBIT people are running a ppa with the packages. But getting software from a ppa repo also means, that you can’t run Rasbian OS on your raspi, instead you have to use Ubuntu. (Thanks (a lot of) to @mtisza)

For the tie-breaker role the raspi needs:
add-apt-repository ppa:linbit/linbit-drbd9-stack && sudo apt update
apt install linux-headers-6.8.0-1017-raspi (make sure to install the ones matching your current kernel!)
apt install drbd-dkms drbd-utils linstor-common linstor-satellite

AND: make sure, that the version numbers in the ppa (linstor-* packages) are matching the ones in the official LINBIT repo to the last digit. If they are not matching, just wait a bit until LINBIT updates the ppa. Versions in the official repo are leading, they must be followed.

Again you can follow the LINBIT docs for configuring LINSTOR. Make only one PVE node the LINSTOR controller for now. Make both PVE nodes and the raspi LINSTOR satellite nodes, create DRBD resources on the PVE nodes. Maybe set

linstor set-property DrbdOptions/AutoEvictAllowEviction false

on controller or node level.

At this point you’ll have a working DRBD cluster on Proxmox. The LINSTOR PVE packages will create disk resources for LXC and VM from the Proxmox UI.

What you don’t have at this point is the LINSTOR controller HA, which I haven’t done yet.
Again, I like to refer to @mtisza and his valuable contribution: maybe you like to make sure that DRBD/LINSTOR related packages are under tighter control and apt-mark hold these packages

Now for the other part.
Family-resilience is the target;-). HA the way to get closer.
Energy costs are a leading factor, for this reason (continuously running) shared storage should be avoided.
Searching for a synchronous disk replication of local disks on different systems bought me to DRBD. Ceph isn’t an option for a 2-PVE-RasPi-HomeUse-Setup. DRBD has some reputation, a lot from being integrated into the Linux kernel.

Luckily I don’t have experienced open source support like @vik-t. I really appreciate the passion and devotion of open source developers. When choosing open source software I look at how many people are contributing, how lively the community is and factor what I see as “reputation”. This might help to choose software with good “community interaction”.
Proxmox forums are referring to LINBIT since the change to DRBD9. I guess you would agree, that the LINBIT forum is not the hotspot of the internet.
But I saw reputation and the contributors are a company. That convinced me. And the company sells support to their customers, so these customers wouldn’t appear in the forum.

I’ve gone up and down every LINBIT doc that seems to relate to my setup. I asked “How to config the raspi?” and now that I know I did a google search for “linbit ppa”. Although it has been talked about the raspi a lot, it took a community member @mtisza to disclose the needed information. Hint for LINBIT: although the raspi is mentioned in your docs, the ppa seems missing or is hidden too deep.

By coincidence then hit the build-version-mismatch. I guess at this moment, acting in short time frames during the holiday season, my thoughts were: what the…it’s supposed to be an automated build pipeline…how can there be a mismatch…by no way a mismatch in the third place should lead to such a problem…
Well, I guess the build pipelines are not running in the same cycle as the ones for the commercial repo, what I absolutely understand from a technical perspective.

Anyway: again my apology.

1 Like