Proxmox: no DRBD storage while LINBIT controller is down

Hmm, maybe I should learn more about kernels and kernel modules, but I am pretty sure that’s this has given me version 9 before (all on PVE-1):

root@pve-1:~# apt install proxmox-default-headers drbd-dkms drbd-utils
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
drbd-dkms is already the newest version (9.2.12-2).
drbd-utils is already the newest version (9.29.0-1).
<snip>
...
<snap>
root@pve-1:~# modprobe drbd

root@pve-1:~# cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: 211FB288A383ED945B83420

root@pve-1:~# dkms status
drbd/9.2.12-2, 6.8.12-4-pve, x86_64: installed

Any explanation what’s happening? As I said: on this very machine, DRBD9 was running before the updates and is - regarding to apt - still installed. But it doesn’t get loaded anymore.

But even better than an explanation of things that might have happened is a step-by-step procedure how to fix a situation like this.

It seems that the ppa now has updated packages as well. So…

systemctl disable --now linstor-satellite # on both PVE nodes
apt dist-upgrade # on the controller (still running in the old VM)
systemctl enable linstor-controller --now # on the controller
apt dist-upgrade # on the PVE nodes
systemctl enable --now linstor-satellite # on the PVE nodes

After this:

# on the controller
root@linstor-controller:~# apt list linst* --installed
Listing... Done
linstor-client/unknown,now 1.24.0-1 all [installed]
linstor-common/unknown,now 1.30.2-1 all [installed]
linstor-controller/unknown,now 1.30.2-1 all [installed]
# on the PVE nodes
root@pve-2:~# apt list linst* --installed
Listing... Done
linstor-client/unknown,now 1.24.0-1 all [installed]
linstor-common/unknown,now 1.30.2-1 all [installed]
linstor-controller/unknown,now 1.30.2-1 all [installed]
linstor-proxmox/unknown,now 8.0.4-1 all [installed]
linstor-satellite/unknown,now 1.30.2-1 all [installed]

Now I was able to delete the raspi using linstor node delete and linstor node lost.
Status at this time:

root@linstor-controller:~# linstor node list -p
+-------------------------------------------------------------------------------+
| Node    | NodeType  | Addresses                   | State                     |
|===============================================================================|
| pve-1   | SATELLITE | 192.168.113.21:3366 (PLAIN) | Online                    |
| pve-2   | SATELLITE | 192.168.113.22:3366 (PLAIN) | Online 

I wiped and reinstalled Ubuntu on the raspi. Then:

add-apt-repository ppa:linbit/linbit-drbd9-stack && sudo apt update
apt install linux-headers-6.8.0-1017-raspi # the raspi is NOT a PVE node, so didn't install promox headers
apt install drbd-dkms drbd-utils linstor-satellite
modprobe drbd

root@raspi-1:/home/a-tupti# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ 28e2ab938fe5e99fdcb27c0c393a9f2a3fb8fdee\ build\ by\ buildd@bos03-arm64-100\,\ 2024-10-29\ 09:17:58
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x09020c
DRBD_KERNEL_VERSION=9.2.12
DRBDADM_VERSION_CODE=0x091d00
DRBDADM_VERSION=9.29.0

root@raspi-1:/home/a-tupti# cat /proc/drbd 
version: 9.2.12 (api:2/proto:118-122)
GIT-hash: 2da6f528dc4ab3fd25c511f7b03531100e54ab08 build by root@raspi-1, 2024-12-22 14:21:05
Transports (api:21):

root@raspi-1:/home/a-tupti# apt list linst* --installed
Listing... Done
linstor-common/noble,now 1.30.1-1ppa1~noble1 all [installed,automatic]
linstor-satellite/noble,now 1.30.1-1ppa1~noble1 all [installed]

Back on the controller

linstor node create raspi-1 192.168.111.20

root@linstor-controller:~# linstor node list -p
+-------------------------------------------------------------------------------+
| Node    | NodeType  | Addresses                   | State                     |
|===============================================================================|
| pve-1   | SATELLITE | 192.168.113.21:3366 (PLAIN) | Online                    |
| pve-2   | SATELLITE | 192.168.113.22:3366 (PLAIN) | Online                    |
| raspi-1 | SATELLITE | 192.168.111.20:3366 (PLAIN) | OFFLINE(VERSION MISMATCH) |
+-------------------------------------------------------------------------------+

Really? A mismatch in the third place of the version? There are no exactly matching packages with version 1.30.x available in the official and ppa repo!

1 Like

That is very annoying. I got hit by that as well. Now I have no way back, because the ppa doesn’t seem to keep prior versions. So there is no tiebreaker until this is resolved. At least not a Raspberry PI.

1 Like

Even more annoying is the lack of caring by LINBIT.

The version mismatches between the PPA and the public Proxmox repos is annoying, and was an oversight on LINBIT’s part. Our devs and tests didn’t consider users mixing where they’re getting packages from.

We’re resolving that now. The build pipeline for the PPA is running and LINSTOR 1.30.2 should be uploaded in a few hours.

While I hear you comments like this are a little rude. The forums, the PPA, and the public Proxmox VE repositories are provided and supported completely for free. All of the software found in the PPA and LINBIT’s Proxmox plugin repos are open source. The majority of LINBIT was in customer response mode for the holidays, including myself who just returned to the office from holiday with my family. We do care, but we are human, and do make mistakes/oversights.

5 Likes

@proxmeup I’m using several open-source products, and in 70% of the cases, there is a complete lack of interest on the side of the developers on interacting with the community. I’m talking about issues being closed before they are even looked at, just by some robot, or issues, even actual confirmed bugs, remain open for years. This is how it often works in open source: unless the developer/community is personally bothered by a bug or needs a feature, not much will happen.

Heck, I’m using certain proprietary products, software and hardware, some of them even with paid maintenance. While they may offer support, their first level is sometimes so terrible that you’ll still go for weeks without sensible feedback.

As such, I’m really thankful that Linbit has not only open-sourced DRBD9/Linstor but even provides free support.

I understand that you probably wrote that in a moment of frustration. Still, please consider there is a paid option available if you require support. Otherwise, let’s all show a bit more appreciation for companies providing support for open source software and keep our helpers such as Matt and Rick motivated to keep coming back to the forum! :+1::blush:

With that in mind, thanks to you and @mtisza, too, for sharing your experience! It was/is an interesting read.

2 Likes

Hello,

sorry, but I’ve been sick for 2 days and didn’t feel like writing.

Anyway, this has now become a mixed thread: code and dealing with each other. For all readers following; I will deal with code first, 'cause that’s what most likely has brought you here - but not without my sincere apology for my words if they sounded rude.

The problem has been fixed. As @kermat said, the build pipeline was activated and the versions are matching again.

Maybe someone is looking for a summary, so here it is:

I aimed for a 2-node PVE cluster with a Raspberry Pi as the PVE quorum and diskless DRBD tie-breaker. Strictly home use, but this setup tries to achieve higher availability. If you’re using the (LINBIT-) recommended way with LINSTOR and DRBD9 and Proxmox, that means that you need a HA linstor-controller as well. As @kermat recommends: the controller should be running on the PVE nodes, not in a VM.

On the PVE nodes you should install the software as described by LINBIT: How to Setup LINSTOR on Proxmox VE

apt -y install pve-headers-$(uname -r) proxmox-default-headers drbd-dkms drbd-utils linstor-common linstor-client linstor-controller python-linstor linstor-satellite

Just follow the LINBIT docs and everything will be alright.

On the raspi, you need to know, that you can’t get the required software packages from the official LINBIT repos, if you’re not a (paying) customer. But fortunately some nice LINBIT people are running a ppa with the packages. But getting software from a ppa repo also means, that you can’t run Rasbian OS on your raspi, instead you have to use Ubuntu. (Thanks (a lot of) to @mtisza)

For the tie-breaker role the raspi needs:
add-apt-repository ppa:linbit/linbit-drbd9-stack && sudo apt update
apt install linux-headers-6.8.0-1017-raspi (make sure to install the ones matching your current kernel!)
apt install drbd-dkms drbd-utils linstor-common linstor-satellite

AND: make sure, that the version numbers in the ppa (linstor-* packages) are matching the ones in the official LINBIT repo to the last digit. If they are not matching, just wait a bit until LINBIT updates the ppa. Versions in the official repo are leading, they must be followed.

Again you can follow the LINBIT docs for configuring LINSTOR. Make only one PVE node the LINSTOR controller for now. Make both PVE nodes and the raspi LINSTOR satellite nodes, create DRBD resources on the PVE nodes. Maybe set

linstor set-property DrbdOptions/AutoEvictAllowEviction false

on controller or node level.

At this point you’ll have a working DRBD cluster on Proxmox. The LINSTOR PVE packages will create disk resources for LXC and VM from the Proxmox UI.

What you don’t have at this point is the LINSTOR controller HA, which I haven’t done yet.
Again, I like to refer to @mtisza and his valuable contribution: maybe you like to make sure that DRBD/LINSTOR related packages are under tighter control and apt-mark hold these packages

Now for the other part.
Family-resilience is the target;-). HA the way to get closer.
Energy costs are a leading factor, for this reason (continuously running) shared storage should be avoided.
Searching for a synchronous disk replication of local disks on different systems bought me to DRBD. Ceph isn’t an option for a 2-PVE-RasPi-HomeUse-Setup. DRBD has some reputation, a lot from being integrated into the Linux kernel.

Luckily I don’t have experienced open source support like @vik-t. I really appreciate the passion and devotion of open source developers. When choosing open source software I look at how many people are contributing, how lively the community is and factor what I see as “reputation”. This might help to choose software with good “community interaction”.
Proxmox forums are referring to LINBIT since the change to DRBD9. I guess you would agree, that the LINBIT forum is not the hotspot of the internet.
But I saw reputation and the contributors are a company. That convinced me. And the company sells support to their customers, so these customers wouldn’t appear in the forum.

I’ve gone up and down every LINBIT doc that seems to relate to my setup. I asked “How to config the raspi?” and now that I know I did a google search for “linbit ppa”. Although it has been talked about the raspi a lot, it took a community member @mtisza to disclose the needed information. Hint for LINBIT: although the raspi is mentioned in your docs, the ppa seems missing or is hidden too deep.

By coincidence then hit the build-version-mismatch. I guess at this moment, acting in short time frames during the holiday season, my thoughts were: what the…it’s supposed to be an automated build pipeline…how can there be a mismatch…by no way a mismatch in the third place should lead to such a problem…
Well, I guess the build pipelines are not running in the same cycle as the ones for the commercial repo, what I absolutely understand from a technical perspective.

Anyway: again my apology.

3 Likes

Just want to say thank you for taking the time to post about your experiences here and I’m glad to learn that you’re reporting being in a better state now.

The LINBIT team just launched this community forum space earlier in 2024. If you have any interest in the background of that, there are some words about the transition in this article on the LINBIT blog. That said, you and other community users taking the time to post experiences, ask questions, and help one another, are building this space into the hotspot that it might not be right now and we’re again thankful for you taking the time to do that.

On another front, I work on documentation at LINBIT and I appreciate you and others who mention some of the issues that you might be having on your journeys through the docs. Blunt hints like these are also appreciated.

Hint for LINBIT: although the raspi is mentioned in your docs, the ppa seems missing or is hidden too deep.

Revisiting it, it does seem as if the installing LINSTOR section of the LINSTOR User Guide could be improved. I did not find any mention of the LINBIT PPA as there is in the more comprehensive installing section in the DRBD User Guide.

I will work to improve the LINSTOR UG.

Also, a parting thought about using the PPA and other public repositories… the LINBIT public repos are intended for community, testing, and non-production use. The LINBIT team does push release candidates to the PPA and other public repos. If that might concern you for your use case, I recommend that you monitor the “Release Announcements” topic in this forum. The LINBIT team will announce release candidates to that topic. Being aware that there is a release candidate version in the public repos newer than the version installed on your system might be especially important before you do any system updates, if you haven’t taken the precaution to ignore/freeze LINBIT packages that you are getting from the PPA or other public repositories.

Hello there at LINBIT and in particular @Michael,

it’s really hard to write on the internet (and social media platforms as well), as words are quickly misunderstood. By saying “not the hotspot” I meant in comparison with other big - because used by many, many users - communities. I haven’t had any disparaging or disrespectful intentions.

I think your software is a great product, which clearly has its place. I already said, why I choose LINSTOR and not another product. I am grateful, that you open soured it and it’s really great, that you are going to improve the docs and listen to your community. As stated by another community user, @vik-t , not every open source developer (and not every business company) does this.

What made it hard for me coming to terms with LINSTOR and DRBD in my PROXMOX (and SOHO) environment?

I had started rebuilding my home IT just a couple of weeks ago. It was my first contact with Proxmox and my wish has been:

  • kind of HA for some services (OPNsense, DNS, Unifi, reverse proxy, Keycloak, metrics and monitoring this and other IT services at home)
  • use cheap and power efficient systems (currently Lenovo M920q) with local NVMe storage as Proxmox cluster nodes
  • use a even more cheap and efficient Raspberry Pi as quorum / tie-breaker for HA
  • use storage replication of logical volumes on the PVE nodes instead of (usually complex) distributed (cluster) filesystems or an ever-running shared-storage system.

Right from the beginning I had difficulties to follow the LINBIT documentation:

  • missing clear statement, that LINSTOR is the way to configure DRBD9 (spent time trying to omit LINSTOR)
  • hard to follow guides, because of “all one page” and missing reference section for the LINSTOR commands
  • missing thorough explanation for LINSTOR command options or links to the related DRBD command
  • missing a clear big picture of what LINSTOR is and how it fits into the DRBD universe
  • totally lack of understanding the role of the controller and the dependencies of DRBD on it
  • misled by references to Raspberry PI, quorum/tie breakers without thorough explanation. Within my other listed difficulties, showing just one “you can do it this way with DBreactor”, bringing in another LINBIT software, not using a common one like Pacemaker, no explanation of controller role and DRBD not even starting without the controller, is - IMHO - not enough. Missing the PPA for Ubuntu on RasPi added to this.

I hope these point give you an idea of the point of view of someone new to your software and maybe helps to improve your documentation.

Also hope you will stay committed to the open source community, which comprises a lot of SOHO users.

Kind regards!

2 Likes

Thanks for all the details here, and once again, you taking the time to provide them here. This is very helpful to me!

Re: it being hard to write words on the Internet without the risk of being understood…

I agree with you here. If it matters, I didn’t take your “not the hotspot” comment in a disparaging way. I recognize that it’s the reality as we just launched the forum space here less than a year ago.

Also, regarding your difficulty points (and thank you again for these!)… wondering if you read the Introduction to LINSTOR chapter in the LINSTOR User Guide? I think that that chapter addresses some of your difficulty points but perhaps I am too close to the material and my thinking is biased because of that. If you have any additional suggestions at some future point in time for improving that section, I would be happy to receive them.

Re: the point about “hard to follow guides, because of “all one page”…”

There are efforts we’re making at the moment to revamp how LINBIT presents documentation, including offering split-paged presentation of user guides.

I hope these point give you an idea of the point of view of someone new to your software and maybe helps to improve your documentation.

Yes! And thank you again for providing them.

Hi everyone, first of all, thanks for sharing your thoughts. Now I could create two HA node PVE with pve-backup, which quorum and manage backup (VM, and other stuff). I´ve made a detailed guide on how to set it up, including pictures. It would be my pleasure if admins allow me to post it here. I hope there aren´t any errors. Anyway, I will follow step by step the instructions because I wanna install it in a production environment.

Special thanks to @ proxmeup

Hello! Have you able to publish the guide anywhere?

Thank you!

Hello, yes. You can download it here.

Don’t forget to replace your OWN IPs:

IPv4 pve-primary
IPv4 pve-secondary
IPv4 pve-backup

Error: Not permitted

Sorry for the trouble, correct URL here.

Hey hi kentril, i use your guide and it works great!
But i have some question, im a noob in linbit.

I have some warning launching some command (in the end everything still work so it’s okaysh)
As far as i understand it say that i am not using drb 9.
I was surprised and i check my installed version and is 8.4
Is that normal?
The 9 version required a pay subscription?
I’m quite sure i put the right repositories, why i am in 8 instead of 9?