Proxmox V9 (fresh node) - can't install drbd

Hi,

I’ve just installed a fresh Proxmox Node V9, installed all updates and tried to install drdb.

apt install proxmox-default-headers drbd-dkms drbd-utils linstor-proxmox

Building initial module drbd/9.2.15-1 for 6.17.2-1-pve
Sign command: /lib/modules/6.17.2-1-pve/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub

Building module(s)…(bad exit status: 2)
Failed command:
make -j16 KERNELRELEASE=6.17.2-1-pve -C src/drbd KDIR=/lib/modules/6.17.2-1-pve/build

Error! Bad return status for module build on kernel: 6.17.2-1-pve (x86_64)
Consult /var/lib/dkms/drbd/9.2.15-1/build/make.log for more information.
dpkg: error processing package drbd-dkms (–configure):
installed drbd-dkms package post-installation script subprocess returned error exit status 10
Errors were encountered while processing:
drbd-dkms
Error: Sub-process /usr/bin/dpkg returned an error code (1)

error in make.log:

CC [M] drbd_transport_tcp.o
CC [M] drbd_transport_lb-tcp.o
/var/lib/dkms/drbd/9.2.15-1/build/src/drbd/drbd_dax_pmem.c:25:10: fatal error: linux/pfn_t.h: No such file or directory
25 | #include <linux/pfn_t.h>
| ^~~~~~~~~~~~~~~
compilation terminated.

Any suggestion?

commands and command output, configuration file contents, log messages, and other such CLI content

The DRBD kmod 9.2.15 does not build for the 6.17 kernel.

You may wish to load the latest compatible kernel, which should be those in the 6.14 series.

Otherwise, the next DRBD release, 9.2.16, will be compatible with 6.17 and is expected to be released in one week, assuming everything goes as expected with testing of the release candidates.

Thanks for the quick response.
So I’ll act chill and lay low till the update is ready :cool:

Hello, same issues there :slight_smile:

Issues with the kernel and drdb :

root@pivoine:/home/interlope# cat /etc/apt/sources.list.d/linbit.list
deb [signed-by=/etc/apt/trusted.gpg.d/linbit-keyring.gpg] http://packages.linbit.com/public/ proxmox-9 drbd-9

root@pivoine:/home/interlope# apt upgrade -y
The following packages were automatically installed and are no longer required:
proxmox-headers-6.14 proxmox-headers-6.14.11-4-pve
Use ‘sudo apt autoremove’ to remove them.

Summary:
Upgrading: 0, Installing: 0, Removing: 0, Not Upgrading: 0
3 not fully installed or removed.
Space needed: 0 B / 473 GB available

Setting up proxmox-kernel-6.17.2-1-pve-signed (6.17.2-1) …
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/dkms 6.17.2-1-pve /boot/vmlinuz-6.17.2-1-pve
Sign command: /lib/modules/6.17.2-1-pve/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub

Autoinstall of module drbd/9.2.15-1 for kernel 6.17.2-1-pve (x86_64)
Building module(s)…(bad exit status: 2)
Failed command:
make -j16 KERNELRELEASE=6.17.2-1-pve -C src/drbd KDIR=/lib/modules/6.17.2-1-pve/build

Error! Bad return status for module build on kernel: 6.17.2-1-pve (x86_64)
Consult /var/lib/dkms/drbd/9.2.15-1/build/make.log for more information.

Autoinstall on 6.17.2-1-pve failed for module(s) drbd(10).

Error! One or more modules failed to install during autoinstall.
Refer to previous errors for more information.
run-parts: /etc/kernel/postinst.d/dkms exited with return code 1
Failed to process /etc/kernel/postinst.d at /var/lib/dpkg/info/proxmox-kernel-6.17.2-1-pve-signed.postinst line 20.
dpkg: error processing package proxmox-kernel-6.17.2-1-pve-signed (–configure):
installed proxmox-kernel-6.17.2-1-pve-signed package post-installation script subprocess returned error exit status 2
dpkg: dependency problems prevent configuration of proxmox-kernel-6.17:
proxmox-kernel-6.17 depends on proxmox-kernel-6.17.2-1-pve-signed | proxmox-kernel-6.17.2-1-pve; however:
Package proxmox-kernel-6.17.2-1-pve-signed is not configured yet.
Package proxmox-kernel-6.17.2-1-pve is not installed.
Package proxmox-kernel-6.17.2-1-pve-signed which provides proxmox-kernel-6.17.2-1-pve is not configured yet.

dpkg: error processing package proxmox-kernel-6.17 (–configure):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-default-kernel:
proxmox-default-kernel depends on proxmox-kernel-6.17; however:
Package proxmox-kernel-6.17 is not configured yet.

dpkg: error processing package proxmox-default-kernel (–configure):
dependency problems - leaving unconfigured
Errors were encountered while processing:
proxmox-kernel-6.17.2-1-pve-signed
proxmox-kernel-6.17
proxmox-default-kernel
Error: Sub-process /usr/bin/dpkg returned an error code (1)

root@pivoine:/home/interlope# cat /var/lib/dkms/drbd/9.2.15-1/build/make.log | grep fatal
/var/lib/dkms/drbd/9.2.15-1/build/src/drbd/drbd_dax_pmem.c:25:10: fatal error: linux/pfn_t.h: No such file or directory

For now, you can pin the kernel to not use the last proxmox provided :slight_smile:

proxmox-boot-tool kernel pin 6.14.11-4-pve

@spleenftw pinning the kernel with a fresh install that has been fully updated without installing drbd-dkms first does not seem to be the solution, i am getting stuck because it still tries to build against kernel 6.17.

EDIT: Sorry, i think i was going crazy on testing, it works when you pin the kernel!

Oh sorry, mine is not a fresh install.

I just pinned it in case of a reboot, but i’ll simply wait next week for the release.

I’ve pinned the kernel to 6.14.11-4-pve. Rebooted the host. Checked the kernel - it’s 6.14.

Started apt update / apt upgrade

Same build error.

Did you try after pinning the kernel and rebooting to install the correct kernel-headers ?:
apt install pve-headers-$(uname -r)

By doing this, drbd-dkms is compiling/installing without errors, I also have pinned my kernel.

1 Like

Just tried it - seems like still some references to the 6.17 kernel are in the system:

Loading new drbd/9.2.15-1 DKMS files…
Building for 6.14.11-4-pve and 6.17.2-1-pve

…..

Errors were encountered while processing:
drbd-dkms

I had manually to remove the 6.17 headers by

apt remove proxmox-headers-6.17

apt remove proxmox-headers-6.17.2-1-pve

Than it works :slight_smile:

Still struggeling with the 6.17 Kernel issue….

The kernel is fixed now and the correct drdb is online and running.

But if I check my resource group I get this:

root@pveAMD01:~# linstor resource list
┊ ResourceName ┊ Node ┊ Layers ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
┊ pm-0c4aa4f8 ┊ pveAMD01 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-09-05 15:36:09 ┊
┊ pm-0c4aa4f8 ┊ pveAMD02 ┊ DRBD,STORAGE ┊ InUse ┊ Connecting(pveAMD01) ┊ UpToDate ┊ 2025-09-05 15:36:05 ┊
┊ pm-1be83a14 ┊ pveAMD01 ┊ DRBD,STORAGE ┊ InUse ┊ Connecting(pveAMD02) ┊ UpToDate ┊ 2025-05-27 08:09:13 ┊
┊ pm-1be83a14 ┊ pveAMD02 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-05-27 08:09:12 ┊
┊ pm-004a3fcd ┊ pveAMD01 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-09-05 11:24:18 ┊
┊ pm-004a3fcd ┊ pveAMD02 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-09-05 11:24:17 ┊
┊ pm-6dbb53f8 ┊ pveAMD01 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-11-01 16:38:07 ┊
┊ pm-6dbb53f8 ┊ pveAMD02 ┊ DRBD,STORAGE ┊ InUse ┊ Connecting(pveAMD01) ┊ UpToDate ┊ 2025-11-01 16:38:06 ┊
┊ pm-6dbf727c ┊ pveAMD01 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-09-05 11:24:15 ┊
┊ pm-6dbf727c ┊ pveAMD02 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-09-05 11:24:14 ┊
┊ pm-6f98e2c2 ┊ pveAMD01 ┊ DRBD,STORAGE ┊ ┊ ┊ Unknown ┊ 2025-09-10 17:33:30 ┊
┊ pm-6f98e2c2 ┊ pveAMD02 ┊ DRBD,STORAGE ┊ InUse ┊ Connecting(pveAMD01) ┊ UpToDate ┊ 2025-09-10 17:33:29 ┊
┊ pm-8b1faeab ┊ pveAMD01 ┊ DRBD,STORAGE ┊ InUse ┊ Connecting(pveAMD02) ┊ UpToDate ┊ 2025-06-10 11:49:22 ┊

Any idea how to fix this?

I also can’t migrate any VM.

Can you reach your controller? for example no errors on linstor node list?
If you do; Do you have on all nodes:

nano /etc/linstor/linstor-client.conf
[global]
controllers = xxx.xxx.xxx.xxx:3370

systemctl restart linstor-satellite

Looks like connection issues?

I do have another question myself, maybe someone noticed this already or i’ll create another topic about the exact right configuration but I notice when running:
2 PVE nodes, 1 PBS nodes which has a qdevice for the PVE HA and is a quorum for LINSTOR.
When 1 PVE node is down, the other one survives, but when selecting the storage on the surviving pve node it does not show the storage capacity anymore.

Is this normal behaviour or misconfiguration?

Hello,

TL;DR: I am quite sure that is expected / normal behavior.


Longer explanation: If I remember correctly our Proxmox plugin uses linstor resource-group query-size-info ${resource_group_name} internally to basically get an answer to the question “What is the largest size of a resource I could create”. This question is not bound to a specific node but rather to a resource-group. I have a very simple setup to demonstrate my point:

$ linstor sp l
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node ┊ Driver   ┊ PoolName     ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName                ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ lin2 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Ok    ┊ lin2;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ lin3 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Ok    ┊ lin3;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ lin4 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Ok    ┊ lin4;DfltDisklessStorPool ┊
┊ lvmthinpool          ┊ lin2 ┊ LVM_THIN ┊ scratch/thin ┊        1 GiB ┊         1 GiB ┊ True         ┊ Ok    ┊ lin2;lvmthinpool          ┊
┊ lvmthinpool          ┊ lin3 ┊ LVM_THIN ┊ scratch/thin ┊        1 GiB ┊         1 GiB ┊ True         ┊ Ok    ┊ lin3;lvmthinpool          ┊
┊ lvmthinpool          ┊ lin4 ┊ LVM_THIN ┊ scratch/thin ┊        1 GiB ┊         1 GiB ┊ True         ┊ Ok    ┊ lin4;lvmthinpool          ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

(Yes, very tiny storage pools, but it is enough for our purposes).

$ linstor rg qsi dfltrscgrp
╭────────────────────────────────────────────────────────────────╮
┊ MaxVolumeSize ┊ AvailableSize ┊ Capacity ┊ Next Spawn Result   ┊
╞════════════════════════════════════════════════════════════════╡
┊        20 GiB ┊         1 GiB ┊    1 GiB ┊ lvmthinpool on lin2 ┊
┊               ┊               ┊          ┊ lvmthinpool on lin3 ┊
╰────────────────────────────────────────────────────────────────╯

These numbers would be used by the Proxmox plugin to show the available storage.

Now if I shut down lin3 node for example:

$ linstor rg qsi dfltrscgrp
╭────────────────────────────────────────────────────────────────╮
┊ MaxVolumeSize ┊ AvailableSize ┊ Capacity ┊ Next Spawn Result   ┊
╞════════════════════════════════════════════════════════════════╡
┊        20 GiB ┊         1 GiB ┊    1 GiB ┊ lvmthinpool on lin2 ┊
┊               ┊               ┊          ┊ lvmthinpool on lin4 ┊
╰────────────────────────────────────────────────────────────────╯

The command still shows the same numbers, but please note that the second line changed from lin3 to lin4. That means that LINSTOR is running internally a “dry-run” autoplacement and uses its result to present the available storage.

Now if I also put lin4 offline, we do have your situation where fewer nodes are left online than configured in your resource-group (1 node online, but resource-group’s place-count is 2 - by default):

$ linstor rg qsi dfltrscgrp
╭──────────────────────────────────────────────────────────────╮
┊ MaxVolumeSize ┊ AvailableSize ┊ Capacity ┊ Next Spawn Result ┊
╞══════════════════════════════════════════════════════════════╡
┊         0 KiB ┊         0 KiB ┊    0 KiB ┊ -                 ┊
╰──────────────────────────────────────────────────────────────╯

If we now think again at the original question “What is the largest resource we could spawn next?” the result above is actually the correct answer.

In case you still want to know the available space per node, regardless of any resource-group, you can always use linstor storage-pool list:

$ linstor sp l
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node ┊ Driver   ┊ PoolName     ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State   ┊ SharedName                ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ lin2 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Ok      ┊ lin2;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ lin3 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Warning ┊ lin3;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ lin4 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Warning ┊ lin4;DfltDisklessStorPool ┊
┊ lvmthinpool          ┊ lin2 ┊ LVM_THIN ┊ scratch/thin ┊        1 GiB ┊         1 GiB ┊ True         ┊ Ok      ┊ lin2;lvmthinpool          ┊
┊ lvmthinpool          ┊ lin3 ┊ LVM_THIN ┊ scratch/thin ┊              ┊               ┊ True         ┊ Warning ┊ lin3;lvmthinpool          ┊
┊ lvmthinpool          ┊ lin4 ┊ LVM_THIN ┊ scratch/thin ┊              ┊               ┊ True         ┊ Warning ┊ lin4;lvmthinpool          ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
WARNING:
Description:
    No active connection to satellite 'lin3'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
WARNING:
Description:
    No active connection to satellite 'lin4'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.

Hope this helps!

2 Likes

Thanks - you saved my day.

The service restart on the satellite fixed the issue.

Update: DRBD 9.2.16 has just been released. This version restores compatibility with the 6.17 kernel and should resolve this issue.

2 Likes

Make sure to reinstall the kernel headers before updating the DRDB packages so that the modules get rebuilt and installed. I forgot and wasted a bunch of time.

Thanks for the update.

The kernel header reinstall hack I’ve also noticed :slight_smile:

Now it works like a charme again - Thanks!

1 Like

Things are working well for me with 6.17.2-2-pve