Fixed: Linstor/DRBD 9.2.16 broken after update

I’m running 2 PVE nodes with Linstor 1.32 and updated to kernel 6.17 and drbd 9.2.16 on one node (Heisenberg) that’s also running the linstor-controller. After reboot the node has no resource definitions:

root@Heisenberg:/var/log/linstor-satellite# linstor v l
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Resource ┊ Node ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊ Repl ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ iso-store ┊ Heisenberg ┊ pve-storage ┊ 0 ┊ 1019 ┊ None ┊ ┊ ┊ Unknown ┊ ┊
┊ iso-store ┊ Oppenheimer ┊ pve-storage ┊ 0 ┊ 1019 ┊ /dev/drbd1019 ┊ 30.73 MiB ┊ Unused ┊ UpToDate ┊ ┊
┊ linstor_db ┊ Heisenberg ┊ pve-storage ┊ 0 ┊ 1033 ┊ None ┊ ┊ ┊ Unknown ┊ ┊
┊ linstor_db ┊ Oppenheimer ┊ pve-storage ┊ 0 ┊ 1033 ┊ /dev/drbd1033 ┊ 15.69 MiB ┊ Unused ┊ UpToDate ┊ ┊
┊ pm-04382a99 ┊ Heisenberg ┊ pve-storage ┊ 0 ┊ 1026 ┊ None ┊ ┊ ┊ Unknown ┊ ┊

pm-fcd1dcba ┊ Oppenheimer ┊ pve-storage ┊ 0 ┊ 1027 ┊ /dev/drbd1027 ┊ 5.03 GiB ┊ InUse ┊ UpToDate ┊ ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

and is offline:

linstor n l

╭─────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════╡
┊ Heisenberg ┊ COMBINED ┊ 172.18.0.12:3366 (PLAIN) ┊ OFFLINE(MISSING EXTERNAL TOOLS) ┊
┊ Oppenheimer ┊ COMBINED ┊ 172.18.0.11:3366 (PLAIN) ┊ Online ┊
╰─────────────────────────────────────────────────────────────────────────────────────╯

Linstor Satellite error log shows:
root@Heisenberg:/var/log/linstor-satellite# cat ErrorReport-69282569-96C93-000000.log
ERROR REPORT 69282569-96C93-000000

============================================================

Application: LINBIT® LINSTOR
Module: Satellite
Version: 1.32.3
Build ID: 6dac06aed233f2c89ac7cc6b1185d6dce9ec74c4
Build time: 2025-10-13T06:37:58+00:00
Error time: 2025-11-27 10:18:28
Node: Heisenberg
Thread: MainWorkerPool-2

============================================================

Reported error:

Category: LinStorException
Class name: MissingRequiredExtToolsStorageException
Class canonical name: com.linbit.linstor.core.apicallhandler.StltApiCallHandler.MissingRequiredExtToolsStorageException
Generated at: Method ‘checkLayersForExtToolsSupport’, Source file ‘StltApiCallHandler.java’, Line #660

Error message: Received a resource that requires DRBD9_KERNEL but that external tool is not supported on this satellite

ErrorContext:

Call backtrace:

Method                                   Native Class:Line number
checkLayersForExtToolsSupport            N      com.linbit.linstor.core.apicallhandler.StltApiCallHandler:660
applyFullSync                            N      com.linbit.linstor.core.apicallhandler.StltApiCallHandler:395
execute                                  N      com.linbit.linstor.api.protobuf.FullSync:115
executeNonReactive                       N      com.linbit.linstor.proto.CommonMessageProcessor:541
lambda$execute$14                        N      com.linbit.linstor.proto.CommonMessageProcessor:512
doInScope                                N      com.linbit.linstor.core.apicallhandler.ScopeRunner:178
lambda$fluxInScope$0                     N      com.linbit.linstor.core.apicallhandler.ScopeRunner:101
call                                     N      reactor.core.publisher.MonoCallable:72
trySubscribeScalarMap                    N      reactor.core.publisher.FluxFlatMap:128
subscribeOrReturn                        N      reactor.core.publisher.MonoFlatMapMany:49
subscribe                                N      reactor.core.publisher.Flux:8833
onNext                                   N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:196
request                                  N      reactor.core.publisher.Operators$ScalarSubscription:2570
onSubscribe                              N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:141
subscribe                                N      reactor.core.publisher.MonoJust:55
subscribe                                N      reactor.core.publisher.MonoDeferContextual:55
subscribe                                N      reactor.core.publisher.Flux:8848
onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:430
slowPath                                 N      reactor.core.publisher.FluxArray$ArraySubscription:126
request                                  N      reactor.core.publisher.FluxArray$ArraySubscription:99
onSubscribe                              N      reactor.core.publisher.FluxFlatMap$FlatMapMain:373
subscribe                                N      reactor.core.publisher.FluxMerge:73
subscribe                                N      reactor.core.publisher.Flux:8848
onComplete                               N      reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber:238
subscribe                                N      reactor.core.publisher.FluxConcatArray:79
subscribe                                N      reactor.core.publisher.InternalFluxOperator:68
subscribe                                N      reactor.core.publisher.FluxDefer:54
subscribe                                N      reactor.core.publisher.Flux:8848
onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:430
drainAsync                               N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:453
drain                                    N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:724
onNext                                   N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:256
drainFused                               N      reactor.core.publisher.SinkManyUnicast:321
drain                                    N      reactor.core.publisher.SinkManyUnicast:363
tryEmitNext                              N      reactor.core.publisher.SinkManyUnicast:239
tryEmitNext                              N      reactor.core.publisher.SinkManySerialized:100
processInOrder                           N      com.linbit.linstor.netcom.TcpConnectorPeer:442
doProcessMessage                         N      com.linbit.linstor.proto.CommonMessageProcessor:228
lambda$processMessage$2                  N      com.linbit.linstor.proto.CommonMessageProcessor:165
onNext                                   N      reactor.core.publisher.FluxPeek$PeekSubscriber:185
runAsync                                 N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:446
run                                      N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:533
call                                     N      reactor.core.scheduler.WorkerTask:84
call                                     N      reactor.core.scheduler.WorkerTask:37
run                                      N      java.util.concurrent.FutureTask:317
run                                      N      java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask:304
runWorker                                N      java.util.concurrent.ThreadPoolExecutor:1144
run                                      N      java.util.concurrent.ThreadPoolExecutor$Worker:642
run                                      N      java.lang.Thread:1583

END OF ERROR REPORT.

lsmod shows the kernel module is loaded:

lsmod | grep drbd
drbd 471040 0
lru_cache 16384 1 drbd

with kernel Linux Heisenberg 6.17.2-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.17.2-1 (2025-10-21T11:55Z) x86_64 GNU/Linux

Before update all workload has been migrated to the other node which is still running kernel 6.14 and drbd 9.2.15.

What does dmesg say? I would assume that something went wrong since the drbd_transport_tcp module has not been loaded.

That’s an interesting one…

[ 23.745987] drbd: initialized. Version: 8.4.11 (api:1/proto:86-101)
[ 23.745994] drbd: srcversion: 900745622289D38C0BAB129
[ 23.745996] drbd: registered as block device major 147

so it’s not using the drbd9 kernel module.

Is the drbd9 module installed in the right place?

find /lib/modules -name drbd.ko

If so, try insmoding the module for the correct kernel version directly. That should at least give you an error in dmesg.

If there is no module, you should install it :slight_smile:

I have no idea why dkms did not work. Just reinstalled drbd-dkms and now it’s all fine. Weird..
However, thanks for the hint :wink: