Trouble during cluster cold start: linstor_db Diskless + Outdated

zrav · May 23, 2025, 7:43am

We recently experienced an issue that I’d like to get some feedback on.

Our setup is a small Proxmox cluster with linstor-controller configured for HA using drbd-reactor. We had shut down the whole cluster for some maintenance on our networking infrastructure. When we powered up the nodes afterwards, the linstor_db volume failed to come up, thus also the controller and the whole cluster.

Specifically, drbdadm showed that the primary node was in Diskless mode, while all other nodes were Outdated. This is a logical conflict that prevented the resource from becoming available. The backing storage volume was fine and I spent some time trying to get the primary to become diskfull, but with no luck.

Finally, I force-promoted one of the outdated nodes to become primary, which allowed the linstor_db resource to become available, so the controller could start on that node and the satellites finally became online again. At this point the original primary node was in diskfull mode again. Also the force-promoted node was in StandAlone mode, but this was expected. Making the original primary node the primary again, then recreating the replica on the force-promoted node running toggle-disk twice solved all issues.

My questions:

How could this logical deadlock occur?
Is there a better way to solve this situation? Could I somehow have convinced the Diskless primary to become diskfull?

Devin · May 28, 2025, 8:26pm

Without logs or further information, I can only guess as to what happened.

My guess at this point would be that something preventing the disk from attaching at startup. I would be curious if an attempt to reconnect to the disk might have resolved things. If you run into this issue again, try a quick drbdadm adjust <res> to force the resource to try to attach to the disk again.

Topic		Replies	Views
Could not connect to any LINSTOR controller (after HA) LINSTOR drbd	1	286	January 21, 2025
Linstor-gateway on diskless satellites LINSTOR	3	198	May 14, 2024
Unable to Freshly Reinstall Linstor Proxmox VE drbd , linstor	4	63	February 24, 2025
Upgrade Cluster with LINSTOR Database on HA Storage / Proxmox Proxmox VE drbd , linstor , upgrading	1	103	November 18, 2024
Linstor Failure on specific node when switchover LINSTOR	2	192	August 9, 2024

Trouble during cluster cold start: linstor_db Diskless + Outdated

Related topics