I am a newbie with drbd-utils but I have experience with Debian GNU/Linux.
My setup is very simple, Debian v12, LINBIT DRBD 9.2.x, on top of a /dev/sda (all flash) in one machine and bcache in the secondary node. For now I am not using any HA software so I can put this third pair in production quickly. It is my third pair of nodes, that I have setup with drbd 9.2.x and when was doing the last checks something broke.
When doing reboots to test my manual procedures to promote one node to primary or secondary, DRBD stopped working, the synchronization was no longer flowing between nodes and no error message anywhere to explain the problem, only kernel messages for drbd being stuck for too long.
Some reboots later it started again to work without explanation. What I found is without using apt or dpkg commands, the fastest the node is using DRBD module from 8.4 and the other is still on 9.x, cat /proc/drbd in one worked as before and the other node only drbdadm status gives information. Should I extract the little information I have on the pair and reinstall the machines or can I follow the procedure to upgrade the metadata from 8.x to 9.x.
The information in /dev/drbd0 is not very important, but I want to deal with it as real data, to learn how to use drbd.
Some extra information that maybe usefull.
Fastest node:
dpkg --list | grep drbd
ii drbd-dkms 9.2.14-1 all RAID 1 over TCP/IP for Linux module source
ii drbd-utils 9.32.0-1 amd64 RAID 1 over TCP/IP for Linux (user utilities)
wipefs /dev/sda
DEVICE OFFSET TYPE UUID LABEL
sda 0x37e2bffff03c drbd 263774bd15de66f5
sda 0x0 xfs cf3596a0-ac15-4821-a591-4ab4e67d8486
other node:
ii drbd-dkms 9.2.14-1 all RAID 1 over TCP/IP for Linux module source
ii drbd-utils 9.31.0-1 amd64 RAID 1 over TCP/IP for Linux (user utilities)
wipefs /dev/bcache0
DEVICE OFFSET TYPE UUID LABEL
bcache0 0x3a36ffffd03c drbd f3957fac668095f2
bcache0 0x0 xfs cf3596a0-ac15-4821-a591-4ab4e67d8486
What are your recomendation?