I have a three node proxmox cluster with DRBD+LINSTOR accordign to the official instructions. I had upgraded the cluster to PVE 9 and changed the repo files as per instructions to match Proxmox 9. This was successful and have been running without issues for some time.
After yesterday’s apt full-upgrade I suddenly after reboot of the third node get the following output from linbit node list on node 1. Node 3 is the only one I have rebooted since the upgrade.
╭─────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════╡
┊ pve1 ┊ SATELLITE ┊ 192.168.***.1:3366 (PLAIN) ┊ Online ┊
┊ pve2 ┊ SATELLITE ┊ 192.168.***.2:3366 (PLAIN) ┊ Online ┊
┊ pve3 ┊ SATELLITE ┊ 192.168.***.3:3366 (PLAIN) ┊ OFFLINE(MISSING EXTERNAL TOOLS) ┊
╰─────────────────────────────────────────────────────────────────────────────────╯
On node 2 and 3 I get:
root@pve3:~# linstor node list
Error: Unable to connect to linstor://localhost:3370: [Errno 111] Connection refused
I have tried to rebuild the kernel module:
apt install drbd-dkms --reinstall
sudo rmmod drbd
modprobe drbd
I have also tried to restart the linstor controlelr and satellite systemd services.
Does anyone have a tip for how to troubleshoot?
I have a ZFS snapshot of the root filesystem I can use to roll back but it is extra hassle if it is an easy fix. I use LINSTOR/DRBD on top of ZFS so I have all datasets for my container disk images still intact. Is there a way in Proxmox to move these datasets to local storage so I can get my containers up and running?