PVE Cluster broken after update

I have a three node proxmox cluster with DRBD+LINSTOR accordign to the official instructions. I had upgraded the cluster to PVE 9 and changed the repo files as per instructions to match Proxmox 9. This was successful and have been running without issues for some time.

After yesterday’s apt full-upgrade I suddenly after reboot of the third node get the following output from linbit node list on node 1. Node 3 is the only one I have rebooted since the upgrade.

╭─────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType  ┊ Addresses                  ┊ State                           ┊
╞═════════════════════════════════════════════════════════════════════════════════╡
┊ pve1 ┊ SATELLITE ┊ 192.168.***.1:3366 (PLAIN) ┊ Online                          ┊
┊ pve2 ┊ SATELLITE ┊ 192.168.***.2:3366 (PLAIN) ┊ Online                          ┊
┊ pve3 ┊ SATELLITE ┊ 192.168.***.3:3366 (PLAIN) ┊ OFFLINE(MISSING EXTERNAL TOOLS) ┊
╰─────────────────────────────────────────────────────────────────────────────────╯

On node 2 and 3 I get:

root@pve3:~# linstor node list
Error: Unable to connect to linstor://localhost:3370: [Errno 111] Connection refused

I have tried to rebuild the kernel module:

apt install drbd-dkms --reinstall
sudo rmmod drbd
modprobe drbd

I have also tried to restart the linstor controlelr and satellite systemd services.
Does anyone have a tip for how to troubleshoot?

I have a ZFS snapshot of the root filesystem I can use to roll back but it is extra hassle if it is an easy fix. I use LINSTOR/DRBD on top of ZFS so I have all datasets for my container disk images still intact. Is there a way in Proxmox to move these datasets to local storage so I can get my containers up and running?

I’d assume that you can not connect to the controller on localhost on pve2 and pve3 is normal because most likely the controller is active on pve1. So you might point the clients to that IP or just use the cli on pve1.

the missing external tools are interesting. Restart the satellite on pve3 and collect the syslogs, that might give some hints what is missing.