Linstor on Proxmox utilising NVMe-of with RDMA over RoCE

I had one more question. Do you know of anyone having issues with TCP Tiebreakers? I’ve been trying to pin down this issue for 3 days or so as I was creating my ansible script to setup my linstor/drbd again.

I have 5 nodes, pve1,pve2,pve3

pvs1 and pvs2 are diskful. They are both running proxmox with linstor installed over it because proxmox is a supported os(and it was easier)

pvs3 is a diskless proxmox VM running as a guest on one of my pve3 nodes. (I plan to run a separate very small ceph cluster between my 3 pve nodes for HA.

So I realized that I probably wasnt using rdma for transport replication between the pvs1-2 nodes so I tried to correct that, because with synchronous replication I assumed it would be a bottleneck. I used variations of commands like

linstor node-connection drbd-peer-options --transport rdma pvs1 pvs2

#linstor resource-connection drbd-peer-options --transport rdma pvs1 pvs2 rg-pve0

I've even tried and all the resource-connection variations as well
#linstor node-connection drbd-peer-options --transport tcp pvs1 pvs3
#linstor node-connection drbd-peer-options --transport tcp pvs2 pvs3
#linstor node-connection drbd-peer-options --transport rdma pvs1 pvs2

but when I reboot one of my diskful nodes, they can connect to each other but not to the diskless(pvs3) node

im still using linstor + linstor-gateway to setup my nvmet target and handle failover to the other mirrored node

I suspect that changing the transport type of pvs1 and pvs2 might be the culprit, but I have no good way to determine this other than when I dont try to change the transport to rdma, the rebooted nodes seems to be able to connect with pvs3

this forum post was a good reference for me, but they dont seem to have my issue.