We have linstor setup with proxmox and kubernetes
Nodes
$ linstor --controllers 10.10.20.1 n list
╭──────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════╡
┊ prod-pve1 ┊ SATELLITE ┊ 10.11.20.251:3366 (PLAIN) ┊ Online ┊
┊ prod-pve2 ┊ SATELLITE ┊ 10.11.20.252:3366 (PLAIN) ┊ Online ┊
┊ prod-pve3 ┊ SATELLITE ┊ 10.11.20.253:3366 (PLAIN) ┊ Online ┊
┊ prod-tages-k8s-worker-1 ┊ SATELLITE ┊ 10.11.20.11:3366 (PLAIN) ┊ Online ┊
┊ prod-tages-k8s-worker-2 ┊ SATELLITE ┊ 10.11.20.12:3366 (PLAIN) ┊ Online ┊
┊ prod-tages-k8s-worker-3 ┊ SATELLITE ┊ 10.11.20.18:3366 (PLAIN) ┊ Online ┊
┊ prod-tages-k8s-worker-4 ┊ SATELLITE ┊ 10.11.20.19:3366 (PLAIN) ┊ Online ┊
╰──────────────────────────────────────────────────────────────────────────╯
pve1
and pve2
are diskful proxmox nodes, k8s nodes are just vms in proxmox cluster.
So we have pvc for deployment
$ linstor --controllers 10.10.20.1 r list -r pvc-dccc4c78-c28a-4ad3-a010-f8fcdf20408b
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ pvc-dccc4c78-c28a-4ad3-a010-f8fcdf20408b ┊ prod-pve1 ┊ 7057 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2025-02-11 09:30:23 ┊
┊ pvc-dccc4c78-c28a-4ad3-a010-f8fcdf20408b ┊ prod-pve2 ┊ 7057 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2025-02-11 09:30:22 ┊
┊ pvc-dccc4c78-c28a-4ad3-a010-f8fcdf20408b ┊ prod-pve3 ┊ 7057 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2025-02-11 09:30:22 ┊
┊ pvc-dccc4c78-c28a-4ad3-a010-f8fcdf20408b ┊ prod-tages-k8s-worker-4 ┊ 7057 ┊ InUse ┊ Ok ┊ Diskless ┊ 2025-02-13 08:05:41 ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
pve3
is a TieBreaker and all k8s nodes will be diskless.
When I’m resetting pve2
with prod-tages-k8s-worker-4
and prod-tages-k8s-worker-2
nodes, pod is being scheduled to node prod-k8s-worker-3
and we loose qourum.
pve1
have vms prod-tages-k8s-worker-1
and prod-tages-k8s-worker-3
pve2
have vms prod-tages-k8s-worker-2
and prod-tages-k8s-worker-4
So when we reset pve2, we have
pve2
(Unknown) and prod-tages-k8s-worker-4
(Unknown) are offline
pve1
(UpToDate) and pve3
(TieBreaker) are alive
So how drbd makes decision about quorum?
For example drbd first of all counts all diskful nodes
pve1
-online
pve2
-offline
Then counts connection to diskless resources
pve1
- tiebreaker
- 1 diskless node
pve2
- prod-tages-k8s-worker-4
- 1 diskless node
How to solve that situation? A don’t want vms to participate in qourum, we loose vms when we loose pve node