Differences in DRBD resources

I’m not sure whether this is the correct Category, for this topic stems from a problem I see with DRBD resources created by LINSTOR through the LINBIT Proxmox plugin. So it might be DRBD as well as LINSTOR as well as LINBIT SDS. @Mods: feel free to move around.

After my journey with LINBIT SDS and before I go further with LINSTOR controller HA, I checked the already cerated and running resources. Because of the problems I had during my journey, DRBD resources on my Proxmox cluster were created under different conditions: with or without the presence of a tie-breaker / DRBD quorum. But every DRBD resource was created using the LINBIT SDS plugin!

The differences I see are regarding to the connections / peer nodes I see for different resources. I guess best explained by output of commands. For this I’d like to provide you with the environment:

pve-1: Proxmox cluster node with disks
pve-2: Proxmox cluster node with disks
raspi-1: diskless node acting as Proxmox qDevice quorum and LINBIT tie-breaker / DRBD quorum

root@pve-1:~# drbdsetup status **pm-575f24e5**
pm-575f24e5 role:Secondary
  disk:UpToDate open:no
  pve-2 role:Primary
    peer-disk:UpToDate
  raspi-1 role:Secondary
    peer-disk:Diskless

(Hint: raspi-1 is missing)
root@pve-1:~# drbdsetup status **pm-335cc2f2**
pm-335cc2f2 role:Secondary
  disk:UpToDate open:no
  pve-2 role:Primary
    peer-disk:UpToDate


(Hint: compare the connections sections)
root@pve-1:~# drbdsetup show pm-575f24e5
resource "pm-575f24e5" {
    options {
        quorum          	majority;
        on-no-quorum    	io-error;
    }
    _this_host {
        node-id			1;
        volume 0 {
            device			minor 1006;
            disk			"/dev/pve/pm-575f24e5_00000";
            meta-disk			internal;
            disk {
                rs-discard-granularity	65536; # bytes
            }
        }
    }
    connection {
        _peer_node_id 0;
        path {
            _this_host ipv4 192.168.113.21:7006;
            _remote_host ipv4 192.168.113.22:7006;
        }
        net {
            allow-two-primaries	yes;
            cram-hmac-alg   	"sha1";
            shared-secret   	"4irwAj6sEmAAGiVune2z";
            verify-alg      	"crct10dif";
            _name           	"pve-2";
        }
    }
    connection {
        _peer_node_id 2;
        path {
            _this_host ipv4 192.168.113.21:7006;
            _remote_host ipv4 192.168.111.20:7006;
        }
        net {
            allow-two-primaries	yes;
            cram-hmac-alg   	"sha1";
            shared-secret   	"4irwAj6sEmAAGiVune2z";
            verify-alg      	"crct10dif";
            _name           	"raspi-1";
        }
        volume 0 {
            disk {
                bitmap          	no;
            }
        }
    }
}

root@pve-1:~# drbdsetup show pm-335cc2f2
resource "pm-335cc2f2" {
    _this_host {
        node-id			0;
        volume 0 {
            device			minor 1005;
            disk			"/dev/pve/pm-335cc2f2_00000";
            meta-disk			internal;
            disk {
                rs-discard-granularity	65536; # bytes
            }
        }
    }
    connection {
        _peer_node_id 1;
        path {
            _this_host ipv4 192.168.113.21:7005;
            _remote_host ipv4 192.168.113.22:7005;
        }
        net {
            allow-two-primaries	yes;
            cram-hmac-alg   	"sha1";
            shared-secret   	"fI2MVgaBxnvFvsj4aB9s";
            verify-alg      	"crct10dif";
            _name           	"pve-2";
        }
    }
}

Also running certain commands creates different output:

(Hint: raspi-1 is missing resource pm-335cc2f2 in the following)
root@pve-1:~# linstor node set-property pve-1 DrbdOptions/AutoEvictAllowEviction false
SUCCESS:
    Successfully set property key(s): DrbdOptions/AutoEvictAllowEviction
SUCCESS:
Description:
    Node 'pve-1' modified.
Details:
    Node 'pve-1' UUID is: bfe6c590-9157-45fc-810a-a1b410d097f5
SUCCESS:
    (raspi-1) Node changes applied.
SUCCESS:
    (raspi-1) Resource 'pm-575f24e5' [DRBD] adjusted.
SUCCESS:
    (pve-2) Node changes applied.
SUCCESS:
    (pve-2) Resource 'pm-335cc2f2' [DRBD] adjusted.
SUCCESS:
    (pve-2) Resource 'pm-575f24e5' [DRBD] adjusted.

(Hint: resource pm-335cc2f2 is completely missing in the following)
root@pve-1:~# linstor node set-property raspi-1 DrbdOptions/AutoEvictAllowEviction false
SUCCESS:
    Successfully set property key(s): DrbdOptions/AutoEvictAllowEviction
SUCCESS:
Description:
    Node 'raspi-1' modified.
Details:
    Node 'raspi-1' UUID is: 722433a6-b8a4-4756-b73d-ace7055b55c3
SUCCESS:
    (raspi-1) Node changes applied.
SUCCESS:
    (raspi-1) Resource 'pm-575f24e5' [DRBD] adjusted.
SUCCESS:
    (pve-1) Node changes applied.
SUCCESS:
    (pve-1) Resource 'pm-575f24e5' [DRBD] adjusted.
SUCCESS:
    (pve-2) Node changes applied.
SUCCESS:
    (pve-2) Resource 'pm-575f24e5' [DRBD] adjusted.

Now the questions:

  • what has happened?
  • is this still correct?
  • should it be corrected? How?

Does anybody have knowledge, please?

It looks like you created the pm-335cc2f2 VM while the raspi-1 node was offline. LINSTOR couldn’t assign a diskless tiebreaker resource to it, so it never got one.

Not if you intend to achieve quorum for that resource.

Yes. You can manually assign the diskless resource to raspi-1:

linstor resource create raspi-1 pm-335cc2f2 --drbd-diskless

Thanks for your answer. In my case, raspi-1 is a diskless tie breaker. raspi-1 is not meant to serve access to the resource.

If I do linstor resource list, every “correct” resource has three nodes (pve-1, pve-2 and raspi-1) with raspi-1’s state column saying TieBreaker.

If I issue

linstor resource create raspi-1 pm-335cc2f2 --drbd-diskless

the state column for raspi-1 will be Diskless. Shouldn’t it be TieBreaker?

Hmmm, looking through resource list, linstor_db is also Diskless instead of TieBreaker. linstor_db is a resource built following your Linstor controller HA guide.

May I kindly ask for a final answer?

This is completely fine and normal and nothing to worry about.

To begin with the explanation, let me start by saying that a TieBreaker is just a special kind of diskless resource. The only difference in LINSTOR between a TieBreaker and a Diskless resource is how LINSTOR deals with this resource if for example you get an additional resource.

You usually only need a TieBreaker if you would otherwise only have 2 peers. If the connection between these two peers break, none of the two peers would know if the other one is still alive / could continue the service. This situation does not happen if you have 3 peers (regardless if they are diskful or diskless), since if you lose one connection, the majority (2) peers still see each other, can figure out that they have majority so they can keep quorum while the minority (the isolated peer) also knows that it no longer has quorum.

So, if you have 2 diskful + 1 TieBreaker resource in LINSTOR and you either delete 1 diskful or add another diskful, you will end up in either just 1 or 3 diskful peers. In neither case would a single additional TieBreaker help you. That is why LINSTOR would be “brave enough” to delete such a TieBreaker resource.

LINSTOR will under no circumstances delete a Diskless on its own, even if you would do the same as in the scenarios above. That means, if you go from 2 diskful + 1 diskless (not a tiebreaker this time) resources to 1 or 3 diskful, LINSTOR will not delete the diskless resource for you. Having 1 or 3 diskful + 1 diskless does not help you with quorum, but LINSTOR still thinks that you want this diskless resource to stay there (maybe a VM will try to access it soon).

Another (slightly briefer) approach to explain this: A TieBreaker is just a “LINSTOR-managed Diskless resource”. A TieBreaker is automatically created by LINSTOR and automatically deleted by LINSTOR if needed. The user can only create Diskless resources, no TieBreakers.


A side note: If a TieBreaker resource gets primary on DRBD level, it will automatically switch into Diskless state/type, since LINSTOR can no longer assume that the resource can safely be removed in case it is no longer needed as a tie-breaker.

2 Likes

Thank you very much for the explanation. My 2+1 config should be fine now.

1 Like