Split-brain issue in drbd 9.2.13

I am currently using a three nodes DRBD setup with version 9.2.13. All three replicas are diskful, and the auto-promote parameter is enabled. Through a series of operations, a split-brain scenario is triggered between two nodes. The process is as follows:

  1. Mount the storage volume on Node C to a directory, causing C to automatically become the primary node.

  2. Disconnect the network between Nodes B and C, then write data to C.

  3. Restore the B-C network connection. At this point, B synchronizes data from C, and its state becomes Inconsistent.

  4. During synchronization, unmount C and mount B, causing B to automatically become the primary node.

  5. Disconnect B and C again. Now, C is Outdated, and B remains Inconsistent.

  6. Restore the B-C connection and wait for synchronization to complete. Now, both B and C become Outdated, but B remains the primary node. At this stage, all three nodes (A, B, C) share the same current UUID.

  7. Unmount B. During this step, there is a certain probability that the current UUID of A and C changes, while B’s current UUID remains unchanged, resulting in a split-brain condition, as shown in the figure below.

Step 7 sometimes causes problems, while other times it does not, requiring repeated testing.

I suspect the issue occurs because unmounting in Step 7 might modify data. If A and C modify data while B does not, this could lead to a mismatch in current UUID.

I would greatly appreciate your help in resolving this issue. Thank you very much!