Drbd-reactor and node affinity

A drbd-reactor exists for each linstor satellite. What is it for?

And there are many node allocation methods for pod scheduling, such as HA Controller, STORK, Node Affinity Controller, CSI topology, and Linstor-Scheduler-Extender.

Are all these methods necessary?
Can I use just one?

DRBD Reactor is used to expose Prometheus metrics for the DRBD devices running on the Satellites.

We recommend using a StorageClass with volumeBindingMode: WaitForFirstConsumer and the LINSTOR parameter allowRemoteVolumeAccess: "false" to enforce data locality.

The LINSTOR Affinity Controller is used to update PV node affinities should replicas be redistributed in the cluster after they are first created (therefore making the PV node affinities incorrect).

The linstor-scheduler-extender is no longer recommended, as there are generally better alternatives that more easily integrate with Kubernetes.

Stork was used with LINSTOR Operator v1, but was basically deprecated in favor of CSI topology (WaitForFirstConsumer) in operator v2.

The LINSTOR HA Controller is only there to speed up failover times in Kubernetes. If the HA Controller detects that a DRBD Primary lost quorum, it will reschedule the pod that is using the DRBD device before Kubernetes would.

Thanks.
I have additional questions.

When a node is not ready, it basically takes a little over 5 minutes to move the pod.
But there is a demo that takes 15 minutes.
Site: GitHub - piraeusdatastore/piraeus-ha-controller: High Availability Controller for stateful workloads using storage provisioned by Piraeus

Please answer my question by referring to the paragraphs explained here.

  1. Is this only stateful apps?

  2. Pod is deplyment after Evict Timeout (5m).
    However, a FailedAttachVolume error occurs in that node beacuse of volume busy. So, k8s spends time to solve this problem, again.
    This takes a total of more than 15 minutes.
    If we use STORK or node affinity controller, can we be assigned the appropriate node immediately without FailedAttachVolume error?
    If so, is it possible to redistribute the pod in about 5 minutes without using the piraeus HA controller?

  3. As a result, if we use something like STORK, redistribution of Pods due to node problems can be resolved in just over 5 minutes, which is the default evict timeout.
    In other words, it does not occur for more than 15 minutes pending which is the worst case.
    Is that correct?

Thanks again.

This reactor used in Satellite seems to use the Prometheus plug-in.
Does the following package use a promoter plug-in?

  1. LinStor Controller HA (https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-linstor_ha)

  2. OwnCloud HA (https://www.youtube.com/watch?v=hi-Yqee6u_k)

Yes, it uses DRBD’s quorum state to help make decisions, which implies that DRBD is being used as a persistent volume.

If you’re seeing FailedAttachVolume, the original node is holding the DRBD device open and is still “connected” (in terms of DRBD’s replication link) to the peer that is attempting to become Primary.

15 minutes should be the worst case. If you test hard failing nodes, like hung kernels or lost power, fail-over should happen faster than that.

That’s what I said in my first reply:

  1. DRBD Reactor’s promoter plugin is the recommended method for making a LINSTOR Controller highly available.

  2. DRBD Reactor’s promoter plugin is used in the linked video to make OCIS highly available.

These questions seem unrelated to the original topic. In the future, please open new topics for new questions to help keep things organized and tidy. Thanks!