DRBD multipath load balancing and failover test

There are two directly connected servers. For the direct connection single dual port Nvidia Connect-X5 NIC used on each server.

So there are two DRBD rdma links between servers.

While there was sequential write operation from NFS client, I tested DRBD rdma link failover by pulling out cables one by one - unplug cable, wait, plug back, wait, unplug another cable, wait, plug back, wait, unplug another cable, wait, …

And finally I got situation where on the Primary I have continuously changing DRBD status

On the Primary

[root@memverge ~]# drbdadm status
resource0 role:Primary
disk:UpToDate
memverge2 connection:Connecting

[root@memverge ~]# drbdadm status
resource0 role:Primary
disk:UpToDate
memverge2 connection:Unconnected

[root@memverge ~]# drbdadm status
resource0 role:Primary
disk:UpToDate
memverge2 connection:Connecting

[root@memverge ~]# drbdadm status
resource0 role:Primary
disk:UpToDate
memverge2 connection:Unconnected

dmesg

[ 2800.565640] drbd resource0 memverge2: timeout while waiting for feature packet
[ 2800.591914] drbd resource0 memverge2: Terminating sender thread
[ 2800.591941] drbd resource0 memverge2: Starting sender thread (peer-node-id 1)
[ 2800.593624] drbd resource0 memverge2: meta connection shut down by peer.
[ 2800.595280] drbd resource0 memverge2: pp_in_use = 97, expected 0
[ 2800.595284] drbd resource0 memverge2: Connection closed
[ 2800.595291] drbd resource0 memverge2: helper command: /sbin/drbdadm disconnected
[ 2800.596787] drbd resource0 memverge2: helper command: /sbin/drbdadm disconnected exit code 0
[ 2800.596795] drbd resource0 memverge2: conn( BrokenPipe → Unconnected ) [disconnected]
[ 2801.653596] drbd resource0 memverge2: conn( Unconnected → Connecting ) [connecting]
[ 2803.701592] drbd resource0 memverge2: sock_recvmsg returned -11
[ 2803.701601] drbd resource0 memverge2: conn( Connecting → BrokenPipe )
[ 2803.701608] drbd resource0 memverge2: short read (expected size 8)
[ 2803.701609] drbd resource0 memverge2: timeout while waiting for feature packet

On the Secondary

[root@memverge2 ~]# drbdadm status
resource0 role:Secondary
disk:UpToDate
memverge role:Primary
peer-disk:UpToDate

dmesg

[ 2863.619102] dtr_dec_rx_descs: 2592 callbacks suppressed
[ 2863.619103] drbd resource0 rdma:memverge: rx_descs_posted underflow avoided
[ 2863.619105] drbd resource0 rdma:memverge: rx_descs_posted underflow avoided
[ 2863.619105] drbd resource0 rdma:memverge: rx_descs_posted underflow avoided
[ 2863.619106] drbd resource0 rdma:memverge: rx_descs_posted underflow avoided
[ 2869.891094] dtr_dec_rx_descs: 2592 callbacks suppressed

So what is the rx_descs_posted ?, may be I need to tune this ??