Hi
There are two links for sync TCP DRBD replication.
from *.res file
connection
{
path
{
host "memverge" address ipv4 192.168.0.6:7900;
host "memverge2" address ipv4 192.168.0.8:7900;
}
path
{
host "memverge" address ipv4 1.1.1.6:7900;
host "memverge2" address ipv4 1.1.1.8:7900;
}
net
{
# load-balance-paths yes;
transport tcp;
protocol C;
sndbuf-size 10M;
rcvbuf-size 10M;
max-buffers 80K;
max-epoch-size 20000;
timeout 90;
ping-timeout 10;
ping-int 15;
connect-int 15;
fencing resource-and-stonith;
}
}
from corosync.conf
nodelist {
node {
name: memverge
nodeid: 27
ring0_addr: 192.168.0.6
ring1_addr: 1.1.1.6
}
node {
name: memverge2
nodeid: 28
ring0_addr: 192.168.0.8
ring1_addr: 1.1.1.8
}
}
but when I unplug cable from one link, DRBD replication freezes until I plug the cable back.
[ 1268.618724] mlx5_core 0000:86:00.0: Port module event: module 0, Cable unplugged
[ 1268.643924] mlx5_core 0000:86:00.0 ens5f0np0: Link down
[ 1284.707415] drbd ha-nfs memverge2: PingAck did not arrive in time.
[ 1284.707448] drbd ha-nfs: susp-io( no -> fencing )
[ 1284.707455] drbd ha-nfs memverge2: conn( Connected -> NetworkFailure ) peer( Secondary -> Unknown )
[ 1284.707462] drbd ha-nfs/29 drbd1 memverge2: pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
[ 1284.707470] drbd ha-nfs/30 drbd2 memverge2: repl( SyncSource -> Off )
[ 1284.707725] drbd ha-nfs tcp:memverge2: dtt_send_page: size=4096 len=2900 sent=-4
[ 1284.707943] drbd ha-nfs memverge2: Terminating sender thread
[ 1284.708011] drbd ha-nfs memverge2: Starting sender thread (peer-node-id 28)
[ 1284.716036] drbd ha-nfs memverge2: Connection closed
[ 1284.716119] drbd ha-nfs memverge2: helper command: /sbin/drbdadm disconnected
[ 1284.716138] drbd ha-nfs memverge2: helper command: /sbin/drbdadm fence-peer
[ 1284.717060] drbd ha-nfs memverge2: helper command: /sbin/drbdadm disconnected exit code 0
[ 1284.717069] drbd ha-nfs memverge2: conn( NetworkFailure -> Unconnected ) [disconnected]
[ 1284.717075] drbd ha-nfs memverge2: Restarting receiver thread
[ 1284.717079] drbd ha-nfs memverge2: conn( Unconnected -> Connecting ) [connecting]
[ 1284.717093] drbd ha-nfs memverge2: Configured local address not found, retrying every 15 sec, err=-99
[ 1284.756086] drbd ha-nfs memverge2: helper command: /sbin/drbdadm fence-peer exit code 4 (0x400)
[ 1284.756109] drbd ha-nfs memverge2: fence-peer helper returned 4 (peer was fenced)
[ 1284.756115] drbd ha-nfs/29 drbd1 memverge2: pdsk( DUnknown -> Outdated ) [outdate-async]
[ 1284.756313] drbd ha-nfs/29 drbd1: new current UUID: C81E77D9A9B8DE3B weak: FFFFFFFFF7FFFFFF
[ 1284.756321] drbd ha-nfs: susp-io( fencing -> no ) [after-fencing]
So there is no failover to second healthy link ?
Anton