Slowest Sync Ever

forbin · June 10, 2025, 6:04pm

Just built a new 2-node cluster and created a 500G resource. The sync speed is extremely slow. After 12+ hours, its is at 65.8% and still running at about 5M/sec.

I have confirmed that the network, hardware, and disks are fast.

[root@ha57a 0]# linstor rg lp rg0
╭────────────────────────────────────────────────╮
┊ Key ┊ Value ┊
╞════════════════════════════════════════════════╡
┊ DrbdOptions/Net/max-buffers ┊ 81920 ┊
┊ DrbdOptions/PeerDevice/c-fill-target ┊ 2048 ┊
┊ DrbdOptions/PeerDevice/c-max-rate ┊ 2048000 ┊
┊ DrbdOptions/Resource/auto-promote ┊ no ┊
┊ PeerSlotsNewResource ┊ 3 ┊
╰────────────────────────────────────────────────╯

The build is…

Rocky 9.6
DRBD 9.31.0-3.el9
Linstor 1.31.1-1.el9
6 x nvme disks

Devin · June 11, 2025, 8:01pm

The resyncs are pretty slow by default, because that is safest. The resync IO is “background resync” It’s effectively a “catch-up” from a disconnect that occurs simultaneously while new writes and application IO is also occurring. If the resync goes too fast, it would begin to slow down the application performance. That is usually undesirable. Are you testing and observing sync-speeds while the volumes are not in use, or not?

I have a KB article on tuning the resync speeds here: Tuning the DRBD Resync Controller | Knowledge Base

Looking at what you have presently, I would advise you:
Set c-fill-target to 1M. I can’t really explain it, but 1M just seems to always work.
Tune the max-buffers down to 40k
Verify the c-max-rate. 204800KiB/s is roughly 17Gbits/s. Can you network support 17gigabit? Is there any competing traffic on this network? Like say, the “foreground” application IO replication I mentioned earlier? Setting the c-max-rate too high can result in slower resyncs.

Rocky 9.6
DRBD 9.31.0-3.el9
Linstor 1.31.1-1.el9
6 x nvme disks

Please note that the DRBD 9.31.0-3.el9 is going to be purely the userland utilities installed. It is also important to note the kernel module version (the true DRBD software). You can query this via /proc/drbd.

forbin · June 11, 2025, 8:23pm

Hi Devin,

This is a new cluster, but we have others, with about 400 DRBD resources between them. We’re accustomed to seeing initial syncs perform at up to 2GB/sec, and rarely lower than about 300MB, so 5MB is painful.

The network is a 100Gb backbone with 25Gb servers, and there’s low utilization and plenty of bandwidth available. iperf3 tests show approx. 24 Gbits. And since this is the first resource on the cluster, nothing is competing for resources.

[root@ha57a ~]# cat /proc/drbd
version: 9.2.13 (api:2/proto:118-122)
GIT-hash: 0457237e0448663529fe161781873b356f17b3c5 build by @buildsystem, 2025-05-13 09:42:39
Transports (api:21): tcp (9.2.13)

I’ll try your recommendations!

Topic		Replies	Views
Sync speed is excellent, but resync is very bad DRBD drbd	1	224	August 26, 2024
DRBD 9.0.23 – Changing resync-rate during active synchronization has no effect DRBD drbd	5	156	April 1, 2025
How to speed up initial sync DRBD drbd	0	156	November 20, 2024
Very low performance on Linstor with ZFS backed storage for Proxmox Proxmox VE	14	733	September 30, 2024
drbd-9.2.13-rc.1 Release Announcements drbd	0	61	March 12, 2025

Slowest Sync Ever

Related topics