[SOLVED] File corruption with NFS HA Cluster

Argadonis · July 24, 2024, 11:29am

Hello everyone,

I’ve created a test environment with 3 nodes based on AlmaLinux 9 and followed the how-to guide “NFS High Availability Clustering Using DRBD and Pacemaker on RHEL 9”. And I have a NFS share visible on the network.

The issue is with testing the failover as is stated in Chapter 5.

I did not create a file using “dd” but instead I copied a video file of ~2Gb and when I did “hard” reset, as exemplified in the aforementioned chapter, the cluster did its thing and jumped to one of the available nodes and the file continued copying but the resulted file got corrupted (checked the file hash as seen in the following screenshot).
Screenshot 2024-07-24 at 14.31.12

If I do the same test but instead of “hard” reset I do a normal “reboot” of the primary node the resulting file is good (checked with file hash).

I do know the reason why the file gets corrupted as the primary DRBD node was “hard” killed before was able to send the data to secondary DRBD node.

So my question is : there is a way to configure DRBD to mitigate this situation ?

Best regards

Devin · July 24, 2024, 5:42pm

I suspect it might just be write buffers not getting flushed to disk with the hard reboot versus the soft one.

Try mounting the filesystem with -o sync option. In Pacemaker speak that would mean adding options=sync as a parameter to the OCF:heartbeat:Filesystem resource.

This will likely cause a noticeable hit to write performance, but test with the above and see if you can recreate.

Argadonis · July 24, 2024, 6:09pm

Thank you @Devin . That was the missing piece of information.

Once the sync option was added to the options present in the pcs command from Chapter 4.2 of the How-to Guide the file copied while doing a “hard” reset has the same hash as the original file.

The performance hit is low as I the test nodes are on a system with NVMe (I will do some test and get the numbers and see exactly the real hit on performance). I will test it on a SATA/SAS system and get back with real life stats.

vik-t · September 30, 2024, 3:01pm

Hi @Argadonis, I’m curious, were you able to test it on SATA/SAS and did you collect any stats?

@Devin Do I understand correctly that in a system with enterprise disks that support PLP this would not have happened, and the -o sync option is not necessary?

Devin · September 30, 2024, 3:40pm

As I understand it PLP is just capacitor backed disk cache. While this would certainly help in the event of a power outage, I don’t think it would do anything to protect any writes saved in system memory and not yet flushed down to the disk.

I suspect you would still want the -o sync option set.

Topic		Replies	Views
Drbd + gfs2/ocfs DRBD drbd	0	74	March 3, 2025
Optimize NFS Client "reconnect" with DRBD reactor HA setup? DRBD Reactor drbd , drbd-reactor	0	17	April 29, 2025
Stress test for drbd and pacemaker General drbd	4	15	May 11, 2025
Testing pacemaker/drbd failover on two nodes active-standby cluster based on rocky 9.5/pacemaker 2.1.8/drbd 9.2.12 General drbd	10	77	March 31, 2025
Verify consistently fails after rebooting secondary node DRBD drbd	1	64	January 31, 2025

[SOLVED] File corruption with NFS HA Cluster

Related topics