Block drbd0: local WRITE IO error sector $SECTOR on mdXpY

robin-checkmk · March 21, 2025, 1:44pm

On a two node DRBD cluster we are experiencing the following error messages:

[11560350.682797] block drbd0: drbd_md_sync_page_io(,3742095912s,WRITE) failed with error -5
[11560350.682808] block drbd0: disk( UpToDate -> Failed )
[11560350.682819] block drbd0: Local IO failed in __al_write_transaction. Detaching...
[11560350.682832] block drbd0: local WRITE IO error sector 218585232+32 on md0p2
[11560350.682842] block drbd0: local WRITE IO error sector 218586616+8 on md0p2
[11560350.682849] block drbd0: local WRITE IO error sector 218588448+16 on md0p2
[11560350.682855] block drbd0: local WRITE IO error sector 218589224+32 on md0p2
[11560350.682864] block drbd0: local WRITE IO error sector 218589848+8 on md0p2
[11560350.682901] block drbd0: disk is Failed, cannot start al transaction
[11560350.683632] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
[11560350.683641] block drbd0: disk( Failed -> Diskless )
[11560350.683810] block drbd0: receiver updated UUIDs to effective data uuid: 78189A5EBB2C6954

OS: Debian 12
DRBD Kernel Module: 8.4.11
DRBD Admin Tools: 9.22.0

The hardware vendor tells us, that the physical disks are healthy as they can be.
There were no obvious changes to the system, other than OS updates.

Does anyone have an idea where to look to isolate this problem? Right now, I am pretty much clueless on where to even start. Thanks in advance!

robin-checkmk · August 22, 2025, 8:04am

Hi everyone,

For those running systems on the Debian LTS kernel (version 6.1) and experiencing this issue, we may have found a potential solution.

The Problem

We’ve identified that certain failures seem to be caused by the kernel function md_submit_flush_data.

The Cause & Solution

After investigating the kernel’s changelog, it appears a patch was committed that completely removes this function. A subsequent follow-up patch was also merged to address related issues.

The key issue is that these patches were not backported to the 6.1 kernel series. They are only included in kernel versions 6.12 and newer.

How to Fix It on Debian

If you are running Debian Bookworm, you can resolve this by upgrading your kernel. The necessary patches are included in the linux-image-amd64 package available in the bookworm-backports repository, which provides the 6.12 kernel series or newer.

Upgrading your kernel from the backports repository should incorporate the fix and might resolve the problem.

We are currently testing this solution and I will update here, if this actually proves to be the solution.

Topic		Replies	Views
INFO: task drbd_r_omd:1957 blocked for more than 120 seconds DRBD	7	288	August 9, 2024
DRBD kickstart troubleshooting DRBD	11	463	August 26, 2024
During heavy IO disk write timeout leads to DRBD going to diskless mode DRBD	5	124	August 22, 2024
drbd-9.1.23 and drbd-9.2.12 Release Announcements drbd	0	212	November 18, 2024
DRBD-9.2.9 and DRBD-9.1.20 Release Announcements	0	161	April 30, 2024

Block drbd0: local WRITE IO error sector $SECTOR on mdXpY

The Problem

The Cause & Solution

How to Fix It on Debian

Related topics