Block drbd0: local WRITE IO error sector $SECTOR on mdXpY

On a two node DRBD cluster we are experiencing the following error messages:

[11560350.682797] block drbd0: drbd_md_sync_page_io(,3742095912s,WRITE) failed with error -5
[11560350.682808] block drbd0: disk( UpToDate -> Failed )
[11560350.682819] block drbd0: Local IO failed in __al_write_transaction. Detaching...
[11560350.682832] block drbd0: local WRITE IO error sector 218585232+32 on md0p2
[11560350.682842] block drbd0: local WRITE IO error sector 218586616+8 on md0p2
[11560350.682849] block drbd0: local WRITE IO error sector 218588448+16 on md0p2
[11560350.682855] block drbd0: local WRITE IO error sector 218589224+32 on md0p2
[11560350.682864] block drbd0: local WRITE IO error sector 218589848+8 on md0p2
[11560350.682901] block drbd0: disk is Failed, cannot start al transaction
[11560350.683632] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
[11560350.683641] block drbd0: disk( Failed -> Diskless )
[11560350.683810] block drbd0: receiver updated UUIDs to effective data uuid: 78189A5EBB2C6954

OS: Debian 12
DRBD Kernel Module: 8.4.11
DRBD Admin Tools: 9.22.0

The hardware vendor tells us, that the physical disks are healthy as they can be.
There were no obvious changes to the system, other than OS updates.

Does anyone have an idea where to look to isolate this problem? Right now, I am pretty much clueless on where to even start. Thanks in advance!

Hi everyone,

For those running systems on the Debian LTS kernel (version 6.1) and experiencing this issue, we may have found a potential solution.

The Problem

We’ve identified that certain failures seem to be caused by the kernel function md_submit_flush_data.

The Cause & Solution

After investigating the kernel’s changelog, it appears a patch was committed that completely removes this function. A subsequent follow-up patch was also merged to address related issues.

The key issue is that these patches were not backported to the 6.1 kernel series. They are only included in kernel versions 6.12 and newer.

How to Fix It on Debian

If you are running Debian Bookworm, you can resolve this by upgrading your kernel. The necessary patches are included in the linux-image-amd64 package available in the bookworm-backports repository, which provides the 6.12 kernel series or newer.

Upgrading your kernel from the backports repository should incorporate the fix and might resolve the problem.

We are currently testing this solution and I will update here, if this actually proves to be the solution. :v:

1 Like