fix: retry mkfs on next reconciliation if interrupted by LoneExile · Pull Request #481 · LINBIT/linstor-server

LoneExile · 2026-02-15T03:17:09Z

Problem

When mkfs is interrupted during initial volume provisioning (e.g., timeout on large volumes, or transient failure while the satellite stays running), the DRBD device is left without a filesystem. The volume appears successfully provisioned but gets stuck in a permanent FailedMount loop because fsck fails on the unformatted device (exit code 8: "Bad magic number in super-block").

This does not self-heal because two one-shot gate flags — checkFileSystem and createPrimary — are cleared before their respective mkfs blocks run. If mkfs throws, both flags are already false and mkfs is never retried on subsequent reconciliations.

Scenario	checkFileSystem	createPrimary	Self-heals?
Node reboot (satellite restarts)	Resets to `true` (in-memory)	Controller re-sends via `StltRscDfnApiCallHandler`	Yes
mkfs timeout (satellite running)	Already `false`	Already `false`	No — stuck forever
mkfs failure (satellite running)	Already `false`	Already `false`	No — stuck forever

Root Cause

Flag 1: `checkFileSystem` in `MkfsUtils.makeFileSystemOnMarked()`

disableCheckFileSystem() is called at line 136 of MkfsUtils.java, before the mkfs loop begins. If any mkfs call throws StorageException (timeout or failure), the flag is already cleared and the next reconciliation skips the entire block.

checkFileSystem is a plain boolean field in AbsRscData.java (line 59), initialized to true in the constructor (line 92)
Not persisted to database — resets to true on satellite restart

Flag 2: `createPrimary` in `DrbdLayer.condInitialOrSkipSync()`

rsc.unsetCreatePrimary() is called at line 1773 of DrbdLayer.java, before both setResourceUpToDate() and the mkfs block. After clearing:

PROP_PRIMARY_SET is already set by controller → Branch A (request primary) is skipped
createPrimary is false → Branch B (go primary + mkfs) is skipped

Even if checkFileSystem were fixed independently, this outer gate blocks re-entry to the DRBD mkfs path.

createPrimary is a plain boolean field in Resource.java (line 86), default false
Set by controller via StltRscDfnApiCallHandler.setCreatePrimary() (line 70) after satellite requests primary

Timeout context

The default external command timeout is 45 seconds (ChildProcessHandler.java, line 20). For large volumes (100+ GB), mkfs can exceed this timeout (see #371).

Fix

Move both flags to after their respective mkfs blocks complete successfully. If mkfs throws, the exception exits the method before the flag is cleared, so the next reconciliation retries.

MkfsUtils.java: Move disableCheckFileSystem() from line 136 (before the mkfs loop) to after the loop completes (line 253 in the patched file).

DrbdLayer.java: Move unsetCreatePrimary() from line 1773 (before the mkfs/sync block) to after it completes (line 1797 in the patched file).

Safety

The existing blkid check in hasFileSystem() (MkfsUtils.java line 70) prevents reformatting volumes that already have a filesystem — partially completed multi-volume runs are safe
setResourceUpToDate() (initial sync trigger) may run again on retry; DRBD handles redundant primary/secondary transitions gracefully
No change to persisted state — both flags are volatile in-memory only

Problem while creating large (100+ GB) volume #371 — mkfs timeout on large volumes (200GB+), same root cause
StorageException: Failed to mkfs /dev/drbd1002 piraeusdatastore/piraeus-operator#641 — StorageException: Failed to mkfs; resource stuck InUse, not demoted to secondary after mkfs failure, volume left without filesystem
fix: handle unformatted DRBD devices in NodePublishVolume piraeusdatastore/linstor-csi#408 — CSI-level workaround

Both MkfsUtils.makeFileSystemOnMarked() and DrbdLayer.condInitialOrSkipSync() clear their one-shot gate flags before mkfs runs. If mkfs is interrupted (timeout or failure while the satellite stays running), the flags are already cleared and mkfs is never retried, leaving the DRBD device without a filesystem and the volume stuck in FailedMount. Move both flags to after mkfs succeeds: - MkfsUtils: move disableCheckFileSystem() from before the mkfs loop to after it completes. If mkfs throws, the exception exits the method before the flag is cleared, so the next reconciliation retries. - DrbdLayer: move unsetCreatePrimary() from before the mkfs block to after it completes. This keeps the createPrimary gate open on failure so the DRBD path re-enters on the next device manager run. The existing blkid check (hasFileSystem) already guards against reformatting volumes that have a filesystem, so successfully formatted volumes from a partial run are not reformatted.

ghernadi · 2026-02-16T16:26:54Z

Honestly I am not sure what to make of this. Why should a second attempt succeed if the first one failed?

Besides, we are about to improve the timeout handling for mkfs as well as for drbdadm create-md to not timeout for large devices. Would that already help your use case or would you still want this PR to be applied?

LoneExile · 2026-02-24T09:19:01Z

The timeout fix would cover the most common trigger, but the underlying issue remains if mkfs is interrupted for any reason while the satellite stays running (transient I/O error, OOM-killed child process, etc.) the one-shot flags are already cleared and the volume is stuck forever. only a satellite restart recovers it.

This PR just makes flag clearing conditional on success. no behavior change when mkfs succeeds, safe retry on next reconciliation when it doesn't (guarded by blkid)

Happy to wait or adapt if you'd prefer to land the timeout fix first

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: retry mkfs on next reconciliation if interrupted#481

fix: retry mkfs on next reconciliation if interrupted#481
LoneExile wants to merge 1 commit intoLINBIT:masterfrom
LoneExile:fix/mkfs-retry-on-interrupted-provisioning

LoneExile commented Feb 15, 2026 •

edited

Loading

Uh oh!

ghernadi commented Feb 16, 2026

Uh oh!

LoneExile commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LoneExile commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Flag 1: checkFileSystem in MkfsUtils.makeFileSystemOnMarked()

Flag 2: createPrimary in DrbdLayer.condInitialOrSkipSync()

Timeout context

Fix

Safety

Related

Uh oh!

ghernadi commented Feb 16, 2026

Uh oh!

LoneExile commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LoneExile commented Feb 15, 2026 •

edited

Loading

Flag 1: `checkFileSystem` in `MkfsUtils.makeFileSystemOnMarked()`

Flag 2: `createPrimary` in `DrbdLayer.condInitialOrSkipSync()`