Skip to content

Add heal_orphaned_dtl import option for crash recovery during topology changes #18172

@nmarasoiu

Description

@nmarasoiu

Describe the feature you would like to see added to OpenZFS

Add a targeted recovery option (heal_orphaned_dtl) for pool import that detects and clears orphaned DTL entries on hole/missing vdevs after a crash during topology changes.

TL;DR: When a system crashes mid-detach, orphaned DTL entries can remain on holes, triggering a phantom resilver (no target device, scans entire pool). Currently recovery requires disabling all validation (spa_load_verify_*=0) - a blunt instrument. A targeted healing
option would be safer and more user-friendly.

How will this feature improve OpenZFS?

Current situation after crash during vdev detach:

  • Pool import fails validation, or triggers phantom resilver
  • zpool status shows "resilver in progress" but NO device has "(resilvering)" marker
  • Resilver scans entire pool (13.8T in my case) with 0% progress forever
  • zpool scrub -s returns EBUSY
  • Only workarounds: disable all validation OR let it hammer drives for hours

With this feature:

  • Pool import detects orphaned DTL on holes automatically
  • Clear error message directs user to: zpool import -o heal_orphaned_dtl=on poolname
  • User explicitly consents to healing action
  • Recovery is logged to pool history
  • No phantom resilver triggered

Why not just fix the crash atomicity? Crashes during topology changes are inherently difficult to make fully atomic - hardware failures, kernel panics, D-state deadlocks can interrupt any transaction. The existing spa_load_verify_* tunables acknowledge this reality.
This feature adds a targeted recovery path rather than requiring users to disable all validation.

Additional context

Environment:

  • OpenZFS: zfs-2.2.2-0ubuntu9.4
  • Kernel: 6.8.0-90-generic
  • Pool: dRAID1 + special vdevs + log vdev, ~14TB

What happened:

Mass detach on 2026-01-26 (5 devices in ~1 second):

2026-01-26.19:29:08 [txg:2675586] detach wwn-...-part1
2026-01-26.19:29:08 [txg:2675593] detach wwn-...-part2
2026-01-26.19:29:08 [txg:2675600] detach wwn-...-part3
2026-01-26.19:29:08 [txg:2675607] detach wwn-...-part4
2026-01-26.19:29:09 [txg:2675614] detach wwn-...-part5

System crash (USB drive issues → kernel hang → sysrq-b)

Pool import on 2026-01-29 triggers phantom resilver

Current pool state (6 holes from detached devices):
hole_array[0]: 2
hole_array[1]: 5
hole_array[2]: 7
hole_array[3]: 9
hole_array[4]: 10
hole_array[5]: 11

Phantom resilver (no target):
scan: resilver in progress since Thu Jan 29 06:21:39 2026
0B / 13.8T scanned, 0B / 13.8T issued
0B resilvered, 0.00% done, no estimated completion time

Current workarounds required:

To import at all:

options zfs spa_load_verify_data=0
options zfs spa_load_verify_metadata=0

To prevent system meltdown from phantom resilver:

options zfs zfs_scan_suspend_progress=1
options zfs zfs_vdev_scrub_max_active=0

Proposed implementation: PR at https://github.com/nmarasoiu/zfs/tree/fix/orphaned-dtl-healing

Adds vdev_dtl_check_orphaned() called during import that:

  1. Detects DTL objects on hole/missing vdevs
  2. By default: fails import with helpful error message
  3. With -o heal_orphaned_dtl=on: clears orphaned DTL, logs to history, continues

Happy to refine the PR based on feedback. The alternative for me is recreating the entire pool from scratch, which works but feels like a sledgehammer for what could be a scalpel fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: FeatureFeature request or new feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions