Skip to content

Add option to btrfs replace to remove unrecoverable files instead of abortingΒ #932

Open
@benpicco

Description

@benpicco

When a btrfs replace is unable the recover a file on a raid56 array, it will just abort:

[167573.709048] BTRFS error (device sdf): unrepaired sectors detected, full stripe 49505622556672 data stripe 2 errors 8-15
[167573.847875] BTRFS error (device sdf): btrfs_scrub_dev(/dev/sdg, 2, /dev/sdd) failed -5

There is a formula to translate those magic numbers back to a file, for convenience I moved that to a shell script:

MNT="/mnt/data"

# unrepaired sectors detected, full stripe 49505622556672 data stripe 2 errors 8-15
#                                                 |                   |        |  |
#                                                $1                  $2       $3 $4

STRIPE=$1
INDEX=$2
E_START=$3
E_END=$4

sudo btrfs inspect-internal logical-resolve -o $(($STRIPE + $INDEX * 65536 + $E_START * 4096)) $MNT
sudo btrfs inspect-internal logical-resolve -o $(($STRIPE + $INDEX * 65536 + $E_END * 4096)) $MNT

(assuming 4k sectors and 64k stripes)

Now instead of barfing fs internals at the user and having them figure out what that do with that information and re-start the replace job from the beginning (until the next unrecoverable error is found), it would be much better if the filesystem could automatically remove those unrecoverable files and continue with the device replace.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions