Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdb: ASSERT at module/zfs/ddt.c:1238:ddt_lookup() while trying to get block stats on active pool #16824

Open
intelfx opened this issue Dec 1, 2024 · 0 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@intelfx
Copy link

intelfx commented Dec 1, 2024

System information

Type Version/Name
Distribution Name Arch
Distribution Version current
Kernel Version 6.11.10
Architecture x86_64
OpenZFS Version 2.3.0-rc3

Describe the problem you're observing

Invoking zdb -Lbbbs on a pool with fast dedup, while the pool is under write load, results in a crash due to an assertion:

# zdb -Lbbbs htank

Traversing all blocks ...

2.75T completed (8950MB/s) estimated time remaining: 0hr 23min 05sec        ASSERT at module/zfs/ddt.c:1238:ddt_lookup()
error == 0 (0x34 == 0)
  PID: 2746952   COMM: zdb
  TID: 2746952   NAME: zdb
Call trace:
<...>

As far as I'm aware, zdb is supposed to be safe to run on imported and active pools. The man page warns that it's "possible, though unlikely, that zdb may interpret inconsistent pool data and behave erratically", however, this crash was reproducible in 100% of invocations so far so I'm assuming something might be wrong outside of expected breakage levels.

Describe how to reproduce the problem

Unknown.

`zpool create` invocation
zpool create \
    -o ashift=12 \
    -o cachefile=/etc/zfs/zpool.cache \
    -o feature@fast_dedup=enabled \
    -o feature@block_cloning=enabled \
    -o feature@empty_bpobj=enabled \
    -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
    -O compression=zstd-11 \
    -O checksum=sha256 \
    -O dedup=sha256 \
    -O atime=off \
    -O relatime=off \
    -O recordsize=1M \
    htank -m /mnt/htank \
    raidz /dev/mapper/htank-{1,2,3,4} \
    log /dev/mapper/htank-log-1 \
    cache /dev/mapper/htank-cache-1 \
    special mirror /dev/mapper/htank-special-{1,2}

Include any warning/errors/backtraces from the system logs

zdb backtrace (optimized build w/debuginfo):

#0  0x00007d36e4ca53f4 in ?? () from /usr/lib/libc.so.6
#1  0x00007d36e4c4c120 in raise () from /usr/lib/libc.so.6
#2  0x00007d36e4c334c3 in abort () from /usr/lib/libc.so.6
#3  0x00007d36e59f9054 in libspl_assertf (file=file@entry=0x7d36e5704db9 "module/zfs/ddt.c", func=func@entry=0x7d36e56f0298 <__FUNCTION__.35.lto_priv.3> "ddt_lookup", line=line@entry=1238, format=format@entry=0x7d36e5703903 "%s == 0 (0x%llx == 0)") at lib/libspl/assert.c:114
#4  0x00007d36e594d814 in ddt_lookup (ddt=ddt@entry=0x55a649566110, bp=bp@entry=0x55a6a6c82080) at module/zfs/ddt.c:1238
#5  0x000055a630bcd021 in zdb_count_block (zcb=zcb@entry=0x55a65ecb5010, zilog=0x0, bp=bp@entry=0x55a6a6c82080, type=type@entry=DMU_OT_PLAIN_FILE_CONTENTS) at cmd/zdb/zdb.c:5808
#6  0x000055a630bcd948 in zdb_blkptr_cb (spa=0x55a64952db60, zilog=<optimized out>, bp=0x55a6a6c82080, zb=0x55a66fb75550, dnp=<optimized out>, arg=0x55a65ecb5010) at cmd/zdb/zdb.c:6109
#7  0x00007d36e5924183 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x7d33ec98b400, bp=0x55a6a6c82080, zb=zb@entry=0x55a66fb75550) at module/zfs/dmu_traverse.c:290
#8  0x00007d36e59242a7 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x7d33ec98b400, bp=bp@entry=0x7d33ec98b440, zb=zb@entry=0x7ffcf6fffd00) at module/zfs/dmu_traverse.c:357
#9  0x00007d36e5924d66 in traverse_dnode (td=td@entry=0x55a64b358f90, bp=bp@entry=0x7d33f0276a80, dnp=dnp@entry=0x7d33ec98b400, objset=781, object=5794) at module/zfs/dmu_traverse.c:537
#10 0x00007d36e5924859 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x55a66ed5c000, bp=0x7d33f0276a80, zb=zb@entry=0x55a673b64390) at module/zfs/dmu_traverse.c:393
#11 0x00007d36e59242a7 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x55a66ed5c000, bp=0x7d33fc03f000, zb=zb@entry=0x55a673ba5c30) at module/zfs/dmu_traverse.c:357
#12 0x00007d36e59242a7 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x55a66ed5c000, bp=0x7d33a8800000, zb=zb@entry=0x55a66b54e4a0) at module/zfs/dmu_traverse.c:357
#13 0x00007d36e59242a7 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x55a66ed5c000, bp=0x7d33981b7000, zb=zb@entry=0x7d33b0007ec0) at module/zfs/dmu_traverse.c:357
#14 0x00007d36e59242a7 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x55a66ed5c000, bp=0x7d33bc13a000, zb=zb@entry=0x7d33b0007e90) at module/zfs/dmu_traverse.c:357
#15 0x00007d36e59242a7 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x55a66ed5c000, bp=bp@entry=0x55a66ed5c040, zb=zb@entry=0x7ffcf7000150) at module/zfs/dmu_traverse.c:357
#16 0x00007d36e5924d66 in traverse_dnode (td=td@entry=0x55a64b358f90, bp=bp@entry=0x7d3584001ab0, dnp=dnp@entry=0x55a66ed5c000, objset=781, object=object@entry=0) at module/zfs/dmu_traverse.c:537
#17 0x00007d36e59245e4 in traverse_visitbp (td=td@entry=0x55a64b358f90, dnp=dnp@entry=0x0, bp=bp@entry=0x7d3584001ab0, zb=zb@entry=0x55a6688eb630) at module/zfs/dmu_traverse.c:434
#18 0x00007d36e599cc22 in traverse_impl (spa=0x55a64952db60, ds=<optimized out>, objset=<optimized out>, rootbp=0x7d3584001ab0, txg_start=<optimized out>, resume=resume@entry=0x0, flags=53, func=0x55a630bcd8a0 <zdb_blkptr_cb>, arg=0x55a65ecb5010) at module/zfs/dmu_traverse.c:703
#19 0x00007d36e599cfba in traverse_dataset_resume (ds=<optimized out>, txg_start=<optimized out>, resume=resume@entry=0x0, flags=flags@entry=53, func=func@entry=0x55a630bcd8a0 <zdb_blkptr_cb>, arg=arg@entry=0x55a65ecb5010) at module/zfs/dmu_traverse.c:731
#20 0x00007d36e599cfd4 in traverse_dataset (ds=<optimized out>, txg_start=<optimized out>, flags=flags@entry=53, func=func@entry=0x55a630bcd8a0 <zdb_blkptr_cb>, arg=arg@entry=0x55a65ecb5010) at module/zfs/dmu_traverse.c:739
#21 0x00007d36e599d2f6 in traverse_pool (spa=spa@entry=0x55a64952db60, txg_start=txg_start@entry=0, flags=flags@entry=53, func=func@entry=0x55a630bcd8a0 <zdb_blkptr_cb>, arg=arg@entry=0x55a65ecb5010) at module/zfs/dmu_traverse.c:795
#22 0x000055a630bc6513 in dump_block_stats (spa=spa@entry=0x55a64952db60) at cmd/zdb/zdb.c:7060
#23 0x000055a630bd2659 in dump_zpool (spa=<optimized out>) at cmd/zdb/zdb.c:8474
#24 main (argc=<optimized out>, argv=<optimized out>) at cmd/zdb/zdb.c:9808
@intelfx intelfx added the Type: Defect Incorrect behavior (e.g. crash, hang) label Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant