-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System information
| Type | Version/Name |
|---|---|
| Distribution Name | FreeBSD |
| Distribution Version | |
| Kernel Version | 14.3-RELEASE-p3 |
| Architecture | x86_64 |
| OpenZFS Version | zfs-2.2.7-FreeBSD_ge269af1b3 |
Describe the problem you're observing
The datasets A and B were created with the default 128k recordsize and have been in use for about 8 years. They have been used as primary and DR replication pairs and the replication direction has changed over time.
A few months ago, the recordsize parameter on A was changed to 512k while it has been replicating to B without the -L flag. The recordsize on B was not changed as it was the read-only replica. Recently, -L and -c flags were added when creating the incremental snapshot from the A side. After receiving one or two replications with -L and -c flag on B, we noticed that at least one file got zeroed with stat showing 512k as st_blksize, 1 as st_blocks and correct file size. This file was created with recordsize 512k and replicated to B months ago. We were able to reproduce the issue on A and B by rolling back to a good snapshot and receiving incremental snapshots with -L flag. After confirming this and reproducing using fresh datasets created on the B's pool as detailed below, we rolled back the dataset B to a good snapshot and removed the -L flag from the subsequent incremental snapshots.
I am aware of #6224 and to the best of my knowledge, the -no-L to -L toggle bug has been addressed while the -L to -no-L bug is prohibited if a large block receive has happened. However, I am not certain that the -no-L to -L bug has fully been addressed considering what we observed.
Describe how to reproduce the problem
We were able to reproduce this issue using two newly created datasets using the following steps at the time we experienced the issue:
- Create a new dataset (foo) with recordsize=128k.
- Create file1 on foo using dd if=/dev/urandom bs=128k count=100.
- Take a snapshot and replicate to another new dataset (bar).
- Set recordsize to 512k on foo.
- Create file2 on foo using the same steps from above.
- Take a snapshot and send incremental snapshot to bar without -L flag.
- Create file3 on foo using the same steps from above.
- Take a snapshot and send incremental snapshot to bar with -L flag.
At this point, we see that file2 on bar got zeroed on bar.
However, we are no longer able to reproduce the issue with the steps above.