Skip to content

btrfs-progs: subvolume: use BTRFS_IOC_SUBVOL_SYNC_WAIT for sync #989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: devel
Choose a base branch
from

Conversation

SidongYang
Copy link

This patch uses BTRFS_IOC_SUBVOL_SYNC_WAIT ioctl in subvolume sync command before checking periodically and adds an option to not to use sync wait ioctl call and force to check periodically. This patch calls a new function wait_for_subvolume_sync() that calls BTRFS_IOC_SUBVOL_SYNC_WAIT for each subvol.

Issue: #953

@kdave kdave force-pushed the devel branch 5 times, most recently from 27c7abb to 0e51d55 Compare May 30, 2025 11:36
This new option allows end users to specify certain per-inode flags for
specified file/directory inside rootdir.

And mkfs will follow the kernel behavior by inheriting the inode flag
from the parent.

For example:

 rootdir
 |- file1
 |- file2
 |- dir1/
 |  |- file3
 |- subv/     << will be created as a subvolume using --subvol option
    |- dir2/
    |  |- file4
    |- file5

When `mkfs.btrfs --rootdir rootdir --subvol subv --inode-flags
nodatacow:dir1 --inode-flags nodatacow:subv", then the following files
and directory will have *nodatacow* flag set:

- dir1
- file3
- subv
- dir2
- file4
- file5

For now only two flags are supported:

- nodatacow
  Disable data COW, implies *nodatasum* for regular files

- nodatasum
  Disable data checksum only.

This also works with --compress option, and files with nodatasum or
nodatacow flag will skip compression.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
The simple test will create a layout like the following:

rootdir
|- file1
|- file2
|- subv/		<< Regular subvolume
|  |- file3
|- nocow_subv/		<< NODATACOW subvolume
|  |- file4
|- nocow_dir/		<< NODATACOW directory
|  |- dir2
|  |  |- file5
|  |- file6
|- nocow_file1		<< NODATACOW file

Any files under NODATACOW subvolume/directory should also be NODATACOW.
The explicitly specified single file should also be NODATACOW.

Issue: kdave#984
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
morbidrsa and others added 2 commits May 30, 2025 16:20
Create a second data block-group to be used for relocation, in case a
zoned filesystem in created.

This second data block-group will then be picked up by the kernel as the
default data relocation block-group on mount.

This ensures we always have a target to relocate good data to when we
need to do garbage collection.

Signed-off-by: Johannes Thumshirn <[email protected]>
Signed-off-by: David Sterba <[email protected]>
If in btrfs_check_super() we find that the superblock has a csum
mismatch, print the wanted and found values, just as we do for metadata
in __csum_tree_block_size().

When hex-editing a btrfs image, it's useful to use btrfs check to
calculate what the new csum should be. Unfortunately at present this
only works for trees and not for the superblock, meaning you have to use
the much more wordy `btrfs inspect-internal`.

Pull-request: kdave#985
Signed-off-by: Mark Harmstone <[email protected]>
Signed-off-by: David Sterba <[email protected]>
@kdave
Copy link
Owner

kdave commented May 30, 2025

Thanks, this is still missing some usability bits. The new ioctl should be preferred if it exists, there's a mode to check it. Either it works or its -ENOTTY. The force argument is misleading, both ways work but with different capabilities. This should be explained in the help text and documentation.

ozraru and others added 11 commits May 30, 2025 20:40
This feature is provided by commit of kernel fc5c0c58258748 ("btrfs:
defrag: extend ioctl to accept compression levels") which is not
included in 6.14 but 6.15.

[skip ci]

Pull-request: kdave#983
Signed-off-by: David Sterba <[email protected]>
Block group tree requires no-holes and free-space-tree features, add
such check just like mkfs.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
The bytenr sequence of all roots are controlled by our code, so if
something went wrong with the sequence, it's a bug.

A UASSERT() is more suitable for this case.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
The function requires parameters @slot and @itemoff to record where the
next item should land.

But this is overkilled, as after inserting an item, the temporary extent
buffer will have its header nritems and the item pointer updated.  We
can use that header nritems and item pointer to get where the next item
should land.

This removes the external counter to record @slot and @itemoff.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
…_chunk_item()

These functions require parameters @slot and @itemoff to record where the
next item should land.

But this is overkilled, as after inserting an item, the temporary extent
buffer will have its header nritems and the item pointer updated.

We can use that header nritems and item pointer to get where the next
item should land.

This removes the external counter to record @slot and @itemoff.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
This function requires parameters @slot and @itemoff to record where the
next item should land.

But this is overkilled, as after inserting an item, the temporary extent
buffer will have its header nritems and the item pointer updated.

We can use that header nritems and item pointer to get where the next
item should land.

This removes the external counter to record @slot and @itemoff.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
…emp_block_group()

These functions require parameters @slot and @itemoff to record where the
next item should land.

But this is overkilled, as after inserting an item, the temporary extent
buffer will have its header nritems and the item pointer updated.

We can use that header nritems and item pointer to get where the next
item should land.

This removes the external counter to record @slot and @itemoff.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
…tree()

Both fs and csum trees are empty at make_convert_btrfs(), no need to use
two different functions to do that.

Merge them into a common setup_temp_empty_tree() instead.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Previously there were some problems related to btrfs-convert bgt support,
that it doesn't work at all, caused by the following reasons:

- We never update the super block with extra compat ro flags
  Even if we set "-O bgt" flags, it will not set the compat ro flags,
  and everything just go non-bgt routine.

  Meanwhile other compat ro flags are for free-space-tree, and
  free-space-tree is rebuilt after the full convert is done.
  Thus this bug won't cause any problem for fst features, but only
  affecting bgt so far.

- No extra handling to create block group tree

Fix above problems by:

- Set the proper compat RO flag for the temporary super block
  We should only set the compat RO flags except the two FST related
  bits.  As FST is handled after conversion, we should not set the flag
  at that timing.

- Add block group tree root item and its backrefs
  So the initial temporary fs will have a proper block group tree.

  The only tricky part is for the extent tree population, where we have
  to put all block group items into the block group tree other than the
  extent tree.

With these two points addressed, now block group tree can be properly
enabled for btrfs-convert.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Previously "btrfs-convert -O bgt" would not cause any error, but the
resulting fs has no block-group-tree feature at all, making it no
different than "btrfs-convert -O ^bgt".

This is a big bug that was never caught by our existing convert runs.
001-ext2-basic and 003-ext4-basic all tested bgt feature, but don't
really check if the resulting fs really have bgt flags set.

To fix that add a new test case, which will do the regular bgt convert,
but at the end also do a super block dump and verify the
BLOCK_GROUP_TREE flag is properly set.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
This patch uses BTRFS_IOC_SUBVOL_SYNC_WAIT ioctl in subvolume sync
command before checking periodically and adds an option to not to
use sync wait ioctl call and force to check periodically. This
patch calls a new function wait_for_subvolume_sync() that calls
BTRFS_IOC_SUBVOL_SYNC_WAIT for each subvol.

Issue: kdave#953
Signed-off-by: Sidong Yang <[email protected]>
@SidongYang SidongYang force-pushed the subvol-sync-with-ioctl branch from 5f72fd7 to 4f3c256 Compare May 31, 2025 15:48
@SidongYang
Copy link
Author

Hi, @kdave !
Thanks for a comment. I've fixed this.

  • print error message for -ENOTTY
  • force_to_check -> periodic
  • change usage message for this patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants