-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure free space fragmentation #50
Comments
Hey! Yes, there is an example script that I wrote for this: https://github.com/knorrie/python-btrfs/blob/master/examples/show_free_space_fragmentation.py Ah, there's also an old branch in the repo, which still contains an adjusted version of this, which actually creates a picture of the block_group using btrfs-heatmap, and then calls btrfs balance to clean it up, one at a time: https://github.com/knorrie/python-btrfs/blob/meuk/balance_fragmented/examples/balance_fragmented.py <- This thing should still 'just' work, but if you want to use it, I'd recommend copying it out of that old branch and running it with the current version of the python-btrfs lib. Here's an example of the output collection of pictures, back in the time when I was running this: https://syrinx.knorrie.org/~knorrie/btrfs/fragmented/ The second field in the filename (e.g. 000291) is the bad free space fragmentation score. Browsing those should give an idea about what score probably relates to what layout inside. Have fun! |
Also, notice that free space fragmentation can have very different effects on hdd performance than actual file fragmentation. The result of using btrfs balance is that data storage gets compacted by tetrissing data extents into already available free space. This is very different than actually defragmenting individual files, so that the chance increases that the content of a single file is stored closer to each other. IOW, btrfs balance does not defragment / optimize file content, it just shovels around existing small or large file extents to some other random place on disk. |
As discussed in the other thread, I confuse the terms "data chunks", "block groups", "blocks" and "extents". These concepts should be centrally described in the man-pages or some wiki.
So for me, when the man-page talks about block groups or chunks, it's almost the same. Still makes me ask, what is a block then, same like a dev-extent?
So dev_extents are different from extents, dev_extents are contiguous physically on device, extents are contiguous logically in block groups? And:
This is what you metaphorically call tetrissing. It compacts block groups, filling the existing and dropping the empty block groups. Talking about free space, there is the free area in already reserved block groups and the real free space that is not occupied for any data or metadata blockgroup. About your |
Block groups map logical addresses to chunks, chunks map those to devices and dev_extents. Block groups contain extents, extents contain blocks.
This depends on the block group. dev_extents are always physically contiguous and exist only on one device. Chunks point to (device, dev_extent) pairs, and combine them with a raid profile to specify the translation from logical to physical addresses. The raid profile defines how many dev_extents there are and how the chunk uses them. e.g. Single profile: chunk contains one dev_extent from one device.
Yes.
dev_extents are typically large, most are only one size (1G) while extents are smaller and variable size (4K to 128M). dev_extent fragmentation is possible, especially when frequently resizing a striped profile. It is typically many orders of magnitude less than extent fragmentation because the free spaces between extents are much smaller and have many different sizes. |
Hi,
from Zygo/bees#298 (comment) I learned that
btrfs-balance-least-used
can help to reduce free space fragmentation reducing the btrfs performance on HDDs a lot.Is there any tool to measure free space fragmentation?
The text was updated successfully, but these errors were encountered: