Skip to content

Conversation

sdruzkin
Copy link
Contributor

Summary:
Today FBW encoding uses byte aligned bit packing, meaning that the bit width will be rounded to the next closest bit (6->8, 9->16). We do this because we noticed that compression does not work well with optimally bit packed data.

This introduces several regressions:

  1. Byte aligned uncompressed data can require from 1% (63->64 bits) to 800% (1->8 bits) more memory.
  2. Compressed byte aligned data size can be larger than uncompressed optimally packed data.

To partially mitigate the issue this change introduces a fallback to uncompressed bit packed data if a ratio between sizes of byte aligned compressed data and uncompressed bit aligned is below a certain ration. The ratio is currently hardcoded at 0.99 for now.

The downside of this change is that the writer will have to spend extra CPU cycles to repack the data using bit packing. I also decided to not attempt to compress bit packed data to save on CPU cycles, but actually nothing stops us from doing the second attempt - I'm pretty sure that at certain bit/byte packed size ratios the compression attempt might succeed.

Differential Revision: D82491030

Summary:
Today FBW encoding uses byte aligned bit packing, meaning that the bit width will be rounded to the next closest bit (6->8, 9->16). We do this because we noticed that compression does not work well with optimally bit packed data.

This introduces several regressions:
1. Byte aligned uncompressed data can require from 1% (63->64 bits) to 800% (1->8 bits) more memory.
2. Compressed byte aligned data size can be larger than uncompressed optimally packed data.

To partially mitigate the issue this change introduces a fallback to uncompressed bit packed data if a ratio between sizes of byte aligned compressed data and uncompressed bit aligned is below a certain ration. The ratio is currently hardcoded at 0.99 for now.

The downside of this change is that the writer will have to spend extra CPU cycles to repack the data using bit packing. I also decided to not attempt to compress bit packed data to save on CPU cycles, but actually nothing stops us from doing the second attempt - I'm pretty sure that at certain bit/byte packed size ratios the compression attempt might succeed.

Differential Revision: D82491030
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 15, 2025
@facebook-github-bot
Copy link
Contributor

@sdruzkin has exported this pull request. If you are a Meta employee, you can view the originating diff in D82491030.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants