-
Notifications
You must be signed in to change notification settings - Fork 158
[18053251438] Parallelise bool unpacking for arrow write #2795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mem-block-refactor
Are you sure you want to change the base?
Conversation
596235c to
a98c46f
Compare
7822686 to
21acd7d
Compare
a98c46f to
1c974c7
Compare
21acd7d to
1753b0d
Compare
alexowens90
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you done any benchmarking to see what the performance improvement is compared to the previous implementation?
9d9b0bc to
0662547
Compare
cpp/arcticdb/column_store/block.cpp
Outdated
|
|
||
| // ExternalPackedMemBlock implementation | ||
| ExternalPackedMemBlock::ExternalPackedMemBlock( | ||
| const uint8_t* data, size_t size, size_t shift, size_t offset, entity::timestamp ts, bool owning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does size mean number of bits? I was left with the impression that all block constructors take in the number of bytes and this confused me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On ExternalMemBlocks the size is the logical_size. Renamed in previous PR to reflect that.
| util::check( | ||
| block->get_type() == MemBlockType::EXTERNAL_PACKED, | ||
| "Expected to see a packed external block but got: {}", | ||
| static_cast<int8_t>(block->get_type()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't mean much to people reading the error. Can you add fmt override for the enum values
| "Expected to see a packed external block but got: {}", | ||
| static_cast<int8_t>(block->get_type()) | ||
| ); | ||
| const auto packed_block = dynamic_cast<ExternalPackedMemBlock*>(block); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've already established that it's ExternalPackedMemBlock, static_casting will be safe if performance is crucial. If you insist on using dynamic_cast you can then then check if the result is !nullptr to avoid the virtual call to block->get_type() on the happy path. If it's not you'd still want to call block->get_type() but then we're throwing so the virtual call will be dwarfed by the cost of the exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to static cast
| static_cast<int8_t>(block->get_type()) | ||
| ); | ||
| const auto packed_block = dynamic_cast<ExternalPackedMemBlock*>(block); | ||
| auto num_bits = std::min(packed_block->logical_bytes() - offset_in_block, bytes - pos_in_res); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be multiplied by 8 to be number of bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed in previous PR to logical_size to reflect that this is indeed the number of bits.
a739aec to
d3efb32
Compare
949528b to
b5978cb
Compare
d3efb32 to
1aa9c4f
Compare
b5978cb to
650e430
Compare
1aa9c4f to
5ebe971
Compare
After the MemBlock refactor we can now store packed bits inside the ChunkedBuffer in `arrow_data_to_segment`. We later unpack the bools inside `WriteToSegmentTask` which runs in parallel for all segment slices.
650e430 to
7283525
Compare
Reference Issues/PRs
Monday ref: 18053251438
What does this implement or fix?
After the MemBlock refactor we can now store packed bits inside the ChunkedBuffer in
arrow_data_to_segment.We later unpack the bools inside
WriteToSegmentTaskwhich runs in parallel for all segment slices.Any other comments?
Timings to write a 10million rows x 10 bool columns
Note that there was still big variance of +- 20ms on my machine but the above is averaged over 200 runs
Dummy benchmar script because of unreliable asv benchmarks:
Checklist
Checklist for code changes...