Skip to content

Minor: reserve space for output views in ByteViewArrayDecoderDictionary #7338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions parquet/src/arrow/array_reader/byte_view_array.rs
Original file line number Diff line number Diff line change
Expand Up @@ -432,11 +432,13 @@ impl ByteViewArrayDecoderDictionary {
}
}

/// Reads the next indexes from self.decoder
/// the indexes are assumed to be indexes into `dict`
/// Reads the next `len` indexes from self.decoder
///
/// The indexes are assumed to be indexes into `dict`
/// the output values are written to output
///
/// Assumptions / Optimization
/// # Assumptions / Optimization
///
/// This function checks if dict.buffers() are the last buffers in `output`, and if so
/// reuses the dictionary page buffers directly without copying data
fn read(&mut self, output: &mut ViewBuffer, dict: &ViewBuffer, len: usize) -> Result<usize> {
Expand All @@ -458,6 +460,10 @@ impl ByteViewArrayDecoderDictionary {
}
};

// we are going to append `len` views to the output buffer so reserve
// the space for them to avoid reallocations
output.reserve_views(len);

if need_to_create_new_buffer {
for b in dict.buffers.iter() {
output.buffers.push(b.clone());
Expand Down
5 changes: 5 additions & 0 deletions parquet/src/arrow/buffer/view_buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ impl ViewBuffer {
self.views.is_empty()
}

/// Reserve capacity for `additional` views
pub fn reserve_views(&mut self, additional: usize) {
self.views.reserve(additional);
}

pub fn append_block(&mut self, block: Buffer) -> u32 {
let block_id = self.buffers.len() as u32;
self.buffers.push(block);
Expand Down
Loading