Skip to content

Extract arrowarray and arrowschema creation#669

Draft
Alex-PLACET wants to merge 7 commits intoman-group:mainfrom
Alex-PLACET:extract_arrowarray_and_arrowschema_creation
Draft

Extract arrowarray and arrowschema creation#669
Alex-PLACET wants to merge 7 commits intoman-group:mainfrom
Alex-PLACET:extract_arrowarray_and_arrowschema_creation

Conversation

@Alex-PLACET
Copy link
Copy Markdown
Collaborator

No description provided.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 99.64912% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.61%. Comparing base (9edcc99) to head (01b3a3b).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...lude/sparrow/layout/arrow_schema_array_factory.hpp 99.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #669      +/-   ##
==========================================
+ Coverage   88.53%   88.61%   +0.07%     
==========================================
  Files         115      117       +2     
  Lines        9545     9651     +106     
==========================================
+ Hits         8451     8552     +101     
- Misses       1094     1099       +5     
Flag Coverage Δ
unittests 88.61% <99.64%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR centralizes Arrow C Data Interface (ArrowArray / ArrowSchema) construction logic into a shared factory to reduce duplication across array implementations.

Changes:

  • Introduces include/sparrow/layout/arrow_schema_array_factory.hpp and src/layout/arrow_schema_array_factory.cpp to build common ArrowSchema/ArrowArray instances for multiple layouts.
  • Refactors several array types (primitive, list, map, union, timestamp, decimal, dictionary-encoded, variable-size binary/view, run-end encoded, null) to use the new factory helpers.
  • Wires the new header/source into the build via CMakeLists.txt.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
include/sparrow/layout/arrow_schema_array_factory.hpp New shared factory helpers for Arrow schema/array creation (contains several correctness/contract issues for empty children).
src/layout/arrow_schema_array_factory.cpp Implements the non-template factory functions declared in the new header.
include/sparrow/layout/primitive_array_impl.hpp Uses factory helpers for primitive Arrow schema/array creation.
include/sparrow/list_array.hpp Migrates list array schema/array creation to factory helpers.
include/sparrow/map_array.hpp Migrates map array schema/array creation to factory helpers and passes keys_sorted explicitly.
include/sparrow/struct_array.hpp Migrates struct schema/array creation to factory helpers; changes sizing logic.
include/sparrow/union_array.hpp Migrates union schema/array creation to factory helpers.
include/sparrow/run_end_encoded_array.hpp Migrates run-end encoded schema/array creation to factory helpers.
include/sparrow/null_array.hpp Migrates null schema/array creation to factory helpers.
include/sparrow/timestamp_array.hpp Migrates timestamp schema/array creation to factory helpers.
include/sparrow/decimal_array.hpp Migrates decimal schema/array creation to factory helpers.
include/sparrow/dictionary_encoded_array.hpp Migrates dictionary-encoded schema/array creation to factory helpers.
include/sparrow/variable_size_binary_array.hpp Migrates variable-size binary schema/array creation to factory helpers.
include/sparrow/variable_size_binary_view_array.hpp Migrates binary-view schema/array creation to factory helpers and introduces create_proxy_impl.
include/sparrow/fixed_width_binary_array.hpp Migrates fixed-width binary schema/array creation to factory helpers.
CMakeLists.txt Adds the new header and source file to the build.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +291 to +307
const std::size_t n = child_arrays.size();
ArrowArray** ptrs = new ArrowArray*[n];
for (std::size_t i = 0; i < n; ++i)
{
ptrs[i] = new ArrowArray(std::move(child_arrays[i]));
}
std::vector<buffer<uint8_t>> buffers;
buffers.reserve(1);
buffers.emplace_back(std::move(bitmap_buf));
return make_arrow_array(
size,
null_count,
0,
std::move(buffers),
ptrs,
repeat_view<bool>(true, n),
nullptr,
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_struct_arrow_array allocates ptrs = new ArrowArray*[n] even when n == 0 and passes it to make_arrow_array. make_arrow_array asserts (children_ownership.size() == 0) == (children == nullptr), so this will fail (and leak) for empty structs. Avoid allocating and pass nullptr for children when there are zero children.

Copilot uses AI. Check for mistakes.
@Alex-PLACET Alex-PLACET requested a review from Hind-M April 1, 2026 09:49
@Alex-PLACET Alex-PLACET force-pushed the extract_arrowarray_and_arrowschema_creation branch from f964c7c to 6f70f44 Compare April 1, 2026 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants