-
Notifications
You must be signed in to change notification settings - Fork 87
pickling support #5011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pickling support #5011
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
vortex-python/Cargo.toml
Outdated
| tokio = { workspace = true, features = ["fs", "rt-multi-thread"] } | ||
| url = { workspace = true } | ||
| vortex = { workspace = true, features = ["object_store", "python", "tokio"] } | ||
| vortex-ipc = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should expose this via the main vortex crate
|
You're gonna need a PR description for this one 😅 |
|
I don't understand the two screenshots you sent? Also, is there any point implementing P4 at all? We're way beyond 3.8 as a min version requirement. |
|
🤦♂️ right. I did initially add 4 to be a baseline to see if 5 would make a difference. Screenshots show the benchmarks ran comparing 4 vs 5. You are right we have min 3.11 required for vortex python so v4 would not be used, I will remove |
|
@gatesn protocol 5 is introduced on py3.8, but made default just recently at py3.14, so we still need both |
Signed-off-by: Onur Satici <[email protected]>
Signed-off-by: Onur Satici <[email protected]>
CodSpeed Performance ReportMerging #5011 will not alter performanceComparing Summary
Benchmarks breakdown
Footnotes
|
Signed-off-by: Onur Satici <[email protected]>
Signed-off-by: Onur Satici <[email protected]>
Signed-off-by: Onur Satici <[email protected]>


Implements
__reduce__and__reduce_ex__methods to enable pickling of Vortex arrays in Python. Arrays are serialized using the Vortex IPC format.For pickle protocol 5+ (Python 3.8+, PEP 574), uses PickleBuffer to keep array buffers separate from the main pickle stream rather than copying them inline. This enables us to use shared memory in the future to potentially zero-copy large arrays even across process boundaries. Protocol 4 and below serialise buffers inline as bytes.
Both protocols share the same deserialization path via
decode_ipc_array_buffers, which reconstructs arrays from IPC-encoded buffer lists (or memoryviews).