-
Notifications
You must be signed in to change notification settings - Fork 419
refactor: convert to mysql values directly from arrow #7096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the MySQL writer to convert Arrow arrays directly to MySQL values instead of using intermediate GreptimeDB Value objects, resulting in significant performance improvements for iterating record batches.
Key changes:
- Replaced Value-based iteration with direct Arrow array processing in the MySQL writer
- Added benchmark tests to demonstrate performance improvements
- Updated dependencies to support new direct conversion approach
Reviewed Changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 8 comments.
File | Description |
---|---|
src/servers/src/mysql/writer.rs | Complete refactor of record batch processing to use direct Arrow array access instead of Value objects |
src/servers/Cargo.toml | Added common-decimal dependency for direct decimal handling |
src/common/recordbatch/benches/iter_record_batch_rows.rs | New benchmark file comparing different record batch iteration approaches |
src/common/recordbatch/Cargo.toml | Added criterion benchmark dependency and configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
f74b7a3
to
e47fed7
Compare
How will we do with the new Json type, since there is no concrete json type in datafusion? |
Signed-off-by: luofucong <[email protected]>
e47fed7
to
e023788
Compare
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your intention?
so that there are no intermediate
Value
s to occupy memory, and the performance for iterating record batch is significantly better:from
iter_record_batch_rows.rs
:PR Checklist
Please convert it to a draft if some of the following conditions are not met.