feat: tune constants by waynexia · Pull Request #7851 · GreptimeTeam/greptimedb

waynexia · 2026-03-23T23:22:08Z

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

Adjust constants to make them more streamlined

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

I have written the necessary rustdoc comments.
I have added the necessary unit tests and integration tests.
This PR requires documentation updates.
API changes are backward compatible.
Schema or data changes are backward compatible.

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

gemini-code-assist · 2026-03-23T23:22:37Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on refining several internal constants and mechanisms to enhance the system's performance and resource utilization. The changes include optimizing Parquet file metadata loading, improving the file cache's capacity distribution, and aligning query execution batch sizes with DataFusion defaults for more efficient data processing. These adjustments aim to streamline operations and improve overall system responsiveness.

Highlights

Parquet Metadata Prefetching: Implemented support for metadata_size_hint in the Parquet file reader, allowing for optimized prefetching of metadata and potentially faster file access.
File Cache Capacity Management: Refactored the file cache's capacity allocation logic to ensure a more robust and balanced distribution between Parquet and Puffin caches, respecting overall budget and minimum capacity requirements.
Optimized Parquet Read Batch Size: Increased the default Parquet read batch size to align with DataFusion's default, which helps reduce rebatching and concatenation overhead in the query pipeline.
Dynamic Batch Sizing for Query Plans: Modified Absent and RangeSelect query execution plans to dynamically utilize the session's configured batch size, removing previously hardcoded values and improving flexibility.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a set of valuable improvements focused on tuning constants and enhancing data processing logic. Key changes include adjusting Parquet read/write constants to improve performance and on-disk format stability, and refactoring several components to use the configurable session batch size, which increases consistency across the system. Additionally, the logic for splitting file cache capacity has been made more robust, fixing a potential overallocation bug. The adoption of a more modern Parquet metadata reader is also a welcome improvement. Overall, these changes are well-implemented and positively contribute to the codebase's quality and configurability.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 930f70b052

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/common/datasource/src/file_format/parquet.rs

src/promql/src/extension_plan/absent.rs

discord9

absent could impl output batch size, rest LGTM

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

evenyag · 2026-03-26T03:01:14Z

src/query/src/range_select/plan.rs

+        let num_rows = self.output_batch.as_ref().unwrap().num_rows();
+        if num_rows == 0 {
+            self.output_batch_offset = 0;
+            return Ok(self.output_batch.take());
+        }


Is it expected to return an empty record batch in range select?

evenyag · 2026-03-26T03:21:15Z

src/promql/src/extension_plan/absent.rs

+                if self.output_timestamps.len() >= self.batch_size {
+                    return Ok(());
+                }


If we return early here, do we need to update the input_timestamp_offset?

evenyag · 2026-03-26T03:28:24Z

@codex review

chatgpt-codex-connector · 2026-03-26T03:34:16Z

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

feat: tune constants

930f70b

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

waynexia requested review from a team, discord9, evenyag and v0y4g3r as code owners March 23, 2026 23:22

github-actions bot added size/S docs-not-required This change does not impact docs. labels Mar 23, 2026

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 23, 2026

View reviewed changes

src/common/datasource/src/file_format/parquet.rs Show resolved Hide resolved

src/promql/src/extension_plan/absent.rs Outdated Show resolved Hide resolved

discord9 approved these changes Mar 24, 2026

View reviewed changes

cap output batch size

b1330a8

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

github-actions bot added size/M and removed size/S labels Mar 24, 2026

evenyag reviewed Mar 26, 2026

View reviewed changes

handle empty input

c867522

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: tune constants#7851

feat: tune constants#7851
waynexia wants to merge 3 commits intomainfrom
tune-consts

waynexia commented Mar 23, 2026

Uh oh!

gemini-code-assist bot commented Mar 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

discord9 left a comment

Uh oh!

evenyag Mar 26, 2026

Uh oh!

evenyag Mar 26, 2026

Uh oh!

evenyag commented Mar 26, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

waynexia commented Mar 23, 2026

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

PR Checklist

Uh oh!

gemini-code-assist bot commented Mar 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

discord9 left a comment

Choose a reason for hiding this comment

Uh oh!

evenyag Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

evenyag Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

evenyag commented Mar 26, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants