feat: implement process manager and information_schema.process_list #5865

v0y4g3r · 2025-04-09T12:01:54Z

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

Process management #5866

What's changed and what's your intention?

This PR adds implementation of ProcessManager and information_schema.process_list table used to track running queries.

This is the first step towards process management. To reduce PR size, currently no query will be registered to ProcessManager.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

I have written the necessary rustdoc comments.
I have added the necessary unit tests and integration tests.
This PR requires documentation updates.
API changes are backward compatible.
Schema or data changes are backward compatible.

coderabbitai · 2025-04-09T12:02:03Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

- **Error Handling Enhancements**:

Refactor Process Management in Meta Module - Introduced `ProcessManager` for handling process registration and deregistration. - Added methods for managing and querying process states, including `register_query`, `deregister_query`, and `list_all_processes`. - Removed redundant process management code from the query module. - Updated error handling to reflect changes in process management. - Enhanced test coverage for process management functionalities.

**Add Process Management Enhancements** - **`manager.rs`**: Introduced `process_manager` to `SystemCatalog` and `KvBackendCatalogManager` for improved process handling. - **`information_schema.rs`**: Updated table insertion logic to conditionally include `PROCESS_LIST`. - **`frontend.rs`, `standalone.rs`**: Enhanced `StartCommand` to clone `process_manager` for better resource management. - **`instance.rs`, `builder.rs`**: Integrated `ProcessManager` into `Instance` and `FrontendBuilder` to manage query

### Add Process Listing and Error Handling Enhancements - **Error Handling**: Introduced a new error variant `ListProcess` in `error.rs` to handle failures when listing running processes. - **Process List Implementation**: Enhanced `InformationSchemaProcessList` in `process_list.rs` to track running queries, including defining column names and implementing the `make_process_list` function to build the process list. - **Frontend Builder**: Added a `#[allow(clippy::too_many_arguments)]` attribute in `builder.rs` to suppress Clippy warnings for the `FrontendBuilder::new` function. These changes improve error handling and process tracking capabilities within the system.

Refactor imports in `process_list.rs` - Updated import paths for `Predicates` and `InformationTable` in `process_list.rs` to align with the new module structure.

Refactor process list generation in `process_list.rs` - Simplified the process list generation by removing intermediate row storage and directly building vectors. - Updated `process_to_row` function to use a mutable vector for current row data, improving memory efficiency. - Removed `rows_to_record_batch` function, integrating its logic directly into the main loop for streamlined processing.

- **Refactor Row Construction**: Updated row construction in multiple files to use references for `Value` objects, improving memory efficiency. Affected files include: - `cluster_info.rs` - `columns.rs` - `flows.rs` - `key_column_usage.rs` - `partitions.rs` - `procedure_info.rs` - `process_list.rs` - `region_peers.rs` - `region_statistics.rs` - `schemata.rs` - `table_constraints.rs` - `tables.rs` - `views.rs` - `pg_class.rs` - `pg_database.rs` - `pg_namespace.rs` - **Remove Unused Code**: Deleted unused functions and error variants related to process management in `process_list.rs` and `error.rs`. - **Predicate Evaluation Update**: Modified predicate evaluation functions in `predicate.rs` to work with references, enhancing performance.

### Implement Process Management Enhancements - **Error Handling Enhancements**: - Added new error variants `BumpSequence`, `StartReportTask`, `ReportProcess`, and `BuildProcessManager` in `error.rs` to improve error handling for process management tasks. - Updated `ErrorExt` implementations to handle new error types. - **Process Manager Improvements**: - Introduced `ProcessManager` enhancements in `process_manager.rs` to manage process states using `ProcessWithState` and `ProcessState` enums. - Implemented periodic task `ReportTask` to report running queries to the KV backend. - Modified `register_query` and `deregister_query` methods to use the new state management system. - **Testing and Validation**: - Updated tests in `process_manager.rs` to validate new process management logic. - Replaced `dump` method with `list_all_processes` for listing processes. - **Integration with Frontend and Standalone**: - Updated `frontend.rs` and `standalone.rs` to handle `ProcessManager` initialization errors using `BuildProcessManager` error variant. - **Schema Adjustments**: - Modified `process_list.rs` in `system_schema/information_schema` to use the updated process listing method. - **Key-Value Conversion**: - Added `TryFrom` implementation for converting `Process` to `KeyValue` in `process_list.rs`.

sunng87 · 2025-04-15T07:20:00Z

src/common/meta/src/key/process_list.rs

+#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
+pub struct ProcessValue {
+    /// Database name.
+    pub database: String,


Note that a query can cross schema, so it's hard to judge which "schema" the query is belongs to. We can use catalog here. When using greptimedb in multi-tenant mode, we use catalog to isolate tenants. Each connection should belongs to only one catalog.

sunng87 · 2025-04-15T07:21:41Z

src/common/meta/src/key/process_list.rs

+    /// The running query sql.
+    pub query: String,
+    /// Query start timestamp in milliseconds.
+    pub start_timestamp_ms: i64,


We will also need to include some information of the client, there is a chance multiple clients issuing same query at same time, it's hard to judge process owner just with start_timestamp_ms.

sunng87 · 2025-04-15T07:24:29Z

src/catalog/src/system_schema/information_schema/process_list.rs

+    let mut query_builder = StringVectorBuilder::with_capacity(queries.len());
+    let mut start_time_builder = TimestampMillisecondVectorBuilder::with_capacity(queries.len());
+    let mut elapsed_time_builder = DurationMillisecondVectorBuilder::with_capacity(queries.len());
+


We need to ensure this table only contains queries from current catalog. When using in multi-tenant mode, we should not allow one tenant to see global process list.

There is an exception that when connected to greptime catalog, we can show global state.

client information like user, client ip/port are better to be included in this table.
https://dev.mysql.com/doc/refman/8.4/en/information-schema-processlist-table.html

sunng87 · 2025-04-15T07:30:46Z

One more idea, instead of this push model that each frontend reports this information to meta, can we make it a pull model that when we doing query against information.process_list, the frontend issues a request to fetch instant state from all siblings.

Pros:

More instant data
No overhead when no query is on process_list

Cons:

Need cache when large quality of queries on process_list

v0y4g3r · 2025-04-15T12:29:39Z

One more idea, instead of this push model that each frontend reports this information to meta, can we make it a pull model that when we doing query against information.process_list, the frontend issues a request to fetch instant state from all siblings.

Pros:

More instant data

No overhead when no query is on process_list

Cons:

Need cache when large quality of queries on process_list

I evaluated that approach at first time, but frontend does not know the address other frontend, to implement this we would need a service discovery thing. Maybe information_schema.cluster_info will do the trick but the frontend list in that table is also updated along with heartbeats, which is not realtime.

sunng87 · 2025-04-16T02:53:55Z

If we use pull mode, we can first query meta to get a list of live frontend nodes, then broadcast to all these frontends. I think we can afford the cost because it won't be a frequent query.

The question is I remember we cannot do async operation on information schema. Not sure if it's still valid.

MichaelScofield · 2025-04-16T02:55:56Z

src/cmd/src/frontend.rs

@@ -331,11 +332,17 @@ impl StartCommand {

        let information_extension =
            Arc::new(DistributedInformationExtension::new(meta_client.clone()));
+
+        let process_manager = Arc::new(
+            ProcessManager::new(opts.grpc.server_addr.clone(), cached_meta_backend.clone())


Prefer to use the method that finds the Frontend address in Frontend's heartbeat:

peer_addr: addrs::resolve_addr(&opts.grpc.bind_addr, Some(&opts.grpc.server_addr)),

MichaelScofield · 2025-04-16T03:11:00Z

There's another issue regarding push mode: the pressure on the metasrv. The process list might be huge in a busy cluster, and the reporting tasks may submit too frequent for metasrv in a large cluster. Both produced extra network and storage issues for metasrv.

v0y4g3r · 2025-04-16T03:14:44Z

There's another issue regarding push mode: the pressure on the metasrv. The process list might be huge in a busy cluster, and the reporting tasks may submit too frequent for metasrv in a large cluster. Both produced extra network and storage issues for metasrv.

For most queries that live less than the report interval, they won't be stored in metasrv or backend storage. It's not expected to have many long-running queries in any cluster.

killme2008 · 2025-04-24T17:54:59Z

What's the status of this PR?

github-actions bot added the docs-not-required This change does not impact docs. label Apr 9, 2025

v0y4g3r mentioned this pull request Apr 9, 2025

Process management #5866

Open

4 tasks

github-actions bot added docs-required This change requires docs update. and removed docs-not-required This change does not impact docs. labels Apr 10, 2025

sunng87 mentioned this pull request Apr 10, 2025

Update docs for feat: support information_schema.process_list table to show running queries GreptimeTeam/docs#1639

Open

v0y4g3r added 13 commits April 14, 2025 13:10

### Add Process List Management

609a59e

- **Error Handling Enhancements**:

refactor: Update test IP addresses to include ports in ProcessKey

35d65a5

chore: rebase main

8b570bf

add information schema process list table

3ab8e21

integrate process list table to system catalog

4c65aa8

build ProcessManager on frontend and standalone mode

9b38bdb

feat/show-process-list:

5c5211b

Refactor imports in `process_list.rs` - Updated import paths for `Predicates` and `InformationTable` in `process_list.rs` to align with the new module structure.

wip: move ProcessManager to catalog crate

2c62545

v0y4g3r force-pushed the feat/show-process-list branch from 931d85b to ab8ea60 Compare April 14, 2025 13:20

v0y4g3r added 2 commits April 14, 2025 13:24

chore: remove register

caadc7b

v0y4g3r force-pushed the feat/show-process-list branch from ab8ea60 to caadc7b Compare April 14, 2025 13:41

v0y4g3r changed the title ~~feat: support information_schema.process_list table to show running queries~~ feat: implement process manager and information_schema.process_list Apr 14, 2025

sunng87 mentioned this pull request Apr 14, 2025

Update docs for feat: implement process manager and information_schema.process_list GreptimeTeam/docs#1644

Open

v0y4g3r marked this pull request as ready for review April 14, 2025 13:43

v0y4g3r requested review from MichaelScofield and a team as code owners April 14, 2025 13:43

v0y4g3r requested a review from sunng87 April 14, 2025 13:43

github-actions bot removed the docs-required This change requires docs update. label Apr 14, 2025

github-actions bot added the docs-not-required This change does not impact docs. label Apr 14, 2025

fix: sqlness tests

5c70e10

sunng87 reviewed Apr 15, 2025

View reviewed changes

MichaelScofield reviewed Apr 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement process manager and information_schema.process_list #5865

feat: implement process manager and information_schema.process_list #5865

Uh oh!

v0y4g3r commented Apr 9, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 9, 2025

Review skipped

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

sunng87 Apr 15, 2025

Uh oh!

sunng87 Apr 15, 2025

Uh oh!

sunng87 Apr 15, 2025

Uh oh!

sunng87 Apr 15, 2025

Uh oh!

sunng87 commented Apr 15, 2025

Uh oh!

v0y4g3r commented Apr 15, 2025

Uh oh!

sunng87 commented Apr 16, 2025

Uh oh!

MichaelScofield Apr 16, 2025

Uh oh!

MichaelScofield commented Apr 16, 2025

Uh oh!

v0y4g3r commented Apr 16, 2025

Uh oh!

killme2008 commented Apr 24, 2025

Uh oh!

Uh oh!

feat: implement process manager and information_schema.process_list #5865

Are you sure you want to change the base?

feat: implement process manager and information_schema.process_list #5865

Uh oh!

Conversation

v0y4g3r commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

PR Checklist

Uh oh!

coderabbitai bot commented Apr 9, 2025

Review skipped

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

sunng87 Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

sunng87 Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

sunng87 Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

sunng87 Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

sunng87 commented Apr 15, 2025

Uh oh!

v0y4g3r commented Apr 15, 2025

Uh oh!

sunng87 commented Apr 16, 2025

Uh oh!

MichaelScofield Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

MichaelScofield commented Apr 16, 2025

Uh oh!

v0y4g3r commented Apr 16, 2025

Uh oh!

killme2008 commented Apr 24, 2025

Uh oh!

Uh oh!

v0y4g3r commented Apr 9, 2025 •

edited

Loading