Skip to content

refactor(tool-parser): rename Glm4MoeParser to GlmParser with catch-all "glm" pattern#1568

Open
1195343015 wants to merge 3 commits into
lightseekorg:mainfrom
1195343015:refactor/glm-tool-parser-rename
Open

refactor(tool-parser): rename Glm4MoeParser to GlmParser with catch-all "glm" pattern#1568
1195343015 wants to merge 3 commits into
lightseekorg:mainfrom
1195343015:refactor/glm-tool-parser-rename

Conversation

@1195343015

@1195343015 1195343015 commented May 28, 2026

Copy link
Copy Markdown
Contributor

Description

Problem

Glm4MoeParser is misleadingly named — the "MoE" suffix is unrelated to the tool call format. The parser key glm47_moe only covers GLM-4.7, while glm-* falls back to the generic json parser, so GLM-5/5.1 tool calls fail to parse.

Solution

Rename Glm4MoeParser to GlmParser. Make the glm47 format (no newline between function name and args, used by GLM-4.7/5/5.1) the default. Keep a dedicated glm45 variant for GLM-4.5/4.6 where the function name is separated by a newline.

Replace glm-* → json with glm-* → glm so all GLM models are routed to the correct tool call parser.

Drop the misleading _moe suffix from all parser keys and pattern names.

Changes

  • Rename glm4_moe.rsglm.rs, Glm4MoeParserGlmParser
  • Default now uses 4.7+ format (was 4.5 format)
  • glm45() constructor preserved for 4.5/4.6 backward compat
  • glm47() visibility narrowed to pub(crate)
  • Factory: glm (default) + glm45 parser keys; glm-4.5*/4.6* → glm45, glm-* → glm patterns
  • Test files: tool_parser_glm.rs (default format) + tool_parser_glm45.rs (old format)
  • Docs and benchmark updated

Test Plan

  • cargo test -p tool-parser — 59 unit + 334 integration = all passed
Checklist
  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes
  • (Optional) Documentation updated
  • (Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

  • New Features

    • Unified GLM tool parser now supports GLM-4.5 through GLM-5.1 with consistent XML-style tool-call handling.
  • Documentation

    • Updated parser identifiers to "glm" and "glm45" and revised GLM compatibility tables.
  • Tests & Benchmarks

    • Integration tests and benchmarks updated to use the new parser identifiers and mappings.

Review Change Stack

@github-actions github-actions Bot added documentation Improvements or additions to documentation benchmarks Benchmark changes tests Test changes tool-parser Tool/function call parser changes model-gateway Model gateway crate changes labels May 28, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames and refactors the GLM parser from Glm4MoeParser to GlmParser to support GLM-4.5 through GLM-5.1 tool call formats. The default format is updated to GLM-4.7+ (whitespace-based), while GLM-4.5/4.6 (newline-based) is registered under glm45. Documentation, benchmarks, and tests are updated to reflect these changes. The reviewer suggested narrowing the visibility of the glm47() constructor to pub(crate) to keep the public API clean and align with the intended design.

Comment thread crates/tool_parser/src/parsers/glm.rs Outdated
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR consolidates GLM tool-call parsing by replacing the MoE-specific Glm4MoeParser with a unified GlmParser that supports GLM-4.5 through GLM-5.1 via variant constructors, updating factory registrations, module exports, integration tests, and documentation accordingly.

Changes

GLM Parser Consolidation

Layer / File(s) Summary
Core parser struct and trait impl refactoring
crates/tool_parser/src/parsers/glm.rs
Glm4MoeParser renamed to GlmParser; new regex-based new, glm45(), glm47() constructors; Default and ToolParser implementations updated; incremental parsing/reset adjustments.
Module declaration and crate re-exports
crates/tool_parser/src/parsers/mod.rs, crates/tool_parser/src/lib.rs
Replace glm4_moe module with glm; re-export GlmParser and remove Glm4MoeParser from crate root.
Factory parser registration and model routing
crates/tool_parser/src/factory.rs
Import GlmParser; register glm (default) and glm45 constructors; map glm-4.5*/glm-4.6*glm45, and glm-*glm.
Integration test updates
crates/tool_parser/tests/tool_parser_glm.rs, crates/tool_parser/tests/tool_parser_glm45.rs
Tests updated to instantiate GlmParser::default() or GlmParser::glm45() as appropriate; added test ensuring default parser parses GLM-47-style format.
Documentation and benchmark updates
crates/tool_parser/README.md, docs/concepts/architecture/grpc-pipeline.md, docs/getting-started/grpc-workers.md, model_gateway/benches/tool_parser_benchmark.rs
Parser reference tables and benchmark cases updated to use glm and glm45 identifiers and descriptions instead of MoE-specific entries.

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • CatherineSue
  • slin1237
  • key4ng

🐰 A parser tidy, keen and bright,
I hopped through regex through the night,
From MoE shadows to one clear name,
GlmParser hops into the frame,
New tests and docs—what a delight!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: renaming Glm4MoeParser to GlmParser and updating the factory pattern key from MoE-specific identifiers to a catch-all 'glm' pattern.
Docstring Coverage ✅ Passed Docstring coverage is 88.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d623e7c18b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +321 to +322
registry.register_parser("glm", || Box::new(GlmParser::default()));
registry.register_parser("glm45", || Box::new(GlmParser::glm45()));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve the old GLM parser aliases

Deployments that explicitly configure the previously registered parser names (--tool-call-parser glm45_moe or glm47_moe) now fail startup because AppContextBuilder validates the configured name with factory.has_parser(name) and rejects unknown tool-call parsers. This change only registers the new glm/glm45 names, so existing configs that worked before this refactor cannot start unless they are updated; keeping the old names as aliases would avoid that compatibility break.

Useful? React with 👍 / 👎.

…ll "glm" pattern

Rename Glm4MoeParser to GlmParser. Use glm47 format (whitespace-only,
GLM-4.7/5/5.1) as the default, with specialized glm45 format for
GLM-4.5/4.6 where the function name is separated by a newline.

Replace glm-* -> json fallback with glm-* -> glm so that GLM-5/5.1
models are routed to the correct tool call parser instead of the
generic JSON parser.

Drop the misleading "_moe" suffix from parser keys.

Signed-off-by: Jiayi Yan <1195343015@qq.com>
@1195343015 1195343015 force-pushed the refactor/glm-tool-parser-rename branch from d623e7c to 20abf42 Compare May 28, 2026 12:12

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/concepts/architecture/grpc-pipeline.md`:
- Line 276: Update the table cell that currently reads "`GLM 4.7+ format
(4.5/4.6 fall back to `glm45`)`" to clearly state which versions the `glm` /
`glm-*` pattern covers and how older 4.5/4.6 models are routed; e.g., indicate
"`GLM 4.7 and newer (models using 4.5/4.6 should be matched to the `glm45`
pattern)`" so the mapping behavior between `glm-*` and `glm45` is unambiguous
and does not imply automatic fallback by the parser.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 418cd796-9210-48b0-8b77-ebc54fba81ab

📥 Commits

Reviewing files that changed from the base of the PR and between c72ae0e and 20abf42.

📒 Files selected for processing (10)
  • crates/tool_parser/README.md
  • crates/tool_parser/src/factory.rs
  • crates/tool_parser/src/lib.rs
  • crates/tool_parser/src/parsers/glm.rs
  • crates/tool_parser/src/parsers/mod.rs
  • crates/tool_parser/tests/tool_parser_glm.rs
  • crates/tool_parser/tests/tool_parser_glm45.rs
  • docs/concepts/architecture/grpc-pipeline.md
  • docs/getting-started/grpc-workers.md
  • model_gateway/benches/tool_parser_benchmark.rs

Comment thread docs/concepts/architecture/grpc-pipeline.md Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 20abf42889

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// GLM models (4.5/4.6 use newline format, 4.7+ uses whitespace-only format)
registry.map_model("glm-4.5*", "glm45");
registry.map_model("glm-4.6*", "glm45");
registry.map_model("glm-*", "glm");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep legacy GLM models on JSON parser

This catch-all now routes every glm-* model, including older IDs such as glm-4/glm-4-air that previously fell through to the explicit JSON fallback, into GlmParser. That parser only extracts calls when the response contains <tool_call> markers (GlmParser::parse_complete returns no calls otherwise), so JSON-formatted tool calls from those older GLM models stop being parsed. Since the new parser is documented as covering GLM-4.5 through GLM-5.1, keep a narrower mapping for the native XML families and preserve the JSON fallback for the rest.

Useful? React with 👍 / 👎.

Replace misleading "fall back" phrasing with an explicit note that
glm-4.5*/glm-4.6* patterns are matched to glm45 before the catch-all.

Signed-off-by: Jiayi Yan <1195343015@qq.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e86bb28ae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +80 to 81
pub(crate) fn glm47() -> Self {
Self::new(r"(?s)<tool_call>\s*([^<\s]+)\s*(.*?)</tool_call>")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve the public GLM 4.7 constructor

For downstream Rust users of the published tool-parser crate, this narrows the previously public Glm4MoeParser::glm47() constructor to pub(crate) on the renamed type, so code that explicitly selected the GLM-4.7 parser can no longer compile while glm45() remains public for compatibility. Keeping glm47() public and/or adding a Glm4MoeParser alias would make this a non-breaking refactor for direct crate consumers.

Useful? React with 👍 / 👎.

Shortened GlmParser name allows JsonParser to fit on the same line.

Signed-off-by: Jiayi Yan <1195343015@qq.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/tool_parser/src/factory.rs (2)

321-322: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Box::new(GlmParser::default()) triggers clippy::box_default; CI treats clippy warnings as errors (cargo clippy ... -- -D warnings).

In crates/tool_parser/src/factory.rs (glm registration), Box::new(GlmParser::default()) matches clippy::box_default; this would break the build unless the lint is explicitly allowed in clippy.toml / Cargo.toml lints (only the glm registration uses Default).

♻️ Proposed fix
-        registry.register_parser("glm", || Box::new(GlmParser::default()));
+        registry.register_parser("glm", || Box::<GlmParser>::default());

Confirm whether clippy::box_default is allowed/suppressed anywhere in clippy.toml or via [lints] in Cargo.toml (the workflow runs cargo clippy -- ... -D warnings).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/tool_parser/src/factory.rs` around lines 321 - 322, The registration
for "glm" uses Box::new(GlmParser::default()) which triggers
clippy::box_default; update the factory registration to use the Box::new_default
helper (e.g., replace the closure to return Box::new_default::<GlmParser>() ) so
the Default allocation doesn't trigger the lint, and verify clippy.toml /
Cargo.toml do not rely on suppressing clippy::box_default; adjust configs only
if you intentionally want to allow the lint. Target the
registry.register_parser("glm", || ...) closure and keep
registry.register_parser("glm45", || Box::new(GlmParser::glm45())) unchanged.

389-392: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix GLM tool-parser routing for org-prefixed/uppercase model IDs

ParserRegistry::resolve_model_to_parser does exact/prefix matching on the model string with no lowercasing/normalization, and gRPC dispatch/conversion passes model through unchanged—so IDs like zai-org/GLM-4.5 / zai-org/GLM-4.6 won’t match the current GLM mappings (glm-4.5*, glm-4.6*, glm-*) and fall back to the default parser. Other families in the same file already include org-prefixed/cased variants, but GLM doesn’t.

♻️ Suggested additional mappings
         // GLM models (4.5/4.6 use newline format, 4.7+ uses whitespace-only format)
         registry.map_model("glm-4.5*", "glm45");
         registry.map_model("glm-4.6*", "glm45");
         registry.map_model("glm-*", "glm");
+        registry.map_model("zai-org/GLM-4.5*", "glm45");
+        registry.map_model("zai-org/GLM-4.6*", "glm45");
+        registry.map_model("GLM-4.5*", "glm45");
+        registry.map_model("GLM-4.6*", "glm45");
+        registry.map_model("zai-org/GLM-*", "glm");
+        registry.map_model("GLM-*", "glm");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/tool_parser/src/factory.rs` around lines 389 - 392, ParserRegistry
currently maps "glm-4.5*", "glm-4.6*", and "glm-*" but does exact/prefix
matching with no normalization, so org-prefixed or uppercase IDs (e.g.,
"zai-org/GLM-4.5") won't match; update the mapping calls where
registry.map_model is invoked for GLM (the lines registering "glm-4.5*",
"glm-4.6*", "glm-*") to also register org-prefixed and uppercase/cased variants
(e.g., "zai-org/glm-4.5*", "zai-org/GLM-4.5*", "zai-org/glm-4.6*",
"zai-org/GLM-4.6*", and similar for the generic "glm*") so
ParserRegistry::resolve_model_to_parser will route those IDs to the GLM parser.
Ensure you add both org-prefixed and common-casing variants consistent with
other families in the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/tool_parser/src/factory.rs`:
- Around line 321-322: The registration for "glm" uses
Box::new(GlmParser::default()) which triggers clippy::box_default; update the
factory registration to use the Box::new_default helper (e.g., replace the
closure to return Box::new_default::<GlmParser>() ) so the Default allocation
doesn't trigger the lint, and verify clippy.toml / Cargo.toml do not rely on
suppressing clippy::box_default; adjust configs only if you intentionally want
to allow the lint. Target the registry.register_parser("glm", || ...) closure
and keep registry.register_parser("glm45", || Box::new(GlmParser::glm45()))
unchanged.
- Around line 389-392: ParserRegistry currently maps "glm-4.5*", "glm-4.6*", and
"glm-*" but does exact/prefix matching with no normalization, so org-prefixed or
uppercase IDs (e.g., "zai-org/GLM-4.5") won't match; update the mapping calls
where registry.map_model is invoked for GLM (the lines registering "glm-4.5*",
"glm-4.6*", "glm-*") to also register org-prefixed and uppercase/cased variants
(e.g., "zai-org/glm-4.5*", "zai-org/GLM-4.5*", "zai-org/glm-4.6*",
"zai-org/GLM-4.6*", and similar for the generic "glm*") so
ParserRegistry::resolve_model_to_parser will route those IDs to the GLM parser.
Ensure you add both org-prefixed and common-casing variants consistent with
other families in the file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fb0d3fa7-3da3-46b4-9e47-7ecd718bd0ef

📥 Commits

Reviewing files that changed from the base of the PR and between 5e86bb2 and a75ec56.

📒 Files selected for processing (1)
  • crates/tool_parser/src/factory.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmarks Benchmark changes documentation Improvements or additions to documentation model-gateway Model gateway crate changes tests Test changes tool-parser Tool/function call parser changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant