refactor(tool-parser): rename Glm4MoeParser to GlmParser with catch-all "glm" pattern#1568
refactor(tool-parser): rename Glm4MoeParser to GlmParser with catch-all "glm" pattern#15681195343015 wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request renames and refactors the GLM parser from Glm4MoeParser to GlmParser to support GLM-4.5 through GLM-5.1 tool call formats. The default format is updated to GLM-4.7+ (whitespace-based), while GLM-4.5/4.6 (newline-based) is registered under glm45. Documentation, benchmarks, and tests are updated to reflect these changes. The reviewer suggested narrowing the visibility of the glm47() constructor to pub(crate) to keep the public API clean and align with the intended design.
📝 WalkthroughWalkthroughThis PR consolidates GLM tool-call parsing by replacing the MoE-specific ChangesGLM Parser Consolidation
🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d623e7c18b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| registry.register_parser("glm", || Box::new(GlmParser::default())); | ||
| registry.register_parser("glm45", || Box::new(GlmParser::glm45())); |
There was a problem hiding this comment.
Preserve the old GLM parser aliases
Deployments that explicitly configure the previously registered parser names (--tool-call-parser glm45_moe or glm47_moe) now fail startup because AppContextBuilder validates the configured name with factory.has_parser(name) and rejects unknown tool-call parsers. This change only registers the new glm/glm45 names, so existing configs that worked before this refactor cannot start unless they are updated; keeping the old names as aliases would avoid that compatibility break.
Useful? React with 👍 / 👎.
…ll "glm" pattern Rename Glm4MoeParser to GlmParser. Use glm47 format (whitespace-only, GLM-4.7/5/5.1) as the default, with specialized glm45 format for GLM-4.5/4.6 where the function name is separated by a newline. Replace glm-* -> json fallback with glm-* -> glm so that GLM-5/5.1 models are routed to the correct tool call parser instead of the generic JSON parser. Drop the misleading "_moe" suffix from parser keys. Signed-off-by: Jiayi Yan <1195343015@qq.com>
d623e7c to
20abf42
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/concepts/architecture/grpc-pipeline.md`:
- Line 276: Update the table cell that currently reads "`GLM 4.7+ format
(4.5/4.6 fall back to `glm45`)`" to clearly state which versions the `glm` /
`glm-*` pattern covers and how older 4.5/4.6 models are routed; e.g., indicate
"`GLM 4.7 and newer (models using 4.5/4.6 should be matched to the `glm45`
pattern)`" so the mapping behavior between `glm-*` and `glm45` is unambiguous
and does not imply automatic fallback by the parser.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 418cd796-9210-48b0-8b77-ebc54fba81ab
📒 Files selected for processing (10)
crates/tool_parser/README.mdcrates/tool_parser/src/factory.rscrates/tool_parser/src/lib.rscrates/tool_parser/src/parsers/glm.rscrates/tool_parser/src/parsers/mod.rscrates/tool_parser/tests/tool_parser_glm.rscrates/tool_parser/tests/tool_parser_glm45.rsdocs/concepts/architecture/grpc-pipeline.mddocs/getting-started/grpc-workers.mdmodel_gateway/benches/tool_parser_benchmark.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 20abf42889
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // GLM models (4.5/4.6 use newline format, 4.7+ uses whitespace-only format) | ||
| registry.map_model("glm-4.5*", "glm45"); | ||
| registry.map_model("glm-4.6*", "glm45"); | ||
| registry.map_model("glm-*", "glm"); |
There was a problem hiding this comment.
Keep legacy GLM models on JSON parser
This catch-all now routes every glm-* model, including older IDs such as glm-4/glm-4-air that previously fell through to the explicit JSON fallback, into GlmParser. That parser only extracts calls when the response contains <tool_call> markers (GlmParser::parse_complete returns no calls otherwise), so JSON-formatted tool calls from those older GLM models stop being parsed. Since the new parser is documented as covering GLM-4.5 through GLM-5.1, keep a narrower mapping for the native XML families and preserve the JSON fallback for the rest.
Useful? React with 👍 / 👎.
Replace misleading "fall back" phrasing with an explicit note that glm-4.5*/glm-4.6* patterns are matched to glm45 before the catch-all. Signed-off-by: Jiayi Yan <1195343015@qq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5e86bb28ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub(crate) fn glm47() -> Self { | ||
| Self::new(r"(?s)<tool_call>\s*([^<\s]+)\s*(.*?)</tool_call>") |
There was a problem hiding this comment.
Preserve the public GLM 4.7 constructor
For downstream Rust users of the published tool-parser crate, this narrows the previously public Glm4MoeParser::glm47() constructor to pub(crate) on the renamed type, so code that explicitly selected the GLM-4.7 parser can no longer compile while glm45() remains public for compatibility. Keeping glm47() public and/or adding a Glm4MoeParser alias would make this a non-breaking refactor for direct crate consumers.
Useful? React with 👍 / 👎.
Shortened GlmParser name allows JsonParser to fit on the same line. Signed-off-by: Jiayi Yan <1195343015@qq.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/tool_parser/src/factory.rs (2)
321-322: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Box::new(GlmParser::default())triggersclippy::box_default; CI treats clippy warnings as errors (cargo clippy ... -- -D warnings).In
crates/tool_parser/src/factory.rs(glm registration),Box::new(GlmParser::default())matchesclippy::box_default; this would break the build unless the lint is explicitly allowed inclippy.toml/Cargo.tomllints (only theglmregistration usesDefault).♻️ Proposed fix
- registry.register_parser("glm", || Box::new(GlmParser::default())); + registry.register_parser("glm", || Box::<GlmParser>::default());Confirm whether
clippy::box_defaultis allowed/suppressed anywhere inclippy.tomlor via[lints]inCargo.toml(the workflow runscargo clippy -- ... -D warnings).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/tool_parser/src/factory.rs` around lines 321 - 322, The registration for "glm" uses Box::new(GlmParser::default()) which triggers clippy::box_default; update the factory registration to use the Box::new_default helper (e.g., replace the closure to return Box::new_default::<GlmParser>() ) so the Default allocation doesn't trigger the lint, and verify clippy.toml / Cargo.toml do not rely on suppressing clippy::box_default; adjust configs only if you intentionally want to allow the lint. Target the registry.register_parser("glm", || ...) closure and keep registry.register_parser("glm45", || Box::new(GlmParser::glm45())) unchanged.
389-392:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix GLM tool-parser routing for org-prefixed/uppercase model IDs
ParserRegistry::resolve_model_to_parserdoes exact/prefix matching on the model string with no lowercasing/normalization, and gRPC dispatch/conversion passesmodelthrough unchanged—so IDs likezai-org/GLM-4.5/zai-org/GLM-4.6won’t match the current GLM mappings (glm-4.5*,glm-4.6*,glm-*) and fall back to the default parser. Other families in the same file already include org-prefixed/cased variants, but GLM doesn’t.♻️ Suggested additional mappings
// GLM models (4.5/4.6 use newline format, 4.7+ uses whitespace-only format) registry.map_model("glm-4.5*", "glm45"); registry.map_model("glm-4.6*", "glm45"); registry.map_model("glm-*", "glm"); + registry.map_model("zai-org/GLM-4.5*", "glm45"); + registry.map_model("zai-org/GLM-4.6*", "glm45"); + registry.map_model("GLM-4.5*", "glm45"); + registry.map_model("GLM-4.6*", "glm45"); + registry.map_model("zai-org/GLM-*", "glm"); + registry.map_model("GLM-*", "glm");🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/tool_parser/src/factory.rs` around lines 389 - 392, ParserRegistry currently maps "glm-4.5*", "glm-4.6*", and "glm-*" but does exact/prefix matching with no normalization, so org-prefixed or uppercase IDs (e.g., "zai-org/GLM-4.5") won't match; update the mapping calls where registry.map_model is invoked for GLM (the lines registering "glm-4.5*", "glm-4.6*", "glm-*") to also register org-prefixed and uppercase/cased variants (e.g., "zai-org/glm-4.5*", "zai-org/GLM-4.5*", "zai-org/glm-4.6*", "zai-org/GLM-4.6*", and similar for the generic "glm*") so ParserRegistry::resolve_model_to_parser will route those IDs to the GLM parser. Ensure you add both org-prefixed and common-casing variants consistent with other families in the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@crates/tool_parser/src/factory.rs`:
- Around line 321-322: The registration for "glm" uses
Box::new(GlmParser::default()) which triggers clippy::box_default; update the
factory registration to use the Box::new_default helper (e.g., replace the
closure to return Box::new_default::<GlmParser>() ) so the Default allocation
doesn't trigger the lint, and verify clippy.toml / Cargo.toml do not rely on
suppressing clippy::box_default; adjust configs only if you intentionally want
to allow the lint. Target the registry.register_parser("glm", || ...) closure
and keep registry.register_parser("glm45", || Box::new(GlmParser::glm45()))
unchanged.
- Around line 389-392: ParserRegistry currently maps "glm-4.5*", "glm-4.6*", and
"glm-*" but does exact/prefix matching with no normalization, so org-prefixed or
uppercase IDs (e.g., "zai-org/GLM-4.5") won't match; update the mapping calls
where registry.map_model is invoked for GLM (the lines registering "glm-4.5*",
"glm-4.6*", "glm-*") to also register org-prefixed and uppercase/cased variants
(e.g., "zai-org/glm-4.5*", "zai-org/GLM-4.5*", "zai-org/glm-4.6*",
"zai-org/GLM-4.6*", and similar for the generic "glm*") so
ParserRegistry::resolve_model_to_parser will route those IDs to the GLM parser.
Ensure you add both org-prefixed and common-casing variants consistent with
other families in the file.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fb0d3fa7-3da3-46b4-9e47-7ecd718bd0ef
📒 Files selected for processing (1)
crates/tool_parser/src/factory.rs
Description
Problem
Glm4MoeParseris misleadingly named — the "MoE" suffix is unrelated to the tool call format. The parser keyglm47_moeonly covers GLM-4.7, whileglm-*falls back to the genericjsonparser, so GLM-5/5.1 tool calls fail to parse.Solution
Rename
Glm4MoeParsertoGlmParser. Make the glm47 format (no newline between function name and args, used by GLM-4.7/5/5.1) the default. Keep a dedicatedglm45variant for GLM-4.5/4.6 where the function name is separated by a newline.Replace
glm-* → jsonwithglm-* → glmso all GLM models are routed to the correct tool call parser.Drop the misleading
_moesuffix from all parser keys and pattern names.Changes
glm4_moe.rs→glm.rs,Glm4MoeParser→GlmParserDefaultnow uses 4.7+ format (was 4.5 format)glm45()constructor preserved for 4.5/4.6 backward compatglm47()visibility narrowed topub(crate)glm(default) +glm45parser keys;glm-4.5*/4.6* → glm45,glm-* → glmpatternstool_parser_glm.rs(default format) +tool_parser_glm45.rs(old format)Test Plan
cargo test -p tool-parser— 59 unit + 334 integration = all passedChecklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
New Features
Documentation
Tests & Benchmarks