fix(kernel): lenient binding parsing — one typo no longer drops the entire bindings table#1146
Open
benhoverter wants to merge 1 commit intoRightNow-AI:mainfrom
Open
fix(kernel): lenient binding parsing — one typo no longer drops the entire bindings table#1146benhoverter wants to merge 1 commit intoRightNow-AI:mainfrom
benhoverter wants to merge 1 commit intoRightNow-AI:mainfrom
Conversation
A typo in any binding's match_rule no longer drops the entire bindings
table. Each entry is parsed independently; malformed entries log an
ERROR with index, agent name, and the underlying serde error, then are
skipped. A single WARN summarizes total dropped vs. surviving bindings.
Per-entry deny_unknown_fields is preserved so silent typos still fail
loudly — just no longer catastrophically.
Before this change, a single misspelled field anywhere in [[bindings]]
caused the whole table to fail parsing, silently unbinding every
agent — the worst possible failure mode for a routing config.
- New `lenient_extract_bindings` runs after include-merge / [api]
migration, before `try_into::<KernelConfig>()`.
- 7 new config tests cover the reproducer, happy path, all-malformed,
no-bindings, missing-agent, survivor-order preservation, and
top-level field typos:
* test_lenient_bindings_drops_typo_keeps_rest
* test_lenient_bindings_all_valid_unchanged
* test_lenient_bindings_all_malformed_yields_empty_but_keeps_rest_of_config
* test_lenient_bindings_no_bindings_section_is_noop
* test_lenient_bindings_missing_agent_field_dropped
* test_lenient_bindings_preserves_survivor_order — locks in that
first-match-wins routing semantics cannot silently regress when
a middle entry is dropped
* test_lenient_bindings_top_level_field_typo_dropped — locks in
that deny_unknown_fields catches operator typos at the binding
top level (e.g. \`agnt = ...\`), not just inside match_rule
- TODO marker added on the remaining \`warn!\` fallback in \`load_config\`
for the non-binding silent-default path (follow-up work).
Tested live: typo'd \`hannel\` field on binding #2 logged as expected;
remaining 5 bindings loaded and routed correctly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1145.
A single misspelled field anywhere in
[[bindings]]currently causes the entire bindings table to fail parsing at config load. BecauseKernelConfigdeliberately does not usedeny_unknown_fieldsat the top level (forward-compat with new sections), the failure surfaces as a silent default: every agent ends up unbound, the daemon boots clean, and routing simply stops working — with no log line pointing at the typo.This PR makes binding parsing partial-success: each entry is parsed independently, malformed entries are dropped with a clear ERROR naming the agent and the bad field, and survivors load normally. Per-entry
deny_unknown_fieldsis preserved, so silent typos still fail loudly per entry — just no longer catastrophically across the whole table.Changes
lenient_extract_bindings(&toml::Value)inopenfang-kernel/src/config.rsruns after include-merge /[api]migration, before the maintry_into::<KernelConfig>()call. Each[[bindings]]entry istry_into::<AgentBinding>()'d on its own; failures log an ERROR with index, best-effort-extracted agent name, and the underlying serde error message (which already lists valid field names). A single aggregate WARN reports dropped vs. surviving counts. Survivors are reinjected into the config value before the normal deserialize proceeds.AgentBindingorBindingMatchRuleschemas —deny_unknown_fieldsis preserved on both. Behavior change is scoped strictly to the failure mode of the bindings array.warn!-on-load fallback inload_configfor the non-binding silent-default path (follow-up work).Out of scope: the broader silent-default path for non-binding config sections; surfacing the dropped-binding count in any non-log channel.
Testing
cargo clippy --workspace --all-targets -- -D warningspassescargo test --workspacepasses (+7 new tests, all green)Live smoke test on a daemon built from this branch stacked on a hardened base: typo'd a binding field (
cccchannel) inconfig.tomland bounced the daemon. Log output:The error message names the agent, the typo'd field, and the valid alternatives. Surviving bindings loaded and routed correctly.
7 new unit tests cover:
test_lenient_bindings_drops_typo_keeps_rest— the issue reproducer.test_lenient_bindings_all_valid_unchanged— happy path: no behavioral change for clean configs.test_lenient_bindings_all_malformed_yields_empty_but_keeps_rest_of_config— every binding malformed → empty bindings, but the rest of the config still loads.test_lenient_bindings_no_bindings_section_is_noop— configs without[[bindings]]are untouched.test_lenient_bindings_missing_agent_field_dropped— entry without anagentfield is dropped (cannot be routed).test_lenient_bindings_preserves_survivor_order— locks in that first-match-wins routing semantics cannot silently regress when a middle entry is dropped.test_lenient_bindings_top_level_field_typo_dropped— locks in thatdeny_unknown_fieldscatches operator typos at the binding top level (e.g.agnt = ...), not just insidematch_rule.CI note: the same pre-existing
cargo fmt --all -- --checkfailure onmain(tracked by #1121, inherited fromda6b567aand earlier) applies here. None of the failing fmt lines are incrates/openfang-kernel/src/config.rs— the only file this PR modifies. Happy to address as a separate fmt-only PR if maintainers prefer.Security
Note on relationship to #1144
This PR pairs naturally with #1144 (channel_id binding hardening). Together they make binding misconfig caught early (#1144) and survivable (this PR) instead of silent-and-catastrophic. The two PRs touch disjoint code paths and can land in either order.