Skip to content

Commit 80b4593

Browse files
committed
Avoid single-failure active skill patches
1 parent f246212 commit 80b4593

11 files changed

Lines changed: 19 additions & 13 deletions

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ Both backends feed the same Skill ranking, posterior audit rendering, and rewrit
113113
- **Evidence-weighted Skill evolution**: update Skill beliefs from verified success and failure trajectories.
114114
- **Bayesian Skill registry**: maintain Bayesian Evidence Model beliefs, optional Beta-Bernoulli posteriors, failure modes, token cost, latency, turns, and context distribution.
115115
- **Failure-mode-aware repair**: identify recurring errors and generate focused repair plans.
116+
- **Overfitting-resistant patch activation**: keep single failures as audit evidence, and promote a failure-mode patch into the benchmark prompt only after at least two verified occurrences.
116117
- **Token-aware context building**: select concise, evidence-backed Skill/SOP text; benchmark prompts receive executable patches and guardrails, while posterior numbers stay in artifacts.
117118
- **Full self-evolution from scratch**: run all tasks, collect evidence online, and evolve Skills without prior traces.
118119
- **Incremental repair for existing agents**: consume failed trajectories from a baseline agent and rerun only the failed tasks.
@@ -262,7 +263,7 @@ skill_context = SkillContextBuilder(registry).render(task_context="sop_bench")
262263
print(skill_context)
263264
```
264265

265-
`SkillContextBuilder` renders a compact posterior audit view. The built-in SOP/Lifelong runners convert posterior decisions into executable failure-mode patches and guardrails before adding them to model prompts.
266+
`SkillContextBuilder` renders a compact posterior audit view. The built-in SOP/Lifelong runners convert recurring posterior-backed failure modes into executable patches and guardrails before adding them to model prompts.
266267

267268
## 🔁 Three Operating Patterns
268269

README_ZH.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ E[p_k | D_k] = (alpha_0 + s_k) / (alpha_0 + beta_0 + s_k + f_k)
113113
- **证据加权的 Skill 进化**:从 verified success/failure trajectory 更新 Skill belief。
114114
- **Bayesian Skill Registry**:维护 Bayesian Evidence Model belief、可选 Beta-Bernoulli posterior、失败模式、token 成本、延迟、轮次和 context 分布。
115115
- **面向失败模式的修复**:识别反复出现的错误,生成聚焦的 repair plan。
116+
- **抗过拟合的 patch 激活**:单次失败只作为审计证据保存;同一 failure mode 至少出现两次验证失败后,才把 patch 提升到 benchmark prompt。
116117
- **Token-aware context 构建**:选择简洁、有证据支持的 Skill/SOP 文本;benchmark prompt 接收可执行 patches 和 guardrails,posterior 数字保存在 artifacts 中。
117118
- **从零全量自进化**:完整运行任务,在线收集 evidence,并在无历史 traces 的情况下进化 Skills。
118119
- **已有 Agent 的增量修复层**:读取 baseline agent 的失败轨迹,只重跑失败任务。
@@ -262,7 +263,7 @@ skill_context = SkillContextBuilder(registry).render(task_context="sop_bench")
262263
print(skill_context)
263264
```
264265

265-
`SkillContextBuilder` 渲染的是简洁的 posterior 审计视图。内置 SOP/Lifelong runners 会先把 posterior 决策转成可执行的 failure-mode patches 和 guardrails,再加入模型 prompt。
266+
`SkillContextBuilder` 渲染的是简洁的 posterior 审计视图。内置 SOP/Lifelong runners 会先把反复出现、posterior 有证据支持的 failure mode 转成可执行 patches 和 guardrails,再加入模型 prompt。
266267

267268
## 🔁 三种运行形态
268269

bayesian_agent/benchmarks/evolution.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@
1212
from bayesian_agent.core.registry import BayesianSkillRegistry
1313

1414

15+
ACTIVE_PATCH_MIN_SUPPORT = 2
16+
17+
1518
def classify_failure(benchmark: str, run: Mapping[str, Any]) -> str:
1619
"""Classify common benchmark failures into reusable evidence labels."""
1720

@@ -161,7 +164,7 @@ def _failure_mode_patch_rules(benchmark: str, registry: BayesianSkillRegistry):
161164
if belief.skill_id != f"benchmark/{benchmark}" and benchmark not in belief.contexts:
162165
continue
163166
for failure_mode, count in belief.failure_modes.items():
164-
if count > 0:
167+
if count >= ACTIVE_PATCH_MIN_SUPPORT:
165168
counts[failure_mode] = counts.get(failure_mode, 0) + int(count)
166169

167170
patches = []

docs/articles/bayesian-evidence-acquired-learning.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -716,7 +716,7 @@ rewrite = patch
716716
reason = failures cluster around left_expected_output_blank
717717
```
718718

719-
改写后的 Skill context 不是泛泛地说“仔细一点”,而是把失败模式变成可执行约束。当前 v0.x 实现会在下一轮 prompt 里注入类似这样的 patch section:
719+
改写后的 Skill context 不是泛泛地说“仔细一点”,而是把反复出现的失败模式变成可执行约束。当前 v0.x 实现会先把单次失败作为 candidate evidence 保存在 audit artifact 中;同一 failure mode 至少出现两次后,才会在下一轮 prompt 里注入类似这样的 active patch section:
720720

721721
```text
722722
### Bayesian Failure-Mode Patches: sop_bench

docs/articles/complex-bayesian-rewrite-example.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -434,7 +434,7 @@ P_h(failure | x_risk) ≈ 0.997
434434
2. left_expected_output_blank 这类失败簇仍然需要被 guardrail 约束。
435435
```
436436

437-
所以当前 v0.x 里,即使后续 repair 成功,`failure_modes` 计数仍然会留在 registry 中。只要这个 recurring failure mode 还在,context 里就会继续保留相关 patch。这是保守的,但对 benchmark repair 和生产环境都更安全
437+
所以当前 v0.x 里,即使后续 repair 成功,`failure_modes` 计数仍然会留在 registry 中。第一次出现的 failure mode 只作为 candidate evidence 保存在 audit artifact 中;同一 failure mode 至少出现两次后,context 里才会保留相关 active patch。这比“一错就改 skill”更稳,也能降低单个异常样本导致过拟合的风险
438438

439439
## 十、这个例子说明了什么
440440

@@ -457,7 +457,7 @@ rewrite 触发:
457457
当前 RewritePolicy 看到同一 failure mode 出现 2 次,触发 patch
458458
459459
context 改写:
460-
benchmark-specific patch rules 被注入下一轮 prompt
460+
benchmark-specific patch rules 被注入下一轮 prompt;单次失败只进入 audit,不进入 active prompt patch
461461
462462
repair 成功:
463463
成功 evidence 回写 registry,健康轨迹的 posterior_success 上升

docs/articles/zhihu-bayesian-agent.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ P(success | theta, C, skill)
3232
- `C` 是推理环境,包括 prompt、context、tools、memory、harness feedback
3333
- `skill` 是可复用的任务流程或 SOP
3434

35-
每次 Agent 执行任务后,Bayesian-Agent 会读取经过验证的 trajectory evidence,更新 Skill 的 posterior belief,并在下一次运行时生成由 posterior 驱动的 Skill patches、guardrails 或压缩后的 SOP 文本。原始 posterior 数字保存在 artifact 中用于审计,而不是默认直接塞进 benchmark prompt。
35+
每次 Agent 执行任务后,Bayesian-Agent 会读取经过验证的 trajectory evidence,更新 Skill 的 posterior belief,并在下一次运行时生成由 posterior 驱动的 Skill patches、guardrails 或压缩后的 SOP 文本。原始 posterior 数字保存在 artifact 中用于审计,而不是默认直接塞进 benchmark prompt。为了避免过拟合,单次失败只作为 candidate evidence;同一 failure mode 至少出现两次后,才会激活进入 benchmark prompt 的 patch。
3636

3737
换句话说,它不是“把失败经历都塞进记忆里”,而是问:
3838

docs/core-concepts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,4 +108,4 @@ The default policy maps posterior state to small, inspectable actions:
108108

109109
These actions are recommendations. External harnesses decide how to rewrite, rerun, or retire Skills.
110110

111-
The bundled SOP-Bench and Lifelong runners implement one concrete `patch` behavior: known failure modes are converted into short failure-mode-specific guardrails in the next prompt. This keeps the current v0.x implementation honest: it patches the inference context for the same Skill belief, rather than silently creating a separate child Skill hypothesis.
111+
The bundled SOP-Bench and Lifelong runners implement one concrete `patch` behavior: recurring known failure modes are converted into short failure-mode-specific guardrails in the next prompt. A single failure is recorded in `belief_*.json` and `posterior_context_*.md` as candidate evidence, but it is not promoted into model-facing patch text until the same failure mode has at least two verified occurrences. This keeps the current v0.x implementation honest: it patches the inference context for the same Skill belief, rather than silently creating a separate child Skill hypothesis.

docs/experiments.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Bayesian modes now persist per-task Skill evolution artifacts under:
4545
snapshot_after.json
4646
```
4747

48-
`skill_context_before.md` is the exact model-facing Skill/SOP text injected into that task. For the built-in benchmarks, it contains executable `Bayesian Failure-Mode Patches` plus stable benchmark guardrails, not raw posterior numbers. `skill_context_after.md` is the next model-facing Skill/SOP text after verifier feedback is recorded.
48+
`skill_context_before.md` is the exact model-facing Skill/SOP text injected into that task. For the built-in benchmarks, it contains stable benchmark guardrails and any active `Bayesian Failure-Mode Patches`. A patch becomes active only after the same failure mode has at least two verified occurrences, so single failures stay audit-only. `skill_context_after.md` is the next model-facing Skill/SOP text after verifier feedback is recorded.
4949

5050
`posterior_context_before.md` and `posterior_context_after.md` are audit artifacts for the Bayesian belief state. They may include posterior summaries such as `posterior_success`, `alpha`, `beta`, observations, and rewrite decisions, but those numeric summaries are not injected into the benchmark prompt.
5151

docs/experiments/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ Bayesian runs also write a per-task Skill evolution trail:
3838
snapshot_after.json
3939
```
4040

41-
The `before` Skill context is the exact model-facing Skill/SOP text injected into the model for that task. For the built-in benchmarks, it contains executable `Bayesian Failure-Mode Patches` plus stable benchmark guardrails, not raw posterior numbers.
41+
The `before` Skill context is the exact model-facing Skill/SOP text injected into the model for that task. For the built-in benchmarks, it contains stable benchmark guardrails and any active `Bayesian Failure-Mode Patches`, not raw posterior numbers. A patch becomes active only after the same failure mode has at least two verified occurrences.
4242

4343
The `after` Skill context is rendered after the verifier result is recorded, so it represents the next model-facing Skill version produced by the Bayesian update. The paired `posterior_context_before.md` and `posterior_context_after.md` files keep the posterior summaries for audit/debugging.
4444

docs/method.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ The default policy maps posterior state to actions:
8181

8282
The policy is intentionally small in v0.4. It is designed to be replaced by project-specific policies.
8383

84-
For the built-in SOP-Bench and Lifelong AgentBench runners, `patch` is not only a label in the posterior audit context. Observed benchmark failure modes are mapped to concrete patch rules and injected into the next prompt under `Bayesian Failure-Mode Patches`. The prompt does not include raw posterior numbers such as `posterior_success`, `alpha`, or `beta`; those stay in `belief_*.json` and `posterior_context_*.md` artifacts. For example, `left_expected_output_blank` adds a CSV writeback verification rule, and `invented_unrequested_column` adds SQL column-use constraints. v0.x records post-patch evidence back to the same benchmark Skill; later releases may split recurring patches into separate child Skill hypotheses.
84+
For the built-in SOP-Bench and Lifelong AgentBench runners, `patch` is not only a label in the posterior audit context. Observed benchmark failure modes are mapped to concrete patch rules, but they are injected into the next prompt under `Bayesian Failure-Mode Patches` only after the same failure mode has at least two verified occurrences. A single failure remains candidate evidence in `belief_*.json` and `posterior_context_*.md`, which reduces overfitting to one-off mistakes. The prompt does not include raw posterior numbers such as `posterior_success`, `alpha`, or `beta`. For example, repeated `left_expected_output_blank` failures add a CSV writeback verification rule, and repeated `invented_unrequested_column` failures add SQL column-use constraints. v0.x records post-patch evidence back to the same benchmark Skill; later releases may split recurring patches into separate child Skill hypotheses.
8585

8686
## Full Mode
8787

0 commit comments

Comments
 (0)