Skip to content

Commit 7efb64b

Browse files
committed
feat(async-context-compression): release v1.4.0 with structure-aware grouping and session locking
- Introduced Atomic Message Grouping to prevent tool-calling corruption (Issue #56) - Implemented Tail Boundary Alignment for deterministic context truncation - Added per-chat asynchronous session locking to prevent duplicate background tasks - Enhanced summarization traceability with message IDs and names - Synchronized version and changelog across all documentation files - Optimized release-prep skill to remove redundant H1 titles Closes #56
1 parent 2eee7c5 commit 7efb64b

28 files changed

Lines changed: 3544 additions & 290 deletions

.gemini/skills/release-prep/SKILL.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -73,11 +73,21 @@ Create two versioned release notes files:
7373
#### Required Sections
7474

7575
Each file must include:
76-
1. **Title**: `# v{version} Release Notes` (EN) / `# v{version} 版本发布说明` (CN)
77-
2. **Overview**: One paragraph summarizing this release
78-
3. **New Features** / **新功能**: Bulleted list of features
79-
4. **Bug Fixes** / **问题修复**: Bulleted list of fixes
80-
5. **Migration Notes** / **迁移说明**: Breaking changes or Valve key renames (omit section if none)
76+
0. **Marketplace Badge**: A prominent button linking to the plugin on openwebui.com using shields.io (e.g., `[![](https://img.shields.io/badge/OpenWebUI%20Community-Get%20Plugin-blue?style=for-the-badge)](URL)`).
77+
1. **Overview Header**: Use `## Overview` as the first header.
78+
2. **Summary Paragraph**: A paragraph summarizing the release. **NEVER** include the version number as a title.
79+
3. **README Link**: Direct link to the plugin's README file on GitHub.
80+
4. **New Features** / **新功能**: Bulleted list of features
81+
5. **Bug Fixes** / **问题修复**: Bulleted list of fixes
82+
6. **Related Issues** / **相关 Issue**: Link to GitHub Issues. **ONLY** include if a specific issue is resolved. **NEVER use placeholders.**
83+
7. **Related PRs** / **相关 PR**: Link to the Pull Request. **ONLY** include if the PR is already created and the ID is known. **NEVER use placeholders.**
84+
8. **Migration Notes**: Breaking changes or Valve key renames (omit section if none)
85+
86+
---
87+
88+
## Language Standard
89+
90+
- **Release Notes Files**: Use **English ONLY** for the final `.md` files to maintain professional consistency on GitHub. Avoid bilingual content in the release description.
8191
6. **Companion Plugins** / **配套插件** (optional): If a companion plugin was updated
8292

8393
If a release notes file already exists for this version, update it rather than creating a new one.
@@ -98,8 +108,10 @@ Generate the commit message following `commit-message.instructions.md` rules:
98108
- **Language**: English ONLY
99109
- **Format**: `type(scope): subject` + blank line + body bullets
100110
- **Scope**: use plugin folder name (e.g., `github-copilot-sdk`)
101-
- **Body**: 1-3 bullets summarizing key changes
102-
- Explicitly mention "READMEs and docs synced" if version was bumped
111+
- **Body**:
112+
- 1-3 bullets summarizing key changes
113+
- Explicitly mention "READMEs and docs synced" if version was bumped
114+
- **MUST** end with `Closes #XX` or `Fixes #XX` if an issue is being resolved.
103115

104116
Present the full commit message to the user for review before executing.
105117

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ A collection of enhancements, plugins, and prompts for [open-webui](https://gith
2727
| 🥈 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | ![v](https://img.shields.io/badge/v-1.5.0-blue?style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
2828
| 🥉 | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![v](https://img.shields.io/badge/v-1.2.7-blue?style=flat) | ![p3_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_dl.json&style=flat) | ![p3_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
2929
| 4️⃣ | [Export to Word Enhanced](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | ![v](https://img.shields.io/badge/v-0.4.4-blue?style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
30-
| 5️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.3.0-blue?style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
30+
| 5️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.4.0-blue?style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--09-gray?style=flat) |
3131
| 6️⃣ | [AI Task Instruction Generator](https://openwebui.com/posts/ai_task_instruction_generator_9bab8b37) | ![v](https://img.shields.io/badge/v-N/A-gray?style=flat) | ![p6_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_dl.json&style=flat) | ![p6_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
3232

3333
### 📈 Total Downloads Trend

README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ OpenWebUI 增强功能集合。包含个人开发与收集的插件、提示词
2424
| 🥈 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | ![v](https://img.shields.io/badge/v-1.5.0-blue?style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
2525
| 🥉 | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![v](https://img.shields.io/badge/v-1.2.7-blue?style=flat) | ![p3_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_dl.json&style=flat) | ![p3_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
2626
| 4️⃣ | [Export to Word Enhanced](https://openwebui.com/posts/export_to_word_enhanced_formatting_fca6a315) | ![v](https://img.shields.io/badge/v-0.4.4-blue?style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
27-
| 5️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.3.0-blue?style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
27+
| 5️⃣ | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.4.0-blue?style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--09-gray?style=flat) |
2828
| 6️⃣ | [AI Task Instruction Generator](https://openwebui.com/posts/ai_task_instruction_generator_9bab8b37) | ![v](https://img.shields.io/badge/v-N/A-gray?style=flat) | ![p6_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_dl.json&style=flat) | ![p6_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p6_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--03--08-gray?style=flat) |
2929

3030
### 📈 总下载量累计趋势
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
# Fix: OpenAI API Error "messages with role 'tool' must be a response to a preceding message with 'tool_calls'"
2+
3+
## Problem Description
4+
In the `async-context-compression` filter, chat history can be trimmed or summarized when the conversation grows. If the retained tail starts in the middle of a native tool-calling sequence, the next request may begin with a `tool` message whose triggering `assistant` message is no longer present.
5+
6+
That produces the OpenAI API error:
7+
`"messages with role 'tool' must be a response to a preceding message with 'tool_calls'"`
8+
9+
## Root Cause
10+
History compression boundaries were not fully aware of atomic tool-call chains. A valid chain may include:
11+
12+
1. An `assistant` message with `tool_calls`
13+
2. One or more `tool` messages
14+
3. An optional assistant follow-up that consumes the tool results
15+
16+
If truncation happens inside that chain, the request sent to the model becomes invalid.
17+
18+
## Solution: Atomic Boundary Alignment
19+
The fix groups tool-call sequences into atomic units and aligns trim boundaries to those groups.
20+
21+
### 1. `_get_atomic_groups()`
22+
This helper groups message indices into units that must be kept or dropped together. It explicitly recognizes native tool-calling patterns such as:
23+
24+
- `assistant(tool_calls)`
25+
- `tool`
26+
- assistant follow-up response
27+
28+
Conceptually, it treats the whole sequence as one atomic block instead of independent messages.
29+
30+
```python
31+
def _get_atomic_groups(self, messages: List[Dict]) -> List[List[int]]:
32+
groups = []
33+
current_group = []
34+
35+
for i, msg in enumerate(messages):
36+
role = msg.get("role")
37+
has_tool_calls = bool(msg.get("tool_calls"))
38+
39+
if role == "assistant" and has_tool_calls:
40+
if current_group:
41+
groups.append(current_group)
42+
current_group = [i]
43+
elif role == "tool":
44+
if not current_group:
45+
groups.append([i])
46+
else:
47+
current_group.append(i)
48+
elif (
49+
role == "assistant"
50+
and current_group
51+
and messages[current_group[-1]].get("role") == "tool"
52+
):
53+
current_group.append(i)
54+
groups.append(current_group)
55+
current_group = []
56+
else:
57+
if current_group:
58+
groups.append(current_group)
59+
current_group = []
60+
groups.append([i])
61+
62+
if current_group:
63+
groups.append(current_group)
64+
65+
return groups
66+
```
67+
68+
### 2. `_align_tail_start_to_atomic_boundary()`
69+
This helper checks whether a proposed trim point falls inside one of those atomic groups. If it does, the start index is moved backward to the beginning of that group.
70+
71+
```python
72+
def _align_tail_start_to_atomic_boundary(
73+
self, messages: List[Dict], raw_start_index: int, protected_prefix: int
74+
) -> int:
75+
aligned_start = max(raw_start_index, protected_prefix)
76+
77+
if aligned_start <= protected_prefix or aligned_start >= len(messages):
78+
return aligned_start
79+
80+
trimmable = messages[protected_prefix:]
81+
local_start = aligned_start - protected_prefix
82+
83+
for group in self._get_atomic_groups(trimmable):
84+
group_start = group[0]
85+
group_end = group[-1] + 1
86+
87+
if local_start == group_start:
88+
return aligned_start
89+
90+
if group_start < local_start < group_end:
91+
return protected_prefix + group_start
92+
93+
return aligned_start
94+
```
95+
96+
### 3. Applied to Tail Retention and Summary Progress
97+
The aligned boundary is now used when rebuilding the retained tail and when calculating how much history can be summarized safely.
98+
99+
Example from the current implementation:
100+
101+
```python
102+
raw_start_index = max(compressed_count, effective_keep_first)
103+
start_index = self._align_tail_start_to_atomic_boundary(
104+
messages, raw_start_index, effective_keep_first
105+
)
106+
tail_messages = messages[start_index:]
107+
```
108+
109+
And during summary progress calculation:
110+
111+
```python
112+
raw_target_compressed_count = max(0, len(messages) - self.valves.keep_last)
113+
target_compressed_count = self._align_tail_start_to_atomic_boundary(
114+
messages, raw_target_compressed_count, effective_keep_first
115+
)
116+
```
117+
118+
## Verification Results
119+
- **First compression boundary**: When history first crosses the compression threshold, the retained tail no longer starts inside a tool-call block.
120+
- **Complex sessions**: Real-world testing with 30+ messages, multiple tool calls, and failed calls remained stable during background summarization.
121+
- **Regression behavior**: The filter now prefers a valid boundary even if that means retaining slightly more context than a naive raw slice would allow.
122+
123+
## Conclusion
124+
The fix prevents orphaned `tool` messages by making history trimming and summary progress aware of atomic tool-call groups. This eliminates the 400 error during long conversations and background compression.
Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# 修复:OpenAI API 错误 "messages with role 'tool' must be a response to a preceding message with 'tool_calls'"
2+
3+
## 问题描述
4+
`async-context-compression` 过滤器中,当对话历史变长时,系统会对消息进行裁剪或摘要。如果保留下来的尾部历史恰好从一个原生工具调用序列的中间开始,那么下一次请求就可能以一条 `tool` 消息开头,而触发它的 `assistant` 消息已经被裁掉。
5+
6+
这就会触发 OpenAI API 的错误:
7+
`"messages with role 'tool' must be a response to a preceding message with 'tool_calls'"`
8+
9+
## 根本原因
10+
11+
真正的缺陷在于历史压缩边界没有完整识别工具调用链的“原子性”。一个合法的工具调用链通常包括:
12+
13+
1. 一条带有 `tool_calls``assistant` 消息
14+
2. 一条或多条 `tool` 消息
15+
3. 一条可选的 assistant 跟进回复,用于消费工具结果
16+
17+
如果裁剪点落在这段链条内部,发给模型的消息序列就会变成非法格式。
18+
19+
## 解决方案:对齐原子边界
20+
修复通过把工具调用序列分组为原子单元,并使裁剪边界对齐到这些单元。
21+
22+
### 1. `_get_atomic_groups()`
23+
这个辅助函数会把消息索引分组为“必须一起保留或一起丢弃”的原子单元。它显式识别以下原生工具调用模式:
24+
25+
- `assistant(tool_calls)`
26+
- `tool`
27+
- assistant 跟进回复
28+
29+
也就是说,它不再把这些消息看成彼此独立的单条消息,而是把整段序列视为一个原子块。
30+
31+
```python
32+
def _get_atomic_groups(self, messages: List[Dict]) -> List[List[int]]:
33+
groups = []
34+
current_group = []
35+
36+
for i, msg in enumerate(messages):
37+
role = msg.get("role")
38+
has_tool_calls = bool(msg.get("tool_calls"))
39+
40+
if role == "assistant" and has_tool_calls:
41+
if current_group:
42+
groups.append(current_group)
43+
current_group = [i]
44+
elif role == "tool":
45+
if not current_group:
46+
groups.append([i])
47+
else:
48+
current_group.append(i)
49+
elif (
50+
role == "assistant"
51+
and current_group
52+
and messages[current_group[-1]].get("role") == "tool"
53+
):
54+
current_group.append(i)
55+
groups.append(current_group)
56+
current_group = []
57+
else:
58+
if current_group:
59+
groups.append(current_group)
60+
current_group = []
61+
groups.append([i])
62+
63+
if current_group:
64+
groups.append(current_group)
65+
66+
return groups
67+
```
68+
69+
### 2. `_align_tail_start_to_atomic_boundary()`
70+
这个辅助函数会检查一个拟定的裁剪起点是否落在某个原子块内部。如果是,它会把起点向前回退到该原子块的开头位置。
71+
72+
```python
73+
def _align_tail_start_to_atomic_boundary(
74+
self, messages: List[Dict], raw_start_index: int, protected_prefix: int
75+
) -> int:
76+
aligned_start = max(raw_start_index, protected_prefix)
77+
78+
if aligned_start <= protected_prefix or aligned_start >= len(messages):
79+
return aligned_start
80+
81+
trimmable = messages[protected_prefix:]
82+
local_start = aligned_start - protected_prefix
83+
84+
for group in self._get_atomic_groups(trimmable):
85+
group_start = group[0]
86+
group_end = group[-1] + 1
87+
88+
if local_start == group_start:
89+
return aligned_start
90+
91+
if group_start < local_start < group_end:
92+
return protected_prefix + group_start
93+
94+
return aligned_start
95+
```
96+
97+
### 3. 应用于尾部保留和摘要进度计算
98+
这个对齐后的边界现在被用于重建保留尾部消息,以及计算可以安全摘要的历史范围。
99+
100+
当前实现中的示例:
101+
102+
```python
103+
raw_start_index = max(compressed_count, effective_keep_first)
104+
start_index = self._align_tail_start_to_atomic_boundary(
105+
messages, raw_start_index, effective_keep_first
106+
)
107+
tail_messages = messages[start_index:]
108+
```
109+
110+
在摘要进度计算中同样如此:
111+
112+
```python
113+
raw_target_compressed_count = max(0, len(messages) - self.valves.keep_last)
114+
target_compressed_count = self._align_tail_start_to_atomic_boundary(
115+
messages, raw_target_compressed_count, effective_keep_first
116+
)
117+
```
118+
119+
## 验证结果
120+
121+
- **首次压缩边界**:当历史第一次越过压缩阈值时,保留尾部不再从工具调用块中间开始。
122+
- **复杂会话验证**:在 30+ 条消息、多个工具调用和失败调用的真实场景下,后台摘要过程保持稳定。
123+
- **回归行为更安全**:过滤器现在会优先选择合法边界,即使这意味着比原始的朴素切片稍微多保留一点上下文。
124+
125+
## 结论
126+
通过让历史裁剪与摘要进度计算具备"工具调用原子块感知"能力,避免孤立的 `tool` 消息出现,消除长对话与后台压缩期间的 400 错误。

0 commit comments

Comments
 (0)