Problem
AgentMiddleware is the core runtime of UniHarness (libs/uniharness/uniharness/langchain/middleware.py). It coordinates the full pre-model pipeline including:
- Compaction 3-phase state machine (
NONE → REQUESTING → APPLYING → NONE)
- Permission gating (
awrap_tool_call with PermissionGate)
- Skill injection (detects Skill tool calls, appends skill content)
- Image extraction for non-Anthropic providers (
_extract_tool_images)
- System reminder rules (annotates the last message before model)
However, the tests/unit_tests/middleware/ directory only contains an empty __init__.py with no tests whatsoever:
```
tests/unit_tests/middleware/
└── init.py # empty
```
For comparison, other modules all have accompanying tests:
harness/ → tests/unit_tests/harness/
tools/ → tests/unit_tests/tools/
mcp/ → tests/unit_tests/mcp/
This is a significant gap given that AgentMiddleware is the most critical runtime component.
Suggested Test Coverage
The following behaviors should be covered:
abefore_agent system prompt injection (idempotency) — injects SystemMessage on first turn; does not duplicate if already present
abefore_model compaction REQUESTING phase — appends compaction prompt without touching existing messages
_extract_tool_images — extracts image blocks from ToolMessage into follow-up HumanMessage; returns None when no images; skips already-processed messages (idempotency via _IMAGE_EXTRACTED marker)
_detect_skill_call — correctly identifies the most recent Skill tool call; returns None when a user message already follows (already injected)
_supports_tool_images — returns True for ChatAnthropic, False for other models
awrap_tool_call DENIED — returns error ToolMessage, does not call the handler
awrap_tool_call ALLOWED — calls through to the handler and returns its result
aafter_model compaction trigger — triggers when token count exceeds threshold
Environment
libs/uniharness/uniharness/langchain/middleware.py — the module under test
- Test infra:
pytest-asyncio with asyncio_mode = "auto", mypy --strict, ruff
- Pattern:
test_<action>_<condition>_<expected_result>
I'd be happy to submit a PR adding these tests if this direction looks good.
Problem
AgentMiddlewareis the core runtime of UniHarness (libs/uniharness/uniharness/langchain/middleware.py). It coordinates the full pre-model pipeline including:NONE → REQUESTING → APPLYING → NONE)awrap_tool_callwithPermissionGate)_extract_tool_images)However, the
tests/unit_tests/middleware/directory only contains an empty__init__.pywith no tests whatsoever:```
tests/unit_tests/middleware/
└── init.py # empty
```
For comparison, other modules all have accompanying tests:
harness/→tests/unit_tests/harness/tools/→tests/unit_tests/tools/mcp/→tests/unit_tests/mcp/This is a significant gap given that
AgentMiddlewareis the most critical runtime component.Suggested Test Coverage
The following behaviors should be covered:
abefore_agentsystem prompt injection (idempotency) — injectsSystemMessageon first turn; does not duplicate if already presentabefore_modelcompaction REQUESTING phase — appends compaction prompt without touching existing messages_extract_tool_images— extracts image blocks fromToolMessageinto follow-upHumanMessage; returnsNonewhen no images; skips already-processed messages (idempotency via_IMAGE_EXTRACTEDmarker)_detect_skill_call— correctly identifies the most recent Skill tool call; returnsNonewhen a user message already follows (already injected)_supports_tool_images— returnsTrueforChatAnthropic,Falsefor other modelsawrap_tool_callDENIED — returns errorToolMessage, does not call the handlerawrap_tool_callALLOWED — calls through to the handler and returns its resultaafter_modelcompaction trigger — triggers when token count exceeds thresholdEnvironment
libs/uniharness/uniharness/langchain/middleware.py— the module under testpytest-asynciowithasyncio_mode = "auto",mypy --strict,rufftest_<action>_<condition>_<expected_result>I'd be happy to submit a PR adding these tests if this direction looks good.