test(langchain): add unit tests for AgentMiddleware — core runtime has no test coverage

## Problem

`AgentMiddleware` is the core runtime of UniHarness (`libs/uniharness/uniharness/langchain/middleware.py`). It coordinates the full pre-model pipeline including:

- **Compaction 3-phase state machine** (`NONE → REQUESTING → APPLYING → NONE`)
- **Permission gating** (`awrap_tool_call` with `PermissionGate`)
- **Skill injection** (detects Skill tool calls, appends skill content)
- **Image extraction** for non-Anthropic providers (`_extract_tool_images`)
- **System reminder rules** (annotates the last message before model)

However, the `tests/unit_tests/middleware/` directory only contains an empty `__init__.py` with **no tests whatsoever**:

\`\`\`
tests/unit_tests/middleware/
└── __init__.py   # empty
\`\`\`

For comparison, other modules all have accompanying tests:
- `harness/` → `tests/unit_tests/harness/`
- `tools/` → `tests/unit_tests/tools/`
- `mcp/` → `tests/unit_tests/mcp/`

This is a significant gap given that `AgentMiddleware` is the most critical runtime component.

## Suggested Test Coverage

The following behaviors should be covered:

1. **`abefore_agent` system prompt injection (idempotency)** — injects `SystemMessage` on first turn; does not duplicate if already present
2. **`abefore_model` compaction REQUESTING phase** — appends compaction prompt without touching existing messages
3. **`_extract_tool_images`** — extracts image blocks from `ToolMessage` into follow-up `HumanMessage`; returns `None` when no images; skips already-processed messages (idempotency via `_IMAGE_EXTRACTED` marker)
4. **`_detect_skill_call`** — correctly identifies the most recent Skill tool call; returns `None` when a user message already follows (already injected)
5. **`_supports_tool_images`** — returns `True` for `ChatAnthropic`, `False` for other models
6. **`awrap_tool_call` DENIED** — returns error `ToolMessage`, does not call the handler
7. **`awrap_tool_call` ALLOWED** — calls through to the handler and returns its result
8. **`aafter_model` compaction trigger** — triggers when token count exceeds threshold

## Environment

- `libs/uniharness/uniharness/langchain/middleware.py` — the module under test
- Test infra: `pytest-asyncio` with `asyncio_mode = "auto"`, `mypy --strict`, `ruff`
- Pattern: `test_<action>_<condition>_<expected_result>`

I'd be happy to submit a PR adding these tests if this direction looks good.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(langchain): add unit tests for AgentMiddleware — core runtime has no test coverage #27

Problem

Suggested Test Coverage

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

test(langchain): add unit tests for AgentMiddleware — core runtime has no test coverage #27

Description

Problem

Suggested Test Coverage

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions