Skip to content

test(langchain): add unit tests for AgentMiddleware — core runtime has no test coverage #27

@caoergou

Description

@caoergou

Problem

AgentMiddleware is the core runtime of UniHarness (libs/uniharness/uniharness/langchain/middleware.py). It coordinates the full pre-model pipeline including:

  • Compaction 3-phase state machine (NONE → REQUESTING → APPLYING → NONE)
  • Permission gating (awrap_tool_call with PermissionGate)
  • Skill injection (detects Skill tool calls, appends skill content)
  • Image extraction for non-Anthropic providers (_extract_tool_images)
  • System reminder rules (annotates the last message before model)

However, the tests/unit_tests/middleware/ directory only contains an empty __init__.py with no tests whatsoever:

```
tests/unit_tests/middleware/
└── init.py # empty
```

For comparison, other modules all have accompanying tests:

  • harness/tests/unit_tests/harness/
  • tools/tests/unit_tests/tools/
  • mcp/tests/unit_tests/mcp/

This is a significant gap given that AgentMiddleware is the most critical runtime component.

Suggested Test Coverage

The following behaviors should be covered:

  1. abefore_agent system prompt injection (idempotency) — injects SystemMessage on first turn; does not duplicate if already present
  2. abefore_model compaction REQUESTING phase — appends compaction prompt without touching existing messages
  3. _extract_tool_images — extracts image blocks from ToolMessage into follow-up HumanMessage; returns None when no images; skips already-processed messages (idempotency via _IMAGE_EXTRACTED marker)
  4. _detect_skill_call — correctly identifies the most recent Skill tool call; returns None when a user message already follows (already injected)
  5. _supports_tool_images — returns True for ChatAnthropic, False for other models
  6. awrap_tool_call DENIED — returns error ToolMessage, does not call the handler
  7. awrap_tool_call ALLOWED — calls through to the handler and returns its result
  8. aafter_model compaction trigger — triggers when token count exceeds threshold

Environment

  • libs/uniharness/uniharness/langchain/middleware.py — the module under test
  • Test infra: pytest-asyncio with asyncio_mode = "auto", mypy --strict, ruff
  • Pattern: test_<action>_<condition>_<expected_result>

I'd be happy to submit a PR adding these tests if this direction looks good.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions