Add Pydantic AI tool-call schema validator

## Why

"Wrong tool args" is one of the top regression modes for tool-calling agents — the right tool gets picked but with a bad payload, and unless you're validating, the diff just looks like an output change.

Pydantic AI already declares typed tool schemas. We can use those schemas at diff time to flag "the tool was called with arguments that don't validate" as a first-class regression class, separate from `TOOLS_CHANGED`.

## What

Extend the Pydantic AI adapter (`evalview/adapters/pydantic_ai_adapter.py`) and the tool-call evaluator to surface schema-validation failures as a distinct reason code.

## Acceptance criteria

- [ ] Tool-call evaluator validates captured args against the Pydantic schema when available
- [ ] New `ReasonCode` (or extension of an existing one) for "tool args failed schema validation"
- [ ] Severity ranking documented (where does it sit vs. `TOOLS_CHANGED` and `REGRESSION`?)
- [ ] Test in `tests/evaluators/` covering: valid args pass, invalid args flagged
- [ ] Docs updated in the evaluators section

## Hints

- `evalview/evaluators/` has the orchestrator and per-eval modules.
- `evalview/core/types.py` is where `ReasonCode` lives.
- Keep it Pydantic-AI-specific for now; we can generalize to a `ToolSchemaProvider` ABC in a later PR if other adapters want to opt in.

## Size

~2-3 hours.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Pydantic AI tool-call schema validator #240

Why

What

Acceptance criteria

Hints

Size

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add Pydantic AI tool-call schema validator #240

Description

Why

What

Acceptance criteria

Hints

Size

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions