BREAKING CHANGE: Make `EvaluationReport` and `ReportCase` into generic dataclasses #1799

dmontagu · 2025-05-21T15:29:35Z

Right now, these types are (non-generic) BaseModels. We received a request in the public slack to have these types be generic. As I noted there, it's much more difficult to do that with BaseModel than with a dataclass, so I converted the types into dataclasses and made things generic.

I'm not 100% convinced yet that we should make this change (though I'd be easily convinced if other maintainers consider the consequences and agree with the change), but the changes in this PR make these classes generic so you can have type-checked access to the report case inputs, outputs, and metadata.

The main downside (and reason this is a breaking change) is the loss of model_* methods on EvaluationReport and ReportCase. I introduced EvaluationReportAdapter and ReportCaseAdapter to make it somewhat easier to serialize these things on your own, but either way this is definitely less ergonomic than BaseModel, it's just harder to deal with generics properly in this way with a generic BaseModel subclass. (Generic BaseModels are more designed to reduce boilerplate when making many concrete instantiations of a parametrized type, and not designed for more type-theoretic applications like what is happening here.)

dmontagu · 2025-05-21T15:33:14Z

pydantic_evals/pydantic_evals/reporting/__init__.py

+    scores: dict[str, EvaluationResult[int | float]]
+    labels: dict[str, EvaluationResult[str]]
+    assertions: dict[str, EvaluationResult[bool]]


The = field(init=False) currently on these fields is just a straight-up mistake in the current implementation (maybe it was supposed to be there at some point, but I'm 99% sure it shouldn't be there now).

I think the only reason it didn't cause issues was because these objects are generally created by us, and in a way that it was never created without explicitly specifying the value. (Because this is currently a BaseModel, the =field(init=False) doesn't remove the field from the __init__.)

github-actions · 2025-05-21T15:37:26Z

Docs Preview

commit:	`6b499f5`
Preview URL:	https://de8356b4-pydantic-ai-previews.pydantic.workers.dev

DouweM · 2025-05-21T16:53:55Z

@dmontagu To not lose the model_dump[_json] methods and maintain backward compatibility (at least of the BaseModel bits we think people are likely to use), would it be crazy to implement them explicitly on the classes in question, delegating to the TypeAdapter?

I could imagine a decorator that works on any dataclass and does just that -- that may be useful more generally to get consistent validation/serialization APIs between BaseModels and dataclasses.

dmontagu · 2025-05-21T17:37:24Z

I thought about that but the reason I didn't do it is because it won't be type-aware (well, specifically, aware of the generic parameters involved; they will always be Any), so may end up with some surprising behavior. Admittedly it would be compatible with the way things work now, but I also feel it's a bit confusing to have model_X methods on a dataclass, it might make you think it's not actually a dataclass. Given we aren't in v1 yet and most people aren't manually serializing the reports, my feeling is it's not unreasonable to make a breaking change here and just make it a proper dataclass now.

If people are making serious use of model_dump etc. on the ReportCase / EvaluationReport classes then I might feel differently, but I would be surprised if that was the case and I'm not sure we have a good way to determine that anyway (short of user feedback, which I assume we won't get until breaking their code 😞).

Make reports generic

6b499f5

dmontagu commented May 21, 2025

View reviewed changes

dmontagu requested a review from DouweM May 21, 2025 16:25

dmontagu assigned DouweM May 21, 2025

DouweM assigned dmontagu and unassigned DouweM May 21, 2025

DouweM approved these changes May 21, 2025

View reviewed changes

DouweM added the breaking change label May 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BREAKING CHANGE: Make `EvaluationReport` and `ReportCase` into generic dataclasses #1799

BREAKING CHANGE: Make `EvaluationReport` and `ReportCase` into generic dataclasses #1799

Uh oh!

dmontagu commented May 21, 2025

Uh oh!

dmontagu May 21, 2025

Uh oh!

github-actions bot commented May 21, 2025

Uh oh!

DouweM commented May 21, 2025

Uh oh!

dmontagu commented May 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

BREAKING CHANGE: Make EvaluationReport and ReportCase into generic dataclasses #1799

Are you sure you want to change the base?

BREAKING CHANGE: Make EvaluationReport and ReportCase into generic dataclasses #1799

Uh oh!

Conversation

dmontagu commented May 21, 2025

Uh oh!

dmontagu May 21, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 21, 2025

Docs Preview

Uh oh!

DouweM commented May 21, 2025

Uh oh!

dmontagu commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

BREAKING CHANGE: Make `EvaluationReport` and `ReportCase` into generic dataclasses #1799

BREAKING CHANGE: Make `EvaluationReport` and `ReportCase` into generic dataclasses #1799

dmontagu commented May 21, 2025 •

edited

Loading