This is a draft repository for OpenSpec‑AE: spec for vendor‑agnostic AI Agent Evaluation.
The lethal trifecta is what happens when you ship a chatbot and call it an agent.
The idea is to define an open standard for platforms where AI engineers could develop and test their AI Agents. Spec enables basic interoperability and standardisation (needed for ecosystem).
This spec targets community research and working together to push state-of-the-art. To achieve that:
- all benchmarks are required to be shareable in public, favouring synthetic data or the data that is publicly available already;
- all benchmark runs against the platform are recorded and made available for everybody to analyse, gain insights and improve their own agents.