[EVAL] Request support for AIME26

## Evaluation short description

AIME26 is the latest AIME-style math reasoning benchmark, commonly used to evaluate LLM performance on competition-level problems requiring multi-step reasoning and exact answers.

It is widely used in the community as a standard reference benchmark for mathematical reasoning, alongside earlier AIME versions.

## Evaluation metadata

- Paper url: N/A (AIME is an AMC competition benchmark)
- Github url: N/A
- Dataset url: https://huggingface.co/datasets/EleutherAI/aime_2024 (AIME-style datasets; AIME26 variants also available in community repos)

---

Hi LightEval team,

Does LightEval currently support evaluating models on **AIME26**?

If not, is there a recommended way to add it as a custom task, or any plan to support it officially?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EVAL] Request support for AIME26 #1167

Evaluation short description

Evaluation metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[EVAL] Request support for AIME26 #1167

Description

Evaluation short description

Evaluation metadata

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions