Name	Name	Last commit message	Last commit date
parent directory ..
aiideabench	aiideabench
astrovisbench	astrovisbench
mlebench	mlebench
scicodebench	scicodebench
.gitignore	.gitignore
README.md	README.md

Name

Last commit message

Last commit date

aiideabench

Benchmarks

This folder contains all the benchmarks for the project. Each benchmark is organized in its own subfolder, and includes the necessary code and data to run the benchmark.

AI Idea Bench 2025: A benchmark for evaluating the performance of AI generating novel, creative, and feasible research ideas.
MLE-bench: A benchmark for evaluating the performance of machine learning models on a variety of tasks.
SciCodeBench: A benchmark for scientific code generation and understanding.

To clone all the submodules, use the following command:

git submodule update --init --recursive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Benchmarks

FilesExpand file tree

benchmarks

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmarks

Folders and files

parent directory

README.md

Benchmarks