Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Benchmarks

This folder contains all the benchmarks for the project. Each benchmark is organized in its own subfolder, and includes the necessary code and data to run the benchmark.

  • AI Idea Bench 2025: A benchmark for evaluating the performance of AI generating novel, creative, and feasible research ideas.
  • MLE-bench: A benchmark for evaluating the performance of machine learning models on a variety of tasks.
  • SciCodeBench: A benchmark for scientific code generation and understanding.

To clone all the submodules, use the following command:

git submodule update --init --recursive