Skip to content

Conversation

@DeepeshWR
Copy link

This PR removes the large benchmarks/ directory (>378 MB) from the main Enzyme repository
and recommends users clone it from a new standalone repository:
https://github.com/EnzymeAD/enzyme-benchmarks

  • The benchmarks folder takes up ~378 MB.

  • It is not needed for most users building Enzyme.

  • Moving it out improves clone speed and CI efficiency.

  • Removed benchmarks/ directory.

  • Updated CMakeLists.txt to display an informative message if the folder is missing.

The benchmarks will be moved to a new repository:
https://github.com/EnzymeAD/enzyme-benchmarks

This repository can be cloned into enzyme/benchmarks/ for benchmarking.

Prior Discussion : #2309

@ZuseZ4
Copy link
Collaborator

ZuseZ4 commented Oct 27, 2025

Thanks. Do you also have commands to clone this as standalone and run benchmarks when one has an llvm+enzyme build folder already somewhere else? It might be that this is too hard, I don't know how tight the cmake integration is.

cc: @tgymnich

@tgymnich
Copy link
Member

tgymnich commented Oct 27, 2025

@DeepeshWR Good idea. Does this work with Bazel? I went ahead and created a blank repo for the benchmarks: https://github.com/EnzymeAD/enzyme-benchmarks

if (ENZYME_ENABLE_PLUGINS AND (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"))
add_subdirectory(benchmarks)
else()
message(STATUS "Benchmarks not found. To include them, clone https://github.com/EnzymeAD/enzyme-benchmarks into the Enzyme source tree.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of asking to clone, can we use something like

ExternalProject_Add(gsl

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-if (ENZYME_ENABLE_PLUGINS AND (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"))
-    add_subdirectory(benchmarks)
+if (ENZYME_ENABLE_PLUGINS)
+    if (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks")
+        # Use existing benchmarks in the source tree
+        add_subdirectory(benchmarks)
+    elseif (ENZYME_RUN_BENCHMARKS)
+        execute_process(
+            COMMAND git clone https://github.com/DeepeshWR/benchmarks.git "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"
+            RESULT_VARIABLE result
+        )
+        if(NOT result EQUAL 0)
+            message(FATAL_ERROR "Failed to clone benchmarks repository")
+        endif()
+        add_subdirectory(benchmarks)
+    else()
+       message(STATUS "Benchmarks not found. To include them, clone https://github.com/DeepeshWR/benchmarks.git into the EnzymE source tree, or reconfigure with -DENZYME_RUN_BENCHMARKS=ON.")
+    endif()
 endif()

Implemented the changes as described above, ensuring that nothing is modified except for enabling the "-DENZYME_RUN_BENCHMARKS=ON" flag to run benchmarks. ExternalProject_Add() functions differently by cloning and making the benchmark sources available only during the Enzyme build. Consequently, using add_subdirectory(benchmarks) would fail at configuration time. This approach allows us to include the benchmarks while preserving the original behaviour of how they were configured and built in Enzyme’s CMakeLists.txt. For reference, I have pushed the benchmark sources to my GitHub. Please let me know if this approach is acceptable to you.

[Testing]
cd /path/to/Enzyme/enzyme
rmdir benchmarks
mkdir build && cd build
cmake -G Ninja .. -DLLVM_DIR=/path/to/llvm/lib/cmake/llvm -DLLVM_EXTERNAL_LIT=/path/to/lit/lit.py -DENZYME_ENABLE_PLUGINS=ON -DENZYME_RUN_BENCHMARKS=ON
ninja

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be instead preferable to move clear the soruce out of the benchmarks/CMakeLists.txt dir, and add another ExternalProejct add to get the source / input data instead [that cmake already has externalproject adds for the other tools].

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-# The benchmarks data are not in git-exported source archives to minimize size.
-# Only add the benchmarks if the directory exists.
-if (ENZYME_ENABLE_PLUGINS AND (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"))
-    add_subdirectory(benchmarks)
+# Benchmarks have been moved to a separate repository to reduce repo size.
+# OPTIONAL: To clone and build benchmarks "-DENZYME_RUN_BENCHMARKS=ON"
+if (ENZYME_ENABLE_PLUGINS AND ENZYME_RUN_BENCHMARKS)
+    include(ExternalProject)
+
+    message(STATUS "Configuring Enzyme benchmarks via ExternalProject_Add")
+
+    set(BENCH_SRC_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/src)
+    set(BENCH_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/build)
+    set(BENCH_INSTALL_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/install)
+    set(BENCH_STAMP_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/stamp)
+    file(MAKE_DIRECTORY ${BENCH_SRC_DIR})
+    file(MAKE_DIRECTORY ${BENCH_BUILD_DIR})
+    file(MAKE_DIRECTORY ${BENCH_INSTALL_DIR})
+    file(MAKE_DIRECTORY ${BENCH_STAMP_DIR})
+
+    ExternalProject_Add(EnzymeBenchmarks
+        GIT_REPOSITORY https://github.com/DeepeshWR/benchmarks.git
+        GIT_TAG main
+        SOURCE_DIR ${BENCH_SRC_DIR}
+        BINARY_DIR ${BENCH_BUILD_DIR}
+        INSTALL_DIR ${BENCH_INSTALL_DIR}
+       STAMP_DIR ${BENCH_STAMP_DIR}
+
+        CMAKE_ARGS
+            -DLLVM_DIR=${LLVM_DIR}
+           -DLLVM_VERSION_MAJOR=${LLVM_VERSION_MAJOR}
+            -DLLVM_EXTERNAL_LIT=${LLVM_EXTERNAL_LIT}
+            -DENZYME_ENABLE_PLUGINS=${ENZYME_ENABLE_PLUGINS}
+           -DENZYME_PLUGIN_PATH=${CMAKE_CURRENT_BINARY_DIR}/Enzyme/LLVMEnzyme-${LLVM_VERSION_MAJOR}${CMAKE_SHARED_LIBRARY_SUFFIX}
+
+        BUILD_COMMAND ${CMAKE_COMMAND} --build ${BENCH_BUILD_DIR} --target bench-enzyme
+        INSTALL_COMMAND ""
+        UPDATE_COMMAND ""
+        TEST_COMMAND ""
+    )
+
+    add_dependencies(EnzymeBenchmarks LLVMEnzyme-${LLVM_VERSION_MAJOR})
+
+    set_target_properties(EnzymeBenchmarks PROPERTIES EXCLUDE_FROM_ALL TRUE)
 endif()

Is my approach is fine or am I overlooking something ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if anyone has had the opportunity to review the above patch? Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gentle reminder on this PR.

@wsmoses
Copy link
Member

wsmoses commented Oct 27, 2025

existing benchmark setup doesnt use bazel iirc, so don't worry about that for now

@wsmoses
Copy link
Member

wsmoses commented Oct 27, 2025

Thanks. Do you also have commands to clone this as standalone and run benchmarks when one has an llvm+enzyme build folder already somewhere else? It might be that this is too hard, I don't know how tight the cmake integration is.

cc: @tgymnich

this can be done as a follow up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants