-
Notifications
You must be signed in to change notification settings - Fork 145
Move benchmarks to separate repository to reduce repo size #2504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Deepesh Varatharajan <[email protected]>
|
Thanks. Do you also have commands to clone this as standalone and run benchmarks when one has an llvm+enzyme build folder already somewhere else? It might be that this is too hard, I don't know how tight the cmake integration is. cc: @tgymnich |
|
@DeepeshWR Good idea. Does this work with Bazel? I went ahead and created a blank repo for the benchmarks: https://github.com/EnzymeAD/enzyme-benchmarks |
| if (ENZYME_ENABLE_PLUGINS AND (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks")) | ||
| add_subdirectory(benchmarks) | ||
| else() | ||
| message(STATUS "Benchmarks not found. To include them, clone https://github.com/EnzymeAD/enzyme-benchmarks into the Enzyme source tree.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of asking to clone, can we use something like
Enzyme/enzyme/BCLoad/CMakeLists.txt
Line 23 in aecafad
| ExternalProject_Add(gsl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-if (ENZYME_ENABLE_PLUGINS AND (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"))
- add_subdirectory(benchmarks)
+if (ENZYME_ENABLE_PLUGINS)
+ if (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks")
+ # Use existing benchmarks in the source tree
+ add_subdirectory(benchmarks)
+ elseif (ENZYME_RUN_BENCHMARKS)
+ execute_process(
+ COMMAND git clone https://github.com/DeepeshWR/benchmarks.git "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"
+ RESULT_VARIABLE result
+ )
+ if(NOT result EQUAL 0)
+ message(FATAL_ERROR "Failed to clone benchmarks repository")
+ endif()
+ add_subdirectory(benchmarks)
+ else()
+ message(STATUS "Benchmarks not found. To include them, clone https://github.com/DeepeshWR/benchmarks.git into the EnzymE source tree, or reconfigure with -DENZYME_RUN_BENCHMARKS=ON.")
+ endif()
endif()
Implemented the changes as described above, ensuring that nothing is modified except for enabling the "-DENZYME_RUN_BENCHMARKS=ON" flag to run benchmarks. ExternalProject_Add() functions differently by cloning and making the benchmark sources available only during the Enzyme build. Consequently, using add_subdirectory(benchmarks) would fail at configuration time. This approach allows us to include the benchmarks while preserving the original behaviour of how they were configured and built in Enzyme’s CMakeLists.txt. For reference, I have pushed the benchmark sources to my GitHub. Please let me know if this approach is acceptable to you.
[Testing]
cd /path/to/Enzyme/enzyme
rmdir benchmarks
mkdir build && cd build
cmake -G Ninja .. -DLLVM_DIR=/path/to/llvm/lib/cmake/llvm -DLLVM_EXTERNAL_LIT=/path/to/lit/lit.py -DENZYME_ENABLE_PLUGINS=ON -DENZYME_RUN_BENCHMARKS=ON
ninja
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be instead preferable to move clear the soruce out of the benchmarks/CMakeLists.txt dir, and add another ExternalProejct add to get the source / input data instead [that cmake already has externalproject adds for the other tools].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-# The benchmarks data are not in git-exported source archives to minimize size.
-# Only add the benchmarks if the directory exists.
-if (ENZYME_ENABLE_PLUGINS AND (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/benchmarks"))
- add_subdirectory(benchmarks)
+# Benchmarks have been moved to a separate repository to reduce repo size.
+# OPTIONAL: To clone and build benchmarks "-DENZYME_RUN_BENCHMARKS=ON"
+if (ENZYME_ENABLE_PLUGINS AND ENZYME_RUN_BENCHMARKS)
+ include(ExternalProject)
+
+ message(STATUS "Configuring Enzyme benchmarks via ExternalProject_Add")
+
+ set(BENCH_SRC_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/src)
+ set(BENCH_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/build)
+ set(BENCH_INSTALL_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/install)
+ set(BENCH_STAMP_DIR ${CMAKE_CURRENT_BINARY_DIR}/benchmarks/stamp)
+ file(MAKE_DIRECTORY ${BENCH_SRC_DIR})
+ file(MAKE_DIRECTORY ${BENCH_BUILD_DIR})
+ file(MAKE_DIRECTORY ${BENCH_INSTALL_DIR})
+ file(MAKE_DIRECTORY ${BENCH_STAMP_DIR})
+
+ ExternalProject_Add(EnzymeBenchmarks
+ GIT_REPOSITORY https://github.com/DeepeshWR/benchmarks.git
+ GIT_TAG main
+ SOURCE_DIR ${BENCH_SRC_DIR}
+ BINARY_DIR ${BENCH_BUILD_DIR}
+ INSTALL_DIR ${BENCH_INSTALL_DIR}
+ STAMP_DIR ${BENCH_STAMP_DIR}
+
+ CMAKE_ARGS
+ -DLLVM_DIR=${LLVM_DIR}
+ -DLLVM_VERSION_MAJOR=${LLVM_VERSION_MAJOR}
+ -DLLVM_EXTERNAL_LIT=${LLVM_EXTERNAL_LIT}
+ -DENZYME_ENABLE_PLUGINS=${ENZYME_ENABLE_PLUGINS}
+ -DENZYME_PLUGIN_PATH=${CMAKE_CURRENT_BINARY_DIR}/Enzyme/LLVMEnzyme-${LLVM_VERSION_MAJOR}${CMAKE_SHARED_LIBRARY_SUFFIX}
+
+ BUILD_COMMAND ${CMAKE_COMMAND} --build ${BENCH_BUILD_DIR} --target bench-enzyme
+ INSTALL_COMMAND ""
+ UPDATE_COMMAND ""
+ TEST_COMMAND ""
+ )
+
+ add_dependencies(EnzymeBenchmarks LLVMEnzyme-${LLVM_VERSION_MAJOR})
+
+ set_target_properties(EnzymeBenchmarks PROPERTIES EXCLUDE_FROM_ALL TRUE)
endif()
Is my approach is fine or am I overlooking something ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if anyone has had the opportunity to review the above patch? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gentle reminder on this PR.
|
existing benchmark setup doesnt use bazel iirc, so don't worry about that for now |
this can be done as a follow up |
This PR removes the large
benchmarks/directory (>378 MB) from the main Enzyme repositoryand recommends users clone it from a new standalone repository:
https://github.com/EnzymeAD/enzyme-benchmarks
The benchmarks folder takes up ~378 MB.
It is not needed for most users building Enzyme.
Moving it out improves clone speed and CI efficiency.
Removed
benchmarks/directory.Updated
CMakeLists.txtto display an informative message if the folder is missing.The benchmarks will be moved to a new repository:
https://github.com/EnzymeAD/enzyme-benchmarks
This repository can be cloned into
enzyme/benchmarks/for benchmarking.Prior Discussion : #2309