Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiler class and native code to support self-profiling #2066

Merged
merged 9 commits into from
May 28, 2024

Conversation

jlowe
Copy link
Contributor

@jlowe jlowe commented May 22, 2024

Contributes to NVIDIA/spark-rapids#10632. Adds a new Profiler class and native code that can be used to collect CUDA and NVTX profiling data for the current process via CUPTI. It supports

The profiler is intended to be started before the process calls any CUDA events so it can capture everything. The profiler is shipped as a separate shared library in the spark-rapids-jni jar that can be loaded before the spark-rapids-jni shared native library that contains the cudf and spark-rapids-jni CUDA kernels.

The profiling data is serialized as a series of size-prefixed flatbuffers. See profiler.fbs for the schema. This also includes a converter tool, built in target/cmake-build/profiler/spark_rapids_profile_converter which can be used to convert the flatbuffer profiling data into various formats. It currently supports converting to JSON (useful for performing custom analysis on the raw data) or the Nsight NVTXT format. NVTXT can in turn be converted into an nsys-rep file that the Nsight Systems viewer can load via the ImportNvtxt tool supplied with Nsight Systems.

@jlowe
Copy link
Contributor Author

jlowe commented May 22, 2024

build

revans2
revans2 previously approved these changes May 24, 2024
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really just nits, but be aware I do not know CMake so if there are scary thing in there I probably missed them. I also am not an expert on the libraries or formats that you are using, so I am not sure 100% if they are all correct, but things appeared to line up properly.

src/main/cpp/profiler/profiler_debug.cpp Outdated Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Outdated Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Outdated Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Outdated Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Outdated Show resolved Hide resolved
src/main/cpp/profiler/ProfilerJni.cpp Show resolved Hide resolved
src/main/cpp/profiler/spark_rapids_profile_converter.cpp Outdated Show resolved Hide resolved
src/main/cpp/profiler/spark_rapids_profile_converter.cpp Outdated Show resolved Hide resolved
@jlowe
Copy link
Contributor Author

jlowe commented May 24, 2024

build

revans2
revans2 previously approved these changes May 24, 2024
@jlowe
Copy link
Contributor Author

jlowe commented May 24, 2024

build

@jlowe
Copy link
Contributor Author

jlowe commented May 24, 2024

build

@jlowe
Copy link
Contributor Author

jlowe commented May 28, 2024

build

@jlowe jlowe merged commit 092fdb8 into NVIDIA:branch-24.06 May 28, 2024
3 checks passed
@jlowe jlowe deleted the profiler branch May 28, 2024 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants