Skip to content

Conversation

aather
Copy link

@aather aather commented May 20, 2025

Traces CUDA GPU kernel functions via BPF and provides in-depth analysis through visualizations and optional LLM (Large Language Model)-powered summaries.

…is through visualizations and optional LLM (Large Language Model)-powered summaries
@facebook-github-bot
Copy link
Contributor

Hi @aather!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 20, 2025
@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

if (link) {
links.emplace_back(link);
/* Attach Uprobes for CUDA API tracepoints */
for (const auto& symbol : kCudaSymbols) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the list of events to trace optional (passed as an argument to the program) instead of collecting all events by default?

bool capture_args;
bool capture_stack;
} prog_cfg = {
// These defaults will be overridden from user space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave the comments?
The purpose of the defaults is that they allow us to exercise these specific code paths using veristat to avoid verifier errors

bpf_printk(fmt, ##__VA_ARGS__); \
})

// The caller uses registers to pass the first 6 arguments to the callee. Given
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this comment as well?

bpf_probe_read_user(&e->args[i], sizeof(arg_addr), arg_addr);

struct gpukern_sample* e = bpf_ringbuf_reserve(&rb, sizeof(*e), 0);
if (!e) return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bpf_printk_debug("Failed to allocate ringbuf entry"); can be useful for debugging, especially if we overrun the ring buf, it should not cause noise or an increase in instruction count as long as the .debug flag is set to false from user space

bpf_probe_read_user(&arg_addr, sizeof(u64), (const void*)(argv + i * sizeof(u64)));
bpf_probe_read_user(&e->args[i], sizeof(arg_addr), arg_addr);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about leaving the comments :)

for (auto& frame : stack) {
frame.print();
// Print function arguments if requested
if (env.args) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should be specific to EVENT_CUDA_LAUNCH_KERNEL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants