-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run nsys or CUPTI profiling on K8 cluster with gpu-operator #1158
Comments
Hello, can you please check output of this command |
This is likely the cause of the issue. There is kernel module parameter NVreg_RestrictProfilingToAdminUsers which is documented as set to "not restrict" by default in the kernel driver https://github.com/NVIDIA/open-gpu-kernel-modules/blob/550/kernel-open/nvidia/nv-reg.h#L526
I can't find any mention of this argument in gpu-operator: for "RestrictProfilingToAdminUsers" https://github.com/search?q=org%3ANVIDIA+RestrictProfiling&type=code or for "RmProfilingAdminOnly" https://github.com/search?q=org%3ANVIDIA+ProfilingAdmin&type=code Update: there is a document https://download.nvidia.com/XFree86/Linux-x86_64/550.67/README/knownissues.html which claims that by default access is limited to root user:
|
There is support for custom kernel module parameters with kernelModuleConfig ConfigMap. Could you please try this instruction
|
I could not run gpu profiling on a K8 cluster with gpu-operator.
tested the following:
Reproducible steps:
The text was updated successfully, but these errors were encountered: