Skip to content

Conversation

@barrycheng05
Copy link

No description provided.

You can also test by running the example deployment YAML under the [example](./example) folder.

**Note:** If you want to execute `nvidia-smi` in the example deployment, you need to add the following snippet to the `deployment.yml` file. Replace `<node-name>` with the node name you labeled during installation:

Copy link

@nayihz nayihz Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does <node-name> represent? what do you mean the node name you labeled during installation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvidia-smi still cannot be executed when add

env:
- name: NODE_NAME
  value: default

error msg:

# k exec -it sleepy-deployment-6bddfbb7f4-s8mc4 -- sh                                                                                                              :( 130 25-03-03 - 6:59:40
/ # nvidia-smi
sh: nvidia-smi: not found

Is my understanding incorrect?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be a different issue. You might want to check if the device plugin is running correctly.

@gshaibi
Copy link
Contributor

gshaibi commented Sep 7, 2025

Hi @barrycheng05 , sorry for the very late response.
From what I see, this is covered in the readme now.
Let me know if you think that is not enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants