Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverting the TGI image version for LLAMA multiple GPUs in GKE samples #931

Conversation

raushan2016
Copy link
Member

The current image override the HF_HOME to /tmp from /data. Even after changing the mountpath to /tmp there is some regression in the newer TGI image which results into out of GPU memory on L4 and requires atleast A2 node. Rolling back the image version to get the sample working will investigation happen in the background.

Issue: GoogleCloudPlatform/kubernetes-engine-samples#1581

@annapendleton
Copy link
Collaborator

/gcbrun

@chengcongdu chengcongdu merged commit c985e95 into GoogleCloudPlatform:main Jan 15, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants