Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverting the TGI image version for LLAMA multiple GPUs in GKE samples #1591

Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ spec:
spec:
containers:
- name: llm
image: us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu124.2-3.ubuntu2204.py311
image: ghcr.io/huggingface/text-generation-inference:1.4.3
resources:
requests:
cpu: "10"
Expand All @@ -51,6 +51,9 @@ spec:
volumeMounts:
- mountPath: /dev/shm
name: dshm
# mountPath is set to /data as it's the path where the HF_HOME environment
# variable points to in the TGI container image i.e. where the downloaded model from the Hub will be
# stored
- mountPath: /data
name: ephemeral-volume
volumes:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ spec:
spec:
containers:
- name: llm
image: us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu124.2-3.ubuntu2204.py311
image: ghcr.io/huggingface/text-generation-inference:1.4.3
resources:
requests:
cpu: "10"
Expand All @@ -56,6 +56,9 @@ spec:
volumeMounts:
- mountPath: /dev/shm
name: dshm
# mountPath is set to /data as it's the path where the HF_HOME environment
# variable points to in the TGI container image i.e. where the downloaded model from the Hub will be
# stored
- mountPath: /data
name: ephemeral-volume
volumes:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ spec:
spec:
containers:
- name: llm
image: us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu124.2-3.ubuntu2204.py311
image: ghcr.io/huggingface/text-generation-inference:2.0.4
resources:
requests:
cpu: "10"
Expand Down Expand Up @@ -58,6 +58,9 @@ spec:
volumeMounts:
- mountPath: /dev/shm
name: dshm
# mountPath is set to /data as it's the path where the HF_HOME environment
# variable points to in the TGI container image i.e. where the downloaded model from the Hub will be
# stored
- mountPath: /data
name: ephemeral-volume
volumes:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ spec:
spec:
containers:
- name: llm
image: us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu124.2-3.ubuntu2204.py311
image: ghcr.io/huggingface/text-generation-inference:1.4.3
resources:
requests:
cpu: "5"
Expand All @@ -56,6 +56,9 @@ spec:
volumeMounts:
- mountPath: /dev/shm
name: dshm
# mountPath is set to /data as it's the path where the HF_HOME environment
# variable points to in the TGI container image i.e. where the downloaded model from the Hub will be
# stored
- mountPath: /data
name: ephemeral-volume
volumes:
Expand Down