Skip to content

Update aks-extension-kaito.md #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions articles/aks/aks-extension-kaito.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ In this article, you learn how to use the AI toolchain operator (KAITO) add-on i

## Prerequisites

* The Azure Kubernetes Service (AKS) extension for Visual Studio Code downloaded. For more information, see [Install the Azure Kubernetes Service (AKS) extension for Visual Studio Code][install-aks-vscode].
* The Azure Kubernetes Service (AKS) extension for Visual Studio Code has been downloaded. For more information, see [Install the Azure Kubernetes Service (AKS) extension for Visual Studio Code][install-aks-vscode].
* The cluster that you are deploying to is a Standard Cluster _(Kaito cannot currently be installed on Automatic clusters)_
* Verify that your Azure subscription has GPU quota for your chosen model by checking the [KAITO model workspaces](https://github.com/kaito-project/kaito/tree/main/presets).

## Install KAITO on your cluster
Expand Down Expand Up @@ -69,7 +70,7 @@ For more information, see [AKS extension for Visual Studio Code features][aks-vs
## Delete your model inference deployment

1. Once you've finished testing the model(s) and you want to free up the allocated GPU resources on your cluster, go to the Kubernetes tab, and under **Clouds** > **Azure** > **your subscription** > **Deploy a LLM with KAITO**, right click on your cluster and select **Manage KAITO models**.
2. For each deployed model, select **Delete Workspace** to clear all allocated resources created by the inferencing deployment.
2. For each deployed model, select **Delete Workspace** to clear all allocated resources created by the inference deployment.

## Product support and feedback

Expand Down