This guide provides step-by-step instructions for deploying the Intel® AI for Enterprise RAG solution on Amazon Elastic Kubernetes Service (EKS).
- Prerequisites
- AWS Configuration
- EKS Cluster Access
- ECR Registry Setup (Optional)
- Service Account and RBAC Configuration
- Storage Configuration
- Pipeline Configuration (Optional)
- Application Configuration
- Deployment
Before deploying to EKS, ensure you have:
- AWS CLI installed and configured
- Terraform installed
kubectlinstalled- Access to an existing EKS cluster or permissions to create one. See below for Terraform deployment instructions.
- Appropriate AWS IAM permissions for EKS
- Appropriate AWS IAM permissions for ECR (Optional)
Configure your AWS credentials:
aws configureYou will be prompted to enter:
- AWS Access Key ID
- AWS Secret Access Key
- Default region (e.g.,
us-east-1) - Default output format (e.g.,
json)
Follow AWS instructions to create an EKS cluster manually if you prefer not to use Terraform automated deployment.
You can deploy the Intel® AI for Enterprise RAG solution on an existing EKS cluster, including one that may have been created during the installation of Nutanix Enterprise AI on AWS EKS. In that case, skip the cluster setup and proceed to EKS Cluster Access.
git clone https://github.com/opea-project/Enterprise-RAG.git
cd Enterprise-RAG/deployment/terraform/aws/nutanix/eks-eragModify the locals block in Enterprise-RAG/deployment/terraform/aws/nutanix/eks-erag/main.tf:
locals {
name = "your-cluster-name" # Update with your desired name
region = "us-east-1" # Update with your preferred region
tags = {
Owner = "your@email.com"
Project = "Intel AI for Enterprise RAG"
Duration = "0"
}
}Run the following Terraform commands:
terraform init
terraform plan
terraform applyEnsure your IAM user or role has been granted access to the EKS cluster:
- Add the user/role to the cluster's IAM access entries
- Configure EKS cluster endpoint access:
- Add your development machine's public IP to the cluster's networking endpoint allowlist (if using private endpoint)
- Or ensure the cluster has public endpoint access enabled
# Verify AWS identity
aws sts get-caller-identity
# Configure kubectl for your EKS cluster
aws eks update-kubeconfig --region us-east-1 --name $(terraform output -raw cluster_name)
# Test the connection
kubectl get svc
# Verify Nodes
kubectl get nodesImportant
This step is only needed if you don't have access to docker.io or your images are not in release version
Create ECR repositories for your container images (if not already created):
aws ecr create-repository --repository-name <repository-name> --region <your-region>Authenticate Docker to your ECR registry:
aws ecr get-login-password --region <your-region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<your-region>.amazonaws.comUpdate and push images to ECR using the provided script:
cd deployment
./update_images.sh -- --registry <account-id>.dkr.ecr.<your-region>.amazonaws.com --build --push --tag <version> -j 20After pushing images to ECR, update the registry configuration in deployment/inventory/<your-environment>/config.yaml:
registry: "<account-id>.dkr.ecr.<your-region>.amazonaws.com/erag"
tag: "<version>"
setup_registry: falseCreate a service account for administrative access:
kubectl -n default create serviceaccount $USER-saBind the service account to the cluster-admin role:
kubectl create clusterrolebinding kubeconfig-cluster-admin-token \
--clusterrole=cluster-admin \
--serviceaccount=default:$USER-saApply the secret (relies on USER environment variable):
cd deployment/terraform/aws/nutanix/eks-erag
envsubst < token-admin.yaml | kubectl apply -f -Extract the token for kubeconfig:
kubectl -n default get secret kubeconfig-cluster-admin-token -o jsonpath='{.data.token}' | base64 --decodeEdit your kubeconfig file to use the service account token. Replace the existing authentication method generated by aws-cli with:
users:
- name: <your-username>
user:
token: <token-from-previous-step>Configure the default storage class for persistent volumes:
kubectl patch storageclass gp2 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'Verify the default storage class:
kubectl get storageclass The default storage class should have (default) next to it.
If your environment requires a proxy, set K8S_AUTH_PROXY environment variable:
export K8S_AUTH_PROXY="http://your-proxy-server:port"On EKS it is recommended to disable traces telemetry in deployment/inventory/<your-environment>/config.yaml:
telemetry:
enabled: true
traces:
enabled: falseEKS doesn't allow using default ports from the service, so you need to remove the following part from deployment/components/ingress/values.yaml to unblock ingress' service ports:
hostPort:
enabled: trueAdditionally, you need to change service type in deployment/inventory/<your-environment>/config.yaml to LoadBalancer:
ingress:
enabled: true
service_type: LoadBalancerFor proper DNS configuration, you need to change the FQDN to the domain that you are planning to use in deployment/inventory/<your-environment>/config.yaml:
FQDN: "erag.com" # Provide the FQDN for the deploymentAfter deployment, retrieve the LoadBalancer URL:
kubectl get --namespace ingress-nginx svc/ingress-nginx-controllerConfigure your DNS provider to create a CNAME record pointing your domain to the LoadBalancer URL. Note that Amazon ELB DNS names do not natively resolve subdomains, so you'll need to configure individual DNS records for each subdomain or use a wildcard DNS entry at your DNS provider.
For detailed TLS configuration options, see Security Settings in the Advanced Configuration guide.
Follow the deployment instructions in the Deploy the Intel® AI for Enterprise RAG Application section.
After the installation completes, verify the deployment status and test the pipeline by following the instructions in Interact with the Deployed Pipeline.