Skip to content

Latest commit

 

History

History
286 lines (191 loc) · 7.87 KB

File metadata and controls

286 lines (191 loc) · 7.87 KB

Amazon EKS Deployment Guide

This guide provides step-by-step instructions for deploying the Intel® AI for Enterprise RAG solution on Amazon Elastic Kubernetes Service (EKS).

Table of Contents

Prerequisites

Before deploying to EKS, ensure you have:

  • AWS CLI installed and configured
  • Terraform installed
  • kubectl installed
  • Access to an existing EKS cluster or permissions to create one. See below for Terraform deployment instructions.
  • Appropriate AWS IAM permissions for EKS
  • Appropriate AWS IAM permissions for ECR (Optional)

AWS Configuration

Configure your AWS credentials:

aws configure

You will be prompted to enter:

  • AWS Access Key ID
  • AWS Secret Access Key
  • Default region (e.g., us-east-1)
  • Default output format (e.g., json)

Manual EKS Deployment

Follow AWS instructions to create an EKS cluster manually if you prefer not to use Terraform automated deployment.

Installation on an existing EKS cluster

You can deploy the Intel® AI for Enterprise RAG solution on an existing EKS cluster, including one that may have been created during the installation of Nutanix Enterprise AI on AWS EKS. In that case, skip the cluster setup and proceed to EKS Cluster Access.

EKS Terraform Automated Deployment

Clone the Repository

git clone https://github.com/opea-project/Enterprise-RAG.git
cd Enterprise-RAG/deployment/terraform/aws/nutanix/eks-erag

Configure Deployment Settings

Modify the locals block in Enterprise-RAG/deployment/terraform/aws/nutanix/eks-erag/main.tf:

locals {
  name   = "your-cluster-name"    # Update with your desired name
  region = "us-east-1"           # Update with your preferred region
  tags = {
    Owner    = "your@email.com"
    Project  = "Intel AI for Enterprise RAG"
    Duration = "0"              
  }
}

Deploy Infrastructure

Run the following Terraform commands:

terraform init
terraform plan
terraform apply

EKS Cluster Access

Grant User Access

Ensure your IAM user or role has been granted access to the EKS cluster:

  1. Add the user/role to the cluster's IAM access entries
  2. Configure EKS cluster endpoint access:
    • Add your development machine's public IP to the cluster's networking endpoint allowlist (if using private endpoint)
    • Or ensure the cluster has public endpoint access enabled

Configure kubectl for EKS Cluster

# Verify AWS identity
aws sts get-caller-identity

# Configure kubectl for your EKS cluster
aws eks update-kubeconfig --region us-east-1 --name $(terraform output -raw cluster_name)

# Test the connection
kubectl get svc

# Verify Nodes
kubectl get nodes

ECR Registry Setup (Optional)

Important

This step is only needed if you don't have access to docker.io or your images are not in release version

Create ECR Repositories

Create ECR repositories for your container images (if not already created):

aws ecr create-repository --repository-name <repository-name> --region <your-region>

Login to ECR

Authenticate Docker to your ECR registry:

aws ecr get-login-password --region <your-region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<your-region>.amazonaws.com

Push Images

Update and push images to ECR using the provided script:

cd deployment
./update_images.sh -- --registry <account-id>.dkr.ecr.<your-region>.amazonaws.com --build --push --tag <version> -j 20

Update Registry Configuration

After pushing images to ECR, update the registry configuration in deployment/inventory/<your-environment>/config.yaml:

registry: "<account-id>.dkr.ecr.<your-region>.amazonaws.com/erag"
tag: "<version>"
setup_registry: false

Service Account and RBAC Configuration

Create Service Account

Create a service account for administrative access:

kubectl -n default create serviceaccount $USER-sa

Create Cluster Role Binding

Bind the service account to the cluster-admin role:

kubectl create clusterrolebinding kubeconfig-cluster-admin-token \
  --clusterrole=cluster-admin \
  --serviceaccount=default:$USER-sa

Create Service Account Token Secret

Apply the secret (relies on USER environment variable):

cd deployment/terraform/aws/nutanix/eks-erag
envsubst < token-admin.yaml | kubectl apply -f -

Retrieve Service Account Token

Extract the token for kubeconfig:

kubectl -n default get secret kubeconfig-cluster-admin-token -o jsonpath='{.data.token}' | base64 --decode

Update Kubeconfig with Token

Edit your kubeconfig file to use the service account token. Replace the existing authentication method generated by aws-cli with:

users:
- name: <your-username>
  user:
    token: <token-from-previous-step>

Storage Configuration

Set Default Storage Class

Configure the default storage class for persistent volumes:

kubectl patch storageclass gp2 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Verify the default storage class:

kubectl get storageclass 

The default storage class should have (default) next to it.


Application Configuration

Proxy Configuration (if needed)

If your environment requires a proxy, set K8S_AUTH_PROXY environment variable:

export K8S_AUTH_PROXY="http://your-proxy-server:port"

Telemetry Configuration

On EKS it is recommended to disable traces telemetry in deployment/inventory/<your-environment>/config.yaml:

telemetry:
  enabled: true
  traces:
    enabled: false

Ingress Configuration

EKS doesn't allow using default ports from the service, so you need to remove the following part from deployment/components/ingress/values.yaml to unblock ingress' service ports:

  hostPort:
    enabled: true

Additionally, you need to change service type in deployment/inventory/<your-environment>/config.yaml to LoadBalancer:

ingress:
  enabled: true
  service_type: LoadBalancer

DNS Configuration

For proper DNS configuration, you need to change the FQDN to the domain that you are planning to use in deployment/inventory/<your-environment>/config.yaml:

FQDN: "erag.com" # Provide the FQDN for the deployment

After deployment, retrieve the LoadBalancer URL:

kubectl get --namespace ingress-nginx svc/ingress-nginx-controller

Configure your DNS provider to create a CNAME record pointing your domain to the LoadBalancer URL. Note that Amazon ELB DNS names do not natively resolve subdomains, so you'll need to configure individual DNS records for each subdomain or use a wildcard DNS entry at your DNS provider.

TLS Configuration

For detailed TLS configuration options, see Security Settings in the Advanced Configuration guide.

Deployment

Start deployment

Follow the deployment instructions in the Deploy the Intel® AI for Enterprise RAG Application section.

Verify Deployment

After the installation completes, verify the deployment status and test the pipeline by following the instructions in Interact with the Deployed Pipeline.