Skip to content

Latest commit

 

History

History
52 lines (38 loc) · 2.85 KB

File metadata and controls

52 lines (38 loc) · 2.85 KB

Quick Start

To set up prerequisities and quickly deploy Intel® AI for Enterprise Inference on a single node, follow the steps in the Single Node Deployment Guide. Otherwise, proceed to the section below for all deployment options.

🚀 New: Automated Intel® AI Accelerator firmware and driver management! See Intel® AI Accelerator Prerequisites for automated setup scripts.

Complete Intel® AI for Enterprise Inference Cluster Setup

Prerequisites

Complete all prerequisites.

Deployment Options

Deployment Type Description
Single Node (vLLM, non‑production) For Quick Testing on Intel® Xeon® processors using vLLM Docker (Guide)
Single Node Quick start for testing or lightweight workloads (Guide)
Single Master, Multiple Workers For higher throughput workloads (Guide)
Multi-Master, Multiple Workers Recommended for HA enterprise clusters (Guide)

Supported Models

💡 Both validated and custom models are supported to meet diverse enterprise needs.


Configuration Files

Two files are required before deployment:

  • inventory/hosts.yaml – Cluster inventory and topology for single node and multi-node)
  • inference-config.cfg – Component-level deployment config example

Deployment Command

Run the following script to deploy the inference platform:

bash inference-stack-deploy.sh

Post-Deployment

Intel® AI for Enterprise Inference - Brownfield Deployment

Intel® AI for Enterprise Inference supports brownfield deployment, allowing you to deploy the inference stack on an existing Kubernetes cluster without disrupting current workloads. This approach leverages your current infrastructure and preserves existing workloads and configurations.

For brownfield deployment guide, refer Brownfield Deployment Guide.