You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Set vllm as default llm serving, and add related docker compose files, readmes, and test scripts.
Fix issue #1436
Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy file name to clipboardExpand all lines: DocSum/docker_compose/intel/cpu/xeon/README.md
+35-2Lines changed: 35 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,8 @@
2
2
3
3
This document outlines the deployment process for a Document Summarization application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on an Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to Docker Hub soon, which will simplify the deployment process for this service.
4
4
5
+
The default pipeline deploys with vLLM as the LLM serving component. It also provides options of using TGI backend for LLM microservice, please refer to [start-microservice-docker-containers](#start-microservice-docker-containers) section in this page.
6
+
5
7
## 🚀 Apply Intel Xeon Server on AWS
6
8
7
9
To apply a Intel Xeon server on AWS, start by creating an AWS account if you don't have one already. Then, head to the [EC2 Console](https://console.aws.amazon.com/ec2/v2/home) to begin the process. Within the EC2 service, select the Amazon EC2 M7i or M7i-flex instance type to leverage 4th Generation Intel Xeon Scalable processors. These instances are optimized for high-performance computing and demanding workloads.
@@ -116,9 +118,20 @@ To set up environment variables for deploying Document Summarization services, f
116
118
117
119
```bash
118
120
cd GenAIExamples/DocSum/docker_compose/intel/cpu/xeon
121
+
```
122
+
123
+
If use vLLM as the LLM serving backend.
124
+
125
+
```bash
119
126
docker compose -f compose.yaml up -d
120
127
```
121
128
129
+
If use TGI as the LLM serving backend.
130
+
131
+
```bash
132
+
docker compose -f compose_tgi.yaml up -d
133
+
```
134
+
122
135
You will have the following Docker Images:
123
136
124
137
1.`opea/docsum-ui:latest`
@@ -128,10 +141,30 @@ You will have the following Docker Images:
128
141
129
142
### Validate Microservices
130
143
131
-
1. TGI Service
144
+
1. LLM backend Service
145
+
146
+
In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready.
147
+
Try the command below to check whether the LLM serving is ready.
Copy file name to clipboardExpand all lines: DocSum/docker_compose/intel/hpu/gaudi/README.md
+35-2Lines changed: 35 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,8 @@
2
2
3
3
This document outlines the deployment process for a Document Summarization application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Gaudi server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to Docker Hub soon, which will simplify the deployment process for this service.
4
4
5
+
The default pipeline deploys with vLLM as the LLM serving component. It also provides options of using TGI backend for LLM microservice, please refer to [start-microservice-docker-containers](#start-microservice-docker-containers) section in this page.
6
+
5
7
## 🚀 Build Docker Images
6
8
7
9
### 1. Build MicroService Docker Image
@@ -108,9 +110,20 @@ To set up environment variables for deploying Document Summarization services, f
108
110
109
111
```bash
110
112
cd GenAIExamples/DocSum/docker_compose/intel/hpu/gaudi
113
+
```
114
+
115
+
If use vLLM as the LLM serving backend.
116
+
117
+
```bash
111
118
docker compose -f compose.yaml up -d
112
119
```
113
120
121
+
If use TGI as the LLM serving backend.
122
+
123
+
```bash
124
+
docker compose -f compose_tgi.yaml up -d
125
+
```
126
+
114
127
You will have the following Docker Images:
115
128
116
129
1.`opea/docsum-ui:latest`
@@ -120,10 +133,30 @@ You will have the following Docker Images:
120
133
121
134
### Validate Microservices
122
135
123
-
1. TGI Service
136
+
1. LLM backend Service
137
+
138
+
In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready.
139
+
Try the command below to check whether the LLM serving is ready.
0 commit comments