Skip to content

Commit e092a1d

Browse files
Add ChatQnA docker-compose example on Intel Xeon using MariaDB Vector
Example on how to deploy the ChatBot on Intel Xeon by using MariaDB Server as a vectorstore. - use MariaDB Server as the backend database. Minimum required version is 11.7 - use the OPEA_DATAPREP_MARIADBVECTOR component for dataprep microservice - use the OPEA_RETRIEVER_MARIADBVECTOR component for retriever microservice How to test Set the HF API token environment variable and: ``` cd ChatQnA/tests bash test_compose_mariadb_on_xeon.sh ``` Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
1 parent 9259ba4 commit e092a1d

File tree

4 files changed

+642
-0
lines changed

4 files changed

+642
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
# Deploying ChatQnA with MariaDB Vector on Intel® Xeon® Processors
2+
3+
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel® Xeon® servers. The pipeline integrates **MariaDB Vector** as the vector database and includes microservices such as `embedding`, `retriever`, `rerank`, and `llm`.
4+
5+
---
6+
7+
## Table of Contents
8+
9+
1. [Build Docker Images](#build-docker-images)
10+
2. [Validate Microservices](#validate-microservices)
11+
3. [Launch the UI](#launch-the-ui)
12+
4. [Launch the Conversational UI (Optional)](#launch-the-conversational-ui-optional)
13+
14+
---
15+
16+
## Build Docker Images
17+
18+
First of all, you need to build Docker Images locally and install the python package of it.
19+
20+
```bash
21+
git clone https://github.com/opea-project/GenAIComps.git
22+
cd GenAIComps
23+
```
24+
25+
### 1. Build Retriever Image
26+
27+
```bash
28+
docker build --no-cache -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
29+
```
30+
31+
### 2. Build Dataprep Image
32+
33+
```bash
34+
docker build --no-cache -t opea/dataprep:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/src/Dockerfile .
35+
cd ..
36+
```
37+
38+
### 3. Build MegaService Docker Image
39+
40+
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
41+
42+
```bash
43+
git clone https://github.com/opea-project/GenAIExamples.git
44+
cd GenAIExamples/ChatQnA/
45+
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
46+
cd ../../..
47+
```
48+
49+
### 4. Build UI Docker Image
50+
51+
Build frontend Docker image via below command:
52+
53+
```bash
54+
cd GenAIExamples/ChatQnA/ui
55+
docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
56+
cd ../../../..
57+
```
58+
59+
### 5. Build Conversational React UI Docker Image (Optional)
60+
61+
Build frontend Docker image that enables Conversational experience with ChatQnA megaservice via below command:
62+
63+
**Export the value of the public IP address of your Xeon server to the `host_ip` environment variable**
64+
65+
```bash
66+
cd GenAIExamples/ChatQnA/ui
67+
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8912/v1/chatqna"
68+
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6043/v1/dataprep/ingest"
69+
docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg BACKEND_SERVICE_ENDPOINT=$BACKEND_SERVICE_ENDPOINT --build-arg DATAPREP_SERVICE_ENDPOINT=$DATAPREP_SERVICE_ENDPOINT -f ./docker/Dockerfile.react .
70+
cd ../../../..
71+
```
72+
73+
### 6. Build Nginx Docker Image
74+
75+
```bash
76+
cd GenAIComps
77+
docker build -t opea/nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/nginx/src/Dockerfile .
78+
```
79+
80+
Then run the command `docker images`, you will have the following 5 Docker Images:
81+
82+
1. `opea/dataprep:latest`
83+
2. `opea/retriever:latest`
84+
3. `opea/chatqna:latest`
85+
4. `opea/chatqna-ui:latest`
86+
5. `opea/nginx:latest`
87+
88+
## Start Microservices
89+
90+
### Required Models
91+
92+
By default, the embedding, reranking and LLM models are set to a default value as listed below:
93+
94+
| Service | Model |
95+
| --------- | ----------------------------------- |
96+
| Embedding | BAAI/bge-base-en-v1.5 |
97+
| Reranking | BAAI/bge-reranker-base |
98+
| LLM | meta-llama/Meta-Llama-3-8B-Instruct |
99+
100+
Change the `xxx_MODEL_ID` below for your needs.
101+
102+
### Setup Environment Variables
103+
104+
Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
105+
106+
**Export the value of the public IP address of your Xeon server to the `host_ip` environment variable**
107+
108+
> Change the External_Public_IP below with the actual IPV4 value
109+
110+
```bash
111+
export host_ip="External_Public_IP"
112+
```
113+
114+
> Change to your actual Huggingface API Token value
115+
116+
```bash
117+
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
118+
```
119+
120+
**Append the value of the public IP address to the no_proxy list if you are in a proxy environment**
121+
122+
```bash
123+
export no_proxy=${your_no_proxy},chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-mariadb-vector,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service
124+
```
125+
126+
```bash
127+
export no_proxy=${your_no_proxy}
128+
export http_proxy=${your_http_proxy}
129+
export https_proxy=${your_http_proxy}
130+
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
131+
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
132+
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
133+
export MARIADB_DATABASE="vectordb"
134+
export MARIADB_USER="chatqna"
135+
export MARIADB_PASSWORD="password"
136+
```
137+
138+
Note: Please replace with `host_ip` with you external IP address, do not use localhost.
139+
140+
### Start all the services Docker Containers
141+
142+
> Before running the docker compose command, you need to be in the folder that has the docker compose yaml file
143+
144+
```bash
145+
cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
146+
docker compose -f compose_mariadb.yaml up -d
147+
```
148+
149+
### Validate Microservices
150+
151+
Follow the instructions to validate MicroServices.
152+
For details on how to verify the correctness of the response, refer to [how-to-validate_service](../../hpu/gaudi/how_to_validate_service.md).
153+
154+
1. TEI Embedding Service
155+
156+
```bash
157+
curl ${host_ip}:6040/embed \
158+
-X POST \
159+
-d '{"inputs":"What is Deep Learning?"}' \
160+
-H 'Content-Type: application/json'
161+
```
162+
163+
2. Retriever Microservice
164+
165+
To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
166+
is determined by the embedding model.
167+
Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.
168+
169+
Check the vector dimension of your embedding model, set `your_embedding` dimension equals to it.
170+
171+
```bash
172+
export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
173+
curl http://${host_ip}:6045/v1/retrieval \
174+
-X POST \
175+
-d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \
176+
-H 'Content-Type: application/json'
177+
```
178+
179+
3. TEI Reranking Service
180+
181+
```bash
182+
curl http://${host_ip}:6041/rerank \
183+
-X POST \
184+
-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
185+
-H 'Content-Type: application/json'
186+
```
187+
188+
4. LLM Backend Service
189+
190+
In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready.
191+
192+
Try the command below to check whether the LLM service is ready.
193+
194+
```bash
195+
docker logs vllm-service 2>&1 | grep complete
196+
```
197+
198+
If the service is ready, you will get the response like below.
199+
200+
```text
201+
INFO: Application startup complete.
202+
```
203+
204+
Then try the `cURL` command below to validate vLLM service.
205+
206+
```bash
207+
curl http://${host_ip}:6042/v1/chat/completions \
208+
-X POST \
209+
-d '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens":17}' \
210+
-H 'Content-Type: application/json'
211+
```
212+
213+
5. MegaService
214+
215+
```bash
216+
curl http://${host_ip}:8912/v1/chatqna -H "Content-Type: application/json" -d '{
217+
"messages": "What is the revenue of Nike in 2023?"
218+
}'
219+
```
220+
221+
6. Dataprep Microservice(Optional)
222+
223+
If you want to update the default knowledge base, you can use the following commands:
224+
225+
Update Knowledge Base via Local File Upload:
226+
227+
```bash
228+
curl -X POST "http://${host_ip}:6043/v1/dataprep/ingest" \
229+
-H "Content-Type: multipart/form-data" \
230+
-F "files=@./your_file.pdf"
231+
```
232+
233+
This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment.
234+
235+
Add Knowledge Base via HTTP Links:
236+
237+
```bash
238+
curl -X POST "http://${host_ip}:6043/v1/dataprep/ingest" \
239+
-H "Content-Type: multipart/form-data" \
240+
-F 'link_list=["https://opea.dev"]'
241+
```
242+
243+
## Launch the UI
244+
245+
To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:
246+
247+
```yaml
248+
chatqna-xeon-ui-server:
249+
image: opea/chatqna-ui:latest
250+
...
251+
ports:
252+
- "80:5173"
253+
```
254+
255+
![project-screenshot](../../../../assets/img/chat_ui_init.png)
256+
257+
Here is an example of running ChatQnA:
258+
259+
![project-screenshot](../../../../assets/img/chat_ui_response.png)

0 commit comments

Comments
 (0)