From b0cf5934a0cbfe220758935f23330fa21567c919 Mon Sep 17 00:00:00 2001 From: Razvan-Liviu Varzaru Date: Wed, 7 May 2025 16:13:09 +0300 Subject: [PATCH 1/3] Add ChatQnA docker-compose example on Intel Xeon using MariaDB Vector Example on how to deploy the ChatBot on Intel Xeon by using MariaDB Server as a vectorstore. - use MariaDB Server as the backend database. Minimum required version is 11.7 - use the OPEA_DATAPREP_MARIADBVECTOR component for dataprep microservice - use the OPEA_RETRIEVER_MARIADBVECTOR component for retriever microservice How to test Set the HF API token environment variable and: ``` cd ChatQnA/tests bash test_compose_mariadb_on_xeon.sh ``` Signed-off-by: Razvan-Liviu Varzaru --- .../intel/cpu/xeon/README_mariadb.md | 259 ++++++++++++++++++ .../intel/cpu/xeon/compose_mariadb.yaml | 182 ++++++++++++ .../intel/cpu/xeon/set_env_mariadb.sh | 25 ++ ChatQnA/tests/test_compose_mariadb_on_xeon.sh | 176 ++++++++++++ 4 files changed, 642 insertions(+) create mode 100644 ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md create mode 100644 ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml create mode 100755 ChatQnA/docker_compose/intel/cpu/xeon/set_env_mariadb.sh create mode 100644 ChatQnA/tests/test_compose_mariadb_on_xeon.sh diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md b/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md new file mode 100644 index 0000000000..2e178a029f --- /dev/null +++ b/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md @@ -0,0 +1,259 @@ +# Deploying ChatQnA with MariaDB Vector on Intel® Xeon® Processors + +This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel® Xeon® servers. The pipeline integrates **MariaDB Vector** as the vector database and includes microservices such as `embedding`, `retriever`, `rerank`, and `llm`. + +--- + +## Table of Contents + +1. [Build Docker Images](#build-docker-images) +2. [Validate Microservices](#validate-microservices) +3. [Launch the UI](#launch-the-ui) +4. [Launch the Conversational UI (Optional)](#launch-the-conversational-ui-optional) + +--- + +## Build Docker Images + +First of all, you need to build Docker Images locally and install the python package of it. + +```bash +git clone https://github.com/opea-project/GenAIComps.git +cd GenAIComps +``` + +### 1. Build Retriever Image + +```bash +docker build --no-cache -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile . +``` + +### 2. Build Dataprep Image + +```bash +docker build --no-cache -t opea/dataprep:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/src/Dockerfile . +cd .. +``` + +### 3. Build MegaService Docker Image + +To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command: + +```bash +git clone https://github.com/opea-project/GenAIExamples.git +cd GenAIExamples/ChatQnA/ +docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +cd ../../.. +``` + +### 4. Build UI Docker Image + +Build frontend Docker image via below command: + +```bash +cd GenAIExamples/ChatQnA/ui +docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +cd ../../../.. +``` + +### 5. Build Conversational React UI Docker Image (Optional) + +Build frontend Docker image that enables Conversational experience with ChatQnA megaservice via below command: + +**Export the value of the public IP address of your Xeon server to the `host_ip` environment variable** + +```bash +cd GenAIExamples/ChatQnA/ui +export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8912/v1/chatqna" +export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6043/v1/dataprep/ingest" +docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg BACKEND_SERVICE_ENDPOINT=$BACKEND_SERVICE_ENDPOINT --build-arg DATAPREP_SERVICE_ENDPOINT=$DATAPREP_SERVICE_ENDPOINT -f ./docker/Dockerfile.react . +cd ../../../.. +``` + +### 6. Build Nginx Docker Image + +```bash +cd GenAIComps +docker build -t opea/nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/nginx/src/Dockerfile . +``` + +Then run the command `docker images`, you will have the following 5 Docker Images: + +1. `opea/dataprep:latest` +2. `opea/retriever:latest` +3. `opea/chatqna:latest` +4. `opea/chatqna-ui:latest` +5. `opea/nginx:latest` + +## Start Microservices + +### Required Models + +By default, the embedding, reranking and LLM models are set to a default value as listed below: + +| Service | Model | +| --------- | ----------------------------------- | +| Embedding | BAAI/bge-base-en-v1.5 | +| Reranking | BAAI/bge-reranker-base | +| LLM | meta-llama/Meta-Llama-3-8B-Instruct | + +Change the `xxx_MODEL_ID` below for your needs. + +### Setup Environment Variables + +Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below. + +**Export the value of the public IP address of your Xeon server to the `host_ip` environment variable** + +> Change the External_Public_IP below with the actual IPV4 value + +```bash +export host_ip="External_Public_IP" +``` + +> Change to your actual Huggingface API Token value + +```bash +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" +``` + +**Append the value of the public IP address to the no_proxy list if you are in a proxy environment** + +```bash +export no_proxy=${your_no_proxy},chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-mariadb-vector,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service +``` + +```bash +export no_proxy=${your_no_proxy} +export http_proxy=${your_http_proxy} +export https_proxy=${your_http_proxy} +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export RERANK_MODEL_ID="BAAI/bge-reranker-base" +export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct" +export MARIADB_DATABASE="vectordb" +export MARIADB_USER="chatqna" +export MARIADB_PASSWORD="password" +``` + +Note: Please replace with `host_ip` with you external IP address, do not use localhost. + +### Start all the services Docker Containers + +> Before running the docker compose command, you need to be in the folder that has the docker compose yaml file + +```bash +cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/ +docker compose -f compose_mariadb.yaml up -d +``` + +### Validate Microservices + +Follow the instructions to validate MicroServices. +For details on how to verify the correctness of the response, refer to [how-to-validate_service](../../hpu/gaudi/how_to_validate_service.md). + +1. TEI Embedding Service + + ```bash + curl ${host_ip}:6040/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` + +2. Retriever Microservice + + To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector + is determined by the embedding model. + Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. + + Check the vector dimension of your embedding model, set `your_embedding` dimension equals to it. + + ```bash + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + curl http://${host_ip}:6045/v1/retrieval \ + -X POST \ + -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ + -H 'Content-Type: application/json' + ``` + +3. TEI Reranking Service + + ```bash + curl http://${host_ip}:6041/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` + +4. LLM Backend Service + + In the first startup, this service will take more time to download, load and warm up the model. After it's finished, the service will be ready. + + Try the command below to check whether the LLM service is ready. + + ```bash + docker logs vllm-service 2>&1 | grep complete + ``` + + If the service is ready, you will get the response like below. + + ```text + INFO: Application startup complete. + ``` + + Then try the `cURL` command below to validate vLLM service. + + ```bash + curl http://${host_ip}:6042/v1/chat/completions \ + -X POST \ + -d '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens":17}' \ + -H 'Content-Type: application/json' + ``` + +5. MegaService + + ```bash + curl http://${host_ip}:8912/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` + +6. Dataprep Microservice(Optional) + + If you want to update the default knowledge base, you can use the following commands: + + Update Knowledge Base via Local File Upload: + + ```bash + curl -X POST "http://${host_ip}:6043/v1/dataprep/ingest" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./your_file.pdf" + ``` + + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + + Add Knowledge Base via HTTP Links: + + ```bash + curl -X POST "http://${host_ip}:6043/v1/dataprep/ingest" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` + +## Launch the UI + +To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below: + +```yaml + chatqna-xeon-ui-server: + image: opea/chatqna-ui:latest + ... + ports: + - "80:5173" +``` + +![project-screenshot](../../../../assets/img/chat_ui_init.png) + +Here is an example of running ChatQnA: + +![project-screenshot](../../../../assets/img/chat_ui_response.png) diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml b/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml new file mode 100644 index 0000000000..41106ac7eb --- /dev/null +++ b/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml @@ -0,0 +1,182 @@ +# Copyright (C) 2025 MariaDB Foundation +# SPDX-License-Identifier: Apache-2.0 + +services: + mariadb-server: + image: mariadb:latest + container_name: mariadb-server + ports: + - "3306:3306" + environment: + - MARIADB_DATABASE=${MARIADB_DATABASE} + - MARIADB_USER=${MARIADB_USER} + - MARIADB_PASSWORD=${MARIADB_PASSWORD} + - MARIADB_RANDOM_ROOT_PASSWORD=1 + healthcheck: + test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"] + start_period: 10s + interval: 10s + timeout: 5s + retries: 3 + dataprep-mariadb-vector: + image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} + container_name: dataprep-mariadb-vector + depends_on: + - mariadb-server + - tei-embedding-service + ports: + - "6007:5000" + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + DATAPREP_COMPONENT_NAME: "OPEA_DATAPREP_MARIADBVECTOR" + MARIADB_CONNECTION_URL: mariadb+mariadbconnector://${MARIADB_USER}:${MARIADB_PASSWORD}@mariadb-server:3306/${MARIADB_DATABASE} + TEI_ENDPOINT: http://tei-embedding-service:80 + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + healthcheck: + test: ["CMD-SHELL", "curl -f http://localhost:5000/v1/health_check || exit 1"] + interval: 10s + timeout: 5s + retries: 50 + restart: unless-stopped + tei-embedding-service: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 + container_name: tei-embedding-server + ports: + - "6006:80" + volumes: + - "${MODEL_CACHE:-./data}:/data" + shm_size: 1g + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate + retriever: + image: ${REGISTRY:-opea}/retriever:${TAG:-latest} + container_name: retriever-mariadb-vector + depends_on: + - mariadb-server + ports: + - "7000:7000" + ipc: host + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + MARIADB_CONNECTION_URL: mariadb+mariadbconnector://${MARIADB_USER}:${MARIADB_PASSWORD}@mariadb-server:3306/${MARIADB_DATABASE} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + LOGFLAG: ${LOGFLAG} + RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_MARIADBVECTOR" + restart: unless-stopped + tei-reranking-service: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 + container_name: tei-reranking-server + ports: + - "8808:80" + volumes: + - "${MODEL_CACHE:-./data}:/data" + shm_size: 1g + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + HF_HUB_DISABLE_PROGRESS_BARS: 1 + HF_HUB_ENABLE_HF_TRANSFER: 0 + command: --model-id ${RERANK_MODEL_ID} --auto-truncate + vllm-service: + image: ${REGISTRY:-opea}/vllm:${TAG:-latest} + container_name: vllm-service + ports: + - "9009:80" + volumes: + - "${MODEL_CACHE:-./data}:/root/.cache/huggingface/hub" + shm_size: 128g + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + LLM_MODEL_ID: ${LLM_MODEL_ID} + VLLM_TORCH_PROFILER_DIR: "/mnt" + VLLM_CPU_KVCACHE_SPACE: 40 + healthcheck: + test: ["CMD-SHELL", "curl -f http://$host_ip:9009/health || exit 1"] + interval: 10s + timeout: 10s + retries: 100 + command: --model $LLM_MODEL_ID --host 0.0.0.0 --port 80 + chatqna-xeon-backend-server: + image: ${REGISTRY:-opea}/chatqna:${TAG:-latest} + container_name: chatqna-xeon-backend-server + depends_on: + mariadb-server: + condition: service_started + dataprep-mariadb-vector: + condition: service_healthy + tei-embedding-service: + condition: service_started + retriever: + condition: service_started + tei-reranking-service: + condition: service_started + vllm-service: + condition: service_healthy + ports: + - "8888:8888" + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - MEGA_SERVICE_HOST_IP=chatqna-xeon-backend-server + - EMBEDDING_SERVER_HOST_IP=tei-embedding-service + - EMBEDDING_SERVER_PORT=${EMBEDDING_SERVER_PORT:-80} + - RETRIEVER_SERVICE_HOST_IP=retriever + - RERANK_SERVER_HOST_IP=tei-reranking-service + - RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80} + - LLM_SERVER_HOST_IP=vllm-service + - LLM_SERVER_PORT=80 + - LLM_MODEL=${LLM_MODEL_ID} + - LOGFLAG=${LOGFLAG} + ipc: host + restart: always + chatqna-xeon-ui-server: + image: ${REGISTRY:-opea}/chatqna-ui:${TAG:-latest} + container_name: chatqna-xeon-ui-server + depends_on: + - chatqna-xeon-backend-server + ports: + - "5173:5173" + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + ipc: host + restart: always + chatqna-xeon-nginx-server: + image: ${REGISTRY:-opea}/nginx:${TAG:-latest} + container_name: chatqna-xeon-nginx-server + depends_on: + - chatqna-xeon-backend-server + - chatqna-xeon-ui-server + ports: + - "${NGINX_PORT:-80}:80" + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - FRONTEND_SERVICE_IP=chatqna-xeon-ui-server + - FRONTEND_SERVICE_PORT=5173 + - BACKEND_SERVICE_NAME=chatqna + - BACKEND_SERVICE_IP=chatqna-xeon-backend-server + - BACKEND_SERVICE_PORT=8888 + - DATAPREP_SERVICE_IP=dataprep-mariadb-vector + - DATAPREP_SERVICE_PORT=5000 + ipc: host + restart: always + +networks: + default: + driver: bridge diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/set_env_mariadb.sh b/ChatQnA/docker_compose/intel/cpu/xeon/set_env_mariadb.sh new file mode 100755 index 0000000000..88ae5c0eec --- /dev/null +++ b/ChatQnA/docker_compose/intel/cpu/xeon/set_env_mariadb.sh @@ -0,0 +1,25 @@ +#!/usr/bin/env bash + +# Copyright (C) 2025 MariaDB Foundation +# SPDX-License-Identifier: Apache-2.0 + +pushd "../../../../../" > /dev/null +source .set_env.sh +popd > /dev/null + +if [ -z "${HUGGINGFACEHUB_API_TOKEN}" ]; then + echo "Error: HUGGINGFACEHUB_API_TOKEN is not set. Please set HUGGINGFACEHUB_API_TOKEN." +fi + +export host_ip=$(hostname -I | awk '{print $1}') +export MARIADB_DATABASE="vectordb" +export MARIADB_USER="chatqna" +export MARIADB_PASSWORD="password" +export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export RERANK_MODEL_ID="BAAI/bge-reranker-base" +export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct" +export LOGFLAG="" +export no_proxy="$no_proxy,chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm-service,jaeger,prometheus,grafana,node-exporter" +export LLM_SERVER_PORT=9000 +export NGINX_PORT=80 diff --git a/ChatQnA/tests/test_compose_mariadb_on_xeon.sh b/ChatQnA/tests/test_compose_mariadb_on_xeon.sh new file mode 100644 index 0000000000..45f5f99e47 --- /dev/null +++ b/ChatQnA/tests/test_compose_mariadb_on_xeon.sh @@ -0,0 +1,176 @@ +#!/bin/bash +# Copyright (C) 2025 MariaDB Foundation +# SPDX-License-Identifier: Apache-2.0 + +set -e +IMAGE_REPO=${IMAGE_REPO:-"opea"} +IMAGE_TAG=${IMAGE_TAG:-"latest"} +echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}" +echo "TAG=IMAGE_TAG=${IMAGE_TAG}" +export REGISTRY=${IMAGE_REPO} +export TAG=${IMAGE_TAG} +export MODEL_CACHE=${model_cache:-"./data"} + +WORKPATH=$(dirname "$PWD") +LOG_PATH="$WORKPATH/tests" +ip_address=$(hostname -I | awk '{print $1}') + +function build_docker_images() { + opea_branch=${opea_branch:-"main"} + cd $WORKPATH/docker_image_build + git clone --depth 1 --branch ${opea_branch} https://github.com/opea-project/GenAIComps.git + pushd GenAIComps + echo "GenAIComps test commit is $(git rev-parse HEAD)" + docker build --no-cache -t ${REGISTRY}/comps-base:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . + popd && sleep 1s + git clone https://github.com/vllm-project/vllm.git && cd vllm + VLLM_VER="v0.8.3" + echo "Check out vLLM tag ${VLLM_VER}" + git checkout ${VLLM_VER} &> /dev/null + # make sure NOT change the pwd + cd ../ + + echo "Build all the images with --no-cache, check docker_image_build.log for details..." + service_list="chatqna chatqna-ui dataprep retriever vllm nginx" + docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log + + docker images && sleep 1s +} + +function start_services() { + cd $WORKPATH/docker_compose/intel/cpu/xeon + export MARIADB_DATABASE="vectordb" + export MARIADB_USER="chatqna" + export MARIADB_PASSWORD="test" + export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" + export RERANK_MODEL_ID="BAAI/bge-reranker-base" + export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct" + export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} + export host_ip=${ip_address} + + # Start Docker Containers + docker compose -f compose_mariadb.yaml up -d > ${LOG_PATH}/start_services_with_compose.log + n=0 + until [[ "$n" -ge 100 ]]; do + docker logs vllm-service > ${LOG_PATH}/vllm_service_start.log 2>&1 + if grep -q complete ${LOG_PATH}/vllm_service_start.log; then + break + fi + sleep 5s + n=$((n+1)) + done +} + +function validate_service() { + local URL="$1" + local EXPECTED_RESULT="$2" + local SERVICE_NAME="$3" + local DOCKER_NAME="$4" + local INPUT_DATA="$5" + + local HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST -d "$INPUT_DATA" -H 'Content-Type: application/json' "$URL") + if [ "$HTTP_STATUS" -eq 200 ]; then + echo "[ $SERVICE_NAME ] HTTP status is 200. Checking content..." + + local CONTENT=$(curl -s -X POST -d "$INPUT_DATA" -H 'Content-Type: application/json' "$URL" | tee ${LOG_PATH}/${SERVICE_NAME}.log) + + if echo "$CONTENT" | grep -q "$EXPECTED_RESULT"; then + echo "[ $SERVICE_NAME ] Content is as expected." + else + echo "[ $SERVICE_NAME ] Content does not match the expected result: $CONTENT" + docker logs ${DOCKER_NAME} >> ${LOG_PATH}/${SERVICE_NAME}.log + exit 1 + fi + else + echo "[ $SERVICE_NAME ] HTTP status is not 200. Received status was $HTTP_STATUS" + docker logs ${DOCKER_NAME} >> ${LOG_PATH}/${SERVICE_NAME}.log + exit 1 + fi + sleep 1s +} + +function validate_microservices() { + # Check if the microservices are running correctly. + sleep 3m + + # tei for embedding service + validate_service \ + "${ip_address}:6006/embed" \ + "\[\[" \ + "tei-embedding" \ + "tei-embedding-server" \ + '{"inputs":"What is Deep Learning?"}' + + # retrieval microservice + test_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + validate_service \ + "${ip_address}:7000/v1/retrieval" \ + " " \ + "retrieval" \ + "retriever-mariadb-vector" \ + "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${test_embedding}}" + + # tei for rerank microservice + validate_service \ + "${ip_address}:8808/rerank" \ + '{"index":1,"score":' \ + "tei-rerank" \ + "tei-reranking-server" \ + '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' + + # vllm for llm service + validate_service \ + "${ip_address}:9009/v1/chat/completions" \ + "content" \ + "vllm-llm" \ + "vllm-service" \ + '{"model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens": 17}' +} + +function validate_megaservice() { + # Curl the Mega Service + validate_service \ + "${ip_address}:8888/v1/chatqna" \ + "Nike" \ + "mega-chatqna" \ + "chatqna-xeon-backend-server" \ + '{"messages": "What is the revenue of Nike in 2023?"}' + +} + +function stop_docker() { + cd $WORKPATH/docker_compose/intel/cpu/xeon + docker compose down +} + +function main() { + + echo "::group::stop_docker" + stop_docker + echo "::endgroup::" + + echo "::group::build_docker_images" + if [[ "$IMAGE_REPO" == "opea" ]]; then build_docker_images; fi + echo "::endgroup::" + + echo "::group::start_services" + start_services + echo "::endgroup::" + + echo "::group::validate_microservices" + validate_microservices + echo "::endgroup::" + + echo "::group::validate_megaservice" + validate_megaservice + echo "::endgroup::" + + echo "::group::stop_docker" + stop_docker + echo "::endgroup::" + + docker system prune -f + +} + +main From 1f84e0dae5ddcffd7793f1ecd04c1924175a7bf4 Mon Sep 17 00:00:00 2001 From: Razvan-Liviu Varzaru Date: Thu, 8 May 2025 09:55:32 +0300 Subject: [PATCH 2/3] Address review comments Signed-off-by: Razvan-Liviu Varzaru --- ChatQnA/docker_compose/intel/cpu/xeon/README.md | 1 + .../docker_compose/intel/cpu/xeon/README_mariadb.md | 8 ++++---- .../intel/cpu/xeon/compose_mariadb.yaml | 11 +++++++---- 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/README.md b/ChatQnA/docker_compose/intel/cpu/xeon/README.md index e2e2deaaa6..c47dfe5022 100644 --- a/ChatQnA/docker_compose/intel/cpu/xeon/README.md +++ b/ChatQnA/docker_compose/intel/cpu/xeon/README.md @@ -156,6 +156,7 @@ In the context of deploying a ChatQnA pipeline on an Intel® Xeon® platform, we | [compose_faqgen_tgi.yaml](./compose_faqgen_tgi.yaml) | Enables FAQ generation using TGI as the LLM serving framework. For more details, refer to [README_faqgen.md](./README_faqgen.md). | | [compose.telemetry.yaml](./compose.telemetry.yaml) | Helper file for telemetry features for vllm. Can be used along with any compose files that serves vllm | | [compose_tgi.telemetry.yaml](./compose_tgi.telemetry.yaml) | Helper file for telemetry features for tgi. Can be used along with any compose files that serves tgi | +| [compose_mariadb.yaml](./compose_mariadb.yaml) | Uses MariaDB Server as the vector database. All other configurations remain the same as the default ## ChatQnA with Conversational UI (Optional) diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md b/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md index 2e178a029f..4717e61109 100644 --- a/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md +++ b/ChatQnA/docker_compose/intel/cpu/xeon/README_mariadb.md @@ -43,7 +43,7 @@ To construct the Mega Service, we utilize the [GenAIComps](https://github.com/op git clone https://github.com/opea-project/GenAIExamples.git cd GenAIExamples/ChatQnA/ docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . -cd ../../.. +cd ../.. ``` ### 4. Build UI Docker Image @@ -53,7 +53,7 @@ Build frontend Docker image via below command: ```bash cd GenAIExamples/ChatQnA/ui docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . -cd ../../../.. +cd ../../.. ``` ### 5. Build Conversational React UI Docker Image (Optional) @@ -67,7 +67,7 @@ cd GenAIExamples/ChatQnA/ui export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8912/v1/chatqna" export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6043/v1/dataprep/ingest" docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg BACKEND_SERVICE_ENDPOINT=$BACKEND_SERVICE_ENDPOINT --build-arg DATAPREP_SERVICE_ENDPOINT=$DATAPREP_SERVICE_ENDPOINT -f ./docker/Dockerfile.react . -cd ../../../.. +cd ../../.. ``` ### 6. Build Nginx Docker Image @@ -101,7 +101,7 @@ Change the `xxx_MODEL_ID` below for your needs. ### Setup Environment Variables -Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below. +Since the `compose.yaml` will consume some environment variables, you need to set them up in advance as below. **Export the value of the public IP address of your Xeon server to the `host_ip` environment variable** diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml b/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml index 41106ac7eb..9e109e6144 100644 --- a/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml +++ b/ChatQnA/docker_compose/intel/cpu/xeon/compose_mariadb.yaml @@ -22,8 +22,10 @@ services: image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} container_name: dataprep-mariadb-vector depends_on: - - mariadb-server - - tei-embedding-service + mariadb-server: + condition: service_healthy + tei-embedding-service: + condition: service_started ports: - "6007:5000" environment: @@ -57,7 +59,8 @@ services: image: ${REGISTRY:-opea}/retriever:${TAG:-latest} container_name: retriever-mariadb-vector depends_on: - - mariadb-server + mariadb-server: + condition: service_healthy ports: - "7000:7000" ipc: host @@ -113,7 +116,7 @@ services: container_name: chatqna-xeon-backend-server depends_on: mariadb-server: - condition: service_started + condition: service_healthy dataprep-mariadb-vector: condition: service_healthy tei-embedding-service: From 1557571ff46e87be5cee00d7ae563dd82df265c6 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Thu, 8 May 2025 06:58:10 +0000 Subject: [PATCH 3/3] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- ChatQnA/docker_compose/intel/cpu/xeon/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/README.md b/ChatQnA/docker_compose/intel/cpu/xeon/README.md index c47dfe5022..eea4c6132d 100644 --- a/ChatQnA/docker_compose/intel/cpu/xeon/README.md +++ b/ChatQnA/docker_compose/intel/cpu/xeon/README.md @@ -156,7 +156,7 @@ In the context of deploying a ChatQnA pipeline on an Intel® Xeon® platform, we | [compose_faqgen_tgi.yaml](./compose_faqgen_tgi.yaml) | Enables FAQ generation using TGI as the LLM serving framework. For more details, refer to [README_faqgen.md](./README_faqgen.md). | | [compose.telemetry.yaml](./compose.telemetry.yaml) | Helper file for telemetry features for vllm. Can be used along with any compose files that serves vllm | | [compose_tgi.telemetry.yaml](./compose_tgi.telemetry.yaml) | Helper file for telemetry features for tgi. Can be used along with any compose files that serves tgi | -| [compose_mariadb.yaml](./compose_mariadb.yaml) | Uses MariaDB Server as the vector database. All other configurations remain the same as the default +| [compose_mariadb.yaml](./compose_mariadb.yaml) | Uses MariaDB Server as the vector database. All other configurations remain the same as the default | ## ChatQnA with Conversational UI (Optional)