-
Notifications
You must be signed in to change notification settings - Fork 280
AgentQnA - add support for remote server #1900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
alexsin368
wants to merge
7
commits into
opea-project:main
Choose a base branch
from
alexsin368:agentqna-iaas
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
c442768
add support for remote server
alexsin368 a9cbcbe
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1ccdda0
address comments, simplify compose_remote.yaml
alexsin368 65f803d
Merge branch 'agentqna-iaas' of https://github.com/alexsin368/GenAIEx…
alexsin368 2137e6f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] c3738c4
simplify compose_remote.yaml
alexsin368 b87c25c
Merge branch 'agentqna-iaas' of https://github.com/alexsin368/GenAIEx…
alexsin368 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -99,7 +99,7 @@ flowchart LR | |
|
||
#### First, clone the `GenAIExamples` repo. | ||
|
||
``` | ||
```bash | ||
export WORKDIR=<your-work-directory> | ||
cd $WORKDIR | ||
git clone https://github.com/opea-project/GenAIExamples.git | ||
|
@@ -109,7 +109,7 @@ git clone https://github.com/opea-project/GenAIExamples.git | |
|
||
##### For proxy environments only | ||
|
||
``` | ||
```bash | ||
export http_proxy="Your_HTTP_Proxy" | ||
export https_proxy="Your_HTTPs_Proxy" | ||
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" | ||
|
@@ -118,31 +118,43 @@ export no_proxy="Your_No_Proxy" | |
|
||
##### For using open-source llms | ||
|
||
``` | ||
Set up a [HuggingFace](https://huggingface.co/) account and generate a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token). | ||
|
||
Then set an environment variable with the token and another for a directory to download the models: | ||
|
||
```bash | ||
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token> | ||
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time | ||
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> # to avoid redownloading models | ||
``` | ||
|
||
##### [Optional] OPANAI_API_KEY to use OpenAI models | ||
##### [Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference | ||
|
||
``` | ||
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys). | ||
|
||
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server. | ||
|
||
Then set the environment variable `OPENAI_API_KEY` with the key contents: | ||
|
||
```bash | ||
export OPENAI_API_KEY=<your-openai-key> | ||
``` | ||
|
||
#### Third, set up environment variables for the selected hardware using the corresponding `set_env.sh` | ||
|
||
##### Gaudi | ||
|
||
``` | ||
```bash | ||
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/set_env.sh | ||
``` | ||
|
||
##### Xeon | ||
|
||
``` | ||
```bash | ||
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh | ||
``` | ||
|
||
For running | ||
|
||
### 2. Launch the multi-agent system. </br> | ||
|
||
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers. | ||
|
@@ -184,14 +196,37 @@ docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/ | |
|
||
#### Launch on Xeon | ||
|
||
On Xeon, only OpenAI models are supported. The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent. | ||
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key. | ||
|
||
```bash | ||
export OPENAI_API_KEY=<your-openai-key> | ||
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon | ||
``` | ||
|
||
##### OpenAI Models | ||
|
||
The command below will launch the multi-agent system with the `DocIndexRetriever` as the retrieval tool for the Worker RAG agent. | ||
|
||
```bash | ||
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d | ||
``` | ||
|
||
##### Models on Remote Server | ||
|
||
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables. | ||
|
||
###### Notes | ||
|
||
- `OPENAI_API_KEY` is already set in a previous step. | ||
- `model` is used to overwrite the value set for this environment variable in `set_env.sh`. | ||
- `LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com". | ||
|
||
```bash | ||
export model=<name-of-model-card> | ||
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you probably need to give an example to explain how it works for Denvr. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added some notes |
||
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d | ||
``` | ||
|
||
### 3. Ingest Data into the vector database | ||
|
||
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo. | ||
|
@@ -208,12 +243,18 @@ bash run_ingest_data.sh | |
The UI microservice is launched in the previous step with the other microservices. | ||
To see the UI, open a web browser to `http://${ip_address}:5173` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. | ||
|
||
1. `create Admin Account` with a random value | ||
2. add opea agent endpoint `http://$ip_address:9090/v1` which is a openai compatible api | ||
1. Click on the arrow above `Get started`. Create an admin account with a name, email, and password. | ||
2. Add an OpenAI-compatible API endpoint. In the upper right, click on the circle button with the user's initial, go to `Admin Settings`->`Connections`. Under `Manage OpenAI API Connections`, click on the `+` to add a connection. Fill in these fields: | ||
|
||
- **URL**: `http://${ip_address}:9090/v1`, do not forget the `v1` | ||
- **Key**: any value | ||
- **Model IDs**: any name i.e. `opea-agent`, then press `+` to add it | ||
|
||
Click "Save". | ||
|
||
 | ||
|
||
3. test opea agent with ui | ||
3. Test OPEA agent with UI. Return to `New Chat` and ensure the model (i.e. `opea-agent`) is selected near the upper left. Enter in any prompt to interact with the agent. | ||
|
||
 | ||
|
||
|
18 changes: 18 additions & 0 deletions
18
AgentQnA/docker_compose/intel/cpu/xeon/compose_remote.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Copyright (C) 2025 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
services: | ||
worker-rag-agent: | ||
environment: | ||
llm_endpoint_url: ${LLM_ENDPOINT_URL} | ||
api_key: ${OPENAI_API_KEY} | ||
|
||
worker-sql-agent: | ||
environment: | ||
llm_endpoint_url: ${LLM_ENDPOINT_URL} | ||
api_key: ${OPENAI_API_KEY} | ||
|
||
supervisor-react-agent: | ||
environment: | ||
llm_endpoint_url: ${LLM_ENDPOINT_URL} | ||
api_key: ${OPENAI_API_KEY} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between this model variable and MODEL_ID in set_env.sh. why users need to set model name in two difference places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the agent, "model" is the env variable used. There is no LLM_MODEL_ID. EMBEDDING_MODEL_ID and RERANK_MODEL_ID are used by the retriever.
We need to set
model
here to overwrite the original value (gpt-4o-mini-2024-07-18" set in set_env.sh. Added a note on this.