-
Notifications
You must be signed in to change notification settings - Fork 280
AgentQnA - add support for remote server #1900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: alexsin368 <[email protected]>
Dependency Review✅ No vulnerabilities or license issues found.Scanned FilesNone |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you just create a yaml file to overwrite original compose.yaml instead of creating a new one? it increase a maintenance efforts, and you probably just change couple values for vllm. please check compose.telemetry.yaml as an example.
if all new features create a new compose.yaml, we will be overwhelmed with many compose.yaml files soon.
To run on Xeon with models deployed on a remote server, run with the `compose_remote.yaml` instead. Additional environment variables also need to be set. | ||
|
||
```bash | ||
export model=<name-of-model-card> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between this model variable and MODEL_ID in set_env.sh. why users need to set model name in two difference places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the agent, "model" is the env variable used. There is no LLM_MODEL_ID. EMBEDDING_MODEL_ID and RERANK_MODEL_ID are used by the retriever.
We need to set model
here to overwrite the original value (gpt-4o-mini-2024-07-18" set in set_env.sh. Added a note on this.
|
||
```bash | ||
export model=<name-of-model-card> | ||
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably need to give an example to explain how it works for Denvr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some notes
Signed-off-by: alexsin368 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed comments
…amples into agentqna-iaas Signed-off-by: alexsin368 <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: alexsin368 <[email protected]>
…amples into agentqna-iaas
The hyperlink check is failing with https://platform.openai.com/api-keys. I checked that it is a valid link, just requires the user to log in to their OpenAI account. Can we bypass this error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
A pull request to add support for remote server inference, including configuration updates and documentation enhancements.
- Introduces a new docker-compose file with environment variables for remote inference.
- Updates the README with corrected typos and detailed instructions for setting up both OpenAI and remote server inference.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
AgentQnA/docker_compose/intel/cpu/xeon/compose_remote.yaml | New configuration file for setting environment variables for remote inference. |
AgentQnA/README.md | Updated documentation with instructions for configuring remote inference and fixes. |
Comments suppressed due to low confidence (1)
AgentQnA/README.md:225
- [nitpick] The variable name 'model' is generic and may lead to confusion; consider renaming it to something more descriptive, such as MODEL_ID, to clearly indicate its purpose.
export model=<name-of-model-card>
Hi @vikramts @NeoZhangJianyu Here the Hyperlink check CI report error, could you please help to resolve it? |
@alexsin368 - If the https://platform.openai.com/api-keys URL is valid, then I don't see why we cannot bypass the error detail. Technically, this is not an error. And we can expect the user to understand that they need to log into their OpenAI account. So I do not see an error here really. |
Description
Add support for usage of a remote server inference endpoint. Supports Enterprise Inference.
Clean up README with fixed typos and additional instructions for setting up the OpenAI endpoint on the WebUI server.
Related dependent PR: opea-project/GenAIComps#1644
Issues
N/A
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
None
Tests
Tested AgentQnA on the UI and verified the chat completions are working.