-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Migrate AzureOpenAI constructors to standard OpenAI client #2752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: pamelafox <[email protected]>
Co-authored-by: pamelafox <[email protected]>
| azure_deployment=self.open_ai_deployment, | ||
| api_version=self.open_ai_api_version, | ||
| # For Azure OpenAI, we need to include the deployment in the URL | ||
| base_url = f"{self.open_ai_endpoint}/openai/deployments/{self.open_ai_deployment}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot This should be {self.open_ai_endpoint}/openai/v1 now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the Azure OpenAI endpoint URL format to use /openai/v1 and updated the model parameter to use the deployment name for Azure OpenAI API calls. Commit: 76e5cb6
app/backend/prepdocs.py
Outdated
| logger.info("AZURE_OPENAI_API_KEY_OVERRIDE found, using as api_key for Azure OpenAI client") | ||
| openai_client = AsyncAzureOpenAI( | ||
| api_version=azure_openai_api_version, azure_endpoint=endpoint, api_key=azure_openai_api_key | ||
| openai_client = AsyncOpenAI( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot The endpoint must end with "/openai/v1" now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed all Azure OpenAI endpoints in prepdocs.py to use the correct /openai/v1 format. Commit: 5d23055
| ) | ||
|
|
||
| # https://learn.microsoft.com/azure/ai-services/openai/api-version-deprecation#latest-ga-api-release | ||
| azure_openai_api_version = os.getenv("AZURE_OPENAI_API_VERSION") or "2024-06-01" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Isn't AZURE_OPENAI_API_VERSION in other files as well, like the GitHub actions workflows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right! Found and removed AZURE_OPENAI_API_VERSION references from GitHub Actions workflows (.github/workflows/azure-dev.yml, .github/workflows/evaluate.yaml), Azure DevOps pipeline (.azdo/pipelines/azure-dev.yml), and documentation (docs/reasoning.md). The only remaining reference is in evals/generate_ground_truth.py which uses LangChain wrappers, not the OpenAI SDK directly. Commit: 5d23055
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments
Co-authored-by: pamelafox <[email protected]>
Co-authored-by: pamelafox <[email protected]>
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR migrates the embeddings implementation from separate AzureOpenAIEmbeddingService and OpenAIEmbeddingService classes to a unified OpenAIEmbeddings class, simplifying the architecture by accepting a pre-configured OpenAI client. The change removes the need for API version configuration and consolidates authentication logic.
Key Changes:
- Unified embeddings service using a single
OpenAIEmbeddingsclass - Removed
AZURE_OPENAI_API_VERSIONconfiguration parameter across all infrastructure and application code - Updated OpenAI SDK from version 1.99.8 to 2.6.1
- Refactored test mocks to work with the new unified embeddings approach
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
app/backend/prepdocslib/embeddings.py |
Removed separate Azure/OpenAI service classes; unified into OpenAIEmbeddings with client injection |
app/backend/prepdocs.py |
Updated setup_embeddings_service to accept pre-configured client instead of creating clients internally |
app/backend/app.py |
Removed API version parameter; updated client setup and endpoint construction |
app/backend/requirements.in |
Updated OpenAI SDK minimum version requirement to 1.109.1 |
app/backend/requirements.txt |
Updated OpenAI SDK to version 2.6.1 |
tests/test_*.py |
Refactored tests to use unified embeddings class and simplified mocking |
infra/main.bicep |
Removed AZURE_OPENAI_API_VERSION parameter and environment variable |
docs/reasoning.md |
Removed API version configuration instructions |
Comments suppressed due to low confidence (1)
tests/test_prepdocs.py:33
- The
MockEmbeddingsClientandMockClientclasses are duplicated fromtests/mocks.py(lines 333-343). This duplication creates a maintenance burden. Consider importing these classes fromtests.mocksinstead of redefining them, similar to howMOCK_EMBEDDING_DIMENSIONSandMOCK_EMBEDDING_MODEL_NAMEare imported.
class MockEmbeddingsClient:
def __init__(self, create_embedding_response: openai.types.CreateEmbeddingResponse):
self.create_embedding_response = create_embedding_response
async def create(self, *args, **kwargs) -> openai.types.CreateEmbeddingResponse:
return self.create_embedding_response
class MockClient:
def __init__(self, embeddings_client):
self.embeddings = embeddings_client
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
| dimensions: int | ||
|
|
||
|
|
||
| class OpenAIEmbeddings(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified this file to reuse the openai_client that we set up in another part of the code, as I didn't like that we were doing the same setup in two places.
|
|
||
|
|
||
| class IntegratedVectorizerStrategy(Strategy): | ||
| class IntegratedVectorizerStrategy(Strategy): # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently has no tests, and it doesn't feel necessary to add tests given the plans to move to prepdocs-as-a-function.
app/backend/app.py
Outdated
| AZURE_OPENAI_CUSTOM_URL = os.getenv("AZURE_OPENAI_CUSTOM_URL") | ||
| # https://learn.microsoft.com/azure/ai-services/openai/api-version-deprecation#latest-ga-api-release | ||
| AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION") or "2024-10-21" | ||
| AZURE_OPENAI_ENDPOINT = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Remove this line that sets AZURE_OPENAI_ENDPOINT, and pass azure_openai_endpoint from setup_openai_client to setup_embeddings instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit ee14c51 - removed the redundant AZURE_OPENAI_ENDPOINT computation and now pass the azure_openai_endpoint value returned from setup_openai_client directly to setup_embeddings_service.
Co-authored-by: Copilot <[email protected]>
| openai_api_key=clean_key_if_exists(os.getenv("OPENAI_API_KEY")), | ||
| openai_organization=os.getenv("OPENAI_ORGANIZATION"), | ||
| ) | ||
| openai_embeddings_service = setup_embeddings_service( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved after, since its now dependent on the client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, and it's simpler at the end of the day
| azure-identity | ||
| quart | ||
| quart-cors | ||
| flask==3.1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quart should bring in flask
…putation, and revert unrelated type annotation changes Co-authored-by: pamelafox <[email protected]>
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
Check Broken URLsWe have automatically detected the following broken URLs in your files. Review and fix the paths to resolve this issue. Check the file paths and associated broken URLs inside them.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, this is simpler now that we don't have separate AOAI and OAI clients
The migration modernizes the codebase to use the current OpenAI SDK API surface, removing deprecated Azure-specific constructors while maintaining full compatibility with Azure OpenAI services (passing in token_provider as api_key). Requires recent version of openai SDK.
Fixes #2751.
Steps
/openai/v1formatKey Changes Made:
AsyncAzureOpenAI()→AsyncOpenAI()azure_endpoint→base_url,azure_ad_token_provider→api_keyapi_version=arguments and related environment variables/openai/v1formatValidation Results:
✅ All Python files compile without syntax errors
✅ JSON configuration files are valid
✅ Requirements.in updated to openai>=1.108.1
✅ No remaining AZURE_OPENAI_API_VERSION references in core application code
✅ All AsyncAzureOpenAI constructor calls replaced with AsyncOpenAI
✅ Azure OpenAI endpoint URLs use correct
/openai/v1format✅ Workflow files and documentation updated
✅ PR review feedback addressed