-
Notifications
You must be signed in to change notification settings - Fork 71
feat(mcp): Add AIRBYTE_CLOUD_CONFIG_API_URL environment variable #920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mcp): Add AIRBYTE_CLOUD_CONFIG_API_URL environment variable #920
Conversation
Add support for overriding the Config API URL via the AIRBYTE_CLOUD_CONFIG_API_URL environment variable. This is needed because the Config API URL cannot always be derived from the public API URL when AIRBYTE_CLOUD_API_URL is overridden. Changes: - Add CLOUD_CONFIG_API_ROOT_ENV_VAR constant in constants.py - Add resolve_cloud_config_api_url() function in cloud/auth.py - Update get_config_api_root() in api_util.py to check env var first Co-Authored-By: AJ Steers <[email protected]>
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1766002911-add-config-api-root-env-var' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1766002911-add-config-api-root-env-var'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
📝 WalkthroughWalkthroughAdds CLOUD_CONFIG API constants and env-var resolver, updates get_config_api_root to prefer the env override or derive from CLOUD_API_ROOT (else raise NotImplementedError), expands SDKError wrapping to include request_url/status_code context across many API calls, and adds two integration tests for URL handling. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)
61-91: Well-structured resolution logic!The updated
get_config_api_rootfunction has clear precedence rules that are well-documented in the docstring. The error message on lines 87-91 is particularly helpful - it tells the user exactly which environment variable to set when the Config API URL can't be derived.One small thought: would it be worth logging a debug message when using the environment variable override? This could help with troubleshooting in production scenarios, wdyt?
# First, check if the Config API URL is explicitly set via environment variable config_api_override = resolve_cloud_config_api_url() if config_api_override: + logger.debug("Using Config API URL from environment variable: %s", config_api_override) return config_api_overrideThis is totally optional - the current implementation is solid as-is.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte/_util/api_util.py(2 hunks)airbyte/cloud/auth.py(1 hunks)airbyte/constants.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte/_util/api_util.py (1)
airbyte/cloud/auth.py (1)
resolve_cloud_config_api_url(69-84)
airbyte/cloud/auth.py (1)
airbyte/secrets/util.py (1)
try_get_secret(33-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte/constants.py (1)
222-230: LGTM!The new constant and its documentation are well-structured. The naming follows the existing pattern (e.g.,
CLOUD_API_ROOT_ENV_VAR), and the docstring clearly explains when and why this override would be needed. Nice work!airbyte/cloud/auth.py (1)
69-84: LGTM!The new resolver function follows the established pattern nicely. I appreciate that it returns
Nonewhen not set, leaving the fallback logic to the caller (get_config_api_root). This keeps the responsibility clear and consistent with howresolve_cloud_bearer_tokenhandles optional values.airbyte/_util/api_util.py (1)
24-25: LGTM!Clean import additions that bring in the new resolver and constant. These are properly placed with related cloud constants.
… 404 errors Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)
61-91: Nice implementation! Clear resolution order with helpful error messaging.The logic correctly prioritizes the environment override, falls back to defaults, and provides actionable guidance when neither applies. The docstring is comprehensive and explains the behavior well.
One optional thought: should we consider validating that the override URL is well-formed before returning it? This could catch configuration issues earlier, though the API calls would fail anyway if the URL is invalid. Wdyt?
Example validation (optional):
# First, check if the Config API URL is explicitly set via environment variable config_api_override = resolve_cloud_config_api_url() if config_api_override: + # Optional: validate URL format + if not config_api_override.startswith(("http://", "https://")): + raise PyAirbyteInputError( + message=f"Invalid Config API URL format: {config_api_override}", + guidance=f"The '{CLOUD_CONFIG_API_ROOT_ENV_VAR}' environment variable must be a valid HTTP(S) URL." + ) return config_api_override
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/_util/api_util.py(8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (1)
airbyte/cloud/auth.py (1)
resolve_cloud_config_api_url(69-84)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/_util/api_util.py (3)
24-25: LGTM! Clean imports for the new functionality.The imports are well-organized and bring in exactly what's needed for the environment-driven Config API URL override.
196-203: Excellent observability improvements!Consistently enriching error contexts with
api_rootandstatus_codeacross multiple API wrappers will make debugging much easier, especially when custom API roots are in play. The additions are non-breaking and follow a consistent pattern throughout the file.Also applies to: 449-453, 551-553, 584-587, 660-663, 907-910
78-81: Integration verified and working correctly.The
resolve_cloud_config_api_url()integration is solid. The function properly wraps environment variable resolution viatry_get_secret()withCLOUD_CONFIG_API_ROOT_ENV_VAR, and the usage pattern in lines 78-81 correctly checks for an override before falling back to the constant. The constants are well-defined, imports are clean, and the fallback logic handles differentapi_rootscenarios appropriately. This follows good error handling and code organization patterns—nice work maintaining backwards compatibility while adding the override capability!
…of api_root Co-Authored-By: AJ Steers <[email protected]>
Co-Authored-By: AJ Steers <[email protected]>
…config-api-root-env-var
|
/prerelease
|
Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)
87-91: Consider using a custom exception for better telemetry hygiene.The error message includes
api_rootdirectly in the f-string, which could potentially expose sensitive information in logs or telemetry. Based on the learnings about decoupling dynamic values from error messages, would it make sense to raise aPyAirbyteErrorsubclass withapi_rootin the context dict instead? This would align with the project's structlog-inspired design. WDYT?Based on learnings, error messages should avoid including dynamic values (which may contain PII) directly in message strings.
Here's a possible approach:
- raise NotImplementedError( - f"Configuration API root not implemented for api_root='{api_root}'. " - f"Set the '{CLOUD_CONFIG_API_ROOT_ENV_VAR}' environment variable " - "to specify the Config API URL." - ) + raise PyAirbyteInputError( + message=( + "Configuration API root cannot be automatically derived. " + f"Set the '{CLOUD_CONFIG_API_ROOT_ENV_VAR}' environment variable " + "to specify the Config API URL." + ), + context={"api_root": api_root}, + )
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/_util/api_util.py(27 hunks)tests/integration_tests/cloud/test_cloud_api_util.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-08T15:34:31.026Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Applied to files:
airbyte/_util/api_util.py
🧬 Code graph analysis (2)
tests/integration_tests/cloud/test_cloud_api_util.py (2)
airbyte/_util/api_util.py (1)
list_sources(315-363)airbyte/exceptions.py (1)
AirbyteError(432-447)
airbyte/_util/api_util.py (2)
airbyte/secrets/util.py (1)
try_get_secret(33-60)airbyte/exceptions.py (1)
AirbyteError(432-447)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (4)
tests/integration_tests/cloud/test_cloud_api_util.py (2)
307-360: Consider the reliability of external HTTP calls in integration tests.The test makes actual network calls to
httpbin.org. While this validates real URL construction behavior, it could introduce flakiness if httpbin.org is unavailable or slow. Have you considered whether this test should remain as an integration test, or if key aspects could be tested with mocked responses to improve reliability? WDYT?
362-390: Verify the assumption about httpbin response behavior.This test assumes httpbin.org will return an empty list (or at least something that evaluates to
[]afterlist_sourcesprocessing). While httpbin's/anythingendpoint typically echoes request data, the SDK's response parsing might handle unexpected formats differently. Have you verified this test passes reliably, and could the assertion message be updated to reflect that we're actually testing URL construction rather than expecting "no real sources"? WDYT?airbyte/_util/api_util.py (2)
24-24: LGTM on the new imports!The addition of
CLOUD_CONFIG_API_ROOT,CLOUD_CONFIG_API_ROOT_ENV_VARconstants and thetry_get_secrethelper are essential for the new environment variable override feature. Clean integration!Also applies to: 33-33
200-202: Excellent systematic error context enhancement!The consistent addition of
request_urlandstatus_codeto error contexts across all API wrapper functions is a great improvement for debugging. This will make it much easier to diagnose issues related to custom API roots and URL construction. The pattern is applied consistently and correctly throughout the file.Also applies to: 250-252, 300-302, 356-358, 407-409, 454-456, 495-496, 557-558, 591-593, 636-638, 671-673, 749-750, 804-805, 872-874, 925-927, 1004-1005, 1059-1060, 1124-1125, 1257-1258, 1321-1322, 1569-1570, 1609-1610, 1654-1655, 1867-1868
This change catches SDKError exceptions from the Speakeasy SDK and wraps them with additional context including the actual request URL that was attempted. This makes debugging API issues (like 404 errors) much easier by showing exactly which URL was requested. Functions updated with SDKError wrapping: - get_workspace - list_connections - list_workspaces - list_sources - list_destinations - get_connection The error context now includes: - request_url: The actual URL that was requested - request_method: The HTTP method used - status_code: The HTTP status code - response_content_type: The content type of the response - api_root: The configured API root - workspace_id/connection_id: Resource identifiers Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)
62-96: Nice error context extraction! One minor consideration about PII in error messages.The helper functions look solid and will definitely improve debugging. The pattern of extracting request context from SDKError is clean and thorough.
One thing to consider: on line 93, the error message includes
error.messagedirectly in the message string:message=f"API error occurred: {error.message}",Based on learnings from the codebase, error messages are designed to decouple message text from dynamic values to avoid PII in telemetry. Since
error.messagefrom the SDK could potentially contain URLs, IDs, or other sensitive data, and it's already captured in the context as"error_message", you might want to consider using a static message instead. Something like:message="API error occurred",wdyt? The dynamic details would still be available via the context dict for debugging.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/_util/api_util.py(27 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-08T15:34:31.026Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Applied to files:
airbyte/_util/api_util.py
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (3)
airbyte/secrets/util.py (1)
try_get_secret(33-60)airbyte/exceptions.py (2)
AirbyteError(432-447)AirbyteMissingResourceError(505-509)airbyte/cloud/workspaces.py (4)
list_connections(635-661)list_sources(663-689)list_destinations(691-717)get_connection(315-327)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/_util/api_util.py (3)
98-128: Excellent implementation of environment variable override!The resolution logic is clean and follows the documented precedence order perfectly. The updated docstring clearly explains the behavior, and the error message provides actionable guidance when the Config API root can't be determined. Using
try_get_secretensures graceful handling when the environment variable isn't set.
225-233: SDKError wrapping looks good! Quick question about coverage.The SDKError wrapping pattern is clean and consistent across the functions where it's applied. The base_context approach provides excellent debugging information.
I noticed that the try/except wrapping was added to some functions (get_workspace, list_connections, list_workspaces, list_sources, list_destinations, get_connection) but not others (run_connection, get_job_logs, get_job_info, create_source, etc.), even though those also call the Speakeasy SDK. Is this intentional - perhaps prioritizing the most commonly used functions first, or planning to expand coverage in a future PR? Just want to make sure the partial coverage is by design. wdyt?
Also applies to: 279-288, 505-517
242-244: Great error context enrichment throughout!Adding
request_urlandstatus_codeto error contexts will significantly improve debugging, especially for API-root-related failures. The pattern is consistent and the context is being captured at the right points - after the API call completes but when status checks fail.The access to
response.raw_response.urllooks safe since it's only accessed after getting past the SDKError handler, meaning we have a valid response object.Also applies to: 297-299, 415-417, 527-529
…ess failures Co-Authored-By: AJ Steers <[email protected]>
|
/prerelease
|
Summary
Adds support for overriding the Config API URL via the
AIRBYTE_CLOUD_CONFIG_API_URLenvironment variable. This is needed because the Config API URL (https://cloud.airbyte.com/api/v1) cannot always be derived from the public API URL whenAIRBYTE_CLOUD_API_URLis overridden to point to a different endpoint.Changes:
CLOUD_CONFIG_API_ROOT_ENV_VARconstant inconstants.pyresolve_cloud_config_api_url()function incloud/auth.pyfollowing the existing resolver patternget_config_api_root()inapi_util.pyto check the env var first, with improved error messagingResolution order for Config API URL:
AIRBYTE_CLOUD_CONFIG_API_URLenvironment variable (if set)NotImplementedErrorwith guidance on which env var to setReview & Testing Checklist for Human
AIRBYTE_CLOUD_CONFIG_API_URLand verifying it's used for Config API operations (e.g., connector builder projects)resolve_cloud_config_api_url()and the updatedget_config_api_root()logicSuggested test plan:
AIRBYTE_CLOUD_API_URLto a custom valueAIRBYTE_CLOUD_CONFIG_API_URL)AIRBYTE_CLOUD_CONFIG_API_URLand verify the operation succeedsNotes
Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.