-
Notifications
You must be signed in to change notification settings - Fork 380
support pyarrow AzureFileSystem parameters (client_id, client_secret, tenant_id)
#2301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for additional Azure authentication parameters in PyArrow's AzureFileSystem integration. It extends the existing Azure file system initialization to include client ID, client secret, and tenant ID parameters for service principal authentication.
- Adds three new Azure configuration constants for client credentials authentication
- Updates Azure file system initialization to pass client authentication parameters to PyArrow
- Includes reference documentation link for the PyArrow AzureFileSystem API
This reverts commit 7c89a9b.
| if client_id := self.properties.get(ADLS_CLIENT_ID): | ||
| client_kwargs["client_id"] = client_id | ||
| if client_secret := self.properties.get(ADLS_CLIENT_SECRET): | ||
| client_kwargs["client_secret"] = client_secret | ||
| if tenant_id := self.properties.get(ADLS_TENANT_ID): | ||
| client_kwargs["tenant_id"] = tenant_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, added!
|
Nice one @kevinjqliu thanks for adding this 👍 |
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
Similar to apache#2299
This PR adds the rest of the parameters to
[`pyarrow.fs.AzureFileSystem`](https://arrow.apache.org/docs/python/generated/pyarrow.fs.AzureFileSystem.html)
Note the [Azure Data Lake configuration
page](https://github.com/apache/iceberg-python/blob/main/mkdocs/docs/configuration.md#azure-data-lake)
already has these 3 parameters
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->

Rationale for this change
Similar to #2299
This PR adds the rest of the parameters to
pyarrow.fs.AzureFileSystemNote the Azure Data Lake configuration page already has these 3 parameters
Are these changes tested?
Are there any user-facing changes?