Skip to content

Conversation

@robtandy
Copy link
Contributor

@robtandy robtandy commented Mar 7, 2025

Basic Object store support

Changes:

  • Updated register_parquet, register_listing_table, and added register_csv such that they will automatically register an object store based on the Url of the path provided. Local paths, relative or absolute, are interpreted as a file:// url, using the parse functionality in ListingTableUrl from datafusion.
  • Added example/http_csv.py a port of https://github.com/apache/datafusion/blob/45.0.0/datafusion- examples/examples/query-http-csv.rs
  • Updated the example in the Readme.md to be a working example using a github hosted csv file.

Object store credentials

In this PR, all object stores are automatically registered - which can be really nice - and the trade off is that we rely on the cloud provider SDKs to gather credentials from their environment, or local files, rather than explicitly providing them.

In a future release we can add more configurability here.

Looks like a few formatting changes snuck in here from editing. I would like to defer sorting out those details until we have commit hooks to ensure proper and consistent file formatting.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @robtandy.

@andygrove andygrove merged commit 42681a1 into apache:main Mar 9, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants