Skip to content

Make ListingTable obey collect_statistics config #16080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adriangb
Copy link
Contributor

Working on #16014 I think I found that we collect parquet statistics by default on ListingTable despite the fact that the config option defaults to false.

I believe no one has noticed this because:

  1. It's hard to detect any side effects.
  2. A lot of companies using DataFusion don't use ListingTable.
  3. In the tests where it matters it gets set explicitly (but the default behavior is not tested).

@github-actions github-actions bot added the core Core DataFusion crate label May 18, 2025
@adriangb adriangb changed the title Don't collect statistics by default Make ListingTable obey collect_statistics config May 18, 2025
@adriangb
Copy link
Contributor Author

adriangb commented May 18, 2025

We could also flip the default of ListingTableOptions which IMO is reasonable (it should match the default in SessionConfig) and since #1347 was so long ago it seems like we decided to go in the other direction but never commited?

@adriangb
Copy link
Contributor Author

cc @alamb am I missing something here ?

@adriangb
Copy link
Contributor Author

Looks like the config option was added in #3846 and it's just never agreed with ListingTableOptions

@adriangb
Copy link
Contributor Author

cc @Dandandan since you added the config option originally in #3846

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant