Skip to content

Integrate hf datasets as a data source #11915

Description

@ParagEkbote

Is your feature request related to a problem? Please describe.

The Hugging Face Hub recently surpassed 1M+ datasets. Many users store and share hf datasets on the Hub and apply it as a standard interface for accessing them. However, great_expectations does not currently provide Hugging Face datasets as a native datasource.

Describe the solution you'd like
Add support for Hugging Face datasets as a datasource in great_expectations. This would allow users to load datasets directly from the Hugging Face Hub and apply expectations to their data.

Describe alternatives you've considered
N.A.

Additional context
I would be interesting in contributing this integration, having done a similar integration contribution to dagster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions