Skip to content
This repository was archived by the owner on Jul 25, 2022. It is now read-only.

Brainstorming how to improve the docs #55

@MrPowers

Description

@MrPowers

I'm trying to get started with DataFusion and would like to run some basic operations to try out the library. I'd like to read a CSV file into a DataFrame and run some queries.

I took a look at the docs and tried to run datafusion.ExecutionContext(), but got a "module 'datafusion' has no attribute 'ExecutionContext'" error. I was able to look at the project README and see that the new syntax is datafusion.SessionContext().

I was able to read the Rust docs and find that the Python syntax for reading a CSV is something like this:

ctx.register_csv("something", "../tmp/N_1e7_K_1e2_single.csv")
ctx.sql("SELECT v1 FROM something LIMIT 5").show()

Are you OK if I send a PR to add some more detailed usage instructions to this project README? Even basic stuff like documenting show() would help (I just guessed that would work, haha).

Once the README is updated, hopefully we can sync the latest version with the arrow.apache.org docs.

Thanks for making this cool library. I am excited to play around with it!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions