Skip to content

Commit a446cdb

Browse files
authored
docs: consolidate datafusion-cli docs (#6218)
1 parent 06e9f53 commit a446cdb

File tree

2 files changed

+6
-92
lines changed

2 files changed

+6
-92
lines changed

datafusion-cli/README.md

Lines changed: 5 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -17,98 +17,12 @@
1717
under the License.
1818
-->
1919

20-
# DataFusion Command-line Interface
21-
22-
[DataFusion](https://github.com/apache/arrow-datafusion) is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
23-
24-
The DataFusion CLI allows SQL queries to be executed by an in-process DataFusion context.
25-
26-
```ignore
27-
USAGE:
28-
datafusion-cli [OPTIONS]
29-
30-
OPTIONS:
31-
-c, --batch-size <BATCH_SIZE> The batch size of each query, or use DataFusion default
32-
-f, --file <FILE>... Execute commands from file(s), then exit
33-
--format <FORMAT> [default: table] [possible values: csv, tsv, table, json,
34-
nd-json]
35-
-h, --help Print help information
36-
-p, --data-path <DATA_PATH> Path to your data, default to current directory
37-
-q, --quiet Reduce printing other than the results and work quietly
38-
-r, --rc <RC>... Run the provided files on startup instead of ~/.datafusionrc
39-
-V, --version Print version information
40-
41-
```
42-
43-
## Example
44-
45-
Create a CSV file to query.
46-
47-
```bash,ignore
48-
$ echo "1,2" > data.csv
49-
```
50-
51-
```sql,ignore
52-
$ datafusion-cli
53-
54-
DataFusion CLI v12.0.0
55-
56-
> CREATE EXTERNAL TABLE foo (a INT, b INT) STORED AS CSV LOCATION 'data.csv';
57-
0 rows in set. Query took 0.001 seconds.
20+
<!-- Note this file is included in the crates.io page as well https://crates.io/crates/datafusion-cli -->
5821

59-
> SELECT * FROM foo;
60-
+---+---+
61-
| a | b |
62-
+---+---+
63-
| 1 | 2 |
64-
+---+---+
65-
1 row in set. Query took 0.017 seconds.
66-
```
67-
68-
## Querying S3 Data Sources
69-
70-
The CLI can query data in S3 if the following environment variables are defined:
71-
72-
- `AWS_REGION`
73-
- `AWS_ACCESS_KEY_ID`
74-
- `AWS_SECRET_ACCESS_KEY`
75-
76-
Alternatively, you can supply a profile name via the `AWS_PROFILE` environment variable. When using a [named profile](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html), the CLI obtains credentials from the profile configuration and thus does not require `AWS_ACCESS_KEY_ID` or `AWS_SECRET_ACCESS_KEY` environment variables to be set.
77-
78-
Note that the region must be set to the region where the bucket exists until the following issue is resolved:
79-
80-
- https://github.com/apache/arrow-rs/issues/2795
81-
82-
Example:
83-
84-
```bash
85-
$ aws s3 cp test.csv s3://my-bucket/
86-
upload: ./test.csv to s3://my-bucket/test.csv
87-
88-
$ export AWS_REGION=us-east-1
89-
$ export AWS_SECRET_ACCESS_KEY=***************************
90-
$ export AWS_ACCESS_KEY_ID=**************
91-
92-
$ ./target/release/datafusion-cli
93-
DataFusion CLI v12.0.0
94-
❯ create external table test stored as csv location 's3://my-bucket/test.csv';
95-
0 rows in set. Query took 0.374 seconds.
96-
select * from test;
97-
+----------+----------+
98-
| column_1 | column_2 |
99-
+----------+----------+
100-
| 1 | 2 |
101-
+----------+----------+
102-
1 row in set. Query took 0.171 seconds.
103-
```
104-
105-
## DataFusion-Cli
22+
# DataFusion Command-line Interface
10623

107-
Build the `datafusion-cli` by `cd` into the sub-directory:
24+
[DataFusion](https://arrow.apache.org/datafusion/) is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
10825

109-
```bash
110-
cd datafusion-cli
111-
cargo build
112-
```
26+
The DataFusion CLI is a command line utility that runs SQL queries using the DataFusion engine.
11327

114-
[df]: https://crates.io/crates/datafusion
28+
See the [`datafusion-cli` documentation](https://arrow.apache.org/datafusion/user-guide/cli.html) for further information.

docs/source/user-guide/cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121

2222
The DataFusion CLI is a command-line interactive SQL utility for executing
2323
queries against any supported data files. It is a convenient way to
24-
try DataFusion out with your own data sources, and test out its SQL support.
24+
try DataFusion's SQL support with your own data.
2525

2626
## Example
2727

0 commit comments

Comments
 (0)