-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add insta
/ snapshot testing to CLI & set up AWS mock
#13672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # datafusion-cli/Cargo.lock
# Conflicts: # .github/workflows/extended.yml
Because CLI integration tests don't break very often, I moved them to the extended tests https://github.com/blaginin/datafusion/actions/runs/13679702615/job/38249126001#step:8:2585 |
Also resolved conflicts with main. I remember @alamb highlighted external data sources as a priority for this year, so I think adding tests is important 👊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @blaginin -- I think this PR is a nice improvement
In my opinion, switching the Ci tests to run on each PR for a while to make sure they are passing is required prior to merge
I think both inline snapshots and moving the docs are important, but we could do them as a follow on PR
.github/workflows/extended.yml
Outdated
rustup toolchain install | ||
- name: Install Protobuf Compiler | ||
run: sudo apt-get install -y protobuf-compiler | ||
- name: Setup Minio - S3-compatible storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried that an extended test may fail (as it is only run on commits to main). And I have spent a bunch of time recently keeping the extended tests gree.
Maybe we can put this test in the normal rust.yaml that runs on each PR for a while until we are sure it is stable 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair! I've added them back to the rust flow. Fells like testing on the existing linux check should be enough (to save time) but happy to add to mac os / move to a new check if preferred
datafusion-cli/CONTRIBUTING.md
Outdated
|
||
```shell | ||
cargo install cargo-insta | ||
cargo insta review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datafusion-cli/CONTRIBUTING.md
Outdated
cargo test | ||
``` | ||
|
||
## Snapshot testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please put these instructions in the existing https://datafusion.apache.org/contributor-guide/testing.html instead?
And then perhaps you can add a link to that page in datafusion-cli/CONTRIBUTING.md
(this is partly so when we start using insta for other tests, we can reuse the same instructions)
|
||
glob!("sql/*.sql", |path| { | ||
let input = fs::read_to_string(path).unwrap(); | ||
assert_cmd_snapshot!(cli().pass_stdin(input)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is absolutely possible! My idea was to make it closer to slt and separate the code from the data.
FWIW I think .slt keeps the tests setup and expected output together (the slt driver is separate for sure, but most of the tests are self contained in the .slt file
insta
/ snapshot testing to CLI & set up AWS mock
.github/workflows/extended.yml
Outdated
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y | ||
source $HOME/.cargo/env | ||
rustup toolchain install | ||
- name: Setup Rust toolchain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will fail because (as I found out earlier in the week) the ubuntu builder doens't start with rust
|
||
### AWS | ||
|
||
To test the S3 integration against [Minio](https://github.com/minio/minio) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the instructions locally and they worked great 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @blaginin -- I think this PR looks really good and will make DataFusion testing better and easier
I expect over time the insta
style snapshot tests will become widely used (and loved!)
I also verified that
cargo test -p datafusion-cli
works without running docker / minio and I also followed the instructions locally
I will likely try a follow on PR to try and run the datafusion-cli tests separately from the main test run to make things clearer but this looks 👍
Thanks again for sticking with it
First start up a container with Minio and load test files. | ||
|
||
```shell | ||
docker run -d \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we consider having the tests themselves start the containers they need?
In Trino land this is done with testcontainers-java and it works awesomely well from developer experience perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s a nice improvement if tests just handle the setup automatically! Created an issue to discuss: #15092
I plan to merge this on Tuesday unless anyone needs additional time to review. I may then organize some tickets to port existing tests to use the new framework (especially as we seem to be a little short on good first issue tickets) |
I merged up to resolve conflicts and plan to merge this when the tests pass |
Thanks again @blaginin |
Which issue does this PR close?
Rationale for this change
We currently don't test whether our external integrations actually work: as a result #13576 has happened.
What changes are included in this PR?
Integration tests for S3.
Are there any user-facing changes?
No.