Skip to content

Conversation

@blaginin
Copy link
Collaborator

@blaginin blaginin commented Oct 20, 2025

blaginin and others added 30 commits October 16, 2025 10:58
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- smj_join_key_ordering
- reorder_join_keys_to_left_input (inline snapshots with filter)
- parallelization_ignores_limit
- parallelization_prior_to_sort_preserving_merge
- parallelization_sort_preserving_merge_with_union
- parallelization_does_not_benefit

Replace old-style expected string arrays with insta inline snapshot assertions.
For reorder_join_keys_to_left_input, use regex filter to replace all join_type
values with '...' and add separate assertion for top join type.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Migrated the following tests to use inline insta snapshots:
- repartition_sorted_limit
- repartition_sorted_limit_with_filter
- repartition_ignores_limit
- repartition_ignores_union
- repartition_through_sort_preserving_merge
- repartition_ignores_sort_preserving_merge
- repartition_ignores_sort_preserving_merge_with_union
- repartition_does_not_destroy_sort
- repartition_does_not_destroy_sort_more_complex
- repartition_ignores_transitively_with_projection
- repartition_transitively_past_sort_with_projection

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Migrated the following tests to use inline insta snapshots:
- remove_redundant_roundrobins
- remove_unnecessary_spm_after_filter
- preserve_ordering_through_repartition
- no_need_for_sort_after_filter
- do_not_preserve_ordering_through_repartition3
- do_not_add_unnecessary_hash
- do_not_add_unnecessary_hash2
- optimize_away_unnecessary_repartition

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Migrated the following tests to use inline insta snapshots:
- optimize_away_unnecessary_repartition2
- parallelization_ignores_transitively_with_projection_csv

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Migrated the following tests to use inline insta snapshots:
- parallelization_single_partition
- parallelization_multiple_files

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…tputs

Migrated the following tests to use inline insta snapshots:
- repartition_transitively_with_projection
- repartition_transitively_past_sort_with_filter
- repartition_transitively_past_sort_with_projection_and_filter

These tests have different outputs for DISTRIB_DISTRIB_SORT vs SORT_DISTRIB_DISTRIB,
so each gets two separate inline snapshots.

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Migrated test_distribute_sort_parquet to use inline insta snapshots.

This test uses different optimizer configurations:
- Initial plan before optimization
- After Run::Distribution
- After Run::Distribution + Run::Sorting

All three states now use inline snapshots instead of string arrays.

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Migrated test_distribute_sort_memtable to use inline insta snapshot.

This test checks the final optimized plan after creating and executing
a SQL query on a MemTable.

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
This test uses a loop to test different compression types with conditional
logic for expected outputs. Used insta::allow_duplicates! to handle inline
snapshots within the loop.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…to insta snapshots

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@blaginin blaginin changed the title Move enforce distrivution Insta for enforce_distrubution Oct 20, 2025
@github-actions github-actions bot added the core Core DataFusion crate label Oct 20, 2025
let mut settings = Settings::clone_current();
settings.add_filter(&format!("join_type={join_type}"), "join_type=...");

#[rustfmt::skip]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen.Recording.2025-10-21.at.11.25.18.mov

@alamb
Copy link
Contributor

alamb commented Oct 22, 2025

So close...

@blaginin
Copy link
Collaborator Author

blaginin commented Oct 22, 2025

This happened to be a bigger change than I expected, there are some really big and complicated tests, and if we just compare with a separate snapshot on every case, we'll add +1K more lines (like in the original PR). I'm doing something more complicated from the code point of view, but on the positive side the file size is the same.

I also think we may want to split some tests / and the file itself, but this may be the one to do on top of this PR

@alamb
Copy link
Contributor

alamb commented Oct 22, 2025

I think a sequence of PRs sounds like a good plan to me

blaginin added a commit to blaginin/datafusion that referenced this pull request Oct 23, 2025
@blaginin
Copy link
Collaborator Author

Sure, moved all easy cases to #18248

github-merge-queue bot pushed a commit that referenced this pull request Oct 29, 2025
- part of #15791

All easy cases from #18185
(that are nicely-ish displayed in git diff).

Note on preserving comments: if it was note about what should happen (or
what will be tested), it's placed on top of the snapshot. If that's
something that comments part of the plan, I put it below the plan
tobixdev pushed a commit to tobixdev/datafusion that referenced this pull request Nov 2, 2025
- part of apache#15791

All easy cases from apache#18185
(that are nicely-ish displayed in git diff).

Note on preserving comments: if it was note about what should happen (or
what will be tested), it's placed on top of the snapshot. If that's
something that comments part of the plan, I put it below the plan
# Conflicts:
#	datafusion/core/tests/physical_optimizer/enforce_distribution.rs
@blaginin blaginin marked this pull request as ready for review November 3, 2025 20:29
@blaginin blaginin self-assigned this Nov 3, 2025
@blaginin blaginin requested a review from alamb November 3, 2025 20:29
@alamb alamb changed the title Insta for enforce_distrubution Complete migrating enforce_distrubution tests to insta Nov 3, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing -- we (well you mostly and @Chen-Yuan-Lai ) did it!

@alamb alamb added the development-process Related to development process of DataFusion label Nov 3, 2025
@alamb alamb enabled auto-merge November 4, 2025 18:13
@alamb alamb added this pull request to the merge queue Nov 5, 2025
Merged via the queue into apache:main with commit acdd263 Nov 5, 2025
28 checks passed
@blaginin
Copy link
Collaborator Author

blaginin commented Nov 5, 2025

Thank you for the review!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate development-process Related to development process of DataFusion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate core tests to insta [Epic] Add snapshot tests (migrate to insta for tests)

3 participants