Skip to content

feat(fabric): rewrite = ANY(ARRAY[...]) and = ANY((...)) to IN#182

Closed
VaibhaveS wants to merge 7 commits into
tobilg:mainfrom
VaibhaveS:vsekar/any-to-in
Closed

feat(fabric): rewrite = ANY(ARRAY[...]) and = ANY((...)) to IN#182
VaibhaveS wants to merge 7 commits into
tobilg:mainfrom
VaibhaveS:vsekar/any-to-in

Conversation

@VaibhaveS
Copy link
Copy Markdown
Contributor

@VaibhaveS VaibhaveS commented Apr 25, 2026

pg_get_querydef in PostgreSQL emits ScalarArrayOpExpr in two forms:

col = ANY(ARRAY['a', 'b', 'c'])
col = ANY(('a', 'b', 'c'))

Neither is valid in T-SQL / Microsoft Fabric. Both should be rewritten to: col IN ('a', 'b', 'c')

col = ANY(ARRAY['a','b','c'])  ->  col IN ('a','b','c')
col = ANY(('a','b','c'))       ->  col IN ('a','b','c')
col <> ANY(ARRAY['a','b'])     ->  unchanged
col = ANY(SELECT id FROM s)    ->  unchanged

@tobilg
Copy link
Copy Markdown
Owner

tobilg commented Apr 25, 2026

Thanks for the PR. The core rewrite is useful, and I verified that the advertised non-empty Fabric cases work:

  • col = ANY(ARRAY['a', 'b'])col IN ('a', 'b')
  • col = ANY(('a', 'b'))col IN ('a', 'b')

I would not merge this as-is yet because there are two gaps:

  • col = ANY(ARRAY[]) currently becomes col IN (), which is invalid T-SQL/Fabric SQL. This should be handled explicitly, e.g. left unchanged or rewritten to an always-false
    predicate.
  • The PR description/comment mentions Fabric/TSQL, but the implementation is Fabric-only. PostgreSQL → TSQL still emits col = ANY(ARRAY['a', 'b']).

Could you add regression tests for the two advertised forms and the empty-array case? If TSQL remains part of the intended scope, please include PostgreSQL → TSQL coverage as
well; otherwise the description/comment should be narrowed to Fabric-only.

@VaibhaveS VaibhaveS closed this Apr 25, 2026
@VaibhaveS VaibhaveS reopened this Apr 25, 2026
@VaibhaveS
Copy link
Copy Markdown
Contributor Author

Updated.

@tobilg
Copy link
Copy Markdown
Owner

tobilg commented Apr 25, 2026

Thanks for the update. The core implementation looks sound to me: I verified the implementation behavior locally for both Fabric and T-SQL, including:

  • col = ANY(ARRAY['a', 'b', 'c']) -> col IN ('a', 'b', 'c')
  • col = ANY(('a', 'b', 'c')) -> col IN ('a', 'b', 'c')
  • col = ANY(ARRAY[]) -> 1 = 0
  • col <> ANY(...) remains unchanged
  • col = ANY(SELECT ...) remains unchanged

cargo check -p polyglot-sql also passed with the implementation applied.

Before merging, this needs a small rebase/update against current main: the PR currently adds fabric_regression.rs and tsql_regression.rs as new files, but those files already exist on main now, so GitHub marks the PR as not mergeable.

One small test-quality request: please append the new cases to the existing regression files and use exact assert_eq! output assertions instead of substring checks. Once that is done, I think this makes sense to merge.

@VaibhaveS
Copy link
Copy Markdown
Contributor Author

Thanks for the update. The core implementation looks sound to me: I verified the implementation behavior locally for both Fabric and T-SQL, including:

  • col = ANY(ARRAY['a', 'b', 'c']) -> col IN ('a', 'b', 'c')
  • col = ANY(('a', 'b', 'c')) -> col IN ('a', 'b', 'c')
  • col = ANY(ARRAY[]) -> 1 = 0
  • col <> ANY(...) remains unchanged
  • col = ANY(SELECT ...) remains unchanged

cargo check -p polyglot-sql also passed with the implementation applied.

Before merging, this needs a small rebase/update against current main: the PR currently adds fabric_regression.rs and tsql_regression.rs as new files, but those files already exist on main now, so GitHub marks the PR as not mergeable.

One small test-quality request: please append the new cases to the existing regression files and use exact assert_eq! output assertions instead of substring checks. Once that is done, I think this makes sense to merge.

I added assert_eq!, I saw files like clickhouse_regression.rs etc and used the same pattern for fabric & tsql.

@VaibhaveS
Copy link
Copy Markdown
Contributor Author

copilot iterated a bit for the CI/CD to pass...

@tobilg
Copy link
Copy Markdown
Owner

tobilg commented Apr 27, 2026

Thanks for the update. The = ANY(...) -> IN (...) part looks good to me now:

  • PR is mergeable.
  • The tests are appended to the existing regression files.
  • The new cases use exact assert_eq! assertions.
  • I verified the focused regression tests pass with:
    cargo test -p polyglot-sql --test fabric_regression --test tsql_regression
  • cargo check -p polyglot-sql also passes.

I would hold off on merging as-is because the PR now includes a couple of unrelated changes that look like correctness regressions:

  • TSQL/Fabric DATEADD(MONTH, 1, order_date) to PostgreSQL now becomes order_date + 1, which drops the MONTH unit.
  • BigQuery ARRAY_LENGTH(items) to PostgreSQL now becomes ARRAY_LENGTH(items), but PostgreSQL expects the dimension argument, e.g. ARRAY_LENGTH(items, 1).

I will strip the releavant portions of this PR and apply it locally, then close the PR. Thanks.

@tobilg tobilg closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants