Skip to content

[substrait] Scalar subquery in select not supported #18066

@bvolpato-dd

Description

@bvolpato-dd

Describe the bug

For certain queries with subqueries, DataFusion optimizes them as a LEFT join without condition (ScalarSubqueryToJoin).

A recent addition to require conditions for joins (#15334) has added the validation that causes problems in this type of query now:

---- cases::roundtrip_logical_plan::scalar_subquery_in_select stdout ----
Error: Plan("join condition should not be empty")

To Reproduce

SELECT a, (SELECT MAX(b) FROM data2) as max_b FROM data

Returns error:

Error: Plan("join condition should not be empty")

Expected behavior

Boolean(true) in the filter:

Projection: data.a, max(data2.b) AS max_b
  Left Join: 
    TableScan: data projection=[a]
    Aggregate: groupBy=[[]], aggr=[[max(data2.b)]]
      TableScan: data2 projection=[b], partial_filters=[Boolean(true)]

Additional context

No response

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions