Skip to content

Conversation

@duongcongtoai
Copy link
Contributor

@duongcongtoai duongcongtoai commented Nov 18, 2025

Which issue does this PR close?

Original PR/discussions: #16186

There were some discussion going on regarding handle subquery with depth aware at the planning stage, which is a nice thing to have, but until we implement something like that, we cannot continue implement query decorrelation. But i realize that we can add some minor change to how we plan the subqueries so at least no error is thrown because of ambiguous schema as in #15558:

  • PlannerContext maintains an optional outer schema, we just need to replace this field with a stack of outer schema

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Nov 18, 2025
@duongcongtoai duongcongtoai changed the title feat: Support parsing subqueries with OuterReferenceColumn belongs to non-adjacent outer relations attemp 2 feat: Support parsing subqueries with OuterReferenceColumn belongs to non-adjacent outer relations attemp 2 Nov 18, 2025
@duongcongtoai duongcongtoai marked this pull request as draft November 19, 2025 07:56
@duongcongtoai duongcongtoai marked this pull request as ready for review November 22, 2025 16:57
@kosiew
Copy link
Contributor

kosiew commented Jan 14, 2026

@duongcongtoai

image

You'll get more helpful review comments if you tell us more about your PR

@alamb alamb changed the title feat: Support parsing subqueries with OuterReferenceColumn belongs to non-adjacent outer relations attemp 2 feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations attemp 2 Jan 14, 2026
@alamb
Copy link
Contributor

alamb commented Jan 14, 2026

Thanks @duongcongtoai and @duongcongtoai

It turns out I got a ping yesterday from Andy Pavlo and his student @yliang412 about the status of subquery support)

I just filed a ticket to track just the planning aspect of this query (which I think this PR handles)

@alamb
Copy link
Contributor

alamb commented Jan 14, 2026

I will try and give this a look shortly

@alamb alamb changed the title feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations attemp 2 feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations Jan 14, 2026

/// The queries schemas of outer query relations, used to resolve the outer referenced
/// columns in subquery (recursive aware)
outer_queries_schemas_stack: Vec<DFSchemaRef>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this is the key change -- that rather than having a single outer query schema, we need to have a stack, one for each in the relation.

However it seems like we shouldn't need both outer_query_schema as well a stack of them -- outer_query_schemas should always be enough -- maybe you can add push and pop methods that add/remove from the schema stack

One thought I had while looking at this PR is that a more logical structure would be for each PlannerContext to have a reference to its parent -- something like

pub struct PlannerContext<'a> {
  /// When planning a subquery, the context of the outer query
  parent_context: Option<&'a PlannerContext>
  ...
}

And remove the explicit outer_query_schema

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @duongcongtoai and @kosiew

Our friends at CMU were just asking about this so it was great to see a PR already started

I left some comments. My biggest ones are:

  1. Can we find a way to unify the schema stacks (rather than having both a field and a stack)?
  2. Can we restrict this PR to just planning (aka no changes to the optimizer or other stages) -- maybe with tests focused only on planning, such as
    #[test]
    fn test_ambiguous_column_references_with_in_using_join() {
    let sql = "select p1.id, p1.age, p2.id
    from person as p1
    INNER JOIN person as p2
    using(id)";
    let plan = logical_plan(sql).unwrap();
    assert_snapshot!(
    plan,
    @r"
    Projection: p1.id, p1.age, p2.id
    Inner Join: Using p1.id = p2.id
    SubqueryAlias: p1
    TableScan: person
    SubqueryAlias: p2
    TableScan: person
    "
    );
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Supporting planning (binding) Nested correlated subquery error with a depth exceeding 1

3 participants