-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Is your feature request related to a problem or challenge?
- Broken out of Support Nested correlated subquery error with a depth exceeding 1 #15558 from @irenjj
I believe the CMU optd project is trying to plan multi-level correlated subqueries (specifically @yliang412 👋 )
CREATE TABLE employees (
employee_id INTEGER,
employee_name VARCHAR,
dept_id INTEGER,
salary DECIMAL
);
CREATE TABLE project_assignments (
project_id INTEGER,
employee_id INTEGER,
priority INTEGER
);Current running this query fails
SELECT e1.employee_name, e1.salary
FROM employees e1
WHERE e1.salary > (
SELECT AVG(e2.salary)
FROM employees e2
WHERE e2.dept_id = e1.dept_id
AND e2.salary > (
SELECT AVG(e3.salary)
FROM employees e3
WHERE e3.dept_id = e1.dept_id
)
);Schema error: No field named e1.dept_id. Did you mean 'e3.dept_id'?.
Feedback from @yliang412 :
I believe it couldn't find the e1.dept_id. Even without unnesting, the planner should still correctly infer that
e1.dept_idis from the employees table in the top-level scope. I suspect it has something to do with resolving the OuterReferenceColumn when it is not from the immediate outer scope.
Describe the solution you'd like
We would like DataFusion to at least Bind the query correctly (aka in DataFusion terms, have SqlToRel successfully create a LogicalPlan)
Not that successfully executing the query and getting the correct answer likely requires more unnesting rules which we don't yet have. This ticket just focus on the logical planning of the query and this ticket covers actually executing it:
Describe alternatives you've considered
@duongcongtoai looks like they had a PR from a while ago to resolve this problem
Additional context
No response