Skip to content

Stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators #16030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fmonjalet opened this issue May 12, 2025 · 1 comment · Fixed by #16031
Assignees
Labels
bug Something isn't working

Comments

@fmonjalet
Copy link
Contributor

Describe the bug

When translating a substrait scalar function call to DataFusion logical plan, when the function name translates to a binary operator and has a large number of arguments (say 2000), further processing of this logical plan result in stack overflow.

To Reproduce

This commit contains a reproducer plan and test. The plan is pretty large (because a lot of arguments are required to see the stack overflow), but pretty simple: it's an OR(col == a, col ==b, col == c, ..., col == x). We could argue that this precise example should actually be a substrait SingularOrList, but the issue can be triggered all the same with other operators and expressions.

Expected behavior

No stack overflow or crash. At best, the plan is accepted and executed, at worst DataFusion returns a clean error.

Additional context

No response

@fmonjalet fmonjalet added the bug Something isn't working label May 12, 2025
@fmonjalet
Copy link
Contributor Author

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant