Skip to content

Conversation

@Sl1mb0
Copy link
Contributor

@Sl1mb0 Sl1mb0 commented Oct 16, 2025

Closes #25

This creates a couple types to stateless-ly handle UDF registration and invocation within a query. I've opted for a draft PR as its possible things may change significantly and to leverage CI for testing.

  • I've read the contributing section of the project CONTRIBUTING.md.
  • Signed CLA (if not already signed).

@Sl1mb0 Sl1mb0 force-pushed the tm/stateless-udf branch 11 times, most recently from 8ea54a1 to ad19165 Compare October 17, 2025 02:18
Copy link
Collaborator

@crepererum crepererum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we merge this, can you try if that API works for monolith?

@Sl1mb0 Sl1mb0 marked this pull request as ready for review October 23, 2025 23:50
@Sl1mb0 Sl1mb0 force-pushed the tm/stateless-udf branch 2 times, most recently from 2959efe to b36562c Compare October 23, 2025 23:56
@Sl1mb0 Sl1mb0 requested a review from crepererum October 24, 2025 01:33
Comment on lines +36 to +50
CREATE FUNCTION add_one()
LANGUAGE python
AS '
def add_one(x: int) -> int:
return x + 1
';
CREATE FUNCTION multiply_two()
LANGUAGE python
AS '
def multiply_two(x: int) -> int:
return x * 2
';
SELECT add_one(1), multiply_two(3);
Copy link
Collaborator

@crepererum crepererum Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good test, but ON TOP of that, could we ALSO test that this works:

CREATE FUNCTION add_one()
LANGUAGE python
AS '
def add_one(x: int) -> int:
    return x + 1

def multiply_two(x: int) -> int:
    return x * 2
';


SELECT add_one(1), multiply_two(3);

(from skimming through the code I think it should)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And related, could we test the behavior of this nonsense here:

CREATE FUNCTION add_one()
LANGUAGE python
AS '';

SELECT 1;

(i.e. the python code block contains NO functions)

Comment on lines 57 to 58
/// Invoke the query, returning a result
pub async fn invoke(&mut self, udf_query: UdfQuery) -> DataFusionResult<Vec<Vec<String>>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the integration it would be easier if you return the physical plan here and let the caller decide if they wanna call collect or whatever.

@Sl1mb0 Sl1mb0 requested a review from crepererum October 27, 2025 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stateless Python UDF Registration

2 participants