-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed Point Queries #81
Comments
Hi Julian,
Datafun's David |
Hi Julian, I'm trying to read the code below.
My intuition tells me that line 4 should be:
Could you tell me what I've missed? David |
Isn't this undecidable? |
Many fixed point algorithms, including this one, will only care about the latest set. Algorithms that need to be efficient may find it useful to know what was added to the set in the previous iteration, for these we provide |
Undecidable in general. But if the language can recognize particular patterns and prove those to terminate, that may be very useful in practice. I am inspired by Total Functional Programming. Like SQL, it skirts the edge of Turing completeness and can solve most real-world problems. Total Functional Programming uses techniques like descending natural numbers to ensure that its loops terminate. Morel is Turing-complete, and therefore a partial functional programming language, but it would be useful if it can recognize total sub-programs. By the way, I'm spitballing here. I said that I haven't thought this through, and I still haven't. |
I wasn't previously aware of Datafun, even though I know of some of Michael Arntzenius's more recent work. Datafun is an impressive piece of work, with many similarities to Morel (especially the goal to use DB evaluation techniques). A quick review of what the various languages have achieved:
The most obvious way to define recursive sets in a functional programming language is with a recursive function. I would like to get to that. Here's why I think it is possible. Consider an regular recursive function such as Now let's consider a function such as
This recurses infinitely, because a call to The problem seems to be the interpretation what a "call" means. Rather than returning a set that is unioned with a set in the caller, we evaluate (This "receiver" model is, in my experience, how most users think about SQL's How would a Morel function know that it should be called using this "receiver" calling convention? Perhaps a new keyword associated with the function declaration. Or perhaps Morel could invoke this behavior when it sees the |
maybe not useful... but I'm thinking this. If the recursive functions are tail recursive (explicitly or translated to it behind the scene) than we can translate the functions in iterative form and so obtain two benefits:
suppose the function return an iterator that at some point keeps yelding a special value endofset (like EOF), the caller can just stop reading when it's reached In iterative form it is easy to see when the loops parameters become fixed and so to return EOS when it is reached |
Yes, useful. Thanks. As I was writing my comment I was thinking how similar 'receivers' were to tail recursion (and also continuations). So I'm glad to hear that someone else had the same intuition. It makes me more confident that we're onto something. The metaphor isn't perfect - tail recursion is an implementation technique, and the implementation technique for recursively defined sets would instead be some form of "incrementalization", such as Datalog-style seminaive evaluation. But I guess what they have in common we are cutting and splicing the call graph, adding and removing loops. |
I've been exploring these ideas with python as the host language. Here is a naive fix() implementation using the emp dataset.
The structure is similar but it lacks explicit self-reference. Here it is running in a datalore notebook: |
Thanks @DavidPratten! It's good to know that this can be done in Python. It is more concise than I expected, but it took a little while to get my head around the So, I claim that there are many programmers (and query writers) who would feel comfortable writing a recursive function (in Morel or Python) and would be uncomfortable writing one or two lambdas. And therefore it's worth finding a formulation using self-referential functions. |
I agree. @julianhyde Here is a possible way forward. If we focus on overloading the Morel UNION the syntax could be quite simple, and natural, and removes cruft from the SQL version. First, here is a motivating example. Given a tree "Items" (n.b. the root has id = parent_id)
Compute the transitive closure
Here is the solution using SQL WITH RECURSIVE (Sqlite).
Here it is in Morel
In this formulation,
Thoughts? David PS I wonder if the SQL drafters considered this light-weight form for recursive queries?
|
@julianhyde any thoughts on the above? The ideas also here Recursive Relational Queries Simplified |
@DavidPratten, Thanks for this great work. You avoid introducing a named collection by implicitly using the name of the operator ( I see a few problems. First, if we want this process to work for operators other than union (probably any monotonic operator would suffice) then that operator might not be named (e.g. might be a lambda or other expression). Second, if there are two occurrences of To implement this, we would have to hack Morel's scoping mechanism - by which variables are added to the scope as we go deeper into an expression - to notice all calls to certain, special operators. So the third problem is that we would have to make certain operators 'special' in terms of how they are added to the available names in a given scope. I don't want any 'special' operators. I'd rather do the reverse - stick with and build on the Now, if you try use the (Credit to Datafun here for the idea of leaning heavily on the concept to monotonicity. I don't think we we should go as far as Datafun in making every type monotonic -- that seems to make the language rather impractical -- and we might allow users to write expressions that we can't prove terminate, e.g. increase forever but never get to +infinity. It's possible that we drop the monotonicity requirement altogether if the operator is "differenceable".) |
Hi thanks for your feedback. I'm grateful to find someone with enough shared interest to engage! I don't know enough ML to comment on the implementation challenges. I'll restrict my comments to the issues that are common to composable relational queries whether in Morel or SQL.
This reference could be more general that to the specific operator. If the query is a collection of sub-queries unified by a multi-level tree breakdown of row-wise unifying operators (that can be monotonic) the reference would be required to be to the result at the top of all of them.
This problem is removed by changing the expression from "from union" to "from query". Which then works equally well for UNION and all other row-wise unifying operators that sit above this sub-query, whether or not the unifiers are lambdas, or named operators like UNION.
This problem has to be solved whatever the linguistic form. The problem exists and has been solved in SQL CTEs. Here are SQLITE's multi-UNION rules. My intuition is that Sqlite has "nailed" this. All non-recursive sub-queries are constrained to be first (in the case of a multi-level tree of unifying operators this would be in tree walk pre-order) and all recursive sub-queries follow (in tree walk pre-order). And each recursive sub-query is presented with all generated rows from every other subquery in the query. Thanks again David |
Is the introduction of a name required to express recursive relational queries? In the case of a composable CTE the name is purely ceremonial: Could we reserve the introduction of names to use cases where the resultant relation is consumed more than once? David |
I've thought a lot about this issue. I came to the conclusion that the best way to recursively define relations is to define them using a function that returns See full discussion: #106 |
I agree that these kinds of queries are important. (I tend to call them 'iterative queries' but others call them 'recursive queries', and 'incremental queries' are closely related.) A functional programming language that treats relations as data values would seem to be a natural approach to such problems.
In a development branch (commit f99d4a9) I have added the
Relational.iterate
function, which runs a computation (query) until it reaches a fixed point. Like this:Such queries can run via Calcite, being translated to Calcite's RepeatUnion relational operator.
I am also experimenting with writing such queries as recursive functions (you can see some examples in that commit). The tricky part is for the language to know that when I call a function recursively and use its return in a union, that has a chance of reaching a fixed point, and therefore the recursion will not go on forever. That would need us to give union some special properties, and I'm not sure what those properties are, or whether they belong to union alone.
Originally posted by @julianhyde in https://github.com/julianhyde/morel/issues/74#issuecomment-951223078
The text was updated successfully, but these errors were encountered: