type inference: fix bug in union-bound algorithm #289

apoelstra · 2025-04-14T14:53:50Z

We have implemented the "halving" variant of the union-bound algorithm, but my code to do so was wrong in a way that sometimes causes types to "come ununified", i.e. a single type bound is split into two, with different instances in the program more-or-less randomly assigned to each side of the split.

This causes all sorts of weird behavior.

Fixes #286

We have logic to output `...` when a type gets too deep and to stop recursing at that point. Most notably this is useful when outputting an occurs-check-failing type. But the max depth is 64, so you can make a type of size 2^64 without hitting this limit, which will still blow out your memory and effectively stall. This can create pretty confusing symptoms when fuzzing for pathological programs -- you will get an apparent stall with no output, and no matter how many debug printlns you add it will appear that they all execute before the stall -- and it turns out that in fact all your code was working correctly up to the point of the final panic where it tries to *display* an occurs-check error. Cut off the display at 10000 nodes. If somebody really needs to output a text representation of such a thing they can write their own logic.

In the union-bound algorithm we track our unified type variables as sets, which are represented by trees of pointers, where the roots of the trees are used as set representatives for determining equality etc. When unifying variables, we move one's tree into the tree of the other, and then copy all its bounds to the root of the new unified tree. Most operations within union-bound are given an arbitrary set element and immediately locate its root before doing anything. We obtain better asymptotics by using the "halving" variant of union-bound, wherein on each root lookup, as we follow the path from the element to its root, we update the links to skip every other one. Basically, if node x points to its parent which points to a grandparent, we update x to point to the grandparent directly. *However* our existing code, rather than making x point to the grandparent, would just copy the grandparent over the parent. I'm not sure what I was thinking here -- I think I got tangled up trying to deal with Rust's multiple-aliasing rules without deadlocking and lost track of the actual algorithm, or maybe I misread Russell's C code which I was mostly copying from -- but the result is both weird and incorrect. In particular, if the grandparent is a root, then by copying its data into parent, we duplicate the root, effectively un-unifying the variable at some hard-to-predict point. Once the un-unification happens, then future unifications will appear to only get half-applied, the resulting type inference will be slightly wrong, we will produce programs which fail type-checking or sharing-checking after being encoded, even though they were successfully constructed. The fix, amusingly, actually reduces the amount of locking, which might provide a slight speedup, and will certainly make it easier to convince yourself that this code is deadlock-free. (Probably we should drop all these mutexes and just use raw pointers and unsafe code, which would be easier to follow; the mutexes serve no purpose other than to mollify the Rust typechecker; we assure type-safety by locking the entire slab, and locking individual set elements is neither necessary nor sufficient to get a thread-safe type inference algorithm.) Thanks to Russell O'Connor for identifying this bug, and some thanks to Claude 3.7-sonnet for producing weird not-quite-sensible "fixes" that would change observed behavior on our bad test cases, and pointed us to roughly the correct location.

You can rebase this PR to move the unit tests in front of the fix, and you will see that they both fail.

…code/decode roundtrip This construction algorithm is a bit fragile and will often fail but the fuzzer should be able to make some progress with it. In particular if, during construction, it encounters any type inference errors, it will fail. I used this to generate the unit tests in this PR.

apoelstra

On f821e22 successfully ran local tests

roconnor-blockstream

utACK

apoelstra · 2025-04-14T18:23:15Z

@uncomputable can you ack (or utack) this?

uncomputable

utACK f821e22

apoelstra force-pushed the 2025-04--invalid-encode branch from c61d31f to 0a3fe2a Compare April 14, 2025 15:04

apoelstra added 3 commits April 14, 2025 15:17

test: add unit tests which demonstrate problems fixed by the last commit

bd3fe4e

You can rebase this PR to move the unit tests in front of the fix, and you will see that they both fail.

apoelstra force-pushed the 2025-04--invalid-encode branch from 0a3fe2a to f821e22 Compare April 14, 2025 15:17

apoelstra commented Apr 14, 2025

View reviewed changes

roconnor-blockstream approved these changes Apr 14, 2025

View reviewed changes

uncomputable approved these changes Apr 14, 2025

View reviewed changes

uncomputable mentioned this pull request Apr 14, 2025

Use released simplicity-lang BlockstreamResearch/SimplicityHL#121

Closed

uncomputable merged commit 5c3ff26 into master Apr 14, 2025
35 checks passed

uncomputable deleted the 2025-04--invalid-encode branch April 14, 2025 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

type inference: fix bug in union-bound algorithm #289

type inference: fix bug in union-bound algorithm #289

Uh oh!

apoelstra commented Apr 14, 2025

Uh oh!

apoelstra left a comment

Uh oh!

roconnor-blockstream left a comment

Uh oh!

apoelstra commented Apr 14, 2025

Uh oh!

uncomputable left a comment

Uh oh!

Uh oh!

Uh oh!

type inference: fix bug in union-bound algorithm #289

type inference: fix bug in union-bound algorithm #289

Uh oh!

Conversation

apoelstra commented Apr 14, 2025

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

roconnor-blockstream left a comment

Choose a reason for hiding this comment

Uh oh!

apoelstra commented Apr 14, 2025

Uh oh!

uncomputable left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!