We've spent a ton of time over the last week looking at the internals of LazySeq and options for avoiding synchronization. The general guidance from Java is to replace synchronized with ReentrantLock (which has virtual thread coordination), but this advice leaves out the inherent tradeoffs in that change. synchronized relies on object monitors which are built into every Java object at the JVM level, whereas ReentrantLocks are additional Java objects (which hold a reference to an internal Sync object). Clojure makes a lot of lazy seqs and allocating two objects (plus adding an additional field to LazySeq) for every lazy seq is a real cost in allocation, heap size, and GC. Additionally, while ReentrantLock seems to be a bit faster than synchronized in Java 21, LazySeq makes one reentrant call, and reentrant calls seems to be noticeably slower than synchronized. There are lots of options though. We think it's relatively easy to make lazy seq walking faster, but a lot harder to keep realization costs under control (as making locks takes non-zero time). One interesting branch we have explored is making one lock per seq and passing it through the seq as we go - lots of tradeoffs in that.
0 commit comments