Skip to content

Sylvan 2.0 #54

@trolando

Description

@trolando

I want to get started on Sylvan 2.0 with an opportunity to redesign parts of the API.

Some currently planned action points:

  • Add more benchmarks, including benchmarks that would benefit from automatic variable reordering
  • Integrate Lace 2.0 (this would hopefully make the code look cleaner)
    🌟 Appears to be neutral wrt performance (more benchmarks are needed)
  • Get rid of the "sylvan_granularity" (it doesn't seem to have impact and it clutters the code)
    🌟 Appears to be neutral wrt performance (more benchmarks are needed)
  • Fix a long-standing race condition in the operation cache (affected modern Apple ARM chips)
    🌟 Appears to be neutral wrt performance (more benchmarks are needed)
  • Determine the system's cache line size automatically when configuring CMake
  • Set up GitHub Actions to be functional again
  • Include testing "make install" in GitHub Actions
  • Fix a strange bug in zdd_isop that causes the zdd_isop_random test to fail.
    🌟 Turns out it was just an array overflow bug in the test...
  • Make the C++ wrapper optional to build (via CMake setting)
  • Do something more useful with sylvan_config.h, maybe as CMake options?
  • Clean up the API, perform some renaming
    • Restructure header files, move public headers to an include directory
    • Renamed the header files, they don't need to all start with sylvan_
    • Revised the CMake scripts to ensure that building and installing should work perfectly
    • Rename llmsset to nodes to prepare for different ways to handle the nodes.
    • Rename "_t" structures to follow standards more
    • Probably rename a lot of the sylvan_ operations to mtbdd_ and bdd_ operations?
    • Consistently use ldd and LDD instead of lddmc and MDD
    • Change all functions that return a DD to expect a result pointer as the first parameter -- This is fairly important to eliminate the complexity of preventing garbage collection problems when running multiple Sylvan operations from different threads
    • Functions like sylvan_makenode should be fully internal, not safe to use from outside Lace threads.
    • Do not typically return the internal "level" of a DD node, rather, when returning for example the variable of the DD root, rather return an ithvar (DD with just that variable) instead.
  • Experiment with replacing the hash table by a structure that only looks at parents of nodes (similar to Ruddy)
    • Check what happens if the hash table doesn't rehash, but just follows the linear probe sequence...
    • One version with a simple linked list, with two arrays: "first" (first parent) and "next" (next parent). When empty, use cas with a magic value to claim, then make the actual node, and release. A wait-free alternative could first make the actual node, and then use cas to update.
    • Other version with a trie based on the hash of the node, with two arrays: "first" and "next", but for next we store two values, "next0" and "next1". Same update mechanisms to consider as in the first version.
  • Consider using just malloc and realloc rather than mmap since they use mmap internally anyway for large allocations
    ⚠️ Only if we actually use realloc to grow. Otherwise Sylvan startup times are huge!!
  • Ensure that there are never pointers into the nodes table, only indices even inside internal methods.
  • Experiment with different versions of the operations cache, specifically an RCU cache instead of seqlock, and thread-local operation caches. Daniel Basgöze implemented a version that has its own cache-garbage-collection, storing indices to the cache in an array (which does add one level of indirection, mitigated by storing part of the hash in each bucket), and having a third array to record potential free-able slots in the data array.
    • I'm not quite sure if a different operation cache is better, or whether there could be a better policy for when to resize it, or metrics to see how effective the cache is
    • One option (probably bad) is thread-local operation caches, so no mutexes are needed
    • RCU version: each bucket only has a pointer to actual data (never cleared until gc, so can be safely read until gc, maybe thread-local?) Eg a fixed amount of space for new cache entries, and once full, run gc and rebalance memories?
  • Consider different hash functions, although this is a low priority. One could record all the operations stored in the cache and compare histograms of different hash functions; it is likely that the CPU time is not super important due to memory accesses being the bottle neck, but the quality of the hashes might also not be extremely important for the operation cache.
  • Consider different node layouts:
    • Maybe a 64 bits version, with 22 bits index (4M nodes) and 10 bits variable (1024 variables). This could be useful for cases where we just don't expect such large BDDs or where we want to smoothly transition to bigger nodes when we reach the 4M mark.
    • Maybe a 256 bits version, with a 128 bits "core" BDD node plus 2x64 bits for an edge label (see if this can be compatible with Q-Sylvan)
    • Note: the current 128 bits version has 40 bits index (1T nodes) and 24 bits variable (16M variables).
  • Reconsider how external references are done, i.e., the API for adding custom references, etc. Rewrite the mark function to cope with millions of roots (in some use cases)
  • Give a better way to configure "grow policies" i.e. as the table grows in size, at what point(s) do we actually want to do garbage collection? When do we increase the size of the operations cache, etc.
    • In fact, if the new nodes implementation is successful (no significant performance loss, or even improved performance) then we don't need to grow the nodes table in powers of 2! It could be any arbitrary size. See the discussion below for a potential API.
  • Add fast algorithms for contains, subset, subseteq, etc.
  • Implement mtbdd_subsume
  • Implement mtbdd_satcount with GMP values
  • Improve support for ZDD (consider what functionality is missing but critical... I had a list somewhere...)
  • Integrate support for TBDD
    • Rewrite to use 40-bit indices, 14-bit variables to work with the new nodes implementation.
  • Consider putting ZDD, TBDD, LDD in separate unique tables / node arrays. This would be useful for implementing sifting, because you can't do sifting if there are also non-BDD nodes in the array!
  • Integrate / Reimplement dynamic variable ordering (sifting) and benchmark different tuning parameters
    • "Grouping" of variables that should always move together
    • "Barriers" to forbid moving variables beyond
  • Improve documentation and website
    • Set up Doxygen and automation
    • Maintainer's guide
    • Tutorial
    • Including how to "install" (and to a custom directory)
  • Better testing! Testing is very basic at the moment. Use the Unity framework, add a lot more unit tests just to ensure that things work.
  • Reconsider the design of sylvan_stats, maybe allow programmatic access rather than just dumping to screen?
  • In general, look at the CPU profile on benchmarks, what is dominating now?
  • Rewrite wrappers for C++, Python, Rust, Java.
  • Add an option to run single-threaded with a "fake" Lace??
    ⚠️ Probably not doing this. Even with a fake Lace, Sylvan has overhead anyway with atomic operations.
  • Consider some kind of support for sentential decision diagrams (http://reasoning.cs.ucla.edu/sdd/)
    ⚠️ Not for Sylvan 2.0, maybe Sylvan 2.1?
  • Add more timer counters for operations, percentage of time spent in the cache operations vs just Sylvan operations

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions