Prototype SMT solver supporting Datalog as a monotonic theory.
The canonical version of this tool is in the Zenodo artifact for the POPL'23 paper "From SMT to ASP: Solver-Based Approaches to Solving Datalog Synthesis-as-Rule-Selection Problems" by Aaron Bembenek, Michael Greenberg, and Stephen Chong.
This repository includes a (modified) copy of the SMT solver CVC4, which has its own license (see CVC4-1.8/COPYING).
CVC4-1.8/: a version of the SMT solver CVC4 extended with our theory of Datalogdlog/: the Datalog source code that will be compiled and linked into the SMT solverexamples/: a couple example SMT-LIB programs using Datalog-based theoriesscripts/: various scripts for building different parts of the systembuild_cvc4.sh: build CVC4; needs to be run whenever you change the CVC4 source code or if you build a new Datalog librarybuild_dlog.sh: compile the Datalog code into a librarybuild_monosat.sh: build the MonoSAT system (to compare against)
benchmarks/: material for evaluating the correctness and efficiency of our system, and comparing it to other systemsreach/: scripts for benchmarking graph reachabilityworlds/: scripts for benchmarking galaxy generationsynthesis/: scripts for benchmarking Datalog synthesis
The project is set-up to run in a Docker container; we have some docker-compose configurations for convenience. If you use the container, the only dependencies should be Docker and docker-compose.
I'd recommend booting up the container like this:
docker-compose run eval
Using this command, the CVC4-1.8/, dlog/, scripts/, and tests/
directories will be mounted into the container, so that you can edit files in
these directories locally (outside the container). In what follows, I will
assume that you have booted up the container this way; all commands should be
issued from inside the home directory of the container.
To build our modified version of CVC4, we first need to compile our Datalog code, and then we can build the SMT solver itself:
scripts/build_dlog.sh && scripts/build_cvc4.sh
If you want to compare against MonoSAT, you can build it with this command:
scripts/build_monosat.sh
We have extended SMT-LIB with a new predicate dlog, which is in the form:
(dlog n relname i_1 ... i_m b_1 ... b_n)
where:
nis a natural number specifying the number of boolean parametersrelnameis a string specifing the name of the Datalog relation- the
i_jare integer arguments to the Datalog relation - the
b_kare boolean parameters
So, assuming we have a Datalog theory defining graph reachability (which we
do), we can say that node 1 is reachable from node 0 under the edge
assignments e0_0, e0_1, e1_0, and e1_1:
(set-logic ALL)
(declare-fun e0_0 () Bool)
(declare-fun e0_1 () Bool)
(declare-fun e1_0 () Bool)
(declare-fun e1_1 () Bool)
(assert (dlog 4 "reach" 0 1 e0_0 e0_1 e1_0 e1_1))
(check-sat)
If you saved this program in foo.smt2, you can run it by using the command:
cvc4 foo.smt2
The Datalog predicate is also available through the Python API as
pycvc4.kinds.DatalogPred; see the function test_datalog in
tests/reach/bm.py for example usage.
We currently support the following options:
datalog-why-provenance: use "why" provenance in conflict justifications (default=true)datalog-why-not-provenance: use "why not" provenance in conflict justifications (default=true)datalog-parallelism=N: useNthreads in the Datalog solver (default=1)datalog-buffer-size=N: wait forNboolean parameters to become assigned between theory checks (default=5)datalog-theory-propagation: propagate Datalog theory atoms (default=true)
Any boolean option <x> also has an opposite option no-<x>; for example, we
can run the command:
cvc4 foo.smt2 --no-datalog-why-not-provenance --datalog-parallelism=4
- Predicate:
reach/2- Denotes that there is a path in the graph from the vertex identified by the first argument to the vertex identified by the second one
- If a graph has
nvertices, they are numbered from0ton-1 - Parameter arity: equal to the square of the number of vertices in the graph
- Each boolean parameter represents the presence of an edge: if there are
nvertices in the graph, then there is a correspondence between the edge fromitojand thei * n + jth (0-indexed) parameter
This theory is used to generate a map for a hypothetical spaced-based game,
where there are a bunch of worlds distributed across a pool of solar systems,
with each world varying in its benevolence, its level of technology, and its
level of resources. There is a set of rules that determine whether two worlds
are in contact with each other and, if so, whether they trade with each other
and whether they are at war with each other. There are 100 candidate worlds; a
choice of worlds can be constrained using the following predicates. For all
predicates, parameter i corresponds to world i.
- Predicate:
Worlds.Trades/1- True if the world corresponding to the argument trades with some world
- Predicate:
Worlds.AtWar/1- True if the world corresponding to the argument is at war with some world
- Predicate:
Worlds.Trade/0- True if there is some world that trades
- Predicate:
Worlds.War/0- True if there is some world that is at war
- Predicate:
Worlds.Smuggling/0- True if there is some world that trades with a world it is at war with
- Predicate:
Worlds.IntraPlanetaryStrife/0- True if there is some world that is at war with itself
Note: Because of a bug in Souffle's provenance mechanism, the nullary
predicates listed above actually need to be unary predicates; the argument
should always be 0.
First, you need to write the program and store it in the dlog/souffle/
directory. Nonnegative integers are the only type of term we allow. You then
need to add the program to dlog/souffle/CMakeLists.txt. See
dlog/souffle/reach.dl for an example.
Now that you have the compiled Datalog code, you need to add some C++ glue code
to the CVC4 source code that tells our "theory of Datalog" implementation how
to interact with the new Datalog program. First, register the program in
cvc4/src/theory/datalog/souffle_solver.cpp (using the name from
dlog/souffle/CMakeLists.txt).
Then you need to implement a class representing the Datalog program and classes
for each output relation in the program. See the existing implementations in
cvc4/src/theory/datalog/relations/ for guidance. Once you've done this,
register the output relations in the function DatalogRelation::initialize()
in cvc4/src/theory/datalog/datalog_relation.cpp.
At this point, rebuild CVC4 (scripts/build_cvc4.sh). Fingers crossed, your
new theory should work!