This repository contains the code and the data used for the experiments in the paper "Rule Induction in Knowledge Graphs Using Linear Programming" by Sanjeeb Dash and Joao Goncalves, AAAI-23.
- The code was tested only on Linux.
- The code is written in C++ and requires a C++ compiler.
- The code uses the commercial solver IBM ILOG CPLEX.
code
: contains the code and the Makefile.data
: contains the 5 datasets used in the paper.runs
: contains the parameters files and scripts to run the 5 datasets.
- Install IBM ILOG CPLEX.
- Edit the file
Makefile
in the directorycode
and add the paths to the directorycplex
andconcert
in the lines:CPLEXDIR = path_to_cplex/cplex
CONCERTDIR = path_to_cplex/concert
- In the directory
code
typemake
.
The following instructions show how to run LPRules with the dataset UMLS.
- Go to the directory
run/UMLS
. - Execute the command:
./run_lprules_in_parallel.sh p_UMLS.txt outUMLS 46
wherep_UMLS.txt
is the parameter file residing in the directoryrun/UMLS
, outUMLS is the name to be used in output files and 46 is the number of relations in the UMLS dataset. - The results are presented at the end of the file
results_outUMLS.txt
.
The number of relations in each dataset is: UMLS 46, Kinship 25, WN18RR 11, FB15k-237 237, YAGO3-10 37.
The following instructions show how to run LPRules with the dataset UMLS
assuming the existence of a set of rules in the file input_rules.txt
.
- Go to the directory
run/UMLS
. - Edit the file
p_UMLS.txt
and set the parameterrun_mode
to either 1, 2, or 3. The meaning of each of these values is: 1 - scenario B in the paper, 2 - scenario C in the paper, 3 - scenario D in the paper. - Execute the command:
./run_lprules_in_parallel_read_rules.sh p_UMLS.txt outUMLS 46 input_rules.txt
wherep_UMLS.txt
is the parameter file residing in the directoryrun/UMLS
, outUMLS is the name to be used in output files and 46 is the number of relations in the UMLS dataset. - The results are presented at the end of the file
results_outUMLS.txt
.