- Java JDK 1.8
- Scala 2.12.10
- Maven 3.8.6
- Python version >= 3.9
- Python package requirements: docopt, requests, flask, openpyxl
- Preprocessing[option]. For generating new statistics (
cost.csv
), we offer the DuckDB version scriptsquery/preprocess.sh
andquery/gen_cost.sh
. Modify the configurations in them, and execute the following command. For web-ui, please move the generated statistics files to foldergraph/q1a/
,tpch/q2/
,lsqb/q1/
,job/1a/
, andcustom/q1/
respectively; for command-line operations, please move them to the specific corresponding query folders. - We provide two execution modes. The default mode is web-ui execution. If you need to switch, please modify the corresponding value
EXEC_MODE
at Line755
inmain.py
.
- Execute main.py to launch the Python backend rewriter component.
$ python main.py
- Execute the Java backend parser component, which is included as a submodule at
SparkSQLPlus/*
. Please use the following command to init and update it.
$ git submodule init
$ git submodule update [--remote]
or
$ git submodule update --init --recursive
- Open the webpage at
http://localhost:8848
. - Begin submitting queries for execution on the webpage.
- Modify python path (
PYTHON_ENV
) inauto_rewrite.sh
. - Execute the following command to get the rewrite querys. The rewrite time is shown in
rewrite_time.txt
- OPTIONS
- Mode: Set generate code mode D(DuckDB)/M(MySql) [default: D]
- Yannakakis/Yannakakis-Plus : Set Y for Yannakakis; N for Yannakakis-Plus [default: N]
$ bash start_parser.sh
$ Parser started.
$ ./auto_rewrite.sh ${DDL_NAME} ${QUERY_DIR} [OPTIONS]
e.g ./auto_rewrite.sh lsqb lsqb M N
- Modify configurations in
query/load_XXX.sql
(load table schemas) andquery/auto_run_XXX.sh
(auto-run script for different DBMSs). - Execute the following command to execute the queries in different DBMSs.
$ ./auto_run_XXX.sh [OPTIONS]
- If you want to run a single query, please change the code commented
# NOTE: single query keeps here
in functioninit_global_vars
(Line575
- Line577
inmain.py
), and comment the code block labeled# NOTE: auto-rewrite keeps here
(the code between the two blank lines, Line598
- Line617
inmain.py
).
- Web-based Interface
- Java Parser Backend
- Python Optimizer & Rewriter Backend
./query/[graph|lsqb|tpch|job]
: plans for different DBMSs./query/*.sh
: auto-run scripts./query/*.sql
: load data scripts./query/[src|Schema]
: files for auto-run SparkSQL./*.py
: code for rewriter and optimizer./sparksql-plus-web-jar-with-dependencies.jar
: parser jar file
- For queries like
SELECT DISTINCT ...
, please removeDISTINCT
keyword before parsing. - Use
jps
command to get the parser pid which name isjar
, and then kill it.