Skip to content

hkustDB/Quorion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Query running too slow? Rewrite it with Quorion!

Requirements

  • Java JDK 1.8
  • Scala 2.12.10
  • Maven 3.8.6
  • Python version >= 3.9
  • Python package requirements: docopt, requests, flask, openpyxl

Steps

  1. Preprocessing[option]. For generating new statistics (cost.csv), we offer the DuckDB version scripts query/preprocess.sh and query/gen_cost.sh. Modify the configurations in them, and execute the following command. For web-ui, please move the generated statistics files to folder graph/q1a/, tpch/q2/, lsqb/q1/, job/1a/, and custom/q1/ respectively; for command-line operations, please move them to the specific corresponding query folders.
  2. We provide two execution modes. The default mode is web-ui execution. If you need to switch, please modify the corresponding value EXEC_MODE at Line 755 in main.py.

Web-UI

  1. Execute main.py to launch the Python backend rewriter component.
$ python main.py
  1. Execute the Java backend parser component, which is included as a submodule at SparkSQLPlus/*. Please use the following command to init and update it.
$ git submodule init
$ git submodule update [--remote]
    or
$ git submodule update --init --recursive
  1. Open the webpage at http://localhost:8848.
  2. Begin submitting queries for execution on the webpage.

Command Line

  1. Modify python path (PYTHON_ENV) in auto_rewrite.sh.
  2. Execute the following command to get the rewrite querys. The rewrite time is shown in rewrite_time.txt
  3. OPTIONS
  • Mode: Set generate code mode D(DuckDB)/M(MySql) [default: D]
  • Yannakakis/Yannakakis-Plus : Set Y for Yannakakis; N for Yannakakis-Plus [default: N]
$ bash start_parser.sh
$ Parser started.
$ ./auto_rewrite.sh ${DDL_NAME} ${QUERY_DIR} [OPTIONS]
e.g ./auto_rewrite.sh lsqb lsqb M N
  1. Modify configurations in query/load_XXX.sql (load table schemas) and query/auto_run_XXX.sh (auto-run script for different DBMSs).
  2. Execute the following command to execute the queries in different DBMSs.
$ ./auto_run_XXX.sh [OPTIONS]
  1. If you want to run a single query, please change the code commented # NOTE: single query keeps here in function init_global_vars (Line 575 - Line 577 in main.py), and comment the code block labeled # NOTE: auto-rewrite keeps here (the code between the two blank lines, Line 598 - Line 617 in main.py).

Structure

Overview

  • Web-based Interface
  • Java Parser Backend
  • Python Optimizer & Rewriter Backend

Files

  • ./query/[graph|lsqb|tpch|job]: plans for different DBMSs
  • ./query/*.sh: auto-run scripts
  • ./query/*.sql: load data scripts
  • ./query/[src|Schema]: files for auto-run SparkSQL
  • ./*.py: code for rewriter and optimizer
  • ./sparksql-plus-web-jar-with-dependencies.jar: parser jar file

Demonstration

Step 1

Step1

Step 2

Step2

Step 3

Step3

Step 4

Step4

NOTE

  • For queries like SELECT DISTINCT ..., please remove DISTINCT keyword before parsing.
  • Use jps command to get the parser pid which name is jar, and then kill it.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published