Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline Experiments. #345

Closed
wants to merge 51 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
d971c15
Enhance provisioning.
amlatyrngom Jul 14, 2023
bf4ba31
Enhance Provisioning.
amlatyrngom Jul 21, 2023
a1a9603
Add TIDB Setup and Connection
amlatyrngom Jul 31, 2023
01993b7
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom Jul 31, 2023
db6b608
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom Aug 29, 2023
c9bc6c7
TiDB Benchmarking
amlatyrngom Sep 1, 2023
bc5a137
Fix linting.
amlatyrngom Sep 1, 2023
26a1e9f
Fix tests.
amlatyrngom Sep 1, 2023
30a3eb5
Fixes
amlatyrngom Sep 1, 2023
9f0b3ab
Setting up baseline
amlatyrngom Oct 5, 2023
617f421
Baseline txns
amlatyrngom Oct 5, 2023
4d1314e
Fixing merge conflicts
amlatyrngom Oct 5, 2023
d7ad9d2
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom Oct 5, 2023
9ca64ab
Fix seq num problem
amlatyrngom Oct 5, 2023
47c2258
Adding downscale expts
amlatyrngom Oct 12, 2023
4f39e8b
Baseline expts
amlatyrngom Nov 2, 2023
2470049
Reconcile baseline code.
amlatyrngom Nov 2, 2023
b5e4054
Manual scale down expt
amlatyrngom Nov 9, 2023
868417f
Add tidb and daylong
amlatyrngom Nov 13, 2023
eb49c1e
Add sine wave to txns.
amlatyrngom Nov 14, 2023
e98c2d3
Linting fixes
amlatyrngom Nov 14, 2023
d338d8e
Modifying expts
amlatyrngom Nov 21, 2023
75c8d64
Baseline expts
amlatyrngom Nov 23, 2023
23ad0dd
Add ad hoc queries and scale up.
amlatyrngom Nov 27, 2023
d0dd0e9
Downscale
amlatyrngom Nov 27, 2023
3e4d9de
Fix ad hoc path
amlatyrngom Nov 27, 2023
377d367
Add specialized expts
amlatyrngom Dec 3, 2023
b7f44a2
Update specialized workload.
amlatyrngom Dec 8, 2023
ad44ee2
Checking in scale up expts.
amlatyrngom Dec 13, 2023
7044c4f
Starting daylong.
amlatyrngom Dec 27, 2023
d55904e
Starting daylong.
amlatyrngom Dec 27, 2023
1392296
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom Jan 22, 2024
5f20b33
Fix merge issues
amlatyrngom Jan 22, 2024
c59ba05
Update expts
amlatyrngom Jan 22, 2024
92a2fe0
Update expts
amlatyrngom Jan 22, 2024
d4c3856
Update expts
amlatyrngom Jan 22, 2024
a8d31b5
Update expts
amlatyrngom Jan 22, 2024
f0b73ca
Fix expts
amlatyrngom Jan 22, 2024
619efb0
Fix expts
amlatyrngom Jan 22, 2024
7e088d8
Cause redshift scaleup
amlatyrngom Jan 22, 2024
07f634d
Change issue slots
amlatyrngom Jan 25, 2024
fb47d92
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom Jan 26, 2024
3ecc61f
Add new trace
amlatyrngom Jan 27, 2024
225a6dc
Add new trace
amlatyrngom Jan 30, 2024
9d68ec6
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom Apr 26, 2024
4ca9d6e
Add slo change expt.
amlatyrngom Apr 26, 2024
6891b5f
New scale down
amlatyrngom May 6, 2024
2948b5e
Adding chbench
amlatyrngom May 13, 2024
0792ddd
Merge remote-tracking branch 'origin/main' into aln-provisioning
amlatyrngom May 13, 2024
97765f0
Running chbench
amlatyrngom May 14, 2024
7433714
Fix tidb chbench
amlatyrngom May 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ brad.egg-info
config/config.yml
config/config_local.yml
config/manifests/manifest.yml
config/tidb.yml
config/baseline.yml
config/baseline-*.yml
config/temp_config.yml
query_logs/
config/physical_config*.yml
Expand Down
25 changes: 25 additions & 0 deletions config/baseline.sample.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
s3_bucket: brad-personal-data
bucket_region: us-east-1
redshift:
host: fillme
user: fillme
password: fillme
database: fillme
port: fillme
iam: fillme
aurora:
host: fillme
user: fillme
password: fillme
database: fillme
port: fillme
access_key: fillme
secret_key: fillme
tidb:
host: fillme
user: fillme
password: fillme
port: fillme
public_key: fillme
private_key: fillme

2 changes: 1 addition & 1 deletion config/schemas/imdb_extended.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ tables:
data_type: SERIAL
primary_key: true
- name: name
data_type: TEXT
data_type: VARCHAR(256)
- name: location_x
data_type: DECIMAL(10)
- name: location_y
Expand Down
6 changes: 6 additions & 0 deletions config/tidb.sample.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
host: fillme
user: fillme
password: fillme
port: 4000
public_key: fillme # TIDB Cloud Public Key
private_key: fillme # TIDB Cloud Private Key
4 changes: 4 additions & 0 deletions experiments/15-e2e-scenarios-v2-baselines/common.cond
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Relative to individual experiment scenario directories.
IMDB_20GB_REGULAR_QUERY_BANK = "workloads/IMDB_20GB/regular_test/queries.sql"
IMDB_100GB_REGULAR_QUERY_BANK = "workloads/IMDB_100GB/regular_test/queries.sql"
IMDB_100GB_REGULAR_QUERY_FREQS = "workloads/IMDB_100GB/regular_test/query_frequency.npy"
232 changes: 232 additions & 0 deletions experiments/15-e2e-scenarios-v2-baselines/common.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
function cancel_experiment() {
for pid_var in "$@"; do
kill -INT $pid_var
done
}

function graceful_shutdown() {
for pid_var in "$@"; do
kill -INT $pid_var
done
for pid_var in "$@"; do
wait $pid_var
done
}

function log_workload_point() {
msg=$1
now=$(date "+%Y-%m-%d %H:%M:%S")
msg="$now,$msg"
echo "$msg" >> $EXPT_OUT/points.log
echo "$msg"
}

function poll_file_for_event() {
local file="$1"
local event_name="$2"
local timeout_minutes="$3"
local previous_size=$(stat -c %s "$file")
local current_size
local last_line

local start_time
local elapsed_time
start_time=$(date +%s)

while true; do
current_size=$(stat -c %s "$file")

if [[ $current_size -ne $previous_size ]]; then
last_line=$(tail -n 1 "$file")

if [[ $last_line == *"$event_name"* ]]; then
>&2 echo "Detected new $event_name!"
break
fi
fi

elapsed_time=$(( $(date +%s) - $start_time ))
if [[ $elapsed_time -ge $((timeout_minutes * 60)) ]]; then
>&2 echo "Timeout reached. Did not detect $event_name within $timeout_minutes minutes."
log_workload_point "timeout_poll_${event_name}"
break
fi

sleep 30
done
}

function start_repeating_olap_runner() {
local ra_clients=$1
local ra_gap_s=$2
local ra_gap_std_s=$3
local query_indexes=$4
local results_name=$5

results_dir=$EXPT_OUT/$results_name
mkdir -p $results_dir

# If result name contains "vec", use $VECTOR_ENGINE, otherwise use $ANALYTICS_ENGINE
if [[ $results_name == *"vec"* ]]; then
engine=$VECTOR_ENGINE
bank_file=$vector_query_bank_file
else
engine=$ANALYTICS_ENGINE
bank_file=$ra_query_bank_file
fi


local args=(
--num-clients $ra_clients
--query-indexes "$query_indexes"
--query-bank-file $bank_file
--avg-gap-s $ra_gap_s
--avg-gap-std-s $ra_gap_std_s
--baseline $engine
--output-dir $results_dir
)

if [[ ! -z $ra_query_frequency_path ]]; then
args+=(--query-frequency-path $ra_query_frequency_path)
fi

>&2 echo "[Repeating Analytics] Running with $ra_clients..."

log_workload_point $results_name
python3 workloads/IMDB_extended/run_repeating_analytics.py "${args[@]}" &

# This is a special return value variable that we use.
runner_pid=$!
}

function start_seq_olap_runner() {
local num_clients=$1
local gap_s=$2
local gap_std_s=$3
local results_name=$4

results_dir=$EXPT_OUT/$results_name
mkdir -p $results_dir

local args=(
--num-clients $num_clients
--query-sequence-file $seq_query_bank_file
--avg-gap-s $gap_s
--avg-gap-std-s $gap_std_s
--baseline $ANALYTICS_ENGINE
--output-dir $results_dir
)

>&2 echo "[Seq Analytics] Running with $num_clients..."

log_workload_point $results_name
python3 workloads/IMDB_extended/run_query_sequence.py "${args[@]}" &

# This is a special return value variable that we use.
runner_pid=$!
}

function start_txn_runner() {
t_clients=$1

>&2 echo "[Transactions] Running with $t_clients..."
results_dir=$EXPT_OUT/t_${t_clients}
mkdir -p $results_dir

log_workload_point "txn_${t_clients}"
python3 workloads/IMDB_extended/run_transactions.py \
--num-clients $t_clients \
--output-dir $results_dir \
--baseline $TRANSACTION_ENGINE \
&

# This is a special return value variable that we use.
runner_pid=$!
}

function start_txn_runner_serial() {
t_clients=$1

>&2 echo "[Serial Transactions] Running with $t_clients..."
results_dir=$EXPT_OUT/t_${t_clients}
mkdir -p $results_dir

local args=(
--num-clients $t_clients
--output-dir $results_dir
--baseline $TRANSACTION_ENGINE
)

log_workload_point "txn_${t_clients}"
python3 workloads/IMDB_extended/run_transactions_serial.py \
"${args[@]}" &

# This is a special return value variable that we use.
runner_pid=$!
}


function extract_named_arguments() {
# Evaluates any environment variables in this script's arguments. This script
# should only be run on trusted input.
orig_args=($@)
for val in "${orig_args[@]}"; do
phys_arg=$(eval "echo $val")

if [[ $phys_arg =~ --ra-clients=.+ ]]; then
ra_clients=${phys_arg:13}
fi

if [[ $phys_arg =~ --t-clients-lo=.+ ]]; then
t_clients_lo=${phys_arg:15}
fi

if [[ $phys_arg =~ --t-clients-hi=.+ ]]; then
t_clients_hi=${phys_arg:15}
fi

if [[ $phys_arg =~ --ra-query-indexes=.+ ]]; then
ra_query_indexes=${phys_arg:19}
fi

if [[ $phys_arg =~ --ra-query-bank-file=.+ ]]; then
ra_query_bank_file=${phys_arg:21}
fi

if [[ $phys_arg =~ --ra-gap-s=.+ ]]; then
ra_gap_s=${phys_arg:11}
fi

if [[ $phys_arg =~ --ra-gap-std-s=.+ ]]; then
ra_gap_std_s=${phys_arg:15}
fi

if [[ $phys_arg =~ --ra-query-frequency-path=.+ ]]; then
ra_query_frequency_path=${phys_arg:26}
fi

if [[ $phys_arg =~ --num-front-ends=.+ ]]; then
num_front_ends=${phys_arg:17}
fi

if [[ $phys_arg =~ --run-for-s=.+ ]]; then
run_for_s=${phys_arg:12}
fi

if [[ $phys_arg =~ --config-file=.+ ]]; then
config_file=${phys_arg:14}
fi

if [[ $phys_arg =~ --planner-config-file=.+ ]]; then
planner_config_file=${phys_arg:22}
fi

if [[ $phys_arg =~ --skip-replan=.+ ]]; then
skip_replan=${phys_arg:14}
fi

if [[ $phys_arg =~ --schema-name=.+ ]]; then
schema_name=${phys_arg:14}
fi
done
}
Loading
Loading