Skip to content

Latest commit

 

History

History
71 lines (53 loc) · 3.86 KB

getting-started.md

File metadata and controls

71 lines (53 loc) · 3.86 KB

Getting Started with Xinda

1. Configuring a Xinda Experiment

There are a few arguments to configure Xinda in terms of the system under test, the benchmark used, and the slow faults injected. Below are the main arguments to configure Xinda. The rest of the arguments can be found in main.py and are for development purposes.

python3 main.py -h

# Xinda: A slow-fault testing pipeline for distributed systems.
# optional arguments:
# ...

To reset/clean up the working environment for next Xinda test:

python3 cleanup.py

1.1 Experiment Configuration

Field Default
Value
Description
log_root_dir $HOME/workdir/data/default The root directory to store logs (data)
data_dir REQUIRED Name of the experiment. Results will be stored in $log_root_dir/$sys_name/$data_dir
bench_exec_time* 150 Benchmark duration in seconds
iter 1 Label for repeated experiments

* Note that for Hadoop, we use special flags to control the duration of the benchmark. Namely, the number of MRBench runs (--mrbench_num_iter) and the size of the data generated by TeraGen (--terasort_num_of_100_byte_rows) followed by the TeraSort benchmark.

1.2 System Configuration

Field Default
Value
Description
sys_name REQUIRED Name of the distributed systems to be tested
cpu_limit 4 The number of CPU cores allocated to each container instance
mem_limit 32G The size of memory allocated to each container instance

1.3 Benchmark Configuration

Field Default
Value
Description
benchmark REQUIRED Specify which benchmark to test the system
ycsb_wkl mixed YCSB workloads. Other options are available in xinda-software/ycsb-workloads
openmsg_driver kafka-latency The yaml filename of OpenMessaging kafka driver. Available options are listed in xinda-software/openmessaging/driver-kafka
openmsg_workload simple-workload The yaml filename of OpenMessaging workload. Available options are listed in xinda-software/openmessaging/workloads
sysbench_lua_scheme oltp_write_only The lua scheme to run SysBench workload on crdb
etcd_official_wkl lease-keepalive etcd official v3 benchmark tool

1.4 Slow-Fault Configuration

Field Default
Value
Description
fault_type REQUIRED Types of slow faults to be injected. Can be {nw, fs, None}
fault_location REQUIRED Fault injection location. Available hostnames are listed in ./xinda/systems/container.yaml*
fault_duration REQUIRED Duration of the fault injection in seconds
fault_severity REQUIRED Severity of the fault injection. For network slow faults, available options are listed in ./tools/blockade (e.g., flaky-p10 or slow-100ms). For filesystem delays, just pass the delay to be injected in us (e.g., 1000 for 1ms or 100000 for 100ms)
fault_start_time REQUIRED Inject slow faults at X seconds after the benchmark is running

* For etcd, directly specify the fault location as leader or follower as the leaders are elected dynamically.

2. Results Analysis

By default each Xinda test will generate a directory in $HOME/workdir/data/default/${data_dir}. The directory contains system logs and runtime stats of the test. To analyze the results, you can use the following command:

python3 $HOME/workdir/xinda/data-analysis/process.py \
    --data_dir PATH_TO_DATA_DIR \
    --output_dir PATH_TO_OUTPUT_DIR

The script will generate a summary of the test results into meta.csv stored in the output directory. It will also parse runtime logs (e.g., fault injection timestamps) and stats (e.g., system throughput time series).