There are a few arguments to configure Xinda in terms of the system under test, the benchmark used, and the slow faults injected. Below are the main arguments to configure Xinda. The rest of the arguments can be found in main.py
and are for development purposes.
python3 main.py -h
# Xinda: A slow-fault testing pipeline for distributed systems.
# optional arguments:
# ...
To reset/clean up the working environment for next Xinda test:
python3 cleanup.py
Field | Default Value |
Description |
---|---|---|
log_root_dir | $HOME/workdir/data/default | The root directory to store logs (data) |
data_dir | REQUIRED | Name of the experiment. Results will be stored in $log_root_dir/$sys_name/$data_dir |
bench_exec_time* | 150 | Benchmark duration in seconds |
iter | 1 | Label for repeated experiments |
* Note that for Hadoop, we use special flags to control the duration of the benchmark. Namely, the number of MRBench runs (--mrbench_num_iter) and the size of the data generated by TeraGen (--terasort_num_of_100_byte_rows) followed by the TeraSort benchmark.
Field | Default Value |
Description |
---|---|---|
sys_name | REQUIRED | Name of the distributed systems to be tested |
cpu_limit | 4 | The number of CPU cores allocated to each container instance |
mem_limit | 32G | The size of memory allocated to each container instance |
Field | Default Value |
Description |
---|---|---|
benchmark | REQUIRED | Specify which benchmark to test the system |
ycsb_wkl | mixed | YCSB workloads. Other options are available in xinda-software/ycsb-workloads |
openmsg_driver | kafka-latency | The yaml filename of OpenMessaging kafka driver. Available options are listed in xinda-software/openmessaging/driver-kafka |
openmsg_workload | simple-workload | The yaml filename of OpenMessaging workload. Available options are listed in xinda-software/openmessaging/workloads |
sysbench_lua_scheme | oltp_write_only | The lua scheme to run SysBench workload on crdb |
etcd_official_wkl | lease-keepalive | etcd official v3 benchmark tool |
Field | Default Value |
Description |
---|---|---|
fault_type | REQUIRED | Types of slow faults to be injected. Can be {nw, fs, None} |
fault_location | REQUIRED | Fault injection location. Available hostnames are listed in ./xinda/systems/container.yaml * |
fault_duration | REQUIRED | Duration of the fault injection in seconds |
fault_severity | REQUIRED | Severity of the fault injection. For network slow faults, available options are listed in ./tools/blockade (e.g., flaky-p10 or slow-100ms). For filesystem delays, just pass the delay to be injected in us (e.g., 1000 for 1ms or 100000 for 100ms) |
fault_start_time | REQUIRED | Inject slow faults at X seconds after the benchmark is running |
* For etcd, directly specify the fault location as
leader
orfollower
as the leaders are elected dynamically.
By default each Xinda test will generate a directory in $HOME/workdir/data/default/${data_dir}
. The directory contains system logs and runtime stats of the test. To analyze the results, you can use the following command:
python3 $HOME/workdir/xinda/data-analysis/process.py \
--data_dir PATH_TO_DATA_DIR \
--output_dir PATH_TO_OUTPUT_DIR
The script will generate a summary of the test results into meta.csv
stored in the output directory. It will also parse runtime logs (e.g., fault injection timestamps) and stats (e.g., system throughput time series).