Getting Started with Xinda

1. Configuring a Xinda Experiment

There are a few arguments to configure Xinda in terms of the system under test, the benchmark used, and the slow faults injected. Below are the main arguments to configure Xinda. The rest of the arguments can be found in main.py and are for development purposes.

python3 main.py -h

# Xinda: A slow-fault testing pipeline for distributed systems.
# optional arguments:
# ...

To reset/clean up the working environment for next Xinda test:

python3 cleanup.py

1.1 Experiment Configuration

Field	Default Value	Description
log_root_dir	$HOME/workdir/data/default	The root directory to store logs (data)
data_dir	REQUIRED	Name of the experiment. Results will be stored in $log_root_dir/$sys_name/$data_dir
bench_exec_time*	150	Benchmark duration in seconds
iter	1	Label for repeated experiments

* Note that for Hadoop, we use special flags to control the duration of the benchmark. Namely, the number of MRBench runs (--mrbench_num_iter) and the size of the data generated by TeraGen (--terasort_num_of_100_byte_rows) followed by the TeraSort benchmark.

1.2 System Configuration

Field	Default Value	Description
sys_name	REQUIRED	Name of the distributed systems to be tested
cpu_limit	4	The number of CPU cores allocated to each container instance
mem_limit	32G	The size of memory allocated to each container instance

1.3 Benchmark Configuration

Field	Default Value	Description
benchmark	REQUIRED	Specify which benchmark to test the system
ycsb_wkl	mixed	YCSB workloads. Other options are available in `xinda-software/ycsb-workloads`
openmsg_driver	kafka-latency	The yaml filename of OpenMessaging kafka driver. Available options are listed in `xinda-software/openmessaging/driver-kafka`
openmsg_workload	simple-workload	The yaml filename of OpenMessaging workload. Available options are listed in `xinda-software/openmessaging/workloads`
sysbench_lua_scheme	oltp_write_only	The lua scheme to run SysBench workload on crdb
etcd_official_wkl	lease-keepalive	etcd official v3 benchmark tool

1.4 Slow-Fault Configuration

Field	Default Value	Description
fault_type	REQUIRED	Types of slow faults to be injected. Can be {nw, fs, None}
fault_location	REQUIRED	Fault injection location. Available hostnames are listed in `./xinda/systems/container.yaml`*
fault_duration	REQUIRED	Duration of the fault injection in seconds
fault_severity	REQUIRED	Severity of the fault injection. For network slow faults, available options are listed in `./tools/blockade` (e.g., flaky-p10 or slow-100ms). For filesystem delays, just pass the delay to be injected in us (e.g., 1000 for 1ms or 100000 for 100ms)
fault_start_time	REQUIRED	Inject slow faults at X seconds after the benchmark is running

* For etcd, directly specify the fault location as leader or follower as the leaders are elected dynamically.

2. Results Analysis

By default each Xinda test will generate a directory in $HOME/workdir/data/default/${data_dir}. The directory contains system logs and runtime stats of the test. To analyze the results, you can use the following command:

python3 $HOME/workdir/xinda/data-analysis/process.py \
    --data_dir PATH_TO_DATA_DIR \
    --output_dir PATH_TO_OUTPUT_DIR

The script will generate a summary of the test results into meta.csv stored in the output directory. It will also parse runtime logs (e.g., fault injection timestamps) and stats (e.g., system throughput time series).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!