Skip to content

Commit 4e43ef8

Browse files
authored
feat: MCP server docs (#409)
Signed-off-by: PriteshKiri <[email protected]>
1 parent 3e2285a commit 4e43ef8

File tree

6 files changed

+508
-0
lines changed

6 files changed

+508
-0
lines changed
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
id: available-tools
3+
title: Available Tools
4+
---
5+
6+
In MCP, a “tool” is a simple action you can call programmatically: it takes clear inputs and returns structured outputs.
7+
8+
Litmus MCP tools map to common chaos engineering workflows, managing experiments, monitoring runs, connecting infrastructures, organizing environments, defining resilience probes, and discovering faults and analytics, so assistants and automations can perform these tasks reliably.
9+
10+
Use this page as a practical reference. Each section explains what a tool does, when to use it, typical inputs, and the kind of output you can expect.
11+
12+
## Overview of tool categories
13+
14+
- Experiment Management: create visibility into experiments and run or stop them.
15+
- Execution Monitoring: see execution history and drill into run details and logs.
16+
- Infrastructure Management: register and inspect Kubernetes infrastructures.
17+
- Environment Organization: group experiments and resources by environment.
18+
- Resilience Validation: define probes to validate steady state and SLOs.
19+
- Discovery & Analytics: explore available faults and review platform statistics.
20+
21+
Below is the full list of 17 tools, organized by category.
22+
23+
## Experiment Management
24+
25+
These tools help you find, inspect, run, and stop chaos experiments.
26+
27+
### list_chaos_experiments
28+
List all chaos experiments with optional filtering.
29+
- What it does: Returns a list of experiments. You can filter by project, environment, tags, or name.
30+
- When to use: To see which experiments exist before selecting one to run or inspect.
31+
- Typical input: Project ID or Name, optional filters (environment, labels, search text), pagination.
32+
- Output: A paginated list of experiments with key fields like name, ID, environment, and status.
33+
34+
### get_chaos_experiment
35+
Get detailed information about a specific chaos experiment.
36+
- What it does: Fetches full details for a single experiment.
37+
- When to use: To review experiment structure, faults used, probes, and configuration before running it.
38+
- Typical input: Experiment ID (or name + project/environment context).
39+
- Output: Experiment spec including faults, probes, parameters, schedules, and metadata.
40+
41+
### run_chaos_experiment
42+
Execute a chaos experiment immediately (on-demand run).
43+
- What it does: Triggers an on-demand run of the selected experiment.
44+
- When to use: To start a test now (outside of any scheduled cadence) for debugging or validation.
45+
- Typical input: Experiment ID and optional overrides (variables, run labels, dry-run flag if supported).
46+
- Output: A run ID (or execution reference) you can use to monitor progress.
47+
48+
### stop_chaos_experiment
49+
Stop an in-progress chaos experiment run.
50+
- What it does: Attempts to stop an in-progress experiment run.
51+
- When to use: If a test must be halted due to impact, misconfiguration, or a time limit.
52+
- Typical input: Experiment ID or Run ID.
53+
- Output: Confirmation that the stop request was accepted; subsequent run status should show as stopped/terminated.
54+
55+
## Execution Monitoring
56+
57+
These tools help you track experiment execution over time, and inspect an individual run in depth.
58+
59+
### list_experiment_runs
60+
List experiment execution history with flexible filters.
61+
- What it does: Lists runs across experiments, with filters such as experiment, environment, status, or time range.
62+
- When to use: To review what ran recently, identify failed runs, or audit changes over time.
63+
- Typical input: Experiment ID (optional), status filters (Succeeded/Failed/Running), time window, pagination.
64+
- Output: A list of runs with IDs, timestamps, duration, status, and basic metadata.
65+
66+
### get_experiment_run_details
67+
Get detailed information about a single run, including timeline, logs, and probe results.
68+
- What it does: Shows a single run’s timeline, step status, logs, and probe results.
69+
- When to use: For debugging failures, verifying probe outcomes, or sharing evidence of success.
70+
- Typical input: Run ID.
71+
- Output: Detailed run record including events, steps, logs, artifacts, and final status.
72+
73+
## Infrastructure Management
74+
75+
Use these tools to manage and view the Kubernetes infrastructures where experiments run.
76+
77+
### list_chaos_infrastructures
78+
List all registered infrastructures (for example, Kubernetes clusters/agents).
79+
- What it does: Returns all infrastructures registered to the project (for example, Kubernetes clusters/agents).
80+
- When to use: To confirm which clusters are connected and healthy.
81+
- Typical input: Project ID or Name, optional filters (status, type), pagination.
82+
- Output: A list of infrastructures with IDs, names, types, connection status, and last heartbeat.
83+
84+
### get_infrastructure_details
85+
Get detailed information about a specific infrastructure.
86+
- What it does: Shows full details about a specific infrastructure.
87+
- When to use: To review configuration, connected namespaces, resource quotas, and health.
88+
- Typical input: Infrastructure ID.
89+
- Output: Detailed infrastructure profile including metadata, status, capabilities, and version info.
90+
91+
### register_chaos_infrastructure
92+
Register a new Kubernetes infrastructure to run experiments.
93+
- What it does: Starts the registration/handshake for a new Kubernetes cluster or agent.
94+
- When to use: When onboarding a new cluster to run chaos experiments.
95+
- Typical input: Project context, cluster name, and registration parameters. You may receive a token or manifest to apply.
96+
- Output: Registration info and next steps (for example, a YAML manifest to install or a token to use with the agent).
97+
98+
## Environment Organization
99+
100+
Organize experiments and resources into environments (for example, dev, staging, prod).
101+
102+
### list_environments
103+
List all environments defined in the project.
104+
- What it does: Lists environments defined in the project.
105+
- When to use: To pick the right environment for creating or running experiments.
106+
- Typical input: Project ID or Name, pagination.
107+
- Output: A list of environments with IDs, names, and basic metadata.
108+
109+
### create_environment
110+
Create a new environment for organizing experiments and resources.
111+
- What it does: Creates a new environment grouping.
112+
- When to use: When you need a separate space for a team, app, or lifecycle stage.
113+
- Typical input: Environment name, description, optional tags/labels.
114+
- Output: The newly created environment with its ID and details.
115+
116+
## Resilience Validation
117+
118+
Probes validate steady state or desired outcomes before, during, and after experiments.
119+
120+
### list_resilience_probes
121+
List all configured resilience probes.
122+
- What it does: Lists probes available in the project or environment.
123+
- When to use: To see what checks exist and reuse them across experiments.
124+
- Typical input: Optional filters like environment, probe type (HTTP, CMD, K8s, Prometheus), pagination.
125+
- Output: A list of probes with IDs, names, types, and brief specs.
126+
127+
### create_resilience_probe
128+
Create a new probe (HTTP, CMD, K8s, or Prometheus) for resilience validation.
129+
- What it does: Creates a new probe definition to validate resilience signals.
130+
- When to use: To add new SLO checks or steady-state validations.
131+
- Typical input: Probe name, type, and spec:
132+
- HTTP: URL, method, headers, expected status/body.
133+
- CMD: Command, arguments, timeout, expected exit code.
134+
- K8s: Resource query (pods/deployments), conditions, namespace.
135+
- Prometheus: Query, comparison operator, threshold, evaluation window.
136+
- Output: The created probe with ID and full spec.
137+
138+
## Discovery & Analytics
139+
140+
Explore the chaos library and get high-level insights about usage and outcomes.
141+
142+
### list_chaos_hubs
143+
List available ChaosHubs that provide faults and experiments.
144+
- What it does: Lists connected ChaosHubs that provide faults/experiments.
145+
- When to use: To discover which hubs are available (public or private) and browse their content.
146+
- Typical input: Optional filters, pagination.
147+
- Output: A list of hubs with IDs, names, types, and availability.
148+
149+
### get_chaos_faults
150+
Browse available chaos faults from connected hubs.
151+
- What it does: Returns chaos faults available from hubs, with metadata like category, platform, and parameters.
152+
- When to use: To select a fault to add to an experiment.
153+
- Typical input: Hub ID (optional), search query, categories, pagination.
154+
- Output: A list of faults with names, descriptions, supported platforms, and input parameters.
155+
156+
### get_experiment_statistics
157+
Get comprehensive platform-level statistics and recent activity.
158+
- What it does: Provides aggregate stats such as number of experiments, runs, success/failure rates, and recent activity.
159+
- When to use: For reporting, governance, and tracking adoption over time.
160+
- Typical input: Optional time range, environment, or project filters.
161+
- Output: Summary metrics, charts-ready aggregates, and counts.
162+
163+
## Learn more
164+
165+
- [Installation](./installation)
166+
- [Example Interactions](./examples)
167+
- [Troubleshooting](./troubleshooting)
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
id: examples
3+
title: Example Interactions
4+
---
5+
6+
In this documentation, you can find copy-paste-ready interactions you can perform via the MCP server. Each example pairs a natural language prompt with the underlying tool calls and a sample response so you can replicate the workflow quickly.
7+
8+
If you’re new to the tool surface, see `mcp-server/available-tools.md` for a complete reference of capabilities and parameters.
9+
10+
## Quick Start
11+
12+
A common end-to-end flow looks like this:
13+
14+
1. List experiments → pick one to run.
15+
2. Run an experiment → capture the Run ID.
16+
3. Monitor the run → check steps, logs, and probe results.
17+
4. (Optional) Stop the run if needed.
18+
19+
Below, you’ll find detailed examples for these and more scenarios.
20+
21+
## Sample Prompts
22+
23+
- Prompt
24+
```text
25+
Show me available chaos experiments in the staging environment that target Kubernetes.
26+
```
27+
It will: List chaos experiments filtered by environment and platform so you can choose one to run.
28+
29+
- Prompt
30+
```text
31+
Run experiment "pod-delete-basic" now in staging and return the run ID.
32+
```
33+
It will: Trigger an on-demand run of the chosen experiment and return the Run ID.
34+
35+
- Prompt
36+
```text
37+
Show me the latest status, timeline, and probe results for run ID <RUN_ID>.
38+
```
39+
It will: Retrieve detailed run information including step timeline and probe outcomes.
40+
41+
- Prompt
42+
```text
43+
Stop the currently running run <RUN_ID> and confirm the termination.
44+
```
45+
It will: Attempt to stop the in-progress run and report acceptance.
46+
47+
- Prompt
48+
```text
49+
List all registered infrastructures and show which ones are healthy.
50+
```
51+
It will: Return the registered infrastructures with their connection/health status.
52+
53+
- Prompt
54+
```text
55+
Onboard a new Kubernetes cluster named "edge-lab" and provide the registration steps.
56+
```
57+
It will: Initiate registration and return the manifest or token with next steps.
58+
59+
- Prompt
60+
```text
61+
Create an HTTP probe that checks GET https://myapp.example.com/health returns 200 in under 2s.
62+
```
63+
It will: Create a reusable HTTP probe definition for steady-state validation.
64+
65+
- Prompt
66+
```text
67+
List all probes so I can attach one to my next experiment.
68+
```
69+
It will: List available resilience probes with IDs and brief specs.
70+
71+
- Prompt
72+
```text
73+
Show me available ChaosHubs and then list Kubernetes pod-level faults.
74+
```
75+
It will: List connected hubs and then fetch faults filtered by platform/category.
76+
77+
- Prompt
78+
```text
79+
Create a new environment called "chaos-lab" for ad-hoc testing.
80+
```
81+
It will: Create a new environment grouping that you can target in experiments.
82+
83+
- Prompt
84+
```text
85+
List environments so I can verify "chaos-lab" exists.
86+
```
87+
It will: List all environments with their IDs and names.
88+
89+
## Tips and Good Practices
90+
91+
- Start broad with listing tools (`list_chaos_experiments`, `list_experiment_runs`, `list_chaos_infrastructures`, `list_environments`) before drilling down.
92+
- Prefer IDs over names for precision when running or stopping experiments.
93+
- After `run_chaos_experiment`, immediately capture the `runId` to monitor or stop it later.
94+
- Reuse probes across experiments to standardize resilience checks.
95+
- Keep filters small and focused to reduce noise in large projects.
96+
97+
For detailed parameter schemas and additional examples, see `mcp-server/available-tools.md`.
98+
99+
## Learn more
100+
101+
- [Installation](./installation)
102+
- [Available Tools](./available-tools)
103+
- [Troubleshooting](./troubleshooting)
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
id: installation
3+
title: Installation
4+
---
5+
6+
You can install Litmus MCP Server from source, via `go install`, or using Docker. Refer to the repository for the most up-to-date commands.
7+
8+
## From Source
9+
10+
```bash
11+
# Clone the repository
12+
git clone https://github.com/litmuschaos/litmus-mcp-server.git
13+
cd litmus-mcp-server
14+
15+
# Build the binary
16+
make build
17+
18+
# Or install directly
19+
make install
20+
```
21+
22+
## Using Go Install
23+
24+
```bash
25+
go install github.com/litmuschaos/litmus-mcp-server@latest
26+
```
27+
28+
## Using Docker
29+
30+
```bash
31+
# Build the Docker image
32+
make docker-build
33+
34+
# Run with Docker
35+
docker run --rm -it \
36+
-e CHAOS_CENTER_ENDPOINT=http://your-chaos-center:8080 \
37+
-e LITMUS_PROJECT_ID=your-project-id \
38+
-e LITMUS_ACCESS_TOKEN=your-token \
39+
litmuschaos-mcp-server:latest
40+
```
41+
42+
## Configuration
43+
44+
Configure Litmus MCP Server using environment variables.
45+
46+
### Environment Variables
47+
48+
```bash
49+
# Required Configuration
50+
export CHAOS_CENTER_ENDPOINT=http://your-chaos-center:8080
51+
export LITMUS_PROJECT_ID=your-project-id
52+
export LITMUS_ACCESS_TOKEN=your-access-token
53+
54+
# Optional Defaults
55+
export DEFAULT_INFRA_ID=your-default-infrastructure-id
56+
export DEFAULT_ENVIRONMENT_ID=production
57+
```
58+
59+
### Getting Your Credentials
60+
61+
1. **Chaos Center Endpoint**: URL of your LitmusChaos installation
62+
2. **Project ID**: Found in Chaos Center project settings
63+
3. **Access Token**: Generate from Chaos Center → Settings → Access Tokens
64+
65+
## Usage
66+
67+
You can run Litmus MCP Server standalone or integrate it with Claude Desktop via the MCP configuration.
68+
69+
### With Claude Desktop
70+
71+
Add to your Claude Desktop MCP configuration:
72+
73+
```json
74+
{
75+
"mcpServers": {
76+
"litmuschaos": {
77+
"command": "/path/to/litmuschaos-mcp-server",
78+
"env": {
79+
"CHAOS_CENTER_ENDPOINT": "http://localhost:8080",
80+
"LITMUS_PROJECT_ID": "your-project-id",
81+
"LITMUS_ACCESS_TOKEN": "your-token"
82+
}
83+
}
84+
}
85+
}
86+
```
87+
88+
### Standalone Usage
89+
90+
```bash
91+
# Using environment variables
92+
./bin/litmuschaos-mcp-server
93+
94+
# Or with make
95+
make run
96+
```
97+
98+
## Learn more
99+
100+
- [Available Tools](./available-tools)
101+
- [Example Interactions](./examples)
102+
- [Troubleshooting](./troubleshooting)

0 commit comments

Comments
 (0)