Skip to content

Commit 2df7233

Browse files
dosumisclaude
andcommitted
docs: Add test outputs documentation and gitignore rules
Adds comprehensive documentation for --save-outputs feature: - test_outputs/README.md with usage examples and use cases - .gitignore rules to exclude saved outputs (keep README) The README documents: - How to use --save-outputs flag - Directory structure and output file formats - Common use cases (debugging, comparison, citation review) - Implementation details and references 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent a56929a commit 2df7233

File tree

2 files changed

+194
-0
lines changed

2 files changed

+194
-0
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@ __pycache__/
88
.pytest_cache/
99
docs/_build/
1010
outputs/
11+
# Test outputs (keep README for documentation)
12+
test_outputs/
13+
!test_outputs/README.md
1114
.idea/
1215
tmp/
1316
# Distribution / packaging

test_outputs/README.md

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
# Test Outputs
2+
3+
This directory contains saved outputs from integration tests for inspection and debugging.
4+
5+
## Purpose
6+
7+
Integration tests make real API calls which are expensive and time-consuming. The `--save-outputs` flag allows you to save these responses for later inspection without re-running tests.
8+
9+
## Usage
10+
11+
### Saving Test Outputs
12+
13+
Run integration tests with the `--save-outputs` flag:
14+
15+
```bash
16+
# Save outputs from all integration tests
17+
pytest -m integration --save-outputs
18+
19+
# Save outputs from specific test file
20+
pytest tests/integration/test_pydantic_validation_e2e.py --save-outputs
21+
22+
# Save outputs from specific test
23+
pytest tests/integration/test_pydantic_validation_e2e.py::test_deepsearch_pydantic_validation_e2e --save-outputs
24+
```
25+
26+
### Running Without Saving
27+
28+
By default, tests run normally without saving outputs:
29+
30+
```bash
31+
# No outputs saved (default behavior)
32+
pytest -m integration
33+
```
34+
35+
## Directory Structure
36+
37+
Outputs are organized by test file and test name with timestamps:
38+
39+
```
40+
test_outputs/
41+
└── integration/
42+
├── deepsearch_service_integration/
43+
│ ├── test_deepsearch_service_preset_integration_20231209_143052/
44+
│ │ ├── raw_response.json
45+
│ │ └── extracted_json.json
46+
│ └── test_deepsearch_service_backward_compatibility_20231209_143105/
47+
│ └── raw_response.json
48+
└── pydantic_validation_e2e/
49+
├── test_deepsearch_pydantic_validation_e2e_20231209_143210/
50+
│ ├── raw_response.json
51+
│ ├── extracted_json.json
52+
│ └── validation_result.json
53+
└── test_deepsearch_pydantic_validation_with_moderate_genes_20231209_143245/
54+
├── raw_response.json
55+
├── extracted_json.json
56+
└── validation_result.json
57+
```
58+
59+
### Output Files
60+
61+
Each test run creates up to 3 JSON files:
62+
63+
#### 1. `raw_response.json`
64+
65+
Contains the full API response with metadata:
66+
67+
```json
68+
{
69+
"test_context": {
70+
"genes": ["TMEM14E"],
71+
"context": "cells",
72+
"preset": "perplexity-sonar-pro"
73+
},
74+
"response": {
75+
"markdown": "# Research Results\n\n...",
76+
"citations": [
77+
{
78+
"source_id": "1",
79+
"notes": "PubMed article..."
80+
}
81+
],
82+
"provider": "perplexity",
83+
"model": "sonar-reasoning-pro",
84+
"duration_seconds": 45.2
85+
},
86+
"saved_at": "2023-12-09T14:32:10.123456"
87+
}
88+
```
89+
90+
#### 2. `extracted_json.json` (when available)
91+
92+
Contains the extracted and parsed JSON from the API response:
93+
94+
```json
95+
{
96+
"context": {
97+
"cell_type": "cells",
98+
"disease": "",
99+
"tissue": ""
100+
},
101+
"input_genes": ["TMEM14E"],
102+
"programs": [
103+
{
104+
"program_name": "Example Program",
105+
"description": "Program description",
106+
"atomic_biological_processes": [],
107+
"atomic_cellular_components": [],
108+
"predicted_cellular_impact": []
109+
}
110+
],
111+
"version": "1.0"
112+
}
113+
```
114+
115+
#### 3. `validation_result.json` (when available)
116+
117+
Contains pydantic validation metadata:
118+
119+
```json
120+
{
121+
"success": true,
122+
"retry_count": 0,
123+
"validation_time_ms": 12.5,
124+
"error": null
125+
}
126+
```
127+
128+
## Use Cases
129+
130+
### 1. Debug Validation Failures
131+
132+
When a test fails validation, inspect the saved outputs:
133+
134+
```bash
135+
# Run test with output saving
136+
pytest tests/integration/test_pydantic_validation_e2e.py::test_deepsearch_pydantic_validation_e2e --save-outputs
137+
138+
# Inspect the extracted JSON
139+
cat test_outputs/integration/pydantic_validation_e2e/test_deepsearch_pydantic_validation_e2e_*/extracted_json.json | jq .
140+
141+
# Check validation errors
142+
cat test_outputs/integration/pydantic_validation_e2e/test_deepsearch_pydantic_validation_e2e_*/validation_result.json | jq .
143+
```
144+
145+
### 2. Compare API Responses
146+
147+
Save outputs from different test runs to compare:
148+
149+
```bash
150+
# Run with different presets
151+
pytest tests/integration/test_deepsearch_service_integration.py --save-outputs
152+
153+
# Compare responses
154+
diff test_outputs/integration/deepsearch_service_integration/test_*_preset_*/raw_response.json
155+
```
156+
157+
### 3. Inspect Citations
158+
159+
Review citation quality without re-running expensive API calls:
160+
161+
```bash
162+
# Extract citations from saved output
163+
cat test_outputs/integration/*/test_*/raw_response.json | jq '.response.citations'
164+
```
165+
166+
### 4. Performance Analysis
167+
168+
Analyze API response times:
169+
170+
```bash
171+
# Extract duration from all saved outputs
172+
find test_outputs -name "raw_response.json" -exec jq -r '.response.duration_seconds' {} \;
173+
```
174+
175+
## Notes
176+
177+
- Outputs are **NOT committed to git** (excluded via `.gitignore`)
178+
- Each test run creates a **new timestamped directory** (outputs never overwrite)
179+
- Directory is **only created when `--save-outputs` is used**
180+
- Test performance is **not impacted** when flag is not used
181+
- Outputs contain **real API responses** - do not commit if they contain sensitive data
182+
183+
## Implementation Details
184+
185+
The output saving infrastructure consists of:
186+
187+
- **`tests/utils/test_output_saver.py`**: Core utility class for saving outputs
188+
- **`tests/conftest.py`**: Pytest fixtures providing the `save_test_output` function
189+
- **Integration tests**: Updated to call `save_test_output()` when fixture is available
190+
191+
See `tests/unit/test_output_saver.py` for unit tests of the output saving infrastructure.

0 commit comments

Comments
 (0)