Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
experiment-001-cache-hash-optimization.md	experiment-001-cache-hash-optimization.md
experiment-002b-insert-list-sweep.md	experiment-002b-insert-list-sweep.md
experiment-002b-output.txt	experiment-002b-output.txt
experiment-003-max-neighbors.md	experiment-003-max-neighbors.md
experiment-003-output.txt	experiment-003-output.txt
experiment-004-100k-output.txt	experiment-004-100k-output.txt
experiment-004-10k-output.txt	experiment-004-10k-output.txt
experiment-004-25k-output.txt	experiment-004-25k-output.txt
experiment-004-scaling-test.md	experiment-004-scaling-test.md
experiment-005-100k-recall.md	experiment-005-100k-recall.md
experiment-005-output.txt	experiment-005-output.txt
experiment-005-runb-output.txt	experiment-005-runb-output.txt
template.md	template.md

DiskANN Performance Experiments

This directory contains structured records of all performance experiments conducted on sqlite-diskann. Each experiment documents the hypothesis, methodology, results, and conclusions to build institutional knowledge.

Experiment Format

Each experiment is documented in a separate markdown file with the following structure:

# Experiment: [Short Title]

**Date:** YYYY-MM-DD
**Engineer:** [Name]
**Status:** [Planned | Running | Complete | Abandoned]

## Hypothesis

What we believe will happen and why.

## Motivation

Why we're running this experiment. What problem are we trying to solve?

## Test Setup

- Parameters tested
- Dataset size and characteristics
- Hardware/environment
- Comparison baseline

## Expected Results

Quantitative predictions with reasoning.

## Actual Results

Raw data, tables, graphs. Link to benchmark output files.

## Analysis

What the results mean. Surprises? Confirmations?

## Conclusions

- What we learned
- Impact on defaults/recommendations
- Follow-up experiments needed

## Artifacts

- Benchmark profiles: `benchmarks/profiles/experiment-001-*.json`
- Results: `results/experiment-001-*.json`
- Graphs: `experiments/graphs/experiment-001-*.png`

Experiment Index

ID	Date	Title	Status	Key Finding
001	2026-02-11	Cache + Hash Set Optimization	Complete	37% build speedup from BLOB caching
002	2026-02-11	insert_list_size Reduction (200→100)	Complete	Only 2% improvement due to cache masking
003	2026-02-14	max_neighbors Impact on Recall	Complete	searchListSize bottleneck; keep default=32
004	2026-02-12	Scaling Test (10k→200k)	Planned	Find crossover vs brute-force
005	2026-02-12	Block Size Fix at 100k	Complete	98% recall (maxDeg=64), 64% (maxDeg=32)

Quick Start

Document a New Experiment

# Create from template
cp experiments/template.md experiments/experiment-XXX-short-name.md

# Edit with your hypothesis and setup
vim experiments/experiment-XXX-short-name.md

# Run benchmark
cd benchmarks
npm run bench -- --profile=profiles/experiment-XXX.json > ../experiments/experiment-XXX-output.txt

# Update experiment file with results

Search Past Experiments

# Find experiments testing specific parameters
grep -r "max_neighbors" experiments/

# Find experiments with high recall
grep -r "Recall.*9[5-9]%" experiments/

# List all completed experiments
grep -l "Status: Complete" experiments/*.md

Experiment Guidelines

Before Running

Check for similar past experiments - Don't repeat work
Document hypothesis clearly - Make predictions falsifiable
Plan for automation - Use benchmark profiles, not manual tests
Estimate time/cost - Large benchmarks can take hours

During Execution

Capture raw output - Redirect to experiment-XXX-output.txt
Save result JSON - Link to timestamped result files
Note anomalies - Document anything unexpected immediately
Take screenshots - For interactive visualizations

After Completion

Update experiment status - Mark as Complete
Add to index - Update table above with key finding
Update docs - If defaults change, update PARAMETERS.md
Link from issues/PRs - Reference experiment IDs in commits

Analysis Tools

Compare Experiments

# Compare build times across experiments
jq '.[] | {experiment: .name, build_time: .build_time}' \
  results/experiment-*.json

# Plot recall vs build time
python3 experiments/scripts/plot-pareto.py results/experiment-*.json

Statistical Significance

# Run t-test between two experiments
python3 experiments/scripts/ttest.py \
  results/experiment-001.json \
  results/experiment-002.json

Best Practices

Hypothesis-Driven

❌ Bad: "Let's try max_neighbors=48 and see what happens" ✅ Good: "Hypothesis: Increasing max_neighbors from 32→48 will improve recall@10 from 95%→97% but increase index size by 50% and build time by 10%"

Reproducible

Use benchmark profiles (JSON configs)
Document exact versions (git commit hash)
Note hardware specs
Seed random number generators

Incremental

Change one variable at a time (when possible)
Build on previous experiments
Reference prior work

Shareable

Write for future you (6 months from now)
Assume reader doesn't have context
Include enough detail to reproduce exactly

Common Pitfalls

Not documenting baseline - Always measure before/after
Cherry-picking results - Document failures too
Ignoring variance - Run multiple trials, report stddev
Confounding variables - Did something else change? (OS update, etc.)
Premature conclusions - Correlation ≠ causation

Templates

template.md - Blank experiment template
template-param-sweep.md - For parameter sweeps
template-scaling.md - For dataset size scaling tests
template-regression.md - For performance regression investigations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

DiskANN Performance Experiments

Experiment Format

Experiment Index

Quick Start

Document a New Experiment

Search Past Experiments

Experiment Guidelines

Before Running

During Execution

After Completion

Analysis Tools

Compare Experiments

Statistical Significance

Best Practices

Hypothesis-Driven

Reproducible

Incremental

Shareable

Common Pitfalls

Templates

FilesExpand file tree

experiments

Directory actions

More options

Directory actions

More options

Latest commit

History

experiments

Folders and files

parent directory

README.md

DiskANN Performance Experiments

Experiment Format

Experiment Index

Quick Start

Document a New Experiment

Search Past Experiments

Experiment Guidelines

Before Running

During Execution

After Completion

Analysis Tools

Compare Experiments

Statistical Significance

Best Practices

Hypothesis-Driven

Reproducible

Incremental

Shareable

Common Pitfalls

Templates