Releases: IronAdamant/Chisel
v0.14.0
Added
- Claude Code skill (
skills/SKILL.md): an installable agent skill distilling the day-to-day Chisel protocol (analyze → diff_impact →chisel run→ triage), v0.13 behavior notes,sourcetrust ordering, and empty-result remedies. Copy to~/.claude/skills/chisel/SKILL.md; mirrors the Wikifier skill pattern.
Changed
- Agent-facing docs refreshed for v0.13:
docs/AGENT_PLAYBOOK.mdgains v0.13 behavior notes (gitignore-aware scanning +CHISEL_INCLUDE_IGNORED, real CLI exit codes,edge_rebuild_skipped) and links the skill; README features list now documents gitignore-aware scanning.
v0.13.0
Fixed
- gitignore-aware scanning (resolves
Logged_issues/2026-06-11_directory-scoped-analyze-walks-all-tests.md): the engine code scan and test discovery walked everything under the project root, including gitignored trees (vendored deps, build output, bulk fixture dirs) — on a repo with a 10k-file ignored fixture tree, test discovery found 3,895 test files and updates ran 18+ minutes. Both walks now filter throughgit ls-files --cached --others --exclude-standard(tracked + untracked-but-not-ignored, so working-tree analysis still sees new files) and never traverse ignored trees. SetCHISEL_INCLUDE_IGNORED=1to disable; non-git projects are unfiltered as before. - CLI exit codes: the console script ran
sys.exit(main()), andmain()returns the result dict — so every successfulchiselcommand exited 1 and dumped the raw dict to stderr, breakingchisel analyze && ...scripting. The newcli_entryentry point converts results to real exit codes (status: error|git_error→ 1, everything else → 0;chisel runpasses through the test runner's code).
Changed
build_test_edgesis ~76× faster with byte-identical output (A/B-verified against the previous algorithm on the same input: 14,667 edges, 17.6s → 0.23s). Match results are memoized per module path and dependency extraction per test file; previously every dep of every test unit scanned every code unit (33M_matches_import_pathcalls on this repo alone — the dominant cost ofanalyze/update, and the real culprit behind multi-minute updates on monorepos).update()skips edge rebuilding when nothing changed: with no changed files and no new commits, the discover/rebuild phases are skipped and the result includesedge_rebuild_skipped: true. Per-functiongit log -Lchurn is now restricted to changed files (plus files touched by new commits) instead of every function in the project. Net on this repo: no-op update ~18s → 0.14s; single-file update ~19s → 1.4s; full analyze 19.4s → 2.3s.
v0.12.0
Added
cancel_jobMCP schema: the tool existed in the dispatch table but had no_TOOL_SCHEMASentry, so it was invisible inGET /toolsand stdio tool listings (callable only if an agent already knew it existed). All 26 tools are now advertised; a regression test enforces schema/dispatch parity.shardparameter exposed end-to-end:analyze,update, andstart_jobnow acceptshardthrough the MCP schemas + dispatch table, and the CLI gains--shardonanalyze,update, andstart-job. Previously the engine supported it but neither interface forwarded it, despitedocs/AGENT_PLAYBOOK.mddocumenting the usage.start_jobalso validates unknown shard keys up front and returns a clean error.
Fixed
chisel runwas completely broken (two independent bugs, found by dogfooding): (1) its positional argument was namedcommand, which collided with the subparser dest —main()then crashed withTypeError: unhashable type: 'list'on every invocation; (2)_PYTEST_RESULT_REanchored the status at end-of-line, but realpytest -voutput appends a progress suffix (PASSED [ 16%]), so even a direct call never parsed real output. Both fixed;chisel run -- pytest tests/now records results end-to-end, with regression tests throughmain()and against real-format output.- Cross-thread shard leak:
_with_shard()mutated shared engine attributes (storage/lock/impact/process lock), so a background job analyzing one shard could silently redirect concurrent tool calls on other threads (HTTP MCP server, parallel agents) to the wrong shard DB. The override is now thread-local via properties; a regression test exercises two threads concurrently. - Unknown-shard errors crashed: the error path used
sorted(self._shard_engines)over keys containingNone, raisingTypeErrorinstead of returning the error dict. - Swift Testing
@Testdetection was dead code: a framework-name mismatch (swift_testvs the actualxctest) made the Swift regex unreachable — Swift@Testfunctions were only matched by accident via the Java regex. Now routed correctly, with support for@Test("description", ...)arguments, stacked attributes (@MainActor,@available(...)), and same-line@Test func f()declarations. - Java/Kotlin parameterized annotations hid test methods:
@ValueSource(ints = {1, 2, 3}),@CsvSource(...), and@Test(expected = ...)broke the annotation regex, so common JUnit 5 parameterized tests were not detected. - C# stacked attributes hid test methods:
[Fact]followed by[Trait("a", "b")](standard xUnit categorization) was not matched. test_gapscould erase all gaps in barrel/re-export-heavy projects: when static-import resolution claimed every gap file was covered, the filter emptied the list and reported a false "all code units have test coverage". If filtering would drop everything, the original gaps are now preserved.suggest_testsworking-tree source labeling: a constant ternary labeled every stem-matched suggestionworking_tree; DB-known tests matched by stem now correctly reportsource: "fallback".- Storage dir validation: passing
:memory:(or afile:URI) asCHISEL_STORAGE_DIR/storage_dirsilently created a literal:memory:directory on disk. These are now rejected with a clearValueError(multi-process coordination requires an on-disk lock file). _quantize_gapclamps to 1.0 to guard against float rounding pushing coverage gap above the documented bound.- README: replaced 23 typographic smart quotes — the MCP config JSON example was not copy-paste valid.
Changed
next_steps: thediff_impacthint forrecord_resultnow includes concretetest_id/passedarguments instead of an empty args dict (record_result has required fields).
Removed
- Dead code:
Storage._ensure_main_conn(),ImpactAnalyzer._import_graph_undirected_neighbors(), the unreachableswift_testbranches, and a redundant duplicated conditional in_suggest_tests_impl.
v0.11.0 — Phase 15/16 MCP gap closures (file locks, record_result visibility, uniform risk warnings, working_tree coupling)
v0.11.0 — Phase 15/16 MCP gap closures for LLM agents
Added
- File lock integration: Advisory locks (
acquire_file_locketc.) are now visible inanalyze/update(active_file_locks,locked_files_sample) andrisk_map/triage(locked_byon files +active_locks/lock_claimsin_meta). Fully wired to the main DB and working-tree flows. - Observable
record_result:suggest_testsnow returnsfailure_rateandfailure_booston every item so the effect of recording flaky tests is directly visible. - Uniform risk detection:
_metainrisk_mapandtriagenow includesuniform_risk_groups,max_identical_risk_files, and warnings when multiple files share identical risk scores (the exact gap reported by the Phase 15 CrossMCP probes).
Changed
couplingsupportsworking_tree=true: On-disk static import extraction for untracked files → immediate non-zeroimport_coupling/import_partners(huge win during active refactoring).stale_testsnow acceptsworking_tree(MCP parity).- Job system robustness: Guard against thread launch failures after the in-progress flag is set (reduces "background thread not returning" errors).
Fixed
- Multiple surfaces for parallel-agent + heavy working-tree scenarios, as stress-tested by the full 25-tool Phase 15 validation matrix in RecipeLab_alt (
ChiselJobLockFlakinessOrchestrator, 200+ file moves, BaseImporter, RouteLoader split, etc.).
Grok Build edited this iteration while closing the chisel_open.md findings (renamed to chisel_closed.md).
Full details: CHANGELOG.md
v0.10.0 — Dynamic-require edges, symbol-collision fix, cycle SCC
Fixes 5 issues surfaced by external MCP testing in chisel_open.md, plus a substantial new feature for dynamic plugin loaders.
Bug fixes
suggest_testssymbol-collision — the name-only edge fallback (tier 4) leaked across files, so a baredispatch()call edged to every file defining adispatchsymbol. Now scoped to files the test actually imports. EliminatesloadPlugin/dispatch/listFunctionsreproductions.diff_impactstale_dbover-trigger — tracked diff files are now filtered by indexed code extensions, so.md/.json/.dbedits no longer flag stale_db.- Cycle false-positives —
_find_circular_dependencieswas being fed undirected neighbors which collapsed bow-tie DAGs into giant fake cycles. Newget_imported_files_batch(directed, hard edges only) used for cycle detection.
New feature: dynamic-require edge resolution
require() patterns the static parser can partially see are now first-class import edges, scaled by confidence:
- Hard imports (
import/from … import) → confidence 1.0 - Tainted imports (
const P = './x'; require(P)) → confidence 1.0 (variable taint already resolved) - Dynamic imports (template literal, string concat,
path.join(__dirname, 'plugins', name)) → confidence 0.2–0.4
Schema migration: import_edges gains a confidence REAL NOT NULL DEFAULT 1.0 column (existing rows default to 1.0, fully backward compatible).
The closure traversal in get_impacted_tests accumulates _IMPORT_HOP_DECAY × edge_confidence per hop, so tests of dynamically-loaded plugins now surface — at proportionally lower relevance and tagged via dynamic require() in the reason text.
Safety guards:
- Cycle detection uses
min_confidence=1.0— soft edges can't create false cycles. risk_mapproximity adjustment uses hard edges only — soft edges don't understatecoverage_gap.- Fan-out cap of 50 files per dynamic-require dep — pattern still surfaces via
unknown_require_countandhidden_risk_factoreven when edges aren't emitted.
New regex: _JS_REQUIRE_PATH_JOIN_RE captures require(path.join(__dirname, '<dir>', var)) — previously invisible.
Agent UX
suggest_testsschema now documents source confidence ranking:hybrid > direct > import_graph > co_change > static_require > working_tree > fallback.dispatch_toolappends awarn_eval_usednext-step whensuggest_testsis queried on a JS/TS file containingeval(...)ornew Function(...)— agents now see when the import graph is necessarily incomplete for that file.
Tests
795 passed (was 777 before this release). +18 regression tests covering symbol collision, diff_impact extension filter, directed-SCC cycle detection, dynamic-require resolution (5 forms), fan-out cap, proximity-vs-soft-edges, and eval-warning surfacing.
v0.9.2 — CI lint fixes
Fixed
Resolved 29 ruff errors that were blocking the CI matrix. No behavior changes.
chisel/engine.py: MovedJobCancelledErrorclass andloggerassignment below the import block (was triggering 12× E402). Removed unusedStaticImportIndeximport (F401). Removed unusedlang = self.mapper.detect_framework(tf)local (F841).chisel/mcp_server.py: Removed unusedfrom socketserver import ThreadingMixIn(F401).examples/extractors/lsp_symbol_extractor.py: Removed unusedimport uuid(F401).examples/extractors/swift_syntax_extractor.py: Added# noqa: F403 / F405on the star import and the symbols it provides (intentional — optional third-party dep).tests/test_language_frameworks.py: Removed unusedimport os(F401).
ruff check . now passes; 777 tests pass locally.
Install
```bash
pip install chisel-test-impact==0.9.2
```
v0.9.1 — silent-swallow logging, auto_update hints, doc parity
Fixed
- Silent exception swallowing:
engine._load_shard_confignow logs a warning when.chisel/shards.tomlfails to parse instead of silently returning an empty config.storage._execute/_executemanyemit a warning whenSQLITE_BUSYpersists past the retry cap (in addition to existing debug logs).static_test_imports.StaticImportIndexdebug-logs the path and error when an untracked test file can't be read, instead of silently skipping.
_try_auto_updaterace window:_scan_code_files()now runs inside the exclusive lock held by_try_auto_update, so concurrent writers can't add files between the scan and the hash-based change check.auto_updateskip reasons surfaced to agents: Whenauto_update=Trueis skipped (bg job running or >50 files changed), the response includes an explicitauto_update_skip_reasonfield and a reason-specific hint. Applied tosuggest_testsanddiff_impactstale-DB envelopes;test_gapslogs a warning.risk_mapandtriagealready exposed this via_meta.
Changed
- Documentation parity sweep: Tool count (
22/24→26= 20 functional + 6 file-lock), CLI subcommand count (17/18→28), and SQLite table count (10/13→17) corrected across CLAUDE.md, README.md, CONTRIBUTING.md, ARCHITECTURE.md, COMPLETE_PROJECT_DOCUMENTATION.md, andwiki-local/spec-project.md. ARCHITECTURE.md tool table gainedoptimize_storageandcancel_joband fixed the file-lock tool names (acquire_lock→acquire_file_lock, etc.) to matchschemas.py.
v0.9.0 — Monorepo SQLite sharding, auto-fallback, chisel run
Highlights
Added
- Monorepo SQLite sharding: Large repos can shard analysis data across multiple SQLite databases. Set
CHISEL_SHARDS=frontend,backend(or create.chisel/shards.toml) to split data by top-level directory. All query tools auto-aggregate; write tools route to the correct shard by file path. analyzeauto-fallback to background job: Withforce=Trueon repos with >300 code files,analyzeautomatically queues a background job and returns{status: \"auto_queued\", job_id: ..., kind: \"analyze\"}to avoid MCP timeouts.exclude_new_file_boostparameter:risk_mapandtriageacceptexclude_new_file_boost=Trueto suppress the 0.5 new-file boost for stable long-term audits.auto_updateparameter for read-only tools:diff_impact,suggest_tests,risk_map,test_gaps, andtriageacceptauto_update=True— Chisel does a lightweight inlineupdate()if the DB is stale. Capped at 50 changed files; skipped when a bg job is running.chisel runCLI subcommand:chisel run -- <test-command>runs tests and callsrecord_resultfor each detected test. Supports pytest and Jest out of the box; Go and Rust scaffolded.- Extractor plugin examples:
examples/extractors/tree_sitter_js_extractor.py,swift_syntax_extractor.py,lsp_symbol_extractor.py, plusdocs/EXTRACTOR_ECOSYSTEM.md.
Changed
- Documentation: README,
docs/LLM_CONTRACT.md,docs/AGENT_PLAYBOOK.md,docs/CUSTOM_EXTRACTORS.md, and ARCHITECTURE updated for sharding, auto-fallback, and the extractor ecosystem.
See CHANGELOG.md for the full list.
v0.8.3 — version sync fix
Fixed
- Version sync:
chisel.__version__was out of sync withpyproject.tomlin the 0.8.2 release, causing CI failures. Realigned to 0.8.3.
v0.8.2 — optimize_storage, job cancellation, framework fixtures
Highlights
Added
optimize_storageMCP tool: RunsPRAGMA optimizeand conditionalVACUUMwhen the WAL grows large.- Incremental import graph rebuilds:
_rebuild_import_edges()now only rebuilds edges for changed files —O(all_files)→O(changed_files). Keeps 1k+ file monorepo updates under 3 seconds. - Directory-scoped
suggest_tests: accepts adirectoryparameter and aggregates suggestions for all code files under that path. - Background job cancellation & events:
cancel_jobtool,cancel_requested_atflag,JobCancelledError, and ajob_eventstable.analyze()/update()check for cancellation at phase boundaries. - Framework fixture test suite (
tests/test_language_frameworks.py): C#, Java, Rust, Swift, and Go module-aware import resolution. risk_mapworking_treeparameter: includes untracked code files in risk scoring with anew_file_boostof 0.5 so new files surface.test_gapsworking-tree elevation: gaps from untracked files sort to the top of the list.suggest_tests/diff_impactdirectory-aware stem matching: same-directory tests strongly preferred over fuzzy substring matches.diff_impactstale-DB detection: returns{status: \"stale_db\", ...}when changed files aren't in the DB.suggest_testsauto-fallback: self-healing for newly tracked/created files with no edges.start_job/job_statusprogress tracking: jobs reportprogress_pct(0–100).- Heuristic edge backfill during
analyze/update: filename-based edges auto-created for test files with no DB edges. - Project fingerprint: stored in
meta.project_fingerprint; warns on cross-project DB reuse. - MCP timeout hints:
tool_analyze/tool_updaterecommendstart_jobfor large repos.
Changed
- Risk formula:
new_file_boost(0.0/0.5) added. Files with no history and no tests score ~0.75 instead of ~0.25. - Coupling formula: import-graph coupling is now first-class —
max(cochange, import, 0.5*cochange + 0.5*import). risk_mapdefaults:coverage_mode=\"line\",proximity_adjustment=True.- Single-author co-change threshold halving: solo-dev commit patterns now surface coupling signal.
- Coverage gap granularity: 4 → 20 quantization steps (0.05 increments).
- Risk reweighting threshold: triggers on 2+ uniform components or any zero-valued uniform component.
Fixed
- SQLite concurrency stability in
storage.py(restoredwith self._conn as conn:wrappers). risk_mapcrash withworking_tree=true(KeyError: 'heuristic').suggest_tests/diff_impacttimeouts under working-tree load (StaticImportIndexcaching).storage.pyread-only transaction error on SELECT queries.
See CHANGELOG.md for the full list.