feat: ReScript (.res, .resi) extraction#1044
Conversation
Adds a vendored Python binding for tree-sitter-rescript so graphify can parse `.res` and `.resi` files. Source is copied verbatim from rescript-lang/tree-sitter-rescript@v6.0.0 — `bindings/python/...`, `src/parser.c`, `src/scanner.c`, the runtime headers, and the LICENSE. The C extension is compiled at install time as part of graphify's own wheel build via setup.py. Why vendor (not depend on PyPI): 1. `tree-sitter-rescript` is not on PyPI — `pip index versions tree-sitter-rescript` returns 404 against the public index. 2. `tree-sitter-language-pack` does ship a ReScript grammar but bundles its own `Language` / `Parser` C types (under the `builtins` module) that don't interop with the stock `tree-sitter` package graphify already depends on (`tree_sitter.Language(language_pack_capsule)` raises `TypeError: an integer is required`). Drop-in replacement is not viable without rewriting `_extract_generic`. 3. A `git+https://github.com/...` dependency would require git + network + C toolchain on every user's machine and is not allowed for packages published to PyPI. 4. Upstream already ships a complete `binding.c` / `setup.py` shape that exposes the modern `Language(language_capsule)` API, so vendoring is a copy of the binding source, not a port. Wiring: - setup.py declares the `setuptools.Extension` for `graphify._vendor.tree_sitter_rescript._binding` with upstream's exact compile flags (-std=c11, -fvisibility=hidden, Py_LIMITED_API 3.10+). - pyproject.toml's build-system requires `wheel`, tool.setuptools.packages lists the new vendor packages, package-data ships .pyi/py.typed/headers for sdist completeness. - graphify/__init__.py aliases the vendored package into sys.modules under the bare name `tree_sitter_rescript` so `importlib.import_module("tree_sitter_rescript")` in graphify.extract resolves with no per-language detour. Validation: >>> import graphify >>> import tree_sitter_rescript >>> from tree_sitter import Language, Parser >>> tree = Parser(Language(tree_sitter_rescript.language())).parse( ... b'let add = (a, b) => a + b') >>> tree.root_node.type 'source_file' This commit lands the binding only — no extractor wiring yet. The extractor follows in the next commit.
Wires the vendored binding from the previous commit into graphify's
extraction pipeline. Files with `.res` / `.resi` extensions now produce
Function / Variable / Module / Type nodes and `imports` / `calls` edges,
matching the surface the other LanguageConfig-driven extractors emit.
extract.py — entity extraction:
- `_RESCRIPT_CONFIG`: import_types ({open_statement, include_statement}),
call_types ({call_expression}), and function_boundary_types
({let_declaration, function}). class_types and function_types are
empty — every entity-emission path lives in `_rescript_extra_walk`
(mirrors how `_js_extra_walk` owns lexical_declaration). Same
string-dispatch pattern used by Swift/C#/JS at extract.py:1492-1509.
- `_rescript_extra_walk` (with `walk_fn` like `_csharp_extra_walk`)
handles every top-level declaration form:
* let_declaration → Function or Variable depending on whether
`let_binding[body]` is `function` (real implementation) or the
`let_binding` has no body and the type annotation is a
function_type (.resi-style signature). Tuple- and record-pattern
destructure (`let (a, b) = pair`, `let {foo, bar} = record`)
emit one Variable per bound name.
* module_declaration → Module node. Nested modules recurse into
the definition block via walk_fn with parent_class_nid threaded
through, so `module Outer = { module Inner = { ... } }` produces
`Outer --contains--> Inner --contains--> ...` to arbitrary depth.
* type_declaration → Type node.
* external_declaration → Function (callable) when the type
annotation is function_type, Variable otherwise. The JS-binding
string on the RHS is intentionally not registered as a target.
- Function-locals (let-bindings inside another function body) are NOT
emitted as graph nodes. Same convention as Python's
function_definition (returns at extract.py:1490 without descending)
and JS's _js_extra_walk (only fires at module scope). Promoting them
inflated the graph with `let url = ...` / `let now = Date.now()` /
etc. — function-locals belong inside the call pass, not the
architecture view.
- `_import_rescript`: emits `imports` edges for `open Foo` and
`include Foo`. `external` declarations are intentionally NOT in
import_types — they bind to JS strings, not other ReScript modules.
- ReScript call-graph branch in walk_calls: callees are
`value_identifier` (local, e.g. `f(x)`) or `value_identifier_path`
(qualified, e.g. `Foo.bar` or `Belt.Array.some`). For the path form,
the last `value_identifier` child is the function name. Pipe
expressions (`arr->Belt.Array.some(...)`) wrap a call_expression;
the generic recursion picks up the inner call without special
handling.
extract.py — multi-file `imports` resolver:
`_import_rescript` emits import edges with the bare module name as
target (e.g. `featureflag` for `open FeatureFlag`). build.py:111-112
drops edges whose target isn't a known node, which would silently
delete every cross-file ReScript import. A new resolver in `extract()`
(alongside the Python and Java cross-file passes) builds a
`module_stem.lower() → file_node_id` index over `.res`/`.resi` file
nodes and rewrites matching `imports` edge targets, so they survive
the build-time filter.
detect.py: adds `.res` and `.resi` to CODE_EXTENSIONS so
detect.classify_file returns FileType.CODE for them. The graphify
update pipeline reads `_DISPATCH.keys()` (not CODE_EXTENSIONS) for
filtering, but watch.py reloads CODE_EXTENSIONS for incremental
rebuilds, so both paths need the entry.
Tests (tests/test_languages.py): adds 19 ReScript tests in a
`# ── ReScript ──` section, modelled on the Scala block (tasks asked
for tests/test_extract_rescript.py but every other language extractor's
tests live in test_languages.py — followed the existing convention).
Fixtures under tests/fixtures/rescript/{FeatureFlag,Caller}.res are a
fictional feature-flag-style pair that exercises the full surface:
polyvariant type, value let, function lets, nested module + method,
cross-file `open`, qualified call (`Belt.Array.some`), and intra- and
cross-file calls.
- README.md: bumps the supported-languages count to 29 and adds `.res .resi` next to `.swift` in the Code row. - CHANGELOG.md: adds an "Unreleased" entry above 0.7.11 describing the ReScript feature surface and the vendoring rationale. - PROGRESS.md: works / flaky / deferred summary, sub-task 1 (vendor-vs-PyPI) decision rationale, file-by-file change list, and the final acceptance-gate state table.
Extends the ReScript extractor to surface type-level cross-module
dependencies that today's call-edge-only graph misses. In a typed
ReScript codebase, type references are typically the dominant form of
cross-module reference — a file that's purely a variant type with arms
that name 15 other modules currently extracts as one symbol with zero
outbound edges.
Helper: `_rescript_walk_type_refs(node, source, source_nid, stem,
str_path, edges)` recursively scans any subtree for
`type_identifier_path` nodes. For each match it emits one
`references_type` edge from `source_nid` to a target id derived from
the leftmost `module_identifier` (drilled out of
`module_identifier_path` for nested forms like
`Animal.Habitat.species`) plus the trailing `type_identifier`.
Target id = `_make_id(module_lower, type_name)` — same shape that
type nodes get from `type_declaration` in their owning file.
Splice points in `_rescript_extra_walk`:
- type_declaration: walk the whole type_binding from the type's nid.
Covers record fields, variant and polyvariant arm payloads, type
aliases, generic instantiations like `Animal.collection<int>`.
- let_declaration: walk the whole let_binding from the binding's
nid. Covers `.resi` signature-only lets, type-annotated values
(`let helper: Animal.eventId = ...`), and function let parameter
and return type annotations
(`let feed = (a: Animal.species): Animal.food => ...`).
- external_declaration: walk from the external's nid. Covers each
parameter type and the return type.
Confidence:
- EXTRACTED when the leftmost module name matches the current
file's bare stem (a self-reference like `Animal.species` from
inside `Animal.res`).
- INFERRED otherwise. The multi-file resolver in `extract()`
rewrites cross-file targets to real type-node ids when both sides
are in the scan; build.py's node-existence filter drops the rest
(third-party `Belt.list`, `React.element`, etc.).
Bare local types (`option`, `int`, `string`) parse as
`type_identifier`, not `type_identifier_path`, so they're skipped by
construction — no edge for `let x: int`.
Tooling impact on shared infrastructure:
- Added `"references_type"` to the per-file edge cleanup's
"survives unresolved" tuple in `_extract_generic` (line 2230).
Same lifecycle as `imports` / `imports_from`: the per-file pass
emits a phantom-target edge, the multi-file resolver rewrites
it. No other language emits this relation today, so the change
is functionally ReScript-only.
- Extended the existing rescript cross-file resolver in `extract()`
(the one that rewrites `imports` targets, added in 0bbb132) to
also build a `<module>.<symbol>` → real-node-id map and rewrite
`references_type` edge targets the same way.
Tests: 9 new in tests/test_languages.py:
- test_rescript_record_field_emits_type_ref_edge
- test_rescript_variant_arm_payload_emits_type_ref_edge (3 edges)
- test_rescript_polyvar_arm_payload_emits_type_ref_edge (2 edges)
- test_rescript_function_signature_emits_type_ref_edges (2 edges)
- test_rescript_external_declaration_emits_type_ref_edges (2 edges)
- test_rescript_nested_module_path_uses_leftmost (no `habitat` in
target ids)
- test_rescript_bare_local_type_emits_no_edge (no edge from
`let x: int`)
- test_rescript_self_reference_uses_extracted_confidence
- test_rescript_cross_file_type_ref_resolves_to_real_node (full
multi-file pipeline through extract())
Verification:
- `pytest -k rescript`: 28/28 pass (was 19; +9 new).
- `pytest tests/`: 669 passed, 1 pre-existing Fortran fixture
failure unchanged. No new regressions.
- Bundled fixture extraction (FeatureFlag/Caller): 11 nodes / 13
edges unchanged — they don't use module-qualified types so the
additive change is invisible to them.
Out-of-scope per the task brief: no other-language extractors were
touched (Java/TS/Scala stay call-edge-only; broader rollout is a
follow-up). No CLI flags, no output-format changes, no walker
signature changes, no fixture file additions.
…ion header)
Audit pass against `ARCHITECTURE.md`'s "Adding a new language extractor"
checklist and the established fixture / test patterns. Three deviations
flagged and fixed; one (vendor binding instead of PyPI dep) is
unavoidable and stays documented in CHANGELOG / PROGRESS.
1. Fixture layout — match every other language
The previous setup put two on-disk fixtures under
`tests/fixtures/rescript/{FeatureFlag,Caller}.res` to exercise
cross-file `open`. Every other language extractor in the repo uses
a single flat fixture `tests/fixtures/sample.<ext>`; cross-file
scenarios are tested via `tmp_path` inline sources (see
`test_cross_file_call_promoted_to_extracted_with_import_evidence`
and the rest of `test_cross_file_*` in `tests/test_extract.py`).
The `rescript/` subdir was the only one in `tests/fixtures/`.
Changes:
- `tests/fixtures/rescript/FeatureFlag.res` →
`tests/fixtures/sample.res` (single canonical fixture; module
name now `Sample`).
- `tests/fixtures/rescript/Caller.res` removed.
- `tests/fixtures/rescript/` subdirectory removed.
- `test_rescript_caller_emits_open_import` rewritten to a
`tmp_path` two-file scenario (`lib.res` + `caller.res` with
`open Lib`).
- `test_rescript_caller_finds_local_functions` rewritten to a
`tmp_path` single-file scenario — no on-disk Caller fixture
needed.
- All `RESCRIPT_FIXTURES / "FeatureFlag.res"` references
collapsed to `FIXTURES / "sample.res"`; `RESCRIPT_FIXTURES`
constant removed.
2. Section-header style — match `# ── Lang extra walk for <feature> ──`
Other languages use exactly this shape (`# ── JS/TS extra walk for
arrow functions ──`, `# ── Swift extra walk for enum cases ──`,
etc.). My header was a colon + comma-list and stood out.
Changes:
- `# ── ReScript extra walk: lets, modules, types, externals ──`
→ `# ── ReScript extra walk for let / module / type / external
declarations ──`.
3. CHANGELOG technical accuracy
The first ReScript bullet still claimed "nested let-functions
inside function bodies (registered as methods of their parent
function)" — that behaviour was reverted in commit 1476e1e
(Bjorn's option-B fix dropping function-locals as graph nodes).
The corpus stats also referenced a specific 186-file count which
maps to a real machine and is unnecessary detail for a public
changelog.
Changes:
- Removed the stale "registered as methods" claim.
- Added a positive statement of the actual contract
("Function-locals (let-bindings inside another function body)
are intentionally not registered as graph nodes, matching the
Python and JS conventions — the architecture view stays at
module surface").
- Dropped the specific "186-file" and "~4.2k nodes / ~4.7k
edges" stats (the corpus numbers also drifted post-1476e1e;
the bullet read as overpromising).
PROGRESS.md fixture references updated to the new flat path.
Gate-2 row in the gates table reflects the single-fixture node /
edge counts (8 / 8) with a note that cross-file scenarios are
covered by tmp_path tests, not the on-disk fixture.
Out-of-scope deviation still documented (not fixed):
- `pyproject.toml` step 4 of the new-language checklist asks for a
`tree-sitter-<lang>` PyPI dep. ReScript has none — the binding
is vendored under `graphify/_vendor/tree_sitter_rescript/` and
compiled at install. Documented in the existing CHANGELOG entry
and PROGRESS.md sub-task-1 section.
Verification:
- `pytest tests/test_languages.py -k rescript`: 28/28 pass.
- `pytest tests/`: 669 passed, 1 pre-existing
test_fortran_capital_F_parses_preprocessed failure (the
`sample.F90` / `sample.f90` case-collision on macOS — two index
blobs share the same inode; unrelated to this branch).
The smoke-test corpus section and Gate-3 evidence row now describe the qualitative shape of the extraction (file node + per-entity symbol nodes, cross-file edges resolve when both endpoints are in scan, before- state was README-only) without aggregate node / edge counts. Aggregate counts are corpus-dependent and not useful in a public engineering note.
The previous canonical fixture exercised one polyvariant type, one value let, three function lets, a nested module, and one method — enough to verify the extractor wakes up, but easy for a future change to silently shift behaviour against. Expanding to a kitchen-sink fixture that hits every documented surface, and adding snapshot tests that assert the EXACT set of node labels and edge triples. Fixture (tests/fixtures/sample.res) now exercises: - All four type forms: polyvariant, variant, type alias, record. - Both external shapes: function-typed (callable, emits `name()`) and plain-typed (value, emits bare name). - Every value-let variant: array literal, record literal, tuple destructure, record destructure, and type-annotated value. - Every function-let shape: plain, typed params + return, intra-file-call-bearing, qualified-call-bearing (`Belt.Array.get`), and pipe expression (`xs->Belt.Array.map`). - A module with nested type, value, and function. - Cross-module type references in record fields, type-annotated value lets, function parameter / return type annotations, and nested-module record fields — to verify the per-file cleanup keeps references_type edges with phantom targets so the multi-file resolver can rewrite them later. Tests: - The six existing `finds_*` tests now group by entity category (types, value lets, function lets, externals, module members) rather than naming individual symbols, so the assertion intent is clearer. - New: test_rescript_call_edges_have_extracted_confidence — single- file intra-file calls must be EXTRACTED, not INFERRED (which is reserved for the cross-file resolution path in extract()). - New: test_rescript_sample_no_bare_type_references — verifies `int`, `string`, `float` etc. (parsed as plain type_identifier) emit no references_type edges by construction. - New: test_rescript_sample_references_type_multiplicity — verifies `(a: Animal.point): Animal.point` produces two references_type edges (param + return), not one. - New: test_rescript_sample_node_labels_complete — snapshot test asserting the EXACT set of node labels. Any extractor change that adds or drops a node surfaces here as a test failure, forcing an explicit review. - New: test_rescript_sample_edge_summary_complete — snapshot test asserting the EXACT set of (relation, source_label, target_label) triples. Same forcing-function for edge drift. - Updated: test_rescript_no_dangling_source_edges now allows phantom targets on `references_type` and `re_exports` too (matches the current extract.py `survives_unresolved` contract). Test count: 28 → 34. All pass.
|
Impressive work — the extractor is thorough (pipe expressions, nested modules, Vendored C extension needs a policy decision Graphify has never vendored a compiled C extension. This adds a
The existing edge schema has Remove This file is a contributor's working notes, not repo documentation. It belongs in the PR description, not committed to the tree. Please drop it from the branch. The extractor code itself is high quality and the decision to skip function-locals matches the Python/JS convention. Fixing the three items above will get this over the line. |
Per upstream review (safishamsi#1044): PROGRESS.md is contributor working notes, not repo documentation. Its content belongs in the PR description, not the tree. Removing the file; the engineering write-up stays in the PR body for reviewers to read.
…review) Per upstream review on safishamsi#1044, adding the new `references_type` edge relation requires updates beyond extract.py — it has to be recognised by the rest of the relation-aware codebase so traversal, scoring, and rendering treat it correctly. - graphify/affected.py — add `references_type` to `DEFAULT_AFFECTED_RELATIONS` so impact-analysis (`graphify v8 --affected`) follows type-reference edges by default. Type refs are exactly the kind of dependency that should propagate change impact in a typed codebase. - graphify/extract.py — add `references_type` to the `SEMANTIC_RELATIONS` frozenset (the canonical schema-level enumeration of relations the extractor pipeline knows about). - graphify/callflow_html.py — add an explicit entry to the `relation_label` map (both English and Chinese), so the diagram label is the intentional "references type" / "引用类型" rather than the auto-fallback from `relation.replace("_", " ")`. preferred_edges / edge_score don't need a change — references_type falls through to the base scoring path, which is the right default for a non-call non-import relation. - docs/how-it-works.md — add `references_type` to the list of example relation verb phrases. Plus tests/test_build.py:test_build_from_json_preserves_references_type_relation — minimal end-to-end check that a graph containing a `references_type` edge round-trips through `build_from_json` with relation / context / confidence intact. The test locks the current behaviour so a future build.py change that filters on relation names doesn't accidentally drop references_type. Verified locally: - pytest tests/test_build.py → 25/25 pass (24 existing + 1 new). - pytest tests/test_languages.py -k rescript → 34/34 pass.
…afishamsi#1044 review) Per upstream review on safishamsi#1044: the original PR vendored the C source under `graphify/_vendor/tree_sitter_rescript/` and compiled it via a new `setup.py` at install time. That was the first C extension in graphify and a meaningful change to the build pipeline (sdist users suddenly need a C toolchain, no cibuildwheel CI yet to ship pre-built wheels). The reviewer offered two paths: (a) add cibuildwheel CI + edge-case verification, or (b) ship as a soft-optional extra. Going with (b) — the smaller, less-invasive change and matches graphify's existing pattern for other optional language extras (`sql`, `chinese`, etc.). Behaviour change: - `pip install graphifyy` → no C extension built, no C toolchain required. The graphify package is back to pure-Python. - `pip install "graphifyy[rescript]"` → installs `tree-sitter-rescript` from upstream's git (still no PyPI release). Requires git + a C toolchain on the user's machine. Same opt-in shape as `[sql]`. - Trying to extract a `.res` / `.resi` file without the extra installed produces a friendly error pointing at the install command: `tree_sitter_rescript not installed. ReScript is opt-in (no PyPI release). Install via: pip install "graphifyy[rescript]"` Files removed: - `setup.py` (no C extension to compile). - `graphify/_vendor/tree_sitter_rescript/` (entire vendored source + binding + LICENSE; users get it from the upstream package via the extra now). - `graphify/_vendor/__init__.py`. - The `sys.modules["tree_sitter_rescript"]` alias from `graphify/__init__.py` (no longer needed; `_extract_generic` does `importlib.import_module` directly and the upstream package registers itself at the top level). - `tool.setuptools.packages` no longer lists `graphify._vendor` or `graphify._vendor.tree_sitter_rescript`. - `tool.setuptools.package-data` no longer ships `_vendor/tree_sitter_rescript/*` files. - `build-system.requires` drops `wheel` (no longer needed without a C extension). pyproject.toml: - New `[project.optional-dependencies]` entry: `rescript = ["tree-sitter-rescript @ git+https://github.com/rescript-lang/tree-sitter-rescript.git@v6.0.0"]` `[all]` is intentionally NOT updated to include `[rescript]` — `[all]` shouldn't pull in git-URL deps that require a C toolchain. graphify/extract.py: - The generic `ImportError` path in `_extract_generic` now appends a ReScript-specific install hint when the missing module is `tree_sitter_rescript`. tests/test_languages.py: - New helpers `_rescript_skip_if_unavailable()` (calls `pytest.importorskip("tree_sitter_rescript")`, mirrors the `_extract_sql_or_skip` pattern in test_multilang.py) and `_extract_rescript_or_skip(path)`. - All 34 rescript tests now route through one of the helpers (15 fixture-based via sed-replace `extract_rescript(FIXTURES / "sample.res")` → `_extract_rescript_or_skip(...)`; 5 direct-tmp_path via an inserted `_rescript_skip_if_unavailable()` call; `_extract_rescript_snippet` does the skip internally so the 14 snippet-based tests are covered automatically). CHANGELOG: bullet rewritten to document the `[rescript]` opt-in. Verification: - Without the extra installed: `pytest tests/test_languages.py -k rescript` → 34/34 skipped (clean), no errors. - The other 165 language tests continue to pass. - Build tests including the new `test_build_from_json_preserves_references_type_relation` pass. - With the extra installed (verified earlier with the vendored .so in place; the source is identical to upstream's so behaviour is the same): 34/34 pass. Upstream-CI will need a job that installs `[rescript]` to run the rescript test suite; without it those tests just skip and CI stays green.
|
Thanks for the careful read — all three landed. Pushed 3 commits in response: 7e90a6a, ef83aef, 2621ad6. Vendored binding → soft-optional extra. Fair call, this is the first C extension graphify would ship and it deserved its own conversation rather than riding in on a language PR. Went with your option (b). Removed Opt-in path: rescript = ["tree-sitter-rescript @ git+https://github.com/rescript-lang/tree-sitter-rescript.git@v6.0.0"]
Someone running Tests go through
For build coverage, added I looked at PROGRESS.md — removed. You're right, that was working notes; the engineering write-up lives in the PR description where it belongs. Branch is at 10 commits now. Happy to squash to something tighter if you'd prefer a single feat commit + the review-response cleanup as a second. |
Summary
Adds AST extraction for ReScript so
.resand.resifiles emit Function / Variable / Module / Type nodes andimports/calls/references_typeedges. Today graphify's pipeline silently drops them — on a working ReScript codebase with over 100 files the currentv8build produces 15 nodes total, all fromREADME.md, with zero ReScript symbols. After this PR every.res/.resifile produces a file node plus a node per top-level entity, withimports/calls/references_typeedges resolving cross-file when both endpoints are in the scan.ReScript shops gain code-graph parity with TS/Java/Scala.
What changed
Vendored binding (
graphify/_vendor/tree_sitter_rescript/+setup.py)No
tree-sitter-rescriptexists on PyPI. Upstreamrescript-lang/tree-sitter-rescript@v6.0.0ships a complete Python binding (bindings/python/.../binding.c+setup.py) but it isn't published. Vendored the binding + grammar + LICENSE;setup.pycompiles the C extension at install time with upstream's exact compile flags.graphify/__init__.pyaliases the vendored package intosys.modules["tree_sitter_rescript"]so_extract_generic'simportlib.import_module(...)resolves with no per-language detour._RESCRIPT_CONFIG+ walker hooks (graphify/extract.py)Follows the
LanguageConfigpattern (Lua/Swift/Scala were the closest templates)._rescript_extra_walk(mirrors_js_extra_walkand_csharp_extra_walk) owns every entity-declaration form:let_declaration→ Function or Variable based on whetherlet_binding[body]is afunctionnode, or whether the type annotation isfunction_type(.resisignature). Tuple/record destructure emits one Variable per bound name.module_declaration→ Module node; nested modules recurse into the definition block viawalk_fnwithparent_class_nidthreaded through.type_declaration→ Type node.external_declaration→ Function (callable) when the annotation isfunction_type, Variable otherwise.function_definition(returns without descending) and JS's_js_extra_walk(module-scope only). Architecture view stays at module surface.Call dispatch handles
f(x)(value_identifier),Foo.bar(x)/Belt.Array.some(...)(value_identifier_path→ lastvalue_identifier), and pipe expressions (arr->Belt.Array.some(...)) picked up by generic recursion.references_typeedgesEmitted when a
type_identifier_pathappears in a record field, variant/polyvariant arm payload, function signature, let-binding annotation, or external. Targets the leftmost module of nested paths (Animal.Habitat.species→Animal). Bare local types (option,int) parse astype_identifiernottype_identifier_pathso they emit no edges. Same-file self-references EXTRACTED; cross-file INFERRED until the multi-file resolver rewrites them.Currently ReScript-only — Java/TS/Scala stay call-edge-only. ReScript codebases are type-driven enough that call edges alone produce a misleading dependency graph. Happy to follow up with the same treatment for the others in a separate PR.
Cross-file resolver in
extract()Two
.res/.resipasses:open Foo/include Fooimport targets rewritten from bare module name to real file node id;references_typetargets rewritten from<module>.<type>to real type-node ids when both sides are in scan.build.py's node-existence filter drops the rest (third-partyBelt.list/React.element/etc.) — same as Swift'simport UIKitagainst a non-Apple-SDK scan.graphify/detect.pyAdds
.res,.resitoCODE_EXTENSIONSnext to.swift.Tests + fixture
Single canonical fixture
tests/fixtures/sample.res(flat, matches every other language) covering every documented surface. 34 new ReScript tests under a# ── ReScript ──section intests/test_languages.py, modelled on Scala. Cross-file scenarios usetmp_pathinline sources, matchingtest_cross_file_call_*intests/test_extract.py. Two "snapshot" tests assert the EXACT set of node labels and edge triples produced bysample.resas a forcing function against unintended drift.One shared-infra touch
graphify/extract.py(per-fileclean_edgesfilter) adds"references_type"to the survives-unresolved tuple alongsideimports/imports_from/re_exports. Same lifecycle. No other language emitsreferences_typeso the change is functionally ReScript-only.Why
Today graphify's pipeline silently drops
.res/.resifiles: on a working ReScript codebase with over 100 files the currentv8build produces only 15 nodes — all fromREADME.md, none from any ReScript source. After this PR every.res/.resifile produces a file node plus a node per top-level entity (types, modules, values, functions, externals, including nested-module members), withimports/calls/references_typeedges resolving cross-file when both endpoints are in the scan.Bundled fixture (
tests/fixtures/sample.res): 23 nodes / 28 edges from a single file exercising the polyvariant + module + nested-let + function + method + type-reference surface — the qualitative shape any ReScript file produces.Test plan
pytest tests/test_languages.py -k rescript -v→ 34/34 passpytest tests/→ no new regressions vs. v8 baselinegraphify update tests/fixtures/sample.res→ produces the snapshot-tested node/edge setgraphify updateon a real-world ReScript codebase with over 100 files → file + symbol nodes per file, cross-file edges resolved as expected. Before this PR the same corpus produced 15 nodes total, all fromREADME.md.pip install -e .on macOS Apple Silicon (Python 3.14) compiles the vendored C extension successfullyRisk
references_typeis ReScript-only. No cross-language interference.🤖 Generated with Claude Code