fix(extract): resolve TypeScript wildcard path aliases (@*, @*/foo)#927
fix(extract): resolve TypeScript wildcard path aliases (@*, @*/foo)#927ghaiat-yoobic wants to merge 2 commits into
Conversation
Two related bugs in _read_tsconfig_aliases / _resolve_js_import_target
that left wildcard aliases unusable on real-world Angular/Nx monorepos
where most cross-pillar imports go through a generic catch-all alias.
## What was broken
`_read_tsconfig_aliases` was stripping trailing /* from both the alias
key and the target. Patterns like:
"@*": ["features/*/src/"]
"@*/interfaces": ["features/*/src/interfaces.ts"]
became { "@": "<repo>/features" } in the aliases dict, losing the
substitution group entirely.
`_resolve_js_import_target` then matched aliases by literal prefix
only — so `@communicate/documentv2` matched the bare `@` prefix and
joined the rest directly to a stripped target. No wildcard substitution,
no real file resolved, edge collapsed to bare basename `documentv2`.
## Fix
1. `_read_tsconfig_aliases` keeps both the alias key and the target
verbatim when either contains *. Non-wildcard aliases still get the
trailing-/ strip they had before.
2. `_resolve_js_import_target` splits wildcard aliases on the * into
(prefix, suffix), matches the import as raw.startswith(prefix) +
raw.endswith(suffix) with at least one captured char, then substitutes
the captured segment into the target's *.
3. Aliases are sorted by specificity before matching, per the TypeScript
path-mapping spec: exact (non-wildcard) entries win over wildcard
entries; among wildcards, the longer non-wildcard prefix wins. So
`@common/integration/*` beats `@*` when both could match.
## Phantom-node fix
Even when alias resolution produced the right target Path, the node id
was derived from the absolute path (`/Users/.../features/...`) while
the file extractor uses `str(path)` against the project-relative path
it was given. The ids never matched, so alias-resolved imports pointed
at phantom nodes.
New `_resolved_path_to_nid` helper relativizes the resolved Path to
Path.cwd() before computing the node id, falling back to the absolute
form only for paths outside cwd.
## Measured impact
Same Nx/Angular monorepo (7,759 TS files, ~50 tsconfig path entries
including `@*` catch-all and `@*/interfaces`/`@*/widgets`/`@*/test-data`):
- AST edges: 164,041 → 193,970 (+30K wildcard-resolved imports)
- Final graph edges (after build_from_json dedup): 99,871 → 107,369
- Specific failing case from triage:
`import { FileChipComponent } from '@communicate/documentv2'`
Before: zero imports_from edges on the importing file, all internal
contains-only.
After: 1 imports_from edge to
`features_communicate_documentv2_src_index_ts` + 1 symbol-level
imports edge to FileChipComponent.
Bare/scoped imports (npm packages, unmatched aliases) keep the existing
basename fallback so external imports continue to behave the same way.
When tsconfig aliases resolve through a barrel (`features/<topic>/src/
index.ts` that does `export { X } from './x'`), the file-level
`imports_from` edge correctly targets the barrel — but the symbol-level
`imports` edge was synthesized as `_make_id(barrel_stem, symbol)`. That
id matches no node (the symbol's real id is derived from the file where
it's actually defined), so build_from_json silently dropped every
barrel-mediated symbol edge.
Common monorepo pattern (Nx, Angular, Lerna, Turborepo). On a 7,761-file
Angular monorepo: every class consumed via `@pillar/topic` barrels
stayed degree-1, masking the actual cross-pillar coupling that the
graph's community detection and god-node analysis are meant to surface.
## Approach
1. New `_export_js` handler emits a record per `export ... from '...'`
statement (named with optional aliasing, `export *`, `export * as ns`).
Wired into JS_CONFIG / TS_CONFIG via new `export_types` /
`export_handler` LanguageConfig fields. `_extract_generic` now
exposes the export handler dispatch in its walk; the `reexports`
accumulator must be initialised before `walk(root)` since every TS
file with re-exports hits it.
2. `_import_js` attaches `_reexport_target_path` (the import target
file's absolute path) and `_reexport_symbol` (the imported name) to
every symbol-level `imports` edge as metadata.
3. `extract()` post-processes after id_remap: builds
`named_reexports[(from_path, exposed)] = (to_path, original)` and
`star_reexports[from_path] = [to_paths]`. For each symbol-level edge,
calls `_resolve_reexport_chain` (visited set, max depth 16) to find
the symbol's defining file. If the synthesized target matches an
existing node id, the edge is rewritten. Metadata fields are
stripped before the dict is serialised.
## Measured impact
7,761-file Angular monorepo, holding everything else constant:
- Symbol-level imports edges rewritten through barrels: **19,108**
- Total edges in final graph: 107,369 → 126,485 (+19K)
- Communities: 2,902 → 2,692 (cross-pillar coupling collapses fragmented
per-pillar islands into shared communities)
- Top in-degree symbol nodes are now real architectural anchors:
TranslatePipe (1,044 callers)
YobiFlexDirective (983)
TranslateService (746)
YobiTextDirective (692)
BrokerService (354)
AuthenticationService (350)
All these had degree 1 (just their own `contains` edge) before the
barrel chase.
Tally's specific failing case from triage:
// features/cross/agent/src/context/context-providers.ts
import { FileChipComponent } from '@communicate/documentv2';
// features/communicate/documentv2/src/index.ts
export { FileChipComponent } from './file/file-chip/file-chip.component';
Before: context-providers ↔ FileChipComponent had no edge.
After: 1 `imports` edge directly to
`file_chip_file_chip_component_filechipcomponent` (the class
node).
## Star re-exports
`export * from './x'` is treated as a passthrough: when chasing a
`(barrel, sym)` key, if no named re-export matches but the barrel has
`*` re-exports, the chain advances to the first `*` target that
itself either has a named re-export of `sym` or further `*`
re-exports. Best-effort — ambiguous multi-`*` cases land on the first
target. Test plan should include the multi-`*` case explicitly.
|
Strong work on the wildcard alias resolution — the measured impact (+30K AST edges on a real Nx monorepo) confirms this is a real problem. A few asks before merge:\n1. Rebase onto v8 — this is based on |
v7's tsconfig path-alias resolution drops the substitution wildcard, so paths like
"@*": ["features/*/src/"]or"@*/interfaces": ["features/*/src/interfaces.ts"]never resolve to a real file. Common pattern on Nx/Angular monorepos that use a generic catch-all alongside specific aliases —_load_tsconfig_aliasesand_resolve_js_import_targetneed to preserve and substitute the*.What was broken
_read_tsconfig_aliasesstrips trailing/*from both the alias key and the target, so:become
{ "@": "<repo>/features", "@*/interfaces": "<repo>/features/*/src/interfaces.ts" }— the substitution group is lost for@*, and the resolver has no concept of wildcards anyway, so the second entry is dead code._resolve_js_import_targetthen matches by literal prefix only.@communicate/documentv2matches the bare@prefix, joins the rest, and produces<repo>/features/communicate/documentv2. No file there, falls through to basenamedocumentv2. Every import that goes through a wildcard alias becomes a stub.Fix
Preserve wildcards.
_read_tsconfig_aliaseskeeps both the alias key and the target verbatim when either contains*. Non-wildcard aliases still get the trailing-/strip.Wildcard substitution in the resolver.
_resolve_js_import_targetsplits a wildcard alias on the*into(prefix, suffix), matches the import asraw.startswith(prefix) and raw.endswith(suffix)with at least one captured char, then substitutes the captured segment into the target's*.Specificity sorting. Aliases are sorted before matching per the TypeScript path-mapping spec: exact (non-wildcard) entries win over wildcard entries; among wildcards, the longer non-wildcard prefix wins.
@common/integration/*beats@*when both could match.Sub-bug fixed too: phantom-node ids on alias resolution
Even when alias resolution found the right file,
_make_id(str(resolved))was computing a node id from the absolute path (/Users/.../features/foo/bar.ts) while the file extractor uses_make_id(str(path))against the project-relative path it was given. The two ids never matched, so every alias-resolved import pointed at a phantom node.New
_resolved_path_to_nidhelper relativizes the resolved Path toPath.cwd()before computing the node id, falling back to the absolute form only for paths outside cwd.Measured impact
Same Nx/Angular monorepo (7,759 TS files, ~50 tsconfig path entries including a
@*catch-all and three@*/foopatterns):Specific failing case from triage on the source repo:
The repo's tsconfig.base.json defines
@communicate/chat/*,@communicate/document/*,@communicate/feed/*, plus a catch-all@*: features/*/src/. Thedocumentv2import has no specific alias and relies on the wildcard.imports_fromedges, only internalcontainsedges. The@communicate/documentv2import collapsed to a baredocumentv2basename node disconnected from the real file.imports_fromedge tofeatures_communicate_documentv2_src_index_tsat L5, plus a symbol-levelimportsedge toFileChipComponent. 29 total edges now target the documentv2 index node.Bare/scoped imports (npm packages, unmatched aliases) keep the existing basename fallback unchanged.
Test plan
I didn't add tests in this PR — the existing
tests/directory has no coverage for_read_tsconfig_aliasesor_resolve_js_import_targetthat I could extend. Happy to add a fixture + tests as a follow-up if you'd like, or attach them to this PR before merge.The three behaviours that need tests:
@*→features/*/src/resolves@foo/bartofeatures/foo/bar/src/@common/integration/*beats@*for@common/integration/fooThe fix is also straightforward to verify against the spec at https://www.typescriptlang.org/docs/handbook/module-resolution.html#path-mapping.