Skip to content

fix(extract): resolve TypeScript wildcard path aliases (@*, @*/foo)#927

Open
ghaiat-yoobic wants to merge 2 commits into
safishamsi:v7from
ghaiat-yoobic:fix/tsconfig-wildcard-aliases
Open

fix(extract): resolve TypeScript wildcard path aliases (@*, @*/foo)#927
ghaiat-yoobic wants to merge 2 commits into
safishamsi:v7from
ghaiat-yoobic:fix/tsconfig-wildcard-aliases

Conversation

@ghaiat-yoobic
Copy link
Copy Markdown

v7's tsconfig path-alias resolution drops the substitution wildcard, so paths like "@*": ["features/*/src/"] or "@*/interfaces": ["features/*/src/interfaces.ts"] never resolve to a real file. Common pattern on Nx/Angular monorepos that use a generic catch-all alongside specific aliases — _load_tsconfig_aliases and _resolve_js_import_target need to preserve and substitute the *.

What was broken

_read_tsconfig_aliases strips trailing /* from both the alias key and the target, so:

"@*": ["features/*/src/"]
"@*/interfaces": ["features/*/src/interfaces.ts"]

become { "@": "<repo>/features", "@*/interfaces": "<repo>/features/*/src/interfaces.ts" } — the substitution group is lost for @*, and the resolver has no concept of wildcards anyway, so the second entry is dead code.

_resolve_js_import_target then matches by literal prefix only. @communicate/documentv2 matches the bare @ prefix, joins the rest, and produces <repo>/features/communicate/documentv2. No file there, falls through to basename documentv2. Every import that goes through a wildcard alias becomes a stub.

Fix

  1. Preserve wildcards. _read_tsconfig_aliases keeps both the alias key and the target verbatim when either contains *. Non-wildcard aliases still get the trailing-/ strip.

  2. Wildcard substitution in the resolver. _resolve_js_import_target splits a wildcard alias on the * into (prefix, suffix), matches the import as raw.startswith(prefix) and raw.endswith(suffix) with at least one captured char, then substitutes the captured segment into the target's *.

  3. Specificity sorting. Aliases are sorted before matching per the TypeScript path-mapping spec: exact (non-wildcard) entries win over wildcard entries; among wildcards, the longer non-wildcard prefix wins. @common/integration/* beats @* when both could match.

Sub-bug fixed too: phantom-node ids on alias resolution

Even when alias resolution found the right file, _make_id(str(resolved)) was computing a node id from the absolute path (/Users/.../features/foo/bar.ts) while the file extractor uses _make_id(str(path)) against the project-relative path it was given. The two ids never matched, so every alias-resolved import pointed at a phantom node.

New _resolved_path_to_nid helper relativizes the resolved Path to Path.cwd() before computing the node id, falling back to the absolute form only for paths outside cwd.

Measured impact

Same Nx/Angular monorepo (7,759 TS files, ~50 tsconfig path entries including a @* catch-all and three @*/foo patterns):

edges (AST) edges (final graph after dedup)
v7 baseline 164,041 99,871
with this fix 193,970 (+30K) 107,369 (+7.5K)

Specific failing case from triage on the source repo:

// features/cross/agent/src/context/context-providers.ts
import { FileChipComponent, FilesComboListFormComponent } from '@communicate/documentv2';

The repo's tsconfig.base.json defines @communicate/chat/*, @communicate/document/*, @communicate/feed/*, plus a catch-all @*: features/*/src/. The documentv2 import has no specific alias and relies on the wildcard.

  • Before: importing file had zero imports_from edges, only internal contains edges. The @communicate/documentv2 import collapsed to a bare documentv2 basename node disconnected from the real file.
  • After: 1 imports_from edge to features_communicate_documentv2_src_index_ts at L5, plus a symbol-level imports edge to FileChipComponent. 29 total edges now target the documentv2 index node.

Bare/scoped imports (npm packages, unmatched aliases) keep the existing basename fallback unchanged.

Test plan

I didn't add tests in this PR — the existing tests/ directory has no coverage for _read_tsconfig_aliases or _resolve_js_import_target that I could extend. Happy to add a fixture + tests as a follow-up if you'd like, or attach them to this PR before merge.

The three behaviours that need tests:

  • Wildcard alias substitution: @*features/*/src/ resolves @foo/bar to features/foo/bar/src/
  • Specificity ordering: @common/integration/* beats @* for @common/integration/foo
  • Non-wildcard aliases keep working unchanged

The fix is also straightforward to verify against the spec at https://www.typescriptlang.org/docs/handbook/module-resolution.html#path-mapping.

Two related bugs in _read_tsconfig_aliases / _resolve_js_import_target
that left wildcard aliases unusable on real-world Angular/Nx monorepos
where most cross-pillar imports go through a generic catch-all alias.

## What was broken

`_read_tsconfig_aliases` was stripping trailing /* from both the alias
key and the target. Patterns like:

    "@*": ["features/*/src/"]
    "@*/interfaces": ["features/*/src/interfaces.ts"]

became { "@": "<repo>/features" } in the aliases dict, losing the
substitution group entirely.

`_resolve_js_import_target` then matched aliases by literal prefix
only — so `@communicate/documentv2` matched the bare `@` prefix and
joined the rest directly to a stripped target. No wildcard substitution,
no real file resolved, edge collapsed to bare basename `documentv2`.

## Fix

1. `_read_tsconfig_aliases` keeps both the alias key and the target
   verbatim when either contains *. Non-wildcard aliases still get the
   trailing-/ strip they had before.

2. `_resolve_js_import_target` splits wildcard aliases on the * into
   (prefix, suffix), matches the import as raw.startswith(prefix) +
   raw.endswith(suffix) with at least one captured char, then substitutes
   the captured segment into the target's *.

3. Aliases are sorted by specificity before matching, per the TypeScript
   path-mapping spec: exact (non-wildcard) entries win over wildcard
   entries; among wildcards, the longer non-wildcard prefix wins. So
   `@common/integration/*` beats `@*` when both could match.

## Phantom-node fix

Even when alias resolution produced the right target Path, the node id
was derived from the absolute path (`/Users/.../features/...`) while
the file extractor uses `str(path)` against the project-relative path
it was given. The ids never matched, so alias-resolved imports pointed
at phantom nodes.

New `_resolved_path_to_nid` helper relativizes the resolved Path to
Path.cwd() before computing the node id, falling back to the absolute
form only for paths outside cwd.

## Measured impact

Same Nx/Angular monorepo (7,759 TS files, ~50 tsconfig path entries
including `@*` catch-all and `@*/interfaces`/`@*/widgets`/`@*/test-data`):

- AST edges: 164,041 → 193,970 (+30K wildcard-resolved imports)
- Final graph edges (after build_from_json dedup): 99,871 → 107,369
- Specific failing case from triage:
    `import { FileChipComponent } from '@communicate/documentv2'`
    Before: zero imports_from edges on the importing file, all internal
    contains-only.
    After: 1 imports_from edge to
    `features_communicate_documentv2_src_index_ts` + 1 symbol-level
    imports edge to FileChipComponent.

Bare/scoped imports (npm packages, unmatched aliases) keep the existing
basename fallback so external imports continue to behave the same way.
When tsconfig aliases resolve through a barrel (`features/<topic>/src/
index.ts` that does `export { X } from './x'`), the file-level
`imports_from` edge correctly targets the barrel — but the symbol-level
`imports` edge was synthesized as `_make_id(barrel_stem, symbol)`. That
id matches no node (the symbol's real id is derived from the file where
it's actually defined), so build_from_json silently dropped every
barrel-mediated symbol edge.

Common monorepo pattern (Nx, Angular, Lerna, Turborepo). On a 7,761-file
Angular monorepo: every class consumed via `@pillar/topic` barrels
stayed degree-1, masking the actual cross-pillar coupling that the
graph's community detection and god-node analysis are meant to surface.

## Approach

1. New `_export_js` handler emits a record per `export ... from '...'`
   statement (named with optional aliasing, `export *`, `export * as ns`).
   Wired into JS_CONFIG / TS_CONFIG via new `export_types` /
   `export_handler` LanguageConfig fields. `_extract_generic` now
   exposes the export handler dispatch in its walk; the `reexports`
   accumulator must be initialised before `walk(root)` since every TS
   file with re-exports hits it.

2. `_import_js` attaches `_reexport_target_path` (the import target
   file's absolute path) and `_reexport_symbol` (the imported name) to
   every symbol-level `imports` edge as metadata.

3. `extract()` post-processes after id_remap: builds
   `named_reexports[(from_path, exposed)] = (to_path, original)` and
   `star_reexports[from_path] = [to_paths]`. For each symbol-level edge,
   calls `_resolve_reexport_chain` (visited set, max depth 16) to find
   the symbol's defining file. If the synthesized target matches an
   existing node id, the edge is rewritten. Metadata fields are
   stripped before the dict is serialised.

## Measured impact

7,761-file Angular monorepo, holding everything else constant:

- Symbol-level imports edges rewritten through barrels: **19,108**
- Total edges in final graph: 107,369 → 126,485 (+19K)
- Communities: 2,902 → 2,692 (cross-pillar coupling collapses fragmented
  per-pillar islands into shared communities)
- Top in-degree symbol nodes are now real architectural anchors:
    TranslatePipe       (1,044 callers)
    YobiFlexDirective   (983)
    TranslateService    (746)
    YobiTextDirective   (692)
    BrokerService       (354)
    AuthenticationService (350)
  All these had degree 1 (just their own `contains` edge) before the
  barrel chase.

Tally's specific failing case from triage:

    // features/cross/agent/src/context/context-providers.ts
    import { FileChipComponent } from '@communicate/documentv2';

    // features/communicate/documentv2/src/index.ts
    export { FileChipComponent } from './file/file-chip/file-chip.component';

Before: context-providers ↔ FileChipComponent had no edge.
After:  1 `imports` edge directly to
        `file_chip_file_chip_component_filechipcomponent` (the class
        node).

## Star re-exports

`export * from './x'` is treated as a passthrough: when chasing a
`(barrel, sym)` key, if no named re-export matches but the barrel has
`*` re-exports, the chain advances to the first `*` target that
itself either has a named re-export of `sym` or further `*`
re-exports. Best-effort — ambiguous multi-`*` cases land on the first
target. Test plan should include the multi-`*` case explicitly.
@safishamsi
Copy link
Copy Markdown
Owner

Strong work on the wildcard alias resolution — the measured impact (+30K AST edges on a real Nx monorepo) confirms this is a real problem. A few asks before merge:\n1. Rebase onto v8 — this is based on v7 and v8 has substantial extract.py changes including the MCP extractor\n2. Split out the barrel re-export work — it's a significant undocumented feature (barrel resolution, _reexport_target_path, 16-deep cycle guard, named/star/namespace maps) that deserves its own PR and review surface\n3. Tests — the three behaviors you called out (wildcard substitution, specificity sort, non-wildcard regression) need tests; the barrel path needs even more coverage (named, star, cycle, max-depth)\n\nThe wildcard alias fix + _resolved_path_to_nid helper on their own would be very mergeable. Happy to review a scoped version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants