Skip to content

Conversation

@cristianoc
Copy link
Collaborator

This PR introduces a planning document for refactoring the reanalyze dead code analysis into a pure, order-independent pipeline.

  • Adds , which:
    • Describes the desired end-state architecture for dead code analysis.
    • Catalogues current sources of mutation and order dependence.
    • Breaks the refactor into small, behaviour-preserving steps that can be tackled in independent PRs.

There are no behavioural changes in this PR: it only adds documentation to coordinate subsequent work on the dead code analysis pipeline.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Nov 29, 2025

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript-lang/rescript@8043

@rescript/darwin-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-arm64@8043

@rescript/darwin-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/darwin-x64@8043

@rescript/linux-arm64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-arm64@8043

@rescript/linux-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/linux-x64@8043

@rescript/runtime

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/runtime@8043

@rescript/win32-x64

npm i https://pkg.pr.new/rescript-lang/rescript/@rescript/win32-x64@8043

commit: 7936d69

@fhammerschmidt
Copy link
Member

I don't see an explicit mention of monorepos or npm/yarn/pnpm workspaces. In this context "dependencies" only refer to other files, but there is still the missing use case when one has shared code like this:

.
├── common
│   ├── package.json
│   ├── rescript.json
│   └── src/
├── mobile
│   ├── package.json
│   ├── rescript.json
│   └── src/
├── web
│   ├── package.json
│   ├── rescript.json
│   └── src/
├── package.json
└── rescript.json

where the root package.json contains a workspace field like so:

"workspaces": [
  "common",
  "mobile",
  "web"
]

and the root rescript.json has

"dependencies": [
  "common",
  "mobile",
  "web",
]

And common/rescript.json is independent of any internal package (may still have external dependencies in a real-world setup):

"dependencies": []

Whereas both web/rescript.json and mobile/rescript.json are dependent on the common package and thus contain:

"dependencies": [
  "common"
]

And we want DCE to mark stuff as dead that is part of common but not used anywhere in the mobile or web package.

This needs to be an option to configure because when e.g. developing a library with tests, I might have tests dependent on the main package, but not 100 % coverage.

The knowledge about package manager monorepos is mostly in rewatch. I imagine if this is all pure and separated now the final result can be recalculated based on a reanalyze includeDependencies: ["common"] option or something.

Also tagging @nojaf and @DZakh who are users of bun and pnpm monorepos.

@cristianoc
Copy link
Collaborator Author

I don't see an explicit mention of monorepos or npm/yarn/pnpm workspaces. In this context "dependencies" only refer to other files, but there is still the missing use case when one has shared code like this:


.

├── common

│   ├── package.json

│   ├── rescript.json

│   └── src/

├── mobile

│   ├── package.json

│   ├── rescript.json

│   └── src/

├── web

│   ├── package.json

│   ├── rescript.json

│   └── src/

├── package.json

└── rescript.json

where the root package.json contains a workspace field like so:

"workspaces": [

  "common",

  "mobile",

  "web"

]

and the root rescript.json has

"dependencies": [

  "common",

  "mobile",

  "web",

]

And common/rescript.json is independent of any internal package (may still have external dependencies in a real-world setup):

"dependencies": []

Whereas both web/rescript.json and mobile/rescript.json are dependent on the common package and thus contain:

"dependencies": [

  "common"

]

And we want DCE to mark stuff as dead that is part of common but not used anywhere in the mobile or web package.

This needs to be an option to configure because when e.g. developing a library with tests, I might have tests dependent on the main package, but not 100 % coverage.

The knowledge about package manager monorepos is mostly in rewatch. I imagine if this is all pure and separated now the final result can be recalculated based on a reanalyze includeDependencies: ["common"] option or something.

Also tagging @nojaf and @DZakh who are users of bun and pnpm monorepos.

This is a behavior preserving refactor.
Which btw makes it easier to add features in future.

- Introduce BindingContext in DeadValue to track the current binding location during dead-code traversal, so binding context is explicit and locally encapsulated.
- Introduce ReportingContext in DeadCommon to track, per file, the end position of the last reported value when deciding whether to suppress nested warnings.
- Replace addValueReference_state with addValueReference ~binding, so value-reference bookkeeping is driven by an explicit binding location rather than a threaded analysis state.
- Update dead-code value and exception handling to use the new addValueReference API.
- Refresh DEADCODE_REFACTOR_PLAN.md to mark these state-localisation steps as completed and to narrow the remaining follow-up to making the binding context fully pure.
- Verified with make test-analysis that behaviour and expected outputs remain unchanged.
- Remove BindingContext module wrapper (was just forwarding to Current)
- Remove Current module entirely (unnecessary abstraction)
- Simplify to pass Location.t directly instead of record type
- Remove unused max_value_pos_end field
- Refactor traverseStructure to use pure functional mapper creation
- Update DEADCODE_REFACTOR_PLAN.md to mark task 4.3 as complete

This eliminates ~40 lines of wrapper code and makes the binding state
tracking pure and simpler to understand.
The original plan was too granular with many 'add scaffolding but don't
use it yet' tasks. This rewrite focuses on:

- Problem-first structure: each task solves a real architectural issue
- Combined related changes: no pointless intermediate states
- Clear value propositions: why each task matters
- Testable success criteria: how we know it worked
- Realistic effort estimates

Reduces 14 fine-grained tasks down to 10 focused tasks that each leave
the codebase measurably better.

Signed-off-by: Cursor AI <[email protected]>
Replace global config reads with explicit ~config parameter threading
throughout the DCE analysis pipeline. This makes the analysis pure
and testable with different configurations.

## Changes

### New module
- DceConfig: Encapsulates DCE configuration (cli + run config)
  - DceConfig.current() captures global state once
  - All analysis functions now take explicit ~config parameter

### DCE Analysis (fully pure - no global reads)
- DeadCode: threads config to all Dead* modules
- DeadValue: replaced ~15 !Cli.debug reads with config.cli.debug
- DeadType: replaced ~7 !Cli.debug reads with config.cli.debug
- DeadOptionalArgs: takes ~config, passes to Log_.warning
- DeadModules: uses config.run.transitive
- DeadCommon: threads config through reporting pipeline
- WriteDeadAnnotations: uses config.cli.write/json
- ProcessDeadAnnotations: uses config.cli.live_names/live_paths

### Logging infrastructure
- Log_.warning: now requires ~config (no optional)
- Log_.logIssue: now requires ~config (no optional)
- Log_.Stats.report: now requires ~config (no optional)
- Consistent API - no conditional logic on Some/None

### Non-DCE analyses (call DceConfig.current() at use sites)
- Exception: 4 call sites updated
- Arnold: 7 call sites updated
- TODO: Thread config through these for full purity

### Other
- Common.ml: removed unused lineAnnotationStr field
- Reanalyze: single DceConfig.current() call at entry point
- DEADCODE_REFACTOR_PLAN.md: updated Task 2, added verification task

## Impact

✅ DCE analysis is now pure - takes explicit config, no global reads
✅ All config parameters required (zero 'config option' types)
✅ Can run analysis with different configs without mutating globals
✅ All tests pass - no regressions

## Remaining Work (Task 2)

- Thread config through Exception/Arnold to eliminate DceConfig.current()
- Verify zero DceConfig.current() calls in analysis code

Signed-off-by: Cursor AI <[email protected]>
Signed-off-by: Cristiano Calcagno <[email protected]>
- Replace DceConfig.current() and !Common.Cli.debug with explicit config parameter
- Thread config through Arnold.ml functions (Stats, ExtendFunctionTable, CheckExpressionWellFormed, Compile, Eval)
- Thread config through Exception.ml functions (Event.combine, Checks.doCheck/doChecks, traverseAst)
- Update Reanalyze.ml to pass config to all analysis functions
- Improves testability and eliminates global state dependencies
Task 1 of the dead code refactor plan: eliminate global mutable state
for current file context.

Changes:
- Add DeadCommon.FileContext.t with source_path, module_name, is_interface
- Thread ~file parameter through DeadCode, DeadValue, DeadType, DeadCommon
- Thread ~file through Exception.processCmt and Arnold.processCmt
- Remove Common.currentSrc, currentModule, currentModuleName globals

Design improvement:
- FileContext.module_name is now a raw string (e.g. "ExnB"), not Name.t
- Added FileContext.module_name_tagged helper to create Name.t when needed
- This avoids confusion: raw name for hashtable keys, tagged name for paths
- Previously the interface encoding (+prefix) leaked into code that expected raw names

This makes it possible to process files concurrently or out of order,
as analysis no longer depends on hidden global state for file context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants