feat: add --fresh flag to parse for forced reparse#38
Draft
joshbouncesecurity wants to merge 3 commits intoknostic:masterfrom
Draft
feat: add --fresh flag to parse for forced reparse#38joshbouncesecurity wants to merge 3 commits intoknostic:masterfrom
joshbouncesecurity wants to merge 3 commits intoknostic:masterfrom
Conversation
The parse step's unit generator merges new units into an existing dataset.json, preserving old units as-is. This means changes to the parser (e.g., improved call graph resolution) don't take effect for previously-parsed units unless the dataset is deleted manually. Add --fresh flag to parse (and ensure scan --fresh also clears the dataset) so users can force a full reparse when needed. - Go CLI: add --fresh flag to parse command, pass through to Python - Python CLI: add --fresh arg to parse subparser - parser_adapter: delete existing dataset.json when fresh=True - scanner: include dataset.json in fresh cleanup alongside checkpoints - unit_generator: add stderr note when existing units are reused Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 tasks
- Extract buildParsePyArgs from runParse so the helper is the source of truth (tests no longer keep a parallel copy with 'keep in sync') - Replace exists()+remove() with try/except FileNotFoundError to avoid TOCTOU race when two --fresh parses run concurrently - Clarify --fresh help text and docstring: only dataset.json is deleted; other artifacts in the output dir are preserved
Contributor
Author
Manual verification
|
Contributor
Author
Local test resultsBuilt the Go CLI from this branch and exercised Commands run: Outcome:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
--freshtoopenant parseto force a full reparse from scratch without manually deletingdataset.json. Useful when parser improvements are deployed and the existing dataset needs to be regenerated.The JS parser also now logs a hint pointing at
--freshwhen existing units are reused, so users discover the flag when they need it.Addresses item 19 from #16 (does not close the issue).
Test plan
openant parse <repo>reuses existing units whendataset.jsonalready exists (default).openant parse <repo> --freshdeletes the cached dataset and reparses from scratch.--fresh, the JS parser hint about reused units no longer fires for that run.