-
Notifications
You must be signed in to change notification settings - Fork 5
Add analyze references cmd #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a high level, this seems like useful functionality, but is missing some considerations that would make it function as-advertised. It's currently going to miss references via a few different avenues.
audit-cli/README.md
Outdated
| However, the tree view will show it in all locations where it appears, with subsequent occurrences marked as circular | ||
| includes in verbose mode. | ||
|
|
||
| #### `analyze references` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One high-level request: can we rename this to analyze file-references or something like that? I want to distinguish it from ref references... JW wrote a script recently to analyze refs and we might want to add that to the CLI, so I'd like to disambiguate.
audit-cli/README.md
Outdated
| - Understand the impact of changes to a file (what pages will be affected) | ||
| - Find all usages of an include file across the documentation | ||
| - Track where code examples are referenced | ||
| - Identify orphaned files (files with no references) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're trying to find orphaned files, I think we also need to check ToC entries. Many .txt files may have no references through include, literalinclude, or io-code-block directives - they may only be referenced through ToC entries, but that doesn't make them "unused."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the toctree resolution as shared functionality
audit-cli/README.md
Outdated
| │ │ │ ├── analyzer.go # Include tree building | ||
| │ │ │ ├── output.go # Output formatting | ||
| │ │ │ └── types.go # Type definitions | ||
| │ │ └── references/ # References analysis subcommand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do rename this command, presumably we'd also need to rename this dir.
| return nil | ||
| } | ||
|
|
||
| // Only process RST files (.rst and .txt extensions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is an issue - there are yaml files that may also include references to files, so we probably need to process those, too. i.e. the extracts files, like here - are .yaml files that contain include, literalinclude, or io-code-block directives - but we're not seeing them here if we're not processing those files.
|
|
||
| // findReferencesInFile searches a single file for references to the target file. | ||
| // | ||
| // This function scans through the file line by line looking for include, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, given this implementation, we will definitely be missing ToC references and therefore some pages will falsely show as orphaned when they're in fact included via ToC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Back to you for some fixups - mainly related to the changes to the analyze includes command.
audit-cli/README.md
Outdated
| #### `analyze includes` | ||
|
|
||
| Analyze `include` directive relationships in RST files to understand file dependencies. | ||
| Analyze `include` directive and `toctree` relationships in RST files to understand file dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is an interesting interpretation.
I would prefer this did not include toctree relationships by default. Can we make this an optional flag? I was specifically trying to inspect all the content that goes into creating this file or page. The toctree references are links to other pages, but they are not pulled into the page content (transcluded) the same way that includes are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we revert this functionality or at least gate it behind an optional flag (for toctrees)?
audit-cli/README.md
Outdated
| - `-v, --verbose` - Show detailed information including line numbers and reference paths | ||
| - `-c, --count-only` - Only show the count of references (useful for quick checks and scripting) | ||
| - `--paths-only` - Only show the file paths, one per line (useful for piping to other commands) | ||
| - `-t, --directive-type <type>` - Filter by directive type: `include`, `literalinclude`, or `io-code-block` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add toctree as a filter here since we're now checking those?
(Ahh, I see in the implementation it's supported - so I guess this README section just needs some updates?)
|
|
||
| This helps identify both the impact scope (how many files) and duplicate includes (when references > files). | ||
|
|
||
| **Supported Directive Types:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - I presume we should add toctree to this list.
| trimmedLine := strings.TrimSpace(line) | ||
|
|
||
| // Check for toctree start (use shared regex from rst package) | ||
| if rst.ToctreeDirectiveRegex.MatchString(trimmedLine) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this inconsistency is bugging me. It looks like we're defining the toctree matching regex in the rst package, but all the other regex is defined as vars at the top of this file. Could we move all of these to rst, maybe as a directive_regex.go file or something?
e92aaa5 to
1f63808
Compare
Adds a new
analyze referencescommand that finds all files that reference a target file.internal/pathresolverinclude,literalinclude,io-code-block