Skip to content

Conversation

@cbullinger
Copy link
Collaborator

Adds a new analyze references command that finds all files that reference a target file.

  • Path resolution logic is centralized in internal/pathresolver
  • All existing commands continue to work with refactored path resolution
  • Command supports multiple directive types: include, literalinclude, io-code-block
  • Command shows directive type in output (e.g., "referenced by literalinclude in page.rst")
  • Command works with both versioned and non-versioned projects
  • Command is limited to single version (no cross-version search)
  • All tests pass
  • Documentation is complete and accurate
  • Code follows existing patterns and conventions

Copy link
Collaborator

@dacharyc dacharyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level, this seems like useful functionality, but is missing some considerations that would make it function as-advertised. It's currently going to miss references via a few different avenues.

However, the tree view will show it in all locations where it appears, with subsequent occurrences marked as circular
includes in verbose mode.

#### `analyze references`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One high-level request: can we rename this to analyze file-references or something like that? I want to distinguish it from ref references... JW wrote a script recently to analyze refs and we might want to add that to the CLI, so I'd like to disambiguate.

- Understand the impact of changes to a file (what pages will be affected)
- Find all usages of an include file across the documentation
- Track where code examples are referenced
- Identify orphaned files (files with no references)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're trying to find orphaned files, I think we also need to check ToC entries. Many .txt files may have no references through include, literalinclude, or io-code-block directives - they may only be referenced through ToC entries, but that doesn't make them "unused."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the toctree resolution as shared functionality

│ │ │ ├── analyzer.go # Include tree building
│ │ │ ├── output.go # Output formatting
│ │ │ └── types.go # Type definitions
│ │ └── references/ # References analysis subcommand
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do rename this command, presumably we'd also need to rename this dir.

return nil
}

// Only process RST files (.rst and .txt extensions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is an issue - there are yaml files that may also include references to files, so we probably need to process those, too. i.e. the extracts files, like here - are .yaml files that contain include, literalinclude, or io-code-block directives - but we're not seeing them here if we're not processing those files.


// findReferencesInFile searches a single file for references to the target file.
//
// This function scans through the file line by line looking for include,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, given this implementation, we will definitely be missing ToC references and therefore some pages will falsely show as orphaned when they're in fact included via ToC.

@cbullinger cbullinger marked this pull request as ready for review October 29, 2025 14:56
@cbullinger cbullinger requested a review from dacharyc October 29, 2025 15:13
Copy link
Collaborator

@dacharyc dacharyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to you for some fixups - mainly related to the changes to the analyze includes command.

#### `analyze includes`

Analyze `include` directive relationships in RST files to understand file dependencies.
Analyze `include` directive and `toctree` relationships in RST files to understand file dependencies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is an interesting interpretation.

I would prefer this did not include toctree relationships by default. Can we make this an optional flag? I was specifically trying to inspect all the content that goes into creating this file or page. The toctree references are links to other pages, but they are not pulled into the page content (transcluded) the same way that includes are.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this functionality or at least gate it behind an optional flag (for toctrees)?

- `-v, --verbose` - Show detailed information including line numbers and reference paths
- `-c, --count-only` - Only show the count of references (useful for quick checks and scripting)
- `--paths-only` - Only show the file paths, one per line (useful for piping to other commands)
- `-t, --directive-type <type>` - Filter by directive type: `include`, `literalinclude`, or `io-code-block`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add toctree as a filter here since we're now checking those?

(Ahh, I see in the implementation it's supported - so I guess this README section just needs some updates?)


This helps identify both the impact scope (how many files) and duplicate includes (when references > files).

**Supported Directive Types:**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - I presume we should add toctree to this list.

trimmedLine := strings.TrimSpace(line)

// Check for toctree start (use shared regex from rst package)
if rst.ToctreeDirectiveRegex.MatchString(trimmedLine) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this inconsistency is bugging me. It looks like we're defining the toctree matching regex in the rst package, but all the other regex is defined as vars at the top of this file. Could we move all of these to rst, maybe as a directive_regex.go file or something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants