Skip to content

feat: submodule discovery#23

Open
janbajc wants to merge 5 commits into
asispts:mainfrom
janbajc:feature/submodule-discovery
Open

feat: submodule discovery#23
janbajc wants to merge 5 commits into
asispts:mainfrom
janbajc:feature/submodule-discovery

Conversation

@janbajc
Copy link
Copy Markdown

@janbajc janbajc commented Mar 27, 2026

Adds submodule discovery so repositories declared in .gitmodules can be detected and included in the extension’s repo set (including nested submodules).

@janbajc janbajc requested a review from asispts as a code owner March 27, 2026 05:23
Copilot AI review requested due to automatic review settings March 27, 2026 05:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds submodule discovery so repositories declared in .gitmodules can be detected and included in the extension’s repo set (including nested submodules).

Changes:

  • Add DataSource.getSubmodules() to enumerate initialized submodules from .gitmodules.
  • Extend RepoManager to discover submodules when adding repos, on startup, and when .gitmodules changes.
  • Add Vitest coverage for getSubmodules.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tests/backend/dataSource.submodules.test.ts New tests that create temp git repos and validate getSubmodules behavior.
src/repoManager.ts Hooks submodule discovery into repo addition, startup, and watcher event handling.
src/dataSource.ts Implements .gitmodules reading/parsing and filters to initialized submodule repos.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/dataSource.ts Outdated
Comment on lines +470 to +509
fs.readFile(path.join(repo, ".gitmodules"), { encoding: "utf8" }, async (err, data) => {
const submodules: string[] = [];
if (err) {
resolve(submodules);
return;
}

const lines = data.split(eolRegex);
const sectionRegex = /^\s*\[.*\]\s*$/;
const submoduleSectionRegex = /^\s*\[submodule "([^"]+)"\]\s*$/;
const pathRegex = /^\s*path\s*=\s*(.*)$/;
let inSubmoduleSection = false;

for (let i = 0; i < lines.length; i++) {
const line = lines[i];
if (sectionRegex.test(line)) {
inSubmoduleSection = submoduleSectionRegex.test(line);
continue;
}

if (!inSubmoduleSection) continue;

const match = line.match(pathRegex);
if (match === null) continue;

const resolvedPath = path.resolve(repo, match[1].trim());
const relativeToRepo = path.relative(repo, resolvedPath);
// Ignore entries that escape the repository root.
if (relativeToRepo.startsWith("..") || path.isAbsolute(relativeToRepo)) continue;

const submodulePath = getPathFromStr(resolvedPath);
if (submodules.includes(submodulePath)) continue;

if (await this.isGitRepository(submodulePath)) {
submodules.push(submodulePath);
}
}

resolve(submodules);
});
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSubmodules manually parses .gitmodules using regexes. .gitmodules uses git-config syntax (quoting/escaping, comments, multi-line values), so this parsing can mis-handle valid files (e.g., quoted path = "foo bar"). Consider querying paths via git config -f .gitmodules --get-regexp (or similar) and parsing git’s output instead of re-implementing config parsing.

Suggested change
fs.readFile(path.join(repo, ".gitmodules"), { encoding: "utf8" }, async (err, data) => {
const submodules: string[] = [];
if (err) {
resolve(submodules);
return;
}
const lines = data.split(eolRegex);
const sectionRegex = /^\s*\[.*\]\s*$/;
const submoduleSectionRegex = /^\s*\[submodule "([^"]+)"\]\s*$/;
const pathRegex = /^\s*path\s*=\s*(.*)$/;
let inSubmoduleSection = false;
for (let i = 0; i < lines.length; i++) {
const line = lines[i];
if (sectionRegex.test(line)) {
inSubmoduleSection = submoduleSectionRegex.test(line);
continue;
}
if (!inSubmoduleSection) continue;
const match = line.match(pathRegex);
if (match === null) continue;
const resolvedPath = path.resolve(repo, match[1].trim());
const relativeToRepo = path.relative(repo, resolvedPath);
// Ignore entries that escape the repository root.
if (relativeToRepo.startsWith("..") || path.isAbsolute(relativeToRepo)) continue;
const submodulePath = getPathFromStr(resolvedPath);
if (submodules.includes(submodulePath)) continue;
if (await this.isGitRepository(submodulePath)) {
submodules.push(submodulePath);
}
}
resolve(submodules);
});
cp.execFile(
"git",
["-C", repo, "config", "-f", ".gitmodules", "--get-regexp", "^submodule\\..*\\.path$"],
async (err, stdout, _stderr) => {
const submodules: string[] = [];
if (err) {
// If `.gitmodules` does not exist or there are no matching entries,
// resolve with an empty list to preserve existing behavior.
resolve(submodules);
return;
}
const lines = stdout.split(eolRegex);
for (const line of lines) {
if (!line.trim()) continue;
const firstSpace = line.indexOf(" ");
if (firstSpace === -1) continue;
const value = line.substring(firstSpace + 1).trim();
if (!value) continue;
const resolvedPath = path.resolve(repo, value);
const relativeToRepo = path.relative(repo, resolvedPath);
// Ignore entries that escape the repository root.
if (relativeToRepo.startsWith("..") || path.isAbsolute(relativeToRepo)) continue;
const submodulePath = getPathFromStr(resolvedPath);
if (submodules.includes(submodulePath)) continue;
if (await this.isGitRepository(submodulePath)) {
submodules.push(submodulePath);
}
}
resolve(submodules);
}
);

Copilot uses AI. Check for mistakes.
Comment thread src/repoManager.ts
Comment on lines 319 to +334
@@ -298,7 +329,9 @@ export class RepoManager {
let path,
changes = false;
while ((path = this.changeEventPaths.shift())) {
if (!(await doesPathExist(path))) {
if (path.endsWith("/.gitmodules")) {
if (await this.checkReposForNewSubmodules()) changes = true;
} else if (!(await doesPathExist(path))) {
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On .gitmodules create/change events, processCreateEvents / processChangeEvents call checkReposForNewSubmodules(), which scans all known repos and runs submodule discovery for each. Since the watcher path is the specific .gitmodules file, this can be much more expensive than necessary in large workspaces. Prefer deriving the repo root from the event path (e.g., dirname(path)), and only scanning that repo (plus optionally its known descendants).

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +65
afterAll(() => {
fs.rmSync(parentRepo, { recursive: true, force: true });
fs.rmSync(submoduleRepo, { recursive: true, force: true });
fs.rmSync(standaloneRepo, { recursive: true, force: true });
});
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afterAll unconditionally calls fs.rmSync on parentRepo/submoduleRepo/standaloneRepo. If beforeAll throws before these variables are assigned, Vitest will still run afterAll and this teardown will throw (masking the real failure). Guard the cleanup (e.g., check the variable is a non-empty string / path exists) or initialize the vars to empty strings and conditionalize the rmSync calls.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/dataSource.ts Outdated
Comment on lines +469 to +472
this.execGit(
'config -f .gitmodules --get-regexp "^submodule\\..*\\.path$"',
repo,
async (err, stdout) => {
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSubmodules wraps the entire git command in single quotes, which makes the shell pass it to git as a single argument (e.g., git 'config -f ...') and will fail (and is especially problematic on Windows cmd.exe). Build the command without wrapping quotes, or preferably switch to spawnGit/execFile with an args array so the --get-regexp pattern is passed as its own argument without shell-escaping issues.

Copilot uses AI. Check for mistakes.
Comment thread src/dataSource.ts
Comment on lines +1 to 2
import * as path from "node:path";
import * as cp from "node:child_process";
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing node:path as path introduces identifier shadowing with existing parameters named path (e.g., isGitRepository(path: string)). Consider renaming those parameters (e.g., repoPath/dir) to avoid confusion and make future use of the path module inside those methods possible.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +74
describe("getSubmodules", () => {
it("returns initialized submodules declared in .gitmodules", async () => {
const dataSource = new DataSource();

const submodules = await dataSource.getSubmodules(parentRepo);

expect(submodules).toEqual([path.join(parentRepo, "modules/child").replace(/\\/g, "/")]);
});
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions nested submodules, but the test suite here only covers a single-level submodule and the no-.gitmodules case. Add coverage for nested submodules (e.g. parent -> child -> grandchild) and/or an uninitialized/missing submodule entry to ensure discovery behaves correctly across the scenarios this feature targets.

Copilot uses AI. Check for mistakes.
Comment thread src/dataSource.ts
const resolvedPath = path.resolve(repo, rawPath);
const relativeToRepo = path.relative(repo, resolvedPath);
// Ignore entries that escape the repository root.
if (relativeToRepo.startsWith("..") || path.isAbsolute(relativeToRepo)) continue;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo-escape guard is overly broad: relativeToRepo.startsWith("..") will also reject legitimate submodule paths whose first path segment merely starts with .. (e.g. ..my-submodule). Consider tightening the check to only reject .. as an actual path segment (e.g. relativeToRepo === ".." or relativeToRepo.startsWith(".." + path.sep)), while keeping the cross-drive absolute-path guard.

Suggested change
if (relativeToRepo.startsWith("..") || path.isAbsolute(relativeToRepo)) continue;
if (
relativeToRepo === ".." ||
relativeToRepo.startsWith(".." + path.sep) ||
path.isAbsolute(relativeToRepo)
) {
continue;
}

Copilot uses AI. Check for mistakes.
@asispts asispts changed the title Feature/submodule discovery feat: submodule discovery Mar 27, 2026
Comment thread src/repoManager.ts
if (isRepo) {
this.addRepo(directory);
resolve(true);
this.addRepo(directory).then(resolve);
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to change this logic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants