feat: Workspace filesystem cleanup #391
6 issues
find-bugs: Found 6 issues (1 high, 2 medium, 3 low)
High
Daemon socket directory in shared tmpdir is vulnerable to symlink pre-creation attack - `src/daemon/socket-path.ts:89-95`
daemonDirForWorkspaceKey places the socket directory under tmpdir() (typically /tmp), which is writable by any local user. The workspace hash is deterministic from a path that may be predictable (e.g., /Users/<name>/projects/<name>), so a local attacker can pre-create xcodebuildmcp-<hash> as a symlink targeting an attacker-controlled directory before the daemon starts. mkdirSync with recursive: true silently succeeds when the path already exists, including when it is a symlink to an existing directory. Although validateSocketDir rejects symlinks afterwards via lstatSync, the daemon will refuse to start while the attacker has effectively performed a denial-of-service; if symlink target is owned by the user and not a symlink itself (e.g., the user's ~/.ssh), the ownership check passes and chmodSync may then alter permissions on that target.
Medium
chmodSync after statSync allows TOCTOU permission change on attacker-swapped path - `src/daemon/socket-path.ts:84-86`
validateSocketDir calls statSync(dir) then chmodSync(dir, 0o700) if permissions are too loose. Between these calls, a local attacker who can win the race could swap the path to a symlink (or replace the directory) so the chmod applies to a different filesystem object owned by the same user. Combined with the predictable path under tmpdir(), this enables tampering with the permissions of arbitrary user-owned directories. Using fchmod on an opened file descriptor (with O_NOFOLLOW) would close the window.
Also found at:
src/utils/fs-lock.ts:113-121
Pre-key cooldown skip can starve scheduling when sweeps never complete - `src/utils/workspace-filesystem-lifecycle.ts:401-412`
buildSchedulePreKey-based skip in scheduleWorkspaceFilesystemLifecycleSweep reads lastScheduledAtByPreKey but that map is only updated after a non-skipped sweep completes. Because the entry is set at completedAt (after the schedule delay plus sweep duration), and the same value is also used to short-circuit future schedule calls, this is benign — but the pre-key check uses options.now ?? Date.now() while the post-completion write uses Date.now(); if a caller passes a fixed options.now in the past for tests, the cooldown check can incorrectly skip indefinitely. This can cause scheduled sweeps to be silently dropped in test or time-controlled environments.
Also found at:
src/utils/workspace-filesystem-lifecycle.ts:426-444
Low
Quarantined lock directory is leaked when restore rename fails - `src/utils/fs-lock.ts:60-67`
restoreQuarantinedLockDir swallows rename errors and intentionally leaves the quarantined directory in place when restoration fails. Over time, repeated contention failures (e.g., another contender created a new lockDir before the validation rejected our recovery) accumulate '.{name}.stale.{pid}.{uuid}' directories under the lock parent with no cleanup path, causing unbounded disk usage in long-lived workspaces.
Also found at:
src/utils/fs-lock.ts:184-191
normalizeWorkspaceKey allows '..' and other traversal-adjacent values - `src/utils/log-paths.ts:38-46`
normalizeWorkspaceKey only rejects empty strings and forward/back slashes, but does not reject '..', '.', null bytes, or other characters that can affect path resolution on some platforms. If a workspace key of '..' is ever passed in, path.join will resolve the workspace root to the parent of the workspaces directory, allowing cleanup logic to operate outside the intended workspace tree. The user-visible consequence is potential deletion or lock contention on unintended directories if an attacker or buggy caller controls the workspace key.
isPidAlive treats EPERM as alive without verifying PID ownership - `src/utils/process-liveness.ts:7-12`
process.kill(pid, 0) returns EPERM when the PID exists but belongs to another user/process the caller cannot signal. The function correctly treats this as 'alive' (only ESRCH returns false), but on systems where PID reuse occurs across users, an unrelated process owned by another user could be considered a live workspace-owned helper. Given the skill's emphasis on 'ownership checks' for multi-process cleanup, this liveness primitive alone cannot establish ownership — callers must combine it with workspace ownership verification, otherwise stale helpers owned by other users may be treated as alive and never reconciled.
Duration: 17m 6s · Tokens: 1.3M in / 19.9k out · Cost: $7.57 (+extraction: $0.00, +merge: $0.00)
Annotations
Check failure on line 95 in src/daemon/socket-path.ts
github-actions / warden: find-bugs
Daemon socket directory in shared tmpdir is vulnerable to symlink pre-creation attack
`daemonDirForWorkspaceKey` places the socket directory under `tmpdir()` (typically `/tmp`), which is writable by any local user. The workspace hash is deterministic from a path that may be predictable (e.g., `/Users/<name>/projects/<name>`), so a local attacker can pre-create `xcodebuildmcp-<hash>` as a symlink targeting an attacker-controlled directory before the daemon starts. `mkdirSync` with `recursive: true` silently succeeds when the path already exists, including when it is a symlink to an existing directory. Although `validateSocketDir` rejects symlinks afterwards via `lstatSync`, the daemon will refuse to start while the attacker has effectively performed a denial-of-service; if symlink target is owned by the user and not a symlink itself (e.g., the user's `~/.ssh`), the ownership check passes and `chmodSync` may then alter permissions on that target.
Check warning on line 86 in src/daemon/socket-path.ts
github-actions / warden: find-bugs
chmodSync after statSync allows TOCTOU permission change on attacker-swapped path
`validateSocketDir` calls `statSync(dir)` then `chmodSync(dir, 0o700)` if permissions are too loose. Between these calls, a local attacker who can win the race could swap the path to a symlink (or replace the directory) so the chmod applies to a different filesystem object owned by the same user. Combined with the predictable path under `tmpdir()`, this enables tampering with the permissions of arbitrary user-owned directories. Using `fchmod` on an opened file descriptor (with `O_NOFOLLOW`) would close the window.
Check warning on line 121 in src/utils/fs-lock.ts
github-actions / warden: find-bugs
[HZS-JCC] chmodSync after statSync allows TOCTOU permission change on attacker-swapped path (additional location)
`validateSocketDir` calls `statSync(dir)` then `chmodSync(dir, 0o700)` if permissions are too loose. Between these calls, a local attacker who can win the race could swap the path to a symlink (or replace the directory) so the chmod applies to a different filesystem object owned by the same user. Combined with the predictable path under `tmpdir()`, this enables tampering with the permissions of arbitrary user-owned directories. Using `fchmod` on an opened file descriptor (with `O_NOFOLLOW`) would close the window.
Check warning on line 412 in src/utils/workspace-filesystem-lifecycle.ts
github-actions / warden: find-bugs
Pre-key cooldown skip can starve scheduling when sweeps never complete
buildSchedulePreKey-based skip in scheduleWorkspaceFilesystemLifecycleSweep reads lastScheduledAtByPreKey but that map is only updated after a non-skipped sweep completes. Because the entry is set at completedAt (after the schedule delay plus sweep duration), and the same value is also used to short-circuit future schedule calls, this is benign — but the pre-key check uses options.now ?? Date.now() while the post-completion write uses Date.now(); if a caller passes a fixed options.now in the past for tests, the cooldown check can incorrectly skip indefinitely. This can cause scheduled sweeps to be silently dropped in test or time-controlled environments.
Check warning on line 444 in src/utils/workspace-filesystem-lifecycle.ts
github-actions / warden: find-bugs
[BVG-EGW] Pre-key cooldown skip can starve scheduling when sweeps never complete (additional location)
buildSchedulePreKey-based skip in scheduleWorkspaceFilesystemLifecycleSweep reads lastScheduledAtByPreKey but that map is only updated after a non-skipped sweep completes. Because the entry is set at completedAt (after the schedule delay plus sweep duration), and the same value is also used to short-circuit future schedule calls, this is benign — but the pre-key check uses options.now ?? Date.now() while the post-completion write uses Date.now(); if a caller passes a fixed options.now in the past for tests, the cooldown check can incorrectly skip indefinitely. This can cause scheduled sweeps to be silently dropped in test or time-controlled environments.