feat: Workspace filesystem cleanup #391
6 issues
find-bugs: Found 6 issues (3 medium, 3 low)
Medium
Forced-shutdown timeout is never cleared, causing duplicate cleanup and exit-code race - `src/daemon.ts:344-351`
When shutdown() runs, the 5-second setTimeout is scheduled but never cleared once server.close() succeeds. If cleanup or flushAndCloseSentry(2000) together take longer than 5 seconds (the sentry flush alone allows up to 2s, and stopOwnedSimulatorLaunchOsLogSessions defaults to 1s plus daemon-file cleanup), the timeout fires and runs cleanupArtifacts() a second time concurrently with the in-progress cleanup, then races to call process.exit(1) against the success path's process.exit(exitCode). This can corrupt the daemon-file/socket cleanup (two concurrent cleanupWorkspaceDaemonFiles and stopOwnedSimulatorLaunchOsLogSessions calls) and override the intended exit code with 1.
Also found at:
src/daemon.ts:334-351
removeStaleSocket unlinks paths under /tmp without owner/type validation - `src/daemon/socket-path.ts:29-36`
removeStaleSocket(socketPath) now operates on a path inside tmpdir(). Combined with the predictable workspace-derived directory name from daemonDirForWorkspaceKey, a local attacker who controls the parent directory could place a symlink at d.sock pointing at a victim-owned file; unlinkSync on the symlink is harmless, but if the attacker pre-creates the directory and the daemon proceeds to mkdir/bind without ownership checks (see related finding), arbitrary deletion is possible via earlier startup paths. The change broadens the trust boundary because the original ~/.xcodebuildmcp was not multi-user writable.
Cooldown markers updated before sweep runs, suppressing later sweeps if scheduled run fails - `src/utils/workspace-filesystem-lifecycle.ts:411-415`
scheduleWorkspaceFilesystemLifecycleSweep writes lastScheduledAtByScope and lastScheduledAtByPreKey synchronously before the setTimeout fires. If the deferred runWorkspaceFilesystemLifecycleSweep throws synchronously during resolveOptions or is skipped by lock/cooldown, subsequent schedule attempts within WORKSPACE_FILESYSTEM_LIFECYCLE_COOLDOWN_MS are silently dropped even though no sweep ever executed. This can cause log retention to stop working under transient errors until the cooldown elapses.
Low
compactWorkspaceKey may collide across distinct workspace keys - `src/daemon/socket-path.ts:16-19`
compactWorkspaceKey extracts the trailing -<12-hex> suffix when present, otherwise hashes the entire key to 12 hex chars. If a caller passes a workspace key that does not match the canonical name-<hash> shape (e.g. raw path or legacy key), the function falls back to hashing the whole input. Two different inputs that share the same 12-hex suffix produced by workspaceKeyForRoot will map to the same daemon directory, leading to socket/registry collisions across workspaces. The probability is low (2^-48 birthday) but the function silently merges namespaces without any check.
isPidAlive returns incorrect results for pid <= 0 or non-integer inputs - `src/utils/process-liveness.ts:1-8`
process.kill(0, 0) signals the caller's entire process group and returns true unconditionally; negative pids target a process group instead of a single process. Non-integer/NaN pids cause process.kill to throw a TypeError whose .code is undefined, so the code !== 'ESRCH' check falsely reports the process as alive. Callers like fs-lock.ts and daemon-registry.ts pass staleOwner.pid / entry.pid without first validating pid > 0 within the visible code paths, so a malformed lock file or registry entry could cause the recovery logic to permanently treat a non-existent owner as alive and refuse to reclaim locks or clean up artifacts.
cleanupOwnedWorkspaceFilesystemArtifacts skips daemon cleanup when daemonCleanup is not provided - `src/utils/workspace-filesystem-lifecycle.ts:482-484`
On shutdown/force-stop, cleanupWorkspaceDaemonFiles is only invoked if the caller passes options.daemonCleanup. In the unconfigured-workspace branch (no workspaceKey), the function returns a zero result without attempting any simulator session stop or daemon cleanup, even though stale artifacts may still exist for the runtime's workspace. Callers relying on this for a complete shutdown sweep may leave daemon files behind.
Duration: 13m 10s · Tokens: 1.3M in / 22.8k out · Cost: $7.83 (+extraction: $0.00, +merge: $0.00, +fix_gate: $0.00)
Annotations
Check warning on line 351 in src/daemon.ts
sentry-warden / warden: find-bugs
Forced-shutdown timeout is never cleared, causing duplicate cleanup and exit-code race
When `shutdown()` runs, the 5-second `setTimeout` is scheduled but never cleared once `server.close()` succeeds. If cleanup or `flushAndCloseSentry(2000)` together take longer than 5 seconds (the sentry flush alone allows up to 2s, and `stopOwnedSimulatorLaunchOsLogSessions` defaults to 1s plus daemon-file cleanup), the timeout fires and runs `cleanupArtifacts()` a second time concurrently with the in-progress cleanup, then races to call `process.exit(1)` against the success path's `process.exit(exitCode)`. This can corrupt the daemon-file/socket cleanup (two concurrent `cleanupWorkspaceDaemonFiles` and `stopOwnedSimulatorLaunchOsLogSessions` calls) and override the intended exit code with 1.
Check warning on line 351 in src/daemon.ts
sentry-warden / warden: find-bugs
[WTC-ADE] Forced-shutdown timeout is never cleared, causing duplicate cleanup and exit-code race (additional location)
When `shutdown()` runs, the 5-second `setTimeout` is scheduled but never cleared once `server.close()` succeeds. If cleanup or `flushAndCloseSentry(2000)` together take longer than 5 seconds (the sentry flush alone allows up to 2s, and `stopOwnedSimulatorLaunchOsLogSessions` defaults to 1s plus daemon-file cleanup), the timeout fires and runs `cleanupArtifacts()` a second time concurrently with the in-progress cleanup, then races to call `process.exit(1)` against the success path's `process.exit(exitCode)`. This can corrupt the daemon-file/socket cleanup (two concurrent `cleanupWorkspaceDaemonFiles` and `stopOwnedSimulatorLaunchOsLogSessions` calls) and override the intended exit code with 1.
Check warning on line 36 in src/daemon/socket-path.ts
sentry-warden / warden: find-bugs
removeStaleSocket unlinks paths under /tmp without owner/type validation
`removeStaleSocket(socketPath)` now operates on a path inside `tmpdir()`. Combined with the predictable workspace-derived directory name from `daemonDirForWorkspaceKey`, a local attacker who controls the parent directory could place a symlink at `d.sock` pointing at a victim-owned file; `unlinkSync` on the symlink is harmless, but if the attacker pre-creates the directory and the daemon proceeds to mkdir/bind without ownership checks (see related finding), arbitrary deletion is possible via earlier startup paths. The change broadens the trust boundary because the original `~/.xcodebuildmcp` was not multi-user writable.
Check warning on line 415 in src/utils/workspace-filesystem-lifecycle.ts
sentry-warden / warden: find-bugs
Cooldown markers updated before sweep runs, suppressing later sweeps if scheduled run fails
scheduleWorkspaceFilesystemLifecycleSweep writes lastScheduledAtByScope and lastScheduledAtByPreKey synchronously before the setTimeout fires. If the deferred runWorkspaceFilesystemLifecycleSweep throws synchronously during resolveOptions or is skipped by lock/cooldown, subsequent schedule attempts within WORKSPACE_FILESYSTEM_LIFECYCLE_COOLDOWN_MS are silently dropped even though no sweep ever executed. This can cause log retention to stop working under transient errors until the cooldown elapses.