Skip to content

perf: optimize sync interval and use mtime-based change detection #267

@troshab

Description

@troshab

Problem

The current sync mechanism (packages/mcp/src/sync.ts) runs every 5 minutes and does a full directory traversal with SHA-256 hashing of every file on each sync. This has two issues:

  1. 5 minutes is too long for LLM-assisted workflows where code changes frequently. Search results can be stale while the LLM has already rewritten multiple files.

  2. Full content hashing is expensive for large codebases. Every sync reads and SHA-256 hashes every file, even when nothing changed. For a codebase with 1000 files, this means ~1000 file reads + hashes every 5 minutes.

Proposed Changes

1. Configurable sync interval via environment variable

Add SYNC_INTERVAL_MS (or SYNC_INTERVAL_SECONDS) environment variable with a default of 60 seconds (1 minute) instead of the current 300 seconds (5 minutes).

const syncIntervalMs = parseInt(process.env.SYNC_INTERVAL_MS || '60000', 10);

2. mtime + size based change detection

Instead of reading and hashing every file on every sync:

  1. stat() each file to get mtime + size (~100x faster than read + hash)
  2. Only read + hash files where mtime or size changed
  3. Use content hash for final verification (avoids false positives from mtime-only changes like touch)

This is the same approach used by make, rsync, and most IDEs.

Impact

  • For a 1000-file project with 0 changes: ~5ms (stat only) vs ~500ms (read + hash)
  • Enables shorter sync intervals without performance penalty
  • Merkle tree structure can be preserved on top of this optimization

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions