-
-
Notifications
You must be signed in to change notification settings - Fork 554
Feature/sheets handling #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/sheets handling #282
Conversation
- Implement ExcelFileHandler for .xlsx/.xls/.xlsm files using xlsx library - Add FileHandler interface with read(), write(), getInfo(), editRange() methods - Create file handler factory for automatic handler selection by extension - Support Excel-specific features: sheet selection, range queries, offset/length - Add edit_block range mode for Excel cell/range updates - Refactor handleGetFileInfo to be generic (handles any file type's nested data) - Remove .describe() calls from schemas (WriteFileArgsSchema, EditBlockArgsSchema) - Add technical debt comments documenting text edit architectural inconsistency
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
WalkthroughAdds a file-handler abstraction (text, excel, image, binary) with a factory; refactors filesystem tools to use handlers and new ReadOptions/FileResult types; integrates Excel search; adds ExcelJS dependency and tests; implements virtual Node sessions/execute_node fallback; enhances process tooling and Python detection; updates schemas and tool descriptions. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Nitpicks 🔍
|
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (22)
src/tools/schemas.ts (1)
88-102: Schema refine logic is correct, but consider stricter typing forcontent.The refine validation correctly enforces that either text replacement fields or range-based fields must be provided together. However,
content: z.any()is very permissive and provides no runtime validation.Consider adding a more specific type that covers the expected content shapes (arrays, objects, primitives) while still allowing flexibility:
- content: z.any().optional(), + content: z.union([ + z.array(z.array(z.any())), // 2D array for Excel data + z.record(z.any()), // Object with sheet names as keys + z.string(), + z.number(), + ]).optional(),This provides documentation value and catches obvious misuse while maintaining flexibility for Excel data structures.
test/test-file-handlers.js (1)
210-218: Fragile status message removal pattern.The regex
^\[.*?\]\n\nto strip status messages is brittle and could break if the status message format changes. Consider using a more robust approach:- const parsed = JSON.parse(readContent.replace(/^\[.*?\]\n\n/, '')); // Remove status message + // Find where the JSON actually starts (skip any status prefix) + const jsonStart = readContent.indexOf('{'); + const parsed = JSON.parse(readContent.substring(jsonStart));Alternatively, consider adding an option to suppress status messages in read operations for testing scenarios.
src/tools/edit.ts (2)
374-417: TheeditRangeresult is not used.The
editRangemethod returns anEditResultwithsuccess,editsApplied, anderrorsfields, but this result is discarded. Consider using the result to provide more detailed feedback:- if ('editRange' in handler && typeof handler.editRange === 'function') { - await handler.editRange(parsed.file_path, parsed.range, content, parsed.options); - return { - content: [{ - type: "text", - text: `Successfully updated range ${parsed.range} in ${parsed.file_path}` - }], - }; + if ('editRange' in handler && typeof handler.editRange === 'function') { + const result = await handler.editRange(parsed.file_path, parsed.range, content, parsed.options); + if (!result.success) { + const errorMessages = result.errors?.map(e => `${e.location}: ${e.error}`).join('\n') || 'Unknown error'; + return { + content: [{ + type: "text", + text: `Failed to update range ${parsed.range}: ${errorMessages}` + }], + isError: true + }; + } + return { + content: [{ + type: "text", + text: `Successfully applied ${result.editsApplied} edit(s) to range ${parsed.range} in ${parsed.file_path}` + }], + };This provides better user feedback and properly handles partial failures reported by the handler.
376-377: Consider static import for better performance.The dynamic import of
getFileHandleradds latency to every range-based edit call. Since this is a core feature, consider using a static import at the top of the file:+import { getFileHandler } from '../utils/files/factory.js'; // ... later in handleEditBlock ... - const { getFileHandler } = await import('../utils/files/factory.js'); - const handler = getFileHandler(parsed.file_path); + const handler = getFileHandler(parsed.file_path);Dynamic imports are better suited for rarely-used code paths or optional features, not core functionality.
src/utils/files/factory.ts (2)
49-61: Consider usingArray.find()for cleaner iteration.The loop logic is correct, but could be more idiomatic.
export function getFileHandler(path: string): FileHandler { const allHandlers = initializeHandlers(); - // Try each handler in order - for (const handler of allHandlers) { - if (handler.canHandle(path)) { - return handler; - } - } - - // Fallback to binary handler (should never reach here due to binary catch-all) - return allHandlers[allHandlers.length - 1]; + // Find first handler that can handle this file type + // BinaryFileHandler.canHandle() always returns true, so this never returns undefined + return allHandlers.find(h => h.canHandle(path)) ?? allHandlers[allHandlers.length - 1]; }
68-89: Potential duplication with handlercanHandle()methods.Both
isExcelFile/isImageFilehelper functions and the handlercanHandle()methods perform extension checking. Consider consolidating to avoid drift. For now, this works correctly.One option is to delegate to the handler:
export function isExcelFile(path: string): boolean { return new ExcelFileHandler().canHandle(path); }Or extract shared extension lists to a constants file. Low priority since current implementation is functional.
src/handlers/node-handlers.ts (2)
24-27: Consider usingcrypto.randomUUID()for stronger temp file uniqueness.The current naming uses
Date.now()+Math.random(), which is generally sufficient but could theoretically collide under high concurrency.+import { randomUUID } from 'crypto'; + // Create temp file IN THE MCP DIRECTORY so ES module imports resolve correctly -const tempFile = path.join(mcpRoot, `.mcp-exec-${Date.now()}-${Math.random().toString(36).slice(2)}.mjs`); +const tempFile = path.join(mcpRoot, `.mcp-exec-${randomUUID()}.mjs`);
32-56: Timeout produces generic exit code, not explicit timeout message.When
spawn'stimeouttriggers, it sends SIGTERM and the process exits with code 143 (on Unix) or similar. The user sees "Execution failed (exit code 143)" rather than a clear timeout indication.Consider tracking timeout explicitly:
const result = await new Promise<{ stdout: string; stderr: string; exitCode: number }>((resolve) => { + let timedOut = false; const proc = spawn(process.execPath, [tempFile], { cwd: mcpRoot, timeout: timeout_ms }); let stdout = ''; let stderr = ''; proc.stdout.on('data', (data) => { stdout += data.toString(); }); proc.stderr.on('data', (data) => { stderr += data.toString(); }); proc.on('close', (exitCode) => { - resolve({ stdout, stderr, exitCode: exitCode ?? 1 }); + resolve({ + stdout, + stderr: timedOut ? `Execution timed out after ${timeout_ms}ms\n${stderr}` : stderr, + exitCode: exitCode ?? 1 + }); }); proc.on('error', (err) => { + if (err.message.includes('ETIMEDOUT') || err.message.includes('killed')) { + timedOut = true; + } resolve({ stdout, stderr: stderr + '\n' + err.message, exitCode: 1 }); }); });src/handlers/filesystem-handlers.ts (1)
96-106: Timeout error handling could be more descriptive.When
withTimeoutreturnsnull, the error message'Failed to read the file'doesn't indicate that a timeout occurred. Consider making this clearer:if (result == null) { - // Handles the impossible case where withTimeout resolves to null instead of throwing - throw new Error('Failed to read the file'); + throw new Error('Read file operation timed out after 60 seconds'); }test/test-excel-files.js (4)
180-186: Strengthen the range exclusion assertion.The assertion
!content.includes('City') || content.split('City').length === 1is logically equivalent to!content.includes('City')(the second condition is always false if the first is false). Consider simplifying:- // City is column C, should NOT be included - assert.ok(!content.includes('City') || content.split('City').length === 1, - 'Range A1:B2 should not include City column'); + // City is column C, should NOT be included + assert.ok(!content.includes('City'), + 'Range A1:B2 should not include City column');
314-322: Negative offset test assertion is too permissive.The assertion
content.includes('Fourth') || content.includes('Fifth')will pass even if only one of the expected rows is present. Foroffset: -2, both the fourth and fifth rows should be included:- assert.ok(content.includes('Fourth') || content.includes('Fifth'), - 'Should include data from last rows'); + assert.ok(content.includes('Fourth'), 'Should include Fourth (second to last row)'); + assert.ok(content.includes('Fifth'), 'Should include Fifth (last row)');
343-359: Consider adding error case tests.The test suite covers happy paths well but lacks tests for error scenarios such as:
- Reading a non-existent Excel file
- Writing to a read-only location
- Reading with an invalid sheet name/index
- Corrupted Excel file handling
These would improve confidence in error handling robustness.
Would you like me to help generate error case tests?
86-87: Constructor name check may break with minification.Using
handler.constructor.name === 'ExcelFileHandler'is fragile as minifiers can rename classes. Consider usinginstanceofif the class is exported, or add a handler type identifier property:// Alternative approach using instanceof import { ExcelFileHandler } from '../dist/utils/files/excel.js'; assert.ok(handler instanceof ExcelFileHandler, ...);For tests against the
distfolder, minification is less likely an issue, but this is worth noting for future-proofing.src/utils/files/image.ts (3)
38-51: Consider adding size limit or chunked reading for large images.Images are read entirely into memory and base64-encoded, which increases memory usage by ~33%. For very large images (e.g., 100MB+ RAW files), this could cause memory pressure. The current implementation is fine for typical use cases, but consider documenting this limitation or adding a size check.
53-61: Add validation for base64 string content.When
contentis a string, the code assumes it's base64-encoded. If a non-base64 string is passed,Buffer.from(content, 'base64')will silently produce garbage output. Consider adding validation:async write(path: string, content: Buffer | string): Promise<void> { // If content is base64 string, convert to buffer if (typeof content === 'string') { + // Validate base64 format + if (!/^[A-Za-z0-9+/]*={0,2}$/.test(content)) { + throw new Error('Invalid base64 string content for image write'); + } const buffer = Buffer.from(content, 'base64'); await fs.writeFile(path, buffer); } else { await fs.writeFile(path, content); } }Alternatively, document that the caller must ensure valid base64 input.
84-92: Minor: getMimeType could use the static map directly.The loop iterates over entries when you could use a direct lookup:
private getMimeType(path: string): string { const lowerPath = path.toLowerCase(); - for (const [ext, mimeType] of Object.entries(ImageFileHandler.IMAGE_MIME_TYPES)) { - if (lowerPath.endsWith(ext)) { - return mimeType; - } - } + const ext = '.' + lowerPath.split('.').pop(); + return ImageFileHandler.IMAGE_MIME_TYPES[ext] ?? 'application/octet-stream'; - return 'application/octet-stream'; // Fallback }The current implementation works correctly; this is just a minor optimization.
src/search-manager.ts (1)
369-395: Potential performance issue with row.eachCell callback.The
eachRowcallback with nestedeachCelloperations could be slow for large workbooks with many sheets. Consider:
- Adding an early exit when
maxResultsis reached at the worksheet level- Processing worksheets in parallel for very large files
Current implementation is functional but may have performance implications for large Excel files.
src/utils/files/text.ts (3)
42-46: Clarify fallback handler semantics in canHandle.
canHandlealways returnstrue, making this the fallback handler. This design relies on the factory checking other handlers first. Consider adding a comment to clarify this dependency or throwing if the path is a known non-text format:canHandle(path: string): boolean { - // Text handler is the default - handles most files - // Only returns false for known non-text formats (checked by other handlers) + // Text handler is the fallback - handles any file not claimed by other handlers. + // The factory (src/utils/files/factory.ts) must check specialized handlers first. return true; }
26-36: Constants duplication - consider importing from a shared location.As noted in the comments, these constants are duplicated from
filesystem.ts. Consider extracting them to a shared constants file to maintain DRY principle:// src/utils/files/constants.ts export const FILE_SIZE_LIMITS = { ... } as const; export const READ_PERFORMANCE_THRESHOLDS = { ... } as const;This is acknowledged technical debt but worth tracking.
370-440: Byte estimation may skip lines in edge cases.The
readFromEstimatedPositionmethod estimates byte position based on average line length from a sample. If the file has highly variable line lengths (e.g., short header followed by long data rows), the estimated position could be significantly off. The method handles this by skipping the first partial line, but the actual line offset displayed to the user viagenerateEnhancedStatusMessagewill be the requested offset, not the actual offset.Consider adding a note to the status message when using estimation:
const content = includeStatusMessage - ? `${this.generateEnhancedStatusMessage(result.length, offset, fileTotalLines, false)}\n\n${result.join('\n')}` + ? `${this.generateEnhancedStatusMessage(result.length, offset, fileTotalLines, false)} (estimated position)\n\n${result.join('\n')}` : result.join('\n');src/utils/files/excel.ts (2)
77-136: Consider extracting ENOENT check to a helper.The ENOENT check pattern is duplicated at lines 83-86 and 131-134. Consider extracting to a small helper for clarity.
+private isFileNotFoundError(error: unknown): boolean { + return (error as any).code === 'ENOENT' || + (error instanceof Error && error.message.includes('ENOENT')); +} + async write(path: string, content: any, mode?: 'rewrite' | 'append'): Promise<void> { // Check existing file size if it exists try { await this.checkFileSize(path); } catch (error) { // File doesn't exist - that's fine for write - if ((error as any).code !== 'ENOENT' && - !(error instanceof Error && error.message.includes('ENOENT'))) { + if (!this.isFileNotFoundError(error)) { throw error; } }
284-309: Minor: redundantfs.statcall inextractMetadata.
extractMetadataperforms an additionalfs.statcall, but the file size was already checked incheckFileSize. Consider caching or passing the stats to avoid the extra I/O, though this is a minor optimization given the file size limit.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (20)
manifest.template.json(2 hunks)package.json(1 hunks)src/handlers/filesystem-handlers.ts(3 hunks)src/handlers/index.ts(1 hunks)src/handlers/node-handlers.ts(1 hunks)src/search-manager.ts(4 hunks)src/server.ts(8 hunks)src/tools/edit.ts(2 hunks)src/tools/filesystem.ts(13 hunks)src/tools/schemas.ts(3 hunks)src/utils/files/base.ts(1 hunks)src/utils/files/binary.ts(1 hunks)src/utils/files/excel.ts(1 hunks)src/utils/files/factory.ts(1 hunks)src/utils/files/image.ts(1 hunks)src/utils/files/index.ts(1 hunks)src/utils/files/text.ts(1 hunks)test/test-excel-files.js(1 hunks)test/test-file-handlers.js(1 hunks)test/test_output/node_repl_debug.txt(0 hunks)
💤 Files with no reviewable changes (1)
- test/test_output/node_repl_debug.txt
🧰 Additional context used
🧬 Code graph analysis (13)
test/test-file-handlers.js (3)
src/config-manager.ts (1)
configManager(227-227)src/utils/files/factory.ts (1)
getFileHandler(49-61)src/tools/filesystem.ts (3)
readFile(725-733)writeFile(824-847)getFileInfo(1077-1125)
src/utils/files/binary.ts (1)
src/utils/files/base.ts (4)
FileHandler(13-64)ReadOptions(73-91)FileResult(96-105)FileInfo(173-200)
src/utils/files/text.ts (2)
src/utils/files/index.ts (1)
TextFileHandler(13-13)src/utils/files/base.ts (4)
FileHandler(13-64)ReadOptions(73-91)FileResult(96-105)FileInfo(173-200)
src/utils/files/image.ts (1)
src/utils/files/base.ts (4)
FileHandler(13-64)ReadOptions(73-91)FileResult(96-105)FileInfo(173-200)
src/handlers/node-handlers.ts (2)
src/types.ts (1)
ServerResult(73-77)src/tools/schemas.ts (1)
ExecuteNodeArgsSchema(168-171)
src/utils/files/factory.ts (5)
src/utils/files/base.ts (1)
FileHandler(13-64)src/utils/files/excel.ts (1)
ExcelFileHandler(33-485)src/utils/files/image.ts (1)
ImageFileHandler(18-93)src/utils/files/text.ts (1)
TextFileHandler(41-441)src/utils/files/binary.ts (1)
BinaryFileHandler(20-173)
src/handlers/filesystem-handlers.ts (2)
src/utils/files/base.ts (1)
ReadOptions(73-91)src/tools/filesystem.ts (2)
readFile(725-733)getFileInfo(1077-1125)
src/utils/files/base.ts (3)
src/tools/filesystem.ts (1)
FileResult(286-286)test/test-excel-files.js (6)
content(121-121)content(175-175)content(196-196)content(233-233)content(288-288)content(316-316)test/test-file-handlers.js (3)
content(184-184)content(206-206)content(255-255)
src/tools/edit.ts (2)
src/tools/schemas.ts (1)
EditBlockArgsSchema(88-102)src/utils/files/factory.ts (1)
getFileHandler(49-61)
src/server.ts (1)
src/tools/schemas.ts (2)
EditBlockArgsSchema(88-102)ExecuteNodeArgsSchema(168-171)
src/search-manager.ts (1)
src/utils/files/factory.ts (1)
isExcelFile(68-71)
src/utils/files/excel.ts (1)
src/utils/files/base.ts (6)
ExcelSheet(134-143)FileHandler(13-64)ReadOptions(73-91)FileResult(96-105)EditResult(152-164)FileInfo(173-200)
src/tools/filesystem.ts (3)
src/utils/files/base.ts (2)
ReadOptions(73-91)FileResult(96-105)src/utils/files/factory.ts (1)
getFileHandler(49-61)src/utils/files/index.ts (1)
getFileHandler(10-10)
🪛 ast-grep (0.40.0)
src/search-manager.ts
[warning] 324-324: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 328-328: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(escaped, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 343-343: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(^${regexPat}$, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🔇 Additional comments (57)
manifest.template.json (2)
91-91: LGTM!The updated description accurately reflects the new dual-mode functionality of
edit_block, supporting both text replacement with fuzzy matching and range-based updates for Excel files.
137-140: LGTM!The new
execute_nodetool entry is properly declared with an appropriate description covering ES module support and top-level await.src/tools/schemas.ts (2)
50-52: LGTM!The new
sheet,range, andoptionsfields properly extend the read file schema to support Excel-specific operations. Thesheetunion type correctly allows both string (sheet name) and number (sheet index) references.
167-171: LGTM!The
ExecuteNodeArgsSchemais well-defined with a sensible 30-second default timeout. The schema aligns with the handler implementation innode-handlers.ts.src/handlers/index.ts (1)
8-8: LGTM!The new re-export follows the established pattern and properly exposes the node-handlers module through the central handlers index.
test/test-file-handlers.js (2)
288-303: LGTM!The exported
runTestsfunction properly handles setup/teardown lifecycle, error handling, and returns a boolean status for integration with test runners.
238-238: Unable to complete verification due to repository access limitations.Unfortunately, I'm unable to clone or access the repository code to verify the concerns raised in your review comment. While the repository
wonderwhy-er/DesktopCommanderMCPis confirmed to be public, I cannot inspect the codebase to:
- Locate the
getFileInfoimplementation- Check the
FileInfotype definition- Determine if the string
'true'is actually returned by the API- Confirm whether this type inconsistency is intentional (backward compatibility) or unintentional (bug)
The review comment's core concern remains valid: The assertion
info.isFile === true || info.isFile === 'true'does suggest potential type inconsistency that should be resolved either by:
- Fixing the source to consistently return boolean
true, or- Adding documentation explaining why both types are acceptable
src/tools/edit.ts (2)
1-16: Good architectural documentation.The technical debt comment clearly documents the current inconsistency and provides a clear path for future refactoring. This is valuable for maintainability.
420-423: Non-null assertions are safe here due to schema refinement.The
!assertions onold_stringandnew_stringare valid because the schema'srefinevalidation guarantees that whenrange/contentare not provided,old_stringandnew_stringmust both be present. The control flow reaching this point ensures the values exist.package.json (1)
86-86: Check transitive dependencies of exceljs for known vulnerabilities.The exceljs 4.4.0 library has no direct CVEs, but multiple security advisories exist for its transitive dependencies (inflight, brace-expansion, lodash, tmp via archiver/glob). These issues remain open in the exceljs repository as of January 2025. Run
npm auditon your project to identify the actual exposure in your dependency tree and consider:
- Using npm/yarn overrides to patch vulnerable transitive deps
- Monitoring the exceljs repository for updated releases
- Evaluating alternative Excel libraries if transitive risk is unacceptable
src/utils/files/factory.ts (1)
12-32: LGTM - Clean singleton pattern with correct priority ordering.The lazy initialization ensures handlers are only created once, and the priority order correctly places more specific handlers (Excel, Image) before the catch-all (Text, Binary).
src/server.ts (4)
265-272: LGTM - Clear documentation of format-specific handling.The FORMAT HANDLING section provides clear guidance on how different file types are processed, including Excel-specific parameters (sheet, range) and image encoding.
328-332: LGTM - Excel write format documentation is clear.The example JSON format
'[["Name","Age"],["Alice",30]]'clearly shows the expected 2D array structure for Excel writes.
1206-1217: LGTM - Handler follows established patterns.Error handling is consistent with other tool handlers, capturing errors and returning structured
ServerResultwithisError: true.
992-1020: Unable to verify security safeguards due to repository access limitations.The repository failed to clone, preventing verification of the codebase for existing security patterns (allowedDirectories, validatePath, sandbox, etc.) and the actual implementation details of the
execute_nodetool.The original review comment raises valid security concerns about arbitrary code execution with MCP server privileges. However, without access to:
- The actual implementation of
execute_node(including any runtime restrictions)- Security patterns used by other tools in the codebase
- Runtime environment configuration and privilege restrictions
- Relevant security design documentation or comments
I cannot definitively verify whether additional safeguards are needed or if this is an intentional design decision.
src/utils/files/index.ts (1)
1-16: LGTM - Well-organized barrel exports.The export structure is clean and follows a logical order: base interfaces first, then factory utilities, then concrete handler implementations.
src/handlers/node-handlers.ts (2)
58-59: Silent cleanup failure is appropriate here.The
.catch(() => {})pattern for cleanup is correct since we don't want cleanup failures to mask the actual execution result.
61-76: LGTM - Error and success paths return proper ServerResult structure.The handler correctly distinguishes between execution failure (non-zero exit) and success, with appropriate
isErrorflag and content formatting.src/utils/files/binary.ts (3)
20-24: LGTM - Correct catch-all semantics.
canHandlereturningtrueunconditionally is the correct behavior for a fallback handler that should process anything not handled by more specific handlers.
26-36: LGTM - Instructional response pattern is well-designed.Instead of attempting to read binary content (which would be meaningless), the handler returns actionable guidance. The
isBinary: truemetadata allows callers to detect this case.
154-172: LGTM - Reasonable MIME type mapping with safe fallback.The mapping covers common binary types, and the
application/octet-streamfallback is the correct choice for unknown extensions.src/handlers/filesystem-handlers.ts (3)
63-70: LGTM - Clean ReadOptions consolidation.The refactored approach using a single
ReadOptionsobject is cleaner than multiple parameters and aligns well with the interface defined insrc/utils/files/base.ts.
72-92: Verify Buffer handling for non-image file content.When
fileResult.contentis aBuffer(as defined inFileResult), calling.toString()without encoding defaults to UTF-8, which is appropriate for text. However, if a binary file somehow reaches this path, the output could be garbled.The current flow relies on the handler system routing binary files correctly, which should be fine given the factory pattern. The implementation looks correct.
248-273: Consider recursive handling for deeply nested objects.The
formatValuefunction handles one level of nesting (arrays of objects), butJSON.stringifyis used for nested objects at line 270. This is reasonable for most cases, but for consistency with array formatting, you might want to add amaxDepthparameter if deeply nested structures become common.Current implementation is acceptable for the stated use cases.
test/test-excel-files.js (1)
34-42: LGTM - Proper cleanup with ENOENT handling.The cleanup function correctly handles the case where the directory doesn't exist, preventing spurious errors during initial cleanup.
src/utils/files/image.ts (2)
33-36: LGTM - Clean extension checking.The
canHandlemethod correctly uses case-insensitive matching for file extensions.
63-79: LGTM - File info implementation is correct.The
getInfomethod properly extracts file stats and returns appropriate metadata with theisImageflag.src/search-manager.ts (3)
149-168: LGTM - Clean parallel Excel search integration.The Excel search runs concurrently with ripgrep and errors are caught without failing the overall search. Results are correctly added to the session.
428-465: LGTM - Recursive file discovery is well-implemented.The
findExcelFilesmethod correctly:
- Skips hidden directories and
node_modules- Handles both file and directory root paths
- Silently handles permission errors
470-482: LGTM - Context extraction utility.The
getMatchContextmethod provides reasonable context around matches with proper boundary handling and ellipsis indicators.src/utils/files/text.ts (4)
238-282: LGTM - Efficient reverse reading implementation.The
readLastNLinesReversemethod efficiently reads from the end of large files using chunked buffer reads. The file descriptor is properly closed in thefinallyblock.
287-326: Circular buffer implementation is correct.The
readFromEndWithReadlineuses a circular buffer pattern to efficiently capture the last N lines without storing all lines in memory. The reconstruction logic at lines 311-319 correctly handles both cases (total lines >= requested and < requested).
71-98: LGTM - getInfo implementation with line count optimization.The implementation correctly avoids counting lines for large files and handles read errors gracefully.
48-61: I'll verify the factory handler order to confirm whether the binary file check timing is actually an issue. Let me search for the handler factory implementation.
<function_calls>
Search for the file handler factory function
#!/bin/bashFind factory function and handler implementations
rg -n "getFileHandler|class.*FileHandler" --type=ts -A 5 -B 2 | head -100
</function_calls>
Search for factory pattern or handler selection logic
#!/bin/bashLook for factory pattern and handler selection
rg -n "BinaryFileHandler|TextFileHandler" --type=ts -C 3 | head -80
</function_calls>
Find canHandle method implementation
#!/bin/bashSearch for canHandle method
rg -n "canHandle" --type=ts -A 3 -B 1
</function_calls>src/utils/files/base.ts (6)
13-64: LGTM!The
FileHandlerinterface is well-designed with clear method contracts. The optionaleditRangemethod appropriately allows for structured editing in handlers that support it (like Excel), while keeping the core interface minimal for simpler handlers.
70-91: LGTM!The
ReadOptionsinterface provides a clean unified API for different file handlers, with Excel-specific options clearly documented.
93-129: LGTM!The
FileResultandFileMetadatainterfaces provide flexible structures that accommodate different file types while maintaining a consistent API.
131-143: LGTM!Clean and minimal interface for Excel sheet metadata.
148-164: LGTM!The
EditResultinterface appropriately supports partial success scenarios with detailed error reporting per location.
170-200: LGTM!The
FileInfointerface provides comprehensive file metadata with clear separation between standard file stats and type-specific information.src/utils/files/excel.ts (7)
1-27: LGTM!Clean imports and reasonable file size limit for in-memory Excel processing.
35-38: LGTM!The extension checking works correctly. The approach of lowercasing the entire path before checking
endsWithis functionally equivalent to extracting and comparing the extension.
40-75: LGTM!The read method provides a good user experience with informative headers, pagination info, and modification hints embedded in the output.
158-236: LGTM with a minor note oneditsApplied.The
editRangeimplementation is solid with proper validation and formula support. Note thateditsAppliedreturns1representing one range operation rather than individual cell count - this semantic is reasonable for range-based editing.
238-280: LGTM!Graceful error handling - returns basic file info with error details in metadata when Excel parsing fails, rather than throwing.
311-420: LGTM!Comprehensive implementation handling various Excel cell value types (formulas, rich text, dates, etc.) with proper pagination support including negative offsets for tail reads.
478-484: LGTM!The
columnToNumberalgorithm correctly handles multi-letter columns using base-26 conversion.src/tools/filesystem.ts (10)
11-12: LGTM!Clean imports from the new file handler system.
285-286: LGTM!Re-exporting
FileResultmaintains backward compatibility for existing consumers.
323-328: LGTM!
readFileFromUrlnow returnsFileResultwith metadata structure consistent with the new file handler pattern.
485-485: LGTM!Text reading helpers now return consistent
FileResultwithmetadata: { isImage: false }.
641-701: LGTM!The refactored
readFileFromDiskcleanly delegates to the appropriate file handler while maintaining telemetry and timeout handling. Thecontent.toString()ensures consistent string output.
725-733: LGTM!Clean unified API with proper delegation based on
isUrlflag.
842-846: LGTM!Clean delegation to file handler for write operations.
866-868: LGTM!Correctly adapts to the new
FileResultstructure with proper optional chaining.
1077-1124: LGTM!Well-implemented backward compatibility layer that translates the new
FileInfostructure to the legacy format while adding type-specific metadata.
397-445: I'll verify whether these text reading helper functions are still being used after the refactoring. Let me search the codebase for their usages.
<function_calls>
#!/bin/bashSearch for imports and usages of the text reading helper functions
First, check for usages outside of filesystem.ts
echo "=== Searching for usages of helper functions outside filesystem.ts ==="
rg -n "readFileWithSmartPositioning|readLastNLinesReverse|readFromEndWithReadline|readFromStartWithReadline|readFromEstimatedPosition" --type ts -g '!src/tools/filesystem.ts' -A 2 -B 2echo ""
echo "=== Checking for TextFileHandler and its usage of these functions ==="
rg -n "TextFileHandler" --type ts -A 5 -B 5echo ""
echo "=== Checking if these functions are exported from filesystem.ts ==="
rg -n "export.*readFileWithSmartPositioning|export.*readLastNLinesReverse|export.*readFromEndWithReadline|export.*readFromStartWithReadline|export.*readFromEstimatedPosition" src/tools/filesystem.ts
</blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| const results: SearchResult[] = []; | ||
|
|
||
| // Build regex for matching content | ||
| const flags = ignoreCase ? 'i' : ''; | ||
| let regex: RegExp; | ||
| try { | ||
| regex = new RegExp(pattern, flags); | ||
| } catch { | ||
| // If pattern is not valid regex, escape it for literal matching | ||
| const escaped = pattern.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); | ||
| regex = new RegExp(escaped, flags); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Script executed:
# First, explore the repository structure and find the file under review
find . -name "search-manager.ts" -type f 2>/dev/null | head -5Repository: wonderwhy-er/DesktopCommanderMCP
<shell_result>
Repository: wonderwhy-er/DesktopCommanderMCP
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository wonderwhy-er/DesktopCommanderMCP: Clone operation failed
</shell_result>
ReDoS protection Node.js regex safe-regex library
<web_search_result>
Short summary and recommendations:
-
What safe-regex is: an npm module that heuristically flags potentially catastrophic (exponential-time) regexes by limiting "star height". It can give both false positives and false negatives. [1]
-
Safer alternatives/tools:
- vuln-regex-detector — more accurate detector (claims no false positives and far fewer false negatives; server-based option). [2]
- redos-detector — deterministic checker/CLI/library that proves whether a pattern is safe. [3]
- safe-regex2 — maintained fork of safe-regex with some improvements/options. [4]
-
Practical protections (use together, not just one):
- Static-check regexes at build time with a detector (vuln-regex-detector or redos-detector). [2][3]
- Prefer linear-time engines (e.g., Google RE2) or use an RE2 binding when you must accept untrusted patterns. RE2 avoids backtracking-based ReDoS. [2]
- Enforce input limits (max length) and application-level timeouts / run regexes inside time-limited workers or separate processes to avoid blocking the event loop.
- Avoid allowing user-supplied arbitrary regexes; if needed, validate them strictly and sandbox execution.
- Where possible, convert vulnerable patterns to safe equivalents (atomic groups, possessive quantifiers where supported, or different parsing approach).
-
Quick npm commands:
- safe-regex: npm i safe-regex. Note its heuristic warnings. [1]
- safe-regex2: npm i safe-regex2. [4]
- vuln-regex-detector: npm i vuln-regex-detector. [2]
- redos-detector: npm i redos-detector. [3]
Sources:
[1] safe-regex (npm) — heuristic checker; warns about false positives/negatives.
[2] vuln-regex-detector (npm / repo) — more accurate detector; mentions RE2 and server analysis.
[3] redos-detector (npm) — deterministic ReDoS checker/CLI/library.
[4] safe-regex2 (npm) — maintained fork of safe-regex.
</web_search_result>
user-supplied regex patterns Node.js TypeScript safe validation timeout
<web_search_result>
Short guidance + options (Node.js + TypeScript) to safely accept and run user-supplied regexes and protect against ReDoS / long-running matches.
Key ideas
- Try to avoid running arbitrary backtracking regexes with untrusted input. Prefer a non-backtracking engine (RE2) or sandbox/timebox the evaluation. Use static detectors as a pre-check. (npmjs.com)
Options (ranked by safety / simplicity)
- Use RE2 (recommended)
- Replace RegExp with node-re2 (bindings to Google RE2). RE2 does not use backtracking so vulnerable patterns can't cause exponential-time matches. Most drop-in compatible. (npmjs.com)
Example (TypeScript):
import RE2 from 're2';
const re = new RE2(userPattern, flags);
const ok = re.test(input);
- Pre-validate patterns with a detector
- Run safe-regex / redos-detector / vuln-regex-detector to flag potentially vulnerable patterns before accepting them. These are heuristics; redos-detector is more thorough than safe-regex. Use this as a gate or to require user to choose RE2. (npmjs.com)
- Timebox execution (worker/child + timeout)
- Execute RegExp.test() inside an isolated worker thread or child process and terminate it if it exceeds a short timeout. This prevents main thread freeze and bounds CPU/time. Use worker_threads. Terminate the worker if timeout elapses and treat the match as "timed out/failed". (Also limit input length.) (nodejs.org)
Minimal pattern (TypeScript, sketch):
// main thread
const worker = new Worker('./regex-worker.js', { workerData: { pattern, flags, input } });
const timer = setTimeout(() => { worker.terminate(); resolve({ timedOut: true }); }, 200);
worker.on('message', r => { clearTimeout(timer); resolve(r); });
- Use existing timeout-capable libs
- Libraries such as super-regex / @redlotus/super-regex implement timeouts for Node.js regex execution (wraps execution with a timeout strategy). Consider them if you prefer a packaged solution. (npmjs.com)
Practical mitigations to combine
- Prefer RE2 for untrusted patterns. If you must accept JavaScript RegExp syntax features RE2 lacks (backrefs/lookahead), require stricter validation or run in a sandboxed worker with a short timeout. (npmjs.com)
- Limit input length and rate-limit requests.
- Log and reject patterns flagged by detectors.
- Use typed wrappers in TypeScript so callers can't accidentally call raw RegExp on user input.
References
- node-re2 (RE2 bindings) — safe alternative to backtracking RegExp. (npmjs.com)
- safe-regex — heuristic detector (fast, has false positives/negatives). (npmjs.com)
- redos-detector / vuln-regex-detector — more complete ReDoS analysis tools. (npmjs.com)
- Timeboxing with worker threads / timers (Node.js docs). (nodejs.org)
- super-regex / similar libraries that provide timeouts. (npmjs.com)
If you want, I can:
- produce a ready-to-run TypeScript example for any option above (RE2 + guard, or worker-thread timeout).
Citations:
- 1: https://www.npmjs.com/package/re2?utm_source=openai
- 2: https://www.npmjs.com/package/re2?utm_source=openai
- 3: https://www.npmjs.com/package/safe-regex?utm_source=openai
- 4: https://nodejs.org/en/docs/guides/timers-in-node/?utm_source=openai
- 5: https://www.npmjs.com/package/super-regex?utm_source=openai
- 6: https://www.npmjs.com/package/re2?utm_source=openai
- 7: https://www.npmjs.com/package/re2?utm_source=openai
- 8: https://www.npmjs.com/package/safe-regex?utm_source=openai
- 9: https://www.npmjs.com/package/redos-detector?utm_source=openai
- 10: https://nodejs.org/en/docs/guides/timers-in-node/?utm_source=openai
- 11: https://www.npmjs.com/package/super-regex?utm_source=openai
</web_search_result>
ReDoS regex pattern examples Node.js catastrophic backtracking short patterns
<web_search_result>
Here are short, common ReDoS (catastrophic backtracking) regex examples that hit Node.js (V8) backtracking engines, why they’re bad, a simple attack string, and quick fixes.
Vulnerable short patterns (examples)
-
/(a+)+$/
Why: nested quantifiers cause exponential splits of the same chars.
Attack string: "aaaaaaaaaaaaaaaaaaaaaab" — engine backtracks massively. [1][3] -
/^(a|aa)+$/
Why: overlapping alternation creates many grouping partitions.
Attack string: "aaaaaaaaaaaaaaab". [1][2] -
/(.a){10}/ or /(.)$/
Why: repeated . (dot-star) or nested .* can overreach and fight over characters.
Attack string: long sequences of the same char not meeting the final condition. [1][5] -
/^(\d+\w*)@/ or /^\s+|\s+$/ used badly on pathological inputs
Why: contiguous quantified tokens or subtle overlap between tokens (\d vs \w, etc.) force many backtracking attempts. [1][2]
Why these are dangerous
- Traditional JS regex engines use backtracking (NFA-style). Nested/overlapping greedy quantifiers produce exponentially many alternative parses for the same substring, exhausting CPU. Real outages have resulted from similar patterns (Cloudflare, StackOverflow incidents). [1][2][4]
Quick fixes / mitigations
- Avoid nested/overlapping quantifiers: rewrite patterns so quantifiers don’t apply to subpatterns that can match the same text. Example: instead of /^(a+)+$/, use /^a+$/ if intent is “one or more a’s”. [1][3]
- Make token boundaries explicit (use character classes, anchors, or separators) so groups can’t steal characters from each other. [1][5]
- Use unambiguous alternation (prefer fixed-length alternatives or order them from longest to shortest when possible). [1]
- Replace risky JS RegExp with a linear-time engine (Google RE2). In Node.js use the node-re2 module (bindings) to avoid backtracking entirely. [4]
- Input length limits / timeouts: validate and bound user-controlled input before running expensive regexes. [2]
Short safe examples / rewrites
- Vulnerable: /(a+)+$/ → Safe: /^a+$/ (if goal is “one or more a”). [1][3]
- Vulnerable: /^(.)$/ → Safe: /^.$/ or use a more precise pattern that doesn’t nest . . [1][5]
- Vulnerable: /(\d+\w*)@/ on untrusted input → parse with stricter token rules: /^[0-9]+[A-Za-z0-9_-]*@/ or validate pieces with separate parsing/length checks. [1][2]
References
- Explosive quantifiers / catastrophic backtracking discussion and examples. [1]
- OWASP / ReDoS examples and mitigation guidance. [2]
- StackOverflow explanation of (a+)* / catastrophic backtracking. [3]
- RE2 (no-backtracking engine) — use via node-re2 to avoid ReDoS. [4]
- Practical writeup and visualizations of backtracking behavior. [5]
If you want, I can:
- analyze a specific regex you have and show whether it’s vulnerable and how to rewrite it, or
- give a small Node.js demo (safe vs vulnerable) you can run locally.
</web_search_result>
Based on my research, I can now provide a comprehensive rewritten review comment. The original comment identifies a real vulnerability but proposes an incomplete fix.
Add comprehensive ReDoS protection; simple length checks are insufficient.
The code has a genuine ReDoS vulnerability. User-supplied patterns like /(a+)+$/ or /^(a|aa)+$/ can trigger catastrophic backtracking even with short inputs (< 50 characters), making a length limit of 1000 characters a false security measure. The try-catch-escape fallback also doesn't prevent ReDoS in the main regex branch.
Recommended mitigations (in order of robustness):
-
Use RE2 library (preferred): Replace
RegExpwithnode-re2for linear-time, backtracking-free matching:import RE2 from 're2'; const regex = new RE2(pattern, flags);
-
Validate patterns + add timeout: Combine pattern validation with execution timeout:
- Use
safe-regex,redos-detector, orvuln-regex-detectorto flag vulnerable patterns before compilation - Wrap regex execution in a timeout (e.g., using worker threads or
super-regexlibrary) to prevent event loop blocking
- Use
-
Input length limit is a minor defense only; do not rely on it alone.
Consider refactoring to avoid accepting arbitrary user-supplied regexes, or sandbox their execution in worker threads with a short timeout (e.g., 200ms).
🧰 Tools
🪛 ast-grep (0.40.0)
[warning] 324-324: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 328-328: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(escaped, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🤖 Prompt for AI Agents
In src/search-manager.ts around lines 319 to 330, the current creation of a
RegExp from user-supplied patterns is vulnerable to ReDoS and the
try/catch-escape fallback is insufficient; fix by replacing native RegExp with a
safe engine (preferably node-re2) or otherwise validating patterns and
sandboxing execution: import and use RE2 to construct regexes (or run a
vuln-regex detector like safe-regex/vidos-detector and reject unsafe patterns),
and execute matches in a short-timeout sandbox (worker thread or separate
process) to abort long-running matches; do not rely solely on input length
limits and ensure errors from invalid/unsafe patterns are rejected early with
clear error messages.
| if (filePattern) { | ||
| const patterns = filePattern.split('|').map(p => p.trim()).filter(Boolean); | ||
| excelFiles = excelFiles.filter(filePath => { | ||
| const fileName = path.basename(filePath); | ||
| return patterns.some(pat => { | ||
| // Support glob-like patterns | ||
| if (pat.includes('*')) { | ||
| const regexPat = pat.replace(/\./g, '\\.').replace(/\*/g, '.*'); | ||
| return new RegExp(`^${regexPat}$`, 'i').test(fileName); | ||
| } | ||
| // Exact match (case-insensitive) | ||
| return fileName.toLowerCase() === pat.toLowerCase(); | ||
| }); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FilePattern regex also needs ReDoS protection.
The glob-to-regex conversion at line 343-344 could also be exploited:
const regexPat = pat.replace(/\./g, '\\.').replace(/\*/g, '.*');
return new RegExp(`^${regexPat}$`, 'i').test(fileName);Since filePattern comes from user input, a pattern like *.*.*.*.*.*.*.*.*.* could create a regex with catastrophic backtracking when tested against certain filenames. Consider using a glob matching library like minimatch or picomatch instead of manual regex conversion.
🧰 Tools
🪛 ast-grep (0.40.0)
[warning] 343-343: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(^${regexPat}$, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🤖 Prompt for AI Agents
In src/search-manager.ts around lines 336 to 350, the current manual
glob-to-regex conversion (pat.replace(/\./g,...).replace(/\*/g,...)) can create
user-controlled regexes vulnerable to ReDoS; replace this manual conversion with
a safe glob-matching library (e.g., minimatch or picomatch) to evaluate fileName
against each pattern, and validate/sanitize the filePattern input (limit overall
length and number of '|' segments) before splitting so excessively long or
repeated wildcards are rejected; ensure you remove the RegExp
construction/testing and instead call the chosen library's match function with
case-insensitive options.
| private parseRange(range: string): [string, string | null] { | ||
| if (range.includes('!')) { | ||
| const [sheetName, cellRange] = range.split('!'); | ||
| return [sheetName, cellRange]; | ||
| } | ||
| return [range, null]; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edge case: sheet names containing ! character.
parseRange uses split('!') which could misparse if a sheet name contains an exclamation mark (e.g., "Hello!World!A1:B2"). While uncommon, Excel allows ! in sheet names when quoted.
private parseRange(range: string): [string, string | null] {
- if (range.includes('!')) {
- const [sheetName, cellRange] = range.split('!');
- return [sheetName, cellRange];
+ // Find the last '!' to handle sheet names that may contain '!'
+ const lastBangIndex = range.lastIndexOf('!');
+ if (lastBangIndex > 0) {
+ const sheetName = range.substring(0, lastBangIndex);
+ const cellRange = range.substring(lastBangIndex + 1);
+ return [sheetName, cellRange || null];
}
return [range, null];
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private parseRange(range: string): [string, string | null] { | |
| if (range.includes('!')) { | |
| const [sheetName, cellRange] = range.split('!'); | |
| return [sheetName, cellRange]; | |
| } | |
| return [range, null]; | |
| } | |
| private parseRange(range: string): [string, string | null] { | |
| // Find the last '!' to handle sheet names that may contain '!' | |
| const lastBangIndex = range.lastIndexOf('!'); | |
| if (lastBangIndex > 0) { | |
| const sheetName = range.substring(0, lastBangIndex); | |
| const cellRange = range.substring(lastBangIndex + 1); | |
| return [sheetName, cellRange || null]; | |
| } | |
| return [range, null]; | |
| } |
🤖 Prompt for AI Agents
In src/utils/files/excel.ts around lines 451 to 457, parseRange currently uses
range.split('!') which breaks when a sheet name contains '!' (e.g., quoted names
like "Hello!World!A1:B2"); change the logic to find the last '!' (use
lastIndexOf('!')) and split into sheetName = range.slice(0, idx) and cellRange =
range.slice(idx+1), returning [sheetName, cellRange] if idx >= 0, otherwise
[range, null]; this preserves sheet names containing '!' and maintains existing
behavior when no range is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/search-manager.ts (2)
329-334: ReDoS vulnerability remains unaddressed.This regex construction from user input is vulnerable to catastrophic backtracking, as comprehensively detailed in the previous review comment at lines 324-335. The recommended mitigations (RE2 library, pattern validation, or timeout sandboxing) have not been implemented.
348-349: Glob-to-regex ReDoS vulnerability remains unaddressed.This manual glob-to-regex conversion creates user-controlled regexes vulnerable to ReDoS, as detailed in the previous review comment at lines 341-355. The recommendation to use a safe glob-matching library (e.g.,
minimatchorpicomatch) has not been implemented.
🧹 Nitpick comments (6)
src/search-manager.ts (1)
313-315: Consider prioritizing the refactoring TODO.Extracting Excel search logic to a separate module (e.g.,
src/utils/search/excel-search.ts) would improve maintainability and make it easier to add support for other file types (PDF, etc.) without bloating this file.test/test-file-handlers.js (3)
88-116: Consider adding a test case for binary file fallback.The test covers Excel, Text, and Image handlers well. Per the comment on lines 91-93,
BinaryFileHandleris the fallback when binary detection fails at read time. Consider adding a test case for a known binary extension (e.g.,.exe,.bin) to verify the handler selection behavior, even if it may map toTextFileHandlerinitially.
222-227: Status message stripping regex may be fragile.The regex
/^\[.*?\]\n\n/assumes a specific status message format. If the format changes, this test will fail. Consider usingreadFileInternal(which excludes status messages) for cleaner testing, or document the expected format dependency.
244-250: Type inconsistency check may mask a bug.Line 250 checks
info.isFile === true || info.isFile === 'true', suggesting uncertainty about the return type. ThegetFileInfofunction (seefilesystem.tslines 1130-1139) returns booleans fromfs.stat(). IfisFileis ever a string, that indicates a bug. Consider removing the string check and investigating if this occurs.- assert.ok(info.isFile === true || info.isFile === 'true', 'Should be a file'); + assert.strictEqual(info.isFile, true, 'Should be a file');src/tools/filesystem.ts (2)
876-884: Consider extracting content conversion to a helper function.The content conversion logic (string vs Buffer, with special handling for images) is duplicated from
readFileFromDisk(lines 698-706). Extracting this to a shared helper would reduce duplication and ensure consistent behavior.function convertFileResultContent(result: FileResult): string { if (typeof result.content === 'string') { return result.content; } else if (result.metadata?.isImage) { return result.content.toString('base64'); } else { return result.content.toString('utf8'); } }
674-678: Remove debug logging statement.Line 675 contains what appears to be debug logging (
console.error('error catch ' + error)) that should be removed or converted to structured telemetry consistent with the rest of the file.} catch (error) { - console.error('error catch ' + error); const errorMessage = error instanceof Error ? error.message : String(error); capture('server_read_file_error', {error: errorMessage, fileExtension: fileExtension}); // If we can't stat the file, continue anyway and let the read operation handle errors }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/handlers/filesystem-handlers.ts(3 hunks)src/search-manager.ts(5 hunks)src/tools/filesystem.ts(13 hunks)test/test-file-handlers.js(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/handlers/filesystem-handlers.ts
🧰 Additional context used
🧬 Code graph analysis (2)
src/search-manager.ts (1)
src/utils/files/factory.ts (1)
isExcelFile(68-71)
src/tools/filesystem.ts (3)
src/utils/files/base.ts (2)
ReadOptions(73-91)FileResult(96-105)src/utils/files/factory.ts (1)
getFileHandler(49-61)src/utils/files/index.ts (1)
getFileHandler(10-10)
🪛 ast-grep (0.40.0)
src/search-manager.ts
[warning] 329-329: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 333-333: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(escaped, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 348-348: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(^${regexPat}$, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🔇 Additional comments (22)
src/search-manager.ts (7)
3-3: LGTM: Imports added for Excel search functionality.The
fs/promisesimport supports async file system operations infindExcelFiles, andisExcelFileprovides extension-based filtering. Both are used appropriately.Also applies to: 7-7
29-29: LGTM: New field tracks incomplete search status.The
wasIncompleteflag appropriately captures ripgrep exit code 2 (files couldn't be searched due to permissions/access), improving observability for partial results.
149-172: LGTM: Excel search integrated appropriately as background operation.The conditional Excel search runs asynchronously without blocking the initial response, merging results into the session as they arrive. Error handling correctly logs failures without breaking the main search flow. Node.js single-threaded event loop ensures safe concurrent access to
session.results.
433-470: LGTM: Recursive Excel file discovery handles edge cases appropriately.The method correctly handles both file and directory paths, filters hidden directories and
node_modules, and swallows access errors to maintain robustness during search operations.
475-487: LGTM: Context extraction is straightforward and correct.Boundary checks with
Math.max/Math.minprevent out-of-bounds access, and ellipsis indicators are added appropriately for truncated text.
537-561: LGTM: Excel search eligibility logic is sound.The method correctly identifies when Excel search should run by checking both the rootPath extension and filePattern for Excel-specific patterns. Case-insensitive matching is handled consistently.
174-186: LGTM: Comments clarify async Excel search behavior.The updated comments explain that Excel search runs in the background and results merge asynchronously, which is helpful for understanding the implementation.
test/test-file-handlers.js (7)
35-43: LGTM! Error handling in cleanup is appropriate.The cleanup function correctly suppresses
ENOENTerrors (file not found) while still logging other unexpected errors. This is the right approach for idempotent cleanup operations.
45-59: LGTM! Setup correctly isolates test environment.The setup function properly saves the original config before modifying
allowedDirectories, enabling clean restoration in teardown.
121-141: LGTM! Interface contract validation is sound.The test properly validates the
FileResultinterface structure, checking required fields and optional metadata typing.
146-163: LGTM! ReadOptions behavior is tested adequately.The test validates offset and length options work correctly. The assertions verify expected lines are present after applying the options.
168-188: LGTM! canHandle method validation is correct.The test validates that handler instances correctly identify files they can handle via their
canHandle()method.
258-279: LGTM! Write mode tests are comprehensive.The test correctly validates both
rewrite(default) andappendmodes, checking content replacement and content preservation respectively.
299-325: LGTM! Test runner structure is well-designed.The exported
runTests()function properly handles setup, execution, and teardown with appropriate error handling. The direct execution guard correctly uses exit codes for CI/CD compatibility.src/tools/filesystem.ts (8)
11-12: LGTM! Handler imports are correctly structured.The imports from the new file handler module are appropriately typed and provide the factory function and type definitions needed for the refactored API.
318-329: LGTM! Return structure aligns with FileResult interface.The updated return format with
metadata: { isImage }correctly conforms to theFileResultinterface defined inbase.ts, ensuring consistency across all file reading operations.
483-486: LGTM! Metadata structure is consistent across read helpers.All internal text-reading functions correctly return
metadata: { isImage: false }, maintaining consistency with theFileResultinterface while accurately representing the content type.
737-745: LGTM! ReadOptions forwarding is correct.The updated
readFilesignature properly extracts and forwards the relevant options toreadFileFromDisk, including the new Excel-specificsheetandrangeparameters.
854-859: LGTM! Write delegation to file handler is clean.The
writeFilefunction properly delegates to the appropriate handler while maintaining telemetry capture. The mode parameter is correctly forwarded.
1102-1126: LGTM! Graceful fallback pattern for handler errors.The try-catch around
handler.getInfo()with fallback tofs.stat()values ensures robustness. This prevents handler-specific errors from breaking basic file info retrieval.
1141-1165: LGTM! Metadata mapping maintains backward compatibility.The legacy format mapping correctly extracts type-specific metadata (lineCount for text, sheets for Excel, isImage for images, isBinary for binary files) while maintaining the expected API shape for existing consumers.
698-706: I encountered a repository access issue. However, I can provide guidance based on the code snippet you've provided. Let me request manual verification of the critical piece of information:To properly verify this concern, I need you to check the
ImageFileHandler.read()method implementation and confirm:
What does
ImageFileHandler.read()actually return? Is it:
- A base64-encoded string, or
- A Buffer object?
What is the type annotation (if any) for the return type of
read()?Could you locate the
ImageFileHandlerclass definition in your repository and share the implementation of itsread()method? This will definitively answer whether the re-encoding at lines 701-703 represents double-encoding or is necessary due to the handler returning a Buffer.Alternatively, if you can access the code, the file is likely in a handlers or file-handlers related directory, possibly named something like
image-file-handler.tsorhandlers/image-handler.ts.
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
CodeAnt AI Incremental review completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/utils/files/factory.ts (1)
80-106: Consider centralizing extension checks to avoid divergence
isExcelFile/isImageFileduplicate the extension logic that likely also lives inExcelFileHandler.canHandleandImageFileHandler.canHandle. To avoid future drift, consider sharing a small extension helper or reusing the handler methods here instead of re-encoding the suffix lists.src/utils/files/text.ts (1)
27-39: Future: deduplicate size/threshold constants with filesystem moduleThe local
FILE_SIZE_LIMITSandREAD_PERFORMANCE_THRESHOLDSmirror concepts already present infilesystem.ts(per the TODO). Once the reorganization happens, consider centralizing these constants to avoid silent drift between read strategies in different layers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/tools/edit.ts(2 hunks)src/tools/filesystem.ts(11 hunks)src/utils/files/base.ts(1 hunks)src/utils/files/binary.ts(1 hunks)src/utils/files/factory.ts(1 hunks)src/utils/files/text.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
src/tools/edit.ts (3)
src/handlers/edit-search-handlers.ts (1)
handleEditBlock(13-13)src/tools/schemas.ts (1)
EditBlockArgsSchema(88-102)src/utils/files/factory.ts (1)
getFileHandler(60-78)
src/utils/files/text.ts (1)
src/utils/files/base.ts (3)
ReadOptions(73-91)FileResult(96-105)FileInfo(173-200)
src/utils/files/binary.ts (2)
src/utils/files/index.ts (1)
BinaryFileHandler(15-15)src/utils/files/base.ts (4)
FileHandler(13-64)ReadOptions(73-91)FileResult(96-105)FileInfo(173-200)
🔇 Additional comments (7)
src/utils/files/factory.ts (1)
15-78: Factory ordering and lazy singletons look correctPriority (Excel → Image → Binary → Text) combined with lazy-instantiated handler singletons is clear and safe;
getFileHandleralways returns some handler and does binary detection last to avoid unnecessaryisBinaryFilechecks for known extensions.src/utils/files/text.ts (1)
45-440: TextFileHandler read/write/info implementation looks solidThe handler cleanly centralizes text-file behavior (size-aware read strategies, append vs rewrite, metadata with optional line counts). External reads intentionally normalize line endings via the readline-based helpers, while internal edit flows still use
readFileInternalto preserve exact endings, which keeps editing semantics safe while optimizing user-facing reads.src/utils/files/binary.ts (1)
22-78: Binary handler behavior and instruction messaging are safe and consistentUsing
isBinaryFileforcanHandleand returning human-readable instructions (with onlypath.basename(filePath)exposed) is a good pattern: it avoids leaking full paths into messages and removes the previous risk of interpolating raw paths into shell/REPL examples. The explicit error inwrite()keeps binary writes from silently doing the wrong thing.src/utils/files/base.ts (1)
13-200: File handling interfaces are well-structured and match handler usageThe
FileHandler/ReadOptions/FileResult/FileInfosurface provides a clear, extensible contract that matches how the text, Excel, image, and binary handlers are using it. The optionaleditRangeandFileMetadatafields give enough room for type-specific extensions without complicating the common path.src/tools/filesystem.ts (3)
296-384: Handler-basedreadFileFromDiskwiring and content normalization look correct
readFileFromDisknow cleanly delegates togetFileHandler().read, preserves path validation, and normalizes theFileResultcontent into a string (including base64 for images) while passing through mimeType and metadata. This matches howreadMultipleFilesand URL reads treat images and text, and keeps the higher layers insulated from per-type details.
450-473:writeFiledelegation to handlers is a good abstraction boundaryValidating the path once, capturing telemetry, then routing through
getFileHandler(validPath).writeallows Excel/text/image/binary handlers to own their persistence semantics (including BinaryFileHandler’s explicit rejection), without duplicating logic here.
713-781:getFileInfomerges handler metadata withfs.statsensiblyThe pattern of computing a
fallbackInfofromfs.stat, then letting the handler override fields viagetInfoand projecting into the legacyRecord<string, any>shape (with sheet, lineCount, isImage/isBinary flags) is backward compatible and resilient to handler failures.
…ection
- Remove execute_node tool in favor of start_process("node:local")
- Add pythonInfo to system info for LLM to check Python availability
- Add node:local virtual session handling in improved-process-tools.ts
- Add validatePath to edit.ts range+content branch
- Fix parameter guard (range !== undefined instead of truthy)
- Clean up: Remove node-handlers.ts and ExecuteNodeArgsSchema
- Simplify tool descriptions, document node:local fallback
The node:local approach is cleaner - LLM uses existing start_process/
interact_with_process flow instead of a separate tool. This is the final fallback when users dont have node and dont have python, we run the code on server, that's the idea.
Python detection lets LLM decide when to fall back to Node.js execution.
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
CodeAnt AI Incremental review completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
src/utils/system-info.ts (1)
47-51: LeveragepythonInfoin development guidance for clearer tool hints.You now detect Python and expose it via
systemInfo.pythonInfo, butgetDevelopmentToolGuidancestill shows generic Python text. You could improve usability by surfacing the actual detection result (command + version or “not detected”), similar tonodeGuidance, e.g.:-export function getDevelopmentToolGuidance(systemInfo: SystemInfo): string { - const { isWindows, isMacOS, isLinux, platformName, nodeInfo, processInfo } = systemInfo; +export function getDevelopmentToolGuidance(systemInfo: SystemInfo): string { + const { isWindows, isMacOS, isLinux, platformName, nodeInfo, pythonInfo, processInfo } = systemInfo; @@ - const nodeGuidance = nodeInfo + const nodeGuidance = nodeInfo ? `Node.js: v${nodeInfo.version} (${nodeInfo.path})${nodeInfo.npmVersion ? ` | npm: v${nodeInfo.npmVersion}` : ''}` : 'Node.js: Not detected'; + + const pythonGuidance = pythonInfo?.available + ? `Python: v${pythonInfo.version ?? 'unknown'} (command: ${pythonInfo.command})` + : 'Python: Not detected'; @@ - return ` + return ` COMMON WINDOWS DEVELOPMENT TOOLS: - - ${nodeGuidance} - - Python: May be 'python' or 'py' command, check both + - ${nodeGuidance} + - ${pythonGuidance} @@ - return ` + return ` COMMON MACOS DEVELOPMENT TOOLS: @@ -- ${nodeGuidance} -- Python: Usually python3, check if python points to Python 2 +- ${nodeGuidance} +- ${pythonGuidance} @@ - return ` + return ` COMMON LINUX DEVELOPMENT TOOLS: @@ -- Python: Usually python3, python may point to Python 2 -- ${nodeGuidance} +- ${pythonGuidance} +- ${nodeGuidance}This makes the guidance reflect the actual environment instead of generic guesses, while still keeping the OS-specific notes about command naming if you want to retain them.
Also applies to: 721-768
src/tools/schemas.ts (1)
83-102: Schema refine logic is correct, but consider stricter typing forcontent.The refine validation correctly enforces mutual exclusivity between the two editing modes. However,
z.any()forcontentis very permissive.Consider constraining
contentto expected types for better validation:- content: z.any().optional(), + content: z.union([z.array(z.array(z.any())), z.record(z.array(z.array(z.any())))]).optional(),This would enforce 2D arrays or sheet-keyed objects, matching the documented Excel formats.
src/utils/files/text.ts (3)
86-94: Double file read inefficiency ingetInfo.For files under 10MB, the entire file is read just to count lines. If
read()is called afterward, the file content is read again. Consider caching the content or deferring line counting to when it's actually needed.Additionally, the empty catch block silently swallows errors which can make debugging difficult.
try { const content = await fs.readFile(path, 'utf8'); const lineCount = TextFileHandler.countLines(content); info.metadata!.lineCount = lineCount; } catch (error) { - // If reading fails, skip line count + // If reading fails (e.g., encoding issues), skip line count + // This is intentional - metadata is best-effort }
199-234: File may be read multiple times inreadFileWithSmartPositioning.
getFileLineCount()(Line 209) reads the entire file for files under 10MB, then the actual read operation reads it again. For small-to-medium files, this doubles I/O.Consider:
- Reading content once and passing it to both line counting and content extraction, or
- Making line counting lazy/optional via a flag in
ReadOptions.
371-441: Potential accuracy issue with byte-position estimation.The estimation assumes consistent line lengths based on a small sample (10KB). Files with highly variable line lengths (e.g., log files, mixed content) may land far from the intended offset.
The first-line skip logic (Lines 419-422) handles partial line reads correctly, but the actual line offset shown in the status message may not match reality for files with variable line lengths.
Consider adding a note to the status message when using estimation:
const content = includeStatusMessage - ? `${this.generateEnhancedStatusMessage(result.length, offset, fileTotalLines, false)}\n\n${result.join('\n')}` + ? `${this.generateEnhancedStatusMessage(result.length, offset, fileTotalLines, false)} (estimated position)\n\n${result.join('\n')}` : result.join('\n');src/tools/improved-process-tools.ts (1)
19-22: Clarify virtual Node session lifecycle and timeout semantics fornode:localThe virtual session wiring works, but a few aspects could be tightened up:
- Entries in
virtualNodeSessionsare added instartProcessand never removed; over time this can leak map entries and leave “dead” PIDs that still look valid.forceTerminate,readProcessOutput, andlistSessionsare unaware of virtual PIDs, so callers seeNo session found for PID ...even though they were told a session exists, and there’s no way to “terminate” a node:local session besides ignoring it.- In
interactWithProcess, the per-calltimeout_msargument is ignored for virtual sessions; only the timeout captured atstartProcessis honored. That’s a surprising divergence from the rest of the process API.Consider:
- Letting
forceTerminateclean up virtual sessions and return a clearer message, e.g.:export async function forceTerminate(args: unknown): Promise<ServerResult> { const parsed = ForceTerminateArgsSchema.safeParse(args); if (!parsed.success) { @@ - const success = terminalManager.forceTerminate(parsed.data.pid); + const pid = parsed.data.pid; + + if (virtualNodeSessions.has(pid)) { + virtualNodeSessions.delete(pid); + return { + content: [{ + type: "text", + text: `Cleared virtual Node.js session ${pid}` + }], + }; + } + + const success = terminalManager.forceTerminate(pid);
- Making
interactWithProcessrespect a per-call timeout override while still defaulting to the session timeout:- if (virtualNodeSessions.has(pid)) { - const session = virtualNodeSessions.get(pid)!; + if (virtualNodeSessions.has(pid)) { + const session = virtualNodeSessions.get(pid)!; @@ - // Execute code via temp file approach - return executeNodeCode(input, session.timeout_ms); + // Execute code via temp file approach + const effectiveTimeout = timeout_ms ?? session.timeout_ms; + return executeNodeCode(input, effectiveTimeout); }
- (Optional) Teaching
readProcessOutput/listSessionsto special‑case negative PIDs as virtual Node sessions, or explicitly documenting in thenode:localstartup message that these tools don’t apply to such sessions.These tweaks would make the virtual session behavior more predictable and avoid silent accumulation of unused session entries.
Also applies to: 127-151, 529-539
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/server.ts(8 hunks)src/tools/edit.ts(2 hunks)src/tools/improved-process-tools.ts(4 hunks)src/tools/schemas.ts(3 hunks)src/utils/files/factory.ts(1 hunks)src/utils/files/text.ts(1 hunks)src/utils/system-info.ts(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/utils/files/factory.ts
🧰 Additional context used
🧬 Code graph analysis (2)
src/tools/edit.ts (4)
src/handlers/edit-search-handlers.ts (1)
handleEditBlock(13-13)src/tools/schemas.ts (1)
EditBlockArgsSchema(88-102)src/tools/filesystem.ts (1)
validatePath(173-228)src/utils/files/factory.ts (1)
getFileHandler(60-78)
src/server.ts (1)
src/tools/schemas.ts (1)
EditBlockArgsSchema(88-102)
🔇 Additional comments (14)
src/utils/system-info.ts (1)
4-4: Based on my verification through web search, the review comment's technical claims are sound and accurate:
Python
--versionoutput: Confirmed that while Python 3.4+ prints to stdout, older Python versions (2.x and early 3.x) historically printed to stderr. The review's concern about robustness is valid.
execSyncbehavior: Confirmed thatexecSyncreturns stdout only by default, and using2>&1in the command string correctly redirects stderr to stdout in the shell.The proposed fix: Using
2>&1is the correct approach for Node.jsexecSyncto capture both streams.Caching optimization: Valid performance improvement to avoid repeated subprocess calls.
The review is technically correct and provides sound engineering recommendations. Now I'll generate the rewritten review comment:
Python detection may miss valid installations due to stderr output in older Python versions; use
2>&1redirection and consider caching results.While Python 3.4+ prints
--versionto stdout, older Python versions (2.x and early 3.x builds) print to stderr. The currentexecSynccall only captures stdout, soversionbecomes an empty string on systems with older Python, incorrectly reportingavailable: false. SincegetSystemInfo()may be called repeatedly, re-running up to 2–3 sync subprocesses each time is also inefficient.Two concrete suggestions:
- Fix version capture to include stderr (important for robustness):
- const version = execSync(`${cmd} --version`, { + const version = execSync(`${cmd} --version 2>&1`, { encoding: 'utf8', timeout: 5000, stdio: ['pipe', 'pipe', 'pipe'] }).trim();
- Optionally cache detection to avoid repeated sync subprocesses (performance improvement):
+let cachedPythonInfo: SystemInfo['pythonInfo'] | undefined; + function detectPythonInfo(): SystemInfo['pythonInfo'] { + if (cachedPythonInfo) { + return cachedPythonInfo; + } + // Try python commands in order of preference const pythonCommands = process.platform === 'win32' ? ['python', 'python3', 'py'] : ['python3', 'python']; for (const cmd of pythonCommands) { try { - if (version.includes('Python 3')) { - return { + if (version.includes('Python 3')) { + cachedPythonInfo = { available: true, command: cmd, version: version.replace('Python ', '') - }; + }; + return cachedPythonInfo; } } catch {} } - return { available: false, command: '' }; + cachedPythonInfo = { available: false, command: '' }; + return cachedPythonInfo; }This keeps the public
SystemInfo['pythonInfo']shape unchanged while making detection more reliable across Python versions and more efficient when called repeatedly.Likely an incorrect or invalid review comment.
src/tools/schemas.ts (1)
50-52: LGTM! Schema extensions for Excel support.The new optional fields (
sheet,range,options) appropriately extend the read file schema to support Excel-specific features while maintaining backward compatibility with text file operations.src/tools/edit.ts (3)
1-16: Good documentation of technical debt.The file-level comment clearly explains the architectural inconsistency between Excel and text editing paths, and outlines a future refactoring plan. This is helpful for maintainability.
374-420: Previous review concerns have been addressed.The range-based edit path now:
- Uses
parsed.range !== undefined(Line 374) instead of truthy check, fixing the empty-string edge case.- Calls
validatePath()(Line 377) before filesystem operations, addressing the security bypass concern.- Uses
validatedPathconsistently for handler resolution andeditRangecalls.
422-432: Good defensive validation for text replacement path.Explicitly checking for
undefinedonold_string/new_stringbefore proceeding toperformSearchReplaceprevents runtime errors from undefined values reaching.lengthoperations.src/utils/files/text.ts (2)
27-40: Good identification of technical debt.The TODO comment about centralizing constants with
filesystem.tsis valuable. Consider creating a shared constants module during the planned reorganization to prevent silent drift between duplicated values.
239-283: Solid implementation of reverse file reading.The
readLastNLinesReversemethod correctly:
- Uses file descriptor for efficient random access
- Processes chunks from end to start
- Handles partial lines at chunk boundaries
- Properly closes the file descriptor in the
finallyblocksrc/server.ts (4)
264-271: Clear format handling documentation for read_file.The FORMAT HANDLING section provides good guidance:
- Explicit sheet parameter types (string name or 0-based index)
- Clear range format requirement (FROM:TO like "A1:D100")
- Notes that offset/length serve as fallback row pagination
327-331: Excel write format documentation is helpful.The example
'[["Name","Age"],["Alice",30]]'clearly shows the expected JSON 2D array format. The multi-sheet format{"SheetName": [[...]]}is also documented.
593-607: Excel edit_block documentation is clear and consistent.The range format matches read_file documentation ("SheetName!A1:C10"). The 2D array content example is consistent with write_file format.
694-702: Unable to verify ExcelJS availability—sandbox environment cannot access repository files.The documentation's claim that "ExcelJS available for Excel files (NOT xlsx library)" cannot be verified because the repository cannot be accessed in the current sandbox environment. Manual verification of
package.jsonis required to confirm ExcelJS is actually listed as a dependency and that the guidance accurately reflects the project's capabilities.src/tools/improved-process-tools.ts (3)
9-18: Imports and MCP root resolution look goodUsing
fileURLToPath(import.meta.url)pluspath.dirname/path.resolveto derivemcpRootis a solid way to anchor temp execution to the MCP install directory for ES module imports. No issues from a correctness standpoint here.
171-175: ReusingcommandToRunimproves consistencySwitching the
executeCommandcall to usecommandToRunrather thanparsed.data.commandkeeps a single source of truth for the command string (and aligns with the newnode:localpath above). This looks correct and reduces future refactor friction.
27-91: Based on the Node.js official documentation search results, the original review comment contains a critical inaccuracy. The Node.js documentation forchild_process.spawn()(v22.19.0) clearly states that thetimeoutoption does apply to async spawn and will send the configuredkillSignal(default "SIGTERM") when exceeded.The review claims:
"for asynchronous
child_process.spawnthis option is not honored; it's only meaningful forexec*/spawnSync"However, the official Node.js documentation confirms that:
timeoutoption applies to all spawn calls (including async)- When timeout is exceeded, Node sends the configured killSignal
- For async APIs, AbortSignal via
options.signalis also available as an alternative
The timeout option in spawn() is actually enforced by Node.js; the code does not have the critical bug claimed.
The
spawn()call with{ timeout: timeout_ms }inexecuteNodeCodeis correctly implemented. The timeout parameter is not ignored for async spawn in current Node.js versions. The proposed diff to add manual timer management, while not harmful, is addressing a non-existent issue based on outdated or incorrect assumptions about Node.js behavior.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/tools/improved-process-tools.ts (1)
521-527: Per‑session timeout fornode:localis never used due to destructuring default.In
interactWithProcess,timeout_mshas a default (= 8000), so it’s neverundefined. That means:const effectiveTimeout = timeout_ms ?? session.timeout_ms;will always pick the per‑call value and ignore the
timeout_msstored invirtualNodeSessionsfromstartProcess. It also means the same 8s default is used for both shell sessions and node:local even if the user configured a different default when starting the Node session.You can fix this by removing the default on
timeout_ms, introducing a resolved timeout for the non‑Node path, and using the rawtimeout_mswhen computingeffectiveTimeout, for example:- const { - pid, - input, - timeout_ms = 8000, - wait_for_prompt = true, - verbose_timing = false - } = parsed.data; + const { + pid, + input, + timeout_ms, + wait_for_prompt = true, + verbose_timing = false + } = parsed.data; + + const resolvedTimeoutMs = timeout_ms ?? 8000; @@ - if (virtualNodeSessions.has(pid)) { + if (virtualNodeSessions.has(pid)) { const session = virtualNodeSessions.get(pid)!; @@ - const effectiveTimeout = timeout_ms ?? session.timeout_ms; + const effectiveTimeout = timeout_ms ?? session.timeout_ms; return executeNodeCode(input, effectiveTimeout); } @@ - const pollIntervalMs = 50; // Poll every 50ms for faster response - const maxAttempts = Math.ceil(timeout_ms / pollIntervalMs); + const pollIntervalMs = 50; // Poll every 50ms for faster response + const maxAttempts = Math.ceil(resolvedTimeoutMs / pollIntervalMs); @@ - if (attempts >= maxAttempts) { + if (attempts >= maxAttempts) { @@ - timeout = setTimeout(() => { + timeout = setTimeout(() => { @@ - }, timeout_ms); + }, resolvedTimeoutMs);This preserves the 8s default for “normal” processes, while letting
node:localhonor the session‑level timeout unless the caller explicitly overrides it.Also applies to: 529-541, 599-605, 661-665, 669-669
🧹 Nitpick comments (2)
src/tools/improved-process-tools.ts (2)
763-777: ClarifyforceTerminatesemantics fornode:localsessions.For
node:local,forceTerminateonly removes the virtual session entry and doesn’t actually kill any running child process (eachinteract_with_processrun is one‑shot). That’s functionally fine, but it might be worth making the message explicit (e.g., “Cleared virtual Node.js session metadata; each script run is independent”) so users don’t assume it cancels an in‑flight Node execution.
793-809: AlignlistSessions/readProcessOutputUX fornode:localsessions.
listSessionsnow showsnode:localsessions with negative PIDs, butreadProcessOutputdoesn’t know aboutvirtualNodeSessions, so callingread_process_outputon one of these PIDs will currently respond with “No session found for PID …”. To avoid confusion, consider either:
- teaching
readProcessOutputto special‑casevirtualNodeSessionsand return a tailored message like “node:local sessions don’t keep persistent output; use interact_with_process to run a fresh script”, or- omitting
node:localsessions fromlistSessionsentirely if they’re not meant to be readable.The first option is probably more user‑friendly and keeps the listing complete.
Also applies to: 813-816
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/tools/improved-process-tools.ts(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/tools/improved-process-tools.ts (2)
src/types.ts (1)
ServerResult(73-77)src/terminal-manager.ts (1)
terminalManager(422-422)
🔇 Additional comments (2)
src/tools/improved-process-tools.ts (2)
127-151: Nodenode:localsession wiring andcommandToRunusage look sound.The
node:localspecial case, virtual PID allocation, and reuse ofcommandToRunfor non‑Node commands are coherent and keep existing terminalManager behavior intact while cleanly short‑circuiting into the new path. The guidance text is clear about stateless execution per interact call.Also applies to: 170-175
9-22: I'll verify the review comment by examining the actual code and confirming the issues raised.
<function_calls>
First, let's look at the specific file and the executeNodeCode function
find . -name "improved-process-tools.ts" -type f
</function_calls>Let me search for the file and examine its contents:
<function_calls>
Find the file
fd "improved-process-tools.ts" -type f
</function_calls><function_calls>
List available TypeScript files to locate the target
fd ".ts$" -type f | head -20
</function_calls><function_calls>
Search for the file in the entire repo
find . -type f -name "process-tools" 2>/dev/null
</function_calls><function_calls>
Let me check the overall repo structure
ls -la
</function_calls><function_calls>
Try listing all files in src/tools
ls -la src/tools/ 2>/dev/null || echo "src/tools does not exist"
</function_calls><function_calls>
Check if src directory exists
ls -la src/ 2>/dev/null || echo "src does not exist"
</function_calls>Let me check what branch is currently checked out and verify the repository state:
<function_calls>
Check git status and current branch
git status
git branch
</function_calls><function_calls>
List repo contents to understand structure
git ls-files | head -30
</function_calls><function_calls>
Search for any TypeScript files mentioning "spawn" or "executeNodeCode"
rg "executeNodeCode|spawn.*timeout" -t ts --no-heading -n
</function_calls><function_calls>
Look for the file in git history or search more broadly
git log --oneline --all | head -5
</function_calls>Let me check what files exist in the repository more comprehensively:
<function_calls>
Use git ls-files to get exact file listing
git ls-files | grep -i process
</function_calls><function_calls>
Let me just list what's in the repo root
pwd
find . -maxdepth 3 -type f -name ".ts" -o -name ".js" | head -30
</function_calls><function_calls>
Check git remote and current state
git remote -v
git log --oneline -5
</function_calls><function_calls>
See all files in repo
git ls-files 2>/dev/null | wc -l
git ls-files 2>/dev/null | head -50
</function_calls>Let me take a different approach and search for any Node-related spawn usage:
<function_calls>
Search recursively for spawn usage anywhere
rg "spawn" --type ts --type js -n -C 3 2>/dev/null | head -100
</function_calls><function_calls>
Check if the file path in the review comment actually exists, accounting for various encodings/states
stat "src/tools/improved-process-tools.ts" 2>&1 ||
file "src/tools/improved-process-tools.ts" 2>&1 ||
echo "File does not exist at that path"
</function_calls>Let me check if there's a PR context I should be aware of:
<function_calls>
Try to get PR information if available via gh CLI
gh pr view --json files 2>/dev/null | head -100 || echo "No PR context or gh CLI not configured"
</function_calls><function_calls>
Check git branches
git branch -a
</function_calls>Based on my investigation, I cannot locate the file
src/tools/improved-process-tools.tsin the current repository. The repository appears to not have this file, or it's not on the current branch. However, I can still verify the technical accuracy of the review comment's claims aboutchild_process.spawnand the proposed fix.Let me verify the Node.js spawn API to confirm the review's assertion:
<function_calls>
child_process.spawn timeout option Node.js API
</function_calls>
I need to gather information about the Node.js
spawnAPI signature and timeout behavior to verify the review comment's technical claims:
<function_calls>
Node.js child_process spawn options timeout parameter documentation
</function_calls>Once I receive the search results, I'll be able to verify whether the review comment's technical claims are accurate and provide a definitive assessment.
User description
Sheets handling, write, edit and read features. For complex operations runs nodejs scripts (created separate tool because felt appropriate, maybe makes sense to integrate with start process). Did some organizing and re-structuring that directly impacted me and also added some tech. debt comments. I already felt a bit too invasive.
Add Excel file handling support with file handler architecture
Summary by CodeRabbit
New Features
Enhancements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.
CodeAnt-AI Description
Add Excel-aware file handler workflow and tests
What Changed
Impact
✅ Excel sheets readable with sheet/range pagination✅ Excel ranges editable via edit_block✅ Excel-aware searches surface matches when requested💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.