fix(core): capture PTY errors in result instead of throwing from async callback#3162
fix(core): capture PTY errors in result instead of throwing from async callback#3162henryhwang wants to merge 3 commits intoQwenLM:mainfrom
Conversation
…c callback Fixes silent crash in SSH/remote environments where unexpected PTY errors thrown from an async event callback become uncaught exceptions, causing the process to exit silently. Errors are now properly captured in the ShellExecutionResult. Also expands isExpectedPtyReadExitError to cover EINTR and ENODEV errno codes that occur in SSH environments. Fixes #3161 Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
| const isExpectedPtyReadExitError = (error: unknown): boolean => { | ||
| const code = getErrnoCode(error); | ||
| if (code === 'EIO') { | ||
| if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') { |
There was a problem hiding this comment.
[Critical] message.includes('pty') is overly broad — will silently suppress any error containing the substring "pty", including legitimate errors like "pty allocation failed" that should be surfaced to the user.
| if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') { | |
| (message.includes('read') && message.toLowerCase().includes('pty')) |
— qwen3.6-plus via Qwen Code /review
| if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') { | ||
| return true; | ||
| } | ||
|
|
There was a problem hiding this comment.
[Suggestion] When this error handler runs, it sets exited = true, removes the abort listener, deletes from activePtys, and resolves. If onExit subsequently fires, cleanup code runs redundantly. Consider adding a guard in onExit to skip if error is already set.
| ptyProcess.onExit((exitInfo) => { | |
| if (error) { | |
| // Error handler already resolved, skip | |
| return; | |
| } |
— qwen3.6-plus via Qwen Code /review
Review Findings[Suggestion] packages/core/src/core/coreToolScheduler.ts:1258 Now that [Suggestion] packages/core/src/services/shellExecutionService.ts:819 When the error handler runs, it sets — qwen3.6-plus via Qwen Code /review |
| signal: null, | ||
| error, | ||
| aborted: abortSignal.aborted, | ||
| pid: ptyProcess.pid, |
There was a problem hiding this comment.
[Suggestion] This change starts returning unexpected runtime PTY failures through ShellExecutionResult.error, but downstream consumers still interpret any non-aborted error as a startup failure. For example, shellProcessor will now report Failed to start shell command even when the command already started and produced partial output. Consider either reserving error for spawn failures only, or updating consumers to distinguish startup failures from runtime PTY failures before surfacing the message.
— gpt-5.4 via Qwen Code /review
Review Fixes1. onExit guard in shellExecutionService.ts — Added a guard in 2. .catch() on openIdeDiffIfEnabled in coreToolScheduler.ts — The fire-and-forget call to |
- Add onExit guard to skip redundant cleanup when error handler already resolved - Add .catch() to openIdeDiffIfEnabled fire-and-forget to prevent unhandled rejections Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
00c1162 to
4b62a1b
Compare
- Tighten message.includes('pty') filter to require both 'read' and 'pty'
in the message, avoiding false positives like 'pty allocation failed'
- Reserve error field for spawn failures only; include runtime PTY error
message in output field so downstream consumers (ShellProcessor) don't
misinterpret it as a startup failure
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Additional Review Fixes3. Tightened 4. Reserved |
|
Thanks for looking into this, @henryhwang! This touches a core area (PTY error handling), so we'd like to be extra careful before merging. Could you add some reproducible test steps or before/after screenshots showing the issue and the fix in action? That would help us review with more confidence. Appreciate it! |
@tanzhenxin after careful investigation, I figured out that my original issue #3161 is from out-of-memory (OOM) problem which happened on my 2G RAM VPS running Debian without swap. although the root cause of #3161 is not PTY problem, the pr #3162 itself is still valid according to the analysis from qwen code. The OOM problem is real, on the other hand. the current version of qwen code somehow generate a memory spike with bash tool calls which make itself quit by oom-kill on small RAM (2GB, not really that small in fact) VPS. anyway, because of the root cause is oom and turning on the swap will resolve it, I am going to close this pr. |
Fix silent crash when PTY error occurs in SSH environments
Summary
Fixes a bug where the program exits silently when a shell command confirmation dialog is displayed in SSH/remote environments. The root cause was throwing an error from an async event callback, which becomes an uncaught exception.
Changes
1. Fix error handling in PTY error handler (
shellExecutionService.ts:809-835)Before:
After:
2. Fix
const errortolet error(shellExecutionService.ts:625)The
errorvariable needs to be mutable to store the error from the event handler.Before:
After:
3. Expand
isExpectedPtyReadExitErrorfilter (shellExecutionService.ts:199-207)The filter was too narrow, only catching
EIOerrors. PTY race conditions in SSH environments can produce other error codes.Before:
After:
4. Update unit test (
shellExecutionService.test.ts)Updated the test
'should throw unexpected PTY errors from error event'to'should capture unexpected PTY errors in result instead of throwing'tomatch the new correct behavior:
expect(...).toThrow()→expect(...).not.toThrow()expect(result.error).toBe(unexpectedError)andexpect(result.exitCode).toBe(1)'unexpected pty error'to'connection broken'to avoid matching the newmessage.includes('pty')filterTesting
shellExecutionService.test.tsRelated Issue
Fixes #3161