Skip to content

fix(core): capture PTY errors in result instead of throwing from async callback#3162

Closed
henryhwang wants to merge 3 commits intoQwenLM:mainfrom
henryhwang:fix/pty-error-handler-crash
Closed

fix(core): capture PTY errors in result instead of throwing from async callback#3162
henryhwang wants to merge 3 commits intoQwenLM:mainfrom
henryhwang:fix/pty-error-handler-crash

Conversation

@henryhwang
Copy link
Copy Markdown

@henryhwang henryhwang commented Apr 12, 2026

Fix silent crash when PTY error occurs in SSH environments

Summary

Fixes a bug where the program exits silently when a shell command confirmation dialog is displayed in SSH/remote environments. The root cause was throwing an error from an async event callback, which becomes an uncaught exception.

Changes

1. Fix error handling in PTY error handler (shellExecutionService.ts:809-835)

Before:

ptyProcess.on('error', (err: NodeJS.ErrnoException) => {
  if (isExpectedPtyReadExitError(err)) {
    return;
  }
  throw err;  // Uncaught exception in async callback
});

After:

ptyProcess.on('error', (err: NodeJS.ErrnoException) => {
  if (isExpectedPtyReadExitError(err)) {
    return;
  }

  // Store the error and trigger exit handling.
  // Throwing from an async event callback causes uncaught exception.
  error = err;
  if (!exited) {
    exited = true;
    abortSignal.removeEventListener('abort', abortHandler);
    this.activePtys.delete(ptyProcess.pid);
    ptyProcess.kill();
    resolve({
      rawOutput: Buffer.concat(outputChunks),
      output: '',
      exitCode: 1,
      signal: null,
      error,
      aborted: abortSignal.aborted,
      pid: ptyProcess.pid,
      executionMethod: (ptyInfo?.name as 'node-pty' | 'lydell-node-pty') ?? 'node-pty',
    });
  }
});

2. Fix const error to let error (shellExecutionService.ts:625)

The error variable needs to be mutable to store the error from the event handler.

Before:

const error: Error | null = null;

After:

let error: Error | null = null;

3. Expand isExpectedPtyReadExitError filter (shellExecutionService.ts:199-207)

The filter was too narrow, only catching EIO errors. PTY race conditions in SSH environments can produce other error codes.

Before:

const isExpectedPtyReadExitError = (error: unknown): boolean => {
  const code = getErrnoCode(error);
  if (code === 'EIO') {
    return true;
  }

  const message = getErrorMessage(error);
  return message.includes('read EIO');
};

After:

const isExpectedPtyReadExitError = (error: unknown): boolean => {
  const code = getErrnoCode(error);
  if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') {
    return true;
  }

  const message = getErrorMessage(error);
  return (
    message.includes('read EIO') ||
    message.includes('read EINTR') ||
    message.includes('pty')
  );
};

4. Update unit test (shellExecutionService.test.ts)

Updated the test 'should throw unexpected PTY errors from error event' to
'should capture unexpected PTY errors in result instead of throwing' to
match the new correct behavior:

  • Changed expect(...).toThrow()expect(...).not.toThrow()
  • Added assertions: expect(result.error).toBe(unexpectedError) and expect(result.exitCode).toBe(1)
  • Changed test error message from 'unexpected pty error' to 'connection broken' to avoid matching the new message.includes('pty') filter

Testing

  • ✅ Unit tests: all 47 tests pass in shellExecutionService.test.ts
  • ✅ Lint: passes clean
  • ✅ Typecheck: passes clean
  • Test in local environment - shell commands should work as before
  • Test in SSH environment - confirmation dialogs should remain visible
  • Test in SSH + tmux environment - confirmation dialogs should remain visible
  • Test file operations in SSH - should continue to work correctly
  • Verify error messages are properly surfaced when PTY errors occur

Related Issue

Fixes #3161

…c callback

Fixes silent crash in SSH/remote environments where unexpected PTY errors
thrown from an async event callback become uncaught exceptions, causing the
process to exit silently. Errors are now properly captured in the
ShellExecutionResult.

Also expands isExpectedPtyReadExitError to cover EINTR and ENODEV errno codes
that occur in SSH environments.

Fixes #3161

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
const isExpectedPtyReadExitError = (error: unknown): boolean => {
const code = getErrnoCode(error);
if (code === 'EIO') {
if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] message.includes('pty') is overly broad — will silently suppress any error containing the substring "pty", including legitimate errors like "pty allocation failed" that should be surfaced to the user.

Suggested change
if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') {
(message.includes('read') && message.toLowerCase().includes('pty'))

— qwen3.6-plus via Qwen Code /review

if (code === 'EIO' || code === 'EINTR' || code === 'ENODEV') {
return true;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] When this error handler runs, it sets exited = true, removes the abort listener, deletes from activePtys, and resolves. If onExit subsequently fires, cleanup code runs redundantly. Consider adding a guard in onExit to skip if error is already set.

Suggested change
ptyProcess.onExit((exitInfo) => {
if (error) {
// Error handler already resolved, skip
return;
}

— qwen3.6-plus via Qwen Code /review

@wenshao
Copy link
Copy Markdown
Collaborator

wenshao commented Apr 12, 2026

Review Findings

[Suggestion] packages/core/src/core/coreToolScheduler.ts:1258

Now that handleConfirmationResponse is awaited here, errors from confirmation handling become unhandled promise rejections at the call site (line ~1044) where openIdeDiffIfEnabled is called fire-and-forget without .catch().

    this.openIdeDiffIfEnabled(confirmationDetails, call.callId, signal).catch(
      (err) => debugLogger.error(`IDE confirmation handling failed: ${err}`),
    );

[Suggestion] packages/core/src/services/shellExecutionService.ts:819

When the error handler runs, it sets exited = true, removes the abort listener, deletes from activePtys, and resolves. If onExit subsequently fires, cleanup code runs redundantly. Consider adding a guard in onExit to skip if error is already set.

        ptyProcess.onExit((exitInfo) => {
          if (error) {
            // Error handler already resolved, skip
            return;
          }

— qwen3.6-plus via Qwen Code /review

@tanzhenxin tanzhenxin added the type/bug Something isn't working as expected label Apr 12, 2026
signal: null,
error,
aborted: abortSignal.aborted,
pid: ptyProcess.pid,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] This change starts returning unexpected runtime PTY failures through ShellExecutionResult.error, but downstream consumers still interpret any non-aborted error as a startup failure. For example, shellProcessor will now report Failed to start shell command even when the command already started and produced partial output. Consider either reserving error for spawn failures only, or updating consumers to distinguish startup failures from runtime PTY failures before surfacing the message.

— gpt-5.4 via Qwen Code /review

@henryhwang
Copy link
Copy Markdown
Author

Review Fixes

1. onExit guard in shellExecutionService.ts — Added a guard in ptyProcess.onExit to skip redundant cleanup when the error handler has already resolved the promise. Prevents double-cleanup and potential race condition.

2. .catch() on openIdeDiffIfEnabled in coreToolScheduler.ts — The fire-and-forget call to openIdeDiffIfEnabled was missing error handling. Added .catch() to log IDE confirmation errors instead of producing an unhandled promise rejection.

- Add onExit guard to skip redundant cleanup when error handler already resolved
- Add .catch() to openIdeDiffIfEnabled fire-and-forget to prevent unhandled rejections

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
@henryhwang henryhwang force-pushed the fix/pty-error-handler-crash branch from 00c1162 to 4b62a1b Compare April 13, 2026 02:53
- Tighten message.includes('pty') filter to require both 'read' and 'pty'
  in the message, avoiding false positives like 'pty allocation failed'
- Reserve error field for spawn failures only; include runtime PTY error
  message in output field so downstream consumers (ShellProcessor) don't
  misinterpret it as a startup failure

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
@henryhwang
Copy link
Copy Markdown
Author

Additional Review Fixes

3. Tightened isExpectedPtyReadExitError filter — Changed message.includes('pty') to require both 'read' AND 'pty' in the message. The previous filter was overly broad and would silently suppress legitimate errors like "pty allocation failed".

4. Reserved error field for spawn failures — Runtime PTY errors during execution are now reported in the output field instead of the error field. This prevents downstream consumers (like ShellProcessor) from misinterpreting a runtime PTY error as a startup failure with the message "Failed to start shell command".

@tanzhenxin
Copy link
Copy Markdown
Collaborator

Thanks for looking into this, @henryhwang! This touches a core area (PTY error handling), so we'd like to be extra careful before merging.

Could you add some reproducible test steps or before/after screenshots showing the issue and the fix in action? That would help us review with more confidence.

Appreciate it!

@henryhwang
Copy link
Copy Markdown
Author

Thanks for looking into this, @henryhwang! This touches a core area (PTY error handling), so we'd like to be extra careful before merging.

Could you add some reproducible test steps or before/after screenshots showing the issue and the fix in action? That would help us review with more confidence.

Appreciate it!

@tanzhenxin after careful investigation, I figured out that my original issue #3161 is from out-of-memory (OOM) problem which happened on my 2G RAM VPS running Debian without swap. although the root cause of #3161 is not PTY problem, the pr #3162 itself is still valid according to the analysis from qwen code.

The OOM problem is real, on the other hand. the current version of qwen code somehow generate a memory spike with bash tool calls which make itself quit by oom-kill on small RAM (2GB, not really that small in fact) VPS.

anyway, because of the root cause is oom and turning on the swap will resolve it, I am going to close this pr.

@henryhwang henryhwang closed this Apr 15, 2026
@henryhwang henryhwang deleted the fix/pty-error-handler-crash branch April 15, 2026 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/bug Something isn't working as expected

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Program exits silently when shell command confirmation dialog is displayed in SSH environments

3 participants