Skip to content

fix(tools): harden bash tool argument handling#269

Open
SuperMarioYL wants to merge 3 commits into
huggingface:mainfrom
SuperMarioYL:fix/coerce-bash-timeout-to-int
Open

fix(tools): harden bash tool argument handling#269
SuperMarioYL wants to merge 3 commits into
huggingface:mainfrom
SuperMarioYL:fix/coerce-bash-timeout-to-int

Conversation

@SuperMarioYL
Copy link
Copy Markdown

What

_bash_handler (the bash tool) had two argument-handling defects. This PR hardens both and adds regression tests.

1. A string timeout crashed the agent turn

timeout is computed before the handler's try block:

timeout = min(args.get("timeout") or DEFAULT_TIMEOUT, MAX_TIMEOUT)

The bash tool schema declares timeout as an integer, but models occasionally emit schema-typed integers as JSON strings ("120"). min("120", 36000) then raises an uncaught TypeError that aborts the whole turn instead of returning a recoverable tool error:

TypeError: '<' not supported between instances of 'int' and 'str'

The value is now coerced with int(), falling back to DEFAULT_TIMEOUT on a non-numeric input. A non-positive value (e.g. "-5" or "0") — which would otherwise make subprocess.run time out instantly with a misleading message — also falls back to the default.

2. An invalid work_dir surfaced a raw OSError

A non-existent work_dir reached subprocess.run(cwd=...) and surfaced a raw FileNotFoundError string. It is now validated up front with a clear, actionable message so the model can correct the path.

Tests

Adds tests/unit/test_local_tools.py (10 tests) covering string / missing / integer / oversized / oversized-string / non-numeric / non-positive timeout, and missing / file / valid work_dir. The string-timeout test is red on main and green here.

uv run pytest tests/unit/test_local_tools.py -> 10 passed. ruff check / ruff format --check clean.

Notes

Behavior for valid integer input is unchanged. This is the bash-tool sibling of the read-tool coercion in #110 — it touches a different function (_bash_handler, not _read_handler) and does not overlap that PR.

The bash tool computes `timeout = min(args.get("timeout") or
DEFAULT_TIMEOUT, MAX_TIMEOUT)` outside the handler's try block. The tool
schema declares `timeout` as an integer, but model providers occasionally
serialize schema-typed integers as JSON strings (e.g. "120"). In that case
`min("120", 36000)` raises an uncaught TypeError that aborts the whole
agent turn instead of returning a recoverable tool error.

Coerce the value with int() and fall back to DEFAULT_TIMEOUT on a
non-numeric input, then clamp with MAX_TIMEOUT. Behavior for valid integer
input is unchanged.

Add tests/unit/test_local_tools.py with regression coverage for string,
missing, integer, oversized, oversized-string, and non-numeric timeouts.
A non-existent `work_dir` reached `subprocess.run(cwd=...)` and surfaced a
raw `FileNotFoundError` string. Check the directory up front and return an
actionable tool error so the model can correct the path.

Extends tests/unit/test_local_tools.py with work_dir coverage.
A negative or zero `timeout` (e.g. "-5" or "0") survived coercion and was
passed straight to `subprocess.run`, which then timed out instantly with a
misleading message. Fall back to DEFAULT_TIMEOUT for any non-positive value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant