feat: add shell tool#739
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new Shell tool node: config parsing, global lifecycle state, a tool instance exposing execute with allowlist/env/timeout handling, a subprocess executor with truncation and timeout logic, service registration, documentation, and unit tests. ChangesShell tool node
Sequence DiagramsequenceDiagram
participant Agent as Agent/Caller
participant IInst as IInstance
participant Global as IGlobal
participant Exec as shell_executor
participant Sub as Subprocess
Agent->>IInst: execute({command, working_dir?, env?, timeout?})
IInst->>IInst: validate input, enforce allowlist
IInst->>Global: read defaults (cwd, timeout, max_output, env, allow_external)
IInst->>IInst: resolve cwd and timeout
IInst->>Exec: build_environment(base_env=None, config_env, call_env, allow_external_env)
Exec-->>IInst: merged_env
IInst->>Exec: execute_command(command, cwd, env, timeout, max_output_bytes)
Exec->>Sub: spawn subprocess (shell=True, cwd, env)
Sub-->>Exec: stdout, stderr, returncode (or timeout)
Exec-->>IInst: {stdout, stderr, exit_code, timed_out, truncated}
IInst-->>Agent: response dict
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
No description provided. |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_shell/config_parser.py`:
- Around line 79-83: The code currently coerces env values with
str(row.get('envValue') or ''), which turns valid falsy values like 0 or False
into empty strings; change the assignment to preserve falsy but non-None values
by first checking for None and only defaulting to '' in that case (e.g. value =
row.get('envValue'); env[name] = '' if value is None else str(value)). Update
the block that builds the env dict (the variables name and env) to use this
None-aware conversion so 0/False remain "0"/"False" instead of ''.
In `@nodes/src/nodes/tool_shell/IInstance.py`:
- Around line 126-127: The default node-level working_dir
(self.IGlobal.working_dir) is not being validated like per-call override paths;
update the logic to run the same validation routine used for override values on
self.IGlobal.working_dir before returning it (and raise the same error/exception
used for invalid overrides), and apply the identical change to the other
occurrence around lines 133-135 so both the override branch and the default
branch use the same validation path-checking code (reference: the variables
override and self.IGlobal.working_dir).
In `@nodes/src/nodes/tool_shell/services.json`:
- Around line 51-52: The description for maxOutputBytes is inaccurate: the
executor enforces a per-stream cap rather than a combined stdout+stderr limit.
Update the description of maxOutputBytes to state that it is the maximum size
applied to each of stdout and stderr individually (e.g., "Maximum size per
stream (stdout and stderr); each is truncated beyond this limit. Default 1
MiB."), or alternatively change the executor logic (where output truncation is
implemented) to combine stdout and stderr before enforcing the 1048576
limit—ensure the symbol maxOutputBytes and the executor's truncation behavior
remain consistent.
In `@nodes/src/nodes/tool_shell/shell_executor.py`:
- Around line 55-63: The subprocess.run call that uses capture_output=True must
be replaced with a streaming implementation that enforces max_output_bytes while
data is produced so memory can't grow unbounded: use subprocess.Popen([...],
shell=True, cwd=cwd, env=env, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
and read stdout/stderr in chunks (select/poll or threads) feeding each chunk
into the existing _decode_and_cap routine (or an incremental variant) to
accumulate decoded output up to max_output_bytes, and if the cap is reached
terminate the child (proc.kill()) and stop reading; preserve timeout by waiting
with proc.wait(timeout) and ensure completed.returncode, stdout/stderr, and
timeout behavior match the original contract. Apply the same change to the other
run site referenced (the block around lines 81-90) and keep using the completed
variable and _decode_and_cap to integrate with existing logic.
In `@nodes/test/tool_shell/test_config_parser.py`:
- Around line 99-113: Add a regression test ensuring falsy envValue values are
coerced to their string representations rather than an empty string: add a new
test (e.g., test_preserves_falsy_values) that calls parse_env_vars with cfg
containing envVars entries like {'envName': 'ZERO', 'envValue': 0} and
{'envName': 'FLAG', 'envValue': False} (also include a JSON-encoded variant if
desired) and assert the result equals {'ZERO': '0', 'FLAG': 'False'} so
parse_env_vars preserves and stringifies falsy values.
In `@nodes/test/tool_shell/test_shell_executor.py`:
- Around line 166-175: The test currently assumes a fixed sentinel 'PATH'
exists; make it robust by selecting the sentinel key from the live environment:
import os in the test and set sentinel = next(iter(os.environ.keys()), 'PATH')
(or similar) before calling build_environment so the assertion checks a key that
actually exists in the test process; keep the reference to the test helper
build_environment and replace the hard-coded 'PATH' sentinel with the dynamic
sentinel variable.
- Around line 17-19: Replace the sys.path.insert workaround with an
importlib-based module load to avoid mutating global import state: locate the
shell_executor.py file (the module that defines build_environment and
execute_command) using Path(__file__).resolve().parent.parent.parent / 'src' /
'nodes' / 'tool_shell', create a spec via importlib.util.spec_from_file_location
and load the module via importlib.util.module_from_spec/exec_module, then access
build_environment and execute_command from that loaded module instead of
importing via sys.path manipulation; update any references to the imported names
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d9bf715a-a5ce-4be7-bfd0-92b1b06e5d34
📒 Files selected for processing (10)
nodes/src/nodes/tool_shell/IGlobal.pynodes/src/nodes/tool_shell/IInstance.pynodes/src/nodes/tool_shell/README.mdnodes/src/nodes/tool_shell/__init__.pynodes/src/nodes/tool_shell/config_parser.pynodes/src/nodes/tool_shell/services.jsonnodes/src/nodes/tool_shell/shell_executor.pynodes/test/tool_shell/__init__.pynodes/test/tool_shell/test_config_parser.pynodes/test/tool_shell/test_shell_executor.py
| sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent / 'src' / 'nodes' / 'tool_shell')) | ||
|
|
||
| from shell_executor import build_environment, execute_command # noqa: E402 |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Top-level module names in nodes/src/nodes/tool_shell:"
fd -a -e py . nodes/src/nodes/tool_shell | xargs -I{} basename {} .py | sort -u
echo
echo "Potential conflicting imports in test files:"
rg -nP --type=py -C2 '^\s*(import|from)\s+(shell_executor|config_parser)\b' -g 'nodes/test/**'Repository: rocketride-org/rocketride-server
Length of output: 1091
🏁 Script executed:
head -40 nodes/src/nodes/tool_shell/shell_executor.pyRepository: rocketride-org/rocketride-server
Length of output: 1616
🏁 Script executed:
head -40 nodes/src/nodes/tool_shell/__init__.pyRepository: rocketride-org/rocketride-server
Length of output: 1443
🏁 Script executed:
rg -A5 -B5 'without triggering' nodes/test/tool_shell/Repository: rocketride-org/rocketride-server
Length of output: 1820
Consider using importlib.util for cleaner module loading.
Lines 17–19 use sys.path.insert() to avoid triggering the __init__.py import (which requires engine runtime). While this is an intentional workaround with clear documentation, importlib.util.spec_from_file_location() achieves the same isolation more cleanly without modifying the global import path. This improves test determinism and follows Python testing best practices.
Suggested refactor
+import importlib.util
import os
import sys
from pathlib import Path
-# Add the node source directory to sys.path so we can import the module
-# without triggering the top-level nodes/__init__.py (which requires the
-# engine runtime).
-sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent / 'src' / 'nodes' / 'tool_shell'))
-
-from shell_executor import build_environment, execute_command # noqa: E402
+_SHELL_EXECUTOR_PATH = (
+ Path(__file__).resolve().parent.parent.parent
+ / 'src'
+ / 'nodes'
+ / 'tool_shell'
+ / 'shell_executor.py'
+)
+_SPEC = importlib.util.spec_from_file_location(
+ 'tool_shell_shell_executor',
+ _SHELL_EXECUTOR_PATH,
+)
+assert _SPEC is not None and _SPEC.loader is not None
+_MODULE = importlib.util.module_from_spec(_SPEC)
+_SPEC.loader.exec_module(_MODULE)
+
+build_environment = _MODULE.build_environment
+execute_command = _MODULE.execute_command🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nodes/test/tool_shell/test_shell_executor.py` around lines 17 - 19, Replace
the sys.path.insert workaround with an importlib-based module load to avoid
mutating global import state: locate the shell_executor.py file (the module that
defines build_environment and execute_command) using
Path(__file__).resolve().parent.parent.parent / 'src' / 'nodes' / 'tool_shell',
create a spec via importlib.util.spec_from_file_location and load the module via
importlib.util.module_from_spec/exec_module, then access build_environment and
execute_command from that loaded module instead of importing via sys.path
manipulation; update any references to the imported names accordingly.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_shell/IInstance.py`:
- Around line 122-126: The allowlist check in _validate_command currently uses
regex search (self.IGlobal.command_patterns) which permits allowed fragments
inside longer/chained commands; update _validate_command to require full-string
matches by replacing p.search(command) with p.fullmatch(command) when checking
patterns, keeping the same behavior of raising ValueError('Command is not
permitted by the configured allowlist.') if no patterns fully match the command.
In `@nodes/src/nodes/tool_shell/services.json`:
- Around line 56-60: Change the insecure default for environment injection by
setting "tool_shell.allowExternalEnv" default to false and require an explicit
command allowlist to enable it; update the corresponding allowlist configuration
(the tool_shell allowlist/allowed-commands setting referenced around lines
92-101) so it is empty/absent by default and fail-safe (reject execution if no
allowlist is provided), and update any production profile
templates/documentation/tests to require an explicit non-empty allowlist before
enabling allowExternalEnv.
In `@nodes/src/nodes/tool_shell/shell_executor.py`:
- Around line 58-65: The subprocess is started with shell=True so proc.kill()
only kills the shell, leaving background children (and t_out/t_err reader
threads) alive; modify the process creation in shell_executor.py (the proc
setup) to create a new process group: on POSIX set preexec_fn=os.setsid, and on
Windows add creationflags=subprocess.CREATE_NEW_PROCESS_GROUP; then change the
timeout cleanup logic that currently calls proc.kill() to kill the entire group
(use os.killpg(proc.pid, signal.SIGKILL) on POSIX and
proc.send_signal(signal.CTRL_BREAK_EVENT) or appropriate group termination on
Windows) so background children are terminated and t_out/t_err readers can
finish.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 64bcc2db-012e-4a05-b356-8ee4f8ce36b6
📒 Files selected for processing (9)
nodes/src/nodes/tool_shell/IGlobal.pynodes/src/nodes/tool_shell/IInstance.pynodes/src/nodes/tool_shell/__init__.pynodes/src/nodes/tool_shell/config_parser.pynodes/src/nodes/tool_shell/services.jsonnodes/src/nodes/tool_shell/shell_executor.pynodes/test/tool_shell/__init__.pynodes/test/tool_shell/test_config_parser.pynodes/test/tool_shell/test_shell_executor.py
Rod-Christensen
left a comment
There was a problem hiding this comment.
I can definitely see why we would want this, and I can also see how incredibly dangerous it is... I'd like to get your thoughts on how we could lock this down a bit more?
1b7eafe to
859e9a2
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/tool_shell/IGlobal.py`:
- Around line 82-85: The current guard only raises if there are compile errors
(invalid_pattern_errors) but misses the case where a configured allowlist
(commandAllowlist) yields no usable patterns because entries were blank; update
the check after parse_command_patterns so that if the config indicates an
allowlist was provided (e.g., cfg.commandAllowlist / presence of any
commandPattern entries) but compiled_patterns is empty, raise the ValueError
instead of allowing an empty list; modify the logic around
parse_command_patterns, invalid_pattern_errors and compiled_patterns in IGlobal
(the block that currently sets self.command_patterns) to treat "configured but
produced no usable patterns" as a failure case so IInstance._validate_command
cannot short-circuit and allow all commands.
In `@nodes/src/nodes/tool_shell/README.md`:
- Line 53: Update the README description for the exit_code handling to broaden
the meaning of 127: change the current line that says "`127` indicates the host
shell could not be launched" to something like "`127` indicates either a command
was not found or the host shell could not be launched, depending on context" so
readers understand both common POSIX uses; target the `exit_code` documentation
entry in nodes/src/nodes/tool_shell/README.md and update the sentence that
references `127`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 11160dbc-2053-4cac-90b3-3efd418affa7
📒 Files selected for processing (10)
nodes/src/nodes/tool_shell/IGlobal.pynodes/src/nodes/tool_shell/IInstance.pynodes/src/nodes/tool_shell/README.mdnodes/src/nodes/tool_shell/__init__.pynodes/src/nodes/tool_shell/config_parser.pynodes/src/nodes/tool_shell/services.jsonnodes/src/nodes/tool_shell/shell_executor.pynodes/test/tool_shell/__init__.pynodes/test/tool_shell/test_config_parser.pynodes/test/tool_shell/test_shell_executor.py
| compiled_patterns = parse_command_patterns(cfg, on_invalid=_on_invalid_pattern) | ||
| if invalid_pattern_errors and not compiled_patterns: | ||
| raise ValueError(f'commandAllowlist is configured but every pattern failed to compile; refusing to start with a non-functional allowlist (would silently allow all commands). First error: {invalid_pattern_errors[0]}') | ||
| self.command_patterns = compiled_patterns |
There was a problem hiding this comment.
Allowlist still fails open when configured with only blank patterns.
The fail-closed guard only triggers when at least one pattern produces a compile error (invalid_pattern_errors). However, parse_command_patterns silently skips blank/whitespace commandPattern entries without calling on_invalid, so a config like:
{"commandAllowlist": [{"commandPattern": ""}, {"commandPattern": " "}]}ends with compiled_patterns == [] and invalid_pattern_errors == []. No ValueError is raised, and IInstance._validate_command then short-circuits on the empty list and admits every command — exactly the silent-allow-all outcome this check is meant to prevent.
Tighten the guard to fire whenever an allowlist was configured but produced no usable patterns.
💡 Proposed fix
- compiled_patterns = parse_command_patterns(cfg, on_invalid=_on_invalid_pattern)
- if invalid_pattern_errors and not compiled_patterns:
- raise ValueError(f'commandAllowlist is configured but every pattern failed to compile; refusing to start with a non-functional allowlist (would silently allow all commands). First error: {invalid_pattern_errors[0]}')
- self.command_patterns = compiled_patterns
+ compiled_patterns = parse_command_patterns(cfg, on_invalid=_on_invalid_pattern)
+ raw_allowlist = cfg.get('commandAllowlist')
+ if raw_allowlist and not compiled_patterns:
+ first = invalid_pattern_errors[0] if invalid_pattern_errors else 'all entries were blank or invalid'
+ raise ValueError(
+ 'commandAllowlist is configured but no usable patterns were compiled; '
+ f'refusing to start with a non-functional allowlist (would silently allow all commands). Detail: {first}'
+ )
+ self.command_patterns = compiled_patterns🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nodes/src/nodes/tool_shell/IGlobal.py` around lines 82 - 85, The current
guard only raises if there are compile errors (invalid_pattern_errors) but
misses the case where a configured allowlist (commandAllowlist) yields no usable
patterns because entries were blank; update the check after
parse_command_patterns so that if the config indicates an allowlist was provided
(e.g., cfg.commandAllowlist / presence of any commandPattern entries) but
compiled_patterns is empty, raise the ValueError instead of allowing an empty
list; modify the logic around parse_command_patterns, invalid_pattern_errors and
compiled_patterns in IGlobal (the block that currently sets
self.command_patterns) to treat "configured but produced no usable patterns" as
a failure case so IInstance._validate_command cannot short-circuit and allow all
commands.
| } | ||
| ``` | ||
|
|
||
| `exit_code` is the process return code. `-1` indicates the command was killed due to timeout; `127` indicates the host shell could not be launched. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify typical shell semantics for exit code 127 without repository access.
set -euo pipefail
# 1) Command-not-found path (usually returns 127 from /bin/sh)
if command -v sh >/dev/null 2>&1; then
sh -c 'definitely_not_a_real_command_12345' >/tmp/rr_out 2>/tmp/rr_err || true
echo "sh command-not-found exit code: $?"
fi
# 2) Launch failure path (simulate missing shell binary)
python - <<'PY'
import subprocess
try:
subprocess.run(["/definitely/missing/shell", "-c", "echo hi"], check=False)
except FileNotFoundError:
print("launch failure produces FileNotFoundError before process exit code")
PYRepository: rocketride-org/rocketride-server
Length of output: 177
🏁 Script executed:
cat -n nodes/src/nodes/tool_shell/README.md | sed -n '45,60p'Repository: rocketride-org/rocketride-server
Length of output: 1166
🏁 Script executed:
#!/bin/bash
# Corrected verification: capture exit code BEFORE || operator
set -euo pipefail
# 1) Correct way to check command-not-found exit code
sh -c 'definitely_not_a_real_command_12345' 2>/dev/null || exit_code=$?
echo "sh command-not-found exit code: ${exit_code:-0}"
# 2) Verify on bash as well
bash -c 'definitely_not_a_real_command_12345' 2>/dev/null || exit_code=$?
echo "bash command-not-found exit code: ${exit_code:-0}"Repository: rocketride-org/rocketride-server
Length of output: 150
Broaden exit_code=127 explanation to include command-not-found.
Line 53 states that 127 indicates the host shell could not be launched, but in standard POSIX shells, 127 is also (and more commonly) returned when a command is not found. The documentation should cover both cases to prevent troubleshooting confusion. Suggest: "127 indicates either a command was not found or the host shell could not be launched, depending on context."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nodes/src/nodes/tool_shell/README.md` at line 53, Update the README
description for the exit_code handling to broaden the meaning of 127: change the
current line that says "`127` indicates the host shell could not be launched" to
something like "`127` indicates either a command was not found or the host shell
could not be launched, depending on context" so readers understand both common
POSIX uses; target the `exit_code` documentation entry in
nodes/src/nodes/tool_shell/README.md and update the sentence that references
`127`.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
nodes/src/nodes/tool_shell/IGlobal.py (1)
82-87:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAllowlist still fails open when configured entries are all blank/whitespace.
The guard only fires when
invalid_pattern_errorsis non‑empty. Asparse_command_patternssilently skips blank/whitespacecommandPatternentries (it does not callon_invalidfor them), a config like{"commandAllowlist": [{"commandPattern": ""}, {"commandPattern": " "}]}producescompiled_patterns == []andinvalid_pattern_errors == []. NoValueErroris raised, andIInstance._validate_commandthen short‑circuits on the empty list (if patterns and ...), admitting every command — the exact silent‑allow‑all outcome this guard was added to prevent.Tighten the predicate to "allowlist was configured but nothing usable was compiled", independent of whether errors were captured.
💡 Proposed fix
- compiled_patterns = parse_command_patterns(cfg, on_invalid=_on_invalid_pattern) - if invalid_pattern_errors and not compiled_patterns: - raise ValueError( - f'commandAllowlist is configured but every pattern failed to compile; refusing to start with a non-functional allowlist (would silently allow all commands). First error: {invalid_pattern_errors[0]}' - ) - self.command_patterns = compiled_patterns + compiled_patterns = parse_command_patterns(cfg, on_invalid=_on_invalid_pattern) + raw_allowlist = cfg.get('commandAllowlist') + if raw_allowlist and not compiled_patterns: + first = invalid_pattern_errors[0] if invalid_pattern_errors else 'all entries were blank or invalid' + raise ValueError( + 'commandAllowlist is configured but no usable patterns were compiled; ' + f'refusing to start with a non-functional allowlist (would silently allow all commands). Detail: {first}' + ) + self.command_patterns = compiled_patterns🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nodes/src/nodes/tool_shell/IGlobal.py` around lines 82 - 87, The current guard only raises when invalid_pattern_errors is non-empty, letting a configured-but-empty (blank/whitespace-only) commandAllowlist slip through; change the predicate so we raise whenever the allowlist was provided in the config but no usable patterns were compiled: after calling parse_command_patterns(cfg, ...), check for the presence of the allowlist config (e.g. cfg contains "commandAllowlist" or similar) and if compiled_patterns is empty, raise the ValueError regardless of invalid_pattern_errors; update the block around compiled_patterns / invalid_pattern_errors (the code handling parse_command_patterns and self.command_patterns) to implement this tighter check so blank/whitespace-only entries don’t silently fail open.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nodes/src/nodes/tool_shell/IInstance.py`:
- Around line 87-94: The description in the description=lambda of IInstance
currently advertises "max {MAX_TIMEOUT}s" which is incorrect because
_resolve_timeout clamps per-call overrides to self.IGlobal.timeout, not
MAX_TIMEOUT; update the description string to remove the MAX_TIMEOUT
parenthetical and instead state the configured global timeout (e.g., "Timeout:
{self.IGlobal.timeout}s (configured global timeout)") so the text matches the
actual behavior of _resolve_timeout and IGlobal.timeout; alternatively, if you
prefer the other approach, change _resolve_timeout to clamp to MAX_TIMEOUT
instead of self.IGlobal.timeout, but do not leave the mismatch between the
description lambda, _resolve_timeout, IGlobal.timeout, and MAX_TIMEOUT.
---
Duplicate comments:
In `@nodes/src/nodes/tool_shell/IGlobal.py`:
- Around line 82-87: The current guard only raises when invalid_pattern_errors
is non-empty, letting a configured-but-empty (blank/whitespace-only)
commandAllowlist slip through; change the predicate so we raise whenever the
allowlist was provided in the config but no usable patterns were compiled: after
calling parse_command_patterns(cfg, ...), check for the presence of the
allowlist config (e.g. cfg contains "commandAllowlist" or similar) and if
compiled_patterns is empty, raise the ValueError regardless of
invalid_pattern_errors; update the block around compiled_patterns /
invalid_pattern_errors (the code handling parse_command_patterns and
self.command_patterns) to implement this tighter check so blank/whitespace-only
entries don’t silently fail open.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9a07c17e-1a52-4077-931c-483b309f39d1
📒 Files selected for processing (2)
nodes/src/nodes/tool_shell/IGlobal.pynodes/src/nodes/tool_shell/IInstance.py
| description=lambda self: ( | ||
| 'Execute a shell command on the host and return stdout, stderr, and exit code. ' | ||
| 'Use for build scripts (npm/pip/make), package management, file operations, process management, ' | ||
| 'environment inspection, and host-installed git operations. ' | ||
| f'Timeout: {self.IGlobal.timeout}s (max {MAX_TIMEOUT}s). ' | ||
| f'Default working directory: {self.IGlobal.working_dir or "host process CWD"}. ' | ||
| 'For portable git operations that do not depend on the host having git installed, prefer the Git node.' | ||
| ), |
There was a problem hiding this comment.
Tool description overstates the per‑call timeout ceiling.
_resolve_timeout clamps the per‑call override to self.IGlobal.timeout (line 168), not to MAX_TIMEOUT. The description string f'Timeout: {self.IGlobal.timeout}s (max {MAX_TIMEOUT}s). ' advertises MAX_TIMEOUT as the ceiling, which is what an LLM consumer will use when constructing tool calls. Requests with a timeout between IGlobal.timeout and MAX_TIMEOUT will be silently clamped down, contradicting the description.
Either drop the MAX_TIMEOUT mention from the per‑call description (since it’s really the configurable upper bound, not the per‑call ceiling), or change the clamp to MAX_TIMEOUT.
💡 Proposed fix (description-only)
- f'Timeout: {self.IGlobal.timeout}s (max {MAX_TIMEOUT}s). '
+ f'Timeout: {self.IGlobal.timeout}s (per-call cap, also the maximum). '🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nodes/src/nodes/tool_shell/IInstance.py` around lines 87 - 94, The
description in the description=lambda of IInstance currently advertises "max
{MAX_TIMEOUT}s" which is incorrect because _resolve_timeout clamps per-call
overrides to self.IGlobal.timeout, not MAX_TIMEOUT; update the description
string to remove the MAX_TIMEOUT parenthetical and instead state the configured
global timeout (e.g., "Timeout: {self.IGlobal.timeout}s (configured global
timeout)") so the text matches the actual behavior of _resolve_timeout and
IGlobal.timeout; alternatively, if you prefer the other approach, change
_resolve_timeout to clamp to MAX_TIMEOUT instead of self.IGlobal.timeout, but do
not leave the mismatch between the description lambda, _resolve_timeout,
IGlobal.timeout, and MAX_TIMEOUT.
Security ConcernThe node is capable to execute arbitrary shell commands on the host and to return their full output. In a cloud Current Implementation: Why the Allowlist Is Not EnoughThe only guard is a regex allowlist (
Required Improvements1. Static Command-Line Checks
2. Restricted Execution Environment
|
Agree on containment-over-detection. The allowlist alone was never going to hold, and the sandbox tiers you laid out are the right shape. A few additions and one gap worth flagging before the design locks in. Gap: the user's working directory. The proposal protects the host but not the contents of Drop The executor currently runs Quick wins to layer with rlimits:
Tier above bwrap. For hostile-tenant cloud, bwrap shares a kernel with the host. Worth adding Output redaction. Even with network isolation, stdout flows back to the model. If the agent is chained to anything with outbound calls, secrets in stdout become the exfiltration path. Scrub Audit log to a separate sink. Append-only log of every command, argv, exit code, truncation flag, written somewhere the sandboxed process can't reach. Containment fails silently otherwise. One pushback on the LLM validator. I'd list it as advisory only, not a gating tier. Adversarial inputs against small validators are well-documented, and once it's in the trust chain it becomes load-bearing. The AST plus denylist plus sandbox should do the actual gating. The LLM check is fine as belt-and-suspenders, not as a gate. |
Summary
-Executes shell commands in the host environment. Use to run scripts, manage processes, install packages, and interact with the operating system via the command line
Type
Feature
Testing
Tests added or updated
Tested locally
./builder test passes
Checklist
Linked Issue
Fixes #653
Summary by CodeRabbit
New Features
Documentation
Tests