feat(nodes): add self-contained Git tool node (#654)#731
feat(nodes): add self-contained Git tool node (#654)#731asclearuc merged 15 commits intorocketride-org:developfrom
Conversation
|
No description provided. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new pygit2-backed Git tool node: GitRepo library and GitError; IGlobal lifecycle that can auto-clone or open repos; IInstance dispatcher exposing many git.* tools to agents; service metadata, tests, and an example pipeline; supports token/SSH auth and safe-mode guards. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Agent as LangChain Agent
participant IInstance as Git Tool (IInstance)
participant Dispatcher as _dispatch/_call
participant IGlobal as IGlobal
participant GitRepo as GitRepo (pygit2)
Agent->>IInstance: invoke(param {tool_name, input})
IInstance->>Dispatcher: _dispatch(tool_name, args)
Dispatcher->>IGlobal: ensure repo available
alt repo available
Dispatcher->>GitRepo: call specific git method
GitRepo-->>Dispatcher: result or GitError
else repo missing
Dispatcher-->>IInstance: JSON error
end
Dispatcher-->>IInstance: JSON response
IInstance-->>Agent: JSON string
sequenceDiagram
autonumber
participant Pipeline as Pipeline
participant IGlobal as Git Tool (IGlobal)
participant TempFS as Temp Dir
participant GitRepo as GitRepo (pygit2)
Pipeline->>IGlobal: beginGlobal()
IGlobal->>IGlobal: load config (repoPath, authType, safeMode)
alt repoPath is URL
IGlobal->>TempFS: create temp dir
IGlobal->>GitRepo: clone(url, temp_path) with auth callbacks
GitRepo-->>IGlobal: opened repo
else repoPath is local
IGlobal->>GitRepo: open(local_path)
GitRepo-->>IGlobal: opened repo
else empty
IGlobal->>IGlobal: defer init to runtime
end
Pipeline->>IGlobal: endGlobal()
IGlobal->>TempFS: remove temp dir if created
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested Reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_git/git_repo.py`:
- Line 479: The loop uses an unused enumerate index; update the loop in
git_repo.py that currently reads "for i, hunk in enumerate(blame_obj):" to drop
the unused index by changing it to "for hunk in blame_obj:" (or, if you prefer
to keep enumerate for clarity, use an underscore for the index: "for _, hunk in
enumerate(blame_obj):") so the unused variable `i` is removed and lint warnings
are resolved.
- Around line 940-943: The bare except around "text = obj.data.decode('utf-8',
errors='replace')" silently swallows unexpected errors; change it to catch only
expected exceptions (e.g., except (AttributeError, TypeError): continue) and/or
add a short comment explaining why other exceptions are intentionally ignored so
future readers know this only protects against missing/invalid obj.data types
(keep the continue behavior). Refer to the decode call on obj.data in the same
loop to find the location to update.
- Around line 471-476: The code always reads file content from repo.head
(repo.head.peel(pygit2.Tree)[path]) so when a non-HEAD ref is supplied the blame
hunks can mismatch; change the tree lookup to peel the provided ref (e.g., use
repo.revparse_single(ref).peel(pygit2.Tree) or repo.get(ref).peel(...) depending
on how refs are represented) instead of repo.head.peel, then get the blob from
that tree entry and decode it as before; update the KeyError message/context to
indicate the lookup used the provided ref when raising GitError for path not
found.
- Around line 665-675: The branch_delete method currently calls b.delete()
unconditionally and can remove unmerged branches even when force=False; update
branch_delete to first resolve the branch tip (repo.branches[name]) and
determine whether that tip is an ancestor/merged into the current HEAD (e.g.,
via merge-base/ancestor check) and if not merged and force is False raise a
GitError instead of deleting; keep the existing _safe_guard('force-deleting a
branch') path to allow deletion when force=True and only call b.delete() after
the merged check or when force is asserted.
In `@nodes/src/nodes/tool_git/IGlobal.py`:
- Around line 71-76: The except block that reads "except (GitError, Exception)"
should be simplified: either catch all exceptions with "except Exception as exc"
or split into two handlers to treat GitError specially (e.g., "except GitError
as exc:" followed by "except Exception as exc:") while preserving the cleanup
(shutil.rmtree(tmp, ignore_errors=True)) and re-raising the ValueError for
git.clone failures (using the same message and "from exc"); update the
handler(s) around the git.clone call in IGlobal (references: git.clone,
repo_path, tmp, self._tmp_dir) accordingly.
In `@nodes/src/nodes/tool_git/IInstance.py`:
- Line 68: The description for the 'max_count' property in IInstance.py contains
an en-dash character in "(1–200" which should be replaced with the ASCII
hyphen-minus to avoid encoding issues; open the 'max_count' entry and change the
en-dash (U+2013) to a regular hyphen-minus (U+002D) so the description reads
"(1-200, default 20)".
In `@nodes/src/nodes/tool_git/README.md`:
- Around line 38-87: Add blank lines before and after the markdown tables under
each heading (e.g., the "### Repository", "### Status & Info", "### Diff &
Inspection", "### Staging & Commits", "### Branches", "### Remote", and "###
Search" headings) so the tables that start with rows like `| Tool | Description
|` and entries such as `git.clone`, `git.status`, `git.diff`, `git.stage`,
`git.branch_list`, `git.fetch`, `git.grep` are preceded by a blank line after
the heading and followed by a blank line after the table; update README.md
accordingly to satisfy the markdown linter.
In `@nodes/test/tool_git/test_tools.py`:
- Around line 449-450: Move the top-level imports "import os" and "import
pytest" from their current location near the bottom into the module's main
import block at the top of the file with the other imports; remove the "# noqa:
E402" markers because they will no longer be needed, and ensure import ordering
follows the project's import grouping (stdlib first, third-party next) so
functions and tests in this module can use os and pytest without conditional or
late import workarounds.
- Around line 94-101: The _invoke function currently uses a mutable default for
args (args: Dict[str, Any] = {}); change the signature to accept
Optional[Dict[str, Any]] (e.g., args: Optional[Dict[str, Any]] = None) and
inside _invoke set args = {} if args is None before assigning param.input and
calling inst.invoke(param); keep references to the same symbols (function
_invoke, parameter args, param.input, and inst.invoke) so the change is
localized and preserves behavior while avoiding the mutable default pitfall.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: dd7f3d85-b9bf-4c77-8d50-d973a5d730c1
📒 Files selected for processing (10)
nodes/src/nodes/tool_git/IGlobal.pynodes/src/nodes/tool_git/IInstance.pynodes/src/nodes/tool_git/README.mdnodes/src/nodes/tool_git/__init__.pynodes/src/nodes/tool_git/git_repo.pynodes/src/nodes/tool_git/requirements.txtnodes/src/nodes/tool_git/services.jsonnodes/test/tool_git/__init__.pynodes/test/tool_git/test_tools.pypipelines/git_agent_example.pipe
9189643 to
c84ae23
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_git/git_repo.py`:
- Around line 804-832: The push method checks force via _safe_guard but never
applies it to the refspec; update the refspec construction in push (currently
building refspec = f'refs/heads/{target}:refs/heads/{target}') to prefix the
source ref with '+' when force is True (e.g., '+refs/heads/...:refs/heads/...')
so rem.push([...], **kwargs) performs a force push when requested, keeping the
existing _safe_guard and exception handling around rem.push.
- Around line 913-919: The initial-commit branch currently only compares
top-level entry names (using commit.peel and _path_matches(entry.name, path)),
which fails for nested paths like "src/utils/foo.py"; update the logic in the
initial-commit handling (the commit.peel(pygit2.Tree) block) to test whether the
full requested path exists in the committed tree rather than just matching root
entry names — e.g., attempt a tree lookup for the full path (using tree[path] or
by iteratively traversing tree entries by path segments) and return True if that
lookup succeeds (keep existing GitError handling to return False on exceptions),
otherwise fall back to any existing root-level _path_matches checks.
- Around line 109-130: The temp SSH key file created in GitRepoSSHKeyProvider
(self._tmp via NamedTemporaryFile) is written but never closed, causing unlink
to fail on Windows and leaving the private key on disk; after writing and
flushing in the key creation path (before returning the pygit2.Keypair) call
self._tmp.close() to close the file handle, and update the close() method to
first attempt to close the file handle if it exists (e.g., call
self._tmp.close() inside the try) before calling os.unlink(self._tmp.name), then
set self._tmp = None; ensure delete=False remains so the explicit unlink still
happens.
In `@nodes/src/nodes/tool_git/IGlobal.py`:
- Around line 125-130: The _parse_bool function currently treats any
unrecognized string as False which can disable safeMode unintentionally; update
_parse_bool so that for str inputs it returns True only for an explicit true set
(e.g., {'1','true','yes','on'}), returns False only for an explicit false set
(e.g., {'0','false','no','off'}), and returns the provided default for any other
string (thus preserving safeMode when the string is unknown); retain existing
handling for bool, None (return default), and non-str non-bool types
(bool(raw)).
In `@nodes/src/nodes/tool_git/IInstance.py`:
- Around line 170-178: The git.stash node's input_schema and dispatch currently
send a message parameter but GitRepo.stash expects an index parameter, causing
runtime failures; update the git.stash input_schema (and any dispatch/argument
construction for the 'git.stash' action) to match GitRepo.stash's signature by
replacing/adding the 'index' property (and removing/ignoring 'message') or
otherwise map incoming 'message' to the correct index argument, and ensure the
code path that calls GitRepo.stash (referencing GitRepo.stash and the
'git.stash' node) constructs and passes the index parameter correctly so both
schema and call site are aligned.
- Around line 329-335: The code currently swallows JSON parse errors by
converting invalid JSON strings to an empty dict, which can cause unintended
default operations; update the logic in the method that handles the args
variable so that if isinstance(args, str) and json.loads(args) raises an
exception you propagate a proper error instead of setting args = {} — for
example, catch the exception and raise a ValueError (or return an error result)
with a clear message about invalid JSON input so _dispatch is never called with
coerced empty args; adjust the code paths that assign param.output and return
param to only run after successful parsing.
In `@nodes/test/tool_git/test_tools.py`:
- Around line 90-94: The mock for GitRepo in _make_instance is too permissive;
replace inst.IGlobal.repo = MagicMock(spec=GitRepo) with a stricter autospec
such as using unittest.mock.create_autospec(GitRepo, spec_set=True) so calls to
inst.IGlobal.repo will enforce the GitRepo call signatures and kwarg names;
update the assignment in _make_instance to use create_autospec for the repo
mock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8b023f08-0c94-49d6-b40f-bc8c951a0056
📒 Files selected for processing (10)
nodes/src/nodes/tool_git/IGlobal.pynodes/src/nodes/tool_git/IInstance.pynodes/src/nodes/tool_git/README.mdnodes/src/nodes/tool_git/__init__.pynodes/src/nodes/tool_git/git_repo.pynodes/src/nodes/tool_git/requirements.txtnodes/src/nodes/tool_git/services.jsonnodes/test/tool_git/__init__.pynodes/test/tool_git/test_tools.pypipelines/git_agent_example.pipe
c84ae23 to
7d49bb4
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_git/git_repo.py`:
- Line 353: The computed 'clean' status only checks staged and unstaged flags
and misses untracked files; update the calculation where 'clean' is assigned
(the line containing "'clean': not staged and not unstaged") to also consider
untracked file state (e.g., include "and not untracked" or the equivalent
variable/flag used in this module) so that repositories with untracked files are
not reported as clean; locate the assignment in git_repo.py (the dict or
status-building code that defines 'clean', 'staged', 'unstaged') and incorporate
the untracked indicator into the boolean expression.
- Around line 290-291: In clone(), avoid calling Path(path).iterdir() without
confirming the path is a directory: first check Path(path).exists(); if it
exists and Path(path).is_file() raise a GitError that the destination exists as
a file; otherwise if it exists and Path(path).is_dir() use
any(Path(path).iterdir()) (or check emptiness) and raise the existing GitError
for non-empty directories; this prevents NotADirectoryError and preserves the
intended controlled GitError behavior.
- Around line 929-930: _commit_touches_path only inspects d.delta.new_file.path
so commits that delete the target path are missed; update the check after
repo.diff(parent.tree, commit.tree) to consider both d.delta.new_file.path and
d.delta.old_file.path (safely handling missing/None values) when calling
_path_matches so deletion commits are included; reference the diff generation
(repo.diff(parent.tree, commit.tree)), the loop/any expression that calls
_path_matches, and ensure both file path candidates are tested before returning
the any() result.
In `@nodes/src/nodes/tool_git/IInstance.py`:
- Around line 369-370: The code is currently using truthy coercion on the args
dict (variable a) when calling git.log and other git methods, which
misinterprets strings and single-value paths; replace those inline coercions
with strict typed extractions: implement and use helpers like parse_bool(a,
'force'), parse_bool(a, 'staged') that accept booleans or explicit
"true"/"false" strings (reject/raise on other types), parse_list(a, 'paths')
that returns a list (wrap single string into [string] and reject non-iterable
types), and parse_int_range(a, 'max_count', 1, 200) for git.log and any other
calls; update the calls that reference a.get('max_count'), a.get('paths'),
a.get('force'), etc., to use these helpers and raise/return validation errors
for malformed inputs so downstream code receives well-shaped values.
In `@nodes/src/nodes/tool_git/services.json`:
- Line 96: Update the "safeMode" description in services.json to accurately
reflect the implemented behavior: state that safeMode blocks destructive
operations including force-push and force branch deletion, and that branch
deletion is allowed only when the branch is fully merged into HEAD (otherwise
deletion requires force=true); reference the "safeMode" key so reviewers can
locate and replace the current misleading sentence "Normal branch deletion is
always allowed" with the corrected merge-condition wording and keep the
recommendation for agent use.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6aebee9b-a3ae-42af-987d-19053ee4751b
📒 Files selected for processing (10)
nodes/src/nodes/tool_git/IGlobal.pynodes/src/nodes/tool_git/IInstance.pynodes/src/nodes/tool_git/README.mdnodes/src/nodes/tool_git/__init__.pynodes/src/nodes/tool_git/git_repo.pynodes/src/nodes/tool_git/requirements.txtnodes/src/nodes/tool_git/services.jsonnodes/test/tool_git/__init__.pynodes/test/tool_git/test_tools.pypipelines/git_agent_example.pipe
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_git/git_repo.py`:
- Around line 550-568: The stage method currently passes verbatim paths to
repo.index.add/remove allowing "../" to escape the repository; update stage to
resolve each provided path against Path(repo.workdir) (e.g. full =
(Path(repo.workdir) / p).resolve()) and verify that the resolved path is inside
the repo root (compare .resolve() with Path(repo.workdir).resolve() or use
relative_to) before calling repo.index.add or repo.index.remove; if a path is
outside the repo reject it (raise ValueError or skip and record an error) and do
not mutate the index, ensuring all index operations in stage only operate on
normalized, in-repo paths.
- Around line 467-475: The patch returned by git_diff API is being post-filtered
with _filter_diff_by_path using a substring match which incorrectly matches
paths like "src/abc.py" when filtering "src/a", and the returned
files_changed/insertions/deletions still reflect the full diff (d.stats) rather
than the filtered patch; update _filter_diff_by_path to parse each "diff --git
a/… b/…" header and only include file entries where the path equals the target
or is a path-prefix that respects path boundaries (e.g., target == path or
path.startswith(target + "/")), then after building the filtered patch
recalculate files_changed by counting diff entries and recompute
insertions/deletions by counting added ('+' not in header/+++ lines) and removed
('-' not in header/--- lines) lines within hunks of the filtered patch, and
return those computed numbers in the dict returned by the function that
currently returns {'patch': patch, 'files_changed': d.stats.files_changed,
'insertions': d.stats.insertions, 'deletions': d.stats.deletions}.
- Around line 533-544: The write_file method currently only prevents escaping
the worktree but allows writes into repository internals like .git; update
write_file (and use repo = self._require_repo(), workdir = Path(repo.workdir))
to detect if the resolved target path is the .git directory or any descendant
(e.g., path equal to ".git" or having a ".git" component / being under workdir /
".git") and raise GitError in that case instead of writing; keep the existing
relative_to(workdir) check and return structure, but add the .git guard before
creating parents or writing the file.
In `@nodes/src/nodes/tool_git/IGlobal.py`:
- Around line 72-76: The exception message currently includes the full repo_path
from git.clone which may contain credentials; update the except block around
git.clone to avoid leaking userinfo by either omitting repo_path from the raised
ValueError or by stripping the userinfo (username:password@) before formatting
the message (use URL parsing to remove netloc credentials). Ensure the cleanup
(shutil.rmtree(tmp, ...)) still runs, then raise ValueError with a redacted or
sanitized repo identifier instead of the original repo_path in this try/except
handling.
In `@nodes/src/nodes/tool_git/IInstance.py`:
- Around line 408-414: In the git.diff dispatch branch (the block handling name
== 'git.diff' in IInstance), add an upfront validation that if a contains
'ref_b' but does not contain 'ref_a' you reject the call (e.g., raise ValueError
or return a clear validation error) instead of falling through to GitRepo.diff;
perform this check before calling git.diff() and keep existing argument
extraction (ref_a, ref_b, path, staged) intact so normal calls continue to work.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 144a5755-9cae-4b48-88b9-2c81245fbfe9
📒 Files selected for processing (4)
nodes/src/nodes/tool_git/IGlobal.pynodes/src/nodes/tool_git/IInstance.pynodes/src/nodes/tool_git/git_repo.pynodes/test/tool_git/test_tools.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nodes/src/nodes/tool_git/IGlobal.py`:
- Around line 77-79: The current redaction builds redacted using parsed.hostname
or parsed.netloc which can leak userinfo/port or fail for SSH-style URLs;
instead, when handling repo_path, extract parsed = urlparse(repo_path) but
construct a safe_netloc: if parsed.hostname is present use parsed.hostname and
include parsed.port (formatted as hostname:port) otherwise remove any leading
userinfo by splitting parsed.netloc on "@" and taking the last segment; then
scrub credentials embedded in parsed.path by removing any user:pass@ patterns
before calling urlunparse to build redacted; update references to parsed,
repo_path, and redacted accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 19973935-26e4-4ea2-90cc-1bb30d0cd188
📒 Files selected for processing (4)
nodes/src/nodes/tool_git/IGlobal.pynodes/src/nodes/tool_git/IInstance.pynodes/src/nodes/tool_git/git_repo.pynodes/test/tool_git/test_tools.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
00197dc to
3509336
Compare
|
|
|
@stepmikhaylov — Implemented readOnlyMode as requested. ✅ Overview Added a new readOnlyMode configuration flag (default true) that complements safeMode by blocking all write operations at the tool dispatch layer. When enabled, agents can only invoke inspection tools (status, log, show, diff, blame, file_at, branch_list, grep, ls_files); all mutation attempts return a JSON error without reaching the repository layer. Implementation details Commit: — feat(tool_git): add readOnlyMode to block all write operations at dispatch Key files:
Design:
Blocked when readOnlyMode=true: Testing & validation ✅ 63 unit tests pass (all existing tests green + 12 new readOnlyMode cases) Recommended usage For inspection-only workflows (code analysis, RAG, auditing), use the default readOnlyMode: true. For write-capable pipelines, set readOnlyMode: false and tune safeMode to block destructive ops (force-push, force delete) as needed. |
Implements tool_git using pygit2 (libgit2) so no host git binary is required. All git operations run inside the process via libgit2, making the node safe to use in sandboxed or minimal agent environments. Capabilities: - Clone, init - Status, log, show, diff, blame, file_at, ls_files, grep - Stage, commit, stash - Branch create/checkout/merge/delete - Fetch, pull, push (token + SSH auth) - write_file with path-traversal guard Auth integrates with RocketRide's existing secrets system via secure fields (token, sshKey, sshPassphrase) in services.json. Safe mode (default on) blocks force-push and force branch deletion. Also includes: - Full unit test suite (mocked pygit2, no binary required) - Integration test class (activated via GIT_TEST_REPO_PATH env var) - git_agent_example.pipe: prompt-driven example pipeline using the node Closes rocketride-org#654 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Include untracked files in clean status check - Guard clone() against file-path destinations (fixes NotADirectoryError) - Check both new_file and old_file paths in _commit_touches_path so deletions are included in log --path filtering - Narrow bare except to (AttributeError, TypeError) in _grep_tree (already done in prior round, re-verified) - Correct safeMode description in services.json to reflect merge-check behaviour Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add docstrings across git_repo.py, IInstance.py, IGlobal.py, and test_tools.py to bring docstring coverage to 100% - Add _bool_arg and _int_arg_in_range helpers to IInstance for typed input validation; replace all bool() / int() coercions and the paths list check - Fix git.stash schema and dispatch to include index parameter for pop/drop Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add path resolution and bounds checks to stage() to prevent "../" escapes - Add .git directory protection to write_file() - Fix _filter_diff_by_path to use proper path matching and recalculate stats from the filtered patch (not the full diff stats) - Redact credentials from clone() error messages to prevent leaking auth info - Add validation in git.diff dispatch to reject ref_b without ref_a Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…PI consistency - Implemented robust path traversal guards using .is_relative_to() - Added protection against writes and indexing inside the .git directory - Hardened URL redaction for both HTTPS and git@ SSH shorthand - Implemented _scrub_exc to redact credentials from libgit2 error messages - Fixed _filter_diff_by_path to handle filenames with spaces correctly via regex - Updated git.diff schema and validation to ensure ref_b requires ref_a - Cleaned up imports and added clarifying comments
…onflict - IInstance._dispatch now catches pygit2.GitError so unwrapped libgit2 errors (e.g. malformed refs in revparse_single) return a JSON error instead of faulting out of tool.invoke. Credentials in the message are scrubbed via the existing _scrub_exc helper. - merge(): on conflict, call repo.state_cleanup() and hard-reset to HEAD before raising. The previous behavior left MERGE_HEAD set, silently corrupting the next git.commit (which would create a non-merge commit with conflict markers). - Test stub: pygit2.GitError is now a distinct Exception subclass; using bare Exception made the new except arm swallow KeyError in tests. - Add regression test for pygit2.GitError -> JSON error path with credential scrubbing.
…overage
- _resolve_ref now also catches pygit2.GitError (raised by revparse_single
on malformed refs) and wraps it as a clean GitError.
- Token/SSH credential callbacks check allowed_types and raise a
GitError with actionable guidance when the server doesn't accept the
configured auth method, instead of letting libgit2 emit an opaque
error.
- Add public GitRepo.open(path) so IGlobal no longer pokes _repo /
_repo_path private attributes; path existence/dir validation moved
with it.
- Rename _scrub_exc -> scrub_credentials and export it; IGlobal reuses
it instead of duplicating the regex inline.
- Apply the standard MIT license header to IGlobal.py, IInstance.py,
and git_repo.py to match the convention used by sibling tool nodes.
- Tests:
* TestPathTraversalGuards exercises write_file and stage against ../
traversal, absolute paths outside the workdir, and .git/ paths,
plus a happy-path write.
* TestFilterDiffByPath covers prefix-collision, exact match,
filenames with spaces, empty patches, and the +++/--- header
exclusion in insertion/deletion counters.
…patch Adds a readOnlyMode flag (default true) that is strictly stronger than safeMode: it blocks every mutating tool the agent can invoke, leaving only the inspection tools available. Recommended for analysis/RAG workflows where the agent must not be able to alter the repository. The guard is enforced at the IInstance dispatch layer rather than inside GitRepo, for two reasons: - It is an agent-policy concern, not a libgit2-wrapper concern, and keeps GitRepo's API clean (matches the encapsulation direction of 4466ca2). - IGlobal.beginGlobal auto-clones a remote-URL repoPath at pipeline start by calling GitRepo.clone() directly. Putting the guard inside clone() would have broken that setup path; gating only at agent dispatch leaves the engine free to do its own initialisation. Behaviour: - Blocked when readOnlyMode=true: git.clone, git.init, git.write_file, git.stage, git.commit, git.stash (push/pop/drop), git.branch_create, git.checkout, git.branch_delete, git.merge, git.fetch, git.pull, git.push. - Always allowed: git.status, git.log, git.show, git.diff, git.blame, git.file_at, git.branch_list, git.grep, git.ls_files, and git.stash with op="list". - The guard fails closed: getattr(git, 'read_only_mode', True) — if a future refactor drops the attribute, writes are blocked rather than silently allowed. Files: - nodes/src/nodes/tool_git/IInstance.py * New _WRITE_TOOLS frozenset (always-mutating tools) with a comment flagging the git.stash op-sensitive case as the lone exception. * New _is_write_tool() helper handling the stash op-sensitivity in one place — extend this, not the frozenset, for future op-sensitive tools. * _dispatch() now checks read_only_mode after the no-repo check and before calling _call(); blocked calls return a JSON error and never reach GitRepo. - nodes/src/nodes/tool_git/git_repo.py * GitRepo.__init__ takes a new keyword-only read_only_mode parameter (default True) and stores it on self.read_only_mode. * Class-level annotations safe_mode: bool = True / read_only_mode: bool = True added so the attributes are visible to introspection (unittest.mock.create_autospec) and to match the IGlobal style of declaring instance attributes at class level. - nodes/src/nodes/tool_git/IGlobal.py * beginGlobal() reads readOnlyMode from node config (default True) and forwards it into the GitRepo constructor. - nodes/src/nodes/tool_git/services.json * New git.readOnlyMode boolean field with title, description, default true, and yes/no enum. * preconfig.profiles.default includes readOnlyMode: true. * shape[].properties includes git.readOnlyMode after git.safeMode. - nodes/src/nodes/tool_git/README.md * Configuration table row for readOnlyMode. * New "Read-only mode" section listing blocked vs always-allowed tools and noting the interaction with safeMode. - nodes/test/tool_git/test_tools.py * New TestReadOnlyMode class — 12 cases covering: every blocked tool (commit, write_file, push, branch_delete, checkout, clone), two allowed reads (status, grep), the git.stash op-sensitive boundary (list allowed; push and default-op blocked), and the disable path (writes succeed when read_only_mode=False). * Each blocked test asserts both the error message contains "read-only mode" AND that the corresponding GitRepo method was never called — guarantees the guard short-circuits before _call(). * _make_instance() now sets read_only_mode=False on the autospec'd mock, with a docstring explaining why; existing per-tool dispatch tests are otherwise unchanged. Verified: 63 unit tests pass, 5 integration tests skip as expected (GIT_TEST_REPO_PATH unset). Ruff lint and format both clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…authType - Use the public pygit2.UserPass / pygit2.Keypair spelling consistently in credential callback type hints (was a mix of pygit2.* and pygit2.credentials.*). - git.grep grows a max_results parameter (1-10000, default 1000); the internal _grep_tree walker stops once the cap is reached so pathological queries against large repos can't OOM the agent. - README: add a "Security note: write scope" subsection making explicit that safe mode does not restrict file writes — anything outside .git/ is fair game for git.write_file. - Replace shutil.rmtree(..., ignore_errors=True) with a small _rmtree helper that uses an onexc/onerror callback to clear the read-only bit and retry. libgit2 leaves pack files in .git/objects/pack/ read-only on Windows, so the previous cleanup silently left orphan rocketride_git_* directories in %TEMP%. - Centralise the valid authType set as _VALID_AUTH_TYPES and check it in both validateConfig (warning at config time) and beginGlobal (runtime warning + fall-back to "none") so a typo like "tokn" surfaces immediately instead of degrading to a confusing 401 from the remote. - Tests: add coverage for grep max_results forwarding and out-of-range validation.
Production failure: an agent passed include_remote: true to git.branch_list
(the schema declares the parameter as remote). The dispatcher silently
dropped the unknown key, defaulted remote to false, and returned only
local branches. The agent then misread "no remote-tracking branches"
as "the tool doesn't support remotes" and gave up.
Fix:
- _strip_envelope() drops harness-supplied keys (input, repo_path,
security_context) that aren't part of any tool's schema, and merges
a nested input dict into the top level so {"input": {"x": 1}} becomes
{"x": 1} (with top-level keys winning on conflict).
- _validate_args_against_schema() raises ValueError listing the allowed
parameters when an unknown key remains. The agent now sees a
self-correcting error message instead of a silently-empty result.
- _dispatch wires both into the existing try/except so the validation
error is serialised as {"error": "..."} on the standard arm.
Tests: TestDispatchSchemaValidation reproduces the exact production
payload, plus envelope stripping, nested-input merge, top-level
override, and zero-property-tool rejection.
Replace the custom invoke()/_dispatch/_call chain with a single
@_tool decorator that combines @tool_function (framework discovery)
with the policy and validation logic that previously lived in
_dispatch.
Each git operation is now a method on IInstance decorated with @_tool.
The decorator runs, in order: envelope strip, repo-initialised check,
read-only guard (with op-sensitive write detection for stash via the
is_write= callable form), strict schema validation, then the body —
catching GitError / ValueError / KeyError / pygit2.GitError uniformly
and returning {'error': '...'} dicts rather than raising.
Tool names now match Python method names (no git. prefix). git.clone
becomes clone, git.branch_list becomes branch_list, etc. — required
because @tool_function discovery uses the method's __name__ as the
tool name. The agent learns the names from tool.query descriptors at
each session start, so this is observable to the LLM but doesn't break
any pipeline that doesn't hardcode tool names.
Removed:
- _TOOLS list (~310 lines of schema entries)
- _TOOL_MAP, _WRITE_TOOLS frozenset, _is_write_tool static method
- custom invoke(), _dispatch, _call methods
The framework's IInstanceBase.invoke now handles all dispatch by
walking the class for __tool_meta__-stamped methods.
Other:
- Module-level _strip_envelope, _validate_args, _bool_arg,
_int_arg_in_range helpers replace the per-instance versions.
- README, services.json description, and example pipeline updated
to reference the unprefixed tool names.
- Test stub adds a real _stub_tool_function that mirrors the
framework decorator's behaviour (stamps __tool_meta__, returns fn);
using bare MagicMock left every method as a non-callable mock.
- Two legacy tests dropped (test_unknown_tool_returns_error,
test_json_string_input_is_parsed) — both drove the removed custom
invoke() and have no equivalent under framework dispatch.
- TestToolQuery now verifies framework-discoverability (every method
has __tool_meta__, schema/description shape) instead of driving the
old tool.query op.
69 unit tests pass, 5 integration tests skip as expected. Smoke tests
confirm the production scenario (the include_remote bug) still
returns a self-correcting error, read-only mode still blocks writes,
and stash op=list is still allowed under read-only.
…ptional_int helpers Three new tool-input helpers in rocketlib.filters, completing the set that tool nodes need at dispatch entry: - validate_tool_input_schema(input_schema, args, *, tool_name=''): rejects any args key not declared in input_schema['properties'] with a clean ValueError listing the allowed parameters. Closes the include_remote- class bug (silent drop of hallucinated param names) for any tool node that opts in. Extracted from tool_git's local _validate_args. - optional_bool(args, key, *, default=None, tool_name=''): strict bool typing when key is present (no truthy coercion of 1/0/"true"), default returned untouched when absent or None. Mirrors optional_str. - optional_int(args, key, *, default=None, lo=None, hi=None, tool_name=''): same absent/None handling as optional_str, delegates to require_int for type and bounds checking when present (so the rules stay in sync between required and optional variants). Default is NOT range-checked — an out-of-range default is an author-side bug, not an agent-side bug. All three exported from rocketlib's top-level namespace. Tests: 46 new cases across TestValidateToolInputSchema (15), TestOptionalBool (12), and TestOptionalInt (19). Each covers happy path, defaults, type rejections, range bounds (where applicable), tool_name prefix on/off, and no-mutation guarantees on args/schema. tool_git migrated to use the shared helpers: - _validate_args removed (~15 lines), call site uses validate_tool_input_schema directly. - 7 args.setdefault(key, False) + require_bool(args, key) pairs collapsed to single optional_bool(..., default=False) calls in diff / branch_list / branch_delete / push / grep / ls_files. - 2 args.setdefault + require_int pairs collapsed to optional_int calls in log / grep. - require_bool and require_int dropped from tool_git's rocketlib imports — neither is used directly by tool_git anymore. Net effect: tool_git's IInstance.py shrinks by ~50 lines; every optional-with-default arg now reads as a single expression. Other tool nodes (tool_github, tool_firecrawl, tool_exa_search, tool_pipe, tool_filesystem) can adopt the same helpers when refactored.
Thank you @mithileshgau for this valuable contribution! I picked it up to help push it forward — refactored it to align with the project's Thank you again!!! |

Summary
Type
Feature
Testing
./builder testpassesChecklist
Linked Issue
Fixes #654
Summary by CodeRabbit
New Features
Documentation
Tests
Chores