Skip to content

test(e2e): migrate Telegram injection to Vitest#5576

Merged
jyaunches merged 6 commits into
mainfrom
e2e-phase6-telegram-injection
Jun 22, 2026
Merged

test(e2e): migrate Telegram injection to Vitest#5576
jyaunches merged 6 commits into
mainfrom
e2e-phase6-telegram-injection

Conversation

@cv

@cv cv commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates the Telegram injection E2E contract from the legacy bash entry point into the live Vitest E2E system. The replacement keeps the real install/OpenShell sandbox boundary for shell metacharacter payloads, process-table leak checks, and sandbox name validation.

Related Issue

Refs #5098

Changes

  • Add test/e2e-scenario/live/telegram-injection.test.ts as Vitest coverage for test/e2e/test-telegram-injection.sh.
  • Add test/e2e-scenario/live/phase6-messaging-helpers.ts for shared Phase 6 install, cleanup, env, registry, and sandbox helpers.
  • Wire a dispatchable telegram-injection-vitest job into .github/workflows/e2e-vitest-scenarios.yaml.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • PR description includes the DCO sign-off declaration and every commit appears as Verified in GitHub
  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests

    • Added a new live end-to-end security scenario that treats shell metacharacters as data to validate command-injection resistance.
    • Confirms safe handling outcomes (“SAFE”), verifies no secret leakage via parameter expansion, checks sandbox name rejection, and performs benign passthrough coverage.
  • CI/CD

    • Added a dedicated live Vitest job for the scenario and included it in pull request status reporting.
    • Improves reliability with rate-limit-aware sandbox setup, isolated Docker auth handling, and artifact uploads.
    • Added CI workflow validation to enforce correct job configuration and secret safety.

@cv cv self-assigned this Jun 22, 2026
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 23118ac8-32a7-4ea1-bdef-2896a29612e2

📥 Commits

Reviewing files that changed from the base of the PR and between d5d1d2f and c493191.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/telegram-injection.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/e2e-vitest-scenarios.yaml

📝 Walkthrough

Walkthrough

Adds a live Vitest e2e scenario (telegram-injection.test.ts) that validates shell injection safety in the OpenShell/OpenClaw sandbox via base64-encoded stdin payloads delivered over both direct exec and SSH pipelines. A shared helper library (phase6-messaging-helpers.ts) provides sandbox lifecycle, command execution, and environment assembly. The new CI job telegram-injection-vitest is wired into .github/workflows/e2e-vitest-scenarios.yaml with Docker auth, CLI build, Vitest run, and artifact upload, and is added to the report-to-pr fan-in. A corresponding workflow boundary validator (validateTelegramInjectionVitestJob) enforces job configuration safety.

Changes

Telegram Injection Live E2E Scenario

Layer / File(s) Summary
Phase6 helper configuration and utilities
test/e2e-scenario/live/phase6-messaging-helpers.ts
Exports constants (REPO_ROOT, CLI, INSTALL_TIMEOUT_MS, COMMAND_TIMEOUT_MS), AgentKind type, and utility functions (stripAnsi, resultText, shellQuote, base64) for environment configuration and output processing.
Phase6 environment and sandbox lifecycle
test/e2e-scenario/live/phase6-messaging-helpers.ts
Implements phase6Env for NemoClaw environment construction with sandbox validation; exports redaction utilities (redactionValues, bestEffort, expectExitZero); implements sandbox cleanup (precleanSandbox, cleanupSandbox), installation with rate-limit handling (installSandbox, installSandboxOrSkipOnRateLimit), and readiness polling (expectSandboxReady).
Phase6 command execution and diagnostics
test/e2e-scenario/live/phase6-messaging-helpers.ts
Exports sandboxSh for trusted shell script execution with artifact naming and redaction; exports dockerInfo helper to check Docker availability with fixed 30s timeout.
Telegram injection payload delivery and secret assertions
test/e2e-scenario/live/telegram-injection.test.ts
Constructs openshellStdinCommand and openshellSshStdinCommand helpers that base64-encode payloads for remote execution via exec and SSH pipelines; wraps them via sendPayloadViaSandboxStdin and sendPayloadViaOpenShellSshStdin host runners; implements parameter-expansion assertions and process-table secret-exposure checks for both execution paths.
Telegram injection test implementation
test/e2e-scenario/live/telegram-injection.test.ts
Test scaffold: loads API key, derives env/redactions, writes scenario.json, registers cleanup, checks Docker, installs sandbox with rate-limit skipping, waits for readiness. Runs three metacharacter injection probes via both exec and SSH paths asserting marker files absent. Validates invalid sandbox name rejection. Asserts benign message passthrough. Performs best-effort post-assertion status check.
CI workflow job and boundary validation
.github/workflows/e2e-vitest-scenarios.yaml, tools/e2e-scenarios/workflow-boundary.mts
Adds telegram-injection-vitest job with conditional execution, Docker auth (anonymous fallback), Node setup, CLI build, Vitest run with NVIDIA_INFERENCE_API_KEY secret, artifact upload, and Docker cleanup. Wires into report-to-pr needs list. Introduces validateTelegramInjectionVitestJob boundary validator and wires it into the overall workflow boundary validation.

Sequence Diagram(s)

sequenceDiagram
    participant Test as telegram-injection.test
    participant Host as HostCliClient
    participant Sandbox as OpenShell Sandbox
    
    Test->>Host: precleanSandbox
    Test->>Host: installSandbox (or skip on rate-limit)
    Host->>Host: expectSandboxReady (poll)
    Test->>Host: dockerInfo (assert available)
    
    loop metacharacter payloads (${}, ``, quote)
        Test->>Host: sendPayloadViaSandboxStdin(payload)
        Host->>Host: openshellStdinCommand(base64(payload))
        Host->>Sandbox: execShell with stdin pipeline
        Sandbox->>Sandbox: base64 decode + execute
        Sandbox-->>Test: assert marker file absent (SAFE)
    end
    
    loop SSH variant for same payloads
        Test->>Host: sendPayloadViaOpenShellSshStdin(payload)
        Host->>Host: openshellSshStdinCommand(base64(payload))
        Host->>Host: ssh via sandbox ssh-config
        Sandbox->>Sandbox: base64 decode + execute
        Sandbox-->>Test: assert SSH marker file absent (SAFE)
    end
    
    Test->>Host: assertParameterPayloadStaysLiteral (exec path)
    Sandbox-->>Test: assert ${NVIDIA_INFERENCE_API_KEY} not expanded
    
    Test->>Host: assertSshParameterPayloadStaysLiteral (SSH path)
    Sandbox-->>Test: assert ${NVIDIA_INFERENCE_API_KEY} not expanded
    
    Test->>Host: assertHostProcessTableDoesNotExposeSecret
    Host-->>Test: assert key prefix not in ps aux
    
    Test->>Sandbox: assertSandboxProcessTableDoesNotExposeSecret
    Sandbox-->>Test: assert key prefix not in ps aux
    
    Test->>Sandbox: validateName (invalid names loop)
    Sandbox-->>Test: assert all rejected
    
    Test->>Host: sendPayloadViaSandboxStdin (benign + special chars)
    Sandbox-->>Test: assert passthrough success
    
    Test->>Host: CLI status (best-effort)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5346: Both PRs wire new free-standing live Vitest jobs into .github/workflows/e2e-vitest-scenarios.yaml and add them to the report-to-pr dependency/needs so their scenario results are reported in the same PR comment table.
  • NVIDIA/NemoClaw#5347: Both PRs update the same e2e-vitest-scenarios.yaml workflow by adding a new free-standing Vitest job and wiring it into the report-to-pr job needs so the PR results table includes the scenario outcome.
  • NVIDIA/NemoClaw#5349: Both PRs add new free-standing Vitest E2E jobs to the same .github/workflows/e2e-vitest-scenarios.yaml and wire them into the PR result aggregation (via needs), though they run different scenario tests.

Suggested labels

area: e2e, area: ci, chore, v0.0.66

Poem

🐇 A sandbox awaits, shell metacharacters abound,
I encode my payloads in base64, safely bound.
No marker file appears—the test asserts SAFE!
The API key hides; no process table reveals its face.
Hop, hop—the CI job wires and the PR report's found! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: migrating the Telegram injection test from bash/shell to Vitest, which is the primary objective across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-phase6-telegram-injection

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-code-quality

github-code-quality Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File c493191 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 46%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File c493191 +/-
src/lib/state/o...oard-session.ts 91%
src/lib/inference/local.ts 76%
src/lib/sandbox/config.ts 72%
src/lib/actions...dbox/rebuild.ts 67%
src/lib/onboard/preflight.ts 64%
src/lib/actions...licy-channel.ts 56%
src/lib/state/sandbox.ts 55%
src/lib/policy/index.ts 49%
src/lib/onboard...er-gpu-patch.ts 44%
src/lib/onboard.ts 18%

Updated June 22, 2026 14:23 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: telegram-injection-vitest
Optional E2E: messaging-providers-vitest, channels-add-remove-vitest

Dispatch hint: telegram-injection-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • telegram-injection-vitest (medium): This PR introduces the telegram-injection live E2E job and its test. Run it as merge-blocking to prove the new workflow job is dispatchable and that the security-boundary assertions pass against a real OpenShell/OpenClaw sandbox.

Optional E2E

  • messaging-providers-vitest (high): Adjacent confidence for live messaging/provider setup and NVIDIA endpoint/rate-limit handling that the new phase6 messaging helper imports and mirrors, but the PR does not change production messaging code.
  • channels-add-remove-vitest (high): Useful adjacent check because the new job is inserted near existing channel/messaging workflow jobs and follows similar OpenShell install, Docker auth, artifact upload, and sandbox lifecycle patterns.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: telegram-injection-vitest

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: telegram-injection-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=telegram-injection-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • telegram-injection-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/telegram-injection.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=telegram-injection-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/phase6-messaging-helpers.ts
  • test/e2e-scenario/live/telegram-injection.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — No blocking findings

Merge posture: No blocking advisor findings
Primary next action: Add or justify PRA-T1 and any related test follow-ups.
Open items: 0 required · 0 warnings · 1 suggestion · 4 test follow-ups
Since last review: 2 prior items resolved · 0 still apply · 1 new item found

Action checklist

  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
  • PRA-T4 Add or justify test follow-up: Acceptance clause
  • PRA-1 In-scope improvement: Consider narrowing the single-use Phase 6 helper module until it has a second consumer in test/e2e-scenario/live/phase6-messaging-helpers.ts:19

Findings index

ID Severity Category Location Required action
PRA-1 Improvement architecture test/e2e-scenario/live/phase6-messaging-helpers.ts:19 Prefer shrinking this in the current PR by moving Telegram-only helpers into `telegram-injection.test.ts`, or make the functions non-exported/narrowly named until another Phase 6 migration reuses them. Preserve the security-relevant boundaries: explicit redaction values, sandbox-name validation, shell quoting/base64 payload transport, and real OpenShell/SSH execution.
Review findings by urgency: 0 required fixes, 0 items to resolve/justify, 1 in-scope improvement

⚠️ Resolve or justify before merge

Investigate these in the current review; either fix them, explain why they are not applicable, or document the accepted risk.

  • None.

💡 In-scope improvements

These are lower-risk, not throwaway. Prefer fixing them in this PR when they are local to changed code; defer only with rationale or a linked follow-up.

PRA-1 Improvement — Consider narrowing the single-use Phase 6 helper module until it has a second consumer

  • Location: test/e2e-scenario/live/phase6-messaging-helpers.ts:19
  • Category: architecture
  • Problem: The PR adds a 201-line exported `phase6-messaging-helpers.ts` module with generalized install, cleanup, env, Docker, shell quoting, and sandbox helpers, but the current tree only imports it from `telegram-injection.test.ts`. This is smaller than a new runner/framework and does not weaken the real shell/OpenShell boundary, but the exported shared shape is ahead of demonstrated reuse.
  • Impact: Keeping broad exported helpers before a second consumer exists can make future E2E migrations couple to accidental APIs, increasing review surface in the already high-churn Vitest E2E workflow area.
  • Suggested action: Prefer shrinking this in the current PR by moving Telegram-only helpers into `telegram-injection.test.ts`, or make the functions non-exported/narrowly named until another Phase 6 migration reuses them. Preserve the security-relevant boundaries: explicit redaction values, sandbox-name validation, shell quoting/base64 payload transport, and real OpenShell/SSH execution.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Read `test/e2e-scenario/live/phase6-messaging-helpers.ts` and grep for `phase6-messaging-helpers`; the only current import is in `test/e2e-scenario/live/telegram-injection.test.ts`.
  • Missing regression test: No additional behavior test is required if this is only a mechanical shrink; the existing live test `Telegram bridge-style message handling treats shell metacharacters as data` should continue covering install, OpenShell exec/SSH payload delivery, validateName rejection, and redaction-sensitive probes.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: `phase6-messaging-helpers.ts` exports `CLI`, env builders, cleanup/install helpers, sandbox shell wrappers, and Docker helpers; `grep` found only `live/telegram-injection.test.ts` importing that module.
Simplification opportunities: 1 possible cut

These are safe simplification checks only. Do not remove validation, security controls, data-loss prevention, or required tests.

  • PRA-1 yagni (test/e2e-scenario/live/phase6-messaging-helpers.ts:19): The exported, Phase-6-wide helper surface that currently has a single consumer.
    • Replacement: Inline Telegram-specific helpers into `telegram-injection.test.ts`, or keep a smaller local module with only non-exported/internal functions needed by this test.
    • Safety boundary: Do not remove sandbox-name validation, explicit credential redaction, base64/shellQuote handling for malicious payloads, or the real OpenShell exec and ssh-config/ssh paths.
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — workflow boundary rejects telegram-injection-vitest Docker auth under github.workspace and requires RUNNER_TEMP DOCKER_CONFIG. The changed behavior crosses workflow, installer, Docker/OpenShell, SSH, sandbox shell execution, validateName, and credential-redaction boundaries. The PR adds the relevant live runtime test and a workflow-boundary validator; additional negative static mutation tests would improve confidence in the new validator but are not required by the static review findings.
  • PRA-T2 Runtime validation — workflow boundary rejects NVIDIA_INFERENCE_API_KEY outside Run Telegram injection live test. The changed behavior crosses workflow, installer, Docker/OpenShell, SSH, sandbox shell execution, validateName, and credential-redaction boundaries. The PR adds the relevant live runtime test and a workflow-boundary validator; additional negative static mutation tests would improve confidence in the new validator but are not required by the static review findings.
  • PRA-T3 Runtime validation — workflow boundary rejects telegram-injection-vitest artifact paths outside e2e-artifacts/vitest/telegram-injection and requires include-hidden-files false. The changed behavior crosses workflow, installer, Docker/OpenShell, SSH, sandbox shell execution, validateName, and credential-redaction boundaries. The PR adds the relevant live runtime test and a workflow-boundary validator; additional negative static mutation tests would improve confidence in the new validator but are not required by the static review findings.
  • PRA-T4 Acceptance clause — Refs Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 — add test evidence or identify existing coverage. The deterministic validation context did not include linked issue Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 body or comments (`linkedIssues` was empty), so literal issue acceptance clauses could not be extracted or mapped.
Since last review details

Current findings, using the urgency labels above:

PRA-1 Improvement — Consider narrowing the single-use Phase 6 helper module until it has a second consumer

  • Location: test/e2e-scenario/live/phase6-messaging-helpers.ts:19
  • Category: architecture
  • Problem: The PR adds a 201-line exported `phase6-messaging-helpers.ts` module with generalized install, cleanup, env, Docker, shell quoting, and sandbox helpers, but the current tree only imports it from `telegram-injection.test.ts`. This is smaller than a new runner/framework and does not weaken the real shell/OpenShell boundary, but the exported shared shape is ahead of demonstrated reuse.
  • Impact: Keeping broad exported helpers before a second consumer exists can make future E2E migrations couple to accidental APIs, increasing review surface in the already high-churn Vitest E2E workflow area.
  • Suggested action: Prefer shrinking this in the current PR by moving Telegram-only helpers into `telegram-injection.test.ts`, or make the functions non-exported/narrowly named until another Phase 6 migration reuses them. Preserve the security-relevant boundaries: explicit redaction values, sandbox-name validation, shell quoting/base64 payload transport, and real OpenShell/SSH execution.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Read `test/e2e-scenario/live/phase6-messaging-helpers.ts` and grep for `phase6-messaging-helpers`; the only current import is in `test/e2e-scenario/live/telegram-injection.test.ts`.
  • Missing regression test: No additional behavior test is required if this is only a mechanical shrink; the existing live test `Telegram bridge-style message handling treats shell metacharacters as data` should continue covering install, OpenShell exec/SSH payload delivery, validateName rejection, and redaction-sensitive probes.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: `phase6-messaging-helpers.ts` exports `CLI`, env builders, cleanup/install helpers, sandbox shell wrappers, and Docker helpers; `grep` found only `live/telegram-injection.test.ts` importing that module.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
.github/workflows/e2e-vitest-scenarios.yaml (1)

4080-4093: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Inconsistent Docker login retry logic; align with channels-add-remove-vitest pattern.

The telegram-injection-vitest job uses a simplified Docker login (line 4093: || echo "::warning::..."), while the channels-add-remove-vitest job (lines 3972-3993) implements a robust 3-attempt retry loop with backoff. For consistency and reliability, adopt the retry pattern here as well.

♻️ Adopt the retry-loop pattern from channels-add-remove-vitest
       - name: Authenticate to Docker Hub
         env:
           DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
           DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
         shell: bash
         run: |
           set -euo pipefail
           if [[ -z "${DOCKERHUB_USERNAME}" || -z "${DOCKERHUB_TOKEN}" ]]; then
             echo "::notice::Docker Hub credentials not configured; continuing with anonymous pulls."
             exit 0
           fi
           mkdir -p "${DOCKER_CONFIG}"
           chmod 700 "${DOCKER_CONFIG}"
-          echo "${DOCKERHUB_TOKEN}" | timeout 30s docker login docker.io --username "${DOCKERHUB_USERNAME}" --password-stdin || echo "::warning::Docker Hub login failed; continuing with anonymous pulls."
+          login_succeeded=0
+          for attempt in 1 2 3; do
+            if echo "${DOCKERHUB_TOKEN}" | timeout 30s docker login docker.io --username "${DOCKERHUB_USERNAME}" --password-stdin; then
+              login_succeeded=1
+              break
+            fi
+            if [[ "$attempt" -lt 3 ]]; then
+              echo "::warning::Docker Hub login attempt ${attempt} failed; retrying."
+              sleep 5
+            fi
+          done
+          if [[ "$login_succeeded" -ne 1 ]]; then
+            echo "::warning::Docker Hub login failed after 3 attempts; continuing with anonymous pulls."
+          fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 4080 - 4093, The
Docker login command in the "Authenticate to Docker Hub" step uses a simple
error handling approach with a single warning message, but the
channels-add-remove-vitest job implements a more robust retry pattern with
exponential backoff. Replace the current one-line Docker login with the docker
login --retry logic and 3-attempt retry loop pattern from the
channels-add-remove-vitest job (lines 3972-3993) to ensure consistent
reliability across both jobs. Apply the same retry loop structure around the
docker login docker.io command to handle transient failures gracefully.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 4107-4117: The "Run Telegram injection live test" step is missing
required OpenShell installation and PATH configuration. Add a new "Install
OpenShell" step before the test step that runs scripts/install-openshell.sh with
NEMOCLAW_NON_INTERACTIVE set to "1", then update the "Run Telegram injection
live test" step to export PATH with the necessary bin directories, verify the
openshell binary location by checking if command -v finds it or if it exists at
$HOME/.local/bin/openshell, export OPENSHELL_BIN with the resolved binary path,
and run openshell --version to confirm installation before executing the vitest
command for telegram-injection.test.ts.

In `@test/e2e-scenario/live/telegram-injection.test.ts`:
- Line 177: The test includes "UPPERCASE" in the invalidNames array, but the
validateName regex pattern at test/e2e-scenario/fixtures/clients/sandbox.ts
explicitly allows uppercase letters in the A-Z character class. Remove
"UPPERCASE" from the invalidNames array in the telegram-injection.test.ts file
since it will actually pass validation, not fail it.
- Around line 88-99: The test body contains a conditional if statement checking
for NVIDIA_ENDPOINT_RATE_LIMIT which violates the linear test structure
guardrail. Extract the rate-limit error handling logic from the test into a new
helper function (e.g., installSandboxOrSkipOnRateLimit in the test helpers file)
that wraps the installSandbox call with the try-catch and rate-limit check,
accepts the skip function as a parameter, and returns the result or triggers
skip internally. Then replace the entire try-catch block in the test with a
single call to this new helper function, keeping the test body free of
conditional logic.
- Around line 30-32: Remove the duplicate base64 function definition from the
test file (the function that converts a string value to base64 encoding using
Buffer.from and toString). Instead, import the base64 function from the
phase6-messaging-helpers module at the top of the file alongside other imports,
then delete the local function definition. Any existing calls to base64 within
this test file will automatically use the shared implementation from
phase6-messaging-helpers.

---

Nitpick comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 4080-4093: The Docker login command in the "Authenticate to Docker
Hub" step uses a simple error handling approach with a single warning message,
but the channels-add-remove-vitest job implements a more robust retry pattern
with exponential backoff. Replace the current one-line Docker login with the
docker login --retry logic and 3-attempt retry loop pattern from the
channels-add-remove-vitest job (lines 3972-3993) to ensure consistent
reliability across both jobs. Apply the same retry loop structure around the
docker login docker.io command to handle transient failures gracefully.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1884fea5-c762-46a1-8cf7-ad6b16b00679

📥 Commits

Reviewing files that changed from the base of the PR and between cf403cf and 98ccb03.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/phase6-messaging-helpers.ts
  • test/e2e-scenario/live/telegram-injection.test.ts

Comment thread .github/workflows/e2e-vitest-scenarios.yaml
Comment thread test/e2e-scenario/live/telegram-injection.test.ts Outdated
Comment thread test/e2e-scenario/live/telegram-injection.test.ts Outdated
Comment thread test/e2e-scenario/live/telegram-injection.test.ts
@cv

cv commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed review feedback in d45654f/d5d1d2f64:

  • Added explicit OpenShell install/PATH/OPENSHELL_BIN setup to the telegram-injection Vitest workflow job.
  • Moved Docker auth to RUNNER_TEMP and aligned login retry behavior with nearby E2E jobs.
  • Removed the duplicate local base64 helper and the invalid UPPERCASE rejection case.
  • Kept the test entrypoint conditional-free by moving rate-limit handling into the helper.
  • Reworked credential leak probes so raw command output is checked inside the shell before redacted evidence is returned/written, and process-table probes fail instead of passing inconclusive output.
  • Shrunk the Phase 6 helper module to the helpers used by this PR, removing unused provider/registry/sandboxNode scaffolding.

Comment thread test/e2e-scenario/live/telegram-injection.test.ts Fixed
Comment thread test/e2e-scenario/live/telegram-injection.test.ts Fixed
Comment thread test/e2e-scenario/live/telegram-injection.test.ts Fixed
Comment thread test/e2e-scenario/live/telegram-injection.test.ts Fixed
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27928266806
Workflow ref: e2e-phase6-telegram-injection
Requested scenarios: (default — all supported)
Requested jobs: telegram-injection-vitest
Summary: 1 passed, 1 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
telegram-injection-vitest ❌ failure
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

Failed jobs: telegram-injection-vitest. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27962694052
Workflow ref: e2e-phase6-telegram-injection
Requested scenarios: (default — all supported)
Requested jobs: telegram-injection-vitest
Summary: 2 passed, 0 failed, 53 skipped

Job Result
agent-turn-latency-vitest ⏭️ skipped
bedrock-runtime-compatible-anthropic-vitest ⏭️ skipped
brave-search-vitest ⏭️ skipped
channels-add-remove-vitest ⏭️ skipped
cloud-inference-vitest ⏭️ skipped
cloud-onboard-vitest ⏭️ skipped
common-egress-agent-vitest ⏭️ skipped
concurrent-gateway-ports-vitest ⏭️ skipped
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
cron-preflight-inference-local-vitest ⏭️ skipped
device-auth-health-vitest ⏭️ skipped
diagnostics-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
full-e2e-vitest ⏭️ skipped
gateway-drift-preflight-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
gateway-health-honest-vitest ⏭️ skipped
generate-matrix ✅ success
gpu-double-onboard-vitest ⏭️ skipped
gpu-e2e-vitest ⏭️ skipped
hermes-e2e-vitest ⏭️ skipped
hermes-inference-switch-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-2478-crash-loop-recovery-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
issue-4462-scope-upgrade-approval-vitest ⏭️ skipped
kimi-inference-compat-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
messaging-compatible-endpoint-vitest ⏭️ skipped
messaging-providers-vitest ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
ollama-auth-proxy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
onboard-repair-vitest ⏭️ skipped
onboard-resume-vitest ⏭️ skipped
openclaw-inference-switch-vitest ⏭️ skipped
openclaw-skill-cli-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
sessions-agents-cli-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
snapshot-commands-vitest ⏭️ skipped
state-backup-restore-vitest ⏭️ skipped
telegram-injection-vitest ✅ success
token-rotation-vitest ⏭️ skipped
upgrade-stale-sandbox-vitest ⏭️ skipped

@jyaunches jyaunches left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the follow-up at c4931918.

This addresses the earlier coverage concerns:

  • added Vitest coverage for the prior bash-test OpenShell SSH route via openshell sandbox ssh-config + ssh -F, alongside the existing openshell sandbox exec checks;
  • restored uppercase sandbox-name rejection coverage and fixed the --help argument handling so invalid-name cases reach validateName;
  • added a workflow-boundary validator for telegram-injection-vitest to keep Docker auth in ``, scope NVIDIA_INFERENCE_API_KEY to the live test step, and keep artifact upload narrowed to `e2e-artifacts/vitest/telegram-injection/`.

Validation I checked:

  • PR Review Advisor reran on c4931918 and reports merge_as_is / no blocking findings.
  • Required telegram-injection-vitest passed in run 27962694052.
  • Local checks before push: Biome lint, typecheck, build:cli, workflow-boundary validator, and diff whitespace check.

Approving.

@jyaunches jyaunches merged commit 60b0a00 into main Jun 22, 2026
100 checks passed
@jyaunches jyaunches deleted the e2e-phase6-telegram-injection branch June 22, 2026 15:57
@cv cv added the v0.0.66 Release target label Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.66 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants