Skip to content

test(e2e): migrate device auth health to vitest#5543

Open
cv wants to merge 6 commits into
mainfrom
e2e-migrate/test-device-auth-health
Open

test(e2e): migrate device auth health to vitest#5543
cv wants to merge 6 commits into
mainfrom
e2e-migrate/test-device-auth-health

Conversation

@cv

@cv cv commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates test/e2e/test-device-auth-health.sh into a typed live Vitest scenario. The new coverage preserves the #2342 device-auth health contract through real install.sh onboard, sandbox HTTP probes, host port-forward probing, nemoclaw status, and gateway recovery checks.

Related Issue

Refs #5098

Changes

  • Add test/e2e-scenario/live/device-auth-health.test.ts with typed artifact capture, cleanup, and secret redaction.
  • Wire device-auth-health-vitest into .github/workflows/e2e-vitest-scenarios.yaml as a free-standing dispatchable Vitest job.
  • Preserve legacy shell deletion for Phase 11 cleanup per the migration governance in Epic: Migrate legacy bash E2E into the Vitest E2E system #5098.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • PR description includes the DCO sign-off declaration and every commit appears as Verified in GitHub
  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Targeted commands run:

  • npx biome check --write test/e2e-scenario/live/device-auth-health.test.ts
  • NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/device-auth-health.test.ts -t __compile_only_nomatch__ --silent=false --reporter=default --passWithNoTests
  • npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • npx tsx scripts/check-test-file-size-budget.ts test/e2e-scenario/live/device-auth-health.test.ts

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests

    • Added a new live end-to-end device authentication health verification scenario with sandbox provisioning, /health checks, expected dashboard behavior (including correct handling of authenticated responses), and validation that health remains correct after a simulated gateway failure and recovery.
  • CI / Release Operations

    • Added a dedicated end-to-end CI job for this scenario and included its pass/fail status in pull request reporting, along with uploaded test artifacts.
    • Improved workflow boundary test timing by setting an explicit timeout.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 18, 2026
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f812271a-a2bb-4dbb-a8a1-678940cb72b6

📥 Commits

Reviewing files that changed from the base of the PR and between 370db91 and 78439d2.

📒 Files selected for processing (1)
  • .github/workflows/e2e-vitest-scenarios.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/e2e-vitest-scenarios.yaml

📝 Walkthrough

Walkthrough

Adds a new live Vitest e2e test file device-auth-health.test.ts that provisions a sandbox, installs NemoClaw with retry logic, validates /health and dashboard HTTP responses, simulates gateway failure and recovery, and persists artifacts. A corresponding device-auth-health-vitest CI job is wired into .github/workflows/e2e-vitest-scenarios.yaml, added to report-to-pr.needs, and the workflow metadata tests are updated to reflect the new job.

Changes

Device Auth Health E2E Vitest Migration

Layer / File(s) Summary
Test module constants and helper functions
test/e2e-scenario/live/device-auth-health.test.ts
Defines sandbox name, dashboard port, install attempt count, and live timeout constants; adds helpers for env construction, best-effort cleanup, offline-status assertion, and sandbox HTTP probing with artifact capture.
Core live scenario implementation
test/e2e-scenario/live/device-auth-health.test.ts
Implements the full live test: gating/skip checks, NVIDIA API key requirement, install retry/backoff, /health (200) and dashboard root (200/401) validation, nemoclaw status non-offline assertion, gateway process kill, recovery status check, and up to 30-poll /health loop writing a success or inconclusive JSON artifact.
CI job and PR reporting wiring
.github/workflows/e2e-vitest-scenarios.yaml
Adds the device-auth-health-vitest free-standing job with Docker Hub auth, isolated Docker config dir, Node/OpenShell setup, environment variables for port/gateway/sandbox, device-auth-health artifact upload, and Docker cleanup; extends report-to-pr.needs to include the new job for PR result reporting.
Workflow metadata test updates
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Reorders test module imports and adds explicit 60-second timeout option to the workflow job inventory derivation test to accommodate the new device-auth-health job metadata.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Modifies the same e2e-vitest-scenarios.yaml workflow to wire free-standing Vitest jobs into the shared job-selector and report-to-pr.needs reporting flow, directly mirroring the pattern used here.
  • NVIDIA/NemoClaw#5347: Both PRs add new free-standing live Vitest jobs to .github/workflows/e2e-vitest-scenarios.yaml and update the workflow boundary/inventory test suite (e2e-scenarios-workflow.test.ts) plus report-to-pr.needs wiring to reflect those new job outcomes.
  • NVIDIA/NemoClaw#5493: Both PRs modify .github/workflows/e2e-vitest-scenarios.yaml by adding a new free-standing live Vitest job and wiring it into the report-to-pr.needs list for PR-comment aggregation.

Suggested labels

area: e2e, chore

Poem

🐇 Hop, hop! The gateway falls—
But health checks bounce right off the walls.
Retry the install, poll the /health,
Collect those artifacts by stealth.
If 401 says "who goes there?"
That's not offline—it's just auth-aware! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(e2e): migrate device auth health to vitest' directly and accurately summarizes the main change: migrating a legacy bash E2E test to a TypeScript/Vitest test scenario.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate/test-device-auth-health

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-code-quality

github-code-quality Bot commented Jun 18, 2026

Copy link
Copy Markdown

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 78439d2 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 46%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 78439d2 +/-
src/lib/state/o...oard-session.ts 90%
src/lib/inference/local.ts 76%
src/lib/sandbox/config.ts 72%
src/lib/actions...dbox/rebuild.ts 67%
src/lib/onboard/preflight.ts 64%
src/lib/actions...licy-channel.ts 56%
src/lib/state/sandbox.ts 55%
src/lib/onboard...er-gpu-patch.ts 50%
src/lib/policy/index.ts 49%
src/lib/onboard.ts 18%

Updated June 19, 2026 23:33 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: device-auth-health-vitest
Optional E2E: gateway-health-honest-vitest, onboard-negative-paths-vitest

Dispatch hint: device-auth-health-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • device-auth-health-vitest (high): This is the newly added free-standing live E2E job. It should be run before merge to validate the new workflow wiring and the real install/onboard/sandbox/device-auth health contract it introduces.

Optional E2E

  • gateway-health-honest-vitest (medium): Adjacent health-reporting coverage: this existing job validates that gateway crashes are not falsely reported healthy. It provides extra confidence around the same status/health classification area touched by the new device-auth-health scenario.
  • onboard-negative-paths-vitest (medium): Optional adjacent onboarding confidence because the new scenario invokes real install/onboard paths with credential-sensitive environment handling, but the PR does not modify runtime onboarding code.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: device-auth-health-vitest

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: device-auth-health-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=device-auth-health-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • device-auth-health-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/device-auth-health.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=device-auth-health-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/device-auth-health.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/live/device-auth-health.test.ts`:
- Around line 233-239: The recovery status validation in the nemoclaw status
check only validates the output text does not contain offline but does not
verify that the nemoclaw command itself succeeded. Update the test to check that
the recoveryStatus indicates successful command execution (verify the exit code
or success property) in addition to calling assertStatusNotOffline on the
result. This ensures that a failed nemoclaw status command does not silently
pass the recovery check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1141229f-aedd-4378-9701-4501bbbd4efc

📥 Commits

Reviewing files that changed from the base of the PR and between c4bd014 and 679b7b3.

📒 Files selected for processing (2)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/device-auth-health.test.ts

Comment thread test/e2e-scenario/live/device-auth-health.test.ts
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 7 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 6 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: Dashboard root 200/401 tolerance in test/e2e-scenario/live/device-auth-health.test.ts: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: The new test accepts `expect(["200", "401"]).toContain(root.stdout.trim())` while the test title says 401 responses are treated as live.
  • Source-of-truth review needed: Dashboard port selection in test/e2e-scenario/live/device-auth-health.test.ts: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: Legacy `test/e2e/test-device-auth-health.sh` updated DASHBOARD_PORT from `openshell forward list`; the new Vitest test uses a constant `DASHBOARD_PORT` for all probes.
  • Keep Docker auth out of the OpenShell installer step (.github/workflows/e2e-vitest-scenarios.yaml:3527): The new device-auth-health job logs in to Docker Hub and persists DOCKER_CONFIG through GITHUB_ENV before running scripts/install-openshell.sh. That makes the OpenShell installer run inside a secret-bearing Docker credential context, widening the trusted-code boundary for an installer step.
    • Recommendation: Install OpenShell before Docker login, or run the installer with Docker/token-related environment removed, for example: env -u DOCKER_CONFIG -u DOCKERHUB_USERNAME -u DOCKERHUB_TOKEN -u NVIDIA_INFERENCE_API_KEY -u GITHUB_TOKEN bash scripts/install-openshell.sh.
    • Evidence: The job writes DOCKER_CONFIG at .github/workflows/e2e-vitest-scenarios.yaml:3482, authenticates with DOCKERHUB_TOKEN, then runs `bash scripts/install-openshell.sh`. Nearby jobs use `env -u DOCKER_CONFIG -u DOCKERHUB_USERNAME -u DOCKERHUB_TOKEN ... bash scripts/install-openshell.sh`.
  • Validate the dashboard port before sandbox shell interpolation (test/e2e-scenario/live/device-auth-health.test.ts:30): NEMOCLAW_DASHBOARD_PORT is read as a raw string and later interpolated into a command executed via `openshell sandbox exec ... sh -lc`. The workflow sets a numeric value, but local or self-hosted overrides could include malformed values or shell metacharacters that alter the command run inside the sandbox.
    • Recommendation: Parse and validate NEMOCLAW_DASHBOARD_PORT as an integer TCP port in range 1-65535 before use, or avoid shell interpolation by passing validated arguments through a non-shell helper.
    • Evidence: `const DASHBOARD_PORT = process.env.NEMOCLAW_DASHBOARD_PORT ?? "18789";` is interpolated into `trustedSandboxShellScript(`curl ... http://localhost:${DASHBOARD\_PORT}${urlPath}\`\)\`. The trustedSandboxShellScript helper only brands non-empty strings; it does not escape embedded values.
  • Do not silently pass the 401 regression path when the dashboard root returns 200 (test/e2e-scenario/live/device-auth-health.test.ts:185): The test title and scenario metadata claim to verify that device-auth 401 responses are treated as live, but the root probe accepts either 200 or 401 and then continues to assert the 401-specific regression path. If the environment returns 200, the test can pass without exercising the device-auth 401 behavior it is meant to protect.
  • Preserve the legacy actual-dashboard-port detection (test/e2e-scenario/live/device-auth-health.test.ts:30): The legacy shell test updated DASHBOARD_PORT from `openshell forward list` after install because the forwarded host port may differ from the requested default if the port was already taken. The Vitest migration always uses the configured/default port for sandbox and host probes, which can fail or probe the wrong endpoint in reused or self-hosted environments.
    • Recommendation: After install/onboard, detect the actual forwarded dashboard port for the sandbox and use that value for subsequent sandbox and host HTTP probes, or explicitly assert that this free-standing job requires the configured port to be honored.
    • Evidence: Legacy `test/e2e/test-device-auth-health.sh` assigned `ACTUAL_PORT=$(openshell forward list ...); DASHBOARD_PORT="$ACTUAL_PORT"` when present. The new test sets `const DASHBOARD_PORT = process.env.NEMOCLAW_DASHBOARD_PORT ?? "18789"` once and never updates it.
  • Add targeted boundary tests for the new free-standing secret-bearing job (test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts): The workflow inventory tests validate free-standing metadata generically, but this PR does not add behavior-specific assertions for the new `device-auth-health` selector or for the installer secret-boundary expectation. Because the added job is secret-bearing and dispatch-selectable, targeted regression coverage would improve confidence.
    • Recommendation: Add explicit selector tests showing `scenarios=device-auth-health` selects only `device-auth-health-vitest` and `jobs=device-auth-health-vitest` selects only that job. Also add or identify coverage that the OpenShell installer step runs without Docker/NVIDIA/GitHub token environment in scope.
    • Evidence: The diff adds workflow metadata `FREE_STANDING_SCENARIO_ID: "device-auth-health"`, but the support test file has no explicit `device-auth-health` assertions. The new job also runs `scripts/install-openshell.sh` after Docker auth, and no support test asserts that boundary.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — NEMOCLAW_DASHBOARD_PORT rejects empty, non-numeric, shell-metacharacter, and out-of-range values before any sandbox `sh -lc` curl command is constructed.. This PR changes a secret-bearing GitHub Actions workflow plus a live installer/OpenShell/sandbox E2E path. The highest-risk behavior crosses Docker credentials, installer trust, shell execution inside a sandbox, and runtime port-forwarding boundaries.
  • **Runtime validation** — Workflow dispatch with `scenarios=device-auth-health` selects only `device-auth-health-vitest`, and `jobs=device-auth-health-vitest` selects only that job.. This PR changes a secret-bearing GitHub Actions workflow plus a live installer/OpenShell/sandbox E2E path. The highest-risk behavior crosses Docker credentials, installer trust, shell execution inside a sandbox, and runtime port-forwarding boundaries.
  • **Runtime validation** — When dashboard `/` returns 401, `nemoclaw status` exits 0, does not contain `Offline`, and shows a live/ready/running indicator.. This PR changes a secret-bearing GitHub Actions workflow plus a live installer/OpenShell/sandbox E2E path. The highest-risk behavior crosses Docker credentials, installer trust, shell execution inside a sandbox, and runtime port-forwarding boundaries.
  • **Runtime validation** — When dashboard `/` returns 200, the test records that the 401-specific device-auth premise was not exercised instead of claiming the 401 regression path passed.. This PR changes a secret-bearing GitHub Actions workflow plus a live installer/OpenShell/sandbox E2E path. The highest-risk behavior crosses Docker credentials, installer trust, shell execution inside a sandbox, and runtime port-forwarding boundaries.
  • **Runtime validation** — If onboarding chooses a forwarded dashboard port different from `NEMOCLAW_DASHBOARD_PORT`, the Vitest scenario discovers and uses the actual port for sandbox and host probes.. This PR changes a secret-bearing GitHub Actions workflow plus a live installer/OpenShell/sandbox E2E path. The highest-risk behavior crosses Docker credentials, installer trust, shell execution inside a sandbox, and runtime port-forwarding boundaries.
  • **Add targeted boundary tests for the new free-standing secret-bearing job** — Add explicit selector tests showing `scenarios=device-auth-health` selects only `device-auth-health-vitest` and `jobs=device-auth-health-vitest` selects only that job. Also add or identify coverage that the OpenShell installer step runs without Docker/NVIDIA/GitHub token environment in scope.
  • **Acceptance clause:** No deterministic linked issue clauses were provided in the review context. — add test evidence or identify existing coverage. The validation context reported `linkedIssues: []`. PR-body references such as `Refs Epic: Migrate legacy bash E2E into the Vitest E2E system #5098` and comments about `[NemoClaw][Brev Launchable] OpenClaw Gateway Dashboard shows "Version n/a" and "Health Offline" after Brev Launchable deployment succeeds #2342` were treated as untrusted context rather than deterministic acceptance clauses.
  • **Dashboard root 200/401 tolerance in test/e2e-scenario/live/device-auth-health.test.ts** — Missing for the 401-required path when `/` returns 200; the test currently passes both 200 and 401.. The new test accepts `expect(["200", "401"]).toContain(root.stdout.trim())` while the test title says 401 responses are treated as live.
Since last review details

Current findings:

  • Source-of-truth review needed: Dashboard root 200/401 tolerance in test/e2e-scenario/live/device-auth-health.test.ts: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: The new test accepts `expect(["200", "401"]).toContain(root.stdout.trim())` while the test title says 401 responses are treated as live.
  • Source-of-truth review needed: Dashboard port selection in test/e2e-scenario/live/device-auth-health.test.ts: The advisor marked localized patch analysis as missing.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: Legacy `test/e2e/test-device-auth-health.sh` updated DASHBOARD_PORT from `openshell forward list`; the new Vitest test uses a constant `DASHBOARD_PORT` for all probes.
  • Keep Docker auth out of the OpenShell installer step (.github/workflows/e2e-vitest-scenarios.yaml:3527): The new device-auth-health job logs in to Docker Hub and persists DOCKER_CONFIG through GITHUB_ENV before running scripts/install-openshell.sh. That makes the OpenShell installer run inside a secret-bearing Docker credential context, widening the trusted-code boundary for an installer step.
    • Recommendation: Install OpenShell before Docker login, or run the installer with Docker/token-related environment removed, for example: env -u DOCKER_CONFIG -u DOCKERHUB_USERNAME -u DOCKERHUB_TOKEN -u NVIDIA_INFERENCE_API_KEY -u GITHUB_TOKEN bash scripts/install-openshell.sh.
    • Evidence: The job writes DOCKER_CONFIG at .github/workflows/e2e-vitest-scenarios.yaml:3482, authenticates with DOCKERHUB_TOKEN, then runs `bash scripts/install-openshell.sh`. Nearby jobs use `env -u DOCKER_CONFIG -u DOCKERHUB_USERNAME -u DOCKERHUB_TOKEN ... bash scripts/install-openshell.sh`.
  • Validate the dashboard port before sandbox shell interpolation (test/e2e-scenario/live/device-auth-health.test.ts:30): NEMOCLAW_DASHBOARD_PORT is read as a raw string and later interpolated into a command executed via `openshell sandbox exec ... sh -lc`. The workflow sets a numeric value, but local or self-hosted overrides could include malformed values or shell metacharacters that alter the command run inside the sandbox.
    • Recommendation: Parse and validate NEMOCLAW_DASHBOARD_PORT as an integer TCP port in range 1-65535 before use, or avoid shell interpolation by passing validated arguments through a non-shell helper.
    • Evidence: `const DASHBOARD_PORT = process.env.NEMOCLAW_DASHBOARD_PORT ?? "18789";` is interpolated into `trustedSandboxShellScript(`curl ... http://localhost:${DASHBOARD\_PORT}${urlPath}\`\)\`. The trustedSandboxShellScript helper only brands non-empty strings; it does not escape embedded values.
  • Do not silently pass the 401 regression path when the dashboard root returns 200 (test/e2e-scenario/live/device-auth-health.test.ts:185): The test title and scenario metadata claim to verify that device-auth 401 responses are treated as live, but the root probe accepts either 200 or 401 and then continues to assert the 401-specific regression path. If the environment returns 200, the test can pass without exercising the device-auth 401 behavior it is meant to protect.
  • Preserve the legacy actual-dashboard-port detection (test/e2e-scenario/live/device-auth-health.test.ts:30): The legacy shell test updated DASHBOARD_PORT from `openshell forward list` after install because the forwarded host port may differ from the requested default if the port was already taken. The Vitest migration always uses the configured/default port for sandbox and host probes, which can fail or probe the wrong endpoint in reused or self-hosted environments.
    • Recommendation: After install/onboard, detect the actual forwarded dashboard port for the sandbox and use that value for subsequent sandbox and host HTTP probes, or explicitly assert that this free-standing job requires the configured port to be honored.
    • Evidence: Legacy `test/e2e/test-device-auth-health.sh` assigned `ACTUAL_PORT=$(openshell forward list ...); DASHBOARD_PORT="$ACTUAL_PORT"` when present. The new test sets `const DASHBOARD_PORT = process.env.NEMOCLAW_DASHBOARD_PORT ?? "18789"` once and never updates it.
  • Add targeted boundary tests for the new free-standing secret-bearing job (test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts): The workflow inventory tests validate free-standing metadata generically, but this PR does not add behavior-specific assertions for the new `device-auth-health` selector or for the installer secret-boundary expectation. Because the added job is secret-bearing and dispatch-selectable, targeted regression coverage would improve confidence.
    • Recommendation: Add explicit selector tests showing `scenarios=device-auth-health` selects only `device-auth-health-vitest` and `jobs=device-auth-health-vitest` selects only that job. Also add or identify coverage that the OpenShell installer step runs without Docker/NVIDIA/GitHub token environment in scope.
    • Evidence: The diff adds workflow metadata `FREE_STANDING_SCENARIO_ID: "device-auth-health"`, but the support test file has no explicit `device-auth-health` assertions. The new job also runs `scripts/install-openshell.sh` after Docker auth, and no support test asserts that boundary.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@cv cv added the v0.0.66 Release target label Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.66 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant