Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Web-based Robot Framework test management (Git, GUI execution, reports, environm
- **Recorder capture script MUST use `composedPath()[0]`, not `ev.target`**: events inside an open shadow root surface with `ev.target` retargeted to the *host*. `recording/capture_script.py::realTarget(ev)` returns `ev.composedPath()[0]` (falls back to `ev.target` for closed roots); every handler (click, dblclick, change, keydown, dragstart, drop) routes through it. The ancestor walk crosses shadow boundaries via `crossShadow(el)` (jump to `getRootNode().host` when `parentElement` is null); each ancestor's `is_shadow_host` flag makes synthesis emit a `host >> inner` chained locator (`selector_synthesis.py::_shadow_chain`). Pinned by `test_capture_script.py::TestShadowDomAwareness`. New event listeners must route through `realTarget`.
- **Recorder selector synthesis MUST emit a parent-context CSS variant**: `selector_synthesis.py::_css` emits `<ancestor#id|testid> tag.classes` whenever a stable ancestor exists, not just `tag.classes`. A bare `button.submit-btn` matching every submit is the top Playwright strict-mode failure source at replay; pinning the nearest stable-id ancestor (quality_score `+10`) cuts misfires by orders of magnitude. `_with_nth_match` (`selector_verification.py`) is the last-resort fallback (penalty `-15`). Pinned by `test_selector_synthesis.py::TestParentContextCss`.
- **Online-dist `.bat` files MUST be CRLF; the Windows online installer MUST pass `--find-links wheels`**: `scripts/build-online-mac-and-linux.sh` generates `install/start/stop-windows.bat` via Bash heredocs (→ LF line endings) on the Linux CI host. cmd.exe mis-parses LF-only batch files — it loses the first token of each line, runs `::` comments as commands, and never reaches `uv venv`, so the install silently no-ops and ships no `.venv`. A trailing awk pass normalises every `*.bat` to CRLF; **any new `.bat` added to that script must go through the same loop** or it ships broken on Windows. Separately, the bundle vendors the `robotframework-roboscopeheal` wheel in `wheels/` (not on PyPI), so the generated `install-windows.bat` must install with `--find-links wheels` (mirror the mac/linux branch) or uv 404s on roboheal. The *offline* `build-windows.ps1` is immune to both: PowerShell `Set-Content` writes CRLF and it already uses `--no-index --find-links=wheels`. Both failure modes are pinned by **Gate 6/7 in `.github/workflows/phase4-gates.yml`** (build online dist on Linux → run the real `install-windows.bat` on `windows-latest` → assert `import RoboScopeHeal` + `/health` boot). Regression history: missing `--find-links` hit main 2026-06-01 (PR #48) and the LF breakage was caught by the new gate 2026-06-15 (PR #49).
- **Deployment feature flags (Epic GOV) resolve ENV > DB > default-ON; enforcement is per-endpoint, NOT in middleware**: `src/governance/flags.py::resolve_flag` resolves a flag from `ROBOSCOPE_FEATURE_<UPPER_SNAKE>` (e.g. `ROBOSCOPE_FEATURE_PACKAGE_MANAGEMENT`) → `app_settings` row `features.<flag>` (category `features`, seeded) → registry default (ON). The ENV value wins and marks the flag `locked` (the Settings UI disables that toggle — `SettingsView.vue::settingLocked`). To govern an endpoint, add `Depends(require_package_op("<op>"))` (or `require_feature`) from `src/governance/dependencies.py` — the flag gate is **absolute** (403s even for ADMIN when off) and runs BEFORE the configurable role floor (`features.packageManagement.role.<op>`, default EDITOR). **Trap: the audit middleware skips responses with status ≥ 400** (`audit/middleware.py`), so a 403 raised from a dependency is NOT auto-logged — `require_package_op` therefore writes its own `AuditLog` (`action="blocked"`, detail `feature_disabled:…` / `insufficient_role:…`) and `db.commit()`s before raising. Any new governed endpoint that must be audited has to do the same. Frontend gate: `useFeatureFlags()` (singleton, token-guarded like `useBypassStatus` to avoid the 401-redirect loop; default-enabled while loading since the server is authoritative). Pinned by `tests/governance/` + `e2e/tests/gov-feature-lockdown.spec.ts`.

## Release

Expand Down
44 changes: 44 additions & 0 deletions _bmad-output/planning-artifacts/aix-prd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Epic AIX — AI Provider & Output Enhancements — PRD + Design

**Status**: Planning → ready for implementation
**Date**: 2026-06-18
**PM/Architect**: John / Winston
**Parent**: [presentation-feedback-epics.md](./presentation-feedback-epics.md)
**Epic**: AIX

## What already exists

- `call_llm` (`ai/llm_client.py`) routes every non-`anthropic` provider through `_call_openai_compatible`, honoring `provider.api_base_url`. So an OpenAI-compatible **LiteLLM gateway** already works today via a generic provider with `api_base_url` set — it just isn't offered as a labeled choice.
- The analysis already threads a `language` param (request → `dispatch_task(run_analyze, job.id, language)` → prompt directive). Verbosity mirrors this exactly.

## AIX-1 — LiteLLM provider type

**JTBD**: *"As an admin, I want to point RoboScope at our LiteLLM gateway so I can use any model we proxy and centralize keys/spend."*

- FE: add `{ value: 'litellm', label: 'LiteLLM (Gateway)' }` to `ProviderConfig.vue` provider types; freeform model list; empty default model; a hint that **the base URL is the gateway endpoint and is required**.
- BE: `litellm` already resolves to `_call_openai_compatible`. Add a guard: if `provider_type == 'litellm'` and no `api_base_url`, the call raises a clear error (no silent fallback to api.openai.com).
- i18n: `ai.litellmHint` in EN/DE/FR/ES/ZH.

## AIX-2 — Analysis verbosity control

**JTBD**: *"As a user, I want a concise summary or a deep dive on demand, so the analysis fits the moment."*

- `concise | standard | detailed` → a prompt directive (appended to `SYSTEM_PROMPT_ANALYZE`, like the language directive) and an effective `max_tokens` cap (concise ≈ 600, standard = provider default, detailed = provider default). Default `standard`.
- Plumbing mirrors `language`: `AnalyzeRequest.verbosity` → `dispatch_task(run_analyze, job.id, language, verbosity)` → `verbosity_directive(verbosity)` composed into the system prompt. Frontend sends the chosen verbosity (a small select near the Analyze button; default standard).
- i18n: `reportDetail.analysis.verbosity.*` in all 5 locales.

## Functional requirements

- **FR-1** LiteLLM is selectable; configuring it with a base URL + model produces working analyses; without a base URL it fails with a clear error (not a wrong-endpoint call).
- **FR-2** Verbosity is selectable per analysis (default standard); `concise` yields a materially shorter prompt directive + lower max_tokens; code/keywords/patches stay verbatim; composes with the language directive.
- **FR-3** Non-breaking: existing providers + analyses behave exactly as before when verbosity is unset (treated as standard).

## Acceptance

1. `verbosity_directive` unit-pinned (concise/standard/detailed/None); analyze dispatch passes verbosity through.
2. LiteLLM provider type appears in the form and round-trips; backend guard unit-pinned (litellm + no base_url → error).
3. i18n complete in EN/DE/FR/ES/ZH (Gate 8 stays green).
4. Real-UI E2E: the analysis card shows a verbosity selector.

## Handoff
→ Implementation (Amelia) → review → E2E.
38 changes: 38 additions & 0 deletions _bmad-output/planning-artifacts/exec-prd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Epic EXEC — RF Execution Configuration — PRD + scope

**Status**: EXEC-1/EXEC-3 shipped; EXEC-2/4/5/6/7 deferred (backlog)
**Date**: 2026-06-18
**PM/Architect**: John / Winston
**Parent**: [presentation-feedback-epics.md](./presentation-feedback-epics.md)

## What already existed (discovery)

The execution pipeline already modeled and applied tags + variables:
`RunCreate`/`ExecutionRun` carry `tags_include`, `tags_exclude`, `variables`, and
`subprocess_runner.py` (and the Docker runner) already build
`robot --include/--exclude/--variable` from them. **But the run dialog never
exposed them** — so users couldn't actually use the capability. That was the gap.

## Shipped this iteration

- **EXEC-1 / EXEC-3 — tag selection in the run dialog**: the New Run dialog now has **Include tags** and **Exclude tags** inputs (comma-separated), threaded through `runForm` → `RunCreateRequest` (already typed) → the existing backend that converts them to `robot --include/--exclude`. i18n in EN/DE/FR/ES/ZH. Real-UI E2E (`e2e/tests/run-tags.spec.ts`) asserts the create-run request carries `tags_include`; the runner application is covered by existing backend tests.

This is the migration-free, backend-ready slice that delivers the most-requested
"manage what robot runs" capability (tag-based selection) end to end.

## Deferred to backlog (each its own future cycle)

These are larger initiatives, intentionally not built in this iteration:

- **EXEC-1b — free-form `robot` args + `--variable` UI**: a guarded "advanced args" field (and a variables key/value editor) needs a new `ExecutionRun` column + Alembic migration + runner arg-merge + injection-safe parsing. (Variables are already modeled/applied; only the UI is missing.)
- **EXEC-2 — PreRunModifiers** (`--prerunmodifier`): config + project-provided module resolution; security note (arbitrary code in the env).
- **EXEC-4 — Long Name / unique ID surfacing → Jira association** (feeds the Phase-6 Jira plugin).
- **EXEC-5 — `__init__.robot`** suite-init editing in Explorer/Flow.
- **EXEC-6 — DataDriver / dynamic test generation** (spike-first).
- **EXEC-7 — RF best-practices research spike** → UI-surfacing backlog (RF Certified Professional rubric).

## Acceptance (shipped scope)

1. The run dialog exposes Include/Exclude tags; submitting a run sends `tags_include`/`tags_exclude` — pinned by real-UI E2E.
2. No regression to the existing runner tag/variable handling (existing backend tests green).
3. i18n complete in all 5 locales (Gate 8 green).
120 changes: 120 additions & 0 deletions _bmad-output/planning-artifacts/gov-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Epic GOV — Architecture

**Status**: Planning → ready for implementation
**Date**: 2026-06-18
**Architect**: Winston
**Parent**: [gov-prd.md](./gov-prd.md)

## Guiding principle

Boring technology. We already have a typed key-value `app_settings` table (`settings/service.py::get_setting_value`) and an ordered-role `require_role(min_role)` dependency. The whole epic is a thin resolver + two FastAPI dependencies + one read endpoint + one frontend composable. No new framework, no new table, no migration.

## 1. Flag registry & resolver

New module `backend/src/governance/flags.py`:

```
FEATURE_FLAGS: dict[str, bool] = {
"packageManagement": True, # GOV-2: install/uninstall/upgrade/build/rfbrowser-init
}
# Per-op role floor (GOV-4); resolved separately, only consulted when the area is ON.
PACKAGE_OP_ROLE_DEFAULT = Role.EDITOR # matches today's behavior
```

**Resolution precedence (per flag): ENV > DB > default.**
- ENV: `ROBOSCOPE_FEATURE_<UPPER_SNAKE>` (e.g. `packageManagement` → `ROBOSCOPE_FEATURE_PACKAGE_MANAGEMENT`). Parsed as bool (`1/true/yes/on` → true, `0/false/no/off` → false; anything else ignored). Read once at process start into a frozen dict.
- DB: `app_settings` row, `category="features"`, `key=<flag>`, `value_type="bool"`. Read via `get_setting_value`.
- default: `FEATURE_FLAGS[flag]` (ON).

```
def resolve_flag(db, key) -> ResolvedFlag # {value: bool, locked: bool}
def resolve_all(db) -> dict[str, ResolvedFlag]
```
`locked=True` when the ENV override is set (the UI renders it as "managed by your administrator", non-editable). DB-or-default → `locked=False`.

**Caching:** none in v1. `app_settings` is tiny and reads are indexed; a per-request lookup on a mutating env call is negligible. (If a hot path ever needs it, add a TTL cache invalidated in `update_settings` — noted, not built.)

**Seeding:** extend `seed_default_settings` with the `features` category rows so they appear in the Settings UI as editable toggles (value defaults to the registry default). Idempotent — existing installs get the row on next boot, value ON, behavior unchanged.

## 2. Backend enforcement dependency

New `backend/src/governance/dependencies.py`:

```
def require_feature(flag: str):
def dep(db = Depends(get_db)):
if not resolve_flag(db, flag).value:
raise HTTPException(403, detail="feature_disabled:<flag>")
return dep

def require_package_op(op: str):
"""Compose flag gate + configurable role floor for a package operation."""
def dep(db = Depends(get_db), user = Depends(get_current_user)) -> User:
if not resolve_flag(db, "packageManagement").value:
raise HTTPException(403, detail="feature_disabled:packageManagement") # absolute — even ADMIN
floor = resolve_package_op_role(db, op) # settings key features.packageManagement.role.<op>, default EDITOR
if ROLE_RANK[user.role] < ROLE_RANK[floor]:
raise HTTPException(403, detail="insufficient_role")
return user
return dep
```

**Wiring (GOV-2 + GOV-4):** on the mutating env endpoints (`install_package` L529, `upgrade_package` L572, `retry_package_install` L612, `uninstall_package` L694, `docker_build` L264, `rfbrowser-init` L654), replace `Depends(require_role(Role.EDITOR))` with `Depends(require_package_op("<op>"))`. Read endpoints (list/installed/keywords/dockerfile/search/popular) are untouched → stay 200 in locked mode (FR-3). `create`/`clone`/`delete` **environment** are governance-neutral for v1 (the customer concern is *package* mutation on a managed env) — left on `require_role` as today; revisit if needed.

**Order matters:** flag check first (absolute policy), role floor second (permission). A locked deployment 403s before any role consideration.

**Audit (FR-6):** the existing audit middleware already logs every POST/PUT/PATCH/DELETE with user/IP and the response — a 403'd mutation is captured. We add the `feature_disabled` / `insufficient_role` detail string so blocked attempts are greppable; no new audit pipeline.

## 3. Read contract — `GET /config/features`

New tiny router `backend/src/governance/router.py`, mounted on `api_router` → `/api/v1/config/features`:

```
GET /config/features (auth: any logged-in user)
→ { "flags": { "packageManagement": true },
"locked": { "packageManagement": false } } # locked=true ⇒ set via ENV, UI shows non-editable
```

Admin **editing** flags needs no new endpoint — flags are `app_settings` rows, edited through the existing Settings update path (PUT). GOV-1 only adds the seed rows + this read endpoint + the resolver.

## 4. Frontend — `useFeatureFlags()` composable

`frontend/src/composables/useFeatureFlags.ts` — singleton, mirrors `useUserSettings`/`useBypassStatus` pattern:

- **MUST early-return when `localStorage.getItem('access_token')` is null** (CLAUDE.md redirect-loop gotcha) — it's consumed by global layout.
- Fetches `/config/features` once, caches; exposes `isEnabled(flag): boolean` (default true while loading, so nothing flickers hidden) and `isLocked(flag): boolean`.
- Refetch hook after login + after an admin saves settings.

**Consumption (GOV-2 + GOV-3):** `EnvironmentsView` / package components gate every mutating control (`+ Install`, uninstall ✕, upgrade, Docker build, rfbrowser-init) on `isEnabled('packageManagement')`. When disabled: controls are hidden, and a localized banner renders — "Package management is managed by your administrator" (EN/DE/FR/ES). The package list, versions, and Docker image stay visible (read-only). **GOV-3 is this locked-state UX, not a second flag** — read-only environments == `packageManagement` off. (Simplification vs. PRD's separate sub-mode; one flag is cleaner and covers the requirement.)

## 5. File layout

```
backend/src/governance/
__init__.py
flags.py # registry + resolver (env/db/default)
dependencies.py # require_feature, require_package_op
router.py # GET /config/features
schemas.py # FeaturesResponse
backend/tests/governance/
test_flags.py # precedence unit tests
test_package_lockdown.py # every mutating env endpoint → 403 when off; role floor
frontend/src/composables/useFeatureFlags.ts
frontend/src/tests/composables/useFeatureFlags.spec.ts
```

## 6. Test strategy (feeds the E2E story)

- **Unit (backend):** precedence (env beats db beats default; bool parsing; locked flag); `require_package_op` 403 paths (flag off → 403 even for ADMIN; role floor below → 403).
- **Endpoint:** parametrized test hitting every mutating env endpoint with flag OFF → 403, and a read endpoint → 200.
- **Unit (frontend):** composable token-guard (no fetch without token), isEnabled/isLocked.
- **E2E (real UI, Playwright):** boot backend with `ROBOSCOPE_FEATURE_PACKAGE_MANAGEMENT=false`, log in, open Environments → assert no install/uninstall/build controls, banner visible, package list still rendered; assert a direct API mutation returns 403. Second run with flag unset → controls present (default unchanged).

## 7. Risks / decisions

- **Env read timing:** env flags frozen at process start — changing the env var requires a restart (correct for a deployment-level lock; documented).
- **`create/clone/delete environment` left ungoverned in v1** — scoped to package mutation per the customer's actual concern; trivially extendable by adding `require_feature` later.
- **No new migration** — `app_settings` rows are seeded, not schema'd.

## 8. Handoff
→ **Implementation (Amelia):** GOV-1 (flags module + resolver + `/config/features` + seed + `useFeatureFlags`), then GOV-2 (wire `require_package_op` + UI gating + banner i18n), GOV-3 (locked-state UX polish), GOV-4 (role-floor settings + resolution). Then code review + full UI E2E.
Loading
Loading