Browser E2E Tests (Playwright)

This guide explains how to run the browser-based end-to-end test suite at tests/playwright/ — locally during development and in CI — and how to write new tests against the shared fixture surface.

tests/playwright/conftest.py is the source of truth for fixture behaviour. This doc paraphrases it for discoverability; if the two ever disagree, the conftest wins.

Overview

The suite uses pytest-playwright to drive a real Chromium / Firefox / WebKit browser against an in-process FastAPI server. It is intentionally isolated from the default pytest run for two reasons:

Browser processes do not play nicely with pytest --cov's subprocess instrumentation — coverage measurement interferes with browser-process isolation, so the suite must be invoked with --override-ini='addopts=', which strips the entire project-wide addopts list (not only the --cov* flags). See "Running locally" below for the full implications and the flags you must re-add.
No Playwright dependency on the default path. tests/conftest.py gates collection on a try: import playwright — if Playwright is not installed (the default for most contributors), the directory is silently added to collect_ignore_glob and skipped. The dedicated CI job (which does install Playwright) collects the directory and runs the suite.

Current coverage (see tests/playwright/):

test_smoke.py — page-load smoke across every UI route
test_setup_wizard_flow.py — first-run setup wizard happy path
test_file_browser_desktop.py — desktop-mode file browser
test_desktop_api_contract.py — window.pywebview.api contract via the mock fixture

Running locally

Prerequisites

pip install -e ".[dev]"          # provides pytest-playwright
playwright install chromium      # or: firefox, webkit

pytest-playwright is part of the [dev] extra (see pyproject.toml); the playwright install command downloads the actual browser binaries under ~/.cache/ms-playwright/. You only need to run it once per machine per browser.

The canonical command

pytest tests/playwright/ \
    --browser chromium \
    --override-ini='addopts=' \
    --strict-markers \
    --timeout=60

--browser chromium — pick the browser. firefox and webkit are the other valid values. CI runs all three in parallel (see below); locally you usually only need one.
--override-ini='addopts=' — required. pyproject.toml's project-wide addopts includes --cov / --cov-fail-under (which break browser-process isolation), --strict-markers, and --timeout=30. Stripping addopts clears all of those flags for this run.
--strict-markers — re-adds strict-marker enforcement so pytest still fails if any test uses an unregistered marker (stripped from addopts above).
--timeout=60 — re-adds a per-test timeout appropriate for browser tests. The project-wide 30 s is too tight for browser startup; 60 s matches what CI uses (stripped from addopts above).

Interactive debugging

# Run headed so you can watch the browser
pytest tests/playwright/ --browser chromium --headed --override-ini='addopts='

# Slow every action so you can see what is happening
pytest tests/playwright/ --browser chromium --headed --slowmo=500 --override-ini='addopts='

# Drop into the Playwright Inspector on the first action
PWDEBUG=1 pytest tests/playwright/ --browser chromium --override-ini='addopts='

Trace viewer

When a test fails (or when you want to inspect a passing run), record a trace and open it in Playwright's trace viewer:

pytest tests/playwright/test_smoke.py \
    --browser chromium \
    --override-ini='addopts=' \
    --strict-markers \
    --timeout=60 \
    --tracing=retain-on-failure \
    --output=playwright-artifacts

# After a failure:
playwright show-trace playwright-artifacts/<test-name>/trace.zip

The trace viewer shows every browser action with a before/after DOM snapshot, network activity, and console output. It is by far the fastest way to diagnose a broken UI test — use it before resorting to print() or page.pause().

Upstream Playwright docs cover the trace viewer in detail: https://playwright.dev/python/docs/trace-viewer.

Running in CI

The Playwright job is defined in .github/workflows/ci.yml under the playwright: job (currently around line 243, added in commit 6711bec0 — "ci: add Playwright E2E job to PR/push CI"). It runs on every pull_request and push.

Matrix

Browser	Runner	fail-fast
chromium	ubuntu-latest	no
firefox	ubuntu-latest	no
webkit	ubuntu-latest	no

All three run in parallel. fail-fast: false is deliberate: a Firefox regression should not cancel the in-progress WebKit leg, because the review value of seeing all three results outweighs the runner-minute cost.

What the job does (condensed)

pip install -e ".[dev,search]" — installs pytest-playwright and the standard test deps.
pip install "pytest-rerunfailures>=14.0" — installed inline (not in [dev]) because only this job uses it for flake tolerance.
Cache ~/.cache/ms-playwright keyed on pyproject.toml hash + browser name.
python -m playwright install --with-deps ${browser} — downloads the browser binary and pulls the system libs a fresh Ubuntu runner needs.
pytest tests/playwright/ --browser ${{ matrix.browser }} --tracing retain-on-failure --screenshot only-on-failure --video retain-on-failure --output=playwright-artifacts --reruns 2 --reruns-delay 2 --timeout=60 --strict-markers --override-ini="addopts="

Debugging a CI failure

When a Playwright leg fails on a PR:

Open the failing GitHub Actions run.
Go to the Artifacts panel (bottom of the run summary page).
Download playwright-artifacts-<browser> — the browser-specific name prevents matrix legs from clobbering each other.
Unzip locally. You will find trace files, screenshots, and videos for every failed test (none for passing tests — retention is retain-on-failure / only-on-failure to keep artifact size sane).

Open the trace:

playwright show-trace <unzipped>/<test-name>/trace.zip

Retention is 7 days. If you need the artifact longer, download and stash it locally.

Flake tolerance

--reruns 2 --reruns-delay 2 retries each failing test up to twice with a 2-second delay, via pytest-rerunfailures. This absorbs transient CI-side flakes (browser launch races, network hiccups on the GitHub-hosted runner) without hiding real regressions — a genuinely broken test still fails on the third attempt. If a test starts needing more than 2 reruns, treat it as broken and fix the root cause rather than bumping the retry count.

Fixture contracts

All fixtures live in tests/playwright/conftest.py. Scopes and line numbers below are current as of the most recent edit to this doc — if you are about to write a test and any of this looks stale, re-read the conftest.

`playwright_config_dir` (session-scoped)

Defined at tests/playwright/conftest.py:117.

Returns a per-session tmp_path that is used as the FastAPI server's XDG_CONFIG_HOME. This isolates the server's ConfigManager from your real ~/.config/file-organizer so tests cannot pollute your dev environment (and vice versa).

When test authors care about it: any test that needs to reset or mutate persistent config mid-session must write to or delete files under playwright_config_dir / "file-organizer", not the real home directory. The canonical reset pattern is in tests/playwright/test_smoke.py — it deletes <playwright_config_dir>/file-organizer/config.yaml immediately before navigation so ConfigManager.load() returns AppConfig defaults (setup_completed=False), making the test order-independent.

If you do not touch persistent state, you can ignore this fixture — it is wired into live_server_url already and does its job transparently.

`live_server_url` (session-scoped)

Defined at tests/playwright/conftest.py:127.

Starts the FastAPI app in-process on a random free port in a daemon thread and yields a base URL like http://127.0.0.1:54321.

What it does under the hood:

Pulls playwright_config_dir to get an isolated config home.
Sets XDG_CONFIG_HOME and monkeypatches file_organizer.config.manager.DEFAULT_CONFIG_DIR before importing the API modules, so the module-level constant captures the tmp dir instead of the user's real config.
Constructs ApiSettings(auth_enabled=False, allowed_paths=[tmp], auth_db_path=<tmp>/auth.db).
Builds the app via create_app(settings) and runs it in a daemon uvicorn.Server thread.
Waits up to 20 seconds for the port to accept TCP connections (_wait_for_port). If the server never comes up, raises a RuntimeError that includes any exception raised from the server thread — so you get a real stack trace instead of a silent timeout.
On teardown, sets server.should_exit = True, joins the thread, and restores environment variables — wrapped in try/finally so cleanup runs even if startup fails before yield.

How to override settings: live_server_url is session-scoped, so you cannot override its ApiSettings per-test. That is deliberate — per-test overrides would force a server restart per test and tank wall-clock. If you need an authenticated variant for a test, epic B2 will add a separate session fixture that builds the app with auth_enabled=True; this doc will be updated when that lands.

`base_url` (session-scoped, overrides `pytest-playwright`)

Defined at tests/playwright/conftest.py:243.

Returns live_server_url as the default URL Playwright resolves relative paths against. With this fixture in place you can write:

page.goto("/ui/files")

…and Playwright rewrites it to http://127.0.0.1:<port>/ui/files. Without the override, pytest-playwright's built-in base_url reads from the --base-url CLI flag, which this project does not pass.

`pywebview_mock` (function-scoped)

Defined at tests/playwright/conftest.py:338.

Injects a stub window.pywebview.api into the page via page.add_init_script(). Because add_init_script runs before any navigation in the page's lifecycle, the mock is always present when desktop_api.js runs its if (window.pywebview) feature detection — which sets document.body.dataset.desktopApp = "1", enabling any elements decorated with [data-desktop-only].

The fixture returns a PywebviewMockHandle (defined at tests/playwright/conftest.py:278) with the following methods (see the conftest for full docstrings):

Method	Purpose
`set_browse_directory_result(p)`	Override what `window.pywebview.api.browse_directory()` resolves to.
`set_browse_file_result(p)`	Override the `browse_file()` return value.
`set_save_file_result(p)`	Override the `save_file()` return value.
`set_open_path_result(bool)`	Override the `open_path()` return value (success / failure).
`get_open_path_calls()`	Return the ordered list of paths `open_path()` was called with.

Caveat: mock state lives in window.__mockPyw. Because add_init_script re-runs on every page load, the state resets on every navigation — if you need to assert "the page called open_path('/foo')" you must read get_open_path_calls() before the next navigation.

Adding a new test

File placement. New files go under tests/playwright/ and must be named test_<feature>.py.
Markers. Apply both e2e and playwright at module level via pytestmark. Not per-function decorators — the existing suite uses the list form consistently.
- The e2e marker keeps the suite visible to developers running pytest -m "integration or e2e" and keeps docs/developer/testing.md's marker taxonomy consistent. Omitting it excludes the test from those selectors.
- The playwright marker enables pytest -m playwright as a convenience selector and satisfies the project-wide --strict-markers flag (which makes pytest fail collection if any test uses an unregistered marker). Omitting it means pytest -m playwright skips the test.
Note: tests/conftest.py's collect_ignore_glob gate is import-based (it skips the directory when import playwright raises ImportError), not marker-based. CI selects the suite by directory (pytest tests/playwright/), not by marker selector. Neither mechanism is affected by which markers your test carries.
Fixtures. Take page (from pytest-playwright) and live_server_url (or any of the other session fixtures above). Because the base_url fixture is wired in, you can use relative paths in page.goto().

Skeleton. Copy the module-level marker block from tests/playwright/test_smoke.py and add one test:

import pytest
from playwright.sync_api import Page

pytestmark = [
    pytest.mark.e2e,
    pytest.mark.playwright,
    pytest.mark.timeout(60),  # browser ops need more headroom than unit tests
]


def test_my_new_page(page: Page, live_server_url: str) -> None:
    page.goto("/ui/files")
    assert page.locator("h1").is_visible()

Run it locally using the canonical command from "Running locally" above, targeting just your new file:

pytest tests/playwright/test_my_feature.py \
    --browser chromium \
    --override-ini='addopts=' \
    --strict-markers \
    --timeout=60

tests/playwright/test_smoke.py is the canonical template — if you are unsure about style or structure, mirror it.

Gotchas

`collect_ignore_glob` is gated on the Playwright import

tests/conftest.py adds playwright/** to collect_ignore_glob only when import playwright raises ImportError. This means:

Default developer environment (Playwright not installed): pytest tests/ silently skips the directory. Surprising the first time you hit it, but intentional — the suite cannot run without a browser anyway.
Dedicated CI job (Playwright installed): the directory is collected, and your new tests run on every PR.
Developer machine with Playwright installed: the directory is collected on every pytest tests/ run. If you were not expecting that and your suite is slow, run with --ignore=tests/playwright explicitly.

State leakage between tests

The live_server_url fixture is session-scoped, so every Playwright test in a run shares the same FastAPI process and the same playwright_config_dir. A test that flips setup_completed=True (for example by completing the setup wizard) will break sibling tests under random ordering unless it resets the state.

The canonical reset pattern, from test_smoke.py: delete <playwright_config_dir>/file-organizer/config.yaml before the navigation under test, so ConfigManager.load() returns AppConfig() defaults. Do this in a fixture or at the top of the test — never in teardown, because another test may run first in random order.

Random-port TOCTOU window

_find_free_port binds port 0, reads the assigned port, and releases the socket — then uvicorn re-binds the same port. There is a small window where another process could steal the port. This is negligible on developer machines but occasionally trips on heavily-loaded CI runners. The CI job's --reruns 2 covers it.

If local runs start failing with OSError: ... Address already in use during fixture setup, simply re-run — do not bump _wait_for_port's timeout.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Browser E2E Tests (Playwright)

Overview

Running locally

Prerequisites

The canonical command

Interactive debugging

Trace viewer

Running in CI

Matrix

What the job does (condensed)

Debugging a CI failure

Flake tolerance

Fixture contracts

`playwright_config_dir` (session-scoped)

`live_server_url` (session-scoped)

`base_url` (session-scoped, overrides `pytest-playwright`)

`pywebview_mock` (function-scoped)

Adding a new test

Gotchas

`collect_ignore_glob` is gated on the Playwright import

State leakage between tests

Random-port TOCTOU window

FilesExpand file tree

playwright-e2e.md

Latest commit

History

playwright-e2e.md

File metadata and controls

Browser E2E Tests (Playwright)

Overview

Running locally

Prerequisites

The canonical command

Interactive debugging

Trace viewer

Running in CI

Matrix

What the job does (condensed)

Debugging a CI failure

Flake tolerance

Fixture contracts

playwright_config_dir (session-scoped)

live_server_url (session-scoped)

base_url (session-scoped, overrides pytest-playwright)

pywebview_mock (function-scoped)

Adding a new test

Gotchas

collect_ignore_glob is gated on the Playwright import

State leakage between tests

Random-port TOCTOU window

`playwright_config_dir` (session-scoped)

`live_server_url` (session-scoped)

`base_url` (session-scoped, overrides `pytest-playwright`)

`pywebview_mock` (function-scoped)

`collect_ignore_glob` is gated on the Playwright import