Skip to content

Latest commit

 

History

History
315 lines (227 loc) · 9.91 KB

File metadata and controls

315 lines (227 loc) · 9.91 KB

Load Testing Mode

This document describes the current (incomplete but runnable) "load testing mode" implementation and how to run it end-to-end manually.

Load testing mode consists of:

  • A fixture generator that builds a single JSON fixture from an existing DB
  • A DB preparation command that creates a fresh load test DB, migrates it, and loads the fixture while suppressing all Django signals
  • A load test settings file that points the app at the load test DB and disables Turnstile blocking
  • A Locust script (locustfile.py) plus a wrapper shell script (load_test.sh) to run a headless load test

The intended lifecycle is:

  1. Generate a fixture from a DB with real-ish data
  2. Create and populate a fresh concordia_lt database from that fixture
  3. Run the web app against concordia_lt using load test settings
  4. Run Locust against that host

The load test database is intended to be single-use.

Files

  • concordia/management/commands/create_load_test_fixtures.py
  • concordia/management/commands/prepare_load_test_db.py
  • concordia/settings_loadtest.py (or your own concordia/settings_loadtest_<name>.py)
  • locustfile.py (repo root)
  • load_test.sh (repo root)

Safety notes

  • create_load_test_fixtures is read-only against the source DB and only writes a JSON file. It is safe to run against production, though it is normally run against a refreshed copy of production.
  • prepare_load_test_db creates and optionally drops a separate database (concordia_lt), runs migrations, and loads fixtures into it.
    • It requires PostgreSQL credentials with CREATE DATABASE privileges.
    • If recreating or dropping, it terminates active connections to the target DB.
  • During fixture load, all Django signals are suppressed to avoid side effects (Celery tasks, storage writes, cache updates, derived fields, etc).
  • Storage in load test mode is configured to use dev/staging buckets for safety. The workflow is designed to avoid writes to external systems.
  • Locust defaults to a non-production host to reduce risk.

Prerequisites

  • VPN access to the target environment
  • PostgreSQL credentials available via environment variables
    • The DB user must be able to connect to dbname=postgres and create databases
  • Python environment with the normal dev dependencies installed (Locust is a dev dependency)
  • Ability to restart the web app with a different settings module
  • A reachable host running the app in load test mode

Fixture contents

The fixture generated by create_load_test_fixtures contains:

  • Up to 2 published Topics, chosen by ascending ordering
  • Up to 5 published Campaigns, preferring Topic-linked Campaigns and filling with additional published Campaigns by ascending ordering
  • Up to --assets-limit Assets (default 10,000), collected by walking:
    • Topic-linked Projects first, then
    • Campaign-linked Projects if needed
  • Closure of referenced Items, Projects, Campaigns and Topics for the chosen Assets
  • All Transcriptions for selected Assets
  • Anonymized fixtures for any Users referenced by those Transcriptions (user and reviewed_by)
  • A synthetic pool of test users:
    • Default: 10,000 users named locusttest00001..locusttest10000
    • All share the same password: locustpass123
    • Email: <username>@example.test
    • Users are created with explicit PKs beyond the existing fixture user PKs to avoid collisions
  • ProjectTopic rows for selected Topic+Project links (preserves the M2M)

Notes:

  • Selection is best-effort. If there are fewer than --assets-limit Assets, the fixture is still written.
  • The output is a single JSON file (default loadtest_fixture.json).
  • By default, the command validates the fixture by calling prepare_load_test_db unless --no-validate is provided.

Commands

1) Create the fixture

Run against a DB with real data (usually a refreshed prod copy):

python manage.py create_load_test_fixtures

Common options:

python manage.py create_load_test_fixtures \
  --assets-limit 10000 \
  --test-users 10000 \
  --test-user-prefix locusttest \
  --test-user-password locustpass123 \
  --output loadtest_fixture.json

Validation options:

  • --no-validate to skip validation
  • --validate-db-name NAME to override the validation DB name
  • --validate-recreate to recreate the validation DB if it exists
  • --validate-drop to drop the validation DB after loading

2) Create and populate the load test DB

Standard DB name: concordia_lt

python manage.py prepare_load_test_db \
  --db-name concordia_lt \
  --recreate \
  --fixtures loadtest_fixture.json

Behavior:

  • Creates or recreates concordia_lt
  • Runs migrations
  • Loads fixtures with all signals suppressed by default

Running the app in load test mode

Settings file

concordia/settings_loadtest.py is an override layer on top of settings_template.py. It:

  • Points the DB at concordia_lt
  • Disables rate limiting
  • Forces Turnstile to always-pass test keys by default
  • Uses console email backend
  • Uses dev buckets for safety
  • Adjusts logging to be visible in common run contexts

If you need a different DB name, do not edit settings_loadtest.py directly. Create a personal settings file, following the local dev convention:

  • concordia/settings_loadtest_<username>.py
  • Override DATABASES["default"]["NAME"] (and any other local overrides)

Selecting settings at runtime

Local example:

DJANGO_SETTINGS_MODULE=concordia.settings_loadtest \
  python manage.py runserver 0.0.0.0:8000

Server/container example:

  • Set DJANGO_SETTINGS_MODULE=concordia.settings_loadtest
  • Restart/redeploy the web process so it actually uses the load test settings

Important:

  • Creating concordia_lt does not affect any running web process. You must restart the app with the load test settings selected.

Locust

Overview

The load test simulates three flows:

  • Anonymous browsing/transcription page interactions
  • Authenticated users who transcribe
  • Authenticated users who review

The script uses these endpoints:

  • / (homepage)
  • /next-transcribable-asset/ (redirect to next asset)
  • /next-reviewable-asset/ (redirect to next reviewable asset)
  • /account/login/ (login)
  • /account/ajax-status/ and /account/ajax-messages/ (simulates normal page load)

The script parses asset pages to find:

  • The transcription form action (<form id="transcription-editor" ...>)
  • Reservation endpoint (<script id="asset-reservation-data" data-reserve-asset-url="...">)
  • Review endpoints (data-review-url, data-submit-url)

If parsing fails, it is treated as a fundamental mismatch between the Locust script and the UI.

"No work" abort behavior

The Locust run aborts the entire test if it determines there is no work available. "No work" is defined as either:

  • A /next-* redirect eventually landing on / (homepage), or
  • An asset page not containing the transcription form

This is controlled by:

  • ABORT_WHEN_NO_WORK = True (default)
  • NO_WORK_DUMP_HTML = False (set True to dump a debug HTML file on abort)

The abort is coordinated across master/workers in distributed mode via a custom message (global-abort). Locust is forced to exit with a non-zero exit code.

load_test.sh

load_test.sh runs Locust in headless mode with defaults that can be overridden via environment variables.

Defaults:

  • Users: 100
  • Spawn rate: 2
  • Run time: 1m30s
  • Host: https://crowd-dev.loc.gov

Override example:

LOCUST_USERS=500 \
LOCUST_SPAWN_RATE=10 \
LOCUST_RUN_TIME=10m \
LOCUST_HOST=https://your-loadtest-host.example \
./load_test.sh

End-to-end manual runbook

This is the current manual process. Nothing here is automated end-to-end yet.

  1. Choose the environment to test
  • Typically your personal environment, dev or staging prepared from a refreshed DB copy of production
  1. Generate a fixture
python manage.py create_load_test_fixtures \
  --output loadtest_fixture.json

If you want a smaller dataset for quicker iteration, lower --assets-limit and/or --test-users.

  1. Create and populate the load test DB
python manage.py prepare_load_test_db \
  --db-name concordia_lt \
  --recreate \
  --fixtures loadtest_fixture.json
  1. Switch the web app to load test settings and restart it
  • Set DJANGO_SETTINGS_MODULE=concordia.settings_loadtest
  • Restart/redeploy the web process so it uses:
    • DATABASES["default"]["NAME"] = "concordia_lt"
    • Turnstile always-pass test keys

Sanity checks:

  • Visit the site and confirm pages load without Turnstile blocking.
  • Attempt login with a known test user:
    • Username: locusttest00001
    • Password: locustpass123
  1. Run Locust
./load_test.sh

Tune parameters if needed:

LOCUST_USERS=200 LOCUST_SPAWN_RATE=5 LOCUST_RUN_TIME=5m ./load_test.sh
  1. Common failure modes
  • Immediate login failures:
    • App not pointing at concordia_lt
    • Fixture not loaded or test users missing
    • Turnstile not disabled for load test mode
  • Global abort "no work":
    • next-* redirects to / because there is no eligible work
    • This is likely due to running the script multiple times without refreshing DB
  • Lots of 403s:
    • Turnstile still active
    • CSRF issues (the script attempts to seed and use CSRF correctly)
  1. Cleanup

There is no automated cleanup step. The DB is intended to be thrown away or recreated for each run.

To recreate on the next run, rerun step (3) with --recreate.

Known gaps / Next development priorities

  • No single "one command" workflow; all steps are manual.
  • No automated mechanism to build and deploy a load-test-mode container in AWS.
  • No automated environment switching between normal and load test settings.
  • No automated teardown of the load test DB after a run.