Skip to content

AutoPTZ/autoptz

Repository files navigation

AutoPTZ

AutoPTZ

AI-driven PTZ camera tracking — detect people, lock onto a target, and move the camera to follow them automatically.

Installation · Configuration · Performance · Building · Architecture · Troubleshooting


AutoPTZ is a cross-platform desktop app (native Qt Widgets / PySide6) that runs a real-time vision pipeline per camera — detect → track → re-identify → pose → aim → drive PTZ — and sends smooth pan/tilt/zoom commands so a PTZ camera keeps the chosen person framed. It is built for live production: multi-camera, stable target identity across occlusions, and graceful degradation when a model or device is missing (it always keeps live preview).

Highlights

  • Multi-camera — each camera runs its own worker; identities stay stable per camera with no cross-camera state bugs.
  • Identity-gated tracking — click a person to target them; optional face recognition + appearance ReID re-bind the right person after occlusions.
  • Smooth PTZ control — motion prediction, one-euro smoothing, PD + velocity feed-forward, an adjustable framing "safe zone", auto-zoom, and loss recovery.
  • Runs anywhere, fast — ONNX Runtime picks the best accelerator per platform (Apple CoreML, NVIDIA TensorRT/CUDA, Windows DirectML, Intel OpenVINO, CPU) with per-EP tuning (FP16, persistent TensorRT engine cache, full graph optimization). See Performance.
  • PTZ backends — VISCA over USB, VISCA over IP, ONVIF, and NDI.
  • In-app updates — downloads the matching GitHub Release asset for your OS and starts the installer/new AppImage.

Quick start (from source)

Requires Python 3.12+.

git clone https://github.com/AutoPTZ/autoptz
cd autoptz

# Create a venv at the PROJECT ROOT (not inside autoptz/)
python3.12 -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

# Full stack (detection + tracking + UI), editable source checkout:
python tools/install.py --editable

python -m autoptz                # launch the app
python -m autoptz --selftest     # verify the foundations and exit

The first launch downloads the detector model (YOLO11) into the platform app-data dir; without it the app still runs in live-preview-only mode.

Picking your accelerator

tools/install.py detects the OS/GPU and prints every pip command before it runs it. Use --dry-run to review the plan. Static requirements files cannot inspect CUDA/TensorRT, so you can still force the ONNX Runtime wheel explicitly:

python tools/install.py --dry-run
python tools/install.py --accelerator nvidia --editable
python tools/install.py --accelerator openvino --editable
python tools/install.py --accelerator cpu --editable

Only one onnxruntime* wheel can be installed at a time — see Performance.

Installers

Pre-built installers are published on the Releases page: a macOS .dmg, a Windows installer (.exe), and a Linux AppImage. To build them yourself see docs/building.md.

After install, Help -> Check for Updates... downloads the matching OS asset, starts it, and closes AutoPTZ so the update can finish. If a release is missing your OS asset, AutoPTZ opens the release page instead.

Documentation

Doc What's in it
Installation From source + pre-built installers, per platform.
Configuration Every tuning knob: model tier, detect interval, framing, smoothing, PTZ gains.
Performance Cross-platform device/precision matrix + the ep_compare benchmark.
Building PyInstaller bundles → DMG / Windows installer / AppImage.
Architecture Module map and the per-frame data flow.
Troubleshooting Common issues (no boxes, wrong camera, slow tracking).
Contributing Dev setup, lint/type/test gates, branch policy.

License

See LICENSE.md.

Contributors

Languages