Select GPU adapters by backend to fix multi-backend duplication and OOM by n13 · Pull Request #67 · Quantus-Network/quantus-miner

n13 · 2026-06-12T03:00:27Z

Overview

Clean reimplementation of the fix attempted in #62, addressing the issues raised in its review. Closes #61.

On Windows, wgpu enumerates each physical GPU once per backend (Vulkan + DX12) plus a CPU-emulated software fallback (Microsoft Basic Render Driver). GpuEngine::init builds a mining context for every entry, so a hybrid laptop yields 5 contexts competing for the same VRAM and the process OOMs during benchmark/serve startup.

What changed (engine-gpu only)

Rather than deduplicating adapters by (vendor, device) PCI IDs — which collapses rigs with multiple identical cards into one context, including on Linux where no cross-backend duplicates exist — adapter selection works by backend:

Drop DeviceType::Cpu adapters (software fallbacks are never useful for PoW, and this also filters lavapipe/llvmpipe on Linux).
Keep all adapters from the highest-ranked backend present (Vulkan/Metal, then DX12, then others). Within a single backend each physical GPU appears exactly once, so identical cards survive by construction — no physical-ID matching needed, and backends that report vendor/device = 0 (e.g. Metal) can't cause false merges.
Order the selection discrete-first, so --gpu-devices 1 on a hybrid laptop picks the discrete card instead of the first enumerated iGPU (also raised in benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows #61).

Each skipped adapter is logged at info level, each selected device gets an info log with name/type/backend, and init fails with an explicit error (pointing at --gpu-devices 0) if nothing usable remains. The selection logic is a pure index-based function (select_adapters) over AdapterInfo, unit-tested without GPU hardware. Also removes the dead adapter_infos local from init.

No public API changes; try_new and device_count() signatures are untouched.

Validation

cargo fmt --all -- --check, cargo clippy --workspace --all-targets -- -D warnings: clean
cargo test --workspace --locked: all pass, including 5 new unit tests covering the exact benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows #61 enumeration (5 entries → 2 contexts, discrete first), identical multi-GPU rigs (all cards kept), DX12-only machines, software-only environments, and empty enumeration
Runtime sanity on macOS (Apple M5 Pro / Metal): single context selected, benchmark mines normally

Not validated on multi-GPU Windows hardware — @adamtpang, if you can run your #61 repro on this branch, that would confirm the fix end-to-end (expect 2 contexts: GPU device 0: NVIDIA ... (DiscreteGpu, Vulkan), GPU device 1: AMD ... (IntegratedGpu, Vulkan)).

Risks and mitigations

A GPU exposed only through a lower-ranked backend (e.g. discrete card with broken Vulkan driver, iGPU with working one) would be skipped. Rare, explicitly logged, and recoverable by fixing the driver; a --gpu-adapter selector remains a possible follow-up.
Setups that intentionally mine on a software adapter now get an init error instead — the CPU engine is the right tool there, and --gpu-devices 0 silences GPU probing.

Follow-ups

benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows #61 also suggests lowering default batch size for integrated GPUs sharing system RAM; out of scope here.

Note

Medium Risk
Changes which GPUs get mining contexts at startup (behavioral fix on Windows); a GPU only exposed via a lower-ranked backend could be skipped, though that is logged.

Overview
Fixes Windows multi-backend enumeration where wgpu listed the same GPUs on Vulkan and DX12 plus CPU fallbacks, causing multiple mining contexts per card and VRAM OOM (#61).

select_adapters replaces “init every enumerated adapter + skip by name” with: drop DeviceType::Cpu, keep only adapters on the highest-priority backend (Vulkan/Metal, then DX12), and order indices discrete GPU first so limited --gpu-devices picks the dGPU. Init now builds contexts only for those indices, logs skips at info, and fails with a clearer no usable adapters message (hinting --gpu-devices 0). Five unit tests cover the #61 case, multi-identical-GPU rigs, DX12-only, and software-only setups.

miner-service: adds a 🎯 prefix to the CPU/GPU “found solution” log line only.

^{Reviewed by Cursor Bugbot for commit 2225161. Configure here.}

On Windows wgpu enumerates each physical GPU once per backend (Vulkan + DX12) plus a CPU-emulated fallback ("Microsoft Basic Render Driver"). Building a mining context for every entry causes VRAM contention and OOMs the process during benchmark/serve startup (#61). Instead of deduplicating by (vendor, device) PCI IDs - which would collapse rigs with multiple identical cards into a single context - drop CPU-emulated adapters and keep all adapters from the highest-ranked backend present (Vulkan/Metal, then DX12). Within a single backend each physical GPU appears exactly once, so identical cards are preserved by construction and no physical-ID matching is needed. Selected adapters are ordered discrete-first so `--gpu-devices 1` picks the discrete card on hybrid laptops. Skipped adapters are logged at info level; if nothing usable remains, init fails with an explicit error. Selection logic is a pure index-based function unit-tested against the exact enumeration reported in #61, identical multi-GPU rigs, DX12-only machines, and software-only environments.

This was referenced Jun 12, 2026

benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows #61

Closed

fix(gpu): filter CPU-emulated adapters and dedupe per physical GPU #62

Closed

illuzen added 2 commits June 30, 2026 12:57

Merge branch 'main' into fix/gpu-adapter-selection

7fa5486

merge with main, add success emoji

2225161

illuzen approved these changes Jun 30, 2026

View reviewed changes

illuzen merged commit d99be29 into main Jun 30, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Select GPU adapters by backend to fix multi-backend duplication and OOM#67

Select GPU adapters by backend to fix multi-backend duplication and OOM#67
illuzen merged 3 commits into
mainfrom
fix/gpu-adapter-selection

n13 commented Jun 12, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

n13 commented Jun 12, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What changed (engine-gpu only)

Validation

Risks and mitigations

Follow-ups

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

n13 commented Jun 12, 2026 •

edited by cursor Bot

Loading