Improve Windows gpu support by illuzen · Pull Request #70 · Quantus-Network/quantus-miner

illuzen · 2026-06-29T06:19:05Z

Add comprehensive GPU detection support for AMD, NVIDIA, Intel, and Qualcomm

Overview

This PR significantly expands GPU detection and tier configuration to support a much wider range of graphics hardware. It addresses user reports of unrecognized GPUs (specifically Vega 8 APU and RX 560X) and proactively adds support for many other common GPUs that were previously falling back to conservative default settings.

What Changed

Bug Fixes

Fixed critical pattern matching bug: The "mi" pattern for AMD Instinct was matching any GPU name containing "mi" (e.g., "Family", "Graphics"). Changed to specific patterns: "instinct mi", "mi100", "mi200", "mi250", "mi300"
Fixed RX 560X misdetection: RX 560X was incorrectly matching the "5600" pattern (RDNA 1) instead of Polaris. Reordered detection to check Polaris patterns before RDNA 1

NVIDIA Additions

Tier	Models	Workgroup Formula
GTX 900 (Maxwell)	980, 970, 960, 950	`max/18, min 768`
GTX 700 (Kepler/Maxwell)	780, 770, 760, 750	`max/20, min 512`
GTX Legacy (Fermi/Kepler)	GTX 600/500/400 series	`max/24, min 384`
MX (Mobile)	MX550, MX450, MX350, etc.	`max/24, min 384`
GT (Entry-Level)	GeForce GT series	`max/28, min 256`
Professional	Added A100, H100, L4 datacenter GPUs	`max/10, min 2560`

Also added missing patterns: RTX 3060, 3050, GTX 1050, 1030

AMD Additions

Tier	Models	Workgroup Formula
Radeon 780M (RDNA 3 APU)	780M	`max/12, min 2048`
Radeon 7x0M (RDNA 3 APU)	760M, 740M	`max/16, min 1024`
Radeon 680M (RDNA 2 APU)	680M	`max/16, min 1536`
Radeon 6x0M (RDNA 2 APU)	660M, 610M	`max/22, min 768`
RX 6500/6400 (Entry RDNA 2)	6500 XT, 6400	`max/22, min 512`
Radeon VII (Vega 20)	Radeon VII	`max/12, min 2048`
Vega 64 (Discrete)	Vega 64	`max/14, min 1536`
Vega 56 (Discrete)	Vega 56	`max/16, min 1280`
Vega (APU)	Vega 8, Vega 11, etc.	`max/28, min 384`
R9 Fury/Nano (Fiji)	Fury X, Fury, Nano	`max/16, min 1280`
R9 (GCN)	390, 380, 290, 280	`max/20, min 768`
R7 (GCN)	370, 360, 270, 260	`max/22, min 512`
Radeon OEM	Radeon 600/700 (rebadged Polaris)	`max/24, min 512`
Radeon Graphics (APU)	Generic APU fallback	`max/26, min 384`

Also added: RX 590, RX 480/470/460, RX 6950/6750/6650 patterns

Intel Additions

Tier	Models	Workgroup Formula
Arc A5 (Desktop)	A580	`max/14, min 1536`
Arc A3 (Desktop)	A380, A310	`max/18, min 768`
Arc A7 Mobile	A770M, A730M	`max/14, min 1536`
Arc A5 Mobile	A550M, A570M	`max/16, min 1024`
Arc A3 Mobile	A370M, A350M	`max/20, min 512`
Iris Xe Max (Discrete)	DG1-based	`max/20, min 512`
Iris Pro	Haswell/Broadwell	`max/26, min 256`
Iris Plus	Ice Lake, etc.	`max/26, min 320`
UHD 700	Alder Lake+	`max/26, min 320`
UHD 600	Coffee Lake, Comet Lake	`max/28, min 256`
HD Graphics	Older generations	`max/30, min 192`

Qualcomm (New Vendor)

Tier	Models	Workgroup Formula
Adreno X1 (Snapdragon X)	X Elite, X Plus	`max/14, min 1536`
Adreno 700	Snapdragon 8 Gen 1/2/3	`max/16, min 1024`
Adreno 600	Snapdragon 800 series	`max/20, min 512`
Adreno 500	Older Snapdragon	`max/24, min 384`

Validation

cargo check -p engine-gpu - passes
cargo clippy -p engine-gpu -- -D warnings - passes, no warnings

Risks and Mitigations

Risk: New patterns could potentially match unintended GPU names
- Mitigation: Patterns are ordered from most specific to least specific; more conservative fallbacks are used for ambiguous cases
Risk: Workgroup formulas for new tiers may not be optimal
- Mitigation: Values are based on relative GPU compute capabilities; users can still override via CLI flags if needed; fallback detection still triggers a warning asking users to report unrecognized GPUs

Testing Notes

The originally reported GPUs should now be detected as:

Radeon(TM) Vega 8 Graphics → AMD Vega (APU) tier
Radeon RX 560X → AMD RX 500/400 (Polaris) tier (previously misdetected as RDNA 1)

Note

Medium Risk
Large heuristic-only change: mis-matched adapter name substrings could pick wrong workgroup limits (performance/stability), though users can still override via CLI and unknown GPUs still log fallback warnings.

Overview
Expands get_vendor_specific_dispatch in engine-gpu so more adapters get a named tier and tuned optimal_workgroups instead of generic fallbacks.

NVIDIA gains extra string matches (e.g. RTX 3060/3050, GTX 1050/1030) and new branches for GTX 900/700/legacy, MX mobile, GT entry-level, and datacenter names (A100/H100/L4).

AMD detection is reordered and broadened: Polaris is checked before RDNA 1 so names like RX 560X no longer hit 5600-style patterns; Instinct matching drops the loose "mi" substring in favor of explicit mi100/instinct mi-style patterns. New tiers cover RDNA APUs, RX 6500/6400, Vega (discrete and APU), GCN R9/R7, OEM rebadges, and generic Radeon Graphics APUs.

Intel splits Arc into finer desktop/mobile buckets and adds more integrated paths (Iris variants, UHD 700/600, HD). A new Qualcomm Adreno block (vendor ID + name patterns) covers Snapdragon X and Adreno 5/6/7 series on Windows ARM.

^{Reviewed by Cursor Bugbot for commit 911463c. Configure here.}

cursor

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit 911463c. Configure here.}

n13

Review: Improve Windows GPU support

Reviewed at head 3656655. Pulled the branch and ran the repo's validation checklist locally — all green:

cargo test -p engine-gpu → 5 passed (the new gpu_tiers tests)
cargo clippy -p engine-gpu --all-targets -- -D warnings → clean
cargo fmt --all -- --check → clean
cargo check --workspace --locked → clean

Verdict: Approve (with non-blocking follow-ups)

Solid net improvement that meets its stated goal. The big win is replacing the brittle String::contains heuristics with a table-driven, word-boundary regex approach that is much more maintainable and DRY (per-tier workgroup math is now one centralized expression instead of duplicated (max/N).max(M) everywhere).

The two Bugbot findings are resolved

Both inline comments were filed against the older contains() commit 911463c. The later refactor into gpu_tiers.rs fixes both, and there are now unit tests proving it:

Polaris vs RDNA1: \b…\b boundaries + RDNA checked before Polaris → RX 5500/5600/5700 map to RDNA1 and RX 560X/580/550 map to Polaris. Verified by test_amd_rdna_vs_polaris.
Arc mobile vs desktop: mobile tiers are checked first and \ba7[57]0\b won't match a770m. Verified by test_intel_arc_mobile_vs_desktop.

I spot-checked extra cases not covered by the tests: RTX A6000 → Professional (not swallowed by consumer tiers), the GT tier does not match GTX names, and RX 6500 XT → RDNA2 Entry. All correct.

Strong positive: fixes a real silent hang

The previous buffer-map path only set its flag on success and looped forever on a map error with no logging — a silent infinite hang, which violates the project's fail-early / always-log rule. The new AtomicU8 state (pending/success/error) + 30s poll timeout logs and bails out. Good catch.

Non-blocking follow-ups

unmap() on a non-mapped buffer (crates/engine-gpu/src/lib.rs ~L664 and ~L675): on the error path the buffer was never mapped, and on the timeout path the map is still pending. wgpu 27 flags this as a validation error (noisy, not a hard panic). Suggest guarding unmap() so it only runs when the buffer is actually mapped.
Persistent device-loss spins: returning NotFound { hash_count: 0 } is the right move for a transient failure, but on a genuinely lost device every batch will log-error and burn cycles at 0 H/s indefinitely. Consider tearing down / removing the dead context so the engine fails loudly, consistent with the repo's fail-early guidance.
Init "timeout" is post-hoc: init_start.elapsed() is checked after request_device().await returns (the comment acknowledges this), so a driver that truly hangs inside the await still blocks. It only guards slow-but-completed init. Fine as-is; if true hang-protection is the goal it needs to race the future against a timer / off-thread init.

Nits

Qualcomm tiers use greedy adreno.*7 / adreno.*6, so e.g. "Adreno 627" buckets as 700-series. Tuning-only and Qualcomm-scoped — low impact.
Duplicate "AMD RX 5000 (RDNA 1)" tier (\b5[56]00\b and rx 5\d{3}, identical params) could be merged into a single pattern.
regex + once_cell are added as direct deps; once_cell::sync::Lazy could be std::sync::LazyLock (Rust ≥ 1.80) if you want to keep the dependency surface minimal.

None of the above blocks merge.

illuzen added 2 commits June 29, 2026 14:09

support AMD GPUs on windows

f1e6de1

add support for more GPUs

911463c

cursor Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread crates/engine-gpu/src/lib.rs Outdated

Comment thread crates/engine-gpu/src/lib.rs Outdated

illuzen added 6 commits June 29, 2026 14:50

remove shader dump from logs

543e680

handle gpu initialization failure better

d5191f2

fmt

e75a266

re-order matching to avoid mobile / desktop false positives

ee42035

harden gpu matching

417f4e1

add tests to engine-gpu module

3656655

n13 approved these changes Jun 29, 2026

View reviewed changes

illuzen added 5 commits June 30, 2026 12:32

handle dead device and failed mapping correctly

3c64c1b

handle timeout properly with tokio

84e68bb

nits

429d8fe

fix benchmarks

37ee281

fmt

19af203

illuzen merged commit 7296c00 into main Jun 30, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve Windows gpu support#70

Improve Windows gpu support#70
illuzen merged 13 commits into
mainfrom
illuzen/windows-gpu

illuzen commented Jun 29, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

n13 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

illuzen commented Jun 29, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add comprehensive GPU detection support for AMD, NVIDIA, Intel, and Qualcomm

Overview

What Changed

Bug Fixes

NVIDIA Additions

AMD Additions

Intel Additions

Qualcomm (New Vendor)

Validation

Risks and Mitigations

Testing Notes

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

n13 left a comment

Choose a reason for hiding this comment

Review: Improve Windows GPU support

Verdict: Approve (with non-blocking follow-ups)

The two Bugbot findings are resolved

Strong positive: fixes a real silent hang

Non-blocking follow-ups

Nits

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

illuzen commented Jun 29, 2026 •

edited by cursor Bot

Loading