High-performance Super Nintendo emulator for modern Windows PCs
Fork of Snes9x — optimized for speed, stripped for clarity
Benchmarked on Intel i5-8265U @ 1.60GHz (laptop), Parodius (Europe) PAL, 3000 frames unthrottled
| Metric | Snes9x | Snes10x v0.5 | Gain |
|---|---|---|---|
| Score | 15,723 | 16,257 | +3.4% |
| Raw FPS | 943 | 975 | +3.4% |
| Median Frame | 886 us | 583 us | -34% |
| Min Frame | 577 us | 370 us | -36% |
| Stability | — | 3.7% | Ultra-stable |
| Binary Size | 13.4 MB | 5.4 MB | -60% |
Snes10x Threading Model
+---------------------------------------------------------+
| |
| [Thread 1: CPU Emulation] |
| 65C816 dispatch (switch jump table, M0X0 inlined) |
| DMA/HDMA, Memory Map LUT, branch hints |
| | | |
| | frame buffer copy | scanline sync |
| v v |
| [Thread 2: D3D11 Render] [Thread 3: SPC700 APU] |
| Filter + GPU color conv Audio DSP + resampling |
| Texture upload + Present Lock-free ring buffer |
| | |
| v |
| [Thread 4: XAudio2] |
| Low-latency output |
+---------------------------------------------------------+
| Technique | Description |
|---|---|
| Direct dispatch (switch) | 256 M0X0 opcodes via switch(Op) — compiler generates jump table with inlined handlers |
| Flatten inlining | __attribute__((flatten)) on main loop + cpuops.cpp in same translation unit (manual LTO) |
| Batch event checking | NMI/IRQ/Timer checks merged into single S9X_UNLIKELY branch per opcode |
| Branch prediction hints | S9X_LIKELY/S9X_UNLIKELY on 65C816 loop, memory access, SPC700, DMA |
| Cache-line alignment | alignas(64) on SCPUState, SICPU, SRegisters with hot fields first |
| ROM prefetch | _mm_prefetch 64 bytes ahead on opcode fetch |
| Technique | Description |
|---|---|
| Async render thread | 1-frame pipeline overlap with back-pressure sync |
| GPU color conversion | HLSL pixel shader converts R16_UINT (RGB565) to RGBA on GPU |
| Non-temporal stores | _mm256_stream_si256 for CPU fallback path (bypasses L1/L2 cache) |
| AVX2 SIMD | 8-pixel color conversion using 256-bit vector instructions |
| FLIP_DISCARD | Modern DXGI swap chain with tearing support |
| Technique | Description |
|---|---|
| APU threading | SPC700 on dedicated thread with mutex+condvar sync |
| Atomic resampler | std::atomic with memory_order_acquire/release barriers |
| Safe buffer management | SubmitSourceBuffer verified before InterlockedIncrement |
| Removed | Lines Saved |
|---|---|
| OpenGL + CG shaders | ~3,000 |
| Vulkan + SPIRV-Cross + glslang | ~73,000 |
| DirectDraw (legacy DX) | ~700 |
| GTK / macOS / Qt platforms | ~7,700 |
| Movie, AVI, Netplay | ~5,000 |
| Total | ~85,000+ |
clang-cl -O3 -march=haswell -mtune=native -fno-strict-aliasing
| Minimum | |
|---|---|
| OS | Windows 10/11 64-bit |
| CPU | Intel Haswell (4th gen, 2013+) or AMD with AVX2 |
| GPU | Any DirectX 11 GPU |
| RAM | 256 MB free |
Requires Visual Studio 2022 with the clang-cl (LLVM) toolset.
:: Full rebuild (dependencies + main project)
build.bat
:: Fast incremental build (main project only)
build_fast.bat| Setting | Value |
|---|---|
| Configuration | Release Unicode | x64 |
| Compiler | clang-cl (LLVM 19) |
| Optimization | -O3 -march=haswell -mtune=native |
| Output | win32\snes9x-x64.exe |
Snes10x stands on the shoulders of the incredible Snes9x project and its contributors:
| Period | Contributors |
|---|---|
| 1996-2002 | Gary Henderson, Jerremy Koot — Original authors |
| 2002-2004 | Matthew Kendora |
| 2002-2005 | Peter Bortas |
| 2004-2005 | Joel Yliluoma |
| 2001-2006 | John Weidman |
| 2002-2010 | Brad Jorsch, funkyass, Kris Bleakley, Nach, zones |
| 2006-2007 | nitsuja |
| 2009-2023 | BearOso, OV2 |
| Win32 port | Matthew Kendora, funkyass, nitsuja, Nach, blip, OV2 |
Thank you for creating and maintaining one of the best SNES emulators ever made.
Original project: github.com/snes9xgit/snes9x
Licensed under the Snes9x License. See LICENSE for details.
Custom build by Ayi NEDJIMI (2024-2026)
Built with the assistance of Claude (Anthropic)