Skip to content

Latest commit

 

History

History
51 lines (40 loc) · 1.94 KB

NVidia-Maxwell.md

File metadata and controls

51 lines (40 loc) · 1.94 KB

Examples

  • GTX 9xx
  • Titan X
  • Quadro Mxxxx
  • Tegra X1
  • Nintendo Switch
  • Shield TV
  • MX110, MX130

References

  1. CUDA Tuning Guide, [backup]
  2. Architecture Whitepaper, [backup]
  3. Dissecting GPU Memory Hierarchy through Microbenchmarking, [backup]
  4. Examining the Nintendo Switch (Tegra X1) Video Engine
  5. Maxwell: Nvidia’s Silver 28nm Hammer
  6. Nintendo Switch’s iGPU: Maxwell Nerfed Edition
  7. Compute Capability 5.x
  8. Tegra X1 Whitepaper, [backup]
  9. Vulkan features for GTX 980

Notes

  • FP16 data storage.

  • Unified L1/Texture cache.

  • Native shared memory atomic operations for 32-bit integer arithmetic, along with native 32 or 64-bit compare-and-swap (CAS).

  • Async transfer queue.

  • instructions IMAD and IMUL have a long latency because they are emulated. ref

Specs

  • ops/clock per SM: [7]
    • 128 fp32 FMA
    • 4 fp64 FMA
    • 32 fp32 rcp, rsqrt, log2, exp2, sin, cos
    • 32 i32 add/sub
    • 64 i32 shift
    • 64 cmp, min, max
    • 64 i32 bit reverse
    • 64 i32 bit extract, insert
    • 128 i32 and, or, xor
    • 32 MSB
    • 32 pop count
    • 32 i8/i16 to i32
    • 4 to/from i64/fp64
    • 32 type conversions