Spectron: Super-Linear Spectral Attention for Efficient Long-Context Language Modeling. Replaces O(n²) self-attention with IFFT(W ⊙ FFT(x)) — 15x faster, 1000x less memory, 10M+ context window.
deep-learning pytorch fft language-model ai-research efficient-attention long-context transformer-alternative spectral-attention super-linear
-
Updated
May 23, 2026 - Python