Skip to content
#

differential-attention

Here are 3 public repositories matching this topic...

From-scratch PyTorch implementation and controlled benchmark of 7 attention mechanisms (MHA, GQA, MQA, SWA, DiffAttn, MLA, MoH) — trained on FineWeb-Edu with identical hyperparameters to isolate architectural impact on perplexity, throughput, and KV cache efficiency.

  • Updated May 17, 2026
  • Python

Differential Attention-Augmented BiomedCLIP for multi-label VCE classification - ICPR 2026 RARE-VISION Competition submission from MINDH Lab, IIT Hyderabad. Achieves mAP@0.5 of 0.2456 on temporal event detection across 17 anatomical and pathological classes.

  • Updated Mar 25, 2026
  • Python

Improve this page

Add a description, image, and links to the differential-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the differential-attention topic, visit your repo's landing page and select "manage topics."

Learn more