Add PUMA: Semantic-Preserving Early Exit for Reasoning Models by ZhishanQ · Pull Request #10 · testtimescaling/testtimescaling.github.io

ZhishanQ · 2026-05-25T01:45:52Z

Adding PUMA as a new row in the main taxonomy table in `index.html`.

PUMA fits the inference-time / "How Well: Token Cost + Speedup" corner of the test-time-scaling landscape — it studies when to stop scaling rather than how to scale up. It uses reasoning-level semantic redundancy (via a lightweight fine-tuned Qwen3-Embedding-0.6B detector) as the candidate-exit signal, with an answer-verification window confirming exits.

Taxonomy classification (please feel free to adjust)

Field	Value	Reasoning
What	Internal	Modifies the reasoning trajectory in place (early-exit) rather than parallel/sequential sampling
SFT	✗	The LRM is frozen; only a small auxiliary embedding model is trained
RL	✗	—
STI	Redundancy-Aware Early Exit	The detector intervenes mid-decoding to flag candidate exit points
SEA	✗	No tree/graph search
VER	Trial-Answer Verifier	The answer-verification window confirms whether a flagged candidate exit is safe
AGG	✗	—
Where	Math, Code, General	MATH-500, AIME24/25, OlympiadBench, GPQA-Diamond + LiveCodeBench + MathVista, MathVision
How Well	Pass@1, Token Cost, Speedup	26.2% average token reduction; 1.40× / 1.28× speedup on DS-7B/14B

The PUMA detector is contrastively trained, so SFT could arguably be marked ✓ — but I went with ✗ because the LRM itself is untouched and the framework is plug-and-play. Happy to flip if you'd prefer.

Links

Paper: https://arxiv.org/abs/2605.17672
Code: https://github.com/giovanni-vaccarino/PUMA

I did not modify `papers.json` (which currently lists only the survey itself) or `arxiv_citations.json` (bot-managed). Let me know if I should add to papers.json too.

Thanks!

Add PUMA: Semantic-Preserving Early Exit for Reasoning Models

2ac8636

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PUMA: Semantic-Preserving Early Exit for Reasoning Models#10

Add PUMA: Semantic-Preserving Early Exit for Reasoning Models#10
ZhishanQ wants to merge 1 commit into
testtimescaling:mainfrom
ZhishanQ:add-puma

ZhishanQ commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZhishanQ commented May 25, 2026

Taxonomy classification (please feel free to adjust)

Links

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant