Add Distributional AGI Safety simulation framework by rsavitt · Pull Request #8 · Giskard-AI/awesome-ai-safety

rsavitt · 2026-02-10T05:28:54Z

New Resource

Adding a multi-agent simulation framework for studying distributional safety in AI systems using probabilistic soft labels.

Resource type: Open-source framework + paper
Section: General ML Testing
Tags: #Robustness #Fairness

The framework models governance trade-offs (taxes, staking, audits, collusion detection) across cooperative, contested, and adversarial regimes. It replaces binary safety classifications with calibrated probabilities (p = P(v = +1)) to surface adverse selection dynamics invisible to hard labels.

Results across 11 scenarios identify a critical adversarial threshold (37.5–50%) and show that structural collusion detection provides qualitatively different protection than individual-level governance levers.

Framework: https://github.com/swarm-ai-safety/swarm

Add Distributional AGI Safety framework

f9e3b43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Distributional AGI Safety simulation framework#8

Add Distributional AGI Safety simulation framework#8
rsavitt wants to merge 1 commit intoGiskard-AI:mainfrom
rsavitt:add-distributional-agi-safety

rsavitt commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

rsavitt commented Feb 10, 2026

New Resource

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant