The Self-Evolving Platform for OpenHands & OpenSpace
-
Updated
May 2, 2026 - Python
The Self-Evolving Platform for OpenHands & OpenSpace
Scientific framework for iterative LLM prompt improvement using multi-dimensional scoring, threshold optimization, cross-validation, and an OPRO-style agent loop. Built on AWS Bedrock with a React + FastAPI observation GUI.
Benchmark for whether LLMs flatten contested scientific mechanisms into false consensus
Benchmark for multimodal contradiction and evidence reconciliation in biological research
Add a description, image, and links to the scientific-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the scientific-evaluation topic, visit your repo's landing page and select "manage topics."