scientific-evaluation

Star

Here are 4 public repositories matching this topic...

firefox-669 / Self_Optimizing_Holo_Half

Star

The Self-Evolving Platform for OpenHands & OpenSpace

python machine-learning automation ab-testing openspace ai-agent self-evolving openhands scientific-evaluation

Updated May 2, 2026
Python

adityaarunsinghal / LLM-As-A-Judge-Prompt-Improver

Star

Scientific framework for iterative LLM prompt improvement using multi-dimensional scoring, threshold optimization, cross-validation, and an OPRO-style agent loop. Built on AWS Bedrock with a React + FastAPI observation GUI.

react python typescript cross-validation active-learning fastapi llm prompt-engineering aws-bedrock prompt-optimization llm-as-a-judge scientific-evaluation

Updated Mar 11, 2026
Python

suholeee / OpenQuestion

Star

Benchmark for whether LLMs flatten contested scientific mechanisms into false consensus

evaluation evaluation-framework claude ai-for-science llm-evaluation scientific-reasoning scientific-evaluation

Updated May 7, 2026
Python

suholeee / Multimodal-AI-reliability

Star

Benchmark for multimodal contradiction and evidence reconciliation in biological research

evaluation computational-biology evaluation-framework claude ai-for-science llm-evaluation scientific-reasoning scientific-evaluation

Updated Jun 8, 2026
Python

Improve this page

Add a description, image, and links to the scientific-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scientific-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scientific-evaluation

Here are 4 public repositories matching this topic...

firefox-669 / Self_Optimizing_Holo_Half

adityaarunsinghal / LLM-As-A-Judge-Prompt-Improver

suholeee / OpenQuestion

suholeee / Multimodal-AI-reliability

Improve this page

Add this topic to your repo