confidence-scoring

Here are 25 public repositories matching this topic...

tznthou / claude-prism

Cross-provider AI code review for Claude Code — evidence-based confidence scoring with Codex, Gemini & Claude

bash gemini code-review fact-checking codex multi-provider github-actions ai-code-review claude-code confidence-scoring cross-provider

Updated May 23, 2026
Shell

goergen95 / seapig

Star

Uncertainty based selection of compatible inputs

deep-learning pytorch remote-sensing uncertainty-estimation selective-prediction torchgeo geospatial-ai confidence-scoring

Updated May 21, 2026
Python

metareason-ai / metareason-core

Star

Open-source LLM evaluation engine with statistical confidence scoring

statistical-analysis bayesian-inference ai-governance llm-evaluation confidence-scoring

Updated Mar 24, 2026
Python

obielin / llm-extract

Star

Extract structured data from any document — PDF, DOCX, HTML, CSV, plain text — using LLMs with Pydantic schema validation, per-field confidence scores, and source grounding.

python nlp pdf extraction structured-output pydantic llm document-parsing anthropic confidence-scoring

Updated Apr 5, 2026
Python

laundromatic / shopgraph

Star

The extraction API that shows its work. Product data extraction with per-field confidence scoring and extraction provenance. REST API + MCP server. 50 free calls/month.

ecommerce ucp schema-org structured-data ai-agents product-data mcp-server confidence-scoring agent-commerce stripe-mpp shopgraph extraction-provenance

Updated May 17, 2026
TypeScript

seljicom / selji-zero-noise

Star

Zero-Noise utilities for safer product research and review signal analysis.

ecommerce decision-support consumer-research review-analysis product-research confidence-scoring zero-noise buyer-tools shopping-tools

Updated Feb 7, 2026
JavaScript

lorenzespinosa / n8n-ai-agent-delegator

Star

Multi-agent AI task delegation architecture for n8n: orchestrator routes natural-language commands to specialist agents with confidence scoring and human-in-the-loop gates.

automation orchestration multi-agent openai ai-agents n8n llm confidence-scoring

Updated Mar 31, 2026

Verification system that catches coding agents falsely claiming task completion. Runs 4 parallel checks (file integrity, test quality, scope narrowing, optional LLM judge) over task+claim+diff and returns a weighted 0-100 confidence score with evidence.

verification asyncio agents github-actions pydantic fastapi ai-evaluation openrouter coding-agents code-review-automation llm-judge test-quality confidence-scoring scope-detection agent-overclaim

Updated May 21, 2026
Python

SouravUpadhyay7 / self_correcting_rag

Star

Research-grade Self-Correcting RAG agent built with LangGraph that retrieves knowledge, generates answers, evaluates grounding/relevance/completeness, and iteratively self-improves with confidence scoring and memory.

python rag streamlit langchain llm-agent openrouter hallucination-detection langgraph knowledge-retrieval huggingface-embeddings confidence-scoring self-correcting-ai

Updated Mar 20, 2026
Python

theangelofwill / CrossModel-Consensus

Star

System that aggregates outputs from multiple Large Language Models (GPT-4, Claude-3, custom models) to generate reliable, high-confidence results through consensus-based reasoning evaluation. Demonstrates sophisticated AI orchestration with 92.7% accuracy improvement over single-model.

python api docker portfolio machine-learning ai deep-learning orchestration pytorch neural-networks multi-model consensus-algorithm model-comparison mlflow fastapi ai-engineering llm prompt-engineering confidence-scoring

Updated Dec 22, 2025
Python

simply-mihir / nistula-technical-assessment

Star

AI-powered concierge that normalises guest messages from WhatsApp, Booking.com, Airbnb, Instagram and direct channels, drafts a reply with Claude, and routes responses through a deterministic confidence-scoring pipeline. Built with FastAPI + Claude Sonnet 4.

Updated May 18, 2026
Python

wjddusrb03 / docforge

Star

Smart Document Conversion for the AI Era - CPU-only, fast, with confidence scoring. Converts PDF, DOCX, PPTX, HTML, EPUB to Markdown, JSON, HTML, Text.

Updated Mar 29, 2026
Python

obinexus / gating

Star

m2ai-portfolio / hallucination-hunter-vscode-extension-for-real-time-ai-answer-validation

Star

Catch AI‑code hallucinations instantly: real‑time sandbox validation scores suggestions, flags low‑confidence snippets, so solo devs avoid wasted debugging and regain trust in assistants.

vscode-extension cli-tool trust-in-ai real-time-validation confidence-scoring sandbox-testing solo-developers linter-integration reduce-rework ai-assistant-users

Updated Apr 16, 2026
Python

rudrasingh-007 / Aegis-SOC

Star

Automated L1/L2 SOC triage platform — real log ingestion, dual-source threat intel, MITRE ATT&CK tagging, kill-chain detection, ML anomaly detection, alert explainability, and confidence scoring. Built in Python.

python flask log-analysis incident-response cybersecurity suricata siem log-parser security-automation threat-intelligence wazuh anomaly-detection confidence-scoring surisuriwazuhwazuh

Updated May 23, 2026
Python

JLHC-AI-portfolio / community-fair-supplier-packet-review

Star

Supplier PDF-to-Excel/CSV workflow with structured extraction, confidence scoring, validation flags, and human-review cues.

nodejs express validation data-cleaning csv-export excel-automation pdf-extraction document-automation confidence-scoring ai-assisted-extraction

Updated Apr 28, 2026
JavaScript

selfradiance / memledger

Star

Append-only CLI ledger for structured agent memory claims with provenance, confidence, contestability, and immutable history.

nodejs cli typescript sqlite provenance developer-tools ai-agents audit-trail append-only zod local-first agent-memory confidence-scoring memory-integrity claim-ledger

Updated Apr 28, 2026
TypeScript

Jh-justinHarmon / knowledge-ingestion-engine

Star

Ingestion pipelines with artifact lineage, replayable stages, and append-only persistence.

telemetry systems-engineering append-only pipeline-architecture confidence-scoring knowledge-ingestion deterministic-processing artifact-lineage replayable-pipeline

Updated Apr 6, 2026
Python

raksh-dev / inventory-data-standardization

Star

A modular AI-driven pipeline for cleaning, normalizing, and standardizing large-scale inventory data with automated SKU generation, confidence scoring, and human-in-the-loop validation.

python machine-learning pandas data-engineering data-normalization data-cleaning human-in-the-loop ai-agents etl-pipeline fastapi sku-generation google-gemini confidence-scoring inventory-standardization

Updated Jan 26, 2026
Python

sirmaxworld / ai-solver

Star

AI-powered problem solver using dual-AI validation with 88%+ confidence scoring. By Yourox.ai

ai agpl developer-tools problem-solving gpt claude confidence-scoring dual-ai

Updated Aug 13, 2025
HTML

Improve this page

Add a description, image, and links to the confidence-scoring topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the confidence-scoring topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

confidence-scoring

Here are 25 public repositories matching this topic...

tznthou / claude-prism

goergen95 / seapig

metareason-ai / metareason-core

obielin / llm-extract

laundromatic / shopgraph

seljicom / selji-zero-noise

lorenzespinosa / n8n-ai-agent-delegator

dakshjain-1616 / AgentLiar

SouravUpadhyay7 / self_correcting_rag

theangelofwill / CrossModel-Consensus

simply-mihir / nistula-technical-assessment

wjddusrb03 / docforge

obinexus / gating

m2ai-portfolio / hallucination-hunter-vscode-extension-for-real-time-ai-answer-validation

rudrasingh-007 / Aegis-SOC

JLHC-AI-portfolio / community-fair-supplier-packet-review

selfradiance / memledger

Jh-justinHarmon / knowledge-ingestion-engine

raksh-dev / inventory-data-standardization

sirmaxworld / ai-solver

Improve this page

Add this topic to your repo