Skip to content

[FEATURE] Add ContextualFaithfulnessEvaluator for RAG evals #65

@stefanoamorelli

Description

@stefanoamorelli

Problem Statement

Currently there's no way to evaluate whether RAG responses are grounded in the retrieved context. The existing FaithfulnessEvaluator checks against conversation history, but RAG systems need validation against the actual context retrieved from vector stores.

Proposed Solution

Add a ContextualFaithfulnessEvaluator that validates responses against a retrieval_context field on test Cases.

Use Case

When using RAG, I need to detect hallucinations:

case = Case(
    input="What is the refund policy?",
    retrieval_context=[
        "Refunds available within 30 days of purchase.",
        "Items must be unopened for full refund."
    ]
)

The evaluator would then score how grounded the response is in relation to the retrieval context defined in the Case.

Alternatives Solutions

No response

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions