How can I see the LLM Judge evaluation in Phoenix UI? #11137
Replies: 4 comments
-
|
hi @ralucaginga how are you creating your correctness_annotations? are you using evals 2? If you show an example of your workflow up until that point I can help explain how to get up and running |
Beta Was this translation helpful? Give feedback.
-
|
I'll add that |
Beta Was this translation helpful? Give feedback.
-
It also doesn't display anything when running a simple example: Could you please assist in how to do that? I cannot access the Slack community unfortunately, the link seems to be expired. Versions: |
Beta Was this translation helpful? Give feedback.
-
|
Hi @ralucaginga For your evaluation example: I don't see where df comes from. For annotations to work, the DataFrame needs to contain valid span IDs that exist in Phoenix. Our recommended workflow is: 1. Start by fetching spans from Phoenixspans_df = client.spans.get_spans_dataframe(project_name="your-project") 2. Prepare DataFrame while preserving span_ideval_df = spans_df[["context.span_id", "attributes.input.value"]].copy() 3. Run evaluationresults_df = await async_evaluate_dataframe(eval_df, evaluators=[relevance_classifier]) 4. Convert and log with sync=True to see any errorsannotation_df = to_annotation_dataframe(results_df) For your simple example: The span IDs "span_123" and "span_456" need to be real span IDs from traces in Phoenix. With sync=False (the default), invalid span IDs silently fail with no error. To debug:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am still struggling to follow the documentation from the website and see a lot of changes are happening with the versions of Arize Phoenix, but could you please assist on how to setup an evaluator (correctness/relevance) and show the evaluation results in the Phoenix UI as well?
Unfortunately, there is not anything shown in the UI and tried to follow the open issues, but nothing helped.
Tried the following pieces of code, but unfortunately nothing is shown in the UI:
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions