Fix/tool call accuracy (#2079) #2092

sdivye92 · 2025-06-26T03:08:20Z

Fix ToolCallAccuracy returns score higher than 1.0 (ToolCallAccuracy returns score higher than 1.0 #2079)
- Using zip to combine reference_tool_calls and pred_tool_calls
Updated is_sequence_aligned method to check for same len of pred and ref seq
- This fixes issue when tool call accuracy is 1 even when pred_tool_calls has more tool calls than reference_tool_calls.
- equal check between pred_sequence and ref_sequence will check that both should be same length.
- when equal order of occurence of their elements are same.

…s#2079)

- equal check between pred_sequence and ref_sequence will check that both should be same length - when equal order of occurence of their elements are same

greptile-apps

PR Summary

Fixed critical scoring bug in ToolCallAccuracy metric where it could incorrectly return scores higher than 1.0 due to double-counting matches.

Modified ragas/src/ragas/metrics/_tool_call_accuracy.py to use zip() for one-to-one matching between reference and prediction tool calls
Updated is_sequence_aligned to enforce equal lengths between prediction and reference sequences
Added validation to ensure tool calls are made in correct order and quantity
Fixed edge case where perfect scores were given despite extra predicted tool calls

_{1 file reviewed, 1 comment}
_{Edit PR Review Bot Settings | Greptile}

ragas/src/ragas/metrics/_tool_call_accuracy.py

sdivye92 added 2 commits June 26, 2025 08:28

Fix ToolCallAccuracy returns score higher than 1.0 (explodinggradient…

03d9887

…s#2079)

Updated to check for same len of pred and ref seq

a8ae5ad

- equal check between pred_sequence and ref_sequence will check that both should be same length - when equal order of occurence of their elements are same

dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jun 26, 2025

greptile-apps bot reviewed Jun 26, 2025

View reviewed changes

ragas/src/ragas/metrics/_tool_call_accuracy.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/tool call accuracy (#2079) #2092

Fix/tool call accuracy (#2079) #2092

Uh oh!

sdivye92 commented Jun 26, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

Fix/tool call accuracy (#2079) #2092

Are you sure you want to change the base?

Fix/tool call accuracy (#2079) #2092

Uh oh!

Conversation

sdivye92 commented Jun 26, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

Uh oh!

Uh oh!

Uh oh!