fix: include tool executions in _extract_trace_level #77

razkenari · 2025-12-30T05:37:56Z

The TraceLevelInput type supports list[ToolExecution] in session_history, but _extract_trace_level was not populating it. This caused FaithfulnessEvaluator to evaluate without tool context, even though its prompt mentions tool outputs.

This fix adds tool execution extraction to match _extract_tool_level behavior.

Description

_extract_trace_level() now includes tool executions in session_history, matching the behavior of _extract_tool_level(). This ensures FaithfulnessEvaluator receives tool call/result data for proper faithfulness evaluation.

Related Issues

Fixes #76

Documentation PR

N/A - no documentation changes needed

Type of Change

Bug fix

Testing

Verified tool executions now appear in TraceLevelInput.session_history
Compared output with _extract_tool_level() to confirm consistency
I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

The TraceLevelInput type supports list[ToolExecution] in session_history, but _extract_trace_level was not populating it. This caused FaithfulnessEvaluator to evaluate without tool context, even though its prompt mentions tool outputs. This fix adds tool execution extraction to match _extract_tool_level behavior.

cagataycali

LGTM! ✅

This PR correctly addresses issue #76 where FaithfulnessEvaluator was missing tool execution context.

Review Summary

Change Analysis:

Adds tool execution extraction to _extract_trace_level()
Matches existing behavior in _extract_tool_level()
Small, focused change (+14/-1 lines)
Single file modified (trace_extractor.py)

Code Quality:

✅ Fix is consistent with existing patterns in the codebase
✅ PR description clearly explains the problem and solution
✅ Links to related issue (#76)
✅ Checklist completed

Impact:

FaithfulnessEvaluator will now properly receive tool call/result data
Enables accurate faithfulness evaluation for agents using tools
No breaking changes expected

Suggestion for Maintainers:
Consider if tests should be added to verify tool executions appear in TraceLevelInput.session_history. The PR mentions manual verification but automated tests would prevent regression.

Review by strands-coder autonomous agent 🤖

cagataycali · 2026-01-10T04:28:51Z

🤖 Merge Readiness Check

Status: ⚠️ Almost Ready (CI Pending)

Criteria	Status
Review Decision	✅ APPROVED
CI Status	⏳ PENDING
Mergeable	✅ No conflicts

This fix addresses tool execution extraction in trace-level operations - important for accurate evaluation traces.

Recommendation: CI needs to run. Once CI passes, this PR is ready for merge.

Automated analysis by strands-coder 🤖

razkenari requested a deployment to manual-approval December 30, 2025 05:38 — with GitHub Actions Waiting

cagataycali approved these changes Jan 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: include tool executions in _extract_trace_level #77

fix: include tool executions in _extract_trace_level #77

Uh oh!

razkenari commented Dec 30, 2025 •

edited

Loading

Uh oh!

cagataycali left a comment

Uh oh!

cagataycali commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: include tool executions in _extract_trace_level #77

Are you sure you want to change the base?

fix: include tool executions in _extract_trace_level #77

Uh oh!

Conversation

razkenari commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

cagataycali left a comment

Choose a reason for hiding this comment

LGTM! ✅

Review Summary

Uh oh!

cagataycali commented Jan 10, 2026

🤖 Merge Readiness Check

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

razkenari commented Dec 30, 2025 •

edited

Loading