Skip to content

Conversation

@xyliugo
Copy link
Collaborator

@xyliugo xyliugo commented Sep 1, 2025

Change Type

  • ✨ feat
  • 🐛 fix
  • ♻️ refactor
  • 💄 style
  • 👷 build
  • ⚡️ perf
  • 📝 docs
  • 🔨 chore

Description of Change

  • Switched to LiteLLM, which is simpler and more flexible and general than the OpenAI Agents SDK for testing MCP capabilities.
  • Started building a new agent (mcpmark-agent) on top of LiteLLM.
  • Dropped Stream mode since we already handle looped tool-calling.
  • Added reasoning_effort support (comes built-in with LiteLLM).
  • Got rid of the heavy OpenAI Agents SDK dependency.
  • Handled Anthropic extended thinking + tool use logic

Additional Information

@xyliugo xyliugo changed the title ✨ feat: ✨ feat: introduce mcpmark-agent Sep 1, 2025
@github-actions
Copy link

github-actions bot commented Sep 1, 2025

🐳 Docker Build Completed!

Version: pr-dev-9d99679
Build Time: 2025-09-01T10:35:29.572Z
🔗 View all tags on Docker Hub: https://hub.docker.com/r/evalsysorg/mcpmark/tags

Pull Image

Download the Docker image to your local machine:

docker pull evalsysorg/mcpmark:pr-dev-9d99679

Run Eval

Execute evaluation tasks using the built image:

DOCKER_IMAGE_VERSION=pr-dev-9d99679 ./run-task.sh --models gpt-4.1-mini --tasks file_context/uppercase

Important

This build is for testing and validation purposes.

Copy link
Member

@arvinxx arvinxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@xyliugo xyliugo merged commit f350cd4 into main Sep 1, 2025
3 checks passed
@xyliugo xyliugo deleted the dev branch September 1, 2025 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants