Improvements for MCP-based agents #111
Replies: 8 comments 14 replies
-
I think these are all great topics
I think this should enable the bubbling of permission requests up from sub-agents to higher up agents/top level interaction
Something important here imo would be enabling top-level (or any intermediate layer) to have awareness of the topology of all nested agentic activity. |
Beta Was this translation helpful? Give feedback.
-
It would be useful to understand better how you see agents fitting into MCP at a conceptual level and in terms of user experience. Right now clients like Claude seem to play the role of a unified agent, since they execute the agent loop, and servers are simply capabilities/context being offered to this singular agent/client. Tools can do whatever they want, so can be agentic but this is irrelevant to the top level agent/client. What is meant by "agent support" in the roadmap? Is MCP working toward a vision where there is a singular client/agent that users interact and this agent is empowered by being connected to many MCPs? Or are you thinking about switching to something more akin to GPTs/Gems/Agents where users interact with many different top level agents that have a clear identity. The gaps mentioned in this issue seem to be mostly just about support multiple servers better and providing additional capabilities to tools (e.g. direct responses and elicitation) rather than supporting multiple agents explicitly. |
Beta Was this translation helpful? Give feedback.
-
Structured, formatted intermediate updates from server -> client, so a deep agent graph can provide information to the user even while a top-level tool is still being run This would be costly in context. As it means the model is stopping and making another function call? Or how? |
Beta Was this translation helpful? Give feedback.
-
Let MCP be simple and focus on standardizing tools and resources, etc., rather than defining standards for Agentic Workflows. The protocol should not overcomplicate matters by forcing diverse use cases to conform to a singular MCP way of organizing/orchestrating agents. |
Beta Was this translation helpful? Give feedback.
-
Hi! Here are few theses I'd like to share with you: For me, it seems like trees of agents concept is breaking the extisting client-server architetureFor trees of agents, it seems like all the nodes can both contain llm-calling logic (act as an MCP client) and provide resources/tools for them (act as an MCP server)
I do really miss the Server-->Client opportunities for my application,this caused me to look up over this issues and disscussions to see if people do run into same questions For example, let's take a Twitter MCP server. Imagine that this server can poll twitter to recieve new mentions from users, and then use MCP client (llm) to create responds for them Currently, this logic can be executed in 2 ways:
This approach is okay, but only if you are connecting to one server (Twitter server). If your agent should act on various platforms (Discords, youtube, tiktok, telegram...) you'll have to poll all those MCP servers as well, creating a lot of traffic and breaking the single responsibility principle inside the client
This approach is way better in sense of multiple MCP servers, but again, you'll have to subscribe to all of that resource and maintain some resource registry, which does not feel right to me. I would instead love an opportunity to push all updates directly to some message broker, and MCP client would consume that messages, idk |
Beta Was this translation helpful? Give feedback.
-
Proposal: Enhancing MCP to Support Provider-Independent CapabilitiesHello from Block, While considering improvements to the Agent UI for Goose, I realized that MCP might need enhancements to better support provider-independent capabilities. Let me explain in detail. Problem StatementImagine an agent that allows the use of any provider, such as Goose. Now, suppose we want to enable real-time audio communication. Since we don’t want to tie this feature to a specific provider, it makes sense to introduce it at the MCP server level. However, the UI should also be able to reflect this newly added capability, for example, by displaying a microphone button when audio communication is available. Proposed SolutionIf the MCP specification allowed servers or tools to declare the types of capabilities they provide (potentially as an enum of known categories), the Agent UI could dynamically adapt its controls based on available features. This could include elements such as:
Additionally, this would allow for more flexible settings management. If multiple MCPs provide the same capability, users could select which MCP instance should handle a given capability. This would enable the installation of MCPs with overlapping capabilities while ensuring that the preferred provider is chosen for each feature. Benefits
Would love to hear your thoughts on this approach! |
Beta Was this translation helpful? Give feedback.
-
Great discussion here. One of the UX patterns that I've been thinking about requires the LLM to initiate a long running task using a tool while it continues interaction with the user. Only to get an update when the long running task is complete so it can provide user the desired information or take further action based on the output. Think something like Deep Research where instead of the Agent being locked until the research is done it can continue the conversation. Can this be done with the current implementation (using the sampling method) or would this require changes in the protocol? This can be further expanded to allow the Agent to do multi-tasking, where it can invoke multiple tools and synthesize and refine its output as more information is provided to it. |
Beta Was this translation helpful? Give feedback.
-
In addition, we're working with multi-agent flows and the discussion regarding whether servers should be thought of as Tool Providers vs being fully Agentic. Instead of expecting servers to act as an Agent can we not have the Agent abstraction at the top level? So a session can have multiple "Agents" each with their own servers. And this information can be accessible to individual servers in case they want to invoke an available Agent (like sampling). I think it would be beneficial to have an Agent abstraction instead of nesting Agents within tools. In case a tool requires access to a specific Agent with certain capabilities, it can ask the client to enable/download the Agent and authenticate/authorize them to act. This would be cleaner from a privacy, transparency, security perspective since the user would have more visibility and control over how their data is being passed around behind the tool call. |
Beta Was this translation helpful? Give feedback.
-
Starting a tracking discussion of various things that could improve MCP's suitability for agents and agentic workflows (especially trees of agents).
Recording this quickly, without enough explanation—hopefully I can backfill that later. 😅
Feel free to add other thoughts!
Scope
Beta Was this translation helpful? Give feedback.
All reactions