Skip to content

BUG: Managed agents share state causing identical parallel execution results #1781

@Zoe14

Description

@Zoe14

Problem
When calling the same managed agent multiple times in one step in parallel, all calls produce identical or very similar results instead of executing independently. This affects all agent types that support managed agents (ToolCallingAgent, CodeAgent, etc.) due to shared instance state.

in agents.py, it is clear that managed agent is executed like a tool call, however all call_ methods would be on the same agent instance, therefore sharing and overwriting the agent state.

Steps to reproduce
It is hard to provide a minimal, self-contained, example for this bug. Because for simple multi-agent, llm is smart enough to recognize only one of the tool call was executed.
I tried with this example

from smolagents import ToolCallingAgent, WebSearchTool


from agents.shared.llm_models import load_model

model = load_model("gpt-4.1")


# Create a managed agent that needs multiple steps
research_agent = ToolCallingAgent(
    model=model,
    tools=[WebSearchTool()],  # Give it tools
    name="research_agent", 
    description="A research agent that searches and analyzes data",
    max_steps=5,
    verbosity_level=2
)

# Create main agent
main_agent = ToolCallingAgent(
    model=model,
    tools=[],
    managed_agents=[research_agent],
    verbosity_level=2
)

# This will also cause the issue
result = main_agent.run("""
# Research three different topics in parallel

ai_history = research_agent(
    task="Research the complete history of AI from 1950-2024 with detailed timeline and key figures",
    additional_args={"focus": "historical_milestones"}
)

ml_current = research_agent(
    task="Research current state of machine learning in 2024, latest models, and industry applications", 
    additional_args={"focus": "current_technology"}
)

ai_future = research_agent(
    task="Research future of AI for 2025-2030, emerging trends, and potential challenges",
    additional_args={"focus": "future_predictions"}
)

And summarize the results
""")

Actual behavior and error logs
looking at the step logs of the main agent, it obviously tried to call the search agent three times with the three different task in step 1, but the observation is only the last call. then in step 2, it called it 2 times for the missing searches which again only one observation. I only posted part of the output because of size limitation

[{'task': '\n# Research three different topics in parallel\n\nai_history = research_agent(\n    task="Research the complete history of AI from 1950-2024 with detailed timeline and key figures",\n    additional_args={"focus": "historical_milestones"}\n)\n\nml_current = research_agent(\n    task="Research current state of machine learning in 2024, latest models, and industry applications", \n    additional_args={"focus": "current_technology"}\n)\n\nai_future = research_agent(\n    task="Research future of AI for 2025-2030, emerging trends, and potential challenges",\n    additional_args={"focus": "future_predictions"}\n)\n\nAnd summarize the results\n',
  'task_images': None},
 {'step_number': 1,
  'timing': {'start_time': 1758569039.867606,
   'end_time': 1758569067.09487,
   'duration': 27.227264165878296},
  'model_input_messages': [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, .......n\nNow Begin!'}], tool_calls=None, raw=None, token_usage=None),
   ChatMessage(role=<MessageRole.USER: 'user'>, content=[{'type': 'text', 'text': 'New task:\n\n# Research three different topics in parallel\n\nai_history = research_agent(\n    task="Research the complete history of AI from 1950-2024 with detailed timeline and key figures",\n    additional_args={"focus": "historical_milestones"}\n)\n\nml_current = research_agent(\n    task="Research current state of machine learning in 2024, latest models, and industry applications", \n    additional_args={"focus": "current_technology"}\n)\n\nai_future = research_agent(\n    task="Research future of AI for 2025-2030, emerging trends, and potential challenges",\n    additional_args={"focus": "future_predictions"}\n)\n\nAnd summarize the results\n'}], tool_calls=None, raw=None, token_usage=None)],
  'tool_calls': [{'id': 'call_PSktJzk845VQkInWunda8vgw',
    'type': 'function',
    'function': {'name': 'research_agent',
     'arguments': {'task': 'Research current state of machine learning in 2024, latest models, and industry applications',
      'additional_args': {'focus': 'current_technology'}}}},
   {'id': 'call_gjtJOzkBmVFwrwyUbcpgU32c',
    'type': 'function',
    'function': {'name': 'research_agent',
     'arguments': {'task': 'Research future of AI for 2025-2030, emerging trends, and potential challenges',
      'additional_args': {'focus': 'future_predictions'}}}},
   {'id': 'call_ylt3M1NIh9LkKubUM3vhcZNl',
    'type': 'function',
    'function': {'name': 'research_agent',
     'arguments': {'task': 'Research the complete history of AI from 1950-2024 with detailed timeline and key figures',
      'additional_args': {'focus': 'historical_milestones'}}}}],
  'error': None,
  'model_output_message': {'role': 'assistant',
   'content': None,
   'tool_calls': [{'function': {'arguments': {'task': 'Research the complete history of AI from 1950-2024 with detailed timeline and key figures',
       'additional_args': {'focus': 'historical_milestones'}},
      'name': 'research_agent',
      'description': None},
     'id': 'call_ylt3M1NIh9LkKubUM3vhcZNl',
     'type': 'function'},
    {'function': {'arguments': {'task': 'Research current state of machine learning in 2024, latest models, and industry applications',
       'additional_args': {'focus': 'current_technology'}},
      'name': 'research_agent',
      'description': None},
     'id': 'call_PSktJzk845VQkInWunda8vgw',
     'type': 'function'},
    {'function': {'arguments': {'task': 'Research future of AI for 2025-2030, emerging trends, and potential challenges',
       'additional_args': {'focus': 'future_predictions'}},
      'name': 'research_agent',
      'description': None},
     'id': 'call_gjtJOzkBmVFwrwyUbcpgU32c',
     'type': 'function'}],
   'raw': ModelResponse(id='chatcmpl-CIgQibbG90r9MEV8nWRE1x8u1f858', created=1758569040, model='gpt-4.1-2025-04-14', object='chat.completion', system_fingerprint='fp_daf5fcc80a', choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(function=Function(arguments='{"task": "Research the complete history of AI from 1950-2024 with detailed timeline and key figures", "additional_args": {"focus": "historical_milestones"}}', name='research_agent'), id='call_ylt3M1NIh9LkKubUM3vhcZNl', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"task": "Research current state of machine learning in 2024, latest models, and industry applications", "additional_args": {"focus": "current_technology"}}', name='research_agent'), id='call_PSktJzk845VQkInWunda8vgw', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"task": "Research future of AI for 2025-2030, emerging trends, and potential challenges", "additional_args": {"focus": "future_predictions"}}', name='research_agent'), id='call_gjtJOzkBmVFwrwyUbcpgU32c', type='function')], function_call=None, provider_specific_fields={'refusal': None}, annotations=[]), provider_specific_fields={})], usage=Usage(completion_tokens=141, prompt_tokens=1178, total_tokens=1319, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0, text_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=0, cached_tokens=0, text_tokens=None, image_tokens=None)), service_tier='default'),
   'token_usage': {'input_tokens': 1178,
    'output_tokens': 141,
    'total_tokens': 1319}},
  'model_output': None,
  'code_action': None,
  'observations': 'Here is the final answer from your managed agent \'research_agent\':\n### 1. Task outcome (short version):\nA comprehensive overview and synthesis of expert predictions for Artificial Intelligence (AI) between 2025-2030, detailing the most prominent emerging trends,......e targeted insights.',
  'observations_images': None,
  'action_output': None,
  'token_usage': {'input_tokens': 1178,
   'output_tokens': 141,
   'total_tokens': 1319},
  'is_final_answer': False},
 {'step_number': 2,
  'timing': {'start_time': 1758569067.095399,
   'end_time': 1758569100.092018,
   'duration': 32.99661898612976},
  'model_input_messages': [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content=[{'type': 'text', 'text': 'You are an expert assistant who can solve any task using tool calls. ..... please specify for more targeted insights.'}], tool_calls=None, raw=None, token_usage=None)],
  'tool_calls': [{'id': 'call_NZXLHtaKnA9swUJTVI9i0Ytn',
    'type': 'function',
    'function': {'name': 'research_agent',
     'arguments': {'task': 'Research current state of machine learning in 2024, latest models, and industry applications',
      'additional_args': {'focus': 'current_technology'}}}},
   {'id': 'call_VEbXcWSIFrwpamjaC09CpgYe',
    'type': 'function',
    'function': {'name': 'research_agent',
     'arguments': {'task': 'Research the complete history of AI from 1950-2024 with detailed timeline and key figures',
      'additional_args': {'focus': 'historical_milestones'}}}}],
  'error': None,
  'model_output_message': {'role': 'assistant',
   'content': None,
   'tool_calls': [{'function': {'arguments': {'task': 'Research the complete history of AI from 1950-2024 with detailed timeline and key figures',
       'additional_args': {'focus': 'historical_milestones'}},
      'name': 'research_agent',
      'description': None},
     'id': 'call_VEbXcWSIFrwpamjaC09CpgYe',
     'type': 'function'},
    {'function': {'arguments': {'task': 'Research current state of machine learning in 2024, latest models, and industry applications',
       'additional_args': {'focus': 'current_technology'}},
      'name': 'research_agent',
      'description': None},
     'id': 'call_NZXLHtaKnA9swUJTVI9i0Ytn',
     'type': 'function'}],
   'raw': ModelResponse(id='chatcmpl-CIgR9iYLnAZBl1sw29OKl2WSlCX3v', created=1758569067, model='gpt-4.1-2025-04-14', object='chat.completion', system_fingerprint='fp_daf5fcc80a', choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(function=Function(arguments='{"task": "Research the complete history of AI from 1950-2024 with detailed timeline and key figures", "additional_args": {"focus": "historical_milestones"}}', name='research_agent'), id='call_VEbXcWSIFrwpamjaC09CpgYe', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"task": "Research current state of machine learning in 2024, latest models, and industry applications", "additional_args": {"focus": "current_technology"}}', name='research_agent'), id='call_NZXLHtaKnA9swUJTVI9i0Ytn', type='function')], function_call=None, provider_specific_fields={'refusal': None}, annotations=[]), provider_specific_fields={})], usage=Usage(completion_tokens=101, prompt_tokens=4178, total_tokens=4279, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0, text_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=0, cached_tokens=0, text_tokens=None, image_tokens=None)), service_tier='default'),
   'token_usage': {'input_tokens': 4178,
    'output_tokens': 101,
    'total_tokens': 4279}},
  'model_output': None,
  'code_action': None,
  'observations': "Here is the final answer from your managed agent 'research_agent':\n### 1. Task outcome (short version):\n\nIn 2024, machine learning (ML) is witnessing rapid expansion marked........The field is highly dynamic—changes and updates occur rapidly, so continuous monitoring is advised for long-term strategies.\n\nIf more specific details or tailored industry-by-industry deep dives are required, further targeted research is recommended.",
  'observations_images': None,
  'action_output': None,
  'token_usage': {'input_tokens': 4178,
   'output_tokens': 101,
   'total_tokens': 4279},
  'is_final_answer': False},
 {'step_number': 3,
  'timing': {'start_time': 1758569100.092465,
   'end_time': 1758569136.04459,
   'duration': 35.95212507247925},
  'model_input_messages': [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content=[{'type': 'text', 'text': 'You are an expert assistant who can solve any task using tool calls. ........further targeted research is recommended."}], tool_calls=None, raw=None, token_usage=None)],
  'tool_calls': [{'id': 'call_UUWlsgVWSdfUW5KRaUaBs4gZ',
    'type': 'function',
    'function': {'name': 'research_agent',
     'arguments': {'task': 'Research the complete history of AI from 1950-2024 with detailed timeline and key figures',
      'additional_args': {'focus': 'historical_milestones'}}}}],
  'error': None,
  'model_output_message': {'role': 'assistant',
   'content': None,
   'tool_calls': [{'function': {'arguments': {'task': 'Research the complete history of AI from 1950-2024 with detailed timeline and key figures',
       'additional_args': {'focus': 'historical_milestones'}},
      'name': 'research_agent',
      'description': None},
     'id': 'call_UUWlsgVWSdfUW5KRaUaBs4gZ',
     'type': 'function'}],

Expected behavior
Each call to a managed agent should execute independently, even when called multiple times in parallel. Different task descriptions should produce different results based on their specific inputs.

Environment:
Please complete the following information:

  • OS: macOS
  • Python version: 3.12.11
  • Package version: latest

Checklist

  • I have searched the existing issues and have not found a similar bug report.
  • I have provided a minimal, reproducible example.
  • I have provided the full traceback of the error.
  • I have provided my environment details.
  • I am willing to work on this issue and submit a pull request. (optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions