Autogen + local LLM = Messy conversations. #471
-
Hello everyone! I hope you're all doing well. Like many others, I've been exploring Autogen. However, due to cost considerations, I've opted for LM Studio as a substitute for ChatGPT. I've tested various models, and in all of them, I'm encountering an issue with the conversation using GroupChat and GroupChatManager. The agents seem to struggle to finish their sentences, and there's an overlap where one agent's speech merges into another's. This issue is also evident in the video shared by another user here: https://youtu.be/5f7MQDSNxmk?t=736 (around the 12-minute mark). I'm facing a similar issue, as depicted in the image below: Here's the code I'm using in gist, or in the end of this message: https://gist.github.com/macintoxic/064f478b312e516b24dcffc9f2c3f5ce I've been troubleshooting this problem for a while now but haven't found a solution. If anyone has encountered and resolved a similar issue or has insights into optimizing the conversation flow with Autogen and LM Studio, I would greatly appreciate your assistance. Interestingly, when testing with the official OpenAI API, everything works flawlessly. However, when using a local LLM, the problem persists. from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager, ChatCompletion, retrieve_utils
### Default data and functions:
def write_file(file_name, file_content):
with open(file_name, "w") as file1:
file1.writelines(file_content)
function_map_definition = []
function_map_definition.append({
'name' : 'write_file',
'description' : 'write a file to disk',
'parameters' : {
'type' : 'object',
'properties' :{
'file_name' : {
'type':'string',
'description' : 'A valid file name'
},
'file_content' : {
'type':'string',
'description' : 'The content of the file.'
}
}
},
'required' : ['file_name', 'file_content']
})
config_list = [{
"api_type": "open_ai",
"api_key": "NULL",
'model' : 'gpt-3.5-turbo',
"api_base" : "http://localhost:1234/v1",
'functions' : function_map_definition ,
}]
#"api_base" : "https://api.openai.com/v1"
#"api_base" : "http://localhost:1234/v1"
llm_config = {
"request_timeout" : 9600,
"seed": 42,
"config_list" : config_list,
"temperature" : 0.1,
"max_tokens": 4096,
}
ChatCompletion.start_logging()
user_proxy = UserProxyAgent(
name="Admin",
system_message="""A human admin.
Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.
Reply TERMINATE if the task has been solved at full satisfaction otherwise CONTINUE, or reply why the reason the task is not solved yet.
""",
code_execution_config=False,
human_input_mode="TERMINATE",
llm_config=llm_config,
)
engineer = AssistantAgent(
name="Engineer",
llm_config=llm_config,
system_message='''Engineer. You follow an approved plan. You write python/shell or csharp code to solve tasks.
Wrap the code in a code block that specifies the script type.
The user can't modify your code. So do not suggest incomplete code which requires others to modify.
Don't use a code block if it's not intended to be executed by the executor.
Don't include multiple code blocks in one response.
Do not ask others to copy and paste the result.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes.
If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption,
collect additional info you need, and think of a different approach to try.
''',
max_consecutive_auto_reply=2,
code_execution_config={"work_dir": "coding"},
)
planner = AssistantAgent(
name="Planner",
system_message='''Planner. Suggest a plan. Break down the task in smaller steps. Revise the plan based on feedback from admin and critic, until admin approval.
The plan may involve an engineer who can write code.
Explain the plan first. Be clear which step is performed by an engineer. Do not write code. ask for an engineer to do it.
''',
llm_config=llm_config,
max_consecutive_auto_reply=5,
)
planner.register_function( function_map={"write_file": write_file})
executor = UserProxyAgent(
name="Executor",
system_message="Executor. Execute the code written by the engineers and report the result. When you receive a csharp or sql file, dont execute-it. Just write it down. ",
human_input_mode="NEVER",
max_consecutive_auto_reply=2,
llm_config=llm_config,
code_execution_config={"work_dir": "coding"},
)
sr_python = AssistantAgent(
name='sr_python_engineer',
system_message='''python engineer. You follow an approved plan. You write python/shell code to solve tasks.
Wrap the code in a code block that specifies the script type.
The user can't modify your code. So do not suggest incomplete code which requires others to modify.
Don't use a code block if it's not intended to be executed by the executor.
Don't include multiple code blocks in one response.
Do not ask others to copy and paste the result. Check the execution result returned by the executor.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes.
If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption,
collect additional info you need, and think of a different approach to try.
''',
llm_config=llm_config,
code_execution_config={"work_dir": "coding"},
)
sr_python.register_function( function_map={"write_file": write_file})
sr_dot_net = AssistantAgent(
name='csharp_engineer',
system_message='''Csharp Engineer. You follow an approved plan. You write csharp code to solve tasks.
Wrap the code in a code block that specifies the script type.
The user can't modify your code. So do not suggest incomplete code which requires others to modify.
Don't use a code block if it's not intended to be executed by the executor.
Don't include multiple code blocks in one response.
Do not ask others to copy and paste the result.
After each file you generate, write it down in the disk using the write_file method.
''',
#is_termination_msg=lambda x : x.get("content","").rstrip().endswith("TERMINATE"),
llm_config=llm_config,
code_execution_config={"work_dir": "coding"},
max_consecutive_auto_reply=2
)
sr_dot_net.register_function( function_map={"write_file": write_file})
sr_sql = AssistantAgent(
name='sql_engineer',
system_message='''Sql Engineer. You follow an approved plan. sql code to solve tasks.
Wrap the code in a code block that specifies the script type.
The user can't modify your code. So do not suggest incomplete code which requires others to modify.
Don't use a code block if it's not intended to be executed by the executor.
Don't include multiple code blocks in one response.
Do not ask others to copy and paste the result.
Unless specified, your code is for postgress database.
After each file you generate, write it down in the disk using the write_file method.
''',
llm_config=llm_config,
)
sr_sql.register_function( function_map={"write_file": write_file})
critic = AssistantAgent(
name="Critic",
system_message="Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.",
llm_config=llm_config,
)
groupchat = GroupChat(agents=[user_proxy, planner, sr_dot_net, sr_sql, critic], messages=[], max_round=30,)
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)
task = """
Given the following class follow these steps:
Write an csharp controller for the crud methods.
Write an csharp service for the crud methods.
Write an csharp repository for the crud methods using entity framework.
Write unity tests for aiming to 100% code coverage using xunit, moq and bogus.
Write the SQL command to create the table in a postgres database.
The default namespace is BuscaCep.
The class:
//Cep.cs
public class Cep
{
[Key]
//max lenght 8
public string ZipCode { get ; set; }= null!;
public string TipoLogradouro { get; set; }= null!;
public string Logradouro { get; set; }= null!;
public string Complemento { get; set; }= null!;
public string Local { get; set; }= null!;
public string Bairro { get; set; }= null!;
public string Cidade { get; set; }= null!;
public string CodCidade { get; set; }= null!;
public string Uf { get; set; }= null!;
public string Estado { get; set; }= null!;
public string CodEstado { get; set; }= null!;
}
Planner, break down this plan to best execution. You have a Csharp engineer, a sql engineer and a critic that verifies what was done.
Please use the function write_file to write the files on disk and wait for the planner end his planning to continue.
"""
def write_file(file_name, file_content):
with open(file_name, "w") as file1:
file1.writelines(file_content)
try:
#proxy_agent.initiate_chat(assistant, message=task3)
user_proxy.initiate_chat(
manager,
message=task,
clear_history=True,
)
except Exception as ex:
print(50* '*')
print(ex)
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 6 replies
-
Hi @macintoxic. Are you using gpt-3.5-turbo for all these experiments? If so, the difference must be in LMStudio or how it's being called. I'm not familiar with LMStudio myself, but there are discussions about it on our Discord channel, like this one. |
Beta Was this translation helpful? Give feedback.
-
I am facing the same issue and the only explanation I have found is that LM Studio is "pausing" at 199 token and then continues with the rest. AutoGen will then add the rest of the text coming from LM Studio to the next agent. I nevertheless haven't found a workaround to this. |
Beta Was this translation helpful? Give feedback.
-
Ok - reality check here: The LM tools available today are nothing more than simple pattern-based predictive algorithms. The patterns they were trained on were sourced from sites like Stack Overflow, Quora etc. So stop yourself here a moment and do two things. 1- Read the text again as though you were encountering this on stack overflow or your personal github repository from another user. Is this a question that you'd answer? If you were searching for an answer, would this be a question you'd expect to have high quality answers? How do you anticipate s/o or reddit posters would respond to this wording, because that's what will have determined how it affects the patterns that emerge in the prediction. 2- The paradigm shifting component in this tech is 'attention' and you are absolutely wasting it on negatives here. Refer back to the first item, we're basically back into the whole 'prompt crafting' business again where the goal is to find the correct combinations of seed and prelude pattern to guide the algorithm towards the right subset of training documents that will provide the reasoning and knowledge required to answer your questions. In ChatDev, for example, a lot of the coder prompts list things the coder should not do such as write methods consisting of nothing but 'pass', and should those elements end up spread across attention boundaries, 'should' and 'should not' are fairly weak to start with, so 1000 tokens later what attention sees amounts to 'methods ... pass'. Guess what ChatDev likes to do when you aren't asking it to verbatim recreate an example it was explicitly trained on? |
Beta Was this translation helpful? Give feedback.
-
I've got it working really well with 2 local LLM as I've been at it straight for 3 weeks trying to get it to work. I pieced parts together from 3 different sources and took lots of trial and error. The models and the chat conversation all have to be correct for it to work. Try a different model and see if you get better results. |
Beta Was this translation helpful? Give feedback.
Hi @macintoxic. Are you using gpt-3.5-turbo for all these experiments? If so, the difference must be in LMStudio or how it's being called. I'm not familiar with LMStudio myself, but there are discussions about it on our Discord channel, like this one.