Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating MemGPT-like Functionality #2937
Integrating MemGPT-like Functionality #2937
Changes from 38 commits
fd5adae
78be88f
efc1c55
9ee63e1
a4e2a18
507a67e
ccffdcc
4c3482f
4b1bf53
de1b94b
5fc4ce4
85a8f4b
2845c2c
7a8299b
a7a4c8a
0428dcc
bd9d14f
22e92d7
d2b1ae1
1afd574
c43ed97
6080071
515e038
44d3c9d
d93f5ee
0fece3f
d59be73
85b715d
7c5606d
d9b3aae
2a81073
140253c
c7a3713
754a9c3
490b192
2162c91
a90edc8
8d7bc30
34caa62
f30a572
13a9f64
d25e19e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We gave up using State, but then we still need to keep the user message that is likely to be a task. That is computed here: https://github.com/OpenDevin/OpenDevin/blob/84a6e90dc2e8b1e096af33f7545fa1969853c7d4/opendevin/controller/state/state.py#L169
Maybe we can set it as non-condensable in the agent, before we get here? Otherwise we're losing relevant information, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something needs to be done about user messages. It is currently losing relevant information. But only marking user messages as non-condensable is not sufficient ig? Some user messages might not contribute much to the relevant information about the final task.?
Currently
summarize_messages()
hassummarized_actions
&summarized_observations
as args. I was thinking, maybe adding one more arg,summarized_user_messages
might also help. This will summarize all the user messages till now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to not do it for all user messages, but the most likely task. The method linked above,
get_current_user_intent()
, returns only one user message: what is likely to be the current task. If we set that one as non-condensable, it will already improve the behavior. For the SWE-bench, that method will work to return the initial user message, which is definitely the task itself.For interactive user sessions, you're right it's more complicated. The method tries to guess what's the best user message, and returns the last one after a FinishAction, so it's the user message just after a task ended, and a new one started. That's a best-guess, of course, that the new task is in that message, but IMHO it will do for now. Do you think it's likely it won't work?
I see what you mean. In the previous version (PR 2021), we were summarizing the user messages in a summary of their own, as you suggest, when there were no more agent messages, so at a later stage, but again except the
current user intent
. The current intent as detected (or attempted 😅) remained uncondensed as long as possible. Doing something like this might still be a good option, but I don't think it's necessary for this PR.The important part, it seems to me, is that if we miss the initial user message, then we risk that the agent will not have enough information about the task to perform correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note, a minor thought, 2 or 4 messages would seem a better default, because a lot of the time, they will be
action, obs, action, obs...
. Not all the time, but well, if we will pick something we might want to target a common behavior.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, makes sense to keep it 2 or 4.