Skip to content

Conversation

@MischaPanch
Copy link

@MischaPanch MischaPanch commented Jun 5, 2025

Solving the task 13974 from SWE Verified with Claude Opus 4 and Serena. The majority of the diff stems from the onboarding part and can be ignored by the reader, it is included for completeness and transparency.

Onboarding conversation:
https://claude.ai/share/1dfa8555-1218-4888-8df9-739137f37448

Task solution conversation:
https://claude.ai/share/5cc6565d-c645-47dd-8c07-350c28353452

Displayed Capabilities:

Opus and Serena form a dream team when it comes to efficient operation. In particular, the following behavior is displayed in the solution conversation

  1. Intelligent and token-frugal usage of symbolic reads. First finds the file, then looks at the symbol overview, then reads the relevant symbol body, then reads the class symbol overview (not the entire body). Claude is great in using find_symbol and get_symbol_overview
  2. Surgical insert of a new symbol, maximally efficient in terms of output tokens
  3. Surgical edit with replace_regex with no follow-up reads. Very few output tokens, contrary to other approaches
  4. Frugal reading of non-symbolic parts like imports by read_file on the top 50 lines
  5. Inserting new tests using insert_after_symbol, again perfect efficiency in terms of output tokens
  6. Domain specific thought tools helped staying on track and focusing on user's instruction (here e.g. to not run tests, since I didn't want to install the dependencies)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants