Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Reproduce results in swebench #5924

Open
1 task done
Hodge931 opened this issue Dec 30, 2024 · 3 comments
Open
1 task done

[Bug]: Reproduce results in swebench #5924

Hodge931 opened this issue Dec 30, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@Hodge931
Copy link

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Describe the bug and reproduction steps

Is it sufficient to reproduce the swebench results (53.8% in verified set) by following the readme at https://github.com/All-Hands-AI/OpenHands/tree/main/evaluation/benchmarks/swe_bench?

Thanks so much!

OpenHands Installation

Docker command in README

OpenHands Version

No response

Operating System

None

Logs, Errors, Screenshots, and Additional Context

No response

@Hodge931 Hodge931 added the bug Something isn't working label Dec 30, 2024
@neubig
Copy link
Contributor

neubig commented Dec 30, 2024

Hi @Hodge931 , yes that should be sufficient, although it might be best to use the exact commit that we used when we generated those results (maybe @xingyaoww can provide that).

@fishmingyu
Copy link

There is one bug I just reported, may you check Bug? @neubig

@fishmingyu
Copy link

Another bug I just reported at new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants