Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjectNav results on HM3D (test-standard) #3

Open
zhangpingrui opened this issue Dec 17, 2023 · 14 comments
Open

ObjectNav results on HM3D (test-standard) #3

zhangpingrui opened this issue Dec 17, 2023 · 14 comments

Comments

@zhangpingrui
Copy link

Hello! I'm so sorry to bother you again that I would like to know if the performance of PEANUT on HM3D is test on 500 episodes or on all 2000 episodes? Cause now I use my reproduced checkpoint to obtain a comparable performance when I run 500 episodes, but on all 2000 episodes its performance is lower than before.

@ajzhai
Copy link
Owner

ajzhai commented Dec 17, 2023

Yea, the ablation study in our paper is on 500 episodes from HM3D val set.

@zhangpingrui
Copy link
Author

Yea, the ablation study in our paper is on 500 episodes from HM3D val set.

Ah, so all the results in Table 1(include other methods, such as ProcTHOR) were evaluate on 500 episodes?

@ajzhai
Copy link
Owner

ajzhai commented Dec 18, 2023

No, that evaluation is using the standard test set on EvalAI: here

@zhangpingrui
Copy link
Author

No, that evaluation is using the standard test set on EvalAI: here

The results in Table 1 were evaluated online? So the code in nav/collect.py is for ablation? I would like to know whether the code has an evaluation part for the HM3D test standard.

@zhangpingrui
Copy link
Author

On this website (https://github.com/matterport/habitat-matterport-3dresearch), I really don't find the test-standard dataset of HM3D. There are only train and val. Could you please tell me where to download the test split dataset?

@ajzhai
Copy link
Owner

ajzhai commented Dec 19, 2023

We cannot download the test dataset. It is intentionally kept hidden from the public so that people cannot optimize their agent's based on test set performance. This is common for many ML benchmarks. The only way to evaluate test set performance is by submitting your docker image to EvalAI. If you want to do that, a few steps need to be done first:

  • Change nav_exp.sh to contain python nav/eval.py (not collect.py)
  • Uncomment this line in the dockerfile:
    # ADD nav /nav
  • You need to docker build the docker image, but no need for docker run
  • Follow directions here to submit.

Note that you should ideally only evaluate the test set performance to report metrics for a paper, not for improving your agent. Use the train and val data to optimize your agent.

@zhangpingrui
Copy link
Author

We cannot download the test dataset. It is intentionally kept hidden from the public so that people cannot optimize their agent's based on test set performance. This is common for many ML benchmarks. The only way to evaluate test set performance is by submitting your docker image to EvalAI. If you want to do that, a few steps need to be done first:

  • Change nav_exp.sh to contain python nav/eval.py (not collect.py)
  • Uncomment this line in the dockerfile:
    # ADD nav /nav
  • You need to docker build the docker image, but no need for docker run
  • Follow directions here to submit.

Note that you should ideally only evaluate the test set performance to report metrics for a paper, not for improving your agent. Use the train and val data to optimize your agent.

Thanks for your reply! I'll follow your guidance to try to get the metrics!

@zhangpingrui
Copy link
Author

We cannot download the test dataset. It is intentionally kept hidden from the public so that people cannot optimize their agent's based on test set performance. This is common for many ML benchmarks. The only way to evaluate test set performance is by submitting your docker image to EvalAI. If you want to do that, a few steps need to be done first:

  • Change nav_exp.sh to contain python nav/eval.py (not collect.py)
  • Uncomment this line in the dockerfile:
    # ADD nav /nav
  • You need to docker build the docker image, but no need for docker run
  • Follow directions here to submit.

Note that you should ideally only evaluate the test set performance to report metrics for a paper, not for improving your agent. Use the train and val data to optimize your agent.

Today I tried to do this, but I found that the link to the test-standard summit page was not working.(orz).
image
If I want to use a metric to justify my approach, what should I do? Can I just use the metrics on HM3D(val)?

@ajzhai
Copy link
Owner

ajzhai commented Dec 21, 2023

I think you need an account first. Can you make an EvalAI account and try again?

@zhangpingrui
Copy link
Author

I think you need an account first. Can you make an EvalAI account and try again?

​Yes, I got that page after I logged in. Maybe you can get in because you have participated in that habitat2022. Can you browse the habitat2021 submit page?

@ajzhai
Copy link
Owner

ajzhai commented Dec 22, 2023

Yes, you can click on the "Participate" button to register a team.

@zhangpingrui
Copy link
Author

Yes, you can click on the "Participate" button to register a team.

Nope, I think you can only create a team during game time. (orz
image

@ajzhai
Copy link
Owner

ajzhai commented Dec 22, 2023

Hmm that's weird, I thought it was available at any time. You can try asking the Habitat people about that, they can probably help. If there's no solution, then I guess you'll have to use the val split.

@zhangpingrui
Copy link
Author

Hmm that's weird, I thought it was available at any time. You can try asking the Habitat people about that, they can probably help. If there's no solution, then I guess you'll have to use the val split.

Good, I'll try. Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants