ObjectNav results on HM3D (test-standard) #3

zhangpingrui · 2023-12-17T02:48:07Z

Hello! I'm so sorry to bother you again that I would like to know if the performance of PEANUT on HM3D is test on 500 episodes or on all 2000 episodes? Cause now I use my reproduced checkpoint to obtain a comparable performance when I run 500 episodes, but on all 2000 episodes its performance is lower than before.

ajzhai · 2023-12-17T07:07:18Z

Yea, the ablation study in our paper is on 500 episodes from HM3D val set.

zhangpingrui · 2023-12-18T04:27:33Z

Yea, the ablation study in our paper is on 500 episodes from HM3D val set.

Ah, so all the results in Table 1(include other methods, such as ProcTHOR) were evaluate on 500 episodes?

ajzhai · 2023-12-18T16:09:28Z

No, that evaluation is using the standard test set on EvalAI: here

zhangpingrui · 2023-12-19T03:31:40Z

No, that evaluation is using the standard test set on EvalAI: here

The results in Table 1 were evaluated online? So the code in nav/collect.py is for ablation? I would like to know whether the code has an evaluation part for the HM3D test standard.

zhangpingrui · 2023-12-19T11:42:25Z

On this website (https://github.com/matterport/habitat-matterport-3dresearch), I really don't find the test-standard dataset of HM3D. There are only train and val. Could you please tell me where to download the test split dataset?

ajzhai · 2023-12-19T14:44:46Z

We cannot download the test dataset. It is intentionally kept hidden from the public so that people cannot optimize their agent's based on test set performance. This is common for many ML benchmarks. The only way to evaluate test set performance is by submitting your docker image to EvalAI. If you want to do that, a few steps need to be done first:

Change nav_exp.sh to contain python nav/eval.py (not collect.py)
Uncomment this line in the dockerfile:

PEANUT/peanut.Dockerfile

Line 32 in 5fe8358

# ADD nav /nav
You need to docker build the docker image, but no need for docker run
Follow directions here to submit.

Note that you should ideally only evaluate the test set performance to report metrics for a paper, not for improving your agent. Use the train and val data to optimize your agent.

zhangpingrui · 2023-12-20T03:18:00Z

We cannot download the test dataset. It is intentionally kept hidden from the public so that people cannot optimize their agent's based on test set performance. This is common for many ML benchmarks. The only way to evaluate test set performance is by submitting your docker image to EvalAI. If you want to do that, a few steps need to be done first:

Change nav_exp.sh to contain python nav/eval.py (not collect.py)

Uncomment this line in the dockerfile:

PEANUT/peanut.Dockerfile

Line 32 in 5fe8358

# ADD nav /nav

You need to docker build the docker image, but no need for docker run

Follow directions here to submit.

Note that you should ideally only evaluate the test set performance to report metrics for a paper, not for improving your agent. Use the train and val data to optimize your agent.

Thanks for your reply! I'll follow your guidance to try to get the metrics!

zhangpingrui · 2023-12-21T14:03:48Z

We cannot download the test dataset. It is intentionally kept hidden from the public so that people cannot optimize their agent's based on test set performance. This is common for many ML benchmarks. The only way to evaluate test set performance is by submitting your docker image to EvalAI. If you want to do that, a few steps need to be done first:

Change nav_exp.sh to contain python nav/eval.py (not collect.py)

Uncomment this line in the dockerfile:

PEANUT/peanut.Dockerfile

Line 32 in 5fe8358

# ADD nav /nav

You need to docker build the docker image, but no need for docker run

Follow directions here to submit.

Note that you should ideally only evaluate the test set performance to report metrics for a paper, not for improving your agent. Use the train and val data to optimize your agent.

Today I tried to do this, but I found that the link to the test-standard summit page was not working.(orz).

If I want to use a metric to justify my approach, what should I do? Can I just use the metrics on HM3D(val)?

ajzhai · 2023-12-21T15:39:53Z

I think you need an account first. Can you make an EvalAI account and try again?

zhangpingrui · 2023-12-22T01:56:38Z

I think you need an account first. Can you make an EvalAI account and try again?

Yes, I got that page after I logged in. Maybe you can get in because you have participated in that habitat2022. Can you browse the habitat2021 submit page?

ajzhai · 2023-12-22T02:59:30Z

Yes, you can click on the "Participate" button to register a team.

zhangpingrui · 2023-12-22T09:27:28Z

Yes, you can click on the "Participate" button to register a team.

Nope, I think you can only create a team during game time. (orz

ajzhai · 2023-12-22T11:23:42Z

Hmm that's weird, I thought it was available at any time. You can try asking the Habitat people about that, they can probably help. If there's no solution, then I guess you'll have to use the val split.

zhangpingrui · 2023-12-22T14:53:23Z

Hmm that's weird, I thought it was available at any time. You can try asking the Habitat people about that, they can probably help. If there's no solution, then I guess you'll have to use the val split.

Good, I'll try. Thanks for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ObjectNav results on HM3D (test-standard) #3

ObjectNav results on HM3D (test-standard) #3

zhangpingrui commented Dec 17, 2023

ajzhai commented Dec 17, 2023

zhangpingrui commented Dec 18, 2023

ajzhai commented Dec 18, 2023

zhangpingrui commented Dec 19, 2023

zhangpingrui commented Dec 19, 2023

ajzhai commented Dec 19, 2023

zhangpingrui commented Dec 20, 2023

zhangpingrui commented Dec 21, 2023

ajzhai commented Dec 21, 2023

zhangpingrui commented Dec 22, 2023

ajzhai commented Dec 22, 2023

zhangpingrui commented Dec 22, 2023

ajzhai commented Dec 22, 2023

zhangpingrui commented Dec 22, 2023

ObjectNav results on HM3D (test-standard) #3

ObjectNav results on HM3D (test-standard) #3

Comments

zhangpingrui commented Dec 17, 2023

ajzhai commented Dec 17, 2023

zhangpingrui commented Dec 18, 2023

ajzhai commented Dec 18, 2023

zhangpingrui commented Dec 19, 2023

zhangpingrui commented Dec 19, 2023

ajzhai commented Dec 19, 2023

zhangpingrui commented Dec 20, 2023

zhangpingrui commented Dec 21, 2023

ajzhai commented Dec 21, 2023

zhangpingrui commented Dec 22, 2023

ajzhai commented Dec 22, 2023

zhangpingrui commented Dec 22, 2023

ajzhai commented Dec 22, 2023

zhangpingrui commented Dec 22, 2023