Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About ScanNet #44

Open
LifeBeyondExpectations opened this issue Feb 25, 2021 · 6 comments
Open

About ScanNet #44

LifeBeyondExpectations opened this issue Feb 25, 2021 · 6 comments

Comments

@LifeBeyondExpectations
Copy link

Thanks for sharing the wonderful work.

I have a question for the usage of the scenes in the ScanNet dataset.
While ScanNet itself provides train/val/test splits, it seems like this paper utilized specific scenes as below.

/local-scratch/scannet/scene0688_00/rgb/frame-000346.color.jpg

I want to double-check whether I correctly understand the author's intentions.

@LifeBeyondExpectations
Copy link
Author

LifeBeyondExpectations commented Feb 25, 2021

I have one more question about ScanNet.
I bring some example images that the authors used for evaluation as in the code below:

/local-scratch/scannet/scene0688_00/rgb/frame-000346.color.jpg

Screenshot 2021-02-25 16:51:49

It seems that I cannot find how the authors extracted image sequences from ScanNet dataset.
Did you extract all the images from *.sens files without setting skipping frames??

For instance, in another paper,
https://github.com/ardaduz/deep-video-mvs/blob/043f25703e5135661a62c9d85f994ecd4ebf1dd0/dataset/scannet-export/scannet-export.py#L226
they clearly describe this hyperparameter as frame_skip = 1.

So I wonder how the authors extract images and depths from the original ScanNet v2.
For me, the two images above have a small quantity of relative camera motion.

@LifeBeyondExpectations
Copy link
Author

I have one more question.
As the authors described in the paper,
"DSO fails to initialize or loses tracking on some of the test sequences so we only evaluate on sequences where DSO is successful."

  • Can you also provide the samples that DSO succeeds?
  • Do you measure the depth accuracy within the subset of the test sequences that DSO succeeds in ??

Currently, I cannot reproduce the reported results (Table2 of the main paper)

@zachteed
Copy link
Collaborator

zachteed commented Mar 3, 2021

Hi, I used the split used in the BA-Net paper in order to compare to BA-Net. The images/depths/poses were extracted from the .sens file with frame skip = 1.

I evaluated the depth/pose accuracy of DeepV2D on samples. For DSO, I only reported the results on the videos where DSO succeeds.

Which results in Table 2 are you having trouble reproducing, and what results are you getting? Are you using the pretrained model or running the training script?

@LifeBeyondExpectations
Copy link
Author

I think I currently got stuck with the sub-set that DSO succeeds.
Can you provide the specific image indexes that DSO succeeds?
I cannot reproduce the same number of succeded cases within the ScanNet dataset.

@zachteed
Copy link
Collaborator

zachteed commented Mar 4, 2021

I will post a .txt file on the cases where DSO succeeds later today or tomorrow. I have the logs from this experiment arxived, but I will need to parse these logs to give you the exact cases.

The evaluation used by BA-Net is performed on pairs of frames, but by default DSO only outputs the pose of keyframes. I needed to use a modified version of DSO to ensure that poses for all frames where recorded. I ran DSO on the full sequences and recorded camera poses for all frames, missing poses indicated a tracking failure, so I only evaluated pairs of frames with results from DSO.

@zachteed
Copy link
Collaborator

zachteed commented Mar 6, 2021

These are the poses I got from running dso
dso_poses.zip

You can use this script to parse the results, you should find that dso has the poses for 1665 of the 2000 pairs

import pickle
import numpy as np
import re
import os

test_frames = np.loadtxt('scannet_test.txt', dtype=np.unicode_)
test_data = []

for i in range(0, len(test_frames), 4):
    test_frame_1 = str(test_frames[i]).split('/')
    test_frame_2 = str(test_frames[i+1]).split('/')
    scan = test_frame_1[3]

    imageid_1 = int(re.findall(r'frame-(.+?).color.jpg', test_frame_1[-1])[0])
    imageid_2 = int(re.findall(r'frame-(.+?).color.jpg', test_frame_2[-1])[0])            
    test_data.append((scan, imageid_1, imageid_2))

count = 0
for x in test_data:
    scan, i1, i2 = x
    pose_path = "tmp/" + scan + ".pickle"

    if os.path.isfile(pose_path):
        poses = pickle.load(open(pose_path, 'rb'))
        if i1 in poses and i2 in poses:
            count += 1

print(count, len(test_data))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants