Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom object detection fails even if using a training image as input #194

Open
andrei91ro opened this issue Nov 27, 2021 · 7 comments
Open

Comments

@andrei91ro
Copy link

Hello,

First of all congratulations for your hard work on DOPE and NDDS!

I used NDDS to generate a dataset of 20K images by following the instructions from the wiki.
The dataset looks good even when loaded using nvdu_viz
image

Since I don't yet have the real object assembled (a Crazyflie drone with a QR code beneath), I tried to point the webcam towads my monitor where one of the training images was displayed. Nothing happend unfortunately.

image

Afterwards, I tried using the ROS image_publisher node to stream a single image as a webcam stream and thus avoid the occasional noise that was caused by the monitor refresh rate.
I also published the camera_info topic along with the image_raw.

However, even when provided with a clear static image, the model does not detect the custom object:
image

At this point, I don't know the cause of my problems so any expert advice on this matter is highly appreciated :)

My possible theories are that:

  1. The object is symetrical and that is why it not detected - I did read issue DOPE detection cube not detected on custom symmetrical object (bowl) with symmetrical texture #37 but my object is not that round and the camera orientation is limited
  2. There is some issue regarding camera intrinsecs that differ between the NDDS training camera view, my Logitech C922 webcam and the syntetic image published by ros image_publisher
  3. Insuficiently varied dataset.

If anyone wants to try out the dataset I can provide it.
Since this work is part of a research project, I intend to write a white paper on the exact training steps and hiccups along the way and publish it on Github along with all of the required training data.

@sejmoonwei
Copy link

I got same problems ,maybe we can talk about the following things.

  1. Do you use your camera's intrinsics instead of the intrinsics in camera_setting.json in ndds dataset ? I got it from my D435i camera and used for inference.It looks better but still not good.

  2. What do you do when generate different poses of your object? I use random rotation/movement , also the background and light in UE4 scene.

  3. I found that training on a symmetrical object will cause the loss to stop at a high value ,but DOPE will still give a feasible result when inferencing which may looks right.

  4. As the former issue said, nvisii works better than NDDS. Now I'm working on nvisii to see if it works.

@andrei91ro
Copy link
Author

Hello @sejmoonwei , sorry for the late response.

  1. Unfortunately I did not use any camera intrinsics file neither for NDDS or the live prediction.
    Of course I will have to try that out too.

  2. Exactly the same

  3. I will try out, possibly limiting the rotation angle ot the camera around the object.

  4. I will try it out as indeed I am wrapping my head around UnrealEngine.
    As an alternative, I am currently trying out BlenderProc for generating syntetic data using Python and Blender.
    The only downside is that you have to handle the format conversions as it does not output into the native format used by DOPE.

@sejmoonwei
Copy link

well , do you get correct results on your train image? Or fail to detect only on real scene? I recommand you don't use ros yet. , Just a inference script will simplify this.I can share one if you need.

@andrei91ro
Copy link
Author

No, not even on training images. I would be grateful if you can provide me with such a script as it would allow me to iterate faster in the training/testing process.
I will also try using a single Apriltag in the square underneath the drone instead of the current four.
From other tests using Yolov4 I noticed that the detector is quite undecided if presented with fiducial tags which differ far less than a human/dog classes.

@sejmoonwei
Copy link

This is the one I currently use for inferencing . It shows the belief map and cuboid of the object. Input as a single image.Please resize your image to 640x480 before input or it may raise a runtime error.
https://github.com/sejmoonwei/inference_on_img/blob/main/belief_maps.py

@andrei91ro
Copy link
Author

Thank you for the link! I will try it sometime during the winter holidays as for now I am trapped in bureaucratic work.
In order to return the favor, I can recommend my personal backup solution which is an integration of a ZED camera and a custom trained Yolo V4 (yolo_integration) bounding box detector (Custom_3D_detection_and_tracking).

The camera SDK is capable of estimating the 3D position of the detected object and there is a Python API available for further processing of real time data.

Of course, DOPE is the better alternative as it does not depend of RGB-D images but the ZED camera is also an alternative.
I think others have achieved the same thing using Intel RGB-D cameras.

@sejmoonwei
Copy link

I‘ll read them , Thanks. Just keep touch on this (Custom object detection , DOPE).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants