Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 1.69 KB

Complex-YOLO-Real-time-3D-Object-Detection-on-Point-Clouds.md

File metadata and controls

25 lines (20 loc) · 1.69 KB

Authors: Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross
PDF, code-pytorch


Summary

  • The authors present a single-stage LIDAR-only model for 3D object localization of vehicles (BEV - bird's eye view).

  • The point cloud data is converted into BEV map, which they also called BEV RGB map - R: density channel, G: height channel, B: intensity channel. The BEV covers the front 80m x 40m in point cloud. The size of the covering area, i.e. grid map, is defined with n = 2014 and m = 512. Therefore grid resolution is about g = 8cm In summary covering area is: PΩ = {P = [x, y, z]T|x ∈ [0, 40m], y ∈ [−40m, 40m], z ∈ [−2m, 1.25m]}

  • Now, YOLO-v2 is applied on BEV map which will predict object geometry(6: x, y, w, l, im, re), conf(1) and class probabilities(7). The object orientation information is also included along with location information, therefore they named it as complex-yolo. If 5 anchors are used the output map will be W x H x C = 32 x 16 x 70.

  • As height of the object is not regressed, only object position and orientation will be detected. The orientation will be calculated based on Im and Re parameters as follows: Obj_ori = arctan2(Im, Re)

  • The Complex-YOLO loss function is the extension of the YOLO loss function: loss = loss_yolo + loss_eular