BEV - Bird's Eye View Generation Pipeline

Integration of Depth Anything V2, Omni3D, and YOLOPv2 for robust Bird's Eye View space generation from monocular video.

Paper: [Coming Soon - Title TBD] ArXiv: [Link will be added upon publication] Project Page: [Link will be added]

Overview

This project combines state-of-the-art computer vision models to create accurate bird's eye view representations from standard camera footage. The pipeline integrates:

Depth Anything V2: Monocular depth estimation for metric 3D scene understanding
Omni3D: 3D object detection and localization in the wild
YOLOPv2: Multi-task panoptic driving perception (lane detection, drivable area segmentation, object detection)

Features

Real-time and batch video processing
GPS data integration for georeferenced outputs
Frame extraction and video reconstruction
BEV projection with depth-aware transformations
Overlay generation combining multiple perception modalities

Project Structure

BEV/
├── OmniLineDepth.py         # Inference Pipeline
├── realTimeOmni.py          # Real-time BEV generation
│
├── DepthAnythingV2/         # Depth Anything V2 model integration
├── omni3d/                  # Omni3D 3D object detection
├── YOLOPv2/                 # YOLOPv2 driving perception
├── IMG_GPS/                 # GPS data processing
├── overlay/                 # Overlay generation
├── projection/              # Geometric projection utilities
│
└── demoImages/              # Demo Images for inferencing

Installation

We recommend using Python 3.9 for stability and dependencies compatibility:

# Clone repository
git clone https://github.com/fantasybarry/BEV.git
cd BEV

# Install dependencies
pip install -r requirements.txt

Note: Update the repository URL once published.

Usage

Basic Processing

# Run an example inference demo on our demo Images
python OmniLineDepth.py \
    --config-file cubercnn://outdoor/cubercnn_DLA34_FPN.yaml \
    --input-folder "demoGPS" \
    --source "demoGPS" \
    --threshold 0.50  \
    --launch-app \
    MODEL.WEIGHTS cubercnn://outdoor/cubercnn_DLA34_FPN.pth \
    OUTPUT_DIR output/demo

Sample Data

demoGPS/ - demo Images
overlay/gps/gps_data_sample.json - demo GPS coordinates

Output

The pipeline generates:

Bird's eye view projections
Depth maps
3D object detections
Lane and drivable area segmentation
GPS-referenced overlays

License

This project is licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) due to the licensing requirements of integrated components. See LICENSE for full details.

Citations

Citing This Work

If you find our work is useful for your research, please kindly cite our paper:

@article{tan2025bev,
  title={[Monocular 3D Perception and Lane-Aware Bird’s-Eye-View Mapping for Autonomous Driving]},
  author={Tan, Lin and Wang, Hanchen and Li, Taozhe and Hajnorouzali, Yasaman and Burch, Collin and Lee, Victoria and Xu, Bin and Arjmanzdadeh, Ziba},
  journal={[arxiv: TBA]},
  year={2025}
}

Citing Integrated Models

This work builds upon the following models. Please also cite their papers:

Paper Citations

Depth Anything V2:

@inproceedings{depthanythingv2,
  title={Depth Anything V2},
  author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
  journal={arXiv:2406.09414},
  year={2024}
}

Omni3D:

@inproceedings{brazil2023omni3d,
  author =       {Garrick Brazil and Abhinav Kumar and Julian Straub and Nikhila Ravi and Justin Johnson and Georgia Gkioxari},
  title =        {{Omni3D}: A Large Benchmark and Model for {3D} Object Detection in the Wild},
  booktitle =    {CVPR},
  address =      {Vancouver, Canada},
  month =        {June},
  year =         {2023},
  organization = {IEEE},
}

YOLOPv2:

@article{han2022yolopv2,
  title={YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception},
  author={Han, Cheng and Zhao, Qichao and Zhang, Shuyi and Chen, Yinzi and Zhang, Zhenlin and Yuan, Jinwei},
  journal={arXiv preprint arXiv:2208.11434},
  year={2022}
}

Code Citations

Depth Anything V2: https://github.com/DepthAnything/Depth-Anything-V2
Omni3D: https://github.com/facebookresearch/omni3d
YOLOPv2: https://github.com/CAIC-AD/YOLOPv2

Acknowledgments

This project builds upon the excellent work of multiple research teams:

The Depth Anything V2 team (Lihe Yang, Bingyi Kang, Zilong Huang, Zhen Zhao, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao) for their robust monocular depth estimation model presented at NeurIPS 2024
Meta AI Research (FAIR) team (Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari) for the Omni3D benchmark and 3D detection framework presented at CVPR 2023
The YOLOPv2 authors (Cheng Han, Qichao Zhao, Shuyi Zhang, Yinzi Chen, Zhenlin Zhang, Jinwei Yuan) for their efficient multi-task panoptic driving perception system

We are grateful for their contributions to the computer vision and autonomous driving research communities, and for making their code publicly available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEV - Bird's Eye View Generation Pipeline

Overview

Features

Project Structure

Installation

Usage

Basic Processing

Sample Data

Output

License

Citations

Citing This Work

Citing Integrated Models

Paper Citations

Code Citations

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github		.github
.vscode		.vscode
DepthAnythingV2		DepthAnythingV2
YOLOPv2		YOLOPv2
demoGPS		demoGPS
omni3d		omni3d
overlay		overlay
projection		projection
.gitignore		.gitignore
LICENSE.md		LICENSE.md
OmniLineDepth.py		OmniLineDepth.py
README.md		README.md
realTimeOmni.py		realTimeOmni.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

BEV - Bird's Eye View Generation Pipeline

Overview

Features

Project Structure

Installation

Usage

Basic Processing

Sample Data

Output

License

Citations

Citing This Work

Citing Integrated Models

Paper Citations

Code Citations

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages