A lightweight deep learning approach for road segmentation in remote sensing imagery using ERFNet architecture with NAIP and LiDAR data fusion.
This project explores the effectiveness of ERFNet (Efficient Residual Factorized ConvNet) for semantic road segmentation in scenarios with limited training data. The study compares performance between standard 4-channel NAIP imagery and enhanced 6-channel datasets that incorporate LiDAR-derived features.
- Lightweight Architecture: Uses ERFNet for efficient real-time semantic segmentation
- Multi-band Input: Supports both 4-channel NAIP and 6-channel NAIP+LiDAR configurations
- Small Dataset Optimization: Designed to work effectively with limited training data
- Geographic Focus: Tested on the complex terrain of Monterey Peninsula, California
- NAIP 4-band: Red, Green, Blue, Near-Infrared (NIR)
- LiDAR-derived: Normalized Digital Surface Model (NDSM), Intensity
- Total: Up to 6 input channels for enhanced spatial context
- Location: Monterey Peninsula, California
- Rationale: Complex geography and varied terrain features provide robust testing conditions
- Data Availability: Consistent LiDAR survey data available for the region
- Base Model: ERFNet (Efficient Residual Factorized ConvNet)
- Input: Variable channel input (4 or 6 channels)
- Output: Binary road segmentation masks
- Optimization: Dilated convolutions and residual connections for efficiency
- Cross-validation: 3-fold K-fold validation
- Optimizer: Adam with learning rate 2e-4
- Epochs: 10 per fold
- Batch Size: 32
- Loss Function: Combined Dice Loss and Binary Cross-Entropy with positive weights
- Filtering of background-only images
- Pixel value normalization across all channels
- Mask scaling to [0, 1] range
- Data augmentation (flipping, rotation)
| Model | Accuracy | Precision | Recall | F1 Score |
|---|---|---|---|---|
| NAIP 4-band | 0.7965 | 0.3371 | 0.6396 | 0.4404 |
| NAIP + LiDAR 6-band | 0.7830 | 0.4949 | 0.8666 | 0.6040 |
- Enhanced Precision: 47% improvement with LiDAR integration
- Better Recall: 35% improvement in road pixel detection
- Improved F1 Score: 37% increase in overall segmentation quality
- Reduced False Positives: More accurate road boundary detection
