Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,27 @@

## Version 2.0.0 (Date: TBD)

The major innovations are:

1. **Migration from albumentations to kornia for data augmentations** - Replaced albumentations with kornia for better PyTorch integration and GPU acceleration

Additional features and enhancements include:

- **Enhancement:** Better PyTorch integration with kornia transforms
- **Enhancement:** Simplified API without bbox parameter complexity
- **Enhancement:** GPU acceleration support for augmentation transforms
- **Enhancement:** More consistent with PyTorch ecosystem
- **Documentation:** Updated augmentation documentation with kornia examples

### Breaking Changes - Deprecated Items Removed:

**Augmentation Changes:**
- **Migration from albumentations to kornia** - All augmentation transforms now use kornia instead of albumentations
- Some augmentation parameter names have changed (e.g., `scale_range` → `scale`, `height/width` → `size`)
- Custom transforms now use `torch.nn.Sequential` instead of `A.Compose`
- No longer requires bbox parameter configuration
- See migration guide in documentation for detailed parameter changes

**Removed Functions:**
- `xml_to_annotations()` - Use `utilities.read_pascal_voc(path)` or the general `utilities.read_file(path)`.
- `boxes_to_shapefile()` - Use `image_to_geo_coordinates()`.
Expand All @@ -23,6 +42,7 @@
- `raster_path` parameter from predict_tile() - Use `path` parameter instead

**Migration Guide:**
- **Augmentations:** Update parameter names and use kornia transforms (see documentation)
- Replace `xml_to_annotations(xml_path)` with `read_pascal_voc(xml_path)`
- Replace `boxes_to_shapefile(df, root_dir)` with `image_to_geo_coordinates(df, root_dir)`
- Replace `plot_points(image, points)` with `plot_results(results)`
Expand Down
54 changes: 26 additions & 28 deletions docs/user_guide/11_training.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ Note that if you trained on GPU and restore on cpu, you will need the map_locati

### Data Augmentations

DeepForest supports configurable data augmentations using [Albumentations](https://albumentations.ai/docs/3-basic-usage/bounding-boxes-augmentations/) to improve model generalization across different sensors and acquisition conditions. Augmentations can be specified through the configuration file or passed directly to the model.
DeepForest supports configurable data augmentations using [Kornia](https://kornia.readthedocs.io/en/latest/augmentation.html) to improve model generalization across different sensors and acquisition conditions. Augmentations can be specified through the configuration file or passed directly to the model.

#### Configuration-based Augmentations

Expand All @@ -285,9 +285,9 @@ train:
# Or as a list of custom parameters
augmentations:
- HorizontalFlip: {p: 0.5}
- Downscale: {scale_range: [0.25, 0.75], p: 0.5}
- RandomSizedBBoxSafeCrop: {height: 400, width: 400, p: 0.3}
- PadIfNeeded: {min_height: 400, min_width: 400, p: 1.0}
- Downscale: {scale: [0.25, 0.75], p: 0.5}
- RandomSizedBBoxSafeCrop: {size: [400, 400], scale: [0.5, 1.0], p: 0.3}
- PadIfNeeded: {size: [400, 400], p: 1.0}
```

Note that augmentations are provided as a list (prepended with a `-` in YAML). If you omit this, the parameter will be interpreted as a dictionary and the config parser may fail. If you provide only the augmentation name, default settings will be used. These have been chosen to reflect sensible parameters for different transformations, as it's quite easy to "over augment" which can make models harder to train. By default, if you enable augmentation and do not specify a transform explicitly, only `HorizontalFlip` will be used.
Expand All @@ -310,7 +310,7 @@ config_args = {
"train": {
"augmentations": [
"HorizontalFlip": {"p": 0.8},
"Downscale": {"scale_range": (0.5, 0.9), "p": 0.3}
"Downscale": {"scale": (0.5, 0.9), "p": 0.3}
]
}
}
Expand All @@ -321,16 +321,16 @@ model = main.deepforest(config_args=config_args)

DeepForest supports the following augmentations optimized for object detection:

- **[HorizontalFlip](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/flip/#HorizontalFlip)**: Randomly flip images horizontally
- **[VerticalFlip](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/flip/#VerticalFlip)**: Randomly flip images vertically
- **[Downscale](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#Downscale)**: Randomly downscale images to simulate different resolutions
- **[RandomSizedBBoxSafeCrop](https://albumentations.ai/docs/api-reference/albumentations/augmentations/crops/transforms/#RandomSizedBBoxSafeCrop)**: Crop image while preserving bounding boxes
- **[PadIfNeeded](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/pad/#PadIfNeeded)**: Pad images to minimum size
- **[Rotate](https://albumentations.ai/docs/api-reference/albumentations/augmentations/geometric/rotate/#Rotate)**: Rotate images by small angles
- **[RandomBrightnessContrast](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#RandomBrightnessContrast)**: Adjust brightness and contrast
- **[HueSaturationValue](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#HueSaturationValue)**: Adjust color properties
- **[GaussNoise](https://albumentations.ai/docs/api-reference/albumentations/augmentations/pixel/transforms/#GaussNoise)**: Add gaussian noise
- **[Blur](https://albumentations.ai/docs/api-reference/albumentations/augmentations/blur/transforms/#Blur)**: Apply blur effect
- **[HorizontalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomHorizontalFlip)**: Randomly flip images horizontally
- **[VerticalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomVerticalFlip)**: Randomly flip images vertically
- **[Downscale](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Randomly downscale images to simulate different resolutions
- **[RandomSizedBBoxSafeCrop](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Crop image while preserving bounding boxes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this (or warn + ignore in the pipeline) as the expected output is not necessarily the same. Could re-implement it easily enough.

- **[PadIfNeeded](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.PadTo)**: Pad images to minimum size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest for these naming changes, allow mapping from either the kornia expected one or the Albumentations one. (e.g. "PadIfNeeded" + "PadTo" should map to the same transform).

- **[Rotate](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomRotation)**: Rotate images by small angles
- **[RandomBrightnessContrast](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.ColorJitter)**: Adjust brightness and contrast
- **[HueSaturationValue](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.ColorJitter)**: Adjust color properties
- **[GaussNoise](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomGaussianNoise)**: Add gaussian noise
- **[Blur](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomGaussianBlur)**: Apply blur effect

#### Zoom Augmentations for Multi-Resolution Training
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zoom is not implemented in kornia, so we would need to re-add that at some point.


Expand All @@ -342,13 +342,13 @@ config_args = {
"train": {
"augmentations": [
# Simulate different acquisition heights/resolutions
"Downscale": {"scale_range": (0.25, 0.75), "p": 0.5},
"Downscale": {"scale": (0.25, 0.75), "p": 0.5},

# Crop at different scales while preserving objects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this guarantee is no longer possible

"RandomSizedBBoxSafeCrop": {"height": 400, "width": 400, "p": 0.3},
"RandomSizedBBoxSafeCrop": {"size": (400, 400), "scale": (0.5, 1.0), "p": 0.3},

# Ensure minimum image size
"PadIfNeeded": {"min_height": 400, "min_width": 400, "p": 1.0},
"PadIfNeeded": {"size": (400, 400), "p": 1.0},

# Basic data augmentation
"HorizontalFlip": {"p": 0.5}
Expand All @@ -364,26 +364,24 @@ model = main.deepforest(config_args=config_args)
For complete control over augmentations, you can still provide custom transforms:

```python
import albumentations as A
from albumentations.pytorch import ToTensorV2
import torch
import kornia.augmentation as K

def get_transform(augment):
"""Custom transform function"""
if augment:
transform = A.Compose([
A.HorizontalFlip(p=0.5),
A.Downscale(scale_range=(0.25, 0.75), p=0.5),
ToTensorV2()
], bbox_params=A.BboxParams(format='pascal_voc', label_fields=["category_ids"]))
transform = torch.nn.Sequential([
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should usekornia.AugmentationSequential https://kornia.readthedocs.io/en/latest/augmentation.container.html

nn.sequential only works for geometric transforms and doesn't modify boxes. We need to specify one of bbox/bbox_xy/etc and maybe change the dataset to return the correct key.

https://kornia.readthedocs.io/en/latest/applications/image_augmentations.html

We should convert all inputs to tensor first I think.

K.RandomHorizontalFlip(p=0.5),
K.RandomResizedCrop(size=(200, 200), scale=(0.25, 0.75), p=0.5)
])
else:
transform = A.Compose([ToTensorV2()],
bbox_params=A.BboxParams(format='pascal_voc', label_fields=["category_ids"]))
transform = torch.nn.Identity()
return transform

model = main.deepforest(transforms=get_transform)
```

**Note**: When creating custom transforms, always include `ToTensorV2()` and properly configure `bbox_params` for object detection. If your augmentation pipeline does not contain any geometric transformations, `bbox_params` is not required. Otherwise it's important that you keep the format as `pascal_voc` so that the boxes are correctly interpreted by Albumentations.
**Note**: When creating custom transforms, use PyTorch's `torch.nn.Sequential` to compose multiple augmentations. Kornia transforms work directly with PyTorch tensors and don't require special bbox parameter handling like Albumentations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, for box transforms AugmentationSequential requires one of: “bbox”, “bbox_xyxy”, “bbox_xywh”. It's not possible to do bounding box transformations without the pipeline knowing the coordinate format.


**How do I make training faster?**

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,12 @@ authors = [
dependencies = [
"aiohttp",
"aiolimiter",
"albumentations>=2.0.0",
"faster-coco-eval>=1.6.8",
"geopandas",
"h5py",
"huggingface_hub>=0.25.0",
"hydra-core",
"kornia",
"matplotlib",
"numpy<2.0",
"omegaconf",
Expand Down
81 changes: 46 additions & 35 deletions src/deepforest/augmentations.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Augmentation module for DeepForest using albumentations.
"""Augmentation module for DeepForest using kornia.

This module provides configurable augmentations for training and
validation that can be specified through configuration files or direct
Expand All @@ -7,36 +7,48 @@

from typing import Any

import albumentations as A
from albumentations.pytorch import ToTensorV2
import kornia.augmentation as K
import torch
from omegaconf import OmegaConf
from omegaconf.dictconfig import DictConfig
from omegaconf.listconfig import ListConfig

_SUPPORTED_TRANSFORMS = {
"HorizontalFlip": (A.HorizontalFlip, {"p": 0.5}),
"VerticalFlip": (A.VerticalFlip, {"p": 0.5}),
"Downscale": (A.Downscale, {"scale_range": (0.25, 0.5), "p": 0.5}),
"RandomCrop": (A.RandomCrop, {"height": 200, "width": 200, "p": 0.5}),
"HorizontalFlip": (K.RandomHorizontalFlip, {"p": 0.5}),
"VerticalFlip": (K.RandomVerticalFlip, {"p": 0.5}),
"Downscale": (
K.RandomResizedCrop,
{"size": (200, 200), "scale": (0.25, 0.5), "p": 0.5},
),
"RandomCrop": (K.RandomCrop, {"size": (200, 200), "p": 0.5}),
"RandomSizedBBoxSafeCrop": (
A.RandomSizedBBoxSafeCrop,
{"height": 200, "width": 200, "p": 0.5},
K.RandomResizedCrop,
{"size": (200, 200), "scale": (0.5, 1.0), "p": 0.5},
),
"PadIfNeeded": (A.PadIfNeeded, {"min_height": 800, "min_width": 800, "p": 1.0}),
"Rotate": (A.Rotate, {"limit": 15, "p": 0.5}),
"PadIfNeeded": (K.PadTo, {"size": (800, 800), "p": 1.0}),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Kornia Migration Fails Augmentation Mapping

The Kornia migration introduces several issues with augmentation mappings. Downscale and RandomSizedBBoxSafeCrop both incorrectly map to K.RandomResizedCrop, losing distinct behaviors and critical bounding box safety for the latter. Furthermore, Downscale and PadIfNeeded pass an unsupported p parameter to their Kornia transforms, leading to a TypeError.

Fix in Cursor Fix in Web

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First comment agree (partially), second is incorrect.

Downscale is incorrectly used here. The documentation for albumentations says that it downscales an image and then upscales it (to simulate lossy rescaling rescaling). https://explore.albumentations.ai/transform/Downscale

If that's actually what we want, then we should pick a different transform (we could replace with a random scale < 1 and a transform to the original image size - bit awkward, but could be done). I think we missed this in the last PR.

p is clearly supported in kornia transforms. https://kornia.readthedocs.io/en/latest/augmentation.module.html#kornia.augmentation.RandomResizedCrop

"Rotate": (K.RandomRotation, {"degrees": 15, "p": 0.5}),
"RandomBrightnessContrast": (
A.RandomBrightnessContrast,
{"brightness_limit": 0.2, "contrast_limit": 0.2, "p": 0.5},
K.ColorJitter,
{"brightness": 0.2, "contrast": 0.2, "p": 0.5},
),
"HueSaturationValue": (
A.HueSaturationValue,
{"hue_shift_limit": 10, "sat_shift_limit": 10, "val_shift_limit": 10, "p": 0.5},
K.ColorJitter,
{"hue": 0.1, "saturation": 0.1, "p": 0.5},
),
"GaussNoise": (K.RandomGaussianNoise, {"std": 0.1, "p": 0.3}),
"Blur": (
K.RandomGaussianBlur,
{"kernel_size": (3, 3), "sigma": (0.1, 2.0), "p": 0.3},
),
"GaussianBlur": (
K.RandomGaussianBlur,
{"kernel_size": (3, 3), "sigma": (0.1, 2.0), "p": 0.3},
),
"MotionBlur": (
K.RandomMotionBlur,
{"kernel_size": 3, "angle": 45, "direction": 0.0, "p": 0.3},
),
"GaussNoise": (A.GaussNoise, {"var_limit": (5.0, 20.0), "p": 0.3}),
"Blur": (A.Blur, {"blur_limit": 2, "p": 0.3}),
"GaussianBlur": (A.GaussianBlur, {"blur_limit": 2, "p": 0.3}),
"MotionBlur": (A.MotionBlur, {"blur_limit": 2, "p": 0.3}),
"ZoomBlur": (A.ZoomBlur, {"max_factor": 1.05, "p": 0.3}),
"ZoomBlur": (K.RandomAffine, {"degrees": 0, "scale": (1.0, 1.05), "p": 0.3}),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: ZoomBlur Replaces Radial Blur with Resize

The ZoomBlur augmentation now uses K.RandomAffine, which applies geometric scaling instead of the radial motion blur effect of the original Albumentations ZoomBlur. This changes the augmentation's visual outcome from a zoom blur to a simple resize.

Fix in Cursor Fix in Web

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should re-implement zoomblur.

}


Expand All @@ -51,8 +63,8 @@ def get_available_augmentations() -> list[str]:

def get_transform(
augmentations: str | list[str] | dict[str, Any] | None = None,
) -> A.Compose:
"""Create Albumentations transform for bounding boxes.
) -> torch.nn.Module:
"""Create Kornia transform for bounding boxes.

Args:
augmentations: Augmentation configuration:
Expand All @@ -62,10 +74,10 @@ def get_transform(
- None: No augmentations

Returns:
Composed albumentations transform
Composed kornia transform

Examples:
>>> # Default behavior, returns a ToTensorV2 transform
>>> # Default behavior, returns a basic transform
>>> transform = get_transform()

>>> # Single augmentation
Expand All @@ -77,11 +89,10 @@ def get_transform(
>>> # Augmentations with parameters
>>> transform = get_transform(augmentations={
... "HorizontalFlip": {"p": 0.5},
... "Downscale": {"scale_min": 0.25, "scale_max": 0.75}
... "Downscale": {"scale": (0.25, 0.75)}
... })
"""
transforms_list = []
bbox_params = None

if augmentations is not None:
augment_configs = _parse_augmentations(augmentations)
Expand All @@ -90,12 +101,12 @@ def get_transform(
aug_transform = _create_augmentation(aug_name, aug_params)
transforms_list.append(aug_transform)

bbox_params = A.BboxParams(format="pascal_voc", label_fields=["category_ids"])

# Always add ToTensorV2 at the end
transforms_list.append(ToTensorV2())

return A.Compose(transforms_list, bbox_params=bbox_params)
# Create a sequential container for all transforms
if transforms_list:
return torch.nn.Sequential(*transforms_list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment earlier, should use AugmentSequential from kornia.

else:
# Return identity transform if no augmentations
return torch.nn.Identity()


def _parse_augmentations(
Expand Down Expand Up @@ -151,15 +162,15 @@ def _parse_augmentations(
raise ValueError(f"Unable to parse augmentation parameters: {augmentations}")


def _create_augmentation(name: str, params: dict[str, Any]) -> A.BasicTransform | None:
"""Create an albumentations transform by name with given parameters.
def _create_augmentation(name: str, params: dict[str, Any]) -> torch.nn.Module | None:
"""Create a kornia transform by name with given parameters.

Args:
name: Name of the augmentation
params: Parameters to pass to the augmentation

Returns:
Albumentations transform or None if name not recognized
Kornia transform or None if name not recognized
"""

if name not in get_available_augmentations():
Expand Down
Loading
Loading