Skip to content

Conversation

@henrykironde
Copy link
Contributor

No description provided.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

),
"PadIfNeeded": (A.PadIfNeeded, {"min_height": 800, "min_width": 800, "p": 1.0}),
"Rotate": (A.Rotate, {"limit": 15, "p": 0.5}),
"PadIfNeeded": (K.PadTo, {"size": (800, 800), "p": 1.0}),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Kornia Migration Fails Augmentation Mapping

The Kornia migration introduces several issues with augmentation mappings. Downscale and RandomSizedBBoxSafeCrop both incorrectly map to K.RandomResizedCrop, losing distinct behaviors and critical bounding box safety for the latter. Furthermore, Downscale and PadIfNeeded pass an unsupported p parameter to their Kornia transforms, leading to a TypeError.

Fix in Cursor Fix in Web

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First comment agree (partially), second is incorrect.

Downscale is incorrectly used here. The documentation for albumentations says that it downscales an image and then upscales it (to simulate lossy rescaling rescaling). https://explore.albumentations.ai/transform/Downscale

If that's actually what we want, then we should pick a different transform (we could replace with a random scale < 1 and a transform to the original image size - bit awkward, but could be done). I think we missed this in the last PR.

p is clearly supported in kornia transforms. https://kornia.readthedocs.io/en/latest/augmentation.module.html#kornia.augmentation.RandomResizedCrop

"GaussianBlur": (A.GaussianBlur, {"blur_limit": 2, "p": 0.3}),
"MotionBlur": (A.MotionBlur, {"blur_limit": 2, "p": 0.3}),
"ZoomBlur": (A.ZoomBlur, {"max_factor": 1.05, "p": 0.3}),
"ZoomBlur": (K.RandomAffine, {"degrees": 0, "scale": (1.0, 1.05), "p": 0.3}),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: ZoomBlur Replaces Radial Blur with Resize

The ZoomBlur augmentation now uses K.RandomAffine, which applies geometric scaling instead of the radial motion blur effect of the original Albumentations ZoomBlur. This changes the augmentation's visual outcome from a zoom blur to a simple resize.

Fix in Cursor Fix in Web

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should re-implement zoomblur.

Copy link
Collaborator

@jveitchmichaelis jveitchmichaelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Henry, a few comments:

  • We should remove/raise for transforms which aren't implemented yet. This mainly affects ZoomBlur and "safe" cropping. We do have a fallback which adds a dummy box if the pipeline picks an empty image by accident.
  • I think we're incorrectly using Downscale currently(?), but resize is the correct solution
  • Should use AugmentSequential instead of nn.Sequential and check that the bounding boxes are in the right place.
  • Instead of passing an identity transform, if there are really no transforms we should skip altogether at the dataset level.
  • Probably should update the tests to make sure we cover everything.

- **[HorizontalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomHorizontalFlip)**: Randomly flip images horizontally
- **[VerticalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomVerticalFlip)**: Randomly flip images vertically
- **[Downscale](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Randomly downscale images to simulate different resolutions
- **[RandomSizedBBoxSafeCrop](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Crop image while preserving bounding boxes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this (or warn + ignore in the pipeline) as the expected output is not necessarily the same. Could re-implement it easily enough.

- **[VerticalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomVerticalFlip)**: Randomly flip images vertically
- **[Downscale](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Randomly downscale images to simulate different resolutions
- **[RandomSizedBBoxSafeCrop](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Crop image while preserving bounding boxes
- **[PadIfNeeded](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.PadTo)**: Pad images to minimum size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest for these naming changes, allow mapping from either the kornia expected one or the Albumentations one. (e.g. "PadIfNeeded" + "PadTo" should map to the same transform).

- **[GaussNoise](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomGaussianNoise)**: Add gaussian noise
- **[Blur](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomGaussianBlur)**: Apply blur effect

#### Zoom Augmentations for Multi-Resolution Training
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zoom is not implemented in kornia, so we would need to re-add that at some point.

"Downscale": {"scale_range": (0.25, 0.75), "p": 0.5},
"Downscale": {"scale": (0.25, 0.75), "p": 0.5},
# Crop at different scales while preserving objects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this guarantee is no longer possible

A.Downscale(scale_range=(0.25, 0.75), p=0.5),
ToTensorV2()
], bbox_params=A.BboxParams(format='pascal_voc', label_fields=["category_ids"]))
transform = torch.nn.Sequential([
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should usekornia.AugmentationSequential https://kornia.readthedocs.io/en/latest/augmentation.container.html

nn.sequential only works for geometric transforms and doesn't modify boxes. We need to specify one of bbox/bbox_xy/etc and maybe change the dataset to return the correct key.

https://kornia.readthedocs.io/en/latest/applications/image_augmentations.html

We should convert all inputs to tensor first I think.

```

**Note**: When creating custom transforms, always include `ToTensorV2()` and properly configure `bbox_params` for object detection. If your augmentation pipeline does not contain any geometric transformations, `bbox_params` is not required. Otherwise it's important that you keep the format as `pascal_voc` so that the boxes are correctly interpreted by Albumentations.
**Note**: When creating custom transforms, use PyTorch's `torch.nn.Sequential` to compose multiple augmentations. Kornia transforms work directly with PyTorch tensors and don't require special bbox parameter handling like Albumentations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, for box transforms AugmentationSequential requires one of: “bbox”, “bbox_xyxy”, “bbox_xywh”. It's not possible to do bounding box transformations without the pipeline knowing the coordinate format.

),
"PadIfNeeded": (A.PadIfNeeded, {"min_height": 800, "min_width": 800, "p": 1.0}),
"Rotate": (A.Rotate, {"limit": 15, "p": 0.5}),
"PadIfNeeded": (K.PadTo, {"size": (800, 800), "p": 1.0}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First comment agree (partially), second is incorrect.

Downscale is incorrectly used here. The documentation for albumentations says that it downscales an image and then upscales it (to simulate lossy rescaling rescaling). https://explore.albumentations.ai/transform/Downscale

If that's actually what we want, then we should pick a different transform (we could replace with a random scale < 1 and a transform to the original image size - bit awkward, but could be done). I think we missed this in the last PR.

p is clearly supported in kornia transforms. https://kornia.readthedocs.io/en/latest/augmentation.module.html#kornia.augmentation.RandomResizedCrop

"GaussianBlur": (A.GaussianBlur, {"blur_limit": 2, "p": 0.3}),
"MotionBlur": (A.MotionBlur, {"blur_limit": 2, "p": 0.3}),
"ZoomBlur": (A.ZoomBlur, {"max_factor": 1.05, "p": 0.3}),
"ZoomBlur": (K.RandomAffine, {"degrees": 0, "scale": (1.0, 1.05), "p": 0.3}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should re-implement zoomblur.

return A.Compose(transforms_list, bbox_params=bbox_params)
# Create a sequential container for all transforms
if transforms_list:
return torch.nn.Sequential(*transforms_list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment earlier, should use AugmentSequential from kornia.

@jveitchmichaelis jveitchmichaelis self-assigned this Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants