-
Notifications
You must be signed in to change notification settings - Fork 223
Replace albumentations with kornia #1169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| ), | ||
| "PadIfNeeded": (A.PadIfNeeded, {"min_height": 800, "min_width": 800, "p": 1.0}), | ||
| "Rotate": (A.Rotate, {"limit": 15, "p": 0.5}), | ||
| "PadIfNeeded": (K.PadTo, {"size": (800, 800), "p": 1.0}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Kornia Migration Fails Augmentation Mapping
The Kornia migration introduces several issues with augmentation mappings. Downscale and RandomSizedBBoxSafeCrop both incorrectly map to K.RandomResizedCrop, losing distinct behaviors and critical bounding box safety for the latter. Furthermore, Downscale and PadIfNeeded pass an unsupported p parameter to their Kornia transforms, leading to a TypeError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First comment agree (partially), second is incorrect.
Downscale is incorrectly used here. The documentation for albumentations says that it downscales an image and then upscales it (to simulate lossy rescaling rescaling). https://explore.albumentations.ai/transform/Downscale
If that's actually what we want, then we should pick a different transform (we could replace with a random scale < 1 and a transform to the original image size - bit awkward, but could be done). I think we missed this in the last PR.
p is clearly supported in kornia transforms. https://kornia.readthedocs.io/en/latest/augmentation.module.html#kornia.augmentation.RandomResizedCrop
| "GaussianBlur": (A.GaussianBlur, {"blur_limit": 2, "p": 0.3}), | ||
| "MotionBlur": (A.MotionBlur, {"blur_limit": 2, "p": 0.3}), | ||
| "ZoomBlur": (A.ZoomBlur, {"max_factor": 1.05, "p": 0.3}), | ||
| "ZoomBlur": (K.RandomAffine, {"degrees": 0, "scale": (1.0, 1.05), "p": 0.3}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should re-implement zoomblur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Henry, a few comments:
- We should remove/raise for transforms which aren't implemented yet. This mainly affects ZoomBlur and "safe" cropping. We do have a fallback which adds a dummy box if the pipeline picks an empty image by accident.
- I think we're incorrectly using Downscale currently(?), but resize is the correct solution
- Should use AugmentSequential instead of nn.Sequential and check that the bounding boxes are in the right place.
- Instead of passing an identity transform, if there are really no transforms we should skip altogether at the dataset level.
- Probably should update the tests to make sure we cover everything.
| - **[HorizontalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomHorizontalFlip)**: Randomly flip images horizontally | ||
| - **[VerticalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomVerticalFlip)**: Randomly flip images vertically | ||
| - **[Downscale](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Randomly downscale images to simulate different resolutions | ||
| - **[RandomSizedBBoxSafeCrop](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Crop image while preserving bounding boxes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove this (or warn + ignore in the pipeline) as the expected output is not necessarily the same. Could re-implement it easily enough.
| - **[VerticalFlip](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomVerticalFlip)**: Randomly flip images vertically | ||
| - **[Downscale](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Randomly downscale images to simulate different resolutions | ||
| - **[RandomSizedBBoxSafeCrop](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomResizedCrop)**: Crop image while preserving bounding boxes | ||
| - **[PadIfNeeded](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.PadTo)**: Pad images to minimum size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest for these naming changes, allow mapping from either the kornia expected one or the Albumentations one. (e.g. "PadIfNeeded" + "PadTo" should map to the same transform).
| - **[GaussNoise](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomGaussianNoise)**: Add gaussian noise | ||
| - **[Blur](https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomGaussianBlur)**: Apply blur effect | ||
|
|
||
| #### Zoom Augmentations for Multi-Resolution Training |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zoom is not implemented in kornia, so we would need to re-add that at some point.
| "Downscale": {"scale_range": (0.25, 0.75), "p": 0.5}, | ||
| "Downscale": {"scale": (0.25, 0.75), "p": 0.5}, | ||
| # Crop at different scales while preserving objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this guarantee is no longer possible
| A.Downscale(scale_range=(0.25, 0.75), p=0.5), | ||
| ToTensorV2() | ||
| ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=["category_ids"])) | ||
| transform = torch.nn.Sequential([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should usekornia.AugmentationSequential https://kornia.readthedocs.io/en/latest/augmentation.container.html
nn.sequential only works for geometric transforms and doesn't modify boxes. We need to specify one of bbox/bbox_xy/etc and maybe change the dataset to return the correct key.
https://kornia.readthedocs.io/en/latest/applications/image_augmentations.html
We should convert all inputs to tensor first I think.
| ``` | ||
|
|
||
| **Note**: When creating custom transforms, always include `ToTensorV2()` and properly configure `bbox_params` for object detection. If your augmentation pipeline does not contain any geometric transformations, `bbox_params` is not required. Otherwise it's important that you keep the format as `pascal_voc` so that the boxes are correctly interpreted by Albumentations. | ||
| **Note**: When creating custom transforms, use PyTorch's `torch.nn.Sequential` to compose multiple augmentations. Kornia transforms work directly with PyTorch tensors and don't require special bbox parameter handling like Albumentations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, for box transforms AugmentationSequential requires one of: “bbox”, “bbox_xyxy”, “bbox_xywh”. It's not possible to do bounding box transformations without the pipeline knowing the coordinate format.
| ), | ||
| "PadIfNeeded": (A.PadIfNeeded, {"min_height": 800, "min_width": 800, "p": 1.0}), | ||
| "Rotate": (A.Rotate, {"limit": 15, "p": 0.5}), | ||
| "PadIfNeeded": (K.PadTo, {"size": (800, 800), "p": 1.0}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First comment agree (partially), second is incorrect.
Downscale is incorrectly used here. The documentation for albumentations says that it downscales an image and then upscales it (to simulate lossy rescaling rescaling). https://explore.albumentations.ai/transform/Downscale
If that's actually what we want, then we should pick a different transform (we could replace with a random scale < 1 and a transform to the original image size - bit awkward, but could be done). I think we missed this in the last PR.
p is clearly supported in kornia transforms. https://kornia.readthedocs.io/en/latest/augmentation.module.html#kornia.augmentation.RandomResizedCrop
| "GaussianBlur": (A.GaussianBlur, {"blur_limit": 2, "p": 0.3}), | ||
| "MotionBlur": (A.MotionBlur, {"blur_limit": 2, "p": 0.3}), | ||
| "ZoomBlur": (A.ZoomBlur, {"max_factor": 1.05, "p": 0.3}), | ||
| "ZoomBlur": (K.RandomAffine, {"degrees": 0, "scale": (1.0, 1.05), "p": 0.3}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should re-implement zoomblur.
| return A.Compose(transforms_list, bbox_params=bbox_params) | ||
| # Create a sequential container for all transforms | ||
| if transforms_list: | ||
| return torch.nn.Sequential(*transforms_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment earlier, should use AugmentSequential from kornia.
No description provided.