Automatic Pipeline for Classification (using Swin Transformer), Detection (using RT-DETR) and Segmentation (using SwinUNETR) of Bleeding and Non-Bleeding frames in Wireless Capsule Endoscopy
- Team name: failed wizards
- Team member names:
- Sasidhar Alavala (MS(R), IIT Tirupati)
- Anil Kumar V (BTech, IIT Tirupati)
- Aparnamala K (BTech, IIT Tirupati)
- Dr. Subrahmanyam Gorthi (Assoc. Professor, IIT Tirupati)
- Abstract:
- This pipeline leverages three state-of-the-art models: Swin Transformer for classification, RT-DETR for detection and SwinUNETR for segmentation. The pipeline begins with a series of image preprocessing steps, including colour space conversion to LAB, CLAHE (Contrast Limited Adaptive Histogram Equalization), and Gaussian blur to enhance image features. These preprocessing steps are applied to both training and validation data.
- For training data, various data augmentations are incorporated to improve model robustness and generalization. These augmentations include random horizontal and vertical flips, random rotations, Gaussian blurring, random affine transformations, random perspective distortions, and MixUp.
- Model Weights, README & Predictions
- Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. "Swin transformer: Hierarchical vision transformer using shifted windows." In Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012-10022. 2021.
- Lv, Wenyu, Shangliang Xu, Yian Zhao, Guanzhong Wang, Jinman Wei, Cheng Cui, Yuning Du, Qingqing Dang, and Yi Liu. "Detrs beat yolos on real-time object detection." arXiv preprint arXiv:2304.08069 (2023).