📌 Project Name
=> 딥러닝 기반 위,대장 내시경 영상 내 질환 자동 검출
📌 Project Goal
=> 1Stage Model부터 2Stage Model까지 다양한 모델 Architecture 공부 및 의료 이미지 도메인에 대한 이해 향상을 목표로 합니다
📌 Lesion Detection
=> AI를 통해 위장, 대장 내 궤양, 용종, 암 검출 가능
-
III. Colonoscopy Dataset
-
II. Colonoscopy EDA
-
I. Yolo
II. Detectron2
III. EfficientDet
-
I. Yolo
II. Detectron2
III. EfficientDet
주소(Download is possible only Korean)
=> https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71666
소개
=> 실제 위, 대장 내시경의 궤양, 용종, 암 이미지를 기반으로 위 20,000장(궤양 5,000장, 용종 5,000장, 암 10,000장), 대장 20,000장(궤양 5,000장, 용종 5,000장, 암 10,000장) 총 40,000장의 내시경 이미지 합성 이미지를 생성
구축목적
=> 개인 정보 이슈가 없이 누구나 사용가능한 헬스케어 데이터를 배포하기 위한 목적으로, 실제의 위/대장 내시경을 기반으로 생성AI를 통해 위/대장 내시 경 이미지를 합성함
주소
=> Gastroscopy 256X256: https://www.kaggle.com/datasets/seoyunje/gastroscopy-256x256-resized-png
=> Gastroscopy 1024X1024: https://www.kaggle.com/datasets/seoyunje/gastroscopy-1024x1024-resized-png
=> Annotation: https://www.kaggle.com/datasets/msyu78/gastroscopy-meta
구축목적
=> 리소스 자원의 한계로 인해 4만장의 대용량 데이터셋 학습에 무리가 있음. 이에 리소스 자원 내 모델 학습 및 최적화를 위해 소규모 데이터셋을 구축함.
주소
=> Colonoscopy 256X256: https://www.kaggle.com/datasets/seoyunje/colonoscopy-256x256-resized-png
=> Colonoscopy 1024X1024: https://www.kaggle.com/datasets/seoyunje/colonoscopy-1024x1024-resized-png
=> Annotation: https://www.kaggle.com/datasets/msyu78/metadataset
구축목적
=> 리소스 자원의 한계로 인해 4만장의 대용량 데이터셋 학습에 무리가 있음. 이에 리소스 자원 내 모델 학습 및 최적화를 위해 소규모 데이터셋을 구축함.
metadata
Meta | Count |
---|---|
Total Image with bbox annotation | 1997 |
Average of bboxes per image | 1 |
The most bboxes in one image | 29 |
The least bboxes in one image | 1 |
metadata
Meta | Count |
---|---|
Total Image with bbox annotation | 1996 |
Average of bboxes per image | 1 |
The most bboxes in one image | 10 |
The least bboxes in one image | 1 |
GastroScopy
- Yolov11 Model Pipeline
- No augmentation
- Adding
HSV-Hue augmentation
(hsv_h) - Adding
HSV-Saturation augmentation
(hsv_s) - Adding
HSV-Value augmentation
(hsv_v) - Adding
image rotation
(degrees) - Adding
image translation
(translate) - Adding
rotation
(flipup, fliplr)
- Adding
Image Enhancement Transform
(CLAHE) - Adding
Random brightness, contrast
(RandomBrightnessContrast)
- Adding
Channel Shuffling Transform
(ChannelShuffle) - Adding
Defocus Blur Transform
(Defocus) - Adding
Glass Blur Transform
(GlassBlur)
- optimizer =
auto
- Delete
Random brightness, contrast
(RandomBrightnessContrast) - Delete
rotation up and down
(flipud) - Adding
mosaic
- Adding
mixup
- Inference, set
conf
= 0.001,iou
= 0.7,max_det
= 100
ColonoScopy
GastroScopy
- Build Detectron2 Model Pipeline
- Backbone:
faster_rcnn_R_50_FPN_1x
- Adding
Flip Transform
(Horizontal Flip) - Adding
Image Enhancement Transform
(CLAHE) - Adding
Channel Transform
(ToGray, ChannelDropout, ChannelShift, RGBShift) - Adding
Dropout Transform
(XYMasking, CoarseDropout, BBoxSafeRandomCrop) - Adding
Nosiy Transform
(RandomGravel, RandomSnow)
- Adding
Custom MixUp Transform
(probability=0.1) - Adding
Custom Mosaic Transform
(probability=0.25)
- Batch Size:
16 -> 8
- base_lr_end:
0 -> 1e-9
- weight_decay:
1e-5 -> 1e-4
- warmup_factor:
1e-3 -> 1e-4
- warmup_iters:
max_iters//40 -> max_iters//20
- RPN.BBOX_REG_LOSS_TYPE:
smooth_l1
->ciou
- ROI.BOX_HEAD.BBOX_REG_LOSS_TYPE:
smooth_l1
->ciou
- base_lr:
1e-2 -> 5e-3
- weight_decay:
1e-4 -> 5e-5
- Pooler_Resolution:
14 -> 28
- Img_SIZE:
256 -> 512
RPN.BATCH_SIZE_PER_IMAGE: 256 -> 384
cfg.MODEL.ANCHOR_GENERATOR.SIZES: [[32, 64, 128, 256, 512]] -> [[16, 32, 64, 128, 192, 256, 512]]
cfg.MODEL.RPN.IN_FEATURES: ["p2", "p3", "p4","p5", "p6"] -> ["p2", "p2", "p3", "p4","p4","p5", "p6"]
MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE: 512 -> 256
MODEL.ROI_HEADS.POSITIVE_FRACTION: 0.25 -> 0.5
ColonoScopy
- Build Detectron2 Model Pipeline
- Backbone: faster_rcnn_R_50_FPN_3x
- Adding Data augmentation
- Adding
RandomFlip
with (prob=0.4, horizontal=True, vertical=False) - Adding
RandomContrast
with (intensity_min = 0.5, intensity_max=1.5) - Adding
RandomBrightness
with (intensity_min=0.5, intensity_max=1.5)
- Adding
- Adding
Custom Mosaic Transform
with probability = 0.2 - Adding
RandomLighting
with scale=0.1 - Adding
RandomRotation
with angle=[-10, 10]
- Backbone : faster_rcnn_R_50_FPN_1x
- scheduler : WarmupCosineLR
- Adding
Mosaic
with probability 0.2
- batch size : 8
- warmup factor : 5e-4
- scheduler : WarmupMultiStepLR
- batch size : 4
GastroScopy
- Build EfficientDet Model Pipeline
- Backbone:
tf_efficientnet_b0
- Model:
tf_efficientdet_d0
- Adding
HorizontalFlip(p=0.5)
- Batch Size:
4
- lr:
1e03
- weight_decay:
1e-5
- Scheduler_Type:
CosineAnnealingLR
- num_epochs:
20
- fpn_name:
BiFpn
- FPN Node Weight Method:
Fast Attention
- fpn_cell_repeats:
3
- fpn_channels:
384
- fpn_activation:
swish
-
label_smoothing:
0.0 -> 0.15
-
num_scales:
3 -> 4
=> [2^0, 2^0.33, 2^0.66] -> [2^0, 2^0.25, 2^0.5, 2^0.75]
-
anchor_scale: 3
-
anchor_ratio: [0.5, 1.0, 2.0]
-
Backbone:
tf_efficientnet_b0
->tf_efficientnet_b0.ns_jft_in1k
- img_size:
256X256
->512X512
- Backbone:
tf_efficientnet_b0
->tf_efficientnet_b1
- Model:
tf_efficientnet_d0
->tf_efficientdet_d1
- num_epochs:
20 -> 40
ColonoScopy
- Build EfficientDet Model Pipeline
- Backbone:
resdet50
- Adding
HorizontalFlip(p=0.5)
scheduler
: CosineAnnealingLR
- Adding
RandomBrightnessContrast
- Adding
RandomRotated
- Adding
CLAHE
- 50 epochs
train batch size
: 16 -> 8
train batch size
8 -> 4weight_decay
: 1e-4scheduler
-> CosineAnnealingWarmRestarts
GastroScopy 📌 Metric mAP50, 75
Version | name | Train | Test | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|---|
V1 | yolo11n | NoteBook | NoteBook | 0.466 | 0.148 | 0.540 | 0.594 | 0.229 | 0.090 | 0.240 | 0.356 |
V2 | yolo11n | NoteBook | NoteBook | 0.574 | 0.420 | 0.631 | 0.671 | 0.324 | 0.209 | 0.348 | 0.397 |
V3 | yolo11n | NoteBook | NoteBook | 0.532 | 0.390 | 0.548 | 0.658 | 0.301 | 0.180 | 0.309 | 0.412 |
V4 | yolo11n | NoteBook | NoteBook | 0.529 | 0.354 | 0.598 | 0.636 | 0.312 | 0.218 | 0.313 | 0.404 |
V5 | yolo11n | NoteBook | NoteBook | 0.610 | 0.459 | 0.660 | 0.712 | 0.353 | 0.178 | 0.413 | 0.468 |
V6 | yolo11n | NoteBook | NoteBook | 0.658 | 0.519 | 0.704 | 0.750 | 0.354 | 0.207 | 0.357 | 0.498 |
ColonoScopy 📌 Metric mAP50, 75
Version | name | Train | Test | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|---|
V1 | rtdeter-large | NoteBook | NoteBook | 0.618 | 0.408 | 0.656 | 0.791 | 0.426 | 0.116 | 0.537 | 0.626 |
GastroScopy 📌 Metric mAP50, 75
Version | Train | Infer | Config | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|---|
V1 | NoteBook | NoteBook | Config | 0.587 | 0.410 | 0.637 | 0.716 | 0.281 | 0.134 | 0.285 | 0.423 |
V1 with TTA | NoteBook | NoteBook | Config | 0.616 | 0.440 | 0.672 | 0.736 | 0.286 | 0.146 | 0.314 | 0.397 |
V2 | NoteBook | NoteBook | Config | 0.639 | 0.457 | 0.697 | 0.763 | 0.294 | 0.141 | 0.316 | 0.425 |
V3 | NoteBook | NoteBook | Config | 0.665 | 0.504 | 0.712 | 0.779 | 0.319 | 0.158 | 0.340 | 0.460 |
V4 | NoteBook | NoteBook | Config | 0.668 | 0.521 | 0.717 | 0.767 | 0.322 | 0.183 | 0.329 | 0.455 |
V5 | NoteBook | NoteBook | Config | 0.701 | 0.552 | 0.745 | 0.805 | 0.343 | 0.199 | 0.399 | 0.430 |
V6 | NoteBook | NoteBook | Config | 0.675 | 0.533 | 0.700 | 0.792 | 0.363 | 0.188 | 0.419 | 0.480 |
V7 | X | NoteBook | Config | 0.671 | 0.532 | 0.695 | 0.786 | 0.373 | 0.195 | 0.422 | 0.502 |
ColonoScopy 📌 Metric mAP50, 75
Version | Train | Inference | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|
V1 | NoteBook | NoteBook | 0.445 | 0.107 | 0.576 | 0.652 | 0.230 | 0.008 | 0.319 | 0.362 |
V2 | NoteBook | NoteBook | 0.534 | 0.281 | 0.619 | 0.701 | 0.309 | 0.033 | 0.437 | 0.458 |
V3 | NoteBook | NoteBook | 0.565 | 0.316 | 0.634 | 0.744 | 0.267 | 0.035 | 0.337 | 0.431 |
V4 | NoteBook | NoteBook | 0.552 | 0.316 | 0.611 | 0.729 | 0.239 | 0.050 | 0.320 | 0.347 |
V5 | NoteBook | NoteBook | 0.619 | 0.411 | 0.655 | 0.792 | 0.188 | 0.029 | 0.211 | 0.324 |
V6 | NoteBook | NoteBook | 0.612 | 0.404 | 0.640 | 0.792 | 0.228 | 0.027 | 0.293 | 0.366 |
Version | Name | Train | Infer | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|---|
V1 | MViTv2_T(Faster-RCNN) | NoteBook | NoteBook | 0.678 | 0.447 | 0.718 | 0.870 | 0.333 | 0.0412 | 0.430 | 0.527 |
V1 | MViTv2_T(Faster-RCNN) | NoteBook | NoteBook | 0.667 | 0.423 | 0.720 | 0.857 | 0.351 | 0.0469 | 0.447 | 0.560 |
V1 | MViTv2_S(Cascade Faster-RCNN) | NoteBook | NoteBook | 0.667 | 0.431 | 0.690 | 0.879 | 0.394 | 0.091 | 0.464 | 0.628 |
GastroScopy 📌 Metric mAP50, 75
Version | Name | Train | Infer | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|---|
V1 | tf-efficientdet_d0 | NoteBook | NoteBook | 0.399 | 0.335 | 0.167 | 0.695 | 0.139 | 0.092 | 0.020 | 0.305 |
V2 | tf-efficientdet_d0 | NoteBook | NoteBook | 0.591 | 0.550 | 0.448 | 0.775 | 0.213 | 0.129 | 0.148 | 0.363 |
V3 | tf-efficientdet_d0 | NoteBook | NoteBook | 0.624 | 0.528 | 0.535 | 0.809 | 0.279 | 0.180 | 0.223 | 0.433 |
V4 | tf-efficientdet_d0 | NoteBook | NoteBook | 0.620 | 0.536 | 0.528 | 0.794 | 0.293 | 0.212 | 0.207 | 0.458 |
V5 | tf-efficientdet_d1 | NoteBook | NoteBook | 0.689 | 0.551 | 0.713 | 0.803 | 0.376 | 0.199 | 0.432 | 0.498 |
ColonoScopy 📌 Metric mAP50, 75
Version | Name | Train | Infer | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|---|---|
V1 | Resdet50 | NoteBook | NoteBook | 0.625 | 0.428 | 0.617 | 0.830 | 0.315 | 0.046 | 0.398 | 0.500 |
V2 | Resdet50 | NoteBook | NoteBook | 0.662 | 0.452 | 0.671 | 0.862 | 0.357 | 0.080 | 0.468 | 0.523 |
V2 | Resdet50 | NoteBook | NoteBook | 0.665 | 0.458 | 0.695 | 0.842 | 0.385 | 0.092 | 0.515 | 0.548 |
If you want to see applied wbf, nms, nmw, soft-nms, click here
-
YoloV11n
on 256X256, infer 256X256 - mAP50: 0.658, mAP75: 0.354 -
Detectron2
on 512X512, infer 512X512 - mAP50: 0.671, mAP75: 0.373 -
EfficientDet0
on 512X512 infer 512X512 - mAP50: 0.689, mAP75: 0.376
📌 After Applying Weighted-Box-Fusiones mAP50
increases by +0.05
, mAP75 increases by + 0.075
=> WBF
- mAP50: 0.735, mAP75: 0.443
If you want to see result report, click here
Techniques to Combine Box | Notebook | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|
NMS | Notebook | 0.717 | 0.578 | 0.759 | 0.815 | 0.398 | 0.200 | 0.484 | 0.511 |
Soft-NMS | Notebook | 0.662 | 0.534 | 0.677 | 0.774 | 0.417 | 0.224 | 0.483 | 0.545 |
NMW | Notebook | 0.721 | 0.589 | 0.765 | 0.819 | 0.430 | 0.230 | 0.501 | 0.561 |
WBF | Notebook | 0.735 | 0.616 | 0.763 | 0.826 | 0.443 | 0.257 | 0.485 | 0.586 |
By simply adjusting the confidence scores of your bbox and not changing their xmin, xmax, ymin, ymax you can increase your mAP metrics. In the following example. the entire plot is 1class, each dot is 1 bbox and the numbers are confidence scores. If the confidence score for bbox D and G are changed then the mAP for this class increases by +0.12!! Therefore it is important to calibrate bbox probabilities.
The first probability is derived from the object detection model, representing the likelihood of a detected object belonging to a specific class. The second probability comes from the classifer model, refining this prediction by providing and independent confidence estimate.
To explore the relatvie importance of these two components, we conducted experiments by adjusting their contributions in the final confidence score. As a result, we modified the formula as follows:
Confidence Score=P(class∣detection)^α×σ(classifier output)^β
wehre α and 𝛽 control the balance between the object detection model and the classifier model, allowing for optimal confidence calibration.
📌 After Calibrated Confidence Score, mAP50 increases by +0.018, mAP75 increases by + 0.006
Alpha | Beta | mAP@50 | mAP@50-Ulcer | mAP@50-Polyp | mAP@75-Cancer | mAP@75 | mAP@75-Ulcer | mAP@75-Polyp | mAP@75-Cancer |
---|---|---|---|---|---|---|---|---|---|
1.0 | 0.0 | 0.735 | 0.616 | 0.763 | 0.826 | 0.443 | 0.257 | 0.485 | 0.586 |
0.7 | 0.3 | 0.753 | 0.622 | 0.796 | 0.841 | 0.449 | 0.255 | 0.501 | 0.592 |
0.6 | 0.4 | 0.753 | 0.621 | 0.797 | 0.841 | 0.447 | 0.253 | 0.498 | 0.591 |
0.5 | 0.5 | 0.752 | 0.618 | 0.797 | 0.841 | 0.445 | 0.251 | 0.496 | 0.589 |
0.4 | 0.6 | 0.749 | 0.613 | 0.794 | 0.838 | 0.443 | 0.249 | 0.492 | 0.588 |
0.3 | 0.7 | 0.744 | 0.608 | 0.789 | 0.834 | 0.438 | 0.243 | 0.487 | 0.585 |
If you want to sea more detail click here
Custom BBox Loss
Custom FPN Architecture
Adding Attention Techniques
Exploration of other object detection library
Exploration of other backbone model
{
Author = {서윤제, 유민선},
Title = {Endoscope Object Detection Model},
Year = {2025},
Publisher = {GitHub},
Journal = {GitHub repository},
Howpublished = {\url{https://github.com/SEOYUNJE/Endoscope-Object-Detection}}
}
=> 서윤제's email: [email protected]
=> 유민선's email: [email protected]