Clone the repository and install dependencies:
git clone https://github.com/yourusername/ss25_Hierarchical_Multiscale_Image_Classification.git
cd ss25_Hierarchical_Multiscale_Image_Classification
pip install -r requirements.txt
All commands are run from the root of the repository:
python src/main.py [OPTIONS]Show CLI Options and Flags
- --download: Download the CAMELYON16 dataset.
- --base_dir BASE_DIR: Set the base directory for downloaded files (default:
./data). - --remote: Download all files (default downloads only a subset for testing).
- -p, --patch: Extract patches from WSIs.
- --patch_level LEVEL: WSI level for patch extraction (0, 1, 2, 3, or 'all').
- Level 0: 1792x1792
- Level 1: 896x896
- Level 2: 448x448
- Level 3: 224x224
- Example:
python src/main.py --patch --patch_level 0
python src/main.py --patch --patch_level all
- -prep, --prepare: Prepare data (create validation set, extract masks, etc).
- -val, --validation: Create a validation set (5 normal + 5 tumor images).
- -train, --train: Train a ResNet18 classifier on extracted patches (default, weighted loss for class imbalance).
- --train_strategy: Train a ResNet18 classifier with a specific strategy. Use with --strategy.
- --strategy STRATEGY: Training strategy for ResNet classifier. Options:
- self_supervised: Use SimCLR pretraining for feature extraction.
- balanced: Balance the number of tumor and normal patches in the training set.
- weighted_loss: Use weighted loss for class imbalance (default for --train).
- Example:
python src/main.py --train_strategy --strategy balanced
python src/main.py --train_strategy --strategy self_supervised
python src/main.py --train_strategy --strategy weighted_loss
- If you encounter CUDA errors or want to debug GPU operations, you can run with:
CUDA_LAUNCH_BLOCKING=1 python src/main.py --train_strategy --strategy self_supervised
This will force synchronous CUDA execution and provide more informative error messages. - --extract_features: Extract feature vectors from patches using ResNet18.
- --check_structure: Check if the directory structure is correct.
| Download a small subset for testing: |
python src/main.py --download| Download the full dataset: |
python src/main.py --download --remote| Extract patches at a level: |
python src/main.py --patch --patch_level 1| Extract patches at all levels: |
python src/main.py --patch --patch_level all| Prepare data (validation set, masks): |
python src/main.py --prep| Create validation set only: |
python src/main.py --val| Train ResNet18 classifier: |
python src/main.py --train| Extract features from patches: |
python src/main.py --extract_features| Check directory structure: |
python src/main.py --check_structuredata/
└── camelyon16/
├── train/
│ └── img/
├── val/
│ └── img/
├── test/
│ └── img/
├── masks/
│ ├── lesion_annotations.zip
│ └── annotations/
└── patches/
└── level_0/
├── normal_001/
├── tumor_001/
└── ...
└── level_1/
└── level_2/
└── level_3/
- Modify
src/config.pyto adjust paths, hyperparameters, and experiment settings.
If you use this codebase, please cite the repository and the CAMELYON16 dataset.
This project is licensed under the MIT License. See LICENSE for details.

