This code reproduces part of the results presented in the paper Interpretation of Disease Evidence for Medical Images Using Adversarial Deformation Fields, to be presented at MICCAI 2020. This code implementation is done in Linux, using Python 3.7.3 and PyTorch 1.2, and was partially inspired by Orobix VAGAN code. Part of the code is modified from Baumgartner's VAGAN code. git-lfs
is needed to clone the repository. You can download the zip file if you do not have git-lfs
.
The DeFI-GAN algorithm can be used to generate spatial explanations of disease evidence. An adversarially trained generator produces deformation fields that modify images of diseased patients to resemble images of healthy patients. The deformation field presents, by the use of quiver plots, changes in structure position, shape, and size. These changes represent what the method found as evidence of disease in the dataset and can also be used to evaluate unexpected dataset biases. The DeFI-GAN method presented higher normalized cross-correlation (NCC) scores than a baseline (VA-GAN). NCC was calculated between disease effect maps generated by the methods and disease effect maps calculated using longitudinal data. The following image is an overview of the DeFI-GAN method. For more details, consult the paper.
The COPD dataset used in the paper is private and is not available for outside investigators. The ADNI dataset has limited access. Downloading and preprocessing the ADNI dataset as used in the paper may take several weeks. We also provide a toy dataset for quick testing of the algorithm without worrying about setting up data. The first run of the code will automatically generate the synthetic dataset.
- To install all the needed libraries, you can use the
requirements.sh
file. It assumes you have conda or miniconda installed and creates a conda environment calleddefigan
, with prerequisites installed. Activate the environment before running the code usingconda activate defigan
. - For running the method for the ADNI dataset, install the SimpleITK library following these instructions. After compiling it, make sure you have the conda environment
defigan
activated when following the instructions in the SimpleITK Python module installation section. The compilation may take a couple hours, but SimpleITK is not necessary for playing with the toy dataset. - For details on how to set up the ADNI dataset, go to the "ADNI data setup" section at the end of this README.
This section provides command-line commands to should use to train and test the proposed method, DeFI-GAN, and the baseline, VA-GAN. A few details:
- To reproduce the image provided in Results, run the "Test pre-trained model" commands.
- All commands are selecting the GPU indexed by 0, but you can change the argument
gpus
according to your needs. - For the test commands, replace the
<timestamps-id>
expression with the respective value of the training experiment folder. - You can run
python -m src.train --help
to see all available options for modifying the runs. - All commands should be run from the project base folder.
- To check test NCC scores, open the log.txt file inside the experiment folder (
./runs/...
). - The first time a dataset is used, H5 files are created for faster loading of datasets in subsequent runs, so the first run may take an unusually long time to start producing outputs.
This dataset contains centered squares of several sizes. Larger squares are considered "healthy" (class 0) and smaller squares are considered "abnormal" (class 1). Squares are generated in a fixed size and have a random smoothed Gaussian noise added to its inside. Then squares are resized according to a random side length following a Weibull distribution. An arbitrary threshold on the side length is used to define the class of the square. Ground truths for the differences between an "abnormal" square and its "healthy" equivalent are calculated by resizing the square and then performing a subtraction. For this dataset, a batch size of 8 needs a GPU with at least 8 GB of memory.
Training command:
python -m src.train --experiment=my_toy_training_flow --gpus=0 --lambda_regg=25 --generator_output=flow --nepochs=120 --constant_to_multiply_flow=10
Test command:
python -m src.train --experiment=my_toy_test_flow --gpus=0 --lambda_regg=25 --generator_output=flow --nepochs=1 --constant_to_multiply_flow=10 --skip_train=true --load_checkpoint_g=./runs/my_toy_training_flow_<timestamp-id>/generator_state_dict_best_epoch --split_validation=test
Test pretrained model:
python -m src.train --experiment=my_toy_test_pretrained_flow --gpus=0 --lambda_regg=25 --generator_output=flow --nepochs=1 --constant_to_multiply_flow=10 --skip_train=true --load_checkpoint_g=./pretrained_models/toy_flow --split_validation=test
Training command:
python -m src.train --experiment=my_toy_training_add --gpus=0 --lambda_regg=50 --generator_output=residual --dataset_to_use=toy --nepochs=120
Test command:
python -m src.train --experiment=my_toy_test_add --gpus=0 --lambda_regg=50 --generator_output=residual --dataset_to_use=toy --nepochs=1 --skip_train=true --load_checkpoint_g=./runs/my_toy_training_add_<timestamp-id>/generator_state_dict_best_epoch --split_validation=test
Test pretrained model:
python -m src.train --experiment=my_toy_test_pretrained_add --gpus=0 --lambda_regg=50 --generator_output=residual --dataset_to_use=toy --skip_train=true --nepochs=1 --load_checkpoint_g=./pretrained_models/toy_add --split_validation=test
The ADNI dataset contains brain MRI volumes of patients in different stages of the progression of Alzheimer's disease (AD). We used cases with mild cognitive impairment (MCI) as class 0, and cases with AD as class 1. The subtraction of registered longitudinal data of patients who evolved from MCI to AD was used as ground truth.
Training command:
python -m src.train --dataset_to_use=adni --batch_size=4 --experiment=my_adni_training_flow --n_dimensions_data=3 --generator_output=flow --lambda_regg=10 --nepochs=60 --use_old_schedule=true --length_initialization_old_schedule=60 --gpus=0 --ADNI_images_location=<ADNI data location>
Test command:
python -m src.train --dataset_to_use=adni --batch_size=4 --experiment=my_adni_test_flow --n_dimensions_data=3 --generator_output=flow --lambda_regg=10 --use_old_schedule=true --length_initialization_old_schedule=60 --gpus=0 --ADNI_images_location=<ADNI data location> --skip_train=true --nepochs=1 --load_checkpoint_g=./runs/my_adni_training_flow_<timestamp-id>/generator_state_dict_best_epoch --split_validation=test
Test pretrained model:
python -m src.train --dataset_to_use=adni --batch_size=4 --experiment=my_adni_test_pretrained_flow --n_dimensions_data=3 --generator_output=flow --lambda_regg=10 --use_old_schedule=true --length_initialization_old_schedule=60 --gpus=0 --ADNI_images_location=<ADNI data location> --skip_train=true --nepochs=1 --load_checkpoint_g=./pretrained_models/brain_flow --split_validation=test --scale_flow_arrow=5 --gap_between_arrows=2
For this dataset and method, a batch size of 4 requires a GPU with at least 24GB of memory.
Training command:
python -m src.vagan_tensorflow.vagan_train --gpus=0 --experiment_name=my_adni_training_add --data_root=<ADNI data location>
Test command:
python -m src.vagan_tensorflow.test_pretrained_vagan_tensorflow --gpus=0 --experiment=my_adni_test_add --ADNI_images_location=<ADNI data location> --load_checkpoint_g=./logdir_tensorflow/gan/my_adni_training_add/model_best_ncc --split_validation=test --batch_size=4 --n_dimensions_data=3
Test pretrained model:
python -m src.vagan_tensorflow.test_pretrained_vagan_tensorflow --gpus=0 --experiment=my_adni_test_pretrained_add --ADNI_images_location=<ADNI data location> --load_checkpoint_g=./pretrained_models/brain_add/model_best_ncc --split_validation=test --batch_size=4 --n_dimensions_data=3
To run the baseline for this dataset, you will need a GPU with at least 11 GB of memory. The difference in memory usage between methods for this dataset comes mainly from the framework (PyTorch vs. Tensorflow) and the inference batch size during training (4 vs. 2). For the Tensorflow VA-GAN implementation, the effective batch size in terms of gradient step is 12, since it uses gradient accumulation of 6 training inferences for each gradient step. Other differences in the Tensorflow implementation include:
For the toy dataset, DeFI-GAN gets an NCC of ~0.7 in the validation set, while VA-GAN gets an NCC of ~0.5. We did not perform hyperparameter optimization, and these results are just for illustration purposes. For the other datasets, expected results are in the table below:
Dataset\Method | VA-GAN (baseline) | DeFI-GAN (proposed method) |
---|---|---|
ADNI | 0.332 | 0.365 |
COPD | 0.174 | 0.204 |
For more detailed results and images, check Interpretation of Disease Evidence for Medical Images Using Adversarial Deformation Fields.
Dataset |
NCC |
Input image |
Desired change |
Deformation Field |
Produced change |
over |
Modified image |
---|---|---|---|---|---|---|---|
Toy | 0.710 | ||||||
ADNI (coronal view) | 0.369 |
Dataset |
NCC |
Input image |
Desired change |
Produced change |
over |
Modified image |
---|---|---|---|---|---|---|
Toy | 0.532 | |||||
ADNI (coronal view) | 0.348 |
All the outputs of the model are saved in the runs
folder, inside a folder for the specific experiment you are running (<experiment name>_<timestamp-id>
). These are the files that are saved:
- tensorboard/events.out.tfevents.<...>: tensorboard file for following the training losses and validation score in real-time and for checking their evolution through the epochs.
- x1: a fixed batch of validation examples for which outputs will be printed
- x1_label.txt: the label for each of the fixed validation images
- xhat0_<epoch>: modified input image at the end of that epoch.
- flow_quiver_<epoch>: quiver plot of the deformation vector field given by the generator. If the option
scale_flow_arrow
is not modified, the length of the arrows represents the exact deformation that a pixel will suffer (pixel intensity is transferred from the beginning of the arrow to its head). Only saved ifgenerator_out=='flow'
. - flow_x_<epoch>: x component of the deformation vector field given by the generator. Only saved if
generator_out=='flow'
. - flow_y_<epoch>: y component of the deformation vector field given by the generator. Only saved if
generator_out=='flow'
. - delta_x_<epoch>.png: disease effect map output of the baseline generator. Only non-zero if
generator_out=='residual'
. - difference_<epoch>: the difference between xhat0 and x1
- difference_overlaid_<epoch>: the difference between xhat0 and x1 colored with a pink/green colormap, overlaid over x1.
- difference_gt.png: ground truth for the difference between xhat0 and x1.
- generator_state_dict_best_epoch: checkpoint for the generator model for the epoch with the highest validation NCC.
- critic_state_dict_best_epoch: checkpoint for the critic model for the epoch with the highest validation NCC.
- log.txt: a way to check the configurations used for that run and check the losses and scores of the model in text format, without loading tensorboard.
- command: command used to run the python script, including all the parser arguments.
- In the case of the 3D ADNI dataset, images are suffixed with "_y_slice" (coronal view), "_x_slice" (sagittal view) and "_z_slice" (axial view).
Steps to organize the ADNI data like used in the DeFI-GAN paper:
- You should not use this dataset if you are just playing with the algorithm. Only go through this process if you plan on using ADNI for your own non-commercial research project. Register to get access to the dataset. You will need a summary of your related project. The approval process can take up to one week.
- Download the data from https://ida.loni.usc.edu/login.jsp?project=ADNI
- First, go to Download > Study Data > ALL > Select ALL tabular data (csv format) > Download. This should give you all the csv tables of the dataset. You only need these ones: ADNIMERGE.csv, DXSUM_PDXCONV_ADNIALL.csv, MRI3META.csv, MRIMETA.csv, VITALS.csv.
- Second, go to Download > Image Collections > Advanced Search, and change the search options as:
- IMAGE TYPES: Check only "Preprocessed"
- Phase: Check ADNI1, ADNIGO and ADNI2
- Image Description: *N3*
- Follow: Search > Select All > Add to collection > Enter a name for the collection > OK
- Follow: Data Collections > Select newly created collection > All 🗹 > Advanced Download, and then select download options that suit your needs and download the dataset.
- There should be around 600 GB of data to download
- Download the brain MRI atlas, mni_icbm152_t1_tal_nlin_asym_09a.nii.
- Edit the
preprocess_adni_all.py
script.bmicdatasets_root
: location of your ADNI data. The data should be organized inside this folder following:
└───<ADNI data folder>/ ├───Tables/ | ├───ADNIMERGE.csv | ├───DXSUM_PDXCONV_ADNIALL.csv | ├───MRI3META.csv | ├───MRIMETA.csv | └───VITALS.csv ├───atlas/ | └───mni_icbm152_t1_tal_nlin_asym_09a.nii └───ADNI/ ├─── ... ├───941_S_4377/ | └───MT1__GradWarp__N3m/ | ├───2012-01-04_13_43_59.0/ | | └───S135356 | | └───ADNI_941_S_4377_MR_MT1__GradWarp__N3m_Br_20120106140231171_S135356_I275760.nii | └─── ... └─── ...
N4_executable
: location ofN4BiasFieldCorrection
. You have to install it if it is not yet installed on your computer.robex_executable
: location ofrunROBEX.sh
. You have to install it if it is not yet installed on your computer.- if not already on your computer, FSL should be installed and its bin folder included in the system path.
- Run the script by using
python -m src.adni_preprocess_all
. This preprocessing may take a long time (in the order of weeks if running on only one computer). It might be advantageous to divide the data into different folders and run the script on several servers in parallel. Results will then be saved to<bmicdatasets_root>/Processed_part_2/ADNI_all_no_PP_3
. - Edit the value of variable
Processed_part_2_folder
in thecorrect_brain_processing.py
script to point to the location of theProcessed_part_2
folder. Runpython -m src.correct_brain_processing
. Results will be saved to<bmicdatasets_root>/Processed_part_3/ADNI_all_no_PP_3
. - Check every resulting file for outliers in terms of file size. Sometimes preprocessing will fail, and, in most of these cases, it will produce either a file larger than usual (when the skull was not completely removed) or smaller than usual (when more parts of the brain were removed than desired). Rerun all these steps for all corrupted files you found.
- For some volumes, even rerunning the whole preprocessing will give wrong results. These volumes can be ignored by the algorithm by including them in
images_where_preprocessing_failed
, a list defined inbrain_loader.py
, following the format<rid>_<viscode>
. Seven volumes were removed for the results of the DeFI-GAN paper (855_0, 424_4, 4741_2, 4741_1, 4741_5, 4741_7, 4892_0).
This project is licensed under the GNU General Public License v3.0. Some of the files in this repository have code snippets originated from files licensed with MIT License.
By: Ricardo Bigolin Lanfredi, [email protected], ricbl.github.io.