A multimodal Quantum AI solution integrating clinical and imaging data in a Graph Neural Network (GNN) for classifying cancer progression. Enhances healthcare access, aligns with SDG 3, and transforms global health sustainability with real-world data integration.
Below is an overview of the project structure.
.
|-test
| |-clinical-classification.ipynb
| |-mwe.ipynb
|-docs
|-results
|-data
| |-clinical_data
| | |-clinical_Data.csv
| | |-train_embeddings.npy
| | |-test_embeddings.npy
| |-labels
| | |-train_labels.csv
| | |-test_labels.csv
| |-image_data
| | |-png
| | |-train_embeddings.npy
| | |-test_embeddings.npy
| |-test_data.csv
| |-train_data.csv
| |-data_loader.py
|-src
| |-pvem
| | |-clinical_data_embeddings.ipynb
| | |-clinical_data_embeddings.py
| |-qcnn
| | |-quanvolution.py
| |-rmrm
| | |-utils
| | | |-utils.py
| | |-embeddings.py
| | |-model
| | | |-intialisation.py
| | | |-network.py
| | | |-components.py
| |-config
| | |-config_train.json
| | |-config_test.json
| |-test.py
| |-train.py
|-pipeline.py
|-requirements.txt
The network implementation can be found inside of the src folder where:
- pvem: performs clinical preprocessing and generates clinical embeddings
- qcc: performs the quantum convolutions and generates image embeddings
- rmrm: combines the previous embeddings in a graph neural network and performs classification
- config: contains configuration files used at train and test time
Additionally, data folder contains:
- clinical_data: the CSV data used at train and test time to generate clinical embeddings and the corresponding target labels
- image_data: a folder containing a png imge for each patient (name format: x.png where x is the id of the patient) After running the generation of the embeddings, the embeddings will be stored as .npy files in the respective folder.
The whole pipeline is handled by pipeline.py in the root of the project. Refer to the following section for some example use cases.
This project requires Python 3.9. The dependencies are listed in the requirements.txt file and can be installed as follows:
pip install -r requirements.txt
Liver Cancer Dataset from The Cancer Imaging Archive: HCC-TACE-Seg
Run embedding generation, training and testing from the pipeline.py class. To generate clinical and image embeddings use mode data:
pipeline.py --mode data
Results will be saved in the data directory and include embeddings both for training and testing.
Use modes train and test to train and test the model. Additionally, it is possible to cofigure paramters for each mode from the respective config json files.
pipeline.py --config src/config/config_train.json --mode train
pipeline.py --config src/config/config_test.json --mode test