-
Notifications
You must be signed in to change notification settings - Fork 0
Description
From may to august 2024 I worked on late fusion of copy number variation (CNV) and histopathology microscopy images (H) data to predict cancer in patients with Barrett's oesophagus (BE). Killcoyne et al.'s 2020 paper showed that CNV can predict cancer in BE patients up to 15 years in advance, whereas H is used by pathologists for diagnosis. We put CNV in a gradient boosting model to get the probability of BE progressing into a worst-case. We put the images through a pre-trained feature collecting model (UNI model by the Mahmood lab) and then trained a ResNet model (CLAM model by the Mahmood lab as well) and got the probability of BE progressing into a worst-case. There were two types of prediction: 3-classes (Risk Factor) which are predicting if the condition got more severe in a graded pathology fashion, and 2-classes (Progressor) which predicted if the patient became a progressor or not (it's important to note that each BE patient starts at just being BE with not cancer so thats what we were predicting from). We evaluated these models individually and then evaluated if the overall accuracy improved if we merged the probabilities of these models. We found the following results:
As can be noticed in the first column (the "Modality Type"), there is one row at the end that is called "Early Fusion". This is when we merge the CNV and the Hs together to make it a single data modality that we can then put into one model. In the research project we:
- got the features out from the images with the UNI model;
- normalised the CNV data with the mean and standard deviation of the images' features;
- append the CNV data to the image feature vectors;
- trained a model with the CLAM architecture to predict the Progressor status.
We got good results from this method on the Progressor status, but ran out of time to do it on the Risk Factor task.
