RHEED(Reflection High Energy Electrion Diffraction) is one of methodology to characterize the surface of crystalline materials. RHEED shot the electron beam by its special gun to surface of materials, which makes reflected electrons. These reflected electrons collides with detector and it makes unique pattern depend on the matterials.
The patterns have been analysized by numerous metohd based on ML like PCA, really classical method to classify the matterial's pattern and understand its property. And we adopted the art-of-state method Self-Supervised Learning method developed by Google and Meta.
The CNN was a kind of revolution to computer vision area. But it has really critical problems. That is it is too expensive to create corresponding dataset desire to analysis and label, even a real laboratory doesn't have enough financial power to classify the data by experts. Because it spend so much time of poor graduate student, Google and Meta proposed the algorithms for computer to understand the difference of images and calculate its loss to classify the images even labels for data is not required.
These algorithms was crucial to diagnose the COVID-19. Unlikely the time of graduate student is cheap, time of Doctors are expensive to label thousands X-ray picture. the time of physicist is priceless, we made accepted this algorithms for RHEED.
This research was supported by Smart Surface Material Lab. in UoS and KRCIT(Korea Research Institute of Chemical Technology). The data was provided by Smart Surface Material Lab and I participated in data analysis parts by self-supervised learning.
The main idea was that our images are really similar to medical images. These are gray-scale, anomaly detection and classification. So, I found a paper, How Transferable are Self-supervised Features in Medical Image Classification Tasks?[1]by Bayer in Germany. Following the flow of this paper, we choose models and got a bit clues, how to preprocess the data or how to validate the results.
Unfortunately, I am not so good at solid state physics. Many colleagues help me understand data significance and interpretation. I purposed this project but it was clear that It was impossible to research this area without their help.
Data was provided by Smart Surface Lab.
- More 1,000 pictures.
- Split with Validation, Test, Train.
- Normalized by Torch likely other medical case analysis in Kaggle
- Under ResNet
Based on [1], concerned paperwithcode and its rank of performance.
- MoCo
- SImCLR
- SwAV
- BYOL
By Lightly, generated the embedded space and observed its tendency. Colleagues who is engaged in solid state physics advised. They give me some feedback then, edited some hyper-parameters or preprocessing.
Medical_Example_Data_from_Dropbox
https://www.dropbox.com/sh/jszs15w46dlwa6i/AACiRn7A4FwUkMsN2Ax_Q7M_a?dl=0
Kaggles
- https://www.kaggle.com/code/yasufuminakama/ranzcr-resnext50-32x4d-starter-training
- https://www.kaggle.com/code/shivanandmn/self-supervise-momentum-contrast-great-idea/notebook?scriptVersionId=54013004
- https://www.kaggle.com/code/ayuraj/v2-self-supervised-pretraining-with-swav/notebook?scriptVersionId=60346393
Models
Lightly( Zurich )