Anomaly detection using self-supervised point clouds

Structure of project

.
├── embeddings                          # precomputed embeddings in .csv
│   ├── BarlowTwins               
│   │   ├──train                  
│   │   └── ...
│   └── SimCLR
│       ├──train
│       └──test
│       
├── examples
│   ├── DistributionPlotting.ipynb      # measure 2 metrics with bootstrap
│   ├── BarlowTwinsEmbeddings.ipynb     # train and make embeddings via BarlowTwins 
│   └── SimClrEmbeddings.ipynb          # train and make embeddings via SimCLR 
│
├── src                                 # utils .py files and models architectures
│   └── ...
│
├── weights                             # precomputed NN weights
│   ├── weights_barlow_twins
│   └── weights_simclr
│
├── requirements.txt                    # necessary packages
│
└── README.md

😎 Team

Agafonova Ekaterina ([email protected])
Volkov Dmitry ([email protected])
Sidnov Kirill ([email protected])
Dembitskiy Artem ([email protected])

TA: Nikita Balabin

💬 Description

The work explores the problem of point clouds similarity estimation in an SSL framework. We have compared the performance of Hausdorff and MTopDiv metrics on CIFAR10 dataset using embeddings extracted from linear layer of BarlowTwins and SimCLR models. To extract the embeddings, we used a single-class learning strategy. We claim that the metrics have failed to distinguish embeddings of the augmented classes due to low robustness to non-rigid augmentations.

Augmentations

We use set of augmentation suggested by the authors of lightly for training of the model and implemented in their ImageCollateFunction (we use default parameters and input size of the CIFAR10 images - 32, the full list of augmentation can be found in lightly documentation).

👯 Models

SimCLR. In a Simple framework for Contrastive Learning of visual Representations (“SimCLR”) two separate data augmentation operators are sampled from the same family of augmentations and applied to each data example to obtain two correlated views.

Barlow Twins. The objective function in “Barlow Twins” measures the cross-correlation matrix between the embeddings of two identical networks fed with distorted versions of a batch of samples, and tries to make this matrix close to the identity.

📐 Metrics

We have picked and evaluated metrics using several criteria:

Value of metric will allow measuring how our generated by SSL embedding representation stable to data argumentation. In other word from the view point of metric SLL capable to produce close enough embeddings to augmented images of the same class.
Numerical value of the metrics will allow distinguishing images of the different classes, i.e. images which we consider as an anomaly.
Metrics evaluation should not be computationally demanding.

Hausdorff distance. The Hausdorff distance (HD) between two point sets is a commonly used dissimilarity measure for comparing point sets and image segmentations.

MTopDiv. Manifold Topology Divergence is a framework for comparing data manifolds, aimed, in particular, towards the evaluation of deep generative models.

Euclidean distance. Computes the point-to-point distance for a cloud using Euclidean distance (2-norm) as the distance metric between the points.

💻 Installation

All the tests were performed using Google Colab GPU's Tesla P100.

    git clone https://github.com/melhaud/proj18.git

The requred packages can be installed from requirements.txt:

    pip install -r requirements.txt

Note that Ripser++, which is an MTopDiv requirement, will install and work on GPU only.

✅ Results

For the baseline, we propose to use a combination of SimCLR/Euclidean distance. The plot below demonstrates plausible results for the embeddings extracted from backbone linear layer trained on a single dog class from CIFAR10. However, it fails to distinguish cat class from the dog one.

It worth mentioning, that we did not perform "grid search" on the optimal number of points in the point clouds to be compared. It means, that improvements of proposed anomaly detections pipline might be achived both from the viewpoint of architecture/training approach and the proper selection of the point clouds sizes or metrics for their comparison.

In general we assume that this direction it rather promising and serves as a great field for further theoretical and experimental research.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anomaly detection using self-supervised point clouds

Structure of project

😎 Team

💬 Description

Augmentations

👯 Models

📐 Metrics

💻 Installation

✅ Results

Powered by

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
embeddings		embeddings
examples		examples
images		images
src		src
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

melhaud/proj18

Folders and files

Latest commit

History

Repository files navigation

Anomaly detection using self-supervised point clouds

Structure of project

😎 Team

💬 Description

Augmentations

👯 Models

📐 Metrics

💻 Installation

✅ Results

Powered by

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages