Machine Learning and Shape Recognition
├── LICENSE
├── README.md <- The top-level README for developers using this project.
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks.
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ ├── figures <- Generated graphics and figures to be used in reporting
| └── results <- Generated models results
|
└── src <- Source code for use in this project.
│
├── data <- Scripts to download or generate data
│
├── features <- Scripts to turn raw data into features for modeling
│
├── models <- Scripts to train models and then use trained models to make predictions
|
└── visualization <- Scripts to create exploratory and results oriented visualizations
Models pre-trained are available here : https://drive.google.com/file/d/129uGZ0P8uAIr4rGbLuZmc40LpUzzWMB1/view?usp=sharing
Make sure to put them in ./models folder.
All commands must be done from the root path mlrf/>
First, you will have to install python and setup a virtual environment.
python -m venv venv
Then you will be able to activate it and install packages :
venv/Scripts/activate
pip install -r requirements.txt
python ./src/process_all.py
First let's download the data :
python ./src/data/get.py
You can then retreat it to make a temporary dataset :
train_batches=5
: Train batches amount to take
python ./src/data/make.py --train_batches=5
Let's have a look of what data looks like images in the train set.
label=0
: Images with this label will be displayedamount=9
: Size of tile (9 means 81 images)
python ./src/visualization/mosaic.py --label=5 --amount=500
All figures are saved in ./reports/figures
You can also visualize labels repartition :
dataset=train
: Display train or test dataset.
python ./src/visualization/repartition.py
Now you can process features. You can compute Flatten, HOG and LBP :
features=all
: Choose features to build ex : flatten,hog
python ./src/features/build.py
To check features correlation :
python ./src/visualization/corr.py
Build and train your models :
hyper_params={"svm": {"tol": 1e-4, "C": 1.0, "max_iter": 50}, "k-means": {"n_neighbors": 10}, "xg-boost": {"max_depth": 15, "epochs": 50, "learning_rate": 0.1}}
: Dictionary of hyper-parametersoverride=False
: Override model, else doesn't compute it
python ./src/models/train.py
leaf_size for KMeans will be adjusted with DE algorithm.
Notice it will display loss history for XG-Boost model
You can now tests your models to get performance metrics. Each test will be save with date as key in ./reports/results
delete_hist=False
: Remove results history
python ./src/models/test.py
If you have downloaded pre-trained models and set override to False, it will not compute them.
After models testing, you have a look on figures :
python ./src/visualization/performances.py
- Clean interim data
python ./src/data/clean.py