Food Vision Project is multi-classification based project which is able to classify 101 classes of food101 dataset.
- Importing food101 dataset.
- Preprocessing the data.
- Using TensorFlow to prepare the dataset.
- Exploring the data.
- Creating model callbacks.
- Importing EfficientNet-B2 architecture for transfer learning.
- Fine-Tuning the model.
- Experimenting with Data Augmentation.
- Evaluating the model (Finding our model's most wrong predictions).
- Making predictions with our Food Vision model on custom images of food.
To get better results and to save time, transfer learning is used to train the model for this project. EfficientNet-B2 that was trained on ImageNet Dataset was used as a feature extractor for the model.
EfficientNet-V2M could've also been considered but it wasn't chosen for several reasons. The Google Colaboratory's GPU Tesla T4 was used for this project. Although, this wasn't a very low-end GPU but when training a model with larger parameters, having just one won't suffice. So, the following requirements were taken under consideration when choosing the architecture:
- Lower Depth (Faster Gradient Descent).
- Higher the ratio
accuracy / no. of parameters
, the better (Faster Training Epochs). - The model's predecessors have proven to show good results on the considered dataset.
The FineTuned EfficientNet-B2 Model has the best acccuracy among the three models.
The following are couple of the results (correct ones) generated by the model.