Skip to content

tiffany-su2004/Allergies-Detection

Repository files navigation

Food Allergy Detection System (Flask + Machine Learning) Overview

This project is a Flask-based machine learning web application that predicts potential food allergy categories based on structured ingredient attributes. It demonstrates the end-to-end deployment of a trained ML model into an interactive web interface, with a clear separation between data preprocessing, model training, and inference.

The current implementation focuses on controlled, dropdown-based inputs to ensure prediction consistency and robustness. The system is intentionally designed to support future extension to text-based ingredient analysis using NLP techniques.

Key Features

Web-based prediction interface built with Flask

Decision Tree classifier trained on categorical food attributes

Dynamic dropdown inputs populated directly from training encoders

End-to-end pipeline: preprocessing → training → model persistence → inference

Clean separation between current functionality and future roadmap

Technical Stack

Backend: Python, Flask

Machine Learning: scikit-learn (Decision Tree Classifier)

Model Persistence: joblib

Data Processing: NumPy, Pandas

Frontend: HTML (Jinja2 templating)

How the System Works

  1. Input Handling

Users select values for the following categorical features:

Food Product

Main Ingredient

Sweetener

Fat / Oil

Seasoning

Dropdown options are generated directly from the trained LabelEncoder classes, ensuring all inputs are valid and consistent with the model’s training data.

  1. Prediction Pipeline

User selections are encoded using pre-trained encoders.

Encoded features are passed to the trained Decision Tree model.

The predicted class is decoded using the target encoder.

The allergy category is displayed on the web interface.

Machine Learning Pipeline Data Preparation

Raw food and ingredient datasets were cleaned and normalized.

Categorical variables were encoded using LabelEncoder.

Target allergy labels were encoded separately.

Model Training

A Decision Tree Classifier was trained on encoded categorical features.

Trained artifacts were saved for deployment:

allergy_model.pkl

encoders.pkl

target_encoder.pkl

Deployment

The trained model and encoders are loaded at application startup.

Predictions are performed in real time via Flask routes.

Project Structure Allergies_Detection/ │ ├── app.py # Flask application entry point ├── allergy_model.pkl # Trained ML model ├── encoders.pkl # Feature encoders ├── target_encoder.pkl # Target label encoder │ ├── train.ipynb # Model training notebook ├── test2.ipynb # Model evaluation / testing │ ├── cleaned_food_allergy_dataset.csv ├── food_allergy_preprocessed.csv ├── food_ingredients_and_allergens.csv │ └── templates/ └── index.html # Web interface

How to Run Locally

  1. Install Dependencies pip install flask numpy pandas scikit-learn joblib

Note: For full reproducibility, dependency versions should match those used during model training.

  1. Start the Application python app.py

  2. Access the App

Open a browser and navigate to:

http://127.0.0.1:5000

Current Limitations

Input is limited to dropdown-based categorical selections

Free-text ingredient descriptions are not yet supported

Model artifacts were trained under a specific scikit-learn version

No authentication or database persistence implemented

Future Enhancements (Planned)

Text-based ingredient input using NLP techniques

Feature extraction via TF-IDF or embedding-based methods

Model retraining to support unstructured inputs

Enhanced UI/UX and prediction explanations

Improved model performance using ensemble methods

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors