This project demonstrates how to use a pre-trained DeepLabV3 model from the torchvision
library for semantic segmentation, separating the foreground and background of an image. Additionally, it allows adding custom text to the background, making it useful for various applications such as image editing or creative content generation.
- Semantic Segmentation: Using a pre-trained DeepLabV3 model with ResNet-101 backbone to segment the image into various categories (e.g., person, animal, background, etc.).
- Foreground-Background Separation: The foreground and background of the image are separated based on the segmentation map.
- Text Overlay: Customizable text can be added to the background of the image, allowing for further creative manipulation.
- Visualization: The original image, segmentation map, foreground, and background with text are displayed using
matplotlib
for easy visualization.
- Python 3.x
- PyTorch and torchvision
- PIL (Pillow)
- matplotlib
- NumPy
Clone this repository and install the required dependencies:
git clone https://github.com/yourusername/deeplabv3-image-segmentation.git
cd deeplabv3-image-segmentation
pip install -r requirements.txt
- Make sure to replace the
image_path
variable with the path to your image. - Run the script to perform segmentation, separate the foreground and background, and add text to the background.
- The results (original image, segmentation map, foreground, and background with text) will be displayed using
matplotlib
.
python segmentation_script.py
-
Run the Flask Application:
- Ensure you have all the dependencies installed by running the following command:
pip install -r requirements.txt
-
Start the Flask App:
- Replace the
image_path
and any necessary configurations in theapp.py
file if required. - Start the Flask server by running:
python app.py
- This will start a local web server, and you can access the app in your browser by navigating to
http://127.0.0.1:5000/
.
- Replace the
-
Uploading an Image:
- On the web app interface, use the file upload option to select and upload an image.
- The app will process the image, perform segmentation using the pre-trained DeepLabV3 model, separate the foreground and background, and add text to the background.
-
View Results:
- After uploading, the app will display:
- Original Image
- Segmentation Map (color-coded)
- Foreground Image
- Background with Text Overlay (in a separate preview section)
- After uploading, the app will display:
-
Customization:
- You can modify the text that is added to the background through the web interface.
- The output will be displayed dynamically in a preview area.
The project follows a simple structure to keep the code organized:
deeplabv3-image-segmentation/
│
├── app.py # Flask application for the web interface
├── segmentation_script.py # Standalone script for running segmentation
├── static/ # Static folder for serving uploaded images
│ ├── uploads/ # Folder for storing uploaded images temporarily
│ └── results/ # Folder for saving processed results
├── templates/ # HTML templates for rendering the web interface
│ └── index.html # Main HTML page for the web app
├── requirements.txt # List of Python dependencies
├── README.md # Project documentation
└── LICENSE # License file
-
app.py
: This is the main entry point for the web application. It handles routing, processing of images, and rendering the results on the frontend. -
segmentation_script.py
: A standalone script that can be used to run the segmentation process outside of the web app context. Useful for testing or batch processing. -
static/
: A folder for serving files such as uploaded images and result images. Theuploads/
folder stores the raw images that are uploaded by the user, andresults/
stores the output images after processing. -
templates/
: Contains the HTML files used to render the web interface. Theindex.html
file is the primary interface where users can upload images and view the results. -
requirements.txt
: Contains the Python dependencies required to run the project, including Flask, PyTorch, torchvision, and others. -
README.md
: This file, which contains the documentation for the project.
- Original Image
- Segmentation Map (using a jet color map)
- Foreground Image (only the segmented foreground)
- Background with Text (background with customizable text added)
This project is licensed under the MIT License - see the LICENSE file for details.
- DeepLabV3 model from
torchvision
. - The idea for separating foreground and background is inspired by image segmentation applications in computer vision.