This project implements a semantic search system for e-commerce products using pre-trained transformer embeddings.
The system allows searching Flipkart product descriptions based on the meaning of user queries, not just keyword matching.
- Uses the Flipkart products dataset from Kaggle.
- Employs the
sentence-transformerslibrary to generate embeddings. - Performs semantic similarity search using cosine similarity.
- Returns the top-k most relevant products for any input query.
The dataset used in this project is the Flipkart products dataset from Kaggle.
- Python 3.7+
- pandas
- sentence-transformers
- torch
You can install the dependencies with:
pip install pandas sentence-transformers torch- Load the dataset.
- Clean and preprocess product data.
- Encode product descriptions using a pre-trained SentenceTransformer model.
- Use the
search_products(query, top_k)function to find products semantically similar to the query.
Example:
search_products("wireless bluetooth headphones", top_k=5)- Integrate with a web app using Gradio or Streamlit for an interactive search experience.
- Fine-tune transformer models using e-commerce data for enhanced accuracy.
- Expand to multi-modal search, including images and reviews.
This project is licensed under the MIT License.
Made with ❤️ by Özlem