Semantic Book Recommender 🚀

LLM-powered, containerized, emotion-aware book discovery built with LangChain, Chroma, Hugging Face Transformers and a slick Gradio front-end via Docker.

Project motivation

The goal is to recommend books from free-text descriptions, while letting readers filter by broad category (fiction, nonfiction, children’s …) and mood (joy, suspense, sadness, etc.).
Under the hood we combine semantic search with lightweight sentiment inference to return covers, titles and punchy blurbs in a scrollable gallery.

UI preview

Short demo

Data

Source Kaggle public dump of Google Books API (~ 6.8 k rows).
Original books.csv lives in data/
Cleaning & filtering
- Remove rows missing description, num_pages, average_rating, published_year.
- Keep books whose description ≥ 25 words.
- Add title_and_subtitle, fix missing thumbnails, append &fife=w800 to up-rez images.
Engineered metadata
- simple_categories map noisy 500 + Google genres into 4 high-level buckets using rule-based mapping + BART-MNLI zero-shot back-fill (≈ 78 % accuracy on held-out sample).
- Emotions (anger, fear, joy, sadness, surprise, neutral, disgust) scored per book with distil-RoBERTa emotion classifier; the dashboard lets users rank by the dominant tone.

Pipeline overview

stage	package	what happens
Embedding	`langchain_openai`	OpenAI ADA embeddings for each tagged description (`isbn13 + description`)
Vector DB	`Chroma`	In-memory store; cosine similarity search driven by LangChain retriever
Sentiment	`transformers`	`j-hartmann/emotion-english-distilroberta-base` – max emotion score across sentences
Category back-fill	`facebook/bart-large-mnli`	Zero-shot between Fiction vs Non-fiction
Frontend	`gradio`	Glass theme; textbox + 2 dropdowns → gallery of 16 covers (+ captions)

Quick start

# clone repo
git clone https://github.com/hamzahassan9320/llm-semantic-book-recommender.git
cd llm-semantic-book-recommender

# (optional) create env
conda create -n bookrec python=3.11
conda activate bookrec

# install deps
pip install -r requirements.txt

# add data locally (≈ 30 MB)
mkdir -p data notebook
cp /path/to/books.csv data/
cp /path/to/books_cleaned.csv notebook/
cp /path/to/tagged_description.txt notebook/

# launch dashboard
python src/gradio-dashboard.py

Tested with Python 3.11 on CUDA 12.4
(GPU strongly recommended for the one-off sentiment-scoring notebook).

Docker

You can either pull the prebuilt image from Docker Hub or build it yourself.

Option A: Pull & run

# Pull the image
docker pull hamza9320/gradio-app:latest

# Run (map host 7860 → container 7860)
docker run -d \
  -p 7860:7860 \
  --name gradio-app \
  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
  hamza9320/gradio-app:latest

# Then browse to:
#   http://localhost:7860

Option B: Build from source

# From the repo root (where your Dockerfile lives)
docker build -t hamza9320/gradio-app:latest .

# Then the same run command
docker run -d \
  -p 7860:7860 \
  --name gradio-app \
  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
  hamza9320/gradio-app:latest

# Browse to http://localhost:7860

Code & notebook guide

File / notebook	Purpose
`notebook/data_exploration.ipynb`	EDA, cleaning, feature engineering, zero-shot labelling & emotion scoring
`src/gradio-dashboard.py`	End-to-end pipeline (load → embed → search → serve)
`requirements.txt`	Pinned runtime dependencies
`LICENSE`	MIT license text

Using the Gradio app

Describe the book you’re craving – “coming-of-age on a distant planet with a hopeful vibe”.
Optionally pick a category (Fiction, Non-fiction, Children’s …) or tone (Happy, Sad, Suspenseful …).
Click Find recommendations.
The top-16 matches (semantic + filters + emotion-ranking) appear as large covers with bite-sized blurbs. Click a cover to zoom.

👉 Live demo on Hugging Face Spaces: https://huggingface.co/spaces/Hamza9320/semantic-book-recommender

Directory layout

.
├── README.md             
├── LICENSE
├── requirements.txt
├── data/                  
|   ├── books.csv
├── notebook/
│   ├── data_exploration.ipynb
│   ├── books_cleaned.csv
│   └── tagged_description.txt
└── src/
    └── gradio-dashboard.py

License

Released under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.vscode		.vscode
data		data
docs/media		docs/media
notebook		notebook
src		src
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
cover_not_found.jpg		cover_not_found.jpg
docker-compose.debug.yml		docker-compose.debug.yml
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Semantic Book Recommender 🚀

Table of contents

Project motivation

UI preview

Short demo

Data

Pipeline overview

Quick start

Docker

Option A: Pull & run

Option B: Build from source

Code & notebook guide

Using the Gradio app

Directory layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

HamzaHassan9320/llm-semantic-book-recommender

Folders and files

Latest commit

History

Repository files navigation

Semantic Book Recommender 🚀

Table of contents

Project motivation

UI preview

Short demo

Data

Pipeline overview

Quick start

Docker

Option A: Pull & run

Option B: Build from source

Code & notebook guide

Using the Gradio app

Directory layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages