A comprehensive Neural Matrix Factorization (NeuMF) recommender system that provides personalized movie recommendations using genre-aware collaborative filtering and natural language intent understanding. Built with PyTorch for training, FastAPI for serving, and React for the frontend interface.
- Project Overview
- Technologies Used
- Architecture
- Mathematical Foundations
- NLP Intent System
- Recent Improvements
- System Workflow
- Installation & Setup
- Usage
- Project Structure
This project implements a state-of-the-art recommendation system that combines:
- Neural Matrix Factorization (NeuMF): A hybrid deep learning model that fuses Generalized Matrix Factorization (GMF) and Multi-Layer Perceptron (MLP) architectures to learn user-item interactions
- Genre-Aware Filtering: Incorporates movie genre information as multi-hot vectors to enhance recommendation quality
- NLP Intent Understanding: Uses semantic embeddings to interpret free-form user queries (e.g., "I want something exciting and thrilling") and map them to appropriate movie recommendations
- Advanced Affect Detection: Automatically detects emotional intents (sad, funny, scary, romantic, etc.) with keyword-aware boosting to better interpret user intent
- Intent Tower Integration: Optional neural network processing of semantic intent vectors during training and inference
The system supports both MovieLens 100K (small, older movies from 1990s) and MovieLens 25M (large, recent movies up to 2019) datasets, automatically detecting the format. It provides both genre-based and intent-based recommendation endpoints through a RESTful API, accessible via a modern React frontend.
Recommended: Use MovieLens 25M for better performance and more recent movies (default).
- Python 3.9+: Core programming language
- PyTorch: Deep learning framework for model training and inference
- FastAPI: High-performance web framework for the recommendation API
- sentence-transformers: Semantic text embeddings for NLP intent processing
- NumPy & Pandas: Data processing and manipulation
- Uvicorn: ASGI server for FastAPI
- YAML: Configuration file parsing
- React: UI framework
- Vite: Modern build tool and dev server
- Anime.js: Smooth animations for UI interactions
- Axios: HTTP client with interceptors for API communication
- MovieLens 100K: Movie rating dataset with 100,000 ratings from 943 users on 1,682 movies (1990s movies)
- MovieLens 25M: Movie rating dataset with 25,000,000 ratings from 162,541 users on 62,423 movies (recent movies up to 2019) - Recommended
The system consists of three main components:
- Data Loading (
data.py): Automatically detects and loads MovieLens 100K or 25M format, creates train/val/test splits using leave-one-out methodology, supports optional sampling for faster experiments - Model Definition (
model.py): Implements the NeuMF architecture with optional intent tower - Training (
train.py): Trains the model using binary cross-entropy loss with negative sampling, memory-efficient intent vector handling - Evaluation (
eval.py): Evaluates model performance using Hit Rate (HR@K) and Normalized Discounted Cumulative Gain (NDCG@K)
- FastAPI Service (
main.py): Serves recommendation endpoints with config-based hyperparameter loading/recommendations: Genre-based recommendations/intent_recommendations: NLP-based intent recommendations with advanced affect detection/genres: List available genres (dynamically loaded from dataset)/users: List available user IDs
- Embedding System: Loads pre-computed item embeddings for semantic search
- Intent Mapping: Maps natural language queries to genre weights and affects with keyword-aware boosting
- React Application: Interactive interface with error handling and loading states
- Selecting users and genres
- Free-text prompt search with tunable alpha parameters
- Displaying recommendations with scores
- Real-time API request/response logging
GMF captures linear user-item interactions through elementwise product of embeddings:
Given user embedding pu ∈ ℝk and item embedding qi ∈ ℝk (where k = emb_dim_gmf), the GMF output is:
hGMF = pu ⊙ qi
Where ⊙ denotes elementwise (Hadamard) product:
hGMF[j] = pu[j] × qi[j] for j = 1, ..., k
This captures multiplicative interactions between user and item latent factors, similar to matrix factorization but in a neural framework.
Embedding Initialization:
- pu = user_gmf_emb(u) where user_gmf_emb: U → ℝk
- qi = item_gmf_emb(i) where item_gmf_emb: I → ℝk
Both embeddings are learned during training via backpropagation.
NeuMF combines GMF with a Multi-Layer Perceptron (MLP) to capture both linear and non-linear interactions:
GMF Output: h_GMF = user_gmf_emb(u) ⊙ item_gmf_emb(i)
h_GMF ∈ ℝ^k where k = emb_dim_gmf
The MLP processes concatenated user, item, genre, and optional intent embeddings:
MLP Input Construction:
xMLP = [user_mlp_emb(u) || item_mlp_emb(i) || genre_proj(g) || intent_tower(v)]
Where:
- user_mlp_emb(u): User embedding in MLP space ∈ ℝdmlp
- item_mlp_emb(i): Item embedding in MLP space ∈ ℝdmlp
- genre_proj(g): Projected genre vector ∈ ℝdgenre
- g is a multi-hot genre vector (e.g., [0,1,0,1,0,...] for Action + Adventure)
- genre_proj(g) = Wgenre · g + bgenre
- Wgenre ∈ ℝ|G|×dgenre, where |G| is the number of genres (19 for 100K, dynamic for 25M)
- intent_tower(v): Optional intent embedding processed through a neural network ∈ ℝdintent
- If present: intent_tower(v) = ReLU(W2 · Dropout(ReLU(W1 · v + b1)) + b2)
- Where v ∈ ℝdembed (typically 384 for all-MiniLM-L6-v2)
- W1 ∈ ℝdembed×h, W2 ∈ ℝh×dintent
- If not present: intent_tower(v) = 0 (zero vector)
Total MLP Input Dimension: din = 2·dmlp + dgenre + dintent
The MLP then applies multiple fully-connected layers:
h1 = ReLU(W1 · xMLP + b1)
h2 = ReLU(W2 · h1 + b2)
...
hL = hMLP (final MLP layer output)
Where the layer sizes are specified by mlp_layers (e.g., [128, 64]).
The model concatenates GMF and MLP outputs and applies a final linear layer:
Final Input: z = [hGMF || hMLP]
z ∈ ℝk + |mlp_layers[-1]|
Final Score: ŷ = Wfinal · z + bfinal
Where Wfinal ∈ ℝ(k+|mlp_layers[-1]|)×1
Predicted Probability: p(interaction | u, i, g, v) = σ(ŷ) = 1 / (1 + exp(-ŷ))
Where σ is the sigmoid function.
The model is trained using Binary Cross-Entropy Loss with negative sampling:
L = -[y · log(σ(ŷ)) + (1-y) · log(1-σ(ŷ))]
Where:
- y = 1 for positive user-item interactions
- y = 0 for negative samples (randomly sampled non-interacted items)
Batch Loss: For a batch of size B:
Lbatch = (1/B) · Σi=1B Li
Optimization: Adam optimizer with learning rate η:
θt+1 = θt - η · ∇θLbatch
The NLP intent system enables users to express preferences in natural language (e.g., "I want something exciting and funny") and maps these queries to personalized recommendations. The system operates in several stages with advanced keyword-aware affect boosting:
User queries are converted to dense vectors using a sentence transformer model:
qraw = SentenceTransformer(text) ∈ ℝdembed
Where dembed = 384 for all-MiniLM-L6-v2.
The embedding is L2-normalized: qraw = qraw / ||qraw||2
This ensures unit-length vectors for proper cosine similarity computation.
For each genre, a centroid is computed as the mean embedding of all movies in that genre:
For genre g, with items Ig = {i : genrei[g] = 1}:
cg = (1/|Ig|) · Σi∈Ig ei
Where ei is the pre-computed embedding for item i (computed from "title + genres" text).
The centroid is normalized: cg = cg / ||cg||2
Genre Centroid Matrix: C = [c1, c2, ..., c|G|]T ∈ ℝ|G|×dembed
The query vector is "steered" toward relevant genre centroids and away from irrelevant ones:
Step 1: Compute similarities sims = C · qraw ∈ ℝ|G|
Step 2: Identify top and bottom genres top_genres = argmax_k(sims) (top K most similar genres, default k=3) bot_genres = argmin_k(sims) (bottom K least similar genres, default k=2)
Step 3: Compute direction vectors ctop = mean({cg : g ∈ top_genres}) cbot = mean({cg : g ∈ bot_genres})
Step 4: Steer query vector qsteered = qraw + αpos · ctop - αneg · cbot
Where αpos and αneg are adaptive scaling factors (typically 0.8-1.1 for αpos, 0.9-1.0 for αneg).
Step 5: Renormalize qsteered = qsteered / ||qsteered||2
This sharpens the query vector to better match user intent.
The system detects emotional intents using predefined affect anchors with keyword-aware boosting:
For each affect a ∈ {sad, funny, scary, romantic, exciting, inspiring, family, dark} with anchor phrase pa:
aemb = embed(pa) / ||embed(pa)||2
These are precomputed and cached for efficiency.
For the steered query vector qsteered, compute cosine similarity with each affect:
affect_score(a) = aembT · qsteered
affect_scores = {a: affect_score(a) for a ∈ AFFECTS}
Action Keyword Detection: Define action keywords: Kaction = {'pumping', 'pump', 'adrenaline', 'heart racing', 'action', 'thrilling', 'awesome'}
If any keyword k ∈ Kaction appears in query q:
If affect_score('exciting') > 0 and affect_score('scary') > 0:
exciting_boost = min(0.15, affect_score('scary') × 0.3)
affect_score('exciting') ← affect_score('exciting') + exciting_boost
If affect_score('scary') > affect_score('exciting'):
affect_score('scary') ← affect_score('scary') × 0.85
This ensures action-oriented queries prioritize "exciting" over "scary".
If top_affect == 'exciting' (after boosting):
αpos = 1.1 (stronger pull toward action genres)
If top_affect == 'scary':
αpos = 1.0
For other affects: αpos = 0.9 (moderate pull)
Keep only positive affect scores: aff_items = {(a, s) : (a, s) ∈ affect_scores.items() ∧ s > 0}
total = Σ(a,s)∈aff_items s
Normalized scores: aff_norm(a) = s / total for each (a, s) ∈ aff_items
Top affect: top_affect = argmaxa aff_norm(a)
Affect confidence: affect_conf = max(aff_norm)
Genre weights are computed by combining centroid similarities and affect priors:
simg = cgT · qsteered
sims_genres = [sim1, sim2, ..., sim|G|]
Clip negative similarities: sims_genres = max(0, sims_genres) (elementwise)
For each affect a with normalized score aff_norm(a), apply genre priors:
P(g | a) is predefined (e.g., P(Action | exciting) = 0.7, P(Thriller | exciting) = 0.25)
affect_weightsg = Σa [aff_norm(a) × P(g | a)]
For all genres g: affect_weights = [affect_weights1, affect_weights2, ..., affect_weights|G|]
Exclude weakly related genres: exclude_thresh = 0.25 if affect_conf ≥ 0.35 else 0.20
sims_genres = sims_genres if sims_genres ≥ exclude_thresh else 0 (elementwise)
Combine signals: β = 0.9 if affect_conf ≥ 0.35 else 0.7
combined = sims_genres + β × affect_weights
If top_affect ∈ CONFLICT_SUPPRESS:
For each conflicting genre g ∈ CONFLICT_SUPPRESS[top_affect]:
combined[g] ← combined[g] × (0.2 if affect_conf ≥ 0.35 else 0.5)
This reduces weight for genres that conflict with the detected affect (e.g., Horror suppressed for "exciting" queries).
If Σg combined[g] > 0:
genre_weights[g] = combined[g] / Σg' combined[g']
Else:
genre_weights = combined (unchanged)
Compute item similarities: item_sims = E · qsteered ∈ ℝ|I|
Where E is the item embedding matrix ∈ ℝ|I|×dembed
Select top candidates: candidates = argtop_k(item_sims, candidate_pool)
embed_bonus = item_sims[candidates] (embedding similarity scores for candidates)
For each candidate item i in candidates:
base_score[i] = σ(ŷ) = model.predict(user_id, i, genre_vector[i], intent_vector)
Where intent_vector = qsteered if intent tower is enabled.
Genre Bonus: genre_bonus[i] = Σg [genre_weights[g] × genre_vector[i][g]]
This is the dot product between genre weights and the item's genre vector.
Popularity Bonus: pop_bonus[i] = popularity[i] × pop_w
Where popularity[i] is normalized interaction count for item i, and pop_w is popularity weight (typically 0 for affect-based queries).
Embedding Bonus: embed_bonus[i] (already computed in step 6.1)
final_score[i] = base_score[i] + αgenre × genre_bonus[i] + αpop × pop_bonus[i] + αembed × embed_bonus[i]
Where:
- αgenre: User-tunable weight for genre matching (default: 0.35)
- αpop: User-tunable weight for popularity (default: 0.05)
- αembed: User-tunable weight for semantic similarity (default: 0.60)
Priority-based ranking:
top_genres_final = argmax_2(genre_weights) (top 2 inferred genres)
For each candidate i:
match_mask[i] = (Σg∈top_genres_final genre_vector[i][g]) > 0
Sort candidates by final_score in descending order:
order = argsort(-final_score)
Split into primary (matching top genres) and secondary:
primary = [i ∈ order : match_mask[i]]
secondary = [i ∈ order : ¬match_mask[i]]
Final selection: selected = (primary || secondary)[:top_k]
This ensures items matching inferred genres are prioritized even when strict=False.
Auto-detection: The system automatically detects whether you're using MovieLens 100K or 25M format and adjusts accordingly.
Format Differences:
- 100K: Tab-separated files, 19 fixed genres, binary genre encoding
- 25M: CSV files, dynamic genre list (pipe-separated), more recent movies (up to 2019)
Memory Optimization: Added max_ratings parameter to sample subsets for faster experimentation:
python main.py --data ./data/ml-25m --max-ratings 500000 --epochs 3Problem: Original implementation tried to store all intent vectors in memory, requiring ~86.8 GB for 25M dataset.
Solution: Implemented lazy loading - intent vectors are looked up on-the-fly during batch iteration:
- Store only item indices (8 bytes each) instead of full embeddings (1536 bytes each)
- 99.4% memory reduction: ~86.8 GB → ~485 MB
- No performance degradation in practice
Problem: Queries like "make my blood pumping" incorrectly prioritized "scary" (0.47) over "exciting" (0.36), leading to horror movie recommendations instead of action.
Solution:
- Detects action-oriented keywords: 'pumping', 'adrenaline', 'thrilling', 'awesome', etc.
- Boosts "exciting" affect when action keywords are present: exciting_score ← exciting_score + min(0.15, scary_score × 0.3)
- Reduces "scary" when it dominates: scary_score ← scary_score × 0.85
- Enhanced steering for exciting queries: αpos = 1.1 (vs 1.0 for scary)
Config-based hyperparameters: Backend loads model hyperparameters from configs/starter.yaml to match training exactly.
Dynamic genre lists: Genre lists are dynamically loaded from the dataset (19 genres for 100K, 19-20+ for 25M including IMAX).
Improved genre centroids: Genre centroids are computed from actual embeddings, ensuring semantic coherence.
Error handling: Added comprehensive error handling and logging:
- API request/response interceptors
- Network error handling
- 30-second timeout for API requests
Loading states: Proper loading indicators during initialization and recommendation fetching.
User feedback: Console logging for debugging and user feedback.
Flexible format support: Single codebase supports both 100K and 25M with automatic detection:
- Detects format based on file structure
- Handles different genre encoding (binary vs. pipe-separated)
- Adapts metadata loading accordingly
1. Data Preparation
└─> Load MovieLens dataset (auto-detect 100K or 25M format)
└─> Create user/item ID mappings (internal indexing)
└─> Build genre matrix (multi-hot vectors)
• 100K: 19 fixed genres with binary encoding
• 25M: Dynamic genres with pipe-separated encoding
└─> Split into train/val/test (leave-one-out methodology)
└─> Generate negative samples for training (4 per positive by default)
2. Embedding Generation (Optional, for NLP Intent System)
└─> Build text descriptions: "Movie Title. Genres: Action, Adventure."
└─> Encode using SentenceTransformer (all-MiniLM-L6-v2)
└─> Normalize embeddings to unit length
└─> Save to item_embeddings.npy (shape: [num_items, 384])
3. Model Training
└─> Initialize NeuMF model:
├─> GMF embeddings: [num_users, emb_dim_gmf], [num_items, emb_dim_gmf]
├─> MLP embeddings: [num_users, emb_dim_mlp], [num_items, emb_dim_mlp]
├─> Genre projector: [num_genres, genre_proj_dim]
└─> Intent tower (if embeddings available): [384 → 128 → 64]
└─> For each epoch:
├─> Sample positive interactions from train set
├─> Sample negative interactions (neg_per_pos per positive)
├─> Forward pass: compute predictions
├─> Compute BCE loss: L = -[y·log(σ(ŷ)) + (1-y)·log(1-σ(ŷ))]
└─> Backpropagate and update weights (Adam optimizer)
└─> Evaluate on test set (HR@10, NDCG@10)
└─> Save model checkpoint to checkpoints/neumf_final.pt
4. API Startup
└─> Load config from configs/starter.yaml
└─> Load dataset and build mappings
└─> Load trained model weights (with compatibility checking)
└─> Initialize model with exact hyperparameters from config
└─> Load item embeddings (if available)
└─> Compute genre centroids from embeddings
└─> Pre-compute popularity scores (normalized interaction counts)
└─> Cache affect anchor embeddings
└─> Start FastAPI server with CORS enabled
5. Recommendation Request (Genre-based)
└─> Receive: user_id, genre, top_k, strict
└─> Map user_id to internal index
└─> Get all items user hasn't interacted with
└─> Filter candidates by genre:
• Strict: Only items with selected genre
• Soft: All items, but genre matching gets bonus
└─> Run NeuMF model on candidates (batch inference)
└─> Apply genre bonus if soft mode
└─> Rank by prediction scores
└─> Return top_k movies with metadata
6. Recommendation Request (Intent-based)
└─> Receive: query text, user_id, top_k, alphas, strict
└─> Embed query text → q_raw (384-dim vector)
└─> Detect action keywords → boost exciting affect if present
└─> Compute affect scores → normalize and identify top_affect
└─> Adjust steering parameters based on top_affect
└─> Steer query vector toward relevant genres
└─> Compute genre weights (centroid sims + affect priors)
└─> Apply conflict suppression if needed
└─> Filter candidates by top genres if strict=True
└─> Retrieve candidates via embedding similarity (top candidate_pool)
└─> Run NeuMF model with intent vector on candidates
└─> Combine scores: base + α_genre·genre + α_pop·pop + α_embed·embed
└─> Prioritize items matching top genres
└─> Return top_k movies
- Python 3.9+ (3.10+ recommended, Note: PyTorch with CUDA requires Python ≤3.12)
- Node.js 18+ and npm
- Optional: CUDA-enabled GPU for faster training/inference
# Create virtual environment
python -m venv .venv
# Activate (Windows PowerShell)
.\.venv\Scripts\Activate.ps1
# Activate (macOS/Linux)
source .venv/bin/activate# Install core dependencies
pip install -r requirements.txt
# Install backend dependencies
pip install -r backend/requirements.txtImportant: The data files are not included in the repository (they are too large for GitHub). You must download them using the provided script or manually from MovieLens.
Option 1: MovieLens 25M (Recommended - larger, more recent movies)
python scripts/download_mlwk.py --dataset 25m --target ./data/ml-25mOption 2: MovieLens 100K (Smaller, older movies from 1990s)
python scripts/download_mlwk.py --dataset 100k --target ./data/ml-100kNote: MovieLens 25M is ~250MB download and may take a few minutes. The system automatically detects the dataset format.
Essential Files Required:
For MovieLens 25M:
data/ml-25m/ml-25m/
├─ ratings.csv ✓ REQUIRED (ratings: userId,movieId,rating,timestamp)
├─ movies.csv ✓ REQUIRED (movie metadata: movieId,title,genres)
└─ ... (other files are optional and excluded from git)
For MovieLens 100K:
data/ml-100k/ml-100k/
├─ u.data ✓ REQUIRED (ratings)
├─ u.item ✓ REQUIRED (movie metadata with genres)
└─ ... (other files are optional and excluded from git)
What's Excluded: The repository uses .gitignore to exclude non-essential files like pre-split test sets (u1.base, u1.test, etc.), documentation (README, u.info), additional metadata (u.user, genome-tags.csv), and scripts (allbut.pl). Only the essential rating and movie metadata files listed above are needed for both training and backend runtime.
For MovieLens 25M (recommended):
python main.py --data ./data/ml-25m --epochs 10For MovieLens 100K:
python main.py --data ./data/ml-100k --epochs 10Optional (faster experiments): limit the number of ratings by sampling from the dataset:
python main.py --data ./data/ml-25m --max-ratings 500000 --epochs 3Note: The system automatically detects the dataset format. For MovieLens 25M, you may want to adjust hyperparameters in configs/starter.yaml or use the --max-ratings parameter to limit training data size for faster experimentation.
This will:
- Load and preprocess the dataset
- Train the NeuMF model
- Evaluate on test set (reports HR@10 and NDCG@10)
- Save model to
./checkpoints/neumf_final.pt
Configuration: Edit configs/starter.yaml to adjust hyperparameters:
emb_dim_gmf: GMF embedding dimension (default: 32)emb_dim_mlp: MLP embedding dimension (default: 64)mlp_layers: MLP layer sizes (default: [128, 64])lr: Learning rate (default: 0.001)batch_size: Training batch size (default: 256)epochs: Number of training epochs (default: 10)neg_per_pos: Negative samples per positive (default: 4)
For MovieLens 25M:
python scripts/build_item_embeddings.py --data ./data/ml-25m --out ./checkpoints/item_embeddings.npyFor MovieLens 100K:
python scripts/build_item_embeddings.py --data ./data/ml-100k --out ./checkpoints/item_embeddings.npyThis creates semantic embeddings for each movie (title + genres) using sentence-transformers, enabling the NLP intent recommendation feature. The embeddings are stored as a NumPy array with shape [num_items, 384].
Windows PowerShell:
$env:MOVIELENS_PATH = ".\data\ml-25m" # or ".\data\ml-100k" for 100K dataset
$env:MODEL_PATH = ".\checkpoints\neumf_final.pt"
$env:EMB_PATH = ".\checkpoints\item_embeddings.npy" # optional
$env:EMB_MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2" # optional
uvicorn backend.main:app --reload --port 8000macOS/Linux:
export MOVIELENS_PATH=./data/ml-25m # or ./data/ml-100k for 100K dataset
export MODEL_PATH=./checkpoints/neumf_final.pt
export EMB_PATH=./checkpoints/item_embeddings.npy # optional
export EMB_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2 # optional
uvicorn backend.main:app --reload --port 8000The API will be available at http://localhost:8000
Note: On first startup, the API loads the dataset and model into memory. This may take 1-3 minutes for MovieLens 25M. Wait for "Application startup complete" before making requests.
API Endpoints:
GET /genres→ List of available genres (dynamically loaded from dataset)GET /users→ List of user IDs (original MovieLens user IDs)GET /recommendations?user_id=<id>&genre=<name>&top_k=<n>&strict=<true|false>→ Genre-based recommendationsGET /intent_recommendations?q=<prompt>&user_id=<id>&top_k=<n>&strict=<bool>&genre_alpha=<f>&pop_alpha=<f>&embed_alpha=<f>→ NLP intent-based recommendations
cd frontend
npm install
npm run devOpen the URL shown by Vite (typically http://localhost:5173)
- Select a user from the dropdown
- Select a genre (e.g., "Action", "Comedy")
- Choose "Strict" mode (only movies with selected genre) or "Soft" mode (prefers matching genre but allows others)
- Adjust "Top K" to control number of recommendations
- Recommendations appear automatically
Soft Mode Formula: final_score = base_score + 0.20 × genre_match_bonus
Where genre_match_bonus = 1.0 if item has selected genre, 0.0 otherwise.
- Enter a free-form text query in the prompt box, e.g.:
- "I want something exciting and thrilling"
- "show me a sad romantic movie"
- "the movie should be awesome and make my blood pumping"
- "something funny for the family"
- Adjust the alpha parameters (live tuning):
- Genre α: Weight for genre matching (0.0-1.0, default: 0.35)
- Popularity α: Weight for popular movies (0.0-1.0, default: 0.05)
- Embedding α: Weight for semantic similarity (0.0-1.0, default: 0.60)
- Click "Recommend"
- The system will:
- Parse your query semantically using sentence transformers
- Detect emotional affects (sad, funny, scary, etc.) with keyword-aware boosting
- Boost "exciting" affect if action keywords detected
- Infer genre preferences through centroid similarity and affect priors
- Steer query vector toward relevant genres
- Retrieve candidates via embedding similarity
- Score using NeuMF model + genre/popularity/embedding bonuses
- Prioritize items matching top inferred genres
- Return personalized recommendations
Example Query Processing:
- Query: "make my blood pumping"
- Action keywords detected: "pumping" → boost exciting
- Affects: exciting (boosted), scary (reduced), dark
- Genres inferred: Action (0.15), Thriller (0.12), Horror (suppressed)
- Results: High-energy action/thriller movies
Genre-based:
curl "http://localhost:8000/recommendations?user_id=1&genre=Action&top_k=10&strict=true"Intent-based:
curl "http://localhost:8000/intent_recommendations?q=exciting%20thrilling&user_id=1&top_k=10&strict=false&genre_alpha=0.35&pop_alpha=0.05&embed_alpha=0.60"Intent-based with action keywords:
curl "http://localhost:8000/intent_recommendations?q=awesome%20blood%20pumping&user_id=1&top_k=10&strict=false"NeuMF-Movie-Recommendation-Engine/
├── recsys/ # Core recommendation system
│ ├── data.py # Data loading and preprocessing (supports 100K & 25M)
│ ├── model.py # NeuMF model definition with optional intent tower
│ ├── train.py # Training loop with memory-efficient intent handling
│ └── eval.py # Evaluation metrics (HR@K, NDCG@K)
│
├── backend/ # FastAPI service
│ ├── main.py # API endpoints and advanced intent system
│ └── utils.py # Helper functions for metadata loading
│
├── frontend/ # React UI
│ ├── src/
│ │ ├── App.jsx # Main application with error handling
│ │ ├── api.js # API client with interceptors and timeout
│ │ └── components/
│ │ ├── GenreSelector.jsx
│ │ ├── PromptSearch.jsx # NLP intent UI with alpha tuning
│ │ ├── Recommendations.jsx
│ │ └── StrictToggle.jsx
│ └── package.json
│
├── scripts/ # Utility scripts
│ ├── download_mlwk.py # Download MovieLens 100K or 25M
│ └── build_item_embeddings.py # Generate semantic embeddings
│
├── configs/ # Configuration files
│ └── starter.yaml # Training hyperparameters (loaded by backend)
│
├── checkpoints/ # Saved models and embeddings
│ ├── neumf_final.pt # Trained model weights
│ └── item_embeddings.npy # Pre-computed embeddings [num_items, 384]
│
├── data/ # Dataset (not in repository - must download)
│ ├── ml-100k/ # MovieLens 100K data (download via script)
│ │ └── ml-100k/
│ │ ├── u.data ✓ Essential
│ │ └── u.item ✓ Essential
│ └── ml-25m/ # MovieLens 25M data (download via script)
│ └── ml-25m/
│ ├── ratings.csv ✓ Essential
│ └── movies.csv ✓ Essential
│
├── main.py # CLI entry point for training
├── requirements.txt # Python dependencies
└── README.md # This file
The system uses standard recommendation metrics:
Fraction of users for whom at least one relevant item appears in top-K recommendations:
HR@K = (1/|U|) · Σu∈U I(top_Ku contains test_itemu)
Where:
- U: Set of all users in test set
- top_Ku: Top K recommended items for user u
- test_itemu: Ground truth test item for user u
- I(·): Indicator function (1 if condition is true, 0 otherwise)
Measures ranking quality, giving higher weight to items ranked higher:
DCG@K = Σi=1K (2reli - 1) / log2(i + 1)
Where:
- reli: Relevance of item at position i (1 if item matches test item, 0 otherwise)
IDCG@K: Ideal DCG (DCG if test item is ranked first)
NDCG@K = DCG@K / IDCG@K
NDCG ranges from 0 to 1, where 1 indicates perfect ranking.
- Backend fails to find dataset:
- Check
MOVIELENS_PATHpoints to the directory containing eitherml-100k/(for 100K) orml-25m/(for 25M) subfolder, orratings.csvin the root - Ensure the essential files exist:
- MovieLens 100K:
ml-100k/u.dataandml-100k/u.itemmust be present - MovieLens 25M:
ml-25m/ratings.csvandml-25m/movies.csvmust be present
- MovieLens 100K:
- If you cloned the repo, remember to download the dataset using
scripts/download_mlwk.py(data files are not in the repository)
- Check
- Backend fails to load model: Ensure
MODEL_PATHpoints tocheckpoints/neumf_final.ptcreated after training. Check that hyperparameters inconfigs/starter.yamlmatch training configuration. - NLP intent system not working: Ensure
item_embeddings.npyexists (runbuild_item_embeddings.pyfirst). Check that embedding shape matches number of items in dataset. - CORS errors: Confirm backend runs on
http://localhost:8000and frontend onhttp://localhost:5173 - Out of memory with 25M dataset:
- Reduce
batch_sizeinconfigs/starter.yaml - Use
--max-ratingsparameter to sample subset (e.g.,--max-ratings 500000) - Use CPU instead of CUDA if GPU memory is insufficient
- Reduce
- Training is slow: MovieLens 25M is much larger - consider using GPU or reducing training epochs for experimentation. Use
--max-ratingsfor quick tests. - Frontend hangs on startup:
- Wait for backend to fully start ("Application startup complete" message)
- Check browser console for API errors
- Verify backend is responding:
curl http://localhost:8000/genres - Check Network tab in browser DevTools for pending requests
- PyTorch CUDA not available: PyTorch CUDA builds require Python ≤3.12. Use CPU mode or create a Python 3.12 environment.
- Intent recommendations skewed wrong: Adjust alpha parameters in UI or via API. Increase
genre_alphafor stronger genre matching, increaseembed_alphafor semantic similarity.
For research and educational purposes.
- MovieLens dataset: https://grouplens.org/datasets/movielens/
- NeuMF paper: "Neural Collaborative Filtering" by He et al., 2017 (WWW)
- Sentence Transformers: https://www.sbert.net/
- FastAPI: https://fastapi.tiangolo.com/
- React: https://react.dev/
If you use this codebase in your research, please cite:
@misc{neumf-movie-recommendation,
title={NeuMF Genre-Aware Movie Recommendation Engine},
author={Your Name},
year={2025},
howpublished={\url{https://github.com/yourusername/NeuMF-Movie-Recommendation-Engine}}
}