Skip to content

During DLRM examples to create prediction using TorchRec library. This shows basic example of how to use TorchRec library quickly locally. #3043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions examples/prediction/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/usr/bin/env python3
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

# pyre-strict

"""
TorchRec recommendation model examples.

This package contains examples of different recommendation models implemented using TorchRec:
- DLRM (Deep Learning Recommendation Model)
- Two-Tower Model

Each model is organized in its own subdirectory with complete implementation, tests, and documentation.
"""

# Import main components from DLRM
from torchrec.github.examples.prediction.dlrm.predict_using_dlrm import (
create_kjt_from_batch,
DLRMRatingWrapper,
RecommendationDataset as DLRMDataset,
TorchRecDLRM,
)

# Import main components from Two-Tower
from torchrec.github.examples.prediction.twoTower.predict_using_twotower import (
create_kjt_from_ids,
RecommendationDataset as TwoTowerDataset,
TwoTowerModel,
TwoTowerRatingWrapper,
)

__all__ = [
# DLRM components
"DLRMRatingWrapper",
"DLRMDataset",
"TorchRecDLRM",
"create_kjt_from_batch",
# Two-Tower components
"TwoTowerModel",
"TwoTowerRatingWrapper",
"TwoTowerDataset",
"create_kjt_from_ids",
]
125 changes: 125 additions & 0 deletions examples/prediction/dlrm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# DLRM Prediction Example

This example demonstrates how to use a Deep Learning Recommendation Model (DLRM) for making predictions using TorchRec capabilities. The code includes:

1. A DLRM implementation using TorchRec's EmbeddingBagCollection and KeyedJaggedTensor
2. Training with random data
3. Evaluation
4. Making sample predictions

## DLRM Architecture

The Deep Learning Recommendation Model (DLRM) is a state-of-the-art architecture for recommendation systems that combines:

- **Bottom MLP**: Processes dense features
- **Embedding Tables**: Process sparse features
- **Feature Interaction**: Computes dot products between embeddings and dense features
- **Top MLP**: Processes the combined features to produce predictions

This architecture is particularly effective for CTR prediction and ranking tasks in recommendation systems.

## TorchRec Integration

This implementation uses TorchRec's capabilities:
- Uses `KeyedJaggedTensor` for sparse features
- Uses `EmbeddingBagCollection` for embedding tables
- Follows the DLRM architecture as described in the paper: https://arxiv.org/abs/1906.00091

The example demonstrates how to leverage TorchRec's efficient sparse feature handling for recommendation models.

## Dependencies

Install the required dependencies:

```bash
# Install PyTorch
pip install torch torchvision

# Install NumPy
pip install numpy

# Install TorchRec
pip install torchrec
```

**Important**: This implementation requires torchrec to run, as it uses TorchRec's specialized modules for recommendation systems.

## Running the Example Locally

1. Download the `predict_using_dlrm.py` file to your local machine.

2. Run the example:

```bash
python3 predict_using_dlrm.py
```

3. If you're using a different Python environment:

```bash
# For conda environments
conda activate your_environment_name
python predict_using_dlrm.py

# For virtual environments
source your_venv/bin/activate
python predict_using_dlrm.py
```

## What to Expect

When you run the example, you'll see:

1. Training progress for 10 epochs with loss and learning rate information
2. Evaluation results showing MSE and RMSE metrics
3. Sample predictions for a specific user on multiple items

## Implementation Details

This example uses TorchRec's capabilities to implement a DLRM model that:

- Takes dense features and sparse features (as KeyedJaggedTensor) as input
- Processes dense features through a bottom MLP
- Processes sparse features through EmbeddingBagCollection
- Computes feature interactions using dot products
- Processes the interactions through a top MLP
- Outputs rating predictions on a 0-5 scale

The implementation demonstrates how to use TorchRec's specialized modules for recommendation systems, making it more efficient and scalable than a custom implementation.

## Key TorchRec Components Used

1. **KeyedJaggedTensor**: Efficiently represents sparse features with variable lengths
2. **EmbeddingBagConfig**: Configures embedding tables with parameters like dimensions and feature names
3. **EmbeddingBagCollection**: Manages multiple embedding tables for different categorical features

## Comparison with Two-Tower Model

While both DLRM and Two-Tower models are used for recommendation systems, they have different architectures and use cases:

- **DLRM**: Combines multiple categorical features and dense features with feature interactions, suitable for CTR prediction and ranking tasks.
- **Two-Tower**: Separates user and item processing into distinct towers, making it more suitable for retrieval tasks and large-scale recommendation systems.

## Troubleshooting

If you encounter any issues:

1. **Python version**: This code has been tested with Python 3.8+. Make sure you're using a compatible version.

2. **PyTorch and TorchRec installation**: If you have issues with PyTorch or TorchRec, try installing specific versions:
```bash
pip install torch==2.0.0 torchvision==0.15.0
pip install torchrec==0.5.0
```

3. **Memory issues**: If you run out of memory, try reducing the batch size by modifying this line in the code:
```python
batch_size = 256 # Try a smaller value like 64 or 32
```

4. **CPU vs GPU**: The code automatically uses CUDA if available. To force CPU usage, modify:
```python
device = torch.device("cpu")
```

5. **TorchRec compatibility**: If you encounter compatibility issues with TorchRec, make sure you're using compatible versions of PyTorch and TorchRec.
8 changes: 8 additions & 0 deletions examples/prediction/dlrm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/usr/bin/env python3
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

# pyre-strict
Loading
Loading