Source code for SIGIR 2023 paper: Prompt Learning for News Recommendation
12 directories correspond to 12 prompt templates three types (Discrete, Continuous, Hybrid) of templates from four perspectives (Relevance, Emotion, Action, Utility)
- Discrete-Relevance, Discrete-Emotion, Discrete-Action, Discrete-Utility
- Continuous-Relevance, Continuous-Emotion, Continuous-Action, Continuous-Utility
- Hybrid-Relevance, Hybrid-Emotion, Hybrid-Action, Hybrid-Utility
The experiments are based on public dataset MIND, we use the small version MIND-Small.
For our paper, we have preprocessed the original dataset and store it as binary files via "pickle". Even though I use ".txt" as the file extension, they are still binary files stored by pickle, you can use pickle package to directly load them, which include:
- train.txt: training set
- val.txt: validation set
- test.txt: testing set
- news.txt: containing information of all news
I have shared our preprocessed dataset on Google Drive as follows:
https://drive.google.com/drive/folders/1_3ffZvEPKD5deHbTU_mVGp6uEaLhyM7c?usp=sharing
In each directory, there is a script called run.sh
that can run the codes for the corresponding template.
Take “Discrete-Relevance” template as an example, the run.sh
file is shown as follows:
python main-multigpu.py --data_path ../DATA/MIND-Small --epochs 4 --batch_size 16 --test_batch_size 100 --wd 1e-3 --max_tokens 500 --log True --model_save True
python predict.py --data_path ../DATA/MIND-Small --test_batch_size 100 --max_tokens 500 --model_file ./temp/BestModel.pt --log True
- The first line is used to train the model on the training set and evaluate it on the validation set at each epoch. During this process, the model with the best performance on the validation set will be stored.
- The second line is used to evaluate the "best" model on the testing set to obtain the performance evaluation.
We implement the source code via the Distributed Data Parallel (DDP) technology provided by pytorch. Hence, our codes is a Multi-GPUs version. We encourage you to overwrite our code to obtain a Single-GPU version.
- python==3.7
- pytorch==1.13.0
- cuda==116
- transformers==4.27.0
If you use this codes, please cite our paper!
@inproceedings{zhang2023prompt,
author = {Zhang, Zizhuo and Wang, Bang},
title = {Prompt Learning for News Recommendation},
year = {2023},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages = {227–237},
numpages = {11},
location = {Taipei, Taiwan},
series = {SIGIR '23}
}