This repository contains the code and resources for the project titled "LLM Prompt Recovery," an ECE 5424 final project. The project aims to develop methodologies for recovering the original prompts used in text rewriting tasks performed by large language models (LLMs).
- Introduction
- Dataset Compilation
- Methodology
- Experimental Setup and Evaluation
- Results and Discussion
- Conclusion
- Appendix
The project explores the relationship between input prompts and the generated outputs in LLMs, focusing on text rewriting tasks. The primary objectives are:
- Develop techniques to recover original prompts.
- Gain insights into LLMs' interpretation of prompts.
- Compare different prompt recovery approaches.
The original texts were selected from the summary attribute of the Wikipedia movie plots dataset, offering a diverse range of genres and styles.
Prompts were generated using the Claude 3 Opus model. Ten categories of prompts were defined, including content modification, cultural adaptations, emotion and sentiment changes, and more.
The rewritten texts were generated using the Gemma 7B-it model. Each original text was paired with multiple prompts to introduce variability.
The dataset comprises 6,000 samples, split into a training set (4,800 samples) and a test set (1,200 samples).
Key preprocessing steps include text cleaning, tokenization, sequence length handling, prompt formatting, and train-test splitting.
These models use a combination of autoregressive and bidirectional language modeling.
This model has been trained on a massive corpus of text data and shows strong language understanding capabilities.
These models have been trained on extensive text data and are evaluated for their performance in zero-shot, few-shot, and fine-tuning settings.
These transformer-based models are fine-tuned for the prompt recovery task.
In this setting, models are evaluated without additional training. Beam search decoding is used for prompt generation.
Models are provided with a limited number of training examples to adapt and generate prompts.
Models are fine-tuned on the training dataset to improve prompt recovery performance.
Performance is assessed using Cosine Similarity and ROUGE Score.
LLaMA2 7B showed the highest performance in the zero-shot setting.
All models improved with few-shot examples, with LLaMA2 7B leading the performance.
Fine-tuning yielded the highest performance, with Gemma 7B achieving the best results.
Performance trends show improvements with task-specific training. Fine-tuning provides the best results but requires more resources.
The project successfully demonstrates the feasibility of prompt recovery using LLMs. Fine-tuning transformer models yields the best performance, contributing to more interpretable and controllable language models.
Examples of zero-shot and few-shot learning prompts used for model predictions.
Examples of qualitative results for zero-shot, few-shot, and fine-tuning settings.
Refer to the report for a detailed list of references.
For more details, please refer to the project report included in this repository.