This repository is for the fastai Modern LLM paper reading study group. Here you can find the papers we covered along with any extra resources.
We are working our way through the seminal LLM papers starting with the GPT-3 paper, Language Models are Few-Shot Learners.
The plan is to read our way through all the modern LLM methods mentioned by Andrej Karpathy in his The State of GPT talk, along with any new developments since then.
The study group is coordinated through the fastai discord in the #cluster-of-stars text channel and currently meets weekly on Fridays at 2300 UTC (7pm Eastern) in the #fastai-study-groups voice channel.
Each paper has its own ReadMe with a direct link, summary, further reading (for most papers), and some supporting materials in the section references folder.
- Language Models are Few-Shot Learners
- Finetuned Language Models Are Zero-Shot Learners
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- Training language models to follow instructions with human feedback
- LoRA: Low-Rank Adaptation of Large Language Models
- Evaluating Large Language Models Trained on Code
- Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
- Atlas: Few-shot Learning with Retrieval Augmented Language Models
- In-Context Retrieval-Augmented Language Models
- ReAct: Synergizing Reasoning and Acting in Language Models
- Toolformer: Language Models Can Teach Themselves to Use Tools
- SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking
- Chain of Papers: Multiple Chain of Thought Papers
- DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
- Self-RAG: Learning to Retrieve, Generate and Critique through Self-Reflections
- TeacherLM: Teaching to Fish Rather Than Giving the Fish: Language Modeling Likewise
- The Pile: An 800GB Dataset of Diverse Text for Language Modeling
- TinyStories: How Small Can Language Models Be and Still Speak Coherent English
- LLaMA: Open and Efficient Foundation Language Models
- D4: Improving LLM Pretraining via Document De-Duplication and Diversification
- DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
- Training Data for the Price of a Sandwich: Common Crawl Impact on Generative AI
- How to Train Data-Efficient LLMs
- Training Language Models to Follow Instructions with Human Feedback
- Constitutional AI: Harmlessness from AI Alignment
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- KTO: Model Alignment as Prospect Theoretic Optimization
- ORPO: Monolithic Preference Optimization without Reference Model
- RewardBench: Evaluating Reward Models for Language Modeling