QLoRA: Efficient Finetuning of Quantized LLMs

This week's paper is QLoRA: Efficient Finetuning of Quantized LLMs. QLoRA introduced a way to save memory using quantization in LoRA training. They also introduce the Guanaco family of models, and do some analyses on fine-tuning data.

Further Reading:

LoRA: Low-Rank Adaptation of Large Language Models - finetuning with adapters if you missed it ([arxiv.org/abs/2106.09685)
Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning - another LoRA memory effiencient technique removing need to store activations.
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models - better initalization for quantized LoRA finetuning
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention - improve quantized LLMs with LoRA to be highly accurate through information retention

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QLoRA_Efficient_Finetuning_of_Quantized_LLMs.md

QLoRA_Efficient_Finetuning_of_Quantized_LLMs.md

QLoRA: Efficient Finetuning of Quantized LLMs

Files

QLoRA_Efficient_Finetuning_of_Quantized_LLMs.md

Latest commit

History

QLoRA_Efficient_Finetuning_of_Quantized_LLMs.md

File metadata and controls

QLoRA: Efficient Finetuning of Quantized LLMs