diff --git a/Transformers/llama/README.md b/Transformers/llama/README.md new file mode 100644 index 0000000..ba3ec07 --- /dev/null +++ b/Transformers/llama/README.md @@ -0,0 +1,16 @@ +# Llama + +[LLaMA](https://arxiv.org/abs/2302.13971) is a foundation language model from Meta. +It is fully open-sourced. + +## Llama 3 + +- 8B and 70B models are available, and 400B model is coming soon (24.06.16) +- The tokenizer of LLaMA 3 trained with 128K tokens, where LLaMA 2 tokenizer was trained with 32K tokens +- Context window is 8192 tokens, where LLaMA 2 is 4096 tokens and LLaMA 1 is 2048 tokens +- Uses grouped query attention, which is more efficient than the standard multi-head attention + +## References + +- [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971) +- [Introducing Meta Llama 3: The most capable openly available LLM to date](https://ai.meta.com/blog/meta-llama-3/)