GitHub - Karanveer266/Building-LLM-from-Scratch: This repository contains my implementation of a Large Language Model built from scratch using GPT-2 architecture and weights

Build a Large Language Model From Scratch Overview :

This repository contains a custom implementation of a Large Language Model (LLM) based on the GPT-2 architecture. The project demonstrates the process of building a transformer-based model from scratch, loading pre-trained weights, and generating text using causal language modeling techniques. It is heavily inspired by Sebastian Raschka's book "Build a Large Language Model (From Scratch)" and Vizuara's YouTube playlist, which provide comprehensive guidance on understanding and implementing LLMs.

Key Features Full implementation of GPT-2 architecture.

Pre-trained weight loading for text generation tasks.

Customizable sampling parameters for coherent text generation.

Modular design for easy experimentation and extension.

Resources This project draws inspiration from the following resources:

Book: Build a Large Language Model (From Scratch) by Sebastian Raschka

The book provides step-by-step guidance on creating LLMs, including coding attention mechanisms, pretraining, fine-tuning, and instruction optimization.

Learn more about the book here.

YouTube Playlist: Vizuara

A detailed video series explaining LLM concepts and implementation strategies.

Check out their playlist here -> https://www.youtube.com/playlist?list=PLPTV0NXA_ZSgsLAr8YCgCwhPIJNNtexWu

Acknowledgments This project is inspired by the following:

Sebastian Raschka's book, which offers an in-depth exploration of LLM development.

Vizuara's YouTube Playlist, which provides practical insights into building LLMs step-by-step.

Special thanks to these resources for making complex concepts accessible to learners and developers!

Feel free to explore, contribute, or use this repository as a foundation for your own projects!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
sms_spam_collection		sms_spam_collection
LLMcode.ipynb		LLMcode.ipynb
README.md		README.md
gptweight.py		gptweight.py
instruction-data.json		instruction-data.json
loss-plot.pdf		loss-plot.pdf
sms_spam_collection.zip		sms_spam_collection.zip
test.csv		test.csv
the-verdict.txt		the-verdict.txt
train.csv		train.csv
validation.csv		validation.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Karanveer266/Building-LLM-from-Scratch

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages