microgpt

A micro GPT implementation and training pipeline in PyTorch.

from microgpt.model import (
    load_model,
    PretrainedModelConfig,
)

model = await load_model(
    config=PretrainedModelConfig(),
)
generated_text = model.generate_text(
    text="Hi, I'm a language model,",
    max_new_tokens=50,
)

Pretrained model

The pretrained models can be found in the pretrained directory. It was trained in 2 stages:

Stage 1: Training using large amounts of mostly web based data
Stage 2: Training using 3 runs of smaller amounts of high quality data and combining/souping the model weights

Comparison with OpenAI's GPT-2

Infrastructure used for training

8x H200 SXM GPUs (80GB) on runpod.io
- Time taken: ~4 hours
- Hourly Cost: $32 per hour
- Total cost: ~$128
1 c8g.4xlarge instance on AWS
- Time taken: ~16 hours
- Hourly Cost: $0.43184 per hour
- Total cost: ~$6.75

Features

Usage

Install uv
Install make
Setup a virtual environment

uv venv --python 3.12
source .venv/bin/activate

Install dependencies

make sync

Go through the notebooks to understand how to use the library.

Acknowledgements

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name	Name	Last commit message	Last commit date
Latest commit gpahal Update README.md Mar 31, 2025 20bc6f7 · Mar 31, 2025 History 3 Commits
.vscode	.vscode	Initialize repo	Mar 30, 2025
assets	assets	Initialize repo	Mar 30, 2025
notebooks	notebooks	Initialize repo	Mar 30, 2025
pretrained	pretrained	Initialize repo	Mar 30, 2025
scripts	scripts	Initialize repo	Mar 30, 2025
src/microgpt	src/microgpt	Initialize repo	Mar 30, 2025
tests	tests	Initialize repo	Mar 30, 2025
.editorconfig	.editorconfig	Initialize repo	Mar 30, 2025
.gitattributes	.gitattributes	Initialize repo	Mar 30, 2025
.gitignore	.gitignore	Initialize repo	Mar 30, 2025
.lfsconfig	.lfsconfig	Initialize repo	Mar 30, 2025
.python-version	.python-version	Initialize repo	Mar 30, 2025
LICENSE	LICENSE	Initialize repo	Mar 30, 2025
Makefile	Makefile	Initialize repo	Mar 30, 2025
README.md	README.md	Update README.md	Mar 31, 2025
pyproject.toml	pyproject.toml	Update dependencies	Mar 30, 2025
uv.lock	uv.lock	Update dependencies	Mar 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

microgpt

Pretrained model

Comparison with OpenAI's GPT-2

Infrastructure used for training

Features

Usage

Acknowledgements

License

About

Languages

License

gpahal/microgpt

Folders and files

Latest commit

History

Repository files navigation

microgpt

Pretrained model

Comparison with OpenAI's GPT-2

Infrastructure used for training

Features

Usage

Acknowledgements

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages