TheBloke's Docker templates

Update: 16 December 2023 - Rebuild to add Mixtral support

Should now support Mixtral, with updated AutoGPTQ 0.6 and llama-cpp-python 0.2.23
Updated PyTorch to 2.1.1

Update: 11 October 2023 - Update API command line option

Container will now launch text-generation-webui with arg --extensions openai
Logs from text-generation-webui will now appear in the Runpod log viewer, as well as /workspace/logs/text-generation-webui.log

Update: 8th October 2023 - CUDA 12.1.1, fixed ExLlamav2 issues

The instances now use CUDA 12.1.1, which fixes issues with EXL2
Note that for now the main container is still called cuda11.8.0-ubuntu22.04-oneclick
This is because I need to get in touch with Runpod to update the name of the container used in their instances
This is just a naming issue; the container does now use CUDA 12.1.1 and EXL2 is confirmed to work again.

Update: 23rd July 2023 - Llama 2 support, including Llama 2 70B in ExLlama

Llama 2 models, including Llama 2 70B, are now fully supported
Updated to latest text-generation-webui requirements.txt
Removed the exllama pip package installed by text-generation-webui
- Therefore the ExLlama kernel will build automatically on first use
- This ensures that ExLlama is always up-to-date with any new ExLlama commits (which are pulled automatically on each boot)
Added simple build script for building the Docker containers

Update: 28th June 2023 - SuperHOT fixed

Updated to latest ExLlama code, fixing issue with SuperHOT GPTQs
ExLlama now automaticaly updates on boot, like text-generation-webui already did
- This should result in the template automatically supporting new ExLlama features in future

Update: 19th June 2023

Major update to the template
text-generation-webui is now integrated with:
- AutoGPTQ with support for all Runpod GPU types
- ExLlama, turbo-charged Llama GPTQ engine - performs 2x faster than AutoGPTQ (Llama 4bit GPTQs only)
- CUDA-accelerated GGML support, with support for all Runpod systems and GPUs.
All text-generation-webui extensions are included and supported (Chat, SuperBooga, Whisper, etc).
text-generation-webui is always up-to-date with the latest code and features.
Automatic model download and loading via environment variable MODEL.
Pass text-generation-webui parameters via environment variable UI_ARGS.

Runpod: TheBloke's Local LLMs UI

Runpod template link

Full documentation is available here

Runpod: TheBloke's Local LLMs UI & API

Runpod template link

Full documentation is available here

Name	Name	Last commit message	Last commit date
Latest commit TheBloke Mention update to PyTorch 2.1.1 Dec 16, 2023 c8dfb32 · Dec 16, 2023 History 52 Commits
.github	.github	Create FUNDING.yml	Jun 20, 2023
conf-files	conf-files	Refactored version of TheBloke Local LLMs. Not quite finished yet, so…	Jun 17, 2023
cuda11.8.0-ubuntu22.04-oneclick-chat	cuda11.8.0-ubuntu22.04-oneclick-chat	Add chat Serverless Runpod template	Aug 5, 2023
cuda11.8.0-ubuntu22.04-oneclick-rp	cuda11.8.0-ubuntu22.04-oneclick-rp	New repos, still WIP under development	Jun 30, 2023
cuda11.8.0-ubuntu22.04-oneclick	cuda11.8.0-ubuntu22.04-oneclick	Remove `--api` CLI arg for now. Edit logging so it will appear in Run…	Nov 12, 2023
cuda11.8.0-ubuntu22.04-pytorch-conda	cuda11.8.0-ubuntu22.04-pytorch-conda	Further WIP changes to pytorch-conda and train. Not yet ready for rel…	Jul 10, 2023
cuda11.8.0-ubuntu22.04-pytorch	cuda11.8.0-ubuntu22.04-pytorch	Update READMEs, remove outdated cuda11.8.0 textgen container	Nov 8, 2023
cuda12.1.1-ubuntu22.04-pytorch	cuda12.1.1-ubuntu22.04-pytorch	Update PyTorch to 2.1.1	Dec 16, 2023
cuda12.1.1-ubuntu22.04-textgen	cuda12.1.1-ubuntu22.04-textgen	Update to CUDA 12.1.1, rebuild oneclick	Nov 8, 2023
imgs	imgs	Add header image	Jun 19, 2023
scripts	scripts	Remove renamed script	Jun 19, 2023
wheels	wheels	H100 wheel name	Aug 1, 2023
.gitattributes	.gitattributes	Add torch 2.0.1 wheel built from source on H100	Aug 1, 2023
LICENSE	LICENSE	Initial commit	Jun 5, 2023
README.md	README.md	Mention update to PyTorch 2.1.1	Dec 16, 2023
README_Runpod_LocalLLMsUI.md	README_Runpod_LocalLLMsUI.md	Mention update to PyTorch 2.1.1	Dec 16, 2023
README_Runpod_LocalLLMsUIandAPI.md	README_Runpod_LocalLLMsUIandAPI.md	Update README_Runpod_LocalLLMsUIandAPI.md	Dec 16, 2023
build_docker.py	build_docker.py	Add chat Serverless Runpod template	Aug 5, 2023
build_oneclick.py	build_oneclick.py	Fix so --no-cache is always used	Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TheBloke's Docker templates

Update: 16 December 2023 - Rebuild to add Mixtral support

Update: 11 October 2023 - Update API command line option

Update: 8th October 2023 - CUDA 12.1.1, fixed ExLlamav2 issues

Update: 23rd July 2023 - Llama 2 support, including Llama 2 70B in ExLlama

Update: 28th June 2023 - SuperHOT fixed

Update: 19th June 2023

Runpod: TheBloke's Local LLMs UI

Runpod: TheBloke's Local LLMs UI & API

About

Releases

Sponsor this project

Packages

Languages

License

TheBlokeAI/dockerLLM

Folders and files

Latest commit

History

Repository files navigation

TheBloke's Docker templates

Update: 16 December 2023 - Rebuild to add Mixtral support

Update: 11 October 2023 - Update API command line option

Update: 8th October 2023 - CUDA 12.1.1, fixed ExLlamav2 issues

Update: 23rd July 2023 - Llama 2 support, including Llama 2 70B in ExLlama

Update: 28th June 2023 - SuperHOT fixed

Update: 19th June 2023

Runpod: TheBloke's Local LLMs UI

Runpod: TheBloke's Local LLMs UI & API

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages