This repository has been archived by the owner on Jun 12, 2024. It is now read-only.

Hugging Face TextGen Inference

Text Generation Inference is a Rust, Python and gRPC server for text generation inference.

This allows you to run Hugging Face Hub models and other LLMs on your own infrastructure.

Set up

Set up the Text Generation Inference server.
Download the aifile and load it with ownAI (in ownAI, click on the logo in the upper left corner to open the menu, then select "AI Workshop", then "New AI" and "Load Aifile").
Set the inference_server_url setting in the aifile to the URL of your server.

Privacy

These AIs are running on your own machine or on a server where you install the inference server.