New Features 🌟
- Offline Llama CPP Integration: run LLMs locally using llama.cpp through unified chatbot or flow interface.
- Multiple Model Support: switch between different GGUF models such as TinyLlama and DeepSeek-R1.
- Enhanced Prompt Formatting: support for model-specific prompt formats.
- Added options to suppress verbose llama.cpp logs.
Using Llama CPP Chat Features 💻
from intelli.function.chatbot import Chatbot, ChatProvider
from intelli.model.input.chatbot_input import ChatModelInput
# Configure tinyLlama Chatbot
options = {
"model_path": "./models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf",
"model_params": {
"n_ctx": 512,
"embedding": False, # True if you need embeddings
"verbose": False # Suppress llama.cpp internal logs
}
}
llama_bot = Chatbot(provider=ChatProvider.LLAMACPP, options=options)
# Prepare a chat input and get a response
chat_input = ChatModelInput("You are a helpful assistant.", model="llamacpp", max_tokens=64, temperature=0.7)
chat_input.add_user_message("What is the capital of France?")
response = llama_bot.chat(chat_input)
For more details check the llama.cpp docs.