-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck at output #66
Comments
@fahdmirza I will look into this. |
Thank you, I am doing a review of this for my channel https://www.youtube.com/@fahdmirza , as it looks so promising. Thanks. |
Hello @fahdmirza for this you have to install llama-cpp-python with cuda? we have tested now on A100 on HF spaces and work nice I don't understand your problem |
your code that share here you are using mistral models with CHATML message format I think you have to use MISTRAL for the chat template |
nevertheless check this we have all setup some examples In HF spaces if wanna make a review for your YouTube channel hehe https://huggingface.co/poscye |
This would work for you always use the correct chat template for the model from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType
llama_model = Llama(r"mistral-7b-instruct-v0.2.Q5_K_S.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)
provider = LlamaCppPythonProvider(llama_model)
agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.MISTRAL)
agent_output = agent.get_chat_response("Hello, World!") |
Hi,
On ubuntu 22.04, It just stucks at generating output for hours:
Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent
from llama_cpp import Llama
from llama_cpp_agent.providers import LlamaCppPythonProvider
Create an instance of the Llama class and load the model
llama_model = Llama(r"mistral-7b-instruct-v0.2.Q5_K_S.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)
Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class
provider = LlamaCppPythonProvider(llama_model)
from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType
agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.CHATML)
agent_output = agent.get_chat_response("Hello, World!")
Stucks here........
I have NVIDIA A6000 GPU and plenty of memory. I have also tried installing llama.cpp from source but still same issue. Any ideas
The text was updated successfully, but these errors were encountered: