Stuck at output #66

fahdmirza · 2024-05-30T23:09:15Z

Hi,
On ubuntu 22.04, It just stucks at generating output for hours:

Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent

from llama_cpp import Llama
from llama_cpp_agent.providers import LlamaCppPythonProvider

Create an instance of the Llama class and load the model

llama_model = Llama(r"mistral-7b-instruct-v0.2.Q5_K_S.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)

Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class

provider = LlamaCppPythonProvider(llama_model)

from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType

agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.CHATML)

agent_output = agent.get_chat_response("Hello, World!")

Stucks here........

I have NVIDIA A6000 GPU and plenty of memory. I have also tried installing llama.cpp from source but still same issue. Any ideas

Maximilian-Winter · 2024-06-01T04:16:42Z

@fahdmirza I will look into this.

fahdmirza · 2024-06-01T04:21:31Z

@fahdmirza I will look into this.

Thank you, I am doing a review of this for my channel https://www.youtube.com/@fahdmirza , as it looks so promising. Thanks.

pabl-o-ce · 2024-07-13T15:01:55Z

Hello @fahdmirza for this you have to install llama-cpp-python with cuda? we have tested now on A100 on HF spaces and work nice I don't understand your problem

pabl-o-ce · 2024-07-13T15:03:03Z

your code that share here you are using mistral models with CHATML message format I think you have to use MISTRAL for the chat template

pabl-o-ce · 2024-07-13T15:04:58Z

nevertheless check this we have all setup some examples In HF spaces if wanna make a review for your YouTube channel hehe https://huggingface.co/poscye

pabl-o-ce · 2024-07-13T15:07:34Z

This would work for you always use the correct chat template for the model

from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType

llama_model = Llama(r"mistral-7b-instruct-v0.2.Q5_K_S.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)
provider = LlamaCppPythonProvider(llama_model)

agent = LlamaCppAgent(provider, system_prompt="You are a helpful assistant.", predefined_messages_formatter_type=MessagesFormatterType.MISTRAL)

agent_output = agent.get_chat_response("Hello, World!")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck at output #66

Stuck at output #66

fahdmirza commented May 30, 2024

Maximilian-Winter commented Jun 1, 2024

fahdmirza commented Jun 1, 2024

pabl-o-ce commented Jul 13, 2024

pabl-o-ce commented Jul 13, 2024

pabl-o-ce commented Jul 13, 2024 •

edited

Loading

pabl-o-ce commented Jul 13, 2024 •

edited

Loading

Stuck at output #66

Stuck at output #66

Comments

fahdmirza commented May 30, 2024

Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent

Create an instance of the Llama class and load the model

Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class

Maximilian-Winter commented Jun 1, 2024

fahdmirza commented Jun 1, 2024

pabl-o-ce commented Jul 13, 2024

pabl-o-ce commented Jul 13, 2024

pabl-o-ce commented Jul 13, 2024 • edited Loading

pabl-o-ce commented Jul 13, 2024 • edited Loading

pabl-o-ce commented Jul 13, 2024 •

edited

Loading

pabl-o-ce commented Jul 13, 2024 •

edited

Loading