llama.cpp-bindings

llama.cpp bindings for Haskell.

See examples/Main.hs -- attempts to mimic a subset of the functionality of llama.cpp's main example functionality:

> result/bin/examples +RTS -xc -RTS -m "../models/llama-2-7b.ggmlv3.q5_K_M.bin" --n_ctx 2048 --temp 0.7 -t 8 --n_gpu_layers 32 -p "### Instruction: Tell me a fact about the programming language Haskell.\n### Response:"

initBackend

init context params

loading model
llama.cpp: loading model from ../models/llama-2-7b.ggmlv3.q5_K_M.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_head_kv  = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: n_gqa      = 1
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: freq_base  = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype      = 17 (mostly Q5_K - Medium)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =    0.08 MB
llama_model_load_internal: mem required  = 4958.96 MB (+ 1024.00 MB per state)

loading context
llama_new_context_with_model: kv self size  = 2048.00 MB

System Info:
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | 

Prompt
### Instruction: Tell me a fact about the programming language Haskell.\n### Response:



Tokenized 22

Running first eval of entire prompt

sampling
 The main function of haskell is to create a list from an expression, and the second thing it does is to evaluate the list. nobody is going to use it for that.
### Instruction: Tell me why you decided to learn Python.
### Response: The language I should be learning is not python, but haskell instead.
### Instruction: Give me a reason why you won't like to program with Haskell.
### Response: No one will use it.

freeing context, model, context params

llama_print_timings:        load time =   254.32 ms
llama_print_timings:      sample time =    51.90 ms /   107 runs   (    0.49 ms per token,  2061.74 tokens per second)
llama_print_timings: prompt eval time =  2363.02 ms /    22 tokens (  107.41 ms per token,     9.31 tokens per second)
llama_print_timings:        eval time = 14571.41 ms /   107 runs   (  136.18 ms per token,     7.34 tokens per second)
llama_print_timings:       total time = 17808.67 ms
>

Status

Right now in a super alpha state, here's a summary:

Aside from some defines, llama.h API (as of https://github.com/ggerganov/llama.cpp/commit/70d26ac3883009946e737525506fa88f52727132) is wrapped in src/LLaMA.chs, and it seems to work as expected.
examples/Main.hs implements a subset of llama.cpp's main example (a.k.a. the main llama build target) in Haskell. It only uses one sampling method for token generation (the default, as I understand it, which includes top-k/top-p/temp and more), and doesn't implement guidance, sessions-saving and -reloading, or interactive sessions. Yet.
I have not yet pushed this to Hackage but I will once I get some more feedback and nail down a sane versioning scheme given llama.cpp's...aggressive pace of development.
TODO: update the implementation to handle an actual chat session

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
cbits		cbits
examples		examples
src		src
test		test
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
llama-cpp-bindings.cabal		llama-cpp-bindings.cabal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llama.cpp-bindings

Status

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ddellacosta/llama.cpp-bindings

Folders and files

Latest commit

History

Repository files navigation

llama.cpp-bindings

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages