[Refactor] Migrate to Nexa AI llama.cpp build

### Feature Request

The Nomic build of llama.cpp is outdated (e.g #3523, #3537, #3540, etc.) and due to be replaced. This FR is an alternative to the proposed switch to Ollama (#3542), which I reckon would be about as welcome as an outhouse breeze.

Nexa AI have their own build of [llama.cpp](https://github.com/NexaAI/llama.cpp) that is used in their [Python SDK](https://github.com/NexaAI/nexa-sdk) and supports a [diverse range](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-2-python-package) of hardware types. It may be preferable to migrate to this third-party build rather than persisting with the Ollama PR. The larger Nexa SDK package is likely to continue being updated regularly, and integrating the Nexa llama.cpp build would still be easier than continuing Nomic's own fork, resource constraints no doubt being a key consideration behind #3542!, even if some adaptation work is required.

The most important issue that exists at the Nexa end in terms of compatibility with GPT4All is the present lack of Vulkan builds for Linux ([nexa-sdk#380](https://github.com/NexaAI/nexa-sdk/issues/380)). However, this is probably an oversight rather than an unfixed feature. There is also a separate branch for Qualcomm (arm64) NPUs [here](https://github.com/Davidqian123/llama.cpp-qnn), although it may not yet be production ready.

_Note: I am not a representative of Nexa AI_

cc @Davidqian123

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] Migrate to Nexa AI llama.cpp build #3547

Feature Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Refactor] Migrate to Nexa AI llama.cpp build #3547

Description

Feature Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions