-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Description
Feature Request
The Nomic build of llama.cpp is outdated (e.g #3523, #3537, #3540, etc.) and due to be replaced. This FR is an alternative to the proposed switch to Ollama (#3542), which I reckon would be about as welcome as an outhouse breeze.
Nexa AI have their own build of llama.cpp that is used in their Python SDK and supports a diverse range of hardware types. It may be preferable to migrate to this third-party build rather than persisting with the Ollama PR. The larger Nexa SDK package is likely to continue being updated regularly, and integrating the Nexa llama.cpp build would still be easier than continuing Nomic's own fork, resource constraints no doubt being a key consideration behind #3542!, even if some adaptation work is required.
The most important issue that exists at the Nexa end in terms of compatibility with GPT4All is the present lack of Vulkan builds for Linux (nexa-sdk#380). However, this is probably an oversight rather than an unfixed feature. There is also a separate branch for Qualcomm (arm64) NPUs here, although it may not yet be production ready.
Note: I am not a representative of Nexa AI