Skip to content

[Refactor] Migrate to Nexa AI llama.cpp buildΒ #3547

@iwr-redmond

Description

@iwr-redmond

Feature Request

The Nomic build of llama.cpp is outdated (e.g #3523, #3537, #3540, etc.) and due to be replaced. This FR is an alternative to the proposed switch to Ollama (#3542), which I reckon would be about as welcome as an outhouse breeze.

Nexa AI have their own build of llama.cpp that is used in their Python SDK and supports a diverse range of hardware types. It may be preferable to migrate to this third-party build rather than persisting with the Ollama PR. The larger Nexa SDK package is likely to continue being updated regularly, and integrating the Nexa llama.cpp build would still be easier than continuing Nomic's own fork, resource constraints no doubt being a key consideration behind #3542!, even if some adaptation work is required.

The most important issue that exists at the Nexa end in terms of compatibility with GPT4All is the present lack of Vulkan builds for Linux (nexa-sdk#380). However, this is probably an oversight rather than an unfixed feature. There is also a separate branch for Qualcomm (arm64) NPUs here, although it may not yet be production ready.

Note: I am not a representative of Nexa AI

cc @Davidqian123

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions