I have a suggestion. Just found this project [https://github.com/guinmoon/LLMFarm](url). Just an FYI, GGUF models work with it, I am not sure if GGML does.