Skip to content

Conversation

@renovate
Copy link
Contributor

@renovate renovate bot commented Nov 1, 2025

This PR contains the following updates:

Package Update Change
localai/localai minor v3.6.0-aio-cpu -> v3.7.0-aio-cpu

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

mudler/LocalAI (localai/localai)

v3.7.0

Compare Source

🚀 LocalAI 3.7.0




Welcome to LocalAI 3.7.0 👋

This release introduces Agentic MCP support with full WebUI integration, a brand-new neutts TTS backend, fuzzy model search, long-form TTS chunking for chatterbox, and a complete WebUI overhaul.

We’ve also fixed critical bugs, improved stability, and enhanced compatibility with OpenAI’s APIs.


📌 TL;DR – What’s New in LocalAI 3.7.0
Feature Summary
🤖 Agentic MCP Support (WebUI-enabled) Build AI agents that use real tools (web search, code exec). Fully-OpenAI compatible and integrated into the WebUI.
🎙️ neutts TTS Backend (Neuphonic-powered) Generate natural, high-quality speech with low-latency audio — ideal for voice assistants.
🖼️ WebUI enhancements Faster, cleaner UI with real-time updates and full YAML model control.
💬 Long-Text TTS Chunking (Chatterbox) Generate natural-sounding long-form audio by intelligently splitting text and preserving context.
🧩 Advanced Agent Controls Fine-tune agent behavior with new options for retries, reasoning, and re-evaluation.
📸 New Video Creation Endpoint We now support the OpenAI-compatible /v1/videos endpoint for text-to-video generation.
🐍 Enhanced Whisper compatibility Whisper.cpp is now supported on various CPU variants (AVX, AVX2, etc.) to prevent illegal instruction crashes.
🔍 Fuzzy Gallery Search Find models in the gallery even with typos (e.g., gema finds gemma).
📦 Easier Model & Backend Management Import, edit, and delete models directly via clean YAML in the WebUI.
▶️ Realtime Example Check out the new realtime voice assistant example (multilingual).
⚠️ Security, Stability & API Compliance Fixed critical crashes, deadlocks, session events, OpenAI compliance, and JSON schema panics.
🧠 Qwen 3 VL Support for Qwen 3 VL with llama.cpp/gguf models
🔥 What’s New in Detail
🤖 Agentic MCP Support – Build Intelligent, Tool-Using AI Agents

We're proud to announce full Agentic MCP support a feature for building AI agents that can reason, plan, and execute actions using external tools like web search, code execution, and data retrieval. You can use standard chat/completions endpoint, but powered by an agent in the background.

Full documentation is available here

Now in WebUI: A dedicated toggle appears in the chat interface when a model supports MCP. Just click to enable agent mode.

✨ Key Features:
  • New Endpoint: POST /mcp/v1/chat/completions (OpenAI-compatible).
  • Flexible Tool Configuration:
    mcp:
      stdio: |
        {
          "mcpServers": {
            "searxng": {
              "command": "docker",
              "args": ["run", "-i", "--rm", "ghcr.io/mudler/mcps/duckduckgo:master"]
            }
          }
        }
  • Advanced Agent Control via agent config:
    agent:
      max_attempts: 3
      max_iterations: 5
      enable_reasoning: true
      enable_re_evaluation: true
    • max_attempts: Retry failed tool calls up to N times.
    • max_iterations: Limit how many times the agent can loop through reasoning.
    • enable_reasoning: Allow step-by-step thought processes (e.g., chain-of-thought).
    • enable_re_evaluation: Re-analyze decisions when tool results are ambiguous.

You can find some plug-n-play MCPs here: https://github.com/mudler/MCPs
Under the hood, MCP functionality is powered by https://github.com/mudler/cogito

🖼️ WebUI enhancements

WebUI had a major overhaul:

  • The chat view now has an MCP toggle in the chat for models that have mcp settings enabled in the model config file.
  • The Editor mask of the model has now been simplified to show/edit the YAML settings of the model
  • More reactive, dropped HTMX in favor of Alpine.js and vanilla javascript
  • Various fixes including deletion of models
🎙️ Introducing neutts TTS Backend – Natural Speech, Low Latency

Say hello to neutts a new, lightweight TTS backend powered by Neuphonic, delivering high-quality, natural-sounding speech with minimal overhead.

🎛️ Setup Example
name: neutts-english
backend: neutts
parameters:
  model: neuphonic/neutts-air
tts:
  audio_path: "./output.wav"
  streaming: true
options:
  # text transcription of the provided audio file
  - ref_text: "So I'm live on radio..."
known_usecases:
  - tts
🐍 Whisper.cpp enhancements

whisper.cpp CPU variants are now available for:

  • avx
  • avx2
  • avx512
  • fallback (no optimized instructions available)

These variants are optimized for specific instruction sets and reduce crashes on older or non-AVX CPUs.

🔍 Smarter Gallery Search: Fuzzy & Case-Insensitive Matching

Searching for gemma now finds gemma-3, gemma2, etc. — even with typos like gemaa or gema.

🧩 Improved Tool & Schema Handling – No More Crashes

We’ve fixed multiple edge cases that caused crashes or silent failures in tool usage.

✅ Fixes:
  • Nullable JSON Schemas: "type": ["string", "null"] now works without panics.
  • Empty Parameters: Tools with missing or empty parameters now handled gracefully.
  • Strict Mode Enforcement: When strict_mode: true, the model must pick a tool — no more skipping.
  • Multi-Type Arrays: Safe handling of ["string", "null"] in function definitions.

🔄 Interaction with Grammar Triggers: strict_mode and grammar rules work together — if a tool is required and the function definition is invalid, the server returns a clear JSON error instead of crashing.

📸 New Video Creation Endpoint: OpenAI-Compatible

LocalAI now supports OpenAI’s /v1/videos endpoint for generating videos from text prompts.

📌 Usage Example:
curl http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-..." \
  -d '{
    "model": "sora",
    "prompt": "A cat walking through a forest at sunset",
    "size": "1024x576",
  }'
🧠 Qwen 3 VL in llama.cpp

Support has been added for Qwen 3 VL in llama.cpp. We have updated llama.cpp to latest! As a reminder, Qwen 3 VL and multimodal models are also compatible with our vLLM and MLX backends. Qwen 3 VL models are already available in the model gallery:

  • qwen3-vl-30b-a3b-instruct
  • qwen3-vl-30b-a3b-thinking
  • qwen3-vl-4b-instruct
  • qwen3-vl-32b-instruct
  • qwen3-vl-4b-thinking
  • qwen3-vl-2b-thinking
  • qwen3-vl-2b-instruct

Note: upgrading the llama.cpp backend is necessary if you already have a LocalAI installation.

🚀 (CI) Gallery Updater Agent: Auto-Detect & Suggest New Models

We’ve added an autonomous CI agent that scans Hugging Face daily for new models and opens PRs to update the gallery.

✨ How It Works:
  1. Scans HF for new, trending models
  2. Extracts base model, quantization, and metadata.
  3. Uses cogito (our agentic framework) to assign the model to the correct family and to obtain the model informations.
  4. Opens a PR with:
    • Suggested name, family, and usecases
    • Link to HF model
    • YAML snippet for import
🔧 Critical Bug Fixes & Stability Improvements
Issue Fix Impact
📌 WebUI Crash on Model Load Fixed can't evaluate field Name in type string error Models now render even without config files
🔁 Deadlock in Model Load/Idle Checks Guarded against race conditions during model loading Improved performance under load
📞 Realtime API Compliance Added session.created event; removed redundant conversation.created Works with VoxInput, OpenAI clients, and more
📥 MCP Response Formatting Output wrapped in message field Matches OpenAI spec — better client compatibility
🛑 JSON Error Responses Now return clean JSON instead of HTML Scripts and libraries no longer break on auth failures
🔄 Session Registration Fixed initial MCP calls failing due to cache issues Reliable first-time use
🎧 kokoro TTS Returns full audio, not partial Better for long-form TTS
🚀 The Complete Local Stack for Privacy-First AI
LocalAI Logo

LocalAI

The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI Logo

LocalAGI

A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall Logo

LocalRecall

A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall


❤️ Thank You!

A huge THANK YOU to our growing community! With over 35,000 stars, LocalAI is a true FOSS movement — built by people, for people, with no corporate backing.

If you love privacy-first AI and open source, please:

  • Star the repo
  • 💬 Contribute code, docs, or feedback
  • 📣 Share with others

Your support keeps this stack alive and evolving!


✅ Full Changelog
📋 Click to expand full changelog
What's Changed
Bug fixes 🐛
Exciting New Features 🎉
🧠 Models
📖 Documentation and examples
👒 Dependencies
Other Changes
New Contributors

Full Changelog: mudler/LocalAI@v3.6.0...v3.7.0


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Never, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Nov 1, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Nov 1, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wanghe-fit2cloud wanghe-fit2cloud merged commit 4bdbf93 into dev Nov 2, 2025
1 check was pending
@wanghe-fit2cloud wanghe-fit2cloud deleted the renovate/localai-localai-3.x branch November 2, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants