1.3.1: Intel® AI for Enterprise RAG - patch release
·
20 commits
to main
since this release
Release Notes
Highlights:
- Enhanced model support with six additional LLMs including Meta-Llama-3.1, Qwen3, and Mistral variants
- Upgraded vLLM version to 0.9.2
- Expanded testing capabilities with pubMed dataset support and fixes for e2e performance tests
Publications:
Detailed Changes
AI/Development
-
Added support for the following models:
- hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
- meta-llama/Llama-3.1-8B-Instruct
- Qwen/Qwen3-14B-AWQ
- Qwen/Qwen3-14B
- solidrust/Mistral-7B-Instruct-v0.3-AWQ
- mistralai/Mistral-7B-Instruct-v0.3
-
Upgraded vLLM version to 0.9.2
-
Updated default resources for the standard redis and text-splitter microservice to avoid OOM errors
-
Added support for custom templates in resources-model-cpu.yaml
-
Added support for pubMed dataset and fixed input token length in e2e performance tests
-
Added "Performance Tuning Guide" for Xeon deployment
Known issues
- For Qwen models, it's possible to see artifact in the response.