Skip to content

1.3.1: Intel® AI for Enterprise RAG - patch release

Choose a tag to compare

@kkurzacz-intel kkurzacz-intel released this 18 Jul 14:11
· 20 commits to main since this release
c08d914

Release Notes

Highlights:

  • Enhanced model support with six additional LLMs including Meta-Llama-3.1, Qwen3, and Mistral variants
  • Upgraded vLLM version to 0.9.2
  • Expanded testing capabilities with pubMed dataset support and fixes for e2e performance tests

Publications:

Detailed Changes

AI/Development

  • Added support for the following models:

    • hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
    • meta-llama/Llama-3.1-8B-Instruct
    • Qwen/Qwen3-14B-AWQ
    • Qwen/Qwen3-14B
    • solidrust/Mistral-7B-Instruct-v0.3-AWQ
    • mistralai/Mistral-7B-Instruct-v0.3
  • Upgraded vLLM version to 0.9.2

  • Updated default resources for the standard redis and text-splitter microservice to avoid OOM errors

  • Added support for custom templates in resources-model-cpu.yaml

  • Added support for pubMed dataset and fixed input token length in e2e performance tests

  • Added "Performance Tuning Guide" for Xeon deployment

Known issues

  • For Qwen models, it's possible to see artifact in the response.