1.3.0: Intel® AI for Enterprise RAG
·
22 commits
to main
since this release
Getting Started
To deploy your Chat Q&A RAG application, please follow the instructions.
Highlights:
- Retriever RBAC support: Document filtering based on user's access privileges to underlying S3 storage, enhancing security and data access control.
- Enhanced text extraction: Improved extraction for PDF, DOC, DOCX, and images including better hyperlink, table, and image text processing.
- Microservice architecture improvements: Split Dataprep into separate TextExtractor and TextSplitter services with new TextCompression microservice for cleaner document processing.
- Advanced retrieval algorithms: Added similarity_search_with_siblings algorithm to improve response accuracy by including adjacent chunks.
- Improved Redis implementation: Migrated to standalone namespace with Helm chart support for both single node and cluster setups for better performance.
- Backup/restore functionality: Added Velero-based backup and restore capabilities for Keycloak, EDP, and vector store database.
- UI Accessibility: Enhanced accessibility with React ARIA components and added syntax highlighting for code snippets.
Detailed changes
AI/Development
- Added Retriever RBAC support - document filtering based on user's access privileges to underlying S3 storage.
- Enhanced text extraction for PDF, DOC, DOCX, and images - improved hyperlink extraction, table text extraction, and image text extraction.
- Migrated text extraction from custom loader classes to Markitdown for ADOC, TXT, JSON, JSONL, CSV, XLSX, XLS, HTML, MD, XML, and YAML file formats.
- Introduced MarkdownSplitter for ADOC, MD, and HTML files to split text by sections and add this information to metadata.
- Added filename/URL and Section information to prompt template, improving responses to questions about document names.
- Split Dataprep microservice into separate TextExtractor and TextSplitter services.
- Introduced TextCompression microservice between TextExtractor and TextSplitter to clean and compress document text. More details here.
- Added similarity_search_with_siblings algorithm to retriever, configurable in Admin Panel, which improves response accuracy by including adjacent chunks.
- Enabled semantic chunking in Ansible and debug feature, with fixes for large files.
- Introduced Hierarchical Indexing for PDF files as an experimental feature, configurable via
config.yaml. Learn more here.
User Interface
- Improved accessibility by refactoring UI components with React ARIA.
- Added syntax highlighting for code snippets in Chat.
- Implemented automatic scaling of ChatQnA pipeline graph size in Admin Panel - Control Plane.
Deployment
- Migrated Redis vector database from ChatQnA pipeline to standalone namespace.
- Deployed Redis via Helm chart - supporting both single node Redis and Redis-cluster for improved performance.
- Implemented balloons policy as an alternative method of pinning VLLM resources.
- Created backup/restore functionality using Velero for Keycloak, EDP, and vector store database. Installation steps, update and restore procedure are described in documentation.
- Added support for deployment under user-defined domain names.
- Created Ansible scripts for simplified Kubernetes deployment.
- Added Ansible scripts for deploying Gaudi via operator.
Security
- Removed non-functional scanners from guardrails.
- Enabled remaining input guardrails in UI.
- Fixed and enhanced guardrails end-to-end tests.
- Enabled fingerprint capability for dataprep guardrail.
- Upgraded LLM Guard package to version 3.16.
Known issues
- When using Redis as a vector database, the default resource settings are not optimized, causing Redis to start with configurations that are unsuitable for production environments or intensive testing. To address this, remove the existing resource and persistence node configurations from here. Update it with the following settings:
redis:
(...)
master:
persistence:
enabled: true
size: "10Gi"
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 16
memory: 16Gi
replica:
persistence:
enabled: true
size: "10Gi"
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 16
memory: 16GiNote: The resource configuration for redis-cluster is not affected and is correctly set up by default.