A multilingual RAG (Retrieval-Augmented Generation) chatbot system that provides encyclopedic knowledge across multiple languages using local LLM inference.
The Encyclopaedic Polyglot Machine (EPM) is designed to provide accurate, encyclopedic information across multiple languages without requiring an internet connection or API keys. It uses local LLM inference combined with specialized vector databases for each supported language, enabling users to get high-quality information in their preferred language.
Key capabilities include:
- Answering factual questions with proper citations
- Switching between languages seamlessly
- Maintaining conversation context across language switches
- Admin interface for managing documents and users
- Local operation with no data sent to external services
- Frontend: Flask templates with responsive design
- Backend: Python Flask server
- LLM: Llama models via Ollama for local inference
- Vector Database: ChromaDB for efficient document retrieval
- Embeddings: Multilingual MiniLM for cross-language semantic search
- Document Processing: PDF, DOCX, and TXT processing pipeline
- User Management: SQLite database for authentication and chat history
- Containerization: Docker for cross-platform deployment
- Multilingual support with language-specific vector databases
- Local LLM inference using Ollama and Llama models
- Persistent chat history and user management
- Admin dashboard for system management and document uploads
- Vector search across encyclopedic knowledge sources
- Wikipedia integration for additional knowledge retrieval
You can run the EPM system using one of these two methods:
The fastest way to get started with minimal setup:
# Pull the image directly from Docker Hub
docker pull eishaenan/polyglot_app:latest
# Run the container
docker run -p 5001:5000 -p 11434:11434 \
-v polyglot_data:/app/data \
-v polyglot_chroma:/app/chroma_db \
-v polyglot_ollama:/root/.ollama \
eishaenan/polyglot_app:latestThis will:
- Pull the pre-built Docker image with all dependencies
- Start Ollama and download the Llama model if needed (~8GB download on first run)
- Initialize the authentication database with default admin user
- Build the vector database from the data sources
- Start the Flask web application
Note: The initial startup may take several minutes as it downloads the model and builds the vector database. Docker performance may be slower than native installation as LLM inference is limited to CPU only in containers.
Access the Application:
- Web Interface: http://localhost:5001
- Default Admin Credentials:
- Username:
admin - Password:
admin123
- Username:
If you prefer to run without Docker:
-
Clone the Repository
git clone https://github.com/wsu-comp3018/final-system-pa2509.git cd final-system-pa2509 -
Create and Activate a Virtual Environment
python3 -m venv venv source venv/bin/activate # Mac/Linux venv\Scripts\activate # Windows (Command Prompt)
-
Install Dependencies
pip3 install -r requirements.txt
-
Install Ollama (If Not Installed Yet)
- Download from Ollama.com
- For Linux:
curl -fsSL https://ollama.com/install.sh | sh
-
Start Ollama in a Separate Terminal
ollama serve
-
Pull the Required Model
ollama pull llama3.1:8b-instruct-q8_0
Note: This will download approximately 4GB of data and may take some time depending on your internet connection.
-
Run the Flask Application
python src/app.py
-
Access the Application
- Web Interface: http://localhost:5000
- Default Admin Credentials:
- Username:
admin - Password:
admin123
- Username:
- Flask: Web framework for the application
- LangChain: Orchestrates the RAG pipeline
- ChromaDB: Vector database for document storage and retrieval
- Ollama: Local LLM inference using Llama models
- SQLite: Database for chat history and user management
The system is fully dockerized for easy deployment across platforms:
-
Persistent Volumes:
chat_db: Stores SQLite database for chat history and user datachroma_db: Stores vector embeddings (3.5GB+)ollama_models: Stores downloaded LLM models
-
Ports:
5001: Web interface (Flask)11434: Ollama API
-
Cross-Platform Support:
- Works on Linux, macOS, and Windows with Docker installed
- All paths are relative and compatible across operating systems
docker-compose builddocker-compose updocker-compose up -ddocker-compose downdocker-compose logs -fIf you're contributing to the project, follow these guidelines:
- Clone the repository and set up a local Python environment as described above
- Make sure you have Ollama installed locally for testing
- Create a feature branch for your changes
- Test thoroughly before committing
- Update documentation as needed
- Test your changes with Docker to ensure cross-platform compatibility
- Verify that volumes are properly persisted
- Include detailed description of changes
- Reference any related issues
-
Port Conflicts: If port 5001 or 11434 is already in use, modify the port mapping in docker-compose.yml
-
Vector Database Building: The first run will take time to build the vector database. Subsequent runs will be faster.
-
Model Download: The first run will download the Llama model (~8GB), which may take time depending on your internet connection.
This project is licensed under the MIT License - see the LICENSE file for details.
- The LangChain team for their excellent RAG framework
- The Ollama project for making local LLM inference accessible
- All contributors to the project
