Skip to content

EishaEnan/epm-polyglot-chatbot

Repository files navigation

Encyclopaedic Polyglot Machine (EPM)

A multilingual RAG (Retrieval-Augmented Generation) chatbot system that provides encyclopedic knowledge across multiple languages using local LLM inference.

EPM Screenshot

About the Project

The Encyclopaedic Polyglot Machine (EPM) is designed to provide accurate, encyclopedic information across multiple languages without requiring an internet connection or API keys. It uses local LLM inference combined with specialized vector databases for each supported language, enabling users to get high-quality information in their preferred language.

Key capabilities include:

  • Answering factual questions with proper citations
  • Switching between languages seamlessly
  • Maintaining conversation context across language switches
  • Admin interface for managing documents and users
  • Local operation with no data sent to external services

Tech Stack

  • Frontend: Flask templates with responsive design
  • Backend: Python Flask server
  • LLM: Llama models via Ollama for local inference
  • Vector Database: ChromaDB for efficient document retrieval
  • Embeddings: Multilingual MiniLM for cross-language semantic search
  • Document Processing: PDF, DOCX, and TXT processing pipeline
  • User Management: SQLite database for authentication and chat history
  • Containerization: Docker for cross-platform deployment

Features

  • Multilingual support with language-specific vector databases
  • Local LLM inference using Ollama and Llama models
  • Persistent chat history and user management
  • Admin dashboard for system management and document uploads
  • Vector search across encyclopedic knowledge sources
  • Wikipedia integration for additional knowledge retrieval

Installation Options

You can run the EPM system using one of these two methods:

Option 1: Using Docker Hub (Recommended)

The fastest way to get started with minimal setup:

# Pull the image directly from Docker Hub
docker pull eishaenan/polyglot_app:latest

# Run the container
docker run -p 5001:5000 -p 11434:11434 \
  -v polyglot_data:/app/data \
  -v polyglot_chroma:/app/chroma_db \
  -v polyglot_ollama:/root/.ollama \
  eishaenan/polyglot_app:latest

This will:

  • Pull the pre-built Docker image with all dependencies
  • Start Ollama and download the Llama model if needed (~8GB download on first run)
  • Initialize the authentication database with default admin user
  • Build the vector database from the data sources
  • Start the Flask web application

Note: The initial startup may take several minutes as it downloads the model and builds the vector database. Docker performance may be slower than native installation as LLM inference is limited to CPU only in containers.

Access the Application:

Option 2: Local Python Environment

If you prefer to run without Docker:

  1. Clone the Repository

    git clone https://github.com/wsu-comp3018/final-system-pa2509.git
    cd final-system-pa2509
  2. Create and Activate a Virtual Environment

    python3 -m venv venv
    source venv/bin/activate  # Mac/Linux
    venv\Scripts\activate     # Windows (Command Prompt)
  3. Install Dependencies

    pip3 install -r requirements.txt
  4. Install Ollama (If Not Installed Yet)

    • Download from Ollama.com
    • For Linux:
      curl -fsSL https://ollama.com/install.sh | sh
  5. Start Ollama in a Separate Terminal

    ollama serve
  6. Pull the Required Model

    ollama pull llama3.1:8b-instruct-q8_0

    Note: This will download approximately 4GB of data and may take some time depending on your internet connection.

  7. Run the Flask Application

    python src/app.py
  8. Access the Application

System Architecture

System Overview Diagram

System Architecture Diagram

Backend

  • Flask: Web framework for the application
  • LangChain: Orchestrates the RAG pipeline
  • ChromaDB: Vector database for document storage and retrieval
  • Ollama: Local LLM inference using Llama models
  • SQLite: Database for chat history and user management

Docker Configuration

The system is fully dockerized for easy deployment across platforms:

  • Persistent Volumes:

    • chat_db: Stores SQLite database for chat history and user data
    • chroma_db: Stores vector embeddings (3.5GB+)
    • ollama_models: Stores downloaded LLM models
  • Ports:

    • 5001: Web interface (Flask)
    • 11434: Ollama API
  • Cross-Platform Support:

    • Works on Linux, macOS, and Windows with Docker installed
    • All paths are relative and compatible across operating systems

Using the Dockerized Application

Building the Image

docker-compose build

Running the Container

docker-compose up

Running in Background

docker-compose up -d

Stopping the Container

docker-compose down

Viewing Logs

docker-compose logs -f

Development Workflow

If you're contributing to the project, follow these guidelines:

1. Setup Development Environment

  • Clone the repository and set up a local Python environment as described above
  • Make sure you have Ollama installed locally for testing

2. Make Changes

  • Create a feature branch for your changes
  • Test thoroughly before committing
  • Update documentation as needed

3. Build and Test Docker Image

  • Test your changes with Docker to ensure cross-platform compatibility
  • Verify that volumes are properly persisted

4. Submit Pull Request

  • Include detailed description of changes
  • Reference any related issues

Troubleshooting

Common Issues

  1. Port Conflicts: If port 5001 or 11434 is already in use, modify the port mapping in docker-compose.yml

  2. Vector Database Building: The first run will take time to build the vector database. Subsequent runs will be faster.

  3. Model Download: The first run will download the Llama model (~8GB), which may take time depending on your internet connection.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • The LangChain team for their excellent RAG framework
  • The Ollama project for making local LLM inference accessible
  • All contributors to the project

About

A multilingual RAG (Retrieval-Augmented Generation) chatbot built with Flask, LangChain, ChromaDB, and Hugging Face Inference API. It retrieves encyclopedic knowledge in 7 languages using modular LLMs and local vector storage. Designed for open-source, local-first, non-commercial deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors