Skip to content

CdC-SI/ZAS-EAK-CopilotGPT

Repository files navigation

Open in GitHub Codespaces

ZAS/EAK CopilotGPT

Welcome to the official repository of the ZAS/EAK CopilotGPT challenge, developed as part of the Innovation Fellowship 2024. This project is designed to enhance workplace efficiency and foster innovation by providing AI-supported tools that assist employees in their daily tasks.

This repository serves as a proof of concept (PoC), which is slated to conclude in February 2025. However, we are optimistic that the momentum generated by this innovative challenge will attract continued support and development beyond this timeframe.

Documentation

Check the website for more information.

For more in-depth procedures and examples, see:

Demo Video

45-sec demo video that shows features autocomplete and RAG from the first prototype.

Updates

Coming Soon

  • Livingdocs Adapter
  • Confluence Adapter
  • Local Private Embeddings
  • Local Private LLMs

Version 0.3.0

  • Ollama OS LLMs
  • Agentic RAG
  • Source validation
  • Topic Check
  • Commands
    • /summarize
    • /translate

Version 0.2.0

  • llama.cpp OS models

Version 0.1.0

  • Authentication

  • User feedback

  • Conversational Memory

  • Chat History

  • Autocomplete

  • Optimized RAG

  • Survey Pipeline

  • Local Private LLM

    • llama.cpp (all compatible models eg. HuggingFace)
    • mlx (Apple Silicon)
  • GUI

  • GUI Styleguide Bundle

Challenge Vision

COMING SOON: a detailed overview of our project's vision and strategic alignment.

Features

  • Automation of Routine Tasks: Reduces monotonous research tasks and the finding of information, allowing employees to focus on more important tasks.
  • Decision Support: Provides real-time assistance in decision-making through advanced algorithms.

How To Contribute

Please check the CONTRIBUTORS.md file to contribute to the ZAS/EAK CopilotGPT project.

How it works

The ZAS/EAK CopilotGPT currently features:

  • Question autosuggest: High quality curated questions (from FAQ) are suggested in the chatbar based on the user input. Validated answers with sources are then returned in the chat. Autocomplete currently supports:
    • exact match
    • fuzzy match (levenstein match, trigram match)
    • semantic similarity match
  • RAG: When no known question/answer pairs are found through autosuggest, RAG is initiated. A semantic similarity search will match the most relevant indexed documents in a vector database and an LLM will generate an answer based on these documents, providing the source of the answer.
  • Agentic RAG: A query is analyzed and routed to a suitable agent (eg. FAK-EAK agent) which can execute specialized tools (eg. calculate_reduction_rate_and_supplement, calculate_reference_age, etc.) and execute agentic RAG:
    • Ask followup questions
    • Refine user query
    • Expand search
    • Perform multiple retrieval rounds
    • Evaluate retrieval results

Getting Started

Here you will find instructions for installing and setting up ZAS/EAK CopilotGPT:

Prerequisites

Before starting, ensure you have the following software installed on your computer:

  • Git: Needed for source code management. Install from here.
  • Docker: Required for running containers. Install from here.

Linux users may need to prepend sudo to Docker commands depending on their Docker configuration.

Installation

  1. Clone the Repository

    Begin by cloning the ZAS/EAK CopilotGPT repository to your local machine.

    git clone https://github.com/CdC-SI/ZAS-EAK-CopilotGPT.git
    cd ZAS-EAK-CopilotGPT
  2. Setting Up Environment Variables

    To use the ZAS/EAK Copilot, you need to set up some environment variables in .env.

    Copy the .env.example file to a new file named .env and fill in the appropriate values:

    cp .env.example .env

    Minimal requirements:

    - `OPENAI_API_KEY`
    - `COHERE_API_KEY`
    - `DEEPL_API_KEY` (if you want translation capabilities)
    - `LANGFUSE_SECRET_KEY` # get at localhost:3000
    - `LANGFUSE_PUBLIC_KEY` # get at localhost:3000
    - `ENCRYPTION_KEY` # run openssl rand -hex 32
    

    For different LLM/embedding/reranking APIs, set:

    If using Open Source local LLM API, set:

    • LOCAL_LLM_GENERATION_ENDPOINT (URL and port, eg. http://host.docker.internal:11434 for ollama)

    All other fields are preconfigured with default settings (but can be configured as well).

  3. OPTIONAL: Advanced Copilot Configuration

    The Copilot is configured with default settings, but you can customize every aspect (eg. LLM model, embedding model, autocomplete, RAG, etc.).

    Here are some tips to customize parameters in src/copilot/app/config/config.yaml:

    • rag/llm/model: set a specific LLM model:
      • gpt-4o-mini with an OpenAI API key
      • claude-3-5-sonnet-20241022 with an Anthropic API key
      • llama-3.1-8b-instant with a GROQ API key
      • Note: set a valid LLM API model name for a given provider.
      • Note: if using an Open Source LLM:
        • with llama.cpp: prefix the model name with llama-cpp: (eg. llama-cpp:qwen2.5-7b-instruct-q2_k.gguf)
        • with mlx: prefix the model name with mlx-community: (eg. mlx-community:Nous-Hermes-2-Mistral-7B-DPO-4bit-MLX)
        • with ollama: prefix the model name with ollama: (eg. ollama:deepseek-r1:8b)
    • rag/embedding/model: set a specific embedding model
      • text-embedding-3-small with an OpenAI API key
      • Note: RAG performance might vary if you don't embed all your data with the same embedding model.
    • rag/retrieval/retrieval_method: add different retrievers
      • top_k_retriever
      • query_rewriting_retriever
      • bm25_retriever
      • contextual_compression_retriever
      • reranking
  4. Build Docker Images

    Build the Docker images using the Docker Compose configuration. This step compiles and launches your Docker environment.

    docker-compose up --build -d

    Note: you might need to down the services and re-up them once at first startup due to db connectivity issues.

  5. Verifying the Installation

    Check the status of the containers to confirm everything is running as expected:

    docker-compose ps

    After the containers are successfully started, verify that the application is running correctly by accessing it through your web browser at http://localhost:4200.

  6. Index some data

    • To index some sample data for FAQ and RAG, you can navigate to the indexing swagger and make a request to:

      • /upload_csv_rag
      • /upload_csv_faq
      • Note: set embed parameter to true to enable semantic search
    • To index more extensive RAG data from any official government website, navigate to the indexing swagger and make a request to:

      • /index_html_from_sitemap (scraps any *.admin.ch website by specifying the sitemap URL)

About

The official repository of the EAK-Copilot project as part of the Innovation Fellowship 2024.

Resources

License

Stars

Watchers

Forks

Packages

No packages published