Skip to content

LaurentVeyssier/Document-Review-AI-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Model Context Protocol + Tool-Rendered UI Document Review Assistant

A minimal document review AI assistant with an interactive in-browser editor and LLM-powered review tools, where the UI is generated by AI tools at runtime via MCP.

Instead of a traditional frontend rendering backend data, MCP tools return fully interactive UI components (HTML/CSS/JS). While the frontend is a thin static shell, interactive UI components are delivered by MCP tools and inserted dynamically into the user interface.

  • On the practical side, this project provides a lightweight, minimal, interactive document reviewer that can manage in-situ modifications and verify facts using real-time web search and semantic reranking.

  • On the technical side, it is a full-stack implementation of an Agentic AI application using the Model Context Protocol (MCP) and UIResource components. It highlights the seamless integration between MCP server-side logic and dynamic WebUI rendering, where the server provides both the agentic tools and the sandboxed interface (UIResource) required for the user to interact with the document directly within the browser in real-time.

This is not a classic REST backend. Instead, it uses:

  • MCP tools as callable functions
  • UIResources returned directly from MCP tools
  • A bridge that converts browser RPC β†’ MCP tool calls

In this system:

  • The user selects document text
  • The model calls a review tool
  • The MCP server returns a live Review Card widget
  • The UI injects it instantly
  • The user can apply edits, save, or download

The model doesn’t just generate text β€” it generates interface.

This demonstrates a tool-rendered UI pattern, where the server defines interactive components and the model orchestrates interface composition dynamically.

Stack summary:

  • MCP tool protocol
  • UIResource rendering
  • Tool-driven UI generation
  • LLM tool-calling loop
  • Bridge architecture
  • Interactive document editor

🌟 Features

  • Interactive UI: A rich, web-based document viewer that supports highlighting and editing.
  • Agentic Review: Select text to have an AI (Gemini) performing the modifications the user requests on it, and checking for factual errors using web search.
  • Fact-Checking Pipeline: Integrates LangSearch for deep web search and Semantic Reranking to provide the LLM with high-quality, relevant context.
  • Full-Stack MCP: Showcases a complete architecture where the MCP Server serves dynamic UIResource components. This iframe-based Web UI provides a sandboxed environment for document interaction, allowing the server to inject custom tools and interfaces directly into the client while maintaining clean separation between the document viewer and the host application logic.

✨ User Experience

This application transforms static documents into interactive, AI-augmented workspaces.

What you can do:

  • Browse & View: Open and read local files (.docx, .md, .txt, .py) through a clean, modern web interface.
  • Smart Selection: Highlight any text you want to verify, expand upon, or fact-check.
  • One-Click Research: Trigger an autonomous agent that searches the web, reranks sources, and validates claims in real-time.
  • Direct Editing: Review AI-generated suggestions and apply them to your document with a single click.

πŸ” Fact-Checking Workflow

  1. Select Text: Highlight any text in the document.
  2. Review: Ask the AI to review the selected text.
  3. Agentic Loop:
    • The server receives the text.
    • The AI detects factual claims.
    • The AI calls web_search("...").
    • LangSearch returns results -> Reranker sorts them -> Filter extracts snippets.
    • The AI compares facts and generates a suggestion.
  4. Edit & Apply: Review the AI's suggestion, make manual edits if needed, and click "Apply" to update the document.

πŸ“„ Supported Files & Locations

The application works with text-based documents and Word files.

  • Supported Formats: .docx, .pptx, .txt, .md, .py, .json
  • Default Location: By default, the application lists files from the docs/ folder in the project root.
  • Manual Selection: You can use the "Browse..." button in the UI to load files from any absolute path on your system.

πŸ—οΈ Architecture

The system consists of three main layers working together (MCP-UI Server, Client and Frontend UI):

flowchart TD
    %% Client-side components
    User["User / Browser"] <-->|HTTP / WebSocket| HostUI["Host UI (host.html)"]
    HostUI <-->|postMessage| ViewerUI["Viewer UI (iframe / viewer.html)"]
    
    %% Bridge between client and server
    HostUI <-->|HTTP RPC| ClientBridge["Client Bridge (client.py)"]
    
    %% Server-side components
    ClientBridge <-->|Stdio / MCP| MCPServer["MCP Server (server.py)"]
    
    %% External services
    MCPServer <-->|External API| LangSearch["LangSearch API"]
    MCPServer <-->|External API| Gemini["Gemini LLM"]
Loading

MCP-UI Architecture

Browser UI (host.html + viewer.html) - Frontend UI Layer
        ↓
HTTP Bridge (client.py) - Bridge Layer
        ↓
MCP Tool Server (server.py with FastMCP)
        ↓
LLM + Tools (Gemini + Web Search + File handlers)

System Flow

Browser selection
β†’ POST /rpc review_selection
β†’ client.py queue
β†’ MCP tool call
β†’ Gemini + optional web_search
β†’ JSON comment + suggestion
β†’ browser displays inline feedback

Typical User Workflow:

User selects text β†’ AI tool called β†’ MCP returns Review Card widget β†’
UI injects editable suggestion β†’ user applies change β†’ save tool invoked

1. MCP Server (server/server.py)

  • Built with mcp.server.fastmcp and mcp-ui SDK.
  • Tools:
    • get_document_ui: Reads a file and returns a UIResource (HTML content) that is passed through the Client Bridge to the Host UI and rendered within the Viewer UI iframe.
    • web_search: Performs search + reranking + filtering.
    • review_selection: Orchestrates the Agentic loop (LLM + Search).
    • save_file: Persists changes to disk.
    • download_document: Downloads the document.
    • upload_document: Uploads a document.
  • Templates: Uses server/assets/viewer.html as the template for the document view.

2. Client Bridge (client/client.py)

  • A custom Python MCP Client.

  • Role: Connects the Web UI to the Stdio-based MCP Server.

  • Functionality:

    • Host a local web server (port 8081).
    • Spawns the MCP Server as a subprocess (uv run server.py).
    • Exposes an /rpc endpoint to allow the browser to call specific MCP tools.
      • this POST /rpc endpoint receives tool calls from the Host UI, pushes requests into a queue, waits for MCP result via Future, returns JSON response to browser.

        queue β†’ call MCP tool β†’ return result β†’ resolve Future β†’ HTTP response

    So the browser never talks directly to MCP β€” only via this bridge.

3. Frontend UI (host.html & viewer.html)

  • Host UI: The "Shell" application. It lists files and manages the main window.
  • Viewer UI: A sandboxed environment (iframe) that renders the UIResource (HTML/JS) served by the MCP Server. It captures user interactionsβ€”such as text selectionβ€”and communicates them to the Host UI via window.postMessage. This allows the Host UI to act as a bridge, forwarding UI events to the MCP Server (via the Client Bridge) to trigger tool executions or state updates, while the server sends updated UI back through the same chain.

host.html acts as the main shell UI:

  • Sidebar layout
  • Loads assets
  • Lets user:
    • Upload documents
    • Select files
    • Open document viewer
  • Sends tool calls via POST /rpc

viewer.html is the interactive document viewer/editor template. The server injects {{DOCUMENT_CONTENT}} and {{FILEPATH}}. This is not just display β€” it’s an interactive editing surface.

Features:

  • Styled document view
  • Editable content
  • Text selection
  • Review selected text
  • Save button β†’ calls MCP tool save_file
  • Download button β†’ calls download_document
  • Review selection β†’ calls review_selection

πŸ’‘ MCP-UI Implementation Details

This project illustrates a powerful pattern for building rich MCP interfaces: Frontend-Backend Bridging via UIResources.

The "UIResource" Pattern

Instead of just returning text like most MCP tools, the get_document_ui tool returns a UIResource. This allows the server to dictate the interface the user sees.

# server.py
@mcp.tool()
def get_document_ui(filepath: str) -> List[UIResource]:
    # ... read file ...
    # ... inject into viewer.html template ...
    return [create_ui_resource({
        "uri": f"ui://viewer/{filename}",
        "content": {"type": "rawHtml", "htmlString": html}, # The full HTML app
        # ...
    })]

This is what enables:

MCP tool β†’ returns UI β†’ rendered in client

This is a core MCP UI pattern.

The Iframe Bridge

To make the UI interactive (e.g., clicking a button in the UI calls a Python tool), we use a postMessage bridge:

  1. Viewer (iframe): User clicks "Review".
    // viewer.html
    window.parent.postMessage({
        type: 'mcp-tool-call',
        tool: 'review_selection',
        params: { text: "...", context: "..." }
    }, '*');
  2. Host (Parent): Listens for message and calls the backend.
    // host.html
    window.addEventListener('message', async (event) => {
        if (event.data.type === 'mcp-tool-call') {
            // Call Python client bridge via HTTP
            const result = await callTool(event.data.tool, event.data.params);
            // Send result back to iframe
            frame.contentWindow.postMessage({ type: 'tool-response', ... }, '*');
        }
    });
  3. Client (Python): Proxies request to MCP Server.
    # client.py
    # Receives POST /rpc -> calls session.call_tool() -> Returns result

πŸ“‚ Project Structure

interactive-doc-mcp/
β”œβ”€β”€ client/
β”‚   β”œβ”€β”€ assets/
β”‚   β”‚   └── host.html       # The main "Shell" UI
β”‚   └── client.py           # The MCP Client Bridge
β”œβ”€β”€ server/
β”‚   β”œβ”€β”€ assets/
β”‚   β”‚   └── viewer.html     # The interactive document template
β”‚   β”œβ”€β”€ .env                # API keys and configuration
β”‚   └── server.py           # The MCP Server logic
β”œβ”€β”€ docs/                   # Default folder for documents
β”œβ”€β”€ requirements.txt        # Shared dependencies
└── README.md

πŸš€ Setup & Usage

Prerequisites

  • Python 3.10+
  • uv (Package manager)
  • API Keys:
    • GOOGLE_API_KEY: For Gemini LLM.
    • LANGSEARCH_API_KEY: For Search & Reranking.

Installation

  1. Clone the repo.
  2. Install dependencies (shared for both client and server):
    uv pip install -r requirements.txt
  3. Create a .env file in server/ with your keys.

Running the App

Run the client bridge from the root directory. This will automatically start the server.

uv run client/client.py

The application will open in your default browser at http://localhost:8080.

About

A minimal document review AI assistant with an interactive in-browser editor and LLM-powered review tools, where the UI is generated by AI tools at runtime via MCP

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors