A minimal document review AI assistant with an interactive in-browser editor and LLM-powered review tools, where the UI is generated by AI tools at runtime via MCP.
Instead of a traditional frontend rendering backend data, MCP tools return fully interactive UI components (HTML/CSS/JS). While the frontend is a thin static shell, interactive UI components are delivered by MCP tools and inserted dynamically into the user interface.
-
On the practical side, this project provides a lightweight, minimal, interactive document reviewer that can manage in-situ modifications and verify facts using real-time web search and semantic reranking.
-
On the technical side, it is a full-stack implementation of an Agentic AI application using the Model Context Protocol (MCP) and UIResource components. It highlights the seamless integration between MCP server-side logic and dynamic WebUI rendering, where the server provides both the agentic tools and the sandboxed interface (UIResource) required for the user to interact with the document directly within the browser in real-time.
This is not a classic REST backend. Instead, it uses:
- MCP tools as callable functions
- UIResources returned directly from MCP tools
- A bridge that converts browser RPC β MCP tool calls
In this system:
- The user selects document text
- The model calls a review tool
- The MCP server returns a live Review Card widget
- The UI injects it instantly
- The user can apply edits, save, or download
The model doesnβt just generate text β it generates interface.
This demonstrates a tool-rendered UI pattern, where the server defines interactive components and the model orchestrates interface composition dynamically.
Stack summary:
- MCP tool protocol
- UIResource rendering
- Tool-driven UI generation
- LLM tool-calling loop
- Bridge architecture
- Interactive document editor
- Interactive UI: A rich, web-based document viewer that supports highlighting and editing.
- Agentic Review: Select text to have an AI (Gemini) performing the modifications the user requests on it, and checking for factual errors using web search.
- Fact-Checking Pipeline: Integrates LangSearch for deep web search and Semantic Reranking to provide the LLM with high-quality, relevant context.
- Full-Stack MCP: Showcases a complete architecture where the MCP Server serves dynamic UIResource components. This iframe-based Web UI provides a sandboxed environment for document interaction, allowing the server to inject custom tools and interfaces directly into the client while maintaining clean separation between the document viewer and the host application logic.
This application transforms static documents into interactive, AI-augmented workspaces.
- Browse & View: Open and read local files (
.docx,.md,.txt,.py) through a clean, modern web interface. - Smart Selection: Highlight any text you want to verify, expand upon, or fact-check.
- One-Click Research: Trigger an autonomous agent that searches the web, reranks sources, and validates claims in real-time.
- Direct Editing: Review AI-generated suggestions and apply them to your document with a single click.
- Select Text: Highlight any text in the document.
- Review: Ask the AI to review the selected text.
- Agentic Loop:
- The server receives the text.
- The AI detects factual claims.
- The AI calls
web_search("..."). - LangSearch returns results -> Reranker sorts them -> Filter extracts snippets.
- The AI compares facts and generates a suggestion.
- Edit & Apply: Review the AI's suggestion, make manual edits if needed, and click "Apply" to update the document.
The application works with text-based documents and Word files.
- Supported Formats:
.docx,.pptx,.txt,.md,.py,.json - Default Location: By default, the application lists files from the
docs/folder in the project root. - Manual Selection: You can use the "Browse..." button in the UI to load files from any absolute path on your system.
The system consists of three main layers working together (MCP-UI Server, Client and Frontend UI):
flowchart TD
%% Client-side components
User["User / Browser"] <-->|HTTP / WebSocket| HostUI["Host UI (host.html)"]
HostUI <-->|postMessage| ViewerUI["Viewer UI (iframe / viewer.html)"]
%% Bridge between client and server
HostUI <-->|HTTP RPC| ClientBridge["Client Bridge (client.py)"]
%% Server-side components
ClientBridge <-->|Stdio / MCP| MCPServer["MCP Server (server.py)"]
%% External services
MCPServer <-->|External API| LangSearch["LangSearch API"]
MCPServer <-->|External API| Gemini["Gemini LLM"]
Browser UI (host.html + viewer.html) - Frontend UI Layer
β
HTTP Bridge (client.py) - Bridge Layer
β
MCP Tool Server (server.py with FastMCP)
β
LLM + Tools (Gemini + Web Search + File handlers)
Browser selection
β POST /rpc review_selection
β client.py queue
β MCP tool call
β Gemini + optional web_search
β JSON comment + suggestion
β browser displays inline feedbackUser selects text β AI tool called β MCP returns Review Card widget β
UI injects editable suggestion β user applies change β save tool invoked- Built with
mcp.server.fastmcpandmcp-uiSDK. - Tools:
get_document_ui: Reads a file and returns aUIResource(HTML content) that is passed through the Client Bridge to the Host UI and rendered within the Viewer UI iframe.web_search: Performs search + reranking + filtering.review_selection: Orchestrates the Agentic loop (LLM + Search).save_file: Persists changes to disk.download_document: Downloads the document.upload_document: Uploads a document.
- Templates: Uses
server/assets/viewer.htmlas the template for the document view.
-
A custom Python MCP Client.
-
Role: Connects the Web UI to the Stdio-based MCP Server.
-
Functionality:
- Host a local web server (port 8081).
- Spawns the MCP Server as a subprocess (
uv run server.py). - Exposes an
/rpcendpoint to allow the browser to call specific MCP tools.-
this POST
/rpcendpoint receives tool calls from the Host UI, pushes requests into a queue, waits for MCP result via Future, returns JSON response to browser.queue β call MCP tool β return result β resolve Future β HTTP response
-
So the browser never talks directly to MCP β only via this bridge.
- Host UI: The "Shell" application. It lists files and manages the main window.
- Viewer UI: A sandboxed environment (iframe) that renders the
UIResource(HTML/JS) served by the MCP Server. It captures user interactionsβsuch as text selectionβand communicates them to the Host UI viawindow.postMessage. This allows the Host UI to act as a bridge, forwarding UI events to the MCP Server (via the Client Bridge) to trigger tool executions or state updates, while the server sends updated UI back through the same chain.
host.html acts as the main shell UI:
- Sidebar layout
- Loads assets
- Lets user:
- Upload documents
- Select files
- Open document viewer
- Sends tool calls via POST /rpc
viewer.html is the interactive document viewer/editor template. The server injects {{DOCUMENT_CONTENT}} and {{FILEPATH}}. This is not just display β itβs an interactive editing surface.
Features:
- Styled document view
- Editable content
- Text selection
- Review selected text
- Save button β calls MCP tool save_file
- Download button β calls download_document
- Review selection β calls review_selection
This project illustrates a powerful pattern for building rich MCP interfaces: Frontend-Backend Bridging via UIResources.
Instead of just returning text like most MCP tools, the get_document_ui tool returns a UIResource. This allows the server to dictate the interface the user sees.
# server.py
@mcp.tool()
def get_document_ui(filepath: str) -> List[UIResource]:
# ... read file ...
# ... inject into viewer.html template ...
return [create_ui_resource({
"uri": f"ui://viewer/{filename}",
"content": {"type": "rawHtml", "htmlString": html}, # The full HTML app
# ...
})]This is what enables:
MCP tool β returns UI β rendered in client
This is a core MCP UI pattern.
To make the UI interactive (e.g., clicking a button in the UI calls a Python tool), we use a postMessage bridge:
- Viewer (iframe): User clicks "Review".
// viewer.html window.parent.postMessage({ type: 'mcp-tool-call', tool: 'review_selection', params: { text: "...", context: "..." } }, '*');
- Host (Parent): Listens for message and calls the backend.
// host.html window.addEventListener('message', async (event) => { if (event.data.type === 'mcp-tool-call') { // Call Python client bridge via HTTP const result = await callTool(event.data.tool, event.data.params); // Send result back to iframe frame.contentWindow.postMessage({ type: 'tool-response', ... }, '*'); } });
- Client (Python): Proxies request to MCP Server.
# client.py # Receives POST /rpc -> calls session.call_tool() -> Returns result
interactive-doc-mcp/
βββ client/
β βββ assets/
β β βββ host.html # The main "Shell" UI
β βββ client.py # The MCP Client Bridge
βββ server/
β βββ assets/
β β βββ viewer.html # The interactive document template
β βββ .env # API keys and configuration
β βββ server.py # The MCP Server logic
βββ docs/ # Default folder for documents
βββ requirements.txt # Shared dependencies
βββ README.md
- Python 3.10+
uv(Package manager)- API Keys:
- GOOGLE_API_KEY: For Gemini LLM.
- LANGSEARCH_API_KEY: For Search & Reranking.
- Clone the repo.
- Install dependencies (shared for both client and server):
uv pip install -r requirements.txt
- Create a
.envfile inserver/with your keys.
Run the client bridge from the root directory. This will automatically start the server.
uv run client/client.pyThe application will open in your default browser at http://localhost:8080.