Pinot MCP Server

Overview

This project is a Python-based Model Context Protocol (MCP) server for interacting with Apache Pinot. It is built using the FastMCP framework. It is designed to integrate with Claude Desktop to enable real-time analytics and metadata queries on a Pinot cluster.

It allows you to

List tables, segments, and schema info from Pinot
Execute read-only SQL queries
View index/column-level metadata
Designed to assist business users via Claude integration
and much more.

StarTree Server for Apache Pinot MCP server

Pinot MCP in Action

See Pinot MCP in action below:

Fetching Metadata

Fetching Data, followed by analysis

Prompt: Can you do a histogram plot on the GitHub events against time

Sample Prompts

Once Claude is running, click the hammer 🛠️ icon and try these prompts:

Can you help me analyse my data in Pinot? Use the Pinot tool and look at the list of tables to begin with.
Can you do a histogram plot on the GitHub events against time

Quick Start

Prerequisites

Install uv (if not already installed)

uv is a fast Python package installer and resolver, written in Rust. It's designed to be a drop-in replacement for pip with significantly better performance.

curl -LsSf https://astral.sh/uv/install.sh | sh

# Reload your bashrc/zshrc to take effect. Alternatively, restart your terminal
# source ~/.bashrc

Installation

# Clone the repository
git clone https://github.com/startreedata/mcp-pinot.git
cd mcp-pinot
uv pip install -e . # Install dependencies

# For development dependencies (including testing tools), use:
# uv pip install -e .[dev]

Configure Pinot Cluster

The MCP server expects a uvicorn config style .env file in the root directory to configure the Pinot cluster connection. This repo includes a sample .env.example file that assumes a pinot quickstart setup.

mv .env.example .env

Configure Table Filtering (Optional)

⚠️ Security Note: For production access control, use Pinot's native table-level ACLs (available since Pinot 0.8.0+). Table filtering in this MCP server is a convenience feature for organizing tables and improving UX, not a security boundary. It uses best-effort SQL parsing and should not be relied upon for security.

Table filtering allows you to control which Pinot tables are visible through the MCP server. This is useful for:

Reduce Cognitive Load: Focus on relevant tables when your Pinot cluster has hundreds or thousands of tables
Multi-Tenancy UX: Run multiple MCP server instances against the same Pinot cluster, each showing different table subsets for different teams or use cases
Environment Separation: Deploy different MCP server instances (dev, staging, prod) that show only environment-specific tables
Hide System Tables: Filter out internal, test, or deprecated tables from end-user view

When table filtering is enabled, all table operations are filtered to show only the configured tables.

What Gets Filtered

Table filtering applies across all MCP operations:

Table Listing - Only configured tables appear in table lists
Query Execution - SQL queries are checked to ensure all referenced tables (in FROM, JOIN, subqueries, CTEs, etc.) match the configured patterns
Table Operations - Direct table access operations filter by table name:
- Get table details, size, and metadata
- Get table segments and segment metadata
- Get index/column details
- Get/update table configurations
Schema Operations - Schema operations filter by schema name:
- Get/create/update schemas
- Create table configurations

Setup

Copy the example configuration file:

cp table_filters.yaml.example table_filters.yaml

Edit table_filters.yaml to specify which tables to include:

included_tables:
  - production_*        # All tables starting with "production_"
  - analytics_events    # Specific table name
  - metrics_*          # All tables starting with "metrics_"

Configure the filter file path in your .env:

PINOT_TABLE_FILTER_FILE=table_filters.yaml

Pattern Matching

The filter supports glob-style patterns using standard Unix filename pattern matching:

exact_table_name - Matches exactly this table
prefix_* - Matches all tables starting with "prefix_"
*_suffix - Matches all tables ending with "_suffix"
*pattern* - Matches all tables containing "pattern"
sharded_table_? - Matches tables with exactly one character after the underscore (e.g., sharded_table_1, sharded_table_a)

Query Filtering

When filtering is enabled, SQL queries are checked before execution:

Supported SQL Features: FROM clauses, JOIN clauses (INNER, LEFT, RIGHT, OUTER, CROSS), subqueries, CTEs (WITH), UNION queries, comma-separated table lists
Quoted Identifiers: Supports both double-quoted ("table name") and backtick-quoted (`table_name`) table names
Schema Prefixes: Handles schema-qualified table names (e.g., database.schema.table)
Comments: Removes SQL comments before checking

Example filtered query:

SELECT * FROM allowed_table
JOIN other_table ON allowed_table.id = other_table.id

Error: Query references unauthorized tables: other_table. Allowed tables: allowed_table, prod_*

Configuration Features

Fail-Fast Validation:

⚠️ If PINOT_TABLE_FILTER_FILE is configured but the file doesn't exist, the server will fail to start with a FileNotFoundError
This prevents accidentally showing all tables due to misconfiguration
Empty filter files or missing included_tables key will show all tables (no filtering)

Comprehensive Filtering:

All MCP tools that access tables apply filtering before execution
Consistent filtering across all table access points
Clear error messages indicate which tables don't match the configured patterns

Disabling Table Filtering

To disable table filtering, either:

Remove the PINOT_TABLE_FILTER_FILE environment variable, or
Don't configure it in your .env file

When not configured, all tables in the Pinot cluster are visible.

Configure OAuth Authentication (Optional)

To enable OAuth authentication, set the following environment variables in your .env file:

Required variables (when OAUTH_ENABLED=true):

OAUTH_CLIENT_ID: OAuth client ID
OAUTH_CLIENT_SECRET: OAuth client secret
OAUTH_BASE_URL: Your MCP server base URL
OAUTH_AUTHORIZATION_ENDPOINT: OAuth authorization endpoint URL
OAUTH_TOKEN_ENDPOINT: OAuth token endpoint URL
OAUTH_JWKS_URI: JSON Web Key Set URI for token verification
OAUTH_ISSUER: Token issuer identifier

Optional variables:

OAUTH_AUDIENCE: Expected audience claim for token validation
OAUTH_EXTRA_AUTH_PARAMS: Additional authorization parameters as JSON object (e.g., {"scope": "openid profile"})

Example configuration:

OAUTH_ENABLED=true
OAUTH_CLIENT_ID=client-id
OAUTH_CLIENT_SECRET=client-secret
OAUTH_BASE_URL=http://localhost:8000
OAUTH_AUTHORIZATION_ENDPOINT=https://example.com/oauth/authorize
OAUTH_TOKEN_ENDPOINT=https://example.com/oauth/token
OAUTH_JWKS_URI=https://example.com/.well-known/jwks.json
OAUTH_ISSUER=https://example.com
OAUTH_AUDIENCE=client-id
OAUTH_EXTRA_AUTH_PARAMS={"scope": "openid profile"}

Run the server

uv --directory . run mcp_pinot/server.py

You should see logs indicating that the server is running.

Launch Pinot Quickstart (Optional)

Start Pinot QuickStart using docker:

docker run --name pinot-quickstart -p 2123:2123 -p 9000:9000 -p 8000:8000 -d apachepinot/pinot:latest QuickStart -type batch

Query MCP Server

uv --directory . run examples/example_client.py

This quickstart just checks all the tools and queries the airlineStats table.

Claude Desktop Integration

Open Claude's config file

vi ~/Library/Application\ Support/Claude/claude_desktop_config.json

Add an MCP server entry

{
  "mcpServers": {
      "pinot_mcp": {
          "command": "/path/to/uv",
          "args": [
              "--directory",
              "/path/to/mcp-pinot-repo",
              "run",
              "mcp_pinot/server.py"
          ],
          "env": {
            // You can also include your .env config here
          }
      }
  }
}

Replace /path/to/uv with the absolute path to the uv command, you can run which uv to figure it out.

Replace /path/to/mcp-pinot with the absolute path to the folder where you cloned this repo.

Note: you must use stdio transport when running your server to use with Claude desktop.

You could also configure environment variables here instead of the .env file, in case you want to connect to multiple pinot clusters as MCP servers.

Restart Claude Desktop

Claude will now auto-launch the MCP server on startup and recognize the new Pinot-based tools.

Using DXT Extension

Apache Pinot MCP server now supports DXT desktop extensions file

To use it, you first need to install dxt via

npm install -g @anthropic-ai/dxt

then you can run the following commands:

uv pip install -r pyproject.toml --target mcp_pinot/lib
uv pip install . --target mcp_pinot/lib 
dxt pack

After this you'll get a .dxt file in your dir. Double click on that file to install it in claude desktop

Developer

All tools are defined in the Pinot class in utils/pinot_client.py

Build

Build the project with

pip install -e ".[dev]"

Test

Test the repo with:

pytest

Build the Docker image

docker build -t mcp-pinot .

Run the container

docker run -v $(pwd)/.env:/app/.env mcp-pinot

Note: Make sure to have your .env file configured with the appropriate Pinot cluster settings before running the container.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
assets		assets
examples		examples
helm		helm
mcp_pinot		mcp_pinot
tests		tests
.dxtignore		.dxtignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
PR_DESCRIPTION.md		PR_DESCRIPTION.md
README.md		README.md
glama.json		glama.json
manifest.json		manifest.json
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
query_simple.sh		query_simple.sh
query_tables.py		query_tables.py
run.sh		run.sh
simple_query_builtin.py		simple_query_builtin.py
table_filters.yaml.example		table_filters.yaml.example
test_query.sh		test_query.sh
test_rest_api.sh		test_rest_api.sh
user_guide.md		user_guide.md
uv.lock		uv.lock

License

etsy/mcp-pinot

Folders and files

Latest commit

History

Repository files navigation