Skip to content

Casual MCP is a Python framework for building, evaluating, and serving LLMs with tool-calling capabilities using Model Context Protocol (MCP)

License

Notifications You must be signed in to change notification settings

AlexStansfield/casual-mcp

Repository files navigation

🧠 Casual MCP

PyPI License: MIT

Casual MCP is a Python framework for building, evaluating, and serving LLMs with tool-calling capabilities using Model Context Protocol (MCP).
It includes:

  • βœ… A multi-server MCP client using FastMCP
  • βœ… Provider support for OpenAI (and OpenAI compatible APIs)
  • βœ… A recursive tool-calling chat loop
  • βœ… System prompt templating with Jinja2
  • βœ… A basic API exposing a chat endpoint

✨ Features

  • Plug-and-play multi-server tool orchestration
  • Prompt templating with Jinja2
  • Configurable via JSON
  • CLI and API access
  • Extensible architecture

πŸ”§ Installation

pip install casual-mcp

Or for development:

git clone https://github.com/AlexStansfield/casual-mcp.git
cd casual-mcp
uv pip install -e .[dev]

🧩 Providers

Providers allow access to LLMs. Currently, only an OpenAI provider is supplied. However, in the model configuration, you can supply an optional endpoint allowing you to use any OpenAI-compatible API (e.g., LM Studio).

Ollama support is planned for a future version, along with support for custom pluggable providers via a standard interface.

🧩 System Prompt Templates

System prompts are defined as Jinja2 templates in the prompt-templates/ directory.

They are used in the config file to specify a system prompt to use per model.

This allows you to define custom prompts for each model β€” useful when using models that do not natively support tools. Templates are passed the tool list in the tools variable.

# prompt-templates/example_prompt.j2
Here is a list of functions in JSON format that you can invoke:
[
{% for tool in tools %}
  {
    "name": "{{ tool.name }}",
    "description": "{{ tool.description }}",
    "parameters": {
    {% for param_name, param in tool.inputSchema.items() %}
      "{{ param_name }}": {
        "description": "{{ param.description }}",
        "type": "{{ param.type }}"{% if param.default is defined %},
        "default": "{{ param.default }}"{% endif %}
      }{% if not loop.last %},{% endif %}
    {% endfor %}
    }
  }{% if not loop.last %},{% endif %}
{% endfor %}
]

βš™οΈ Configuration File (casual_mcp_config.json)

πŸ“„ See the Programmatic Usage section to build configs and messages with typed models.

The CLI and API can be configured using a casual_mcp_config.json file that defines:

  • πŸ”§ Available models and their providers
  • 🧰 Available MCP tool servers
  • 🧩 Optional tool namespacing behavior

πŸ”Έ Example

{
  "models": {
    "lm-qwen-3": {
      "provider": "openai",
      "endpoint": "http://localhost:1234/v1",
      "model": "qwen3-8b",
      "template": "lm-studio-native-tools"
    },
    "gpt-4.1": {
        "provider": "openai",
        "model": "gpt-4.1"
    }
  },
  "servers": {
    "time": {
      "command": "python",
      "args": ["mcp-servers/time/server.py"]
    },
    "weather": {
      "url": "http://localhost:5050/mcp"
    }
  }
}

πŸ”Ή models

Each model has:

  • provider: "openai" (more to come)
  • model: the model name (e.g., gpt-4.1, qwen3-8b)
  • endpoint: required for custom OpenAI-compatible backends (e.g., LM Studio)
  • template: optional name used to apply model-specific tool calling formatting

πŸ”Ή servers

Servers can either be local (over stdio) or remote.

Local Config:

  • command: the command to run the server, e.g python, npm
  • args: the arguments to pass to the server as a list, e.g ["time/server.py"]
  • Optional: env: for subprocess environments, system_prompt to override server prompt

Remote Config:

  • url: the url of the mcp server
  • Optional: transport: the type of transport, http, sse, streamable-http. Defaults to http

Environmental Variables

There are two environmental variables:

  • OPEN_AI_API_KEY: required when using the openai provider, if using a local model with an openai compatible API it can be any string
  • TOOL_RESULT_FORMAT: adjusts the format of the tool result given back to the LLM. Options are result, function_result, function_args_result. Defaults to result

You can set them using export or by creating a .env file.

πŸ›  CLI Reference

casual-mcp serve

Start the API server.

Options:

  • --host: Host to bind (default 0.0.0.0)
  • --port: Port to serve on (default 8000)

casual-mcp servers

Loads the config and outputs the list of MCP servers you have configured.

Example Output

$ casual-mcp servers
┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━┓
┃ Name    ┃ Type   ┃ Command / Url                 ┃ Env ┃
┑━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━┩
β”‚ math    β”‚ local  β”‚ mcp-servers/math/server.py    β”‚     β”‚
β”‚ time    β”‚ local  β”‚ mcp-servers/time-v2/server.py β”‚     β”‚
β”‚ weather β”‚ local  β”‚ mcp-servers/weather/server.py β”‚     β”‚
β”‚ words   β”‚ remote β”‚ https://localhost:3000/mcp    β”‚     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”˜

casual-mcp models

Loads the config and outputs the list of models you have configured.

Example Output

$ casual-mcp models
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name              ┃ Provider ┃ Model                     ┃ Endpoint               ┃
┑━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩
β”‚ lm-phi-4-mini     β”‚ openai   β”‚ phi-4-mini-instruct       β”‚ http://kovacs:1234/v1  β”‚
β”‚ lm-hermes-3       β”‚ openai   β”‚ hermes-3-llama-3.2-3b     β”‚ http://kovacs:1234/v1  β”‚
β”‚ lm-groq           β”‚ openai   β”‚ llama-3-groq-8b-tool-use  β”‚ http://kovacs:1234/v1  β”‚
β”‚ gpt-4o-mini       β”‚ openai   β”‚ gpt-4o-mini               β”‚                        β”‚
β”‚ gpt-4.1-nano      β”‚ openai   β”‚ gpt-4.1-nano              β”‚                        β”‚
β”‚ gpt-4.1-mini      β”‚ openai   β”‚ gpt-4.1-mini              β”‚                        β”‚
β”‚ gpt-4.1           β”‚ openai   β”‚ gpt-4.1                   β”‚                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🧠 Programmatic Usage

You can import and use the core framework in your own Python code.

βœ… Exposed Interfaces

McpToolChat

Orchestrates LLM interaction with tools using a recursive loop.

from casual_mcp import McpToolChat
from casual_mcp.tool_cache import ToolCache
from casual_mcp.models import SystemMessage, UserMessage

tool_cache = ToolCache(mcp_client)
chat = McpToolChat(mcp_client, provider, system_prompt, tool_cache=tool_cache)

# Generate method to take user prompt
response = await chat.generate("What time is it in London?")

# Generate method with session
response = await chat.generate("What time is it in London?", "my-session-id")

# Chat method that takes list of chat messages 
# note: system prompt ignored if sent in messages so no need to set
chat = McpToolChat(mcp_client, provider, tool_cache=tool_cache)
messages = [
  SystemMessage(content="You are a cool dude who likes to help the user"),
  UserMessage(content="What time is it in London?")
]
response = await chat.chat(messages)

ProviderFactory

Instantiates LLM providers based on the selected model config.

from casual_mcp import ProviderFactory

provider_factory = ProviderFactory(mcp_client, tool_cache=tool_cache)
provider = await provider_factory.get_provider("lm-qwen-3", model_config)

ℹ️ Tool catalogues are cached to avoid repeated ListTools calls. The cache refreshes every 30 seconds by default. Override this with the MCP_TOOL_CACHE_TTL environment variable (set to 0 or a negative value to cache indefinitely).

load_config

Loads your casual_mcp_config.json into a validated config object.

from casual_mcp import load_config

config = load_config("casual_mcp_config.json")

load_mcp_client

Creats a multi server FastMCP client from the config object

from casual_mcp import load_mcp_client

config = load_mcp_client(config)

Model and Server Configs

Exported models:

  • StdioServerConfig
  • RemoteServerConfig
  • OpenAIModelConfig

Use these types to build valid configs:

from casual_mcp.models import OpenAIModelConfig, StdioServerConfig

model = OpenAIModelConfig(model="llama3", endpoint="http://...")
server = StdioServerConfig(command="python", args=["time/server.py"])

Chat Messages

Exported models:

  • AssistantMessage
  • SystemMessage
  • ToolResultMessage
  • UserMessage

Use these types to build message chains:

from casual_mcp.models import SystemMessage, UserMessage

messages = [
  SystemMessage(content="You are a friendly tool calling assistant."),
  UserMessage(content="What is the time?")
]

Example

from casual_mcp import McpToolChat, load_config, load_mcp_client, ProviderFactory
from casual_mcp.models import SystemMessage, UserMessage

model = "gpt-4.1-nano"
messages = [
  SystemMessage(content="""You are a tool calling assistant. 
You have access to up-to-date information through the tools. 
Respond naturally and confidently, as if you already know all the facts."""),
  UserMessage(content="Will I need to take my umbrella to London today?")
]

# Load the Config from the File
config = load_config("casual_mcp_config.json")

# Setup the MCP Client
mcp_client = load_mcp_client(config)

# Get the Provider for the Model
provider_factory = ProviderFactory(mcp_client)
provider = await provider_factory.get_provider(model, config.models[model])

# Perform the Chat and Tool calling
chat = McpToolChat(mcp_client, provider)
response_messages = await chat.chat(messages)

πŸš€ API Usage

Start the API Server

casual-mcp serve --host 0.0.0.0 --port 8000

Chat

Endpoint: POST /chat

Request Body:

  • model: the LLM model to use
  • messages: list of chat messages (system, assistant, user, etc) that you can pass to the api, allowing you to keep your own chat session in the client calling the api

Example:

{
    "model": "gpt-4.1-nano",
    "messages": [
        {
            "role": "user",
            "content": "can you explain what the word consistent means?"
        }
    ]
}

Generate

The generate endpoint allows you to send a user prompt as a string.

It also support sessions that keep a record of all messages in the session and feeds them back into the LLM for context. Sessions are stored in memory so are cleared when the server is restarted

Endpoint: POST /generate

Request Body:

  • model: the LLM model to use
  • prompt: the user prompt
  • session_id: an optional ID that stores all the messages from the session and provides them back to the LLM for context

Example:

{
    "session_id": "my-session",
    "model": "gpt-4o-mini",
    "prompt": "can you explain what the word consistent means?"
}

Get Session

Get all the messages from a session

Endpoint: GET /generate/session/{session_id}

License

This software is released under the MIT License

About

Casual MCP is a Python framework for building, evaluating, and serving LLMs with tool-calling capabilities using Model Context Protocol (MCP)

Resources

License

Stars

Watchers

Forks

Packages

No packages published