Skip to content

HayaoAkiyama/sim-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sim-mcp

An MCP (Model Context Protocol) server that lets AI agents control the Xcode iOS Simulator — tap, type, swipe, read screens, launch apps — all through standard MCP tools.

Built as an open-source alternative to Mirroir for the iOS Simulator. Zero external dependencies beyond the MCP SDK — uses only macOS-native APIs.

How It Works

Claude Code / MCP Client
        │
        │  MCP protocol (stdio, JSON-RPC)
        ▼
   ┌─────────┐
   │ server.py│  ← FastMCP tool definitions (16 tools)
   └────┬─────┘
        │
        ▼
┌──────────────┐
│ sim_control.py│  ← Core engine
└──┬───┬───┬───┘
   │   │   │
   │   │   └── xcrun simctl ─── screenshots, app launch, URLs, clipboard
   │   │
   │   └── CoreGraphics CGEvent (ctypes) ─── tap, swipe, type, keyboard
   │
   └── Apple Vision framework (Swift OCR helper, compiled on first use) ─── describe_screen

No pip packages needed for the core. sim_control.py works standalone using only macOS system frameworks via Python ctypes. The mcp package is only needed for the MCP server wrapper.

Requirements

  • macOS (tested on macOS 15 Sequoia)
  • Xcode with iOS Simulator installed
  • Python 3.10+ (Python 3.12 recommended)
  • A booted iOS Simulator (xcrun simctl boot <device>)

Quick Start

1. Clone and set up

git clone https://github.com/HayaoAkiyama/sim-mcp.git
cd sim-mcp

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2. Boot a Simulator

# List available devices
xcrun simctl list devices available

# Boot one (e.g., iPhone 16 Pro)
xcrun simctl boot "iPhone 16 Pro"

# Open the Simulator app so the window is visible
open -a Simulator

3. Register with Claude Code

Add to your project's .mcp.json (or create one in your project root):

{
  "mcpServers": {
    "sim": {
      "command": "/path/to/sim-mcp/.venv/bin/python3",
      "args": ["/path/to/sim-mcp/server.py"]
    }
  }
}

Restart Claude Code. The tools will appear as mcp__sim__tap, mcp__sim__screenshot, etc.

4. Or use as a CLI

sim_control.py also works as a standalone command-line tool without the MCP server:

python3 sim_control.py screenshot /tmp/screen.png
python3 sim_control.py tap 200 400
python3 sim_control.py type "Hello world"
python3 sim_control.py describe
python3 sim_control.py launch com.apple.mobilesafari
python3 sim_control.py home
python3 sim_control.py swipe 200 600 200 200
python3 sim_control.py status

Tools

Tool Parameters Description
screenshot Capture the screen as base64 PNG
describe_screen OCR the screen; returns text elements with tap coordinates
get_orientation Returns "portrait" or "landscape"
tap x, y Tap at device-point coordinates
double_tap x, y Double-tap (select text, zoom)
long_press x, y, duration_ms Long press (context menus, drag mode)
type_text text Type text into the focused field
swipe from_x/y, to_x/y, duration_ms Fast swipe gesture
drag from_x/y, to_x/y, duration_ms Slow drag (reorder items)
scroll_to direction, amount Scroll up/down/left/right from center
press_home Go to home screen
press_key key, modifiers Press any key with modifiers
launch_app identifier Launch by bundle ID or app name
open_url url Open URL in Safari
status Device info and coordinate mapping
check_health Verify Simulator is running

Coordinate System

All coordinates are in device points (not pixels), with (0, 0) at the top-left corner of the device screen.

Device Width Height Scale
iPhone 16 Pro 402 874 3x
iPhone 16 393 852 3x
iPhone SE 375 667 2x
iPad Pro 13" 1024 1366 2x

The controller auto-detects dimensions from the booted Simulator.

Typical Workflow

1. describe_screen  →  find UI elements and their coordinates
2. tap(x, y)        →  tap a button/field at the OCR coordinates
3. type_text("...")  →  enter text into the focused field
4. screenshot       →  verify the result

Architecture

sim_control.py — Core Engine

Input (tap, swipe, type): Uses CoreGraphics CGEvent via Python ctypes to post synthetic mouse and keyboard events directly to the macOS event system. The Simulator window receives these events as if they came from a real mouse/keyboard.

Coordinate mapping: On init, queries the Simulator window's AXGroup (the device screen view) via macOS Accessibility API to get its exact screen position and size. Maps device-point coordinates → macOS screen coordinates using scale factors derived from the AXGroup size and the device's pixel dimensions.

Text input: Types characters one-by-one via CGEvent keystrokes. Handles uppercase (Shift flag), symbols (Shift + base key), and falls back to clipboard paste for Unicode characters not in the US keyboard layout.

OCR (describe_screen): Compiles a Swift helper (/tmp/sim_ocr) on first use that runs Apple's Vision framework VNRecognizeTextRequest. Returns text bounding boxes with pixel coordinates, which are converted to device points.

Screenshots: Uses xcrun simctl io <udid> screenshot for pixel-perfect device screenshots (no window chrome).

server.py — MCP Wrapper

Thin wrapper using the MCP Python SDK's FastMCP class. Each tool is a decorated function that delegates to SimController. Communicates via stdio using JSON-RPC 2.0.

Troubleshooting

"No booted simulator found"

Boot a Simulator first: xcrun simctl boot "iPhone 16 Pro" and open the Simulator app: open -a Simulator.

Text input produces Japanese/wrong characters

The Simulator's iOS keyboard must be set to English. Fix by resetting keyboard preferences:

UDID=$(xcrun simctl list devices booted -j | python3 -c "import sys,json; d=json.load(sys.stdin); print([x['udid'] for v in d['devices'].values() for x in v if x['state']=='Booted'][0])")

xcrun simctl spawn $UDID defaults write -g AppleKeyboards -array "en_US@sw=QWERTY;hw=Automatic"
xcrun simctl spawn $UDID defaults write -g ApplePasscodeKeyboards -array "en_US@sw=QWERTY;hw=Automatic" "emoji@sw=Emoji"
xcrun simctl spawn $UDID defaults write com.apple.keyboard.preferences KeyboardLastUsed -string "en_US@sw=QWERTY;hw=Automatic"
xcrun simctl spawn $UDID launchctl stop com.apple.SpringBoard

Taps land in the wrong place

Run python3 sim_control.py calibrate to tap the four corners and center. If taps are offset, the Simulator window may have been resized — the controller recalculates geometry on each init.

"Connect Hardware Keyboard" must be enabled

In Simulator menu: I/O > Keyboard > Connect Hardware Keyboard (should be checked). This allows CGEvent keyboard events to reach iOS.

License

MIT

About

MCP server for iOS Simulator control — tap, type, swipe, OCR, launch apps. Mirroir-compatible tools for Claude Code and any MCP client.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages