An MCP (Model Context Protocol) server that lets AI agents control the Xcode iOS Simulator — tap, type, swipe, read screens, launch apps — all through standard MCP tools.
Built as an open-source alternative to Mirroir for the iOS Simulator. Zero external dependencies beyond the MCP SDK — uses only macOS-native APIs.
Claude Code / MCP Client
│
│ MCP protocol (stdio, JSON-RPC)
▼
┌─────────┐
│ server.py│ ← FastMCP tool definitions (16 tools)
└────┬─────┘
│
▼
┌──────────────┐
│ sim_control.py│ ← Core engine
└──┬───┬───┬───┘
│ │ │
│ │ └── xcrun simctl ─── screenshots, app launch, URLs, clipboard
│ │
│ └── CoreGraphics CGEvent (ctypes) ─── tap, swipe, type, keyboard
│
└── Apple Vision framework (Swift OCR helper, compiled on first use) ─── describe_screen
No pip packages needed for the core. sim_control.py works standalone using only macOS system frameworks via Python ctypes. The mcp package is only needed for the MCP server wrapper.
- macOS (tested on macOS 15 Sequoia)
- Xcode with iOS Simulator installed
- Python 3.10+ (Python 3.12 recommended)
- A booted iOS Simulator (
xcrun simctl boot <device>)
git clone https://github.com/HayaoAkiyama/sim-mcp.git
cd sim-mcp
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt# List available devices
xcrun simctl list devices available
# Boot one (e.g., iPhone 16 Pro)
xcrun simctl boot "iPhone 16 Pro"
# Open the Simulator app so the window is visible
open -a SimulatorAdd to your project's .mcp.json (or create one in your project root):
{
"mcpServers": {
"sim": {
"command": "/path/to/sim-mcp/.venv/bin/python3",
"args": ["/path/to/sim-mcp/server.py"]
}
}
}Restart Claude Code. The tools will appear as mcp__sim__tap, mcp__sim__screenshot, etc.
sim_control.py also works as a standalone command-line tool without the MCP server:
python3 sim_control.py screenshot /tmp/screen.png
python3 sim_control.py tap 200 400
python3 sim_control.py type "Hello world"
python3 sim_control.py describe
python3 sim_control.py launch com.apple.mobilesafari
python3 sim_control.py home
python3 sim_control.py swipe 200 600 200 200
python3 sim_control.py status| Tool | Parameters | Description |
|---|---|---|
screenshot |
— | Capture the screen as base64 PNG |
describe_screen |
— | OCR the screen; returns text elements with tap coordinates |
get_orientation |
— | Returns "portrait" or "landscape" |
tap |
x, y |
Tap at device-point coordinates |
double_tap |
x, y |
Double-tap (select text, zoom) |
long_press |
x, y, duration_ms |
Long press (context menus, drag mode) |
type_text |
text |
Type text into the focused field |
swipe |
from_x/y, to_x/y, duration_ms |
Fast swipe gesture |
drag |
from_x/y, to_x/y, duration_ms |
Slow drag (reorder items) |
scroll_to |
direction, amount |
Scroll up/down/left/right from center |
press_home |
— | Go to home screen |
press_key |
key, modifiers |
Press any key with modifiers |
launch_app |
identifier |
Launch by bundle ID or app name |
open_url |
url |
Open URL in Safari |
status |
— | Device info and coordinate mapping |
check_health |
— | Verify Simulator is running |
All coordinates are in device points (not pixels), with (0, 0) at the top-left corner of the device screen.
| Device | Width | Height | Scale |
|---|---|---|---|
| iPhone 16 Pro | 402 | 874 | 3x |
| iPhone 16 | 393 | 852 | 3x |
| iPhone SE | 375 | 667 | 2x |
| iPad Pro 13" | 1024 | 1366 | 2x |
The controller auto-detects dimensions from the booted Simulator.
1. describe_screen → find UI elements and their coordinates
2. tap(x, y) → tap a button/field at the OCR coordinates
3. type_text("...") → enter text into the focused field
4. screenshot → verify the result
Input (tap, swipe, type): Uses CoreGraphics CGEvent via Python ctypes to post synthetic mouse and keyboard events directly to the macOS event system. The Simulator window receives these events as if they came from a real mouse/keyboard.
Coordinate mapping: On init, queries the Simulator window's AXGroup (the device screen view) via macOS Accessibility API to get its exact screen position and size. Maps device-point coordinates → macOS screen coordinates using scale factors derived from the AXGroup size and the device's pixel dimensions.
Text input: Types characters one-by-one via CGEvent keystrokes. Handles uppercase (Shift flag), symbols (Shift + base key), and falls back to clipboard paste for Unicode characters not in the US keyboard layout.
OCR (describe_screen): Compiles a Swift helper (/tmp/sim_ocr) on first use that runs Apple's Vision framework VNRecognizeTextRequest. Returns text bounding boxes with pixel coordinates, which are converted to device points.
Screenshots: Uses xcrun simctl io <udid> screenshot for pixel-perfect device screenshots (no window chrome).
Thin wrapper using the MCP Python SDK's FastMCP class. Each tool is a decorated function that delegates to SimController. Communicates via stdio using JSON-RPC 2.0.
Boot a Simulator first: xcrun simctl boot "iPhone 16 Pro" and open the Simulator app: open -a Simulator.
The Simulator's iOS keyboard must be set to English. Fix by resetting keyboard preferences:
UDID=$(xcrun simctl list devices booted -j | python3 -c "import sys,json; d=json.load(sys.stdin); print([x['udid'] for v in d['devices'].values() for x in v if x['state']=='Booted'][0])")
xcrun simctl spawn $UDID defaults write -g AppleKeyboards -array "en_US@sw=QWERTY;hw=Automatic"
xcrun simctl spawn $UDID defaults write -g ApplePasscodeKeyboards -array "en_US@sw=QWERTY;hw=Automatic" "emoji@sw=Emoji"
xcrun simctl spawn $UDID defaults write com.apple.keyboard.preferences KeyboardLastUsed -string "en_US@sw=QWERTY;hw=Automatic"
xcrun simctl spawn $UDID launchctl stop com.apple.SpringBoardRun python3 sim_control.py calibrate to tap the four corners and center. If taps are offset, the Simulator window may have been resized — the controller recalculates geometry on each init.
In Simulator menu: I/O > Keyboard > Connect Hardware Keyboard (should be checked). This allows CGEvent keyboard events to reach iOS.
MIT