SnapScribe

A macOS menu bar app that captures screenshots and converts them to searchable, formatted text using 100% local ML inference. No cloud APIs, no subscriptions, no data leaving your machine.

Built for academics, developers, researchers, and knowledge workers who deal with technical content — code, math, structured documents — and value privacy.

How It Works

Capture → Process → Store → Search

Capture a region, window, or full screen via ScreenCaptureKit
Process with local ML — Apple Vision OCR, IBM Docling, or Google Gemma
Store automatically in a local SQLite database with full-text search
Search across all your captures instantly

Processing Modes

Hold modifier keys during capture to select a pipeline:

Mode	Modifier	Pipeline	Best For
Vision	(none)	Apple Vision OCR	Fast narrative text
Docling	`⌥ Option`	granite-docling-258M	Tables, forms, structured docs
Vision + Gemma	`⇧ Shift`	Vision OCR → Gemma LLM	Enhanced formatting, LaTeX
Docling + Gemma	`⌥⇧ Both`	Docling → Gemma LLM	Academic papers, complex LaTeX

Gemma is available in two sizes (4B and 12B) — selectable in Settings.

Comparison Mode

Enable in Settings to run all 6 pipelines in parallel on a single capture (Vision, Docling, each with no LLM / Gemma 4B / Gemma 12B) and compare results side-by-side.

Requirements

macOS 14.0+ (Sonoma)
Apple Silicon (M1 or later) — required for MLX inference
Xcode 15+ with Swift 5.9
Python 3.12+ (via pyenv)
~8 GB disk for ML models
~6.5 GB RAM if using comparison mode (both Gemma models loaded)

Setup

1. Python Environment

pyenv install 3.12.10
pyenv shell 3.12.10
pip install mlx-lm mlx-vlm pillow docling-core huggingface-hub

Or use the setup script:

./scripts/dev_setup.sh

2. Download Models

cd models/

# Document understanding (~500MB)
huggingface-cli download ibm-granite/granite-docling-258M-mlx \
  --local-dir granite-docling-258M-mlx

# Gemma 3 4B — lighter, faster (~2GB)
huggingface-cli download mlx-community/gemma-3-4b-it-4bit \
  --local-dir gemma-3-4b-it-4bit

# Gemma 3 12B — higher quality (~4GB)
huggingface-cli download mlx-community/gemma-3-12b-it-4bit \
  --local-dir gemma-3-12b-it-4bit

Or use the download script:

./scripts/download_model.sh

3. Build & Run

xcodebuild -scheme SnapScribe -configuration Debug build

Then open the built app:

open ~/Library/Developer/Xcode/DerivedData/SnapScribe-*/Build/Products/Debug/SnapScribe.app

Or open SnapScribe.xcodeproj in Xcode and hit ⌘R.

Note: Debug builds use hardcoded paths to ~/.pyenv/versions/3.12.10/bin/python3.12 and the local models/ directory. See DoclingProcessor.swift, GemmaProcessor.swift, and GemmaModel.swift.

Architecture

┌─────────────────────────────────────────────────────────┐
│                  SwiftUI Menu Bar App                    │
│                                                         │
│   MenuBarContentView ── AppState ── LibraryWindow       │
│        (Capture/History)     │      (3-column browser)  │
│                              │                          │
│                     ProcessorManager                    │
│                      (Singleton)                        │
└──────────────────────────┬──────────────────────────────┘
                           │
              ┌────────────┼────────────────┐
              │            │                │
        ┌─────▼────┐  ┌───▼──────────┐  ┌──▼──────────┐
        │ VisionOCR │  │ Docling      │  │ Gemma       │
        │ (native)  │  │ (Python IPC) │  │ (Python IPC)│
        └─────┬─────┘  └──────────────┘  └─────────────┘
              │
        ┌─────▼──────┐
        │CaptureStore │ ← SQLite + FTS5
        └────────────┘

Key Directories

Directory	Contents
`SnapScribe/App/`	UI views — menu bar, history, library, comparison
`SnapScribe/Capture/`	ScreenCaptureKit integration, region selection
`SnapScribe/Inference/`	ML processors — Vision, Docling, Gemma, embeddings
`SnapScribe/Storage/`	SQLite database, data models, FTS5 search
`SnapScribe/Settings/`	Settings window
`SnapScribe/Resources/`	Python inference servers
`models/`	ML model weights (git-ignored, ~8GB)
`scripts/`	Dev setup, model download, bundling, signing

Python IPC

Swift communicates with Python inference servers over stdin/stdout JSON:

Request:  {"id": "uuid", "action": "convert|enhance|ping", "image": "base64", ...}
Response: {"id": "uuid", "status": "success|error", "markdown": "...", ...}

Servers emit {"status": "ready"} once the model is loaded into memory. Models stay resident via ProcessorManager — first call takes 5-10s, subsequent calls are fast.

Database

Stored at ~/Library/Application Support/SnapScribe/captures.db (SQLite, auto-migrating schema).

captures — screenshot metadata, OCR text, enhanced text, user edits, notes, tags, thumbnails
folders — hierarchical organization
comparison_results — parallel pipeline comparison data
captures_fts — FTS5 virtual table for full-text search across all text fields

Scripts

Script	Purpose
`scripts/dev_setup.sh`	Full dev environment setup (validates Apple Silicon, macOS, Python)
`scripts/download_model.sh`	Download ML models from HuggingFace
`scripts/bundle_python.sh`	Bundle Python runtime for distribution
`scripts/create_dmg.sh`	Create distributable DMG
`scripts/sign_app.sh`	Code signing

Known Limitations

Debug paths are hardcoded — Python and model paths point to the dev machine. Release builds will need a bundled runtime.
App Sandbox disabled in Debug to allow Python subprocess access.
Not App Store distributable — requires screen recording permission (direct distribution only).
No global keyboard shortcuts yet — KeyboardShortcuts package is integrated but not wired up.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.claude		.claude
SnapScribe.xcodeproj		SnapScribe.xcodeproj
SnapScribe		SnapScribe
WeKnora		WeKnora
docs		docs
models/SmolDocling-256M-preview-mlx-bf16		models/SmolDocling-256M-preview-mlx-bf16
prompts		prompts
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
PRODUCT_DOC.md		PRODUCT_DOC.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SnapScribe

How It Works

Processing Modes

Comparison Mode

Requirements

Setup

1. Python Environment

2. Download Models

3. Build & Run

Architecture

Key Directories

Python IPC

Database

Scripts

Known Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SnapScribe

How It Works

Processing Modes

Comparison Mode

Requirements

Setup

1. Python Environment

2. Download Models

3. Build & Run

Architecture

Key Directories

Python IPC

Database

Scripts

Known Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages