Convert web pages to Markdown using the Cloudflare Workers AI toMarkdown API.
Features:
- Fetches any URL and converts the content to a clean
.mdfile - Recursive crawling — follow links to a configurable depth
- Smart link filtering — skips nav, footers, login pages, file downloads, etc.
- Post-processing — strips boilerplate ("Skip to main content", "On this page") and collapses excess blank lines
- Filename generation from page titles with collision avoidance
- Rust toolchain (1.85+, edition 2024)
- A Cloudflare account with access to the Workers AI
toMarkdownAPI - Cloudflare Account ID and API Token
git clone <repo-url> && cd url2md
cargo build --releaseThe binary is produced at ./target/release/url2md.
Provide your Cloudflare credentials via environment variables or CLI flags:
# environment variables (recommended)
export CLOUDFLARE_ACCOUNT_ID="your-account-id"
export CLOUDFLARE_API_TOKEN="your-api-token"
# or pass as flags
url2md --account-id <ID> --api-token <TOKEN> <URL>url2md https://example.com -o ./output# immediate links (depth 1)
url2md https://example.com -o ./output -r 1
# two levels deep
url2md https://example.com -o ./output -r 2cargo run -- https://example.com -o ./output
cargo run -- https://example.com -o ./output -r 2url2md [OPTIONS] <URL>
Arguments:
<URL> URL to fetch and convert
Options:
-o, --output <PATH> Output directory [default: .]
-r, --recursive <DEPTH> Recursively convert linked pages [default: 0]
--account-id <ID> Cloudflare account ID (or CLOUDFLARE_ACCOUNT_ID)
--api-token <TOKEN> Cloudflare API token (or CLOUDFLARE_API_TOKEN)
-h, --help Print help
-V, --version Print version
cargo build # debug build
cargo build --release # optimised release build
cargo check # fast compile check (no binary produced)cargo testcargo clippy # check for lint warnings
cargo clippy -- -D warnings # treat warnings as errorscargo fmt # auto-format code
cargo fmt -- --check # check only — does not modify filesRun formatting, linting, and tests together:
cargo fmt -- --check && cargo clippy -- -D warnings && cargo testurl2md/
├── Cargo.toml # package manifest and dependencies
├── Cargo.lock # locked dependency versions
├── README.md
└── src/
├── main.rs # CLI, HTTP fetching, link extraction, crawl loop
└── config.rs # markdown post-processing and unit tests
See repository for license details.