A powerful, production-ready REST API for searching and retrieving book information from Anna's Archive. Features bot-detection bypass via stealth Playwright, intelligent domain rotation across all mirrors, and a multi-layer caching system.
- 🤖 Stealth Scraping — Playwright + stealth plugin bypasses Cloudflare & bot detection
- 🔄 Domain Rotation — Automatically rotates across all known Anna's Archive mirrors, falling back gracefully when one is down
- ⚡ Smart Caching — Multi-TTL in-memory cache: search results (5 min), book details (1 hour)
- 📄 Pagination — Full pagination support on search results
- 🏊 Browser Pool — Reusable browser instances for high performance
- 🛡️ Rate Limiting — Built-in per-IP rate limiting
- 📊 Health Endpoint — Domain health & cache statistics
# Install dependencies
npm install
# Install Playwright's Chromium browser
npm run install:browsers
# Start in development mode
npm run dev
# Start in production
npm startSearch for books by any query (title, author, ISBN, DOI, MD5, etc.)
Query Parameters:
| Param | Type | Default | Description |
|---|---|---|---|
q |
string | required | Search query |
page |
number | 1 | Page number |
lang |
string | — | Language filter (e.g. en) |
ext |
string | — | File extension filter (e.g. pdf, epub) |
sort |
string | — | Sort order (newest, oldest, largest, smallest) |
content |
string | — | Content type filter |
Response:
{
"success": true,
"query": "python programming",
"page": 1,
"results": [...],
"cached": false,
"domain": "annas-archive.gl",
"responseTime": 1234
}Get full details for a specific book by its MD5 hash.
Response:
{
"success": true,
"md5": "d64efd386ed7227592499460aca2044b",
"book": {
"title": "Data Science Essentials in Python",
"author": "Dmitry Zinoviev",
"publisher": "Pragmatic Bookshelf",
"year": "2016",
"language": "en",
"filesize": 6432380,
"extension": "pdf",
"isbn": ["9781680501841", "1680501844"],
"description": "...",
"cover": "https://...",
"md5": "d64efd386ed7227592499460aca2044b",
"downloadLinks": {
"fast": [...],
"slow": [...],
"external": [...]
},
"metadata": {...}
},
"cached": true,
"responseTime": 45
}Returns API health, domain status, cache stats, and browser pool status.
Clears the entire cache. Useful for forced refresh.
The API will automatically try these domains in order:
annas-archive.glannas-archive.organnas-archive.seannas-archive.gsannas-archive.gdannas-archive.pk
If a domain is unreachable, it is temporarily blacklisted and the next one is tried.
See .env for all configuration options. Key settings include:
PORT: Server port (default: 3000)CACHE_TYPE:memory(default) orredisREDIS_URL: Redis connection string (e.g.redis://localhost:6379)REDIS_PREFIX: Prefix for keys in Redis (default:annas-api:)CACHE_TTL_SEARCH: TTL for search results in seconds (default: 300)CACHE_TTL_BOOK: TTL for book details in seconds (default: 3600)
The API supports two caching engines:
Default option. Best for single-instance deployments.
- Set
CACHE_TYPE=memory - Automatic cleanup of expired keys
- Lightning-fast retrieval
Best for multi-instance deployments or persistent caching.
- Set
CACHE_TYPE=redis - Configure via
REDIS_URL - Shared cache across multiple API nodes
- Survives application restarts