A collection of Crawl4AI-based scrapers to download and maintain offline copies of Indian stock broker API documentation.
| Broker | Sections | Lines | Status |
|---|---|---|---|
| Kite Connect v3 (Zerodha) | 18 | ~4,700 | β Complete |
| Angel One SmartAPI | 19 | ~161,740 | β Complete |
| Upstox Open API | 24 | ~1,200 | β Complete |
| Groww Trade API | 27 | ~6,200 | β Complete |
Total: 88 files, ~173,840 lines of documentation
output/
βββ kite-connect-v3/ # Zerodha Kite Connect
β βββ orders/README.md
β βββ user/README.md
β βββ websocket/README.md
β βββ ... (18 sections)
β
βββ angel-one-smartapi/ # Angel One
β βββ Orders/README.md
β βββ User/README.md
β βββ Gtt/README.md
β βββ ... (19 sections)
β
βββ upstox-open-api/ # Upstox
β βββ orders/README.md
β βββ gtt-orders/README.md
β βββ authentication/README.md
β βββ ... (24 sections)
β
βββ groww-trade-api/ # Groww
βββ python-sdk/orders/README.md
βββ python-sdk/smart-orders/README.md
βββ python-sdk/portfolio/README.md
βββ curl/orders/README.md
βββ ... (27 sections)
# Install Crawl4AI
pip install crawl4ai
# Install Playwright browser
playwright install chromiumKite Connect (Zerodha):
python kite_crawler.py
python clean_docs.pyAngel One SmartAPI:
python angel_crawler.py
python clean_angel_docs.pyUpstox Open API:
python upstox_crawler.py
python clean_upstox_docs.pypython verify_all_docs.py| Script | Purpose | Broker |
|---|---|---|
kite_crawler.py |
Scrape Kite Connect docs | Zerodha |
angel_crawler.py |
Scrape Angel One docs | Angel Broking |
upstox_crawler.py |
Scrape Upstox docs | Upstox |
groww_crawler.py |
Scrape Groww docs | Groww |
clean_docs.py |
Clean Kite markdown | Zerodha |
clean_angel_docs.py |
Clean Angel markdown | Angel Broking |
clean_upstox_docs.py |
Clean Upstox markdown | Upstox |
clean_groww_docs.py |
Clean Groww markdown | Groww |
verify_all_docs.py |
Verify completeness | All |
# Re-scrape everything
python kite_crawler.py
python angel_crawler.py
python upstox_crawler.py
python groww_crawler.py
# Clean all
python clean_docs.py
python clean_angel_docs.py
python clean_upstox_docs.py
python clean_groww_docs.py
# Verify
python verify_all_docs.py
# Commit updates
git add output/
git commit -m "Refresh all broker docs - $(date +%Y-%m-%d)"# Delete existing output
rm -rf output/kite-connect-v3
rm -rf output/angel-one-smartapi
rm -rf output/upstox-open-api
rm -rf output/groww-trade-api
# Re-scrape from scratch
python kite_crawler.py && python clean_docs.py
python angel_crawler.py && python clean_angel_docs.py
python upstox_crawler.py && python clean_upstox_docs.py
python groww_crawler.py && python clean_groww_docs.pyTo scrape a new broker's documentation:
-
Copy existing crawler as template:
cp kite_crawler.py newbroker_crawler.py
-
Edit the configuration:
# In newbroker_crawler.py, change: BASE_URL = "https://api.newbroker.com/docs/" OUTPUT_DIR = Path(__file__).parent / "output" / "newbroker-api"
-
Run the crawler:
python newbroker_crawler.py
-
Create cleanup script if needed (depends on site's HTML structure)
-
Commit:
git add newbroker_crawler.py output/newbroker-api/ git commit -m "Add NewBroker API documentation"
| Frequency | Action |
|---|---|
| Monthly | Check broker API changelogs for updates |
| Quarterly | Full re-scrape of all documentation |
| After announcements | Re-scrape affected broker |
- Crawl4AI - Async web crawler with JavaScript rendering
- Playwright - Headless browser automation
- Python 3.10+ - Runtime environment
- β Authentication & Login
- β Orders (Regular, GTT, AMO, CO, Smart Orders)
- β Portfolio & Holdings
- β Market Data & Quotes
- β Historical Data
- β User Profile & Funds
- β Margins & Charges
- β WebSocket Streaming / Feed
- β Rate Limiting
- β Error Codes & Exceptions
- β SDKs & Libraries
- β Instruments & Symbols
- β Backtesting (Groww)
-
Navigation Cleanup: Some markdown files may contain residual navigation links. Cleanup scripts remove most noise automatically.
-
Dynamic Content: All scrapers use headless browsers to handle JavaScript-rendered documentation sites.
-
Rate Limiting: Scrapers respect website load times. Don't run too frequently to avoid being blocked.
-
API Changes: Broker APIs evolve. Re-scrape quarterly or after major announcements.
This repository contains scraped documentation for personal/offline use only. All content copyright belongs to respective brokers:
- Zerodha (Kite Connect)
- Angel One (SmartAPI)
- Upstox (Open API)
For production use, always refer to official documentation:
Found a broker with missing docs? Want to add a new scraper?
- Fork the repository
- Create a new crawler script
- Test thoroughly
- Submit a pull request
Last Updated: 2026-02-26