This directory contains executable examples for both Python and TypeScript demonstrating the core capabilities of the Hyperion API Gateway.
- /python: Examples using the
hyperion-aiPython package - /typescript: Examples using the
@hyperion-ai/sdkTypeScript package
01_exact_match_caching: Shows how identical requests are served in milliseconds from the L1 Redis cache without hitting upstream. Demonstrates reading latency and token savings from the.hyperionresponse metadata.02_semantic_cache: Shows how conceptually similar requests (different phrasing, same meaning) are caught by the L2 vector database, printing thesimilarity_score.03_smart_routing: Demonstrates dynamic model selection using theautomodel parameter, where simple requests go to fast/cheap models.04_budget_enforcement: Explains how Hyperion intercepts requests with a 402 error if a key has exceeded its database spending limit.05_provider_failover: Demonstrates configuring thehyperion.fallbacksarray so that if a primary model is down, the request transparently reroutes without throwing a client error.
- Make sure your local Hyperion Gateway is running (
docker compose up -d). - Generate an API Key via the Admin Dashboard (
http://localhost:3000). - Export your key as an environment variable:
export HYPERION_API_KEY="your-api-key"
- Run the scripts using your respective language runner (
python3orts-node).