One interface for every model. Authentication, routing, streaming, retries, caching — handled.
Quick Start · Features · Providers · Usage · Architecture · Contributing
eyrie is the LLM provider runtime that powers the hawk coding agent. It handles everything between your application and LLM APIs — authentication, model resolution, streaming, retries, rate limiting, and caching.
When your app calls a model, eyrie figures out which provider to use, how to talk to it, and how to stream the response back. Switch from Anthropic to Ollama? eyrie handles the translation. API returns 529? eyrie retries with backoff. Response hits max_tokens? eyrie continues automatically.
Your app never talks to an LLM API directly. eyrie does.
go get github.com/GrayCodeAI/eyrieRequires Go 1.26+. Zero external dependencies.
import "github.com/GrayCodeAI/eyrie/client"
// Create a client — provider auto-detected from environment
c := client.NewEyrieClient(&client.EyrieConfig{
Provider: client.DetectProvider(),
})
// Stream a response
sr, err := c.StreamChat(ctx, messages, client.ChatOptions{
Model: "claude-sonnet-4-6",
})
defer sr.Close()
for evt := range sr.Events {
switch evt.Type {
case "content": // stream text
case "tool_call": // execute tool
case "done": // response complete
}
}Automatically detects and routes to the right provider based on environment variables, config files, or explicit selection.
Maps abstract tiers (opus/sonnet/haiku) to concrete model IDs per provider. Ships with an embedded catalog of pricing, context windows, and capabilities.
Parses SSE for Anthropic and OpenAI formats — text, tool calls, and thinking blocks.
- Retries on 429/500/529 with exponential backoff and
Retry-Aftersupport - Auto-continuation when
stop_reason == max_tokens - Provider fallback chains for high availability
Token bucket rate limiter per provider — prevents hitting API limits before they happen.
- Response caching with configurable TTL
- Semantic similarity caching for repeated prompts
- Anthropic prompt caching breakpoints on system prompt and conversation prefix
Built-in cost estimation per call, with per-provider pricing from the embedded model catalog.
| Provider | Env Variable | Notes |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY |
Default · thinking, caching |
| OpenAI | OPENAI_API_KEY |
Full tool use + reasoning |
| OpenRouter | OPENROUTER_API_KEY |
200+ models via one key |
| Grok (xAI) | XAI_API_KEY |
|
| Gemini | GEMINI_API_KEY |
|
| CanopyWave | CANOPYWAVE_API_KEY |
|
| Ollama | OLLAMA_BASE_URL |
Local models, no key needed |
| OpenCodeGo | OPENCODEGO_API_KEY |
Providers are detected automatically in the order listed above.
resp, err := c.Chat(ctx, messages, client.ChatOptions{
Model: "gpt-4o",
})// Auto-continues when max_tokens is hit
resp, err := client.ChatWithContinuation(ctx, provider, messages,
client.ChatOptions{Model: model},
client.DefaultContinuationConfig(),
)mock := client.NewMockProvider(client.MockModeFixed)
mock.Response = "Here is the code you asked for..."
resp, _ := mock.Chat(ctx, messages, opts)
// No real API calls — perfect for testscat := catalog.DefaultModelCatalog()
// Get the best model for a tier
model := catalog.GetPreferredProviderModel("anthropic", catalog.TierSonnet, &cat)
// → "claude-sonnet-4-6"
// Check deprecation warnings
warn := catalog.GetModelDeprecationWarning("claude-3-7-sonnet", "anthropic")cfg := config.LoadProviderConfig("") // load from disk
config.ApplyProviderConfigToEnv(cfg, false, nil) // apply to environment
config.SaveProviderConfig(cfg, "") // save changeseyrie/
├── cmd/eyrie/ # CLI binary
├── internal/
│ ├── client/ # Provider implementations (51 files)
│ │ ├── providers/ # Anthropic, OpenAI, Azure, Vertex, etc.
│ │ ├── middleware/ # Retry, rate limit, cache, fallback
│ │ └── metrics/ # Cost, call, analytics
│ ├── server/ # HTTP API and gateway
│ ├── config/ # Provider configuration & routing
│ ├── catalog/ # Model catalog & tier system
│ ├── registry/ # Runtime manifest & routing policies
│ ├── routing/ # Weighted provider router
│ ├── storage/ # SQLite conversation DAG store
│ ├── conversation/ # Conversation engine with branching
│ ├── observability/ # OpenTelemetry spans & metrics
│ ├── health/ # Provider health checker
│ ├── cache/ # Response cache warmer
│ ├── types/ # Branded types & API errors
│ ├── errors/ # Error message constants
│ ├── constants/ # API limits
│ ├── utils/ # Error utilities
│ └── sdk/ # Go, Python, TypeScript client SDKs
└── assets/ # Logo and branding
eyrie is part of the hawk-eco:
| Component | Repository | Purpose |
|---|---|---|
| hawk | GrayCodeAI/hawk | AI coding agent |
| eyrie | This repo | LLM provider runtime |
| tok | GrayCodeAI/tok | Tokenizer & compression |
| yaad | GrayCodeAI/yaad | Graph-based memory |
| trace | GrayCodeAI/trace | Session capture |
- Go 1.26+
go build ./cmd/eyrie # Build binary
go test -race ./... # Run all tests with race detector
make ci # Run full CI suite (lint, test, security)
make cover # Generate coverage reportWe welcome contributions! Please see CONTRIBUTING.md for development setup, commit conventions, and the PR process.
Quick start:
- Fork and create a branch:
git checkout -b feat/short-description - Make changes in small, focused commits
- Run
make cilocally - Open a pull request
Use Conventional Commits for commit messages — release-please uses them for versioning.
MIT — see LICENSE for details.
© 2026 GrayCode AI