Provider-agnostic LLM architecture for TypeScript.
Switch providers without changing code.
Avoid vendor lock-in.
Control cost.
Reuse prompts as capabilities.
Multi-provider routing • fallback chains • USD cost gating • capability factories • tool-use security • observability
Most LLM applications break in predictable ways:
- SDK upgrades touch too many files
- Switching providers requires refactoring
- Prompt logic is duplicated across features
- Cost and routing logic are scattered
- Business logic becomes coupled to provider-specific SDKs
This is not just an SDK problem.
It is an architecture problem.
llm-ports applies the ports-and-adapters pattern to LLM systems.
Only two files in your codebase should know the LLM SDK exists.
Everything else talks to a typed interface.
Instead of calling models directly, your application uses reusable capabilities:
- classify
- draft
- score
- summarize
- extract
- plan
- analyze
The LLM stops being a dependency you manage.
It becomes infrastructure you configure.
- Multi-provider LLM routing across OpenAI, Anthropic, Ollama, Vercel AI SDK, and compatible providers
- Fallback chains when a provider fails or exceeds budget
- USD-based cost gating with hourly, daily, and monthly limits
- Reusable prompt capabilities so prompts are defined once and reused everywhere
- Validation recovery for structured output failures
- Tool-use safety primitives for destructive or confirmation-required actions
- Observability hooks for cost, latency, quality, and outcomes
- TypeScript-first API with full type support
- No runtime dependency on LangChain, LlamaIndex, or heavy frameworks
LLM_PROVIDER_FAST=anthropic|<model>|cost:50/day
LLM_PROVIDER_SMART=anthropic|<model>|cost:200/day
LLM_TASK_ROUTE_TRIAGE=fast,smartimport { createRegistryFromEnv } from "@llm-ports/core";
import { createAnthropicAdapter } from "@llm-ports/adapter-anthropic";
export const llm = createRegistryFromEnv({
adapters: {
anthropic: createAnthropicAdapter({
apiKey: process.env.ANTHROPIC_API_KEY!,
}),
},
}).getPort();const result = await llm.generateText({
taskType: "triage",
prompt: "Classify this email...",
});The registry:
- selects the right model for the task
- enforces cost limits
- falls back through the provider chain on failure
- records usage, cost, and latency
Instead of duplicating prompt logic across files, define a capability once and reuse it.
import { createClassifier } from "@llm-ports/capabilities";
import { z } from "zod";
const IntentSchema = z.object({
intent: z.enum(["question", "request", "complaint", "feedback", "other"]),
urgency: z.enum(["low", "normal", "high"]),
reasoning: z.string(),
});
export const classifyIntent = createClassifier({
port: llm,
schema: IntentSchema,
schemaName: "user-intent",
rubric: `
question: asking for information
request: wants something done
complaint: reports a problem
feedback: opinion only
other: anything else
`,
});Now call it anywhere:
const result = await classifyIntent({ content: userMessage });Example output:
{
intent: "request",
urgency: "high",
reasoning: "The user is asking for a concrete action."
}Why this matters:
- Improve a prompt once, and every call site benefits
- Keep behavior consistent across the system
- Make debugging and evaluation easier
- Keep business logic free from provider-specific SDK details
Before:
Application code
├─ direct SDK call
├─ direct SDK call
├─ direct SDK call
└─ model router leaking SDK types
After:
Application code
↓
Capabilities
↓
LLM Port
↓
Adapters and Provider Registry
↓
LLM providers
The key shift:
Application code stops calling models directly. It calls capabilities.
| Package | Purpose |
|---|---|
@llm-ports/core |
Port interfaces, registry, routing, cost gating, validation strategies, content blocks |
@llm-ports/capabilities |
Reusable LLM operation factories |
@llm-ports/adapter-openai |
OpenAI SDK adapter with baseURL support for compatible providers |
@llm-ports/adapter-anthropic |
Anthropic SDK adapter |
@llm-ports/adapter-google |
Google Gemini native adapter (@google/genai SDK) — full multimodal, bundled pricing |
@llm-ports/adapter-ollama |
Ollama native adapter with model management |
@llm-ports/adapter-vercel |
Vercel AI SDK adapter for migration and compatibility |
@llm-ports/observability(quality tracking hooks, sinks, deterministic edit-diff helpers) is planned for v0.2.
Seven runnable examples in examples/, each its own pnpm workspace package with a README walking through the code:
| Example | What it shows |
|---|---|
basic |
The smallest possible end-to-end. One adapter, one task type, one generateText call. The 60-second-setup demo. |
multi-provider |
Fallback chain (Anthropic primary → OpenAI backup), USD cost gating per provider, capability factory. |
email-triage |
The most common production use case, condensed into ~150 lines. Inbound email → classify (intent + urgency + sentiment) → policy gate → draft brand-voiced reply → queue for human review. Capability composition story. |
streaming-chat |
Express server with three routes: POST /chat (one-shot), POST /chat/stream (Server-Sent Events), POST /chat/agent (tool-augmented). The most common LLM UX patterns in ~30 lines of glue. |
extract-from-pdf |
Document extraction: raw OCR'd invoice text → fully-typed structured object via Zod. Demonstrates generateStructured, validation-retry-with-feedback, and the createExtractor factory. |
agent-with-approval |
Tool-use agent with first-class security primitives. destructive, requiresConfirmation, maxOutputBytes flags + an approval-gate wrapper. The differentiation example. |
migrate-from-vercel-ai |
Two migration paths for users on Vercel AI SDK: (a) wrap your existing model factories with @llm-ports/adapter-vercel, (b) replace @ai-sdk/* with native llm-ports adapters. Side-by-side before/after diffs. |
Each example is runnable from the monorepo root:
pnpm --filter @llm-ports/example-<name> startSet the relevant API key (ANTHROPIC_API_KEY, OPENAI_API_KEY) before running. Each example's README documents which keys it needs.
Use llm-ports when you need:
- multi-provider LLM routing
- LLM fallback chains
- TypeScript LLM abstraction
- OpenAI and Anthropic provider switching
- cost control for production LLM applications
- reusable prompt capabilities
- structured output validation and recovery
- tool-use security in agent workflows
- observability for LLM cost, latency, and quality
- vendor-neutral AI architecture
Use llm-ports if:
- you use 2 or more LLM providers
- you may switch providers later
- SDK upgrades have caused multi-file changes
- prompt logic is duplicated
- cost control matters
- you want business logic decoupled from provider SDKs
Skip it if:
- you have 1 or 2 LLM calls
- you are only prototyping
- you are intentionally building around one provider-specific feature
- you want a full agent framework, memory layer, RAG framework, or hosted gateway
| Tool | How llm-ports relates |
|---|---|
| Vercel AI SDK | Vercel unifies provider calls. llm-ports adds registry, fallback chains, USD cost gating, validation recovery, and capability factories on top. |
| LiteLLM | LiteLLM is a Python-first HTTP proxy. llm-ports is TypeScript and runs in-process with no extra network hop. |
| Portkey | Portkey is a commercial hosted gateway. llm-ports is MIT, in-process, and has no hosted dependency. |
| LangChain.js | LangChain is a framework. llm-ports is a lightweight architecture and control layer. |
| LlamaIndex.TS | LlamaIndex is retrieval-first. llm-ports handles LLM invocation, routing, fallback, and cost control. |
| Mastra | Mastra is agent-first with built-in memory and workflow primitives. llm-ports provides lower-level LLM primitives beneath that layer. |
llm-ports is pre-release. The core architecture is stable and the offline regression suite is comprehensive (250+ tests, latency p99 under 1 ms, no doc-rot detected across 110+ snippets). Some adapter and agent paths are still being hardened.
Fourteen medium-impact alpha-bake issues (#1, #3, #4, #5, #6, #9, #12, #14, #16, #19, #20, #21, #24, #32) shipped in 0.1.0-alpha.1 → 0.1.0-alpha.13 and are now closed. The alpha line completes the v0.1 surface: Gemini multi-turn runAgent + native responseSchema, runtime model discovery (LLMPort.listModels() across 4 adapters + Registry.checkPricingFreshness()), useStrictResponseFormat on adapter-openai for Cerebras strict-JSON, dangerouslyAllowBrowser opt-in on openai + anthropic, reasoningEffort parameter for o-series / gpt-5-nano / Groq gpt-oss-120b reasoning depth control, capability factories propagating reasoningEffort + signal + forceProviderAlias to the underlying port call, plus an expanded attemptValidationRepair pass that catches markdown-wrapped enums, trailing punctuation, stringified-JSON-as-object, and array-with-single-object misreads. The full per-surface inventory lives at the v0.1 status page.
What's still open:
- Some compat-provider models (Groq, Together AI, Fireworks, Clarifai, SambaNova) may require a
pricingOverridesentry to satisfy the registry's pricing-validation step. Bundled pricing tables cover OpenAI, Anthropic, Google, and Ollama by default. Worked examples for Clarifai's Qwen3.6 35B A3B FP8 and SambaNova's MiniMax-M2.7 are in the openai adapter docs. - Vercel adapter
runAgentis single-turn only (multi-turn lands in v0.2). - Registry walks the chain on budget gating AND on runtime errors (alpha.7+, default predicate:
ProviderUnavailableError). Configurable viaruntimeFallback: "none" | "default" | { shouldFallback }. Streaming methods walk only on stream-creation failure, not mid-iteration.
If you hit something not listed here, please open an issue — the bug-report template captures the version + repro shape we need.
llm-ports is in alpha. All 7 packages are at v0.1.0-alpha.13. Stable v0.1 lands after a short alpha bake — see the v0.1 status page for what's stable today vs still being hardened.
npm install @llm-ports/coreInstall adapters as needed:
npm install @llm-ports/adapter-anthropic
npm install @llm-ports/adapter-openai
npm install @llm-ports/adapter-google
npm install @llm-ports/adapter-ollama
npm install @llm-ports/adapter-vercel
npm install @llm-ports/capabilities(All six packages are scoped under @llm-ports. They're versioned together via changesets.)
Peer dependency: zod >=3.24.0 <5. Bring your own SDKs (@anthropic-ai/sdk, openai, ollama, ai).
Documentation site (auto-deployed from docs/ on every push to main):
https://baabakk.github.io/llm-ports/
Pages:
- Getting Started
- Concepts: ports, adapters, task routing, cost gating, content blocks, validation strategies
- Guides: multi-provider routing, local-to-cloud, cost control, custom adapters, observability, security
- Capabilities: one page per capability
- Adapters: one page per adapter and feature matrix
- Migration: from Vercel AI SDK, LangChain.js, and direct provider SDKs
Tool use without a threat model is dangerous.
llm-ports treats security as a first-class part of the API:
- destructive tool markers
- confirmation-required actions
- max output byte limits
- redaction capability
- explicit guidance for prompt injection and tool abuse
See SECURITY.md.
Contributions are welcome after the initial v0.1 scaffolding lands.
See CONTRIBUTING.md.
MIT. See LICENSE.
Pre-release.
Current target:
- v0.1: core, adapters, cost gating, 7 capability factories
- v0.2: expanded capabilities and observability package
- v0.3: additional adapters and markdown skill format evaluation
llm-ports is pre-release. To get notified when v0.1 lands on the latest tag (and for every minor release after):
- Click the Watch button at the top of the GitHub repo
- Choose Custom
- Enable Releases
You'll get an email or notification only when a real version ships. No PR or commit noise.