Add custom providers and models (Ollama, vLLM, LM Studio, proxies) via ~/.pi/agent/models.json.
- Minimal Example
- Full Example
- Supported APIs
- Provider Configuration
- Model Configuration
- Overriding Built-in Providers
- Per-model Overrides
- OpenAI Compatibility
For local models (Ollama, LM Studio, vLLM), only id is required per model:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{ "id": "llama3.1:8b" },
{ "id": "qwen2.5-coder:7b" }
]
}
}
}The apiKey is required but Ollama ignores it, so any value works.
Some OpenAI-compatible servers do not understand the developer role used for reasoning-capable models. For those providers, set compat.supportsDeveloperRole to false so pi sends the system prompt as a system message instead. If the server also does not support reasoning_effort, set compat.supportsReasoningEffort to false too.
You can set compat at the provider level to apply to all models, or at the model level to override a specific model. This commonly applies to Ollama, vLLM, SGLang, and similar OpenAI-compatible servers.
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false
},
"models": [
{
"id": "gpt-oss:20b",
"reasoning": true
}
]
}
}
}Override defaults when you need specific values:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{
"id": "llama3.1:8b",
"name": "Llama 3.1 8B (Local)",
"reasoning": false,
"input": ["text"],
"contextWindow": 128000,
"maxTokens": 32000,
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
}
]
}
}
}The file reloads each time you open /model. Edit during session; no restart needed.
| API | Description |
|---|---|
openai-completions |
OpenAI Chat Completions (most compatible) |
openai-responses |
OpenAI Responses API |
anthropic-messages |
Anthropic Messages API |
google-generative-ai |
Google Generative AI |
Set api at provider level (default for all models) or model level (override per model).
| Field | Description |
|---|---|
baseUrl |
API endpoint URL |
api |
API type (see above) |
apiKey |
API key (see value resolution below) |
headers |
Custom headers (see value resolution below) |
authHeader |
Set true to add Authorization: Bearer <apiKey> automatically |
models |
Array of model configurations |
modelOverrides |
Per-model overrides for built-in models on this provider |
The apiKey and headers fields support three formats:
- Shell command:
"!command"executes and uses stdout"apiKey": "!security find-generic-password -ws 'anthropic'" "apiKey": "!op read 'op://vault/item/credential'"
- Environment variable: Uses the value of the named variable
"apiKey": "MY_API_KEY"
- Literal value: Used directly
"apiKey": "sk-..."
{
"providers": {
"custom-proxy": {
"baseUrl": "https://proxy.example.com/v1",
"apiKey": "MY_API_KEY",
"api": "anthropic-messages",
"headers": {
"x-portkey-api-key": "PORTKEY_API_KEY",
"x-secret": "!op read 'op://vault/item/secret'"
},
"models": [...]
}
}
}| Field | Required | Default | Description |
|---|---|---|---|
id |
Yes | — | Model identifier (passed to the API) |
name |
No | id |
Human-readable model label. Used for matching (--model patterns) and shown in model details/status text. |
api |
No | provider's api |
Override provider's API for this model |
reasoning |
No | false |
Supports extended thinking |
input |
No | ["text"] |
Input types: ["text"] or ["text", "image"] |
contextWindow |
No | 128000 |
Context window size in tokens |
maxTokens |
No | 16384 |
Maximum output tokens |
cost |
No | all zeros | {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0} (per million tokens) |
compat |
No | provider compat |
OpenAI compatibility overrides. Merged with provider-level compat when both are set. |
Current behavior:
/modeland--list-modelslist entries by modelid.- The configured
nameis used for model matching and detail/status text.
Route a built-in provider through a proxy without redefining models:
{
"providers": {
"anthropic": {
"baseUrl": "https://my-proxy.example.com/v1"
}
}
}All built-in Anthropic models remain available. Existing OAuth or API key auth continues to work.
To merge custom models into a built-in provider, include the models array:
{
"providers": {
"anthropic": {
"baseUrl": "https://my-proxy.example.com/v1",
"apiKey": "ANTHROPIC_API_KEY",
"api": "anthropic-messages",
"models": [...]
}
}
}Merge semantics:
- Built-in models are kept.
- Custom models are upserted by
idwithin the provider. - If a custom model
idmatches a built-in modelid, the custom model replaces that built-in model. - If a custom model
idis new, it is added alongside built-in models.
Use modelOverrides to customize specific built-in models without replacing the provider's full model list.
{
"providers": {
"openrouter": {
"modelOverrides": {
"anthropic/claude-sonnet-4": {
"name": "Claude Sonnet 4 (Bedrock Route)",
"compat": {
"openRouterRouting": {
"only": ["amazon-bedrock"]
}
}
}
}
}
}
}modelOverrides supports these fields per model: name, reasoning, input, cost (partial), contextWindow, maxTokens, headers, compat.
Behavior notes:
modelOverridesare applied to built-in provider models.- Unknown model IDs are ignored.
- You can combine provider-level
baseUrl/headerswithmodelOverrides. - If
modelsis also defined for a provider, custom models are merged after built-in overrides. A custom model with the sameidreplaces the overridden built-in model entry.
For providers with partial OpenAI compatibility, use the compat field.
- Provider-level
compatapplies defaults to all models under that provider. - Model-level
compatoverrides provider-level values for that model.
{
"providers": {
"local-llm": {
"baseUrl": "http://localhost:8080/v1",
"api": "openai-completions",
"compat": {
"supportsUsageInStreaming": false,
"maxTokensField": "max_tokens"
},
"models": [...]
}
}
}| Field | Description |
|---|---|
supportsStore |
Provider supports store field |
supportsDeveloperRole |
Use developer vs system role |
supportsReasoningEffort |
Support for reasoning_effort parameter |
reasoningEffortMap |
Map pi thinking levels to provider-specific reasoning_effort values |
supportsUsageInStreaming |
Supports stream_options: { include_usage: true } (default: true) |
maxTokensField |
Use max_completion_tokens or max_tokens |
requiresToolResultName |
Include name on tool result messages |
requiresAssistantAfterToolResult |
Insert an assistant message before a user message after tool results |
requiresThinkingAsText |
Convert thinking blocks to plain text |
thinkingFormat |
Use reasoning_effort, zai, qwen, or qwen-chat-template thinking parameters |
supportsStrictMode |
Include the strict field in tool definitions |
openRouterRouting |
OpenRouter routing config passed to OpenRouter for model/provider selection |
vercelGatewayRouting |
Vercel AI Gateway routing config for provider selection (only, order) |
qwen uses top-level enable_thinking. Use qwen-chat-template for local Qwen-compatible servers that require chat_template_kwargs.enable_thinking.
Example:
{
"providers": {
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "OPENROUTER_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "openrouter/anthropic/claude-3.5-sonnet",
"name": "OpenRouter Claude 3.5 Sonnet",
"compat": {
"openRouterRouting": {
"order": ["anthropic"],
"fallbacks": ["openai"]
}
}
}
]
}
}
}Vercel AI Gateway example:
{
"providers": {
"vercel-ai-gateway": {
"baseUrl": "https://ai-gateway.vercel.sh/v1",
"apiKey": "AI_GATEWAY_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "moonshotai/kimi-k2.5",
"name": "Kimi K2.5 (Fireworks via Vercel)",
"reasoning": true,
"input": ["text", "image"],
"cost": { "input": 0.6, "output": 3, "cacheRead": 0, "cacheWrite": 0 },
"contextWindow": 262144,
"maxTokens": 262144,
"compat": {
"vercelGatewayRouting": {
"only": ["fireworks", "novita"],
"order": ["fireworks", "novita"]
}
}
}
]
}
}
}