- Implement a provider module under
lib/req_llm/providers/, useReqLLM.Provider.DSL+Defaults, and only override what the API actually deviates on. - The
Defaultprovider implementation is OpenAI Compatible. - Non-streaming requests run through Req with
attach/3+encode_body/1+decode_response/1; streaming runs through Finch withattach_stream/4+decode_stream_event/2or/3. - Add models via
priv/models_local/when you want shared registry coverage, then add tests using the three-tier strategy and record fixtures withLIVE=true. For one-off invocation or early development, ReqLLM can also use explicit model specs; see Model Specs.
Adding a provider means implementing a single Elixir module that:
- Translates between canonical types (
Model,Context,Message,ContentPart,Tool) and the provider HTTP API - Implements the
ReqLLM.Providerbehavior via the DSL and default callbacks - Provides SSE-to-
StreamChunkdecoding for streaming when applicable
You should know:
- Provider's API paths, request/response JSON, auth, and streaming protocol
- Req basics (request/response steps) and Finch for streaming
- ReqLLM canonical types (see Data Structures) and normalization principles (Core Concepts)
- Confirm provider supports needed capabilities (chat, tools, images, streaming)
- Gather API key/env var name and any extra headers or versions
- Start with the OpenAI-compatible defaults if at all possible
Create lib/req_llm/providers/<provider>.ex
Use the DSL to register:
id(atom) - Provider identifierbase_url- Default API endpointmetadata- Path to metadata file (priv/models_dev/<provider>.json)default_env_key- Fallback environment variable for API keyprovider_schema- Provider-only options
Required vs optional callbacks:
Required for non-streaming:
prepare_request/4- Configure operation-specific requestsattach/3- Set up authentication and Req pipeline stepsencode_body/1- Transform context to provider JSONdecode_response/1- Parse API responses
Streaming (recommended):
attach_stream/4- Build complete Finch streaming requestdecode_stream_event/2or/3- Decode provider SSE events to StreamChunk structs
Optional:
extract_usage/2- Extract usage/cost datatranslate_options/3- Provider-specific parameter translationnormalize_model_id/1- Handle model ID aliasesparse_stream_protocol/2- Custom streaming protocol handlinginit_stream_state/1- Initialize stateful streamingflush_stream_state/2- Flush accumulated stream state
Response Assembly (Optional):
ResponseBuilder.build_response/3- Custom response assembly from StreamChunks
Prefer use ReqLLM.Provider.Defaults to get robust OpenAI-style defaults and override only when needed.
If you are developing a provider outside of the req_llm library (e.g., in your own application), you must register it so req_llm can discover it.
Option 1: Config-based registration (recommended)
Add the module to your config.exs:
# In config/config.exs
config :req_llm, :custom_providers, [MyApp.Providers.Acme]This tells ReqLLM to automatically load your provider at application startup.
Option 2: Manual registration in Application.start/2
defmodule MyApp.Application do
use Application
def start(_type, _args) do
ReqLLM.Providers.register(MyApp.Providers.Acme)
# ... rest of supervision tree
end
endCustom providers are not in the LLMDB catalog, so you cannot use string specs like "acme:model-name". Instead, use map-based model specs:
{:ok, model} = ReqLLM.model(%{id: "acme-chat-mini", provider: :acme})
{:ok, response} = ReqLLM.generate_text(model, "Hello!")Or pass the model struct directly:
model = LLMDB.Model.new!(%{id: "acme-chat-mini", provider: :acme})
{:ok, response} = ReqLLM.generate_text(model, "Hello!")Note: The
mix mc(model compatibility) task is for validating models in the LLMDB catalog. It does not apply to custom providers.
Version Note: The
mix mcalias requires ReqLLM >= 1.1. If you see** (Mix) The task "mc" could not be found, usemix req_llm.model_compatinstead, or upgrade ReqLLM.
This example shows a provider that reuses defaults and only adds custom headers:
defmodule ReqLLM.Providers.Acme do
@moduledoc "Acme – OpenAI-compatible chat API."
@behaviour ReqLLM.Provider
use ReqLLM.Provider.DSL,
id: :acme,
base_url: "https://api.acme.ai/v1",
metadata: "priv/models_dev/acme.json",
default_env_key: "ACME_API_KEY",
provider_schema: [
organization: [type: :string, doc: "Tenant/Org header"]
]
use ReqLLM.Provider.Defaults
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
request = super(request, model_input, user_opts)
org = user_opts[:organization]
case org do
nil -> request
_ -> Req.Request.put_header(request, "x-acme-organization", org)
end
end
endWhat you get for free:
- Non-streaming: Req pipeline with Bearer auth, JSON encode/decode in OpenAI shape
- Streaming: Finch request builder with OpenAI-compatible body and SSE decoding
- Usage extraction from response body
- Error handling and retry logic
This example shows custom encoding/decoding for a provider with different JSON schema:
defmodule ReqLLM.Providers.Zephyr do
@moduledoc "Zephyr – custom JSON schema, SSE streaming."
@behaviour ReqLLM.Provider
use ReqLLM.Provider.DSL,
id: :zephyr,
base_url: "https://api.zephyr.ai",
metadata: "priv/models_dev/zephyr.json",
default_env_key: "ZEPHYR_API_KEY",
provider_schema: [
version: [type: :string, default: "2024-10-01"],
tenant: [type: :string]
]
use ReqLLM.Provider.Defaults
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
request = ReqLLM.Provider.Defaults.default_attach(__MODULE__, request, model_input, user_opts)
request
|> Req.Request.put_header("x-zephyr-version", user_opts[:version] || "2024-10-01")
|> then(fn req ->
case user_opts[:tenant] do
nil -> req
t -> Req.Request.put_header(req, "x-zephyr-tenant", t)
end
end)
end
@impl ReqLLM.Provider
def encode_body(%Req.Request{} = request) do
context = request.options[:context]
model = request.options[:model]
stream = request.options[:stream] == true
tools = request.options[:tools] || []
provider_opts = request.options[:provider_options] || []
messages =
Enum.map(context.messages, fn m ->
%{
role: Atom.to_string(m.role),
parts: Enum.map(m.content, &encode_part/1)
}
end)
body =
%{
model: model,
messages: messages,
stream: stream
}
|> maybe_put(:temperature, request.options[:temperature])
|> maybe_put(:max_output_tokens, request.options[:max_tokens])
|> maybe_put(:tools, encode_tools(tools))
|> Map.merge(Map.new(provider_opts))
encoded = Jason.encode!(body)
request
|> Req.Request.put_header("content-type", "application/json")
|> Map.put(:body, encoded)
end
@impl ReqLLM.Provider
def decode_response({req, resp}) do
case resp.status do
200 ->
body = ensure_parsed_body(resp.body)
with {:ok, response} <- decode_chat_response(body, req) do
{req, %{resp | body: response}}
else
{:error, reason} ->
{req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
end
status ->
{req,
ReqLLM.Error.API.Response.exception(
reason: "Zephyr API error",
status: status,
response_body: resp.body
)}
end
end
@impl ReqLLM.Provider
def attach_stream(model, context, opts, _finch_name) do
api_key = ReqLLM.Keys.get!(model, opts)
url = Keyword.get(opts, :base_url, default_base_url()) <> "/chat:stream"
headers = [
{"authorization", "Bearer " <> api_key},
{"content-type", "application/json"},
{"accept", "text/event-stream"}
]
req = %Req.Request{
options: %{
model: model.model,
context: context,
stream: true,
provider_options: opts[:provider_options] || []
}
}
body = encode_body(req).body
{:ok, Finch.build(:post, url, headers, body)}
end
@impl ReqLLM.Provider
def decode_stream_event(%{data: data}, model) do
case Jason.decode(data) do
{:ok, %{"type" => "delta", "text" => text}} when is_binary(text) and text != "" ->
[ReqLLM.StreamChunk.text(text)]
{:ok, %{"type" => "reasoning", "text" => think}} when is_binary(think) and think != "" ->
[ReqLLM.StreamChunk.thinking(think)]
{:ok, %{"type" => "tool_call", "name" => name, "arguments" => args}} ->
[ReqLLM.StreamChunk.tool_call(name, Map.new(args))]
{:ok, %{"type" => "usage", "usage" => usage}} ->
[ReqLLM.StreamChunk.meta(%{usage: normalize_usage(usage), model: model.model})]
{:ok, %{"type" => "done", "finish_reason" => reason}} ->
[ReqLLM.StreamChunk.meta(%{
finish_reason: normalize_finish_reason(reason),
terminal?: true
})]
_ ->
[]
end
end
@impl ReqLLM.Provider
def extract_usage(body, _model) when is_map(body) do
case body do
%{"usage" => u} -> {:ok, normalize_usage(u)}
_ -> {:error, :no_usage}
end
end
@impl ReqLLM.Provider
def translate_options(:chat, _model, opts) do
{opts
|> Keyword.rename(:max_tokens, :max_output_tokens)
|> Keyword.drop([:presence_penalty]),
[]}
end
# Helper functions
defp encode_part(%ReqLLM.Message.ContentPart{type: :text, text: t}),
do: %{"type" => "text", "text" => t}
defp encode_part(%ReqLLM.Message.ContentPart{type: :image_url, url: url}),
do: %{"type" => "image_url", "url" => url}
defp encode_part(%ReqLLM.Message.ContentPart{type: :image, data: bin, media_type: mt}),
do: %{"type" => "image", "data" => Base.encode64(bin), "media_type" => mt}
defp encode_part(%ReqLLM.Message.ContentPart{type: :file, data: bin, media_type: mt, name: name}),
do: %{"type" => "file", "name" => name, "data" => Base.encode64(bin), "media_type" => mt}
defp encode_part(%ReqLLM.Message.ContentPart{type: :thinking, text: t}),
do: %{"type" => "thinking", "text" => t}
defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: a}),
do: %{"type" => "tool_call", "name" => n, "arguments" => a}
defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: a}),
do: %{"type" => "tool_result", "name" => n, "result" => a}
defp decode_chat_response(body, req) do
with %{"message" => %{"role" => role, "content" => content}} <- body,
{:ok, message} <- to_message(role, content) do
{:ok,
%ReqLLM.Response{
id: body["id"] || "zephyr_" <> Integer.to_string(System.unique_integer([:positive])),
model: req.options[:model],
context: req.options[:context] || ReqLLM.Context.new([]),
message: message,
usage: normalize_usage(body["usage"] || %{}),
stream?: false
}}
else
_ -> {:error, :unexpected_body}
end
end
defp to_message(role, parts) do
content_parts =
Enum.flat_map(parts, fn
%{"type" => "text", "text" => t} ->
[%ReqLLM.Message.ContentPart{type: :text, text: t}]
%{"type" => "thinking", "text" => t} ->
[%ReqLLM.Message.ContentPart{type: :thinking, text: t}]
%{"type" => "tool_call", "name" => n, "arguments" => a} ->
[%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: Map.new(a)}]
%{"type" => "tool_result", "name" => n, "result" => r} ->
[%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: Map.new(r)}]
_ -> []
end)
{:ok, %ReqLLM.Message{role: String.to_existing_atom(role), content: content_parts}}
end
defp encode_tools([]), do: nil
defp encode_tools(tools) do
Enum.map(tools, &ReqLLM.Tool.to_schema(&1, :openai))
end
defp maybe_put(map, _k, nil), do: map
defp maybe_put(map, k, v), do: Map.put(map, k, v)
defp ensure_parsed_body(body) when is_binary(body), do: Jason.decode!(body)
defp ensure_parsed_body(body), do: body
defp normalize_usage(%{"prompt" => i, "completion" => o}),
do: %{input_tokens: i, output_tokens: o, total_tokens: (i || 0) + (o || 0)}
defp normalize_usage(%{"input_tokens" => i, "output_tokens" => o, "total_tokens" => t}),
do: %{input_tokens: i || 0, output_tokens: o || 0, total_tokens: t || (i || 0) + (o || 0)}
defp normalize_usage(_),
do: %{input_tokens: 0, output_tokens: 0, total_tokens: 0}
defp normalize_finish_reason("stop"), do: :stop
defp normalize_finish_reason("length"), do: :length
defp normalize_finish_reason("tool"), do: :tool_calls
defp normalize_finish_reason(_), do: :error
endAlways convert ReqLLM.Context (list of Messages with ContentParts) to provider JSON.
Message structure:
roleis:system|:user|:assistant|:toolcontentis a list ofContentPart
ContentPart variants to handle:
text("...")- Plain text contentimage_url("...")- Image from URLimage(binary, mime)- Base64-encoded imagefile(binary, name, mime)- File attachmentthinking("...")- Reasoning tokens (for models that expose them)tool_call(name, map)- Function call requesttool_result(tool_call_id_or_name, map)- Function call result
Non-streaming:
Decode provider JSON into a single assistant ReqLLM.Message with canonical ContentParts and fill ReqLLM.Response:
Response.messageis the assistant messageResponse.usageis normalized when available- For object generation, preserve
tool_call/tool_resultor JSON content soReqLLM.Response.object/1works consistently
Streaming (SSE):
Map each provider event into one or more ReqLLM.StreamChunk:
:content— Text tokens:thinking— Reasoning tokens:tool_call— Function name + arguments (may arrive in fragments):meta— Usage deltas, finish_reason,terminal?: trueon completion
One conversation model, one streaming shape, one response shape: Never leak provider specifics to callers; normalize at the adapter boundary.
Create priv/models_local/<provider>.json to seed/supplement models before syncing:
{
"provider": {
"id": "acme",
"name": "Acme AI"
},
"models": [
{
"id": "acme-chat-mini",
"name": "Acme Chat Mini",
"type": "chat",
"capabilities": {
"stream": true,
"tool_call": true,
"vision": true
},
"modalities": {
"input": ["text","image"],
"output": ["text"]
},
"cost": {
"input": 0.00015,
"output": 0.0006
}
}
]
}Model metadata is provided by the llm_db dependency. For custom providers not yet in llm_db, add a local patch file in priv/models_local/ when you want registry and tooling support. That is not required just to call a model through an explicit %LLMDB.Model{} or ReqLLM.model!/1.
The registry enables:
- Validation with
mix mc - Model lookup by
"acme:acme-chat-mini" - Capability gating in tests
ReqLLM uses a three-tier testing architecture:
Under test/req_llm/ for core types/helpers.
Under test/providers/, unit-testing your encoding/decoding and options behavior with small bodies.
Example:
defmodule Providers.AcmeTest do
use ExUnit.Case, async: true
alias ReqLLM.Message.ContentPart
test "encode_body: text + tools into OpenAI shape" do
ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
{:ok, model} = ReqLLM.model("acme:acme-chat-mini")
req =
Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
|> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
|> ReqLLM.Providers.Acme.encode_body()
assert is_binary(req.body)
body = Jason.decode!(req.body)
assert body["model"] =~ "acme-chat-mini"
assert body["messages"] |> is_list()
end
endUnder test/coverage/ using the fixture system for integration against the high-level API.
Example:
defmodule Coverage.AcmeChatTest do
use ExUnit.Case, async: false
use ReqLLM.Test.LiveFixture, provider: :acme
test "basic text generation" do
{:ok, response} =
use_fixture(:provider, "acme-basic", fn ->
ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
end)
assert ReqLLM.Response.text(response) =~ "hi"
end
test "streaming tokens" do
{:ok, sr} =
use_fixture(:provider, "acme-stream", fn ->
ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
end)
tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
assert length(tokens) >= 3
end
end# Record fixtures during live test runs
LIVE=true mix test --only provider:acme
# Or use model compatibility tool
mix mc "acme:*" --record# Quick validation
mix mc
# Sample models during development
mix mc --sampleAlways use ReqLLM.Keys for key retrieval. Never read System.get_env/1 directly.
api_key = ReqLLM.Keys.get!(model, opts)The DSL's default_env_key is the fallback env var name. ReqLLM.Keys also supports:
- Application config
- Per-call override via
opts[:api_key]
Attach Bearer header in attach/3 or use Defaults (already sets authorization):
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
api_key = ReqLLM.Keys.get!(model_input, user_opts)
request
|> Req.Request.put_header("authorization", "Bearer #{api_key}")
|> Req.Request.put_header("content-type", "application/json")
endReqLLM.Error.Auth- Missing/invalid API keysReqLLM.Error.API.Request- HTTP request issuesReqLLM.Error.API.Response- HTTP response errorsReqLLM.Error.Parse- JSON/body shape issues
In decode_response/1, return {req, exception} for non-200 or malformed payloads:
@impl ReqLLM.Provider
def decode_response({req, resp}) do
case resp.status do
200 ->
body = ensure_parsed_body(resp.body)
with {:ok, response} <- decode_chat_response(body, req) do
{req, %{resp | body: response}}
else
{:error, reason} ->
{req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
end
status ->
{req,
ReqLLM.Error.API.Response.exception(
reason: "API error",
status: status,
response_body: resp.body
)}
end
endThe pipeline will propagate errors consistently to callers.
Different LLM providers have subtle differences in how they represent responses, tool calls, finish reasons, and metadata. Previously, these differences were handled in multiple places (streaming vs non-streaming, provider-specific decoders), leading to behavioral inconsistencies.
The ResponseBuilder behaviour centralizes provider-specific Response assembly logic, ensuring that:
- Streaming and non-streaming produce identical Response structs
- Provider quirks are handled in one place per provider
- New providers have a clear extension point
Both streaming and non-streaming paths converge on ResponseBuilder:
- Decode wire format to
[StreamChunk.t()] - Collect metadata (usage, finish_reason, provider-specific)
- Call the appropriate builder:
builder = ResponseBuilder.for_model(model)
{:ok, response} = builder.build_response(chunks, metadata, opts)ResponseBuilder.for_model/1 routes to provider-specific builders:
- Anthropic models →
Anthropic.ResponseBuilder - Google/Vertex models →
Google.ResponseBuilder - OpenAI Responses API models →
OpenAI.ResponsesAPI.ResponseBuilder - All others →
Provider.Defaults.ResponseBuilder
Most providers can use Provider.Defaults.ResponseBuilder. Implement a custom builder when:
- Content block requirements: Anthropic requires content blocks to never be empty
- Provider-specific metadata: OpenAI Responses API needs to propagate
response_idfor stateless multi-turn - Finish reason detection: Google needs to detect
functionCallto set correct finish_reason - Custom tool call handling: Provider has non-standard tool call representation
defmodule ReqLLM.Providers.Zephyr.ResponseBuilder do
@moduledoc "Custom ResponseBuilder for Zephyr provider."
@behaviour ReqLLM.Provider.ResponseBuilder
alias ReqLLM.Provider.Defaults.ResponseBuilder, as: DefaultBuilder
@impl true
def build_response(chunks, metadata, opts) do
# Delegate to default builder for standard processing
with {:ok, response} <- DefaultBuilder.build_response(chunks, metadata, opts) do
# Apply provider-specific post-processing
response = apply_zephyr_quirks(response, metadata)
{:ok, response}
end
end
defp apply_zephyr_quirks(response, metadata) do
# Example: Zephyr includes session_id in metadata
case metadata[:session_id] do
nil -> response
sid -> %{response | provider_meta: Map.put(response.provider_meta, :session_id, sid)}
end
end
endThen register the builder by adding a clause to ResponseBuilder.for_model/1 (for built-in providers) or by pattern matching on your model in your provider's streaming/non-streaming paths.
Let's add a fictional provider called "Acme" from start to finish.
File: lib/req_llm/providers/acme.ex
defmodule ReqLLM.Providers.Acme do
@moduledoc "Acme – OpenAI-compatible chat API."
@behaviour ReqLLM.Provider
use ReqLLM.Provider.DSL,
id: :acme,
base_url: "https://api.acme.ai/v1",
metadata: "priv/models_dev/acme.json",
default_env_key: "ACME_API_KEY",
provider_schema: [
organization: [type: :string, doc: "Tenant/Org header"]
]
use ReqLLM.Provider.Defaults
@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
request = super(request, model_input, user_opts)
org = user_opts[:organization]
case org do
nil -> request
_ -> Req.Request.put_header(request, "x-acme-organization", org)
end
end
endFile: priv/models_local/acme.json
{
"provider": {
"id": "acme",
"name": "Acme AI"
},
"models": [
{
"id": "acme-chat-mini",
"name": "Acme Chat Mini",
"type": "chat",
"capabilities": {
"stream": true,
"tool_call": true,
"vision": true
},
"modalities": {
"input": ["text","image"],
"output": ["text"]
},
"cost": {
"input": 0.00015,
"output": 0.0006
}
}
]
}export ACME_API_KEY=sk-...
mix req_llm.gen "Hello" --model acme:acme-chat-miniFile: test/providers/acme_test.exs
defmodule Providers.AcmeTest do
use ExUnit.Case, async: true
alias ReqLLM.Message.ContentPart
test "encode_body: text + tools into OpenAI shape" do
ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
{:ok, model} = ReqLLM.model("acme:acme-chat-mini")
req =
Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
|> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
|> ReqLLM.Providers.Acme.encode_body()
assert is_binary(req.body)
body = Jason.decode!(req.body)
assert body["model"] =~ "acme-chat-mini"
assert body["messages"] |> is_list()
end
endFile: test/coverage/acme_chat_test.exs
defmodule Coverage.AcmeChatTest do
use ExUnit.Case, async: false
use ReqLLM.Test.LiveFixture, provider: :acme
test "basic text generation" do
{:ok, response} =
use_fixture(:provider, "acme-basic", fn ->
ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
end)
assert ReqLLM.Response.text(response) =~ "hi"
end
test "streaming tokens" do
{:ok, sr} =
use_fixture(:provider, "acme-stream", fn ->
ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
end)
tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
assert length(tokens) >= 3
end
end# Option 1: During test run
LIVE=true mix test --only provider:acme
# Option 2: Using model compat tool
mix mc "acme:*" --record# Validate Acme models
mix req_llm.model_compat acme
# List all registered providers/models
mix mc --available- Prefer using
ReqLLM.Provider.Defaults. Only override what the provider truly deviates on - Keep
prepare_request/4a thin dispatcher; centralize option prep inattach/3and the defaults pipeline
- No comments inside function bodies. Use clear naming and module docs
- Prefer pattern matching to conditionals
- Use
{:ok, result}|{:error, reason}tuples for fallible helpers
- Use
translate_options/3to rename/drop provider-specific params (e.g.,max_tokens→max_output_tokens)
- Always map tools via
ReqLLM.Tool.to_schema/2 - Respect
ContentPartvariants for images/files. Base64 encode if the provider requires it
- Build the Finch request in
attach_stream/4 - Decode events to
StreamChunkindecode_stream_event/2or/3 - Emit terminal meta chunk with
finish_reasonand usage if provided
- Start with non-streaming happy path, then add streaming and tools
- Record minimal, deterministic fixtures (
temperature: 0)
- Provider uses non-SSE streaming (binary protocol) or chunked JSON requiring stateful accumulation
- Models with unique parameter semantics that demand
translate_options/3and capability gating - Complex multimodal tool invocation requiring custom mapping of multi-part tool args/results
- Implement
parse_stream_protocol/2for custom binary protocols (e.g., AWS Event Stream) - Implement
init_stream_state/1,decode_stream_event/3,flush_stream_state/2to accumulate partial tool_call args or demultiplex multi-channel events - Implement
normalize_model_id/1for regional aliases andtranslate_options/3with warning aggregation - Provide provider-specific usage accounting that merges multi-phase usage deltas
prepare_request/4
- Build Req for the operation
- Defaults cover
:chat,:object,:embedding
attach/3
- Set headers, auth, and pipeline steps
- Defaults add Bearer, retry, error, usage, fixture steps
encode_body/1
- Transform options/context to provider JSON
- Defaults are OpenAI-compatible; override for custom wire formats
decode_response/1
- Map provider body to Response or error
- Defaults map OpenAI-style bodies; override if your shape differs
attach_stream/4
- Must return
{:ok, Finch.Request.t()} - Defaults build OpenAI-compatible streaming requests; override for custom endpoints/headers
decode_stream_event/2 or /3
- Map provider events to StreamChunk
- Defaults handle OpenAI-compatible deltas
extract_usage/2
- Normalize usage tokens/cost if provider deviates from standard usage shape
translate_options/3
- Rename/drop options per model or operation
ResponseBuilder.build_response/3
- Build final Response struct from accumulated StreamChunks and metadata
- Defaults handle OpenAI-compatible responses; override for provider-specific quirks
- Required parameters:
chunks(list of StreamChunk),metadata(map with usage, finish_reason, etc.),opts(keyword list with:contextand:model)
Adding a provider to ReqLLM involves:
- Creating a provider module with the DSL and behavior implementation
- Implementing encoding/decoding for the provider's wire format
- Optionally implementing a custom
ResponseBuilderfor provider-specific response assembly - Adding model metadata and syncing the registry
- Writing tests at all three tiers (core, provider, coverage)
- Recording fixtures for validation
By following these guidelines and leveraging the defaults, you can add robust, well-tested provider support that maintains ReqLLM's normalization principles across all AI interactions.