Adding a New Provider

TL;DR

Implement a provider module under lib/req_llm/providers/, use ReqLLM.Provider.DSL + Defaults, and only override what the API actually deviates on.
The Default provider implementation is OpenAI Compatible.
Non-streaming requests run through Req with attach/3 + encode_body/1 + decode_response/1; streaming runs through Finch with attach_stream/4 + decode_stream_event/2 or /3.
Add models via priv/models_local/ when you want shared registry coverage, then add tests using the three-tier strategy and record fixtures with LIVE=true. For one-off invocation or early development, ReqLLM can also use explicit model specs; see Model Specs.

Overview and Prerequisites

What it means to add a provider

Adding a provider means implementing a single Elixir module that:

Translates between canonical types (Model, Context, Message, ContentPart, Tool) and the provider HTTP API
Implements the ReqLLM.Provider behavior via the DSL and default callbacks
Provides SSE-to-StreamChunk decoding for streaming when applicable

Required knowledge and setup

You should know:

Provider's API paths, request/response JSON, auth, and streaming protocol
Req basics (request/response steps) and Finch for streaming
ReqLLM canonical types (see Data Structures) and normalization principles (Core Concepts)

Before coding

Confirm provider supports needed capabilities (chat, tools, images, streaming)
Gather API key/env var name and any extra headers or versions
Start with the OpenAI-compatible defaults if at all possible

Provider Module Structure

File location

Create lib/req_llm/providers/<provider>.ex

Using the DSL

Use the DSL to register:

id (atom) - Provider identifier
base_url - Default API endpoint
metadata - Path to metadata file (priv/models_dev/<provider>.json)
default_env_key - Fallback environment variable for API key
provider_schema - Provider-only options

Implementing the behavior

Required vs optional callbacks:

Required for non-streaming:

prepare_request/4 - Configure operation-specific requests
attach/3 - Set up authentication and Req pipeline steps
encode_body/1 - Transform context to provider JSON
decode_response/1 - Parse API responses

Streaming (recommended):

attach_stream/4 - Build complete Finch streaming request
decode_stream_event/2 or /3 - Decode provider SSE events to StreamChunk structs

Optional:

extract_usage/2 - Extract usage/cost data
translate_options/3 - Provider-specific parameter translation
normalize_model_id/1 - Handle model ID aliases
parse_stream_protocol/2 - Custom streaming protocol handling
init_stream_state/1 - Initialize stateful streaming
flush_stream_state/2 - Flush accumulated stream state

Response Assembly (Optional):

ResponseBuilder.build_response/3 - Custom response assembly from StreamChunks

Using Defaults

Prefer use ReqLLM.Provider.Defaults to get robust OpenAI-style defaults and override only when needed.

Registering Custom Providers

If you are developing a provider outside of the req_llm library (e.g., in your own application), you must register it so req_llm can discover it.

Option 1: Config-based registration (recommended)

Add the module to your config.exs:

# In config/config.exs
config :req_llm, :custom_providers, [MyApp.Providers.Acme]

This tells ReqLLM to automatically load your provider at application startup.

Option 2: Manual registration in Application.start/2

defmodule MyApp.Application do
  use Application

  def start(_type, _args) do
    ReqLLM.Providers.register(MyApp.Providers.Acme)
    # ... rest of supervision tree
  end
end

Using Custom Provider Models

Custom providers are not in the LLMDB catalog, so you cannot use string specs like "acme:model-name". Instead, use map-based model specs:

{:ok, model} = ReqLLM.model(%{id: "acme-chat-mini", provider: :acme})
{:ok, response} = ReqLLM.generate_text(model, "Hello!")

Or pass the model struct directly:

model = LLMDB.Model.new!(%{id: "acme-chat-mini", provider: :acme})
{:ok, response} = ReqLLM.generate_text(model, "Hello!")

Note: The mix mc (model compatibility) task is for validating models in the LLMDB catalog. It does not apply to custom providers.

Version Note: The mix mc alias requires ReqLLM >= 1.1. If you see ** (Mix) The task "mc" could not be found, use mix req_llm.model_compat instead, or upgrade ReqLLM.

Core Implementation

Minimal OpenAI-compatible provider

This example shows a provider that reuses defaults and only adds custom headers:

defmodule ReqLLM.Providers.Acme do
  @moduledoc "Acme – OpenAI-compatible chat API."

  @behaviour ReqLLM.Provider

  use ReqLLM.Provider.DSL,
    id: :acme,
    base_url: "https://api.acme.ai/v1",
    metadata: "priv/models_dev/acme.json",
    default_env_key: "ACME_API_KEY",
    provider_schema: [
      organization: [type: :string, doc: "Tenant/Org header"]
    ]

  use ReqLLM.Provider.Defaults

  @impl ReqLLM.Provider
  def attach(request, model_input, user_opts) do
    request = super(request, model_input, user_opts)
    org = user_opts[:organization]
    
    case org do
      nil -> request
      _ -> Req.Request.put_header(request, "x-acme-organization", org)
    end
  end
end

What you get for free:

Non-streaming: Req pipeline with Bearer auth, JSON encode/decode in OpenAI shape
Streaming: Finch request builder with OpenAI-compatible body and SSE decoding
Usage extraction from response body
Error handling and retry logic

Non-OpenAI wire-format provider

This example shows custom encoding/decoding for a provider with different JSON schema:

defmodule ReqLLM.Providers.Zephyr do
  @moduledoc "Zephyr – custom JSON schema, SSE streaming."

  @behaviour ReqLLM.Provider

  use ReqLLM.Provider.DSL,
    id: :zephyr,
    base_url: "https://api.zephyr.ai",
    metadata: "priv/models_dev/zephyr.json",
    default_env_key: "ZEPHYR_API_KEY",
    provider_schema: [
      version: [type: :string, default: "2024-10-01"],
      tenant: [type: :string]
    ]

  use ReqLLM.Provider.Defaults

  @impl ReqLLM.Provider
  def attach(request, model_input, user_opts) do
    request = ReqLLM.Provider.Defaults.default_attach(__MODULE__, request, model_input, user_opts)
    
    request
    |> Req.Request.put_header("x-zephyr-version", user_opts[:version] || "2024-10-01")
    |> then(fn req ->
      case user_opts[:tenant] do
        nil -> req
        t -> Req.Request.put_header(req, "x-zephyr-tenant", t)
      end
    end)
  end

  @impl ReqLLM.Provider
  def encode_body(%Req.Request{} = request) do
    context = request.options[:context]
    model = request.options[:model]
    stream = request.options[:stream] == true
    tools = request.options[:tools] || []
    provider_opts = request.options[:provider_options] || []

    messages =
      Enum.map(context.messages, fn m ->
        %{
          role: Atom.to_string(m.role),
          parts: Enum.map(m.content, &encode_part/1)
        }
      end)

    body =
      %{
        model: model,
        messages: messages,
        stream: stream
      }
      |> maybe_put(:temperature, request.options[:temperature])
      |> maybe_put(:max_output_tokens, request.options[:max_tokens])
      |> maybe_put(:tools, encode_tools(tools))
      |> Map.merge(Map.new(provider_opts))

    encoded = Jason.encode!(body)
    
    request
    |> Req.Request.put_header("content-type", "application/json")
    |> Map.put(:body, encoded)
  end

  @impl ReqLLM.Provider
  def decode_response({req, resp}) do
    case resp.status do
      200 ->
        body = ensure_parsed_body(resp.body)
        
        with {:ok, response} <- decode_chat_response(body, req) do
          {req, %{resp | body: response}}
        else
          {:error, reason} -> 
            {req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
        end

      status ->
        {req,
         ReqLLM.Error.API.Response.exception(
           reason: "Zephyr API error",
           status: status,
           response_body: resp.body
         )}
    end
  end

  @impl ReqLLM.Provider
  def attach_stream(model, context, opts, _finch_name) do
    api_key = ReqLLM.Keys.get!(model, opts)
    url = Keyword.get(opts, :base_url, default_base_url()) <> "/chat:stream"
    
    headers = [
      {"authorization", "Bearer " <> api_key},
      {"content-type", "application/json"},
      {"accept", "text/event-stream"}
    ]
    
    req = %Req.Request{
      options: %{
        model: model.model,
        context: context,
        stream: true,
        provider_options: opts[:provider_options] || []
      }
    }
    
    body = encode_body(req).body
    {:ok, Finch.build(:post, url, headers, body)}
  end

  @impl ReqLLM.Provider
  def decode_stream_event(%{data: data}, model) do
    case Jason.decode(data) do
      {:ok, %{"type" => "delta", "text" => text}} when is_binary(text) and text != "" ->
        [ReqLLM.StreamChunk.text(text)]
        
      {:ok, %{"type" => "reasoning", "text" => think}} when is_binary(think) and think != "" ->
        [ReqLLM.StreamChunk.thinking(think)]
        
      {:ok, %{"type" => "tool_call", "name" => name, "arguments" => args}} ->
        [ReqLLM.StreamChunk.tool_call(name, Map.new(args))]
        
      {:ok, %{"type" => "usage", "usage" => usage}} ->
        [ReqLLM.StreamChunk.meta(%{usage: normalize_usage(usage), model: model.model})]
        
      {:ok, %{"type" => "done", "finish_reason" => reason}} ->
        [ReqLLM.StreamChunk.meta(%{
          finish_reason: normalize_finish_reason(reason),
          terminal?: true
        })]
        
      _ ->
        []
    end
  end

  @impl ReqLLM.Provider
  def extract_usage(body, _model) when is_map(body) do
    case body do
      %{"usage" => u} -> {:ok, normalize_usage(u)}
      _ -> {:error, :no_usage}
    end
  end

  @impl ReqLLM.Provider
  def translate_options(:chat, _model, opts) do
    {opts
     |> Keyword.rename(:max_tokens, :max_output_tokens)
     |> Keyword.drop([:presence_penalty]),
     []}
  end

  # Helper functions

  defp encode_part(%ReqLLM.Message.ContentPart{type: :text, text: t}), 
    do: %{"type" => "text", "text" => t}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :image_url, url: url}), 
    do: %{"type" => "image_url", "url" => url}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :image, data: bin, media_type: mt}), 
    do: %{"type" => "image", "data" => Base.encode64(bin), "media_type" => mt}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :file, data: bin, media_type: mt, name: name}), 
    do: %{"type" => "file", "name" => name, "data" => Base.encode64(bin), "media_type" => mt}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :thinking, text: t}), 
    do: %{"type" => "thinking", "text" => t}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: a}), 
    do: %{"type" => "tool_call", "name" => n, "arguments" => a}
    
  defp encode_part(%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: a}), 
    do: %{"type" => "tool_result", "name" => n, "result" => a}

  defp decode_chat_response(body, req) do
    with %{"message" => %{"role" => role, "content" => content}} <- body,
         {:ok, message} <- to_message(role, content) do
      {:ok,
       %ReqLLM.Response{
         id: body["id"] || "zephyr_" <> Integer.to_string(System.unique_integer([:positive])),
         model: req.options[:model],
         context: req.options[:context] || ReqLLM.Context.new([]),
         message: message,
         usage: normalize_usage(body["usage"] || %{}),
         stream?: false
       }}
    else
      _ -> {:error, :unexpected_body}
    end
  end

  defp to_message(role, parts) do
    content_parts =
      Enum.flat_map(parts, fn
        %{"type" => "text", "text" => t} -> 
          [%ReqLLM.Message.ContentPart{type: :text, text: t}]
          
        %{"type" => "thinking", "text" => t} -> 
          [%ReqLLM.Message.ContentPart{type: :thinking, text: t}]
          
        %{"type" => "tool_call", "name" => n, "arguments" => a} -> 
          [%ReqLLM.Message.ContentPart{type: :tool_call, name: n, arguments: Map.new(a)}]
          
        %{"type" => "tool_result", "name" => n, "result" => r} -> 
          [%ReqLLM.Message.ContentPart{type: :tool_result, name: n, arguments: Map.new(r)}]
          
        _ -> []
      end)

    {:ok, %ReqLLM.Message{role: String.to_existing_atom(role), content: content_parts}}
  end

  defp encode_tools([]), do: nil
  defp encode_tools(tools) do
    Enum.map(tools, &ReqLLM.Tool.to_schema(&1, :openai))
  end

  defp maybe_put(map, _k, nil), do: map
  defp maybe_put(map, k, v), do: Map.put(map, k, v)

  defp ensure_parsed_body(body) when is_binary(body), do: Jason.decode!(body)
  defp ensure_parsed_body(body), do: body

  defp normalize_usage(%{"prompt" => i, "completion" => o}), 
    do: %{input_tokens: i, output_tokens: o, total_tokens: (i || 0) + (o || 0)}
    
  defp normalize_usage(%{"input_tokens" => i, "output_tokens" => o, "total_tokens" => t}), 
    do: %{input_tokens: i || 0, output_tokens: o || 0, total_tokens: t || (i || 0) + (o || 0)}
    
  defp normalize_usage(_), 
    do: %{input_tokens: 0, output_tokens: 0, total_tokens: 0}

  defp normalize_finish_reason("stop"), do: :stop
  defp normalize_finish_reason("length"), do: :length
  defp normalize_finish_reason("tool"), do: :tool_calls
  defp normalize_finish_reason(_), do: :error
end

Working with Canonical Data Structures

Input: Context to Provider JSON

Always convert ReqLLM.Context (list of Messages with ContentParts) to provider JSON.

Message structure:

role is :system | :user | :assistant | :tool
content is a list of ContentPart

ContentPart variants to handle:

text("...") - Plain text content
image_url("...") - Image from URL
image(binary, mime) - Base64-encoded image
file(binary, name, mime) - File attachment
thinking("...") - Reasoning tokens (for models that expose them)
tool_call(name, map) - Function call request
tool_result(tool_call_id_or_name, map) - Function call result

Output: Provider JSON to Response

Non-streaming:

Decode provider JSON into a single assistant ReqLLM.Message with canonical ContentParts and fill ReqLLM.Response:

Response.message is the assistant message
Response.usage is normalized when available
For object generation, preserve tool_call/tool_result or JSON content so ReqLLM.Response.object/1 works consistently

Streaming (SSE):

Map each provider event into one or more ReqLLM.StreamChunk:

:content — Text tokens
:thinking — Reasoning tokens
:tool_call — Function name + arguments (may arrive in fragments)
:meta — Usage deltas, finish_reason, terminal?: true on completion

Normalization principle

One conversation model, one streaming shape, one response shape: Never leak provider specifics to callers; normalize at the adapter boundary.

Model Metadata Integration

Add local patch

Create priv/models_local/<provider>.json to seed/supplement models before syncing:

{
  "provider": { 
    "id": "acme", 
    "name": "Acme AI" 
  },
  "models": [
    {
      "id": "acme-chat-mini",
      "name": "Acme Chat Mini",
      "type": "chat",
      "capabilities": { 
        "stream": true, 
        "tool_call": true, 
        "vision": true 
      },
      "modalities": { 
        "input": ["text","image"], 
        "output": ["text"] 
      },
      "cost": { 
        "input": 0.00015, 
        "output": 0.0006 
      }
    }
  ]
}

Register models

Model metadata is provided by the llm_db dependency. For custom providers not yet in llm_db, add a local patch file in priv/models_local/ when you want registry and tooling support. That is not required just to call a model through an explicit %LLMDB.Model{} or ReqLLM.model!/1.

Benefits

The registry enables:

Validation with mix mc
Model lookup by "acme:acme-chat-mini"
Capability gating in tests

Testing Strategy

ReqLLM uses a three-tier testing architecture:

1. Core package tests (no API calls)

Under test/req_llm/ for core types/helpers.

2. Provider-specific tests (no API calls)

Under test/providers/, unit-testing your encoding/decoding and options behavior with small bodies.

Example:

defmodule Providers.AcmeTest do
  use ExUnit.Case, async: true

  alias ReqLLM.Message.ContentPart

  test "encode_body: text + tools into OpenAI shape" do
    ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
    {:ok, model} = ReqLLM.model("acme:acme-chat-mini")
    
    req =
      Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
      |> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
      |> ReqLLM.Providers.Acme.encode_body()

    assert is_binary(req.body)
    body = Jason.decode!(req.body)
    assert body["model"] =~ "acme-chat-mini"
    assert body["messages"] |> is_list()
  end
end

3. Live API coverage tests

Under test/coverage/ using the fixture system for integration against the high-level API.

Example:

defmodule Coverage.AcmeChatTest do
  use ExUnit.Case, async: false
  use ReqLLM.Test.LiveFixture, provider: :acme

  test "basic text generation" do
    {:ok, response} =
      use_fixture(:provider, "acme-basic", fn ->
        ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
      end)

    assert ReqLLM.Response.text(response) =~ "hi"
  end

  test "streaming tokens" do
    {:ok, sr} =
      use_fixture(:provider, "acme-stream", fn ->
        ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
      end)

    tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
    assert length(tokens) >= 3
  end
end

Recording fixtures

# Record fixtures during live test runs
LIVE=true mix test --only provider:acme

# Or use model compatibility tool
mix mc "acme:*" --record

Validate coverage

# Quick validation
mix mc

# Sample models during development
mix mc --sample

Authentication

Use ReqLLM.Keys

Always use ReqLLM.Keys for key retrieval. Never read System.get_env/1 directly.

api_key = ReqLLM.Keys.get!(model, opts)

Configuration

The DSL's default_env_key is the fallback env var name. ReqLLM.Keys also supports:

Application config
Per-call override via opts[:api_key]

Adding authentication

Attach Bearer header in attach/3 or use Defaults (already sets authorization):

@impl ReqLLM.Provider
def attach(request, model_input, user_opts) do
  api_key = ReqLLM.Keys.get!(model_input, user_opts)
  
  request
  |> Req.Request.put_header("authorization", "Bearer #{api_key}")
  |> Req.Request.put_header("content-type", "application/json")
end

Error Handling

Use Splode error types

ReqLLM.Error.Auth - Missing/invalid API keys
ReqLLM.Error.API.Request - HTTP request issues
ReqLLM.Error.API.Response - HTTP response errors
ReqLLM.Error.Parse - JSON/body shape issues

Example

In decode_response/1, return {req, exception} for non-200 or malformed payloads:

@impl ReqLLM.Provider
def decode_response({req, resp}) do
  case resp.status do
    200 ->
      body = ensure_parsed_body(resp.body)
      
      with {:ok, response} <- decode_chat_response(body, req) do
        {req, %{resp | body: response}}
      else
        {:error, reason} -> 
          {req, ReqLLM.Error.Parse.exception(reason: inspect(reason))}
      end

    status ->
      {req,
       ReqLLM.Error.API.Response.exception(
         reason: "API error",
         status: status,
         response_body: resp.body
       )}
  end
end

The pipeline will propagate errors consistently to callers.

Response Assembly with ResponseBuilder

Why ResponseBuilder Exists

Different LLM providers have subtle differences in how they represent responses, tool calls, finish reasons, and metadata. Previously, these differences were handled in multiple places (streaming vs non-streaming, provider-specific decoders), leading to behavioral inconsistencies.

The ResponseBuilder behaviour centralizes provider-specific Response assembly logic, ensuring that:

Streaming and non-streaming produce identical Response structs
Provider quirks are handled in one place per provider
New providers have a clear extension point

How It Works

Both streaming and non-streaming paths converge on ResponseBuilder:

Decode wire format to [StreamChunk.t()]
Collect metadata (usage, finish_reason, provider-specific)
Call the appropriate builder:

builder = ResponseBuilder.for_model(model)
{:ok, response} = builder.build_response(chunks, metadata, opts)

Routing Logic

ResponseBuilder.for_model/1 routes to provider-specific builders:

Anthropic models → Anthropic.ResponseBuilder
Google/Vertex models → Google.ResponseBuilder
OpenAI Responses API models → OpenAI.ResponsesAPI.ResponseBuilder
All others → Provider.Defaults.ResponseBuilder

When to Implement a Custom ResponseBuilder

Most providers can use Provider.Defaults.ResponseBuilder. Implement a custom builder when:

Content block requirements: Anthropic requires content blocks to never be empty
Provider-specific metadata: OpenAI Responses API needs to propagate response_id for stateless multi-turn
Finish reason detection: Google needs to detect functionCall to set correct finish_reason
Custom tool call handling: Provider has non-standard tool call representation

Example: Custom ResponseBuilder

defmodule ReqLLM.Providers.Zephyr.ResponseBuilder do
  @moduledoc "Custom ResponseBuilder for Zephyr provider."

  @behaviour ReqLLM.Provider.ResponseBuilder

  alias ReqLLM.Provider.Defaults.ResponseBuilder, as: DefaultBuilder

  @impl true
  def build_response(chunks, metadata, opts) do
    # Delegate to default builder for standard processing
    with {:ok, response} <- DefaultBuilder.build_response(chunks, metadata, opts) do
      # Apply provider-specific post-processing
      response = apply_zephyr_quirks(response, metadata)
      {:ok, response}
    end
  end

  defp apply_zephyr_quirks(response, metadata) do
    # Example: Zephyr includes session_id in metadata
    case metadata[:session_id] do
      nil -> response
      sid -> %{response | provider_meta: Map.put(response.provider_meta, :session_id, sid)}
    end
  end
end

Then register the builder by adding a clause to ResponseBuilder.for_model/1 (for built-in providers) or by pattern matching on your model in your provider's streaming/non-streaming paths.

Step-by-Step Example

Let's add a fictional provider called "Acme" from start to finish.

1. Create provider module

File: lib/req_llm/providers/acme.ex

defmodule ReqLLM.Providers.Acme do
  @moduledoc "Acme – OpenAI-compatible chat API."

  @behaviour ReqLLM.Provider

  use ReqLLM.Provider.DSL,
    id: :acme,
    base_url: "https://api.acme.ai/v1",
    metadata: "priv/models_dev/acme.json",
    default_env_key: "ACME_API_KEY",
    provider_schema: [
      organization: [type: :string, doc: "Tenant/Org header"]
    ]

  use ReqLLM.Provider.Defaults

  @impl ReqLLM.Provider
  def attach(request, model_input, user_opts) do
    request = super(request, model_input, user_opts)
    org = user_opts[:organization]
    
    case org do
      nil -> request
      _ -> Req.Request.put_header(request, "x-acme-organization", org)
    end
  end
end

2. Add model metadata

File: priv/models_local/acme.json

{
  "provider": { 
    "id": "acme", 
    "name": "Acme AI" 
  },
  "models": [
    {
      "id": "acme-chat-mini",
      "name": "Acme Chat Mini",
      "type": "chat",
      "capabilities": { 
        "stream": true, 
        "tool_call": true, 
        "vision": true 
      },
      "modalities": { 
        "input": ["text","image"], 
        "output": ["text"] 
      },
      "cost": { 
        "input": 0.00015, 
        "output": 0.0006 
      }
    }
  ]
}

3. Quick smoke test

export ACME_API_KEY=sk-...
mix req_llm.gen "Hello" --model acme:acme-chat-mini

4. Provider unit tests

File: test/providers/acme_test.exs

defmodule Providers.AcmeTest do
  use ExUnit.Case, async: true

  alias ReqLLM.Message.ContentPart

  test "encode_body: text + tools into OpenAI shape" do
    ctx = ReqLLM.Context.new([ReqLLM.Context.user([ContentPart.text("Hello")])])
    {:ok, model} = ReqLLM.model("acme:acme-chat-mini")
    
    req =
      Req.new(url: "/chat/completions", method: :post, base_url: "https://example.test")
      |> ReqLLM.Providers.Acme.attach(model, context: ctx, stream: false, temperature: 0.0)
      |> ReqLLM.Providers.Acme.encode_body()

    assert is_binary(req.body)
    body = Jason.decode!(req.body)
    assert body["model"] =~ "acme-chat-mini"
    assert body["messages"] |> is_list()
  end
end

5. Coverage tests with fixtures

File: test/coverage/acme_chat_test.exs

defmodule Coverage.AcmeChatTest do
  use ExUnit.Case, async: false
  use ReqLLM.Test.LiveFixture, provider: :acme

  test "basic text generation" do
    {:ok, response} =
      use_fixture(:provider, "acme-basic", fn ->
        ReqLLM.generate_text("acme:acme-chat-mini", "Say hi", temperature: 0)
      end)

    assert ReqLLM.Response.text(response) =~ "hi"
  end

  test "streaming tokens" do
    {:ok, sr} =
      use_fixture(:provider, "acme-stream", fn ->
        ReqLLM.stream_text("acme:acme-chat-mini", "Count 1..3", temperature: 0)
      end)

    tokens = ReqLLM.StreamResponse.tokens(sr) |> Enum.take(3)
    assert length(tokens) >= 3
  end
end

6. Record fixtures

# Option 1: During test run
LIVE=true mix test --only provider:acme

# Option 2: Using model compat tool
mix mc "acme:*" --record

7. Validate models

# Validate Acme models
mix req_llm.model_compat acme

# List all registered providers/models
mix mc --available

Best Practices

Simplicity-first and normalization

Prefer using ReqLLM.Provider.Defaults. Only override what the provider truly deviates on
Keep prepare_request/4 a thin dispatcher; centralize option prep in attach/3 and the defaults pipeline

Code style (from AGENTS.md)

No comments inside function bodies. Use clear naming and module docs
Prefer pattern matching to conditionals
Use {:ok, result} | {:error, reason} tuples for fallible helpers

Options translation

Use translate_options/3 to rename/drop provider-specific params (e.g., max_tokens → max_output_tokens)

Tools and multimodal

Always map tools via ReqLLM.Tool.to_schema/2
Respect ContentPart variants for images/files. Base64 encode if the provider requires it

Streaming

Build the Finch request in attach_stream/4
Decode events to StreamChunk in decode_stream_event/2 or /3
Emit terminal meta chunk with finish_reason and usage if provided

Testing incrementally

Start with non-streaming happy path, then add streaming and tools
Record minimal, deterministic fixtures (temperature: 0)

Advanced Topics

When to consider the advanced path

Provider uses non-SSE streaming (binary protocol) or chunked JSON requiring stateful accumulation
Models with unique parameter semantics that demand translate_options/3 and capability gating
Complex multimodal tool invocation requiring custom mapping of multi-part tool args/results

Advanced implementations

Implement parse_stream_protocol/2 for custom binary protocols (e.g., AWS Event Stream)
Implement init_stream_state/1, decode_stream_event/3, flush_stream_state/2 to accumulate partial tool_call args or demultiplex multi-channel events
Implement normalize_model_id/1 for regional aliases and translate_options/3 with warning aggregation
Provide provider-specific usage accounting that merges multi-phase usage deltas

Callback Reference

What to implement and when

prepare_request/4

Build Req for the operation
Defaults cover :chat, :object, :embedding

attach/3

Set headers, auth, and pipeline steps
Defaults add Bearer, retry, error, usage, fixture steps

encode_body/1

Transform options/context to provider JSON
Defaults are OpenAI-compatible; override for custom wire formats

decode_response/1

Map provider body to Response or error
Defaults map OpenAI-style bodies; override if your shape differs

attach_stream/4

Must return {:ok, Finch.Request.t()}
Defaults build OpenAI-compatible streaming requests; override for custom endpoints/headers

decode_stream_event/2 or /3

Map provider events to StreamChunk
Defaults handle OpenAI-compatible deltas

extract_usage/2

Normalize usage tokens/cost if provider deviates from standard usage shape

translate_options/3

Rename/drop options per model or operation

ResponseBuilder.build_response/3

Build final Response struct from accumulated StreamChunks and metadata
Defaults handle OpenAI-compatible responses; override for provider-specific quirks
Required parameters: chunks (list of StreamChunk), metadata (map with usage, finish_reason, etc.), opts (keyword list with :context and :model)

Summary

Adding a provider to ReqLLM involves:

Creating a provider module with the DSL and behavior implementation
Implementing encoding/decoding for the provider's wire format
Optionally implementing a custom ResponseBuilder for provider-specific response assembly
Adding model metadata and syncing the registry
Writing tests at all three tiers (core, provider, coverage)
Recording fixtures for validation

By following these guidelines and leveraging the defaults, you can add robust, well-tested provider support that maintains ReqLLM's normalization principles across all AI interactions.

FilesExpand file tree

adding_a_provider.md

Latest commit

History