Skip to content

[Feature]: Research and add OpenAI WebSocket support (Responses API + Realtime) #476

@mikehostetler

Description

@mikehostetler

Summary

OpenAI now documents WebSocket support for LLM workflows, including:

  • Responses API WebSocket mode
  • Realtime API over WebSocket

ReqLLM currently has strong support for non-streaming requests and SSE-based streaming, but does not yet expose a dedicated OpenAI WebSocket transport path.

Why this matters

WebSockets can reduce repeated request overhead and enable persistent, low-latency interactions. For some workloads (especially conversational/session-based flows), this can provide better UX and more efficient transport semantics than one-off HTTP requests.

Proposed work

Research and propose an implementation plan for adding OpenAI WebSocket support to ReqLLM, then implement incrementally.

Areas to evaluate:

  • API surface design in ReqLLM (how WebSocket session semantics should map to current high-level APIs)
  • Provider architecture impact (new callbacks vs extending existing streaming callbacks)
  • Event decoding and response assembly parity with existing ReqLLM.Response / ReqLLM.StreamChunk
  • Error handling, reconnect behavior, and timeout/session lifecycle
  • Usage reporting and compatibility with existing provider defaults
  • Fixture/testing strategy for deterministic coverage

Suggested rollout

  1. Research/design spike with architecture notes and recommended API shape
  2. Implement Responses API WebSocket mode (server-side first)
  3. Evaluate/implement Realtime WebSocket support (possibly behind experimental API)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions