-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
At the time of writing this, the triton openai frontend has strict schemas for the request bodies.
For example here
class CreateChatCompletionRequest(BaseModel):
# Explicitly return errors for unknown fields.
model_config: ConfigDict = ConfigDict(extra="forbid")
This causes the following error in case a request has a new attribute not specified in the Pydantic Model.
INFO: 10.165.103.167:0 - "POST /chat/completions HTTP/1.1" 422 Unprocessable Entity
In this issue, I want to point out one side of the tradeoff. While I am sure there are valid reasons for that strictness, it does impact adoption.
The open-ai compatible frontend implementation in triton probably does not want to play a catchup game with the OpenAI request spec.
For example OpenAI has a service_tier attribute which probably does not make sense for this openai-compatible frontend. But the ConfigDict(extra="forbid")
hinders that this server could be a easy drop in replacement for existing OpenAI calls from callers, if they use service_tier.
The callers need to be modified to omit that service_tier filed.
In addition to the service_tier field, existing openai-python api allows arbitrary undocumented fields to be passed as well via the extra_body field.
For any python client that ended up using that feature, it becomes incrementally more challenging to transition to triton-openai-frontend, because the triton openai frontend does not allow extra fields in its Pydantic Models that represent request bodies.