Support per user api-key for multi-tenant use case #753

Jeffwan · 2025-02-26T23:18:33Z

🚀 Feature Description and Motivation

Background

Currently, vLLM only supports a single API key for authentication, making it difficult to share the inference engine across multiple tenants. Extending vLLM to support multiple keys is an option, but this would be a static solution. A more flexible approach is needed to handle multi-tenant API key management dynamically.

Proposed Solutions

Option 1: Extend vLLM to Support External Authentication

vLLM integrates with an external authentication server to validate API keys dynamically.
This approach allows for greater flexibility but introduces external dependencies. overhead is another concern

Option 2: Manage API Keys Outside of vLLM

Option 2a: User-Managed Authentication (Bring Your Own Stack)

Users adopt an external authentication solution (e.g., Istio, OAuth, or API gateways) to manage API keys.

Option 2b: Extend AIBrix Gateway for Multi-Tenant API Key Management

AIBrix Gateway already has a basic user concept and rate-limiting control.
The extension would associate users with API keys, providing built-in multi-tenancy support.

Future Considerations

In addition to authentication, we want to support tenant-aware optimizations within vLLM. The gateway should attach tenant metadata (e.g., X-Tenant-ID, X-Priority, JWT claims) before forwarding the request to vLLM. This would enable the inference engine to make tenant-aware optimizations, such as priority-based scheduling or resource allocation.

Open Questions

Which approach aligns best with the vLLM architecture?
Should vLLM natively support dynamic authentication, or should this be handled externally?
How can we ensure a smooth integration between vLLM and the authentication layer without introducing significant overhead?

/cc @simon-mo @robertgshaw2-redhat @gaocegege @kerthcet

Use Case

Support multi-tenancy for vLLM

Proposed Solution

No response

jolfr · 2025-03-01T02:46:43Z

One use-case I'd love to see supported as a tenant-aware optimization is tenant-based LoRA adapters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support per user api-key for multi-tenant use case #753

Support per user api-key for multi-tenant use case #753

Jeffwan commented Feb 26, 2025

jolfr commented Mar 1, 2025

Support per user api-key for multi-tenant use case #753

Support per user api-key for multi-tenant use case #753

Comments

Jeffwan commented Feb 26, 2025

🚀 Feature Description and Motivation

Background

Proposed Solutions

Option 1: Extend vLLM to Support External Authentication

Option 2: Manage API Keys Outside of vLLM

Option 2a: User-Managed Authentication (Bring Your Own Stack)

Option 2b: Extend AIBrix Gateway for Multi-Tenant API Key Management

Future Considerations

Use Case

Proposed Solution

jolfr commented Mar 1, 2025