Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the user to pass the client object #301

Open
RyanMarten opened this issue Jan 5, 2025 · 3 comments
Open

Allow the user to pass the client object #301

RyanMarten opened this issue Jan 5, 2025 · 3 comments
Labels
enhancement New feature or request task This is a task

Comments

@RyanMarten
Copy link
Contributor

RyanMarten commented Jan 5, 2025

It may be a good idea to let the user specify the client. That way the user can fully configure the client however they want. This also might simplify the interface so we don't have to add things like base_url (as mentioned in #238) or full retry behavior (related to #279, by passing an httpx object).

Similar to how instruct has from_gemini from_anthropic, from_litellm etc. and takes in a client object and returns a standardized client object.
https://github.com/instructor-ai/instructor/blob/main/instructor/client_anthropic.py

We can do this for both online and batch (and offline via vllm object based on #298)

  • litellm client
  • openai client (for batch)
  • anthropic client (for batch)
@RyanMarten RyanMarten added the enhancement New feature or request label Jan 5, 2025
@RyanMarten
Copy link
Contributor Author

RyanMarten commented Jan 10, 2025

This can be instead of #331

Example for vllm offline: https://github.com/bespokelabsai/curator/blob/main/src/bespokelabs/curator/request_processor/offline/vllm_offline_request_processor.py#L45C1-L54C10

This way we automatically support all current and future configurations of the backend and don't have to manage it ourselves

@RyanMarten
Copy link
Contributor Author

RyanMarten commented Jan 10, 2025

Current params that would fall into this:

base_url: str | None = None,
response_format: Type[BaseModel] | None = None,
batch: bool = False,
backend: str | None = None,
max_requests_per_minute: int | None = None,
max_tokens_per_minute: int | None = None,
batch_size: int | None = None,
batch_check_interval: int | None = None,
delete_successful_batch_files: bool | None = None,
delete_failed_batch_files: bool | None = None,
max_retries: int | None = None,
require_all_responses: bool | None = None,
generation_params: dict | None = None,
seconds_to_pause_on_rate_limit: int | None = None,
tensor_parallel_size: int | None = None,
enforce_eager: bool | None = None,
max_model_length: int | None = None,
max_tokens: int | None = None,
gpu_memory_utilization: float | None = None,

litellm / openai / anthropic
base_url: str | None = None,
max_retries: int | None = None,

litellm
instead of base_url - called litellm.api_base
instead of max_retries in client, it is num_retries in .completion call
so maybe the user has to keep track of more things and we can actually provide value by providing a uniform API

vllm
tensor_parallel_size: int | None = None,
enforce_eager: bool | None = None,
max_model_length: int | None = None,
max_tokens: int | None = None,
gpu_memory_utilization: float | None = None,

params that don't (we manage them ourselves)
seconds_to_pause_on_rate_limit: int | None = None,
require_all_responses: bool | None = None,
batch_size: int | None = None,
batch_check_interval: int | None = None,
delete_successful_batch_files: bool | None = None,
delete_failed_batch_files: bool | None = None,
response_format: Type[BaseModel] | None = None,
batch: bool = False,
backend: str | None = None,
max_requests_per_minute: int | None = None,
max_tokens_per_minute: int | None = None,

Actually thinking about this further, it is still messy and this doesn't entirely solve the issue

@adamoptimizer adamoptimizer added the task This is a task label Jan 28, 2025
@RyanMarten
Copy link
Contributor Author

RyanMarten commented Jan 31, 2025

See https://x.com/kgourg/status/1885396492162121797?s=46 for an example of when this is useful

Due to a custom OpenAI client from my workplace, I had to jump inside the library to find a place to pass my client so that I could use the library. But I’ve had the same issue with other libs, like Outlines.

The only thing that is special about it is that it requires additional security artifacts, ie, not just an API key and a base url, for access. Otherwise it implements everything a standard client has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request task This is a task
Projects
None yet
Development

No branches or pull requests

2 participants