Allow the user to pass the `client` object #301

RyanMarten · 2025-01-05T23:08:53Z

It may be a good idea to let the user specify the client. That way the user can fully configure the client however they want. This also might simplify the interface so we don't have to add things like base_url (as mentioned in #238) or full retry behavior (related to #279, by passing an httpx object).

Similar to how instruct has from_gemini from_anthropic, from_litellm etc. and takes in a client object and returns a standardized client object.
https://github.com/instructor-ai/instructor/blob/main/instructor/client_anthropic.py

We can do this for both online and batch (and offline via vllm object based on #298)

litellm client
openai client (for batch)
anthropic client (for batch)

RyanMarten · 2025-01-10T20:14:51Z

This can be instead of #331

Example for vllm offline: https://github.com/bespokelabsai/curator/blob/main/src/bespokelabs/curator/request_processor/offline/vllm_offline_request_processor.py#L45C1-L54C10

This way we automatically support all current and future configurations of the backend and don't have to manage it ourselves

RyanMarten · 2025-01-10T20:21:18Z

Current params that would fall into this:

curator/src/bespokelabs/curator/llm/llm.py

Lines 45 to 63 in 07cff6f

    
           base_url: str | None = None, 
        
           response_format: Type[BaseModel] | None = None, 
        
           batch: bool = False, 
        
           backend: str | None = None, 
        
           max_requests_per_minute: int | None = None, 
        
           max_tokens_per_minute: int | None = None, 
        
           batch_size: int | None = None, 
        
           batch_check_interval: int | None = None, 
        
           delete_successful_batch_files: bool | None = None, 
        
           delete_failed_batch_files: bool | None = None, 
        
           max_retries: int | None = None, 
        
           require_all_responses: bool | None = None, 
        
           generation_params: dict | None = None, 
        
           seconds_to_pause_on_rate_limit: int | None = None, 
        
           tensor_parallel_size: int | None = None, 
        
           enforce_eager: bool | None = None, 
        
           max_model_length: int | None = None, 
        
           max_tokens: int | None = None, 
        
           gpu_memory_utilization: float | None = None,

litellm / openai / anthropic
base_url: str | None = None,
max_retries: int | None = None,

litellm
instead of base_url - called litellm.api_base
instead of max_retries in client, it is num_retries in .completion call
so maybe the user has to keep track of more things and we can actually provide value by providing a uniform API

params that don't (we manage them ourselves)
seconds_to_pause_on_rate_limit: int | None = None,
require_all_responses: bool | None = None,
batch_size: int | None = None,
batch_check_interval: int | None = None,
delete_successful_batch_files: bool | None = None,
delete_failed_batch_files: bool | None = None,
response_format: Type[BaseModel] | None = None,
batch: bool = False,
backend: str | None = None,
max_requests_per_minute: int | None = None,
max_tokens_per_minute: int | None = None,

Actually thinking about this further, it is still messy and this doesn't entirely solve the issue

RyanMarten · 2025-01-31T19:32:36Z

See https://x.com/kgourg/status/1885396492162121797?s=46 for an example of when this is useful

Due to a custom OpenAI client from my workplace, I had to jump inside the library to find a place to pass my client so that I could use the library. But I’ve had the same issue with other libs, like Outlines.

The only thing that is special about it is that it requires additional security artifacts, ie, not just an API key and a base url, for access. Otherwise it implements everything a standard client has.

RyanMarten added the enhancement New feature or request label Jan 5, 2025

adamoptimizer added the task This is a task label Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow the user to pass the `client` object #301

Allow the user to pass the `client` object #301

RyanMarten commented Jan 5, 2025 •

edited

Loading

RyanMarten commented Jan 10, 2025 •

edited

Loading

RyanMarten commented Jan 10, 2025 •

edited

Loading

RyanMarten commented Jan 31, 2025 •

edited

Loading

Allow the user to pass the client object #301

Allow the user to pass the client object #301

Comments

RyanMarten commented Jan 5, 2025 • edited Loading

RyanMarten commented Jan 10, 2025 • edited Loading

RyanMarten commented Jan 10, 2025 • edited Loading

RyanMarten commented Jan 31, 2025 • edited Loading

Allow the user to pass the `client` object #301

Allow the user to pass the `client` object #301

RyanMarten commented Jan 5, 2025 •

edited

Loading

RyanMarten commented Jan 10, 2025 •

edited

Loading

RyanMarten commented Jan 10, 2025 •

edited

Loading

RyanMarten commented Jan 31, 2025 •

edited

Loading