[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

CharlieJCJ · 2025-03-02T21:22:13Z

one way to do it is through backend_params, or we can create a separate field for specifying cost-related config.

RyanMarten · 2025-03-02T21:25:02Z

It would be great to have this through backend_params! Thanks!

@kartik4949 please provide insight on the backend implementation of this since you were looking at registering new model costs previously

kartik4949 · 2025-03-03T18:06:41Z

@RyanMarten @CharlieJCJ
#481

CharlieJCJ · 2025-03-03T20:17:43Z

Thanks,

Example usage:

poet = Poet(model_name="gpt-4o-mini", backend_params={"in_mtok_cost": 1000000, "out_mtok_cost": 1000000})

I did a simple test from main branch (overriding the cost of GPT-4o-mini, but right now, it doesn't return the cost that uses the user-specified cost.

Selected logs included here

                    INFO     Override cost per input token: 1.0, cost per output token: 1.0 for model: gpt-4o-mini                             cost.py:24
                    INFO     Model cost: {'max_tokens': 8192, 'max_input_tokens': 128000, 'max_output_tokens': 16384,        online_status_tracker.py:147
                             'input_cost_per_token': 1.0, 'output_cost_per_token': 1.0, 'input_cost_per_token_batches':                                  
                             7.5e-08, 'output_cost_per_token_batches': 3e-07, 'cache_read_input_token_cost': 7.5e-08,                                    
                             'litellm_provider': 'openai', 'mode': 'chat', 'supports_function_calling': True,                                            
                             'supports_parallel_function_calling': True, 'supports_response_schema': True,                                               
                             'supports_vision': True, 'supports_prompt_caching': True, 'supports_system_messages': True,                                 
                             'supports_tool_choice': True, 'key': 'gpt-4o-mini', 'cache_creation_input_token_cost': None,                                
                             'input_cost_per_character': None, 'input_cost_per_token_above_128k_tokens': None,                                           
                             'input_cost_per_query': None, 'input_cost_per_second': None, 'input_cost_per_audio_token':                                  
                             None, 'output_cost_per_audio_token': None, 'output_cost_per_character': None,                                               
                             'output_cost_per_token_above_128k_tokens': None, 'output_cost_per_character_above_128k_tokens':                             
                             None, 'output_cost_per_second': None, 'output_cost_per_image': None, 'output_vector_size':                                  
                             None, 'supports_assistant_prefill': False, 'supports_audio_input': False,                                                   
                             'supports_audio_output': False, 'supports_pdf_input': False, 'supports_embedding_image_input':                              
                             False, 'supports_native_streaming': None, 'tpm': None, 'rpm': None, 'supported_openai_params':                              
                             ['frequency_penalty', 'logit_bias', 'logprobs', 'top_logprobs', 'max_tokens',                                               
                             'max_completion_tokens', 'modalities', 'prediction', 'n', 'presence_penalty', 'seed', 'stop',                               
                             'stream', 'stream_options', 'temperature', 'top_p', 'tools', 'tool_choice', 'function_call',                                
                             'functions', 'max_retries', 'extra_headers', 'parallel_tool_calls', 'response_format', 'user']}                             
                    INFO     I use litellm to calculate cost, cost info: {'max_tokens': 16384, 'max_input_tokens': 128000,                     cost.py:45
                             'max_output_tokens': 16384, 'input_cost_per_token': 1.5e-07, 'output_cost_per_token': 6e-07,                                
                             'input_cost_per_token_batches': 7.5e-08, 'output_cost_per_token_batches': 3e-07, 'cache_read_input_token_cost':             
                             7.5e-08, 'litellm_provider': 'openai', 'mode': 'chat', 'supports_function_calling': True,                                   
                             'supports_parallel_function_calling': True, 'supports_response_schema': True, 'supports_vision': True,                      
                             'supports_prompt_caching': True, 'supports_system_messages': True, 'supports_tool_choice': True}                            
                    INFO     response_cost: 0.00025649999999999995                                                   base_online_request_processor.py:489

TL;DR is that the model cost from online_status_tracker updates the manual cost, but base_online_request_processor (openai online) doesn't seem to use the updated manual cost?

haven't test the other models / providers yet
@RyanMarten @kartik4949

kartik4949 · 2025-03-04T10:58:58Z

@CharlieJCJ Thanks for flagging, it was latest litellm version bug, I fixed it in cost revamp pr.
Thanks

RyanMarten added enhancement curator-tracker Related to tracking - costs, requests, etc. and removed enhancement labels Mar 2, 2025

kartik4949 linked a pull request Mar 4, 2025 that will close this issue

fix: use model name from config in cost processor #575

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

CharlieJCJ commented Mar 2, 2025 •

edited

Loading

RyanMarten commented Mar 2, 2025

kartik4949 commented Mar 3, 2025

CharlieJCJ commented Mar 3, 2025 •

edited

Loading

kartik4949 commented Mar 4, 2025

[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

Comments

CharlieJCJ commented Mar 2, 2025 • edited Loading

RyanMarten commented Mar 2, 2025

kartik4949 commented Mar 3, 2025

CharlieJCJ commented Mar 3, 2025 • edited Loading

kartik4949 commented Mar 4, 2025

CharlieJCJ commented Mar 2, 2025 •

edited

Loading

CharlieJCJ commented Mar 3, 2025 •

edited

Loading