Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

Open
CharlieJCJ opened this issue Mar 2, 2025 · 4 comments · May be fixed by #575
Open

[Cost] Custom input/output/etc cost that user can specify from our curator interface #569

CharlieJCJ opened this issue Mar 2, 2025 · 4 comments · May be fixed by #575
Labels
curator-tracker Related to tracking - costs, requests, etc.

Comments

@CharlieJCJ
Copy link
Contributor

CharlieJCJ commented Mar 2, 2025

one way to do it is through backend_params, or we can create a separate field for specifying cost-related config.

@RyanMarten
Copy link
Contributor

It would be great to have this through backend_params! Thanks!

@kartik4949 please provide insight on the backend implementation of this since you were looking at registering new model costs previously

@RyanMarten RyanMarten added enhancement curator-tracker Related to tracking - costs, requests, etc. and removed enhancement labels Mar 2, 2025
@kartik4949
Copy link
Contributor

@RyanMarten @CharlieJCJ
#481

@CharlieJCJ
Copy link
Contributor Author

CharlieJCJ commented Mar 3, 2025

Thanks,

Example usage:

poet = Poet(model_name="gpt-4o-mini", backend_params={"in_mtok_cost": 1000000, "out_mtok_cost": 1000000})

I did a simple test from main branch (overriding the cost of GPT-4o-mini, but right now, it doesn't return the cost that uses the user-specified cost.

Image

Selected logs included here

                    INFO     Override cost per input token: 1.0, cost per output token: 1.0 for model: gpt-4o-mini                             cost.py:24
                    INFO     Model cost: {'max_tokens': 8192, 'max_input_tokens': 128000, 'max_output_tokens': 16384,        online_status_tracker.py:147
                             'input_cost_per_token': 1.0, 'output_cost_per_token': 1.0, 'input_cost_per_token_batches':                                  
                             7.5e-08, 'output_cost_per_token_batches': 3e-07, 'cache_read_input_token_cost': 7.5e-08,                                    
                             'litellm_provider': 'openai', 'mode': 'chat', 'supports_function_calling': True,                                            
                             'supports_parallel_function_calling': True, 'supports_response_schema': True,                                               
                             'supports_vision': True, 'supports_prompt_caching': True, 'supports_system_messages': True,                                 
                             'supports_tool_choice': True, 'key': 'gpt-4o-mini', 'cache_creation_input_token_cost': None,                                
                             'input_cost_per_character': None, 'input_cost_per_token_above_128k_tokens': None,                                           
                             'input_cost_per_query': None, 'input_cost_per_second': None, 'input_cost_per_audio_token':                                  
                             None, 'output_cost_per_audio_token': None, 'output_cost_per_character': None,                                               
                             'output_cost_per_token_above_128k_tokens': None, 'output_cost_per_character_above_128k_tokens':                             
                             None, 'output_cost_per_second': None, 'output_cost_per_image': None, 'output_vector_size':                                  
                             None, 'supports_assistant_prefill': False, 'supports_audio_input': False,                                                   
                             'supports_audio_output': False, 'supports_pdf_input': False, 'supports_embedding_image_input':                              
                             False, 'supports_native_streaming': None, 'tpm': None, 'rpm': None, 'supported_openai_params':                              
                             ['frequency_penalty', 'logit_bias', 'logprobs', 'top_logprobs', 'max_tokens',                                               
                             'max_completion_tokens', 'modalities', 'prediction', 'n', 'presence_penalty', 'seed', 'stop',                               
                             'stream', 'stream_options', 'temperature', 'top_p', 'tools', 'tool_choice', 'function_call',                                
                             'functions', 'max_retries', 'extra_headers', 'parallel_tool_calls', 'response_format', 'user']}                             
                    INFO     I use litellm to calculate cost, cost info: {'max_tokens': 16384, 'max_input_tokens': 128000,                     cost.py:45
                             'max_output_tokens': 16384, 'input_cost_per_token': 1.5e-07, 'output_cost_per_token': 6e-07,                                
                             'input_cost_per_token_batches': 7.5e-08, 'output_cost_per_token_batches': 3e-07, 'cache_read_input_token_cost':             
                             7.5e-08, 'litellm_provider': 'openai', 'mode': 'chat', 'supports_function_calling': True,                                   
                             'supports_parallel_function_calling': True, 'supports_response_schema': True, 'supports_vision': True,                      
                             'supports_prompt_caching': True, 'supports_system_messages': True, 'supports_tool_choice': True}                            
                    INFO     response_cost: 0.00025649999999999995                                                   base_online_request_processor.py:489

TL;DR is that the model cost from online_status_tracker updates the manual cost, but base_online_request_processor (openai online) doesn't seem to use the updated manual cost?

haven't test the other models / providers yet
@RyanMarten @kartik4949

@kartik4949
Copy link
Contributor

@CharlieJCJ Thanks for flagging, it was latest litellm version bug, I fixed it in cost revamp pr.
Thanks

@kartik4949 kartik4949 linked a pull request Mar 4, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
curator-tracker Related to tracking - costs, requests, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants