Open
Description
When processing large images or documents (file) with every request the files is send as encoded data. This increases the token usage since every time the file needs to be ingested again. Especially with Tool Calling this leads to a high usage of tokens.
To mitigate this issues some platforms provide a way to cache input files. It would be nice to fine a way to enable caching.
For Reference Purposes:
- https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
- https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
- https://platform.openai.com/docs/guides/prompt-caching
- Prompt Caching is activated by default for OpenAI