-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Labels
Description
Problem
Spring AI 1.0.3+ has comprehensive Anthropic prompt caching support through AnthropicChatOptions.cacheOptions(AnthropicCacheOptions), but Embabel's LlmOptions doesn't expose this configuration.
Current State:
- Spring AI:
AnthropicChatOptions.cacheOptions✅ Full support - Embabel:
LlmOptions❌ No caching fields - Embabel Converter:
AnthropicOptionsConverter❌ Doesn't map caching
Impact: Users must create custom bean overrides, losing Embabel's abstraction benefits and missing 90% cost savings.
Proposed Solution
1. Add Caching Fields to LlmOptions
// embabel-common-ai/.../LlmOptions.kt
data class LlmOptions(
// ... existing fields ...
/**
* Enable Anthropic prompt caching strategy.
* Options: NONE, SYSTEM_ONLY, SYSTEM_AND_TOOLS, CONVERSATION_HISTORY
*/
val promptCacheStrategy: String? = null, // Maps to AnthropicCacheStrategy
/**
* Cache TTL: "5m" (default) or "1h"
* 5m = 1.25x write cost, 1h = 2x write cost
* Read cost = 0.1x for both
*/
val promptCacheTtl: String? = null,
/**
* Minimum content length (characters) to cache.
* Haiku requires 2048+, Opus/Sonnet require 1024+
*/
val promptCacheMinLength: Int? = null
)
2. Update AnthropicOptionsConverter
// embabel-agent-anthropic-autoconfigure/.../AnthropicOptionsConverter.kt
fun convertOptions(llmOptions: LlmOptions): AnthropicChatOptions {
val builder = AnthropicChatOptions.builder()
.model(llmOptions.model)
// ... existing mappings ...
// Map cache configuration
if (llmOptions.promptCacheStrategy != null) {
val cacheOptions = AnthropicCacheOptions.builder()
.strategy(AnthropicCacheStrategy.valueOf(
llmOptions.promptCacheStrategy.uppercase()
))
// Apply TTL if specified
if (llmOptions.promptCacheTtl != null) {
val ttl = when(llmOptions.promptCacheTtl) {
"1h" -> AnthropicCacheTtl.ONE_HOUR
else -> AnthropicCacheTtl.FIVE_MINUTES
}
// Apply to SYSTEM and TOOL messages
cacheOptions
.messageTypeTtl(MessageType.SYSTEM, ttl)
.messageTypeTtl(MessageType.TOOL, ttl)
}
// Apply minimum length if specified
if (llmOptions.promptCacheMinLength != null) {
cacheOptions
.messageTypeMinContentLength(MessageType.SYSTEM, llmOptions.promptCacheMinLength)
.messageTypeMinContentLength(MessageType.TOOL, llmOptions.promptCacheMinLength)
}
builder.cacheOptions(cacheOptions.build())
}
return builder.build()
}
3. Usage Example
@Agent
class MyAgent(private val context: OperationContext) {
@Action
fun performTask(input: TaskInput): TaskOutput {
val options = LlmOptions.builder()
.withModel("claude-3-5-haiku-20241022")
.withTemperature(0.7)
.withMaxTokens(8000)
.withPromptCacheStrategy("SYSTEM_AND_TOOLS") // Enable caching
.withPromptCacheTtl("5m") // 5-minute TTL
.withPromptCacheMinLength(2048) // Haiku minimum
.build()
val result = context.ai().createObject(
prompt = "Analyze: ${input.data}",
options = options
)
return result
}
}
Benefits
✅ 90% Cost Savings: Cache reads cost 10% of base tokens
✅ Latency Reduction: Faster subsequent requests within TTL
✅ Framework Consistency: Uses Embabel's LlmOptions API
✅ Backward Compatible: Optional parameters
✅ Type Safe: Compile-time validation
References
[Spring AI AnthropicChatOptions](https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/AnthropicChatOptions.java)
[Spring AI AnthropicCacheOptions](https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/api/AnthropicCacheOptions.java)
[Anthropic Caching Docs](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)