Open
Description
Expected Behavior
VertexAiGeminiChatModel should use retry options similar to e.g. OpenAiChatModel.
Current Behavior
VertexAiGeminiChatModel does not use retry.
Context
Gemini model 1.5 Pro sometimes returns error:
java.lang.RuntimeException: Failed to generate content
at org.springframework.ai.vertexai.gemini.VertexAiGeminiChatModel.getContentResponse(VertexAiGeminiChatModel.java:532)
at org.springframework.ai.vertexai.gemini.VertexAiGeminiChatModel.call(VertexAiGeminiChatModel.java:173)
...
Caused by: com.google.api.gax.rpc.ResourceExhaustedException: io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: Unable to submit request because the service is temporarily out of capacity. Try again later.
Retrying in such cases is crucial for stable application operation.
More information of resources exhaustion: https://cloud.google.com/vertex-ai/generative-ai/docs/quotas#troubleshoot-dynamic-shared-quota
There is already an issue to add spring-ai-retry dependency #832, but just adding dependency does not solve the problem with not using retries by VertexAiGeminiChatModel.