Skip to content

fix(antigravity): clamp upstream retryDelay/reset to prevent excessive cooldowns#2302

Open
ryneivy wants to merge 1 commit into
Wei-Shaw:mainfrom
ryneivy:fix/antigravity-cooldown-clamp
Open

fix(antigravity): clamp upstream retryDelay/reset to prevent excessive cooldowns#2302
ryneivy wants to merge 1 commit into
Wei-Shaw:mainfrom
ryneivy:fix/antigravity-cooldown-clamp

Conversation

@ryneivy
Copy link
Copy Markdown

@ryneivy ryneivy commented May 8, 2026

Antigravity occasionally returns very long retry windows (hours or days, e.g. on geo-restricted accounts or true quota exhaustion). The gateway currently propagates these values verbatim to SetModelRateLimit / SetRateLimited, locking accounts out of scheduling for the full duration.

Mirror the OpenAI fix from #2290 by capping the value at a local upper bound:

  • Add antigravityMaxRateLimitCooldown = 2h (same magnitude as OpenAI's maxRateLimit429CooldownSeconds = 7200).
  • Apply the cap in two places that consume upstream-supplied wait time:
    • parseAntigravitySmartRetryInfo: clamps retryDelay parsed from google.rpc.RetryInfo (covers smart-retry and "switch account" paths).
    • (*AntigravityGatewayService).resolveResetTime: clamps the absolute reset timestamp used by the 429 fallback handler.
  • Log a single line when a clamp fires, so operators can spot pathological upstream responses.

Out of scope, intentional:

  • Account-recovery hook analogous to recoverOpenAIRateLimitedAccountBeforeNoAvailable (touches the shared selector path; left for a follow-up PR).
  • antigravityRateLimitThreshold / antigravitySmartRetryMaxAttempts.

Tests added in antigravity_cooldown_clamp_test.go cover both helpers and verify parseAntigravitySmartRetryInfo clamps a 24h retryDelay while leaving short delays (0.4s) untouched.

…e cooldowns

Antigravity occasionally returns very long retry windows (hours or days,
e.g. on geo-restricted accounts or true quota exhaustion). The gateway
currently propagates these values verbatim to SetModelRateLimit /
SetRateLimited, locking accounts out of scheduling for the full duration.

Mirror the OpenAI fix from Wei-Shaw#2290 by capping the value at a local upper
bound:

- Add antigravityMaxRateLimitCooldown = 2h (same magnitude as OpenAI's
  maxRateLimit429CooldownSeconds = 7200).
- Apply the cap in two places that consume upstream-supplied wait time:
  - parseAntigravitySmartRetryInfo: clamps retryDelay parsed from
    google.rpc.RetryInfo (covers smart-retry and "switch account" paths).
  - (*AntigravityGatewayService).resolveResetTime: clamps the absolute
    reset timestamp used by the 429 fallback handler.
- Log a single line when a clamp fires, so operators can spot
  pathological upstream responses.

Out of scope, intentional:
- Account-recovery hook analogous to
  recoverOpenAIRateLimitedAccountBeforeNoAvailable (touches the shared
  selector path; left for a follow-up PR).
- antigravityRateLimitThreshold / antigravitySmartRetryMaxAttempts.

Tests added in antigravity_cooldown_clamp_test.go cover both helpers and
verify parseAntigravitySmartRetryInfo clamps a 24h retryDelay while
leaving short delays (0.4s) untouched.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant