feat(fleet): add refresh endpoint#1128
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 77a27fa3c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| usage_written = await UsageUpdater( | ||
| usage_repo, | ||
| context.repository, | ||
| additional_usage_repo, | ||
| ).refresh_accounts(eligible_accounts, latest_primary) |
There was a problem hiding this comment.
Use a background session for shielded refreshes
When POST /api/fleet/refresh is cancelled or times out while an upstream fetch is in flight, UsageUpdater.refresh_accounts() runs the per-account work inside _USAGE_REFRESH_SINGLEFLIGHT.run() via an asyncio.shielded task. Because the repos passed here all share the request-scoped context.session, FastAPI can close or roll back that session while the shielded task keeps using it, leading to failed refreshes or checked-out connections not returning under disconnects; create and own a background session inside the refresh task instead.
Useful? React with 👍 / 👎.
| router = APIRouter( | ||
| prefix="/api/fleet", | ||
| tags=["fleet"], | ||
| dependencies=[Security(validate_usage_api_key), Depends(set_dashboard_error_format)], |
There was a problem hiding this comment.
Preserve API-key scope in fleet handlers
For deployments that issue account-scoped API keys to clients, this router-level auth dependency only validates the token and discards the ApiKeyData, so both fleet handlers operate as an unscoped key: GET /api/fleet/summary lists every account/email and POST /api/fleet/refresh refreshes every eligible account instead of just assignedAccountIds. The existing API-key contract treats non-empty assigned_account_ids as account scope, so these handlers should receive the key data and filter accounts/usage queries before returning or refreshing.
Useful? React with 👍 / 👎.
|
Addressed the two Codex P1 findings on new head
Verification after the fix: @codex review |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8fbb99a8d3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| accounts = await context.service.list_accounts(account_ids=_visible_account_ids(api_key)) | ||
| return FleetSummaryResponse(accounts=build_fleet_account_summaries(accounts)) |
There was a problem hiding this comment.
Honor API-key quota visibility for fleet summary
For API keys deliberately configured not to see upstream quota data, this new path bypasses the existing visibility controls: /v1/usage parses api_key.usage_sections and suppresses account_pool_usage when hide_upstream_quota_from_api_keys is enabled, but GET /api/fleet/summary returns per-account remaining percentages and reset times for the same key. When usageSections excludes account_pool_usage/upstream_limits or the global privacy toggle is on, callers can recover the hidden quota details by calling this endpoint instead, so the summary should apply the same gates or require a separate explicit permission.
Useful? React with 👍 / 👎.
|
Addressed the latest Codex P1 on new head
Verification after the fix: @codex review |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e293ec254d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| usage_written = await UsageUpdater( | ||
| usage_repo, | ||
| accounts_repo, | ||
| additional_usage_repo, | ||
| ).refresh_accounts(eligible_accounts, latest_primary) |
There was a problem hiding this comment.
Run fleet refresh with a task-owned DB session
When a client disconnects/cancels POST /api/fleet/refresh while an upstream usage fetch is in flight, UsageUpdater.refresh_accounts runs the per-account refresh through the module-level singleflight using asyncio.shield, so that task can continue after this endpoint unwinds. Because the repositories passed here all share the surrounding get_background_session() context, its finalizer rolls back/closes the same AsyncSession while the surviving task may still write usage rows, token metadata, or status updates through it, leading to lost refresh work or broken/leaked pooled connections. Use a session owned by the shielded refresh task or ensure the singleflight is awaited/cancelled before leaving the context.
Useful? React with 👍 / 👎.
|
Addressed the current-head Codex P2 on new head
Verification after the fix: @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db1d932613
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| account_id=account.account_id, | ||
| display_name=account.display_name, | ||
| email=account.email, | ||
| status=account.status, |
There was a problem hiding this comment.
Hide usage-derived status when quota visibility is disabled
When include_usage is false because the API key lacks account_pool_usage or hide_upstream_quota_from_api_keys is enabled, this still forwards AccountSummary.status. That status is computed by the accounts summary mapper from the latest usage rows, so an otherwise active account with primary or secondary usage at 100% is serialized as rate_limited/quota_exceeded while the window fields are nulled, leaking the hidden quota state through the new fleet endpoint.
Useful? React with 👍 / 👎.
|
Addressed both current-head Codex P2 findings on new head
Regression coverage added/updated:
Verification after the fix: @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 50f704041c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| result = await _USAGE_REFRESH_SINGLEFLIGHT.run( | ||
| account.id, | ||
| lambda account=account: self._refresh_account_if_stale( | ||
| account, | ||
| usage_account_id=account.chatgpt_account_id, | ||
| interval_seconds=interval, | ||
| ), | ||
| refresh_factory, | ||
| ) |
There was a problem hiding this comment.
Avoid joining non-owned singleflight refreshes
When a fleet refresh overlaps an already-running refresh for the same account from paths that call refresh_accounts without own_singleflight_sessions, this call keeps the default join_existing=True, so the owned-session factory above is never executed. If that original waiter is cancelled and its DB context exits, the fleet refresh is still awaiting work tied to the original session, which reintroduces the closed-session failure this flag is meant to avoid; the owned path should avoid joining non-owned in-flight work or key it separately.
Useful? React with 👍 / 👎.
|
Addressed the current-head Codex P2 on new head
Verification after the fix: @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94270c59dd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
|
|
||
| def _usage_refresh_singleflight_key(account_id: str, *, own_singleflight_session: bool = False) -> str: | ||
| return f"owned-session:{account_id}" if own_singleflight_session else account_id |
There was a problem hiding this comment.
Use collision-free keys for owned refreshes
If an imported/upstream account id starts with owned-session:, the normal singleflight key for that account is identical to the owned-session key for the account id after the prefix (for example normal owned-session:abc vs owned abc). In that case a fleet refresh can still join an unrelated in-flight request-scoped refresh, leaving the intended account unrefreshed and reintroducing the session-lifetime coupling this change is trying to avoid; use a structured key or namespace both modes unambiguously.
Useful? React with 👍 / 👎.
|
Addressed the current-head Codex P2 on new head
Verification after the fix: @codex review |
|
Codex Review: Didn't find any major issues. 🎉 Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 104aa13496
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
|
|
||
| async def _can_view_fleet_usage(api_key: ApiKeyData) -> bool: | ||
| if "account_pool_usage" not in _usage_sections(api_key.usage_sections): |
There was a problem hiding this comment.
Gate reset/window fields on upstream_limits permission
Fresh evidence after the quota-visibility fix is that this predicate still treats account_pool_usage alone as permission to return the entire fleet summary. /v1/usage gates upstream_limits separately, but the fleet mapper returns per-account resetAt and windowMinutes; a key configured with usage_sections="account_pool_usage" and no upstream_limits can recover reset/window quota details that are hidden from /v1/usage. Require upstream_limits for those fields or null them independently.
Useful? React with 👍 / 👎.
Summary
Add API-key-authenticated fleet summary and refresh endpoints so trusted local dashboards can distinguish a read of codex-lb state from an upstream usage-refresh attempt.
Type of change
feat:— new user-facing feature or capabilityfix:— bug fix (no behavior change beyond the bug)refactor:— internal refactor (no behavior change, no API change)docs:— documentation onlychore:/ci:/build:— tooling, CI, packagingtest:— test-only changeLinked issue: N/A
OpenSpec
Change directory: openspec/changes/add-fleet-refresh-endpoint/
Changes
GET /api/fleet/summarywith a minimal non-sensitive account/window projection.POST /api/fleet/refreshbehind the existing always-required usage API key dependency.UsageUpdater.refresh_accountsoutside proxy request selection, skip paused/reauth/deactivated accounts, and invalidate relevant caches only when usage was written.Test plan
Also attempted:
openspecis not installed in this local shell (command not found/ failed spawn). Fulluv run pytestreached 4104 passed / 45 skipped but failed three existing account-recovery / usage-updater tests outside this change; rerunning those exact tests in isolation reproduces the same failures.Screenshots / output (optional)
N/A — backend/API change.
Checklist
<type>(<scope>)?: <subject>).openspec validate --specspasses and/opsx:verifyis clean. Blocked locally because the OpenSpec CLI is unavailable.