Skip to content

Conversation

@fernando-villalba
Copy link
Collaborator

  • Change Cell.Spec.TopoServer to a pointer (*LocalTopoServerSpec). This allows the field to be omitted (nil) when using the global default topology server, resolving validation errors where the empty struct failed the "oneOf" requirement (etcd vs external).

  • Adjusted validation limits due to CEL budget issues during integration tests. These limits may need to be adjusted as we go.

…mits

- Change `Cell.Spec.TopoServer` to a pointer (`*LocalTopoServerSpec`).
  This allows the field to be omitted (nil) when using the global
  default topology server, resolving validation errors where the
  empty struct failed the "oneOf" requirement (etcd vs external).

- Adjusted validation limits due to CEL budget issues during integration tests. These limits may need to be adjusted as we go.
@fernando-villalba
Copy link
Collaborator Author

To provide more context on what caused the CEL validation budget errors and why I was conservative adjusting them all down:

The Kubernetes API server rejected the CRD because the validation complexity (cost) is 100x higher than allowed. This happened because we had deep nesting (Databases -> TableGroups -> Shards -> Pools/Orch) combined with CEL rules on maps (podAnnotations, podLabels, pools), and the API server calculates the "worst-case scenario".

The specific fields triggering the 100x budget explosion:

  1. overrides.pools
  • Path: spec.databases[].tablegroups[].shards[].overrides.pools
  • Violation: Cost exceeds budget by >100x.
  1. overrides.multiorch.podAnnotations
  • Path: spec.databases[].tablegroups[].shards[].overrides.multiorch.podAnnotations
  • Violation: Cost exceeds budget by >100x.
  1. overrides.multiorch.podLabels
  • Path: spec.databases[].tablegroups[].shards[].overrides.multiorch.podLabels
  • Violation: Cost exceeds budget by >100x.
  1. spec.pools (Inline Spec)
  • Path: spec.databases[].tablegroups[].shards[].spec.pools
  • Violation: Cost exceeds budget by >100x.
  1. spec.multiorch.podLabels (Inline Spec)
  • Path: spec.databases[].tablegroups[].shards[].spec.multiorch.podLabels
  • Violation: Cost exceeds budget by >100x.
  1. spec.multiorch.podAnnotations (Inline Spec)
  • Path: spec.databases[].tablegroups[].shards[].spec.multiorch.podAnnotations
  • Violation: Cost exceeds budget by >100x.
  1. shards[] List
  • Path: spec.databases[].tablegroups[].shards
  • Violation: Cost exceeds budget by 1.9x (on the list itself, independent of the items).

Summary: The deep nesting means a single validation rule on podAnnotations is multiplied by the max size of every parent list. Without strict MaxItems and MaxProperties at every level, the cost becomes billions of operations.

Copy link
Member

@rytswd rytswd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one question left for clarification

// Pools overrides. Keyed by pool name.
// +optional
// +kubebuilder:validation:MaxProperties=32
// +kubebuilder:validation:MaxProperties=8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, where is this number coming from?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my desperation and frugality to keep everything under CEL budget 😭

Don't take any of these limits as gospel, my guess is that we will need to adjust them further as we go along.

@fernando-villalba fernando-villalba merged commit e060d24 into main Dec 22, 2025
@fernando-villalba fernando-villalba deleted the adjusting-cel-budgets branch December 22, 2025 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants