Skip to content

Support GKE Workload Identity Federation for sharded controllers in multi-cluster deploymentsΒ #5646

@papaslon

Description

@papaslon

Note: If you're requesting a Helm chart option that may be very niche and not useful to the community at large, please consider using Kustomize to apply "last mile" tweaks to the output of helm template to suit your needs instead.

Checklist

  • I've searched the issue queue to verify this is not a duplicate feature request.
  • I've pasted the output of kargo version, if applicable.
  • I've pasted logs, if applicable.

Proposed Feature

Kargo version: 1.8.6

Support for GKE Workload Identity Federation in sharded multi-cluster deployments by allowing the management controller to create RoleBindings with kind: User subjects instead of kind: ServiceAccount.

When a sharded kargo-controller runs in a different GKE cluster and authenticates to the control plane cluster via Workload Identity Federation, Kubernetes presents it as a User identity, not as a ServiceAccount. The subject format for cross-cluster authentication is:

subjects:
  - kind: User
    name: "serviceAccount:<GCP_PROJECT>.svc.id.goog[<NAMESPACE>/<SERVICE_ACCOUNT>]"

However, the management controller currently hardcodes RoleBinding subjects as:

subjects:
  - kind: ServiceAccount
    name: <sa-name>
    namespace: <kargo-namespace>

This only works for ServiceAccounts within the same cluster.

Suggested Implementation

Add configuration options to specify remote controller identities. For example:

controller:
  remoteControllers:
    - name: kargo-controller-shard-a
      subject:
        kind: User
        name: "serviceAccount:my-gcp-project.svc.id.goog[kargo/kargo-controller]"

The management controller would then create RoleBindings in Project namespaces (and optionally global credential namespaces) with the correct User subjects.

Motivation

Use Case: Multi-Cluster GKE Sharded Deployment

In a sharded Kargo architecture on GKE:

  1. Control plane cluster hosts all Kargo CRDs, the API server, and management controller
  2. Workload clusters run sharded kargo-controllers that connect back to the control plane
  3. GKE Workload Identity Federation is used for secure cross-cluster authentication (no long-lived credentials)

The sharded controllers need to read Secrets in Project namespaces (for repository credentials) and in global credential namespaces. The management controller automatically creates RoleBindings for local ServiceAccounts, but these don't work for cross-cluster Workload Identity authentication.

Current Workaround

Operators must manually create RoleBindings in every Project namespace and global credential namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kargo-controller-shard-read-secrets
  namespace: <each-project-namespace>
subjects:
  - kind: User
    name: "serviceAccount:<GCP_PROJECT>.svc.id.goog[kargo/kargo-controller]"
roleRef:
  kind: ClusterRole
  name: kargo-controller-read-secrets
  apiGroup: rbac.authorization.k8s.io

This is error-prone and doesn't scale, especially since new RoleBindings must be created every time a new Project is added.

Why This Matters

  • GKE Workload Identity Federation is the recommended authentication method for GKE workloads accessing Google Cloud and cross-cluster resources
  • Multi-cluster/sharded deployments are a key scaling pattern for Kargo
  • Security best practices discourage long-lived credentials or cluster-wide Secret access

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/enhancementAn entirely new featurekind/proposalIndicates maintainers have not yet committed to a feature requestneeds/areaIssue or PR needs to be labeled to indicate what parts of the code base are affectedneeds/priorityPriority has not yet been determined; a good signal that maintainers aren't fully committed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions