Skip to content

RBAC: +kubebuilder:rbac markers are incomplete chart and \make manifests\ have drifted #390

Description

@WentingWu666666

Problem

The operator''s ClusterRole (rendered by operator/documentdb-helm-chart/templates/05_clusterrole.yaml) has been hand-maintained and has drifted significantly from what make manifests would generate from +kubebuilder:rbac markers.

Concretely:

  • Only 4 files in operator/src/internal/ carry +kubebuilder:rbac markers:
    • certificate_controller.go
    • documentdb_controller.go
    • pv_controller.go
    • (partial coverage)
  • These files have zero RBAC markers despite making client.* calls:
    • backup_controller.go
    • scheduledbackup_controller.go
    • physical_replication.go
    • internal/utils/util.go
    • internal/utils/replication_context.go
    • internal/cnpg/cnpg_sync.go
  • As a result, make manifests regenerates operator/src/config/rbac/role.yaml with only a small subset of the permissions the operator actually needs (documentdb.io CRDs, cert-manager objects, PVC/PV/StorageClass, plus events). Everything else (postgresql.cnpg.io, snapshot.storage.k8s.io, fleet, services, secrets, configmaps, serviceaccounts, namespaced RBAC, leader-election leases, backup/scheduledbackup CRs) is missing.
  • The Helm chart''s ClusterRole has therefore been edited by hand to fill the gap. PR chore(helm): trim over-broad ClusterRole permissions #386 cleans up the worst dead grants that hand-editing accumulated, but the underlying drift remains.

Why this matters for GA

  • We cannot rely on make manifests as a source of truth for RBAC reviews.
  • New controllers / new client calls require remembering to update the chart''s ClusterRole by hand. Easy to miss; easy to over-grant defensively.
  • Reviewers cannot cheaply verify "this PR''s code matches its RBAC delta" they have to grep the codebase by hand (as chore(helm): trim over-broad ClusterRole permissions #386 had to).

Proposed fix

  1. Add a complete set of +kubebuilder:rbac markers to each controller / utility function that makes Kubernetes API calls. Markers should be co-located with the call site (typically on the Reconcile method, or at file top for cross-cutting utilities).
  2. Run make manifests and commit the regenerated operator/src/config/rbac/role.yaml.
  3. Switch the chart''s templates/05_clusterrole.yaml to render from (or be asserted equal to) the generated role.yaml. Options:
    a. Replace the hand-written chart template with a generated one (chart copies role.yaml at build time).
    b. Add a CI check that diffs the chart''s rendered ClusterRole against make manifests output and fails on drift.
  4. Add a make target (e.g. make verify-rbac) wired into CI so any future RBAC drift fails the build.

Acceptance criteria

  • make manifests on a fresh checkout produces a role.yaml that, when applied, allows every controller path to run successfully against a kind cluster (verified by E2E test suite passing with no forbidden errors in operator logs).
  • A CI check rejects any PR that introduces a client.* call without a corresponding +kubebuilder:rbac marker, OR any PR that changes the chart''s ClusterRole without a matching marker change.
  • The chart''s ClusterRole template either is generated, or contains a comment pointing reviewers at the make verify-rbac target.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions