Skip to content

proposal to remove list functionality#383

Open
ssyno wants to merge 3 commits intomainfrom
security-update-proposal
Open

proposal to remove list functionality#383
ssyno wants to merge 3 commits intomainfrom
security-update-proposal

Conversation

@ssyno
Copy link
Collaborator

@ssyno ssyno commented Mar 12, 2026

Towards: https://github.com/giantswarm/giantswarm/issues/35198

What this PR does / why we need it

This PR:

  • Token TTL reduced to 1h with proactive 30-minute renewal, so leaked tokens are short-lived by design
  • Old tokens are explicitly revoked from Teleport immediately after rotation, closing the residual window down to near-zero
  • IsTokenValid now calls GetToken(name) directly — the token name is already in the Secret, no listing needed
  • DeleteTokenByName replaces the label-scan DeleteToken — the controller reads UUIDs from the Secret and ConfigMap before deleting them; both per-cluster tokens are now cleaned up (previously only the first match was deleted)
  • The Teleport operator role can now drop list from its token resource verbs, preventing cross-tenant enumeration entirely
  • Secret-backed token no longer carries the node role — kube-agent registration only requires kube/app

Checklist

  • Update changelog in CHANGELOG.md.

@ssyno ssyno requested a review from a team as a code owner March 12, 2026 13:08
"joinToken": token,
}

if err := ctrlClient.Patch(ctx, existing, patch); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the old implementation used Update(), and that is generally considered better practice, why is it necessary and important now to use Patch() instead?

// TeleportKubeTokenValidity is the TTL for all join tokens generated by this
// operator. Reduced from 720h to 1h to limit the blast radius of a leaked token.
// Tokens are renewed proactively via TokenRenewalThreshold before they expire.
TeleportKubeTokenValidity = 1 * time.Hour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like 1hr in theory, but I want to be sure I understand the scope here. This is the validity of the WC join token, right? If teleport-operator loses connectivity for 6 hours (but WCs are still connected), does this mean WCs will be disconnected, or do they use a different token after the initial join?

I'm worried about what happens in an incident or network outage, so I just want to make sure this won't boot WCs out and will self-heal if TO can't reach teleport for some reason for a few hours. Once it regains connectivity, would oncallers need to manually do anything to restore TKA access?

// to make the name unique across cluster re-creations with the same name.
// An empty clusterUID falls back to the legacy two-component format to preserve
// backwards compatibility with already-registered clusters.
func GetRegisterName(managementClusterName, clusterName, clusterUID string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the change to GetRegisterName() for?

This seems extremely likely to break things. For example, tsh kube login would then need to know the uid of the target cluster, right?

if !cluster.DeletionTimestamp.IsZero() {
// Delete teleport token for the cluster
if err := r.Teleport.DeleteToken(ctx, log, registerName); err != nil {
// Read token names from K8s resources before deleting them so we can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify - all the deletion changes besides using DeleteByTokenName are only necessary if changing the register name, right? Or is there a second purpose here?

Comment on lines +193 to +204
// secretLabels are applied to every Secret this operator manages so that
// ownership can be determined without relying on OwnerReferences, and so
// that secrets can be scoped to a specific cluster UID for auditability.
secretLabels := map[string]string{
"app.kubernetes.io/managed-by": "teleport-operator",
"teleport-operator/cluster-uid": string(cluster.UID),
}

// Check and update Secret if necessary.
// The Secret uses the same roles as the ConfigMap (kube + optionally app)
// because the node role is not required for kube-agent registration and
// granting excess roles violates least-privilege.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What is the difference between the ConfigMap and Secret that we create here?
  2. If we're adding labels, let's apply the same labels to both CM and Secret

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants