feat: Add ConfigMap watching for faster token re-issuance#313

Open

ssyno wants to merge 11 commits intomainfrom

feat/configmap-watching-for-faster-token-reissuance

Collaborator

ssyno commented Sep 3, 2025

This implementation adds a new ConfigMap controller that watches for changes to the teleport-operator ConfigMap and triggers immediate token re-issuance when configuration updates occur, improving response time from 5+ minutes to seconds.

Key features:

Event-driven ConfigMap watching with predicate filtering
Intelligent change detection with 4-level impact classification:
- Critical (ProxyAddr): Forces reconnection + token regeneration
- High (ManagementClusterName): Invalidates all tokens
- Medium (TeleportVersion/AppName): Updates ConfigMaps
- Low (AppVersion/AppCatalog): No immediate action required
Immediate cluster reconciliation triggering via annotations
Comprehensive unit tests covering all change scenarios
Backward compatible - no breaking changes

Performance improvements:

ProxyAddr changes: 5+ minutes → ~seconds (>100x faster)
ManagementClusterName changes: 5+ minutes → ~seconds (>100x faster)
TeleportVersion changes: Next restart → ~seconds (near-instant)

Files added:

internal/controller/config_controller.go - Main implementation
internal/controller/config_controller_test.go - Unit tests

Files modified:

main.go - Added ConfigMap controller registration

What this PR does / why we need it

Checklist

Update changelog in CHANGELOG.md.


          feat: Add ConfigMap watching for faster token re-issuance

f1350d7

This implementation adds a new ConfigMap controller that watches for changes
to the teleport-operator ConfigMap and triggers immediate token re-issuance
when configuration updates occur, improving response time from 5+ minutes to seconds.

Key features:
- Event-driven ConfigMap watching with predicate filtering
- Intelligent change detection with 4-level impact classification:
  * Critical (ProxyAddr): Forces reconnection + token regeneration
  * High (ManagementClusterName): Invalidates all tokens
  * Medium (TeleportVersion/AppName): Updates ConfigMaps
  * Low (AppVersion/AppCatalog): No immediate action required
- Immediate cluster reconciliation triggering via annotations
- Comprehensive unit tests covering all change scenarios
- Backward compatible - no breaking changes

Performance improvements:
- ProxyAddr changes: 5+ minutes → ~seconds (>100x faster)
- ManagementClusterName changes: 5+ minutes → ~seconds (>100x faster)
- TeleportVersion changes: Next restart → ~seconds (near-instant)

Files added:
- internal/controller/config_controller.go - Main implementation
- internal/controller/config_controller_test.go - Unit tests

Files modified:
- main.go - Added ConfigMap controller registration

ssyno requested a review from a team as a code owner

September 3, 2025 12:59


          Merge branch 'main' into feat/configmap-watching-for-faster-token-rei…

06d7170

…ssuance

stone-z reviewed

View reviewed changes

internal/controller/config_controller.go Outdated

Comment on lines +64 to +65

		log.Info("ConfigMap deleted, but we continue with cached config")
		return ctrl.Result{}, nil

Contributor

stone-z Sep 8, 2025

This controller does not currently continue with cached config. Why should the controller cache any config?

Collaborator Author

ssyno Sep 9, 2025

thats right, the log message is misleading since we don't actually cache config in this controller. I'll update it.

internal/controller/config_controller.go Outdated

Comment on lines +56 to +59

+              	// Only process the teleport-operator ConfigMap
+              	if req.Name != key.TeleportOperatorConfigName {
+              		return ctrl.Result{}, nil
+              	}

Contributor

stone-z Sep 8, 2025

This is theoretically handled by the predicate function, right?

Collaborator Author

ssyno Sep 9, 2025

removing it.

internal/controller/config_controller.go Outdated

Comment on lines +127 to +132

+              	var changes []ConfigChange
+              	if oldConfig == nil {
+              		// First time seeing config, no changes to process
+              		return changes
+              	}

Contributor

stone-z Sep 8, 2025

IMO the logic should differentiate between "nothing changed between old and new" and "there was no old config". Wouldn't everything in a new config be a change if there was no old config?

Collaborator Author

ssyno Sep 9, 2025

during startup when there's no old config, do we want to treat the initial configuration as changes that trigger reconciliation actions. The system is initializing and all components will naturally use the new config through the normal startup flow.

internal/controller/config_controller.go Outdated

Comment on lines +235 to +242

+              	// Log all changes for audit purposes
+              	for _, change := range changes {
+              		log.Info("Configuration change processed",
+              			"field", change.Field,
+              			"oldValue", change.OldValue,
+              			"newValue", change.NewValue,
+              			"impact", r.impactString(change.Impact))
+              	}

Contributor

stone-z Sep 8, 2025

This audit is lost if any of the previous steps error out

Collaborator Author

ssyno Sep 9, 2025

I will move it to happen immediately after change detection

internal/controller/config_controller.go Outdated

+              }
+              // handleConfigChanges processes detected configuration changes
+              func (r *ConfigReconciler) handleConfigChanges(ctx context.Context, log logr.Logger, changes []ConfigChange) error {

Contributor

stone-z Sep 8, 2025

AFAICT the only difference in any of the code paths from here are log lines and clearing the local object teleport identity.

Why is this whole impact system necessary if the outcome is always the same?

Collaborator Author

ssyno Sep 9, 2025

I am going to refactor this one

internal/controller/config_controller.go Outdated

+              		}
+              		// Add annotation to trigger reconciliation
+              		cluster.Annotations["teleport-operator.giantswarm.io/config-updated"] = timestamp

Contributor

stone-z Sep 8, 2025

The annotation string should be a const.

What were your thoughts on using a timestamp vs a config hash?

Collaborator Author

ssyno Sep 9, 2025

am going to move this one to the key package

Collaborator Author

ssyno Sep 9, 2025

timestamp works well since we've already filtered for meaningful changes in detectConfigChanges(). If we find that identical configs are causing issues, we could switch to a config hash

internal/controller/config_controller.go Outdated

+              	return ctrl.NewControllerManagedBy(mgr).
+              		For(&corev1.ConfigMap{}).
+              		WithOptions(controller.Options{
+              			MaxConcurrentReconciles: 1, // Process config changes sequentially

Contributor

stone-z Sep 8, 2025

Why does sequence / concurrency matter for this controller?

Collaborator Author

ssyno Sep 9, 2025

config changes are rare events so they have predictable behavior without significant performance impact using sequential processing

Contributor

stone-z Sep 9, 2025

I was confused because 1 is already the default for that setting, so by explicitly setting it I thought you might have already identified a race condition or something that limits concurrency. If that's not the case, I don't think you really need to set the value

Collaborator Author

ssyno Sep 9, 2025

yeah we can get rid of it, since 1 is already the default for MaxConcurrentReconciles


          updates

02a3f19

Contributor

stone-z commented Sep 9, 2025

To generalize my feedback:

The purpose of this controller is to force the existing cluster controller to re-reconcile Cluster CRs when a single ConfigMap changes.
So, the minimum functionality for this controller IIUC would be to 1. ignore CM deletions, and otherwise: 2. apply any "new" value as an annotation to the Cluster CRs (hash, timestamp, random number, etc.). The contents of the CM don't matter to this controller.

So, is the other logic necessary? Are there more use cases I missed that you're trying to support?

ssyno and others added 8 commits

September 9, 2025 15:37


          Merge branch 'main' into feat/configmap-watching-for-faster-token-rei…

32b1150

…ssuance


          remove MaxConcurrentReconciles

e2bf96b


          simplify config controller

fab7111


          fix test

1cda127

fmt

70bfedf


          fix main

bc1a924


          Merge branch 'main' into feat/configmap-watching-for-faster-token-rei…

…ssuance


          Merge branch 'main' into feat/configmap-watching-for-faster-token-rei…

3ba052d

…ssuance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet