Skip to content

feat(config, images): Auto-clean stale images, introduce configs#133

Merged
nfebe merged 3 commits into
mainfrom
feat/auto-cleanup-stale-images
May 25, 2026
Merged

feat(config, images): Auto-clean stale images, introduce configs#133
nfebe merged 3 commits into
mainfrom
feat/auto-cleanup-stale-images

Conversation

@nfebe
Copy link
Copy Markdown
Contributor

@nfebe nfebe commented May 25, 2026

After a successful pull (whether via the standalone pull endpoint or as part of a deploy with pull), the agent now removes images that this deployment previously used but no longer references. An image is only removed if no container references it and no other deployment's compose references it either, so cleanup never deletes anything still in use locally. The default is on; pass cleanup=false on the request to skip.

Two new endpoints make the cleanup callable on demand: one scoped to a single deployment and one for host-wide pruning of dangling layers. A dry_run flag returns the would-be removals without deleting anything.

Closes #124

@sourceant
Copy link
Copy Markdown

sourceant Bot commented May 25, 2026

Code Review Summary

This PR introduces a robust system for cleaning up stale Docker images after deployments and pulls. It also adds a generic configuration management API that allows updating agent settings at runtime with persistence.

🚀 Key Improvements

  • Implemented safe image cleanup that respects container usage and other deployments.
  • Added reflection-based configuration API with support for sensitive fields.
  • Introduced configurable timeouts for Docker operations.
  • Improved image reference parsing using official Docker distribution libraries.

💡 Minor Suggestions

  • Normalize image references from container list to improve matching reliability.
  • Improve error handling consistency when configuration persistence fails.

🚨 Critical Issues

  • Reflection-based field resolution could panic if nested pointers in the config struct are nil.

Copy link
Copy Markdown

@sourceant sourceant Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. See the overview comment for a summary.

Comment thread internal/docker/cleanup.go Outdated
Comment thread internal/docker/cleanup.go
Copy link
Copy Markdown

@sourceant sourceant Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. See the overview comment for a summary.

Comment thread internal/docker/cleanup.go
Comment thread internal/docker/cleanup.go Outdated
nfebe added 2 commits May 25, 2026 21:22
After a successful pull (whether via the standalone pull endpoint or as
part of a deploy with pull), the agent now removes images that this
deployment previously used but no longer references. An image is only
removed if no container references it and no other deployment's compose
references it either, so cleanup never deletes anything still in use
locally. The default is on; pass cleanup=false on the request to skip.

Two new endpoints make the cleanup callable on demand: one scoped to a
single deployment and one for host-wide pruning of dangling layers. A
dry_run flag returns the would-be removals without deleting anything.

Closes #124
Clients can now list, fetch, and update the agent's YAML configuration
through dotted keys instead of one bespoke endpoint per setting. Sensitive
keys are masked, string inputs are coerced to the field's real type, and
runtime-applicable keys take effect immediately. The new endpoints sit
behind dedicated config permissions granted only to admins, so the
discovery surface is admin-only by default.

A top-level default timeout setting is the fallback for any block that
does not specify its own, used today by the image cleanup loop. The
legacy /settings endpoints are unchanged.
@nfebe nfebe force-pushed the feat/auto-cleanup-stale-images branch from 14c1ce1 to 262c8aa Compare May 25, 2026 20:22
Copy link
Copy Markdown

@sourceant sourceant Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. See the overview comment for a summary.

Comment thread internal/docker/cleanup.go
Comment thread pkg/config/registry.go
@nfebe nfebe changed the title feat(images): Auto-clean stale images after a deployment pulls feat(config, images): Auto-clean stale images, introduce configs May 25, 2026
…hash tags

Image refs are now parsed with the official Docker reference package, so
edge cases like registries with ports and digest references are handled
the same way Docker itself handles them. The agent and the daemon now
normalize refs identically, which removes a class of false negatives in
the cross-deployment match.

The auto-cleanup is also gated by tag shape: an image is only removed if
every tag on it looks like a content hash (sha-* prefixes, git shas,
sha256 digests). Floating labels (latest, edge, stable, semver) are
preserved even when otherwise eligible, since they often still mean
something to a human operator.
Copy link
Copy Markdown

@sourceant sourceant Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. See the overview comment for a summary.

return
}

if s.configPath != "" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code currently returns a 500 error if the configuration is updated in memory but fails to persist to disk. Since the in-memory state and the file are now out of sync, it is safer to return the error and avoid proceeding to apply the configuration to the runtime.

Suggested change
if s.configPath != "" {
if s.configPath != "" {
if err := config.Save(s.config, s.configPath); err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to persist configuration: " + err.Error()})
return
}
}

Comment on lines +218 to +219
}
if c.Image != "" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When collecting image references from containers, using the Image field (which can be a tag or a name) is often redundant if ImageID is present. However, to ensure perfect matching against RepoTags, it's better to specifically check for and include the full tag reference if it's available in the container list to avoid missing matches during selectStaleImages.

Suggested change
}
if c.Image != "" {
if c.Image != "" {
out[normalizeRef(c.Image)] = true
}

Comment thread pkg/config/registry.go
Comment on lines +146 to +147
}
return resolveField(v.Field(i), f.Type, parts[1])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When resolving nested struct fields via reflection, if an intermediate pointer is nil, the code will panic when calling v.Field(i) or f.Type recursively. Add a check to initialize pointers if they are nil during resolution.

Suggested change
}
return resolveField(v.Field(i), f.Type, parts[1])
if v.Field(i).Kind() == reflect.Ptr && v.Field(i).IsNil() {
v.Field(i).Set(reflect.New(v.Field(i).Type().Elem()))
}
return resolveField(v.Field(i), f.Type, parts[1])

@nfebe nfebe merged commit 74dfbd1 into main May 25, 2026
5 checks passed
@nfebe nfebe deleted the feat/auto-cleanup-stale-images branch May 25, 2026 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enh: Delete stale unused images when new ones are pulled

1 participant