Skip to content

feat: add durable backup history#11

Merged
steipete merged 2 commits into
mainfrom
codex/crawlkit-backup-migration
Jun 19, 2026
Merged

feat: add durable backup history#11
steipete merged 2 commits into
mainfrom
codex/crawlkit-backup-migration

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

  • add named backup tags, snapshot history, and backup pull --ref restores without changing the backup checkout
  • move encrypted snapshot IO and durable Git synchronization to CrawlKit
  • adopt the shared contact-export contract and safe FTS5 term builder
  • update CrawlKit, SQLite, gotd/td, and all current Go dependencies

The archive schema, backup manifest JSON shape, current restore behavior, and existing CLI JSON contracts remain unchanged.

Durability

  • preserves legacy backup branch names and local-only tags across failed-push rebases
  • keeps recipient-rotation backups idempotent
  • preserves zero count keys and idempotence for empty archives
  • keeps first-run --no-push usable when the configured remote does not yet exist

Proof

  • GOWORK=off go test -count=1 ./...
  • GOWORK=off go test -race -count=1 ./...
  • GOWORK=off go vet ./...
  • ./scripts/coverage.sh 35.0 (56.1%)
  • golangci-lint, staticcheck, deadcode, gofumpt, and diff checks clean
  • autoreview: no accepted/actionable findings

No release or tag requested.

@clawsweeper

clawsweeper Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codex review: needs real behavior proof before merge. Reviewed June 19, 2026, 6:57 AM ET / 10:57 UTC.

Summary
The branch adds backup snapshot history, named backup tags, backup pull --ref, CrawlKit-backed backup mechanics, shared contact-export/FTS helpers, tests, docs, and Go dependency updates.

Reproducibility: not applicable. this is a feature PR rather than a current-main bug report. The changelog finding is source-reproducible from the PR diff.

Review metrics: 3 noteworthy metrics.

  • Diff surface: 14 files changed, +573/-462. The PR mixes feature code, tests, docs, changelog, dependency updates, and a small Telegram API compatibility adjustment.
  • Backup package scope: 5 backup files changed, 1 backup file added. Most of the user-facing behavior depends on the backup package migration and new history path.
  • Dependency surface: go.mod/go.sum updated, CrawlKit v0.7.0 -> v0.12.3 pseudo-version. The feature relies on newly pinned shared helpers, which broadens compatibility review beyond local telecrawl code.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted real behavior proof for backup push --tag, backup snapshots, and backup pull --ref; update the PR body so ClawSweeper re-reviews, or ask a maintainer for @clawsweeper re-review if it does not trigger.
  • Remove the CHANGELOG.md hunk from the branch.

Proof guidance:

  • [P1] Needs real behavior proof before merge: Only tests and static checks are listed; please add redacted terminal output, logs, screenshots, or a recording from real backup push/tag, snapshots, and pull --ref runs before merge. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] The PR migrates encrypted snapshot IO and Git synchronization from telecrawl-local code to a newer CrawlKit dependency, so existing backup-repo compatibility needs upgrade proof beyond unit tests.
  • [P1] The contributor has only listed tests and static checks; real behavior proof from a redacted backup tag, snapshot listing, and historical restore run is still missing.

Maintainer options:

  1. Prove backup upgrade compatibility (recommended)
    Require redacted real-run proof or focused upgrade evidence that a current-main backup repo can still push, tag, list snapshots, and restore current plus historical refs after the CrawlKit migration.
  2. Accept the shared-helper migration risk
    Maintainers can intentionally land the CrawlKit migration as-is if they are comfortable owning any backup compatibility regression that tests did not catch.

Next step before merge

  • [P1] The remaining blockers are contributor-provided real behavior proof and maintainer comfort with the backup compatibility migration, so this should stay on human review rather than an automated repair lane.

Security
Cleared: No concrete security or supply-chain regression was found in the diff; the dependency migration is tracked as compatibility risk instead.

Review findings

  • [P3] Remove the release-owned changelog hunk — CHANGELOG.md:9-13
Review details

Best possible solution:

Land the backup history implementation after removing the CHANGELOG.md hunk, preserving existing backup contracts, and adding redacted real-run proof for push/tag, snapshots, and pull --ref upgrade behavior.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is a feature PR rather than a current-main bug report. The changelog finding is source-reproducible from the PR diff.

Is this the best way to solve the issue?

Mostly yes: the implementation path is coherent and covered by focused tests, but it is not merge-ready until the release-owned changelog hunk is removed and real backup-run proof is added.

Full review comments:

  • [P3] Remove the release-owned changelog hunk — CHANGELOG.md:9-13
    OpenClaw keeps CHANGELOG.md release-owned for normal PRs, and this PR already carries the release-note context in its body. Please drop this changelog hunk so release notes can be assembled by the release process.
    Confidence: 0.9

Overall correctness: patch is correct
Overall confidence: 0.78

AGENTS.md: not found in the target repository.

Codex review notes: model internal, reasoning high; reviewed against c0b341495848.

Label changes

Label changes:

  • add P2: This is a normal-priority backup durability feature with focused but compatibility-sensitive CLI and storage impact.
  • add merge-risk: 🚨 compatibility: Moving encrypted snapshot and Git synchronization behavior into CrawlKit could break existing backup repositories if the helper contracts differ from current telecrawl behavior.
  • add rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: Only tests and static checks are listed; please add redacted terminal output, logs, screenshots, or a recording from real backup push/tag, snapshots, and pull --ref runs before merge. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Label justifications:

  • P2: This is a normal-priority backup durability feature with focused but compatibility-sensitive CLI and storage impact.
  • merge-risk: 🚨 compatibility: Moving encrypted snapshot and Git synchronization behavior into CrawlKit could break existing backup repositories if the helper contracts differ from current telecrawl behavior.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: Only tests and static checks are listed; please add redacted terminal output, logs, screenshots, or a recording from real backup push/tag, snapshots, and pull --ref runs before merge. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • Target AGENTS check: No AGENTS.md exists inside the telecrawl target checkout; the only AGENTS.md found was in the parent ClawSweeper workspace and was not target-repository policy.
  • Current main lacks the requested backup history surface: Current main only routes backup subcommands init, push, pull, and status; there is no snapshots subcommand or ref/tag history flow on the checked-out default branch. (internal/cli/cli.go:658, c0b341495848)
  • PR adds the new CLI flags and subcommand: The PR branch adds snapshots routing plus --ref, --tag, and --limit backup flags. (internal/cli/cli.go:654, cea247977b0f)
  • PR includes historical restore coverage: The PR adds TestHistoricalSnapshotRestore, covering tag creation, snapshot listing, non-mutating historical pull, and immutable tag behavior. (internal/backup/backup_test.go:171, cea247977b0f)
  • Shared CrawlKit history dependency inspected: The new telecrawl history path delegates snapshot history and ref reads to CrawlKit backup/mirror helpers pinned by the PR dependency update. (github.com/openclaw/crawlkit/backup/history.go:21, 6e14735bb248)
  • Release-owned changelog hunk: The PR adds unreleased changelog entries even though OpenClaw PR review policy keeps CHANGELOG.md release-owned and asks for release context in the PR body or commits instead. (CHANGELOG.md:9, cea247977b0f)

Likely related people:

  • steipete: Current main's encrypted backup implementation and docs were introduced by Peter Steinberger in the v0.2.0 release commit, which is the central behavior this PR extends. (role: feature owner; confidence: high; commits: 49930fd7e801, 07ae062edf7a; files: internal/backup/backup.go, internal/backup/git.go, internal/backup/config.go)
  • joshp123: Recent merged work by joshp123 changed the CLI, contact export, store search, dependencies, and native Telegram implementation areas also touched by this PR. (role: recent adjacent contributor; confidence: medium; commits: 01eeecc55fc0, 9df81dcf3183, 9ef3b111c008; files: internal/cli/cli.go, internal/store/store.go, internal/telegramdesktop/tdata.go)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. labels Jun 19, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cea247977b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/backup.go
Comment on lines +111 to +113
tag, err := tagSnapshot(ctx, cfg, opts.Tag)
if err != nil {
return Result{}, err

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preflight tag conflicts before committing snapshots

When the requested tag already exists on a different commit and this push produces a new snapshot, the immutable-tag error is discovered only here, after writeSnapshot and commitAndPush have already advanced the backup repo's HEAD. The command returns an error but leaves that untagged commit as the current backup, so a later unqualified backup pull or backup snapshots can use a snapshot from a failed backup push --tag .... Check for an existing conflicting tag before writing/committing, or roll back the local commit on this error path.

Useful? React with 👍 / 👎.

@steipete steipete merged commit e07a15d into main Jun 19, 2026
9 checks passed
@steipete steipete deleted the codex/crawlkit-backup-migration branch June 19, 2026 11:00
@steipete

Copy link
Copy Markdown
Contributor Author

Landed as e07a15d9b42f7355ef7a699d3215d2ec0df0c819.

Proof before merge:

  • GOWORK=off go test -count=1 ./...
  • GOWORK=off go test -race -count=1 ./...
  • GOWORK=off go vet ./...
  • ./scripts/coverage.sh 35.0 (56.1%)
  • golangci-lint, staticcheck, deadcode, gofumpt, and diff checks clean
  • autoreview: no accepted/actionable findings
  • CI, Docker, CodeQL, dependency, secret, and release-snapshot checks green

Post-merge checkout: main, clean and synchronized with origin/main. No release or tag created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. P2 Normal priority bug or improvement with limited blast radius. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant