Skip to content

release: develop → main (v0.7.0 — mirror scheduler + last-run status)#479

Merged
chronoai-shining merged 8 commits into
mainfrom
develop
May 13, 2026
Merged

release: develop → main (v0.7.0 — mirror scheduler + last-run status)#479
chronoai-shining merged 8 commits into
mainfrom
develop

Conversation

@chronoai-shining

Copy link
Copy Markdown
Collaborator

Release cut promoting developmain. After merge, .github/workflows/changeset-release.yml will:

  1. Open a release/v<next> PR that consumes the 6 pending changesets and bumps both packages to v0.7.0 (highest pending bump = minor).
  2. On that PR's merge, tag v0.7.0 and create a GitHub Release using .github/release-notes-20260513.md as the body.
  3. Open sync/post-release-v0.7.0 → develop, which the workflow auto-merges via the GitHub API as a merge commit (preserves the merge-base, prevents phantom-bump drift).

What ships

New Feature

  • Mirror reconcile now runs in-process; no external Kubernetes CronJob.
  • Mirror schedule editable from admin settings — presets or custom cron, Singapore time, defaults to daily 02:00 SGT.
  • "Last run" status on mirror settings page and dashboard: succeeded, failed (with error message), running, or never.

Changed

  • Legacy mirror dashboard's "Last reconcile" tile is now consistent across pods and survives restarts (reads from persisted scheduler state).
  • Mirror configuration unified under SettingsService — one-shot boot migration copies any legacy platform_settings.githubMirror values into the new section. The legacy field is gone from types/repo/service.

Fixed

  • Release machinery: handle release bodies that exceed GitHub's 125 000-char cap, restructure to template + per-release dated notes files, gate PRs targeting main on a curated notes file being present.

Issues closed by this release

(#430 / #432 / #433 / #435 already shipped + closed earlier as the release-machinery changesets.)

Test plan

chore: sync main → develop after v0.6.0 (must merge-commit, not squash)
…0-char cap (#430)

* fix(ci): release-body fallback when CHANGELOG section exceeds GitHub's 125 000-char cap (#429)

v0.6.0 (87 consumed changesets) tripped GitHub's hard 125 000-char
limit on release-note bodies — `gh release create` returned
`HTTP 422: Validation Failed / body is too long`. Recovery required
manual GH-Release creation + a manual sync PR (#428).

The 125 000 cap is a GitHub platform constraint, not a repo setting;
no flag raises it. Fix on our side: detect when the combined inline
CHANGELOG body would exceed the cap and fall back to a short body
that links to the in-repo CHANGELOG files on the tag. The full notes
still live in the repo (as they always did), just rendered there
instead of duplicated into the release-create call.

Threshold: 120 000 chars (5 000-char headroom under the platform cap
to account for the small framing text wrapped around the notes).
Small releases keep the nice inline notes; only outliers like v0.6.0
hit the fallback.

* docs: changeset for #429
#432)

* feat(ci): release body from curated `.github/release-notes-next.md` (#431)

Replaces the raw-CHANGELOG-dump release body that blew past GitHub's
125 000-char API limit on v0.6.0 (#429) with a maintainer-curated,
user-facing three-section summary read from
`.github/release-notes-next.md`.

## How it works

  1. Before the next release, the maintainer (or their local Claude)
     edits `.github/release-notes-next.md` into three sections —
     Fixed / New Feature / Changed — with brief one-liners. Technical-
     only work is clustered into a single trailing bullet per section
     so engineer-speak doesn't bleed into the user-facing release page.

  2. `changeset-release.yml` State B reads the file at release time.
     If the file exists and the `(write here)` placeholder string is
     gone, the workflow uses the file's prose (HTML comment block
     stripped) as the release body, with a CHANGELOG-links footer
     appended.

  3. If the file is missing OR still contains the placeholder, the
     workflow falls back to a short link-to-CHANGELOG body. Release
     still publishes; the body just isn't custom.

  4. Final length guard: even a curated body is capped at 120 000
     chars (5 000-char headroom under the platform limit). Pasting a
     novel into the file falls back to the short body too.

## Why not call an LLM at release time

Considered. Means an API key as a repo secret + network dependency in
the critical-path State B step + cost per release. The in-repo file
approach has zero external dependencies and gives the author final
review of the published text. Authors can still ask their local
Claude to write the file — that step is just outside the workflow,
which is the right boundary.

## Initial template

`.github/release-notes-next.md` ships with a placeholder template and
an HTML comment block targeted at the author (or their Claude) with
the formatting rules. Update once per release.

* docs: changeset for #431
) (#434)

The release workflow added in #431 falls back to a short link-to-
CHANGELOG body when the file is missing or still has the
(write here) placeholder. The fallback is fine as a safety net but
wrong as a default — by the time the workflow runs (post-merge), the
release is already published with the generic body and there's no
easy way to retro-add curated notes.

This new workflow gates every PR targeting main on the file being
properly filled in. Three structural checks:

  1. File exists.
  2. Headings ## Fixed, ## New Feature, ## Changed are all present.
  3. No (write here) placeholder anywhere in the file.

If any of the three fails, the workflow emits an ::error::
annotation pointing at the file and exits non-zero.

Scope: PRs with base=main. develop→main is the primary case;
release/v*→main passes automatically because the file content came
in via the previous develop→main merge that was already gated.

Required-status check binding still needs to happen in repo Settings
(workflow file alone isn't load-bearing until added to main branch
protection's required-checks list).
…ted files (#435) (#436)

Replaces the single mutable `.github/release-notes-next.md` with a
two-file model:

  - `.github/release-notes-template.md` — immutable template. Stays
    in the repo forever; holds the format and instructions. Never
    edited.
  - `.github/release-notes-<yyyymmdd>.md` — one per release. Copied
    from the template, filled in by the maintainer before opening
    the develop → main PR, retained in the repo after release as a
    historical record of that release's user-facing notes.

This gives us a git-tracked archive of every release's user-facing
notes (previously only on the GitHub Releases page) and removes the
"file always shows last release's content" awkwardness of the
single-file model.

## Workflow updates

  - `.github/workflows/changeset-release.yml` State B finds the most
    recent file matching `release-notes-[0-9]{8}.md` (descending
    sort, template excluded by pattern). Reads that file, strips
    the HTML comment block, appends the CHANGELOG-links footer,
    uses as GH Release body. Falls back to short link body if no
    dated file exists or the `(write here)` placeholder is still
    in place.

  - `.github/workflows/check-release-notes.yml` (gate on PRs to main)
    validates the most recent dated file. Fails on missing template,
    no dated file present, filename pattern mismatch, missing any of
    `## Fixed` / `## New Feature` / `## Changed`, or `(write here)`
    placeholder still present.

## Docs

CLAUDE.md release flow gains "Step 0 — curate release notes" before
"Step 1 — promote develop → main". CONTRIBUTING.md's "Release notes —
maintainer task per release" section updated for the two-file model.

## Out of scope

Historical changesets (in `.changeset/*.md`) that mention the old
`release-notes-next.md` filename are left as-is — they're accurate
records of what shipped at the time.
…adence (#455)

* refactor(api): re-point MirrorService + mirror routes to SettingsService.getMirror() (#437)

MirrorService and the `/github/repo` + `/admin/mirror/status` routes
previously read mirror config from the legacy `PlatformSettingsService`
(`platform_settings.githubMirror` field). The new multi-section
`SettingsService` (`settings.mirror` doc) is the migration target —
already edited by the admin settings page MirrorSection, but until
now nothing load-bearing consumed it.

This commit re-points MirrorService at `settingsService.getMirror()`
and rewires POST /github/repo to write through
`settingsService.putSection("mirror", ...)`. The legacy
`PlatformSettings.githubMirror` field is left in place for the boot
migration in the next commit to read on first deploy; subsequent
commits drop the legacy field entirely.

The `MirrorSection` shape in SettingsService is a strict superset of
the old `GithubMirrorConfig` shape (same field names), so the
behaviour change is purely "which doc do we read." No HTTP contract
change, no breaking change for the legacy `/admin/mirror` UI.

scripts/reconcile-mirror.ts updated in lockstep so the manual
debugging shim still composes against the new dep graph (it was
already broken against the post-#270 `SkillService` shape — fixed
that drive-by since the file gets rewritten anyway).

Refs #437.

* feat(api): boot migration — copy legacy platform_settings.githubMirror into settings.mirror (#437)

One-shot, idempotent migration that runs once per pod boot in
`bootstrap.ts`. Reads the legacy `_id: "ornn"` doc's `githubMirror`
field and writes it to the new `_id: "mirror"` per-section doc that
`SettingsService.getMirror()` (now the only reader, since commit 1)
consumes.

Idempotency contract:
  • New `_id: "mirror"` doc already exists → skip (treat as
    authoritative; never overwrite operator's choice).
  • Legacy doc absent / lacks `githubMirror` → skip cleanly.
  • Running twice with no input change → second run touches nothing
    (covered by test).

Crypto: the `appPrivateKey` field is AES-256-GCM ciphertext on both
sides, derived from the same `ENCRYPTION_KEY`. Migration copies the
ciphertext byte-for-byte (no decrypt/re-encrypt roundtrip — same
plaintext, different IV would just be churn). `SettingsServiceImpl`
already degrades a corrupted ciphertext to empty + a loud log, so
a key rotation that happened mid-deploy doesn't crash the boot path.

Failure is logged + non-fatal — operators can always re-save mirror
config through the admin UI after boot.

Refs #437.

* chore(api): drop githubMirror from legacy PlatformSettings (#437)

Now that MirrorService reads from `SettingsService.getMirror()` and
the boot migration in the previous commit moved any existing legacy
values into the new section, the `githubMirror` field on
`PlatformSettings` is dead code. Drop it:

  • Remove `GithubMirrorConfig` type, the `githubMirror` field on
    `PlatformSettings`, and the entry in `DEFAULT_PLATFORM_SETTINGS`.
  • Remove `PlatformSettingsRepository` read/write paths for the field.
  • Remove `PlatformSettingsService.get`'s decrypt/encrypt branch and
    the `getGithubMirrorConfig()` accessor.
  • Trim `maskSensitiveSettings` so it no longer touches a field that
    isn't there.

`PlatformSettingsService` still owns `auditWaiverThreshold` and the
legacy `llmProvider` override (those have their own migration paths
into the new settings model — out of scope here).

DB-side: the existing `platform_settings:{_id:"ornn"}.githubMirror`
sub-document is left in place after this PR — the migration in the
previous commit already extracted it, and dropping the embedded field
on an existing doc would require a separate cleanup migration. It
costs nothing to leave (no reader). A follow-up can `$unset` it in a
later release once we're confident every cluster has rolled past
this version.

Refs #437.

* feat(api): add reconcileSchedule field to mirror section schema (#437)

Adds `reconcileSchedule: string` to `MirrorSection` so the in-process
scheduler (next commit) can read its cadence from admin-editable
settings instead of a redeploy-locked manifest.

Semantics:
  • Default: `"0 2 * * *"` — daily at 02:00 Singapore time.
  • Empty string disables the scheduled reconcile entirely (publish-
    time webhooks still fire — this is independent of the master
    `enabled` kill switch).
  • Cron syntax is validated by `cron-parser` (new dep, 5.5.0) at
    settings-write time. Invalid expressions reject with a 400 from
    the standard SettingsService Zod path.
  • Timezone is NOT stored in the schema — it's pinned to
    `Asia/Singapore` by the scheduler itself (UTC+8, no DST). See
    the mirror.ts comment block.

Test fixtures across `service.test`, `exportImport/*.test`, and
`mirrorService.test` updated to include the new field on every hand-
rolled mirror payload.

Schema-level coverage in `sections.test.ts`:
  • UT-SCHEMA-MIRROR-002: accepts valid cron + empty string.
  • UT-SCHEMA-MIRROR-003: rejects invalid cron expressions.
  • UT-SCHEMA-MIRROR-004: defaults parse cleanly + value is correct.

Refs #437.

* feat(api): in-process mirror reconcile scheduler via Agenda (#437)

Replaces the soon-to-be-removed k8s `CronJob` (next commit deletes
the manifest) with an Agenda-backed in-process scheduler that lives
inside the long-running `ornn-api` pod.

Two recurring Agenda jobs:

  1. `mirror-reconcile` — calls `MirrorService.reconcileAll()`. The
     schedule (cron expression) is driven by the new
     `settings.mirror.reconcileSchedule` field and registered with
     `timezone: "Asia/Singapore"` so admins typing `0 2 * * *` get
     literal 2am Singapore time (UTC+8, no DST — pinned).

  2. `mirror-sync-schedule` — runs every minute on every pod. Reads
     settings; if the cron changed, calls `agenda.every(cron, name)`
     to update the recurring-job doc. Because `every()` upserts by
     job name on the shared `agendaJobs` collection, every pod's
     Agenda picks up the new cadence via its 5s poll. No cross-pod
     messaging needed. Convergence ≤ ~65s from admin save.

Empty `reconcileSchedule` → `agenda.cancel({ name: "mirror-reconcile" })`,
removing the recurring row. Independent of the master `enabled`
kill switch — admins can pause the cron while keeping the publish-
time webhook path alive.

Multi-pod safety: Agenda's per-fire row lock on `agendaJobs`. Exactly
one pod claims each scheduled fire; others see it locked and skip.
`defaultLockLifetime: 10min` provides stale-lock recovery if a pod
dies mid-reconcile. The split-brain failure mode common to all
TTL-lock systems (paused leader resumes thinking it still holds the
lock) is benign here because `reconcileAll` is idempotent.

Deps added:
  - `agenda@6.2.5` — main library, ESM, actively maintained as of
    2026-05 (last commit 2026-05-10, 10 npm maintainers, 96k weekly
    downloads).
  - `@agendajs/mongo-backend@4.0.2` — Agenda v6+ moved storage
    backends into separate packages; this one accepts our existing
    `Db` so the scheduler shares our Mongo connection.

Wired in `bootstrap.ts` after `mirrorService` is constructed.
Scheduler start failure is logged + non-fatal — the pod still serves
HTTP. Shutdown chain stops the scheduler before closing Mongo.

Tests mock the Agenda surface so the assertions stay deterministic;
the integration-level multipod claim race is Agenda's own test
suite's job, not duplicated here.

Refs #437.

* feat(web): mirror schedule control on admin settings page (#437)

Adds the `Reconcile schedule` field to `MirrorSection.tsx` so admins
can manage the in-process scheduler's cadence from the GitHub mirror
settings page — no redeploy.

UX shape (matching what GitHub Actions / Vercel / Netlify do):
  • Preset dropdown: Disabled / Daily 02:00 / Every 6h / Every 12h /
    Hourly / Custom…
  • "Custom (cron expression)…" reveals a text input. Validated
    client-side via `cron-parser` — invalid expressions tint the
    input red with an inline hint, and the existing Zod schema on
    the form rejects them before save.
  • Below the field: a "Next run: <ISO>" preview computed from
    `cron-parser` with `tz: "Asia/Singapore"` so the operator sees
    exactly when the next fire will be in SGT.
  • A persistent note: "Schedules are interpreted in Singapore time
    (UTC+8, no DST)." — pinned timezone is enforced server-side too,
    so this matches reality regardless of where the operator's
    browser thinks it lives.

Wired up the `MirrorSection` API type with the new
`reconcileSchedule: string` field. i18n strings added to both en.json
and zh.json under the existing `adminSettings.sections.mirror`
namespace.

Adds `cron-parser` as an ornn-web dep (same version pinned by ornn-api
in an earlier commit — keeps validation behavior identical client-
and server-side).

Refs #437.

* chore(deployment): delete mirror-cronjob.yaml + update stale UI copy (#437)

Removes `deployment/ornn-api/mirror-cronjob.yaml`. The in-process
scheduler in the previous commits now owns the periodic reconcile;
no external Kubernetes object is involved.

Also updates two pieces of stale operator-facing copy that referenced
"the hourly cron at :17":

  • `MirrorSetupHelp` step 4 — now points operators at the
    settings-page schedule control instead of a redeploy-locked
    manifest.
  • `MirrorPage` reconcile + repo-form hints — say "scheduler"
    rather than "cron" to match the new in-process model.

Refs #437.

* docs: changeset for #437

In-process mirror reconcile scheduler with admin-configurable cadence.
Both packages bumped at minor — backend gains a new dep (`agenda`) +
new field on the mirror settings section; web gains the schedule
control UI. No breaking change for callers; legacy
`PlatformSettings.githubMirror` already had a boot migration into
the new section.
* feat(api): MirrorScheduler exposes getScheduledRunStatus() (#475)

Adds a method on `MirrorScheduler` that returns the last scheduled
fire's outcome — read from Agenda's persisted recurring-job doc in
the `agendaJobs` collection, so it survives pod restarts and
aggregates correctly across replicas.

```ts
interface ScheduledRunStatus {
  status: "succeeded" | "failed" | "running" | "never_run";
  lastRunAt: Date | null;
  lastFinishedAt: Date | null;
  lastDurationMs: number | null;
  lastError: string | null;
  nextRunAt: Date | null;
}
```

Derivation priority (covered by unit tests):
  1. `lockedAt` set → `running` (some pod is executing it right now).
  2. `failedAt` newer than `lastFinishedAt` (or `lastFinishedAt` never
     set while `failedAt` is) → `failed`; populates `lastError` from
     `failReason`.
  3. `lastFinishedAt` set, no recent failure → `succeeded`.
  4. Nothing set yet, or the doc doesn't exist (fresh boot, or
     schedule was just disabled via `agenda.cancel`) → `never_run`.

`queryJobs` failures (Mongo unreachable mid-poll) are swallowed and
treated as `never_run` so the admin status endpoint doesn't 500 when
the DB blips. Recovery happens on the next poll.

No public-facing behaviour change yet — the route layer consumes
this in the next commit.

Refs #475.

* feat(api): GET /admin/mirror/status returns scheduledRun (#475)

Threads the in-process `mirrorScheduler` into `createMirrorRoutes`
and replaces the old per-pod in-process `lastReconcile` block in the
status response with a persisted `scheduledRun` block sourced from
`mirrorScheduler.getScheduledRunStatus()`.

Response shape under `data`:
```
{
  enabled, repo, appId, installationId, appPrivateKey,
  counts: { ... },
  scheduledRun: {
    status: "succeeded" | "failed" | "running" | "never_run",
    lastRunAt: ISO|null,
    lastFinishedAt: ISO|null,
    lastDurationMs: number|null,
    lastError: string|null,
    nextRunAt: ISO|null,
  }
}
```

The previous `lastReconcile.{status, startedAt, finishedAt,
durationMs, result, error}` payload — which was driven by the
per-pod in-process state of manual `Reconcile now` clicks — is gone.
Nothing in this PR's tree consumes it, and the frontend updates in
the next two commits migrate over to `scheduledRun`. Manual
`Reconcile now` still works the same way; the in-process
`reconcileState` lives on inside `mirror/routes.ts` for its 409
"already running" guard, just not surfaced in the status response.

Bootstrap reorders `createMirrorScheduler` before `createMirrorRoutes`
so the routes receive a constructed scheduler. If the scheduler
fails to start (any boot-time error), routes get `mirrorScheduler:
null` and the status endpoint reports `never_run` for `scheduledRun`
rather than 500ing — same defensive shape the existing scheduler
startup uses.

Refs #475.

* feat(web): show last scheduled-run status on settings page + dashboard (#475)

Migrates both mirror UI surfaces off the (now-removed) per-pod
`lastReconcile` block and onto the persisted `scheduledRun` block
from \`GET /admin/mirror/status\`. Single source of truth across pods
and across restarts.

**Settings page (\`/admin/settings/mirror\`)** — adds a new line below
\"Next run\" in the schedule control:

\`\`\`
Last run: 2026-05-13 18:00 SGT · ✓ Succeeded · 4.2s
\`\`\`
or, on failure, with the error message wrapping freely on a second line:
\`\`\`
Last run: 2026-05-13 18:00 SGT · ✗ Failed · 4.2s
          github 502: Bad Gateway
\`\`\`
\`never_run\` shows \`Last run: —\`. \`running\` shows \`⟳ Running…\`
without a duration (no \`lastFinishedAt\` until the fire lands).

The settings page polls \`/admin/mirror/status\` every 30s for this
data via a dedicated TanStack Query — independent of the form's
React-Query cache so saving doesn't invalidate the poll.

**Legacy dashboard (\`/admin/mirror\`)** — repoints the existing
\"Last reconcile\" tile at the same persisted data. Label widened to
\"Last *scheduled* reconcile\" since manual \`Reconcile now\` clicks
don't update \`scheduledRun\`. Drops the \"+X ~Y -Z =W\" delta-counts
line that the old in-process state carried; that info is implicit
in the synced / lagging / never-synced count cards already on the
page, and the persisted scheduler doc doesn't store it. The
\`useMirrorStatus\` hook's fast-poll-while-running heuristic now keys
off \`scheduledRun.status === \"running\"\`.

Type update: \`githubMirrorApi.MirrorStatus.lastReconcile\` removed
and replaced with \`scheduledRun: MirrorScheduledRun\`. No consumers
depend on the old shape after this commit.

i18n strings added in en + zh.

Refs #475.

* docs: changeset for #475

Surface last-run status of the scheduled mirror reconcile on both the
admin settings page and the legacy dashboard. Backend swaps the
per-pod `lastReconcile` block on `GET /admin/mirror/status` for a
persisted `scheduledRun` block; frontend migrates both consumers.
Curated user-facing notes for the next release: in-process mirror
reconcile scheduler with admin-configurable cadence (#437) and the
last-run status widget on the mirror settings page + legacy
dashboard (#475).

Empty changeset — release-notes files are CI-only, no package bump.
@chronoai-shining chronoai-shining merged commit 0406492 into main May 13, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant