Skip to content

docs(compute): document scale-to-zero default and Fly.io mapping#64

Closed
tonychang04 wants to merge 3 commits into
mainfrom
docs/compute-scale-to-zero
Closed

docs(compute): document scale-to-zero default and Fly.io mapping#64
tonychang04 wants to merge 3 commits into
mainfrom
docs/compute-scale-to-zero

Conversation

@tonychang04
Copy link
Copy Markdown
Contributor

@tonychang04 tonychang04 commented May 11, 2026

Summary

Companion to InsForge/InsForge#1251, which makes compute services deploy with full scale-to-zero by default (auto_stop_machines: stop, auto_start_machines: true, min_machines_running: 0). This PR teaches the skill about the new behavior.

What changes for agents reading the skill

  • SKILL.md: adds a short note under the compute command list — scale-to-zero is the default, ~1s cold start, the CLI/skill can only dial it up toward always-on (never below zero, since it's already the floor on Fly).
  • references/compute-deploy.md: adds a full "Scale-to-zero (default)" section with:
    • A table mapping InsForge defaults to Fly's exact field names (auto_stop_machines, auto_start_machines, min_machines_running) — agents familiar with Fly can predict what they're getting.
    • "What the CLI/skill can change" matrix — covers what users can ask for (faster wake / always-on / N warm replicas) and notes those flags aren't yet exposed; route through support.
    • Anti-pattern callout: don't flyctl machine update directly to bypass — same reason as the rest of the page (the Fly machine isn't yours, state will drift from InsForge's view).
    • A migration note for already-deployed pre-default services: redeploy to pick up the new config.

Both edits keep the existing voice and the "DO NOT call flyctl directly" stance.

Test plan

  • Re-render the skill into Claude and ask "is compute scale-to-zero by default?" — agent should answer yes with the Fly mapping
  • Ask "how do I make my compute service always-on" — agent should explain CLI doesn't expose it yet, point to support, NOT recommend flyctl machine update

🤖 Generated with Claude Code


Summary by cubic

Documents scale-to-zero as the only v1 mode for compute services (no override flags). Includes Fly Machines API mapping (autostop: "stop", autostart: true, min_machines_running: 0) with fly.toml aliases, ~1s cold-start expectations, a do-not-use flyctl warning, and that min_machines_running is honored only in the app’s primary region; for always-on or warm replicas, contact support (don’t ask the CLI to change autostop or keep N warm).

  • Migration
    • Existing services keep the old config until you redeploy or run compute update <id>.

Written for commit ed592d8. Summary will update on new commits.

Summary by CodeRabbit

  • Documentation
    • Added detailed notes for Backend Compute Services explaining default scale-to-zero behavior, Fly defaults used, and cold-start performance expectations.
    • Clarified that the CLI cannot opt out of scale-to-zero, how configuration changes are applied on subsequent deployments, and which availability/tuning options are not yet exposed.

Review Change Stack

Review Change Stack

Compute services now deploy with Fly's full scale-to-zero
(auto_stop_machines: stop, auto_start_machines: true,
min_machines_running: 0) — see InsForge/InsForge#1251.

Document this in the skill so agents:
- Know scale-to-zero is the default behavior
- Set the right cold-start expectation (~1s on shared-1x)
- Understand the CLI/skill can only dial it *up* toward always-on,
  never below zero (it's already the floor on Fly)
- Map InsForge's behavior to Fly's exact field names so users
  familiar with Fly can predict what they're getting
- Know that pre-existing services need a redeploy to pick up the
  new defaults

Also notes that flags for `auto_stop_machines: off|suspend` and
`min_machines_running > 0` aren't yet exposed via the CLI —
route latency-critical services through support for now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3adbab3e-d674-4774-bcb8-5e6b19cc7c26

📥 Commits

Reviewing files that changed from the base of the PR and between cd1bda8 and ed592d8.

📒 Files selected for processing (2)
  • skills/insforge-cli/SKILL.md
  • skills/insforge-cli/references/compute-deploy.md
✅ Files skipped from review due to trivial changes (2)
  • skills/insforge-cli/references/compute-deploy.md
  • skills/insforge-cli/SKILL.md

Walkthrough

This PR adds documentation stating compute services deploy with Fly scale-to-zero defaults (autostop: "stop", autostart: true, min_machines_running: 0), describes cold-start expectations, notes the CLI has no override flags, and documents migration behavior for existing services.

Changes

Scale-to-zero default behavior documentation

Layer / File(s) Summary
Skill Overview
skills/insforge-cli/SKILL.md
Brief paragraph added under Backend Compute Services (Fly.io) noting scale-to-zero defaults, cold-start (~1s on shared-1x), and that the CLI provides no override flags.
Detailed Reference
skills/insforge-cli/references/compute-deploy.md
New "Scale-to-zero (v1 — the only mode)" section detailing the exact Machines API services fields sent (autostop, autostart, min_machines_running), fly.toml mappings, cold-start expectations, explanation of no CLI override flags, and migration behavior for already-deployed always-on services.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

A rabbit reads the docs by night,
Whispering how machines take flight.
They sleep, they wake with just a ping—
A tiny cold-start, then they sing.
Hooray for clear and cozy text! 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding documentation for the scale-to-zero default behavior and Fly.io API field mapping in compute service deployment.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/compute-scale-to-zero

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
skills/insforge-cli/SKILL.md (1)

169-169: ⚡ Quick win

Clarify "in flight" to avoid ambiguity.

The phrase "A flag to opt out is not yet exposed via the CLI" followed by "in flight" is clear in context, but "in flight" could be misread. Consider "while a CLI flag is being developed" or "until a CLI flag lands" for consistency with the detailed reference doc.

✏️ Suggested rewording
-> 💤 **Scale-to-zero is the default.** Every compute service deploys with Fly's `auto_stop_machines: "stop"` + `auto_start_machines: true` + `min_machines_running: 0`. When traffic is idle the machine stops; the next incoming request wakes it (~1s cold start on `shared-1x`). This is the most aggressive cost-saving setting Fly offers — the CLI and skill cannot dial it down further. The only direction to adjust is *up* (more warm capacity): always-on or `N` warm replicas. A flag to opt out is not yet exposed via the CLI; if you need always-on for a latency-critical service, contact support. Semantics mirror Fly.io exactly — `auto_stop_machines` accepts `off | stop | suspend` and `min_machines_running` is an integer ≥ 0 on the Fly side.
+> 💤 **Scale-to-zero is the default.** Every compute service deploys with Fly's `auto_stop_machines: "stop"` + `auto_start_machines: true` + `min_machines_running: 0`. When traffic is idle the machine stops; the next incoming request wakes it (~1s cold start on `shared-1x`). This is the most aggressive cost-saving setting Fly offers — the CLI and skill cannot dial it down further. The only direction to adjust is *up* (more warm capacity): always-on or `N` warm replicas. A flag to opt out is not yet exposed via the CLI; if you need always-on for a latency-critical service, contact support. Semantics mirror Fly.io exactly — `auto_stop_machines` accepts `off | stop | suspend` and `min_machines_running` is an integer ≥ 0 on the Fly side.

Note: Since this note mentions "A flag to opt out is not yet exposed," the "in flight" reference in the reference doc at line 239 might be clearer if it matches this phrasing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/insforge-cli/SKILL.md` at line 169, Replace the ambiguous phrase "in
flight" in the SKILL.md section that currently says "A flag to opt out is not
yet exposed via the CLI; if you need always-on..." with clearer wording such as
"while a CLI flag is being developed" or "until a CLI flag lands" so the
sentence reads consistently; locate the sentence in SKILL.md (the paragraph
starting "A flag to opt out is not yet exposed via the CLI") and swap "in
flight" for one of the suggested phrasings to eliminate ambiguity.
skills/insforge-cli/references/compute-deploy.md (1)

239-239: ⚡ Quick win

Clarify "in flight" to avoid ambiguity.

The phrase "while a CLI flag is in flight" could be misread as "while a command is executing." Consider rephrasing to "while a CLI flag is being developed" or "until a CLI flag lands."

✏️ Suggested rewording
-If you need always-on for a latency-sensitive service today, contact support — we can adjust the machine config directly while a CLI flag is in flight. **Do not run `flyctl machine update` against the service yourself** — same reason as the rest of this page: the Fly machine isn't yours to manage, and operating on it directly will drift state away from the InsForge cloud's view.
+If you need always-on for a latency-sensitive service today, contact support — we can adjust the machine config directly while a CLI flag is being developed. **Do not run `flyctl machine update` against the service yourself** — same reason as the rest of this page: the Fly machine isn't yours to manage, and operating on it directly will drift state away from the InsForge cloud's view.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/insforge-cli/references/compute-deploy.md` at line 239, The phrase
"while a CLI flag is in flight" is ambiguous; change that clause in the sentence
mentioning contacting support so it reads clearly (e.g., replace "while a CLI
flag is in flight" with "while a CLI flag is being developed" or "until a CLI
flag lands") in the paragraph that begins "If you need always-on for a
latency-sensitive service today, contact support — we can adjust the machine
config directly..."; keep the warning about not running `flyctl machine update`
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/insforge-cli/references/compute-deploy.md`:
- Around line 220-224: The table currently shows backslash-escaped pipes in the
`auto_stop_machines` row (`"off" \| "stop" \| "suspend"`); remove the
backslashes and wrap the union-type string in code formatting so it renders
correctly (e.g., use `"off" | "stop" | "suspend"` inside backticks) and verify
other type cells (`auto_start_machines`, `min_machines_running`) use consistent
code formatting if needed.

---

Nitpick comments:
In `@skills/insforge-cli/references/compute-deploy.md`:
- Line 239: The phrase "while a CLI flag is in flight" is ambiguous; change that
clause in the sentence mentioning contacting support so it reads clearly (e.g.,
replace "while a CLI flag is in flight" with "while a CLI flag is being
developed" or "until a CLI flag lands") in the paragraph that begins "If you
need always-on for a latency-sensitive service today, contact support — we can
adjust the machine config directly..."; keep the warning about not running
`flyctl machine update` unchanged.

In `@skills/insforge-cli/SKILL.md`:
- Line 169: Replace the ambiguous phrase "in flight" in the SKILL.md section
that currently says "A flag to opt out is not yet exposed via the CLI; if you
need always-on..." with clearer wording such as "while a CLI flag is being
developed" or "until a CLI flag lands" so the sentence reads consistently;
locate the sentence in SKILL.md (the paragraph starting "A flag to opt out is
not yet exposed via the CLI") and swap "in flight" for one of the suggested
phrasings to eliminate ambiguity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c0d0cd41-477b-4d92-ab6f-c664137766c5

📥 Commits

Reviewing files that changed from the base of the PR and between ed14fea and 1a6331a.

📒 Files selected for processing (2)
  • skills/insforge-cli/SKILL.md
  • skills/insforge-cli/references/compute-deploy.md

Comment on lines +220 to +224
| Field | InsForge default | Fly's range | What it does |
|---|---|---|---|
| `auto_stop_machines` | `"stop"` | `"off" \| "stop" \| "suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |
| `auto_start_machines` | `true` | `bool` | Wake the machine when a request arrives at its endpoint. |
| `min_machines_running` | `0` | `int ≥ 0` | Minimum warm replicas. `0` = full scale-to-zero. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Escaped pipe characters may render incorrectly in the table.

Line 222 contains "off" \| "stop" \| "suspend" with backslash-escaped pipes. In most Markdown renderers, the backslashes will appear literally. If the intent is to show the literal pipe character as a type separator (union type syntax), consider wrapping in backticks: `"off" | "stop" | "suspend"`.

📝 Proposed fix
-| `auto_stop_machines` | `"stop"` | `"off" \| "stop" \| "suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |
+| `auto_stop_machines` | `"stop"` | `"off" | "stop" | "suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |

Or, if union-type syntax is preferred:

-| `auto_stop_machines` | `"stop"` | `"off" \| "stop" \| "suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |
+| `auto_stop_machines` | `"stop"` | `"off"`, `"stop"`, or `"suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Field | InsForge default | Fly's range | What it does |
|---|---|---|---|
| `auto_stop_machines` | `"stop"` | `"off" \| "stop" \| "suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |
| `auto_start_machines` | `true` | `bool` | Wake the machine when a request arrives at its endpoint. |
| `min_machines_running` | `0` | `int ≥ 0` | Minimum warm replicas. `0` = full scale-to-zero. |
| Field | InsForge default | Fly's range | What it does |
|---|---|---|---|
| `auto_stop_machines` | `"stop"` | `"off" | "stop" | "suspend"` | `stop` = fully stop on idle (cheapest, ~1s cold start). `suspend` = pause RAM in place (faster wake, more $). `off` = never stop. |
| `auto_start_machines` | `true` | `bool` | Wake the machine when a request arrives at its endpoint. |
| `min_machines_running` | `0` | `int ≥ 0` | Minimum warm replicas. `0` = full scale-to-zero. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/insforge-cli/references/compute-deploy.md` around lines 220 - 224, The
table currently shows backslash-escaped pipes in the `auto_stop_machines` row
(`"off" \| "stop" \| "suspend"`); remove the backslashes and wrap the union-type
string in code formatting so it renders correctly (e.g., use `"off" | "stop" |
"suspend"` inside backticks) and verify other type cells (`auto_start_machines`,
`min_machines_running`) use consistent code formatting if needed.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Tip: cubic could auto-approve low-risk PRs like this, if it thinks it's safe to merge. Learn more

tonychang04 and others added 2 commits May 11, 2026 13:45
The first draft documented the fly.toml field names
(`auto_stop_machines` / `auto_start_machines`). Companion fix in
InsForge/InsForge#1251 corrected the backend to send the Machines API
short names — `autostop` / `autostart` — because that's what the
`POST api.machines.dev/v1/apps/<app>/machines` body actually accepts
(`auto_*_machines` are silently ignored).

Rewrites the scale-to-zero section to:
- Show the body InsForge actually sends (short names)
- Cross-reference both spellings in one table so users coming from
  fly.toml docs and users debugging via `flyctl machines list --json`
  both see what they expect
- Note that `min_machines_running` is honored only in the app's
  primary region

Authoritative schema reference: fly.MachineService in
https://docs.machines.dev/spec/openapi3.json

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Earlier draft hedged with "not yet exposed via the CLI" language for
autostop/min_machines override flags. Decision: don't build those for
v1. One mode, less surface area, simpler support story.

Rewrites both files to be explicit:
- "v1 is the only mode" — every service is scale-to-zero, no flags
- Removes the "What the CLI/skill can change" matrix that implied
  flags were coming. Replaced with a short "Why no override flags"
  paragraph explaining the tradeoff and the support escape hatch.
- Adds a direct instruction for agents: don't ask "set autostop to
  off" or "keep N warm" — there's no flag, nothing the skill can
  do, route to support instead. Prevents the agent from invoking
  imaginary flags and confusing the user.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tonychang04
Copy link
Copy Markdown
Contributor Author

Consolidated into #62 (scale-to-zero compute docs are now part of the unified PR for combined review/merge).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant