refactor(orchestrator): move busybox from compile-time embed to runtime disk read#2326
refactor(orchestrator): move busybox from compile-time embed to runtime disk read#2326tomassrnka merged 15 commits intomainfrom
Conversation
PR SummaryHigh Risk Overview Reviewed by Cursor Bugbot for commit dee65be. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Fetch script version parameter ignored in download URL
- Added version to GCS URL path in both fetch-busybox.sh and upload-busybox.sh so downloads are version-specific at busybox/${VERSION}/${ARCH}/busybox.
- ✅ Fixed: AWS nodepool-client missing busybox ARN in IAM policy
- Added fc_busybox_bucket_arn to the client_node_policy IAM document resources alongside the other bucket ARNs.
Or push these changes by commenting:
@cursor push b31c321f9e
Preview (b31c321f9e)
diff --git a/iac/provider-aws/modules/nodepool-client/main.tf b/iac/provider-aws/modules/nodepool-client/main.tf
--- a/iac/provider-aws/modules/nodepool-client/main.tf
+++ b/iac/provider-aws/modules/nodepool-client/main.tf
@@ -66,6 +66,9 @@
"${var.fc_versions_bucket_arn}/*",
var.fc_versions_bucket_arn,
+
+ "${var.fc_busybox_bucket_arn}/*",
+ var.fc_busybox_bucket_arn,
]
}
diff --git a/packages/orchestrator/scripts/fetch-busybox.sh b/packages/orchestrator/scripts/fetch-busybox.sh
--- a/packages/orchestrator/scripts/fetch-busybox.sh
+++ b/packages/orchestrator/scripts/fetch-busybox.sh
@@ -21,7 +21,7 @@
exit 0
fi
-GCS_URL="https://storage.googleapis.com/e2b-prod-public-builds/busybox/${ARCH}/busybox"
+GCS_URL="https://storage.googleapis.com/e2b-prod-public-builds/busybox/${VERSION}/${ARCH}/busybox"
echo "Downloading busybox v${VERSION} (${ARCH}) from GCS..."
diff --git a/packages/orchestrator/scripts/upload-busybox.sh b/packages/orchestrator/scripts/upload-busybox.sh
--- a/packages/orchestrator/scripts/upload-busybox.sh
+++ b/packages/orchestrator/scripts/upload-busybox.sh
@@ -17,7 +17,7 @@
RELEASE_URL="https://github.com/e2b-dev/fc-busybox/releases/download/v${VERSION}"
BINARY="busybox_v${VERSION}_${ARCH}"
-GCS_PATH="gs://e2b-prod-public-builds/busybox/${ARCH}/busybox"
+GCS_PATH="gs://e2b-prod-public-builds/busybox/${VERSION}/${ARCH}/busybox"
echo "Downloading busybox v${VERSION} (${ARCH}) from GitHub..."This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
…me disk read
Busybox was embedded into the orchestrator binary via //go:embed at compile
time. This broke orchestrator-ee in the belt repo because the binary isn't
in the Go module cache (it's gitignored and fetched at build time).
Instead, busybox is now read from disk at runtime via HOST_BUSYBOX_DIR
(default /fc-busybox), matching how kernels, firecracker, and envd already
work. A new GCS bucket ({project}-fc-busybox) is created per environment
and mounted via gcsfuse (GCP) or s3fs (AWS) on orchestrator hosts.
Changes:
- Delete pkg/template/build/core/systeminit/ package entirely
- Read busybox from disk in rootfs.go using os.ReadFile
- Add HostBusyboxDir to BuilderConfig (HOST_BUSYBOX_DIR env var)
- Create fc-busybox bucket in Terraform (GCP + AWS)
- Add gcsfuse/s3fs mount in host startup scripts
- Add upload-busybox.sh for publishing to GCS public bucket
- Update fetch-busybox.sh to download from GCS instead of GitHub
- Add busybox to copy-public-builds and host-init
- Update cmd/create-build local mode to handle busybox path
- Remove fetch-busybox from CI lint/build (no longer compile-time dep)
Deployment: run upload-busybox, terraform apply, copy-public-builds before
deploying orchestrator binaries that include this change.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Include version in GCS URL path (busybox/{version}/{arch}/busybox)
so a version bump in the Makefile fails loudly if the binary hasn't
been uploaded, instead of silently downloading the old one.
- Add BusyboxVersion to BuilderConfig (BUSYBOX_VERSION env, default 1.36.1)
so the runtime path includes the version component.
- Add fc_busybox_bucket_arn to AWS nodepool-client IAM read-access policy
(was missing, only worked because the parent cluster policy included it).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upload a busybox.sha256 sidecar file alongside the binary in GCS. The fetch script now downloads and verifies the checksum, restoring the integrity check that was lost when moving from GitHub releases (which had SHA256SUMS) to GCS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Busybox is no longer embedded at compile time, so tests don't need the binary on disk. The rootfs_test mocks busybox with temp files. Removing the fetch step unblocks CI before busybox is uploaded to GCS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The smoketest (cmd/smoketest) does a real template build that needs busybox on disk. Restore fetch-busybox in test workflows with || true so CI doesn't fail when GCS isn't populated yet (the smoketest will fail with a clear error instead of the setup step crashing). Also make host-init busybox download non-fatal for integration tests. Set HOST_BUSYBOX_DIR via GITHUB_ENV so the test steps pick up the local .busybox directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Host-init should fail hard if busybox isn't available, same as kernels and firecracker. Upload busybox to GCS before deploying. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9faf869 to
f05c47f
Compare
Co-authored-by: Jakub Novák <jakub@e2b.dev>
Co-authored-by: Jakub Novák <jakub@e2b.dev>
The fc-busybox CI workflow handles uploads to GCS automatically on tag push. No need for a manual upload target in infra. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 416b1a5698
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The build-template target runs locally where /fc-busybox doesn't exist. Fetch busybox first and point HOST_BUSYBOX_DIR to the local .busybox dir so cmd/create-build can find the binary at runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| os.Setenv("HOST_BUSYBOX_DIR", busyboxDir) | ||
| } | ||
| busyboxVersion := os.Getenv("BUSYBOX_VERSION") | ||
| if busyboxVersion == "" { | ||
| busyboxVersion = "1.36.1" | ||
| } |
There was a problem hiding this comment.
🟡 In setupEnv() (create-build/main.go:176), the busybox version fallback is hardcoded to "1.36.1" independently of cfg/model.go's envDefault:"1.36.1", creating two sources of truth. If the canonical default in cfg/model.go is ever bumped without updating this hardcoded fallback, the local-dev diagnostic will check the wrong version path and print a misleading warning even when busybox is present at the correct new-version path. The fix is to derive the version from cfg.ParseBuilder() instead of repeating the literal.
Extended reasoning...
What the bug is
In packages/orchestrator/cmd/create-build/main.go lines 172-177, setupEnv() reads BUSYBOX_VERSION from the environment and falls back to the hardcoded string "1.36.1" when the variable is unset. Separately, cfg/model.go declares BusyboxVersion with envDefault:"1.36.1". These two defaults are independent; nothing enforces they stay in sync.
The specific code path
When a developer runs create-build locally without setting BUSYBOX_VERSION, setupEnv() constructs busyboxBin using its own hardcoded "1.36.1". It then performs an os.Stat() check and prints either a success or warning message. After setupEnv() returns, cfg.ParseBuilder() is called and produces a BuilderConfig whose BusyboxVersion comes from the env tag default in model.go. The actual build uses BuilderConfig.BusyboxVersion, not the fallback in setupEnv().
Why existing code does not prevent it
There is no compile-time linkage between the two defaults. In a bare go run invocation without environment variables, both sources are read independently.
Impact
If cfg/model.go is updated to e.g. envDefault:"1.37.0" and create-build/main.go is not, the diagnostic check would stat .busybox/1.36.1/amd64/busybox and print a false warning, even though busybox is present at .busybox/1.37.0/amd64/busybox and the build itself succeeds. The impact is misleading developer output only (local dev mode diagnostic); production builds are unaffected because BUSYBOX_VERSION is set explicitly in production environments.
How to fix
Move cfg.ParseBuilder() before setupEnv() and pass builderConfig.BusyboxVersion into setupEnv() as a parameter, eliminating the second hardcoded fallback. Alternatively, expose the default as an exported constant from the cfg package and reference it in both places.
Step-by-step proof
- Developer bumps BusyboxVersion envDefault in cfg/model.go from "1.36.1" to "1.37.0" and runs make fetch-busybox, which downloads to .busybox/1.37.0/amd64/busybox.
- Developer runs go run ./cmd/create-build -storage ./local-build -to-build test without setting BUSYBOX_VERSION.
- setupEnv() reads os.Getenv("BUSYBOX_VERSION") which is empty and falls back to hardcoded "1.36.1".
- busyboxBin is set to .busybox/1.36.1/amd64/busybox which does not exist, prints the warning message.
- cfg.ParseBuilder() then returns BusyboxVersion="1.37.0" from the updated envDefault.
- rootfs.additionalOCILayers reads from .busybox/1.37.0/amd64/busybox and succeeds silently.
- Result: build succeeds but developer saw a false warning that busybox is missing.
Use cfg.DefaultBusyboxVersion instead of hardcoding "1.36.1" in cmd/create-build, avoiding two sources of truth. Same pattern as featureflags.DefaultKernelVersion and DefaultFirecrackerVersion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous version wrote the binary to the final path before downloading the SHA256 sidecar. If the sidecar download failed, set -e left an unverified binary on disk without a stamp. Now both files go to temp paths first, verification runs, then the binary is moved to the output path — matching the old /tmp staging pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consistent with how kernels and firecracker are handled — the Makefile doesn't set HOST_KERNELS_DIR or pre-fetch those either. The cmd/create-build setupEnv() handles local paths when -storage is used. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…plate" This reverts commit 15cdbdf.
Keep fetch-busybox dependency but don't override the env var — consistent with how other host paths are handled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit dee65be. Configure here.
Commit bba0313 moved busybox from compile-time go:embed to runtime disk read via HOST_BUSYBOX_DIR, but didn't add the variable to .env.local. Without it, local-dev builds fail trying to read busybox from /fc-busybox (the production default). Point HOST_BUSYBOX_DIR at .busybox, the local fetch-busybox output directory. Fixes: bba0313 ("refactor(orchestrator): move busybox from compile-time embed to runtime disk read (#2326)")
Commit bba0313 moved busybox from compile-time go:embed to runtime disk read via HOST_BUSYBOX_DIR, but didn't add the variable to .env.local. Without it, local-dev builds fail trying to read busybox from /fc-busybox (the production default). Point HOST_BUSYBOX_DIR at .busybox, the local fetch-busybox output directory. Fixes: bba0313 ("refactor(orchestrator): move busybox from compile-time embed to runtime disk read (#2326)")


Summary
//go:embedat compile time toos.ReadFileat runtime, matching how kernels, firecracker, and envd already worksysteminitpackage that caused orchestrator-ee build failures in belt (the binary wasn't in the Go module cache because it's gitignored)fc-busyboxGCS/S3 bucket per environment, mounted via gcsfuse (GCP) / s3fs (AWS) at/fc-busybox/{arch}/busyboxDeployment steps (before merging)
cd packages/orchestrator make upload-busybox BUILD_ARCH=amd64 make upload-busybox BUILD_ARCH=arm64fc-busyboxbuckets:make plan && make applyTest plan
go build ./...compiles without busybox on disk (no embed dependency)go test ./...passes (rootfs_test mocks busybox with temp files)make upload-busyboxfor both archesfc-busyboxbucket existsmake copy-public-builds, verify files in per-env bucket/fc-busybox/{arch}/busybox)🤖 Generated with Claude Code