Skip to content

feat: add composite upload option for large file writes#1254

Open
mishushakov wants to merge 13 commits intomainfrom
mishushakov/composite-upload
Open

feat: add composite upload option for large file writes#1254
mishushakov wants to merge 13 commits intomainfrom
mishushakov/composite-upload

Conversation

@mishushakov
Copy link
Copy Markdown
Member

@mishushakov mishushakov commented Apr 3, 2026

Summary

  • Adds automatic composite upload for large files (>64MB) in both JS and Python SDKs — write() transparently splits data into 64MB chunks, uploads them in parallel, then composes them server-side using zero-copy concatenation via the new POST /files/compose endpoint
  • Adds /files/compose endpoint to the envd OpenAPI spec with ComposeRequest schema (source_paths, destination, username) and regenerates JS SDK types
  • Files at or below 64MB use the normal single upload path — no user-facing API changes needed
  • Supports gzip compression for chunk uploads
  • Supports both sync and async Python SDKs

Test plan

  • Upload a file >64MB and verify it is chunked and composed correctly
  • Upload a file <64MB and verify normal upload path is used
  • Test with gzip: true on large file uploads
  • Test with both JS and Python SDKs (sync and async)

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 3, 2026

🦋 Changeset detected

Latest commit: 111dd1f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@e2b/python-sdk Minor
e2b Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 3, 2026

PR Summary

Medium Risk
Changes the file upload path for large writes by introducing parallel chunk uploads and a server-side compose step, which could affect reliability/timeouts and temporary file handling. Risk is limited to SDK write flows and gated by envd version support.

Overview
Adds composite uploads for large file writes (>64MB). The JS SDK Filesystem.write() now transparently converts large writes to a 64MB-chunked parallel upload to temporary /tmp/.e2b-upload-* paths, then calls the new POST /files/compose endpoint to assemble the final file (with optional gzip).

Extends the envd API spec and generated types with POST /files/compose and the ComposeRequest schema (source_paths, destination, optional username), and updates the async Python SDK to use the same chunk+compose strategy for oversized writes. A changeset bumps @e2b/python-sdk and e2b as minor releases.

Written by Cursor Bugbot for commit 111dd1f. This will update automatically on new commits. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Package Artifacts

Built from 72d3970. Download artifacts from this workflow run.

JS SDK (e2b@2.19.1-mishushakov-composite-upload.0):

npm install ./e2b-2.19.1-mishushakov-composite-upload.0.tgz

CLI (@e2b/cli@2.9.1-mishushakov-composite-upload.0):

npm install ./e2b-cli-2.9.1-mishushakov-composite-upload.0.tgz

Python SDK (e2b==2.20.0+mishushakov-composite-upload):

pip install ./e2b-2.20.0+mishushakov.composite.upload-py3-none-any.whl

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

- Build chunk_paths deterministically before asyncio.gather in async _composite_write
- Use Username type instead of bare string in JS compositeWrite

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mishushakov and others added 5 commits April 3, 2026 13:01
Use already-materialized blob/content instead of re-reading original data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When data fits in a single chunk, fall through to the normal write path
instead of duplicating the upload logic inside compositeWrite.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the `composite` option from `write()`. Files over 64MB are now
automatically chunked and uploaded via the composite path when the envd
version supports it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When data is an IO object and ≤64MB, to_upload_body() consumes the
stream. Pass the materialized bytes to write_files() instead of the
exhausted IO object.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mishushakov mishushakov marked this pull request as ready for review April 3, 2026 13:36
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9de5bc1fe8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +418 to 437
"destination": destination,
}
if username:
body["username"] = username

r = await self._envd_api.post(
ENVD_API_FILES_COMPOSE_ROUTE,
json=body,
timeout=self._connection_config.get_request_timeout(request_timeout),
)

err = await _ahandle_filesystem_envd_api_exception(r)
if err:
raise err

return WriteInfo(**r.json())

async def list(
self,
path: str,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new _composite_write method introduces two related bugs in its _upload_chunk coroutines run via asyncio.gather: (1) when use_gzip=True, to_upload_body(chunk_data, use_gzip) calls gzip.compress(raw) synchronously—a CPU-bound operation that can take seconds on a 64MB chunk—fully blocking the asyncio event loop before the first await, freezing all other async operations during every chunk's compression; (2) asyncio.gather without return_exceptions=True or TaskGroup does not cancel remaining in-flight chunk uploads when one fails, so they continue uploading to /tmp despite the overall operation having already failed, wasting bandwidth and creating additional orphaned temp files. Fix (1) by using asyncio.to_thread(gzip.compress, chunk_data) and fix (2) by using asyncio.TaskGroup (Python 3.11+) or explicit task cancellation on error.

Extended reasoning...

Issue 1 — Synchronous gzip.compress blocks the event loop

In _composite_write (lines 418–437 of sandbox_async/filesystem/filesystem.py), the inner _upload_chunk(i) coroutine calls upload_content = to_upload_body(chunk_data, use_gzip) before its first await. The to_upload_body function (in sandbox/filesystem/filesystem.py) calls gzip.compress(raw) when use_gzip=True. gzip.compress is a pure CPU-bound operation with no await points—it holds the GIL and blocks the asyncio event loop for the entire duration of the compression, typically hundreds of milliseconds to several seconds for a 64MB chunk.

Why asyncio.gather does not help here

Python's asyncio is single-threaded. All coroutines run cooperatively on one thread, and a coroutine can only yield control at an await point. Because gzip.compress runs before the first await in _upload_chunk, it holds the event loop thread continuously. With asyncio.gather launching chunk_count coroutines, each coroutine blocks the event loop sequentially at its compression step. Concretely, for a 192MB file (3 chunks), the event loop is blocked during 3 sequential gzip.compress calls—potentially 3–6+ seconds total—preventing all other async operations in the application from progressing. The fix is await asyncio.to_thread(gzip.compress, chunk_data) to offload compression to a thread pool, releasing the event loop between chunks.

Step-by-step proof for Issue 1

Consider a 128MB upload with use_gzip=True, producing 2 chunks of 64MB each:

  1. asyncio.gather(_upload_chunk(0), _upload_chunk(1)) schedules both coroutines.
  2. _upload_chunk(0) runs first, calls gzip.compress(chunk_data_0) — event loop blocked ~1–3s.
  3. Control never yields to _upload_chunk(1) until compression finishes.
  4. _upload_chunk(0) hits await self._envd_api.post(...) — now _upload_chunk(1) can start.
  5. _upload_chunk(1) calls gzip.compress(chunk_data_1) — event loop blocked again ~1–3s.
  6. Net result: compressions run serially (2–6s of event loop blockage), defeating the purpose of parallel uploads and freezing any other async tasks (e.g., heartbeats, other sandbox operations).

Issue 2 — asyncio.gather does not cancel in-flight tasks on failure

Per the Python docs: "If return_exceptions is False (default), the first raised exception is immediately propagated to the task that awaits on gather(). Other awaitables in the aws sequence won't be cancelled and will continue to run." In _composite_write, if _upload_chunk(0) raises (e.g., server error, network timeout, or disk full), asyncio.gather propagates the exception to the caller immediately, but _upload_chunk(1) through _upload_chunk(N-1) continue uploading data in the background. Since the overall _composite_write operation has already failed and the compose step will never run, these uploads produce orphaned /tmp/.e2b-upload-* files that waste network bandwidth and sandbox disk space. This compounds with any existing orphaned-file cleanup issue.

How to fix

For Issue 1: replace upload_content = to_upload_body(chunk_data, use_gzip) with upload_content = await asyncio.to_thread(gzip.compress, chunk_data) (and handle the non-gzip path inline), or refactor to_upload_body to have an async variant. For Issue 2: replace asyncio.gather with asyncio.TaskGroup (Python 3.11+), which automatically cancels all remaining tasks when one raises. Both fixes are confined to _composite_write in the async filesystem module and have no impact on the sync SDK.

mishushakov and others added 3 commits April 3, 2026 15:59
Composite upload's primary benefit is parallel chunk uploading, which
the sync SDK cannot leverage (sequential HTTP requests negate the
performance advantage). Only the async Python SDK and JS SDK retain
composite upload support via asyncio.gather() and Promise.all().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
write_files() already calls to_upload_body internally, so the
pre-materialization in write() was unnecessary after removing the
composite upload size check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace asyncio.gather with asyncio.TaskGroup for structured
concurrency, and offload gzip compression to a thread to avoid
blocking the event loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

throw composeErr
}

return composeRes.data as WriteInfo
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orphaned chunk files leak disk on upload failure

Medium Severity

Both compositeWrite (JS) and _composite_write (Python) upload chunks to /tmp/.e2b-upload-{uuid}-{i} but never clean them up if a chunk upload partially fails or if the compose step fails. Promise.all / asyncio.TaskGroup will reject on first error, leaving already-uploaded chunks orphaned on disk. For a 1GB file, that's ~1GB of wasted space per failed attempt, which could exhaust the sandbox's limited disk and cause subsequent NotEnoughDiskSpace errors. A try/finally block around the upload+compose steps that removes successfully uploaded chunks on error would prevent this.

Additional Locations (1)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant