Skip to content

Turbopack: Fix near-duplicate chunks and non-deterministic ordering in production builds#90710

Draft
lukesandberg wants to merge 4 commits intocanaryfrom
lukesandberg/shared_chunks_for_layouts_exploration
Draft

Turbopack: Fix near-duplicate chunks and non-deterministic ordering in production builds#90710
lukesandberg wants to merge 4 commits intocanaryfrom
lukesandberg/shared_chunks_for_layouts_exploration

Conversation

@lukesandberg
Copy link
Contributor

What?

Four changes to the Turbopack production chunking algorithm:

  1. Absorption pass for near-duplicate chunks — After the main merge loop, small remaining candidates that couldn't find a merge partner are now checked against existing heap chunks. If a small item's bitmap overlaps sufficiently (duplication cost > extra download cost), the small item is absorbed rather than creating a near-duplicate chunk.

  2. Extract merge algorithm into merge.rs — The merge/absorption logic is extracted from production.rs into a standalone merge.rs module with pure (Vc-free) types, enabling direct unit testing. production.rs now calls merge_grouped_chunks() instead of duplicating the algorithm.

  3. Sort chunk items by ModuleId in content emitters — Different routes perform independent DFS traversals producing different module orderings. Sorting by ModuleId before serialization in both browser and Node.js content emitters ensures identical module sets always produce identical content hashes.

  4. Sort group indices after merge — The merge algorithm's group_indices and batch_group_ids are collected in non-deterministic order (hash maps + heap). Sorting them in production.rs before collecting chunk items ensures deterministic output.

Why?

Users reported that Turbopack creates duplicate (or nearly duplicate) chunks for shared layouts, leading to excess download bytes. Two root causes:

  • Near-duplicate chunks: A tiny module (e.g. 360B) with bitmap {0,1,2} would create a separate chunk instead of being absorbed into a large chunk (e.g. 60KB) with bitmap {0,1}, resulting in two nearly identical chunks being downloaded.
  • Non-deterministic ordering: The same set of modules could produce different content hashes depending on route traversal order or merge algorithm iteration order, causing the same logical chunk to get different filenames across routes — leading to duplicate downloads.

How?

merge.rs (new file):

  • merge_grouped_chunks(): Core merge algorithm operating on GroupInput (size + bitmap + batch_group_id). Used by production.rs.
  • merge_chunks(): Test-only wrapper that groups items by bitmap first, then delegates to merge_grouped_chunks().
  • Absorption pass after the main merge loop: iterates small unmerged candidates and finds the best-overlapping heap chunk to absorb them into.
  • 7 unit tests covering near-duplicate scenarios, layout sharing, deep nesting, input-order stability, and no-merge configs.

production.rs (refactored):

  • Resolves Vc values and groups items by bitmap (unchanged).
  • Converts groups to GroupInput, calls merge_grouped_chunks().
  • Sorts group_indices and batch_group_ids before iterating to ensure deterministic chunk item collection.

content.rs (browser + nodejs):

  • Collects all chunk items, sorts by ModuleId, then serializes. Ensures identical content hashes regardless of DFS traversal order.

Testing

  • 7 unit tests in merge.rs covering the merge algorithm directly (near-duplicate absorption, layout sharing, deep nesting, input-order stability, characterization, no-merge config)
  • Existing production chunking integration tests exercise the full pipeline

lukesandberg and others added 4 commits February 28, 2026 11:36
Add absorption pass to the production chunking merge algorithm. After the
main merge loop, small remaining candidates that couldn't find a merge
partner are now checked against existing heap chunks. If a small item's
bitmap overlaps sufficiently with a heap chunk's bitmap (duplication cost
exceeds extra download cost), the small item is absorbed into that chunk
rather than creating a near-duplicate.

This fixes cases where a tiny module (e.g. 360B) with bitmap {0,1,2}
would create a separate near-identical chunk instead of being absorbed
into a large chunk (e.g. 60KB) with bitmap {0,1}.

Also adds merge.rs with a pure Vc-free extraction of the merge algorithm
and 7 unit tests covering the near-duplicate scenarios.
Refactor production.rs to call merge_grouped_chunks() from merge.rs
instead of having its own copy of the merge algorithm. This eliminates
~500 lines of duplicated merge logic.

production.rs now:
1. Resolves Vc values and groups items by bitmap (unchanged)
2. Converts groups to GroupInput (size + bitmap + batch_group_id)
3. Calls merge_grouped_chunks() for the merge + absorption pass
4. Maps results back to make_chunk() calls

merge.rs now exposes two entry points:
- merge_grouped_chunks(): takes pre-grouped items (used by production.rs)
- merge_chunks(): groups items by bitmap first, then calls
  merge_grouped_chunks() (used by unit tests)

Test-only types (ChunkItemForMerging, MergedChunkInfo, merge_chunks,
hash_bitmap) are gated with #[cfg(test)].
…shes

Different routes perform independent DFS traversals from different entry
points, producing different topological orderings of shared modules.
Since production chunk filenames are derived from content hashes of the
serialized JS bytes, this ordering instability causes the same set of
modules to produce chunks with different filenames across routes,
leading to duplicate downloads.

Fix by sorting chunk items by ModuleId before serialization in both
browser and Node.js chunk content emitters. This ensures identical
module sets always produce identical content hashes regardless of
traversal order.
…ks integration

After integrating merge_chunks into production.rs, chunk items were being
collected in the order that groups were merged, which is non-deterministic
due to hash map iteration order and heap operations in the merge algorithm.

The merge algorithm extends group_indices and batch_group_ids in whatever
order groups are encountered during merging. When production.rs iterates
over these unsorted indices to collect chunk items, the same set of modules
produces different output orders across builds.

Fix by sorting group_indices and batch_group_ids before iterating over them
to collect chunk items. This ensures deterministic output regardless of the
merge algorithm's internal ordering.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@nextjs-bot nextjs-bot added created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js. labels Mar 1, 2026
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 1, 2026

Merging this PR will not alter performance

✅ 17 untouched benchmarks
⏩ 3 skipped benchmarks1


Comparing lukesandberg/shared_chunks_for_layouts_exploration (5aa9876) with canary (fb694e9)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

if should_absorb && heap_chunks[idx].size + small.size <= max_size {
let hc = &mut heap_chunks[idx];
hc.size += small.size;
hc.group_indices.extend(small.group_indices);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absorption pass in merge algorithm silently drops batch_group_ids when absorbing small candidates into heap chunks, causing missing batch group metadata in output chunks.

Fix on Vercel

@nextjs-bot
Copy link
Collaborator

nextjs-bot commented Mar 1, 2026

Failing test suites

Commit: 5aa9876 | About building and testing Next.js

pnpm test-dev-turbo test/development/acceptance-app/error-recovery.test.ts (turbopack) (job)

  • Error recovery app > can recover from a event handler error (DD)
  • Error recovery app > render error not shown right after syntax error (DD)
Expand output

● Error recovery app › can recover from a event handler error

expect(received).toMatchInlineSnapshot(snapshot)

Snapshot name: `Error recovery app can recover from a event handler error 1`

- Snapshot  - 1
+ Received  + 1

@@ -7,8 +7,8 @@
       |           ^",
    "stack": [
      "Index.useCallback[increment] index.js (7:11)",
      "button <anonymous>",
      "Index index.js (12:7)",
-     "Page index.js (10:5)",
+     "Page app/page.js (4:10)",
    ],
  }

  328 |       // So the SourceMap is updated to reflect that. But the browser still has the old bundled file loaded.
  329 |       // So we look up locations from the old bundle in the new source maps, which leads to mismatched locations.
> 330 |       await expect(browser).toDisplayCollapsedRedbox(`
      |                             ^
  331 |        {
  332 |          "description": "oops",
  333 |          "environmentLabel": null,

  at Object.toDisplayCollapsedRedbox (development/acceptance-app/error-recovery.test.ts:330:29)

● Error recovery app › render error not shown right after syntax error

expect(received).toMatchInlineSnapshot(snapshot)

Snapshot name: `Error recovery app render error not shown right after syntax error 3`

- Snapshot  - 1
+ Received  + 1

@@ -5,8 +5,8 @@
    "source": "index.js (5:11) @ ClassDefault.render
  > 5 |     throw new Error('nooo');
      |           ^",
    "stack": [
      "ClassDefault.render index.js (5:11)",
-     "Page index.js (10:16)",
+     "Page app/page.js (4:10)",
    ],
  }

  1075 |     if (isTurbopack) {
  1076 |       // TODO(veil): Location of Page should be app/page.js
> 1077 |       await expect(browser).toDisplayRedbox(`
       |                             ^
  1078 |        {
  1079 |          "description": "nooo",
  1080 |          "environmentLabel": null,

  at Object.toDisplayRedbox (development/acceptance-app/error-recovery.test.ts:1077:29)

pnpm test-dev-turbo test/development/acceptance/ReactRefreshLogBox.test.ts (turbopack) (job)

  • ReactRefreshLogBox > module init error not shown (DD)
Expand output

● ReactRefreshLogBox › module init error not shown

expect(received).toMatchInlineSnapshot(snapshot)

Snapshot name: `ReactRefreshLogBox module init error not shown 1`

- Snapshot  - 1
+ Received  + 1

@@ -7,9 +7,9 @@
  > 3 | throw new Error('no')
      |       ^",
    "stack": [
      "module evaluation index.js (3:7)",
      "module evaluation pages/index.js (1:1)",
-     "module evaluation pages/index.js (1:1)",
+     "module evaluation index.js (9:16)",
      "<FIXME-next-dist-dir>",
    ],
  }

  160 |     } else {
  161 |       if (isTurbopack) {
> 162 |         await expect(browser).toDisplayRedbox(`
      |                               ^
  163 |          {
  164 |            "code": "E394",
  165 |            "description": "no",

  at Object.toDisplayRedbox (development/acceptance/ReactRefreshLogBox.test.ts:162:31)

pnpm test-dev-turbo test/development/acceptance-app/ReactRefreshLogBox.test.ts (turbopack) (job)

  • ReactRefreshLogBox app > logbox: anchors links in error messages (DD)
  • ReactRefreshLogBox app > Should not show webpack_exports when exporting anonymous arrow function (DD)
  • ReactRefreshLogBox app > Unhandled errors and rejections opens up in the minimized state (DD)
Expand output

● ReactRefreshLogBox app › logbox: anchors links in error messages

expect(received).toMatchInlineSnapshot(snapshot)

Snapshot name: `ReactRefreshLogBox app logbox: anchors links in error messages 1`

- Snapshot  - 1
+ Received  + 1

@@ -7,8 +7,8 @@
      |           ^",
    "stack": [
      "Index.useCallback[boom] index.js (5:11)",
      "button <anonymous>",
      "Index index.js (9:7)",
-     "Page index.js (9:30)",
+     "Page app/page.js (4:10)",
    ],
  }

  618 |     // TODO(veil): Why Owner Stack location different?
  619 |     if (isTurbopack) {
> 620 |       await expect(browser).toDisplayCollapsedRedbox(`
      |                             ^
  621 |        {
  622 |          "description": "end https://nextjs.org",
  623 |          "environmentLabel": null,

  at Object.toDisplayCollapsedRedbox (development/acceptance-app/ReactRefreshLogBox.test.ts:620:29)

● ReactRefreshLogBox app › Should not show webpack_exports when exporting anonymous arrow function

expect(received).toMatchInlineSnapshot(snapshot)

Snapshot name: `ReactRefreshLogBox app Should not show __webpack_exports__ when exporting anonymous arrow function 1`

- Snapshot  - 1
+ Received  + 1

@@ -5,8 +5,8 @@
    "source": "index.js (3:11) @ {default export}
  > 3 |     throw new Error('test')
      |           ^",
    "stack": [
      "{default export} index.js (3:11)",
-     "Page app/page.js (2:1)",
+     "Page app/page.js (4:10)",
    ],
  }

  1040 |
  1041 |     if (isTurbopack) {
> 1042 |       await expect(browser).toDisplayRedbox(`
       |                             ^
  1043 |        {
  1044 |          "description": "test",
  1045 |          "environmentLabel": null,

  at Object.toDisplayRedbox (development/acceptance-app/ReactRefreshLogBox.test.ts:1042:29)

● ReactRefreshLogBox app › Unhandled errors and rejections opens up in the minimized state

expect(received).toMatchInlineSnapshot(snapshot)

Snapshot name: `ReactRefreshLogBox app Unhandled errors and rejections opens up in the minimized state 1`

- Snapshot  - 1
+ Received  + 1

@@ -5,8 +5,8 @@
    "source": "index.js (2:44) @ Index
  > 2 |   if (typeof window !== 'undefined') throw new Error('Component error')
      |                                            ^",
    "stack": [
      "Index index.js (2:44)",
-     "Page index.js (16:8)",
+     "Page app/page.js (4:10)",
    ],
  }

  1155 |     // TODO(veil): Why Owner Stack location different?
  1156 |     if (isTurbopack) {
> 1157 |       await expect(browser).toDisplayRedbox(`
       |                             ^
  1158 |        {
  1159 |          "description": "Component error",
  1160 |          "environmentLabel": null,

  at Object.toDisplayRedbox (development/acceptance-app/ReactRefreshLogBox.test.ts:1157:29)

pnpm test-start-turbo test/e2e/prerender.test.ts (turbopack) (job)

  • deterministic build - changing deployment id > build output API - builder > should produce identical build outputs even when changing deployment id (DD)
Expand output

● deterministic build - changing deployment id › build output API - builder › should produce identical build outputs even when changing deployment id

thrown: "Exceeded timeout of 60000 ms for a test.
Add a timeout value to this test to increase the timeout, if this is a long-running test. See https://jestjs.io/docs/api#testname-fn-timeout."

  50 |       }
  51 |
> 52 |       const result = Reflect.apply(target, thisArg, args)
     |                              ^
  53 |       return typeof result === 'function' ? wrapJestTestFn(result) : result
  54 |     },
  55 |     get(target, prop, receiver) {

  at Object.apply (lib/e2e-utils/index.ts:52:30)
  at it (production/deterministic-build/deployment-id.test.ts:247:7)
      at Array.forEach (<anonymous>)
  at production/deterministic-build/deployment-id.test.ts:218:41
  at Object.<anonymous> (production/deterministic-build/deployment-id.test.ts:197:58)

@nextjs-bot
Copy link
Collaborator

Stats from current PR

✅ No significant changes detected

📊 All Metrics
📖 Metrics Glossary

Dev Server Metrics:

  • Listen = TCP port starts accepting connections
  • First Request = HTTP server returns successful response
  • Cold = Fresh build (no cache)
  • Warm = With cached build artifacts

Build Metrics:

  • Fresh = Clean build (no .next directory)
  • Cached = With existing .next directory

Change Thresholds:

  • Time: Changes < 50ms AND < 10%, OR < 2% are insignificant
  • Size: Changes < 1KB AND < 1% are insignificant
  • All other changes are flagged to catch regressions

⚡ Dev Server

Metric Canary PR Change Trend
Cold (Listen) 915ms 914ms ▁▁▁█▁
Cold (Ready in log) 893ms 888ms ▂▂▁█▁
Cold (First Request) 1.666s 1.585s ▄▄▁█▁
Warm (Listen) 912ms 913ms ▁▁▁█▁
Warm (Ready in log) 883ms 887ms ▁▁▁█▁
Warm (First Request) 684ms 669ms ▁▁▁█▁
📦 Dev Server (Webpack) (Legacy)

📦 Dev Server (Webpack)

Metric Canary PR Change Trend
Cold (Listen) 456ms 456ms ▁▁▁▁▁
Cold (Ready in log) 437ms 438ms ▂▃▁▁▃
Cold (First Request) 1.939s 1.922s ▂▁▁▁▂
Warm (Listen) 456ms 457ms ▁▁▁▁▁
Warm (Ready in log) 438ms 438ms ▂▂▁▁▃
Warm (First Request) 1.957s 1.951s ▁▁▁▁▂

⚡ Production Builds

Metric Canary PR Change Trend
Fresh Build 6.204s 6.328s ▁▁▃█▁
Cached Build 6.231s 6.277s ▁▁▃█▁
📦 Production Builds (Webpack) (Legacy)

📦 Production Builds (Webpack)

Metric Canary PR Change Trend
Fresh Build 14.035s 14.086s ▁▁▁▁▁
Cached Build 14.189s 14.216s ▁▁▁▁▁
node_modules Size 475 MB 475 MB ▁▁▁▁▁
📦 Bundle Sizes

Bundle Sizes

⚡ Turbopack

Client

Main Bundles: **401 kB** → **401 kB** ✅ -14 B

80 files with content-based hashes (individual files not comparable between builds)

Server

Middleware
Canary PR Change
middleware-b..fest.js gzip 761 B 766 B
Total 761 B 766 B ⚠️ +5 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 451 B 450 B
Total 451 B 450 B ✅ -1 B

📦 Webpack

Client

Main Bundles
Canary PR Change
5528-HASH.js gzip 5.54 kB N/A -
6280-HASH.js gzip 58.4 kB N/A -
6335.HASH.js gzip 169 B N/A -
912-HASH.js gzip 4.59 kB N/A -
e8aec2e4-HASH.js gzip 62.6 kB N/A -
framework-HASH.js gzip 59.7 kB 59.7 kB
main-app-HASH.js gzip 254 B 254 B
main-HASH.js gzip 39.1 kB 39.1 kB
webpack-HASH.js gzip 1.68 kB 1.68 kB
262-HASH.js gzip N/A 4.59 kB -
2889.HASH.js gzip N/A 169 B -
5602-HASH.js gzip N/A 5.55 kB -
6948ada0-HASH.js gzip N/A 62.6 kB -
9544-HASH.js gzip N/A 59.1 kB -
Total 232 kB 233 kB ⚠️ +701 B
Polyfills
Canary PR Change
polyfills-HASH.js gzip 39.4 kB 39.4 kB
Total 39.4 kB 39.4 kB
Pages
Canary PR Change
_app-HASH.js gzip 194 B 194 B
_error-HASH.js gzip 183 B 180 B 🟢 3 B (-2%)
css-HASH.js gzip 331 B 330 B
dynamic-HASH.js gzip 1.81 kB 1.81 kB
edge-ssr-HASH.js gzip 256 B 256 B
head-HASH.js gzip 351 B 352 B
hooks-HASH.js gzip 384 B 383 B
image-HASH.js gzip 580 B 581 B
index-HASH.js gzip 260 B 260 B
link-HASH.js gzip 2.5 kB 2.5 kB
routerDirect..HASH.js gzip 320 B 319 B
script-HASH.js gzip 386 B 386 B
withRouter-HASH.js gzip 315 B 315 B
1afbb74e6ecf..834.css gzip 106 B 106 B
Total 7.97 kB 7.97 kB ✅ -2 B

Server

Edge SSR
Canary PR Change
edge-ssr.js gzip 125 kB 125 kB
page.js gzip 254 kB 255 kB
Total 379 kB 379 kB ⚠️ +331 B
Middleware
Canary PR Change
middleware-b..fest.js gzip 616 B 614 B
middleware-r..fest.js gzip 156 B 155 B
middleware.js gzip 43.8 kB 44 kB
edge-runtime..pack.js gzip 842 B 842 B
Total 45.4 kB 45.6 kB ⚠️ +203 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 715 B 718 B
Total 715 B 718 B ⚠️ +3 B
Build Cache
Canary PR Change
0.pack gzip 4.03 MB 4.03 MB 🔴 +7.24 kB (+0%)
index.pack gzip 102 kB 102 kB
index.pack.old gzip 103 kB 103 kB
Total 4.23 MB 4.24 MB ⚠️ +7.51 kB

🔄 Shared (bundler-independent)

Runtimes
Canary PR Change
app-page-exp...dev.js gzip 320 kB 320 kB
app-page-exp..prod.js gzip 170 kB 170 kB
app-page-tur...dev.js gzip 320 kB 320 kB
app-page-tur..prod.js gzip 170 kB 170 kB
app-page-tur...dev.js gzip 316 kB 316 kB
app-page-tur..prod.js gzip 168 kB 168 kB
app-page.run...dev.js gzip 317 kB 317 kB
app-page.run..prod.js gzip 168 kB 168 kB
app-route-ex...dev.js gzip 70.8 kB 70.8 kB
app-route-ex..prod.js gzip 49.2 kB 49.2 kB
app-route-tu...dev.js gzip 70.9 kB 70.9 kB
app-route-tu..prod.js gzip 49.3 kB 49.3 kB
app-route-tu...dev.js gzip 70.4 kB 70.4 kB
app-route-tu..prod.js gzip 49 kB 49 kB
app-route.ru...dev.js gzip 70.4 kB 70.4 kB
app-route.ru..prod.js gzip 49 kB 49 kB
dist_client_...dev.js gzip 324 B 324 B
dist_client_...dev.js gzip 326 B 326 B
dist_client_...dev.js gzip 318 B 318 B
dist_client_...dev.js gzip 317 B 317 B
pages-api-tu...dev.js gzip 43.2 kB 43.2 kB
pages-api-tu..prod.js gzip 32.9 kB 32.9 kB
pages-api.ru...dev.js gzip 43.2 kB 43.2 kB
pages-api.ru..prod.js gzip 32.9 kB 32.9 kB
pages-turbo....dev.js gzip 52.6 kB 52.6 kB
pages-turbo...prod.js gzip 38.5 kB 38.5 kB
pages.runtim...dev.js gzip 52.6 kB 52.6 kB
pages.runtim..prod.js gzip 38.5 kB 38.5 kB
server.runti..prod.js gzip 61.9 kB 61.9 kB
Total 2.82 MB 2.82 MB ✅ -4 B
📎 Tarball URL
next@https://vercel-packages.vercel.app/next/prs/90710/next

write!(code, ",")?;
}

// Sort chunk items by module ID to ensure deterministic output regardless of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not good for compression (I accidentally did it with webpack). Instead sort by identifier, which is path order, which is much better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But sorting shouldn't happen during code generation, but when the chunk is created. The list of chunk items can be sorted

// Absorption pass: try to absorb small remaining candidates into existing heap chunks.
// This prevents tiny modules with slightly different bitmaps from creating near-duplicate
// chunks. A small module (e.g. 360B) with bitmap {0,1,2} should be absorbed into a large
// chunk (e.g. 60KB) with bitmap {0,1} rather than creating a separate near-identical chunk.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation doesn't make sense. When doing so chunk group 2 would ship extra 60kb.

The reverse case might make sense, but it would also overship modules, which we previously avoided. (We never send modules you don't need)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what would make sense is to duplicate the small module, put it in the {0,1} chunk and leave a chunk with {2} and the small module.
That would reduce the requests for the groups 0 and 1.

@sokra
Copy link
Member

sokra commented Mar 1, 2026

I think extracting the logic and adding test cases is valueable.

Would be cool if that would be a separate PR

Copy link
Contributor Author

yeah good idea on that v-work got a little aggressive with this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants