Skip to content

[DTensor] Add Strategy B (local-shard transfer + redistribute)#1645

Open
vmoens wants to merge 6 commits intogh/vmoens/86/basefrom
gh/vmoens/86/head
Open

[DTensor] Add Strategy B (local-shard transfer + redistribute)#1645
vmoens wants to merge 6 commits intogh/vmoens/86/basefrom
gh/vmoens/86/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Mar 9, 2026

[ghstack-poisoned]
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 9, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}27$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.8310μs 14.8348μs 67.4089 KOps/s 67.5143 KOps/s $\color{#d91a1a}-0.16\%$
test_plain_set_stack_nested 39.0800μs 15.3459μs 65.1638 KOps/s 65.4865 KOps/s $\color{#d91a1a}-0.49\%$
test_plain_set_nested_inplace 47.9210μs 16.7478μs 59.7092 KOps/s 59.0472 KOps/s $\color{#35bf28}+1.12\%$
test_plain_set_stack_nested_inplace 40.4500μs 16.6021μs 60.2332 KOps/s 59.9761 KOps/s $\color{#35bf28}+0.43\%$
test_items 30.2400μs 6.0411μs 165.5316 KOps/s 161.3198 KOps/s $\color{#35bf28}+2.61\%$
test_items_nested 0.5755ms 0.4728ms 2.1149 KOps/s 2.1383 KOps/s $\color{#d91a1a}-1.09\%$
test_items_nested_locked 0.5347ms 0.4769ms 2.0969 KOps/s 2.1171 KOps/s $\color{#d91a1a}-0.95\%$
test_items_nested_leaf 0.1530ms 97.8857μs 10.2160 KOps/s 10.1988 KOps/s $\color{#35bf28}+0.17\%$
test_items_stack_nested 0.6712ms 0.4725ms 2.1165 KOps/s 2.1510 KOps/s $\color{#d91a1a}-1.60\%$
test_items_stack_nested_leaf 0.1434ms 99.3607μs 10.0643 KOps/s 10.0774 KOps/s $\color{#d91a1a}-0.13\%$
test_items_stack_nested_locked 0.5660ms 0.4712ms 2.1221 KOps/s 2.1150 KOps/s $\color{#35bf28}+0.34\%$
test_keys 32.1010μs 4.2560μs 234.9626 KOps/s 235.1632 KOps/s $\color{#d91a1a}-0.09\%$
test_keys_nested 0.1797ms 0.1307ms 7.6504 KOps/s 7.6551 KOps/s $\color{#d91a1a}-0.06\%$
test_keys_nested_locked 2.1434ms 0.1391ms 7.1884 KOps/s 7.1888 KOps/s $-0.01\%$
test_keys_nested_leaf 0.1718ms 0.1213ms 8.2467 KOps/s 8.2974 KOps/s $\color{#d91a1a}-0.61\%$
test_keys_stack_nested 0.1825ms 0.1310ms 7.6357 KOps/s 7.6006 KOps/s $\color{#35bf28}+0.46\%$
test_keys_stack_nested_leaf 0.1681ms 0.1217ms 8.2200 KOps/s 8.2482 KOps/s $\color{#d91a1a}-0.34\%$
test_keys_stack_nested_locked 0.2068ms 0.1383ms 7.2281 KOps/s 7.2222 KOps/s $\color{#35bf28}+0.08\%$
test_values 6.5502μs 1.0279μs 972.8770 KOps/s 972.1296 KOps/s $\color{#35bf28}+0.08\%$
test_values_nested 89.9410μs 52.5243μs 19.0388 KOps/s 19.1603 KOps/s $\color{#d91a1a}-0.63\%$
test_values_nested_locked 87.7010μs 56.1097μs 17.8222 KOps/s 17.8892 KOps/s $\color{#d91a1a}-0.37\%$
test_values_nested_leaf 0.1085ms 60.6622μs 16.4847 KOps/s 16.6349 KOps/s $\color{#d91a1a}-0.90\%$
test_values_stack_nested 81.5210μs 53.2616μs 18.7753 KOps/s 18.9505 KOps/s $\color{#d91a1a}-0.92\%$
test_values_stack_nested_leaf 92.7010μs 60.6591μs 16.4856 KOps/s 16.6011 KOps/s $\color{#d91a1a}-0.70\%$
test_values_stack_nested_locked 0.1478ms 56.3667μs 17.7410 KOps/s 17.8504 KOps/s $\color{#d91a1a}-0.61\%$
test_membership 6.2617μs 0.8562μs 1.1680 MOps/s 1.1636 MOps/s $\color{#35bf28}+0.37\%$
test_membership_nested 29.3200μs 2.9130μs 343.2838 KOps/s 332.6158 KOps/s $\color{#35bf28}+3.21\%$
test_membership_nested_leaf 31.4100μs 2.9103μs 343.6096 KOps/s 335.8217 KOps/s $\color{#35bf28}+2.32\%$
test_membership_stacked_nested 34.0000μs 2.9296μs 341.3429 KOps/s 332.4589 KOps/s $\color{#35bf28}+2.67\%$
test_membership_stacked_nested_leaf 31.7610μs 2.8891μs 346.1293 KOps/s 335.2976 KOps/s $\color{#35bf28}+3.23\%$
test_membership_nested_last 34.3310μs 4.3805μs 228.2834 KOps/s 224.2675 KOps/s $\color{#35bf28}+1.79\%$
test_membership_nested_leaf_last 37.6610μs 4.3980μs 227.3757 KOps/s 226.4939 KOps/s $\color{#35bf28}+0.39\%$
test_membership_stacked_nested_last 30.5510μs 4.3657μs 229.0569 KOps/s 224.2438 KOps/s $\color{#35bf28}+2.15\%$
test_membership_stacked_nested_leaf_last 39.9700μs 4.3405μs 230.3881 KOps/s 224.4404 KOps/s $\color{#35bf28}+2.65\%$
test_nested_getleaf 53.9510μs 21.5542μs 46.3948 KOps/s 45.9244 KOps/s $\color{#35bf28}+1.02\%$
test_nested_get 48.3010μs 20.3879μs 49.0487 KOps/s 48.1736 KOps/s $\color{#35bf28}+1.82\%$
test_stacked_getleaf 62.0710μs 21.3328μs 46.8762 KOps/s 46.3531 KOps/s $\color{#35bf28}+1.13\%$
test_stacked_get 88.4210μs 20.6469μs 48.4335 KOps/s 48.1608 KOps/s $\color{#35bf28}+0.57\%$
test_nested_getitemleaf 51.0410μs 21.9925μs 45.4700 KOps/s 44.8499 KOps/s $\color{#35bf28}+1.38\%$
test_nested_getitem 48.4800μs 21.1663μs 47.2449 KOps/s 47.2499 KOps/s $\color{#d91a1a}-0.01\%$
test_stacked_getitemleaf 50.4610μs 22.1854μs 45.0748 KOps/s 45.5722 KOps/s $\color{#d91a1a}-1.09\%$
test_stacked_getitem 58.5610μs 21.0833μs 47.4308 KOps/s 47.8148 KOps/s $\color{#d91a1a}-0.80\%$
test_lock_nested 0.5533ms 0.4791ms 2.0875 KOps/s 2.0915 KOps/s $\color{#d91a1a}-0.19\%$
test_lock_stack_nested 0.5950ms 0.4831ms 2.0702 KOps/s 2.0556 KOps/s $\color{#35bf28}+0.71\%$
test_unlock_nested 0.4644ms 0.3944ms 2.5358 KOps/s 2.5614 KOps/s $\color{#d91a1a}-1.00\%$
test_unlock_stack_nested 0.4418ms 0.3890ms 2.5710 KOps/s 2.5277 KOps/s $\color{#35bf28}+1.72\%$
test_flatten_speed 0.1567ms 0.1230ms 8.1292 KOps/s 8.2036 KOps/s $\color{#d91a1a}-0.91\%$
test_unflatten_speed 0.6865ms 0.5715ms 1.7499 KOps/s 1.7404 KOps/s $\color{#35bf28}+0.55\%$
test_common_ops 0.8404ms 0.6921ms 1.4449 KOps/s 1.4377 KOps/s $\color{#35bf28}+0.50\%$
test_creation 0.1073ms 3.1448μs 317.9853 KOps/s 319.2008 KOps/s $\color{#d91a1a}-0.38\%$
test_creation_empty 43.5810μs 6.9614μs 143.6499 KOps/s 142.7651 KOps/s $\color{#35bf28}+0.62\%$
test_creation_nested_1 35.4910μs 11.5628μs 86.4845 KOps/s 85.5819 KOps/s $\color{#35bf28}+1.05\%$
test_creation_nested_2 47.1310μs 13.4064μs 74.5913 KOps/s 78.1229 KOps/s $\color{#d91a1a}-4.52\%$
test_creation_many_keys[10] 48.6110μs 21.0609μs 47.4815 KOps/s 47.1214 KOps/s $\color{#35bf28}+0.76\%$
test_creation_many_keys[50] 0.1615ms 89.9179μs 11.1213 KOps/s 10.9889 KOps/s $\color{#35bf28}+1.20\%$
test_creation_many_keys[100] 0.2189ms 0.1763ms 5.6707 KOps/s 5.5754 KOps/s $\color{#35bf28}+1.71\%$
test_creation_nested_many_keys[10] 93.9820μs 44.8615μs 22.2909 KOps/s 22.0988 KOps/s $\color{#35bf28}+0.87\%$
test_creation_nested_many_keys[50] 0.2452ms 0.1837ms 5.4447 KOps/s 5.3760 KOps/s $\color{#35bf28}+1.28\%$
test_clone 45.4010μs 13.4465μs 74.3689 KOps/s 74.0602 KOps/s $\color{#35bf28}+0.42\%$
test_getitem[int] 1.7286ms 15.1922μs 65.8234 KOps/s 59.1130 KOps/s $\textbf{\color{#35bf28}+11.35\%}$
test_getitem[slice_int] 0.1381ms 24.0305μs 41.6137 KOps/s 41.4779 KOps/s $\color{#35bf28}+0.33\%$
test_getitem[range] 0.1728ms 64.1073μs 15.5989 KOps/s 15.6695 KOps/s $\color{#d91a1a}-0.45\%$
test_getitem[tuple] 0.1427ms 24.0967μs 41.4994 KOps/s 41.9872 KOps/s $\color{#d91a1a}-1.16\%$
test_getitem[list] 0.2023ms 61.6473μs 16.2213 KOps/s 16.9668 KOps/s $\color{#d91a1a}-4.39\%$
test_setitem_dim[int] 45.0110μs 25.7041μs 38.9044 KOps/s 38.2092 KOps/s $\color{#35bf28}+1.82\%$
test_setitem_dim[slice_int] 67.8410μs 43.5706μs 22.9513 KOps/s 22.9555 KOps/s $\color{#d91a1a}-0.02\%$
test_setitem_dim[range] 0.1320ms 96.5965μs 10.3523 KOps/s 10.5793 KOps/s $\color{#d91a1a}-2.15\%$
test_setitem_dim[tuple] 77.5510μs 41.6925μs 23.9852 KOps/s 24.8871 KOps/s $\color{#d91a1a}-3.62\%$
test_setitem 57.0000μs 17.6548μs 56.6417 KOps/s 56.3790 KOps/s $\color{#35bf28}+0.47\%$
test_set 51.2600μs 16.7403μs 59.7359 KOps/s 59.0686 KOps/s $\color{#35bf28}+1.13\%$
test_set_shared 0.5065ms 0.2049ms 4.8800 KOps/s 4.8786 KOps/s $\color{#35bf28}+0.03\%$
test_update 0.2006ms 21.5987μs 46.2992 KOps/s 46.2204 KOps/s $\color{#35bf28}+0.17\%$
test_update_nested 70.8810μs 32.6490μs 30.6288 KOps/s 30.2807 KOps/s $\color{#35bf28}+1.15\%$
test_update__nested 0.5192ms 34.3329μs 29.1266 KOps/s 29.0457 KOps/s $\color{#35bf28}+0.28\%$
test_set_nested 64.5410μs 18.8922μs 52.9319 KOps/s 53.0377 KOps/s $\color{#d91a1a}-0.20\%$
test_set_nested_new 62.2010μs 23.5400μs 42.4809 KOps/s 41.8699 KOps/s $\color{#35bf28}+1.46\%$
test_select 75.8110μs 40.7737μs 24.5256 KOps/s 24.7706 KOps/s $\color{#d91a1a}-0.99\%$
test_select_nested 0.1062ms 74.7972μs 13.3695 KOps/s 13.3137 KOps/s $\color{#35bf28}+0.42\%$
test_exclude_nested 0.1455ms 91.9230μs 10.8787 KOps/s 10.6839 KOps/s $\color{#35bf28}+1.82\%$
test_empty[True] 0.4602ms 0.3986ms 2.5087 KOps/s 2.4982 KOps/s $\color{#35bf28}+0.42\%$
test_empty[False] 9.5152μs 1.3083μs 764.3272 KOps/s 762.6905 KOps/s $\color{#35bf28}+0.21\%$
test_to 0.1052ms 72.5123μs 13.7908 KOps/s 12.9554 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_to_nonblocking 1.2350ms 68.1303μs 14.6778 KOps/s 15.3044 KOps/s $\color{#d91a1a}-4.09\%$
test_unbind_speed 0.3967ms 0.3333ms 3.0005 KOps/s 2.9912 KOps/s $\color{#35bf28}+0.31\%$
test_unbind_speed_stack0 0.3883ms 0.3330ms 3.0028 KOps/s 2.9989 KOps/s $\color{#35bf28}+0.13\%$
test_unbind_speed_stack1 0.1032s 0.9222ms 1.0844 KOps/s 1.1807 KOps/s $\textbf{\color{#d91a1a}-8.16\%}$
test_split 1.2106ms 1.1388ms 878.0928 Ops/s 779.6477 Ops/s $\textbf{\color{#35bf28}+12.63\%}$
test_chunk 0.1035s 1.2053ms 829.6713 Ops/s 915.1955 Ops/s $\textbf{\color{#d91a1a}-9.34\%}$
test_to_cpu_blocking 29.1969ms 28.8476ms 34.6650 Ops/s 45.9701 Ops/s $\textbf{\color{#d91a1a}-24.59\%}$
test_to_cpu_global_sync 11.6476ms 11.5150ms 86.8431 Ops/s 86.5873 Ops/s $\color{#35bf28}+0.30\%$
test_to_cpu_event_sync 12.6843ms 12.4977ms 80.0148 Ops/s 80.1953 Ops/s $\color{#d91a1a}-0.23\%$
test_to_cpu_default 0.1163s 13.8070ms 72.4268 Ops/s 80.0505 Ops/s $\textbf{\color{#d91a1a}-9.52\%}$
test_consolidate[False-None] 4.3172ms 4.1652ms 240.0838 Ops/s 243.1612 Ops/s $\color{#d91a1a}-1.27\%$
test_consolidate[default-None] 2.2115ms 2.0268ms 493.3992 Ops/s 473.1098 Ops/s $\color{#35bf28}+4.29\%$
test_consolidate[reduce-overhead-None] 2.0156ms 1.9352ms 516.7549 Ops/s 489.6527 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_consolidate_njt[False-None] 0.1892s 10.0235ms 99.7654 Ops/s 113.4438 Ops/s $\textbf{\color{#d91a1a}-12.06\%}$
test_to[False-False-None] 2.2826ms 2.1031ms 475.4874 Ops/s 466.7357 Ops/s $\color{#35bf28}+1.88\%$
test_to[True-False-None] 2.1896ms 1.9497ms 512.8969 Ops/s 514.7273 Ops/s $\color{#d91a1a}-0.36\%$
test_to[within-False-None] 6.3893ms 6.1570ms 162.4171 Ops/s 164.0818 Ops/s $\color{#d91a1a}-1.01\%$
test_to[True-default-None] 9.4853ms 8.9255ms 112.0384 Ops/s 111.7918 Ops/s $\color{#35bf28}+0.22\%$
test_to_njt[False-False-None] 8.9485ms 8.5318ms 117.2086 Ops/s 116.5858 Ops/s $\color{#35bf28}+0.53\%$
test_to_njt[True-False-None] 7.0508ms 6.9323ms 144.2517 Ops/s 141.2861 Ops/s $\color{#35bf28}+2.10\%$
test_to_njt[within-False-None] 16.4654ms 15.6399ms 63.9389 Ops/s 63.2533 Ops/s $\color{#35bf28}+1.08\%$
test_creation[device0] 0.4045ms 0.1152ms 8.6829 KOps/s 8.6714 KOps/s $\color{#35bf28}+0.13\%$
test_creation_from_tensor 0.4119ms 0.1135ms 8.8090 KOps/s 8.7767 KOps/s $\color{#35bf28}+0.37\%$
test_add_one[memmap_tensor0] 0.4038ms 6.6553μs 150.2573 KOps/s 149.0716 KOps/s $\color{#35bf28}+0.80\%$
test_contiguous[memmap_tensor0] 16.9210μs 0.6718μs 1.4885 MOps/s 2.1289 MOps/s $\textbf{\color{#d91a1a}-30.08\%}$
test_stack[memmap_tensor0] 28.3810μs 4.6275μs 216.0977 KOps/s 214.3122 KOps/s $\color{#35bf28}+0.83\%$
test_memmaptd_index 0.9555ms 0.2659ms 3.7605 KOps/s 3.6752 KOps/s $\color{#35bf28}+2.32\%$
test_memmaptd_index_astensor 0.5209ms 0.3681ms 2.7169 KOps/s 2.6490 KOps/s $\color{#35bf28}+2.56\%$
test_memmaptd_index_op 0.9475ms 0.6234ms 1.6042 KOps/s 1.5820 KOps/s $\color{#35bf28}+1.40\%$
test_serialize_model 0.3095s 0.1612s 6.2025 Ops/s 7.2671 Ops/s $\textbf{\color{#d91a1a}-14.65\%}$
test_serialize_model_pickle 1.3702s 1.2210s 0.8190 Ops/s 0.8223 Ops/s $\color{#d91a1a}-0.39\%$
test_serialize_weights 0.1375s 0.1354s 7.3856 Ops/s 7.3501 Ops/s $\color{#35bf28}+0.48\%$
test_serialize_weights_returnearly 0.4256s 88.5827ms 11.2889 Ops/s 6.4677 Ops/s $\textbf{\color{#35bf28}+74.54\%}$
test_serialize_weights_pickle 1.3678s 1.2140s 0.8237 Ops/s 0.8226 Ops/s $\color{#35bf28}+0.13\%$
test_reshape_pytree 0.1988ms 32.7733μs 30.5127 KOps/s 30.6823 KOps/s $\color{#d91a1a}-0.55\%$
test_reshape_td 83.0710μs 45.7872μs 21.8402 KOps/s 20.8109 KOps/s $\color{#35bf28}+4.95\%$
test_view_pytree 0.2362ms 32.7573μs 30.5275 KOps/s 30.8110 KOps/s $\color{#d91a1a}-0.92\%$
test_view_td 92.1810μs 53.2472μs 18.7803 KOps/s 18.9675 KOps/s $\color{#d91a1a}-0.99\%$
test_unbind_pytree 0.2222ms 36.3372μs 27.5200 KOps/s 26.7975 KOps/s $\color{#35bf28}+2.70\%$
test_unbind_td 0.1835ms 49.7653μs 20.0943 KOps/s 19.7091 KOps/s $\color{#35bf28}+1.95\%$
test_split_pytree 0.2497ms 43.0775μs 23.2140 KOps/s 23.5903 KOps/s $\color{#d91a1a}-1.60\%$
test_split_td 0.1238ms 65.3657μs 15.2985 KOps/s 15.4282 KOps/s $\color{#d91a1a}-0.84\%$
test_add_pytree 0.2302ms 42.3795μs 23.5963 KOps/s 23.6347 KOps/s $\color{#d91a1a}-0.16\%$
test_add_td 98.3310μs 56.9676μs 17.5539 KOps/s 18.2794 KOps/s $\color{#d91a1a}-3.97\%$
test_compile_add_one_nested[tensordict-compile] 0.1920ms 0.1400ms 7.1405 KOps/s 6.4717 KOps/s $\textbf{\color{#35bf28}+10.33\%}$
test_compile_add_one_nested[tensordict-eager] 0.2834ms 0.2035ms 4.9134 KOps/s 5.0219 KOps/s $\color{#d91a1a}-2.16\%$
test_compile_add_one_nested[pytree-compile] 0.1401ms 0.1090ms 9.1771 KOps/s 9.1253 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_nested[pytree-eager] 0.4317ms 0.1800ms 5.5547 KOps/s 5.4334 KOps/s $\color{#35bf28}+2.23\%$
test_compile_copy_nested[tensordict-compile] 0.2482ms 10.3509μs 96.6099 KOps/s 97.1886 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_copy_nested[tensordict-eager] 89.5910μs 54.8854μs 18.2198 KOps/s 18.4989 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_copy_nested[pytree-compile] 0.1179ms 9.9171μs 100.8355 KOps/s 102.6630 KOps/s $\color{#d91a1a}-1.78\%$
test_compile_copy_nested[pytree-eager] 0.4666ms 70.3507μs 14.2145 KOps/s 14.5035 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_add_one_flat[tensordict-compile] 0.4530ms 0.1762ms 5.6744 KOps/s 5.3945 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_compile_add_one_flat[tensordict-eager] 0.3378ms 0.2881ms 3.4706 KOps/s 3.5269 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_add_one_flat[tensorclass-compile] 0.2100ms 0.1174ms 8.5207 KOps/s 8.3424 KOps/s $\color{#35bf28}+2.14\%$
test_compile_add_one_flat[tensorclass-eager] 0.1251ms 78.4360μs 12.7492 KOps/s 12.9381 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_add_one_flat[pytree-compile] 0.2467ms 0.1580ms 6.3274 KOps/s 6.2003 KOps/s $\color{#35bf28}+2.05\%$
test_compile_add_one_flat[pytree-eager] 0.8048ms 0.5225ms 1.9137 KOps/s 1.8489 KOps/s $\color{#35bf28}+3.51\%$
test_compile_add_self_flat[tensordict-eager] 0.5026ms 0.3364ms 2.9726 KOps/s 2.9599 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_self_flat[tensordict-compile] 0.2242ms 0.1790ms 5.5857 KOps/s 5.1578 KOps/s $\textbf{\color{#35bf28}+8.30\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1531ms 94.3356μs 10.6005 KOps/s 11.1603 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1952ms 0.1193ms 8.3830 KOps/s 8.0047 KOps/s $\color{#35bf28}+4.73\%$
test_compile_add_self_flat[pytree-eager] 0.6422ms 0.4341ms 2.3036 KOps/s 2.2142 KOps/s $\color{#35bf28}+4.04\%$
test_compile_add_self_flat[pytree-compile] 0.3415ms 0.1588ms 6.2955 KOps/s 6.1589 KOps/s $\color{#35bf28}+2.22\%$
test_compile_copy_flat[tensordict-compile] 0.1228ms 13.5385μs 73.8636 KOps/s 74.7487 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_copy_flat[tensordict-eager] 77.1210μs 41.8400μs 23.9006 KOps/s 23.7841 KOps/s $\color{#35bf28}+0.49\%$
test_compile_copy_flat[pytree-compile] 0.1955ms 10.7142μs 93.3344 KOps/s 93.1592 KOps/s $\color{#35bf28}+0.19\%$
test_compile_copy_flat[pytree-eager] 0.4142ms 52.7737μs 18.9488 KOps/s 18.9803 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_assign_and_add[tensordict-compile] 2.0090ms 0.1747ms 5.7252 KOps/s 5.4883 KOps/s $\color{#35bf28}+4.32\%$
test_compile_assign_and_add[tensordict-eager] 3.3955ms 3.3121ms 301.9198 Ops/s 300.6912 Ops/s $\color{#35bf28}+0.41\%$
test_compile_assign_and_add[pytree-compile] 1.9608ms 0.1615ms 6.1922 KOps/s 6.1083 KOps/s $\color{#35bf28}+1.37\%$
test_compile_assign_and_add[pytree-eager] 2.9413ms 2.7987ms 357.3088 Ops/s 348.4650 Ops/s $\color{#35bf28}+2.54\%$
test_compile_indexing[tensor-tensordict-compile] 0.2004ms 0.1093ms 9.1526 KOps/s 8.8369 KOps/s $\color{#35bf28}+3.57\%$
test_compile_indexing[tensor-tensordict-eager] 0.3258ms 74.5468μs 13.4144 KOps/s 13.5680 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1370ms 96.1646μs 10.3988 KOps/s 10.2660 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2522ms 44.7744μs 22.3342 KOps/s 20.9584 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_compile_indexing[tensor-pytree-compile] 0.1376ms 96.9670μs 10.3128 KOps/s 10.2982 KOps/s $\color{#35bf28}+0.14\%$
test_compile_indexing[tensor-pytree-eager] 0.2510ms 44.6930μs 22.3748 KOps/s 21.0247 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_compile_indexing[slice-tensordict-compile] 0.2055ms 56.8838μs 17.5797 KOps/s 16.4919 KOps/s $\textbf{\color{#35bf28}+6.60\%}$
test_compile_indexing[slice-tensordict-eager] 0.2141ms 27.8012μs 35.9697 KOps/s 35.9414 KOps/s $\color{#35bf28}+0.08\%$
test_compile_indexing[slice-tensorclass-compile] 0.1297ms 44.5715μs 22.4359 KOps/s 22.1979 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[slice-tensorclass-eager] 0.2679ms 22.7106μs 44.0323 KOps/s 43.9430 KOps/s $\color{#35bf28}+0.20\%$
test_compile_indexing[slice-pytree-compile] 0.2162ms 45.1469μs 22.1499 KOps/s 21.7634 KOps/s $\color{#35bf28}+1.78\%$
test_compile_indexing[slice-pytree-eager] 0.2528ms 22.5417μs 44.3623 KOps/s 44.4516 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[int-tensordict-compile] 0.1045ms 57.1098μs 17.5101 KOps/s 16.6028 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_compile_indexing[int-tensordict-eager] 0.2257ms 27.7627μs 36.0196 KOps/s 36.7616 KOps/s $\color{#d91a1a}-2.02\%$
test_compile_indexing[int-tensorclass-compile] 82.9110μs 45.0408μs 22.2021 KOps/s 21.3535 KOps/s $\color{#35bf28}+3.97\%$
test_compile_indexing[int-tensorclass-eager] 0.2735ms 22.4250μs 44.5930 KOps/s 44.3181 KOps/s $\color{#35bf28}+0.62\%$
test_compile_indexing[int-pytree-compile] 87.0910μs 45.9542μs 21.7608 KOps/s 22.2291 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_indexing[int-pytree-eager] 0.2572ms 22.6578μs 44.1350 KOps/s 44.3582 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_replace[single-eager] 0.1037ms 47.6052μs 21.0061 KOps/s 20.4431 KOps/s $\color{#35bf28}+2.75\%$
test_compile_replace[single-compile] 0.1789ms 0.1052ms 9.5065 KOps/s 9.0018 KOps/s $\textbf{\color{#35bf28}+5.61\%}$
test_compile_replace[multi-eager] 0.6965ms 0.5664ms 1.7657 KOps/s 1.7841 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_replace[multi-compile] 0.2909ms 0.1112ms 8.9901 KOps/s 8.8860 KOps/s $\color{#35bf28}+1.17\%$
test_compile_tc_getattr_20[eager] 0.2317ms 0.1677ms 5.9637 KOps/s 5.9238 KOps/s $\color{#35bf28}+0.67\%$
test_compile_tc_getattr_20[compile] 0.1839ms 0.1210ms 8.2627 KOps/s 8.2380 KOps/s $\color{#35bf28}+0.30\%$
test_compile_clone_shallow[20-eager] 52.5200μs 19.3137μs 51.7768 KOps/s 51.1365 KOps/s $\color{#35bf28}+1.25\%$
test_compile_clone_shallow[20-compile] 62.3210μs 11.6380μs 85.9252 KOps/s 87.9237 KOps/s $\color{#d91a1a}-2.27\%$
test_compile_clone_shallow[40-eager] 66.4710μs 33.2598μs 30.0663 KOps/s 29.0132 KOps/s $\color{#35bf28}+3.63\%$
test_compile_clone_shallow[40-compile] 51.4810μs 12.5983μs 79.3758 KOps/s 69.4117 KOps/s $\textbf{\color{#35bf28}+14.36\%}$
test_compile_clone_shallow[80-eager] 0.1260ms 63.3705μs 15.7802 KOps/s 15.7715 KOps/s $\color{#35bf28}+0.06\%$
test_compile_clone_shallow[80-compile] 88.4910μs 15.6438μs 63.9231 KOps/s 66.3864 KOps/s $\color{#d91a1a}-3.71\%$
test_compile_update_inplace[eager] 0.1045ms 58.3853μs 17.1276 KOps/s 16.8882 KOps/s $\color{#35bf28}+1.42\%$
test_compile_update_inplace[compile] 0.2411ms 0.1415ms 7.0668 KOps/s 6.9357 KOps/s $\color{#35bf28}+1.89\%$
test_mod_add[eager] 0.1066ms 51.0324μs 19.5954 KOps/s 20.6186 KOps/s $\color{#d91a1a}-4.96\%$
test_mod_add[compile] 0.3732ms 0.1090ms 9.1778 KOps/s 9.2736 KOps/s $\color{#d91a1a}-1.03\%$
test_mod_add[compile-overhead] 0.5479ms 0.1502ms 6.6566 KOps/s 6.5746 KOps/s $\color{#35bf28}+1.25\%$
test_mod_wrap[eager] 0.4512ms 0.2938ms 3.4034 KOps/s 3.3722 KOps/s $\color{#35bf28}+0.93\%$
test_mod_wrap[compile] 0.5159ms 0.3612ms 2.7688 KOps/s 2.8004 KOps/s $\color{#d91a1a}-1.13\%$
test_mod_wrap[compile-overhead] 7.2748ms 3.9949ms 250.3201 Ops/s 248.1475 Ops/s $\color{#35bf28}+0.88\%$
test_mod_wrap_and_backward[eager] 1.5977ms 1.4949ms 668.9542 Ops/s 597.8570 Ops/s $\textbf{\color{#35bf28}+11.89\%}$
test_mod_wrap_and_backward[compile] 1.8580ms 1.4484ms 690.3983 Ops/s 681.6746 Ops/s $\color{#35bf28}+1.28\%$
test_mod_wrap_and_backward[compile-overhead] 1.2365ms 0.8879ms 1.1262 KOps/s 1.1076 KOps/s $\color{#35bf28}+1.68\%$
test_seq_add[eager] 0.2053ms 0.1543ms 6.4801 KOps/s 6.5325 KOps/s $\color{#d91a1a}-0.80\%$
test_seq_add[compile] 0.3669ms 0.1128ms 8.8624 KOps/s 8.0063 KOps/s $\textbf{\color{#35bf28}+10.69\%}$
test_seq_add[compile-overhead] 0.3189ms 0.1537ms 6.5068 KOps/s 5.8220 KOps/s $\textbf{\color{#35bf28}+11.76\%}$
test_seq_wrap[eager] 0.6164ms 0.5197ms 1.9241 KOps/s 1.8061 KOps/s $\textbf{\color{#35bf28}+6.53\%}$
test_seq_wrap[compile] 0.4512ms 0.3640ms 2.7469 KOps/s 2.5536 KOps/s $\textbf{\color{#35bf28}+7.57\%}$
test_seq_wrap[compile-overhead] 0.3508ms 0.2655ms 3.7664 KOps/s 3.5734 KOps/s $\textbf{\color{#35bf28}+5.40\%}$
test_func_call_runtime[False-eager] 0.9035ms 0.8419ms 1.1878 KOps/s 1.1099 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_func_call_runtime[False-compile] 1.0953ms 0.9108ms 1.0979 KOps/s 1.0830 KOps/s $\color{#35bf28}+1.38\%$
test_func_call_runtime[False-compile-overhead] 0.5168ms 0.4644ms 2.1531 KOps/s 2.1435 KOps/s $\color{#35bf28}+0.45\%$
test_func_call_runtime[True-eager] 1.1980ms 1.0776ms 927.9459 Ops/s 905.8089 Ops/s $\color{#35bf28}+2.44\%$
test_func_call_runtime[True-compile] 1.0164ms 0.9206ms 1.0862 KOps/s 1.0300 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_func_call_runtime[True-compile-overhead] 0.5907ms 0.4773ms 2.0950 KOps/s 2.0554 KOps/s $\color{#35bf28}+1.93\%$
test_func_call_cm_runtime[False-eager] 0.9584ms 0.8701ms 1.1493 KOps/s 1.1786 KOps/s $\color{#d91a1a}-2.48\%$
test_func_call_cm_runtime[False-compile] 1.0282ms 0.9181ms 1.0892 KOps/s 1.0829 KOps/s $\color{#35bf28}+0.58\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5114ms 0.4657ms 2.1474 KOps/s 2.1156 KOps/s $\color{#35bf28}+1.50\%$
test_func_call_cm_runtime[True-eager] 1.3103ms 1.2195ms 820.0415 Ops/s 803.4360 Ops/s $\color{#35bf28}+2.07\%$
test_func_call_cm_runtime[True-compile] 1.0097ms 0.9621ms 1.0393 KOps/s 1.0294 KOps/s $\color{#35bf28}+0.96\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5634ms 0.5101ms 1.9605 KOps/s 1.9050 KOps/s $\color{#35bf28}+2.91\%$
test_vmap_func_call_cm_runtime[eager] 2.8699ms 2.3677ms 422.3565 Ops/s 413.7972 Ops/s $\color{#35bf28}+2.07\%$
test_vmap_func_call_cm_runtime[compile] 1.0843ms 0.9788ms 1.0216 KOps/s 966.8806 Ops/s $\textbf{\color{#35bf28}+5.66\%}$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6301ms 0.5175ms 1.9324 KOps/s 1.9005 KOps/s $\color{#35bf28}+1.68\%$
test_distributed 0.9960ms 0.1530ms 6.5362 KOps/s 6.4920 KOps/s $\color{#35bf28}+0.68\%$
test_tdmodule 0.2846ms 27.6568μs 36.1574 KOps/s 36.5537 KOps/s $\color{#d91a1a}-1.08\%$
test_tdmodule_dispatch 74.5010μs 44.6028μs 22.4201 KOps/s 22.6879 KOps/s $\color{#d91a1a}-1.18\%$
test_tdseq 45.4000μs 26.5700μs 37.6364 KOps/s 36.9763 KOps/s $\color{#35bf28}+1.79\%$
test_tdseq_dispatch 86.0710μs 46.2315μs 21.6303 KOps/s 20.7614 KOps/s $\color{#35bf28}+4.19\%$
test_instantiation_functorch 2.2034ms 2.0963ms 477.0265 Ops/s 477.7465 Ops/s $\color{#d91a1a}-0.15\%$
test_exec_functorch 0.2425ms 0.1800ms 5.5557 KOps/s 5.5651 KOps/s $\color{#d91a1a}-0.17\%$
test_exec_functional_call 0.2232ms 0.1617ms 6.1825 KOps/s 6.1607 KOps/s $\color{#35bf28}+0.35\%$
test_exec_td_decorator 0.4740ms 0.2395ms 4.1749 KOps/s 4.1850 KOps/s $\color{#d91a1a}-0.24\%$
test_vmap_mlp_speed_decorator[True-True] 1.0259ms 0.8265ms 1.2100 KOps/s 1.1925 KOps/s $\color{#35bf28}+1.47\%$
test_vmap_mlp_speed_decorator[True-False] 1.0003ms 0.8270ms 1.2092 KOps/s 1.1894 KOps/s $\color{#35bf28}+1.67\%$
test_vmap_mlp_speed_decorator[False-True] 0.9098ms 0.7132ms 1.4021 KOps/s 1.3828 KOps/s $\color{#35bf28}+1.40\%$
test_vmap_mlp_speed_decorator[False-False] 0.8857ms 0.7115ms 1.4056 KOps/s 1.3860 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_transformer_speed_decorator[True-True] 21.4595ms 20.5890ms 48.5697 Ops/s 48.2602 Ops/s $\color{#35bf28}+0.64\%$
test_vmap_transformer_speed_decorator[True-False] 21.2069ms 20.5542ms 48.6519 Ops/s 48.2158 Ops/s $\color{#35bf28}+0.90\%$
test_vmap_transformer_speed_decorator[False-True] 21.0956ms 20.3503ms 49.1393 Ops/s 48.7010 Ops/s $\color{#35bf28}+0.90\%$
test_vmap_transformer_speed_decorator[False-False] 20.8946ms 20.4087ms 48.9987 Ops/s 48.7588 Ops/s $\color{#35bf28}+0.49\%$
test_to_module_speed[True] 1.5837ms 1.4810ms 675.2342 Ops/s 675.3895 Ops/s $\color{#d91a1a}-0.02\%$
test_to_module_speed[False] 1.5746ms 1.4653ms 682.4683 Ops/s 691.0312 Ops/s $\color{#d91a1a}-1.24\%$
test_tc_init 68.8610μs 44.6479μs 22.3975 KOps/s 22.0423 KOps/s $\color{#35bf28}+1.61\%$
test_tc_init_tensor_only 36.9210μs 9.7133μs 102.9514 KOps/s 103.1590 KOps/s $\color{#d91a1a}-0.20\%$
test_tc_init_nested 0.1225ms 87.4584μs 11.4340 KOps/s 11.2940 KOps/s $\color{#35bf28}+1.24\%$
test_tc_init_many_fields 59.2210μs 16.3049μs 61.3314 KOps/s 62.4550 KOps/s $\color{#d91a1a}-1.80\%$
test_tc_first_layer_tensor 29.7710μs 1.7953μs 557.0174 KOps/s 557.7563 KOps/s $\color{#d91a1a}-0.13\%$
test_tc_first_layer_tensor_only 3.1431μs 0.4046μs 2.4714 MOps/s 2.5396 MOps/s $\color{#d91a1a}-2.69\%$
test_tc_first_layer_tensor_set 34.7600μs 3.9364μs 254.0416 KOps/s 252.8614 KOps/s $\color{#35bf28}+0.47\%$
test_tc_first_layer_tensor_only_set 23.4610μs 3.2527μs 307.4349 KOps/s 306.1626 KOps/s $\color{#35bf28}+0.42\%$
test_tc_first_layer_nontensor 8.0168ms 6.7822μs 147.4445 KOps/s 158.5641 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_tc_second_layer_tensor 26.7010μs 4.4116μs 226.6744 KOps/s 224.8003 KOps/s $\color{#35bf28}+0.83\%$
test_tc_second_layer_nontensor 38.7600μs 9.1365μs 109.4505 KOps/s 112.4267 KOps/s $\color{#d91a1a}-2.65\%$
test_unbind 0.2499s 14.2243ms 70.3020 Ops/s 56.3898 Ops/s $\textbf{\color{#35bf28}+24.67\%}$
test_full_like 7.5611ms 4.3955ms 227.5058 Ops/s 227.9905 Ops/s $\color{#d91a1a}-0.21\%$
test_zeros_like 5.0579ms 4.3672ms 228.9775 Ops/s 137.3324 Ops/s $\textbf{\color{#35bf28}+66.73\%}$
test_ones_like 4.9305ms 4.3728ms 228.6854 Ops/s 228.5342 Ops/s $\color{#35bf28}+0.07\%$
test_clone 6.6695ms 6.4224ms 155.7061 Ops/s 155.7894 Ops/s $\color{#d91a1a}-0.05\%$
test_squeeze 84.3110μs 13.8840μs 72.0253 KOps/s 68.6368 KOps/s $\color{#35bf28}+4.94\%$
test_unsqueeze 0.1626ms 0.1120ms 8.9263 KOps/s 9.0217 KOps/s $\color{#d91a1a}-1.06\%$
test_split 0.2493ms 0.1855ms 5.3908 KOps/s 5.3784 KOps/s $\color{#35bf28}+0.23\%$
test_permute 0.2488ms 0.2047ms 4.8852 KOps/s 4.8347 KOps/s $\color{#35bf28}+1.05\%$
test_stack 51.2219ms 50.8967ms 19.6477 Ops/s 19.6529 Ops/s $\color{#d91a1a}-0.03\%$
test_cat 51.1477ms 50.8561ms 19.6633 Ops/s 19.7246 Ops/s $\color{#d91a1a}-0.31\%$
test_sequential_tensordict 0.2970ms 0.2187ms 4.5719 KOps/s 4.3002 KOps/s $\textbf{\color{#35bf28}+6.32\%}$
test_sequential_graph_module 0.1671ms 0.1186ms 8.4331 KOps/s 7.9809 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_nested_tensordict 0.4367ms 0.2910ms 3.4364 KOps/s 3.4615 KOps/s $\color{#d91a1a}-0.73\%$
test_nested_graph_module 0.1736ms 0.1305ms 7.6657 KOps/s 7.5708 KOps/s $\color{#35bf28}+1.25\%$

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.9010μs 14.8839μs 67.1867 KOps/s 67.4432 KOps/s $\color{#d91a1a}-0.38\%$
test_plain_set_stack_nested 36.5600μs 15.4389μs 64.7714 KOps/s 65.7056 KOps/s $\color{#d91a1a}-1.42\%$
test_plain_set_nested_inplace 46.9800μs 16.7791μs 59.5980 KOps/s 59.0559 KOps/s $\color{#35bf28}+0.92\%$
test_plain_set_stack_nested_inplace 60.0010μs 16.7395μs 59.7389 KOps/s 59.2602 KOps/s $\color{#35bf28}+0.81\%$
test_items 38.1200μs 6.0611μs 164.9863 KOps/s 167.4624 KOps/s $\color{#d91a1a}-1.48\%$
test_items_nested 0.6426ms 0.4740ms 2.1095 KOps/s 2.1218 KOps/s $\color{#d91a1a}-0.58\%$
test_items_nested_locked 0.5586ms 0.4734ms 2.1124 KOps/s 2.1181 KOps/s $\color{#d91a1a}-0.27\%$
test_items_nested_leaf 0.1575ms 97.6795μs 10.2376 KOps/s 10.0610 KOps/s $\color{#35bf28}+1.75\%$
test_items_stack_nested 0.5521ms 0.4649ms 2.1512 KOps/s 2.1403 KOps/s $\color{#35bf28}+0.51\%$
test_items_stack_nested_leaf 0.1705ms 98.1917μs 10.1842 KOps/s 10.3003 KOps/s $\color{#d91a1a}-1.13\%$
test_items_stack_nested_locked 0.5149ms 0.4727ms 2.1153 KOps/s 2.1102 KOps/s $\color{#35bf28}+0.24\%$
test_keys 22.6810μs 4.1751μs 239.5156 KOps/s 235.1571 KOps/s $\color{#35bf28}+1.85\%$
test_keys_nested 0.1778ms 0.1295ms 7.7245 KOps/s 7.7816 KOps/s $\color{#d91a1a}-0.73\%$
test_keys_nested_locked 2.1315ms 0.1389ms 7.1972 KOps/s 7.2274 KOps/s $\color{#d91a1a}-0.42\%$
test_keys_nested_leaf 0.1580ms 0.1205ms 8.2995 KOps/s 8.3017 KOps/s $\color{#d91a1a}-0.03\%$
test_keys_stack_nested 0.1754ms 0.1311ms 7.6283 KOps/s 7.6904 KOps/s $\color{#d91a1a}-0.81\%$
test_keys_stack_nested_leaf 0.1582ms 0.1218ms 8.2104 KOps/s 8.2782 KOps/s $\color{#d91a1a}-0.82\%$
test_keys_stack_nested_locked 0.1884ms 0.1378ms 7.2574 KOps/s 7.3132 KOps/s $\color{#d91a1a}-0.76\%$
test_values 6.3662μs 1.0166μs 983.6594 KOps/s 989.3225 KOps/s $\color{#d91a1a}-0.57\%$
test_values_nested 84.7910μs 52.9648μs 18.8805 KOps/s 19.0431 KOps/s $\color{#d91a1a}-0.85\%$
test_values_nested_locked 96.6720μs 55.7532μs 17.9362 KOps/s 18.0071 KOps/s $\color{#d91a1a}-0.39\%$
test_values_nested_leaf 91.5410μs 60.2657μs 16.5932 KOps/s 16.7067 KOps/s $\color{#d91a1a}-0.68\%$
test_values_stack_nested 84.3510μs 52.7572μs 18.9547 KOps/s 18.8732 KOps/s $\color{#35bf28}+0.43\%$
test_values_stack_nested_leaf 85.9520μs 60.7314μs 16.4659 KOps/s 16.6702 KOps/s $\color{#d91a1a}-1.23\%$
test_values_stack_nested_locked 0.1266ms 55.4184μs 18.0445 KOps/s 18.0318 KOps/s $\color{#35bf28}+0.07\%$
test_membership 5.6433μs 0.8170μs 1.2241 MOps/s 1.1832 MOps/s $\color{#35bf28}+3.45\%$
test_membership_nested 29.9910μs 2.8632μs 349.2614 KOps/s 351.5292 KOps/s $\color{#d91a1a}-0.65\%$
test_membership_nested_leaf 25.9810μs 2.8814μs 347.0578 KOps/s 348.4387 KOps/s $\color{#d91a1a}-0.40\%$
test_membership_stacked_nested 25.2810μs 2.8844μs 346.6866 KOps/s 345.6680 KOps/s $\color{#35bf28}+0.29\%$
test_membership_stacked_nested_leaf 34.5200μs 2.9078μs 343.8972 KOps/s 347.6154 KOps/s $\color{#d91a1a}-1.07\%$
test_membership_nested_last 31.8200μs 4.2493μs 235.3310 KOps/s 228.6653 KOps/s $\color{#35bf28}+2.92\%$
test_membership_nested_leaf_last 31.1410μs 4.4072μs 226.9024 KOps/s 229.5143 KOps/s $\color{#d91a1a}-1.14\%$
test_membership_stacked_nested_last 28.3110μs 4.3601μs 229.3523 KOps/s 232.5882 KOps/s $\color{#d91a1a}-1.39\%$
test_membership_stacked_nested_leaf_last 31.9910μs 4.3922μs 227.6749 KOps/s 229.9623 KOps/s $\color{#d91a1a}-0.99\%$
test_nested_getleaf 47.7110μs 21.9262μs 45.6076 KOps/s 47.2950 KOps/s $\color{#d91a1a}-3.57\%$
test_nested_get 46.5510μs 20.7590μs 48.1719 KOps/s 49.8981 KOps/s $\color{#d91a1a}-3.46\%$
test_stacked_getleaf 50.1110μs 21.5943μs 46.3086 KOps/s 47.5911 KOps/s $\color{#d91a1a}-2.69\%$
test_stacked_get 48.1000μs 20.7672μs 48.1529 KOps/s 49.9413 KOps/s $\color{#d91a1a}-3.58\%$
test_nested_getitemleaf 47.8110μs 21.9882μs 45.4789 KOps/s 45.3745 KOps/s $\color{#35bf28}+0.23\%$
test_nested_getitem 46.7700μs 20.9438μs 47.7469 KOps/s 47.8915 KOps/s $\color{#d91a1a}-0.30\%$
test_stacked_getitemleaf 48.8010μs 22.0084μs 45.4371 KOps/s 45.8756 KOps/s $\color{#d91a1a}-0.96\%$
test_stacked_getitem 45.5310μs 20.9546μs 47.7223 KOps/s 48.6569 KOps/s $\color{#d91a1a}-1.92\%$
test_lock_nested 0.5780ms 0.4866ms 2.0552 KOps/s 2.1037 KOps/s $\color{#d91a1a}-2.31\%$
test_lock_stack_nested 0.5612ms 0.4902ms 2.0401 KOps/s 2.0662 KOps/s $\color{#d91a1a}-1.26\%$
test_unlock_nested 0.4852ms 0.3985ms 2.5094 KOps/s 2.5933 KOps/s $\color{#d91a1a}-3.24\%$
test_unlock_stack_nested 0.4312ms 0.3986ms 2.5085 KOps/s 2.5623 KOps/s $\color{#d91a1a}-2.10\%$
test_flatten_speed 0.1667ms 0.1214ms 8.2390 KOps/s 8.1984 KOps/s $\color{#35bf28}+0.50\%$
test_unflatten_speed 0.6438ms 0.5667ms 1.7647 KOps/s 1.7631 KOps/s $\color{#35bf28}+0.09\%$
test_common_ops 0.8980ms 0.7158ms 1.3971 KOps/s 1.4286 KOps/s $\color{#d91a1a}-2.21\%$
test_creation 79.1410μs 3.1149μs 321.0403 KOps/s 319.6905 KOps/s $\color{#35bf28}+0.42\%$
test_creation_empty 28.1410μs 7.0308μs 142.2311 KOps/s 143.5621 KOps/s $\color{#d91a1a}-0.93\%$
test_creation_nested_1 44.0210μs 11.6346μs 85.9502 KOps/s 86.7159 KOps/s $\color{#d91a1a}-0.88\%$
test_creation_nested_2 53.5500μs 13.3977μs 74.6397 KOps/s 75.3641 KOps/s $\color{#d91a1a}-0.96\%$
test_creation_many_keys[10] 50.8010μs 20.9854μs 47.6522 KOps/s 47.6802 KOps/s $\color{#d91a1a}-0.06\%$
test_creation_many_keys[50] 0.1352ms 88.8833μs 11.2507 KOps/s 10.9389 KOps/s $\color{#35bf28}+2.85\%$
test_creation_many_keys[100] 0.2062ms 0.1746ms 5.7263 KOps/s 5.5315 KOps/s $\color{#35bf28}+3.52\%$
test_creation_nested_many_keys[10] 76.8010μs 44.6984μs 22.3722 KOps/s 22.1730 KOps/s $\color{#35bf28}+0.90\%$
test_creation_nested_many_keys[50] 0.2405ms 0.1816ms 5.5072 KOps/s 5.3760 KOps/s $\color{#35bf28}+2.44\%$
test_clone 40.8910μs 13.2657μs 75.3824 KOps/s 73.9413 KOps/s $\color{#35bf28}+1.95\%$
test_getitem[int] 1.5386ms 15.3883μs 64.9845 KOps/s 60.9023 KOps/s $\textbf{\color{#35bf28}+6.70\%}$
test_getitem[slice_int] 0.1421ms 24.8367μs 40.2629 KOps/s 41.3750 KOps/s $\color{#d91a1a}-2.69\%$
test_getitem[range] 0.1886ms 63.6937μs 15.7001 KOps/s 15.7303 KOps/s $\color{#d91a1a}-0.19\%$
test_getitem[tuple] 0.1374ms 24.1731μs 41.3683 KOps/s 41.9732 KOps/s $\color{#d91a1a}-1.44\%$
test_getitem[list] 0.1924ms 57.4966μs 17.3923 KOps/s 17.2610 KOps/s $\color{#35bf28}+0.76\%$
test_setitem_dim[int] 45.6710μs 25.4313μs 39.3217 KOps/s 38.1948 KOps/s $\color{#35bf28}+2.95\%$
test_setitem_dim[slice_int] 66.7820μs 42.2920μs 23.6451 KOps/s 23.1034 KOps/s $\color{#35bf28}+2.34\%$
test_setitem_dim[range] 0.1171ms 94.1259μs 10.6241 KOps/s 10.4827 KOps/s $\color{#35bf28}+1.35\%$
test_setitem_dim[tuple] 61.7810μs 38.9506μs 25.6736 KOps/s 24.8449 KOps/s $\color{#35bf28}+3.34\%$
test_setitem 48.4910μs 17.5960μs 56.8311 KOps/s 56.6129 KOps/s $\color{#35bf28}+0.39\%$
test_set 44.6910μs 16.9330μs 59.0562 KOps/s 58.5708 KOps/s $\color{#35bf28}+0.83\%$
test_set_shared 0.4980ms 0.2047ms 4.8851 KOps/s 4.8773 KOps/s $\color{#35bf28}+0.16\%$
test_update 0.3585ms 21.8236μs 45.8221 KOps/s 45.7303 KOps/s $\color{#35bf28}+0.20\%$
test_update_nested 74.0310μs 33.6854μs 29.6864 KOps/s 29.8426 KOps/s $\color{#d91a1a}-0.52\%$
test_update__nested 0.4341ms 34.1891μs 29.2491 KOps/s 28.7883 KOps/s $\color{#35bf28}+1.60\%$
test_set_nested 54.0310μs 18.7495μs 53.3346 KOps/s 52.8231 KOps/s $\color{#35bf28}+0.97\%$
test_set_nested_new 60.0410μs 23.7928μs 42.0296 KOps/s 41.6141 KOps/s $\color{#35bf28}+1.00\%$
test_select 81.5820μs 40.5694μs 24.6491 KOps/s 23.8753 KOps/s $\color{#35bf28}+3.24\%$
test_select_nested 0.1025ms 73.8889μs 13.5338 KOps/s 13.3498 KOps/s $\color{#35bf28}+1.38\%$
test_exclude_nested 0.1270ms 91.0455μs 10.9835 KOps/s 10.8830 KOps/s $\color{#35bf28}+0.92\%$
test_empty[True] 0.4868ms 0.3987ms 2.5079 KOps/s 2.5194 KOps/s $\color{#d91a1a}-0.46\%$
test_empty[False] 7.2300μs 1.3004μs 769.0212 KOps/s 758.0321 KOps/s $\color{#35bf28}+1.45\%$
test_to 0.1055ms 74.6511μs 13.3956 KOps/s 13.1878 KOps/s $\color{#35bf28}+1.58\%$
test_to_nonblocking 0.1103ms 67.9329μs 14.7204 KOps/s 15.4394 KOps/s $\color{#d91a1a}-4.66\%$
test_unbind_speed 0.3878ms 0.3416ms 2.9277 KOps/s 3.0102 KOps/s $\color{#d91a1a}-2.74\%$
test_unbind_speed_stack0 0.4289ms 0.3386ms 2.9535 KOps/s 3.0082 KOps/s $\color{#d91a1a}-1.82\%$
test_unbind_speed_stack1 0.1050s 0.9368ms 1.0674 KOps/s 1.1880 KOps/s $\textbf{\color{#d91a1a}-10.15\%}$
test_split 1.2415ms 1.1393ms 877.7650 Ops/s 784.8170 Ops/s $\textbf{\color{#35bf28}+11.84\%}$
test_chunk 0.1048s 1.2087ms 827.3477 Ops/s 923.0060 Ops/s $\textbf{\color{#d91a1a}-10.36\%}$
test_to_cpu_blocking 29.0982ms 28.8132ms 34.7063 Ops/s 35.0451 Ops/s $\color{#d91a1a}-0.97\%$
test_to_cpu_global_sync 11.6557ms 11.5147ms 86.8453 Ops/s 88.5908 Ops/s $\color{#d91a1a}-1.97\%$
test_to_cpu_event_sync 12.6491ms 12.4593ms 80.2616 Ops/s 81.6905 Ops/s $\color{#d91a1a}-1.75\%$
test_to_cpu_default 0.1169s 13.7641ms 72.6529 Ops/s 81.3717 Ops/s $\textbf{\color{#d91a1a}-10.71\%}$
test_consolidate[False-None] 4.3855ms 4.1864ms 238.8687 Ops/s 242.6339 Ops/s $\color{#d91a1a}-1.55\%$
test_consolidate[default-None] 2.1985ms 2.0763ms 481.6350 Ops/s 490.1776 Ops/s $\color{#d91a1a}-1.74\%$
test_consolidate[reduce-overhead-None] 2.0782ms 1.9922ms 501.9467 Ops/s 510.1810 Ops/s $\color{#d91a1a}-1.61\%$
test_consolidate_njt[False-None] 8.7487ms 8.4806ms 117.9156 Ops/s 118.5688 Ops/s $\color{#d91a1a}-0.55\%$
test_to[False-False-None] 2.2208ms 2.1056ms 474.9228 Ops/s 480.3764 Ops/s $\color{#d91a1a}-1.14\%$
test_to[True-False-None] 2.2195ms 1.9481ms 513.3235 Ops/s 528.9654 Ops/s $\color{#d91a1a}-2.96\%$
test_to[within-False-None] 6.2693ms 6.1602ms 162.3324 Ops/s 163.6426 Ops/s $\color{#d91a1a}-0.80\%$
test_to[True-default-None] 9.0716ms 8.7932ms 113.7248 Ops/s 111.9064 Ops/s $\color{#35bf28}+1.62\%$
test_to_njt[False-False-None] 8.8877ms 8.4700ms 118.0639 Ops/s 117.7471 Ops/s $\color{#35bf28}+0.27\%$
test_to_njt[True-False-None] 7.2830ms 6.9928ms 143.0046 Ops/s 144.4577 Ops/s $\color{#d91a1a}-1.01\%$
test_to_njt[within-False-None] 0.1860s 18.1448ms 55.1123 Ops/s 63.0028 Ops/s $\textbf{\color{#d91a1a}-12.52\%}$
test_creation[device0] 0.3971ms 0.1161ms 8.6110 KOps/s 8.6960 KOps/s $\color{#d91a1a}-0.98\%$
test_creation_from_tensor 0.4095ms 0.1127ms 8.8729 KOps/s 8.5264 KOps/s $\color{#35bf28}+4.06\%$
test_add_one[memmap_tensor0] 0.3027ms 6.6493μs 150.3918 KOps/s 152.3968 KOps/s $\color{#d91a1a}-1.32\%$
test_contiguous[memmap_tensor0] 27.1110μs 0.6735μs 1.4848 MOps/s 2.1553 MOps/s $\textbf{\color{#d91a1a}-31.11\%}$
test_stack[memmap_tensor0] 58.1910μs 4.7147μs 212.1028 KOps/s 216.9211 KOps/s $\color{#d91a1a}-2.22\%$
test_memmaptd_index 1.0749ms 0.2754ms 3.6316 KOps/s 3.7592 KOps/s $\color{#d91a1a}-3.39\%$
test_memmaptd_index_astensor 0.5436ms 0.3787ms 2.6403 KOps/s 2.7177 KOps/s $\color{#d91a1a}-2.85\%$
test_memmaptd_index_op 1.0199ms 0.6329ms 1.5800 KOps/s 1.6074 KOps/s $\color{#d91a1a}-1.70\%$
test_serialize_model 0.1379s 0.1360s 7.3553 Ops/s 7.4120 Ops/s $\color{#d91a1a}-0.76\%$
test_serialize_model_pickle 1.3614s 1.1836s 0.8449 Ops/s 0.8230 Ops/s $\color{#35bf28}+2.66\%$
test_serialize_weights 0.1375s 0.1345s 7.4374 Ops/s 7.4268 Ops/s $\color{#35bf28}+0.14\%$
test_serialize_weights_returnearly 0.4320s 87.0403ms 11.4889 Ops/s 11.9424 Ops/s $\color{#d91a1a}-3.80\%$
test_serialize_weights_pickle 1.3674s 1.2131s 0.8244 Ops/s 0.8216 Ops/s $\color{#35bf28}+0.34\%$
test_reshape_pytree 0.2075ms 32.7865μs 30.5004 KOps/s 30.5654 KOps/s $\color{#d91a1a}-0.21\%$
test_reshape_td 75.4010μs 44.9839μs 22.2302 KOps/s 22.5079 KOps/s $\color{#d91a1a}-1.23\%$
test_view_pytree 0.2313ms 32.4522μs 30.8146 KOps/s 30.6605 KOps/s $\color{#35bf28}+0.50\%$
test_view_td 0.1053ms 54.7389μs 18.2685 KOps/s 18.9569 KOps/s $\color{#d91a1a}-3.63\%$
test_unbind_pytree 0.2330ms 36.3629μs 27.5005 KOps/s 27.0158 KOps/s $\color{#35bf28}+1.79\%$
test_unbind_td 0.2006ms 50.3752μs 19.8510 KOps/s 20.2403 KOps/s $\color{#d91a1a}-1.92\%$
test_split_pytree 0.2939ms 42.5633μs 23.4944 KOps/s 23.6048 KOps/s $\color{#d91a1a}-0.47\%$
test_split_td 0.1276ms 65.2356μs 15.3290 KOps/s 15.6560 KOps/s $\color{#d91a1a}-2.09\%$
test_add_pytree 0.2311ms 42.7601μs 23.3863 KOps/s 23.5767 KOps/s $\color{#d91a1a}-0.81\%$
test_add_td 0.2106ms 57.8138μs 17.2969 KOps/s 18.0387 KOps/s $\color{#d91a1a}-4.11\%$
test_compile_add_one_nested[tensordict-compile] 0.1933ms 0.1400ms 7.1407 KOps/s 6.7761 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_compile_add_one_nested[tensordict-eager] 0.3712ms 0.2022ms 4.9447 KOps/s 4.9972 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_add_one_nested[pytree-compile] 0.1727ms 0.1078ms 9.2737 KOps/s 9.1815 KOps/s $\color{#35bf28}+1.00\%$
test_compile_add_one_nested[pytree-eager] 0.4564ms 0.1835ms 5.4485 KOps/s 5.6012 KOps/s $\color{#d91a1a}-2.72\%$
test_compile_copy_nested[tensordict-compile] 0.3301ms 10.8491μs 92.1739 KOps/s 94.7535 KOps/s $\color{#d91a1a}-2.72\%$
test_compile_copy_nested[tensordict-eager] 97.8220μs 54.0908μs 18.4874 KOps/s 18.4413 KOps/s $\color{#35bf28}+0.25\%$
test_compile_copy_nested[pytree-compile] 0.1169ms 9.8307μs 101.7222 KOps/s 101.9246 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_copy_nested[pytree-eager] 0.4089ms 69.0247μs 14.4876 KOps/s 14.3539 KOps/s $\color{#35bf28}+0.93\%$
test_compile_add_one_flat[tensordict-compile] 0.2822ms 0.1760ms 5.6807 KOps/s 5.4438 KOps/s $\color{#35bf28}+4.35\%$
test_compile_add_one_flat[tensordict-eager] 0.3398ms 0.2778ms 3.5997 KOps/s 3.5826 KOps/s $\color{#35bf28}+0.48\%$
test_compile_add_one_flat[tensorclass-compile] 0.2055ms 0.1167ms 8.5673 KOps/s 7.8755 KOps/s $\textbf{\color{#35bf28}+8.78\%}$
test_compile_add_one_flat[tensorclass-eager] 0.1085ms 73.7313μs 13.5628 KOps/s 13.6186 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_add_one_flat[pytree-compile] 0.2229ms 0.1581ms 6.3265 KOps/s 6.0157 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_compile_add_one_flat[pytree-eager] 0.8366ms 0.5335ms 1.8745 KOps/s 1.8963 KOps/s $\color{#d91a1a}-1.15\%$
test_compile_add_self_flat[tensordict-eager] 0.4993ms 0.3333ms 3.0003 KOps/s 2.9636 KOps/s $\color{#35bf28}+1.24\%$
test_compile_add_self_flat[tensordict-compile] 0.2776ms 0.1791ms 5.5837 KOps/s 5.2729 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1364ms 91.2261μs 10.9618 KOps/s 11.1417 KOps/s $\color{#d91a1a}-1.61\%$
test_compile_add_self_flat[tensorclass-compile] 0.2690ms 0.1216ms 8.2260 KOps/s 8.0718 KOps/s $\color{#35bf28}+1.91\%$
test_compile_add_self_flat[pytree-eager] 0.7010ms 0.4422ms 2.2617 KOps/s 2.2861 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_add_self_flat[pytree-compile] 0.3189ms 0.1593ms 6.2773 KOps/s 5.9827 KOps/s $\color{#35bf28}+4.92\%$
test_compile_copy_flat[tensordict-compile] 0.1106ms 14.1227μs 70.8078 KOps/s 75.0283 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_compile_copy_flat[tensordict-eager] 72.2210μs 41.6491μs 24.0101 KOps/s 24.5280 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_copy_flat[pytree-compile] 0.1318ms 10.8652μs 92.0369 KOps/s 91.9878 KOps/s $\color{#35bf28}+0.05\%$
test_compile_copy_flat[pytree-eager] 0.4287ms 52.1059μs 19.1917 KOps/s 19.2027 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_assign_and_add[tensordict-compile] 2.0450ms 0.1765ms 5.6668 KOps/s 5.5794 KOps/s $\color{#35bf28}+1.57\%$
test_compile_assign_and_add[tensordict-eager] 3.5049ms 3.3118ms 301.9512 Ops/s 303.9276 Ops/s $\color{#d91a1a}-0.65\%$
test_compile_assign_and_add[pytree-compile] 1.9855ms 0.1636ms 6.1107 KOps/s 6.1010 KOps/s $\color{#35bf28}+0.16\%$
test_compile_assign_and_add[pytree-eager] 2.9226ms 2.7968ms 357.5484 Ops/s 360.4683 Ops/s $\color{#d91a1a}-0.81\%$
test_compile_indexing[tensor-tensordict-compile] 0.1473ms 0.1087ms 9.2024 KOps/s 9.1550 KOps/s $\color{#35bf28}+0.52\%$
test_compile_indexing[tensor-tensordict-eager] 0.3237ms 77.7366μs 12.8639 KOps/s 13.4958 KOps/s $\color{#d91a1a}-4.68\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2301ms 95.5329μs 10.4676 KOps/s 10.4864 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2523ms 47.3279μs 21.1292 KOps/s 22.3496 KOps/s $\textbf{\color{#d91a1a}-5.46\%}$
test_compile_indexing[tensor-pytree-compile] 0.1558ms 96.6009μs 10.3519 KOps/s 10.4736 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_indexing[tensor-pytree-eager] 0.2377ms 46.8523μs 21.3437 KOps/s 22.3237 KOps/s $\color{#d91a1a}-4.39\%$
test_compile_indexing[slice-tensordict-compile] 0.1928ms 56.6454μs 17.6537 KOps/s 17.4378 KOps/s $\color{#35bf28}+1.24\%$
test_compile_indexing[slice-tensordict-eager] 0.2270ms 27.7549μs 36.0297 KOps/s 36.6754 KOps/s $\color{#d91a1a}-1.76\%$
test_compile_indexing[slice-tensorclass-compile] 0.1658ms 44.2720μs 22.5876 KOps/s 22.8950 KOps/s $\color{#d91a1a}-1.34\%$
test_compile_indexing[slice-tensorclass-eager] 0.2491ms 22.6693μs 44.1125 KOps/s 44.2313 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_indexing[slice-pytree-compile] 0.1040ms 44.6910μs 22.3759 KOps/s 22.1977 KOps/s $\color{#35bf28}+0.80\%$
test_compile_indexing[slice-pytree-eager] 0.2510ms 22.7263μs 44.0018 KOps/s 44.3899 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_indexing[int-tensordict-compile] 91.0220μs 57.6760μs 17.3382 KOps/s 17.6401 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[int-tensordict-eager] 0.3053ms 28.0356μs 35.6690 KOps/s 37.3377 KOps/s $\color{#d91a1a}-4.47\%$
test_compile_indexing[int-tensorclass-compile] 86.4210μs 44.3853μs 22.5300 KOps/s 22.2422 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[int-tensorclass-eager] 0.2614ms 22.6285μs 44.1920 KOps/s 44.0027 KOps/s $\color{#35bf28}+0.43\%$
test_compile_indexing[int-pytree-compile] 86.5310μs 44.8756μs 22.2838 KOps/s 22.0986 KOps/s $\color{#35bf28}+0.84\%$
test_compile_indexing[int-pytree-eager] 0.2581ms 22.6280μs 44.1931 KOps/s 44.5566 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_replace[single-eager] 86.8020μs 47.2307μs 21.1727 KOps/s 20.9513 KOps/s $\color{#35bf28}+1.06\%$
test_compile_replace[single-compile] 0.2171ms 0.1047ms 9.5508 KOps/s 9.6011 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_replace[multi-eager] 0.6234ms 0.5642ms 1.7723 KOps/s 1.8264 KOps/s $\color{#d91a1a}-2.96\%$
test_compile_replace[multi-compile] 0.1966ms 0.1114ms 8.9738 KOps/s 9.0365 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_tc_getattr_20[eager] 0.2227ms 0.1707ms 5.8592 KOps/s 6.1093 KOps/s $\color{#d91a1a}-4.09\%$
test_compile_tc_getattr_20[compile] 0.2840ms 0.1184ms 8.4490 KOps/s 8.4183 KOps/s $\color{#35bf28}+0.36\%$
test_compile_clone_shallow[20-eager] 49.8010μs 19.5160μs 51.2400 KOps/s 51.7538 KOps/s $\color{#d91a1a}-0.99\%$
test_compile_clone_shallow[20-compile] 52.8910μs 11.4703μs 87.1818 KOps/s 86.8162 KOps/s $\color{#35bf28}+0.42\%$
test_compile_clone_shallow[40-eager] 0.1198ms 34.1166μs 29.3113 KOps/s 29.6810 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_clone_shallow[40-compile] 46.3710μs 13.0290μs 76.7522 KOps/s 80.0917 KOps/s $\color{#d91a1a}-4.17\%$
test_compile_clone_shallow[80-eager] 0.1011ms 63.7100μs 15.6961 KOps/s 15.9362 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_clone_shallow[80-compile] 54.5610μs 15.2151μs 65.7244 KOps/s 65.3351 KOps/s $\color{#35bf28}+0.60\%$
test_compile_update_inplace[eager] 94.4320μs 60.7331μs 16.4655 KOps/s 17.1292 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_update_inplace[compile] 0.6843ms 0.1401ms 7.1365 KOps/s 6.7903 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_mod_add[eager] 0.1041ms 50.0789μs 19.9685 KOps/s 20.4108 KOps/s $\color{#d91a1a}-2.17\%$
test_mod_add[compile] 0.5004ms 0.1049ms 9.5328 KOps/s 9.5027 KOps/s $\color{#35bf28}+0.32\%$
test_mod_add[compile-overhead] 0.2365ms 0.1483ms 6.7446 KOps/s 6.6476 KOps/s $\color{#35bf28}+1.46\%$
test_mod_wrap[eager] 0.3728ms 0.2908ms 3.4391 KOps/s 3.4431 KOps/s $\color{#d91a1a}-0.11\%$
test_mod_wrap[compile] 0.4480ms 0.3477ms 2.8757 KOps/s 2.8448 KOps/s $\color{#35bf28}+1.09\%$
test_mod_wrap[compile-overhead] 4.7624ms 2.6698ms 374.5608 Ops/s 249.1080 Ops/s $\textbf{\color{#35bf28}+50.36\%}$
test_mod_wrap_and_backward[eager] 1.6443ms 1.5114ms 661.6468 Ops/s 671.0626 Ops/s $\color{#d91a1a}-1.40\%$
test_mod_wrap_and_backward[compile] 1.6704ms 1.5536ms 643.6773 Ops/s 691.0024 Ops/s $\textbf{\color{#d91a1a}-6.85\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4526ms 0.9867ms 1.0135 KOps/s 1.1106 KOps/s $\textbf{\color{#d91a1a}-8.75\%}$
test_seq_add[eager] 0.2223ms 0.1541ms 6.4912 KOps/s 6.4936 KOps/s $\color{#d91a1a}-0.04\%$
test_seq_add[compile] 0.1981ms 0.1130ms 8.8526 KOps/s 8.4678 KOps/s $\color{#35bf28}+4.54\%$
test_seq_add[compile-overhead] 0.4034ms 0.1530ms 6.5380 KOps/s 6.3003 KOps/s $\color{#35bf28}+3.77\%$
test_seq_wrap[eager] 0.6287ms 0.5224ms 1.9141 KOps/s 1.9191 KOps/s $\color{#d91a1a}-0.26\%$
test_seq_wrap[compile] 0.4670ms 0.3661ms 2.7317 KOps/s 2.7180 KOps/s $\color{#35bf28}+0.50\%$
test_seq_wrap[compile-overhead] 0.3528ms 0.2654ms 3.7678 KOps/s 3.7333 KOps/s $\color{#35bf28}+0.92\%$
test_func_call_runtime[False-eager] 0.9093ms 0.8449ms 1.1836 KOps/s 1.1897 KOps/s $\color{#d91a1a}-0.52\%$
test_func_call_runtime[False-compile] 1.1016ms 0.9173ms 1.0902 KOps/s 1.0970 KOps/s $\color{#d91a1a}-0.63\%$
test_func_call_runtime[False-compile-overhead] 0.5780ms 0.4613ms 2.1680 KOps/s 2.1613 KOps/s $\color{#35bf28}+0.31\%$
test_func_call_runtime[True-eager] 1.1502ms 1.0722ms 932.6604 Ops/s 929.4673 Ops/s $\color{#35bf28}+0.34\%$
test_func_call_runtime[True-compile] 0.9979ms 0.9316ms 1.0735 KOps/s 1.0671 KOps/s $\color{#35bf28}+0.60\%$
test_func_call_runtime[True-compile-overhead] 0.5457ms 0.4768ms 2.0974 KOps/s 2.0905 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_cm_runtime[False-eager] 0.9681ms 0.8409ms 1.1892 KOps/s 1.1336 KOps/s $\color{#35bf28}+4.90\%$
test_func_call_cm_runtime[False-compile] 1.0271ms 0.9170ms 1.0905 KOps/s 1.0911 KOps/s $\color{#d91a1a}-0.05\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5345ms 0.4650ms 2.1505 KOps/s 2.1532 KOps/s $\color{#d91a1a}-0.13\%$
test_func_call_cm_runtime[True-eager] 1.3152ms 1.2240ms 817.0262 Ops/s 815.4659 Ops/s $\color{#35bf28}+0.19\%$
test_func_call_cm_runtime[True-compile] 1.0214ms 0.9614ms 1.0402 KOps/s 1.0458 KOps/s $\color{#d91a1a}-0.54\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5598ms 0.5101ms 1.9604 KOps/s 1.9449 KOps/s $\color{#35bf28}+0.80\%$
test_vmap_func_call_cm_runtime[eager] 2.8685ms 2.3789ms 420.3695 Ops/s 419.5055 Ops/s $\color{#35bf28}+0.21\%$
test_vmap_func_call_cm_runtime[compile] 1.0337ms 0.9845ms 1.0157 KOps/s 1.0213 KOps/s $\color{#d91a1a}-0.55\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5595ms 0.5148ms 1.9426 KOps/s 1.9297 KOps/s $\color{#35bf28}+0.67\%$
test_distributed 2.8514ms 0.1684ms 5.9393 KOps/s 6.0710 KOps/s $\color{#d91a1a}-2.17\%$
test_tdmodule 0.3757ms 28.2052μs 35.4545 KOps/s 36.3839 KOps/s $\color{#d91a1a}-2.55\%$
test_tdmodule_dispatch 76.6110μs 45.7037μs 21.8801 KOps/s 21.9757 KOps/s $\color{#d91a1a}-0.44\%$
test_tdseq 46.4110μs 26.7351μs 37.4041 KOps/s 37.1168 KOps/s $\color{#35bf28}+0.77\%$
test_tdseq_dispatch 70.6110μs 47.6692μs 20.9779 KOps/s 20.9324 KOps/s $\color{#35bf28}+0.22\%$
test_instantiation_functorch 2.2046ms 2.1094ms 474.0592 Ops/s 480.0678 Ops/s $\color{#d91a1a}-1.25\%$
test_exec_functorch 0.2591ms 0.1790ms 5.5875 KOps/s 5.5788 KOps/s $\color{#35bf28}+0.16\%$
test_exec_functional_call 0.2589ms 0.1600ms 6.2487 KOps/s 6.3466 KOps/s $\color{#d91a1a}-1.54\%$
test_exec_td_decorator 0.5058ms 0.2368ms 4.2223 KOps/s 4.2378 KOps/s $\color{#d91a1a}-0.37\%$
test_vmap_mlp_speed_decorator[True-True] 1.0337ms 0.8240ms 1.2137 KOps/s 1.2167 KOps/s $\color{#d91a1a}-0.25\%$
test_vmap_mlp_speed_decorator[True-False] 0.9986ms 0.8261ms 1.2105 KOps/s 1.2140 KOps/s $\color{#d91a1a}-0.29\%$
test_vmap_mlp_speed_decorator[False-True] 0.8756ms 0.7121ms 1.4044 KOps/s 1.4099 KOps/s $\color{#d91a1a}-0.39\%$
test_vmap_mlp_speed_decorator[False-False] 0.8895ms 0.7115ms 1.4055 KOps/s 1.4090 KOps/s $\color{#d91a1a}-0.25\%$
test_vmap_transformer_speed_decorator[True-True] 21.1226ms 20.5707ms 48.6129 Ops/s 48.7208 Ops/s $\color{#d91a1a}-0.22\%$
test_vmap_transformer_speed_decorator[True-False] 21.1447ms 20.5117ms 48.7527 Ops/s 48.8121 Ops/s $\color{#d91a1a}-0.12\%$
test_vmap_transformer_speed_decorator[False-True] 20.9081ms 20.3429ms 49.1572 Ops/s 49.1829 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed_decorator[False-False] 20.5821ms 20.3479ms 49.1452 Ops/s 49.2389 Ops/s $\color{#d91a1a}-0.19\%$
test_to_module_speed[True] 1.5577ms 1.4808ms 675.2978 Ops/s 683.0797 Ops/s $\color{#d91a1a}-1.14\%$
test_to_module_speed[False] 1.5556ms 1.4709ms 679.8436 Ops/s 692.1223 Ops/s $\color{#d91a1a}-1.77\%$
test_tc_init 0.1230ms 43.2597μs 23.1162 KOps/s 22.0801 KOps/s $\color{#35bf28}+4.69\%$
test_tc_init_tensor_only 39.1610μs 9.6632μs 103.4857 KOps/s 101.5284 KOps/s $\color{#35bf28}+1.93\%$
test_tc_init_nested 0.1647ms 86.8593μs 11.5129 KOps/s 11.3659 KOps/s $\color{#35bf28}+1.29\%$
test_tc_init_many_fields 53.4200μs 16.2222μs 61.6440 KOps/s 60.8640 KOps/s $\color{#35bf28}+1.28\%$
test_tc_first_layer_tensor 26.8300μs 1.8183μs 549.9618 KOps/s 548.5220 KOps/s $\color{#35bf28}+0.26\%$
test_tc_first_layer_tensor_only 2.4521μs 0.3908μs 2.5589 MOps/s 2.4983 MOps/s $\color{#35bf28}+2.42\%$
test_tc_first_layer_tensor_set 31.7400μs 3.8886μs 257.1601 KOps/s 251.7063 KOps/s $\color{#35bf28}+2.17\%$
test_tc_first_layer_tensor_only_set 28.1210μs 3.2604μs 306.7086 KOps/s 301.5672 KOps/s $\color{#35bf28}+1.70\%$
test_tc_first_layer_nontensor 34.3210μs 6.1960μs 161.3953 KOps/s 162.1877 KOps/s $\color{#d91a1a}-0.49\%$
test_tc_second_layer_tensor 27.5000μs 4.4512μs 224.6570 KOps/s 226.4449 KOps/s $\color{#d91a1a}-0.79\%$
test_tc_second_layer_nontensor 73.8310μs 8.7442μs 114.3615 KOps/s 113.3684 KOps/s $\color{#35bf28}+0.88\%$
test_unbind 0.2715s 16.4519ms 60.7831 Ops/s 55.7618 Ops/s $\textbf{\color{#35bf28}+9.01\%}$
test_full_like 17.7045ms 17.5361ms 57.0252 Ops/s 226.0408 Ops/s $\textbf{\color{#d91a1a}-74.77\%}$
test_zeros_like 18.5542ms 17.4519ms 57.3005 Ops/s 113.9673 Ops/s $\textbf{\color{#d91a1a}-49.72\%}$
test_ones_like 17.0058ms 16.6901ms 59.9157 Ops/s 229.1918 Ops/s $\textbf{\color{#d91a1a}-73.86\%}$
test_clone 17.8397ms 17.4010ms 57.4680 Ops/s 155.5197 Ops/s $\textbf{\color{#d91a1a}-63.05\%}$
test_squeeze 96.3520μs 14.2359μs 70.2450 KOps/s 70.1452 KOps/s $\color{#35bf28}+0.14\%$
test_unsqueeze 0.2594ms 0.1109ms 9.0181 KOps/s 9.1054 KOps/s $\color{#d91a1a}-0.96\%$
test_split 0.2407ms 0.1817ms 5.5035 KOps/s 5.4246 KOps/s $\color{#35bf28}+1.45\%$
test_permute 0.2578ms 0.2020ms 4.9506 KOps/s 4.9619 KOps/s $\color{#d91a1a}-0.23\%$
test_stack 51.2593ms 50.9572ms 19.6243 Ops/s 19.6266 Ops/s $\color{#d91a1a}-0.01\%$
test_cat 51.1861ms 50.9048ms 19.6445 Ops/s 19.6827 Ops/s $\color{#d91a1a}-0.19\%$
test_sequential_tensordict 0.2732ms 0.2167ms 4.6142 KOps/s 4.5421 KOps/s $\color{#35bf28}+1.59\%$
test_sequential_graph_module 0.1739ms 0.1186ms 8.4339 KOps/s 8.3091 KOps/s $\color{#35bf28}+1.50\%$
test_nested_tensordict 0.3577ms 0.2831ms 3.5320 KOps/s 3.4821 KOps/s $\color{#35bf28}+1.43\%$
test_nested_graph_module 0.1887ms 0.1318ms 7.5876 KOps/s 7.4899 KOps/s $\color{#35bf28}+1.31\%$

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy B (local-shard transfer + redistribute)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant