Skip to content

[DTensor] Add Strategy C (optimal P2P using transfer plan)#1646

Open
vmoens wants to merge 6 commits intogh/vmoens/87/basefrom
gh/vmoens/87/head
Open

[DTensor] Add Strategy C (optimal P2P using transfer plan)#1646
vmoens wants to merge 6 commits intogh/vmoens/87/basefrom
gh/vmoens/87/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Mar 9, 2026

Stack from ghstack (oldest at bottom):

Strategy C computes the minimal set of P2P transfers between
source and destination meshes using _compute_transfer_plan.
Each rank sends only the data slices needed by specific destination
ranks, avoiding redundant data movement. This is the most
bandwidth-efficient strategy and is selected by default when
mesh and placement information is provided ("auto" strategy).

Made-with: Cursor

[ghstack-poisoned]
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 9, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.1400μs 14.8953μs 67.1352 KOps/s 68.1976 KOps/s $\color{#d91a1a}-1.56\%$
test_plain_set_stack_nested 35.9310μs 15.0444μs 66.4698 KOps/s 65.6650 KOps/s $\color{#35bf28}+1.23\%$
test_plain_set_nested_inplace 44.1800μs 16.7591μs 59.6691 KOps/s 59.6045 KOps/s $\color{#35bf28}+0.11\%$
test_plain_set_stack_nested_inplace 52.1700μs 16.3958μs 60.9910 KOps/s 59.9425 KOps/s $\color{#35bf28}+1.75\%$
test_items 27.9700μs 5.9987μs 166.7024 KOps/s 164.1738 KOps/s $\color{#35bf28}+1.54\%$
test_items_nested 0.5517ms 0.4684ms 2.1349 KOps/s 2.1125 KOps/s $\color{#35bf28}+1.06\%$
test_items_nested_locked 0.5696ms 0.4668ms 2.1424 KOps/s 2.1098 KOps/s $\color{#35bf28}+1.54\%$
test_items_nested_leaf 0.1396ms 98.1307μs 10.1905 KOps/s 10.1122 KOps/s $\color{#35bf28}+0.77\%$
test_items_stack_nested 0.8023ms 0.4617ms 2.1661 KOps/s 2.1108 KOps/s $\color{#35bf28}+2.62\%$
test_items_stack_nested_leaf 0.1314ms 98.2021μs 10.1831 KOps/s 10.1218 KOps/s $\color{#35bf28}+0.61\%$
test_items_stack_nested_locked 0.6781ms 0.4672ms 2.1406 KOps/s 2.1191 KOps/s $\color{#35bf28}+1.01\%$
test_keys 32.4900μs 4.2578μs 234.8625 KOps/s 235.0620 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_nested 0.1918ms 0.1311ms 7.6265 KOps/s 7.5725 KOps/s $\color{#35bf28}+0.71\%$
test_keys_nested_locked 2.0703ms 0.1382ms 7.2350 KOps/s 7.1335 KOps/s $\color{#35bf28}+1.42\%$
test_keys_nested_leaf 0.1664ms 0.1203ms 8.3148 KOps/s 8.1956 KOps/s $\color{#35bf28}+1.45\%$
test_keys_stack_nested 0.1736ms 0.1296ms 7.7177 KOps/s 7.6153 KOps/s $\color{#35bf28}+1.34\%$
test_keys_stack_nested_leaf 0.1613ms 0.1193ms 8.3802 KOps/s 8.2358 KOps/s $\color{#35bf28}+1.75\%$
test_keys_stack_nested_locked 0.1977ms 0.1380ms 7.2485 KOps/s 7.2179 KOps/s $\color{#35bf28}+0.42\%$
test_values 7.5280μs 1.0266μs 974.1334 KOps/s 980.5856 KOps/s $\color{#d91a1a}-0.66\%$
test_values_nested 80.6910μs 52.7306μs 18.9643 KOps/s 18.9030 KOps/s $\color{#35bf28}+0.32\%$
test_values_nested_locked 90.6110μs 55.7893μs 17.9246 KOps/s 17.8335 KOps/s $\color{#35bf28}+0.51\%$
test_values_nested_leaf 98.9120μs 60.1255μs 16.6319 KOps/s 16.3980 KOps/s $\color{#35bf28}+1.43\%$
test_values_stack_nested 87.1510μs 52.9745μs 18.8770 KOps/s 18.7723 KOps/s $\color{#35bf28}+0.56\%$
test_values_stack_nested_leaf 0.1018ms 60.3811μs 16.5615 KOps/s 16.3035 KOps/s $\color{#35bf28}+1.58\%$
test_values_stack_nested_locked 86.9510μs 55.8970μs 17.8901 KOps/s 17.7560 KOps/s $\color{#35bf28}+0.76\%$
test_membership 5.4567μs 0.8578μs 1.1658 MOps/s 1.1945 MOps/s $\color{#d91a1a}-2.41\%$
test_membership_nested 0.1039ms 2.9140μs 343.1750 KOps/s 350.3141 KOps/s $\color{#d91a1a}-2.04\%$
test_membership_nested_leaf 26.2500μs 2.8580μs 349.8902 KOps/s 347.1621 KOps/s $\color{#35bf28}+0.79\%$
test_membership_stacked_nested 26.6200μs 2.8996μs 344.8766 KOps/s 343.4283 KOps/s $\color{#35bf28}+0.42\%$
test_membership_stacked_nested_leaf 38.3500μs 2.8940μs 345.5458 KOps/s 349.7938 KOps/s $\color{#d91a1a}-1.21\%$
test_membership_nested_last 25.8610μs 4.3807μs 228.2750 KOps/s 234.6175 KOps/s $\color{#d91a1a}-2.70\%$
test_membership_nested_leaf_last 40.6410μs 4.4132μs 226.5949 KOps/s 232.4133 KOps/s $\color{#d91a1a}-2.50\%$
test_membership_stacked_nested_last 33.2300μs 4.3743μs 228.6072 KOps/s 230.5373 KOps/s $\color{#d91a1a}-0.84\%$
test_membership_stacked_nested_leaf_last 36.7710μs 4.3629μs 229.2036 KOps/s 230.8205 KOps/s $\color{#d91a1a}-0.70\%$
test_nested_getleaf 50.4910μs 21.9446μs 45.5694 KOps/s 46.6045 KOps/s $\color{#d91a1a}-2.22\%$
test_nested_get 45.9000μs 20.6953μs 48.3201 KOps/s 48.6157 KOps/s $\color{#d91a1a}-0.61\%$
test_stacked_getleaf 46.8610μs 21.8381μs 45.7915 KOps/s 46.4970 KOps/s $\color{#d91a1a}-1.52\%$
test_stacked_get 49.5810μs 20.5437μs 48.6767 KOps/s 48.9073 KOps/s $\color{#d91a1a}-0.47\%$
test_nested_getitemleaf 48.7410μs 22.2723μs 44.8988 KOps/s 45.3269 KOps/s $\color{#d91a1a}-0.94\%$
test_nested_getitem 93.1610μs 21.3133μs 46.9191 KOps/s 47.4081 KOps/s $\color{#d91a1a}-1.03\%$
test_stacked_getitemleaf 47.4710μs 22.2301μs 44.9840 KOps/s 45.4915 KOps/s $\color{#d91a1a}-1.12\%$
test_stacked_getitem 55.0910μs 21.1750μs 47.2254 KOps/s 47.7043 KOps/s $\color{#d91a1a}-1.00\%$
test_lock_nested 0.5446ms 0.4769ms 2.0971 KOps/s 2.0866 KOps/s $\color{#35bf28}+0.50\%$
test_lock_stack_nested 0.5794ms 0.4807ms 2.0801 KOps/s 2.0590 KOps/s $\color{#35bf28}+1.03\%$
test_unlock_nested 0.4592ms 0.3900ms 2.5643 KOps/s 2.5404 KOps/s $\color{#35bf28}+0.94\%$
test_unlock_stack_nested 0.4706ms 0.3891ms 2.5698 KOps/s 2.5367 KOps/s $\color{#35bf28}+1.30\%$
test_flatten_speed 0.1669ms 0.1237ms 8.0838 KOps/s 8.0814 KOps/s $\color{#35bf28}+0.03\%$
test_unflatten_speed 0.6226ms 0.5708ms 1.7519 KOps/s 1.7346 KOps/s $\color{#35bf28}+0.99\%$
test_common_ops 0.8994ms 0.6969ms 1.4350 KOps/s 1.4390 KOps/s $\color{#d91a1a}-0.28\%$
test_creation 0.1228ms 3.1898μs 313.4946 KOps/s 315.7697 KOps/s $\color{#d91a1a}-0.72\%$
test_creation_empty 27.4600μs 7.0790μs 141.2623 KOps/s 141.4366 KOps/s $\color{#d91a1a}-0.12\%$
test_creation_nested_1 28.5300μs 11.7122μs 85.3814 KOps/s 86.8177 KOps/s $\color{#d91a1a}-1.65\%$
test_creation_nested_2 38.5610μs 13.4082μs 74.5814 KOps/s 74.2693 KOps/s $\color{#35bf28}+0.42\%$
test_creation_many_keys[10] 0.2111ms 21.2739μs 47.0060 KOps/s 47.7655 KOps/s $\color{#d91a1a}-1.59\%$
test_creation_many_keys[50] 0.1277ms 91.2768μs 10.9557 KOps/s 11.1033 KOps/s $\color{#d91a1a}-1.33\%$
test_creation_many_keys[100] 0.2178ms 0.1796ms 5.5690 KOps/s 5.6152 KOps/s $\color{#d91a1a}-0.82\%$
test_creation_nested_many_keys[10] 78.3900μs 45.5969μs 21.9313 KOps/s 22.1961 KOps/s $\color{#d91a1a}-1.19\%$
test_creation_nested_many_keys[50] 0.2407ms 0.1855ms 5.3921 KOps/s 5.4382 KOps/s $\color{#d91a1a}-0.85\%$
test_clone 46.3310μs 13.6149μs 73.4488 KOps/s 74.7143 KOps/s $\color{#d91a1a}-1.69\%$
test_getitem[int] 1.5039ms 15.4296μs 64.8104 KOps/s 60.8366 KOps/s $\textbf{\color{#35bf28}+6.53\%}$
test_getitem[slice_int] 0.1447ms 24.4494μs 40.9008 KOps/s 41.3324 KOps/s $\color{#d91a1a}-1.04\%$
test_getitem[range] 0.2038ms 62.4712μs 16.0074 KOps/s 15.7251 KOps/s $\color{#35bf28}+1.80\%$
test_getitem[tuple] 0.1430ms 24.1663μs 41.3800 KOps/s 41.9049 KOps/s $\color{#d91a1a}-1.25\%$
test_getitem[list] 0.1867ms 59.9300μs 16.6861 KOps/s 17.1582 KOps/s $\color{#d91a1a}-2.75\%$
test_setitem_dim[int] 54.3310μs 27.9489μs 35.7796 KOps/s 38.6698 KOps/s $\textbf{\color{#d91a1a}-7.47\%}$
test_setitem_dim[slice_int] 76.2410μs 45.4601μs 21.9973 KOps/s 23.1745 KOps/s $\textbf{\color{#d91a1a}-5.08\%}$
test_setitem_dim[range] 0.1272ms 0.1003ms 9.9715 KOps/s 10.3542 KOps/s $\color{#d91a1a}-3.70\%$
test_setitem_dim[tuple] 80.0310μs 40.5086μs 24.6861 KOps/s 24.9183 KOps/s $\color{#d91a1a}-0.93\%$
test_setitem 54.1500μs 17.9414μs 55.7371 KOps/s 56.3907 KOps/s $\color{#d91a1a}-1.16\%$
test_set 48.2110μs 17.1669μs 58.2516 KOps/s 58.4975 KOps/s $\color{#d91a1a}-0.42\%$
test_set_shared 0.4989ms 0.2030ms 4.9258 KOps/s 4.8828 KOps/s $\color{#35bf28}+0.88\%$
test_update 0.3372ms 21.9398μs 45.5792 KOps/s 46.1302 KOps/s $\color{#d91a1a}-1.19\%$
test_update_nested 0.1178ms 32.9888μs 30.3133 KOps/s 30.0528 KOps/s $\color{#35bf28}+0.87\%$
test_update__nested 0.5285ms 34.4021μs 29.0680 KOps/s 28.7706 KOps/s $\color{#35bf28}+1.03\%$
test_set_nested 52.6310μs 19.1675μs 52.1718 KOps/s 52.3776 KOps/s $\color{#d91a1a}-0.39\%$
test_set_nested_new 65.8010μs 23.8679μs 41.8973 KOps/s 41.8070 KOps/s $\color{#35bf28}+0.22\%$
test_select 79.4310μs 41.1032μs 24.3290 KOps/s 24.9813 KOps/s $\color{#d91a1a}-2.61\%$
test_select_nested 0.1105ms 74.7082μs 13.3854 KOps/s 13.3630 KOps/s $\color{#35bf28}+0.17\%$
test_exclude_nested 0.1327ms 92.4456μs 10.8172 KOps/s 10.8478 KOps/s $\color{#d91a1a}-0.28\%$
test_empty[True] 0.7768ms 0.4033ms 2.4796 KOps/s 2.5062 KOps/s $\color{#d91a1a}-1.06\%$
test_empty[False] 8.2678μs 1.3348μs 749.1944 KOps/s 768.5754 KOps/s $\color{#d91a1a}-2.52\%$
test_to 0.1062ms 72.9771μs 13.7029 KOps/s 13.6629 KOps/s $\color{#35bf28}+0.29\%$
test_to_nonblocking 0.1122ms 65.8668μs 15.1822 KOps/s 15.5196 KOps/s $\color{#d91a1a}-2.17\%$
test_unbind_speed 0.3644ms 0.3321ms 3.0112 KOps/s 2.9924 KOps/s $\color{#35bf28}+0.63\%$
test_unbind_speed_stack0 0.4233ms 0.3350ms 2.9852 KOps/s 3.0835 KOps/s $\color{#d91a1a}-3.19\%$
test_unbind_speed_stack1 0.1030s 0.8396ms 1.1911 KOps/s 1.1804 KOps/s $\color{#35bf28}+0.90\%$
test_split 0.1033s 1.2671ms 789.2077 Ops/s 786.8975 Ops/s $\color{#35bf28}+0.29\%$
test_chunk 0.1035s 1.2107ms 825.9609 Ops/s 931.4663 Ops/s $\textbf{\color{#d91a1a}-11.33\%}$
test_to_cpu_blocking 28.8600ms 28.6980ms 34.8457 Ops/s 50.7916 Ops/s $\textbf{\color{#d91a1a}-31.39\%}$
test_to_cpu_global_sync 11.6221ms 11.4800ms 87.1079 Ops/s 78.8578 Ops/s $\textbf{\color{#35bf28}+10.46\%}$
test_to_cpu_event_sync 12.6587ms 12.4146ms 80.5502 Ops/s 80.7241 Ops/s $\color{#d91a1a}-0.22\%$
test_to_cpu_default 0.1152s 13.7236ms 72.8669 Ops/s 80.3641 Ops/s $\textbf{\color{#d91a1a}-9.33\%}$
test_consolidate[False-None] 4.4006ms 4.2213ms 236.8915 Ops/s 215.3116 Ops/s $\textbf{\color{#35bf28}+10.02\%}$
test_consolidate[default-None] 2.8227ms 2.0837ms 479.9143 Ops/s 481.5324 Ops/s $\color{#d91a1a}-0.34\%$
test_consolidate[reduce-overhead-None] 2.0795ms 1.9881ms 502.9923 Ops/s 499.7359 Ops/s $\color{#35bf28}+0.65\%$
test_consolidate_njt[False-None] 0.1899s 10.1655ms 98.3723 Ops/s 117.2376 Ops/s $\textbf{\color{#d91a1a}-16.09\%}$
test_to[False-False-None] 2.2489ms 2.1319ms 469.0578 Ops/s 470.3355 Ops/s $\color{#d91a1a}-0.27\%$
test_to[True-False-None] 2.1774ms 1.9499ms 512.8370 Ops/s 519.4785 Ops/s $\color{#d91a1a}-1.28\%$
test_to[within-False-None] 6.4345ms 6.2855ms 159.0961 Ops/s 162.0755 Ops/s $\color{#d91a1a}-1.84\%$
test_to[True-default-None] 9.1127ms 8.9935ms 111.1917 Ops/s 110.8537 Ops/s $\color{#35bf28}+0.30\%$
test_to_njt[False-False-None] 8.6767ms 8.4953ms 117.7115 Ops/s 116.5723 Ops/s $\color{#35bf28}+0.98\%$
test_to_njt[True-False-None] 7.1187ms 6.9522ms 143.8393 Ops/s 141.0832 Ops/s $\color{#35bf28}+1.95\%$
test_to_njt[within-False-None] 15.9037ms 15.6612ms 63.8519 Ops/s 63.6388 Ops/s $\color{#35bf28}+0.33\%$
test_creation[device0] 0.3893ms 0.1156ms 8.6476 KOps/s 8.7918 KOps/s $\color{#d91a1a}-1.64\%$
test_creation_from_tensor 0.4008ms 0.1126ms 8.8814 KOps/s 8.8885 KOps/s $\color{#d91a1a}-0.08\%$
test_add_one[memmap_tensor0] 0.2113ms 6.5729μs 152.1402 KOps/s 154.3008 KOps/s $\color{#d91a1a}-1.40\%$
test_contiguous[memmap_tensor0] 37.9010μs 0.6795μs 1.4717 MOps/s 2.1132 MOps/s $\textbf{\color{#d91a1a}-30.36\%}$
test_stack[memmap_tensor0] 29.8200μs 4.7412μs 210.9157 KOps/s 215.8089 KOps/s $\color{#d91a1a}-2.27\%$
test_memmaptd_index 1.0123ms 0.2702ms 3.7014 KOps/s 3.7444 KOps/s $\color{#d91a1a}-1.15\%$
test_memmaptd_index_astensor 0.5478ms 0.3751ms 2.6662 KOps/s 2.6844 KOps/s $\color{#d91a1a}-0.68\%$
test_memmaptd_index_op 0.8009ms 0.6323ms 1.5815 KOps/s 1.6123 KOps/s $\color{#d91a1a}-1.91\%$
test_serialize_model 0.3066s 0.1602s 6.2437 Ops/s 7.2961 Ops/s $\textbf{\color{#d91a1a}-14.42\%}$
test_serialize_model_pickle 1.3491s 1.2139s 0.8238 Ops/s 0.8383 Ops/s $\color{#d91a1a}-1.73\%$
test_serialize_weights 0.1361s 0.1351s 7.4019 Ops/s 7.3522 Ops/s $\color{#35bf28}+0.68\%$
test_serialize_weights_returnearly 0.4125s 89.0543ms 11.2291 Ops/s 15.3062 Ops/s $\textbf{\color{#d91a1a}-26.64\%}$
test_serialize_weights_pickle 1.3647s 1.2145s 0.8234 Ops/s 0.8239 Ops/s $\color{#d91a1a}-0.06\%$
test_reshape_pytree 0.2019ms 33.5564μs 29.8006 KOps/s 30.5488 KOps/s $\color{#d91a1a}-2.45\%$
test_reshape_td 77.8810μs 45.4516μs 22.0014 KOps/s 21.7560 KOps/s $\color{#35bf28}+1.13\%$
test_view_pytree 0.2142ms 33.2890μs 30.0400 KOps/s 30.9739 KOps/s $\color{#d91a1a}-3.02\%$
test_view_td 90.6110μs 54.8281μs 18.2388 KOps/s 18.8674 KOps/s $\color{#d91a1a}-3.33\%$
test_unbind_pytree 0.2463ms 37.0064μs 27.0223 KOps/s 27.3673 KOps/s $\color{#d91a1a}-1.26\%$
test_unbind_td 0.1464ms 50.1595μs 19.9364 KOps/s 19.9410 KOps/s $\color{#d91a1a}-0.02\%$
test_split_pytree 0.2421ms 44.0152μs 22.7194 KOps/s 23.5495 KOps/s $\color{#d91a1a}-3.52\%$
test_split_td 0.2110ms 68.8477μs 14.5248 KOps/s 15.5420 KOps/s $\textbf{\color{#d91a1a}-6.54\%}$
test_add_pytree 0.2144ms 42.9235μs 23.2973 KOps/s 23.8637 KOps/s $\color{#d91a1a}-2.37\%$
test_add_td 0.1554ms 55.8831μs 17.8945 KOps/s 18.1894 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_add_one_nested[tensordict-compile] 0.2274ms 0.1435ms 6.9667 KOps/s 6.9170 KOps/s $\color{#35bf28}+0.72\%$
test_compile_add_one_nested[tensordict-eager] 0.4942ms 0.2035ms 4.9141 KOps/s 4.9295 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_add_one_nested[pytree-compile] 0.2473ms 0.1099ms 9.1018 KOps/s 8.9564 KOps/s $\color{#35bf28}+1.62\%$
test_compile_add_one_nested[pytree-eager] 0.4366ms 0.1855ms 5.3901 KOps/s 5.5298 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_copy_nested[tensordict-compile] 0.2985ms 10.0880μs 99.1281 KOps/s 96.9904 KOps/s $\color{#35bf28}+2.20\%$
test_compile_copy_nested[tensordict-eager] 83.9910μs 54.9978μs 18.1825 KOps/s 18.3518 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_copy_nested[pytree-compile] 0.1110ms 9.7253μs 102.8242 KOps/s 101.4444 KOps/s $\color{#35bf28}+1.36\%$
test_compile_copy_nested[pytree-eager] 0.4663ms 70.2254μs 14.2399 KOps/s 14.4426 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_add_one_flat[tensordict-compile] 0.2809ms 0.1798ms 5.5620 KOps/s 3.2232 KOps/s $\textbf{\color{#35bf28}+72.56\%}$
test_compile_add_one_flat[tensordict-eager] 0.3276ms 0.2805ms 3.5654 KOps/s 3.4825 KOps/s $\color{#35bf28}+2.38\%$
test_compile_add_one_flat[tensorclass-compile] 0.2197ms 0.1196ms 8.3606 KOps/s 8.0446 KOps/s $\color{#35bf28}+3.93\%$
test_compile_add_one_flat[tensorclass-eager] 0.1254ms 74.3996μs 13.4409 KOps/s 13.4479 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_add_one_flat[pytree-compile] 0.2167ms 0.1602ms 6.2426 KOps/s 6.1010 KOps/s $\color{#35bf28}+2.32\%$
test_compile_add_one_flat[pytree-eager] 0.8444ms 0.5560ms 1.7985 KOps/s 1.9284 KOps/s $\textbf{\color{#d91a1a}-6.73\%}$
test_compile_add_self_flat[tensordict-eager] 0.3998ms 0.3342ms 2.9924 KOps/s 2.9405 KOps/s $\color{#35bf28}+1.76\%$
test_compile_add_self_flat[tensordict-compile] 0.2856ms 0.1815ms 5.5084 KOps/s 5.0934 KOps/s $\textbf{\color{#35bf28}+8.15\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1402ms 90.7344μs 11.0212 KOps/s 10.9831 KOps/s $\color{#35bf28}+0.35\%$
test_compile_add_self_flat[tensorclass-compile] 0.2679ms 0.1242ms 8.0541 KOps/s 7.7297 KOps/s $\color{#35bf28}+4.20\%$
test_compile_add_self_flat[pytree-eager] 0.6653ms 0.4475ms 2.2346 KOps/s 2.3152 KOps/s $\color{#d91a1a}-3.48\%$
test_compile_add_self_flat[pytree-compile] 0.4628ms 0.1599ms 6.2549 KOps/s 6.0927 KOps/s $\color{#35bf28}+2.66\%$
test_compile_copy_flat[tensordict-compile] 0.1324ms 14.0139μs 71.3577 KOps/s 74.1772 KOps/s $\color{#d91a1a}-3.80\%$
test_compile_copy_flat[tensordict-eager] 72.3810μs 42.2409μs 23.6737 KOps/s 23.8852 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_copy_flat[pytree-compile] 46.7910μs 10.8474μs 92.1882 KOps/s 92.2041 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_copy_flat[pytree-eager] 0.4129ms 52.6440μs 18.9955 KOps/s 18.9869 KOps/s $\color{#35bf28}+0.05\%$
test_compile_assign_and_add[tensordict-compile] 2.0779ms 0.1764ms 5.6702 KOps/s 5.4476 KOps/s $\color{#35bf28}+4.09\%$
test_compile_assign_and_add[tensordict-eager] 3.5427ms 3.3465ms 298.8169 Ops/s 296.1359 Ops/s $\color{#35bf28}+0.91\%$
test_compile_assign_and_add[pytree-compile] 1.9916ms 0.1621ms 6.1682 KOps/s 6.0582 KOps/s $\color{#35bf28}+1.82\%$
test_compile_assign_and_add[pytree-eager] 3.0329ms 2.8367ms 352.5185 Ops/s 359.9451 Ops/s $\color{#d91a1a}-2.06\%$
test_compile_indexing[tensor-tensordict-compile] 0.1746ms 0.1108ms 9.0269 KOps/s 8.8110 KOps/s $\color{#35bf28}+2.45\%$
test_compile_indexing[tensor-tensordict-eager] 0.3204ms 74.9167μs 13.3482 KOps/s 13.5388 KOps/s $\color{#d91a1a}-1.41\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1875ms 97.7388μs 10.2314 KOps/s 10.2855 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2563ms 45.7911μs 21.8383 KOps/s 22.4211 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_indexing[tensor-pytree-compile] 0.1371ms 98.1502μs 10.1885 KOps/s 10.2749 KOps/s $\color{#d91a1a}-0.84\%$
test_compile_indexing[tensor-pytree-eager] 0.2468ms 45.4953μs 21.9803 KOps/s 22.1358 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_indexing[slice-tensordict-compile] 0.2471ms 56.2677μs 17.7722 KOps/s 16.7054 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_compile_indexing[slice-tensordict-eager] 0.2231ms 27.6216μs 36.2035 KOps/s 35.3344 KOps/s $\color{#35bf28}+2.46\%$
test_compile_indexing[slice-tensorclass-compile] 0.1256ms 44.1262μs 22.6623 KOps/s 22.4223 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[slice-tensorclass-eager] 0.2564ms 22.8946μs 43.6783 KOps/s 43.8822 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_indexing[slice-pytree-compile] 91.6020μs 44.5346μs 22.4544 KOps/s 21.6964 KOps/s $\color{#35bf28}+3.49\%$
test_compile_indexing[slice-pytree-eager] 0.2662ms 22.9175μs 43.6348 KOps/s 44.0923 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_indexing[int-tensordict-compile] 99.5310μs 56.7490μs 17.6215 KOps/s 16.5378 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_compile_indexing[int-tensordict-eager] 0.2299ms 27.5752μs 36.2644 KOps/s 36.0428 KOps/s $\color{#35bf28}+0.61\%$
test_compile_indexing[int-tensorclass-compile] 89.0910μs 45.2773μs 22.0861 KOps/s 22.2518 KOps/s $\color{#d91a1a}-0.74\%$
test_compile_indexing[int-tensorclass-eager] 0.2598ms 22.9538μs 43.5658 KOps/s 44.2598 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_indexing[int-pytree-compile] 0.2195ms 45.1730μs 22.1371 KOps/s 21.9945 KOps/s $\color{#35bf28}+0.65\%$
test_compile_indexing[int-pytree-eager] 0.2699ms 22.7791μs 43.8998 KOps/s 44.0267 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_replace[single-eager] 89.2410μs 49.8617μs 20.0555 KOps/s 20.9880 KOps/s $\color{#d91a1a}-4.44\%$
test_compile_replace[single-compile] 0.1889ms 0.1061ms 9.4260 KOps/s 9.3781 KOps/s $\color{#35bf28}+0.51\%$
test_compile_replace[multi-eager] 0.6707ms 0.5729ms 1.7455 KOps/s 1.7885 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_replace[multi-compile] 0.2434ms 0.1151ms 8.6904 KOps/s 8.8286 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_tc_getattr_20[eager] 0.2227ms 0.1763ms 5.6705 KOps/s 5.9985 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_compile_tc_getattr_20[compile] 0.4617ms 0.1218ms 8.2074 KOps/s 8.1725 KOps/s $\color{#35bf28}+0.43\%$
test_compile_clone_shallow[20-eager] 58.5910μs 19.5817μs 51.0681 KOps/s 52.6690 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_clone_shallow[20-compile] 60.6900μs 11.1404μs 89.7630 KOps/s 86.1126 KOps/s $\color{#35bf28}+4.24\%$
test_compile_clone_shallow[40-eager] 95.0310μs 34.4155μs 29.0566 KOps/s 29.1954 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_clone_shallow[40-compile] 67.3710μs 12.3707μs 80.8362 KOps/s 79.2337 KOps/s $\color{#35bf28}+2.02\%$
test_compile_clone_shallow[80-eager] 91.4510μs 64.0364μs 15.6161 KOps/s 15.6708 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_clone_shallow[80-compile] 66.3010μs 14.8484μs 67.3474 KOps/s 66.8236 KOps/s $\color{#35bf28}+0.78\%$
test_compile_update_inplace[eager] 99.8520μs 59.3369μs 16.8529 KOps/s 16.5668 KOps/s $\color{#35bf28}+1.73\%$
test_compile_update_inplace[compile] 0.3293ms 0.1402ms 7.1349 KOps/s 6.8959 KOps/s $\color{#35bf28}+3.47\%$
test_mod_add[eager] 0.1007ms 49.2188μs 20.3174 KOps/s 20.4003 KOps/s $\color{#d91a1a}-0.41\%$
test_mod_add[compile] 0.1691ms 0.1089ms 9.1820 KOps/s 8.9282 KOps/s $\color{#35bf28}+2.84\%$
test_mod_add[compile-overhead] 0.2349ms 0.1506ms 6.6392 KOps/s 6.6482 KOps/s $\color{#d91a1a}-0.13\%$
test_mod_wrap[eager] 0.3784ms 0.2892ms 3.4581 KOps/s 3.3995 KOps/s $\color{#35bf28}+1.72\%$
test_mod_wrap[compile] 0.4856ms 0.3501ms 2.8564 KOps/s 2.8039 KOps/s $\color{#35bf28}+1.87\%$
test_mod_wrap[compile-overhead] 7.1970ms 3.9626ms 252.3589 Ops/s 247.9920 Ops/s $\color{#35bf28}+1.76\%$
test_mod_wrap_and_backward[eager] 1.5860ms 1.4876ms 672.2268 Ops/s 669.6426 Ops/s $\color{#35bf28}+0.39\%$
test_mod_wrap_and_backward[compile] 1.5976ms 1.4542ms 687.6661 Ops/s 637.1431 Ops/s $\textbf{\color{#35bf28}+7.93\%}$
test_mod_wrap_and_backward[compile-overhead] 1.2588ms 0.8916ms 1.1216 KOps/s 988.2418 Ops/s $\textbf{\color{#35bf28}+13.49\%}$
test_seq_add[eager] 0.2199ms 0.1546ms 6.4689 KOps/s 6.4641 KOps/s $\color{#35bf28}+0.07\%$
test_seq_add[compile] 0.1700ms 0.1140ms 8.7707 KOps/s 8.1595 KOps/s $\textbf{\color{#35bf28}+7.49\%}$
test_seq_add[compile-overhead] 0.4162ms 0.1557ms 6.4225 KOps/s 6.3071 KOps/s $\color{#35bf28}+1.83\%$
test_seq_wrap[eager] 0.6206ms 0.5158ms 1.9387 KOps/s 1.9191 KOps/s $\color{#35bf28}+1.02\%$
test_seq_wrap[compile] 0.4139ms 0.3655ms 2.7363 KOps/s 2.6778 KOps/s $\color{#35bf28}+2.18\%$
test_seq_wrap[compile-overhead] 0.4177ms 0.2670ms 3.7456 KOps/s 3.6949 KOps/s $\color{#35bf28}+1.37\%$
test_func_call_runtime[False-eager] 0.9374ms 0.8459ms 1.1822 KOps/s 1.2048 KOps/s $\color{#d91a1a}-1.88\%$
test_func_call_runtime[False-compile] 1.0714ms 0.9205ms 1.0864 KOps/s 1.0806 KOps/s $\color{#35bf28}+0.54\%$
test_func_call_runtime[False-compile-overhead] 0.5742ms 0.4668ms 2.1421 KOps/s 2.1228 KOps/s $\color{#35bf28}+0.91\%$
test_func_call_runtime[True-eager] 1.3953ms 1.0864ms 920.4903 Ops/s 935.8363 Ops/s $\color{#d91a1a}-1.64\%$
test_func_call_runtime[True-compile] 1.0339ms 0.9264ms 1.0794 KOps/s 1.0710 KOps/s $\color{#35bf28}+0.79\%$
test_func_call_runtime[True-compile-overhead] 0.5350ms 0.4818ms 2.0757 KOps/s 2.0577 KOps/s $\color{#35bf28}+0.87\%$
test_func_call_cm_runtime[False-eager] 0.9491ms 0.8366ms 1.1953 KOps/s 1.2101 KOps/s $\color{#d91a1a}-1.22\%$
test_func_call_cm_runtime[False-compile] 1.0330ms 0.9621ms 1.0394 KOps/s 1.0853 KOps/s $\color{#d91a1a}-4.23\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5439ms 0.4707ms 2.1246 KOps/s 2.1184 KOps/s $\color{#35bf28}+0.29\%$
test_func_call_cm_runtime[True-eager] 1.3138ms 1.2270ms 815.0063 Ops/s 819.2552 Ops/s $\color{#d91a1a}-0.52\%$
test_func_call_cm_runtime[True-compile] 1.0221ms 0.9659ms 1.0353 KOps/s 1.0279 KOps/s $\color{#35bf28}+0.72\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5762ms 0.5138ms 1.9463 KOps/s 1.9281 KOps/s $\color{#35bf28}+0.94\%$
test_vmap_func_call_cm_runtime[eager] 2.8691ms 2.3783ms 420.4772 Ops/s 423.0646 Ops/s $\color{#d91a1a}-0.61\%$
test_vmap_func_call_cm_runtime[compile] 1.1337ms 0.9878ms 1.0123 KOps/s 1.0096 KOps/s $\color{#35bf28}+0.27\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6087ms 0.5189ms 1.9272 KOps/s 1.8849 KOps/s $\color{#35bf28}+2.24\%$
test_distributed 0.6407ms 0.1531ms 6.5321 KOps/s 6.4832 KOps/s $\color{#35bf28}+0.75\%$
test_tdmodule 59.8810μs 27.0903μs 36.9135 KOps/s 35.5445 KOps/s $\color{#35bf28}+3.85\%$
test_tdmodule_dispatch 74.8910μs 45.1499μs 22.1484 KOps/s 21.6399 KOps/s $\color{#35bf28}+2.35\%$
test_tdseq 57.9410μs 26.8737μs 37.2110 KOps/s 36.6268 KOps/s $\color{#35bf28}+1.60\%$
test_tdseq_dispatch 66.8400μs 47.0024μs 21.2755 KOps/s 20.7944 KOps/s $\color{#35bf28}+2.31\%$
test_instantiation_functorch 2.1885ms 2.0928ms 477.8185 Ops/s 475.8930 Ops/s $\color{#35bf28}+0.40\%$
test_exec_functorch 0.2462ms 0.1803ms 5.5457 KOps/s 5.5717 KOps/s $\color{#d91a1a}-0.47\%$
test_exec_functional_call 0.2022ms 0.1609ms 6.2152 KOps/s 6.2474 KOps/s $\color{#d91a1a}-0.52\%$
test_exec_td_decorator 0.4512ms 0.2369ms 4.2212 KOps/s 4.2578 KOps/s $\color{#d91a1a}-0.86\%$
test_vmap_mlp_speed_decorator[True-True] 1.0189ms 0.8235ms 1.2144 KOps/s 1.2088 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[True-False] 0.9977ms 0.8203ms 1.2191 KOps/s 1.2078 KOps/s $\color{#35bf28}+0.93\%$
test_vmap_mlp_speed_decorator[False-True] 0.8845ms 0.7085ms 1.4115 KOps/s 1.4017 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_mlp_speed_decorator[False-False] 0.8835ms 0.7076ms 1.4133 KOps/s 1.4113 KOps/s $\color{#35bf28}+0.14\%$
test_vmap_transformer_speed_decorator[True-True] 21.3439ms 20.5688ms 48.6174 Ops/s 48.4830 Ops/s $\color{#35bf28}+0.28\%$
test_vmap_transformer_speed_decorator[True-False] 20.7685ms 20.5907ms 48.5655 Ops/s 48.5379 Ops/s $\color{#35bf28}+0.06\%$
test_vmap_transformer_speed_decorator[False-True] 20.5747ms 20.3616ms 49.1120 Ops/s 49.1323 Ops/s $\color{#d91a1a}-0.04\%$
test_vmap_transformer_speed_decorator[False-False] 20.4742ms 20.3470ms 49.1473 Ops/s 48.7692 Ops/s $\color{#35bf28}+0.78\%$
test_to_module_speed[True] 1.5788ms 1.4742ms 678.3140 Ops/s 681.6393 Ops/s $\color{#d91a1a}-0.49\%$
test_to_module_speed[False] 1.5782ms 1.4548ms 687.3684 Ops/s 679.3855 Ops/s $\color{#35bf28}+1.18\%$
test_tc_init 79.7510μs 44.9563μs 22.2438 KOps/s 22.5273 KOps/s $\color{#d91a1a}-1.26\%$
test_tc_init_tensor_only 30.4000μs 9.8980μs 101.0304 KOps/s 102.8888 KOps/s $\color{#d91a1a}-1.81\%$
test_tc_init_nested 0.1271ms 89.5690μs 11.1646 KOps/s 11.4127 KOps/s $\color{#d91a1a}-2.17\%$
test_tc_init_many_fields 45.2400μs 16.2779μs 61.4329 KOps/s 60.7870 KOps/s $\color{#35bf28}+1.06\%$
test_tc_first_layer_tensor 19.9400μs 1.8347μs 545.0502 KOps/s 544.7623 KOps/s $\color{#35bf28}+0.05\%$
test_tc_first_layer_tensor_only 3.6870μs 0.4025μs 2.4844 MOps/s 2.5293 MOps/s $\color{#d91a1a}-1.77\%$
test_tc_first_layer_tensor_set 29.8010μs 3.9379μs 253.9434 KOps/s 253.2634 KOps/s $\color{#35bf28}+0.27\%$
test_tc_first_layer_tensor_only_set 37.0810μs 3.3002μs 303.0153 KOps/s 302.0014 KOps/s $\color{#35bf28}+0.34\%$
test_tc_first_layer_nontensor 42.6310μs 6.2413μs 160.2219 KOps/s 159.5754 KOps/s $\color{#35bf28}+0.41\%$
test_tc_second_layer_tensor 25.5110μs 4.4723μs 223.5973 KOps/s 224.6140 KOps/s $\color{#d91a1a}-0.45\%$
test_tc_second_layer_nontensor 36.7600μs 8.8144μs 113.4505 KOps/s 113.1819 KOps/s $\color{#35bf28}+0.24\%$
test_unbind 0.2479s 17.0554ms 58.6325 Ops/s 66.7870 Ops/s $\textbf{\color{#d91a1a}-12.21\%}$
test_full_like 4.7497ms 4.3258ms 231.1710 Ops/s 228.7843 Ops/s $\color{#35bf28}+1.04\%$
test_zeros_like 4.9650ms 4.3699ms 228.8368 Ops/s 228.8900 Ops/s $\color{#d91a1a}-0.02\%$
test_ones_like 4.7539ms 4.3643ms 229.1320 Ops/s 228.6035 Ops/s $\color{#35bf28}+0.23\%$
test_clone 6.7556ms 6.4051ms 156.1258 Ops/s 154.3121 Ops/s $\color{#35bf28}+1.18\%$
test_squeeze 0.1724ms 14.3234μs 69.8160 KOps/s 69.4068 KOps/s $\color{#35bf28}+0.59\%$
test_unsqueeze 0.1631ms 0.1125ms 8.8928 KOps/s 9.0071 KOps/s $\color{#d91a1a}-1.27\%$
test_split 0.2392ms 0.1867ms 5.3569 KOps/s 5.3577 KOps/s $\color{#d91a1a}-0.02\%$
test_permute 0.2541ms 0.2051ms 4.8756 KOps/s 4.8209 KOps/s $\color{#35bf28}+1.13\%$
test_stack 51.0125ms 50.3299ms 19.8689 Ops/s 23.3613 Ops/s $\textbf{\color{#d91a1a}-14.95\%}$
test_cat 51.2316ms 50.4308ms 19.8291 Ops/s 19.6964 Ops/s $\color{#35bf28}+0.67\%$
test_sequential_tensordict 0.2537ms 0.2139ms 4.6757 KOps/s 4.5342 KOps/s $\color{#35bf28}+3.12\%$
test_sequential_graph_module 0.5427ms 0.1174ms 8.5208 KOps/s 8.3929 KOps/s $\color{#35bf28}+1.52\%$
test_nested_tensordict 0.5378ms 0.2929ms 3.4142 KOps/s 3.5136 KOps/s $\color{#d91a1a}-2.83\%$
test_nested_graph_module 0.5212ms 0.1366ms 7.3225 KOps/s 7.6304 KOps/s $\color{#d91a1a}-4.03\%$

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}29$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 35.3910μs 14.9723μs 66.7899 KOps/s 66.8608 KOps/s $\color{#d91a1a}-0.11\%$
test_plain_set_stack_nested 43.8810μs 15.2830μs 65.4323 KOps/s 66.0761 KOps/s $\color{#d91a1a}-0.97\%$
test_plain_set_nested_inplace 56.0510μs 16.9309μs 59.0635 KOps/s 58.8241 KOps/s $\color{#35bf28}+0.41\%$
test_plain_set_stack_nested_inplace 40.1610μs 16.7245μs 59.7926 KOps/s 60.0285 KOps/s $\color{#d91a1a}-0.39\%$
test_items 26.0210μs 5.9304μs 168.6233 KOps/s 165.8638 KOps/s $\color{#35bf28}+1.66\%$
test_items_nested 0.6796ms 0.4686ms 2.1338 KOps/s 2.1316 KOps/s $\color{#35bf28}+0.10\%$
test_items_nested_locked 0.6796ms 0.4632ms 2.1587 KOps/s 2.0995 KOps/s $\color{#35bf28}+2.82\%$
test_items_nested_leaf 0.1574ms 97.1424μs 10.2942 KOps/s 10.1291 KOps/s $\color{#35bf28}+1.63\%$
test_items_stack_nested 0.5602ms 0.4638ms 2.1560 KOps/s 2.1389 KOps/s $\color{#35bf28}+0.80\%$
test_items_stack_nested_leaf 0.1533ms 97.8682μs 10.2178 KOps/s 10.1887 KOps/s $\color{#35bf28}+0.29\%$
test_items_stack_nested_locked 0.5701ms 0.4650ms 2.1503 KOps/s 2.1249 KOps/s $\color{#35bf28}+1.20\%$
test_keys 50.4810μs 4.2251μs 236.6801 KOps/s 234.3704 KOps/s $\color{#35bf28}+0.99\%$
test_keys_nested 0.2007ms 0.1297ms 7.7085 KOps/s 7.6436 KOps/s $\color{#35bf28}+0.85\%$
test_keys_nested_locked 2.1030ms 0.1381ms 7.2428 KOps/s 7.1504 KOps/s $\color{#35bf28}+1.29\%$
test_keys_nested_leaf 0.1916ms 0.1192ms 8.3877 KOps/s 8.2515 KOps/s $\color{#35bf28}+1.65\%$
test_keys_stack_nested 0.2022ms 0.1285ms 7.7793 KOps/s 7.6742 KOps/s $\color{#35bf28}+1.37\%$
test_keys_stack_nested_leaf 0.1750ms 0.1199ms 8.3420 KOps/s 8.2461 KOps/s $\color{#35bf28}+1.16\%$
test_keys_stack_nested_locked 0.2190ms 0.1361ms 7.3492 KOps/s 7.2146 KOps/s $\color{#35bf28}+1.87\%$
test_values 5.3040μs 1.0232μs 977.2800 KOps/s 978.4221 KOps/s $\color{#d91a1a}-0.12\%$
test_values_nested 95.3020μs 52.4817μs 19.0543 KOps/s 18.9036 KOps/s $\color{#35bf28}+0.80\%$
test_values_nested_locked 93.5920μs 55.7703μs 17.9307 KOps/s 17.8345 KOps/s $\color{#35bf28}+0.54\%$
test_values_nested_leaf 0.1106ms 59.8670μs 16.7037 KOps/s 16.6340 KOps/s $\color{#35bf28}+0.42\%$
test_values_stack_nested 86.5720μs 52.4530μs 19.0647 KOps/s 18.9175 KOps/s $\color{#35bf28}+0.78\%$
test_values_stack_nested_leaf 0.1248ms 59.8865μs 16.6982 KOps/s 16.5428 KOps/s $\color{#35bf28}+0.94\%$
test_values_stack_nested_locked 89.5620μs 56.2238μs 17.7861 KOps/s 17.7666 KOps/s $\color{#35bf28}+0.11\%$
test_membership 4.8567μs 0.8519μs 1.1738 MOps/s 1.1756 MOps/s $\color{#d91a1a}-0.16\%$
test_membership_nested 32.5600μs 2.8761μs 347.6936 KOps/s 350.9213 KOps/s $\color{#d91a1a}-0.92\%$
test_membership_nested_leaf 33.1610μs 2.8506μs 350.7975 KOps/s 344.6430 KOps/s $\color{#35bf28}+1.79\%$
test_membership_stacked_nested 57.2810μs 2.8653μs 349.0004 KOps/s 345.2944 KOps/s $\color{#35bf28}+1.07\%$
test_membership_stacked_nested_leaf 39.6000μs 2.9022μs 344.5618 KOps/s 344.3944 KOps/s $\color{#35bf28}+0.05\%$
test_membership_nested_last 30.4300μs 4.3403μs 230.3973 KOps/s 228.7415 KOps/s $\color{#35bf28}+0.72\%$
test_membership_nested_leaf_last 32.9000μs 4.4020μs 227.1699 KOps/s 237.8298 KOps/s $\color{#d91a1a}-4.48\%$
test_membership_stacked_nested_last 52.3020μs 4.3363μs 230.6117 KOps/s 230.0793 KOps/s $\color{#35bf28}+0.23\%$
test_membership_stacked_nested_leaf_last 33.2300μs 4.3457μs 230.1100 KOps/s 229.0139 KOps/s $\color{#35bf28}+0.48\%$
test_nested_getleaf 49.3210μs 21.8753μs 45.7136 KOps/s 45.9728 KOps/s $\color{#d91a1a}-0.56\%$
test_nested_get 86.6720μs 20.3989μs 49.0224 KOps/s 48.2292 KOps/s $\color{#35bf28}+1.64\%$
test_stacked_getleaf 55.3510μs 21.3656μs 46.8041 KOps/s 46.7281 KOps/s $\color{#35bf28}+0.16\%$
test_stacked_get 43.0310μs 20.1726μs 49.5721 KOps/s 48.7055 KOps/s $\color{#35bf28}+1.78\%$
test_nested_getitemleaf 64.3810μs 21.5897μs 46.3183 KOps/s 45.2606 KOps/s $\color{#35bf28}+2.34\%$
test_nested_getitem 42.0610μs 20.8603μs 47.9379 KOps/s 47.2250 KOps/s $\color{#35bf28}+1.51\%$
test_stacked_getitemleaf 50.2310μs 21.9228μs 45.6146 KOps/s 46.1747 KOps/s $\color{#d91a1a}-1.21\%$
test_stacked_getitem 48.4710μs 20.8699μs 47.9160 KOps/s 47.5780 KOps/s $\color{#35bf28}+0.71\%$
test_lock_nested 0.5720ms 0.4796ms 2.0852 KOps/s 2.1170 KOps/s $\color{#d91a1a}-1.50\%$
test_lock_stack_nested 0.5203ms 0.4825ms 2.0726 KOps/s 2.0736 KOps/s $\color{#d91a1a}-0.05\%$
test_unlock_nested 0.4931ms 0.3925ms 2.5478 KOps/s 2.5832 KOps/s $\color{#d91a1a}-1.37\%$
test_unlock_stack_nested 0.4239ms 0.3897ms 2.5658 KOps/s 2.5661 KOps/s $\color{#d91a1a}-0.01\%$
test_flatten_speed 0.1650ms 0.1225ms 8.1615 KOps/s 8.2088 KOps/s $\color{#d91a1a}-0.58\%$
test_unflatten_speed 0.6239ms 0.5709ms 1.7516 KOps/s 1.7339 KOps/s $\color{#35bf28}+1.03\%$
test_common_ops 0.8299ms 0.6940ms 1.4409 KOps/s 1.4227 KOps/s $\color{#35bf28}+1.28\%$
test_creation 71.3510μs 3.1591μs 316.5414 KOps/s 316.5122 KOps/s $+0.01\%$
test_creation_empty 29.3300μs 7.0026μs 142.8044 KOps/s 143.2746 KOps/s $\color{#d91a1a}-0.33\%$
test_creation_nested_1 44.2310μs 11.5860μs 86.3108 KOps/s 87.5113 KOps/s $\color{#d91a1a}-1.37\%$
test_creation_nested_2 47.6310μs 13.3005μs 75.1853 KOps/s 74.8104 KOps/s $\color{#35bf28}+0.50\%$
test_creation_many_keys[10] 61.1920μs 21.1819μs 47.2102 KOps/s 48.2463 KOps/s $\color{#d91a1a}-2.15\%$
test_creation_many_keys[50] 0.1331ms 92.4520μs 10.8164 KOps/s 11.2012 KOps/s $\color{#d91a1a}-3.44\%$
test_creation_many_keys[100] 0.2575ms 0.1792ms 5.5795 KOps/s 5.7062 KOps/s $\color{#d91a1a}-2.22\%$
test_creation_nested_many_keys[10] 84.6420μs 45.4704μs 21.9923 KOps/s 22.3856 KOps/s $\color{#d91a1a}-1.76\%$
test_creation_nested_many_keys[50] 0.2510ms 0.1851ms 5.4014 KOps/s 5.5120 KOps/s $\color{#d91a1a}-2.01\%$
test_clone 43.9910μs 13.4605μs 74.2912 KOps/s 74.2777 KOps/s $\color{#35bf28}+0.02\%$
test_getitem[int] 1.6187ms 15.3690μs 65.0660 KOps/s 60.4168 KOps/s $\textbf{\color{#35bf28}+7.70\%}$
test_getitem[slice_int] 0.1339ms 24.4610μs 40.8814 KOps/s 40.7903 KOps/s $\color{#35bf28}+0.22\%$
test_getitem[range] 0.1714ms 63.6160μs 15.7193 KOps/s 15.7871 KOps/s $\color{#d91a1a}-0.43\%$
test_getitem[tuple] 0.1412ms 24.2258μs 41.2783 KOps/s 41.4831 KOps/s $\color{#d91a1a}-0.49\%$
test_getitem[list] 0.1791ms 57.5611μs 17.3729 KOps/s 17.1181 KOps/s $\color{#35bf28}+1.49\%$
test_setitem_dim[int] 54.2010μs 25.9435μs 38.5453 KOps/s 38.2455 KOps/s $\color{#35bf28}+0.78\%$
test_setitem_dim[slice_int] 64.6910μs 42.7909μs 23.3694 KOps/s 22.9032 KOps/s $\color{#35bf28}+2.04\%$
test_setitem_dim[range] 0.1431ms 96.0882μs 10.4071 KOps/s 10.4513 KOps/s $\color{#d91a1a}-0.42\%$
test_setitem_dim[tuple] 59.0810μs 39.6403μs 25.2268 KOps/s 24.7458 KOps/s $\color{#35bf28}+1.94\%$
test_setitem 47.7410μs 17.7578μs 56.3132 KOps/s 56.0608 KOps/s $\color{#35bf28}+0.45\%$
test_set 44.7310μs 17.3460μs 57.6502 KOps/s 58.3135 KOps/s $\color{#d91a1a}-1.14\%$
test_set_shared 0.5072ms 0.2056ms 4.8645 KOps/s 4.9076 KOps/s $\color{#d91a1a}-0.88\%$
test_update 0.3205ms 21.6810μs 46.1233 KOps/s 46.5528 KOps/s $\color{#d91a1a}-0.92\%$
test_update_nested 69.2320μs 33.9039μs 29.4951 KOps/s 30.8475 KOps/s $\color{#d91a1a}-4.38\%$
test_update__nested 0.4638ms 34.2778μs 29.1734 KOps/s 29.1852 KOps/s $\color{#d91a1a}-0.04\%$
test_set_nested 65.2110μs 18.9893μs 52.6612 KOps/s 53.3394 KOps/s $\color{#d91a1a}-1.27\%$
test_set_nested_new 57.9510μs 24.2915μs 41.1667 KOps/s 41.6687 KOps/s $\color{#d91a1a}-1.20\%$
test_select 67.7020μs 40.2704μs 24.8321 KOps/s 24.4834 KOps/s $\color{#35bf28}+1.42\%$
test_select_nested 0.1193ms 74.8374μs 13.3623 KOps/s 13.2481 KOps/s $\color{#35bf28}+0.86\%$
test_exclude_nested 0.1473ms 91.9816μs 10.8717 KOps/s 10.7953 KOps/s $\color{#35bf28}+0.71\%$
test_empty[True] 0.4444ms 0.3998ms 2.5012 KOps/s 2.4793 KOps/s $\color{#35bf28}+0.88\%$
test_empty[False] 17.1503μs 1.3093μs 763.7459 KOps/s 751.9976 KOps/s $\color{#35bf28}+1.56\%$
test_to 0.1058ms 72.1215μs 13.8655 KOps/s 13.1159 KOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_to_nonblocking 0.1076ms 65.1273μs 15.3545 KOps/s 15.5097 KOps/s $\color{#d91a1a}-1.00\%$
test_unbind_speed 0.3727ms 0.3385ms 2.9545 KOps/s 3.0208 KOps/s $\color{#d91a1a}-2.20\%$
test_unbind_speed_stack0 0.4457ms 0.3343ms 2.9918 KOps/s 3.0336 KOps/s $\color{#d91a1a}-1.38\%$
test_unbind_speed_stack1 0.1048s 0.9228ms 1.0837 KOps/s 961.5343 Ops/s $\textbf{\color{#35bf28}+12.70\%}$
test_split 1.2145ms 1.1455ms 873.0075 Ops/s 879.0211 Ops/s $\color{#d91a1a}-0.68\%$
test_chunk 0.1047s 1.2154ms 822.7528 Ops/s 922.7470 Ops/s $\textbf{\color{#d91a1a}-10.84\%}$
test_to_cpu_blocking 19.9641ms 19.5554ms 51.1368 Ops/s 46.8970 Ops/s $\textbf{\color{#35bf28}+9.04\%}$
test_to_cpu_global_sync 11.6448ms 11.5017ms 86.9437 Ops/s 88.6010 Ops/s $\color{#d91a1a}-1.87\%$
test_to_cpu_event_sync 12.7607ms 12.4191ms 80.5211 Ops/s 81.8642 Ops/s $\color{#d91a1a}-1.64\%$
test_to_cpu_default 0.1168s 13.6564ms 73.2259 Ops/s 81.4709 Ops/s $\textbf{\color{#d91a1a}-10.12\%}$
test_consolidate[False-None] 4.3773ms 4.1813ms 239.1579 Ops/s 217.6148 Ops/s $\textbf{\color{#35bf28}+9.90\%}$
test_consolidate[default-None] 2.1591ms 2.0598ms 485.4758 Ops/s 486.5741 Ops/s $\color{#d91a1a}-0.23\%$
test_consolidate[reduce-overhead-None] 2.0589ms 1.9856ms 503.6257 Ops/s 507.6965 Ops/s $\color{#d91a1a}-0.80\%$
test_consolidate_njt[False-None] 8.7951ms 8.5541ms 116.9030 Ops/s 117.3420 Ops/s $\color{#d91a1a}-0.37\%$
test_to[False-False-None] 2.2533ms 2.1359ms 468.1801 Ops/s 477.8175 Ops/s $\color{#d91a1a}-2.02\%$
test_to[True-False-None] 2.1849ms 1.9643ms 509.0749 Ops/s 518.0176 Ops/s $\color{#d91a1a}-1.73\%$
test_to[within-False-None] 6.3975ms 6.2594ms 159.7587 Ops/s 163.0284 Ops/s $\color{#d91a1a}-2.01\%$
test_to[True-default-None] 9.4362ms 9.0351ms 110.6799 Ops/s 113.2152 Ops/s $\color{#d91a1a}-2.24\%$
test_to_njt[False-False-None] 8.8861ms 8.4840ms 117.8696 Ops/s 117.2302 Ops/s $\color{#35bf28}+0.55\%$
test_to_njt[True-False-None] 7.3984ms 7.2061ms 138.7705 Ops/s 143.1529 Ops/s $\color{#d91a1a}-3.06\%$
test_to_njt[within-False-None] 16.3167ms 15.5354ms 64.3692 Ops/s 63.7793 Ops/s $\color{#35bf28}+0.93\%$
test_creation[device0] 0.4494ms 0.1201ms 8.3279 KOps/s 8.6435 KOps/s $\color{#d91a1a}-3.65\%$
test_creation_from_tensor 0.4796ms 0.1193ms 8.3819 KOps/s 8.8783 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_add_one[memmap_tensor0] 0.3934ms 6.6968μs 149.3243 KOps/s 148.7723 KOps/s $\color{#35bf28}+0.37\%$
test_contiguous[memmap_tensor0] 25.5800μs 0.6829μs 1.4644 MOps/s 2.1618 MOps/s $\textbf{\color{#d91a1a}-32.26\%}$
test_stack[memmap_tensor0] 37.1210μs 4.7761μs 209.3769 KOps/s 215.8714 KOps/s $\color{#d91a1a}-3.01\%$
test_memmaptd_index 1.0247ms 0.2759ms 3.6246 KOps/s 3.7236 KOps/s $\color{#d91a1a}-2.66\%$
test_memmaptd_index_astensor 0.5266ms 0.3797ms 2.6338 KOps/s 2.6915 KOps/s $\color{#d91a1a}-2.14\%$
test_memmaptd_index_op 0.7747ms 0.6341ms 1.5770 KOps/s 1.5859 KOps/s $\color{#d91a1a}-0.56\%$
test_serialize_model 0.1366s 0.1348s 7.4177 Ops/s 7.3790 Ops/s $\color{#35bf28}+0.52\%$
test_serialize_model_pickle 1.3510s 1.1939s 0.8376 Ops/s 0.8384 Ops/s $\color{#d91a1a}-0.09\%$
test_serialize_weights 0.1365s 0.1347s 7.4251 Ops/s 7.4223 Ops/s $\color{#35bf28}+0.04\%$
test_serialize_weights_returnearly 0.4343s 87.9008ms 11.3765 Ops/s 15.5709 Ops/s $\textbf{\color{#d91a1a}-26.94\%}$
test_serialize_weights_pickle 1.3651s 1.2144s 0.8234 Ops/s 0.8389 Ops/s $\color{#d91a1a}-1.84\%$
test_reshape_pytree 0.2088ms 33.3624μs 29.9739 KOps/s 29.3190 KOps/s $\color{#35bf28}+2.23\%$
test_reshape_td 78.4710μs 46.3858μs 21.5583 KOps/s 21.8380 KOps/s $\color{#d91a1a}-1.28\%$
test_view_pytree 0.2154ms 32.4569μs 30.8101 KOps/s 30.4746 KOps/s $\color{#35bf28}+1.10\%$
test_view_td 0.2171ms 55.0047μs 18.1803 KOps/s 18.4322 KOps/s $\color{#d91a1a}-1.37\%$
test_unbind_pytree 0.2428ms 36.5795μs 27.3377 KOps/s 27.1667 KOps/s $\color{#35bf28}+0.63\%$
test_unbind_td 0.1964ms 50.5159μs 19.7957 KOps/s 20.2591 KOps/s $\color{#d91a1a}-2.29\%$
test_split_pytree 0.2475ms 43.2600μs 23.1160 KOps/s 23.3369 KOps/s $\color{#d91a1a}-0.95\%$
test_split_td 0.2148ms 64.8373μs 15.4232 KOps/s 15.2192 KOps/s $\color{#35bf28}+1.34\%$
test_add_pytree 0.2363ms 43.5187μs 22.9786 KOps/s 23.1359 KOps/s $\color{#d91a1a}-0.68\%$
test_add_td 0.1095ms 55.5947μs 17.9873 KOps/s 17.7762 KOps/s $\color{#35bf28}+1.19\%$
test_compile_add_one_nested[tensordict-compile] 0.2126ms 0.1415ms 7.0678 KOps/s 6.6188 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_compile_add_one_nested[tensordict-eager] 0.2976ms 0.2027ms 4.9334 KOps/s 5.0019 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_add_one_nested[pytree-compile] 0.1685ms 0.1092ms 9.1584 KOps/s 9.0867 KOps/s $\color{#35bf28}+0.79\%$
test_compile_add_one_nested[pytree-eager] 0.4350ms 0.1818ms 5.5007 KOps/s 5.6010 KOps/s $\color{#d91a1a}-1.79\%$
test_compile_copy_nested[tensordict-compile] 0.3181ms 10.2305μs 97.7471 KOps/s 96.0028 KOps/s $\color{#35bf28}+1.82\%$
test_compile_copy_nested[tensordict-eager] 81.0410μs 54.0296μs 18.5084 KOps/s 18.5414 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_copy_nested[pytree-compile] 31.6610μs 9.9241μs 100.7643 KOps/s 101.5495 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_copy_nested[pytree-eager] 0.4430ms 68.4386μs 14.6116 KOps/s 14.7923 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_add_one_flat[tensordict-compile] 0.3125ms 0.1771ms 5.6454 KOps/s 5.1886 KOps/s $\textbf{\color{#35bf28}+8.80\%}$
test_compile_add_one_flat[tensordict-eager] 0.3495ms 0.2791ms 3.5833 KOps/s 3.5439 KOps/s $\color{#35bf28}+1.11\%$
test_compile_add_one_flat[tensorclass-compile] 0.3401ms 0.1180ms 8.4716 KOps/s 8.1791 KOps/s $\color{#35bf28}+3.58\%$
test_compile_add_one_flat[tensorclass-eager] 0.1201ms 72.7869μs 13.7387 KOps/s 13.1331 KOps/s $\color{#35bf28}+4.61\%$
test_compile_add_one_flat[pytree-compile] 0.2259ms 0.1594ms 6.2720 KOps/s 6.1650 KOps/s $\color{#35bf28}+1.74\%$
test_compile_add_one_flat[pytree-eager] 0.8143ms 0.5336ms 1.8742 KOps/s 1.9218 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_add_self_flat[tensordict-eager] 0.4831ms 0.3334ms 2.9997 KOps/s 2.9659 KOps/s $\color{#35bf28}+1.14\%$
test_compile_add_self_flat[tensordict-compile] 0.3082ms 0.1801ms 5.5517 KOps/s 5.2856 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1296ms 89.3264μs 11.1949 KOps/s 11.0892 KOps/s $\color{#35bf28}+0.95\%$
test_compile_add_self_flat[tensorclass-compile] 0.2784ms 0.1211ms 8.2596 KOps/s 7.4810 KOps/s $\textbf{\color{#35bf28}+10.41\%}$
test_compile_add_self_flat[pytree-eager] 0.6569ms 0.4392ms 2.2766 KOps/s 2.3174 KOps/s $\color{#d91a1a}-1.76\%$
test_compile_add_self_flat[pytree-compile] 0.6080ms 0.1604ms 6.2342 KOps/s 5.8978 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_compile_copy_flat[tensordict-compile] 54.5120μs 13.3831μs 74.7211 KOps/s 74.1398 KOps/s $\color{#35bf28}+0.78\%$
test_compile_copy_flat[tensordict-eager] 90.9420μs 41.5962μs 24.0407 KOps/s 24.5063 KOps/s $\color{#d91a1a}-1.90\%$
test_compile_copy_flat[pytree-compile] 0.1115ms 10.8349μs 92.2940 KOps/s 91.9356 KOps/s $\color{#35bf28}+0.39\%$
test_compile_copy_flat[pytree-eager] 0.4147ms 52.8562μs 18.9193 KOps/s 19.0180 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_assign_and_add[tensordict-compile] 2.0258ms 0.1757ms 5.6927 KOps/s 5.3500 KOps/s $\textbf{\color{#35bf28}+6.41\%}$
test_compile_assign_and_add[tensordict-eager] 3.4226ms 3.3129ms 301.8473 Ops/s 301.4253 Ops/s $\color{#35bf28}+0.14\%$
test_compile_assign_and_add[pytree-compile] 2.0246ms 0.1647ms 6.0702 KOps/s 6.0498 KOps/s $\color{#35bf28}+0.34\%$
test_compile_assign_and_add[pytree-eager] 2.9262ms 2.8201ms 354.6004 Ops/s 355.2700 Ops/s $\color{#d91a1a}-0.19\%$
test_compile_indexing[tensor-tensordict-compile] 0.2461ms 0.1106ms 9.0442 KOps/s 8.7261 KOps/s $\color{#35bf28}+3.65\%$
test_compile_indexing[tensor-tensordict-eager] 0.3103ms 74.4385μs 13.4339 KOps/s 13.6325 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2173ms 96.7765μs 10.3331 KOps/s 10.0336 KOps/s $\color{#35bf28}+2.98\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2490ms 44.5671μs 22.4381 KOps/s 21.8572 KOps/s $\color{#35bf28}+2.66\%$
test_compile_indexing[tensor-pytree-compile] 0.1388ms 97.5857μs 10.2474 KOps/s 9.7346 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_compile_indexing[tensor-pytree-eager] 0.2671ms 44.8435μs 22.2998 KOps/s 21.4701 KOps/s $\color{#35bf28}+3.86\%$
test_compile_indexing[slice-tensordict-compile] 0.2074ms 55.5906μs 17.9887 KOps/s 16.4432 KOps/s $\textbf{\color{#35bf28}+9.40\%}$
test_compile_indexing[slice-tensordict-eager] 0.2153ms 27.8662μs 35.8858 KOps/s 35.4191 KOps/s $\color{#35bf28}+1.32\%$
test_compile_indexing[slice-tensorclass-compile] 0.1589ms 44.2200μs 22.6142 KOps/s 21.7380 KOps/s $\color{#35bf28}+4.03\%$
test_compile_indexing[slice-tensorclass-eager] 0.2497ms 22.5557μs 44.3346 KOps/s 43.7844 KOps/s $\color{#35bf28}+1.26\%$
test_compile_indexing[slice-pytree-compile] 85.3510μs 44.4234μs 22.5106 KOps/s 20.5540 KOps/s $\textbf{\color{#35bf28}+9.52\%}$
test_compile_indexing[slice-pytree-eager] 0.2765ms 22.4463μs 44.5508 KOps/s 43.9086 KOps/s $\color{#35bf28}+1.46\%$
test_compile_indexing[int-tensordict-compile] 94.9920μs 57.4352μs 17.4109 KOps/s 15.9536 KOps/s $\textbf{\color{#35bf28}+9.14\%}$
test_compile_indexing[int-tensordict-eager] 0.2826ms 28.0888μs 35.6014 KOps/s 35.3255 KOps/s $\color{#35bf28}+0.78\%$
test_compile_indexing[int-tensorclass-compile] 84.1920μs 45.2654μs 22.0919 KOps/s 21.3555 KOps/s $\color{#35bf28}+3.45\%$
test_compile_indexing[int-tensorclass-eager] 0.2546ms 22.4973μs 44.4498 KOps/s 44.0812 KOps/s $\color{#35bf28}+0.84\%$
test_compile_indexing[int-pytree-compile] 80.7620μs 43.9050μs 22.7764 KOps/s 20.8920 KOps/s $\textbf{\color{#35bf28}+9.02\%}$
test_compile_indexing[int-pytree-eager] 0.2571ms 22.5252μs 44.3948 KOps/s 44.3501 KOps/s $\color{#35bf28}+0.10\%$
test_compile_replace[single-eager] 86.9420μs 47.7035μs 20.9628 KOps/s 20.4169 KOps/s $\color{#35bf28}+2.67\%$
test_compile_replace[single-compile] 0.1552ms 0.1056ms 9.4684 KOps/s 8.8964 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_compile_replace[multi-eager] 0.6297ms 0.5686ms 1.7586 KOps/s 1.6602 KOps/s $\textbf{\color{#35bf28}+5.93\%}$
test_compile_replace[multi-compile] 0.1883ms 0.1119ms 8.9353 KOps/s 8.3945 KOps/s $\textbf{\color{#35bf28}+6.44\%}$
test_compile_tc_getattr_20[eager] 0.2243ms 0.1700ms 5.8838 KOps/s 5.9156 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_tc_getattr_20[compile] 0.2575ms 0.1200ms 8.3311 KOps/s 7.8189 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_compile_clone_shallow[20-eager] 49.1610μs 19.4645μs 51.3756 KOps/s 51.1696 KOps/s $\color{#35bf28}+0.40\%$
test_compile_clone_shallow[20-compile] 48.4510μs 11.4792μs 87.1140 KOps/s 86.8169 KOps/s $\color{#35bf28}+0.34\%$
test_compile_clone_shallow[40-eager] 65.8320μs 34.0674μs 29.3536 KOps/s 29.7939 KOps/s $\color{#d91a1a}-1.48\%$
test_compile_clone_shallow[40-compile] 70.9210μs 12.8444μs 77.8547 KOps/s 80.7968 KOps/s $\color{#d91a1a}-3.64\%$
test_compile_clone_shallow[80-eager] 98.4620μs 62.3879μs 16.0288 KOps/s 15.8952 KOps/s $\color{#35bf28}+0.84\%$
test_compile_clone_shallow[80-compile] 95.9020μs 14.7345μs 67.8680 KOps/s 66.1377 KOps/s $\color{#35bf28}+2.62\%$
test_compile_update_inplace[eager] 0.1055ms 59.7496μs 16.7365 KOps/s 16.8271 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_update_inplace[compile] 0.2064ms 0.1399ms 7.1468 KOps/s 6.8529 KOps/s $\color{#35bf28}+4.29\%$
test_mod_add[eager] 97.6120μs 49.7569μs 20.0977 KOps/s 20.2965 KOps/s $\color{#d91a1a}-0.98\%$
test_mod_add[compile] 0.4364ms 0.1038ms 9.6341 KOps/s 9.3027 KOps/s $\color{#35bf28}+3.56\%$
test_mod_add[compile-overhead] 0.2337ms 0.1491ms 6.7085 KOps/s 6.3888 KOps/s $\textbf{\color{#35bf28}+5.00\%}$
test_mod_wrap[eager] 0.3764ms 0.2918ms 3.4269 KOps/s 3.4166 KOps/s $\color{#35bf28}+0.30\%$
test_mod_wrap[compile] 0.8023ms 0.3503ms 2.8547 KOps/s 2.8122 KOps/s $\color{#35bf28}+1.51\%$
test_mod_wrap[compile-overhead] 7.2931ms 4.0051ms 249.6832 Ops/s 247.9184 Ops/s $\color{#35bf28}+0.71\%$
test_mod_wrap_and_backward[eager] 1.7264ms 1.5885ms 629.5124 Ops/s 655.8110 Ops/s $\color{#d91a1a}-4.01\%$
test_mod_wrap_and_backward[compile] 1.7397ms 1.5515ms 644.5316 Ops/s 683.2133 Ops/s $\textbf{\color{#d91a1a}-5.66\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4502ms 0.9943ms 1.0057 KOps/s 1.0867 KOps/s $\textbf{\color{#d91a1a}-7.45\%}$
test_seq_add[eager] 0.2152ms 0.1541ms 6.4904 KOps/s 6.1425 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_seq_add[compile] 0.2013ms 0.1135ms 8.8127 KOps/s 7.9646 KOps/s $\textbf{\color{#35bf28}+10.65\%}$
test_seq_add[compile-overhead] 0.4471ms 0.1534ms 6.5204 KOps/s 6.2363 KOps/s $\color{#35bf28}+4.56\%$
test_seq_wrap[eager] 0.6632ms 0.5240ms 1.9083 KOps/s 1.9306 KOps/s $\color{#d91a1a}-1.16\%$
test_seq_wrap[compile] 0.4577ms 0.3664ms 2.7295 KOps/s 2.6777 KOps/s $\color{#35bf28}+1.93\%$
test_seq_wrap[compile-overhead] 0.3372ms 0.2649ms 3.7747 KOps/s 3.6886 KOps/s $\color{#35bf28}+2.33\%$
test_func_call_runtime[False-eager] 0.9779ms 0.8922ms 1.1209 KOps/s 1.1747 KOps/s $\color{#d91a1a}-4.58\%$
test_func_call_runtime[False-compile] 1.0696ms 0.9298ms 1.0755 KOps/s 1.0840 KOps/s $\color{#d91a1a}-0.79\%$
test_func_call_runtime[False-compile-overhead] 0.5707ms 0.4637ms 2.1565 KOps/s 2.1192 KOps/s $\color{#35bf28}+1.76\%$
test_func_call_runtime[True-eager] 1.3264ms 1.1028ms 906.8173 Ops/s 921.3248 Ops/s $\color{#d91a1a}-1.57\%$
test_func_call_runtime[True-compile] 1.0768ms 0.9231ms 1.0833 KOps/s 1.0708 KOps/s $\color{#35bf28}+1.17\%$
test_func_call_runtime[True-compile-overhead] 0.5232ms 0.4796ms 2.0849 KOps/s 2.0636 KOps/s $\color{#35bf28}+1.03\%$
test_func_call_cm_runtime[False-eager] 1.0713ms 0.8955ms 1.1166 KOps/s 1.1869 KOps/s $\textbf{\color{#d91a1a}-5.92\%}$
test_func_call_cm_runtime[False-compile] 1.1217ms 0.9227ms 1.0838 KOps/s 1.0785 KOps/s $\color{#35bf28}+0.49\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6121ms 0.4663ms 2.1445 KOps/s 2.1130 KOps/s $\color{#35bf28}+1.49\%$
test_func_call_cm_runtime[True-eager] 1.3274ms 1.2228ms 817.7920 Ops/s 804.9083 Ops/s $\color{#35bf28}+1.60\%$
test_func_call_cm_runtime[True-compile] 1.0182ms 0.9573ms 1.0446 KOps/s 1.0118 KOps/s $\color{#35bf28}+3.24\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5966ms 0.5114ms 1.9554 KOps/s 1.9285 KOps/s $\color{#35bf28}+1.40\%$
test_vmap_func_call_cm_runtime[eager] 2.8722ms 2.3904ms 418.3429 Ops/s 417.8385 Ops/s $\color{#35bf28}+0.12\%$
test_vmap_func_call_cm_runtime[compile] 1.0484ms 0.9811ms 1.0192 KOps/s 1.0100 KOps/s $\color{#35bf28}+0.91\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5704ms 0.5169ms 1.9346 KOps/s 1.9104 KOps/s $\color{#35bf28}+1.27\%$
test_distributed 3.0860ms 0.1594ms 6.2728 KOps/s 6.4850 KOps/s $\color{#d91a1a}-3.27\%$
test_tdmodule 0.1677ms 27.1718μs 36.8029 KOps/s 36.4632 KOps/s $\color{#35bf28}+0.93\%$
test_tdmodule_dispatch 75.4910μs 45.8100μs 21.8293 KOps/s 21.8174 KOps/s $\color{#35bf28}+0.05\%$
test_tdseq 45.6610μs 26.5242μs 37.7014 KOps/s 36.7179 KOps/s $\color{#35bf28}+2.68\%$
test_tdseq_dispatch 68.0220μs 47.0572μs 21.2508 KOps/s 20.9072 KOps/s $\color{#35bf28}+1.64\%$
test_instantiation_functorch 2.1894ms 2.0862ms 479.3350 Ops/s 481.7811 Ops/s $\color{#d91a1a}-0.51\%$
test_exec_functorch 0.2402ms 0.1799ms 5.5601 KOps/s 5.5111 KOps/s $\color{#35bf28}+0.89\%$
test_exec_functional_call 0.2143ms 0.1621ms 6.1708 KOps/s 6.2725 KOps/s $\color{#d91a1a}-1.62\%$
test_exec_td_decorator 0.4655ms 0.2362ms 4.2336 KOps/s 4.2147 KOps/s $\color{#35bf28}+0.45\%$
test_vmap_mlp_speed_decorator[True-True] 1.0894ms 0.8290ms 1.2063 KOps/s 1.2028 KOps/s $\color{#35bf28}+0.29\%$
test_vmap_mlp_speed_decorator[True-False] 1.0319ms 0.8268ms 1.2095 KOps/s 1.2067 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_mlp_speed_decorator[False-True] 0.9177ms 0.7177ms 1.3933 KOps/s 1.4048 KOps/s $\color{#d91a1a}-0.82\%$
test_vmap_mlp_speed_decorator[False-False] 0.8897ms 0.7147ms 1.3991 KOps/s 1.4072 KOps/s $\color{#d91a1a}-0.57\%$
test_vmap_transformer_speed_decorator[True-True] 20.7016ms 20.5592ms 48.6401 Ops/s 48.3504 Ops/s $\color{#35bf28}+0.60\%$
test_vmap_transformer_speed_decorator[True-False] 21.1163ms 20.6194ms 48.4979 Ops/s 48.1717 Ops/s $\color{#35bf28}+0.68\%$
test_vmap_transformer_speed_decorator[False-True] 20.9615ms 20.3672ms 49.0986 Ops/s 49.0667 Ops/s $\color{#35bf28}+0.07\%$
test_vmap_transformer_speed_decorator[False-False] 21.1114ms 20.4082ms 49.0000 Ops/s 49.0736 Ops/s $\color{#d91a1a}-0.15\%$
test_to_module_speed[True] 2.0108ms 1.4851ms 673.3542 Ops/s 675.0584 Ops/s $\color{#d91a1a}-0.25\%$
test_to_module_speed[False] 1.9655ms 1.4693ms 680.6027 Ops/s 692.7334 Ops/s $\color{#d91a1a}-1.75\%$
test_tc_init 74.1120μs 44.5205μs 22.4616 KOps/s 22.2240 KOps/s $\color{#35bf28}+1.07\%$
test_tc_init_tensor_only 32.3410μs 9.8063μs 101.9753 KOps/s 100.5589 KOps/s $\color{#35bf28}+1.41\%$
test_tc_init_nested 0.1226ms 89.0707μs 11.2270 KOps/s 11.2024 KOps/s $\color{#35bf28}+0.22\%$
test_tc_init_many_fields 41.2710μs 16.4958μs 60.6214 KOps/s 60.9338 KOps/s $\color{#d91a1a}-0.51\%$
test_tc_first_layer_tensor 34.5710μs 1.8333μs 545.4584 KOps/s 561.3942 KOps/s $\color{#d91a1a}-2.84\%$
test_tc_first_layer_tensor_only 1.4320μs 0.3949μs 2.5321 MOps/s 2.5756 MOps/s $\color{#d91a1a}-1.69\%$
test_tc_first_layer_tensor_set 36.5110μs 3.9760μs 251.5062 KOps/s 253.7546 KOps/s $\color{#d91a1a}-0.89\%$
test_tc_first_layer_tensor_only_set 34.6100μs 3.2817μs 304.7247 KOps/s 308.1873 KOps/s $\color{#d91a1a}-1.12\%$
test_tc_first_layer_nontensor 30.8010μs 6.1407μs 162.8492 KOps/s 162.6267 KOps/s $\color{#35bf28}+0.14\%$
test_tc_second_layer_tensor 29.3310μs 4.4877μs 222.8308 KOps/s 229.3029 KOps/s $\color{#d91a1a}-2.82\%$
test_tc_second_layer_nontensor 45.0810μs 8.8512μs 112.9789 KOps/s 114.6941 KOps/s $\color{#d91a1a}-1.50\%$
test_unbind 0.2624s 16.3754ms 61.0674 Ops/s 54.6368 Ops/s $\textbf{\color{#35bf28}+11.77\%}$
test_full_like 11.9876ms 8.8359ms 113.1745 Ops/s 94.8322 Ops/s $\textbf{\color{#35bf28}+19.34\%}$
test_zeros_like 11.7088ms 8.8156ms 113.4348 Ops/s 73.6605 Ops/s $\textbf{\color{#35bf28}+54.00\%}$
test_ones_like 4.8043ms 4.3678ms 228.9480 Ops/s 73.7285 Ops/s $\textbf{\color{#35bf28}+210.53\%}$
test_clone 11.4761ms 9.2368ms 108.2625 Ops/s 56.8406 Ops/s $\textbf{\color{#35bf28}+90.47\%}$
test_squeeze 0.1675ms 13.9296μs 71.7896 KOps/s 62.2899 KOps/s $\textbf{\color{#35bf28}+15.25\%}$
test_unsqueeze 0.1600ms 0.1145ms 8.7374 KOps/s 8.9618 KOps/s $\color{#d91a1a}-2.50\%$
test_split 0.2491ms 0.1818ms 5.4992 KOps/s 5.3966 KOps/s $\color{#35bf28}+1.90\%$
test_permute 0.2889ms 0.2071ms 4.8291 KOps/s 4.8757 KOps/s $\color{#d91a1a}-0.96\%$
test_stack 53.2036ms 51.2510ms 19.5118 Ops/s 19.5219 Ops/s $\color{#d91a1a}-0.05\%$
test_cat 51.4673ms 50.8303ms 19.6733 Ops/s 19.5566 Ops/s $\color{#35bf28}+0.60\%$
test_sequential_tensordict 0.2821ms 0.2265ms 4.4157 KOps/s 4.3229 KOps/s $\color{#35bf28}+2.15\%$
test_sequential_graph_module 0.5350ms 0.1261ms 7.9275 KOps/s 7.8874 KOps/s $\color{#35bf28}+0.51\%$
test_nested_tensordict 0.6854ms 0.2881ms 3.4714 KOps/s 3.3917 KOps/s $\color{#35bf28}+2.35\%$
test_nested_graph_module 0.1900ms 0.1340ms 7.4623 KOps/s 7.5449 KOps/s $\color{#d91a1a}-1.10\%$

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add Strategy C (optimal P2P using transfer plan)

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant