Skip to content

[DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A#1644

Open
vmoens wants to merge 6 commits intogh/vmoens/85/basefrom
gh/vmoens/85/head
Open

[DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A#1644
vmoens wants to merge 6 commits intogh/vmoens/85/basefrom
gh/vmoens/85/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Mar 9, 2026

Stack from ghstack (oldest at bottom):

Public API for cross-mesh DTensor transfer on TensorDictBase:

  • dtensor_send(dst, strategy, transport, ...) / dtensor_recv(src, ...)
  • Strategy selection: "materialize", "redistribute", "optimal", "auto"
  • Transport selection: "torch_distributed", "ucxx", "auto"

Strategy A (materialize): gathers DTensors via full_tensor() then
sends the full tensor over P2P. Simple but memory-intensive.

Made-with: Cursor

[ghstack-poisoned]
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 30.8110μs 14.7728μs 67.6921 KOps/s 66.8796 KOps/s $\color{#35bf28}+1.21\%$
test_plain_set_stack_nested 36.3800μs 15.0513μs 66.4393 KOps/s 66.3600 KOps/s $\color{#35bf28}+0.12\%$
test_plain_set_nested_inplace 0.4598ms 16.9245μs 59.0858 KOps/s 59.5060 KOps/s $\color{#d91a1a}-0.71\%$
test_plain_set_stack_nested_inplace 41.2310μs 16.7079μs 59.8519 KOps/s 60.0740 KOps/s $\color{#d91a1a}-0.37\%$
test_items 0.4361ms 6.0838μs 164.3708 KOps/s 166.5426 KOps/s $\color{#d91a1a}-1.30\%$
test_items_nested 0.9609ms 0.4617ms 2.1657 KOps/s 2.1147 KOps/s $\color{#35bf28}+2.41\%$
test_items_nested_locked 0.9397ms 0.4681ms 2.1361 KOps/s 2.0967 KOps/s $\color{#35bf28}+1.88\%$
test_items_nested_leaf 0.5194ms 98.9047μs 10.1107 KOps/s 10.2686 KOps/s $\color{#d91a1a}-1.54\%$
test_items_stack_nested 0.8930ms 0.4672ms 2.1406 KOps/s 2.1240 KOps/s $\color{#35bf28}+0.78\%$
test_items_stack_nested_leaf 0.5297ms 99.1918μs 10.0815 KOps/s 10.3559 KOps/s $\color{#d91a1a}-2.65\%$
test_items_stack_nested_locked 0.9045ms 0.4690ms 2.1320 KOps/s 2.1034 KOps/s $\color{#35bf28}+1.36\%$
test_keys 33.4910μs 4.2414μs 235.7717 KOps/s 236.7429 KOps/s $\color{#d91a1a}-0.41\%$
test_keys_nested 0.5670ms 0.1299ms 7.7000 KOps/s 7.6351 KOps/s $\color{#35bf28}+0.85\%$
test_keys_nested_locked 2.0500ms 0.1389ms 7.1970 KOps/s 7.1725 KOps/s $\color{#35bf28}+0.34\%$
test_keys_nested_leaf 0.5482ms 0.1204ms 8.3046 KOps/s 8.2488 KOps/s $\color{#35bf28}+0.68\%$
test_keys_stack_nested 0.5989ms 0.1292ms 7.7422 KOps/s 7.6443 KOps/s $\color{#35bf28}+1.28\%$
test_keys_stack_nested_leaf 0.5525ms 0.1198ms 8.3468 KOps/s 8.2594 KOps/s $\color{#35bf28}+1.06\%$
test_keys_stack_nested_locked 0.5857ms 0.1374ms 7.2805 KOps/s 7.2285 KOps/s $\color{#35bf28}+0.72\%$
test_values 89.3496μs 1.0289μs 971.9134 KOps/s 978.2629 KOps/s $\color{#d91a1a}-0.65\%$
test_values_nested 81.3820μs 52.5444μs 19.0315 KOps/s 17.9051 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_values_nested_locked 88.6110μs 55.6920μs 17.9559 KOps/s 17.0137 KOps/s $\textbf{\color{#35bf28}+5.54\%}$
test_values_nested_leaf 0.4827ms 59.7657μs 16.7320 KOps/s 16.0984 KOps/s $\color{#35bf28}+3.94\%$
test_values_stack_nested 0.4849ms 52.1370μs 19.1802 KOps/s 17.9791 KOps/s $\textbf{\color{#35bf28}+6.68\%}$
test_values_stack_nested_leaf 0.4853ms 60.4710μs 16.5368 KOps/s 15.7418 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_values_stack_nested_locked 0.4813ms 55.2714μs 18.0925 KOps/s 17.0257 KOps/s $\textbf{\color{#35bf28}+6.27\%}$
test_membership 72.1780μs 0.8284μs 1.2071 MOps/s 1.1636 MOps/s $\color{#35bf28}+3.74\%$
test_membership_nested 33.9400μs 2.8826μs 346.9042 KOps/s 345.7061 KOps/s $\color{#35bf28}+0.35\%$
test_membership_nested_leaf 75.4510μs 2.8832μs 346.8427 KOps/s 338.8272 KOps/s $\color{#35bf28}+2.37\%$
test_membership_stacked_nested 0.4463ms 2.9183μs 342.6686 KOps/s 342.8778 KOps/s $\color{#d91a1a}-0.06\%$
test_membership_stacked_nested_leaf 60.9310μs 2.8759μs 347.7155 KOps/s 347.7266 KOps/s $-0.00\%$
test_membership_nested_last 0.4760ms 4.3371μs 230.5672 KOps/s 228.3801 KOps/s $\color{#35bf28}+0.96\%$
test_membership_nested_leaf_last 24.1800μs 4.3653μs 229.0813 KOps/s 228.2297 KOps/s $\color{#35bf28}+0.37\%$
test_membership_stacked_nested_last 29.0310μs 4.3162μs 231.6866 KOps/s 228.0133 KOps/s $\color{#35bf28}+1.61\%$
test_membership_stacked_nested_leaf_last 27.1300μs 4.2955μs 232.8027 KOps/s 227.8956 KOps/s $\color{#35bf28}+2.15\%$
test_nested_getleaf 0.4688ms 21.2732μs 47.0076 KOps/s 47.0178 KOps/s $\color{#d91a1a}-0.02\%$
test_nested_get 0.4589ms 19.8585μs 50.3563 KOps/s 49.5658 KOps/s $\color{#35bf28}+1.59\%$
test_stacked_getleaf 53.3810μs 20.8116μs 48.0500 KOps/s 47.1676 KOps/s $\color{#35bf28}+1.87\%$
test_stacked_get 0.4656ms 19.7482μs 50.6375 KOps/s 49.3464 KOps/s $\color{#35bf28}+2.62\%$
test_nested_getitemleaf 0.4652ms 21.5794μs 46.3405 KOps/s 45.9297 KOps/s $\color{#35bf28}+0.89\%$
test_nested_getitem 0.4761ms 20.3242μs 49.2023 KOps/s 48.0155 KOps/s $\color{#35bf28}+2.47\%$
test_stacked_getitemleaf 0.4590ms 21.4774μs 46.5606 KOps/s 45.8948 KOps/s $\color{#35bf28}+1.45\%$
test_stacked_getitem 48.0610μs 20.4192μs 48.9736 KOps/s 48.7870 KOps/s $\color{#35bf28}+0.38\%$
test_lock_nested 0.7153ms 0.4769ms 2.0967 KOps/s 2.1095 KOps/s $\color{#d91a1a}-0.60\%$
test_lock_stack_nested 0.5368ms 0.4842ms 2.0654 KOps/s 2.0570 KOps/s $\color{#35bf28}+0.41\%$
test_unlock_nested 0.6151ms 0.3922ms 2.5500 KOps/s 2.5670 KOps/s $\color{#d91a1a}-0.66\%$
test_unlock_stack_nested 0.4316ms 0.3922ms 2.5494 KOps/s 2.5360 KOps/s $\color{#35bf28}+0.53\%$
test_flatten_speed 0.1954ms 0.1236ms 8.0879 KOps/s 8.0352 KOps/s $\color{#35bf28}+0.65\%$
test_unflatten_speed 0.6974ms 0.5694ms 1.7564 KOps/s 1.7744 KOps/s $\color{#d91a1a}-1.02\%$
test_common_ops 0.8363ms 0.7033ms 1.4220 KOps/s 1.4277 KOps/s $\color{#d91a1a}-0.40\%$
test_creation 0.1155ms 3.1863μs 313.8465 KOps/s 317.1807 KOps/s $\color{#d91a1a}-1.05\%$
test_creation_empty 40.3600μs 7.0450μs 141.9437 KOps/s 143.5350 KOps/s $\color{#d91a1a}-1.11\%$
test_creation_nested_1 33.0400μs 11.6778μs 85.6328 KOps/s 87.2543 KOps/s $\color{#d91a1a}-1.86\%$
test_creation_nested_2 50.8810μs 13.4396μs 74.4068 KOps/s 74.7689 KOps/s $\color{#d91a1a}-0.48\%$
test_creation_many_keys[10] 97.0420μs 21.1585μs 47.2623 KOps/s 48.0656 KOps/s $\color{#d91a1a}-1.67\%$
test_creation_many_keys[50] 0.1430ms 91.0090μs 10.9879 KOps/s 11.1168 KOps/s $\color{#d91a1a}-1.16\%$
test_creation_many_keys[100] 0.2721ms 0.1782ms 5.6110 KOps/s 5.6561 KOps/s $\color{#d91a1a}-0.80\%$
test_creation_nested_many_keys[10] 84.5720μs 45.5894μs 21.9349 KOps/s 22.4744 KOps/s $\color{#d91a1a}-2.40\%$
test_creation_nested_many_keys[50] 0.2684ms 0.1855ms 5.3917 KOps/s 5.3990 KOps/s $\color{#d91a1a}-0.14\%$
test_clone 52.0310μs 13.5694μs 73.6954 KOps/s 73.7771 KOps/s $\color{#d91a1a}-0.11\%$
test_getitem[int] 1.5632ms 15.4344μs 64.7905 KOps/s 60.0473 KOps/s $\textbf{\color{#35bf28}+7.90\%}$
test_getitem[slice_int] 0.1361ms 24.8189μs 40.2918 KOps/s 40.7719 KOps/s $\color{#d91a1a}-1.18\%$
test_getitem[range] 0.1679ms 63.3548μs 15.7841 KOps/s 15.4953 KOps/s $\color{#35bf28}+1.86\%$
test_getitem[tuple] 0.1404ms 24.1988μs 41.3243 KOps/s 41.3352 KOps/s $\color{#d91a1a}-0.03\%$
test_getitem[list] 0.1778ms 57.9212μs 17.2648 KOps/s 16.7307 KOps/s $\color{#35bf28}+3.19\%$
test_setitem_dim[int] 45.7710μs 25.7773μs 38.7938 KOps/s 37.5733 KOps/s $\color{#35bf28}+3.25\%$
test_setitem_dim[slice_int] 80.9620μs 42.6258μs 23.4600 KOps/s 22.5193 KOps/s $\color{#35bf28}+4.18\%$
test_setitem_dim[range] 0.1384ms 94.8956μs 10.5379 KOps/s 10.4217 KOps/s $\color{#35bf28}+1.11\%$
test_setitem_dim[tuple] 72.6110μs 39.7469μs 25.1592 KOps/s 24.0886 KOps/s $\color{#35bf28}+4.44\%$
test_setitem 68.1610μs 17.8109μs 56.1452 KOps/s 55.0249 KOps/s $\color{#35bf28}+2.04\%$
test_set 69.0120μs 17.1345μs 58.3617 KOps/s 58.4537 KOps/s $\color{#d91a1a}-0.16\%$
test_set_shared 0.6672ms 0.2131ms 4.6921 KOps/s 4.8814 KOps/s $\color{#d91a1a}-3.88\%$
test_update 0.3461ms 22.0692μs 45.3121 KOps/s 45.3482 KOps/s $\color{#d91a1a}-0.08\%$
test_update_nested 0.4797ms 33.9100μs 29.4898 KOps/s 28.4639 KOps/s $\color{#35bf28}+3.60\%$
test_update__nested 0.4436ms 35.4759μs 28.1881 KOps/s 27.7202 KOps/s $\color{#35bf28}+1.69\%$
test_set_nested 0.4797ms 19.6233μs 50.9599 KOps/s 50.4270 KOps/s $\color{#35bf28}+1.06\%$
test_set_nested_new 61.3910μs 23.7533μs 42.0994 KOps/s 40.8377 KOps/s $\color{#35bf28}+3.09\%$
test_select 0.4735ms 40.3914μs 24.7577 KOps/s 24.0199 KOps/s $\color{#35bf28}+3.07\%$
test_select_nested 0.5358ms 75.3325μs 13.2745 KOps/s 13.5219 KOps/s $\color{#d91a1a}-1.83\%$
test_exclude_nested 0.1301ms 92.3415μs 10.8294 KOps/s 10.9299 KOps/s $\color{#d91a1a}-0.92\%$
test_empty[True] 0.8283ms 0.3986ms 2.5086 KOps/s 2.5104 KOps/s $\color{#d91a1a}-0.07\%$
test_empty[False] 0.1170ms 1.3025μs 767.7281 KOps/s 780.6264 KOps/s $\color{#d91a1a}-1.65\%$
test_to 0.1093ms 77.4114μs 12.9180 KOps/s 12.8935 KOps/s $\color{#35bf28}+0.19\%$
test_to_nonblocking 0.1194ms 65.5957μs 15.2449 KOps/s 15.3700 KOps/s $\color{#d91a1a}-0.81\%$
test_unbind_speed 0.3774ms 0.3400ms 2.9409 KOps/s 2.9973 KOps/s $\color{#d91a1a}-1.88\%$
test_unbind_speed_stack0 0.4875ms 0.3332ms 3.0008 KOps/s 3.0518 KOps/s $\color{#d91a1a}-1.67\%$
test_unbind_speed_stack1 0.1036s 0.9187ms 1.0885 KOps/s 1.1810 KOps/s $\textbf{\color{#d91a1a}-7.83\%}$
test_split 1.3648ms 1.1440ms 874.1024 Ops/s 784.8590 Ops/s $\textbf{\color{#35bf28}+11.37\%}$
test_chunk 0.1040s 1.2087ms 827.3377 Ops/s 924.0608 Ops/s $\textbf{\color{#d91a1a}-10.47\%}$
test_to_cpu_blocking 29.4776ms 28.8151ms 34.7040 Ops/s 34.3621 Ops/s $\color{#35bf28}+0.99\%$
test_to_cpu_global_sync 11.8145ms 11.3815ms 87.8618 Ops/s 78.1115 Ops/s $\textbf{\color{#35bf28}+12.48\%}$
test_to_cpu_event_sync 12.9077ms 12.3637ms 80.8817 Ops/s 79.6462 Ops/s $\color{#35bf28}+1.55\%$
test_to_cpu_default 0.1157s 13.6803ms 73.0977 Ops/s 79.4915 Ops/s $\textbf{\color{#d91a1a}-8.04\%}$
test_consolidate[False-None] 4.7267ms 4.2612ms 234.6746 Ops/s 215.7281 Ops/s $\textbf{\color{#35bf28}+8.78\%}$
test_consolidate[default-None] 2.2473ms 2.1152ms 472.7753 Ops/s 483.2645 Ops/s $\color{#d91a1a}-2.17\%$
test_consolidate[reduce-overhead-None] 2.1591ms 2.0097ms 497.5982 Ops/s 500.6777 Ops/s $\color{#d91a1a}-0.62\%$
test_consolidate_njt[False-None] 8.7601ms 8.5328ms 117.1949 Ops/s 116.0198 Ops/s $\color{#35bf28}+1.01\%$
test_to[False-False-None] 2.2554ms 2.1357ms 468.2396 Ops/s 467.5465 Ops/s $\color{#35bf28}+0.15\%$
test_to[True-False-None] 2.3031ms 1.9931ms 501.7186 Ops/s 510.1243 Ops/s $\color{#d91a1a}-1.65\%$
test_to[within-False-None] 6.4572ms 6.3639ms 157.1355 Ops/s 161.9196 Ops/s $\color{#d91a1a}-2.95\%$
test_to[True-default-None] 9.6326ms 9.1196ms 109.6542 Ops/s 110.3812 Ops/s $\color{#d91a1a}-0.66\%$
test_to_njt[False-False-None] 8.6683ms 8.4893ms 117.7954 Ops/s 114.6614 Ops/s $\color{#35bf28}+2.73\%$
test_to_njt[True-False-None] 7.1496ms 6.9665ms 143.5439 Ops/s 144.0085 Ops/s $\color{#d91a1a}-0.32\%$
test_to_njt[within-False-None] 16.3858ms 15.6618ms 63.8498 Ops/s 63.1588 Ops/s $\color{#35bf28}+1.09\%$
test_creation[device0] 0.4589ms 0.1159ms 8.6249 KOps/s 8.7172 KOps/s $\color{#d91a1a}-1.06\%$
test_creation_from_tensor 0.4559ms 0.1140ms 8.7745 KOps/s 8.8044 KOps/s $\color{#d91a1a}-0.34\%$
test_add_one[memmap_tensor0] 0.1995ms 6.7391μs 148.3877 KOps/s 150.6323 KOps/s $\color{#d91a1a}-1.49\%$
test_contiguous[memmap_tensor0] 17.4200μs 0.6846μs 1.4606 MOps/s 2.1475 MOps/s $\textbf{\color{#d91a1a}-31.99\%}$
test_stack[memmap_tensor0] 34.0200μs 4.9067μs 203.8042 KOps/s 217.8997 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_memmaptd_index 1.0131ms 0.2752ms 3.6335 KOps/s 3.7674 KOps/s $\color{#d91a1a}-3.56\%$
test_memmaptd_index_astensor 0.5593ms 0.3804ms 2.6287 KOps/s 2.7114 KOps/s $\color{#d91a1a}-3.05\%$
test_memmaptd_index_op 0.8931ms 0.6417ms 1.5583 KOps/s 1.6146 KOps/s $\color{#d91a1a}-3.49\%$
test_serialize_model 0.1377s 0.1363s 7.3356 Ops/s 5.9672 Ops/s $\textbf{\color{#35bf28}+22.93\%}$
test_serialize_model_pickle 1.3602s 1.2148s 0.8232 Ops/s 0.8256 Ops/s $\color{#d91a1a}-0.30\%$
test_serialize_weights 0.1363s 0.1345s 7.4332 Ops/s 7.4043 Ops/s $\color{#35bf28}+0.39\%$
test_serialize_weights_returnearly 0.4313s 88.2353ms 11.3333 Ops/s 11.3976 Ops/s $\color{#d91a1a}-0.56\%$
test_serialize_weights_pickle 1.3680s 1.2155s 0.8227 Ops/s 0.8226 Ops/s $+0.01\%$
test_reshape_pytree 0.2090ms 33.3096μs 30.0213 KOps/s 30.9741 KOps/s $\color{#d91a1a}-3.08\%$
test_reshape_td 75.6220μs 46.4635μs 21.5223 KOps/s 21.5745 KOps/s $\color{#d91a1a}-0.24\%$
test_view_pytree 0.2208ms 33.0865μs 30.2238 KOps/s 31.3024 KOps/s $\color{#d91a1a}-3.45\%$
test_view_td 98.8710μs 59.0929μs 16.9225 KOps/s 19.1459 KOps/s $\textbf{\color{#d91a1a}-11.61\%}$
test_unbind_pytree 0.2427ms 36.4828μs 27.4102 KOps/s 27.4847 KOps/s $\color{#d91a1a}-0.27\%$
test_unbind_td 0.1617ms 50.4786μs 19.8104 KOps/s 20.2109 KOps/s $\color{#d91a1a}-1.98\%$
test_split_pytree 0.2483ms 43.2648μs 23.1135 KOps/s 23.6553 KOps/s $\color{#d91a1a}-2.29\%$
test_split_td 0.1143ms 64.8040μs 15.4311 KOps/s 15.3879 KOps/s $\color{#35bf28}+0.28\%$
test_add_pytree 0.2536ms 43.8080μs 22.8269 KOps/s 23.7242 KOps/s $\color{#d91a1a}-3.78\%$
test_add_td 0.2146ms 61.2006μs 16.3397 KOps/s 18.3083 KOps/s $\textbf{\color{#d91a1a}-10.75\%}$
test_compile_add_one_nested[tensordict-compile] 0.2328ms 0.1433ms 6.9807 KOps/s 6.9294 KOps/s $\color{#35bf28}+0.74\%$
test_compile_add_one_nested[tensordict-eager] 0.6710ms 0.2152ms 4.6476 KOps/s 5.0415 KOps/s $\textbf{\color{#d91a1a}-7.81\%}$
test_compile_add_one_nested[pytree-compile] 0.3513ms 0.1090ms 9.1726 KOps/s 9.1279 KOps/s $\color{#35bf28}+0.49\%$
test_compile_add_one_nested[pytree-eager] 0.4358ms 0.1821ms 5.4919 KOps/s 5.6310 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_copy_nested[tensordict-compile] 0.3793ms 10.4956μs 95.2779 KOps/s 95.3274 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_nested[tensordict-eager] 85.5920μs 54.1664μs 18.4616 KOps/s 18.1588 KOps/s $\color{#35bf28}+1.67\%$
test_compile_copy_nested[pytree-compile] 0.1555ms 10.0391μs 99.6105 KOps/s 101.7709 KOps/s $\color{#d91a1a}-2.12\%$
test_compile_copy_nested[pytree-eager] 0.4256ms 68.0098μs 14.7038 KOps/s 14.6396 KOps/s $\color{#35bf28}+0.44\%$
test_compile_add_one_flat[tensordict-compile] 0.2207ms 0.1767ms 5.6586 KOps/s 5.4930 KOps/s $\color{#35bf28}+3.02\%$
test_compile_add_one_flat[tensordict-eager] 0.3802ms 0.2826ms 3.5388 KOps/s 3.5765 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_add_one_flat[tensorclass-compile] 0.1669ms 0.1185ms 8.4357 KOps/s 8.3498 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_one_flat[tensorclass-eager] 0.1322ms 74.3681μs 13.4466 KOps/s 13.4636 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_add_one_flat[pytree-compile] 0.1919ms 0.1589ms 6.2952 KOps/s 6.2458 KOps/s $\color{#35bf28}+0.79\%$
test_compile_add_one_flat[pytree-eager] 0.8188ms 0.5342ms 1.8721 KOps/s 1.9156 KOps/s $\color{#d91a1a}-2.27\%$
test_compile_add_self_flat[tensordict-eager] 0.3879ms 0.3376ms 2.9624 KOps/s 2.9868 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_add_self_flat[tensordict-compile] 0.3115ms 0.1799ms 5.5599 KOps/s 3.3274 KOps/s $\textbf{\color{#35bf28}+67.10\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1255ms 91.0046μs 10.9885 KOps/s 11.1959 KOps/s $\color{#d91a1a}-1.85\%$
test_compile_add_self_flat[tensorclass-compile] 0.3065ms 0.1218ms 8.2074 KOps/s 7.8613 KOps/s $\color{#35bf28}+4.40\%$
test_compile_add_self_flat[pytree-eager] 0.6802ms 0.4459ms 2.2426 KOps/s 2.2716 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_add_self_flat[pytree-compile] 0.1978ms 0.1600ms 6.2516 KOps/s 6.1318 KOps/s $\color{#35bf28}+1.95\%$
test_compile_copy_flat[tensordict-compile] 0.1161ms 13.7571μs 72.6897 KOps/s 75.2425 KOps/s $\color{#d91a1a}-3.39\%$
test_compile_copy_flat[tensordict-eager] 68.6810μs 41.8039μs 23.9212 KOps/s 23.7796 KOps/s $\color{#35bf28}+0.60\%$
test_compile_copy_flat[pytree-compile] 0.1094ms 10.8896μs 91.8310 KOps/s 93.1020 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_copy_flat[pytree-eager] 0.4607ms 51.9108μs 19.2638 KOps/s 18.9803 KOps/s $\color{#35bf28}+1.49\%$
test_compile_assign_and_add[tensordict-compile] 2.0363ms 0.1744ms 5.7326 KOps/s 5.3655 KOps/s $\textbf{\color{#35bf28}+6.84\%}$
test_compile_assign_and_add[tensordict-eager] 3.5881ms 3.3141ms 301.7391 Ops/s 299.3317 Ops/s $\color{#35bf28}+0.80\%$
test_compile_assign_and_add[pytree-compile] 1.9998ms 0.1627ms 6.1456 KOps/s 6.0639 KOps/s $\color{#35bf28}+1.35\%$
test_compile_assign_and_add[pytree-eager] 2.9889ms 2.8062ms 356.3521 Ops/s 355.1221 Ops/s $\color{#35bf28}+0.35\%$
test_compile_indexing[tensor-tensordict-compile] 0.1706ms 0.1100ms 9.0920 KOps/s 8.8125 KOps/s $\color{#35bf28}+3.17\%$
test_compile_indexing[tensor-tensordict-eager] 0.2938ms 73.5620μs 13.5940 KOps/s 13.2943 KOps/s $\color{#35bf28}+2.25\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1642ms 96.9480μs 10.3148 KOps/s 10.3730 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2401ms 45.0028μs 22.2208 KOps/s 21.8800 KOps/s $\color{#35bf28}+1.56\%$
test_compile_indexing[tensor-pytree-compile] 0.1533ms 98.6607μs 10.1357 KOps/s 10.2522 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_indexing[tensor-pytree-eager] 0.2512ms 44.9804μs 22.2319 KOps/s 21.9785 KOps/s $\color{#35bf28}+1.15\%$
test_compile_indexing[slice-tensordict-compile] 0.1283ms 58.0772μs 17.2185 KOps/s 16.4256 KOps/s $\color{#35bf28}+4.83\%$
test_compile_indexing[slice-tensordict-eager] 0.2118ms 27.9193μs 35.8175 KOps/s 36.1918 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_indexing[slice-tensorclass-compile] 0.1404ms 45.2078μs 22.1201 KOps/s 22.2778 KOps/s $\color{#d91a1a}-0.71\%$
test_compile_indexing[slice-tensorclass-eager] 0.2686ms 22.7444μs 43.9669 KOps/s 42.5969 KOps/s $\color{#35bf28}+3.22\%$
test_compile_indexing[slice-pytree-compile] 83.3220μs 45.0041μs 22.2202 KOps/s 21.3592 KOps/s $\color{#35bf28}+4.03\%$
test_compile_indexing[slice-pytree-eager] 0.2663ms 22.4466μs 44.5501 KOps/s 43.0629 KOps/s $\color{#35bf28}+3.45\%$
test_compile_indexing[int-tensordict-compile] 0.2379ms 57.9286μs 17.2626 KOps/s 16.5045 KOps/s $\color{#35bf28}+4.59\%$
test_compile_indexing[int-tensordict-eager] 0.4919ms 27.1875μs 36.7816 KOps/s 36.4781 KOps/s $\color{#35bf28}+0.83\%$
test_compile_indexing[int-tensorclass-compile] 77.4310μs 46.1232μs 21.6811 KOps/s 22.1003 KOps/s $\color{#d91a1a}-1.90\%$
test_compile_indexing[int-tensorclass-eager] 0.4634ms 22.4957μs 44.4528 KOps/s 42.4549 KOps/s $\color{#35bf28}+4.71\%$
test_compile_indexing[int-pytree-compile] 0.4780ms 44.2899μs 22.5785 KOps/s 21.7615 KOps/s $\color{#35bf28}+3.75\%$
test_compile_indexing[int-pytree-eager] 0.2744ms 22.4138μs 44.6154 KOps/s 42.9272 KOps/s $\color{#35bf28}+3.93\%$
test_compile_replace[single-eager] 0.5004ms 48.4777μs 20.6280 KOps/s 20.7112 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_replace[single-compile] 0.6198ms 0.1093ms 9.1516 KOps/s 9.4079 KOps/s $\color{#d91a1a}-2.73\%$
test_compile_replace[multi-eager] 1.0616ms 0.5769ms 1.7334 KOps/s 1.7979 KOps/s $\color{#d91a1a}-3.59\%$
test_compile_replace[multi-compile] 0.2556ms 0.1123ms 8.9031 KOps/s 8.9187 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_tc_getattr_20[eager] 0.6638ms 0.1731ms 5.7762 KOps/s 6.0138 KOps/s $\color{#d91a1a}-3.95\%$
test_compile_tc_getattr_20[compile] 0.6180ms 0.1214ms 8.2351 KOps/s 8.3239 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_clone_shallow[20-eager] 0.4635ms 19.4110μs 51.5172 KOps/s 51.5833 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_clone_shallow[20-compile] 58.9810μs 11.6353μs 85.9455 KOps/s 88.2439 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_clone_shallow[40-eager] 0.4930ms 34.0053μs 29.4072 KOps/s 29.5910 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_clone_shallow[40-compile] 0.4605ms 12.7648μs 78.3405 KOps/s 81.5950 KOps/s $\color{#d91a1a}-3.99\%$
test_compile_clone_shallow[80-eager] 0.5008ms 62.4966μs 16.0009 KOps/s 15.6829 KOps/s $\color{#35bf28}+2.03\%$
test_compile_clone_shallow[80-compile] 0.4605ms 15.3798μs 65.0202 KOps/s 64.7780 KOps/s $\color{#35bf28}+0.37\%$
test_compile_update_inplace[eager] 0.4839ms 58.8621μs 16.9889 KOps/s 16.6151 KOps/s $\color{#35bf28}+2.25\%$
test_compile_update_inplace[compile] 0.2303ms 0.1428ms 7.0014 KOps/s 7.0193 KOps/s $\color{#d91a1a}-0.26\%$
test_mod_add[eager] 0.4999ms 52.8834μs 18.9095 KOps/s 20.2115 KOps/s $\textbf{\color{#d91a1a}-6.44\%}$
test_mod_add[compile] 0.5904ms 0.1111ms 8.9992 KOps/s 9.4379 KOps/s $\color{#d91a1a}-4.65\%$
test_mod_add[compile-overhead] 0.2386ms 0.1505ms 6.6424 KOps/s 6.5328 KOps/s $\color{#35bf28}+1.68\%$
test_mod_wrap[eager] 0.7934ms 0.3071ms 3.2563 KOps/s 3.2926 KOps/s $\color{#d91a1a}-1.10\%$
test_mod_wrap[compile] 0.4149ms 0.3513ms 2.8469 KOps/s 2.8388 KOps/s $\color{#35bf28}+0.28\%$
test_mod_wrap[compile-overhead] 7.1809ms 3.9875ms 250.7839 Ops/s 248.4667 Ops/s $\color{#35bf28}+0.93\%$
test_mod_wrap_and_backward[eager] 1.6154ms 1.5147ms 660.1979 Ops/s 669.1080 Ops/s $\color{#d91a1a}-1.33\%$
test_mod_wrap_and_backward[compile] 1.5680ms 1.4555ms 687.0622 Ops/s 685.2161 Ops/s $\color{#35bf28}+0.27\%$
test_mod_wrap_and_backward[compile-overhead] 1.2868ms 0.9031ms 1.1073 KOps/s 1.1022 KOps/s $\color{#35bf28}+0.46\%$
test_seq_add[eager] 0.2167ms 0.1567ms 6.3830 KOps/s 6.4918 KOps/s $\color{#d91a1a}-1.68\%$
test_seq_add[compile] 0.1702ms 0.1150ms 8.6950 KOps/s 8.5446 KOps/s $\color{#35bf28}+1.76\%$
test_seq_add[compile-overhead] 0.2243ms 0.1541ms 6.4891 KOps/s 6.3056 KOps/s $\color{#35bf28}+2.91\%$
test_seq_wrap[eager] 0.6043ms 0.5441ms 1.8380 KOps/s 1.9098 KOps/s $\color{#d91a1a}-3.76\%$
test_seq_wrap[compile] 0.4458ms 0.3688ms 2.7116 KOps/s 2.7256 KOps/s $\color{#d91a1a}-0.52\%$
test_seq_wrap[compile-overhead] 0.3494ms 0.2668ms 3.7480 KOps/s 3.7350 KOps/s $\color{#35bf28}+0.35\%$
test_func_call_runtime[False-eager] 0.9149ms 0.8514ms 1.1745 KOps/s 1.1928 KOps/s $\color{#d91a1a}-1.54\%$
test_func_call_runtime[False-compile] 0.9900ms 0.9204ms 1.0865 KOps/s 1.0957 KOps/s $\color{#d91a1a}-0.84\%$
test_func_call_runtime[False-compile-overhead] 0.5418ms 0.4664ms 2.1441 KOps/s 2.1584 KOps/s $\color{#d91a1a}-0.66\%$
test_func_call_runtime[True-eager] 1.2142ms 1.0865ms 920.3631 Ops/s 923.4570 Ops/s $\color{#d91a1a}-0.34\%$
test_func_call_runtime[True-compile] 0.9763ms 0.9295ms 1.0759 KOps/s 1.0793 KOps/s $\color{#d91a1a}-0.32\%$
test_func_call_runtime[True-compile-overhead] 0.5203ms 0.4793ms 2.0863 KOps/s 2.0722 KOps/s $\color{#35bf28}+0.68\%$
test_func_call_cm_runtime[False-eager] 0.9671ms 0.8428ms 1.1866 KOps/s 1.1732 KOps/s $\color{#35bf28}+1.14\%$
test_func_call_cm_runtime[False-compile] 1.1037ms 0.9217ms 1.0850 KOps/s 1.0884 KOps/s $\color{#d91a1a}-0.31\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5924ms 0.4646ms 2.1522 KOps/s 2.1443 KOps/s $\color{#35bf28}+0.37\%$
test_func_call_cm_runtime[True-eager] 1.3427ms 1.2341ms 810.2907 Ops/s 818.3478 Ops/s $\color{#d91a1a}-0.98\%$
test_func_call_cm_runtime[True-compile] 1.0256ms 0.9618ms 1.0398 KOps/s 1.0354 KOps/s $\color{#35bf28}+0.42\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6631ms 0.5107ms 1.9580 KOps/s 1.9387 KOps/s $\color{#35bf28}+0.99\%$
test_vmap_func_call_cm_runtime[eager] 2.8539ms 2.3835ms 419.5516 Ops/s 417.6482 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_func_call_cm_runtime[compile] 1.1530ms 0.9812ms 1.0191 KOps/s 1.0143 KOps/s $\color{#35bf28}+0.47\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5902ms 0.5173ms 1.9330 KOps/s 1.9197 KOps/s $\color{#35bf28}+0.69\%$
test_distributed 2.8434ms 0.1705ms 5.8643 KOps/s 5.6553 KOps/s $\color{#35bf28}+3.70\%$
test_tdmodule 0.2646ms 27.9520μs 35.7756 KOps/s 35.5140 KOps/s $\color{#35bf28}+0.74\%$
test_tdmodule_dispatch 75.9620μs 45.9981μs 21.7400 KOps/s 21.9436 KOps/s $\color{#d91a1a}-0.93\%$
test_tdseq 51.3310μs 26.8238μs 37.2804 KOps/s 37.0372 KOps/s $\color{#35bf28}+0.66\%$
test_tdseq_dispatch 66.7510μs 47.5188μs 21.0443 KOps/s 21.1933 KOps/s $\color{#d91a1a}-0.70\%$
test_instantiation_functorch 2.4316ms 2.1112ms 473.6680 Ops/s 477.2485 Ops/s $\color{#d91a1a}-0.75\%$
test_exec_functorch 0.2365ms 0.1818ms 5.4991 KOps/s 5.5281 KOps/s $\color{#d91a1a}-0.53\%$
test_exec_functional_call 0.2095ms 0.1611ms 6.2085 KOps/s 6.2060 KOps/s $\color{#35bf28}+0.04\%$
test_exec_td_decorator 0.4529ms 0.2398ms 4.1693 KOps/s 4.1900 KOps/s $\color{#d91a1a}-0.50\%$
test_vmap_mlp_speed_decorator[True-True] 1.0688ms 0.8282ms 1.2074 KOps/s 1.2026 KOps/s $\color{#35bf28}+0.40\%$
test_vmap_mlp_speed_decorator[True-False] 1.0226ms 0.8271ms 1.2091 KOps/s 1.2080 KOps/s $\color{#35bf28}+0.09\%$
test_vmap_mlp_speed_decorator[False-True] 0.9138ms 0.7150ms 1.3986 KOps/s 1.3991 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[False-False] 0.9398ms 0.7147ms 1.3992 KOps/s 1.3923 KOps/s $\color{#35bf28}+0.49\%$
test_vmap_transformer_speed_decorator[True-True] 21.1235ms 20.6289ms 48.4757 Ops/s 48.1464 Ops/s $\color{#35bf28}+0.68\%$
test_vmap_transformer_speed_decorator[True-False] 21.2167ms 20.6613ms 48.3997 Ops/s 48.2565 Ops/s $\color{#35bf28}+0.30\%$
test_vmap_transformer_speed_decorator[False-True] 21.0438ms 20.4384ms 48.9274 Ops/s 48.7480 Ops/s $\color{#35bf28}+0.37\%$
test_vmap_transformer_speed_decorator[False-False] 21.0322ms 20.4542ms 48.8898 Ops/s 48.6861 Ops/s $\color{#35bf28}+0.42\%$
test_to_module_speed[True] 2.0858ms 1.4784ms 676.4270 Ops/s 676.6796 Ops/s $\color{#d91a1a}-0.04\%$
test_to_module_speed[False] 1.9839ms 1.4310ms 698.7936 Ops/s 682.2133 Ops/s $\color{#35bf28}+2.43\%$
test_tc_init 73.6710μs 44.8054μs 22.3188 KOps/s 22.1403 KOps/s $\color{#35bf28}+0.81\%$
test_tc_init_tensor_only 39.4410μs 9.6478μs 103.6508 KOps/s 101.5166 KOps/s $\color{#35bf28}+2.10\%$
test_tc_init_nested 0.3771ms 89.1944μs 11.2115 KOps/s 11.4584 KOps/s $\color{#d91a1a}-2.15\%$
test_tc_init_many_fields 46.5010μs 16.1137μs 62.0589 KOps/s 61.5096 KOps/s $\color{#35bf28}+0.89\%$
test_tc_first_layer_tensor 22.0400μs 1.7979μs 556.1957 KOps/s 559.5306 KOps/s $\color{#d91a1a}-0.60\%$
test_tc_first_layer_tensor_only 1.6426μs 0.3906μs 2.5602 MOps/s 2.5505 MOps/s $\color{#35bf28}+0.38\%$
test_tc_first_layer_tensor_set 33.3800μs 3.8510μs 259.6744 KOps/s 254.8223 KOps/s $\color{#35bf28}+1.90\%$
test_tc_first_layer_tensor_only_set 20.0810μs 3.2837μs 304.5377 KOps/s 303.5636 KOps/s $\color{#35bf28}+0.32\%$
test_tc_first_layer_nontensor 39.8910μs 6.1060μs 163.7743 KOps/s 161.9368 KOps/s $\color{#35bf28}+1.13\%$
test_tc_second_layer_tensor 29.4310μs 4.3944μs 227.5639 KOps/s 227.8685 KOps/s $\color{#d91a1a}-0.13\%$
test_tc_second_layer_nontensor 32.6000μs 8.6552μs 115.5381 KOps/s 114.6426 KOps/s $\color{#35bf28}+0.78\%$
test_unbind 0.2673s 17.7023ms 56.4897 Ops/s 66.2239 Ops/s $\textbf{\color{#d91a1a}-14.70\%}$
test_full_like 4.9814ms 4.3789ms 228.3663 Ops/s 226.5552 Ops/s $\color{#35bf28}+0.80\%$
test_zeros_like 4.9371ms 4.3864ms 227.9777 Ops/s 228.2019 Ops/s $\color{#d91a1a}-0.10\%$
test_ones_like 5.1721ms 4.3821ms 228.2027 Ops/s 231.5702 Ops/s $\color{#d91a1a}-1.45\%$
test_clone 6.6267ms 6.4749ms 154.4424 Ops/s 153.3191 Ops/s $\color{#35bf28}+0.73\%$
test_squeeze 0.1858ms 14.2295μs 70.2765 KOps/s 69.6737 KOps/s $\color{#35bf28}+0.87\%$
test_unsqueeze 0.2718ms 0.1147ms 8.7195 KOps/s 8.8046 KOps/s $\color{#d91a1a}-0.97\%$
test_split 0.2504ms 0.1833ms 5.4550 KOps/s 5.2912 KOps/s $\color{#35bf28}+3.10\%$
test_permute 0.2925ms 0.2139ms 4.6747 KOps/s 4.6251 KOps/s $\color{#35bf28}+1.07\%$
test_stack 43.3416ms 42.9708ms 23.2716 Ops/s 19.6592 Ops/s $\textbf{\color{#35bf28}+18.38\%}$
test_cat 43.3199ms 42.9260ms 23.2959 Ops/s 19.7728 Ops/s $\textbf{\color{#35bf28}+17.82\%}$
test_sequential_tensordict 0.2850ms 0.2233ms 4.4776 KOps/s 4.4417 KOps/s $\color{#35bf28}+0.81\%$
test_sequential_graph_module 0.1913ms 0.1232ms 8.1156 KOps/s 8.4270 KOps/s $\color{#d91a1a}-3.70\%$
test_nested_tensordict 0.3754ms 0.2931ms 3.4117 KOps/s 3.4805 KOps/s $\color{#d91a1a}-1.97\%$
test_nested_graph_module 0.2009ms 0.1346ms 7.4279 KOps/s 7.4970 KOps/s $\color{#d91a1a}-0.92\%$

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.7210μs 14.3153μs 69.8554 KOps/s 68.2741 KOps/s $\color{#35bf28}+2.32\%$
test_plain_set_stack_nested 34.0710μs 14.7432μs 67.8278 KOps/s 68.2900 KOps/s $\color{#d91a1a}-0.68\%$
test_plain_set_nested_inplace 56.1310μs 16.1542μs 61.9036 KOps/s 61.7802 KOps/s $\color{#35bf28}+0.20\%$
test_plain_set_stack_nested_inplace 35.6410μs 16.1678μs 61.8515 KOps/s 61.8047 KOps/s $\color{#35bf28}+0.08\%$
test_items 25.8910μs 5.4887μs 182.1922 KOps/s 180.3641 KOps/s $\color{#35bf28}+1.01\%$
test_items_nested 0.5527ms 0.4500ms 2.2222 KOps/s 2.2339 KOps/s $\color{#d91a1a}-0.52\%$
test_items_nested_locked 0.5456ms 0.4518ms 2.2136 KOps/s 2.2085 KOps/s $\color{#35bf28}+0.23\%$
test_items_nested_leaf 0.1341ms 92.0818μs 10.8599 KOps/s 10.7998 KOps/s $\color{#35bf28}+0.56\%$
test_items_stack_nested 0.5309ms 0.4519ms 2.2127 KOps/s 2.2510 KOps/s $\color{#d91a1a}-1.70\%$
test_items_stack_nested_leaf 0.1396ms 92.8681μs 10.7680 KOps/s 10.7463 KOps/s $\color{#35bf28}+0.20\%$
test_items_stack_nested_locked 0.5545ms 0.4545ms 2.2002 KOps/s 2.2255 KOps/s $\color{#d91a1a}-1.14\%$
test_keys 36.0010μs 4.1392μs 241.5946 KOps/s 242.6018 KOps/s $\color{#d91a1a}-0.42\%$
test_keys_nested 0.1837ms 0.1268ms 7.8860 KOps/s 7.8789 KOps/s $\color{#35bf28}+0.09\%$
test_keys_nested_locked 2.2607ms 0.1354ms 7.3871 KOps/s 7.3780 KOps/s $\color{#35bf28}+0.12\%$
test_keys_nested_leaf 0.1610ms 0.1171ms 8.5398 KOps/s 8.5291 KOps/s $\color{#35bf28}+0.13\%$
test_keys_stack_nested 0.1866ms 0.1259ms 7.9441 KOps/s 7.8467 KOps/s $\color{#35bf28}+1.24\%$
test_keys_stack_nested_leaf 0.1630ms 0.1174ms 8.5144 KOps/s 8.5097 KOps/s $\color{#35bf28}+0.06\%$
test_keys_stack_nested_locked 0.1837ms 0.1344ms 7.4397 KOps/s 7.4695 KOps/s $\color{#d91a1a}-0.40\%$
test_values 3.7891μs 1.0203μs 980.0871 KOps/s 995.8847 KOps/s $\color{#d91a1a}-1.59\%$
test_values_nested 78.8820μs 50.9319μs 19.6340 KOps/s 19.4400 KOps/s $\color{#35bf28}+1.00\%$
test_values_nested_locked 84.1920μs 53.9998μs 18.5186 KOps/s 18.3824 KOps/s $\color{#35bf28}+0.74\%$
test_values_nested_leaf 82.7120μs 58.6435μs 17.0522 KOps/s 17.0460 KOps/s $\color{#35bf28}+0.04\%$
test_values_stack_nested 0.1029ms 51.0528μs 19.5876 KOps/s 19.3944 KOps/s $\color{#35bf28}+1.00\%$
test_values_stack_nested_leaf 0.1048ms 58.3708μs 17.1319 KOps/s 17.0262 KOps/s $\color{#35bf28}+0.62\%$
test_values_stack_nested_locked 89.5930μs 54.5709μs 18.3248 KOps/s 18.4478 KOps/s $\color{#d91a1a}-0.67\%$
test_membership 4.2785μs 0.8144μs 1.2279 MOps/s 1.2622 MOps/s $\color{#d91a1a}-2.72\%$
test_membership_nested 31.2000μs 2.7321μs 366.0180 KOps/s 375.1385 KOps/s $\color{#d91a1a}-2.43\%$
test_membership_nested_leaf 16.3755μs 2.6802μs 373.1122 KOps/s 373.1612 KOps/s $\color{#d91a1a}-0.01\%$
test_membership_stacked_nested 38.6310μs 2.7362μs 365.4681 KOps/s 366.1607 KOps/s $\color{#d91a1a}-0.19\%$
test_membership_stacked_nested_leaf 21.9110μs 2.7182μs 367.8965 KOps/s 364.8961 KOps/s $\color{#35bf28}+0.82\%$
test_membership_nested_last 39.4610μs 4.1409μs 241.4936 KOps/s 241.8805 KOps/s $\color{#d91a1a}-0.16\%$
test_membership_nested_leaf_last 35.6110μs 4.1646μs 240.1163 KOps/s 240.4891 KOps/s $\color{#d91a1a}-0.15\%$
test_membership_stacked_nested_last 21.2410μs 4.1518μs 240.8614 KOps/s 241.8731 KOps/s $\color{#d91a1a}-0.42\%$
test_membership_stacked_nested_leaf_last 28.8900μs 4.1380μs 241.6624 KOps/s 242.5370 KOps/s $\color{#d91a1a}-0.36\%$
test_nested_getleaf 44.3910μs 20.8442μs 47.9751 KOps/s 48.5033 KOps/s $\color{#d91a1a}-1.09\%$
test_nested_get 42.4710μs 19.7921μs 50.5253 KOps/s 50.8171 KOps/s $\color{#d91a1a}-0.57\%$
test_stacked_getleaf 45.6210μs 21.0501μs 47.5056 KOps/s 48.8199 KOps/s $\color{#d91a1a}-2.69\%$
test_stacked_get 42.3610μs 19.8665μs 50.3360 KOps/s 50.9272 KOps/s $\color{#d91a1a}-1.16\%$
test_nested_getitemleaf 43.8110μs 21.2254μs 47.1133 KOps/s 47.7218 KOps/s $\color{#d91a1a}-1.28\%$
test_nested_getitem 43.1110μs 20.2463μs 49.3918 KOps/s 50.2306 KOps/s $\color{#d91a1a}-1.67\%$
test_stacked_getitemleaf 44.7010μs 21.2814μs 46.9893 KOps/s 47.5901 KOps/s $\color{#d91a1a}-1.26\%$
test_stacked_getitem 39.8210μs 20.0914μs 49.7726 KOps/s 50.5732 KOps/s $\color{#d91a1a}-1.58\%$
test_lock_nested 0.6230ms 0.4518ms 2.2133 KOps/s 2.1903 KOps/s $\color{#35bf28}+1.05\%$
test_lock_stack_nested 0.6021ms 0.4565ms 2.1907 KOps/s 2.1439 KOps/s $\color{#35bf28}+2.18\%$
test_unlock_nested 0.5498ms 0.3702ms 2.7012 KOps/s 2.6778 KOps/s $\color{#35bf28}+0.87\%$
test_unlock_stack_nested 0.3986ms 0.3696ms 2.7053 KOps/s 2.6463 KOps/s $\color{#35bf28}+2.23\%$
test_flatten_speed 0.1659ms 0.1188ms 8.4210 KOps/s 8.5756 KOps/s $\color{#d91a1a}-1.80\%$
test_unflatten_speed 0.6367ms 0.5455ms 1.8331 KOps/s 1.8493 KOps/s $\color{#d91a1a}-0.87\%$
test_common_ops 0.8442ms 0.6712ms 1.4900 KOps/s 1.3928 KOps/s $\textbf{\color{#35bf28}+6.97\%}$
test_creation 71.3420μs 2.9897μs 334.4784 KOps/s 344.0864 KOps/s $\color{#d91a1a}-2.79\%$
test_creation_empty 33.9810μs 6.5687μs 152.2368 KOps/s 153.5486 KOps/s $\color{#d91a1a}-0.85\%$
test_creation_nested_1 32.1410μs 11.0303μs 90.6593 KOps/s 92.3286 KOps/s $\color{#d91a1a}-1.81\%$
test_creation_nested_2 48.0710μs 12.7436μs 78.4706 KOps/s 79.8036 KOps/s $\color{#d91a1a}-1.67\%$
test_creation_many_keys[10] 88.8520μs 20.1905μs 49.5281 KOps/s 49.9805 KOps/s $\color{#d91a1a}-0.90\%$
test_creation_many_keys[50] 0.1247ms 86.3629μs 11.5790 KOps/s 11.7105 KOps/s $\color{#d91a1a}-1.12\%$
test_creation_many_keys[100] 0.2049ms 0.1722ms 5.8079 KOps/s 5.9901 KOps/s $\color{#d91a1a}-3.04\%$
test_creation_nested_many_keys[10] 90.1220μs 42.8470μs 23.3389 KOps/s 23.5456 KOps/s $\color{#d91a1a}-0.88\%$
test_creation_nested_many_keys[50] 0.2446ms 0.1775ms 5.6323 KOps/s 5.7934 KOps/s $\color{#d91a1a}-2.78\%$
test_clone 54.6510μs 12.5917μs 79.4171 KOps/s 78.1215 KOps/s $\color{#35bf28}+1.66\%$
test_getitem[int] 1.6151ms 14.5714μs 68.6276 KOps/s 62.3711 KOps/s $\textbf{\color{#35bf28}+10.03\%}$
test_getitem[slice_int] 0.1396ms 22.8742μs 43.7173 KOps/s 40.9764 KOps/s $\textbf{\color{#35bf28}+6.69\%}$
test_getitem[range] 0.1750ms 60.9785μs 16.3992 KOps/s 15.2507 KOps/s $\textbf{\color{#35bf28}+7.53\%}$
test_getitem[tuple] 0.1400ms 22.6238μs 44.2013 KOps/s 43.6915 KOps/s $\color{#35bf28}+1.17\%$
test_getitem[list] 0.1827ms 56.1408μs 17.8124 KOps/s 17.9733 KOps/s $\color{#d91a1a}-0.90\%$
test_setitem_dim[int] 46.4210μs 24.7840μs 40.3487 KOps/s 39.7052 KOps/s $\color{#35bf28}+1.62\%$
test_setitem_dim[slice_int] 79.3820μs 41.0278μs 24.3737 KOps/s 24.1869 KOps/s $\color{#35bf28}+0.77\%$
test_setitem_dim[range] 0.1186ms 90.9441μs 10.9958 KOps/s 11.0566 KOps/s $\color{#d91a1a}-0.55\%$
test_setitem_dim[tuple] 77.5120μs 38.2291μs 26.1581 KOps/s 26.0731 KOps/s $\color{#35bf28}+0.33\%$
test_setitem 65.6520μs 16.9295μs 59.0686 KOps/s 58.4300 KOps/s $\color{#35bf28}+1.09\%$
test_set 61.1310μs 16.2949μs 61.3690 KOps/s 61.3302 KOps/s $\color{#35bf28}+0.06\%$
test_set_shared 0.5112ms 0.2015ms 4.9619 KOps/s 4.8789 KOps/s $\color{#35bf28}+1.70\%$
test_update 0.2009ms 20.4609μs 48.8737 KOps/s 44.8399 KOps/s $\textbf{\color{#35bf28}+9.00\%}$
test_update_nested 70.6320μs 31.3716μs 31.8759 KOps/s 30.8743 KOps/s $\color{#35bf28}+3.24\%$
test_update__nested 0.5418ms 33.1152μs 30.1976 KOps/s 29.4563 KOps/s $\color{#35bf28}+2.52\%$
test_set_nested 72.6120μs 18.1034μs 55.2383 KOps/s 54.3230 KOps/s $\color{#35bf28}+1.68\%$
test_set_nested_new 68.4220μs 22.8314μs 43.7993 KOps/s 43.3408 KOps/s $\color{#35bf28}+1.06\%$
test_select 85.7920μs 38.8495μs 25.7403 KOps/s 25.3912 KOps/s $\color{#35bf28}+1.38\%$
test_select_nested 0.1385ms 70.5978μs 14.1647 KOps/s 14.4094 KOps/s $\color{#d91a1a}-1.70\%$
test_exclude_nested 0.1224ms 87.2392μs 11.4627 KOps/s 11.6103 KOps/s $\color{#d91a1a}-1.27\%$
test_empty[True] 0.7421ms 0.3867ms 2.5862 KOps/s 2.6170 KOps/s $\color{#d91a1a}-1.18\%$
test_empty[False] 7.5075μs 1.2575μs 795.2251 KOps/s 800.7755 KOps/s $\color{#d91a1a}-0.69\%$
test_to 0.1020ms 70.1904μs 14.2470 KOps/s 14.1288 KOps/s $\color{#35bf28}+0.84\%$
test_to_nonblocking 0.1274ms 63.5190μs 15.7433 KOps/s 15.8112 KOps/s $\color{#d91a1a}-0.43\%$
test_unbind_speed 0.3920ms 0.3144ms 3.1803 KOps/s 3.1252 KOps/s $\color{#35bf28}+1.76\%$
test_unbind_speed_stack0 0.4378ms 0.3157ms 3.1673 KOps/s 3.1078 KOps/s $\color{#35bf28}+1.92\%$
test_unbind_speed_stack1 0.1037s 0.8822ms 1.1336 KOps/s 1.2280 KOps/s $\textbf{\color{#d91a1a}-7.69\%}$
test_split 1.2271ms 1.0885ms 918.6845 Ops/s 825.7254 Ops/s $\textbf{\color{#35bf28}+11.26\%}$
test_chunk 0.1037s 1.1593ms 862.5573 Ops/s 965.7841 Ops/s $\textbf{\color{#d91a1a}-10.69\%}$
test_to_cpu_blocking 29.1580ms 28.4568ms 35.1410 Ops/s 31.5808 Ops/s $\textbf{\color{#35bf28}+11.27\%}$
test_to_cpu_global_sync 11.3968ms 11.2171ms 89.1498 Ops/s 89.0295 Ops/s $\color{#35bf28}+0.14\%$
test_to_cpu_event_sync 12.4238ms 12.1977ms 81.9824 Ops/s 82.1521 Ops/s $\color{#d91a1a}-0.21\%$
test_to_cpu_default 0.1163s 13.4650ms 74.2664 Ops/s 82.0383 Ops/s $\textbf{\color{#d91a1a}-9.47\%}$
test_consolidate[False-None] 4.1866ms 4.0036ms 249.7757 Ops/s 251.0442 Ops/s $\color{#d91a1a}-0.51\%$
test_consolidate[default-None] 2.0551ms 1.9625ms 509.5547 Ops/s 501.1209 Ops/s $\color{#35bf28}+1.68\%$
test_consolidate[reduce-overhead-None] 1.9425ms 1.8692ms 534.9787 Ops/s 513.1241 Ops/s $\color{#35bf28}+4.26\%$
test_consolidate_njt[False-None] 8.4053ms 8.1368ms 122.8980 Ops/s 121.1974 Ops/s $\color{#35bf28}+1.40\%$
test_to[False-False-None] 2.1725ms 2.0419ms 489.7359 Ops/s 484.4751 Ops/s $\color{#35bf28}+1.09\%$
test_to[True-False-None] 2.1137ms 1.8827ms 531.1430 Ops/s 541.1689 Ops/s $\color{#d91a1a}-1.85\%$
test_to[within-False-None] 6.2541ms 5.9409ms 168.3257 Ops/s 169.3990 Ops/s $\color{#d91a1a}-0.63\%$
test_to[True-default-None] 8.6801ms 8.5024ms 117.6144 Ops/s 113.3951 Ops/s $\color{#35bf28}+3.72\%$
test_to_njt[False-False-None] 8.3129ms 8.1851ms 122.1728 Ops/s 117.5869 Ops/s $\color{#35bf28}+3.90\%$
test_to_njt[True-False-None] 6.8310ms 6.6471ms 150.4413 Ops/s 142.6445 Ops/s $\textbf{\color{#35bf28}+5.47\%}$
test_to_njt[within-False-None] 15.1683ms 14.8593ms 67.2979 Ops/s 66.1676 Ops/s $\color{#35bf28}+1.71\%$
test_creation[device0] 0.3949ms 0.1127ms 8.8751 KOps/s 8.7680 KOps/s $\color{#35bf28}+1.22\%$
test_creation_from_tensor 0.4088ms 0.1106ms 9.0427 KOps/s 8.9457 KOps/s $\color{#35bf28}+1.08\%$
test_add_one[memmap_tensor0] 0.2016ms 6.4074μs 156.0706 KOps/s 153.4400 KOps/s $\color{#35bf28}+1.71\%$
test_contiguous[memmap_tensor0] 17.3010μs 0.6052μs 1.6523 MOps/s 2.1877 MOps/s $\textbf{\color{#d91a1a}-24.47\%}$
test_stack[memmap_tensor0] 28.0500μs 4.7300μs 211.4173 KOps/s 212.8123 KOps/s $\color{#d91a1a}-0.66\%$
test_memmaptd_index 1.0929ms 0.2646ms 3.7794 KOps/s 3.8024 KOps/s $\color{#d91a1a}-0.60\%$
test_memmaptd_index_astensor 0.5259ms 0.3668ms 2.7261 KOps/s 2.7776 KOps/s $\color{#d91a1a}-1.85\%$
test_memmaptd_index_op 0.8372ms 0.6143ms 1.6278 KOps/s 1.6407 KOps/s $\color{#d91a1a}-0.78\%$
test_serialize_model 0.1385s 0.1366s 7.3184 Ops/s 7.3491 Ops/s $\color{#d91a1a}-0.42\%$
test_serialize_model_pickle 1.3644s 1.2139s 0.8238 Ops/s 0.8251 Ops/s $\color{#d91a1a}-0.16\%$
test_serialize_weights 0.1372s 0.1335s 7.4909 Ops/s 7.3974 Ops/s $\color{#35bf28}+1.26\%$
test_serialize_weights_returnearly 0.4282s 83.4414ms 11.9845 Ops/s 6.1209 Ops/s $\textbf{\color{#35bf28}+95.79\%}$
test_serialize_weights_pickle 1.3720s 1.1876s 0.8420 Ops/s 0.8240 Ops/s $\color{#35bf28}+2.18\%$
test_reshape_pytree 0.2065ms 31.9073μs 31.3408 KOps/s 31.4924 KOps/s $\color{#d91a1a}-0.48\%$
test_reshape_td 82.2920μs 43.3358μs 23.0756 KOps/s 22.3965 KOps/s $\color{#35bf28}+3.03\%$
test_view_pytree 0.2188ms 31.4537μs 31.7927 KOps/s 32.3580 KOps/s $\color{#d91a1a}-1.75\%$
test_view_td 98.1130μs 50.9141μs 19.6409 KOps/s 19.2287 KOps/s $\color{#35bf28}+2.14\%$
test_unbind_pytree 0.2343ms 35.1815μs 28.4241 KOps/s 27.7717 KOps/s $\color{#35bf28}+2.35\%$
test_unbind_td 0.1141ms 46.8055μs 21.3650 KOps/s 20.0377 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_split_pytree 0.2461ms 41.7020μs 23.9797 KOps/s 24.4276 KOps/s $\color{#d91a1a}-1.83\%$
test_split_td 0.2080ms 61.9281μs 16.1478 KOps/s 16.1746 KOps/s $\color{#d91a1a}-0.17\%$
test_add_pytree 0.1908ms 41.6765μs 23.9943 KOps/s 24.8065 KOps/s $\color{#d91a1a}-3.27\%$
test_add_td 0.1054ms 53.5704μs 18.6670 KOps/s 18.6775 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_one_nested[tensordict-compile] 0.2359ms 0.1427ms 7.0079 KOps/s 6.6682 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_compile_add_one_nested[tensordict-eager] 0.6198ms 0.1968ms 5.0822 KOps/s 5.2440 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_add_one_nested[pytree-compile] 0.1537ms 0.1051ms 9.5127 KOps/s 9.3567 KOps/s $\color{#35bf28}+1.67\%$
test_compile_add_one_nested[pytree-eager] 0.5906ms 0.1755ms 5.6974 KOps/s 5.6908 KOps/s $\color{#35bf28}+0.12\%$
test_compile_copy_nested[tensordict-compile] 0.4496ms 9.8285μs 101.7450 KOps/s 100.4194 KOps/s $\color{#35bf28}+1.32\%$
test_compile_copy_nested[tensordict-eager] 0.1187ms 51.9461μs 19.2507 KOps/s 19.5250 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_copy_nested[pytree-compile] 0.1062ms 9.3832μs 106.5732 KOps/s 105.9018 KOps/s $\color{#35bf28}+0.63\%$
test_compile_copy_nested[pytree-eager] 0.4386ms 65.9313μs 15.1673 KOps/s 15.1755 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_add_one_flat[tensordict-compile] 0.2458ms 0.1795ms 5.5695 KOps/s 5.3012 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_compile_add_one_flat[tensordict-eager] 0.3866ms 0.2752ms 3.6340 KOps/s 3.6269 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.1889ms 0.1131ms 8.8438 KOps/s 8.2509 KOps/s $\textbf{\color{#35bf28}+7.19\%}$
test_compile_add_one_flat[tensorclass-eager] 0.1276ms 72.3104μs 13.8293 KOps/s 13.7798 KOps/s $\color{#35bf28}+0.36\%$
test_compile_add_one_flat[pytree-compile] 0.3861ms 0.1544ms 6.4775 KOps/s 6.2964 KOps/s $\color{#35bf28}+2.88\%$
test_compile_add_one_flat[pytree-eager] 0.7945ms 0.5168ms 1.9350 KOps/s 1.9367 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_add_self_flat[tensordict-eager] 0.3746ms 0.3270ms 3.0582 KOps/s 3.0305 KOps/s $\color{#35bf28}+0.91\%$
test_compile_add_self_flat[tensordict-compile] 0.2544ms 0.1750ms 5.7139 KOps/s 5.3170 KOps/s $\textbf{\color{#35bf28}+7.46\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1404ms 87.8195μs 11.3870 KOps/s 11.4216 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_add_self_flat[tensorclass-compile] 0.2271ms 0.1152ms 8.6777 KOps/s 7.9187 KOps/s $\textbf{\color{#35bf28}+9.59\%}$
test_compile_add_self_flat[pytree-eager] 0.6330ms 0.4318ms 2.3160 KOps/s 2.2909 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_self_flat[pytree-compile] 0.2198ms 0.1542ms 6.4864 KOps/s 6.2476 KOps/s $\color{#35bf28}+3.82\%$
test_compile_copy_flat[tensordict-compile] 0.1163ms 12.8427μs 77.8652 KOps/s 76.8758 KOps/s $\color{#35bf28}+1.29\%$
test_compile_copy_flat[tensordict-eager] 87.0420μs 39.9714μs 25.0179 KOps/s 24.9949 KOps/s $\color{#35bf28}+0.09\%$
test_compile_copy_flat[pytree-compile] 31.4610μs 10.4261μs 95.9134 KOps/s 94.7779 KOps/s $\color{#35bf28}+1.20\%$
test_compile_copy_flat[pytree-eager] 0.4033ms 51.4491μs 19.4367 KOps/s 19.4011 KOps/s $\color{#35bf28}+0.18\%$
test_compile_assign_and_add[tensordict-compile] 1.9610ms 0.1717ms 5.8229 KOps/s 5.4703 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_compile_assign_and_add[tensordict-eager] 3.3563ms 3.2288ms 309.7114 Ops/s 308.4342 Ops/s $\color{#35bf28}+0.41\%$
test_compile_assign_and_add[pytree-compile] 1.9333ms 0.1586ms 6.3063 KOps/s 6.2332 KOps/s $\color{#35bf28}+1.17\%$
test_compile_assign_and_add[pytree-eager] 2.8292ms 2.7136ms 368.5192 Ops/s 365.8082 Ops/s $\color{#35bf28}+0.74\%$
test_compile_indexing[tensor-tensordict-compile] 0.2471ms 0.1057ms 9.4584 KOps/s 9.0617 KOps/s $\color{#35bf28}+4.38\%$
test_compile_indexing[tensor-tensordict-eager] 0.3091ms 71.9033μs 13.9076 KOps/s 14.1087 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2016ms 93.7457μs 10.6672 KOps/s 10.3743 KOps/s $\color{#35bf28}+2.82\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2597ms 44.2657μs 22.5908 KOps/s 23.3003 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_indexing[tensor-pytree-compile] 0.1480ms 98.8534μs 10.1160 KOps/s 10.0169 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[tensor-pytree-eager] 0.2648ms 46.4450μs 21.5309 KOps/s 23.2983 KOps/s $\textbf{\color{#d91a1a}-7.59\%}$
test_compile_indexing[slice-tensordict-compile] 0.1572ms 54.2125μs 18.4459 KOps/s 17.1200 KOps/s $\textbf{\color{#35bf28}+7.75\%}$
test_compile_indexing[slice-tensordict-eager] 0.2225ms 26.3265μs 37.9846 KOps/s 38.7554 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_indexing[slice-tensorclass-compile] 0.1654ms 42.6440μs 23.4499 KOps/s 22.4446 KOps/s $\color{#35bf28}+4.48\%$
test_compile_indexing[slice-tensorclass-eager] 0.2794ms 21.3426μs 46.8546 KOps/s 46.2711 KOps/s $\color{#35bf28}+1.26\%$
test_compile_indexing[slice-pytree-compile] 85.9920μs 42.6694μs 23.4360 KOps/s 22.2529 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_compile_indexing[slice-pytree-eager] 0.2551ms 21.4865μs 46.5409 KOps/s 46.5144 KOps/s $\color{#35bf28}+0.06\%$
test_compile_indexing[int-tensordict-compile] 0.1024ms 55.9624μs 17.8691 KOps/s 16.5092 KOps/s $\textbf{\color{#35bf28}+8.24\%}$
test_compile_indexing[int-tensordict-eager] 0.2983ms 26.2020μs 38.1650 KOps/s 39.0623 KOps/s $\color{#d91a1a}-2.30\%$
test_compile_indexing[int-tensorclass-compile] 67.4120μs 42.8654μs 23.3288 KOps/s 22.1440 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_compile_indexing[int-tensorclass-eager] 0.2669ms 21.4812μs 46.5524 KOps/s 46.5163 KOps/s $\color{#35bf28}+0.08\%$
test_compile_indexing[int-pytree-compile] 0.1008ms 42.7290μs 23.4033 KOps/s 22.1151 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_compile_indexing[int-pytree-eager] 0.2848ms 21.6103μs 46.2742 KOps/s 46.4970 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_replace[single-eager] 86.6320μs 46.3808μs 21.5606 KOps/s 21.3581 KOps/s $\color{#35bf28}+0.95\%$
test_compile_replace[single-compile] 0.1987ms 0.1036ms 9.6494 KOps/s 9.0677 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_compile_replace[multi-eager] 0.6367ms 0.5497ms 1.8191 KOps/s 1.8121 KOps/s $\color{#35bf28}+0.39\%$
test_compile_replace[multi-compile] 0.1453ms 0.1082ms 9.2395 KOps/s 8.5499 KOps/s $\textbf{\color{#35bf28}+8.06\%}$
test_compile_tc_getattr_20[eager] 0.2261ms 0.1695ms 5.8984 KOps/s 5.9745 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_tc_getattr_20[compile] 0.2340ms 0.1214ms 8.2353 KOps/s 7.9660 KOps/s $\color{#35bf28}+3.38\%$
test_compile_clone_shallow[20-eager] 55.5910μs 18.6977μs 53.4825 KOps/s 54.7022 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_clone_shallow[20-compile] 58.0110μs 10.9805μs 91.0709 KOps/s 90.3379 KOps/s $\color{#35bf28}+0.81\%$
test_compile_clone_shallow[40-eager] 63.0720μs 32.4005μs 30.8637 KOps/s 30.9165 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_clone_shallow[40-compile] 63.1320μs 11.7826μs 84.8706 KOps/s 81.6137 KOps/s $\color{#35bf28}+3.99\%$
test_compile_clone_shallow[80-eager] 0.1287ms 61.0246μs 16.3868 KOps/s 16.4773 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_clone_shallow[80-compile] 52.0110μs 14.2233μs 70.3070 KOps/s 66.3879 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_compile_update_inplace[eager] 93.1430μs 58.1623μs 17.1933 KOps/s 16.6430 KOps/s $\color{#35bf28}+3.31\%$
test_compile_update_inplace[compile] 0.1734ms 0.1343ms 7.4447 KOps/s 6.9374 KOps/s $\textbf{\color{#35bf28}+7.31\%}$
test_mod_add[eager] 0.1123ms 47.1302μs 21.2178 KOps/s 19.3620 KOps/s $\textbf{\color{#35bf28}+9.59\%}$
test_mod_add[compile] 0.2986ms 0.1022ms 9.7850 KOps/s 9.1832 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_mod_add[compile-overhead] 0.3330ms 0.1451ms 6.8902 KOps/s 6.3762 KOps/s $\textbf{\color{#35bf28}+8.06\%}$
test_mod_wrap[eager] 0.3654ms 0.2847ms 3.5123 KOps/s 3.3344 KOps/s $\textbf{\color{#35bf28}+5.33\%}$
test_mod_wrap[compile] 0.4854ms 0.3394ms 2.9466 KOps/s 2.9008 KOps/s $\color{#35bf28}+1.58\%$
test_mod_wrap[compile-overhead] 7.2517ms 4.0088ms 249.4520 Ops/s 248.7882 Ops/s $\color{#35bf28}+0.27\%$
test_mod_wrap_and_backward[eager] 1.7151ms 1.4806ms 675.4002 Ops/s 675.1808 Ops/s $\color{#35bf28}+0.03\%$
test_mod_wrap_and_backward[compile] 1.5310ms 1.4140ms 707.2229 Ops/s 701.9208 Ops/s $\color{#35bf28}+0.76\%$
test_mod_wrap_and_backward[compile-overhead] 1.2222ms 0.8662ms 1.1544 KOps/s 1.1381 KOps/s $\color{#35bf28}+1.44\%$
test_seq_add[eager] 0.2054ms 0.1508ms 6.6292 KOps/s 6.6447 KOps/s $\color{#d91a1a}-0.23\%$
test_seq_add[compile] 0.2040ms 0.1106ms 9.0396 KOps/s 8.6825 KOps/s $\color{#35bf28}+4.11\%$
test_seq_add[compile-overhead] 0.2006ms 0.1493ms 6.6974 KOps/s 6.3816 KOps/s $\color{#35bf28}+4.95\%$
test_seq_wrap[eager] 0.5728ms 0.5091ms 1.9644 KOps/s 1.9602 KOps/s $\color{#35bf28}+0.22\%$
test_seq_wrap[compile] 0.4442ms 0.3544ms 2.8219 KOps/s 2.7582 KOps/s $\color{#35bf28}+2.31\%$
test_seq_wrap[compile-overhead] 0.3440ms 0.2564ms 3.9005 KOps/s 3.8195 KOps/s $\color{#35bf28}+2.12\%$
test_func_call_runtime[False-eager] 0.9635ms 0.8274ms 1.2086 KOps/s 1.2123 KOps/s $\color{#d91a1a}-0.31\%$
test_func_call_runtime[False-compile] 0.9661ms 0.8885ms 1.1255 KOps/s 1.1279 KOps/s $\color{#d91a1a}-0.22\%$
test_func_call_runtime[False-compile-overhead] 0.5045ms 0.4420ms 2.2623 KOps/s 2.2208 KOps/s $\color{#35bf28}+1.87\%$
test_func_call_runtime[True-eager] 1.1561ms 1.0416ms 960.0871 Ops/s 947.1289 Ops/s $\color{#35bf28}+1.37\%$
test_func_call_runtime[True-compile] 0.9678ms 0.8858ms 1.1289 KOps/s 1.0706 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_func_call_runtime[True-compile-overhead] 0.5640ms 0.4558ms 2.1940 KOps/s 2.1598 KOps/s $\color{#35bf28}+1.58\%$
test_func_call_cm_runtime[False-eager] 0.9378ms 0.8199ms 1.2197 KOps/s 1.1972 KOps/s $\color{#35bf28}+1.88\%$
test_func_call_cm_runtime[False-compile] 1.1076ms 0.9166ms 1.0910 KOps/s 1.1224 KOps/s $\color{#d91a1a}-2.79\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4960ms 0.4446ms 2.2494 KOps/s 2.2199 KOps/s $\color{#35bf28}+1.33\%$
test_func_call_cm_runtime[True-eager] 1.3593ms 1.1942ms 837.3956 Ops/s 830.4734 Ops/s $\color{#35bf28}+0.83\%$
test_func_call_cm_runtime[True-compile] 1.0605ms 0.9245ms 1.0817 KOps/s 1.0667 KOps/s $\color{#35bf28}+1.41\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5445ms 0.4876ms 2.0510 KOps/s 2.0066 KOps/s $\color{#35bf28}+2.21\%$
test_vmap_func_call_cm_runtime[eager] 2.8017ms 2.3196ms 431.1091 Ops/s 427.9471 Ops/s $\color{#35bf28}+0.74\%$
test_vmap_func_call_cm_runtime[compile] 1.1347ms 0.9494ms 1.0533 KOps/s 1.0396 KOps/s $\color{#35bf28}+1.32\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5484ms 0.4932ms 2.0277 KOps/s 1.9905 KOps/s $\color{#35bf28}+1.87\%$
test_distributed 3.0652ms 0.1547ms 6.4652 KOps/s 6.4612 KOps/s $\color{#35bf28}+0.06\%$
test_tdmodule 55.9520μs 28.0171μs 35.6924 KOps/s 37.1744 KOps/s $\color{#d91a1a}-3.99\%$
test_tdmodule_dispatch 76.7520μs 44.1616μs 22.6441 KOps/s 22.4689 KOps/s $\color{#35bf28}+0.78\%$
test_tdseq 47.2610μs 26.3178μs 37.9971 KOps/s 37.4283 KOps/s $\color{#35bf28}+1.52\%$
test_tdseq_dispatch 74.6320μs 46.5572μs 21.4790 KOps/s 21.4105 KOps/s $\color{#35bf28}+0.32\%$
test_instantiation_functorch 2.0722ms 1.9657ms 508.7186 Ops/s 500.6383 Ops/s $\color{#35bf28}+1.61\%$
test_exec_functorch 0.2278ms 0.1726ms 5.7930 KOps/s 5.7123 KOps/s $\color{#35bf28}+1.41\%$
test_exec_functional_call 0.1975ms 0.1542ms 6.4832 KOps/s 6.4920 KOps/s $\color{#d91a1a}-0.14\%$
test_exec_td_decorator 0.4204ms 0.2250ms 4.4435 KOps/s 4.3747 KOps/s $\color{#35bf28}+1.57\%$
test_vmap_mlp_speed_decorator[True-True] 1.0327ms 0.8045ms 1.2430 KOps/s 1.2347 KOps/s $\color{#35bf28}+0.67\%$
test_vmap_mlp_speed_decorator[True-False] 1.0059ms 0.8038ms 1.2440 KOps/s 1.2346 KOps/s $\color{#35bf28}+0.76\%$
test_vmap_mlp_speed_decorator[False-True] 0.8673ms 0.6885ms 1.4524 KOps/s 1.4319 KOps/s $\color{#35bf28}+1.43\%$
test_vmap_mlp_speed_decorator[False-False] 0.9352ms 0.6971ms 1.4346 KOps/s 1.4278 KOps/s $\color{#35bf28}+0.48\%$
test_vmap_transformer_speed_decorator[True-True] 20.6418ms 20.0573ms 49.8571 Ops/s 49.4695 Ops/s $\color{#35bf28}+0.78\%$
test_vmap_transformer_speed_decorator[True-False] 21.0903ms 20.3080ms 49.2416 Ops/s 49.5376 Ops/s $\color{#d91a1a}-0.60\%$
test_vmap_transformer_speed_decorator[False-True] 20.6313ms 19.8934ms 50.2680 Ops/s 49.9265 Ops/s $\color{#35bf28}+0.68\%$
test_vmap_transformer_speed_decorator[False-False] 20.8405ms 20.0899ms 49.7762 Ops/s 50.0136 Ops/s $\color{#d91a1a}-0.47\%$
test_to_module_speed[True] 1.9892ms 1.3918ms 718.4794 Ops/s 714.9245 Ops/s $\color{#35bf28}+0.50\%$
test_to_module_speed[False] 1.9121ms 1.3544ms 738.3309 Ops/s 729.8336 Ops/s $\color{#35bf28}+1.16\%$
test_tc_init 66.8020μs 42.7221μs 23.4071 KOps/s 23.3268 KOps/s $\color{#35bf28}+0.34\%$
test_tc_init_tensor_only 39.6110μs 9.1482μs 109.3109 KOps/s 108.8719 KOps/s $\color{#35bf28}+0.40\%$
test_tc_init_nested 0.1327ms 84.3753μs 11.8518 KOps/s 11.7742 KOps/s $\color{#35bf28}+0.66\%$
test_tc_init_many_fields 37.5310μs 15.2816μs 65.4383 KOps/s 64.9983 KOps/s $\color{#35bf28}+0.68\%$
test_tc_first_layer_tensor 17.2510μs 1.6546μs 604.3747 KOps/s 585.0597 KOps/s $\color{#35bf28}+3.30\%$
test_tc_first_layer_tensor_only 1.7981μs 0.3813μs 2.6229 MOps/s 2.6071 MOps/s $\color{#35bf28}+0.60\%$
test_tc_first_layer_tensor_set 24.8110μs 3.7263μs 268.3630 KOps/s 270.8307 KOps/s $\color{#d91a1a}-0.91\%$
test_tc_first_layer_tensor_only_set 22.6310μs 3.1158μs 320.9427 KOps/s 316.9816 KOps/s $\color{#35bf28}+1.25\%$
test_tc_first_layer_nontensor 28.5700μs 5.8355μs 171.3659 KOps/s 170.2360 KOps/s $\color{#35bf28}+0.66\%$
test_tc_second_layer_tensor 26.5910μs 4.1461μs 241.1929 KOps/s 238.1256 KOps/s $\color{#35bf28}+1.29\%$
test_tc_second_layer_nontensor 38.6310μs 8.1109μs 123.2916 KOps/s 120.2792 KOps/s $\color{#35bf28}+2.50\%$
test_unbind 0.2631s 15.2045ms 65.7700 Ops/s 73.0128 Ops/s $\textbf{\color{#d91a1a}-9.92\%}$
test_full_like 9.3746ms 7.3832ms 135.4431 Ops/s 227.3324 Ops/s $\textbf{\color{#d91a1a}-40.42\%}$
test_zeros_like 5.4772ms 4.3699ms 228.8404 Ops/s 228.3364 Ops/s $\color{#35bf28}+0.22\%$
test_ones_like 4.4791ms 4.3761ms 228.5145 Ops/s 227.8743 Ops/s $\color{#35bf28}+0.28\%$
test_clone 7.2841ms 6.5780ms 152.0226 Ops/s 152.1320 Ops/s $\color{#d91a1a}-0.07\%$
test_squeeze 88.4520μs 13.4854μs 74.1541 KOps/s 73.4284 KOps/s $\color{#35bf28}+0.99\%$
test_unsqueeze 0.1651ms 0.1108ms 9.0264 KOps/s 9.2042 KOps/s $\color{#d91a1a}-1.93\%$
test_split 0.2285ms 0.1789ms 5.5906 KOps/s 5.5882 KOps/s $\color{#35bf28}+0.04\%$
test_permute 0.2600ms 0.2039ms 4.9050 KOps/s 4.9886 KOps/s $\color{#d91a1a}-1.67\%$
test_stack 43.3200ms 43.0301ms 23.2396 Ops/s 19.4066 Ops/s $\textbf{\color{#35bf28}+19.75\%}$
test_cat 43.4711ms 43.0741ms 23.2158 Ops/s 19.4676 Ops/s $\textbf{\color{#35bf28}+19.25\%}$
test_sequential_tensordict 0.3718ms 0.2161ms 4.6274 KOps/s 4.6034 KOps/s $\color{#35bf28}+0.52\%$
test_sequential_graph_module 0.4875ms 0.1138ms 8.7853 KOps/s 8.2947 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_nested_tensordict 0.3517ms 0.2719ms 3.6783 KOps/s 3.5456 KOps/s $\color{#35bf28}+3.74\%$
test_nested_graph_module 0.5168ms 0.1273ms 7.8538 KOps/s 7.6344 KOps/s $\color{#35bf28}+2.87\%$

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. tensorclass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant