Skip to content

[DTensor] Add transfer plan computation and transport abstraction#1643

Open
vmoens wants to merge 5 commits intogh/vmoens/84/basefrom
gh/vmoens/84/head
Open

[DTensor] Add transfer plan computation and transport abstraction#1643
vmoens wants to merge 5 commits intogh/vmoens/84/basefrom
gh/vmoens/84/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Mar 9, 2026

Stack from ghstack (oldest at bottom):

Pure-logic module for computing optimal P2P transfers between
different DeviceMesh sharding layouts. Includes:

  • Shard-algebra functions for computing local slices and slice
    intersections between source and destination meshes.
  • _compute_transfer_plan: computes minimal P2P transfers (testable
    without GPUs or a distributed runtime).
  • _TransportBackend protocol with _TorchDistributedBackend (NCCL-safe
    JSON-over-CUDA metadata serialization) and _UCXXBackend implementations.
  • DeviceMesh helper utilities (_mesh_to_rank_map, _mesh_all_ranks).
  • Comprehensive CPU-only unit tests.

Made-with: Cursor

[ghstack-poisoned]
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 9, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.7410μs 14.8350μs 67.4082 KOps/s 67.5859 KOps/s $\color{#d91a1a}-0.26\%$
test_plain_set_stack_nested 38.0910μs 15.2387μs 65.6226 KOps/s 66.1767 KOps/s $\color{#d91a1a}-0.84\%$
test_plain_set_nested_inplace 45.1410μs 17.0949μs 58.4969 KOps/s 59.5600 KOps/s $\color{#d91a1a}-1.78\%$
test_plain_set_stack_nested_inplace 42.4210μs 16.8543μs 59.3321 KOps/s 59.4904 KOps/s $\color{#d91a1a}-0.27\%$
test_items 30.4010μs 6.1200μs 163.3980 KOps/s 164.2061 KOps/s $\color{#d91a1a}-0.49\%$
test_items_nested 0.5949ms 0.4671ms 2.1410 KOps/s 2.1401 KOps/s $\color{#35bf28}+0.04\%$
test_items_nested_locked 0.6284ms 0.4705ms 2.1253 KOps/s 2.1356 KOps/s $\color{#d91a1a}-0.48\%$
test_items_nested_leaf 0.1579ms 97.9143μs 10.2130 KOps/s 10.2942 KOps/s $\color{#d91a1a}-0.79\%$
test_items_stack_nested 0.6256ms 0.4703ms 2.1265 KOps/s 2.1531 KOps/s $\color{#d91a1a}-1.24\%$
test_items_stack_nested_leaf 0.1344ms 98.6255μs 10.1394 KOps/s 10.3247 KOps/s $\color{#d91a1a}-1.80\%$
test_items_stack_nested_locked 0.6559ms 0.4656ms 2.1476 KOps/s 2.1316 KOps/s $\color{#35bf28}+0.75\%$
test_keys 25.1500μs 4.2371μs 236.0121 KOps/s 237.0399 KOps/s $\color{#d91a1a}-0.43\%$
test_keys_nested 0.2011ms 0.1308ms 7.6447 KOps/s 7.7900 KOps/s $\color{#d91a1a}-1.86\%$
test_keys_nested_locked 0.7410ms 0.1388ms 7.2049 KOps/s 7.2151 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_nested_leaf 0.1789ms 0.1213ms 8.2424 KOps/s 8.3163 KOps/s $\color{#d91a1a}-0.89\%$
test_keys_stack_nested 0.2192ms 0.1303ms 7.6723 KOps/s 7.7172 KOps/s $\color{#d91a1a}-0.58\%$
test_keys_stack_nested_leaf 0.2027ms 0.1219ms 8.2053 KOps/s 8.3483 KOps/s $\color{#d91a1a}-1.71\%$
test_keys_stack_nested_locked 0.2008ms 0.1388ms 7.2068 KOps/s 7.2501 KOps/s $\color{#d91a1a}-0.60\%$
test_values 10.0722μs 1.0206μs 979.8000 KOps/s 991.0020 KOps/s $\color{#d91a1a}-1.13\%$
test_values_nested 90.4820μs 52.2804μs 19.1276 KOps/s 19.1887 KOps/s $\color{#d91a1a}-0.32\%$
test_values_nested_locked 0.1049ms 56.0571μs 17.8389 KOps/s 18.1235 KOps/s $\color{#d91a1a}-1.57\%$
test_values_nested_leaf 99.0620μs 60.8606μs 16.4310 KOps/s 15.4669 KOps/s $\textbf{\color{#35bf28}+6.23\%}$
test_values_stack_nested 90.8420μs 52.4409μs 19.0691 KOps/s 19.0485 KOps/s $\color{#35bf28}+0.11\%$
test_values_stack_nested_leaf 0.1036ms 60.2870μs 16.5873 KOps/s 15.6032 KOps/s $\textbf{\color{#35bf28}+6.31\%}$
test_values_stack_nested_locked 0.1014ms 57.2208μs 17.4762 KOps/s 18.1405 KOps/s $\color{#d91a1a}-3.66\%$
test_membership 5.8552μs 0.8595μs 1.1635 MOps/s 1.1975 MOps/s $\color{#d91a1a}-2.84\%$
test_membership_nested 21.2910μs 2.9295μs 341.3595 KOps/s 348.6713 KOps/s $\color{#d91a1a}-2.10\%$
test_membership_nested_leaf 25.8210μs 2.9444μs 339.6321 KOps/s 348.4161 KOps/s $\color{#d91a1a}-2.52\%$
test_membership_stacked_nested 36.0700μs 2.8883μs 346.2221 KOps/s 351.0882 KOps/s $\color{#d91a1a}-1.39\%$
test_membership_stacked_nested_leaf 32.1800μs 2.9012μs 344.6800 KOps/s 348.1371 KOps/s $\color{#d91a1a}-0.99\%$
test_membership_nested_last 32.7310μs 4.4482μs 224.8115 KOps/s 230.2563 KOps/s $\color{#d91a1a}-2.36\%$
test_membership_nested_leaf_last 39.4410μs 4.5038μs 222.0339 KOps/s 230.2478 KOps/s $\color{#d91a1a}-3.57\%$
test_membership_stacked_nested_last 24.0000μs 4.4840μs 223.0138 KOps/s 232.0694 KOps/s $\color{#d91a1a}-3.90\%$
test_membership_stacked_nested_leaf_last 40.5610μs 4.4266μs 225.9062 KOps/s 229.9663 KOps/s $\color{#d91a1a}-1.77\%$
test_nested_getleaf 57.0310μs 21.9119μs 45.6373 KOps/s 46.1585 KOps/s $\color{#d91a1a}-1.13\%$
test_nested_get 49.8310μs 20.6045μs 48.5332 KOps/s 48.5698 KOps/s $\color{#d91a1a}-0.08\%$
test_stacked_getleaf 48.8610μs 21.5700μs 46.3607 KOps/s 46.0071 KOps/s $\color{#35bf28}+0.77\%$
test_stacked_get 63.9510μs 20.3410μs 49.1619 KOps/s 48.7758 KOps/s $\color{#35bf28}+0.79\%$
test_nested_getitemleaf 51.8610μs 22.1386μs 45.1700 KOps/s 45.1416 KOps/s $\color{#35bf28}+0.06\%$
test_nested_getitem 64.9520μs 21.2535μs 47.0511 KOps/s 47.7199 KOps/s $\color{#d91a1a}-1.40\%$
test_stacked_getitemleaf 52.5710μs 21.9374μs 45.5842 KOps/s 45.3123 KOps/s $\color{#35bf28}+0.60\%$
test_stacked_getitem 53.1110μs 20.9854μs 47.6522 KOps/s 47.2239 KOps/s $\color{#35bf28}+0.91\%$
test_lock_nested 7.8389ms 0.4927ms 2.0295 KOps/s 2.1054 KOps/s $\color{#d91a1a}-3.61\%$
test_lock_stack_nested 0.5458ms 0.4858ms 2.0586 KOps/s 2.0645 KOps/s $\color{#d91a1a}-0.29\%$
test_unlock_nested 0.4706ms 0.3989ms 2.5072 KOps/s 2.5717 KOps/s $\color{#d91a1a}-2.51\%$
test_unlock_stack_nested 0.4715ms 0.3998ms 2.5012 KOps/s 2.5309 KOps/s $\color{#d91a1a}-1.17\%$
test_flatten_speed 0.2294ms 0.1228ms 8.1410 KOps/s 8.1472 KOps/s $\color{#d91a1a}-0.08\%$
test_unflatten_speed 0.6705ms 0.5755ms 1.7377 KOps/s 1.7276 KOps/s $\color{#35bf28}+0.58\%$
test_common_ops 0.8548ms 0.6903ms 1.4486 KOps/s 1.4316 KOps/s $\color{#35bf28}+1.19\%$
test_creation 58.2510μs 3.1658μs 315.8735 KOps/s 315.3711 KOps/s $\color{#35bf28}+0.16\%$
test_creation_empty 45.5810μs 7.0470μs 141.9036 KOps/s 141.2766 KOps/s $\color{#35bf28}+0.44\%$
test_creation_nested_1 36.4910μs 11.5913μs 86.2718 KOps/s 85.7901 KOps/s $\color{#35bf28}+0.56\%$
test_creation_nested_2 60.1310μs 13.4303μs 74.4585 KOps/s 74.3777 KOps/s $\color{#35bf28}+0.11\%$
test_creation_many_keys[10] 56.8210μs 21.1396μs 47.3047 KOps/s 47.2827 KOps/s $\color{#35bf28}+0.05\%$
test_creation_many_keys[50] 0.1529ms 91.5510μs 10.9229 KOps/s 11.0159 KOps/s $\color{#d91a1a}-0.84\%$
test_creation_many_keys[100] 0.2758ms 0.1805ms 5.5393 KOps/s 5.6364 KOps/s $\color{#d91a1a}-1.72\%$
test_creation_nested_many_keys[10] 78.5720μs 45.2529μs 22.0980 KOps/s 22.1989 KOps/s $\color{#d91a1a}-0.45\%$
test_creation_nested_many_keys[50] 0.2643ms 0.1848ms 5.4112 KOps/s 5.3761 KOps/s $\color{#35bf28}+0.65\%$
test_clone 50.9610μs 13.3451μs 74.9340 KOps/s 73.4103 KOps/s $\color{#35bf28}+2.08\%$
test_getitem[int] 1.5963ms 15.4544μs 64.7067 KOps/s 59.9698 KOps/s $\textbf{\color{#35bf28}+7.90\%}$
test_getitem[slice_int] 0.1395ms 24.3702μs 41.0337 KOps/s 41.1931 KOps/s $\color{#d91a1a}-0.39\%$
test_getitem[range] 0.1961ms 63.7679μs 15.6819 KOps/s 15.5757 KOps/s $\color{#35bf28}+0.68\%$
test_getitem[tuple] 0.1416ms 24.0199μs 41.6322 KOps/s 41.1604 KOps/s $\color{#35bf28}+1.15\%$
test_getitem[list] 0.1844ms 59.3656μs 16.8448 KOps/s 16.7702 KOps/s $\color{#35bf28}+0.44\%$
test_setitem_dim[int] 46.1610μs 25.4950μs 39.2234 KOps/s 38.2704 KOps/s $\color{#35bf28}+2.49\%$
test_setitem_dim[slice_int] 61.2610μs 42.2885μs 23.6471 KOps/s 22.9992 KOps/s $\color{#35bf28}+2.82\%$
test_setitem_dim[range] 0.1233ms 94.9410μs 10.5329 KOps/s 10.4135 KOps/s $\color{#35bf28}+1.15\%$
test_setitem_dim[tuple] 66.7710μs 39.6240μs 25.2372 KOps/s 24.4398 KOps/s $\color{#35bf28}+3.26\%$
test_setitem 62.2910μs 18.1094μs 55.2201 KOps/s 55.3938 KOps/s $\color{#d91a1a}-0.31\%$
test_set 88.1120μs 16.7432μs 59.7258 KOps/s 57.2459 KOps/s $\color{#35bf28}+4.33\%$
test_set_shared 0.6280ms 0.2015ms 4.9619 KOps/s 4.8761 KOps/s $\color{#35bf28}+1.76\%$
test_update 0.4556ms 22.2787μs 44.8859 KOps/s 45.1437 KOps/s $\color{#d91a1a}-0.57\%$
test_update_nested 64.6810μs 33.4139μs 29.9277 KOps/s 30.5345 KOps/s $\color{#d91a1a}-1.99\%$
test_update__nested 0.4725ms 34.5399μs 28.9521 KOps/s 28.6426 KOps/s $\color{#35bf28}+1.08\%$
test_set_nested 54.3010μs 19.1307μs 52.2721 KOps/s 52.0664 KOps/s $\color{#35bf28}+0.40\%$
test_set_nested_new 60.9120μs 23.8494μs 41.9298 KOps/s 41.5076 KOps/s $\color{#35bf28}+1.02\%$
test_select 68.0220μs 40.7338μs 24.5496 KOps/s 24.3655 KOps/s $\color{#35bf28}+0.76\%$
test_select_nested 0.4976ms 75.5310μs 13.2396 KOps/s 13.3224 KOps/s $\color{#d91a1a}-0.62\%$
test_exclude_nested 0.5179ms 93.7233μs 10.6697 KOps/s 10.6797 KOps/s $\color{#d91a1a}-0.09\%$
test_empty[True] 0.8185ms 0.4015ms 2.4905 KOps/s 2.4883 KOps/s $\color{#35bf28}+0.09\%$
test_empty[False] 0.1051ms 1.3047μs 766.4444 KOps/s 757.3352 KOps/s $\color{#35bf28}+1.20\%$
test_to 0.1125ms 75.1491μs 13.3069 KOps/s 13.6172 KOps/s $\color{#d91a1a}-2.28\%$
test_to_nonblocking 0.2149ms 66.3078μs 15.0812 KOps/s 15.4927 KOps/s $\color{#d91a1a}-2.66\%$
test_unbind_speed 0.7791ms 0.3387ms 2.9524 KOps/s 3.0007 KOps/s $\color{#d91a1a}-1.61\%$
test_unbind_speed_stack0 0.5337ms 0.3371ms 2.9667 KOps/s 3.0159 KOps/s $\color{#d91a1a}-1.63\%$
test_unbind_speed_stack1 0.1045s 0.8429ms 1.1864 KOps/s 1.1901 KOps/s $\color{#d91a1a}-0.31\%$
test_split 0.1045s 1.2780ms 782.4536 Ops/s 778.1599 Ops/s $\color{#35bf28}+0.55\%$
test_chunk 0.1046s 1.2185ms 820.6931 Ops/s 912.8000 Ops/s $\textbf{\color{#d91a1a}-10.09\%}$
test_to_cpu_blocking 29.1710ms 28.7423ms 34.7919 Ops/s 45.6986 Ops/s $\textbf{\color{#d91a1a}-23.87\%}$
test_to_cpu_global_sync 11.7845ms 11.3870ms 87.8193 Ops/s 87.5224 Ops/s $\color{#35bf28}+0.34\%$
test_to_cpu_event_sync 12.8260ms 12.3989ms 80.6523 Ops/s 80.0696 Ops/s $\color{#35bf28}+0.73\%$
test_to_cpu_default 0.1157s 13.6920ms 73.0351 Ops/s 80.0434 Ops/s $\textbf{\color{#d91a1a}-8.76\%}$
test_consolidate[False-None] 4.6791ms 4.2506ms 235.2594 Ops/s 239.1573 Ops/s $\color{#d91a1a}-1.63\%$
test_consolidate[default-None] 2.2051ms 2.0751ms 481.9121 Ops/s 479.0317 Ops/s $\color{#35bf28}+0.60\%$
test_consolidate[reduce-overhead-None] 2.1178ms 2.0039ms 499.0286 Ops/s 500.2465 Ops/s $\color{#d91a1a}-0.24\%$
test_consolidate_njt[False-None] 0.1932s 10.2074ms 97.9678 Ops/s 116.9762 Ops/s $\textbf{\color{#d91a1a}-16.25\%}$
test_to[False-False-None] 2.2947ms 2.1199ms 471.7266 Ops/s 469.9086 Ops/s $\color{#35bf28}+0.39\%$
test_to[True-False-None] 2.2062ms 1.9726ms 506.9394 Ops/s 509.1492 Ops/s $\color{#d91a1a}-0.43\%$
test_to[within-False-None] 6.4089ms 6.2964ms 158.8202 Ops/s 161.8525 Ops/s $\color{#d91a1a}-1.87\%$
test_to[True-default-None] 9.1505ms 8.9708ms 111.4726 Ops/s 109.6696 Ops/s $\color{#35bf28}+1.64\%$
test_to_njt[False-False-None] 8.5954ms 8.4820ms 117.8968 Ops/s 116.6221 Ops/s $\color{#35bf28}+1.09\%$
test_to_njt[True-False-None] 7.1835ms 7.0180ms 142.4900 Ops/s 140.7611 Ops/s $\color{#35bf28}+1.23\%$
test_to_njt[within-False-None] 15.9030ms 15.6883ms 63.7418 Ops/s 62.9305 Ops/s $\color{#35bf28}+1.29\%$
test_creation[device0] 0.4481ms 0.1143ms 8.7456 KOps/s 8.7454 KOps/s $+0.00\%$
test_creation_from_tensor 0.5844ms 0.1121ms 8.9174 KOps/s 8.9397 KOps/s $\color{#d91a1a}-0.25\%$
test_add_one[memmap_tensor0] 0.3770ms 6.5694μs 152.2202 KOps/s 147.6164 KOps/s $\color{#35bf28}+3.12\%$
test_contiguous[memmap_tensor0] 24.7710μs 0.6733μs 1.4852 MOps/s 2.1552 MOps/s $\textbf{\color{#d91a1a}-31.09\%}$
test_stack[memmap_tensor0] 24.0410μs 4.7865μs 208.9191 KOps/s 217.7404 KOps/s $\color{#d91a1a}-4.05\%$
test_memmaptd_index 1.1789ms 0.2714ms 3.6852 KOps/s 3.7124 KOps/s $\color{#d91a1a}-0.73\%$
test_memmaptd_index_astensor 0.5224ms 0.3734ms 2.6784 KOps/s 2.6712 KOps/s $\color{#35bf28}+0.27\%$
test_memmaptd_index_op 0.8701ms 0.6271ms 1.5946 KOps/s 1.5936 KOps/s $\color{#35bf28}+0.06\%$
test_serialize_model 0.3134s 0.1643s 6.0872 Ops/s 7.3614 Ops/s $\textbf{\color{#d91a1a}-17.31\%}$
test_serialize_model_pickle 2.0852s 1.4011s 0.7137 Ops/s 0.8227 Ops/s $\textbf{\color{#d91a1a}-13.25\%}$
test_serialize_weights 0.1357s 0.1337s 7.4801 Ops/s 7.3399 Ops/s $\color{#35bf28}+1.91\%$
test_serialize_weights_returnearly 0.4235s 87.9757ms 11.3668 Ops/s 6.1880 Ops/s $\textbf{\color{#35bf28}+83.69\%}$
test_serialize_weights_pickle 1.3752s 1.2158s 0.8225 Ops/s 0.8216 Ops/s $\color{#35bf28}+0.12\%$
test_reshape_pytree 0.2030ms 32.9061μs 30.3895 KOps/s 30.4017 KOps/s $\color{#d91a1a}-0.04\%$
test_reshape_td 82.9310μs 46.0283μs 21.7257 KOps/s 21.8317 KOps/s $\color{#d91a1a}-0.49\%$
test_view_pytree 0.2124ms 32.5683μs 30.7047 KOps/s 30.8098 KOps/s $\color{#d91a1a}-0.34\%$
test_view_td 94.3010μs 53.6181μs 18.6504 KOps/s 19.0540 KOps/s $\color{#d91a1a}-2.12\%$
test_unbind_pytree 0.2316ms 36.3178μs 27.5347 KOps/s 27.1214 KOps/s $\color{#35bf28}+1.52\%$
test_unbind_td 0.1277ms 50.1613μs 19.9357 KOps/s 19.9160 KOps/s $\color{#35bf28}+0.10\%$
test_split_pytree 0.2479ms 42.7393μs 23.3977 KOps/s 23.5533 KOps/s $\color{#d91a1a}-0.66\%$
test_split_td 0.1981ms 65.7178μs 15.2166 KOps/s 15.4116 KOps/s $\color{#d91a1a}-1.27\%$
test_add_pytree 0.1915ms 42.5586μs 23.4970 KOps/s 23.5900 KOps/s $\color{#d91a1a}-0.39\%$
test_add_td 0.1008ms 56.1230μs 17.8180 KOps/s 17.8820 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_add_one_nested[tensordict-compile] 0.2110ms 0.1430ms 6.9924 KOps/s 6.6678 KOps/s $\color{#35bf28}+4.87\%$
test_compile_add_one_nested[tensordict-eager] 0.3001ms 0.2021ms 4.9471 KOps/s 5.0219 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_add_one_nested[pytree-compile] 0.1975ms 0.1091ms 9.1676 KOps/s 8.8387 KOps/s $\color{#35bf28}+3.72\%$
test_compile_add_one_nested[pytree-eager] 0.4285ms 0.1812ms 5.5175 KOps/s 5.5586 KOps/s $\color{#d91a1a}-0.74\%$
test_compile_copy_nested[tensordict-compile] 0.3507ms 10.3113μs 96.9813 KOps/s 98.2303 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_copy_nested[tensordict-eager] 90.5720μs 54.2980μs 18.4169 KOps/s 18.3620 KOps/s $\color{#35bf28}+0.30\%$
test_compile_copy_nested[pytree-compile] 0.1475ms 9.8266μs 101.7642 KOps/s 100.7732 KOps/s $\color{#35bf28}+0.98\%$
test_compile_copy_nested[pytree-eager] 0.4530ms 67.7347μs 14.7635 KOps/s 14.4393 KOps/s $\color{#35bf28}+2.25\%$
test_compile_add_one_flat[tensordict-compile] 0.2381ms 0.1802ms 5.5507 KOps/s 5.3918 KOps/s $\color{#35bf28}+2.95\%$
test_compile_add_one_flat[tensordict-eager] 0.3707ms 0.2812ms 3.5565 KOps/s 3.5411 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_one_flat[tensorclass-compile] 0.1684ms 0.1177ms 8.4988 KOps/s 8.2983 KOps/s $\color{#35bf28}+2.42\%$
test_compile_add_one_flat[tensorclass-eager] 0.1151ms 73.1718μs 13.6665 KOps/s 12.9773 KOps/s $\textbf{\color{#35bf28}+5.31\%}$
test_compile_add_one_flat[pytree-compile] 0.2445ms 0.1592ms 6.2829 KOps/s 6.1914 KOps/s $\color{#35bf28}+1.48\%$
test_compile_add_one_flat[pytree-eager] 0.8163ms 0.5359ms 1.8659 KOps/s 1.9173 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_add_self_flat[tensordict-eager] 0.4822ms 0.3373ms 2.9646 KOps/s 2.9787 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_add_self_flat[tensordict-compile] 0.2150ms 0.1786ms 5.5991 KOps/s 5.1246 KOps/s $\textbf{\color{#35bf28}+9.26\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1255ms 89.4686μs 11.1771 KOps/s 11.1179 KOps/s $\color{#35bf28}+0.53\%$
test_compile_add_self_flat[tensorclass-compile] 0.3470ms 0.1204ms 8.3066 KOps/s 7.7936 KOps/s $\textbf{\color{#35bf28}+6.58\%}$
test_compile_add_self_flat[pytree-eager] 0.6525ms 0.4451ms 2.2467 KOps/s 2.2984 KOps/s $\color{#d91a1a}-2.25\%$
test_compile_add_self_flat[pytree-compile] 0.2025ms 0.1586ms 6.3045 KOps/s 6.1531 KOps/s $\color{#35bf28}+2.46\%$
test_compile_copy_flat[tensordict-compile] 78.0120μs 13.3868μs 74.7007 KOps/s 73.8989 KOps/s $\color{#35bf28}+1.08\%$
test_compile_copy_flat[tensordict-eager] 70.6320μs 42.1507μs 23.7244 KOps/s 23.9433 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_copy_flat[pytree-compile] 0.1466ms 10.9137μs 91.6275 KOps/s 92.6157 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_copy_flat[pytree-eager] 0.4003ms 53.0409μs 18.8534 KOps/s 18.8983 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_assign_and_add[tensordict-compile] 2.0146ms 0.1754ms 5.7024 KOps/s 5.1489 KOps/s $\textbf{\color{#35bf28}+10.75\%}$
test_compile_assign_and_add[tensordict-eager] 3.4437ms 3.3490ms 298.5958 Ops/s 292.7712 Ops/s $\color{#35bf28}+1.99\%$
test_compile_assign_and_add[pytree-compile] 2.0205ms 0.1643ms 6.0871 KOps/s 6.0510 KOps/s $\color{#35bf28}+0.60\%$
test_compile_assign_and_add[pytree-eager] 2.9696ms 2.8399ms 352.1235 Ops/s 358.2927 Ops/s $\color{#d91a1a}-1.72\%$
test_compile_indexing[tensor-tensordict-compile] 0.1464ms 0.1098ms 9.1065 KOps/s 8.7976 KOps/s $\color{#35bf28}+3.51\%$
test_compile_indexing[tensor-tensordict-eager] 0.3142ms 74.5395μs 13.4157 KOps/s 13.4654 KOps/s $\color{#d91a1a}-0.37\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2215ms 97.3559μs 10.2716 KOps/s 10.3401 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2473ms 44.6869μs 22.3779 KOps/s 22.5023 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_indexing[tensor-pytree-compile] 0.1694ms 98.4152μs 10.1610 KOps/s 10.2170 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_indexing[tensor-pytree-eager] 0.2757ms 44.7893μs 22.3268 KOps/s 22.5701 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_indexing[slice-tensordict-compile] 0.1956ms 57.3267μs 17.4439 KOps/s 17.3516 KOps/s $\color{#35bf28}+0.53\%$
test_compile_indexing[slice-tensordict-eager] 0.2216ms 28.2986μs 35.3375 KOps/s 36.0629 KOps/s $\color{#d91a1a}-2.01\%$
test_compile_indexing[slice-tensorclass-compile] 0.1452ms 45.3752μs 22.0385 KOps/s 22.3186 KOps/s $\color{#d91a1a}-1.26\%$
test_compile_indexing[slice-tensorclass-eager] 0.2499ms 22.4867μs 44.4708 KOps/s 43.8094 KOps/s $\color{#35bf28}+1.51\%$
test_compile_indexing[slice-pytree-compile] 87.5510μs 45.7970μs 21.8355 KOps/s 21.6282 KOps/s $\color{#35bf28}+0.96\%$
test_compile_indexing[slice-pytree-eager] 0.2646ms 22.5560μs 44.3341 KOps/s 43.6497 KOps/s $\color{#35bf28}+1.57\%$
test_compile_indexing[int-tensordict-compile] 93.4820μs 56.9722μs 17.5524 KOps/s 16.9039 KOps/s $\color{#35bf28}+3.84\%$
test_compile_indexing[int-tensordict-eager] 0.2446ms 28.0737μs 35.6205 KOps/s 35.5539 KOps/s $\color{#35bf28}+0.19\%$
test_compile_indexing[int-tensorclass-compile] 79.7620μs 45.8733μs 21.7992 KOps/s 21.6208 KOps/s $\color{#35bf28}+0.83\%$
test_compile_indexing[int-tensorclass-eager] 0.2504ms 22.5075μs 44.4297 KOps/s 44.1217 KOps/s $\color{#35bf28}+0.70\%$
test_compile_indexing[int-pytree-compile] 99.4620μs 45.3780μs 22.0371 KOps/s 21.6089 KOps/s $\color{#35bf28}+1.98\%$
test_compile_indexing[int-pytree-eager] 0.2655ms 22.3054μs 44.8322 KOps/s 44.1378 KOps/s $\color{#35bf28}+1.57\%$
test_compile_replace[single-eager] 83.3120μs 47.1276μs 21.2190 KOps/s 21.4974 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_replace[single-compile] 0.1714ms 0.1050ms 9.5221 KOps/s 9.4295 KOps/s $\color{#35bf28}+0.98\%$
test_compile_replace[multi-eager] 0.6397ms 0.5635ms 1.7747 KOps/s 1.8142 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_replace[multi-compile] 0.2411ms 0.1119ms 8.9342 KOps/s 8.9273 KOps/s $\color{#35bf28}+0.08\%$
test_compile_tc_getattr_20[eager] 0.2186ms 0.1740ms 5.7481 KOps/s 5.8275 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_tc_getattr_20[compile] 0.2966ms 0.1206ms 8.2893 KOps/s 8.2736 KOps/s $\color{#35bf28}+0.19\%$
test_compile_clone_shallow[20-eager] 82.0110μs 19.5420μs 51.1719 KOps/s 51.1337 KOps/s $\color{#35bf28}+0.07\%$
test_compile_clone_shallow[20-compile] 0.1059ms 11.4647μs 87.2244 KOps/s 88.3936 KOps/s $\color{#d91a1a}-1.32\%$
test_compile_clone_shallow[40-eager] 66.7110μs 34.3095μs 29.1465 KOps/s 29.2054 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_clone_shallow[40-compile] 50.7210μs 12.5060μs 79.9616 KOps/s 58.0139 KOps/s $\textbf{\color{#35bf28}+37.83\%}$
test_compile_clone_shallow[80-eager] 93.3120μs 63.6633μs 15.7076 KOps/s 15.7708 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_clone_shallow[80-compile] 49.6910μs 15.3578μs 65.1135 KOps/s 68.0628 KOps/s $\color{#d91a1a}-4.33\%$
test_compile_update_inplace[eager] 0.1031ms 58.8925μs 16.9801 KOps/s 16.8399 KOps/s $\color{#35bf28}+0.83\%$
test_compile_update_inplace[compile] 0.3235ms 0.1411ms 7.0876 KOps/s 6.8439 KOps/s $\color{#35bf28}+3.56\%$
test_mod_add[eager] 89.9020μs 50.2692μs 19.8929 KOps/s 20.3551 KOps/s $\color{#d91a1a}-2.27\%$
test_mod_add[compile] 0.1530ms 0.1044ms 9.5824 KOps/s 9.3872 KOps/s $\color{#35bf28}+2.08\%$
test_mod_add[compile-overhead] 0.3613ms 0.1520ms 6.5793 KOps/s 6.5591 KOps/s $\color{#35bf28}+0.31\%$
test_mod_wrap[eager] 0.3880ms 0.3029ms 3.3016 KOps/s 3.4074 KOps/s $\color{#d91a1a}-3.10\%$
test_mod_wrap[compile] 0.4819ms 0.3479ms 2.8740 KOps/s 2.7580 KOps/s $\color{#35bf28}+4.21\%$
test_mod_wrap[compile-overhead] 7.2784ms 4.0399ms 247.5296 Ops/s 246.0818 Ops/s $\color{#35bf28}+0.59\%$
test_mod_wrap_and_backward[eager] 1.6700ms 1.4997ms 666.8096 Ops/s 667.6074 Ops/s $\color{#d91a1a}-0.12\%$
test_mod_wrap_and_backward[compile] 1.5630ms 1.4451ms 692.0056 Ops/s 640.7736 Ops/s $\textbf{\color{#35bf28}+8.00\%}$
test_mod_wrap_and_backward[compile-overhead] 1.2427ms 0.8896ms 1.1241 KOps/s 1.0045 KOps/s $\textbf{\color{#35bf28}+11.91\%}$
test_seq_add[eager] 0.2262ms 0.1551ms 6.4463 KOps/s 6.5384 KOps/s $\color{#d91a1a}-1.41\%$
test_seq_add[compile] 0.1795ms 0.1144ms 8.7378 KOps/s 8.1767 KOps/s $\textbf{\color{#35bf28}+6.86\%}$
test_seq_add[compile-overhead] 0.4563ms 0.1545ms 6.4736 KOps/s 6.3201 KOps/s $\color{#35bf28}+2.43\%$
test_seq_wrap[eager] 0.6178ms 0.5412ms 1.8476 KOps/s 1.8896 KOps/s $\color{#d91a1a}-2.22\%$
test_seq_wrap[compile] 0.4765ms 0.3842ms 2.6029 KOps/s 2.7212 KOps/s $\color{#d91a1a}-4.35\%$
test_seq_wrap[compile-overhead] 0.3316ms 0.2661ms 3.7586 KOps/s 3.7049 KOps/s $\color{#35bf28}+1.45\%$
test_func_call_runtime[False-eager] 0.9363ms 0.8409ms 1.1892 KOps/s 1.1809 KOps/s $\color{#35bf28}+0.70\%$
test_func_call_runtime[False-compile] 1.0343ms 0.9143ms 1.0937 KOps/s 1.1023 KOps/s $\color{#d91a1a}-0.78\%$
test_func_call_runtime[False-compile-overhead] 0.5783ms 0.4660ms 2.1459 KOps/s 2.1395 KOps/s $\color{#35bf28}+0.30\%$
test_func_call_runtime[True-eager] 1.2501ms 1.0777ms 927.8924 Ops/s 910.9300 Ops/s $\color{#35bf28}+1.86\%$
test_func_call_runtime[True-compile] 1.0556ms 0.9265ms 1.0793 KOps/s 1.0884 KOps/s $\color{#d91a1a}-0.84\%$
test_func_call_runtime[True-compile-overhead] 0.5976ms 0.4789ms 2.0882 KOps/s 2.0641 KOps/s $\color{#35bf28}+1.17\%$
test_func_call_cm_runtime[False-eager] 0.9526ms 0.8730ms 1.1455 KOps/s 1.1483 KOps/s $\color{#d91a1a}-0.25\%$
test_func_call_cm_runtime[False-compile] 1.1052ms 0.9129ms 1.0954 KOps/s 1.0957 KOps/s $\color{#d91a1a}-0.03\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7000ms 0.4690ms 2.1320 KOps/s 2.1183 KOps/s $\color{#35bf28}+0.65\%$
test_func_call_cm_runtime[True-eager] 1.3550ms 1.2372ms 808.2938 Ops/s 808.8674 Ops/s $\color{#d91a1a}-0.07\%$
test_func_call_cm_runtime[True-compile] 1.1060ms 0.9753ms 1.0253 KOps/s 1.0474 KOps/s $\color{#d91a1a}-2.11\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6341ms 0.5125ms 1.9511 KOps/s 1.9251 KOps/s $\color{#35bf28}+1.35\%$
test_vmap_func_call_cm_runtime[eager] 2.8680ms 2.3815ms 419.9028 Ops/s 421.3745 Ops/s $\color{#d91a1a}-0.35\%$
test_vmap_func_call_cm_runtime[compile] 1.0931ms 0.9835ms 1.0168 KOps/s 1.0267 KOps/s $\color{#d91a1a}-0.97\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6255ms 0.5182ms 1.9298 KOps/s 1.9093 KOps/s $\color{#35bf28}+1.08\%$
test_distributed 0.5614ms 0.1527ms 6.5482 KOps/s 6.4722 KOps/s $\color{#35bf28}+1.17\%$
test_tdmodule 70.8710μs 27.6234μs 36.2012 KOps/s 35.6299 KOps/s $\color{#35bf28}+1.60\%$
test_tdmodule_dispatch 73.5420μs 45.2259μs 22.1112 KOps/s 22.1852 KOps/s $\color{#d91a1a}-0.33\%$
test_tdseq 58.8710μs 27.2623μs 36.6807 KOps/s 36.6835 KOps/s $-0.01\%$
test_tdseq_dispatch 80.3920μs 47.5111μs 21.0477 KOps/s 20.7627 KOps/s $\color{#35bf28}+1.37\%$
test_instantiation_functorch 2.1710ms 2.0976ms 476.7404 Ops/s 473.0298 Ops/s $\color{#35bf28}+0.78\%$
test_exec_functorch 0.2398ms 0.1794ms 5.5755 KOps/s 5.4970 KOps/s $\color{#35bf28}+1.43\%$
test_exec_functional_call 0.2191ms 0.1627ms 6.1451 KOps/s 6.1730 KOps/s $\color{#d91a1a}-0.45\%$
test_exec_td_decorator 0.4391ms 0.2383ms 4.1959 KOps/s 4.2153 KOps/s $\color{#d91a1a}-0.46\%$
test_vmap_mlp_speed_decorator[True-True] 1.1075ms 0.8279ms 1.2079 KOps/s 1.1849 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_mlp_speed_decorator[True-False] 1.0518ms 0.8287ms 1.2067 KOps/s 1.1972 KOps/s $\color{#35bf28}+0.79\%$
test_vmap_mlp_speed_decorator[False-True] 0.8908ms 0.7114ms 1.4056 KOps/s 1.3841 KOps/s $\color{#35bf28}+1.56\%$
test_vmap_mlp_speed_decorator[False-False] 0.8969ms 0.7182ms 1.3924 KOps/s 1.3940 KOps/s $\color{#d91a1a}-0.12\%$
test_vmap_transformer_speed_decorator[True-True] 21.3721ms 20.5401ms 48.6853 Ops/s 48.7008 Ops/s $\color{#d91a1a}-0.03\%$
test_vmap_transformer_speed_decorator[True-False] 21.3726ms 20.5941ms 48.5576 Ops/s 48.8009 Ops/s $\color{#d91a1a}-0.50\%$
test_vmap_transformer_speed_decorator[False-True] 21.1982ms 20.4906ms 48.8029 Ops/s 49.2495 Ops/s $\color{#d91a1a}-0.91\%$
test_vmap_transformer_speed_decorator[False-False] 21.2611ms 20.4946ms 48.7933 Ops/s 48.8362 Ops/s $\color{#d91a1a}-0.09\%$
test_to_module_speed[True] 1.5883ms 1.4900ms 671.1253 Ops/s 675.5545 Ops/s $\color{#d91a1a}-0.66\%$
test_to_module_speed[False] 1.5851ms 1.4881ms 671.9902 Ops/s 691.6929 Ops/s $\color{#d91a1a}-2.85\%$
test_tc_init 74.5020μs 43.7872μs 22.8377 KOps/s 22.2644 KOps/s $\color{#35bf28}+2.58\%$
test_tc_init_tensor_only 42.9210μs 9.8128μs 101.9073 KOps/s 103.2558 KOps/s $\color{#d91a1a}-1.31\%$
test_tc_init_nested 0.1296ms 88.4875μs 11.3010 KOps/s 11.2428 KOps/s $\color{#35bf28}+0.52\%$
test_tc_init_many_fields 54.9110μs 16.4424μs 60.8183 KOps/s 60.7143 KOps/s $\color{#35bf28}+0.17\%$
test_tc_first_layer_tensor 29.9900μs 1.8375μs 544.2283 KOps/s 549.0406 KOps/s $\color{#d91a1a}-0.88\%$
test_tc_first_layer_tensor_only 2.5890μs 0.3994μs 2.5037 MOps/s 2.4439 MOps/s $\color{#35bf28}+2.45\%$
test_tc_first_layer_tensor_set 31.6310μs 3.9539μs 252.9131 KOps/s 251.4138 KOps/s $\color{#35bf28}+0.60\%$
test_tc_first_layer_tensor_only_set 61.1310μs 3.1418μs 318.2936 KOps/s 304.0345 KOps/s $\color{#35bf28}+4.69\%$
test_tc_first_layer_nontensor 33.6010μs 6.1609μs 162.3149 KOps/s 161.2139 KOps/s $\color{#35bf28}+0.68\%$
test_tc_second_layer_tensor 30.2900μs 4.4109μs 226.7090 KOps/s 225.8417 KOps/s $\color{#35bf28}+0.38\%$
test_tc_second_layer_nontensor 38.6910μs 8.7737μs 113.9775 KOps/s 114.5568 KOps/s $\color{#d91a1a}-0.51\%$
test_unbind 0.2518s 16.5531ms 60.4117 Ops/s 54.8492 Ops/s $\textbf{\color{#35bf28}+10.14\%}$
test_full_like 11.2434ms 4.4072ms 226.9011 Ops/s 227.5458 Ops/s $\color{#d91a1a}-0.28\%$
test_zeros_like 4.5373ms 4.3600ms 229.3574 Ops/s 228.6445 Ops/s $\color{#35bf28}+0.31\%$
test_ones_like 4.5170ms 4.3671ms 228.9828 Ops/s 228.1659 Ops/s $\color{#35bf28}+0.36\%$
test_clone 6.5761ms 6.4212ms 155.7352 Ops/s 154.1234 Ops/s $\color{#35bf28}+1.05\%$
test_squeeze 65.1910μs 14.3362μs 69.7533 KOps/s 70.3355 KOps/s $\color{#d91a1a}-0.83\%$
test_unsqueeze 0.1907ms 0.1106ms 9.0390 KOps/s 8.4255 KOps/s $\textbf{\color{#35bf28}+7.28\%}$
test_split 0.3518ms 0.1858ms 5.3825 KOps/s 5.1766 KOps/s $\color{#35bf28}+3.98\%$
test_permute 0.2828ms 0.2046ms 4.8866 KOps/s 4.5996 KOps/s $\textbf{\color{#35bf28}+6.24\%}$
test_stack 51.2805ms 50.8312ms 19.6729 Ops/s 28.8883 Ops/s $\textbf{\color{#d91a1a}-31.90\%}$
test_cat 42.6875ms 42.5017ms 23.5285 Ops/s 28.9235 Ops/s $\textbf{\color{#d91a1a}-18.65\%}$
test_sequential_tensordict 0.2734ms 0.2140ms 4.6720 KOps/s 4.5568 KOps/s $\color{#35bf28}+2.53\%$
test_sequential_graph_module 0.2465ms 0.1179ms 8.4822 KOps/s 8.0810 KOps/s $\color{#35bf28}+4.96\%$
test_nested_tensordict 0.3528ms 0.2929ms 3.4141 KOps/s 3.3681 KOps/s $\color{#35bf28}+1.37\%$
test_nested_graph_module 0.2170ms 0.1293ms 7.7324 KOps/s 7.2508 KOps/s $\textbf{\color{#35bf28}+6.64\%}$

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.4310μs 14.8480μs 67.3490 KOps/s 67.4070 KOps/s $\color{#d91a1a}-0.09\%$
test_plain_set_stack_nested 38.6710μs 15.1778μs 65.8857 KOps/s 67.1537 KOps/s $\color{#d91a1a}-1.89\%$
test_plain_set_nested_inplace 46.8110μs 16.7443μs 59.7220 KOps/s 60.1316 KOps/s $\color{#d91a1a}-0.68\%$
test_plain_set_stack_nested_inplace 37.4210μs 16.7024μs 59.8715 KOps/s 60.8365 KOps/s $\color{#d91a1a}-1.59\%$
test_items 38.8810μs 5.9929μs 166.8637 KOps/s 168.5054 KOps/s $\color{#d91a1a}-0.97\%$
test_items_nested 0.5137ms 0.4701ms 2.1274 KOps/s 2.1650 KOps/s $\color{#d91a1a}-1.74\%$
test_items_nested_locked 0.5651ms 0.4749ms 2.1057 KOps/s 2.1343 KOps/s $\color{#d91a1a}-1.34\%$
test_items_nested_leaf 0.1306ms 98.6902μs 10.1327 KOps/s 10.3485 KOps/s $\color{#d91a1a}-2.09\%$
test_items_stack_nested 0.5412ms 0.4737ms 2.1111 KOps/s 2.1708 KOps/s $\color{#d91a1a}-2.75\%$
test_items_stack_nested_leaf 0.1383ms 97.1736μs 10.2909 KOps/s 10.3418 KOps/s $\color{#d91a1a}-0.49\%$
test_items_stack_nested_locked 0.7382ms 0.4679ms 2.1372 KOps/s 2.1569 KOps/s $\color{#d91a1a}-0.91\%$
test_keys 27.3910μs 4.2728μs 234.0376 KOps/s 238.4868 KOps/s $\color{#d91a1a}-1.87\%$
test_keys_nested 0.1777ms 0.1290ms 7.7543 KOps/s 7.7055 KOps/s $\color{#35bf28}+0.63\%$
test_keys_nested_locked 0.7399ms 0.1348ms 7.4204 KOps/s 7.2914 KOps/s $\color{#35bf28}+1.77\%$
test_keys_nested_leaf 0.2594ms 0.1204ms 8.3066 KOps/s 8.3822 KOps/s $\color{#d91a1a}-0.90\%$
test_keys_stack_nested 0.1752ms 0.1297ms 7.7100 KOps/s 7.7266 KOps/s $\color{#d91a1a}-0.21\%$
test_keys_stack_nested_leaf 0.1784ms 0.1203ms 8.3158 KOps/s 8.3830 KOps/s $\color{#d91a1a}-0.80\%$
test_keys_stack_nested_locked 0.2031ms 0.1380ms 7.2438 KOps/s 7.3156 KOps/s $\color{#d91a1a}-0.98\%$
test_values 6.5262μs 1.0192μs 981.1472 KOps/s 986.7173 KOps/s $\color{#d91a1a}-0.56\%$
test_values_nested 78.0020μs 52.8326μs 18.9277 KOps/s 19.0619 KOps/s $\color{#d91a1a}-0.70\%$
test_values_nested_locked 93.0530μs 56.5342μs 17.6884 KOps/s 17.9973 KOps/s $\color{#d91a1a}-1.72\%$
test_values_nested_leaf 0.1025ms 61.1708μs 16.3477 KOps/s 16.7632 KOps/s $\color{#d91a1a}-2.48\%$
test_values_stack_nested 89.1520μs 53.5445μs 18.6761 KOps/s 19.1710 KOps/s $\color{#d91a1a}-2.58\%$
test_values_stack_nested_leaf 96.7920μs 61.0196μs 16.3882 KOps/s 16.6764 KOps/s $\color{#d91a1a}-1.73\%$
test_values_stack_nested_locked 83.3520μs 56.7280μs 17.6280 KOps/s 18.4957 KOps/s $\color{#d91a1a}-4.69\%$
test_membership 5.2517μs 0.8597μs 1.1632 MOps/s 1.1811 MOps/s $\color{#d91a1a}-1.52\%$
test_membership_nested 30.5510μs 2.8655μs 348.9826 KOps/s 346.3378 KOps/s $\color{#35bf28}+0.76\%$
test_membership_nested_leaf 28.7610μs 2.8218μs 354.3897 KOps/s 360.4697 KOps/s $\color{#d91a1a}-1.69\%$
test_membership_stacked_nested 22.2210μs 2.9180μs 342.6995 KOps/s 346.2923 KOps/s $\color{#d91a1a}-1.04\%$
test_membership_stacked_nested_leaf 26.5200μs 2.9016μs 344.6359 KOps/s 347.2052 KOps/s $\color{#d91a1a}-0.74\%$
test_membership_nested_last 25.3800μs 4.4707μs 223.6794 KOps/s 230.6209 KOps/s $\color{#d91a1a}-3.01\%$
test_membership_nested_leaf_last 39.9410μs 4.4592μs 224.2574 KOps/s 230.1453 KOps/s $\color{#d91a1a}-2.56\%$
test_membership_stacked_nested_last 25.5710μs 4.4496μs 224.7390 KOps/s 230.6251 KOps/s $\color{#d91a1a}-2.55\%$
test_membership_stacked_nested_leaf_last 20.4300μs 4.4445μs 224.9978 KOps/s 230.3708 KOps/s $\color{#d91a1a}-2.33\%$
test_nested_getleaf 55.0410μs 21.7247μs 46.0305 KOps/s 45.9888 KOps/s $\color{#35bf28}+0.09\%$
test_nested_get 55.9120μs 20.5695μs 48.6157 KOps/s 48.4520 KOps/s $\color{#35bf28}+0.34\%$
test_stacked_getleaf 51.4610μs 21.8989μs 45.6644 KOps/s 45.6818 KOps/s $\color{#d91a1a}-0.04\%$
test_stacked_get 95.7530μs 20.6191μs 48.4988 KOps/s 48.3185 KOps/s $\color{#35bf28}+0.37\%$
test_nested_getitemleaf 51.0620μs 22.1824μs 45.0807 KOps/s 44.4895 KOps/s $\color{#35bf28}+1.33\%$
test_nested_getitem 51.8710μs 20.9815μs 47.6610 KOps/s 47.1653 KOps/s $\color{#35bf28}+1.05\%$
test_stacked_getitemleaf 45.7810μs 22.0543μs 45.3427 KOps/s 44.9653 KOps/s $\color{#35bf28}+0.84\%$
test_stacked_getitem 45.6710μs 21.2201μs 47.1251 KOps/s 47.3579 KOps/s $\color{#d91a1a}-0.49\%$
test_lock_nested 7.8324ms 0.4932ms 2.0277 KOps/s 2.0947 KOps/s $\color{#d91a1a}-3.20\%$
test_lock_stack_nested 0.5693ms 0.4885ms 2.0472 KOps/s 2.0546 KOps/s $\color{#d91a1a}-0.36\%$
test_unlock_nested 0.4663ms 0.3953ms 2.5296 KOps/s 2.5508 KOps/s $\color{#d91a1a}-0.83\%$
test_unlock_stack_nested 0.4846ms 0.3944ms 2.5352 KOps/s 2.5269 KOps/s $\color{#35bf28}+0.33\%$
test_flatten_speed 0.1763ms 0.1223ms 8.1743 KOps/s 8.1367 KOps/s $\color{#35bf28}+0.46\%$
test_unflatten_speed 0.6790ms 0.5757ms 1.7371 KOps/s 1.7539 KOps/s $\color{#d91a1a}-0.96\%$
test_common_ops 0.8443ms 0.6979ms 1.4329 KOps/s 1.4350 KOps/s $\color{#d91a1a}-0.15\%$
test_creation 0.1053ms 3.1708μs 315.3799 KOps/s 314.8631 KOps/s $\color{#35bf28}+0.16\%$
test_creation_empty 30.6110μs 7.0167μs 142.5173 KOps/s 143.6992 KOps/s $\color{#d91a1a}-0.82\%$
test_creation_nested_1 83.8320μs 11.5300μs 86.7306 KOps/s 86.6763 KOps/s $\color{#35bf28}+0.06\%$
test_creation_nested_2 43.3010μs 13.3718μs 74.7842 KOps/s 75.1634 KOps/s $\color{#d91a1a}-0.50\%$
test_creation_many_keys[10] 48.6410μs 21.0545μs 47.4957 KOps/s 47.4591 KOps/s $\color{#35bf28}+0.08\%$
test_creation_many_keys[50] 0.1280ms 90.8640μs 11.0055 KOps/s 10.9981 KOps/s $\color{#35bf28}+0.07\%$
test_creation_many_keys[100] 0.2315ms 0.1787ms 5.5945 KOps/s 5.5514 KOps/s $\color{#35bf28}+0.77\%$
test_creation_nested_many_keys[10] 66.3720μs 45.0899μs 22.1779 KOps/s 22.1490 KOps/s $\color{#35bf28}+0.13\%$
test_creation_nested_many_keys[50] 0.2885ms 0.1840ms 5.4345 KOps/s 5.3548 KOps/s $\color{#35bf28}+1.49\%$
test_clone 38.9110μs 13.2779μs 75.3133 KOps/s 75.1943 KOps/s $\color{#35bf28}+0.16\%$
test_getitem[int] 1.4821ms 15.3869μs 64.9901 KOps/s 59.4387 KOps/s $\textbf{\color{#35bf28}+9.34\%}$
test_getitem[slice_int] 0.1369ms 24.6315μs 40.5985 KOps/s 41.2835 KOps/s $\color{#d91a1a}-1.66\%$
test_getitem[range] 0.1797ms 63.1625μs 15.8322 KOps/s 15.8451 KOps/s $\color{#d91a1a}-0.08\%$
test_getitem[tuple] 0.1403ms 24.3095μs 41.1362 KOps/s 41.8686 KOps/s $\color{#d91a1a}-1.75\%$
test_getitem[list] 0.1806ms 57.8496μs 17.2862 KOps/s 17.2276 KOps/s $\color{#35bf28}+0.34\%$
test_setitem_dim[int] 60.4910μs 26.1453μs 38.2478 KOps/s 37.8364 KOps/s $\color{#35bf28}+1.09\%$
test_setitem_dim[slice_int] 66.1220μs 43.3183μs 23.0849 KOps/s 22.9563 KOps/s $\color{#35bf28}+0.56\%$
test_setitem_dim[range] 0.1173ms 94.3787μs 10.5956 KOps/s 10.5310 KOps/s $\color{#35bf28}+0.61\%$
test_setitem_dim[tuple] 61.8120μs 39.2737μs 25.4623 KOps/s 25.5379 KOps/s $\color{#d91a1a}-0.30\%$
test_setitem 54.6810μs 17.8382μs 56.0594 KOps/s 56.4916 KOps/s $\color{#d91a1a}-0.77\%$
test_set 42.5010μs 17.2441μs 57.9910 KOps/s 59.5765 KOps/s $\color{#d91a1a}-2.66\%$
test_set_shared 0.4884ms 0.2034ms 4.9155 KOps/s 4.9140 KOps/s $\color{#35bf28}+0.03\%$
test_update 0.3239ms 22.0750μs 45.3001 KOps/s 46.3833 KOps/s $\color{#d91a1a}-2.34\%$
test_update_nested 70.6820μs 33.5481μs 29.8079 KOps/s 30.2386 KOps/s $\color{#d91a1a}-1.42\%$
test_update__nested 0.4530ms 34.3051μs 29.1502 KOps/s 29.2888 KOps/s $\color{#d91a1a}-0.47\%$
test_set_nested 59.4110μs 20.3388μs 49.1670 KOps/s 53.0083 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_set_nested_new 57.2810μs 23.9477μs 41.7576 KOps/s 42.5149 KOps/s $\color{#d91a1a}-1.78\%$
test_select 79.6320μs 40.3234μs 24.7995 KOps/s 24.4876 KOps/s $\color{#35bf28}+1.27\%$
test_select_nested 95.4520μs 74.7358μs 13.3805 KOps/s 13.3542 KOps/s $\color{#35bf28}+0.20\%$
test_exclude_nested 0.1264ms 92.1791μs 10.8484 KOps/s 10.8581 KOps/s $\color{#d91a1a}-0.09\%$
test_empty[True] 0.5044ms 0.3999ms 2.5005 KOps/s 2.5181 KOps/s $\color{#d91a1a}-0.70\%$
test_empty[False] 7.4378μs 1.3280μs 752.9850 KOps/s 769.5914 KOps/s $\color{#d91a1a}-2.16\%$
test_to 0.1114ms 77.8706μs 12.8418 KOps/s 13.5777 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_to_nonblocking 0.1095ms 64.3672μs 15.5359 KOps/s 15.5048 KOps/s $\color{#35bf28}+0.20\%$
test_unbind_speed 0.3894ms 0.3377ms 2.9616 KOps/s 2.9670 KOps/s $\color{#d91a1a}-0.18\%$
test_unbind_speed_stack0 0.3919ms 0.3364ms 2.9724 KOps/s 3.1138 KOps/s $\color{#d91a1a}-4.54\%$
test_unbind_speed_stack1 0.1044s 0.8478ms 1.1795 KOps/s 1.1740 KOps/s $\color{#35bf28}+0.47\%$
test_split 0.1043s 1.2764ms 783.4726 Ops/s 777.9090 Ops/s $\color{#35bf28}+0.72\%$
test_chunk 0.1045s 1.2218ms 818.4571 Ops/s 912.5112 Ops/s $\textbf{\color{#d91a1a}-10.31\%}$
test_to_cpu_blocking 29.0445ms 28.7236ms 34.8146 Ops/s 46.4259 Ops/s $\textbf{\color{#d91a1a}-25.01\%}$
test_to_cpu_global_sync 11.5356ms 11.3431ms 88.1594 Ops/s 88.0325 Ops/s $\color{#35bf28}+0.14\%$
test_to_cpu_event_sync 12.5333ms 12.2159ms 81.8602 Ops/s 80.9140 Ops/s $\color{#35bf28}+1.17\%$
test_to_cpu_default 0.1162s 13.5195ms 73.9671 Ops/s 81.1415 Ops/s $\textbf{\color{#d91a1a}-8.84\%}$
test_consolidate[False-None] 4.2911ms 4.2250ms 236.6855 Ops/s 240.4390 Ops/s $\color{#d91a1a}-1.56\%$
test_consolidate[default-None] 3.0535ms 2.0504ms 487.7132 Ops/s 469.7533 Ops/s $\color{#35bf28}+3.82\%$
test_consolidate[reduce-overhead-None] 2.1052ms 1.9783ms 505.4906 Ops/s 485.1919 Ops/s $\color{#35bf28}+4.18\%$
test_consolidate_njt[False-None] 8.7990ms 8.6115ms 116.1236 Ops/s 112.1070 Ops/s $\color{#35bf28}+3.58\%$
test_to[False-False-None] 2.2286ms 2.1041ms 475.2539 Ops/s 470.6209 Ops/s $\color{#35bf28}+0.98\%$
test_to[True-False-None] 2.2107ms 1.9390ms 515.7371 Ops/s 511.0153 Ops/s $\color{#35bf28}+0.92\%$
test_to[within-False-None] 6.3367ms 6.2489ms 160.0274 Ops/s 161.2249 Ops/s $\color{#d91a1a}-0.74\%$
test_to[True-default-None] 9.0416ms 8.7769ms 113.9349 Ops/s 111.4640 Ops/s $\color{#35bf28}+2.22\%$
test_to_njt[False-False-None] 8.7771ms 8.5135ms 117.4606 Ops/s 115.4385 Ops/s $\color{#35bf28}+1.75\%$
test_to_njt[True-False-None] 7.1235ms 6.9478ms 143.9309 Ops/s 139.9219 Ops/s $\color{#35bf28}+2.87\%$
test_to_njt[within-False-None] 15.7622ms 15.6496ms 63.8993 Ops/s 62.8746 Ops/s $\color{#35bf28}+1.63\%$
test_creation[device0] 0.2898ms 0.1165ms 8.5804 KOps/s 8.6727 KOps/s $\color{#d91a1a}-1.06\%$
test_creation_from_tensor 0.4055ms 0.1140ms 8.7695 KOps/s 8.6682 KOps/s $\color{#35bf28}+1.17\%$
test_add_one[memmap_tensor0] 0.2110ms 6.5381μs 152.9507 KOps/s 151.3260 KOps/s $\color{#35bf28}+1.07\%$
test_contiguous[memmap_tensor0] 20.7210μs 0.6661μs 1.5013 MOps/s 2.1396 MOps/s $\textbf{\color{#d91a1a}-29.83\%}$
test_stack[memmap_tensor0] 35.2210μs 4.7466μs 210.6781 KOps/s 211.2343 KOps/s $\color{#d91a1a}-0.26\%$
test_memmaptd_index 1.0930ms 0.2809ms 3.5596 KOps/s 3.6000 KOps/s $\color{#d91a1a}-1.12\%$
test_memmaptd_index_astensor 0.5389ms 0.3852ms 2.5962 KOps/s 2.6418 KOps/s $\color{#d91a1a}-1.72\%$
test_memmaptd_index_op 0.9559ms 0.6388ms 1.5653 KOps/s 1.5192 KOps/s $\color{#35bf28}+3.04\%$
test_serialize_model 0.3076s 0.1603s 6.2400 Ops/s 7.3740 Ops/s $\textbf{\color{#d91a1a}-15.38\%}$
test_serialize_model_pickle 1.3490s 1.2106s 0.8260 Ops/s 0.8328 Ops/s $\color{#d91a1a}-0.81\%$
test_serialize_weights 0.1375s 0.1351s 7.4044 Ops/s 7.3362 Ops/s $\color{#35bf28}+0.93\%$
test_serialize_weights_returnearly 0.4535s 88.5806ms 11.2892 Ops/s 6.1874 Ops/s $\textbf{\color{#35bf28}+82.45\%}$
test_serialize_weights_pickle 1.3684s 1.2136s 0.8240 Ops/s 0.8227 Ops/s $\color{#35bf28}+0.15\%$
test_reshape_pytree 0.2094ms 33.4388μs 29.9054 KOps/s 30.1401 KOps/s $\color{#d91a1a}-0.78\%$
test_reshape_td 82.1520μs 46.9869μs 21.2825 KOps/s 22.1320 KOps/s $\color{#d91a1a}-3.84\%$
test_view_pytree 0.2119ms 33.4556μs 29.8903 KOps/s 30.9273 KOps/s $\color{#d91a1a}-3.35\%$
test_view_td 97.5820μs 54.3571μs 18.3969 KOps/s 18.4999 KOps/s $\color{#d91a1a}-0.56\%$
test_unbind_pytree 0.2354ms 37.0089μs 27.0205 KOps/s 26.8639 KOps/s $\color{#35bf28}+0.58\%$
test_unbind_td 0.1637ms 50.5587μs 19.7790 KOps/s 19.7753 KOps/s $\color{#35bf28}+0.02\%$
test_split_pytree 0.2264ms 43.2024μs 23.1468 KOps/s 23.5356 KOps/s $\color{#d91a1a}-1.65\%$
test_split_td 0.1842ms 67.5042μs 14.8139 KOps/s 15.4222 KOps/s $\color{#d91a1a}-3.94\%$
test_add_pytree 0.2407ms 43.0028μs 23.2543 KOps/s 24.1154 KOps/s $\color{#d91a1a}-3.57\%$
test_add_td 0.1013ms 54.7817μs 18.2543 KOps/s 18.6901 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_add_one_nested[tensordict-compile] 0.2061ms 0.1426ms 7.0148 KOps/s 6.7226 KOps/s $\color{#35bf28}+4.35\%$
test_compile_add_one_nested[tensordict-eager] 0.5720ms 0.2014ms 4.9651 KOps/s 4.9934 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_add_one_nested[pytree-compile] 0.4178ms 0.1107ms 9.0297 KOps/s 8.8948 KOps/s $\color{#35bf28}+1.52\%$
test_compile_add_one_nested[pytree-eager] 0.6076ms 0.1850ms 5.4046 KOps/s 5.4880 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_copy_nested[tensordict-compile] 0.3023ms 10.3629μs 96.4982 KOps/s 95.9914 KOps/s $\color{#35bf28}+0.53\%$
test_compile_copy_nested[tensordict-eager] 91.5230μs 54.3642μs 18.3945 KOps/s 18.2538 KOps/s $\color{#35bf28}+0.77\%$
test_compile_copy_nested[pytree-compile] 45.5910μs 9.8679μs 101.3382 KOps/s 99.7046 KOps/s $\color{#35bf28}+1.64\%$
test_compile_copy_nested[pytree-eager] 0.4393ms 69.9695μs 14.2919 KOps/s 14.6166 KOps/s $\color{#d91a1a}-2.22\%$
test_compile_add_one_flat[tensordict-compile] 0.2131ms 0.1767ms 5.6601 KOps/s 5.2547 KOps/s $\textbf{\color{#35bf28}+7.72\%}$
test_compile_add_one_flat[tensordict-eager] 0.3565ms 0.2787ms 3.5885 KOps/s 3.5713 KOps/s $\color{#35bf28}+0.48\%$
test_compile_add_one_flat[tensorclass-compile] 0.1695ms 0.1178ms 8.4900 KOps/s 8.1288 KOps/s $\color{#35bf28}+4.44\%$
test_compile_add_one_flat[tensorclass-eager] 0.1044ms 73.4884μs 13.6076 KOps/s 13.6758 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_one_flat[pytree-compile] 0.2007ms 0.1578ms 6.3374 KOps/s 6.1861 KOps/s $\color{#35bf28}+2.45\%$
test_compile_add_one_flat[pytree-eager] 0.8926ms 0.5438ms 1.8387 KOps/s 1.8492 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_add_self_flat[tensordict-eager] 0.4193ms 0.3316ms 3.0153 KOps/s 2.9985 KOps/s $\color{#35bf28}+0.56\%$
test_compile_add_self_flat[tensordict-compile] 0.2149ms 0.1791ms 5.5842 KOps/s 5.2009 KOps/s $\textbf{\color{#35bf28}+7.37\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1624ms 89.4496μs 11.1795 KOps/s 11.2606 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_add_self_flat[tensorclass-compile] 0.1927ms 0.1198ms 8.3462 KOps/s 8.0802 KOps/s $\color{#35bf28}+3.29\%$
test_compile_add_self_flat[pytree-eager] 0.6576ms 0.4477ms 2.2336 KOps/s 2.2613 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_add_self_flat[pytree-compile] 0.2385ms 0.1591ms 6.2851 KOps/s 6.1435 KOps/s $\color{#35bf28}+2.31\%$
test_compile_copy_flat[tensordict-compile] 62.5320μs 13.0420μs 76.6752 KOps/s 74.1977 KOps/s $\color{#35bf28}+3.34\%$
test_compile_copy_flat[tensordict-eager] 71.7810μs 41.5088μs 24.0913 KOps/s 24.3536 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_copy_flat[pytree-compile] 38.5710μs 10.7811μs 92.7547 KOps/s 93.2226 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_copy_flat[pytree-eager] 0.4073ms 52.7505μs 18.9572 KOps/s 19.0477 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_assign_and_add[tensordict-compile] 2.0333ms 0.1736ms 5.7602 KOps/s 5.3783 KOps/s $\textbf{\color{#35bf28}+7.10\%}$
test_compile_assign_and_add[tensordict-eager] 3.4364ms 3.3265ms 300.6175 Ops/s 302.5844 Ops/s $\color{#d91a1a}-0.65\%$
test_compile_assign_and_add[pytree-compile] 1.9732ms 0.1628ms 6.1421 KOps/s 5.9238 KOps/s $\color{#35bf28}+3.69\%$
test_compile_assign_and_add[pytree-eager] 2.9654ms 2.8187ms 354.7787 Ops/s 352.7546 Ops/s $\color{#35bf28}+0.57\%$
test_compile_indexing[tensor-tensordict-compile] 0.1581ms 0.1093ms 9.1504 KOps/s 8.8007 KOps/s $\color{#35bf28}+3.97\%$
test_compile_indexing[tensor-tensordict-eager] 0.3133ms 73.6091μs 13.5853 KOps/s 13.7651 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1401ms 96.6273μs 10.3490 KOps/s 10.1996 KOps/s $\color{#35bf28}+1.47\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2518ms 45.0761μs 22.1847 KOps/s 21.0891 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_compile_indexing[tensor-pytree-compile] 0.1446ms 97.0734μs 10.3015 KOps/s 10.2255 KOps/s $\color{#35bf28}+0.74\%$
test_compile_indexing[tensor-pytree-eager] 0.2574ms 44.8519μs 22.2956 KOps/s 22.6221 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_indexing[slice-tensordict-compile] 99.2420μs 56.3077μs 17.7596 KOps/s 16.9161 KOps/s $\color{#35bf28}+4.99\%$
test_compile_indexing[slice-tensordict-eager] 0.2194ms 27.8620μs 35.8912 KOps/s 36.0989 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_indexing[slice-tensorclass-compile] 93.2120μs 44.6030μs 22.4200 KOps/s 21.9304 KOps/s $\color{#35bf28}+2.23\%$
test_compile_indexing[slice-tensorclass-eager] 0.2690ms 22.8553μs 43.7536 KOps/s 44.5956 KOps/s $\color{#d91a1a}-1.89\%$
test_compile_indexing[slice-pytree-compile] 83.1020μs 44.2714μs 22.5879 KOps/s 21.4616 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_compile_indexing[slice-pytree-eager] 0.2777ms 22.6297μs 44.1897 KOps/s 44.7689 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_indexing[int-tensordict-compile] 93.1220μs 57.2454μs 17.4686 KOps/s 16.6321 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_compile_indexing[int-tensordict-eager] 0.2832ms 28.2226μs 35.4326 KOps/s 36.8333 KOps/s $\color{#d91a1a}-3.80\%$
test_compile_indexing[int-tensorclass-compile] 86.0520μs 44.6529μs 22.3950 KOps/s 21.5474 KOps/s $\color{#35bf28}+3.93\%$
test_compile_indexing[int-tensorclass-eager] 0.2662ms 22.6864μs 44.0792 KOps/s 44.5158 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_indexing[int-pytree-compile] 81.7620μs 44.5219μs 22.4609 KOps/s 21.5754 KOps/s $\color{#35bf28}+4.10\%$
test_compile_indexing[int-pytree-eager] 0.2747ms 22.7383μs 43.9786 KOps/s 44.7885 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_replace[single-eager] 0.1012ms 47.6776μs 20.9742 KOps/s 21.2434 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_replace[single-compile] 0.1447ms 0.1047ms 9.5490 KOps/s 9.3061 KOps/s $\color{#35bf28}+2.61\%$
test_compile_replace[multi-eager] 0.6365ms 0.5651ms 1.7695 KOps/s 1.7790 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_replace[multi-compile] 0.1622ms 0.1128ms 8.8651 KOps/s 8.8814 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_tc_getattr_20[eager] 0.2178ms 0.1750ms 5.7128 KOps/s 5.9126 KOps/s $\color{#d91a1a}-3.38\%$
test_compile_tc_getattr_20[compile] 0.1740ms 0.1194ms 8.3779 KOps/s 8.1934 KOps/s $\color{#35bf28}+2.25\%$
test_compile_clone_shallow[20-eager] 52.9020μs 19.5172μs 51.2368 KOps/s 52.2831 KOps/s $\color{#d91a1a}-2.00\%$
test_compile_clone_shallow[20-compile] 52.7410μs 11.7383μs 85.1912 KOps/s 85.2134 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_clone_shallow[40-eager] 60.9920μs 34.1464μs 29.2856 KOps/s 29.5577 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_clone_shallow[40-compile] 49.6910μs 12.7251μs 78.5850 KOps/s 78.8690 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_clone_shallow[80-eager] 0.1747ms 61.4896μs 16.2629 KOps/s 15.7289 KOps/s $\color{#35bf28}+3.40\%$
test_compile_clone_shallow[80-compile] 40.2410μs 15.0043μs 66.6475 KOps/s 64.7920 KOps/s $\color{#35bf28}+2.86\%$
test_compile_update_inplace[eager] 0.1242ms 59.8931μs 16.6964 KOps/s 16.7293 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_update_inplace[compile] 0.1918ms 0.1404ms 7.1248 KOps/s 6.9418 KOps/s $\color{#35bf28}+2.64\%$
test_mod_add[eager] 0.1177ms 49.9102μs 20.0360 KOps/s 20.1400 KOps/s $\color{#d91a1a}-0.52\%$
test_mod_add[compile] 0.1487ms 0.1058ms 9.4521 KOps/s 9.4495 KOps/s $\color{#35bf28}+0.03\%$
test_mod_add[compile-overhead] 0.2325ms 0.1496ms 6.6862 KOps/s 6.4990 KOps/s $\color{#35bf28}+2.88\%$
test_mod_wrap[eager] 0.3636ms 0.2925ms 3.4189 KOps/s 3.4452 KOps/s $\color{#d91a1a}-0.76\%$
test_mod_wrap[compile] 0.8315ms 0.3547ms 2.8189 KOps/s 2.7142 KOps/s $\color{#35bf28}+3.86\%$
test_mod_wrap[compile-overhead] 7.2512ms 4.0155ms 249.0346 Ops/s 247.5443 Ops/s $\color{#35bf28}+0.60\%$
test_mod_wrap_and_backward[eager] 1.9425ms 1.5085ms 662.9031 Ops/s 659.6341 Ops/s $\color{#35bf28}+0.50\%$
test_mod_wrap_and_backward[compile] 1.9626ms 1.4523ms 688.5602 Ops/s 680.3849 Ops/s $\color{#35bf28}+1.20\%$
test_mod_wrap_and_backward[compile-overhead] 1.2654ms 0.8898ms 1.1238 KOps/s 1.1066 KOps/s $\color{#35bf28}+1.56\%$
test_seq_add[eager] 0.6370ms 0.1560ms 6.4100 KOps/s 6.4170 KOps/s $\color{#d91a1a}-0.11\%$
test_seq_add[compile] 0.6107ms 0.1163ms 8.5957 KOps/s 8.5217 KOps/s $\color{#35bf28}+0.87\%$
test_seq_add[compile-overhead] 0.6001ms 0.1567ms 6.3829 KOps/s 6.2174 KOps/s $\color{#35bf28}+2.66\%$
test_seq_wrap[eager] 0.9654ms 0.5225ms 1.9137 KOps/s 1.9051 KOps/s $\color{#35bf28}+0.45\%$
test_seq_wrap[compile] 0.8736ms 0.3680ms 2.7177 KOps/s 2.7139 KOps/s $\color{#35bf28}+0.14\%$
test_seq_wrap[compile-overhead] 0.3413ms 0.2664ms 3.7537 KOps/s 3.7001 KOps/s $\color{#35bf28}+1.45\%$
test_func_call_runtime[False-eager] 1.2829ms 0.8422ms 1.1874 KOps/s 1.2054 KOps/s $\color{#d91a1a}-1.49\%$
test_func_call_runtime[False-compile] 1.4088ms 0.9193ms 1.0878 KOps/s 1.0883 KOps/s $\color{#d91a1a}-0.04\%$
test_func_call_runtime[False-compile-overhead] 0.9048ms 0.4654ms 2.1488 KOps/s 2.1317 KOps/s $\color{#35bf28}+0.80\%$
test_func_call_runtime[True-eager] 1.5263ms 1.0878ms 919.3103 Ops/s 928.1597 Ops/s $\color{#d91a1a}-0.95\%$
test_func_call_runtime[True-compile] 1.4396ms 0.9307ms 1.0745 KOps/s 1.0765 KOps/s $\color{#d91a1a}-0.19\%$
test_func_call_runtime[True-compile-overhead] 0.9212ms 0.4785ms 2.0899 KOps/s 2.0635 KOps/s $\color{#35bf28}+1.28\%$
test_func_call_cm_runtime[False-eager] 1.2917ms 0.8681ms 1.1519 KOps/s 1.2028 KOps/s $\color{#d91a1a}-4.23\%$
test_func_call_cm_runtime[False-compile] 1.1656ms 0.9265ms 1.0793 KOps/s 1.0855 KOps/s $\color{#d91a1a}-0.57\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5712ms 0.4660ms 2.1459 KOps/s 2.1217 KOps/s $\color{#35bf28}+1.14\%$
test_func_call_cm_runtime[True-eager] 1.3070ms 1.2253ms 816.1113 Ops/s 813.4923 Ops/s $\color{#35bf28}+0.32\%$
test_func_call_cm_runtime[True-compile] 1.0251ms 0.9606ms 1.0410 KOps/s 1.0344 KOps/s $\color{#35bf28}+0.64\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5610ms 0.5110ms 1.9569 KOps/s 1.9333 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_func_call_cm_runtime[eager] 2.8531ms 2.3739ms 421.2492 Ops/s 420.3055 Ops/s $\color{#35bf28}+0.22\%$
test_vmap_func_call_cm_runtime[compile] 1.1358ms 0.9868ms 1.0134 KOps/s 1.0132 KOps/s $\color{#35bf28}+0.02\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5955ms 0.5149ms 1.9421 KOps/s 1.9094 KOps/s $\color{#35bf28}+1.72\%$
test_distributed 0.5470ms 0.1522ms 6.5706 KOps/s 6.5085 KOps/s $\color{#35bf28}+0.95\%$
test_tdmodule 0.2486ms 27.4147μs 36.4767 KOps/s 36.2088 KOps/s $\color{#35bf28}+0.74\%$
test_tdmodule_dispatch 77.4020μs 45.5802μs 21.9394 KOps/s 22.4956 KOps/s $\color{#d91a1a}-2.47\%$
test_tdseq 53.5910μs 26.5465μs 37.6697 KOps/s 37.2483 KOps/s $\color{#35bf28}+1.13\%$
test_tdseq_dispatch 67.3010μs 46.4783μs 21.5154 KOps/s 21.1420 KOps/s $\color{#35bf28}+1.77\%$
test_instantiation_functorch 2.2018ms 2.1025ms 475.6165 Ops/s 480.5947 Ops/s $\color{#d91a1a}-1.04\%$
test_exec_functorch 0.2407ms 0.1795ms 5.5704 KOps/s 5.5451 KOps/s $\color{#35bf28}+0.46\%$
test_exec_functional_call 0.2285ms 0.1593ms 6.2775 KOps/s 6.2424 KOps/s $\color{#35bf28}+0.56\%$
test_exec_td_decorator 0.4326ms 0.2346ms 4.2627 KOps/s 4.2311 KOps/s $\color{#35bf28}+0.75\%$
test_vmap_mlp_speed_decorator[True-True] 1.0064ms 0.8204ms 1.2190 KOps/s 1.2115 KOps/s $\color{#35bf28}+0.62\%$
test_vmap_mlp_speed_decorator[True-False] 0.9999ms 0.8197ms 1.2200 KOps/s 1.2144 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[False-True] 0.8536ms 0.7048ms 1.4187 KOps/s 1.4041 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed_decorator[False-False] 0.8753ms 0.7081ms 1.4123 KOps/s 1.4002 KOps/s $\color{#35bf28}+0.86\%$
test_vmap_transformer_speed_decorator[True-True] 21.4274ms 20.5676ms 48.6203 Ops/s 48.4663 Ops/s $\color{#35bf28}+0.32\%$
test_vmap_transformer_speed_decorator[True-False] 20.6841ms 20.5664ms 48.6230 Ops/s 48.4720 Ops/s $\color{#35bf28}+0.31\%$
test_vmap_transformer_speed_decorator[False-True] 20.5470ms 20.3188ms 49.2155 Ops/s 48.9142 Ops/s $\color{#35bf28}+0.62\%$
test_vmap_transformer_speed_decorator[False-False] 20.4746ms 20.3529ms 49.1332 Ops/s 48.8916 Ops/s $\color{#35bf28}+0.49\%$
test_to_module_speed[True] 1.8673ms 1.4789ms 676.1605 Ops/s 676.7710 Ops/s $\color{#d91a1a}-0.09\%$
test_to_module_speed[False] 1.5720ms 1.4552ms 687.1957 Ops/s 689.3482 Ops/s $\color{#d91a1a}-0.31\%$
test_tc_init 81.3120μs 44.7082μs 22.3673 KOps/s 22.2063 KOps/s $\color{#35bf28}+0.72\%$
test_tc_init_tensor_only 84.8020μs 9.7830μs 102.2177 KOps/s 101.2725 KOps/s $\color{#35bf28}+0.93\%$
test_tc_init_nested 0.1238ms 88.8926μs 11.2495 KOps/s 11.2355 KOps/s $\color{#35bf28}+0.12\%$
test_tc_init_many_fields 65.3710μs 16.5153μs 60.5500 KOps/s 60.8630 KOps/s $\color{#d91a1a}-0.51\%$
test_tc_first_layer_tensor 73.2820μs 1.8213μs 549.0657 KOps/s 554.8517 KOps/s $\color{#d91a1a}-1.04\%$
test_tc_first_layer_tensor_only 2.7267μs 0.4011μs 2.4932 MOps/s 2.4802 MOps/s $\color{#35bf28}+0.53\%$
test_tc_first_layer_tensor_set 33.7110μs 3.9616μs 252.4209 KOps/s 252.8804 KOps/s $\color{#d91a1a}-0.18\%$
test_tc_first_layer_tensor_only_set 28.0810μs 3.2719μs 305.6372 KOps/s 303.4225 KOps/s $\color{#35bf28}+0.73\%$
test_tc_first_layer_nontensor 27.7100μs 6.1529μs 162.5246 KOps/s 157.1311 KOps/s $\color{#35bf28}+3.43\%$
test_tc_second_layer_tensor 27.8510μs 4.4365μs 225.4014 KOps/s 224.8127 KOps/s $\color{#35bf28}+0.26\%$
test_tc_second_layer_nontensor 56.9020μs 8.6910μs 115.0616 KOps/s 111.0009 KOps/s $\color{#35bf28}+3.66\%$
test_unbind 0.2498s 16.4056ms 60.9548 Ops/s 53.8745 Ops/s $\textbf{\color{#35bf28}+13.14\%}$
test_full_like 16.9958ms 16.5246ms 60.5157 Ops/s 73.4493 Ops/s $\textbf{\color{#d91a1a}-17.61\%}$
test_zeros_like 17.3524ms 16.8474ms 59.3563 Ops/s 74.1591 Ops/s $\textbf{\color{#d91a1a}-19.96\%}$
test_ones_like 16.9585ms 16.5884ms 60.2831 Ops/s 73.9715 Ops/s $\textbf{\color{#d91a1a}-18.50\%}$
test_clone 17.8337ms 17.5699ms 56.9154 Ops/s 67.6819 Ops/s $\textbf{\color{#d91a1a}-15.91\%}$
test_squeeze 97.4530μs 14.4489μs 69.2095 KOps/s 64.2986 KOps/s $\textbf{\color{#35bf28}+7.64\%}$
test_unsqueeze 0.1712ms 0.1104ms 9.0599 KOps/s 8.5937 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_split 0.3520ms 0.1853ms 5.3965 KOps/s 5.1184 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_permute 0.2700ms 0.2113ms 4.7325 KOps/s 4.5282 KOps/s $\color{#35bf28}+4.51\%$
test_stack 51.8540ms 51.2154ms 19.5254 Ops/s 19.4851 Ops/s $\color{#35bf28}+0.21\%$
test_cat 51.4389ms 50.9110ms 19.6421 Ops/s 19.4963 Ops/s $\color{#35bf28}+0.75\%$
test_sequential_tensordict 0.6098ms 0.2223ms 4.4987 KOps/s 4.5426 KOps/s $\color{#d91a1a}-0.97\%$
test_sequential_graph_module 0.1993ms 0.1230ms 8.1288 KOps/s 8.4607 KOps/s $\color{#d91a1a}-3.92\%$
test_nested_tensordict 0.7246ms 0.2820ms 3.5457 KOps/s 3.5493 KOps/s $\color{#d91a1a}-0.10\%$
test_nested_graph_module 0.1710ms 0.1311ms 7.6257 KOps/s 7.6847 KOps/s $\color{#d91a1a}-0.77\%$

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Add transfer plan computation and transport abstraction

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant