[DTensor] Add Strategy B (local-shard transfer + redistribute)#1645
[DTensor] Add Strategy B (local-shard transfer + redistribute)#1645vmoens wants to merge 6 commits intogh/vmoens/86/basefrom
Conversation
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 40.8310μs | 14.8348μs | 67.4089 KOps/s | 67.5143 KOps/s | |
| test_plain_set_stack_nested | 39.0800μs | 15.3459μs | 65.1638 KOps/s | 65.4865 KOps/s | |
| test_plain_set_nested_inplace | 47.9210μs | 16.7478μs | 59.7092 KOps/s | 59.0472 KOps/s | |
| test_plain_set_stack_nested_inplace | 40.4500μs | 16.6021μs | 60.2332 KOps/s | 59.9761 KOps/s | |
| test_items | 30.2400μs | 6.0411μs | 165.5316 KOps/s | 161.3198 KOps/s | |
| test_items_nested | 0.5755ms | 0.4728ms | 2.1149 KOps/s | 2.1383 KOps/s | |
| test_items_nested_locked | 0.5347ms | 0.4769ms | 2.0969 KOps/s | 2.1171 KOps/s | |
| test_items_nested_leaf | 0.1530ms | 97.8857μs | 10.2160 KOps/s | 10.1988 KOps/s | |
| test_items_stack_nested | 0.6712ms | 0.4725ms | 2.1165 KOps/s | 2.1510 KOps/s | |
| test_items_stack_nested_leaf | 0.1434ms | 99.3607μs | 10.0643 KOps/s | 10.0774 KOps/s | |
| test_items_stack_nested_locked | 0.5660ms | 0.4712ms | 2.1221 KOps/s | 2.1150 KOps/s | |
| test_keys | 32.1010μs | 4.2560μs | 234.9626 KOps/s | 235.1632 KOps/s | |
| test_keys_nested | 0.1797ms | 0.1307ms | 7.6504 KOps/s | 7.6551 KOps/s | |
| test_keys_nested_locked | 2.1434ms | 0.1391ms | 7.1884 KOps/s | 7.1888 KOps/s | |
| test_keys_nested_leaf | 0.1718ms | 0.1213ms | 8.2467 KOps/s | 8.2974 KOps/s | |
| test_keys_stack_nested | 0.1825ms | 0.1310ms | 7.6357 KOps/s | 7.6006 KOps/s | |
| test_keys_stack_nested_leaf | 0.1681ms | 0.1217ms | 8.2200 KOps/s | 8.2482 KOps/s | |
| test_keys_stack_nested_locked | 0.2068ms | 0.1383ms | 7.2281 KOps/s | 7.2222 KOps/s | |
| test_values | 6.5502μs | 1.0279μs | 972.8770 KOps/s | 972.1296 KOps/s | |
| test_values_nested | 89.9410μs | 52.5243μs | 19.0388 KOps/s | 19.1603 KOps/s | |
| test_values_nested_locked | 87.7010μs | 56.1097μs | 17.8222 KOps/s | 17.8892 KOps/s | |
| test_values_nested_leaf | 0.1085ms | 60.6622μs | 16.4847 KOps/s | 16.6349 KOps/s | |
| test_values_stack_nested | 81.5210μs | 53.2616μs | 18.7753 KOps/s | 18.9505 KOps/s | |
| test_values_stack_nested_leaf | 92.7010μs | 60.6591μs | 16.4856 KOps/s | 16.6011 KOps/s | |
| test_values_stack_nested_locked | 0.1478ms | 56.3667μs | 17.7410 KOps/s | 17.8504 KOps/s | |
| test_membership | 6.2617μs | 0.8562μs | 1.1680 MOps/s | 1.1636 MOps/s | |
| test_membership_nested | 29.3200μs | 2.9130μs | 343.2838 KOps/s | 332.6158 KOps/s | |
| test_membership_nested_leaf | 31.4100μs | 2.9103μs | 343.6096 KOps/s | 335.8217 KOps/s | |
| test_membership_stacked_nested | 34.0000μs | 2.9296μs | 341.3429 KOps/s | 332.4589 KOps/s | |
| test_membership_stacked_nested_leaf | 31.7610μs | 2.8891μs | 346.1293 KOps/s | 335.2976 KOps/s | |
| test_membership_nested_last | 34.3310μs | 4.3805μs | 228.2834 KOps/s | 224.2675 KOps/s | |
| test_membership_nested_leaf_last | 37.6610μs | 4.3980μs | 227.3757 KOps/s | 226.4939 KOps/s | |
| test_membership_stacked_nested_last | 30.5510μs | 4.3657μs | 229.0569 KOps/s | 224.2438 KOps/s | |
| test_membership_stacked_nested_leaf_last | 39.9700μs | 4.3405μs | 230.3881 KOps/s | 224.4404 KOps/s | |
| test_nested_getleaf | 53.9510μs | 21.5542μs | 46.3948 KOps/s | 45.9244 KOps/s | |
| test_nested_get | 48.3010μs | 20.3879μs | 49.0487 KOps/s | 48.1736 KOps/s | |
| test_stacked_getleaf | 62.0710μs | 21.3328μs | 46.8762 KOps/s | 46.3531 KOps/s | |
| test_stacked_get | 88.4210μs | 20.6469μs | 48.4335 KOps/s | 48.1608 KOps/s | |
| test_nested_getitemleaf | 51.0410μs | 21.9925μs | 45.4700 KOps/s | 44.8499 KOps/s | |
| test_nested_getitem | 48.4800μs | 21.1663μs | 47.2449 KOps/s | 47.2499 KOps/s | |
| test_stacked_getitemleaf | 50.4610μs | 22.1854μs | 45.0748 KOps/s | 45.5722 KOps/s | |
| test_stacked_getitem | 58.5610μs | 21.0833μs | 47.4308 KOps/s | 47.8148 KOps/s | |
| test_lock_nested | 0.5533ms | 0.4791ms | 2.0875 KOps/s | 2.0915 KOps/s | |
| test_lock_stack_nested | 0.5950ms | 0.4831ms | 2.0702 KOps/s | 2.0556 KOps/s | |
| test_unlock_nested | 0.4644ms | 0.3944ms | 2.5358 KOps/s | 2.5614 KOps/s | |
| test_unlock_stack_nested | 0.4418ms | 0.3890ms | 2.5710 KOps/s | 2.5277 KOps/s | |
| test_flatten_speed | 0.1567ms | 0.1230ms | 8.1292 KOps/s | 8.2036 KOps/s | |
| test_unflatten_speed | 0.6865ms | 0.5715ms | 1.7499 KOps/s | 1.7404 KOps/s | |
| test_common_ops | 0.8404ms | 0.6921ms | 1.4449 KOps/s | 1.4377 KOps/s | |
| test_creation | 0.1073ms | 3.1448μs | 317.9853 KOps/s | 319.2008 KOps/s | |
| test_creation_empty | 43.5810μs | 6.9614μs | 143.6499 KOps/s | 142.7651 KOps/s | |
| test_creation_nested_1 | 35.4910μs | 11.5628μs | 86.4845 KOps/s | 85.5819 KOps/s | |
| test_creation_nested_2 | 47.1310μs | 13.4064μs | 74.5913 KOps/s | 78.1229 KOps/s | |
| test_creation_many_keys[10] | 48.6110μs | 21.0609μs | 47.4815 KOps/s | 47.1214 KOps/s | |
| test_creation_many_keys[50] | 0.1615ms | 89.9179μs | 11.1213 KOps/s | 10.9889 KOps/s | |
| test_creation_many_keys[100] | 0.2189ms | 0.1763ms | 5.6707 KOps/s | 5.5754 KOps/s | |
| test_creation_nested_many_keys[10] | 93.9820μs | 44.8615μs | 22.2909 KOps/s | 22.0988 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2452ms | 0.1837ms | 5.4447 KOps/s | 5.3760 KOps/s | |
| test_clone | 45.4010μs | 13.4465μs | 74.3689 KOps/s | 74.0602 KOps/s | |
| test_getitem[int] | 1.7286ms | 15.1922μs | 65.8234 KOps/s | 59.1130 KOps/s | |
| test_getitem[slice_int] | 0.1381ms | 24.0305μs | 41.6137 KOps/s | 41.4779 KOps/s | |
| test_getitem[range] | 0.1728ms | 64.1073μs | 15.5989 KOps/s | 15.6695 KOps/s | |
| test_getitem[tuple] | 0.1427ms | 24.0967μs | 41.4994 KOps/s | 41.9872 KOps/s | |
| test_getitem[list] | 0.2023ms | 61.6473μs | 16.2213 KOps/s | 16.9668 KOps/s | |
| test_setitem_dim[int] | 45.0110μs | 25.7041μs | 38.9044 KOps/s | 38.2092 KOps/s | |
| test_setitem_dim[slice_int] | 67.8410μs | 43.5706μs | 22.9513 KOps/s | 22.9555 KOps/s | |
| test_setitem_dim[range] | 0.1320ms | 96.5965μs | 10.3523 KOps/s | 10.5793 KOps/s | |
| test_setitem_dim[tuple] | 77.5510μs | 41.6925μs | 23.9852 KOps/s | 24.8871 KOps/s | |
| test_setitem | 57.0000μs | 17.6548μs | 56.6417 KOps/s | 56.3790 KOps/s | |
| test_set | 51.2600μs | 16.7403μs | 59.7359 KOps/s | 59.0686 KOps/s | |
| test_set_shared | 0.5065ms | 0.2049ms | 4.8800 KOps/s | 4.8786 KOps/s | |
| test_update | 0.2006ms | 21.5987μs | 46.2992 KOps/s | 46.2204 KOps/s | |
| test_update_nested | 70.8810μs | 32.6490μs | 30.6288 KOps/s | 30.2807 KOps/s | |
| test_update__nested | 0.5192ms | 34.3329μs | 29.1266 KOps/s | 29.0457 KOps/s | |
| test_set_nested | 64.5410μs | 18.8922μs | 52.9319 KOps/s | 53.0377 KOps/s | |
| test_set_nested_new | 62.2010μs | 23.5400μs | 42.4809 KOps/s | 41.8699 KOps/s | |
| test_select | 75.8110μs | 40.7737μs | 24.5256 KOps/s | 24.7706 KOps/s | |
| test_select_nested | 0.1062ms | 74.7972μs | 13.3695 KOps/s | 13.3137 KOps/s | |
| test_exclude_nested | 0.1455ms | 91.9230μs | 10.8787 KOps/s | 10.6839 KOps/s | |
| test_empty[True] | 0.4602ms | 0.3986ms | 2.5087 KOps/s | 2.4982 KOps/s | |
| test_empty[False] | 9.5152μs | 1.3083μs | 764.3272 KOps/s | 762.6905 KOps/s | |
| test_to | 0.1052ms | 72.5123μs | 13.7908 KOps/s | 12.9554 KOps/s | |
| test_to_nonblocking | 1.2350ms | 68.1303μs | 14.6778 KOps/s | 15.3044 KOps/s | |
| test_unbind_speed | 0.3967ms | 0.3333ms | 3.0005 KOps/s | 2.9912 KOps/s | |
| test_unbind_speed_stack0 | 0.3883ms | 0.3330ms | 3.0028 KOps/s | 2.9989 KOps/s | |
| test_unbind_speed_stack1 | 0.1032s | 0.9222ms | 1.0844 KOps/s | 1.1807 KOps/s | |
| test_split | 1.2106ms | 1.1388ms | 878.0928 Ops/s | 779.6477 Ops/s | |
| test_chunk | 0.1035s | 1.2053ms | 829.6713 Ops/s | 915.1955 Ops/s | |
| test_to_cpu_blocking | 29.1969ms | 28.8476ms | 34.6650 Ops/s | 45.9701 Ops/s | |
| test_to_cpu_global_sync | 11.6476ms | 11.5150ms | 86.8431 Ops/s | 86.5873 Ops/s | |
| test_to_cpu_event_sync | 12.6843ms | 12.4977ms | 80.0148 Ops/s | 80.1953 Ops/s | |
| test_to_cpu_default | 0.1163s | 13.8070ms | 72.4268 Ops/s | 80.0505 Ops/s | |
| test_consolidate[False-None] | 4.3172ms | 4.1652ms | 240.0838 Ops/s | 243.1612 Ops/s | |
| test_consolidate[default-None] | 2.2115ms | 2.0268ms | 493.3992 Ops/s | 473.1098 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.0156ms | 1.9352ms | 516.7549 Ops/s | 489.6527 Ops/s | |
| test_consolidate_njt[False-None] | 0.1892s | 10.0235ms | 99.7654 Ops/s | 113.4438 Ops/s | |
| test_to[False-False-None] | 2.2826ms | 2.1031ms | 475.4874 Ops/s | 466.7357 Ops/s | |
| test_to[True-False-None] | 2.1896ms | 1.9497ms | 512.8969 Ops/s | 514.7273 Ops/s | |
| test_to[within-False-None] | 6.3893ms | 6.1570ms | 162.4171 Ops/s | 164.0818 Ops/s | |
| test_to[True-default-None] | 9.4853ms | 8.9255ms | 112.0384 Ops/s | 111.7918 Ops/s | |
| test_to_njt[False-False-None] | 8.9485ms | 8.5318ms | 117.2086 Ops/s | 116.5858 Ops/s | |
| test_to_njt[True-False-None] | 7.0508ms | 6.9323ms | 144.2517 Ops/s | 141.2861 Ops/s | |
| test_to_njt[within-False-None] | 16.4654ms | 15.6399ms | 63.9389 Ops/s | 63.2533 Ops/s | |
| test_creation[device0] | 0.4045ms | 0.1152ms | 8.6829 KOps/s | 8.6714 KOps/s | |
| test_creation_from_tensor | 0.4119ms | 0.1135ms | 8.8090 KOps/s | 8.7767 KOps/s | |
| test_add_one[memmap_tensor0] | 0.4038ms | 6.6553μs | 150.2573 KOps/s | 149.0716 KOps/s | |
| test_contiguous[memmap_tensor0] | 16.9210μs | 0.6718μs | 1.4885 MOps/s | 2.1289 MOps/s | |
| test_stack[memmap_tensor0] | 28.3810μs | 4.6275μs | 216.0977 KOps/s | 214.3122 KOps/s | |
| test_memmaptd_index | 0.9555ms | 0.2659ms | 3.7605 KOps/s | 3.6752 KOps/s | |
| test_memmaptd_index_astensor | 0.5209ms | 0.3681ms | 2.7169 KOps/s | 2.6490 KOps/s | |
| test_memmaptd_index_op | 0.9475ms | 0.6234ms | 1.6042 KOps/s | 1.5820 KOps/s | |
| test_serialize_model | 0.3095s | 0.1612s | 6.2025 Ops/s | 7.2671 Ops/s | |
| test_serialize_model_pickle | 1.3702s | 1.2210s | 0.8190 Ops/s | 0.8223 Ops/s | |
| test_serialize_weights | 0.1375s | 0.1354s | 7.3856 Ops/s | 7.3501 Ops/s | |
| test_serialize_weights_returnearly | 0.4256s | 88.5827ms | 11.2889 Ops/s | 6.4677 Ops/s | |
| test_serialize_weights_pickle | 1.3678s | 1.2140s | 0.8237 Ops/s | 0.8226 Ops/s | |
| test_reshape_pytree | 0.1988ms | 32.7733μs | 30.5127 KOps/s | 30.6823 KOps/s | |
| test_reshape_td | 83.0710μs | 45.7872μs | 21.8402 KOps/s | 20.8109 KOps/s | |
| test_view_pytree | 0.2362ms | 32.7573μs | 30.5275 KOps/s | 30.8110 KOps/s | |
| test_view_td | 92.1810μs | 53.2472μs | 18.7803 KOps/s | 18.9675 KOps/s | |
| test_unbind_pytree | 0.2222ms | 36.3372μs | 27.5200 KOps/s | 26.7975 KOps/s | |
| test_unbind_td | 0.1835ms | 49.7653μs | 20.0943 KOps/s | 19.7091 KOps/s | |
| test_split_pytree | 0.2497ms | 43.0775μs | 23.2140 KOps/s | 23.5903 KOps/s | |
| test_split_td | 0.1238ms | 65.3657μs | 15.2985 KOps/s | 15.4282 KOps/s | |
| test_add_pytree | 0.2302ms | 42.3795μs | 23.5963 KOps/s | 23.6347 KOps/s | |
| test_add_td | 98.3310μs | 56.9676μs | 17.5539 KOps/s | 18.2794 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1920ms | 0.1400ms | 7.1405 KOps/s | 6.4717 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.2834ms | 0.2035ms | 4.9134 KOps/s | 5.0219 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1401ms | 0.1090ms | 9.1771 KOps/s | 9.1253 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4317ms | 0.1800ms | 5.5547 KOps/s | 5.4334 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.2482ms | 10.3509μs | 96.6099 KOps/s | 97.1886 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 89.5910μs | 54.8854μs | 18.2198 KOps/s | 18.4989 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1179ms | 9.9171μs | 100.8355 KOps/s | 102.6630 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4666ms | 70.3507μs | 14.2145 KOps/s | 14.5035 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.4530ms | 0.1762ms | 5.6744 KOps/s | 5.3945 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3378ms | 0.2881ms | 3.4706 KOps/s | 3.5269 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.2100ms | 0.1174ms | 8.5207 KOps/s | 8.3424 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1251ms | 78.4360μs | 12.7492 KOps/s | 12.9381 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2467ms | 0.1580ms | 6.3274 KOps/s | 6.2003 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8048ms | 0.5225ms | 1.9137 KOps/s | 1.8489 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.5026ms | 0.3364ms | 2.9726 KOps/s | 2.9599 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2242ms | 0.1790ms | 5.5857 KOps/s | 5.1578 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1531ms | 94.3356μs | 10.6005 KOps/s | 11.1603 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.1952ms | 0.1193ms | 8.3830 KOps/s | 8.0047 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6422ms | 0.4341ms | 2.3036 KOps/s | 2.2142 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.3415ms | 0.1588ms | 6.2955 KOps/s | 6.1589 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 0.1228ms | 13.5385μs | 73.8636 KOps/s | 74.7487 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 77.1210μs | 41.8400μs | 23.9006 KOps/s | 23.7841 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1955ms | 10.7142μs | 93.3344 KOps/s | 93.1592 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4142ms | 52.7737μs | 18.9488 KOps/s | 18.9803 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 2.0090ms | 0.1747ms | 5.7252 KOps/s | 5.4883 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.3955ms | 3.3121ms | 301.9198 Ops/s | 300.6912 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9608ms | 0.1615ms | 6.1922 KOps/s | 6.1083 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9413ms | 2.7987ms | 357.3088 Ops/s | 348.4650 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.2004ms | 0.1093ms | 9.1526 KOps/s | 8.8369 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3258ms | 74.5468μs | 13.4144 KOps/s | 13.5680 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1370ms | 96.1646μs | 10.3988 KOps/s | 10.2660 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2522ms | 44.7744μs | 22.3342 KOps/s | 20.9584 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1376ms | 96.9670μs | 10.3128 KOps/s | 10.2982 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2510ms | 44.6930μs | 22.3748 KOps/s | 21.0247 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.2055ms | 56.8838μs | 17.5797 KOps/s | 16.4919 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2141ms | 27.8012μs | 35.9697 KOps/s | 35.9414 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1297ms | 44.5715μs | 22.4359 KOps/s | 22.1979 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2679ms | 22.7106μs | 44.0323 KOps/s | 43.9430 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 0.2162ms | 45.1469μs | 22.1499 KOps/s | 21.7634 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2528ms | 22.5417μs | 44.3623 KOps/s | 44.4516 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1045ms | 57.1098μs | 17.5101 KOps/s | 16.6028 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2257ms | 27.7627μs | 36.0196 KOps/s | 36.7616 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 82.9110μs | 45.0408μs | 22.2021 KOps/s | 21.3535 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2735ms | 22.4250μs | 44.5930 KOps/s | 44.3181 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 87.0910μs | 45.9542μs | 21.7608 KOps/s | 22.2291 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2572ms | 22.6578μs | 44.1350 KOps/s | 44.3582 KOps/s | |
| test_compile_replace[single-eager] | 0.1037ms | 47.6052μs | 21.0061 KOps/s | 20.4431 KOps/s | |
| test_compile_replace[single-compile] | 0.1789ms | 0.1052ms | 9.5065 KOps/s | 9.0018 KOps/s | |
| test_compile_replace[multi-eager] | 0.6965ms | 0.5664ms | 1.7657 KOps/s | 1.7841 KOps/s | |
| test_compile_replace[multi-compile] | 0.2909ms | 0.1112ms | 8.9901 KOps/s | 8.8860 KOps/s | |
| test_compile_tc_getattr_20[eager] | 0.2317ms | 0.1677ms | 5.9637 KOps/s | 5.9238 KOps/s | |
| test_compile_tc_getattr_20[compile] | 0.1839ms | 0.1210ms | 8.2627 KOps/s | 8.2380 KOps/s | |
| test_compile_clone_shallow[20-eager] | 52.5200μs | 19.3137μs | 51.7768 KOps/s | 51.1365 KOps/s | |
| test_compile_clone_shallow[20-compile] | 62.3210μs | 11.6380μs | 85.9252 KOps/s | 87.9237 KOps/s | |
| test_compile_clone_shallow[40-eager] | 66.4710μs | 33.2598μs | 30.0663 KOps/s | 29.0132 KOps/s | |
| test_compile_clone_shallow[40-compile] | 51.4810μs | 12.5983μs | 79.3758 KOps/s | 69.4117 KOps/s | |
| test_compile_clone_shallow[80-eager] | 0.1260ms | 63.3705μs | 15.7802 KOps/s | 15.7715 KOps/s | |
| test_compile_clone_shallow[80-compile] | 88.4910μs | 15.6438μs | 63.9231 KOps/s | 66.3864 KOps/s | |
| test_compile_update_inplace[eager] | 0.1045ms | 58.3853μs | 17.1276 KOps/s | 16.8882 KOps/s | |
| test_compile_update_inplace[compile] | 0.2411ms | 0.1415ms | 7.0668 KOps/s | 6.9357 KOps/s | |
| test_mod_add[eager] | 0.1066ms | 51.0324μs | 19.5954 KOps/s | 20.6186 KOps/s | |
| test_mod_add[compile] | 0.3732ms | 0.1090ms | 9.1778 KOps/s | 9.2736 KOps/s | |
| test_mod_add[compile-overhead] | 0.5479ms | 0.1502ms | 6.6566 KOps/s | 6.5746 KOps/s | |
| test_mod_wrap[eager] | 0.4512ms | 0.2938ms | 3.4034 KOps/s | 3.3722 KOps/s | |
| test_mod_wrap[compile] | 0.5159ms | 0.3612ms | 2.7688 KOps/s | 2.8004 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.2748ms | 3.9949ms | 250.3201 Ops/s | 248.1475 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.5977ms | 1.4949ms | 668.9542 Ops/s | 597.8570 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.8580ms | 1.4484ms | 690.3983 Ops/s | 681.6746 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2365ms | 0.8879ms | 1.1262 KOps/s | 1.1076 KOps/s | |
| test_seq_add[eager] | 0.2053ms | 0.1543ms | 6.4801 KOps/s | 6.5325 KOps/s | |
| test_seq_add[compile] | 0.3669ms | 0.1128ms | 8.8624 KOps/s | 8.0063 KOps/s | |
| test_seq_add[compile-overhead] | 0.3189ms | 0.1537ms | 6.5068 KOps/s | 5.8220 KOps/s | |
| test_seq_wrap[eager] | 0.6164ms | 0.5197ms | 1.9241 KOps/s | 1.8061 KOps/s | |
| test_seq_wrap[compile] | 0.4512ms | 0.3640ms | 2.7469 KOps/s | 2.5536 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3508ms | 0.2655ms | 3.7664 KOps/s | 3.5734 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9035ms | 0.8419ms | 1.1878 KOps/s | 1.1099 KOps/s | |
| test_func_call_runtime[False-compile] | 1.0953ms | 0.9108ms | 1.0979 KOps/s | 1.0830 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5168ms | 0.4644ms | 2.1531 KOps/s | 2.1435 KOps/s | |
| test_func_call_runtime[True-eager] | 1.1980ms | 1.0776ms | 927.9459 Ops/s | 905.8089 Ops/s | |
| test_func_call_runtime[True-compile] | 1.0164ms | 0.9206ms | 1.0862 KOps/s | 1.0300 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5907ms | 0.4773ms | 2.0950 KOps/s | 2.0554 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.9584ms | 0.8701ms | 1.1493 KOps/s | 1.1786 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.0282ms | 0.9181ms | 1.0892 KOps/s | 1.0829 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5114ms | 0.4657ms | 2.1474 KOps/s | 2.1156 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3103ms | 1.2195ms | 820.0415 Ops/s | 803.4360 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.0097ms | 0.9621ms | 1.0393 KOps/s | 1.0294 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5634ms | 0.5101ms | 1.9605 KOps/s | 1.9050 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8699ms | 2.3677ms | 422.3565 Ops/s | 413.7972 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0843ms | 0.9788ms | 1.0216 KOps/s | 966.8806 Ops/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.6301ms | 0.5175ms | 1.9324 KOps/s | 1.9005 KOps/s | |
| test_distributed | 0.9960ms | 0.1530ms | 6.5362 KOps/s | 6.4920 KOps/s | |
| test_tdmodule | 0.2846ms | 27.6568μs | 36.1574 KOps/s | 36.5537 KOps/s | |
| test_tdmodule_dispatch | 74.5010μs | 44.6028μs | 22.4201 KOps/s | 22.6879 KOps/s | |
| test_tdseq | 45.4000μs | 26.5700μs | 37.6364 KOps/s | 36.9763 KOps/s | |
| test_tdseq_dispatch | 86.0710μs | 46.2315μs | 21.6303 KOps/s | 20.7614 KOps/s | |
| test_instantiation_functorch | 2.2034ms | 2.0963ms | 477.0265 Ops/s | 477.7465 Ops/s | |
| test_exec_functorch | 0.2425ms | 0.1800ms | 5.5557 KOps/s | 5.5651 KOps/s | |
| test_exec_functional_call | 0.2232ms | 0.1617ms | 6.1825 KOps/s | 6.1607 KOps/s | |
| test_exec_td_decorator | 0.4740ms | 0.2395ms | 4.1749 KOps/s | 4.1850 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0259ms | 0.8265ms | 1.2100 KOps/s | 1.1925 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 1.0003ms | 0.8270ms | 1.2092 KOps/s | 1.1894 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.9098ms | 0.7132ms | 1.4021 KOps/s | 1.3828 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8857ms | 0.7115ms | 1.4056 KOps/s | 1.3860 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.4595ms | 20.5890ms | 48.5697 Ops/s | 48.2602 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.2069ms | 20.5542ms | 48.6519 Ops/s | 48.2158 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 21.0956ms | 20.3503ms | 49.1393 Ops/s | 48.7010 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.8946ms | 20.4087ms | 48.9987 Ops/s | 48.7588 Ops/s | |
| test_to_module_speed[True] | 1.5837ms | 1.4810ms | 675.2342 Ops/s | 675.3895 Ops/s | |
| test_to_module_speed[False] | 1.5746ms | 1.4653ms | 682.4683 Ops/s | 691.0312 Ops/s | |
| test_tc_init | 68.8610μs | 44.6479μs | 22.3975 KOps/s | 22.0423 KOps/s | |
| test_tc_init_tensor_only | 36.9210μs | 9.7133μs | 102.9514 KOps/s | 103.1590 KOps/s | |
| test_tc_init_nested | 0.1225ms | 87.4584μs | 11.4340 KOps/s | 11.2940 KOps/s | |
| test_tc_init_many_fields | 59.2210μs | 16.3049μs | 61.3314 KOps/s | 62.4550 KOps/s | |
| test_tc_first_layer_tensor | 29.7710μs | 1.7953μs | 557.0174 KOps/s | 557.7563 KOps/s | |
| test_tc_first_layer_tensor_only | 3.1431μs | 0.4046μs | 2.4714 MOps/s | 2.5396 MOps/s | |
| test_tc_first_layer_tensor_set | 34.7600μs | 3.9364μs | 254.0416 KOps/s | 252.8614 KOps/s | |
| test_tc_first_layer_tensor_only_set | 23.4610μs | 3.2527μs | 307.4349 KOps/s | 306.1626 KOps/s | |
| test_tc_first_layer_nontensor | 8.0168ms | 6.7822μs | 147.4445 KOps/s | 158.5641 KOps/s | |
| test_tc_second_layer_tensor | 26.7010μs | 4.4116μs | 226.6744 KOps/s | 224.8003 KOps/s | |
| test_tc_second_layer_nontensor | 38.7600μs | 9.1365μs | 109.4505 KOps/s | 112.4267 KOps/s | |
| test_unbind | 0.2499s | 14.2243ms | 70.3020 Ops/s | 56.3898 Ops/s | |
| test_full_like | 7.5611ms | 4.3955ms | 227.5058 Ops/s | 227.9905 Ops/s | |
| test_zeros_like | 5.0579ms | 4.3672ms | 228.9775 Ops/s | 137.3324 Ops/s | |
| test_ones_like | 4.9305ms | 4.3728ms | 228.6854 Ops/s | 228.5342 Ops/s | |
| test_clone | 6.6695ms | 6.4224ms | 155.7061 Ops/s | 155.7894 Ops/s | |
| test_squeeze | 84.3110μs | 13.8840μs | 72.0253 KOps/s | 68.6368 KOps/s | |
| test_unsqueeze | 0.1626ms | 0.1120ms | 8.9263 KOps/s | 9.0217 KOps/s | |
| test_split | 0.2493ms | 0.1855ms | 5.3908 KOps/s | 5.3784 KOps/s | |
| test_permute | 0.2488ms | 0.2047ms | 4.8852 KOps/s | 4.8347 KOps/s | |
| test_stack | 51.2219ms | 50.8967ms | 19.6477 Ops/s | 19.6529 Ops/s | |
| test_cat | 51.1477ms | 50.8561ms | 19.6633 Ops/s | 19.7246 Ops/s | |
| test_sequential_tensordict | 0.2970ms | 0.2187ms | 4.5719 KOps/s | 4.3002 KOps/s | |
| test_sequential_graph_module | 0.1671ms | 0.1186ms | 8.4331 KOps/s | 7.9809 KOps/s | |
| test_nested_tensordict | 0.4367ms | 0.2910ms | 3.4364 KOps/s | 3.4615 KOps/s | |
| test_nested_graph_module | 0.1736ms | 0.1305ms | 7.6657 KOps/s | 7.5708 KOps/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 41.9010μs | 14.8839μs | 67.1867 KOps/s | 67.4432 KOps/s | |
| test_plain_set_stack_nested | 36.5600μs | 15.4389μs | 64.7714 KOps/s | 65.7056 KOps/s | |
| test_plain_set_nested_inplace | 46.9800μs | 16.7791μs | 59.5980 KOps/s | 59.0559 KOps/s | |
| test_plain_set_stack_nested_inplace | 60.0010μs | 16.7395μs | 59.7389 KOps/s | 59.2602 KOps/s | |
| test_items | 38.1200μs | 6.0611μs | 164.9863 KOps/s | 167.4624 KOps/s | |
| test_items_nested | 0.6426ms | 0.4740ms | 2.1095 KOps/s | 2.1218 KOps/s | |
| test_items_nested_locked | 0.5586ms | 0.4734ms | 2.1124 KOps/s | 2.1181 KOps/s | |
| test_items_nested_leaf | 0.1575ms | 97.6795μs | 10.2376 KOps/s | 10.0610 KOps/s | |
| test_items_stack_nested | 0.5521ms | 0.4649ms | 2.1512 KOps/s | 2.1403 KOps/s | |
| test_items_stack_nested_leaf | 0.1705ms | 98.1917μs | 10.1842 KOps/s | 10.3003 KOps/s | |
| test_items_stack_nested_locked | 0.5149ms | 0.4727ms | 2.1153 KOps/s | 2.1102 KOps/s | |
| test_keys | 22.6810μs | 4.1751μs | 239.5156 KOps/s | 235.1571 KOps/s | |
| test_keys_nested | 0.1778ms | 0.1295ms | 7.7245 KOps/s | 7.7816 KOps/s | |
| test_keys_nested_locked | 2.1315ms | 0.1389ms | 7.1972 KOps/s | 7.2274 KOps/s | |
| test_keys_nested_leaf | 0.1580ms | 0.1205ms | 8.2995 KOps/s | 8.3017 KOps/s | |
| test_keys_stack_nested | 0.1754ms | 0.1311ms | 7.6283 KOps/s | 7.6904 KOps/s | |
| test_keys_stack_nested_leaf | 0.1582ms | 0.1218ms | 8.2104 KOps/s | 8.2782 KOps/s | |
| test_keys_stack_nested_locked | 0.1884ms | 0.1378ms | 7.2574 KOps/s | 7.3132 KOps/s | |
| test_values | 6.3662μs | 1.0166μs | 983.6594 KOps/s | 989.3225 KOps/s | |
| test_values_nested | 84.7910μs | 52.9648μs | 18.8805 KOps/s | 19.0431 KOps/s | |
| test_values_nested_locked | 96.6720μs | 55.7532μs | 17.9362 KOps/s | 18.0071 KOps/s | |
| test_values_nested_leaf | 91.5410μs | 60.2657μs | 16.5932 KOps/s | 16.7067 KOps/s | |
| test_values_stack_nested | 84.3510μs | 52.7572μs | 18.9547 KOps/s | 18.8732 KOps/s | |
| test_values_stack_nested_leaf | 85.9520μs | 60.7314μs | 16.4659 KOps/s | 16.6702 KOps/s | |
| test_values_stack_nested_locked | 0.1266ms | 55.4184μs | 18.0445 KOps/s | 18.0318 KOps/s | |
| test_membership | 5.6433μs | 0.8170μs | 1.2241 MOps/s | 1.1832 MOps/s | |
| test_membership_nested | 29.9910μs | 2.8632μs | 349.2614 KOps/s | 351.5292 KOps/s | |
| test_membership_nested_leaf | 25.9810μs | 2.8814μs | 347.0578 KOps/s | 348.4387 KOps/s | |
| test_membership_stacked_nested | 25.2810μs | 2.8844μs | 346.6866 KOps/s | 345.6680 KOps/s | |
| test_membership_stacked_nested_leaf | 34.5200μs | 2.9078μs | 343.8972 KOps/s | 347.6154 KOps/s | |
| test_membership_nested_last | 31.8200μs | 4.2493μs | 235.3310 KOps/s | 228.6653 KOps/s | |
| test_membership_nested_leaf_last | 31.1410μs | 4.4072μs | 226.9024 KOps/s | 229.5143 KOps/s | |
| test_membership_stacked_nested_last | 28.3110μs | 4.3601μs | 229.3523 KOps/s | 232.5882 KOps/s | |
| test_membership_stacked_nested_leaf_last | 31.9910μs | 4.3922μs | 227.6749 KOps/s | 229.9623 KOps/s | |
| test_nested_getleaf | 47.7110μs | 21.9262μs | 45.6076 KOps/s | 47.2950 KOps/s | |
| test_nested_get | 46.5510μs | 20.7590μs | 48.1719 KOps/s | 49.8981 KOps/s | |
| test_stacked_getleaf | 50.1110μs | 21.5943μs | 46.3086 KOps/s | 47.5911 KOps/s | |
| test_stacked_get | 48.1000μs | 20.7672μs | 48.1529 KOps/s | 49.9413 KOps/s | |
| test_nested_getitemleaf | 47.8110μs | 21.9882μs | 45.4789 KOps/s | 45.3745 KOps/s | |
| test_nested_getitem | 46.7700μs | 20.9438μs | 47.7469 KOps/s | 47.8915 KOps/s | |
| test_stacked_getitemleaf | 48.8010μs | 22.0084μs | 45.4371 KOps/s | 45.8756 KOps/s | |
| test_stacked_getitem | 45.5310μs | 20.9546μs | 47.7223 KOps/s | 48.6569 KOps/s | |
| test_lock_nested | 0.5780ms | 0.4866ms | 2.0552 KOps/s | 2.1037 KOps/s | |
| test_lock_stack_nested | 0.5612ms | 0.4902ms | 2.0401 KOps/s | 2.0662 KOps/s | |
| test_unlock_nested | 0.4852ms | 0.3985ms | 2.5094 KOps/s | 2.5933 KOps/s | |
| test_unlock_stack_nested | 0.4312ms | 0.3986ms | 2.5085 KOps/s | 2.5623 KOps/s | |
| test_flatten_speed | 0.1667ms | 0.1214ms | 8.2390 KOps/s | 8.1984 KOps/s | |
| test_unflatten_speed | 0.6438ms | 0.5667ms | 1.7647 KOps/s | 1.7631 KOps/s | |
| test_common_ops | 0.8980ms | 0.7158ms | 1.3971 KOps/s | 1.4286 KOps/s | |
| test_creation | 79.1410μs | 3.1149μs | 321.0403 KOps/s | 319.6905 KOps/s | |
| test_creation_empty | 28.1410μs | 7.0308μs | 142.2311 KOps/s | 143.5621 KOps/s | |
| test_creation_nested_1 | 44.0210μs | 11.6346μs | 85.9502 KOps/s | 86.7159 KOps/s | |
| test_creation_nested_2 | 53.5500μs | 13.3977μs | 74.6397 KOps/s | 75.3641 KOps/s | |
| test_creation_many_keys[10] | 50.8010μs | 20.9854μs | 47.6522 KOps/s | 47.6802 KOps/s | |
| test_creation_many_keys[50] | 0.1352ms | 88.8833μs | 11.2507 KOps/s | 10.9389 KOps/s | |
| test_creation_many_keys[100] | 0.2062ms | 0.1746ms | 5.7263 KOps/s | 5.5315 KOps/s | |
| test_creation_nested_many_keys[10] | 76.8010μs | 44.6984μs | 22.3722 KOps/s | 22.1730 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2405ms | 0.1816ms | 5.5072 KOps/s | 5.3760 KOps/s | |
| test_clone | 40.8910μs | 13.2657μs | 75.3824 KOps/s | 73.9413 KOps/s | |
| test_getitem[int] | 1.5386ms | 15.3883μs | 64.9845 KOps/s | 60.9023 KOps/s | |
| test_getitem[slice_int] | 0.1421ms | 24.8367μs | 40.2629 KOps/s | 41.3750 KOps/s | |
| test_getitem[range] | 0.1886ms | 63.6937μs | 15.7001 KOps/s | 15.7303 KOps/s | |
| test_getitem[tuple] | 0.1374ms | 24.1731μs | 41.3683 KOps/s | 41.9732 KOps/s | |
| test_getitem[list] | 0.1924ms | 57.4966μs | 17.3923 KOps/s | 17.2610 KOps/s | |
| test_setitem_dim[int] | 45.6710μs | 25.4313μs | 39.3217 KOps/s | 38.1948 KOps/s | |
| test_setitem_dim[slice_int] | 66.7820μs | 42.2920μs | 23.6451 KOps/s | 23.1034 KOps/s | |
| test_setitem_dim[range] | 0.1171ms | 94.1259μs | 10.6241 KOps/s | 10.4827 KOps/s | |
| test_setitem_dim[tuple] | 61.7810μs | 38.9506μs | 25.6736 KOps/s | 24.8449 KOps/s | |
| test_setitem | 48.4910μs | 17.5960μs | 56.8311 KOps/s | 56.6129 KOps/s | |
| test_set | 44.6910μs | 16.9330μs | 59.0562 KOps/s | 58.5708 KOps/s | |
| test_set_shared | 0.4980ms | 0.2047ms | 4.8851 KOps/s | 4.8773 KOps/s | |
| test_update | 0.3585ms | 21.8236μs | 45.8221 KOps/s | 45.7303 KOps/s | |
| test_update_nested | 74.0310μs | 33.6854μs | 29.6864 KOps/s | 29.8426 KOps/s | |
| test_update__nested | 0.4341ms | 34.1891μs | 29.2491 KOps/s | 28.7883 KOps/s | |
| test_set_nested | 54.0310μs | 18.7495μs | 53.3346 KOps/s | 52.8231 KOps/s | |
| test_set_nested_new | 60.0410μs | 23.7928μs | 42.0296 KOps/s | 41.6141 KOps/s | |
| test_select | 81.5820μs | 40.5694μs | 24.6491 KOps/s | 23.8753 KOps/s | |
| test_select_nested | 0.1025ms | 73.8889μs | 13.5338 KOps/s | 13.3498 KOps/s | |
| test_exclude_nested | 0.1270ms | 91.0455μs | 10.9835 KOps/s | 10.8830 KOps/s | |
| test_empty[True] | 0.4868ms | 0.3987ms | 2.5079 KOps/s | 2.5194 KOps/s | |
| test_empty[False] | 7.2300μs | 1.3004μs | 769.0212 KOps/s | 758.0321 KOps/s | |
| test_to | 0.1055ms | 74.6511μs | 13.3956 KOps/s | 13.1878 KOps/s | |
| test_to_nonblocking | 0.1103ms | 67.9329μs | 14.7204 KOps/s | 15.4394 KOps/s | |
| test_unbind_speed | 0.3878ms | 0.3416ms | 2.9277 KOps/s | 3.0102 KOps/s | |
| test_unbind_speed_stack0 | 0.4289ms | 0.3386ms | 2.9535 KOps/s | 3.0082 KOps/s | |
| test_unbind_speed_stack1 | 0.1050s | 0.9368ms | 1.0674 KOps/s | 1.1880 KOps/s | |
| test_split | 1.2415ms | 1.1393ms | 877.7650 Ops/s | 784.8170 Ops/s | |
| test_chunk | 0.1048s | 1.2087ms | 827.3477 Ops/s | 923.0060 Ops/s | |
| test_to_cpu_blocking | 29.0982ms | 28.8132ms | 34.7063 Ops/s | 35.0451 Ops/s | |
| test_to_cpu_global_sync | 11.6557ms | 11.5147ms | 86.8453 Ops/s | 88.5908 Ops/s | |
| test_to_cpu_event_sync | 12.6491ms | 12.4593ms | 80.2616 Ops/s | 81.6905 Ops/s | |
| test_to_cpu_default | 0.1169s | 13.7641ms | 72.6529 Ops/s | 81.3717 Ops/s | |
| test_consolidate[False-None] | 4.3855ms | 4.1864ms | 238.8687 Ops/s | 242.6339 Ops/s | |
| test_consolidate[default-None] | 2.1985ms | 2.0763ms | 481.6350 Ops/s | 490.1776 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.0782ms | 1.9922ms | 501.9467 Ops/s | 510.1810 Ops/s | |
| test_consolidate_njt[False-None] | 8.7487ms | 8.4806ms | 117.9156 Ops/s | 118.5688 Ops/s | |
| test_to[False-False-None] | 2.2208ms | 2.1056ms | 474.9228 Ops/s | 480.3764 Ops/s | |
| test_to[True-False-None] | 2.2195ms | 1.9481ms | 513.3235 Ops/s | 528.9654 Ops/s | |
| test_to[within-False-None] | 6.2693ms | 6.1602ms | 162.3324 Ops/s | 163.6426 Ops/s | |
| test_to[True-default-None] | 9.0716ms | 8.7932ms | 113.7248 Ops/s | 111.9064 Ops/s | |
| test_to_njt[False-False-None] | 8.8877ms | 8.4700ms | 118.0639 Ops/s | 117.7471 Ops/s | |
| test_to_njt[True-False-None] | 7.2830ms | 6.9928ms | 143.0046 Ops/s | 144.4577 Ops/s | |
| test_to_njt[within-False-None] | 0.1860s | 18.1448ms | 55.1123 Ops/s | 63.0028 Ops/s | |
| test_creation[device0] | 0.3971ms | 0.1161ms | 8.6110 KOps/s | 8.6960 KOps/s | |
| test_creation_from_tensor | 0.4095ms | 0.1127ms | 8.8729 KOps/s | 8.5264 KOps/s | |
| test_add_one[memmap_tensor0] | 0.3027ms | 6.6493μs | 150.3918 KOps/s | 152.3968 KOps/s | |
| test_contiguous[memmap_tensor0] | 27.1110μs | 0.6735μs | 1.4848 MOps/s | 2.1553 MOps/s | |
| test_stack[memmap_tensor0] | 58.1910μs | 4.7147μs | 212.1028 KOps/s | 216.9211 KOps/s | |
| test_memmaptd_index | 1.0749ms | 0.2754ms | 3.6316 KOps/s | 3.7592 KOps/s | |
| test_memmaptd_index_astensor | 0.5436ms | 0.3787ms | 2.6403 KOps/s | 2.7177 KOps/s | |
| test_memmaptd_index_op | 1.0199ms | 0.6329ms | 1.5800 KOps/s | 1.6074 KOps/s | |
| test_serialize_model | 0.1379s | 0.1360s | 7.3553 Ops/s | 7.4120 Ops/s | |
| test_serialize_model_pickle | 1.3614s | 1.1836s | 0.8449 Ops/s | 0.8230 Ops/s | |
| test_serialize_weights | 0.1375s | 0.1345s | 7.4374 Ops/s | 7.4268 Ops/s | |
| test_serialize_weights_returnearly | 0.4320s | 87.0403ms | 11.4889 Ops/s | 11.9424 Ops/s | |
| test_serialize_weights_pickle | 1.3674s | 1.2131s | 0.8244 Ops/s | 0.8216 Ops/s | |
| test_reshape_pytree | 0.2075ms | 32.7865μs | 30.5004 KOps/s | 30.5654 KOps/s | |
| test_reshape_td | 75.4010μs | 44.9839μs | 22.2302 KOps/s | 22.5079 KOps/s | |
| test_view_pytree | 0.2313ms | 32.4522μs | 30.8146 KOps/s | 30.6605 KOps/s | |
| test_view_td | 0.1053ms | 54.7389μs | 18.2685 KOps/s | 18.9569 KOps/s | |
| test_unbind_pytree | 0.2330ms | 36.3629μs | 27.5005 KOps/s | 27.0158 KOps/s | |
| test_unbind_td | 0.2006ms | 50.3752μs | 19.8510 KOps/s | 20.2403 KOps/s | |
| test_split_pytree | 0.2939ms | 42.5633μs | 23.4944 KOps/s | 23.6048 KOps/s | |
| test_split_td | 0.1276ms | 65.2356μs | 15.3290 KOps/s | 15.6560 KOps/s | |
| test_add_pytree | 0.2311ms | 42.7601μs | 23.3863 KOps/s | 23.5767 KOps/s | |
| test_add_td | 0.2106ms | 57.8138μs | 17.2969 KOps/s | 18.0387 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1933ms | 0.1400ms | 7.1407 KOps/s | 6.7761 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.3712ms | 0.2022ms | 4.9447 KOps/s | 4.9972 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1727ms | 0.1078ms | 9.2737 KOps/s | 9.1815 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4564ms | 0.1835ms | 5.4485 KOps/s | 5.6012 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.3301ms | 10.8491μs | 92.1739 KOps/s | 94.7535 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 97.8220μs | 54.0908μs | 18.4874 KOps/s | 18.4413 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1169ms | 9.8307μs | 101.7222 KOps/s | 101.9246 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4089ms | 69.0247μs | 14.4876 KOps/s | 14.3539 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2822ms | 0.1760ms | 5.6807 KOps/s | 5.4438 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3398ms | 0.2778ms | 3.5997 KOps/s | 3.5826 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.2055ms | 0.1167ms | 8.5673 KOps/s | 7.8755 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1085ms | 73.7313μs | 13.5628 KOps/s | 13.6186 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2229ms | 0.1581ms | 6.3265 KOps/s | 6.0157 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8366ms | 0.5335ms | 1.8745 KOps/s | 1.8963 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4993ms | 0.3333ms | 3.0003 KOps/s | 2.9636 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2776ms | 0.1791ms | 5.5837 KOps/s | 5.2729 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1364ms | 91.2261μs | 10.9618 KOps/s | 11.1417 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.2690ms | 0.1216ms | 8.2260 KOps/s | 8.0718 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.7010ms | 0.4422ms | 2.2617 KOps/s | 2.2861 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.3189ms | 0.1593ms | 6.2773 KOps/s | 5.9827 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 0.1106ms | 14.1227μs | 70.8078 KOps/s | 75.0283 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 72.2210μs | 41.6491μs | 24.0101 KOps/s | 24.5280 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1318ms | 10.8652μs | 92.0369 KOps/s | 91.9878 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4287ms | 52.1059μs | 19.1917 KOps/s | 19.2027 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 2.0450ms | 0.1765ms | 5.6668 KOps/s | 5.5794 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.5049ms | 3.3118ms | 301.9512 Ops/s | 303.9276 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9855ms | 0.1636ms | 6.1107 KOps/s | 6.1010 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9226ms | 2.7968ms | 357.5484 Ops/s | 360.4683 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.1473ms | 0.1087ms | 9.2024 KOps/s | 9.1550 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3237ms | 77.7366μs | 12.8639 KOps/s | 13.4958 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.2301ms | 95.5329μs | 10.4676 KOps/s | 10.4864 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2523ms | 47.3279μs | 21.1292 KOps/s | 22.3496 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1558ms | 96.6009μs | 10.3519 KOps/s | 10.4736 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2377ms | 46.8523μs | 21.3437 KOps/s | 22.3237 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1928ms | 56.6454μs | 17.6537 KOps/s | 17.4378 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2270ms | 27.7549μs | 36.0297 KOps/s | 36.6754 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1658ms | 44.2720μs | 22.5876 KOps/s | 22.8950 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2491ms | 22.6693μs | 44.1125 KOps/s | 44.2313 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 0.1040ms | 44.6910μs | 22.3759 KOps/s | 22.1977 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2510ms | 22.7263μs | 44.0018 KOps/s | 44.3899 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 91.0220μs | 57.6760μs | 17.3382 KOps/s | 17.6401 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.3053ms | 28.0356μs | 35.6690 KOps/s | 37.3377 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 86.4210μs | 44.3853μs | 22.5300 KOps/s | 22.2422 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2614ms | 22.6285μs | 44.1920 KOps/s | 44.0027 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 86.5310μs | 44.8756μs | 22.2838 KOps/s | 22.0986 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2581ms | 22.6280μs | 44.1931 KOps/s | 44.5566 KOps/s | |
| test_compile_replace[single-eager] | 86.8020μs | 47.2307μs | 21.1727 KOps/s | 20.9513 KOps/s | |
| test_compile_replace[single-compile] | 0.2171ms | 0.1047ms | 9.5508 KOps/s | 9.6011 KOps/s | |
| test_compile_replace[multi-eager] | 0.6234ms | 0.5642ms | 1.7723 KOps/s | 1.8264 KOps/s | |
| test_compile_replace[multi-compile] | 0.1966ms | 0.1114ms | 8.9738 KOps/s | 9.0365 KOps/s | |
| test_compile_tc_getattr_20[eager] | 0.2227ms | 0.1707ms | 5.8592 KOps/s | 6.1093 KOps/s | |
| test_compile_tc_getattr_20[compile] | 0.2840ms | 0.1184ms | 8.4490 KOps/s | 8.4183 KOps/s | |
| test_compile_clone_shallow[20-eager] | 49.8010μs | 19.5160μs | 51.2400 KOps/s | 51.7538 KOps/s | |
| test_compile_clone_shallow[20-compile] | 52.8910μs | 11.4703μs | 87.1818 KOps/s | 86.8162 KOps/s | |
| test_compile_clone_shallow[40-eager] | 0.1198ms | 34.1166μs | 29.3113 KOps/s | 29.6810 KOps/s | |
| test_compile_clone_shallow[40-compile] | 46.3710μs | 13.0290μs | 76.7522 KOps/s | 80.0917 KOps/s | |
| test_compile_clone_shallow[80-eager] | 0.1011ms | 63.7100μs | 15.6961 KOps/s | 15.9362 KOps/s | |
| test_compile_clone_shallow[80-compile] | 54.5610μs | 15.2151μs | 65.7244 KOps/s | 65.3351 KOps/s | |
| test_compile_update_inplace[eager] | 94.4320μs | 60.7331μs | 16.4655 KOps/s | 17.1292 KOps/s | |
| test_compile_update_inplace[compile] | 0.6843ms | 0.1401ms | 7.1365 KOps/s | 6.7903 KOps/s | |
| test_mod_add[eager] | 0.1041ms | 50.0789μs | 19.9685 KOps/s | 20.4108 KOps/s | |
| test_mod_add[compile] | 0.5004ms | 0.1049ms | 9.5328 KOps/s | 9.5027 KOps/s | |
| test_mod_add[compile-overhead] | 0.2365ms | 0.1483ms | 6.7446 KOps/s | 6.6476 KOps/s | |
| test_mod_wrap[eager] | 0.3728ms | 0.2908ms | 3.4391 KOps/s | 3.4431 KOps/s | |
| test_mod_wrap[compile] | 0.4480ms | 0.3477ms | 2.8757 KOps/s | 2.8448 KOps/s | |
| test_mod_wrap[compile-overhead] | 4.7624ms | 2.6698ms | 374.5608 Ops/s | 249.1080 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.6443ms | 1.5114ms | 661.6468 Ops/s | 671.0626 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.6704ms | 1.5536ms | 643.6773 Ops/s | 691.0024 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.4526ms | 0.9867ms | 1.0135 KOps/s | 1.1106 KOps/s | |
| test_seq_add[eager] | 0.2223ms | 0.1541ms | 6.4912 KOps/s | 6.4936 KOps/s | |
| test_seq_add[compile] | 0.1981ms | 0.1130ms | 8.8526 KOps/s | 8.4678 KOps/s | |
| test_seq_add[compile-overhead] | 0.4034ms | 0.1530ms | 6.5380 KOps/s | 6.3003 KOps/s | |
| test_seq_wrap[eager] | 0.6287ms | 0.5224ms | 1.9141 KOps/s | 1.9191 KOps/s | |
| test_seq_wrap[compile] | 0.4670ms | 0.3661ms | 2.7317 KOps/s | 2.7180 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3528ms | 0.2654ms | 3.7678 KOps/s | 3.7333 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9093ms | 0.8449ms | 1.1836 KOps/s | 1.1897 KOps/s | |
| test_func_call_runtime[False-compile] | 1.1016ms | 0.9173ms | 1.0902 KOps/s | 1.0970 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5780ms | 0.4613ms | 2.1680 KOps/s | 2.1613 KOps/s | |
| test_func_call_runtime[True-eager] | 1.1502ms | 1.0722ms | 932.6604 Ops/s | 929.4673 Ops/s | |
| test_func_call_runtime[True-compile] | 0.9979ms | 0.9316ms | 1.0735 KOps/s | 1.0671 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5457ms | 0.4768ms | 2.0974 KOps/s | 2.0905 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.9681ms | 0.8409ms | 1.1892 KOps/s | 1.1336 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.0271ms | 0.9170ms | 1.0905 KOps/s | 1.0911 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5345ms | 0.4650ms | 2.1505 KOps/s | 2.1532 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3152ms | 1.2240ms | 817.0262 Ops/s | 815.4659 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.0214ms | 0.9614ms | 1.0402 KOps/s | 1.0458 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5598ms | 0.5101ms | 1.9604 KOps/s | 1.9449 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8685ms | 2.3789ms | 420.3695 Ops/s | 419.5055 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0337ms | 0.9845ms | 1.0157 KOps/s | 1.0213 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5595ms | 0.5148ms | 1.9426 KOps/s | 1.9297 KOps/s | |
| test_distributed | 2.8514ms | 0.1684ms | 5.9393 KOps/s | 6.0710 KOps/s | |
| test_tdmodule | 0.3757ms | 28.2052μs | 35.4545 KOps/s | 36.3839 KOps/s | |
| test_tdmodule_dispatch | 76.6110μs | 45.7037μs | 21.8801 KOps/s | 21.9757 KOps/s | |
| test_tdseq | 46.4110μs | 26.7351μs | 37.4041 KOps/s | 37.1168 KOps/s | |
| test_tdseq_dispatch | 70.6110μs | 47.6692μs | 20.9779 KOps/s | 20.9324 KOps/s | |
| test_instantiation_functorch | 2.2046ms | 2.1094ms | 474.0592 Ops/s | 480.0678 Ops/s | |
| test_exec_functorch | 0.2591ms | 0.1790ms | 5.5875 KOps/s | 5.5788 KOps/s | |
| test_exec_functional_call | 0.2589ms | 0.1600ms | 6.2487 KOps/s | 6.3466 KOps/s | |
| test_exec_td_decorator | 0.5058ms | 0.2368ms | 4.2223 KOps/s | 4.2378 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0337ms | 0.8240ms | 1.2137 KOps/s | 1.2167 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.9986ms | 0.8261ms | 1.2105 KOps/s | 1.2140 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8756ms | 0.7121ms | 1.4044 KOps/s | 1.4099 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8895ms | 0.7115ms | 1.4055 KOps/s | 1.4090 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.1226ms | 20.5707ms | 48.6129 Ops/s | 48.7208 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.1447ms | 20.5117ms | 48.7527 Ops/s | 48.8121 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.9081ms | 20.3429ms | 49.1572 Ops/s | 49.1829 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.5821ms | 20.3479ms | 49.1452 Ops/s | 49.2389 Ops/s | |
| test_to_module_speed[True] | 1.5577ms | 1.4808ms | 675.2978 Ops/s | 683.0797 Ops/s | |
| test_to_module_speed[False] | 1.5556ms | 1.4709ms | 679.8436 Ops/s | 692.1223 Ops/s | |
| test_tc_init | 0.1230ms | 43.2597μs | 23.1162 KOps/s | 22.0801 KOps/s | |
| test_tc_init_tensor_only | 39.1610μs | 9.6632μs | 103.4857 KOps/s | 101.5284 KOps/s | |
| test_tc_init_nested | 0.1647ms | 86.8593μs | 11.5129 KOps/s | 11.3659 KOps/s | |
| test_tc_init_many_fields | 53.4200μs | 16.2222μs | 61.6440 KOps/s | 60.8640 KOps/s | |
| test_tc_first_layer_tensor | 26.8300μs | 1.8183μs | 549.9618 KOps/s | 548.5220 KOps/s | |
| test_tc_first_layer_tensor_only | 2.4521μs | 0.3908μs | 2.5589 MOps/s | 2.4983 MOps/s | |
| test_tc_first_layer_tensor_set | 31.7400μs | 3.8886μs | 257.1601 KOps/s | 251.7063 KOps/s | |
| test_tc_first_layer_tensor_only_set | 28.1210μs | 3.2604μs | 306.7086 KOps/s | 301.5672 KOps/s | |
| test_tc_first_layer_nontensor | 34.3210μs | 6.1960μs | 161.3953 KOps/s | 162.1877 KOps/s | |
| test_tc_second_layer_tensor | 27.5000μs | 4.4512μs | 224.6570 KOps/s | 226.4449 KOps/s | |
| test_tc_second_layer_nontensor | 73.8310μs | 8.7442μs | 114.3615 KOps/s | 113.3684 KOps/s | |
| test_unbind | 0.2715s | 16.4519ms | 60.7831 Ops/s | 55.7618 Ops/s | |
| test_full_like | 17.7045ms | 17.5361ms | 57.0252 Ops/s | 226.0408 Ops/s | |
| test_zeros_like | 18.5542ms | 17.4519ms | 57.3005 Ops/s | 113.9673 Ops/s | |
| test_ones_like | 17.0058ms | 16.6901ms | 59.9157 Ops/s | 229.1918 Ops/s | |
| test_clone | 17.8397ms | 17.4010ms | 57.4680 Ops/s | 155.5197 Ops/s | |
| test_squeeze | 96.3520μs | 14.2359μs | 70.2450 KOps/s | 70.1452 KOps/s | |
| test_unsqueeze | 0.2594ms | 0.1109ms | 9.0181 KOps/s | 9.1054 KOps/s | |
| test_split | 0.2407ms | 0.1817ms | 5.5035 KOps/s | 5.4246 KOps/s | |
| test_permute | 0.2578ms | 0.2020ms | 4.9506 KOps/s | 4.9619 KOps/s | |
| test_stack | 51.2593ms | 50.9572ms | 19.6243 Ops/s | 19.6266 Ops/s | |
| test_cat | 51.1861ms | 50.9048ms | 19.6445 Ops/s | 19.6827 Ops/s | |
| test_sequential_tensordict | 0.2732ms | 0.2167ms | 4.6142 KOps/s | 4.5421 KOps/s | |
| test_sequential_graph_module | 0.1739ms | 0.1186ms | 8.4339 KOps/s | 8.3091 KOps/s | |
| test_nested_tensordict | 0.3577ms | 0.2831ms | 3.5320 KOps/s | 3.4821 KOps/s | |
| test_nested_graph_module | 0.1887ms | 0.1318ms | 7.5876 KOps/s | 7.4899 KOps/s |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
1 similar comment
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
1 similar comment
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
1 similar comment
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
Stack from ghstack (oldest at bottom):
Strategy B sends the local shard from the sender along with
placement metadata. The receiver gets the raw shard and can
reconstruct a DTensor or redistribute() to target placements.
More bandwidth-efficient than Strategy A since it avoids the
all-gather, but requires the receiver to handle redistribution.
Made-with: Cursor