[DTensor] Add transfer plan computation and transport abstraction#1643
[DTensor] Add transfer plan computation and transport abstraction#1643vmoens wants to merge 5 commits intogh/vmoens/84/basefrom
Conversation
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 39.7410μs | 14.8350μs | 67.4082 KOps/s | 67.5859 KOps/s | |
| test_plain_set_stack_nested | 38.0910μs | 15.2387μs | 65.6226 KOps/s | 66.1767 KOps/s | |
| test_plain_set_nested_inplace | 45.1410μs | 17.0949μs | 58.4969 KOps/s | 59.5600 KOps/s | |
| test_plain_set_stack_nested_inplace | 42.4210μs | 16.8543μs | 59.3321 KOps/s | 59.4904 KOps/s | |
| test_items | 30.4010μs | 6.1200μs | 163.3980 KOps/s | 164.2061 KOps/s | |
| test_items_nested | 0.5949ms | 0.4671ms | 2.1410 KOps/s | 2.1401 KOps/s | |
| test_items_nested_locked | 0.6284ms | 0.4705ms | 2.1253 KOps/s | 2.1356 KOps/s | |
| test_items_nested_leaf | 0.1579ms | 97.9143μs | 10.2130 KOps/s | 10.2942 KOps/s | |
| test_items_stack_nested | 0.6256ms | 0.4703ms | 2.1265 KOps/s | 2.1531 KOps/s | |
| test_items_stack_nested_leaf | 0.1344ms | 98.6255μs | 10.1394 KOps/s | 10.3247 KOps/s | |
| test_items_stack_nested_locked | 0.6559ms | 0.4656ms | 2.1476 KOps/s | 2.1316 KOps/s | |
| test_keys | 25.1500μs | 4.2371μs | 236.0121 KOps/s | 237.0399 KOps/s | |
| test_keys_nested | 0.2011ms | 0.1308ms | 7.6447 KOps/s | 7.7900 KOps/s | |
| test_keys_nested_locked | 0.7410ms | 0.1388ms | 7.2049 KOps/s | 7.2151 KOps/s | |
| test_keys_nested_leaf | 0.1789ms | 0.1213ms | 8.2424 KOps/s | 8.3163 KOps/s | |
| test_keys_stack_nested | 0.2192ms | 0.1303ms | 7.6723 KOps/s | 7.7172 KOps/s | |
| test_keys_stack_nested_leaf | 0.2027ms | 0.1219ms | 8.2053 KOps/s | 8.3483 KOps/s | |
| test_keys_stack_nested_locked | 0.2008ms | 0.1388ms | 7.2068 KOps/s | 7.2501 KOps/s | |
| test_values | 10.0722μs | 1.0206μs | 979.8000 KOps/s | 991.0020 KOps/s | |
| test_values_nested | 90.4820μs | 52.2804μs | 19.1276 KOps/s | 19.1887 KOps/s | |
| test_values_nested_locked | 0.1049ms | 56.0571μs | 17.8389 KOps/s | 18.1235 KOps/s | |
| test_values_nested_leaf | 99.0620μs | 60.8606μs | 16.4310 KOps/s | 15.4669 KOps/s | |
| test_values_stack_nested | 90.8420μs | 52.4409μs | 19.0691 KOps/s | 19.0485 KOps/s | |
| test_values_stack_nested_leaf | 0.1036ms | 60.2870μs | 16.5873 KOps/s | 15.6032 KOps/s | |
| test_values_stack_nested_locked | 0.1014ms | 57.2208μs | 17.4762 KOps/s | 18.1405 KOps/s | |
| test_membership | 5.8552μs | 0.8595μs | 1.1635 MOps/s | 1.1975 MOps/s | |
| test_membership_nested | 21.2910μs | 2.9295μs | 341.3595 KOps/s | 348.6713 KOps/s | |
| test_membership_nested_leaf | 25.8210μs | 2.9444μs | 339.6321 KOps/s | 348.4161 KOps/s | |
| test_membership_stacked_nested | 36.0700μs | 2.8883μs | 346.2221 KOps/s | 351.0882 KOps/s | |
| test_membership_stacked_nested_leaf | 32.1800μs | 2.9012μs | 344.6800 KOps/s | 348.1371 KOps/s | |
| test_membership_nested_last | 32.7310μs | 4.4482μs | 224.8115 KOps/s | 230.2563 KOps/s | |
| test_membership_nested_leaf_last | 39.4410μs | 4.5038μs | 222.0339 KOps/s | 230.2478 KOps/s | |
| test_membership_stacked_nested_last | 24.0000μs | 4.4840μs | 223.0138 KOps/s | 232.0694 KOps/s | |
| test_membership_stacked_nested_leaf_last | 40.5610μs | 4.4266μs | 225.9062 KOps/s | 229.9663 KOps/s | |
| test_nested_getleaf | 57.0310μs | 21.9119μs | 45.6373 KOps/s | 46.1585 KOps/s | |
| test_nested_get | 49.8310μs | 20.6045μs | 48.5332 KOps/s | 48.5698 KOps/s | |
| test_stacked_getleaf | 48.8610μs | 21.5700μs | 46.3607 KOps/s | 46.0071 KOps/s | |
| test_stacked_get | 63.9510μs | 20.3410μs | 49.1619 KOps/s | 48.7758 KOps/s | |
| test_nested_getitemleaf | 51.8610μs | 22.1386μs | 45.1700 KOps/s | 45.1416 KOps/s | |
| test_nested_getitem | 64.9520μs | 21.2535μs | 47.0511 KOps/s | 47.7199 KOps/s | |
| test_stacked_getitemleaf | 52.5710μs | 21.9374μs | 45.5842 KOps/s | 45.3123 KOps/s | |
| test_stacked_getitem | 53.1110μs | 20.9854μs | 47.6522 KOps/s | 47.2239 KOps/s | |
| test_lock_nested | 7.8389ms | 0.4927ms | 2.0295 KOps/s | 2.1054 KOps/s | |
| test_lock_stack_nested | 0.5458ms | 0.4858ms | 2.0586 KOps/s | 2.0645 KOps/s | |
| test_unlock_nested | 0.4706ms | 0.3989ms | 2.5072 KOps/s | 2.5717 KOps/s | |
| test_unlock_stack_nested | 0.4715ms | 0.3998ms | 2.5012 KOps/s | 2.5309 KOps/s | |
| test_flatten_speed | 0.2294ms | 0.1228ms | 8.1410 KOps/s | 8.1472 KOps/s | |
| test_unflatten_speed | 0.6705ms | 0.5755ms | 1.7377 KOps/s | 1.7276 KOps/s | |
| test_common_ops | 0.8548ms | 0.6903ms | 1.4486 KOps/s | 1.4316 KOps/s | |
| test_creation | 58.2510μs | 3.1658μs | 315.8735 KOps/s | 315.3711 KOps/s | |
| test_creation_empty | 45.5810μs | 7.0470μs | 141.9036 KOps/s | 141.2766 KOps/s | |
| test_creation_nested_1 | 36.4910μs | 11.5913μs | 86.2718 KOps/s | 85.7901 KOps/s | |
| test_creation_nested_2 | 60.1310μs | 13.4303μs | 74.4585 KOps/s | 74.3777 KOps/s | |
| test_creation_many_keys[10] | 56.8210μs | 21.1396μs | 47.3047 KOps/s | 47.2827 KOps/s | |
| test_creation_many_keys[50] | 0.1529ms | 91.5510μs | 10.9229 KOps/s | 11.0159 KOps/s | |
| test_creation_many_keys[100] | 0.2758ms | 0.1805ms | 5.5393 KOps/s | 5.6364 KOps/s | |
| test_creation_nested_many_keys[10] | 78.5720μs | 45.2529μs | 22.0980 KOps/s | 22.1989 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2643ms | 0.1848ms | 5.4112 KOps/s | 5.3761 KOps/s | |
| test_clone | 50.9610μs | 13.3451μs | 74.9340 KOps/s | 73.4103 KOps/s | |
| test_getitem[int] | 1.5963ms | 15.4544μs | 64.7067 KOps/s | 59.9698 KOps/s | |
| test_getitem[slice_int] | 0.1395ms | 24.3702μs | 41.0337 KOps/s | 41.1931 KOps/s | |
| test_getitem[range] | 0.1961ms | 63.7679μs | 15.6819 KOps/s | 15.5757 KOps/s | |
| test_getitem[tuple] | 0.1416ms | 24.0199μs | 41.6322 KOps/s | 41.1604 KOps/s | |
| test_getitem[list] | 0.1844ms | 59.3656μs | 16.8448 KOps/s | 16.7702 KOps/s | |
| test_setitem_dim[int] | 46.1610μs | 25.4950μs | 39.2234 KOps/s | 38.2704 KOps/s | |
| test_setitem_dim[slice_int] | 61.2610μs | 42.2885μs | 23.6471 KOps/s | 22.9992 KOps/s | |
| test_setitem_dim[range] | 0.1233ms | 94.9410μs | 10.5329 KOps/s | 10.4135 KOps/s | |
| test_setitem_dim[tuple] | 66.7710μs | 39.6240μs | 25.2372 KOps/s | 24.4398 KOps/s | |
| test_setitem | 62.2910μs | 18.1094μs | 55.2201 KOps/s | 55.3938 KOps/s | |
| test_set | 88.1120μs | 16.7432μs | 59.7258 KOps/s | 57.2459 KOps/s | |
| test_set_shared | 0.6280ms | 0.2015ms | 4.9619 KOps/s | 4.8761 KOps/s | |
| test_update | 0.4556ms | 22.2787μs | 44.8859 KOps/s | 45.1437 KOps/s | |
| test_update_nested | 64.6810μs | 33.4139μs | 29.9277 KOps/s | 30.5345 KOps/s | |
| test_update__nested | 0.4725ms | 34.5399μs | 28.9521 KOps/s | 28.6426 KOps/s | |
| test_set_nested | 54.3010μs | 19.1307μs | 52.2721 KOps/s | 52.0664 KOps/s | |
| test_set_nested_new | 60.9120μs | 23.8494μs | 41.9298 KOps/s | 41.5076 KOps/s | |
| test_select | 68.0220μs | 40.7338μs | 24.5496 KOps/s | 24.3655 KOps/s | |
| test_select_nested | 0.4976ms | 75.5310μs | 13.2396 KOps/s | 13.3224 KOps/s | |
| test_exclude_nested | 0.5179ms | 93.7233μs | 10.6697 KOps/s | 10.6797 KOps/s | |
| test_empty[True] | 0.8185ms | 0.4015ms | 2.4905 KOps/s | 2.4883 KOps/s | |
| test_empty[False] | 0.1051ms | 1.3047μs | 766.4444 KOps/s | 757.3352 KOps/s | |
| test_to | 0.1125ms | 75.1491μs | 13.3069 KOps/s | 13.6172 KOps/s | |
| test_to_nonblocking | 0.2149ms | 66.3078μs | 15.0812 KOps/s | 15.4927 KOps/s | |
| test_unbind_speed | 0.7791ms | 0.3387ms | 2.9524 KOps/s | 3.0007 KOps/s | |
| test_unbind_speed_stack0 | 0.5337ms | 0.3371ms | 2.9667 KOps/s | 3.0159 KOps/s | |
| test_unbind_speed_stack1 | 0.1045s | 0.8429ms | 1.1864 KOps/s | 1.1901 KOps/s | |
| test_split | 0.1045s | 1.2780ms | 782.4536 Ops/s | 778.1599 Ops/s | |
| test_chunk | 0.1046s | 1.2185ms | 820.6931 Ops/s | 912.8000 Ops/s | |
| test_to_cpu_blocking | 29.1710ms | 28.7423ms | 34.7919 Ops/s | 45.6986 Ops/s | |
| test_to_cpu_global_sync | 11.7845ms | 11.3870ms | 87.8193 Ops/s | 87.5224 Ops/s | |
| test_to_cpu_event_sync | 12.8260ms | 12.3989ms | 80.6523 Ops/s | 80.0696 Ops/s | |
| test_to_cpu_default | 0.1157s | 13.6920ms | 73.0351 Ops/s | 80.0434 Ops/s | |
| test_consolidate[False-None] | 4.6791ms | 4.2506ms | 235.2594 Ops/s | 239.1573 Ops/s | |
| test_consolidate[default-None] | 2.2051ms | 2.0751ms | 481.9121 Ops/s | 479.0317 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.1178ms | 2.0039ms | 499.0286 Ops/s | 500.2465 Ops/s | |
| test_consolidate_njt[False-None] | 0.1932s | 10.2074ms | 97.9678 Ops/s | 116.9762 Ops/s | |
| test_to[False-False-None] | 2.2947ms | 2.1199ms | 471.7266 Ops/s | 469.9086 Ops/s | |
| test_to[True-False-None] | 2.2062ms | 1.9726ms | 506.9394 Ops/s | 509.1492 Ops/s | |
| test_to[within-False-None] | 6.4089ms | 6.2964ms | 158.8202 Ops/s | 161.8525 Ops/s | |
| test_to[True-default-None] | 9.1505ms | 8.9708ms | 111.4726 Ops/s | 109.6696 Ops/s | |
| test_to_njt[False-False-None] | 8.5954ms | 8.4820ms | 117.8968 Ops/s | 116.6221 Ops/s | |
| test_to_njt[True-False-None] | 7.1835ms | 7.0180ms | 142.4900 Ops/s | 140.7611 Ops/s | |
| test_to_njt[within-False-None] | 15.9030ms | 15.6883ms | 63.7418 Ops/s | 62.9305 Ops/s | |
| test_creation[device0] | 0.4481ms | 0.1143ms | 8.7456 KOps/s | 8.7454 KOps/s | |
| test_creation_from_tensor | 0.5844ms | 0.1121ms | 8.9174 KOps/s | 8.9397 KOps/s | |
| test_add_one[memmap_tensor0] | 0.3770ms | 6.5694μs | 152.2202 KOps/s | 147.6164 KOps/s | |
| test_contiguous[memmap_tensor0] | 24.7710μs | 0.6733μs | 1.4852 MOps/s | 2.1552 MOps/s | |
| test_stack[memmap_tensor0] | 24.0410μs | 4.7865μs | 208.9191 KOps/s | 217.7404 KOps/s | |
| test_memmaptd_index | 1.1789ms | 0.2714ms | 3.6852 KOps/s | 3.7124 KOps/s | |
| test_memmaptd_index_astensor | 0.5224ms | 0.3734ms | 2.6784 KOps/s | 2.6712 KOps/s | |
| test_memmaptd_index_op | 0.8701ms | 0.6271ms | 1.5946 KOps/s | 1.5936 KOps/s | |
| test_serialize_model | 0.3134s | 0.1643s | 6.0872 Ops/s | 7.3614 Ops/s | |
| test_serialize_model_pickle | 2.0852s | 1.4011s | 0.7137 Ops/s | 0.8227 Ops/s | |
| test_serialize_weights | 0.1357s | 0.1337s | 7.4801 Ops/s | 7.3399 Ops/s | |
| test_serialize_weights_returnearly | 0.4235s | 87.9757ms | 11.3668 Ops/s | 6.1880 Ops/s | |
| test_serialize_weights_pickle | 1.3752s | 1.2158s | 0.8225 Ops/s | 0.8216 Ops/s | |
| test_reshape_pytree | 0.2030ms | 32.9061μs | 30.3895 KOps/s | 30.4017 KOps/s | |
| test_reshape_td | 82.9310μs | 46.0283μs | 21.7257 KOps/s | 21.8317 KOps/s | |
| test_view_pytree | 0.2124ms | 32.5683μs | 30.7047 KOps/s | 30.8098 KOps/s | |
| test_view_td | 94.3010μs | 53.6181μs | 18.6504 KOps/s | 19.0540 KOps/s | |
| test_unbind_pytree | 0.2316ms | 36.3178μs | 27.5347 KOps/s | 27.1214 KOps/s | |
| test_unbind_td | 0.1277ms | 50.1613μs | 19.9357 KOps/s | 19.9160 KOps/s | |
| test_split_pytree | 0.2479ms | 42.7393μs | 23.3977 KOps/s | 23.5533 KOps/s | |
| test_split_td | 0.1981ms | 65.7178μs | 15.2166 KOps/s | 15.4116 KOps/s | |
| test_add_pytree | 0.1915ms | 42.5586μs | 23.4970 KOps/s | 23.5900 KOps/s | |
| test_add_td | 0.1008ms | 56.1230μs | 17.8180 KOps/s | 17.8820 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.2110ms | 0.1430ms | 6.9924 KOps/s | 6.6678 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.3001ms | 0.2021ms | 4.9471 KOps/s | 5.0219 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1975ms | 0.1091ms | 9.1676 KOps/s | 8.8387 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4285ms | 0.1812ms | 5.5175 KOps/s | 5.5586 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.3507ms | 10.3113μs | 96.9813 KOps/s | 98.2303 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 90.5720μs | 54.2980μs | 18.4169 KOps/s | 18.3620 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1475ms | 9.8266μs | 101.7642 KOps/s | 100.7732 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4530ms | 67.7347μs | 14.7635 KOps/s | 14.4393 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2381ms | 0.1802ms | 5.5507 KOps/s | 5.3918 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3707ms | 0.2812ms | 3.5565 KOps/s | 3.5411 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1684ms | 0.1177ms | 8.4988 KOps/s | 8.2983 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1151ms | 73.1718μs | 13.6665 KOps/s | 12.9773 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2445ms | 0.1592ms | 6.2829 KOps/s | 6.1914 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8163ms | 0.5359ms | 1.8659 KOps/s | 1.9173 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4822ms | 0.3373ms | 2.9646 KOps/s | 2.9787 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2150ms | 0.1786ms | 5.5991 KOps/s | 5.1246 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1255ms | 89.4686μs | 11.1771 KOps/s | 11.1179 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.3470ms | 0.1204ms | 8.3066 KOps/s | 7.7936 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6525ms | 0.4451ms | 2.2467 KOps/s | 2.2984 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.2025ms | 0.1586ms | 6.3045 KOps/s | 6.1531 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 78.0120μs | 13.3868μs | 74.7007 KOps/s | 73.8989 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 70.6320μs | 42.1507μs | 23.7244 KOps/s | 23.9433 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1466ms | 10.9137μs | 91.6275 KOps/s | 92.6157 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4003ms | 53.0409μs | 18.8534 KOps/s | 18.8983 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 2.0146ms | 0.1754ms | 5.7024 KOps/s | 5.1489 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.4437ms | 3.3490ms | 298.5958 Ops/s | 292.7712 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 2.0205ms | 0.1643ms | 6.0871 KOps/s | 6.0510 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9696ms | 2.8399ms | 352.1235 Ops/s | 358.2927 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.1464ms | 0.1098ms | 9.1065 KOps/s | 8.7976 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3142ms | 74.5395μs | 13.4157 KOps/s | 13.4654 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.2215ms | 97.3559μs | 10.2716 KOps/s | 10.3401 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2473ms | 44.6869μs | 22.3779 KOps/s | 22.5023 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1694ms | 98.4152μs | 10.1610 KOps/s | 10.2170 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2757ms | 44.7893μs | 22.3268 KOps/s | 22.5701 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1956ms | 57.3267μs | 17.4439 KOps/s | 17.3516 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2216ms | 28.2986μs | 35.3375 KOps/s | 36.0629 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1452ms | 45.3752μs | 22.0385 KOps/s | 22.3186 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2499ms | 22.4867μs | 44.4708 KOps/s | 43.8094 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 87.5510μs | 45.7970μs | 21.8355 KOps/s | 21.6282 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2646ms | 22.5560μs | 44.3341 KOps/s | 43.6497 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 93.4820μs | 56.9722μs | 17.5524 KOps/s | 16.9039 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2446ms | 28.0737μs | 35.6205 KOps/s | 35.5539 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 79.7620μs | 45.8733μs | 21.7992 KOps/s | 21.6208 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2504ms | 22.5075μs | 44.4297 KOps/s | 44.1217 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 99.4620μs | 45.3780μs | 22.0371 KOps/s | 21.6089 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2655ms | 22.3054μs | 44.8322 KOps/s | 44.1378 KOps/s | |
| test_compile_replace[single-eager] | 83.3120μs | 47.1276μs | 21.2190 KOps/s | 21.4974 KOps/s | |
| test_compile_replace[single-compile] | 0.1714ms | 0.1050ms | 9.5221 KOps/s | 9.4295 KOps/s | |
| test_compile_replace[multi-eager] | 0.6397ms | 0.5635ms | 1.7747 KOps/s | 1.8142 KOps/s | |
| test_compile_replace[multi-compile] | 0.2411ms | 0.1119ms | 8.9342 KOps/s | 8.9273 KOps/s | |
| test_compile_tc_getattr_20[eager] | 0.2186ms | 0.1740ms | 5.7481 KOps/s | 5.8275 KOps/s | |
| test_compile_tc_getattr_20[compile] | 0.2966ms | 0.1206ms | 8.2893 KOps/s | 8.2736 KOps/s | |
| test_compile_clone_shallow[20-eager] | 82.0110μs | 19.5420μs | 51.1719 KOps/s | 51.1337 KOps/s | |
| test_compile_clone_shallow[20-compile] | 0.1059ms | 11.4647μs | 87.2244 KOps/s | 88.3936 KOps/s | |
| test_compile_clone_shallow[40-eager] | 66.7110μs | 34.3095μs | 29.1465 KOps/s | 29.2054 KOps/s | |
| test_compile_clone_shallow[40-compile] | 50.7210μs | 12.5060μs | 79.9616 KOps/s | 58.0139 KOps/s | |
| test_compile_clone_shallow[80-eager] | 93.3120μs | 63.6633μs | 15.7076 KOps/s | 15.7708 KOps/s | |
| test_compile_clone_shallow[80-compile] | 49.6910μs | 15.3578μs | 65.1135 KOps/s | 68.0628 KOps/s | |
| test_compile_update_inplace[eager] | 0.1031ms | 58.8925μs | 16.9801 KOps/s | 16.8399 KOps/s | |
| test_compile_update_inplace[compile] | 0.3235ms | 0.1411ms | 7.0876 KOps/s | 6.8439 KOps/s | |
| test_mod_add[eager] | 89.9020μs | 50.2692μs | 19.8929 KOps/s | 20.3551 KOps/s | |
| test_mod_add[compile] | 0.1530ms | 0.1044ms | 9.5824 KOps/s | 9.3872 KOps/s | |
| test_mod_add[compile-overhead] | 0.3613ms | 0.1520ms | 6.5793 KOps/s | 6.5591 KOps/s | |
| test_mod_wrap[eager] | 0.3880ms | 0.3029ms | 3.3016 KOps/s | 3.4074 KOps/s | |
| test_mod_wrap[compile] | 0.4819ms | 0.3479ms | 2.8740 KOps/s | 2.7580 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.2784ms | 4.0399ms | 247.5296 Ops/s | 246.0818 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.6700ms | 1.4997ms | 666.8096 Ops/s | 667.6074 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.5630ms | 1.4451ms | 692.0056 Ops/s | 640.7736 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2427ms | 0.8896ms | 1.1241 KOps/s | 1.0045 KOps/s | |
| test_seq_add[eager] | 0.2262ms | 0.1551ms | 6.4463 KOps/s | 6.5384 KOps/s | |
| test_seq_add[compile] | 0.1795ms | 0.1144ms | 8.7378 KOps/s | 8.1767 KOps/s | |
| test_seq_add[compile-overhead] | 0.4563ms | 0.1545ms | 6.4736 KOps/s | 6.3201 KOps/s | |
| test_seq_wrap[eager] | 0.6178ms | 0.5412ms | 1.8476 KOps/s | 1.8896 KOps/s | |
| test_seq_wrap[compile] | 0.4765ms | 0.3842ms | 2.6029 KOps/s | 2.7212 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3316ms | 0.2661ms | 3.7586 KOps/s | 3.7049 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9363ms | 0.8409ms | 1.1892 KOps/s | 1.1809 KOps/s | |
| test_func_call_runtime[False-compile] | 1.0343ms | 0.9143ms | 1.0937 KOps/s | 1.1023 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5783ms | 0.4660ms | 2.1459 KOps/s | 2.1395 KOps/s | |
| test_func_call_runtime[True-eager] | 1.2501ms | 1.0777ms | 927.8924 Ops/s | 910.9300 Ops/s | |
| test_func_call_runtime[True-compile] | 1.0556ms | 0.9265ms | 1.0793 KOps/s | 1.0884 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5976ms | 0.4789ms | 2.0882 KOps/s | 2.0641 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.9526ms | 0.8730ms | 1.1455 KOps/s | 1.1483 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.1052ms | 0.9129ms | 1.0954 KOps/s | 1.0957 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.7000ms | 0.4690ms | 2.1320 KOps/s | 2.1183 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3550ms | 1.2372ms | 808.2938 Ops/s | 808.8674 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.1060ms | 0.9753ms | 1.0253 KOps/s | 1.0474 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.6341ms | 0.5125ms | 1.9511 KOps/s | 1.9251 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8680ms | 2.3815ms | 419.9028 Ops/s | 421.3745 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0931ms | 0.9835ms | 1.0168 KOps/s | 1.0267 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.6255ms | 0.5182ms | 1.9298 KOps/s | 1.9093 KOps/s | |
| test_distributed | 0.5614ms | 0.1527ms | 6.5482 KOps/s | 6.4722 KOps/s | |
| test_tdmodule | 70.8710μs | 27.6234μs | 36.2012 KOps/s | 35.6299 KOps/s | |
| test_tdmodule_dispatch | 73.5420μs | 45.2259μs | 22.1112 KOps/s | 22.1852 KOps/s | |
| test_tdseq | 58.8710μs | 27.2623μs | 36.6807 KOps/s | 36.6835 KOps/s | |
| test_tdseq_dispatch | 80.3920μs | 47.5111μs | 21.0477 KOps/s | 20.7627 KOps/s | |
| test_instantiation_functorch | 2.1710ms | 2.0976ms | 476.7404 Ops/s | 473.0298 Ops/s | |
| test_exec_functorch | 0.2398ms | 0.1794ms | 5.5755 KOps/s | 5.4970 KOps/s | |
| test_exec_functional_call | 0.2191ms | 0.1627ms | 6.1451 KOps/s | 6.1730 KOps/s | |
| test_exec_td_decorator | 0.4391ms | 0.2383ms | 4.1959 KOps/s | 4.2153 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.1075ms | 0.8279ms | 1.2079 KOps/s | 1.1849 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 1.0518ms | 0.8287ms | 1.2067 KOps/s | 1.1972 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8908ms | 0.7114ms | 1.4056 KOps/s | 1.3841 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8969ms | 0.7182ms | 1.3924 KOps/s | 1.3940 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.3721ms | 20.5401ms | 48.6853 Ops/s | 48.7008 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.3726ms | 20.5941ms | 48.5576 Ops/s | 48.8009 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 21.1982ms | 20.4906ms | 48.8029 Ops/s | 49.2495 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 21.2611ms | 20.4946ms | 48.7933 Ops/s | 48.8362 Ops/s | |
| test_to_module_speed[True] | 1.5883ms | 1.4900ms | 671.1253 Ops/s | 675.5545 Ops/s | |
| test_to_module_speed[False] | 1.5851ms | 1.4881ms | 671.9902 Ops/s | 691.6929 Ops/s | |
| test_tc_init | 74.5020μs | 43.7872μs | 22.8377 KOps/s | 22.2644 KOps/s | |
| test_tc_init_tensor_only | 42.9210μs | 9.8128μs | 101.9073 KOps/s | 103.2558 KOps/s | |
| test_tc_init_nested | 0.1296ms | 88.4875μs | 11.3010 KOps/s | 11.2428 KOps/s | |
| test_tc_init_many_fields | 54.9110μs | 16.4424μs | 60.8183 KOps/s | 60.7143 KOps/s | |
| test_tc_first_layer_tensor | 29.9900μs | 1.8375μs | 544.2283 KOps/s | 549.0406 KOps/s | |
| test_tc_first_layer_tensor_only | 2.5890μs | 0.3994μs | 2.5037 MOps/s | 2.4439 MOps/s | |
| test_tc_first_layer_tensor_set | 31.6310μs | 3.9539μs | 252.9131 KOps/s | 251.4138 KOps/s | |
| test_tc_first_layer_tensor_only_set | 61.1310μs | 3.1418μs | 318.2936 KOps/s | 304.0345 KOps/s | |
| test_tc_first_layer_nontensor | 33.6010μs | 6.1609μs | 162.3149 KOps/s | 161.2139 KOps/s | |
| test_tc_second_layer_tensor | 30.2900μs | 4.4109μs | 226.7090 KOps/s | 225.8417 KOps/s | |
| test_tc_second_layer_nontensor | 38.6910μs | 8.7737μs | 113.9775 KOps/s | 114.5568 KOps/s | |
| test_unbind | 0.2518s | 16.5531ms | 60.4117 Ops/s | 54.8492 Ops/s | |
| test_full_like | 11.2434ms | 4.4072ms | 226.9011 Ops/s | 227.5458 Ops/s | |
| test_zeros_like | 4.5373ms | 4.3600ms | 229.3574 Ops/s | 228.6445 Ops/s | |
| test_ones_like | 4.5170ms | 4.3671ms | 228.9828 Ops/s | 228.1659 Ops/s | |
| test_clone | 6.5761ms | 6.4212ms | 155.7352 Ops/s | 154.1234 Ops/s | |
| test_squeeze | 65.1910μs | 14.3362μs | 69.7533 KOps/s | 70.3355 KOps/s | |
| test_unsqueeze | 0.1907ms | 0.1106ms | 9.0390 KOps/s | 8.4255 KOps/s | |
| test_split | 0.3518ms | 0.1858ms | 5.3825 KOps/s | 5.1766 KOps/s | |
| test_permute | 0.2828ms | 0.2046ms | 4.8866 KOps/s | 4.5996 KOps/s | |
| test_stack | 51.2805ms | 50.8312ms | 19.6729 Ops/s | 28.8883 Ops/s | |
| test_cat | 42.6875ms | 42.5017ms | 23.5285 Ops/s | 28.9235 Ops/s | |
| test_sequential_tensordict | 0.2734ms | 0.2140ms | 4.6720 KOps/s | 4.5568 KOps/s | |
| test_sequential_graph_module | 0.2465ms | 0.1179ms | 8.4822 KOps/s | 8.0810 KOps/s | |
| test_nested_tensordict | 0.3528ms | 0.2929ms | 3.4141 KOps/s | 3.3681 KOps/s | |
| test_nested_graph_module | 0.2170ms | 0.1293ms | 7.7324 KOps/s | 7.2508 KOps/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 34.4310μs | 14.8480μs | 67.3490 KOps/s | 67.4070 KOps/s | |
| test_plain_set_stack_nested | 38.6710μs | 15.1778μs | 65.8857 KOps/s | 67.1537 KOps/s | |
| test_plain_set_nested_inplace | 46.8110μs | 16.7443μs | 59.7220 KOps/s | 60.1316 KOps/s | |
| test_plain_set_stack_nested_inplace | 37.4210μs | 16.7024μs | 59.8715 KOps/s | 60.8365 KOps/s | |
| test_items | 38.8810μs | 5.9929μs | 166.8637 KOps/s | 168.5054 KOps/s | |
| test_items_nested | 0.5137ms | 0.4701ms | 2.1274 KOps/s | 2.1650 KOps/s | |
| test_items_nested_locked | 0.5651ms | 0.4749ms | 2.1057 KOps/s | 2.1343 KOps/s | |
| test_items_nested_leaf | 0.1306ms | 98.6902μs | 10.1327 KOps/s | 10.3485 KOps/s | |
| test_items_stack_nested | 0.5412ms | 0.4737ms | 2.1111 KOps/s | 2.1708 KOps/s | |
| test_items_stack_nested_leaf | 0.1383ms | 97.1736μs | 10.2909 KOps/s | 10.3418 KOps/s | |
| test_items_stack_nested_locked | 0.7382ms | 0.4679ms | 2.1372 KOps/s | 2.1569 KOps/s | |
| test_keys | 27.3910μs | 4.2728μs | 234.0376 KOps/s | 238.4868 KOps/s | |
| test_keys_nested | 0.1777ms | 0.1290ms | 7.7543 KOps/s | 7.7055 KOps/s | |
| test_keys_nested_locked | 0.7399ms | 0.1348ms | 7.4204 KOps/s | 7.2914 KOps/s | |
| test_keys_nested_leaf | 0.2594ms | 0.1204ms | 8.3066 KOps/s | 8.3822 KOps/s | |
| test_keys_stack_nested | 0.1752ms | 0.1297ms | 7.7100 KOps/s | 7.7266 KOps/s | |
| test_keys_stack_nested_leaf | 0.1784ms | 0.1203ms | 8.3158 KOps/s | 8.3830 KOps/s | |
| test_keys_stack_nested_locked | 0.2031ms | 0.1380ms | 7.2438 KOps/s | 7.3156 KOps/s | |
| test_values | 6.5262μs | 1.0192μs | 981.1472 KOps/s | 986.7173 KOps/s | |
| test_values_nested | 78.0020μs | 52.8326μs | 18.9277 KOps/s | 19.0619 KOps/s | |
| test_values_nested_locked | 93.0530μs | 56.5342μs | 17.6884 KOps/s | 17.9973 KOps/s | |
| test_values_nested_leaf | 0.1025ms | 61.1708μs | 16.3477 KOps/s | 16.7632 KOps/s | |
| test_values_stack_nested | 89.1520μs | 53.5445μs | 18.6761 KOps/s | 19.1710 KOps/s | |
| test_values_stack_nested_leaf | 96.7920μs | 61.0196μs | 16.3882 KOps/s | 16.6764 KOps/s | |
| test_values_stack_nested_locked | 83.3520μs | 56.7280μs | 17.6280 KOps/s | 18.4957 KOps/s | |
| test_membership | 5.2517μs | 0.8597μs | 1.1632 MOps/s | 1.1811 MOps/s | |
| test_membership_nested | 30.5510μs | 2.8655μs | 348.9826 KOps/s | 346.3378 KOps/s | |
| test_membership_nested_leaf | 28.7610μs | 2.8218μs | 354.3897 KOps/s | 360.4697 KOps/s | |
| test_membership_stacked_nested | 22.2210μs | 2.9180μs | 342.6995 KOps/s | 346.2923 KOps/s | |
| test_membership_stacked_nested_leaf | 26.5200μs | 2.9016μs | 344.6359 KOps/s | 347.2052 KOps/s | |
| test_membership_nested_last | 25.3800μs | 4.4707μs | 223.6794 KOps/s | 230.6209 KOps/s | |
| test_membership_nested_leaf_last | 39.9410μs | 4.4592μs | 224.2574 KOps/s | 230.1453 KOps/s | |
| test_membership_stacked_nested_last | 25.5710μs | 4.4496μs | 224.7390 KOps/s | 230.6251 KOps/s | |
| test_membership_stacked_nested_leaf_last | 20.4300μs | 4.4445μs | 224.9978 KOps/s | 230.3708 KOps/s | |
| test_nested_getleaf | 55.0410μs | 21.7247μs | 46.0305 KOps/s | 45.9888 KOps/s | |
| test_nested_get | 55.9120μs | 20.5695μs | 48.6157 KOps/s | 48.4520 KOps/s | |
| test_stacked_getleaf | 51.4610μs | 21.8989μs | 45.6644 KOps/s | 45.6818 KOps/s | |
| test_stacked_get | 95.7530μs | 20.6191μs | 48.4988 KOps/s | 48.3185 KOps/s | |
| test_nested_getitemleaf | 51.0620μs | 22.1824μs | 45.0807 KOps/s | 44.4895 KOps/s | |
| test_nested_getitem | 51.8710μs | 20.9815μs | 47.6610 KOps/s | 47.1653 KOps/s | |
| test_stacked_getitemleaf | 45.7810μs | 22.0543μs | 45.3427 KOps/s | 44.9653 KOps/s | |
| test_stacked_getitem | 45.6710μs | 21.2201μs | 47.1251 KOps/s | 47.3579 KOps/s | |
| test_lock_nested | 7.8324ms | 0.4932ms | 2.0277 KOps/s | 2.0947 KOps/s | |
| test_lock_stack_nested | 0.5693ms | 0.4885ms | 2.0472 KOps/s | 2.0546 KOps/s | |
| test_unlock_nested | 0.4663ms | 0.3953ms | 2.5296 KOps/s | 2.5508 KOps/s | |
| test_unlock_stack_nested | 0.4846ms | 0.3944ms | 2.5352 KOps/s | 2.5269 KOps/s | |
| test_flatten_speed | 0.1763ms | 0.1223ms | 8.1743 KOps/s | 8.1367 KOps/s | |
| test_unflatten_speed | 0.6790ms | 0.5757ms | 1.7371 KOps/s | 1.7539 KOps/s | |
| test_common_ops | 0.8443ms | 0.6979ms | 1.4329 KOps/s | 1.4350 KOps/s | |
| test_creation | 0.1053ms | 3.1708μs | 315.3799 KOps/s | 314.8631 KOps/s | |
| test_creation_empty | 30.6110μs | 7.0167μs | 142.5173 KOps/s | 143.6992 KOps/s | |
| test_creation_nested_1 | 83.8320μs | 11.5300μs | 86.7306 KOps/s | 86.6763 KOps/s | |
| test_creation_nested_2 | 43.3010μs | 13.3718μs | 74.7842 KOps/s | 75.1634 KOps/s | |
| test_creation_many_keys[10] | 48.6410μs | 21.0545μs | 47.4957 KOps/s | 47.4591 KOps/s | |
| test_creation_many_keys[50] | 0.1280ms | 90.8640μs | 11.0055 KOps/s | 10.9981 KOps/s | |
| test_creation_many_keys[100] | 0.2315ms | 0.1787ms | 5.5945 KOps/s | 5.5514 KOps/s | |
| test_creation_nested_many_keys[10] | 66.3720μs | 45.0899μs | 22.1779 KOps/s | 22.1490 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2885ms | 0.1840ms | 5.4345 KOps/s | 5.3548 KOps/s | |
| test_clone | 38.9110μs | 13.2779μs | 75.3133 KOps/s | 75.1943 KOps/s | |
| test_getitem[int] | 1.4821ms | 15.3869μs | 64.9901 KOps/s | 59.4387 KOps/s | |
| test_getitem[slice_int] | 0.1369ms | 24.6315μs | 40.5985 KOps/s | 41.2835 KOps/s | |
| test_getitem[range] | 0.1797ms | 63.1625μs | 15.8322 KOps/s | 15.8451 KOps/s | |
| test_getitem[tuple] | 0.1403ms | 24.3095μs | 41.1362 KOps/s | 41.8686 KOps/s | |
| test_getitem[list] | 0.1806ms | 57.8496μs | 17.2862 KOps/s | 17.2276 KOps/s | |
| test_setitem_dim[int] | 60.4910μs | 26.1453μs | 38.2478 KOps/s | 37.8364 KOps/s | |
| test_setitem_dim[slice_int] | 66.1220μs | 43.3183μs | 23.0849 KOps/s | 22.9563 KOps/s | |
| test_setitem_dim[range] | 0.1173ms | 94.3787μs | 10.5956 KOps/s | 10.5310 KOps/s | |
| test_setitem_dim[tuple] | 61.8120μs | 39.2737μs | 25.4623 KOps/s | 25.5379 KOps/s | |
| test_setitem | 54.6810μs | 17.8382μs | 56.0594 KOps/s | 56.4916 KOps/s | |
| test_set | 42.5010μs | 17.2441μs | 57.9910 KOps/s | 59.5765 KOps/s | |
| test_set_shared | 0.4884ms | 0.2034ms | 4.9155 KOps/s | 4.9140 KOps/s | |
| test_update | 0.3239ms | 22.0750μs | 45.3001 KOps/s | 46.3833 KOps/s | |
| test_update_nested | 70.6820μs | 33.5481μs | 29.8079 KOps/s | 30.2386 KOps/s | |
| test_update__nested | 0.4530ms | 34.3051μs | 29.1502 KOps/s | 29.2888 KOps/s | |
| test_set_nested | 59.4110μs | 20.3388μs | 49.1670 KOps/s | 53.0083 KOps/s | |
| test_set_nested_new | 57.2810μs | 23.9477μs | 41.7576 KOps/s | 42.5149 KOps/s | |
| test_select | 79.6320μs | 40.3234μs | 24.7995 KOps/s | 24.4876 KOps/s | |
| test_select_nested | 95.4520μs | 74.7358μs | 13.3805 KOps/s | 13.3542 KOps/s | |
| test_exclude_nested | 0.1264ms | 92.1791μs | 10.8484 KOps/s | 10.8581 KOps/s | |
| test_empty[True] | 0.5044ms | 0.3999ms | 2.5005 KOps/s | 2.5181 KOps/s | |
| test_empty[False] | 7.4378μs | 1.3280μs | 752.9850 KOps/s | 769.5914 KOps/s | |
| test_to | 0.1114ms | 77.8706μs | 12.8418 KOps/s | 13.5777 KOps/s | |
| test_to_nonblocking | 0.1095ms | 64.3672μs | 15.5359 KOps/s | 15.5048 KOps/s | |
| test_unbind_speed | 0.3894ms | 0.3377ms | 2.9616 KOps/s | 2.9670 KOps/s | |
| test_unbind_speed_stack0 | 0.3919ms | 0.3364ms | 2.9724 KOps/s | 3.1138 KOps/s | |
| test_unbind_speed_stack1 | 0.1044s | 0.8478ms | 1.1795 KOps/s | 1.1740 KOps/s | |
| test_split | 0.1043s | 1.2764ms | 783.4726 Ops/s | 777.9090 Ops/s | |
| test_chunk | 0.1045s | 1.2218ms | 818.4571 Ops/s | 912.5112 Ops/s | |
| test_to_cpu_blocking | 29.0445ms | 28.7236ms | 34.8146 Ops/s | 46.4259 Ops/s | |
| test_to_cpu_global_sync | 11.5356ms | 11.3431ms | 88.1594 Ops/s | 88.0325 Ops/s | |
| test_to_cpu_event_sync | 12.5333ms | 12.2159ms | 81.8602 Ops/s | 80.9140 Ops/s | |
| test_to_cpu_default | 0.1162s | 13.5195ms | 73.9671 Ops/s | 81.1415 Ops/s | |
| test_consolidate[False-None] | 4.2911ms | 4.2250ms | 236.6855 Ops/s | 240.4390 Ops/s | |
| test_consolidate[default-None] | 3.0535ms | 2.0504ms | 487.7132 Ops/s | 469.7533 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.1052ms | 1.9783ms | 505.4906 Ops/s | 485.1919 Ops/s | |
| test_consolidate_njt[False-None] | 8.7990ms | 8.6115ms | 116.1236 Ops/s | 112.1070 Ops/s | |
| test_to[False-False-None] | 2.2286ms | 2.1041ms | 475.2539 Ops/s | 470.6209 Ops/s | |
| test_to[True-False-None] | 2.2107ms | 1.9390ms | 515.7371 Ops/s | 511.0153 Ops/s | |
| test_to[within-False-None] | 6.3367ms | 6.2489ms | 160.0274 Ops/s | 161.2249 Ops/s | |
| test_to[True-default-None] | 9.0416ms | 8.7769ms | 113.9349 Ops/s | 111.4640 Ops/s | |
| test_to_njt[False-False-None] | 8.7771ms | 8.5135ms | 117.4606 Ops/s | 115.4385 Ops/s | |
| test_to_njt[True-False-None] | 7.1235ms | 6.9478ms | 143.9309 Ops/s | 139.9219 Ops/s | |
| test_to_njt[within-False-None] | 15.7622ms | 15.6496ms | 63.8993 Ops/s | 62.8746 Ops/s | |
| test_creation[device0] | 0.2898ms | 0.1165ms | 8.5804 KOps/s | 8.6727 KOps/s | |
| test_creation_from_tensor | 0.4055ms | 0.1140ms | 8.7695 KOps/s | 8.6682 KOps/s | |
| test_add_one[memmap_tensor0] | 0.2110ms | 6.5381μs | 152.9507 KOps/s | 151.3260 KOps/s | |
| test_contiguous[memmap_tensor0] | 20.7210μs | 0.6661μs | 1.5013 MOps/s | 2.1396 MOps/s | |
| test_stack[memmap_tensor0] | 35.2210μs | 4.7466μs | 210.6781 KOps/s | 211.2343 KOps/s | |
| test_memmaptd_index | 1.0930ms | 0.2809ms | 3.5596 KOps/s | 3.6000 KOps/s | |
| test_memmaptd_index_astensor | 0.5389ms | 0.3852ms | 2.5962 KOps/s | 2.6418 KOps/s | |
| test_memmaptd_index_op | 0.9559ms | 0.6388ms | 1.5653 KOps/s | 1.5192 KOps/s | |
| test_serialize_model | 0.3076s | 0.1603s | 6.2400 Ops/s | 7.3740 Ops/s | |
| test_serialize_model_pickle | 1.3490s | 1.2106s | 0.8260 Ops/s | 0.8328 Ops/s | |
| test_serialize_weights | 0.1375s | 0.1351s | 7.4044 Ops/s | 7.3362 Ops/s | |
| test_serialize_weights_returnearly | 0.4535s | 88.5806ms | 11.2892 Ops/s | 6.1874 Ops/s | |
| test_serialize_weights_pickle | 1.3684s | 1.2136s | 0.8240 Ops/s | 0.8227 Ops/s | |
| test_reshape_pytree | 0.2094ms | 33.4388μs | 29.9054 KOps/s | 30.1401 KOps/s | |
| test_reshape_td | 82.1520μs | 46.9869μs | 21.2825 KOps/s | 22.1320 KOps/s | |
| test_view_pytree | 0.2119ms | 33.4556μs | 29.8903 KOps/s | 30.9273 KOps/s | |
| test_view_td | 97.5820μs | 54.3571μs | 18.3969 KOps/s | 18.4999 KOps/s | |
| test_unbind_pytree | 0.2354ms | 37.0089μs | 27.0205 KOps/s | 26.8639 KOps/s | |
| test_unbind_td | 0.1637ms | 50.5587μs | 19.7790 KOps/s | 19.7753 KOps/s | |
| test_split_pytree | 0.2264ms | 43.2024μs | 23.1468 KOps/s | 23.5356 KOps/s | |
| test_split_td | 0.1842ms | 67.5042μs | 14.8139 KOps/s | 15.4222 KOps/s | |
| test_add_pytree | 0.2407ms | 43.0028μs | 23.2543 KOps/s | 24.1154 KOps/s | |
| test_add_td | 0.1013ms | 54.7817μs | 18.2543 KOps/s | 18.6901 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.2061ms | 0.1426ms | 7.0148 KOps/s | 6.7226 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.5720ms | 0.2014ms | 4.9651 KOps/s | 4.9934 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.4178ms | 0.1107ms | 9.0297 KOps/s | 8.8948 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.6076ms | 0.1850ms | 5.4046 KOps/s | 5.4880 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.3023ms | 10.3629μs | 96.4982 KOps/s | 95.9914 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 91.5230μs | 54.3642μs | 18.3945 KOps/s | 18.2538 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 45.5910μs | 9.8679μs | 101.3382 KOps/s | 99.7046 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4393ms | 69.9695μs | 14.2919 KOps/s | 14.6166 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2131ms | 0.1767ms | 5.6601 KOps/s | 5.2547 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3565ms | 0.2787ms | 3.5885 KOps/s | 3.5713 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1695ms | 0.1178ms | 8.4900 KOps/s | 8.1288 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1044ms | 73.4884μs | 13.6076 KOps/s | 13.6758 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2007ms | 0.1578ms | 6.3374 KOps/s | 6.1861 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8926ms | 0.5438ms | 1.8387 KOps/s | 1.8492 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4193ms | 0.3316ms | 3.0153 KOps/s | 2.9985 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2149ms | 0.1791ms | 5.5842 KOps/s | 5.2009 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1624ms | 89.4496μs | 11.1795 KOps/s | 11.2606 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.1927ms | 0.1198ms | 8.3462 KOps/s | 8.0802 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6576ms | 0.4477ms | 2.2336 KOps/s | 2.2613 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.2385ms | 0.1591ms | 6.2851 KOps/s | 6.1435 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 62.5320μs | 13.0420μs | 76.6752 KOps/s | 74.1977 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 71.7810μs | 41.5088μs | 24.0913 KOps/s | 24.3536 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 38.5710μs | 10.7811μs | 92.7547 KOps/s | 93.2226 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4073ms | 52.7505μs | 18.9572 KOps/s | 19.0477 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 2.0333ms | 0.1736ms | 5.7602 KOps/s | 5.3783 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.4364ms | 3.3265ms | 300.6175 Ops/s | 302.5844 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9732ms | 0.1628ms | 6.1421 KOps/s | 5.9238 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9654ms | 2.8187ms | 354.7787 Ops/s | 352.7546 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.1581ms | 0.1093ms | 9.1504 KOps/s | 8.8007 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3133ms | 73.6091μs | 13.5853 KOps/s | 13.7651 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1401ms | 96.6273μs | 10.3490 KOps/s | 10.1996 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2518ms | 45.0761μs | 22.1847 KOps/s | 21.0891 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1446ms | 97.0734μs | 10.3015 KOps/s | 10.2255 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2574ms | 44.8519μs | 22.2956 KOps/s | 22.6221 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 99.2420μs | 56.3077μs | 17.7596 KOps/s | 16.9161 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2194ms | 27.8620μs | 35.8912 KOps/s | 36.0989 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 93.2120μs | 44.6030μs | 22.4200 KOps/s | 21.9304 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2690ms | 22.8553μs | 43.7536 KOps/s | 44.5956 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 83.1020μs | 44.2714μs | 22.5879 KOps/s | 21.4616 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2777ms | 22.6297μs | 44.1897 KOps/s | 44.7689 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 93.1220μs | 57.2454μs | 17.4686 KOps/s | 16.6321 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2832ms | 28.2226μs | 35.4326 KOps/s | 36.8333 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 86.0520μs | 44.6529μs | 22.3950 KOps/s | 21.5474 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2662ms | 22.6864μs | 44.0792 KOps/s | 44.5158 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 81.7620μs | 44.5219μs | 22.4609 KOps/s | 21.5754 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2747ms | 22.7383μs | 43.9786 KOps/s | 44.7885 KOps/s | |
| test_compile_replace[single-eager] | 0.1012ms | 47.6776μs | 20.9742 KOps/s | 21.2434 KOps/s | |
| test_compile_replace[single-compile] | 0.1447ms | 0.1047ms | 9.5490 KOps/s | 9.3061 KOps/s | |
| test_compile_replace[multi-eager] | 0.6365ms | 0.5651ms | 1.7695 KOps/s | 1.7790 KOps/s | |
| test_compile_replace[multi-compile] | 0.1622ms | 0.1128ms | 8.8651 KOps/s | 8.8814 KOps/s | |
| test_compile_tc_getattr_20[eager] | 0.2178ms | 0.1750ms | 5.7128 KOps/s | 5.9126 KOps/s | |
| test_compile_tc_getattr_20[compile] | 0.1740ms | 0.1194ms | 8.3779 KOps/s | 8.1934 KOps/s | |
| test_compile_clone_shallow[20-eager] | 52.9020μs | 19.5172μs | 51.2368 KOps/s | 52.2831 KOps/s | |
| test_compile_clone_shallow[20-compile] | 52.7410μs | 11.7383μs | 85.1912 KOps/s | 85.2134 KOps/s | |
| test_compile_clone_shallow[40-eager] | 60.9920μs | 34.1464μs | 29.2856 KOps/s | 29.5577 KOps/s | |
| test_compile_clone_shallow[40-compile] | 49.6910μs | 12.7251μs | 78.5850 KOps/s | 78.8690 KOps/s | |
| test_compile_clone_shallow[80-eager] | 0.1747ms | 61.4896μs | 16.2629 KOps/s | 15.7289 KOps/s | |
| test_compile_clone_shallow[80-compile] | 40.2410μs | 15.0043μs | 66.6475 KOps/s | 64.7920 KOps/s | |
| test_compile_update_inplace[eager] | 0.1242ms | 59.8931μs | 16.6964 KOps/s | 16.7293 KOps/s | |
| test_compile_update_inplace[compile] | 0.1918ms | 0.1404ms | 7.1248 KOps/s | 6.9418 KOps/s | |
| test_mod_add[eager] | 0.1177ms | 49.9102μs | 20.0360 KOps/s | 20.1400 KOps/s | |
| test_mod_add[compile] | 0.1487ms | 0.1058ms | 9.4521 KOps/s | 9.4495 KOps/s | |
| test_mod_add[compile-overhead] | 0.2325ms | 0.1496ms | 6.6862 KOps/s | 6.4990 KOps/s | |
| test_mod_wrap[eager] | 0.3636ms | 0.2925ms | 3.4189 KOps/s | 3.4452 KOps/s | |
| test_mod_wrap[compile] | 0.8315ms | 0.3547ms | 2.8189 KOps/s | 2.7142 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.2512ms | 4.0155ms | 249.0346 Ops/s | 247.5443 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.9425ms | 1.5085ms | 662.9031 Ops/s | 659.6341 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.9626ms | 1.4523ms | 688.5602 Ops/s | 680.3849 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2654ms | 0.8898ms | 1.1238 KOps/s | 1.1066 KOps/s | |
| test_seq_add[eager] | 0.6370ms | 0.1560ms | 6.4100 KOps/s | 6.4170 KOps/s | |
| test_seq_add[compile] | 0.6107ms | 0.1163ms | 8.5957 KOps/s | 8.5217 KOps/s | |
| test_seq_add[compile-overhead] | 0.6001ms | 0.1567ms | 6.3829 KOps/s | 6.2174 KOps/s | |
| test_seq_wrap[eager] | 0.9654ms | 0.5225ms | 1.9137 KOps/s | 1.9051 KOps/s | |
| test_seq_wrap[compile] | 0.8736ms | 0.3680ms | 2.7177 KOps/s | 2.7139 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3413ms | 0.2664ms | 3.7537 KOps/s | 3.7001 KOps/s | |
| test_func_call_runtime[False-eager] | 1.2829ms | 0.8422ms | 1.1874 KOps/s | 1.2054 KOps/s | |
| test_func_call_runtime[False-compile] | 1.4088ms | 0.9193ms | 1.0878 KOps/s | 1.0883 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.9048ms | 0.4654ms | 2.1488 KOps/s | 2.1317 KOps/s | |
| test_func_call_runtime[True-eager] | 1.5263ms | 1.0878ms | 919.3103 Ops/s | 928.1597 Ops/s | |
| test_func_call_runtime[True-compile] | 1.4396ms | 0.9307ms | 1.0745 KOps/s | 1.0765 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.9212ms | 0.4785ms | 2.0899 KOps/s | 2.0635 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 1.2917ms | 0.8681ms | 1.1519 KOps/s | 1.2028 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.1656ms | 0.9265ms | 1.0793 KOps/s | 1.0855 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5712ms | 0.4660ms | 2.1459 KOps/s | 2.1217 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3070ms | 1.2253ms | 816.1113 Ops/s | 813.4923 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.0251ms | 0.9606ms | 1.0410 KOps/s | 1.0344 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5610ms | 0.5110ms | 1.9569 KOps/s | 1.9333 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8531ms | 2.3739ms | 421.2492 Ops/s | 420.3055 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.1358ms | 0.9868ms | 1.0134 KOps/s | 1.0132 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5955ms | 0.5149ms | 1.9421 KOps/s | 1.9094 KOps/s | |
| test_distributed | 0.5470ms | 0.1522ms | 6.5706 KOps/s | 6.5085 KOps/s | |
| test_tdmodule | 0.2486ms | 27.4147μs | 36.4767 KOps/s | 36.2088 KOps/s | |
| test_tdmodule_dispatch | 77.4020μs | 45.5802μs | 21.9394 KOps/s | 22.4956 KOps/s | |
| test_tdseq | 53.5910μs | 26.5465μs | 37.6697 KOps/s | 37.2483 KOps/s | |
| test_tdseq_dispatch | 67.3010μs | 46.4783μs | 21.5154 KOps/s | 21.1420 KOps/s | |
| test_instantiation_functorch | 2.2018ms | 2.1025ms | 475.6165 Ops/s | 480.5947 Ops/s | |
| test_exec_functorch | 0.2407ms | 0.1795ms | 5.5704 KOps/s | 5.5451 KOps/s | |
| test_exec_functional_call | 0.2285ms | 0.1593ms | 6.2775 KOps/s | 6.2424 KOps/s | |
| test_exec_td_decorator | 0.4326ms | 0.2346ms | 4.2627 KOps/s | 4.2311 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0064ms | 0.8204ms | 1.2190 KOps/s | 1.2115 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.9999ms | 0.8197ms | 1.2200 KOps/s | 1.2144 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8536ms | 0.7048ms | 1.4187 KOps/s | 1.4041 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8753ms | 0.7081ms | 1.4123 KOps/s | 1.4002 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.4274ms | 20.5676ms | 48.6203 Ops/s | 48.4663 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 20.6841ms | 20.5664ms | 48.6230 Ops/s | 48.4720 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.5470ms | 20.3188ms | 49.2155 Ops/s | 48.9142 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.4746ms | 20.3529ms | 49.1332 Ops/s | 48.8916 Ops/s | |
| test_to_module_speed[True] | 1.8673ms | 1.4789ms | 676.1605 Ops/s | 676.7710 Ops/s | |
| test_to_module_speed[False] | 1.5720ms | 1.4552ms | 687.1957 Ops/s | 689.3482 Ops/s | |
| test_tc_init | 81.3120μs | 44.7082μs | 22.3673 KOps/s | 22.2063 KOps/s | |
| test_tc_init_tensor_only | 84.8020μs | 9.7830μs | 102.2177 KOps/s | 101.2725 KOps/s | |
| test_tc_init_nested | 0.1238ms | 88.8926μs | 11.2495 KOps/s | 11.2355 KOps/s | |
| test_tc_init_many_fields | 65.3710μs | 16.5153μs | 60.5500 KOps/s | 60.8630 KOps/s | |
| test_tc_first_layer_tensor | 73.2820μs | 1.8213μs | 549.0657 KOps/s | 554.8517 KOps/s | |
| test_tc_first_layer_tensor_only | 2.7267μs | 0.4011μs | 2.4932 MOps/s | 2.4802 MOps/s | |
| test_tc_first_layer_tensor_set | 33.7110μs | 3.9616μs | 252.4209 KOps/s | 252.8804 KOps/s | |
| test_tc_first_layer_tensor_only_set | 28.0810μs | 3.2719μs | 305.6372 KOps/s | 303.4225 KOps/s | |
| test_tc_first_layer_nontensor | 27.7100μs | 6.1529μs | 162.5246 KOps/s | 157.1311 KOps/s | |
| test_tc_second_layer_tensor | 27.8510μs | 4.4365μs | 225.4014 KOps/s | 224.8127 KOps/s | |
| test_tc_second_layer_nontensor | 56.9020μs | 8.6910μs | 115.0616 KOps/s | 111.0009 KOps/s | |
| test_unbind | 0.2498s | 16.4056ms | 60.9548 Ops/s | 53.8745 Ops/s | |
| test_full_like | 16.9958ms | 16.5246ms | 60.5157 Ops/s | 73.4493 Ops/s | |
| test_zeros_like | 17.3524ms | 16.8474ms | 59.3563 Ops/s | 74.1591 Ops/s | |
| test_ones_like | 16.9585ms | 16.5884ms | 60.2831 Ops/s | 73.9715 Ops/s | |
| test_clone | 17.8337ms | 17.5699ms | 56.9154 Ops/s | 67.6819 Ops/s | |
| test_squeeze | 97.4530μs | 14.4489μs | 69.2095 KOps/s | 64.2986 KOps/s | |
| test_unsqueeze | 0.1712ms | 0.1104ms | 9.0599 KOps/s | 8.5937 KOps/s | |
| test_split | 0.3520ms | 0.1853ms | 5.3965 KOps/s | 5.1184 KOps/s | |
| test_permute | 0.2700ms | 0.2113ms | 4.7325 KOps/s | 4.5282 KOps/s | |
| test_stack | 51.8540ms | 51.2154ms | 19.5254 Ops/s | 19.4851 Ops/s | |
| test_cat | 51.4389ms | 50.9110ms | 19.6421 Ops/s | 19.4963 Ops/s | |
| test_sequential_tensordict | 0.6098ms | 0.2223ms | 4.4987 KOps/s | 4.5426 KOps/s | |
| test_sequential_graph_module | 0.1993ms | 0.1230ms | 8.1288 KOps/s | 8.4607 KOps/s | |
| test_nested_tensordict | 0.7246ms | 0.2820ms | 3.5457 KOps/s | 3.5493 KOps/s | |
| test_nested_graph_module | 0.1710ms | 0.1311ms | 7.6257 KOps/s | 7.6847 KOps/s |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
1 similar comment
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
1 similar comment
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
1 similar comment
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
Stack from ghstack (oldest at bottom):
Pure-logic module for computing optimal P2P transfers between
different DeviceMesh sharding layouts. Includes:
intersections between source and destination meshes.
without GPUs or a distributed runtime).
JSON-over-CUDA metadata serialization) and _UCXXBackend implementations.
Made-with: Cursor