-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] inplace
arg in TDM constructor
#992
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Sep 16, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 40.2750μs | 19.1191μs | 52.3036 KOps/s | 49.2278 KOps/s | |
test_plain_set_stack_nested | 84.5070μs | 19.1860μs | 52.1213 KOps/s | 48.5777 KOps/s | |
test_plain_set_nested_inplace | 46.1460μs | 20.7332μs | 48.2319 KOps/s | 44.8778 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1020ms | 20.9719μs | 47.6830 KOps/s | 45.3087 KOps/s | |
test_items | 21.5810μs | 4.1346μs | 241.8623 KOps/s | 233.9152 KOps/s | |
test_items_nested | 0.5236ms | 0.3592ms | 2.7842 KOps/s | 2.8159 KOps/s | |
test_items_nested_locked | 0.5281ms | 0.3617ms | 2.7645 KOps/s | 2.7950 KOps/s | |
test_items_nested_leaf | 0.1297ms | 68.4004μs | 14.6198 KOps/s | 14.3806 KOps/s | |
test_items_stack_nested | 0.6432ms | 0.3639ms | 2.7478 KOps/s | 2.7598 KOps/s | |
test_items_stack_nested_leaf | 0.1645ms | 71.7869μs | 13.9301 KOps/s | 13.6716 KOps/s | |
test_items_stack_nested_locked | 0.5918ms | 0.3732ms | 2.6794 KOps/s | 2.7540 KOps/s | |
test_keys | 22.7230μs | 3.5836μs | 279.0479 KOps/s | 281.1463 KOps/s | |
test_keys_nested | 0.1785ms | 0.1022ms | 9.7857 KOps/s | 9.7985 KOps/s | |
test_keys_nested_locked | 1.9011ms | 0.1078ms | 9.2747 KOps/s | 9.2849 KOps/s | |
test_keys_nested_leaf | 0.1479ms | 84.3390μs | 11.8569 KOps/s | 11.5880 KOps/s | |
test_keys_stack_nested | 0.1650ms | 99.3052μs | 10.0700 KOps/s | 9.7232 KOps/s | |
test_keys_stack_nested_leaf | 0.1404ms | 81.3850μs | 12.2873 KOps/s | 11.6883 KOps/s | |
test_keys_stack_nested_locked | 0.1998ms | 0.1044ms | 9.5794 KOps/s | 9.2975 KOps/s | |
test_values | 13.1822μs | 1.0974μs | 911.2209 KOps/s | 931.0493 KOps/s | |
test_values_nested | 0.1391ms | 74.6810μs | 13.3903 KOps/s | 13.5471 KOps/s | |
test_values_nested_locked | 0.1432ms | 73.0934μs | 13.6811 KOps/s | 13.5840 KOps/s | |
test_values_nested_leaf | 0.1173ms | 61.6492μs | 16.2208 KOps/s | 15.9900 KOps/s | |
test_values_stack_nested | 0.1353ms | 74.2577μs | 13.4666 KOps/s | 13.3737 KOps/s | |
test_values_stack_nested_leaf | 0.1215ms | 59.6307μs | 16.7699 KOps/s | 16.1709 KOps/s | |
test_values_stack_nested_locked | 0.1385ms | 73.7149μs | 13.5658 KOps/s | 13.5025 KOps/s | |
test_membership | 22.9630μs | 0.8625μs | 1.1595 MOps/s | 1.4136 MOps/s | |
test_membership_nested | 29.8560μs | 2.7373μs | 365.3216 KOps/s | 360.6925 KOps/s | |
test_membership_nested_leaf | 20.0980μs | 2.7118μs | 368.7644 KOps/s | 369.5049 KOps/s | |
test_membership_stacked_nested | 26.9800μs | 2.7166μs | 368.1132 KOps/s | 365.9161 KOps/s | |
test_membership_stacked_nested_leaf | 31.6590μs | 2.7242μs | 367.0826 KOps/s | 367.8747 KOps/s | |
test_membership_nested_last | 34.0540μs | 3.9335μs | 254.2280 KOps/s | 253.7718 KOps/s | |
test_membership_nested_leaf_last | 36.5780μs | 3.9260μs | 254.7129 KOps/s | 250.7386 KOps/s | |
test_membership_stacked_nested_last | 26.1890μs | 3.8876μs | 257.2280 KOps/s | 179.1601 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.8560μs | 3.9049μs | 256.0890 KOps/s | 179.0967 KOps/s | |
test_nested_getleaf | 52.6790μs | 10.4531μs | 95.6656 KOps/s | 93.0541 KOps/s | |
test_nested_get | 39.5240μs | 10.0454μs | 99.5477 KOps/s | 96.7907 KOps/s | |
test_stacked_getleaf | 43.6810μs | 10.4261μs | 95.9132 KOps/s | 94.2641 KOps/s | |
test_stacked_get | 31.4390μs | 9.9847μs | 100.1535 KOps/s | 98.7183 KOps/s | |
test_nested_getitemleaf | 58.4370μs | 10.5833μs | 94.4888 KOps/s | 88.5847 KOps/s | |
test_nested_getitem | 42.1490μs | 10.2851μs | 97.2284 KOps/s | 96.1981 KOps/s | |
test_stacked_getitemleaf | 35.2460μs | 10.9042μs | 91.7077 KOps/s | 89.9554 KOps/s | |
test_stacked_getitem | 35.2460μs | 10.1946μs | 98.0910 KOps/s | 94.7427 KOps/s | |
test_lock_nested | 84.8870ms | 0.5757ms | 1.7369 KOps/s | 2.0512 KOps/s | |
test_lock_stack_nested | 0.7186ms | 0.4492ms | 2.2263 KOps/s | 2.1823 KOps/s | |
test_unlock_nested | 86.3763ms | 0.5038ms | 1.9847 KOps/s | 2.4119 KOps/s | |
test_unlock_stack_nested | 0.5528ms | 0.3716ms | 2.6911 KOps/s | 2.6203 KOps/s | |
test_flatten_speed | 0.1890ms | 89.0567μs | 11.2288 KOps/s | 11.3405 KOps/s | |
test_unflatten_speed | 0.8445ms | 0.4735ms | 2.1120 KOps/s | 2.1504 KOps/s | |
test_common_ops | 2.0146ms | 1.0664ms | 937.7176 Ops/s | 893.8386 Ops/s | |
test_creation | 91.9220μs | 2.1198μs | 471.7330 KOps/s | 474.9589 KOps/s | |
test_creation_empty | 51.4260μs | 15.1910μs | 65.8285 KOps/s | 58.0193 KOps/s | |
test_creation_nested_1 | 60.4830μs | 18.0490μs | 55.4048 KOps/s | 49.0498 KOps/s | |
test_creation_nested_2 | 55.4140μs | 22.3158μs | 44.8112 KOps/s | 39.8406 KOps/s | |
test_clone | 60.9840μs | 17.0236μs | 58.7421 KOps/s | 57.7425 KOps/s | |
test_getitem[int] | 0.8786ms | 17.1419μs | 58.3367 KOps/s | 58.6644 KOps/s | |
test_getitem[slice_int] | 0.1375ms | 31.1479μs | 32.1048 KOps/s | 32.0601 KOps/s | |
test_getitem[range] | 0.3400ms | 59.6111μs | 16.7754 KOps/s | 16.7030 KOps/s | |
test_getitem[tuple] | 0.1303ms | 25.6918μs | 38.9229 KOps/s | 39.6775 KOps/s | |
test_getitem[list] | 0.1910ms | 55.3062μs | 18.0811 KOps/s | 18.1047 KOps/s | |
test_setitem_dim[int] | 74.6990μs | 33.8443μs | 29.5471 KOps/s | 29.8332 KOps/s | |
test_setitem_dim[slice_int] | 0.1044ms | 62.0756μs | 16.1094 KOps/s | 16.1168 KOps/s | |
test_setitem_dim[range] | 0.1402ms | 85.9758μs | 11.6312 KOps/s | 11.6296 KOps/s | |
test_setitem_dim[tuple] | 90.3590μs | 50.0987μs | 19.9606 KOps/s | 19.8739 KOps/s | |
test_setitem | 89.6980μs | 28.8329μs | 34.6826 KOps/s | 33.8369 KOps/s | |
test_set | 81.6530μs | 27.8543μs | 35.9011 KOps/s | 34.9222 KOps/s | |
test_set_shared | 1.3033ms | 0.2117ms | 4.7241 KOps/s | 4.6548 KOps/s | |
test_update | 0.1541ms | 33.3734μs | 29.9640 KOps/s | 27.2493 KOps/s | |
test_update_nested | 0.1426ms | 44.6678μs | 22.3875 KOps/s | 21.3246 KOps/s | |
test_update__nested | 83.8270μs | 34.5626μs | 28.9330 KOps/s | 28.9271 KOps/s | |
test_set_nested | 0.1039ms | 30.1934μs | 33.1198 KOps/s | 31.9790 KOps/s | |
test_set_nested_new | 0.1057ms | 36.2133μs | 27.6141 KOps/s | 26.9884 KOps/s | |
test_select | 0.1322ms | 53.1869μs | 18.8016 KOps/s | 18.7041 KOps/s | |
test_select_nested | 0.1221ms | 60.4615μs | 16.5395 KOps/s | 16.5135 KOps/s | |
test_exclude_nested | 0.1936ms | 76.3665μs | 13.0948 KOps/s | 13.0951 KOps/s | |
test_empty[True] | 0.4708ms | 0.3196ms | 3.1288 KOps/s | 3.1079 KOps/s | |
test_empty[False] | 7.9598μs | 1.2608μs | 793.1405 KOps/s | 798.0378 KOps/s | |
test_unbind_speed | 0.4970ms | 0.3091ms | 3.2357 KOps/s | 3.2186 KOps/s | |
test_unbind_speed_stack0 | 0.4598ms | 0.2962ms | 3.3764 KOps/s | 3.3256 KOps/s | |
test_unbind_speed_stack1 | 97.4602ms | 0.7973ms | 1.2542 KOps/s | 1.3432 KOps/s | |
test_split | 2.2527ms | 2.0236ms | 494.1617 Ops/s | 447.3076 Ops/s | |
test_chunk | 0.1137s | 2.2650ms | 441.4970 Ops/s | 447.0895 Ops/s | |
test_creation[device0] | 4.3576ms | 0.1202ms | 8.3190 KOps/s | 8.2795 KOps/s | |
test_creation_from_tensor | 0.2813ms | 0.1176ms | 8.5047 KOps/s | 8.2531 KOps/s | |
test_add_one[memmap_tensor0] | 0.1865ms | 7.6568μs | 130.6032 KOps/s | 129.1879 KOps/s | |
test_contiguous[memmap_tensor0] | 22.5220μs | 1.8833μs | 530.9959 KOps/s | 516.4560 KOps/s | |
test_stack[memmap_tensor0] | 35.4160μs | 5.9379μs | 168.4100 KOps/s | 172.6294 KOps/s | |
test_memmaptd_index | 1.0350ms | 0.4155ms | 2.4069 KOps/s | 2.4036 KOps/s | |
test_memmaptd_index_astensor | 0.9970ms | 0.4915ms | 2.0346 KOps/s | 2.0366 KOps/s | |
test_memmaptd_index_op | 1.7741ms | 1.0106ms | 989.5301 Ops/s | 945.3920 Ops/s | |
test_serialize_model | 0.2106s | 0.1326s | 7.5421 Ops/s | 8.6298 Ops/s | |
test_serialize_model_pickle | 0.4665s | 0.3882s | 2.5757 Ops/s | 2.5239 Ops/s | |
test_serialize_weights | 0.1223s | 0.1160s | 8.6207 Ops/s | 7.4380 Ops/s | |
test_serialize_weights_returnearly | 0.1864s | 0.1586s | 6.3045 Ops/s | 6.1923 Ops/s | |
test_serialize_weights_pickle | 0.4806s | 0.4242s | 2.3575 Ops/s | 2.3085 Ops/s | |
test_serialize_weights_filesystem | 0.1491s | 0.1412s | 7.0826 Ops/s | 7.0930 Ops/s | |
test_serialize_model_filesystem | 0.1627s | 0.1528s | 6.5445 Ops/s | 6.5664 Ops/s | |
test_reshape_pytree | 85.3390μs | 38.5428μs | 25.9452 KOps/s | 25.5475 KOps/s | |
test_reshape_td | 0.1080ms | 44.7727μs | 22.3350 KOps/s | 21.3932 KOps/s | |
test_view_pytree | 0.1410ms | 38.7193μs | 25.8269 KOps/s | 25.5705 KOps/s | |
test_view_td | 0.1069ms | 50.7389μs | 19.7087 KOps/s | 19.1641 KOps/s | |
test_unbind_pytree | 74.8600μs | 36.2858μs | 27.5590 KOps/s | 27.7266 KOps/s | |
test_unbind_td | 0.2949ms | 45.6083μs | 21.9258 KOps/s | 22.1155 KOps/s | |
test_split_pytree | 79.7490μs | 38.4512μs | 26.0070 KOps/s | 26.0089 KOps/s | |
test_split_td | 0.5231ms | 57.9912μs | 17.2440 KOps/s | 14.4394 KOps/s | |
test_add_pytree | 0.1098ms | 46.6666μs | 21.4286 KOps/s | 21.8181 KOps/s | |
test_add_td | 0.3023ms | 80.8982μs | 12.3612 KOps/s | 12.1715 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1266ms | 57.7656μs | 17.3113 KOps/s | 17.5988 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3520ms | 0.1779ms | 5.6200 KOps/s | 5.5773 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1214ms | 56.4217μs | 17.7237 KOps/s | 17.7431 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2746ms | 0.1457ms | 6.8641 KOps/s | 6.8788 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 62.0460μs | 21.2113μs | 47.1447 KOps/s | 46.2925 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1389ms | 68.1493μs | 14.6737 KOps/s | 14.8630 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1701ms | 75.6658μs | 13.2160 KOps/s | 13.1148 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1533ms | 67.5547μs | 14.8028 KOps/s | 14.8716 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2910ms | 0.1731ms | 5.7760 KOps/s | 5.7489 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3931ms | 0.1885ms | 5.3058 KOps/s | 5.2725 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1091ms | 46.0338μs | 21.7232 KOps/s | 21.3406 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1597ms | 68.8552μs | 14.5232 KOps/s | 13.7768 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3782ms | 0.1772ms | 5.6440 KOps/s | 5.7910 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5842ms | 0.2951ms | 3.3890 KOps/s | 3.4136 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3002ms | 0.1979ms | 5.0526 KOps/s | 5.0011 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3336ms | 0.1771ms | 5.6452 KOps/s | 5.8551 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1343ms | 61.7377μs | 16.1975 KOps/s | 15.9843 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 89.1260μs | 46.5699μs | 21.4731 KOps/s | 21.7755 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3689ms | 0.2352ms | 4.2515 KOps/s | 4.2488 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2673ms | 0.1733ms | 5.7699 KOps/s | 5.6368 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1851ms | 0.1014ms | 9.8645 KOps/s | 9.8803 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1231ms | 58.3331μs | 17.1429 KOps/s | 17.2573 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1870ms | 78.3242μs | 12.7674 KOps/s | 12.8408 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1257ms | 68.5375μs | 14.5905 KOps/s | 14.6573 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3006ms | 0.1972ms | 5.0711 KOps/s | 5.1371 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7753ms | 1.6701ms | 598.7615 Ops/s | 601.7945 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2997ms | 0.1959ms | 5.1043 KOps/s | 5.1998 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.7739ms | 1.1298ms | 885.0817 Ops/s | 896.0053 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5261ms | 0.4234ms | 2.3620 KOps/s | 2.3853 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.7069ms | 3.7008ms | 270.2109 Ops/s | 263.4368 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 74.9800μs | 33.9019μs | 29.4969 KOps/s | 28.7662 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0664ms | 50.0244μs | 19.9903 KOps/s | 20.1267 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 72.1350μs | 29.7170μs | 33.6507 KOps/s | 33.2033 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 77.7950μs | 30.1611μs | 33.1552 KOps/s | 32.8905 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1179ms | 29.8959μs | 33.4494 KOps/s | 32.5290 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1003ms | 29.6459μs | 33.7315 KOps/s | 33.1502 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1554ms | 73.5331μs | 13.5993 KOps/s | 13.4119 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5241ms | 27.5755μs | 36.2641 KOps/s | 35.9191 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1394ms | 67.8788μs | 14.7321 KOps/s | 14.7446 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 63.5690μs | 23.0532μs | 43.3779 KOps/s | 41.0553 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1976ms | 67.7885μs | 14.7518 KOps/s | 14.8753 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 60.4730μs | 23.2240μs | 43.0590 KOps/s | 42.6139 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1436ms | 72.3660μs | 13.8186 KOps/s | 13.7109 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0206ms | 27.6009μs | 36.2308 KOps/s | 36.3019 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1566ms | 67.6187μs | 14.7888 KOps/s | 14.8530 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 63.4080μs | 23.1438μs | 43.2080 KOps/s | 43.0546 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1345ms | 67.9575μs | 14.7151 KOps/s | 15.0064 KOps/s | |
test_compile_indexing[int-pytree-eager] | 83.4660μs | 23.0524μs | 43.3794 KOps/s | 42.7971 KOps/s | |
test_mod_add[eager] | 75.1500μs | 22.9002μs | 43.6677 KOps/s | 40.1954 KOps/s | |
test_mod_add[compile] | 0.1093ms | 40.1907μs | 24.8814 KOps/s | 25.1954 KOps/s | |
test_mod_add[compile-overhead] | 0.1019ms | 39.4446μs | 25.3520 KOps/s | 25.1407 KOps/s | |
test_mod_wrap[eager] | 0.4088ms | 0.2087ms | 4.7912 KOps/s | 4.7632 KOps/s | |
test_mod_wrap[compile] | 0.3145ms | 0.2377ms | 4.2064 KOps/s | 4.2868 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3351ms | 0.2333ms | 4.2856 KOps/s | 4.2996 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.1390ms | 10.6706ms | 93.7152 Ops/s | 92.8704 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.0450ms | 10.6022ms | 94.3200 Ops/s | 91.5298 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 11.5439ms | 10.6135ms | 94.2195 Ops/s | 90.4529 Ops/s | |
test_seq_add[eager] | 0.2153ms | 89.4437μs | 11.1802 KOps/s | 11.2898 KOps/s | |
test_seq_add[compile] | 0.1546ms | 64.4649μs | 15.5123 KOps/s | 15.6057 KOps/s | |
test_seq_add[compile-overhead] | 0.1475ms | 63.2906μs | 15.8001 KOps/s | 15.7932 KOps/s | |
test_seq_wrap[eager] | 0.6391ms | 0.3809ms | 2.6251 KOps/s | 2.5971 KOps/s | |
test_seq_wrap[compile] | 1.4113ms | 0.2692ms | 3.7141 KOps/s | 3.6989 KOps/s | |
test_seq_wrap[compile-overhead] | 1.3333ms | 0.2698ms | 3.7071 KOps/s | 3.6801 KOps/s | |
test_func_call_runtime[False-eager] | 0.9624ms | 0.5235ms | 1.9101 KOps/s | 1.9061 KOps/s | |
test_func_call_runtime[False-compile] | 0.9487ms | 0.5035ms | 1.9861 KOps/s | 1.9970 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6540ms | 0.5041ms | 1.9838 KOps/s | 1.9761 KOps/s | |
test_func_call_runtime[True-eager] | 1.2724ms | 0.7426ms | 1.3466 KOps/s | 1.3587 KOps/s | |
test_func_call_runtime[True-compile] | 0.6465ms | 0.5128ms | 1.9501 KOps/s | 1.9409 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7004ms | 0.5137ms | 1.9467 KOps/s | 1.9533 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7751ms | 0.5198ms | 1.9240 KOps/s | 1.9333 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7145ms | 0.5038ms | 1.9851 KOps/s | 1.9884 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6747ms | 0.5057ms | 1.9776 KOps/s | 1.9832 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2246ms | 0.8687ms | 1.1511 KOps/s | 1.1528 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9016ms | 0.7422ms | 1.3474 KOps/s | 1.3481 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0540ms | 0.7352ms | 1.3601 KOps/s | 1.3422 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.3937ms | 1.8708ms | 534.5239 Ops/s | 528.8157 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6223ms | 1.9317ms | 517.6799 Ops/s | 517.6657 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.7021ms | 1.9169ms | 521.6880 Ops/s | 517.2949 Ops/s | |
test_distributed | 0.2289ms | 0.1236ms | 8.0882 KOps/s | 7.9178 KOps/s | |
test_tdmodule | 75.8020μs | 16.9200μs | 59.1015 KOps/s | 55.3262 KOps/s | |
test_tdmodule_dispatch | 61.3350μs | 33.1418μs | 30.1734 KOps/s | 27.7755 KOps/s | |
test_tdseq | 35.5770μs | 19.0801μs | 52.4106 KOps/s | 49.1686 KOps/s | |
test_tdseq_dispatch | 67.7460μs | 37.4782μs | 26.6822 KOps/s | 24.0881 KOps/s | |
test_instantiation_functorch | 1.7054ms | 1.5942ms | 627.2655 Ops/s | 613.8585 Ops/s | |
test_instantiation_td | 1.9007ms | 1.1772ms | 849.4568 Ops/s | 838.5721 Ops/s | |
test_exec_functorch | 0.3070ms | 0.1857ms | 5.3856 KOps/s | 5.3840 KOps/s | |
test_exec_functional_call | 0.3561ms | 0.1780ms | 5.6166 KOps/s | 5.7057 KOps/s | |
test_exec_td | 0.3237ms | 0.1721ms | 5.8090 KOps/s | 5.8150 KOps/s | |
test_exec_td_decorator | 0.4143ms | 0.2268ms | 4.4082 KOps/s | 4.4447 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0673ms | 0.6402ms | 1.5621 KOps/s | 1.5348 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9571ms | 0.6387ms | 1.5656 KOps/s | 1.5553 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7301ms | 0.4995ms | 2.0020 KOps/s | 1.9889 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7740ms | 0.5014ms | 1.9945 KOps/s | 1.9998 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2443ms | 0.6146ms | 1.6271 KOps/s | 1.5925 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9506ms | 0.6206ms | 1.6114 KOps/s | 1.5949 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8198ms | 0.5164ms | 1.9364 KOps/s | 1.9383 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7084ms | 0.5141ms | 1.9451 KOps/s | 1.9390 KOps/s | |
test_to_module_speed[True] | 2.0584ms | 1.2962ms | 771.4778 Ops/s | 783.5097 Ops/s | |
test_to_module_speed[False] | 1.7362ms | 1.2547ms | 797.0252 Ops/s | 804.8348 Ops/s | |
test_tc_init | 75.1600μs | 41.5878μs | 24.0455 KOps/s | 23.1934 KOps/s | |
test_tc_init_nested | 0.1544ms | 85.3437μs | 11.7173 KOps/s | 11.3298 KOps/s | |
test_tc_first_layer_tensor | 18.4750μs | 1.5655μs | 638.7824 KOps/s | 656.0347 KOps/s | |
test_tc_first_layer_nontensor | 34.4440μs | 4.7573μs | 210.2020 KOps/s | 214.1845 KOps/s | |
test_tc_second_layer_tensor | 25.7080μs | 2.8868μs | 346.4058 KOps/s | 358.1563 KOps/s | |
test_tc_second_layer_nontensor | 24.9570μs | 6.0760μs | 164.5807 KOps/s | 166.5299 KOps/s | |
test_unbind | 0.4783s | 13.2076ms | 75.7140 Ops/s | 75.4316 Ops/s | |
test_full_like | 9.6939ms | 7.9325ms | 126.0629 Ops/s | 129.8895 Ops/s | |
test_zeros_like | 3.8229ms | 3.0628ms | 326.4997 Ops/s | 152.2856 Ops/s | |
test_ones_like | 13.7831ms | 7.0864ms | 141.1157 Ops/s | 128.5688 Ops/s | |
test_clone | 16.5545ms | 8.7870ms | 113.8041 Ops/s | 108.0901 Ops/s | |
test_squeeze | 68.7180μs | 12.3760μs | 80.8015 KOps/s | 83.2828 KOps/s | |
test_unsqueeze | 0.2339ms | 94.9209μs | 10.5351 KOps/s | 10.3708 KOps/s | |
test_split | 0.5352ms | 0.1992ms | 5.0211 KOps/s | 4.9801 KOps/s | |
test_permute | 0.4705ms | 0.2300ms | 4.3479 KOps/s | 4.3016 KOps/s | |
test_stack | 30.7310ms | 24.9623ms | 40.0604 Ops/s | 40.0158 Ops/s | |
test_cat | 28.0563ms | 24.8276ms | 40.2777 Ops/s | 39.9682 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.5810ms | 13.4748μs | 74.2124 KOps/s | 66.7465 KOps/s | |
test_plain_set_stack_nested | 38.8110μs | 13.8443μs | 72.2319 KOps/s | 66.3379 KOps/s | |
test_plain_set_nested_inplace | 67.1610μs | 14.7315μs | 67.8818 KOps/s | 62.6182 KOps/s | |
test_plain_set_stack_nested_inplace | 50.3110μs | 14.4730μs | 69.0940 KOps/s | 61.4923 KOps/s | |
test_items | 32.8400μs | 2.9049μs | 344.2504 KOps/s | 345.5687 KOps/s | |
test_items_nested | 0.4732ms | 0.3242ms | 3.0848 KOps/s | 3.1025 KOps/s | |
test_items_nested_locked | 0.3756ms | 0.3259ms | 3.0687 KOps/s | 3.0323 KOps/s | |
test_items_nested_leaf | 90.2520μs | 55.5788μs | 17.9925 KOps/s | 17.9532 KOps/s | |
test_items_stack_nested | 0.3867ms | 0.3291ms | 3.0383 KOps/s | 3.0554 KOps/s | |
test_items_stack_nested_leaf | 96.0420μs | 57.3979μs | 17.4222 KOps/s | 17.6270 KOps/s | |
test_items_stack_nested_locked | 0.4141ms | 0.3300ms | 3.0300 KOps/s | 3.0336 KOps/s | |
test_keys | 39.1710μs | 3.4615μs | 288.8924 KOps/s | 290.2361 KOps/s | |
test_keys_nested | 85.0910μs | 56.3602μs | 17.7430 KOps/s | 17.6823 KOps/s | |
test_keys_nested_locked | 2.2340ms | 63.0239μs | 15.8670 KOps/s | 15.9831 KOps/s | |
test_keys_nested_leaf | 95.0310μs | 46.9920μs | 21.2802 KOps/s | 21.0737 KOps/s | |
test_keys_stack_nested | 88.6420μs | 55.0631μs | 18.1610 KOps/s | 17.8707 KOps/s | |
test_keys_stack_nested_leaf | 81.5910μs | 48.5605μs | 20.5929 KOps/s | 20.7854 KOps/s | |
test_keys_stack_nested_locked | 84.7620μs | 61.8682μs | 16.1634 KOps/s | 16.2495 KOps/s | |
test_values | 4.8850μs | 0.8646μs | 1.1566 MOps/s | 1.1487 MOps/s | |
test_values_nested | 65.0420μs | 40.8163μs | 24.5000 KOps/s | 24.5855 KOps/s | |
test_values_nested_locked | 73.4810μs | 42.8682μs | 23.3273 KOps/s | 23.3965 KOps/s | |
test_values_nested_leaf | 81.5120μs | 35.6541μs | 28.0472 KOps/s | 28.1902 KOps/s | |
test_values_stack_nested | 81.3420μs | 41.7067μs | 23.9770 KOps/s | 24.0197 KOps/s | |
test_values_stack_nested_leaf | 69.7720μs | 36.0092μs | 27.7706 KOps/s | 28.0227 KOps/s | |
test_values_stack_nested_locked | 70.9220μs | 43.7337μs | 22.8657 KOps/s | 22.8370 KOps/s | |
test_membership | 1.7301μs | 0.5056μs | 1.9778 MOps/s | 1.9776 MOps/s | |
test_membership_nested | 26.2505μs | 1.8986μs | 526.7080 KOps/s | 510.7386 KOps/s | |
test_membership_nested_leaf | 15.1855μs | 1.8988μs | 526.6562 KOps/s | 526.8074 KOps/s | |
test_membership_stacked_nested | 35.8100μs | 1.9418μs | 514.9810 KOps/s | 506.3305 KOps/s | |
test_membership_stacked_nested_leaf | 25.7000μs | 1.9479μs | 513.3830 KOps/s | 499.2538 KOps/s | |
test_membership_nested_last | 34.6100μs | 2.7966μs | 357.5742 KOps/s | 347.4701 KOps/s | |
test_membership_nested_leaf_last | 28.2800μs | 2.8251μs | 353.9746 KOps/s | 346.2352 KOps/s | |
test_membership_stacked_nested_last | 35.8110μs | 3.1286μs | 319.6294 KOps/s | 125.3498 KOps/s | |
test_membership_stacked_nested_leaf_last | 29.3310μs | 3.1006μs | 322.5213 KOps/s | 125.6149 KOps/s | |
test_nested_getleaf | 30.6910μs | 6.1105μs | 163.6516 KOps/s | 161.6791 KOps/s | |
test_nested_get | 37.8510μs | 5.7539μs | 173.7942 KOps/s | 170.7503 KOps/s | |
test_stacked_getleaf | 38.3810μs | 6.0326μs | 165.7649 KOps/s | 162.7685 KOps/s | |
test_stacked_get | 31.0600μs | 5.6708μs | 176.3423 KOps/s | 173.0144 KOps/s | |
test_nested_getitemleaf | 32.4710μs | 6.1837μs | 161.7167 KOps/s | 161.2354 KOps/s | |
test_nested_getitem | 48.7210μs | 5.8339μs | 171.4114 KOps/s | 172.4260 KOps/s | |
test_stacked_getitemleaf | 36.9910μs | 6.1487μs | 162.6371 KOps/s | 161.6949 KOps/s | |
test_stacked_getitem | 34.8310μs | 5.8331μs | 171.4343 KOps/s | 171.8948 KOps/s | |
test_lock_nested | 2.8911ms | 0.4234ms | 2.3617 KOps/s | 2.3629 KOps/s | |
test_lock_stack_nested | 0.4244ms | 0.3870ms | 2.5839 KOps/s | 2.6603 KOps/s | |
test_unlock_nested | 0.7317ms | 0.3610ms | 2.7701 KOps/s | 2.7705 KOps/s | |
test_unlock_stack_nested | 0.3915ms | 0.3268ms | 3.0603 KOps/s | 3.1721 KOps/s | |
test_flatten_speed | 0.1067ms | 68.9094μs | 14.5118 KOps/s | 14.5013 KOps/s | |
test_unflatten_speed | 0.3637ms | 0.2845ms | 3.5155 KOps/s | 3.4684 KOps/s | |
test_common_ops | 1.5132ms | 1.2450ms | 803.2147 Ops/s | 752.7426 Ops/s | |
test_creation | 32.3610μs | 1.4901μs | 671.0760 KOps/s | 659.8788 KOps/s | |
test_creation_empty | 48.4710μs | 15.3696μs | 65.0635 KOps/s | 55.7523 KOps/s | |
test_creation_nested_1 | 49.0810μs | 17.2461μs | 57.9842 KOps/s | 47.9954 KOps/s | |
test_creation_nested_2 | 49.2110μs | 19.1728μs | 52.1573 KOps/s | 39.8790 KOps/s | |
test_clone | 57.3610μs | 32.0787μs | 31.1734 KOps/s | 30.2162 KOps/s | |
test_getitem[int] | 91.9901ms | 23.8026μs | 42.0122 KOps/s | 60.1065 KOps/s | |
test_getitem[slice_int] | 0.1179ms | 28.4842μs | 35.1072 KOps/s | 34.4521 KOps/s | |
test_getitem[range] | 0.2232ms | 0.1113ms | 8.9885 KOps/s | 8.8639 KOps/s | |
test_getitem[tuple] | 0.1166ms | 26.6948μs | 37.4605 KOps/s | 40.0085 KOps/s | |
test_getitem[list] | 0.2029ms | 0.1095ms | 9.1347 KOps/s | 9.2195 KOps/s | |
test_setitem_dim[int] | 75.2820μs | 50.5030μs | 19.8008 KOps/s | 20.1393 KOps/s | |
test_setitem_dim[slice_int] | 0.1154ms | 69.8389μs | 14.3187 KOps/s | 14.5958 KOps/s | |
test_setitem_dim[range] | 0.1779ms | 0.1384ms | 7.2273 KOps/s | 7.6572 KOps/s | |
test_setitem_dim[tuple] | 93.9320μs | 67.3519μs | 14.8474 KOps/s | 16.1380 KOps/s | |
test_setitem | 89.3420μs | 46.0264μs | 21.7267 KOps/s | 22.3079 KOps/s | |
test_set | 82.9720μs | 44.7596μs | 22.3416 KOps/s | 22.8638 KOps/s | |
test_set_shared | 0.3510ms | 53.2729μs | 18.7713 KOps/s | 19.0124 KOps/s | |
test_update | 99.1320μs | 52.7742μs | 18.9486 KOps/s | 18.3886 KOps/s | |
test_update_nested | 0.1070ms | 59.7326μs | 16.7413 KOps/s | 16.4476 KOps/s | |
test_update__nested | 0.1023ms | 66.4040μs | 15.0593 KOps/s | 16.1972 KOps/s | |
test_set_nested | 86.7120μs | 47.5963μs | 21.0100 KOps/s | 21.2739 KOps/s | |
test_set_nested_new | 89.6310μs | 50.8306μs | 19.6732 KOps/s | 19.8309 KOps/s | |
test_select | 0.1035ms | 64.0016μs | 15.6246 KOps/s | 15.6018 KOps/s | |
test_select_nested | 0.5456ms | 42.0982μs | 23.7540 KOps/s | 23.2285 KOps/s | |
test_exclude_nested | 92.8610μs | 59.2438μs | 16.8794 KOps/s | 16.9899 KOps/s | |
test_empty[True] | 0.2966ms | 0.2432ms | 4.1114 KOps/s | 4.1149 KOps/s | |
test_empty[False] | 4.0381μs | 0.7442μs | 1.3438 MOps/s | 1.3093 MOps/s | |
test_to | 45.1310μs | 24.9767μs | 40.0372 KOps/s | 39.0716 KOps/s | |
test_to_nonblocking | 59.7420μs | 24.5954μs | 40.6581 KOps/s | 40.7419 KOps/s | |
test_unbind_speed | 0.3519ms | 0.2869ms | 3.4860 KOps/s | 3.5220 KOps/s | |
test_unbind_speed_stack0 | 0.4128ms | 0.2831ms | 3.5320 KOps/s | 3.6400 KOps/s | |
test_unbind_speed_stack1 | 91.1690ms | 0.7208ms | 1.3873 KOps/s | 1.4236 KOps/s | |
test_split | 93.9870ms | 2.1384ms | 467.6455 Ops/s | 437.5686 Ops/s | |
test_chunk | 92.9697ms | 2.1449ms | 466.2316 Ops/s | 434.3583 Ops/s | |
test_creation[device0] | 0.2902ms | 0.1263ms | 7.9188 KOps/s | 7.7954 KOps/s | |
test_creation_from_tensor | 0.4374ms | 0.1352ms | 7.3972 KOps/s | 7.7041 KOps/s | |
test_add_one[memmap_tensor0] | 0.1497ms | 9.4578μs | 105.7333 KOps/s | 111.4941 KOps/s | |
test_contiguous[memmap_tensor0] | 31.0300μs | 2.2037μs | 453.7833 KOps/s | 447.0314 KOps/s | |
test_stack[memmap_tensor0] | 36.4600μs | 6.8674μs | 145.6146 KOps/s | 146.3735 KOps/s | |
test_memmaptd_index | 1.0938ms | 0.4512ms | 2.2165 KOps/s | 2.3118 KOps/s | |
test_memmaptd_index_astensor | 1.0298ms | 0.5070ms | 1.9723 KOps/s | 2.0490 KOps/s | |
test_memmaptd_index_op | 1.4324ms | 1.0494ms | 952.9142 Ops/s | 919.6501 Ops/s | |
test_serialize_model | 0.1305s | 0.1293s | 7.7332 Ops/s | 7.7371 Ops/s | |
test_serialize_model_pickle | 1.3506s | 1.2109s | 0.8258 Ops/s | 0.8244 Ops/s | |
test_serialize_weights | 0.2225s | 0.1420s | 7.0445 Ops/s | 7.7855 Ops/s | |
test_serialize_weights_returnearly | 0.2242s | 56.0300ms | 17.8476 Ops/s | 16.2926 Ops/s | |
test_serialize_weights_pickle | 1.3481s | 1.2128s | 0.8245 Ops/s | 0.8140 Ops/s | |
test_reshape_pytree | 70.3320μs | 37.5538μs | 26.6284 KOps/s | 26.9185 KOps/s | |
test_reshape_td | 90.0910μs | 47.3919μs | 21.1006 KOps/s | 23.4886 KOps/s | |
test_view_pytree | 75.9110μs | 38.1078μs | 26.2413 KOps/s | 27.8739 KOps/s | |
test_view_td | 0.1117ms | 51.3127μs | 19.4883 KOps/s | 21.5382 KOps/s | |
test_unbind_pytree | 77.4810μs | 37.2413μs | 26.8519 KOps/s | 28.3175 KOps/s | |
test_unbind_td | 0.4452ms | 47.4698μs | 21.0660 KOps/s | 23.3636 KOps/s | |
test_split_pytree | 94.7620μs | 50.4765μs | 19.8112 KOps/s | 20.9131 KOps/s | |
test_split_td | 0.6540ms | 58.6907μs | 17.0385 KOps/s | 16.6761 KOps/s | |
test_add_pytree | 92.6520μs | 58.1044μs | 17.2104 KOps/s | 17.2155 KOps/s | |
test_add_td | 0.1274ms | 91.1096μs | 10.9758 KOps/s | 10.3726 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4155ms | 0.2112ms | 4.7348 KOps/s | 4.6816 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5647ms | 0.1536ms | 6.5103 KOps/s | 6.5907 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2210ms | 0.1477ms | 6.7719 KOps/s | 6.8103 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2381ms | 0.1863ms | 5.3683 KOps/s | 5.2796 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 93.2620μs | 23.2055μs | 43.0932 KOps/s | 43.8754 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 90.3320μs | 44.3302μs | 22.5580 KOps/s | 22.8815 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2386ms | 64.7538μs | 15.4431 KOps/s | 15.9162 KOps/s | |
test_compile_copy_nested[pytree-eager] | 82.7010μs | 49.6292μs | 20.1494 KOps/s | 20.6142 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3955ms | 0.3213ms | 3.1123 KOps/s | 3.1122 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2455ms | 0.2107ms | 4.7471 KOps/s | 4.5826 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1719ms | 0.1300ms | 7.6894 KOps/s | 7.2442 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1052ms | 60.8392μs | 16.4368 KOps/s | 15.5141 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3918ms | 0.3201ms | 3.1244 KOps/s | 3.0390 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6941ms | 0.6318ms | 1.5827 KOps/s | 1.4854 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.2997ms | 0.2495ms | 4.0075 KOps/s | 3.9126 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3715ms | 0.3211ms | 3.1139 KOps/s | 3.0796 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1141ms | 71.2359μs | 14.0379 KOps/s | 13.4230 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1803ms | 0.1306ms | 7.6594 KOps/s | 7.3381 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6313ms | 0.5390ms | 1.8554 KOps/s | 1.7874 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3842ms | 0.3199ms | 3.1257 KOps/s | 3.0755 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 93.4320μs | 18.7330μs | 53.3818 KOps/s | 54.2986 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 61.2910μs | 27.2510μs | 36.6960 KOps/s | 36.9990 KOps/s | |
test_compile_copy_flat[pytree-compile] | 98.4620μs | 70.1178μs | 14.2617 KOps/s | 14.3461 KOps/s | |
test_compile_copy_flat[pytree-eager] | 84.5910μs | 51.9463μs | 19.2506 KOps/s | 19.5800 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3316ms | 0.8141ms | 1.2283 KOps/s | 1.1207 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.2992ms | 3.2029ms | 312.2187 Ops/s | 303.7670 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3213ms | 0.8138ms | 1.2287 KOps/s | 1.1435 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6216ms | 3.2671ms | 306.0813 Ops/s | 298.3852 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1714ms | 0.1098ms | 9.1058 KOps/s | 8.8697 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1954ms | 61.1946μs | 16.3413 KOps/s | 15.6692 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1545ms | 0.1038ms | 9.6308 KOps/s | 9.3787 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1333ms | 43.5138μs | 22.9812 KOps/s | 22.1416 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1453ms | 0.1048ms | 9.5421 KOps/s | 9.4379 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 99.9420μs | 43.4591μs | 23.0101 KOps/s | 22.5792 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1951ms | 0.1389ms | 7.1972 KOps/s | 7.1616 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1681ms | 25.4911μs | 39.2294 KOps/s | 37.3137 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1852ms | 0.1316ms | 7.5982 KOps/s | 7.5063 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 62.7010μs | 20.8899μs | 47.8701 KOps/s | 46.3284 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1799ms | 0.1320ms | 7.5784 KOps/s | 7.4576 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 54.6410μs | 20.5870μs | 48.5744 KOps/s | 46.5170 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1975ms | 0.1399ms | 7.1468 KOps/s | 7.1379 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5000ms | 25.2978μs | 39.5291 KOps/s | 37.5352 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2025ms | 0.1340ms | 7.4652 KOps/s | 7.4437 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1566ms | 21.1098μs | 47.3715 KOps/s | 46.5165 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1933ms | 0.1324ms | 7.5524 KOps/s | 7.4749 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.7910μs | 20.5520μs | 48.6571 KOps/s | 46.7074 KOps/s | |
test_mod_add[eager] | 69.9210μs | 32.2254μs | 31.0315 KOps/s | 26.3846 KOps/s | |
test_mod_add[compile] | 0.3224ms | 69.2753μs | 14.4352 KOps/s | 13.3030 KOps/s | |
test_mod_add[compile-overhead] | 0.2670ms | 0.1379ms | 7.2510 KOps/s | 6.7062 KOps/s | |
test_mod_wrap[eager] | 0.3128ms | 0.2424ms | 4.1249 KOps/s | 3.7233 KOps/s | |
test_mod_wrap[compile] | 0.6757ms | 0.3145ms | 3.1797 KOps/s | 3.3097 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4878ms | 4.0333ms | 247.9364 Ops/s | 255.6675 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4928ms | 1.3631ms | 733.6102 Ops/s | 682.3424 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.9196ms | 1.3346ms | 749.3099 Ops/s | 687.4703 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3269ms | 0.9188ms | 1.0884 KOps/s | 947.9812 Ops/s | |
test_seq_add[eager] | 0.1843ms | 96.9258μs | 10.3172 KOps/s | 9.5984 KOps/s | |
test_seq_add[compile] | 0.3825ms | 82.0399μs | 12.1892 KOps/s | 11.7723 KOps/s | |
test_seq_add[compile-overhead] | 0.1574ms | 0.1161ms | 8.6144 KOps/s | 8.4949 KOps/s | |
test_seq_wrap[eager] | 0.4588ms | 0.3751ms | 2.6656 KOps/s | 2.5066 KOps/s | |
test_seq_wrap[compile] | 0.3848ms | 0.3164ms | 3.1604 KOps/s | 3.1203 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2693ms | 0.2224ms | 4.4967 KOps/s | 4.4714 KOps/s | |
test_func_call_runtime[False-eager] | 0.8312ms | 0.7449ms | 1.3424 KOps/s | 1.3265 KOps/s | |
test_func_call_runtime[False-compile] | 1.0270ms | 0.7882ms | 1.2687 KOps/s | 1.2329 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4618ms | 0.3610ms | 2.7698 KOps/s | 2.7582 KOps/s | |
test_func_call_runtime[True-eager] | 0.9502ms | 0.9063ms | 1.1034 KOps/s | 1.0843 KOps/s | |
test_func_call_runtime[True-compile] | 0.9209ms | 0.8222ms | 1.2162 KOps/s | 1.1866 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4916ms | 0.3946ms | 2.5341 KOps/s | 2.5184 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8044ms | 0.7443ms | 1.3436 KOps/s | 1.3201 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8349ms | 0.7901ms | 1.2657 KOps/s | 1.2261 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4159ms | 0.3646ms | 2.7429 KOps/s | 2.7345 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1230ms | 1.0052ms | 994.8116 Ops/s | 983.3634 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9298ms | 0.8497ms | 1.1769 KOps/s | 1.1483 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4930ms | 0.4205ms | 2.3780 KOps/s | 2.3785 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5329ms | 2.0707ms | 482.9309 Ops/s | 475.6803 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9931ms | 0.8665ms | 1.1540 KOps/s | 1.1264 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5282ms | 0.4264ms | 2.3450 KOps/s | 2.3413 KOps/s | |
test_distributed | 1.6980ms | 0.2082ms | 4.8020 KOps/s | 8.4706 KOps/s | |
test_tdmodule | 91.1310μs | 14.1986μs | 70.4297 KOps/s | 63.6017 KOps/s | |
test_tdmodule_dispatch | 65.4510μs | 28.8618μs | 34.6478 KOps/s | 32.2449 KOps/s | |
test_tdseq | 37.8310μs | 16.3840μs | 61.0351 KOps/s | 60.2729 KOps/s | |
test_tdseq_dispatch | 53.3310μs | 32.3656μs | 30.8970 KOps/s | 29.4375 KOps/s | |
test_instantiation_functorch | 2.0658ms | 1.9913ms | 502.1933 Ops/s | 525.8592 Ops/s | |
test_instantiation_td | 1.8205ms | 1.2194ms | 820.0807 Ops/s | 822.4717 Ops/s | |
test_exec_functorch | 0.2639ms | 0.2162ms | 4.6254 KOps/s | 4.7503 KOps/s | |
test_exec_functional_call | 0.2728ms | 0.2230ms | 4.4833 KOps/s | 4.4088 KOps/s | |
test_exec_td | 0.2773ms | 0.2273ms | 4.3993 KOps/s | 4.2538 KOps/s | |
test_exec_td_decorator | 0.8932ms | 0.2634ms | 3.7963 KOps/s | 3.6105 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7744ms | 0.7139ms | 1.4008 KOps/s | 1.4366 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8058ms | 0.7174ms | 1.3939 KOps/s | 1.3738 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7161ms | 0.6059ms | 1.6504 KOps/s | 1.6399 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6663ms | 0.6063ms | 1.6493 KOps/s | 1.6388 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3873ms | 0.6961ms | 1.4367 KOps/s | 1.4039 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8280ms | 0.6985ms | 1.4317 KOps/s | 1.4019 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7617ms | 0.6168ms | 1.6214 KOps/s | 1.6029 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7387ms | 0.6161ms | 1.6231 KOps/s | 1.5966 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.7707ms | 8.4171ms | 118.8052 Ops/s | 117.0264 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.7276ms | 8.3790ms | 119.3460 Ops/s | 116.5690 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.6083ms | 8.2623ms | 121.0311 Ops/s | 120.0125 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.6025ms | 8.2000ms | 121.9518 Ops/s | 120.8770 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.5133ms | 19.6859ms | 50.7978 Ops/s | 50.2205 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.3825ms | 19.6644ms | 50.8532 Ops/s | 50.4252 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.2846ms | 19.5933ms | 51.0379 Ops/s | 50.8731 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.9370ms | 19.5381ms | 51.1821 Ops/s | 50.6575 Ops/s | |
test_to_module_speed[True] | 1.3718ms | 0.9416ms | 1.0620 KOps/s | 1.0621 KOps/s | |
test_to_module_speed[False] | 1.3037ms | 0.9202ms | 1.0867 KOps/s | 1.0920 KOps/s | |
test_tc_init | 67.7810μs | 32.9802μs | 30.3212 KOps/s | 27.1742 KOps/s | |
test_tc_init_nested | 0.1154ms | 67.9423μs | 14.7184 KOps/s | 13.2055 KOps/s | |
test_tc_first_layer_tensor | 6.0530μs | 0.6759μs | 1.4796 MOps/s | 1.4843 MOps/s | |
test_tc_first_layer_nontensor | 26.8400μs | 2.2289μs | 448.6503 KOps/s | 450.5350 KOps/s | |
test_tc_second_layer_tensor | 8.6800μs | 1.3593μs | 735.6561 KOps/s | 743.7549 KOps/s | |
test_tc_second_layer_nontensor | 0.1044ms | 2.9515μs | 338.8154 KOps/s | 344.7502 KOps/s | |
test_unbind | 0.1901s | 11.9221ms | 83.8777 Ops/s | 91.1574 Ops/s | |
test_full_like | 0.6593ms | 0.5733ms | 1.7444 KOps/s | 1.7418 KOps/s | |
test_zeros_like | 0.2892ms | 0.1980ms | 5.0517 KOps/s | 5.0609 KOps/s | |
test_ones_like | 0.2357ms | 0.1977ms | 5.0569 KOps/s | 5.0677 KOps/s | |
test_clone | 0.4458ms | 0.4148ms | 2.4106 KOps/s | 2.4193 KOps/s | |
test_squeeze | 42.9110μs | 11.4064μs | 87.6704 KOps/s | 100.8936 KOps/s | |
test_unsqueeze | 0.2937ms | 74.4611μs | 13.4298 KOps/s | 12.6548 KOps/s | |
test_split | 0.2596ms | 0.1612ms | 6.2025 KOps/s | 6.0193 KOps/s | |
test_permute | 0.2336ms | 0.1900ms | 5.2634 KOps/s | 5.4249 KOps/s | |
test_stack | 1.2423ms | 0.8745ms | 1.1436 KOps/s | 1.1338 KOps/s | |
test_cat | 1.2625ms | 1.2314ms | 812.0681 Ops/s | 811.9570 Ops/s |
vmoens
added a commit
that referenced
this pull request
Sep 17, 2024
ghstack-source-id: 2c6c43c5b34be73572d1d1a8da009585e8876bc5 Pull Request resolved: #992
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
selected_out_keys
arg in TDS constructor #993inplace
arg in TDM constructor #992