-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Consolidate lazy stacks of non-tensors [EDIT] #1224
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 20, 2025
ghstack-source-id: d3d822dba235b74128f99e6cbff08989d13c1af4 Pull Request resolved: #1224
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 82.4630μs | 21.0865μs | 47.4236 KOps/s | 47.2165 KOps/s | |
test_plain_set_stack_nested | 45.7460μs | 21.2103μs | 47.1470 KOps/s | 46.5496 KOps/s | |
test_plain_set_nested_inplace | 60.5530μs | 22.7574μs | 43.9418 KOps/s | 42.7514 KOps/s | |
test_plain_set_stack_nested_inplace | 90.4680μs | 22.7549μs | 43.9467 KOps/s | 42.5656 KOps/s | |
test_items | 63.6990μs | 4.2981μs | 232.6621 KOps/s | 239.0922 KOps/s | |
test_items_nested | 0.6487ms | 0.4181ms | 2.3917 KOps/s | 2.4323 KOps/s | |
test_items_nested_locked | 0.5034ms | 0.3982ms | 2.5115 KOps/s | 2.4411 KOps/s | |
test_items_nested_leaf | 0.1316ms | 77.7667μs | 12.8590 KOps/s | 12.9141 KOps/s | |
test_items_stack_nested | 0.8318ms | 0.4017ms | 2.4893 KOps/s | 2.4025 KOps/s | |
test_items_stack_nested_leaf | 0.1793ms | 77.2070μs | 12.9522 KOps/s | 12.6480 KOps/s | |
test_items_stack_nested_locked | 0.5886ms | 0.4012ms | 2.4925 KOps/s | 2.4185 KOps/s | |
test_keys | 23.4340μs | 3.6231μs | 276.0073 KOps/s | 281.7361 KOps/s | |
test_keys_nested | 0.2607ms | 0.1642ms | 6.0885 KOps/s | 6.0467 KOps/s | |
test_keys_nested_locked | 0.7230ms | 0.1701ms | 5.8799 KOps/s | 5.8635 KOps/s | |
test_keys_nested_leaf | 0.2363ms | 0.1428ms | 7.0006 KOps/s | 6.9104 KOps/s | |
test_keys_stack_nested | 0.2537ms | 0.1630ms | 6.1367 KOps/s | 6.0809 KOps/s | |
test_keys_stack_nested_leaf | 0.2248ms | 0.1431ms | 6.9857 KOps/s | 6.9415 KOps/s | |
test_keys_stack_nested_locked | 0.2683ms | 0.1681ms | 5.9486 KOps/s | 5.9060 KOps/s | |
test_values | 8.6340μs | 1.0539μs | 948.8470 KOps/s | 931.9904 KOps/s | |
test_values_nested | 0.1095ms | 61.4310μs | 16.2784 KOps/s | 16.0460 KOps/s | |
test_values_nested_locked | 0.1268ms | 61.6657μs | 16.2165 KOps/s | 16.1329 KOps/s | |
test_values_nested_leaf | 0.1453ms | 70.9685μs | 14.0908 KOps/s | 13.9352 KOps/s | |
test_values_stack_nested | 0.1130ms | 62.0003μs | 16.1290 KOps/s | 15.4073 KOps/s | |
test_values_stack_nested_leaf | 0.1332ms | 70.9605μs | 14.0924 KOps/s | 13.8987 KOps/s | |
test_values_stack_nested_locked | 0.1941ms | 62.7352μs | 15.9400 KOps/s | 15.8732 KOps/s | |
test_membership | 83.1650μs | 0.8880μs | 1.1261 MOps/s | 1.0928 MOps/s | |
test_membership_nested | 19.0050μs | 2.8596μs | 349.7012 KOps/s | 337.3472 KOps/s | |
test_membership_nested_leaf | 24.0750μs | 2.8800μs | 347.2175 KOps/s | 329.7488 KOps/s | |
test_membership_stacked_nested | 22.7920μs | 2.8786μs | 347.3971 KOps/s | 332.5313 KOps/s | |
test_membership_stacked_nested_leaf | 45.8350μs | 2.8799μs | 347.2330 KOps/s | 339.8966 KOps/s | |
test_membership_nested_last | 22.9830μs | 4.2681μs | 234.2990 KOps/s | 225.1568 KOps/s | |
test_membership_nested_leaf_last | 50.6350μs | 4.2918μs | 233.0033 KOps/s | 223.5508 KOps/s | |
test_membership_stacked_nested_last | 25.2270μs | 4.2539μs | 235.0791 KOps/s | 176.9916 KOps/s | |
test_membership_stacked_nested_leaf_last | 47.2080μs | 4.2997μs | 232.5733 KOps/s | 174.1117 KOps/s | |
test_nested_getleaf | 77.1840μs | 10.5409μs | 94.8684 KOps/s | 92.1813 KOps/s | |
test_nested_get | 29.5750μs | 10.0552μs | 99.4508 KOps/s | 97.6473 KOps/s | |
test_stacked_getleaf | 58.2380μs | 10.3622μs | 96.5048 KOps/s | 92.8452 KOps/s | |
test_stacked_get | 53.5000μs | 9.9363μs | 100.6410 KOps/s | 98.3182 KOps/s | |
test_nested_getitemleaf | 33.5030μs | 11.0905μs | 90.1671 KOps/s | 87.3221 KOps/s | |
test_nested_getitem | 49.3910μs | 10.5163μs | 95.0907 KOps/s | 91.0306 KOps/s | |
test_stacked_getitemleaf | 46.5170μs | 11.0967μs | 90.1166 KOps/s | 88.2706 KOps/s | |
test_stacked_getitem | 47.1780μs | 10.4019μs | 96.1365 KOps/s | 93.9209 KOps/s | |
test_lock_nested | 7.3809ms | 0.4163ms | 2.4020 KOps/s | 2.4156 KOps/s | |
test_lock_stack_nested | 0.5416ms | 0.4229ms | 2.3644 KOps/s | 2.3587 KOps/s | |
test_unlock_nested | 0.7279ms | 0.3395ms | 2.9456 KOps/s | 2.9867 KOps/s | |
test_unlock_stack_nested | 0.5199ms | 0.3421ms | 2.9235 KOps/s | 2.9190 KOps/s | |
test_flatten_speed | 0.2089ms | 0.1014ms | 9.8663 KOps/s | 9.8595 KOps/s | |
test_unflatten_speed | 0.9781ms | 0.5211ms | 1.9189 KOps/s | 1.8846 KOps/s | |
test_common_ops | 1.0447ms | 0.8335ms | 1.1998 KOps/s | 1.2173 KOps/s | |
test_creation | 32.6610μs | 2.6163μs | 382.2257 KOps/s | 394.5776 KOps/s | |
test_creation_empty | 36.7990μs | 12.6643μs | 78.9622 KOps/s | 77.5212 KOps/s | |
test_creation_nested_1 | 78.7770μs | 15.3764μs | 65.0349 KOps/s | 63.2115 KOps/s | |
test_creation_nested_2 | 74.3690μs | 20.4473μs | 48.9063 KOps/s | 48.8544 KOps/s | |
test_clone | 67.3480μs | 13.3371μs | 74.9785 KOps/s | 72.8953 KOps/s | |
test_getitem[int] | 0.9629ms | 12.5038μs | 79.9759 KOps/s | 77.0490 KOps/s | |
test_getitem[slice_int] | 0.1616ms | 23.7087μs | 42.1785 KOps/s | 41.0078 KOps/s | |
test_getitem[range] | 0.1698ms | 51.0256μs | 19.5980 KOps/s | 19.4746 KOps/s | |
test_getitem[tuple] | 0.1651ms | 20.0937μs | 49.7669 KOps/s | 48.7459 KOps/s | |
test_getitem[list] | 0.1663ms | 44.9448μs | 22.2495 KOps/s | 21.3282 KOps/s | |
test_setitem_dim[int] | 47.0580μs | 25.6688μs | 38.9578 KOps/s | 38.6969 KOps/s | |
test_setitem_dim[slice_int] | 90.7990μs | 50.9202μs | 19.6386 KOps/s | 19.1944 KOps/s | |
test_setitem_dim[range] | 0.1689ms | 77.4101μs | 12.9182 KOps/s | 13.0314 KOps/s | |
test_setitem_dim[tuple] | 84.8480μs | 40.7397μs | 24.5461 KOps/s | 24.0542 KOps/s | |
test_setitem | 80.0600μs | 20.8298μs | 48.0081 KOps/s | 46.3571 KOps/s | |
test_set | 75.7010μs | 20.2127μs | 49.4739 KOps/s | 47.6689 KOps/s | |
test_set_shared | 3.5387ms | 0.1829ms | 5.4680 KOps/s | 5.4974 KOps/s | |
test_update | 0.1663ms | 23.5092μs | 42.5365 KOps/s | 40.9509 KOps/s | |
test_update_nested | 0.1144ms | 34.0774μs | 29.3449 KOps/s | 28.4413 KOps/s | |
test_update__nested | 0.4722ms | 34.0385μs | 29.3785 KOps/s | 29.4282 KOps/s | |
test_set_nested | 75.6410μs | 22.4759μs | 44.4922 KOps/s | 43.3813 KOps/s | |
test_set_nested_new | 68.1670μs | 27.2126μs | 36.7477 KOps/s | 36.3067 KOps/s | |
test_select | 0.1230ms | 43.6241μs | 22.9231 KOps/s | 22.1864 KOps/s | |
test_select_nested | 0.1165ms | 63.3639μs | 15.7819 KOps/s | 15.8221 KOps/s | |
test_exclude_nested | 0.1516ms | 81.1762μs | 12.3189 KOps/s | 12.3289 KOps/s | |
test_empty[True] | 0.6969ms | 0.4048ms | 2.4706 KOps/s | 2.4313 KOps/s | |
test_empty[False] | 11.6040μs | 1.3546μs | 738.2208 KOps/s | 736.8984 KOps/s | |
test_unbind_speed | 0.5860ms | 0.2685ms | 3.7247 KOps/s | 3.6497 KOps/s | |
test_unbind_speed_stack0 | 0.4651ms | 0.2667ms | 3.7497 KOps/s | 3.6921 KOps/s | |
test_unbind_speed_stack1 | 0.7457ms | 0.6617ms | 1.5114 KOps/s | 1.2302 KOps/s | |
test_split | 0.1043s | 1.8935ms | 528.1172 Ops/s | 569.7388 Ops/s | |
test_chunk | 2.5084ms | 1.5705ms | 636.7429 Ops/s | 630.8728 Ops/s | |
test_consolidate_njt[False-None] | 0.1082s | 9.0311ms | 110.7281 Ops/s | 108.2703 Ops/s | |
test_creation[device0] | 4.3265ms | 93.5931μs | 10.6846 KOps/s | 10.8415 KOps/s | |
test_creation_from_tensor | 0.2459ms | 95.0217μs | 10.5239 KOps/s | 10.4794 KOps/s | |
test_add_one[memmap_tensor0] | 93.3540μs | 5.0357μs | 198.5817 KOps/s | 185.0536 KOps/s | |
test_contiguous[memmap_tensor0] | 10.6600μs | 0.5070μs | 1.9722 MOps/s | 1.9067 MOps/s | |
test_stack[memmap_tensor0] | 23.4030μs | 3.3981μs | 294.2843 KOps/s | 275.7159 KOps/s | |
test_memmaptd_index | 1.1629ms | 0.2282ms | 4.3814 KOps/s | 4.2088 KOps/s | |
test_memmaptd_index_astensor | 0.6253ms | 0.3184ms | 3.1411 KOps/s | 3.0303 KOps/s | |
test_memmaptd_index_op | 1.1309ms | 0.6056ms | 1.6512 KOps/s | 1.5857 KOps/s | |
test_serialize_model | 0.1214s | 0.1143s | 8.7499 Ops/s | 8.5528 Ops/s | |
test_serialize_model_pickle | 0.5032s | 0.4054s | 2.4664 Ops/s | 2.4931 Ops/s | |
test_serialize_weights | 0.1283s | 0.1139s | 8.7810 Ops/s | 8.6161 Ops/s | |
test_serialize_weights_returnearly | 0.1782s | 0.1619s | 6.1782 Ops/s | 6.4577 Ops/s | |
test_serialize_weights_pickle | 0.4992s | 0.4230s | 2.3638 Ops/s | 2.2941 Ops/s | |
test_serialize_weights_filesystem | 0.1513s | 0.1426s | 7.0131 Ops/s | 7.0826 Ops/s | |
test_serialize_model_filesystem | 0.1544s | 0.1442s | 6.9346 Ops/s | 6.4716 Ops/s | |
test_reshape_pytree | 65.9630μs | 26.0539μs | 38.3819 KOps/s | 37.9071 KOps/s | |
test_reshape_td | 74.0080μs | 32.6844μs | 30.5957 KOps/s | 30.0414 KOps/s | |
test_view_pytree | 58.6900μs | 25.9476μs | 38.5392 KOps/s | 38.2133 KOps/s | |
test_view_td | 82.3540μs | 40.0559μs | 24.9651 KOps/s | 24.7749 KOps/s | |
test_unbind_pytree | 67.4560μs | 29.0648μs | 34.4059 KOps/s | 33.6720 KOps/s | |
test_unbind_td | 0.3800ms | 39.6129μs | 25.2443 KOps/s | 24.6740 KOps/s | |
test_split_pytree | 67.8960μs | 28.7404μs | 34.7943 KOps/s | 34.0421 KOps/s | |
test_split_td | 0.5025ms | 45.3800μs | 22.0361 KOps/s | 21.9141 KOps/s | |
test_add_pytree | 96.5800μs | 35.8438μs | 27.8989 KOps/s | 26.8287 KOps/s | |
test_add_td | 0.1593ms | 58.2898μs | 17.1557 KOps/s | 16.2786 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1305ms | 66.2430μs | 15.0959 KOps/s | 14.9286 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5280ms | 0.1710ms | 5.8475 KOps/s | 5.7295 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1045ms | 45.6390μs | 21.9111 KOps/s | 21.6286 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2190ms | 0.1184ms | 8.4429 KOps/s | 8.0804 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 65.2310μs | 28.3307μs | 35.2974 KOps/s | 35.9903 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1191ms | 58.6565μs | 17.0484 KOps/s | 16.9660 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1512ms | 79.0822μs | 12.6451 KOps/s | 12.4669 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1514ms | 66.8387μs | 14.9614 KOps/s | 14.8288 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1934ms | 0.1075ms | 9.3063 KOps/s | 9.3162 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4327ms | 0.2171ms | 4.6060 KOps/s | 4.5933 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1828ms | 47.5467μs | 21.0320 KOps/s | 21.2629 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1556ms | 68.3639μs | 14.6276 KOps/s | 14.5390 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1766ms | 0.1020ms | 9.8062 KOps/s | 9.9329 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3399ms | 0.2042ms | 4.8968 KOps/s | 4.8189 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4098ms | 0.2331ms | 4.2901 KOps/s | 4.2467 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2370ms | 0.1118ms | 8.9432 KOps/s | 9.1892 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3611ms | 63.4030μs | 15.7721 KOps/s | 15.8970 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.3409ms | 49.0053μs | 20.4060 KOps/s | 19.7038 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3684ms | 0.1642ms | 6.0910 KOps/s | 6.2098 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1752ms | 0.1008ms | 9.9214 KOps/s | 9.8986 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 60.9230μs | 20.8340μs | 47.9985 KOps/s | 47.3309 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1494ms | 68.5625μs | 14.5852 KOps/s | 15.0306 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2323ms | 83.4707μs | 11.9803 KOps/s | 12.2345 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1160ms | 68.0617μs | 14.6925 KOps/s | 14.8208 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.5718ms | 0.2180ms | 4.5876 KOps/s | 4.7023 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.6185ms | 1.3687ms | 730.6345 Ops/s | 722.0444 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3123ms | 0.2112ms | 4.7343 KOps/s | 4.7446 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.7398ms | 0.8329ms | 1.2007 KOps/s | 1.1942 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8237ms | 0.4626ms | 2.1617 KOps/s | 2.1920 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.9123ms | 2.6960ms | 370.9247 Ops/s | 354.0799 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 90.6690μs | 38.9448μs | 25.6773 KOps/s | 25.9003 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5940ms | 33.1436μs | 30.1717 KOps/s | 29.2257 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1352ms | 31.6795μs | 31.5661 KOps/s | 31.3385 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.5797ms | 23.0931μs | 43.3029 KOps/s | 40.8484 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 92.7430μs | 31.7059μs | 31.5399 KOps/s | 30.5264 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 64.7300μs | 22.9197μs | 43.6307 KOps/s | 41.9134 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1366ms | 52.7919μs | 18.9423 KOps/s | 19.0129 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3838ms | 19.2303μs | 52.0013 KOps/s | 49.3713 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1221ms | 46.1500μs | 21.6685 KOps/s | 21.2749 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 50.6150μs | 18.5929μs | 53.7839 KOps/s | 53.6053 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 98.9540μs | 46.7660μs | 21.3830 KOps/s | 21.2672 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 65.3720μs | 18.7618μs | 53.2998 KOps/s | 53.6329 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1152ms | 54.0010μs | 18.5182 KOps/s | 18.3499 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9827ms | 19.1544μs | 52.2072 KOps/s | 49.4436 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1124ms | 46.1708μs | 21.6587 KOps/s | 21.3371 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1287ms | 19.0606μs | 52.4643 KOps/s | 53.5299 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1119ms | 46.2624μs | 21.6158 KOps/s | 21.1581 KOps/s | |
test_compile_indexing[int-pytree-eager] | 73.9480μs | 18.5569μs | 53.8882 KOps/s | 53.5920 KOps/s | |
test_mod_add[eager] | 93.8240μs | 35.6360μs | 28.0615 KOps/s | 27.1542 KOps/s | |
test_mod_add[compile] | 0.1101ms | 63.8104μs | 15.6714 KOps/s | 15.5863 KOps/s | |
test_mod_add[compile-overhead] | 0.2042ms | 64.0806μs | 15.6053 KOps/s | 15.5402 KOps/s | |
test_mod_wrap[eager] | 0.9339ms | 0.2254ms | 4.4365 KOps/s | 4.4596 KOps/s | |
test_mod_wrap[compile] | 1.9018ms | 0.2299ms | 4.3492 KOps/s | 4.3505 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4242ms | 0.2264ms | 4.4177 KOps/s | 4.3822 KOps/s | |
test_mod_wrap_and_backward[eager] | 15.9641ms | 13.4876ms | 74.1421 Ops/s | 72.0827 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.5857ms | 11.7592ms | 85.0397 Ops/s | 85.3565 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 16.7307ms | 11.9013ms | 84.0245 Ops/s | 87.4527 Ops/s | |
test_seq_add[eager] | 0.1880ms | 0.1154ms | 8.6621 KOps/s | 8.2400 KOps/s | |
test_seq_add[compile] | 0.1398ms | 75.7865μs | 13.1950 KOps/s | 12.8365 KOps/s | |
test_seq_add[compile-overhead] | 0.1384ms | 74.9989μs | 13.3335 KOps/s | 13.3191 KOps/s | |
test_seq_wrap[eager] | 0.5786ms | 0.4380ms | 2.2831 KOps/s | 2.1509 KOps/s | |
test_seq_wrap[compile] | 0.4534ms | 0.2445ms | 4.0896 KOps/s | 4.0100 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3412ms | 0.2427ms | 4.1197 KOps/s | 4.0278 KOps/s | |
test_func_call_runtime[False-eager] | 0.7223ms | 0.5300ms | 1.8866 KOps/s | 1.8483 KOps/s | |
test_func_call_runtime[False-compile] | 0.5586ms | 0.4438ms | 2.2534 KOps/s | 2.2153 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5366ms | 0.4429ms | 2.2578 KOps/s | 2.2218 KOps/s | |
test_func_call_runtime[True-eager] | 1.1114ms | 0.7417ms | 1.3483 KOps/s | 1.3263 KOps/s | |
test_func_call_runtime[True-compile] | 0.5717ms | 0.4624ms | 2.1627 KOps/s | 2.1236 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6886ms | 0.4639ms | 2.1557 KOps/s | 2.1266 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9641ms | 0.5306ms | 1.8845 KOps/s | 1.8505 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6201ms | 0.4385ms | 2.2805 KOps/s | 2.2049 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6220ms | 0.4407ms | 2.2692 KOps/s | 2.2187 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0202ms | 0.8791ms | 1.1375 KOps/s | 1.1095 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1585ms | 0.7833ms | 1.2767 KOps/s | 1.2448 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.3268ms | 0.7936ms | 1.2601 KOps/s | 1.2324 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.9205ms | 1.9193ms | 521.0315 Ops/s | 517.6806 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8637ms | 0.5537ms | 1.8061 KOps/s | 1.8355 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7810ms | 0.5387ms | 1.8563 KOps/s | 1.8335 KOps/s | |
test_distributed | 0.2775ms | 0.1279ms | 7.8198 KOps/s | 7.8115 KOps/s | |
test_tdmodule | 46.8070μs | 27.0599μs | 36.9551 KOps/s | 36.6310 KOps/s | |
test_tdmodule_dispatch | 0.1148ms | 51.4228μs | 19.4466 KOps/s | 19.7018 KOps/s | |
test_tdseq | 48.8610μs | 29.5049μs | 33.8927 KOps/s | 32.7639 KOps/s | |
test_tdseq_dispatch | 0.1157ms | 55.6928μs | 17.9556 KOps/s | 18.1166 KOps/s | |
test_instantiation_functorch | 2.0360ms | 1.5378ms | 650.2841 Ops/s | 639.8023 Ops/s | |
test_exec_functorch | 0.3884ms | 0.1766ms | 5.6611 KOps/s | 5.6186 KOps/s | |
test_exec_functional_call | 0.3989ms | 0.1713ms | 5.8387 KOps/s | 5.8353 KOps/s | |
test_exec_td_decorator | 0.5217ms | 0.2331ms | 4.2900 KOps/s | 4.2743 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9678ms | 0.6575ms | 1.5208 KOps/s | 1.4939 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8806ms | 0.6531ms | 1.5311 KOps/s | 1.5001 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8592ms | 0.5301ms | 1.8866 KOps/s | 1.8675 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8765ms | 0.5334ms | 1.8748 KOps/s | 1.8564 KOps/s | |
test_to_module_speed[True] | 1.9705ms | 1.3308ms | 751.4401 Ops/s | 753.2216 Ops/s | |
test_to_module_speed[False] | 2.4274ms | 1.3214ms | 756.7445 Ops/s | 771.0601 Ops/s | |
test_tc_init | 88.8260μs | 49.3753μs | 20.2530 KOps/s | 20.2017 KOps/s | |
test_tc_init_nested | 0.1824ms | 98.6175μs | 10.1402 KOps/s | 10.1113 KOps/s | |
test_tc_first_layer_tensor | 43.8520μs | 1.6314μs | 612.9637 KOps/s | 639.3121 KOps/s | |
test_tc_first_layer_nontensor | 30.7970μs | 4.9586μs | 201.6687 KOps/s | 209.5784 KOps/s | |
test_tc_second_layer_tensor | 22.2920μs | 2.9403μs | 340.0963 KOps/s | 343.6534 KOps/s | |
test_tc_second_layer_nontensor | 51.9570μs | 6.3697μs | 156.9938 KOps/s | 163.5150 KOps/s | |
test_unbind | 0.2491s | 13.9931ms | 71.4639 Ops/s | 70.1015 Ops/s | |
test_full_like | 9.2718ms | 8.0533ms | 124.1725 Ops/s | 139.4272 Ops/s | |
test_zeros_like | 9.2510ms | 4.6573ms | 214.7161 Ops/s | 359.8828 Ops/s | |
test_ones_like | 5.0855ms | 3.3418ms | 299.2442 Ops/s | 313.3520 Ops/s | |
test_clone | 13.2095ms | 7.4227ms | 134.7225 Ops/s | 187.3545 Ops/s | |
test_squeeze | 69.4400μs | 12.6291μs | 79.1822 KOps/s | 79.5056 KOps/s | |
test_unsqueeze | 0.2755ms | 95.0682μs | 10.5188 KOps/s | 10.3282 KOps/s | |
test_split | 0.4317ms | 0.1956ms | 5.1122 KOps/s | 5.0690 KOps/s | |
test_permute | 0.3364ms | 0.2030ms | 4.9270 KOps/s | 4.9264 KOps/s | |
test_stack | 28.2078ms | 25.2077ms | 39.6704 Ops/s | 40.2664 Ops/s | |
test_cat | 29.8918ms | 25.2455ms | 39.6111 Ops/s | 40.7719 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 40.9700μs | 12.8500μs | 77.8213 KOps/s | 81.3223 KOps/s | |
test_plain_set_stack_nested | 34.1700μs | 12.9815μs | 77.0327 KOps/s | 80.1578 KOps/s | |
test_plain_set_nested_inplace | 43.4410μs | 13.9801μs | 71.5302 KOps/s | 73.7923 KOps/s | |
test_plain_set_stack_nested_inplace | 45.8910μs | 13.8753μs | 72.0706 KOps/s | 74.8436 KOps/s | |
test_items | 29.9310μs | 2.8676μs | 348.7220 KOps/s | 342.5485 KOps/s | |
test_items_nested | 0.4357ms | 0.3702ms | 2.7012 KOps/s | 2.6730 KOps/s | |
test_items_nested_locked | 0.4173ms | 0.3715ms | 2.6916 KOps/s | 2.6617 KOps/s | |
test_items_nested_leaf | 91.4410μs | 60.0957μs | 16.6401 KOps/s | 16.5699 KOps/s | |
test_items_stack_nested | 0.4035ms | 0.3683ms | 2.7155 KOps/s | 2.6927 KOps/s | |
test_items_stack_nested_leaf | 87.5810μs | 60.5718μs | 16.5093 KOps/s | 16.3929 KOps/s | |
test_items_stack_nested_locked | 0.4306ms | 0.3720ms | 2.6880 KOps/s | 2.6653 KOps/s | |
test_keys | 37.2610μs | 3.4274μs | 291.7690 KOps/s | 291.9284 KOps/s | |
test_keys_nested | 0.1334ms | 87.9063μs | 11.3758 KOps/s | 11.3327 KOps/s | |
test_keys_nested_locked | 0.7633ms | 94.4716μs | 10.5852 KOps/s | 10.6198 KOps/s | |
test_keys_nested_leaf | 0.1151ms | 79.8926μs | 12.5168 KOps/s | 12.6384 KOps/s | |
test_keys_stack_nested | 0.1435ms | 88.4761μs | 11.3025 KOps/s | 11.3579 KOps/s | |
test_keys_stack_nested_leaf | 0.1132ms | 80.3783μs | 12.4412 KOps/s | 12.5539 KOps/s | |
test_keys_stack_nested_locked | 0.1325ms | 94.9989μs | 10.5264 KOps/s | 10.5430 KOps/s | |
test_values | 6.0435μs | 0.8584μs | 1.1650 MOps/s | 1.1740 MOps/s | |
test_values_nested | 77.9510μs | 37.2826μs | 26.8222 KOps/s | 26.8376 KOps/s | |
test_values_nested_locked | 72.1010μs | 39.6248μs | 25.2367 KOps/s | 25.6083 KOps/s | |
test_values_nested_leaf | 78.9510μs | 42.4983μs | 23.5304 KOps/s | 23.7158 KOps/s | |
test_values_stack_nested | 65.7010μs | 37.4495μs | 26.7026 KOps/s | 26.5897 KOps/s | |
test_values_stack_nested_leaf | 75.4010μs | 42.4132μs | 23.5776 KOps/s | 23.4314 KOps/s | |
test_values_stack_nested_locked | 0.1033ms | 39.9061μs | 25.0588 KOps/s | 25.5233 KOps/s | |
test_membership | 1.7071μs | 0.5016μs | 1.9936 MOps/s | 1.9741 MOps/s | |
test_membership_nested | 29.0600μs | 2.0976μs | 476.7342 KOps/s | 478.7006 KOps/s | |
test_membership_nested_leaf | 21.9755μs | 2.0089μs | 497.7783 KOps/s | 491.8793 KOps/s | |
test_membership_stacked_nested | 32.7710μs | 2.1207μs | 471.5489 KOps/s | 475.8986 KOps/s | |
test_membership_stacked_nested_leaf | 41.8710μs | 2.1199μs | 471.7275 KOps/s | 462.3066 KOps/s | |
test_membership_nested_last | 31.4910μs | 3.0673μs | 326.0244 KOps/s | 322.5975 KOps/s | |
test_membership_nested_leaf_last | 32.1400μs | 3.0397μs | 328.9748 KOps/s | 323.6620 KOps/s | |
test_membership_stacked_nested_last | 30.4610μs | 3.1044μs | 322.1272 KOps/s | 324.5949 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.6410μs | 3.1018μs | 322.3926 KOps/s | 323.1095 KOps/s | |
test_nested_getleaf | 55.2410μs | 6.1851μs | 161.6790 KOps/s | 159.1955 KOps/s | |
test_nested_get | 34.0410μs | 5.9774μs | 167.2966 KOps/s | 168.1455 KOps/s | |
test_stacked_getleaf | 35.5010μs | 6.1623μs | 162.2769 KOps/s | 160.2197 KOps/s | |
test_stacked_get | 33.0100μs | 5.8616μs | 170.6008 KOps/s | 170.4022 KOps/s | |
test_nested_getitemleaf | 30.0710μs | 6.4790μs | 154.3441 KOps/s | 153.7289 KOps/s | |
test_nested_getitem | 31.0900μs | 6.1507μs | 162.5819 KOps/s | 162.9327 KOps/s | |
test_stacked_getitemleaf | 41.1610μs | 6.4345μs | 155.4115 KOps/s | 155.5882 KOps/s | |
test_stacked_getitem | 35.1610μs | 5.9939μs | 166.8363 KOps/s | 167.0478 KOps/s | |
test_lock_nested | 0.4003ms | 0.3433ms | 2.9130 KOps/s | 2.9817 KOps/s | |
test_lock_stack_nested | 0.3837ms | 0.3494ms | 2.8622 KOps/s | 2.9064 KOps/s | |
test_unlock_nested | 0.3827ms | 0.2898ms | 3.4506 KOps/s | 3.5542 KOps/s | |
test_unlock_stack_nested | 0.3366ms | 0.2898ms | 3.4506 KOps/s | 3.5404 KOps/s | |
test_flatten_speed | 0.1115ms | 77.8701μs | 12.8419 KOps/s | 12.7541 KOps/s | |
test_unflatten_speed | 0.4177ms | 0.3261ms | 3.0662 KOps/s | 3.1084 KOps/s | |
test_common_ops | 0.7546ms | 0.6212ms | 1.6099 KOps/s | 1.6362 KOps/s | |
test_creation | 78.4810μs | 1.7575μs | 569.0022 KOps/s | 580.2079 KOps/s | |
test_creation_empty | 39.7000μs | 9.0458μs | 110.5483 KOps/s | 124.5914 KOps/s | |
test_creation_nested_1 | 42.6100μs | 10.5729μs | 94.5817 KOps/s | 102.6839 KOps/s | |
test_creation_nested_2 | 48.1910μs | 13.3807μs | 74.7344 KOps/s | 80.9147 KOps/s | |
test_clone | 56.7210μs | 11.1498μs | 89.6876 KOps/s | 92.8230 KOps/s | |
test_getitem[int] | 1.2048ms | 10.9538μs | 91.2927 KOps/s | 95.6949 KOps/s | |
test_getitem[slice_int] | 0.1167ms | 21.4230μs | 46.6787 KOps/s | 49.1492 KOps/s | |
test_getitem[range] | 0.1295ms | 39.2133μs | 25.5016 KOps/s | 26.8209 KOps/s | |
test_getitem[tuple] | 0.1064ms | 18.4766μs | 54.1226 KOps/s | 56.3739 KOps/s | |
test_getitem[list] | 0.1675ms | 34.4229μs | 29.0504 KOps/s | 29.7768 KOps/s | |
test_setitem_dim[int] | 47.6910μs | 20.0976μs | 49.7571 KOps/s | 50.4232 KOps/s | |
test_setitem_dim[slice_int] | 63.8010μs | 39.1260μs | 25.5585 KOps/s | 25.7394 KOps/s | |
test_setitem_dim[range] | 0.1059ms | 55.9077μs | 17.8866 KOps/s | 18.8388 KOps/s | |
test_setitem_dim[tuple] | 60.5710μs | 32.6786μs | 30.6011 KOps/s | 29.6221 KOps/s | |
test_setitem | 67.0110μs | 16.0120μs | 62.4533 KOps/s | 66.8440 KOps/s | |
test_set | 75.7920μs | 15.1916μs | 65.8260 KOps/s | 67.6725 KOps/s | |
test_set_shared | 0.5159ms | 0.1599ms | 6.2535 KOps/s | 6.2965 KOps/s | |
test_update | 0.2313ms | 18.6899μs | 53.5049 KOps/s | 56.7598 KOps/s | |
test_update_nested | 68.9410μs | 24.3475μs | 41.0721 KOps/s | 42.8787 KOps/s | |
test_update__nested | 0.5101ms | 25.5786μs | 39.0952 KOps/s | 38.8957 KOps/s | |
test_set_nested | 64.0610μs | 16.7630μs | 59.6552 KOps/s | 62.2903 KOps/s | |
test_set_nested_new | 76.2120μs | 18.6645μs | 53.5777 KOps/s | 53.4038 KOps/s | |
test_select | 67.1610μs | 30.1491μs | 33.1684 KOps/s | 33.7563 KOps/s | |
test_select_nested | 74.3110μs | 43.2636μs | 23.1141 KOps/s | 22.5378 KOps/s | |
test_exclude_nested | 94.1510μs | 63.1741μs | 15.8293 KOps/s | 15.6248 KOps/s | |
test_empty[True] | 0.3604ms | 0.2945ms | 3.3957 KOps/s | 3.3541 KOps/s | |
test_empty[False] | 3.7130μs | 0.8388μs | 1.1922 MOps/s | 1.1961 MOps/s | |
test_to | 88.2910μs | 57.1923μs | 17.4849 KOps/s | 16.4618 KOps/s | |
test_to_nonblocking | 91.3620μs | 49.2259μs | 20.3145 KOps/s | 20.8840 KOps/s | |
test_unbind_speed | 0.2923ms | 0.2480ms | 4.0329 KOps/s | 4.2837 KOps/s | |
test_unbind_speed_stack0 | 0.3126ms | 0.2474ms | 4.0414 KOps/s | 4.2288 KOps/s | |
test_unbind_speed_stack1 | 93.5453ms | 0.7503ms | 1.3328 KOps/s | 1.3467 KOps/s | |
test_split | 94.2802ms | 1.6044ms | 623.2778 Ops/s | 627.8731 Ops/s | |
test_chunk | 96.3463ms | 1.6143ms | 619.4774 Ops/s | 627.1133 Ops/s | |
test_consolidate[False-None] | 2.8077ms | 2.7081ms | 369.2684 Ops/s | 336.0734 Ops/s | |
test_consolidate[default-None] | 1.7968ms | 1.7316ms | 577.4926 Ops/s | 584.7331 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8454ms | 1.7534ms | 570.3048 Ops/s | 572.8247 Ops/s | |
test_consolidate_njt[False-None] | 6.7370ms | 6.5330ms | 153.0692 Ops/s | 111.2031 Ops/s | |
test_to[False-False-None] | 1.8194ms | 1.7532ms | 570.3847 Ops/s | 564.3212 Ops/s | |
test_to[True-False-None] | 1.4931ms | 1.3763ms | 726.6059 Ops/s | 751.6225 Ops/s | |
test_to[within-False-None] | 4.4646ms | 4.2387ms | 235.9187 Ops/s | 237.8136 Ops/s | |
test_to[True-default-None] | 5.5874ms | 5.2450ms | 190.6581 Ops/s | 178.7626 Ops/s | |
test_to_njt[False-False-None] | 7.1205ms | 6.9804ms | 143.2579 Ops/s | 134.1229 Ops/s | |
test_to_njt[True-False-None] | 5.7053ms | 5.5317ms | 180.7763 Ops/s | 173.7753 Ops/s | |
test_to_njt[within-False-None] | 12.4026ms | 12.1236ms | 82.4838 Ops/s | 80.6630 Ops/s | |
test_creation[device0] | 0.4579ms | 79.3675μs | 12.5996 KOps/s | 12.6119 KOps/s | |
test_creation_from_tensor | 0.6225ms | 83.3414μs | 11.9988 KOps/s | 11.8954 KOps/s | |
test_add_one[memmap_tensor0] | 0.4339ms | 7.0271μs | 142.3069 KOps/s | 148.1619 KOps/s | |
test_contiguous[memmap_tensor0] | 2.0430μs | 0.4183μs | 2.3905 MOps/s | 2.2562 MOps/s | |
test_stack[memmap_tensor0] | 38.1010μs | 4.7791μs | 209.2435 KOps/s | 229.5432 KOps/s | |
test_memmaptd_index | 1.8343ms | 0.2515ms | 3.9766 KOps/s | 4.1440 KOps/s | |
test_memmaptd_index_astensor | 0.4534ms | 0.3110ms | 3.2153 KOps/s | 3.2908 KOps/s | |
test_memmaptd_index_op | 0.7522ms | 0.6109ms | 1.6369 KOps/s | 1.7363 KOps/s | |
test_serialize_model | 0.1310s | 0.1298s | 7.7012 Ops/s | 7.6913 Ops/s | |
test_serialize_model_pickle | 1.3820s | 1.2222s | 0.8182 Ops/s | 0.8206 Ops/s | |
test_serialize_weights | 0.1308s | 0.1292s | 7.7384 Ops/s | 7.7419 Ops/s | |
test_serialize_weights_returnearly | 0.2951s | 52.8193ms | 18.9325 Ops/s | 14.8062 Ops/s | |
test_serialize_weights_pickle | 1.3779s | 1.2203s | 0.8195 Ops/s | 0.7714 Ops/s | |
test_reshape_pytree | 51.7400μs | 22.0371μs | 45.3781 KOps/s | 45.0607 KOps/s | |
test_reshape_td | 66.1910μs | 27.2803μs | 36.6564 KOps/s | 36.3830 KOps/s | |
test_view_pytree | 56.2200μs | 22.1690μs | 45.1081 KOps/s | 45.4979 KOps/s | |
test_view_td | 65.5910μs | 31.4571μs | 31.7893 KOps/s | 30.8832 KOps/s | |
test_unbind_pytree | 58.0510μs | 28.5857μs | 34.9825 KOps/s | 35.8853 KOps/s | |
test_unbind_td | 0.6173ms | 37.6149μs | 26.5852 KOps/s | 27.1819 KOps/s | |
test_split_pytree | 65.5810μs | 29.9188μs | 33.4238 KOps/s | 33.3854 KOps/s | |
test_split_td | 0.7787ms | 39.0661μs | 25.5977 KOps/s | 25.5585 KOps/s | |
test_add_pytree | 66.4010μs | 35.5357μs | 28.1407 KOps/s | 28.4188 KOps/s | |
test_add_td | 0.1032ms | 53.7308μs | 18.6113 KOps/s | 20.9821 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1744ms | 0.1223ms | 8.1735 KOps/s | 7.9163 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2263ms | 0.1320ms | 7.5782 KOps/s | 7.3398 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2029ms | 96.0188μs | 10.4146 KOps/s | 10.3462 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.1541ms | 0.1511ms | 6.6191 KOps/s | 6.6733 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 54.8110μs | 24.3814μs | 41.0149 KOps/s | 41.5395 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 95.8120μs | 29.4006μs | 34.0129 KOps/s | 33.1532 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4255ms | 63.8800μs | 15.6544 KOps/s | 15.2975 KOps/s | |
test_compile_copy_nested[pytree-eager] | 82.7710μs | 48.6699μs | 20.5466 KOps/s | 20.2094 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1965ms | 0.1426ms | 7.0124 KOps/s | 7.0377 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3140ms | 0.2188ms | 4.5703 KOps/s | 4.5716 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2210ms | 97.7165μs | 10.2337 KOps/s | 10.2954 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1160ms | 55.8169μs | 17.9157 KOps/s | 17.8406 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1858ms | 0.1375ms | 7.2706 KOps/s | 7.3665 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5390ms | 0.4920ms | 2.0324 KOps/s | 2.0490 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3807ms | 0.2639ms | 3.7898 KOps/s | 3.7699 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1985ms | 0.1474ms | 6.7821 KOps/s | 7.0611 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1636ms | 67.2128μs | 14.8781 KOps/s | 14.3054 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1418ms | 0.1003ms | 9.9734 KOps/s | 10.0912 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4800ms | 0.4182ms | 2.3912 KOps/s | 2.4414 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1792ms | 0.1381ms | 7.2388 KOps/s | 7.4232 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.7710μs | 18.9517μs | 52.7657 KOps/s | 46.4657 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 92.8320μs | 31.3424μs | 31.9056 KOps/s | 31.4160 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2180ms | 68.8135μs | 14.5320 KOps/s | 14.3578 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1384ms | 51.4516μs | 19.4357 KOps/s | 19.2906 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6371ms | 0.4003ms | 2.4983 KOps/s | 2.2053 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9187ms | 2.7465ms | 364.1025 Ops/s | 376.8972 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5984ms | 0.4325ms | 2.3124 KOps/s | 2.2681 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8367ms | 2.7249ms | 366.9871 Ops/s | 376.7115 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5111ms | 0.1176ms | 8.5031 KOps/s | 8.5720 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5650ms | 80.7123μs | 12.3897 KOps/s | 12.1327 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.5430ms | 0.1128ms | 8.8623 KOps/s | 9.2199 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1290ms | 69.8626μs | 14.3138 KOps/s | 14.2006 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2335ms | 0.1139ms | 8.7799 KOps/s | 9.1360 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1521ms | 69.4133μs | 14.4065 KOps/s | 14.2129 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1480ms | 0.1040ms | 9.6187 KOps/s | 9.8932 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1480ms | 17.2742μs | 57.8897 KOps/s | 57.1565 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1439ms | 97.0872μs | 10.3000 KOps/s | 10.3329 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 84.8110μs | 15.6353μs | 63.9577 KOps/s | 64.3575 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1671ms | 97.5174μs | 10.2546 KOps/s | 10.2303 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 47.1010μs | 15.6956μs | 63.7121 KOps/s | 57.7693 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1515ms | 0.1028ms | 9.7282 KOps/s | 9.6683 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5919ms | 17.0922μs | 58.5063 KOps/s | 57.5289 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1636ms | 98.0224μs | 10.2018 KOps/s | 9.9624 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 50.9010μs | 15.7669μs | 63.4239 KOps/s | 64.5180 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1591ms | 97.6275μs | 10.2430 KOps/s | 10.1749 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.2485ms | 17.5007μs | 57.1406 KOps/s | 64.6684 KOps/s | |
test_mod_add[eager] | 77.0410μs | 39.5224μs | 25.3021 KOps/s | 26.1127 KOps/s | |
test_mod_add[compile] | 0.3082ms | 82.0436μs | 12.1886 KOps/s | 12.1894 KOps/s | |
test_mod_add[compile-overhead] | 0.3390ms | 0.1715ms | 5.8308 KOps/s | 5.6896 KOps/s | |
test_mod_wrap[eager] | 0.3380ms | 0.2559ms | 3.9074 KOps/s | 3.7725 KOps/s | |
test_mod_wrap[compile] | 0.3592ms | 0.2894ms | 3.4552 KOps/s | 3.4658 KOps/s | |
test_mod_wrap[compile-overhead] | 7.2202ms | 3.8200ms | 261.7777 Ops/s | 268.9822 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5329ms | 1.3913ms | 718.7608 Ops/s | 677.9732 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3867ms | 1.2899ms | 775.2711 Ops/s | 714.5655 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3848ms | 0.9340ms | 1.0707 KOps/s | 954.8104 Ops/s | |
test_seq_add[eager] | 0.1868ms | 0.1201ms | 8.3245 KOps/s | 8.5295 KOps/s | |
test_seq_add[compile] | 0.1572ms | 89.1346μs | 11.2190 KOps/s | 10.8283 KOps/s | |
test_seq_add[compile-overhead] | 0.1697ms | 0.1305ms | 7.6603 KOps/s | 7.6080 KOps/s | |
test_seq_wrap[eager] | 0.5195ms | 0.4337ms | 2.3055 KOps/s | 2.2236 KOps/s | |
test_seq_wrap[compile] | 0.4825ms | 0.3118ms | 3.2073 KOps/s | 3.1003 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3094ms | 0.2269ms | 4.4066 KOps/s | 4.3613 KOps/s | |
test_func_call_runtime[False-eager] | 0.8157ms | 0.7500ms | 1.3333 KOps/s | 1.2357 KOps/s | |
test_func_call_runtime[False-compile] | 0.8971ms | 0.7634ms | 1.3100 KOps/s | 1.3111 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4105ms | 0.3689ms | 2.7110 KOps/s | 2.7214 KOps/s | |
test_func_call_runtime[True-eager] | 0.9923ms | 0.9187ms | 1.0885 KOps/s | 1.0705 KOps/s | |
test_func_call_runtime[True-compile] | 0.8481ms | 0.7842ms | 1.2752 KOps/s | 1.2902 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5562ms | 0.3894ms | 2.5682 KOps/s | 2.5692 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8293ms | 0.7510ms | 1.3316 KOps/s | 1.3103 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0111ms | 0.7614ms | 1.3134 KOps/s | 1.2879 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4484ms | 0.3688ms | 2.7112 KOps/s | 2.7081 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1059ms | 1.0215ms | 978.9055 Ops/s | 961.0818 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1188ms | 1.0050ms | 995.0678 Ops/s | 975.3712 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0697ms | 1.0064ms | 993.6562 Ops/s | 971.0163 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5238ms | 2.1032ms | 475.4685 Ops/s | 467.7946 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9021ms | 0.8268ms | 1.2096 KOps/s | 1.2106 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4649ms | 0.4187ms | 2.3885 KOps/s | 2.3707 KOps/s | |
test_distributed | 2.8979ms | 0.2354ms | 4.2482 KOps/s | 8.7807 KOps/s | |
test_tdmodule | 36.2710μs | 20.8044μs | 48.0668 KOps/s | 50.5049 KOps/s | |
test_tdmodule_dispatch | 83.2810μs | 37.2086μs | 26.8755 KOps/s | 28.3575 KOps/s | |
test_tdseq | 31.0800μs | 21.4020μs | 46.7246 KOps/s | 48.2386 KOps/s | |
test_tdseq_dispatch | 58.1910μs | 38.6556μs | 25.8694 KOps/s | 25.8834 KOps/s | |
test_instantiation_functorch | 1.6976ms | 1.5738ms | 635.4113 Ops/s | 649.6501 Ops/s | |
test_exec_functorch | 0.1873ms | 0.1461ms | 6.8451 KOps/s | 7.0015 KOps/s | |
test_exec_functional_call | 0.1982ms | 0.1411ms | 7.0885 KOps/s | 7.3345 KOps/s | |
test_exec_td_decorator | 0.3803ms | 0.1909ms | 5.2394 KOps/s | 5.3129 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8366ms | 0.6931ms | 1.4428 KOps/s | 1.4404 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8281ms | 0.6913ms | 1.4465 KOps/s | 1.4427 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7136ms | 0.6010ms | 1.6639 KOps/s | 1.6552 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7155ms | 0.6005ms | 1.6653 KOps/s | 1.6515 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.1005ms | 19.4450ms | 51.4272 Ops/s | 51.1006 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.5093ms | 19.4289ms | 51.4696 Ops/s | 51.2784 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.3391ms | 19.2907ms | 51.8385 Ops/s | 51.6929 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3845ms | 19.2929ms | 51.8326 Ops/s | 51.6996 Ops/s | |
test_to_module_speed[True] | 1.4302ms | 0.9694ms | 1.0316 KOps/s | 1.0240 KOps/s | |
test_to_module_speed[False] | 1.0211ms | 0.9507ms | 1.0519 KOps/s | 1.0391 KOps/s | |
test_tc_init | 75.6310μs | 36.6059μs | 27.3180 KOps/s | 28.2507 KOps/s | |
test_tc_init_nested | 0.1178ms | 73.0007μs | 13.6985 KOps/s | 14.4008 KOps/s | |
test_tc_first_layer_tensor | 24.7310μs | 0.7970μs | 1.2547 MOps/s | 1.2590 MOps/s | |
test_tc_first_layer_nontensor | 31.6900μs | 2.2046μs | 453.5987 KOps/s | 451.2249 KOps/s | |
test_tc_second_layer_tensor | 9.4402μs | 1.4078μs | 710.3391 KOps/s | 709.0221 KOps/s | |
test_tc_second_layer_nontensor | 31.2910μs | 2.9328μs | 340.9676 KOps/s | 335.3211 KOps/s | |
test_unbind | 0.2164s | 11.9649ms | 83.5775 Ops/s | 141.1119 Ops/s | |
test_full_like | 9.3135ms | 9.1365ms | 109.4505 Ops/s | 108.2235 Ops/s | |
test_zeros_like | 6.7772ms | 4.3513ms | 229.8182 Ops/s | 234.7166 Ops/s | |
test_ones_like | 4.9591ms | 4.3278ms | 231.0635 Ops/s | 235.4041 Ops/s | |
test_clone | 6.9333ms | 6.3820ms | 156.6903 Ops/s | 109.8489 Ops/s | |
test_squeeze | 53.4510μs | 10.0993μs | 99.0171 KOps/s | 103.2357 KOps/s | |
test_unsqueeze | 0.1264ms | 74.4285μs | 13.4357 KOps/s | 13.5746 KOps/s | |
test_split | 0.3711ms | 0.1621ms | 6.1703 KOps/s | 6.1252 KOps/s | |
test_permute | 0.2425ms | 0.1859ms | 5.3797 KOps/s | 5.4420 KOps/s | |
test_stack | 50.8762ms | 50.5157ms | 19.7958 Ops/s | 19.9554 Ops/s | |
test_cat | 50.6447ms | 50.2292ms | 19.9088 Ops/s | 20.0065 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):