-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Consolidate lazy stacks of non-tensors #1222
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Feb 19, 2025
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: 45e02491db1e191204d13481602f8611e1588909 Pull Request resolved: #1222
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: 8b1d3dcdc4b3f428d79c979b53c6615de1def9f0 Pull Request resolved: #1222
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.8140μs | 21.0699μs | 47.4610 KOps/s | 49.0189 KOps/s | |
test_plain_set_stack_nested | 41.3780μs | 20.9454μs | 47.7431 KOps/s | 48.2252 KOps/s | |
test_plain_set_nested_inplace | 52.3380μs | 22.8619μs | 43.7408 KOps/s | 44.6930 KOps/s | |
test_plain_set_stack_nested_inplace | 93.8690μs | 22.8275μs | 43.8068 KOps/s | 44.7768 KOps/s | |
test_items | 0.1028ms | 4.2246μs | 236.7081 KOps/s | 240.6147 KOps/s | |
test_items_nested | 0.5880ms | 0.4042ms | 2.4739 KOps/s | 2.4419 KOps/s | |
test_items_nested_locked | 0.5196ms | 0.4039ms | 2.4761 KOps/s | 2.4268 KOps/s | |
test_items_nested_leaf | 0.1383ms | 76.7135μs | 13.0355 KOps/s | 12.8818 KOps/s | |
test_items_stack_nested | 0.5989ms | 0.4121ms | 2.4264 KOps/s | 2.4322 KOps/s | |
test_items_stack_nested_leaf | 0.1327ms | 77.6983μs | 12.8703 KOps/s | 12.4785 KOps/s | |
test_items_stack_nested_locked | 0.5776ms | 0.4070ms | 2.4569 KOps/s | 2.4194 KOps/s | |
test_keys | 28.6130μs | 3.4566μs | 289.3011 KOps/s | 289.4759 KOps/s | |
test_keys_nested | 0.2684ms | 0.1649ms | 6.0648 KOps/s | 6.0884 KOps/s | |
test_keys_nested_locked | 0.7396ms | 0.1713ms | 5.8381 KOps/s | 5.8541 KOps/s | |
test_keys_nested_leaf | 0.2294ms | 0.1433ms | 6.9794 KOps/s | 6.9652 KOps/s | |
test_keys_stack_nested | 0.3040ms | 0.1659ms | 6.0290 KOps/s | 6.0627 KOps/s | |
test_keys_stack_nested_leaf | 0.2597ms | 0.1444ms | 6.9248 KOps/s | 7.1811 KOps/s | |
test_keys_stack_nested_locked | 0.2700ms | 0.1715ms | 5.8304 KOps/s | 5.9465 KOps/s | |
test_values | 5.5846μs | 1.0325μs | 968.5399 KOps/s | 969.5777 KOps/s | |
test_values_nested | 0.1054ms | 63.5790μs | 15.7285 KOps/s | 16.1620 KOps/s | |
test_values_nested_locked | 0.1111ms | 63.9656μs | 15.6334 KOps/s | 16.2223 KOps/s | |
test_values_nested_leaf | 0.1217ms | 71.6702μs | 13.9528 KOps/s | 13.8582 KOps/s | |
test_values_stack_nested | 0.1114ms | 64.7863μs | 15.4354 KOps/s | 14.8523 KOps/s | |
test_values_stack_nested_leaf | 0.1258ms | 72.2738μs | 13.8363 KOps/s | 14.2342 KOps/s | |
test_values_stack_nested_locked | 0.1103ms | 64.6482μs | 15.4683 KOps/s | 15.7605 KOps/s | |
test_membership | 3.5251μs | 0.7165μs | 1.3957 MOps/s | 1.1459 MOps/s | |
test_membership_nested | 43.0900μs | 2.9470μs | 339.3234 KOps/s | 340.5480 KOps/s | |
test_membership_nested_leaf | 30.6680μs | 2.9153μs | 343.0187 KOps/s | 344.6672 KOps/s | |
test_membership_stacked_nested | 31.6190μs | 2.8985μs | 345.0027 KOps/s | 344.5362 KOps/s | |
test_membership_stacked_nested_leaf | 39.6550μs | 2.9411μs | 340.0124 KOps/s | 340.1864 KOps/s | |
test_membership_nested_last | 29.5960μs | 4.3264μs | 231.1388 KOps/s | 228.2099 KOps/s | |
test_membership_nested_leaf_last | 35.0460μs | 4.3336μs | 230.7552 KOps/s | 219.9783 KOps/s | |
test_membership_stacked_nested_last | 0.1276ms | 4.4446μs | 224.9913 KOps/s | 74.3158 KOps/s | |
test_membership_stacked_nested_leaf_last | 89.7780μs | 4.3622μs | 229.2423 KOps/s | 74.1329 KOps/s | |
test_nested_getleaf | 38.0210μs | 10.8786μs | 91.9232 KOps/s | 92.7874 KOps/s | |
test_nested_get | 33.6240μs | 10.4930μs | 95.3020 KOps/s | 98.5251 KOps/s | |
test_stacked_getleaf | 26.7600μs | 10.6496μs | 93.9002 KOps/s | 94.2900 KOps/s | |
test_stacked_get | 37.1590μs | 10.3617μs | 96.5094 KOps/s | 98.1857 KOps/s | |
test_nested_getitemleaf | 38.2420μs | 11.5409μs | 86.6482 KOps/s | 87.4805 KOps/s | |
test_nested_getitem | 41.1270μs | 10.9231μs | 91.5487 KOps/s | 94.8482 KOps/s | |
test_stacked_getitemleaf | 43.0810μs | 11.3269μs | 88.2857 KOps/s | 89.8069 KOps/s | |
test_stacked_getitem | 41.8680μs | 11.0153μs | 90.7828 KOps/s | 93.6309 KOps/s | |
test_lock_nested | 7.2654ms | 0.4138ms | 2.4164 KOps/s | 2.4496 KOps/s | |
test_lock_stack_nested | 0.8251ms | 0.4180ms | 2.3921 KOps/s | 2.4366 KOps/s | |
test_unlock_nested | 0.4130ms | 0.3300ms | 3.0305 KOps/s | 2.9932 KOps/s | |
test_unlock_stack_nested | 0.4138ms | 0.3361ms | 2.9756 KOps/s | 3.0138 KOps/s | |
test_flatten_speed | 0.1794ms | 99.6279μs | 10.0374 KOps/s | 9.4253 KOps/s | |
test_unflatten_speed | 1.0881ms | 0.5308ms | 1.8840 KOps/s | 1.8697 KOps/s | |
test_common_ops | 6.9947ms | 0.8422ms | 1.1874 KOps/s | 1.2661 KOps/s | |
test_creation | 41.4780μs | 2.5136μs | 397.8409 KOps/s | 397.1191 KOps/s | |
test_creation_empty | 37.1800μs | 12.5786μs | 79.5001 KOps/s | 87.8107 KOps/s | |
test_creation_nested_1 | 52.4580μs | 15.6082μs | 64.0690 KOps/s | 70.1971 KOps/s | |
test_creation_nested_2 | 53.3390μs | 20.3457μs | 49.1505 KOps/s | 53.8173 KOps/s | |
test_clone | 63.8290μs | 13.4902μs | 74.1279 KOps/s | 73.4214 KOps/s | |
test_getitem[int] | 0.8942ms | 12.3347μs | 81.0723 KOps/s | 78.7365 KOps/s | |
test_getitem[slice_int] | 0.1541ms | 24.6210μs | 40.6158 KOps/s | 41.2602 KOps/s | |
test_getitem[range] | 0.1625ms | 49.7312μs | 20.1081 KOps/s | 19.9930 KOps/s | |
test_getitem[tuple] | 0.1371ms | 19.8102μs | 50.4790 KOps/s | 49.9747 KOps/s | |
test_getitem[list] | 0.2965ms | 45.6687μs | 21.8968 KOps/s | 21.8896 KOps/s | |
test_setitem_dim[int] | 56.7370μs | 25.9076μs | 38.5987 KOps/s | 38.9021 KOps/s | |
test_setitem_dim[slice_int] | 88.4760μs | 52.3291μs | 19.1098 KOps/s | 19.2256 KOps/s | |
test_setitem_dim[range] | 0.1312ms | 75.5177μs | 13.2419 KOps/s | 12.8777 KOps/s | |
test_setitem_dim[tuple] | 95.7290μs | 42.4575μs | 23.5530 KOps/s | 24.8900 KOps/s | |
test_setitem | 86.7630μs | 21.1030μs | 47.3865 KOps/s | 48.7144 KOps/s | |
test_set | 0.2593ms | 20.6274μs | 48.4793 KOps/s | 50.3757 KOps/s | |
test_set_shared | 6.3092ms | 0.1851ms | 5.4037 KOps/s | 5.4576 KOps/s | |
test_update | 0.1446ms | 23.9832μs | 41.6959 KOps/s | 44.4226 KOps/s | |
test_update_nested | 0.2109ms | 36.2747μs | 27.5674 KOps/s | 29.5031 KOps/s | |
test_update__nested | 0.2885ms | 33.8627μs | 29.5310 KOps/s | 29.1522 KOps/s | |
test_set_nested | 0.1289ms | 22.8217μs | 43.8180 KOps/s | 44.2889 KOps/s | |
test_set_nested_new | 83.3370μs | 28.1083μs | 35.5767 KOps/s | 36.4146 KOps/s | |
test_select | 0.1017ms | 44.3833μs | 22.5310 KOps/s | 23.3325 KOps/s | |
test_select_nested | 0.1543ms | 63.1741μs | 15.8293 KOps/s | 15.7809 KOps/s | |
test_exclude_nested | 0.1787ms | 81.8260μs | 12.2210 KOps/s | 12.2071 KOps/s | |
test_empty[True] | 0.7378ms | 0.4076ms | 2.4531 KOps/s | 2.3884 KOps/s | |
test_empty[False] | 6.7450μs | 1.3348μs | 749.2023 KOps/s | 726.9274 KOps/s | |
test_unbind_speed | 0.5793ms | 0.2674ms | 3.7393 KOps/s | 3.6991 KOps/s | |
test_unbind_speed_stack0 | 0.3505ms | 0.2644ms | 3.7827 KOps/s | 3.7965 KOps/s | |
test_unbind_speed_stack1 | 0.1059s | 0.7294ms | 1.3710 KOps/s | 1.1116 KOps/s | |
test_split | 0.1028s | 1.7364ms | 575.9178 Ops/s | 564.5762 Ops/s | |
test_chunk | 0.1030s | 1.7602ms | 568.1126 Ops/s | 625.5824 Ops/s | |
test_consolidate_njt[False-None] | 9.4583ms | 8.3331ms | 120.0035 Ops/s | 108.4618 Ops/s | |
test_creation[device0] | 0.2257ms | 91.2784μs | 10.9555 KOps/s | 10.8791 KOps/s | |
test_creation_from_tensor | 0.2354ms | 95.3409μs | 10.4887 KOps/s | 10.2197 KOps/s | |
test_add_one[memmap_tensor0] | 0.1546ms | 5.0884μs | 196.5260 KOps/s | 210.0658 KOps/s | |
test_contiguous[memmap_tensor0] | 10.6090μs | 0.5192μs | 1.9259 MOps/s | 1.9824 MOps/s | |
test_stack[memmap_tensor0] | 19.3560μs | 3.4829μs | 287.1169 KOps/s | 291.2094 KOps/s | |
test_memmaptd_index | 0.5425ms | 0.2340ms | 4.2739 KOps/s | 4.3027 KOps/s | |
test_memmaptd_index_astensor | 0.6501ms | 0.3181ms | 3.1433 KOps/s | 3.0880 KOps/s | |
test_memmaptd_index_op | 0.7904ms | 0.5980ms | 1.6721 KOps/s | 1.7530 KOps/s | |
test_serialize_model | 0.2162s | 0.1302s | 7.6834 Ops/s | 8.3535 Ops/s | |
test_serialize_model_pickle | 0.4483s | 0.3953s | 2.5298 Ops/s | 2.5202 Ops/s | |
test_serialize_weights | 0.1226s | 0.1164s | 8.5938 Ops/s | 8.7593 Ops/s | |
test_serialize_weights_returnearly | 0.1881s | 0.1650s | 6.0592 Ops/s | 5.5275 Ops/s | |
test_serialize_weights_pickle | 1.1443s | 0.6964s | 1.4359 Ops/s | 2.5664 Ops/s | |
test_serialize_weights_filesystem | 0.2470s | 0.1589s | 6.2920 Ops/s | 6.9201 Ops/s | |
test_serialize_model_filesystem | 0.1512s | 0.1433s | 6.9786 Ops/s | 6.5968 Ops/s | |
test_reshape_pytree | 73.2770μs | 26.6907μs | 37.4662 KOps/s | 37.5043 KOps/s | |
test_reshape_td | 69.0090μs | 32.9719μs | 30.3288 KOps/s | 29.6238 KOps/s | |
test_view_pytree | 0.1866ms | 26.7905μs | 37.3266 KOps/s | 37.1970 KOps/s | |
test_view_td | 90.1690μs | 40.3229μs | 24.7998 KOps/s | 24.5840 KOps/s | |
test_unbind_pytree | 63.0080μs | 29.3664μs | 34.0525 KOps/s | 33.3044 KOps/s | |
test_unbind_td | 0.3642ms | 40.7499μs | 24.5400 KOps/s | 25.1773 KOps/s | |
test_split_pytree | 63.1480μs | 29.3032μs | 34.1260 KOps/s | 34.2084 KOps/s | |
test_split_td | 0.5243ms | 44.8969μs | 22.2733 KOps/s | 21.9722 KOps/s | |
test_add_pytree | 84.1270μs | 35.5820μs | 28.1041 KOps/s | 27.9588 KOps/s | |
test_add_td | 0.1292ms | 62.1528μs | 16.0894 KOps/s | 17.2033 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1420ms | 67.3318μs | 14.8518 KOps/s | 14.6892 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.7703ms | 0.1731ms | 5.7784 KOps/s | 5.7561 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1100ms | 44.9765μs | 22.2338 KOps/s | 21.7205 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2591ms | 0.1196ms | 8.3585 KOps/s | 8.3591 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 78.4670μs | 28.8443μs | 34.6689 KOps/s | 35.3731 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1217ms | 59.0911μs | 16.9230 KOps/s | 17.0069 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1492ms | 80.0687μs | 12.4893 KOps/s | 12.5519 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1208ms | 66.8261μs | 14.9642 KOps/s | 14.7707 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1992ms | 0.1067ms | 9.3685 KOps/s | 9.3417 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4276ms | 0.2160ms | 4.6299 KOps/s | 4.5322 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2038ms | 46.8527μs | 21.3435 KOps/s | 21.0383 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2009ms | 66.9186μs | 14.9435 KOps/s | 14.7576 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1949ms | 0.1014ms | 9.8657 KOps/s | 9.9463 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.2875ms | 0.2016ms | 4.9592 KOps/s | 4.9117 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4715ms | 0.2349ms | 4.2569 KOps/s | 4.2971 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2250ms | 0.1107ms | 9.0320 KOps/s | 9.1829 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3214ms | 64.0077μs | 15.6231 KOps/s | 15.6076 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2054ms | 48.9510μs | 20.4286 KOps/s | 20.1912 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3321ms | 0.1581ms | 6.3256 KOps/s | 6.3079 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2018ms | 0.1003ms | 9.9691 KOps/s | 9.8557 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 52.1480μs | 21.4025μs | 46.7235 KOps/s | 46.3006 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1459ms | 69.2793μs | 14.4343 KOps/s | 14.7805 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1936ms | 81.7199μs | 12.2369 KOps/s | 11.9306 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1422ms | 68.2067μs | 14.6613 KOps/s | 14.5296 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3154ms | 0.2161ms | 4.6285 KOps/s | 4.5294 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7708ms | 1.3654ms | 732.3805 Ops/s | 716.4002 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3084ms | 0.2092ms | 4.7793 KOps/s | 4.6667 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.4820ms | 0.8329ms | 1.2006 KOps/s | 1.1885 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5804ms | 0.4618ms | 2.1655 KOps/s | 2.1614 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.5238ms | 2.7934ms | 357.9804 Ops/s | 375.2794 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1878ms | 39.9192μs | 25.0506 KOps/s | 25.2424 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6247ms | 32.8196μs | 30.4696 KOps/s | 28.9821 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1766ms | 30.8135μs | 32.4534 KOps/s | 31.5156 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 89.8480μs | 22.9968μs | 43.4844 KOps/s | 43.1157 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 87.9240μs | 31.5412μs | 31.7046 KOps/s | 30.3306 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 79.7510μs | 23.0607μs | 43.3639 KOps/s | 43.4152 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1289ms | 52.9340μs | 18.8914 KOps/s | 18.4015 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3172ms | 20.2229μs | 49.4490 KOps/s | 49.1425 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2588ms | 45.2577μs | 22.0957 KOps/s | 20.6437 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 80.3000μs | 18.8654μs | 53.0071 KOps/s | 53.4603 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 93.8360μs | 46.0953μs | 21.6942 KOps/s | 20.2595 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 76.3030μs | 19.2307μs | 52.0003 KOps/s | 53.0337 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1350ms | 54.1959μs | 18.4516 KOps/s | 17.7923 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9307ms | 19.9554μs | 50.1117 KOps/s | 49.3230 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1328ms | 45.7021μs | 21.8808 KOps/s | 20.0593 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 81.7430μs | 19.0197μs | 52.5770 KOps/s | 52.9553 KOps/s | |
test_compile_indexing[int-pytree-compile] | 97.1120μs | 46.4538μs | 21.5268 KOps/s | 20.1124 KOps/s | |
test_compile_indexing[int-pytree-eager] | 65.0820μs | 19.0369μs | 52.5296 KOps/s | 53.6067 KOps/s | |
test_mod_add[eager] | 84.4380μs | 36.7230μs | 27.2309 KOps/s | 28.2854 KOps/s | |
test_mod_add[compile] | 0.1465ms | 66.5918μs | 15.0169 KOps/s | 14.4393 KOps/s | |
test_mod_add[compile-overhead] | 0.1278ms | 66.1579μs | 15.1154 KOps/s | 14.7728 KOps/s | |
test_mod_wrap[eager] | 0.4487ms | 0.2270ms | 4.4049 KOps/s | 4.4530 KOps/s | |
test_mod_wrap[compile] | 1.9457ms | 0.2350ms | 4.2552 KOps/s | 4.3002 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3401ms | 0.2347ms | 4.2604 KOps/s | 4.3384 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.6994ms | 13.5685ms | 73.7002 Ops/s | 77.7724 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.7046ms | 12.0448ms | 83.0235 Ops/s | 87.8823 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.0462ms | 11.8179ms | 84.6176 Ops/s | 88.6191 Ops/s | |
test_seq_add[eager] | 0.1935ms | 0.1188ms | 8.4178 KOps/s | 8.4009 KOps/s | |
test_seq_add[compile] | 0.1534ms | 78.5051μs | 12.7380 KOps/s | 12.8595 KOps/s | |
test_seq_add[compile-overhead] | 0.1568ms | 75.2982μs | 13.2805 KOps/s | 12.9165 KOps/s | |
test_seq_wrap[eager] | 0.6306ms | 0.4546ms | 2.1997 KOps/s | 2.2418 KOps/s | |
test_seq_wrap[compile] | 0.4392ms | 0.2477ms | 4.0374 KOps/s | 4.0999 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3753ms | 0.2479ms | 4.0338 KOps/s | 4.0903 KOps/s | |
test_func_call_runtime[False-eager] | 0.8365ms | 0.5465ms | 1.8297 KOps/s | 1.8258 KOps/s | |
test_func_call_runtime[False-compile] | 0.6137ms | 0.4509ms | 2.2177 KOps/s | 2.2348 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6161ms | 0.4496ms | 2.2242 KOps/s | 2.2536 KOps/s | |
test_func_call_runtime[True-eager] | 0.9667ms | 0.7649ms | 1.3073 KOps/s | 1.3186 KOps/s | |
test_func_call_runtime[True-compile] | 0.7431ms | 0.4702ms | 2.1269 KOps/s | 2.1499 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6723ms | 0.4720ms | 2.1185 KOps/s | 2.1534 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8785ms | 0.5428ms | 1.8423 KOps/s | 1.8502 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8419ms | 0.4521ms | 2.2121 KOps/s | 2.2500 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.8387ms | 0.4518ms | 2.2135 KOps/s | 2.2349 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0786ms | 0.9101ms | 1.0988 KOps/s | 1.1017 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9687ms | 0.8034ms | 1.2446 KOps/s | 1.2354 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.4338ms | 0.8207ms | 1.2184 KOps/s | 1.2180 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 4.6775ms | 2.1474ms | 465.6829 Ops/s | 523.6769 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9376ms | 0.5505ms | 1.8166 KOps/s | 1.8217 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6522ms | 0.5422ms | 1.8442 KOps/s | 1.8479 KOps/s | |
test_distributed | 0.2462ms | 0.1275ms | 7.8453 KOps/s | 7.9587 KOps/s | |
test_tdmodule | 73.4780μs | 27.6771μs | 36.1310 KOps/s | 37.6581 KOps/s | |
test_tdmodule_dispatch | 0.3012ms | 61.6748μs | 16.2141 KOps/s | 21.0470 KOps/s | |
test_tdseq | 48.2200μs | 29.4612μs | 33.9430 KOps/s | 34.3135 KOps/s | |
test_tdseq_dispatch | 0.1183ms | 56.2681μs | 17.7721 KOps/s | 18.8336 KOps/s | |
test_instantiation_functorch | 2.2217ms | 1.5719ms | 636.1615 Ops/s | 639.3416 Ops/s | |
test_exec_functorch | 0.3447ms | 0.1779ms | 5.6219 KOps/s | 5.4789 KOps/s | |
test_exec_functional_call | 0.3268ms | 0.1720ms | 5.8142 KOps/s | 5.7713 KOps/s | |
test_exec_td_decorator | 0.5130ms | 0.2376ms | 4.2085 KOps/s | 4.2162 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8623ms | 0.6829ms | 1.4644 KOps/s | 1.5518 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0170ms | 0.6847ms | 1.4604 KOps/s | 1.5360 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8854ms | 0.5520ms | 1.8116 KOps/s | 1.9205 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9194ms | 0.5527ms | 1.8092 KOps/s | 1.9114 KOps/s | |
test_to_module_speed[True] | 2.2078ms | 1.3670ms | 731.5037 Ops/s | 734.4825 Ops/s | |
test_to_module_speed[False] | 2.1232ms | 1.3211ms | 756.9313 Ops/s | 759.1190 Ops/s | |
test_tc_init | 0.1101ms | 47.5078μs | 21.0492 KOps/s | 21.6205 KOps/s | |
test_tc_init_nested | 0.1721ms | 94.1172μs | 10.6250 KOps/s | 10.8364 KOps/s | |
test_tc_first_layer_tensor | 39.1610μs | 1.5523μs | 644.2204 KOps/s | 647.3888 KOps/s | |
test_tc_first_layer_nontensor | 34.7870μs | 4.7672μs | 209.7652 KOps/s | 208.7714 KOps/s | |
test_tc_second_layer_tensor | 32.3300μs | 3.2059μs | 311.9283 KOps/s | 339.6703 KOps/s | |
test_tc_second_layer_nontensor | 52.1080μs | 6.3732μs | 156.9058 KOps/s | 164.1090 KOps/s | |
test_unbind | 0.2395s | 13.5139ms | 73.9982 Ops/s | 69.2619 Ops/s | |
test_full_like | 9.5861ms | 7.8229ms | 127.8296 Ops/s | 133.1770 Ops/s | |
test_zeros_like | 6.2521ms | 2.8868ms | 346.4076 Ops/s | 350.9916 Ops/s | |
test_ones_like | 4.7427ms | 3.2398ms | 308.6596 Ops/s | 308.4542 Ops/s | |
test_clone | 12.1052ms | 6.9749ms | 143.3704 Ops/s | 191.2663 Ops/s | |
test_squeeze | 68.9190μs | 13.0284μs | 76.7551 KOps/s | 78.3659 KOps/s | |
test_unsqueeze | 0.1608ms | 93.6259μs | 10.6808 KOps/s | 10.6300 KOps/s | |
test_split | 0.4663ms | 0.1979ms | 5.0529 KOps/s | 5.1414 KOps/s | |
test_permute | 0.3498ms | 0.2006ms | 4.9860 KOps/s | 4.9406 KOps/s | |
test_stack | 29.7444ms | 25.2266ms | 39.6407 Ops/s | 38.8995 Ops/s | |
test_cat | 27.0820ms | 24.9148ms | 40.1367 Ops/s | 38.8643 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 32.5810μs | 13.1181μs | 76.2307 KOps/s | 79.4407 KOps/s | |
test_plain_set_stack_nested | 40.7810μs | 13.1157μs | 76.2446 KOps/s | 79.7696 KOps/s | |
test_plain_set_nested_inplace | 44.0610μs | 14.1709μs | 70.5671 KOps/s | 73.7085 KOps/s | |
test_plain_set_stack_nested_inplace | 42.7800μs | 14.1118μs | 70.8629 KOps/s | 74.4135 KOps/s | |
test_items | 27.8500μs | 2.9147μs | 343.0860 KOps/s | 334.5725 KOps/s | |
test_items_nested | 0.4203ms | 0.3657ms | 2.7342 KOps/s | 2.7334 KOps/s | |
test_items_nested_locked | 0.5652ms | 0.3677ms | 2.7197 KOps/s | 2.7174 KOps/s | |
test_items_nested_leaf | 0.1220ms | 60.7742μs | 16.4543 KOps/s | 16.3566 KOps/s | |
test_items_stack_nested | 0.3966ms | 0.3656ms | 2.7354 KOps/s | 2.7365 KOps/s | |
test_items_stack_nested_leaf | 92.5020μs | 62.5244μs | 15.9938 KOps/s | 16.2240 KOps/s | |
test_items_stack_nested_locked | 0.4144ms | 0.3639ms | 2.7482 KOps/s | 2.7413 KOps/s | |
test_keys | 28.2200μs | 3.4482μs | 290.0028 KOps/s | 285.9957 KOps/s | |
test_keys_nested | 0.1409ms | 88.6211μs | 11.2840 KOps/s | 11.3073 KOps/s | |
test_keys_nested_locked | 0.7621ms | 94.8208μs | 10.5462 KOps/s | 10.6177 KOps/s | |
test_keys_nested_leaf | 0.1097ms | 79.8425μs | 12.5247 KOps/s | 12.5995 KOps/s | |
test_keys_stack_nested | 0.1425ms | 89.4136μs | 11.1840 KOps/s | 11.3583 KOps/s | |
test_keys_stack_nested_leaf | 0.1039ms | 81.1643μs | 12.3207 KOps/s | 12.5292 KOps/s | |
test_keys_stack_nested_locked | 0.1660ms | 95.2646μs | 10.4971 KOps/s | 10.6813 KOps/s | |
test_values | 5.9617μs | 0.8652μs | 1.1558 MOps/s | 1.1666 MOps/s | |
test_values_nested | 69.0310μs | 37.5837μs | 26.6073 KOps/s | 26.6530 KOps/s | |
test_values_nested_locked | 71.7610μs | 39.3310μs | 25.4252 KOps/s | 25.2526 KOps/s | |
test_values_nested_leaf | 0.1219ms | 42.8121μs | 23.3579 KOps/s | 23.4566 KOps/s | |
test_values_stack_nested | 75.5410μs | 38.2857μs | 26.1194 KOps/s | 26.3029 KOps/s | |
test_values_stack_nested_leaf | 74.2210μs | 43.1559μs | 23.1718 KOps/s | 23.4219 KOps/s | |
test_values_stack_nested_locked | 84.9520μs | 40.2523μs | 24.8433 KOps/s | 24.9702 KOps/s | |
test_membership | 1.5630μs | 0.5050μs | 1.9804 MOps/s | 1.9928 MOps/s | |
test_membership_nested | 25.2400μs | 2.0864μs | 479.2981 KOps/s | 474.3379 KOps/s | |
test_membership_nested_leaf | 17.3605μs | 2.0528μs | 487.1292 KOps/s | 492.2626 KOps/s | |
test_membership_stacked_nested | 30.3010μs | 2.0916μs | 478.1130 KOps/s | 474.4662 KOps/s | |
test_membership_stacked_nested_leaf | 27.3910μs | 2.0822μs | 480.2662 KOps/s | 471.1602 KOps/s | |
test_membership_nested_last | 35.1310μs | 3.1195μs | 320.5633 KOps/s | 320.2239 KOps/s | |
test_membership_nested_leaf_last | 31.0800μs | 3.1185μs | 320.6707 KOps/s | 320.6960 KOps/s | |
test_membership_stacked_nested_last | 28.8700μs | 3.5961μs | 278.0808 KOps/s | 118.1031 KOps/s | |
test_membership_stacked_nested_leaf_last | 96.5320μs | 3.5819μs | 279.1776 KOps/s | 118.1504 KOps/s | |
test_nested_getleaf | 36.3110μs | 6.2846μs | 159.1179 KOps/s | 159.3153 KOps/s | |
test_nested_get | 35.4510μs | 6.0031μs | 166.5812 KOps/s | 167.5453 KOps/s | |
test_stacked_getleaf | 26.3800μs | 6.1809μs | 161.7886 KOps/s | 161.2302 KOps/s | |
test_stacked_get | 55.6210μs | 5.8332μs | 171.4336 KOps/s | 170.6425 KOps/s | |
test_nested_getitemleaf | 32.9310μs | 6.4984μs | 153.8838 KOps/s | 155.2167 KOps/s | |
test_nested_getitem | 40.1900μs | 6.1254μs | 163.2555 KOps/s | 162.7608 KOps/s | |
test_stacked_getitemleaf | 26.2700μs | 6.4281μs | 155.5677 KOps/s | 155.0606 KOps/s | |
test_stacked_getitem | 38.9010μs | 6.0358μs | 165.6775 KOps/s | 165.5376 KOps/s | |
test_lock_nested | 0.3991ms | 0.3457ms | 2.8927 KOps/s | 2.9136 KOps/s | |
test_lock_stack_nested | 0.4563ms | 0.3491ms | 2.8642 KOps/s | 2.8905 KOps/s | |
test_unlock_nested | 0.3527ms | 0.2839ms | 3.5221 KOps/s | 3.5390 KOps/s | |
test_unlock_stack_nested | 0.3283ms | 0.2818ms | 3.5486 KOps/s | 3.5996 KOps/s | |
test_flatten_speed | 0.1159ms | 77.9490μs | 12.8289 KOps/s | 12.6002 KOps/s | |
test_unflatten_speed | 0.3990ms | 0.3246ms | 3.0809 KOps/s | 3.0713 KOps/s | |
test_common_ops | 0.7924ms | 0.6155ms | 1.6246 KOps/s | 1.6594 KOps/s | |
test_creation | 28.0600μs | 1.7796μs | 561.9374 KOps/s | 560.9758 KOps/s | |
test_creation_empty | 35.1110μs | 9.3784μs | 106.6276 KOps/s | 118.0526 KOps/s | |
test_creation_nested_1 | 40.6810μs | 11.1063μs | 90.0390 KOps/s | 98.0476 KOps/s | |
test_creation_nested_2 | 40.9800μs | 13.8350μs | 72.2805 KOps/s | 77.6777 KOps/s | |
test_clone | 29.7900μs | 10.2025μs | 98.0152 KOps/s | 100.0601 KOps/s | |
test_getitem[int] | 1.1932ms | 10.8984μs | 91.7564 KOps/s | 92.0662 KOps/s | |
test_getitem[slice_int] | 0.1082ms | 20.7481μs | 48.1971 KOps/s | 47.0807 KOps/s | |
test_getitem[range] | 0.1229ms | 36.7622μs | 27.2018 KOps/s | 27.0402 KOps/s | |
test_getitem[tuple] | 0.1036ms | 18.2811μs | 54.7012 KOps/s | 54.5493 KOps/s | |
test_getitem[list] | 0.1247ms | 32.9567μs | 30.3429 KOps/s | 30.6559 KOps/s | |
test_setitem_dim[int] | 41.4210μs | 18.6287μs | 53.6806 KOps/s | 53.0447 KOps/s | |
test_setitem_dim[slice_int] | 60.7110μs | 36.8039μs | 27.1710 KOps/s | 26.4793 KOps/s | |
test_setitem_dim[range] | 80.8120μs | 52.2978μs | 19.1213 KOps/s | 19.2374 KOps/s | |
test_setitem_dim[tuple] | 61.2910μs | 31.8053μs | 31.4413 KOps/s | 31.4620 KOps/s | |
test_setitem | 63.5410μs | 15.3794μs | 65.0222 KOps/s | 68.4324 KOps/s | |
test_set | 67.6410μs | 14.8947μs | 67.1379 KOps/s | 71.0523 KOps/s | |
test_set_shared | 0.5080ms | 0.1552ms | 6.4436 KOps/s | 6.3977 KOps/s | |
test_update | 0.4187ms | 18.7631μs | 53.2962 KOps/s | 58.3289 KOps/s | |
test_update_nested | 76.4310μs | 24.2554μs | 41.2279 KOps/s | 43.2062 KOps/s | |
test_update__nested | 0.4917ms | 24.5936μs | 40.6610 KOps/s | 40.2525 KOps/s | |
test_set_nested | 65.8310μs | 16.2997μs | 61.3508 KOps/s | 64.8809 KOps/s | |
test_set_nested_new | 69.3310μs | 18.7962μs | 53.2022 KOps/s | 55.8034 KOps/s | |
test_select | 62.5110μs | 29.8553μs | 33.4949 KOps/s | 33.8052 KOps/s | |
test_select_nested | 71.2410μs | 43.8804μs | 22.7892 KOps/s | 22.5931 KOps/s | |
test_exclude_nested | 95.6510μs | 63.8302μs | 15.6666 KOps/s | 15.4582 KOps/s | |
test_empty[True] | 0.3271ms | 0.2974ms | 3.3624 KOps/s | 3.3753 KOps/s | |
test_empty[False] | 3.2681μs | 0.8185μs | 1.2217 MOps/s | 1.2129 MOps/s | |
test_to | 88.9820μs | 54.8447μs | 18.2333 KOps/s | 17.9100 KOps/s | |
test_to_nonblocking | 85.1810μs | 46.7949μs | 21.3698 KOps/s | 21.2083 KOps/s | |
test_unbind_speed | 0.2773ms | 0.2423ms | 4.1279 KOps/s | 4.1331 KOps/s | |
test_unbind_speed_stack0 | 0.3108ms | 0.2372ms | 4.2159 KOps/s | 4.2058 KOps/s | |
test_unbind_speed_stack1 | 92.9696ms | 0.7199ms | 1.3891 KOps/s | 1.3865 KOps/s | |
test_split | 95.7998ms | 1.6065ms | 622.4583 Ops/s | 617.1189 Ops/s | |
test_chunk | 95.1355ms | 1.6041ms | 623.4199 Ops/s | 614.5226 Ops/s | |
test_consolidate[False-None] | 3.3749ms | 2.6607ms | 375.8454 Ops/s | 372.9970 Ops/s | |
test_consolidate[default-None] | 1.7722ms | 1.6905ms | 591.5393 Ops/s | 589.9783 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8119ms | 1.7313ms | 577.6106 Ops/s | 579.1552 Ops/s | |
test_consolidate_njt[False-None] | 6.8317ms | 6.5840ms | 151.8839 Ops/s | 113.2139 Ops/s | |
test_to[False-False-None] | 1.8000ms | 1.6876ms | 592.5675 Ops/s | 588.9173 Ops/s | |
test_to[True-False-None] | 1.5798ms | 1.3570ms | 736.9057 Ops/s | 721.9913 Ops/s | |
test_to[within-False-None] | 4.4443ms | 4.1686ms | 239.8904 Ops/s | 236.2559 Ops/s | |
test_to[True-default-None] | 5.5471ms | 5.3284ms | 187.6748 Ops/s | 184.1381 Ops/s | |
test_to_njt[False-False-None] | 7.0621ms | 6.9371ms | 144.1529 Ops/s | 141.8789 Ops/s | |
test_to_njt[True-False-None] | 5.7030ms | 5.5799ms | 179.2147 Ops/s | 176.7204 Ops/s | |
test_to_njt[within-False-None] | 12.3841ms | 12.2249ms | 81.8002 Ops/s | 80.5304 Ops/s | |
test_creation[device0] | 0.4601ms | 78.9770μs | 12.6619 KOps/s | 11.8791 KOps/s | |
test_creation_from_tensor | 0.6149ms | 83.7987μs | 11.9334 KOps/s | 11.5049 KOps/s | |
test_add_one[memmap_tensor0] | 0.3524ms | 6.3997μs | 156.2571 KOps/s | 157.2588 KOps/s | |
test_contiguous[memmap_tensor0] | 1.9380μs | 0.4246μs | 2.3554 MOps/s | 2.3448 MOps/s | |
test_stack[memmap_tensor0] | 47.3000μs | 4.5940μs | 217.6739 KOps/s | 229.4514 KOps/s | |
test_memmaptd_index | 1.4722ms | 0.2429ms | 4.1166 KOps/s | 4.0947 KOps/s | |
test_memmaptd_index_astensor | 0.4381ms | 0.3057ms | 3.2711 KOps/s | 3.2643 KOps/s | |
test_memmaptd_index_op | 0.7271ms | 0.5817ms | 1.7192 KOps/s | 1.7628 KOps/s | |
test_serialize_model | 0.1309s | 0.1300s | 7.6953 Ops/s | 7.6725 Ops/s | |
test_serialize_model_pickle | 1.3655s | 1.2161s | 0.8223 Ops/s | 0.8240 Ops/s | |
test_serialize_weights | 0.1316s | 0.1293s | 7.7340 Ops/s | 7.7115 Ops/s | |
test_serialize_weights_returnearly | 0.4397s | 69.6106ms | 14.3656 Ops/s | 14.6216 Ops/s | |
test_serialize_weights_pickle | 1.3503s | 1.1874s | 0.8422 Ops/s | 0.8396 Ops/s | |
test_reshape_pytree | 59.5210μs | 22.9409μs | 43.5903 KOps/s | 43.2346 KOps/s | |
test_reshape_td | 0.1647ms | 26.7824μs | 37.3379 KOps/s | 36.3591 KOps/s | |
test_view_pytree | 47.5910μs | 22.5243μs | 44.3965 KOps/s | 45.0643 KOps/s | |
test_view_td | 66.1910μs | 31.5590μs | 31.6867 KOps/s | 30.9095 KOps/s | |
test_unbind_pytree | 57.7410μs | 28.8184μs | 34.7000 KOps/s | 35.1478 KOps/s | |
test_unbind_td | 0.9564ms | 37.6159μs | 26.5845 KOps/s | 26.8543 KOps/s | |
test_split_pytree | 59.6110μs | 30.8136μs | 32.4532 KOps/s | 32.7437 KOps/s | |
test_split_td | 0.1770ms | 38.8441μs | 25.7439 KOps/s | 24.8252 KOps/s | |
test_add_pytree | 69.4510μs | 33.6187μs | 29.7453 KOps/s | 29.8480 KOps/s | |
test_add_td | 86.4620μs | 50.4692μs | 19.8141 KOps/s | 20.7791 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1758ms | 0.1233ms | 8.1118 KOps/s | 7.8032 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2233ms | 0.1357ms | 7.3699 KOps/s | 7.2587 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1424ms | 0.1002ms | 9.9804 KOps/s | 10.2128 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.4031ms | 0.1457ms | 6.8632 KOps/s | 6.6583 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 99.3020μs | 25.3979μs | 39.3733 KOps/s | 38.5577 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 71.0520μs | 29.8597μs | 33.4900 KOps/s | 34.0503 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3758ms | 65.0348μs | 15.3764 KOps/s | 15.0873 KOps/s | |
test_compile_copy_nested[pytree-eager] | 84.4220μs | 49.6870μs | 20.1260 KOps/s | 19.6321 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1861ms | 0.1424ms | 7.0216 KOps/s | 7.0513 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3207ms | 0.2178ms | 4.5914 KOps/s | 4.6137 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1343ms | 97.7556μs | 10.2296 KOps/s | 10.2508 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1137ms | 56.6725μs | 17.6452 KOps/s | 17.2531 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1840ms | 0.1366ms | 7.3231 KOps/s | 7.3623 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5998ms | 0.4609ms | 2.1698 KOps/s | 2.1027 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3786ms | 0.2616ms | 3.8221 KOps/s | 3.8055 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2853ms | 0.1457ms | 6.8630 KOps/s | 7.0370 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1556ms | 70.6923μs | 14.1458 KOps/s | 14.5429 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2309ms | 99.1181μs | 10.0890 KOps/s | 10.1544 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4403ms | 0.3937ms | 2.5401 KOps/s | 2.5233 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1799ms | 0.1355ms | 7.3797 KOps/s | 7.4048 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 53.0010μs | 19.4865μs | 51.3175 KOps/s | 36.4500 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 67.7910μs | 31.7742μs | 31.4721 KOps/s | 32.0749 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1400ms | 70.8526μs | 14.1138 KOps/s | 14.0032 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1537ms | 53.2118μs | 18.7928 KOps/s | 18.8286 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6570ms | 0.3972ms | 2.5174 KOps/s | 2.1891 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.0433ms | 2.7467ms | 364.0776 Ops/s | 373.8378 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6074ms | 0.4349ms | 2.2992 KOps/s | 2.2413 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.9113ms | 2.6432ms | 378.3338 Ops/s | 386.0650 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.4117ms | 0.1200ms | 8.3312 KOps/s | 8.5653 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5757ms | 81.2232μs | 12.3118 KOps/s | 12.2372 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.6596ms | 0.1125ms | 8.8857 KOps/s | 8.8536 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1947ms | 69.6714μs | 14.3531 KOps/s | 13.7713 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2932ms | 0.1125ms | 8.8857 KOps/s | 8.7580 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2039ms | 69.3709μs | 14.4153 KOps/s | 13.8401 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2479ms | 0.1087ms | 9.1996 KOps/s | 9.4980 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1690ms | 17.4365μs | 57.3509 KOps/s | 55.3237 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2165ms | 0.1015ms | 9.8480 KOps/s | 9.9148 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 88.0420μs | 16.1546μs | 61.9020 KOps/s | 62.2776 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2298ms | 0.1029ms | 9.7160 KOps/s | 9.8866 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1018ms | 16.2286μs | 61.6195 KOps/s | 63.2141 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2278ms | 0.1067ms | 9.3715 KOps/s | 9.8451 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5892ms | 17.3233μs | 57.7258 KOps/s | 56.6625 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2322ms | 0.1037ms | 9.6451 KOps/s | 10.3342 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1112ms | 16.0385μs | 62.3500 KOps/s | 63.1891 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2342ms | 0.1024ms | 9.7627 KOps/s | 10.3637 KOps/s | |
test_compile_indexing[int-pytree-eager] | 87.0010μs | 15.9942μs | 62.5226 KOps/s | 60.4886 KOps/s | |
test_mod_add[eager] | 0.1603ms | 38.6734μs | 25.8576 KOps/s | 25.8636 KOps/s | |
test_mod_add[compile] | 0.2896ms | 82.8233μs | 12.0739 KOps/s | 12.1533 KOps/s | |
test_mod_add[compile-overhead] | 0.3255ms | 0.1685ms | 5.9349 KOps/s | 5.6184 KOps/s | |
test_mod_wrap[eager] | 0.3296ms | 0.2584ms | 3.8701 KOps/s | 3.9950 KOps/s | |
test_mod_wrap[compile] | 0.3770ms | 0.2886ms | 3.4650 KOps/s | 3.4314 KOps/s | |
test_mod_wrap[compile-overhead] | 7.2570ms | 3.8890ms | 257.1373 Ops/s | 261.4183 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4677ms | 1.3422ms | 745.0196 Ops/s | 700.6597 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4074ms | 1.2821ms | 779.9593 Ops/s | 722.3696 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3776ms | 0.9261ms | 1.0798 KOps/s | 961.6483 Ops/s | |
test_seq_add[eager] | 0.1790ms | 0.1231ms | 8.1239 KOps/s | 8.2844 KOps/s | |
test_seq_add[compile] | 0.2026ms | 91.9513μs | 10.8753 KOps/s | 10.9719 KOps/s | |
test_seq_add[compile-overhead] | 0.1723ms | 0.1312ms | 7.6215 KOps/s | 7.6354 KOps/s | |
test_seq_wrap[eager] | 0.4914ms | 0.4277ms | 2.3383 KOps/s | 2.2148 KOps/s | |
test_seq_wrap[compile] | 0.3461ms | 0.3046ms | 3.2828 KOps/s | 3.2607 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2765ms | 0.2283ms | 4.3796 KOps/s | 4.4041 KOps/s | |
test_func_call_runtime[False-eager] | 0.8353ms | 0.7535ms | 1.3271 KOps/s | 1.3705 KOps/s | |
test_func_call_runtime[False-compile] | 0.8966ms | 0.7856ms | 1.2729 KOps/s | 1.3046 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4788ms | 0.3715ms | 2.6917 KOps/s | 2.7218 KOps/s | |
test_func_call_runtime[True-eager] | 0.9468ms | 0.8855ms | 1.1294 KOps/s | 1.1135 KOps/s | |
test_func_call_runtime[True-compile] | 0.8493ms | 0.7788ms | 1.2840 KOps/s | 1.2736 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4557ms | 0.3887ms | 2.5728 KOps/s | 2.5740 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8903ms | 0.7285ms | 1.3727 KOps/s | 1.3831 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.2968ms | 0.7588ms | 1.3179 KOps/s | 1.3066 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4251ms | 0.3699ms | 2.7035 KOps/s | 2.6934 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0703ms | 0.9908ms | 1.0093 KOps/s | 986.5777 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1321ms | 1.0228ms | 977.7369 Ops/s | 1.0115 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0618ms | 0.9842ms | 1.0161 KOps/s | 1.0034 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4638ms | 2.0318ms | 492.1627 Ops/s | 487.2872 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9270ms | 0.8231ms | 1.2149 KOps/s | 1.1912 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4716ms | 0.4211ms | 2.3745 KOps/s | 2.3695 KOps/s | |
test_distributed | 2.6674ms | 0.3375ms | 2.9627 KOps/s | 8.3495 KOps/s | |
test_tdmodule | 53.7110μs | 20.8133μs | 48.0463 KOps/s | 47.9519 KOps/s | |
test_tdmodule_dispatch | 72.0310μs | 37.2834μs | 26.8216 KOps/s | 26.7225 KOps/s | |
test_tdseq | 45.2800μs | 21.5419μs | 46.4212 KOps/s | 47.1392 KOps/s | |
test_tdseq_dispatch | 60.1110μs | 40.2384μs | 24.8519 KOps/s | 25.7853 KOps/s | |
test_instantiation_functorch | 1.6636ms | 1.5560ms | 642.6555 Ops/s | 641.0899 Ops/s | |
test_exec_functorch | 0.2020ms | 0.1393ms | 7.1794 KOps/s | 7.0100 KOps/s | |
test_exec_functional_call | 0.1855ms | 0.1320ms | 7.5732 KOps/s | 7.5718 KOps/s | |
test_exec_td_decorator | 0.3611ms | 0.1816ms | 5.5066 KOps/s | 5.4263 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7883ms | 0.6714ms | 1.4893 KOps/s | 1.4285 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7907ms | 0.6707ms | 1.4910 KOps/s | 1.4724 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7310ms | 0.5766ms | 1.7343 KOps/s | 1.7293 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7114ms | 0.5759ms | 1.7364 KOps/s | 1.7270 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.7757ms | 18.6390ms | 53.6509 Ops/s | 53.6844 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.8548ms | 18.6619ms | 53.5852 Ops/s | 53.1643 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.7087ms | 18.4886ms | 54.0874 Ops/s | 54.0018 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.6755ms | 18.5331ms | 53.9575 Ops/s | 53.9473 Ops/s | |
test_to_module_speed[True] | 1.1839ms | 0.9783ms | 1.0222 KOps/s | 1.0210 KOps/s | |
test_to_module_speed[False] | 1.3594ms | 0.9536ms | 1.0487 KOps/s | 1.0460 KOps/s | |
test_tc_init | 70.7910μs | 36.2284μs | 27.6027 KOps/s | 28.1831 KOps/s | |
test_tc_init_nested | 0.1168ms | 73.7204μs | 13.5648 KOps/s | 13.5546 KOps/s | |
test_tc_first_layer_tensor | 23.4300μs | 0.8275μs | 1.2084 MOps/s | 1.4235 MOps/s | |
test_tc_first_layer_nontensor | 28.5600μs | 2.2735μs | 439.8592 KOps/s | 425.1754 KOps/s | |
test_tc_second_layer_tensor | 12.2437μs | 1.4407μs | 694.1296 KOps/s | 694.2729 KOps/s | |
test_tc_second_layer_nontensor | 23.7700μs | 3.0150μs | 331.6732 KOps/s | 322.1516 KOps/s | |
test_unbind | 0.2238s | 10.1404ms | 98.6159 Ops/s | 142.2247 Ops/s | |
test_full_like | 11.4011ms | 9.3175ms | 107.3249 Ops/s | 107.8702 Ops/s | |
test_zeros_like | 4.9887ms | 4.3329ms | 230.7912 Ops/s | 231.6129 Ops/s | |
test_ones_like | 9.4247ms | 7.1552ms | 139.7586 Ops/s | 230.4833 Ops/s | |
test_clone | 7.3189ms | 6.4523ms | 154.9824 Ops/s | 154.3629 Ops/s | |
test_squeeze | 58.2010μs | 10.0548μs | 99.4547 KOps/s | 100.9527 KOps/s | |
test_unsqueeze | 0.1337ms | 74.1263μs | 13.4905 KOps/s | 13.4667 KOps/s | |
test_split | 0.3691ms | 0.1590ms | 6.2901 KOps/s | 6.0952 KOps/s | |
test_permute | 0.2273ms | 0.1823ms | 5.4849 KOps/s | 5.3682 KOps/s | |
test_stack | 53.6053ms | 51.7006ms | 19.3421 Ops/s | 19.6376 Ops/s | |
test_cat | 54.4925ms | 51.1718ms | 19.5420 Ops/s | 19.5776 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: afb1480da5702ec582d4c8438ce16e569b819d9b Pull Request resolved: #1222
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: afb1480da5702ec582d4c8438ce16e569b819d9b Pull Request resolved: #1222
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: afb1480da5702ec582d4c8438ce16e569b819d9b Pull Request resolved: #1222
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):