-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix tensorclass indexing #1217
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 12, 2025
ghstack-source-id: ce89eb7de8fb1f7f536668b77bdf0684a92f7e52 Pull Request resolved: #1217
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.9330μs | 20.9126μs | 47.8180 KOps/s | 48.3096 KOps/s | |
test_plain_set_stack_nested | 43.2510μs | 21.1283μs | 47.3299 KOps/s | 47.5316 KOps/s | |
test_plain_set_nested_inplace | 74.3180μs | 22.8614μs | 43.7418 KOps/s | 43.4536 KOps/s | |
test_plain_set_stack_nested_inplace | 51.7570μs | 22.8566μs | 43.7509 KOps/s | 43.6776 KOps/s | |
test_items | 48.9810μs | 4.1853μs | 238.9333 KOps/s | 235.8691 KOps/s | |
test_items_nested | 0.8958ms | 0.3984ms | 2.5103 KOps/s | 2.4164 KOps/s | |
test_items_nested_locked | 0.9200ms | 0.4025ms | 2.4843 KOps/s | 2.4294 KOps/s | |
test_items_nested_leaf | 0.1325ms | 77.5709μs | 12.8914 KOps/s | 12.9296 KOps/s | |
test_items_stack_nested | 0.8113ms | 0.4047ms | 2.4708 KOps/s | 2.4028 KOps/s | |
test_items_stack_nested_leaf | 0.1394ms | 79.2028μs | 12.6258 KOps/s | 12.5048 KOps/s | |
test_items_stack_nested_locked | 0.7993ms | 0.4028ms | 2.4829 KOps/s | 2.4052 KOps/s | |
test_keys | 57.4470μs | 3.4680μs | 288.3490 KOps/s | 286.7885 KOps/s | |
test_keys_nested | 0.2511ms | 0.1639ms | 6.1007 KOps/s | 5.9930 KOps/s | |
test_keys_nested_locked | 1.9823ms | 0.1699ms | 5.8855 KOps/s | 5.7850 KOps/s | |
test_keys_nested_leaf | 0.2340ms | 0.1423ms | 7.0255 KOps/s | 6.9097 KOps/s | |
test_keys_stack_nested | 0.3312ms | 0.1654ms | 6.0461 KOps/s | 5.9481 KOps/s | |
test_keys_stack_nested_leaf | 0.2131ms | 0.1425ms | 7.0165 KOps/s | 6.9475 KOps/s | |
test_keys_stack_nested_locked | 0.2284ms | 0.1682ms | 5.9457 KOps/s | 5.8223 KOps/s | |
test_values | 8.8926μs | 1.0337μs | 967.4079 KOps/s | 954.7189 KOps/s | |
test_values_nested | 0.1110ms | 62.0909μs | 16.1054 KOps/s | 15.4167 KOps/s | |
test_values_nested_locked | 0.1170ms | 62.2827μs | 16.0558 KOps/s | 15.7369 KOps/s | |
test_values_nested_leaf | 0.1667ms | 71.4729μs | 13.9913 KOps/s | 13.6605 KOps/s | |
test_values_stack_nested | 0.1614ms | 63.2511μs | 15.8100 KOps/s | 14.8416 KOps/s | |
test_values_stack_nested_leaf | 0.1419ms | 71.4162μs | 14.0024 KOps/s | 13.7776 KOps/s | |
test_values_stack_nested_locked | 0.1184ms | 63.0918μs | 15.8499 KOps/s | 15.4723 KOps/s | |
test_membership | 5.2754μs | 0.7138μs | 1.4009 MOps/s | 1.1298 MOps/s | |
test_membership_nested | 40.2750μs | 2.8456μs | 351.4242 KOps/s | 346.1891 KOps/s | |
test_membership_nested_leaf | 46.9170μs | 2.8925μs | 345.7235 KOps/s | 339.1152 KOps/s | |
test_membership_stacked_nested | 19.0750μs | 2.8488μs | 351.0283 KOps/s | 345.4491 KOps/s | |
test_membership_stacked_nested_leaf | 39.8340μs | 2.8577μs | 349.9378 KOps/s | 345.7864 KOps/s | |
test_membership_nested_last | 48.2700μs | 4.2825μs | 233.5067 KOps/s | 227.2340 KOps/s | |
test_membership_nested_leaf_last | 29.4550μs | 4.2870μs | 233.2635 KOps/s | 224.1435 KOps/s | |
test_membership_stacked_nested_last | 28.8840μs | 4.3364μs | 230.6054 KOps/s | 225.2379 KOps/s | |
test_membership_stacked_nested_leaf_last | 23.8940μs | 4.2562μs | 234.9537 KOps/s | 225.8065 KOps/s | |
test_nested_getleaf | 45.1340μs | 10.7091μs | 93.3782 KOps/s | 94.4162 KOps/s | |
test_nested_get | 44.7230μs | 10.0625μs | 99.3792 KOps/s | 98.6225 KOps/s | |
test_stacked_getleaf | 34.8040μs | 10.5253μs | 95.0090 KOps/s | 95.0079 KOps/s | |
test_stacked_get | 55.1320μs | 9.9875μs | 100.1254 KOps/s | 100.0398 KOps/s | |
test_nested_getitemleaf | 57.9870μs | 11.3031μs | 88.4711 KOps/s | 87.7676 KOps/s | |
test_nested_getitem | 66.4340μs | 10.7802μs | 92.7627 KOps/s | 93.1884 KOps/s | |
test_stacked_getitemleaf | 57.2670μs | 11.1311μs | 89.8383 KOps/s | 89.4201 KOps/s | |
test_stacked_getitem | 73.7170μs | 10.7427μs | 93.0865 KOps/s | 92.8948 KOps/s | |
test_lock_nested | 0.5054ms | 0.4119ms | 2.4279 KOps/s | 2.4544 KOps/s | |
test_lock_stack_nested | 0.7778ms | 0.4313ms | 2.3188 KOps/s | 2.3505 KOps/s | |
test_unlock_nested | 0.4431ms | 0.3363ms | 2.9735 KOps/s | 2.9849 KOps/s | |
test_unlock_stack_nested | 0.5696ms | 0.3471ms | 2.8813 KOps/s | 2.9241 KOps/s | |
test_flatten_speed | 0.2083ms | 0.1028ms | 9.7287 KOps/s | 10.0902 KOps/s | |
test_unflatten_speed | 0.6647ms | 0.5236ms | 1.9100 KOps/s | 1.9292 KOps/s | |
test_common_ops | 7.5853ms | 0.8401ms | 1.1903 KOps/s | 1.2100 KOps/s | |
test_creation | 49.6740μs | 2.5388μs | 393.8802 KOps/s | 397.4929 KOps/s | |
test_creation_empty | 36.2470μs | 12.6273μs | 79.1937 KOps/s | 81.1336 KOps/s | |
test_creation_nested_1 | 65.2920μs | 15.4917μs | 64.5507 KOps/s | 65.3667 KOps/s | |
test_creation_nested_2 | 51.7760μs | 20.1980μs | 49.5098 KOps/s | 49.9296 KOps/s | |
test_clone | 0.1332ms | 13.5974μs | 73.5437 KOps/s | 73.7947 KOps/s | |
test_getitem[int] | 0.8579ms | 12.9453μs | 77.2483 KOps/s | 75.4107 KOps/s | |
test_getitem[slice_int] | 0.1289ms | 24.9580μs | 40.0672 KOps/s | 39.7490 KOps/s | |
test_getitem[range] | 0.1888ms | 49.4912μs | 20.2056 KOps/s | 19.5324 KOps/s | |
test_getitem[tuple] | 0.1517ms | 20.2457μs | 49.3932 KOps/s | 48.6368 KOps/s | |
test_getitem[list] | 0.1614ms | 45.1020μs | 22.1719 KOps/s | 21.3107 KOps/s | |
test_setitem_dim[int] | 49.3220μs | 25.1635μs | 39.7400 KOps/s | 38.2856 KOps/s | |
test_setitem_dim[slice_int] | 98.3030μs | 51.2743μs | 19.5030 KOps/s | 19.3527 KOps/s | |
test_setitem_dim[range] | 0.1348ms | 76.3492μs | 13.0977 KOps/s | 12.5343 KOps/s | |
test_setitem_dim[tuple] | 71.4030μs | 40.8857μs | 24.4584 KOps/s | 24.5572 KOps/s | |
test_setitem | 0.1885ms | 21.3091μs | 46.9283 KOps/s | 47.6526 KOps/s | |
test_set | 0.1570ms | 21.0043μs | 47.6093 KOps/s | 49.0993 KOps/s | |
test_set_shared | 0.3816ms | 0.1803ms | 5.5459 KOps/s | 5.5087 KOps/s | |
test_update | 0.2319ms | 24.2162μs | 41.2947 KOps/s | 42.0579 KOps/s | |
test_update_nested | 0.1746ms | 35.5556μs | 28.1250 KOps/s | 29.5680 KOps/s | |
test_update__nested | 0.5697ms | 33.7993μs | 29.5864 KOps/s | 29.2036 KOps/s | |
test_set_nested | 0.1363ms | 22.8497μs | 43.7643 KOps/s | 44.1733 KOps/s | |
test_set_nested_new | 90.2680μs | 27.9043μs | 35.8368 KOps/s | 35.8031 KOps/s | |
test_select | 0.2103ms | 43.1188μs | 23.1918 KOps/s | 23.3290 KOps/s | |
test_select_nested | 0.1441ms | 63.3336μs | 15.7894 KOps/s | 15.8573 KOps/s | |
test_exclude_nested | 0.1482ms | 81.1155μs | 12.3281 KOps/s | 12.2637 KOps/s | |
test_empty[True] | 0.6897ms | 0.4067ms | 2.4588 KOps/s | 2.4588 KOps/s | |
test_empty[False] | 15.0435μs | 1.3892μs | 719.8646 KOps/s | 719.1596 KOps/s | |
test_unbind_speed | 0.4341ms | 0.2711ms | 3.6892 KOps/s | 3.6539 KOps/s | |
test_unbind_speed_stack0 | 0.5546ms | 0.2678ms | 3.7339 KOps/s | 3.7028 KOps/s | |
test_unbind_speed_stack1 | 0.1175s | 0.7372ms | 1.3565 KOps/s | 1.3240 KOps/s | |
test_split | 0.1181s | 1.7648ms | 566.6350 Ops/s | 551.4198 Ops/s | |
test_chunk | 0.1207s | 1.7851ms | 560.1855 Ops/s | 556.0109 Ops/s | |
test_consolidate_njt[False-None] | 10.5492ms | 8.2361ms | 121.4171 Ops/s | 119.8518 Ops/s | |
test_creation[device0] | 4.9190ms | 92.7529μs | 10.7813 KOps/s | 10.5687 KOps/s | |
test_creation_from_tensor | 0.2671ms | 93.9038μs | 10.6492 KOps/s | 10.3746 KOps/s | |
test_add_one[memmap_tensor0] | 85.2690μs | 4.9240μs | 203.0858 KOps/s | 200.5956 KOps/s | |
test_contiguous[memmap_tensor0] | 18.9260μs | 0.5434μs | 1.8402 MOps/s | 1.9253 MOps/s | |
test_stack[memmap_tensor0] | 20.0170μs | 3.4722μs | 287.9996 KOps/s | 289.4397 KOps/s | |
test_memmaptd_index | 1.3561ms | 0.2318ms | 4.3135 KOps/s | 4.3576 KOps/s | |
test_memmaptd_index_astensor | 0.5121ms | 0.3204ms | 3.1212 KOps/s | 3.1667 KOps/s | |
test_memmaptd_index_op | 1.0838ms | 0.5971ms | 1.6748 KOps/s | 1.6954 KOps/s | |
test_serialize_model | 0.1271s | 0.1207s | 8.2846 Ops/s | 7.2314 Ops/s | |
test_serialize_model_pickle | 0.4816s | 0.4017s | 2.4894 Ops/s | 2.5789 Ops/s | |
test_serialize_weights | 0.1213s | 0.1171s | 8.5393 Ops/s | 8.3832 Ops/s | |
test_serialize_weights_returnearly | 0.1842s | 0.1615s | 6.1934 Ops/s | 6.3334 Ops/s | |
test_serialize_weights_pickle | 0.4996s | 0.4251s | 2.3525 Ops/s | 2.4535 Ops/s | |
test_serialize_weights_filesystem | 0.1520s | 0.1470s | 6.8017 Ops/s | 5.9185 Ops/s | |
test_serialize_model_filesystem | 0.1644s | 0.1518s | 6.5867 Ops/s | 6.4075 Ops/s | |
test_reshape_pytree | 59.5610μs | 25.8644μs | 38.6631 KOps/s | 36.4713 KOps/s | |
test_reshape_td | 76.3020μs | 32.6262μs | 30.6503 KOps/s | 30.5104 KOps/s | |
test_view_pytree | 55.0420μs | 25.6842μs | 38.9344 KOps/s | 38.4894 KOps/s | |
test_view_td | 80.3990μs | 39.4706μs | 25.3353 KOps/s | 26.4823 KOps/s | |
test_unbind_pytree | 93.0350μs | 28.9502μs | 34.5421 KOps/s | 33.6846 KOps/s | |
test_unbind_td | 0.3596ms | 40.3058μs | 24.8103 KOps/s | 24.7592 KOps/s | |
test_split_pytree | 88.6650μs | 28.9876μs | 34.4975 KOps/s | 34.4891 KOps/s | |
test_split_td | 0.4764ms | 45.4883μs | 21.9837 KOps/s | 21.8619 KOps/s | |
test_add_pytree | 80.5900μs | 34.5669μs | 28.9294 KOps/s | 28.2615 KOps/s | |
test_add_td | 0.1152ms | 58.1620μs | 17.1933 KOps/s | 17.1218 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1489ms | 66.6746μs | 14.9982 KOps/s | 14.8309 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3191ms | 0.1702ms | 5.8748 KOps/s | 5.7770 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1218ms | 45.5333μs | 21.9619 KOps/s | 21.8886 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2591ms | 0.1171ms | 8.5387 KOps/s | 8.3857 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 92.1520μs | 28.0154μs | 35.6946 KOps/s | 34.5153 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1243ms | 59.0367μs | 16.9386 KOps/s | 16.8121 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1574ms | 79.0468μs | 12.6507 KOps/s | 12.5868 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1245ms | 66.2755μs | 15.0885 KOps/s | 14.9142 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4287ms | 0.1077ms | 9.2858 KOps/s | 9.3115 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3524ms | 0.2137ms | 4.6801 KOps/s | 4.6199 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1378ms | 46.6541μs | 21.4344 KOps/s | 21.2066 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.3280ms | 66.0286μs | 15.1449 KOps/s | 14.6852 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1829ms | 0.1002ms | 9.9838 KOps/s | 9.8477 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3662ms | 0.1991ms | 5.0235 KOps/s | 4.9756 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3734ms | 0.2298ms | 4.3521 KOps/s | 4.2996 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2258ms | 0.1085ms | 9.2152 KOps/s | 9.2593 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1257ms | 63.7361μs | 15.6897 KOps/s | 15.8509 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1092ms | 47.8057μs | 20.9180 KOps/s | 20.4114 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3607ms | 0.1566ms | 6.3852 KOps/s | 6.3782 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2027ms | 0.1001ms | 9.9888 KOps/s | 9.7074 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 56.0240μs | 20.8798μs | 47.8931 KOps/s | 45.1034 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1329ms | 68.3182μs | 14.6374 KOps/s | 14.8626 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1779ms | 81.2259μs | 12.3113 KOps/s | 12.1503 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1911ms | 67.0264μs | 14.9195 KOps/s | 14.7831 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3200ms | 0.2148ms | 4.6549 KOps/s | 4.6488 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.6144ms | 1.3812ms | 724.0179 Ops/s | 711.1822 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2853ms | 0.2084ms | 4.7996 KOps/s | 4.7634 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9204ms | 0.8127ms | 1.2305 KOps/s | 1.2091 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5600ms | 0.4565ms | 2.1908 KOps/s | 2.1877 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.4275ms | 2.7820ms | 359.4512 Ops/s | 366.6000 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1308ms | 39.3071μs | 25.4407 KOps/s | 25.9631 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.9087ms | 32.0322μs | 31.2186 KOps/s | 28.3475 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1007ms | 30.7164μs | 32.5559 KOps/s | 32.4585 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 82.2130μs | 22.4909μs | 44.4625 KOps/s | 42.5734 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 82.1830μs | 31.6705μs | 31.5752 KOps/s | 31.6274 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 62.7670μs | 22.3624μs | 44.7179 KOps/s | 42.1138 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1131ms | 53.1871μs | 18.8016 KOps/s | 18.4156 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4227ms | 19.3157μs | 51.7715 KOps/s | 48.4189 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1304ms | 45.6871μs | 21.8880 KOps/s | 21.2612 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 73.0260μs | 18.3236μs | 54.5743 KOps/s | 52.9925 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 93.8450μs | 46.4551μs | 21.5261 KOps/s | 21.0117 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 54.6220μs | 18.0664μs | 55.3514 KOps/s | 53.2737 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1153ms | 54.4959μs | 18.3500 KOps/s | 18.2279 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1067ms | 19.3404μs | 51.7052 KOps/s | 49.4376 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1078ms | 46.1245μs | 21.6804 KOps/s | 21.0777 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 51.6260μs | 18.0064μs | 55.5357 KOps/s | 53.5376 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1139ms | 46.4978μs | 21.5064 KOps/s | 21.0356 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.8080μs | 18.0712μs | 55.3366 KOps/s | 53.7793 KOps/s | |
test_mod_add[eager] | 96.0490μs | 36.4268μs | 27.4523 KOps/s | 26.9125 KOps/s | |
test_mod_add[compile] | 0.1161ms | 65.9224μs | 15.1693 KOps/s | 15.3642 KOps/s | |
test_mod_add[compile-overhead] | 0.1573ms | 64.7372μs | 15.4471 KOps/s | 15.5838 KOps/s | |
test_mod_wrap[eager] | 0.3512ms | 0.2216ms | 4.5131 KOps/s | 4.5396 KOps/s | |
test_mod_wrap[compile] | 2.3072ms | 0.2275ms | 4.3956 KOps/s | 4.2949 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3704ms | 0.2227ms | 4.4906 KOps/s | 4.2825 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.1080ms | 12.7881ms | 78.1975 Ops/s | 75.7988 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.6439ms | 13.8256ms | 72.3295 Ops/s | 82.9229 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 14.2761ms | 11.9669ms | 83.5635 Ops/s | 83.8499 Ops/s | |
test_seq_add[eager] | 0.2163ms | 0.1155ms | 8.6588 KOps/s | 8.1005 KOps/s | |
test_seq_add[compile] | 0.1882ms | 81.7885μs | 12.2267 KOps/s | 12.9853 KOps/s | |
test_seq_add[compile-overhead] | 0.1477ms | 77.3916μs | 12.9213 KOps/s | 13.3169 KOps/s | |
test_seq_wrap[eager] | 0.7341ms | 0.4460ms | 2.2423 KOps/s | 2.2525 KOps/s | |
test_seq_wrap[compile] | 0.4392ms | 0.2433ms | 4.1099 KOps/s | 4.1649 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4404ms | 0.2435ms | 4.1065 KOps/s | 4.2140 KOps/s | |
test_func_call_runtime[False-eager] | 0.6949ms | 0.5469ms | 1.8283 KOps/s | 1.8292 KOps/s | |
test_func_call_runtime[False-compile] | 0.7059ms | 0.4450ms | 2.2471 KOps/s | 2.2237 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6220ms | 0.4407ms | 2.2690 KOps/s | 2.2418 KOps/s | |
test_func_call_runtime[True-eager] | 1.2253ms | 0.7589ms | 1.3178 KOps/s | 1.3210 KOps/s | |
test_func_call_runtime[True-compile] | 0.7647ms | 0.4681ms | 2.1362 KOps/s | 2.1324 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6137ms | 0.4620ms | 2.1645 KOps/s | 2.1202 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7325ms | 0.5358ms | 1.8665 KOps/s | 1.8600 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5877ms | 0.4369ms | 2.2890 KOps/s | 2.2281 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7891ms | 0.4420ms | 2.2625 KOps/s | 2.1398 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4792ms | 0.9027ms | 1.1078 KOps/s | 1.1153 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0332ms | 0.8025ms | 1.2461 KOps/s | 1.2361 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1544ms | 0.8023ms | 1.2464 KOps/s | 1.2170 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5889ms | 1.9195ms | 520.9681 Ops/s | 513.3286 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8648ms | 0.5436ms | 1.8397 KOps/s | 1.8006 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7136ms | 0.5377ms | 1.8599 KOps/s | 1.8335 KOps/s | |
test_distributed | 0.2381ms | 0.1247ms | 8.0208 KOps/s | 7.6962 KOps/s | |
test_tdmodule | 0.1398ms | 27.2315μs | 36.7222 KOps/s | 36.3747 KOps/s | |
test_tdmodule_dispatch | 97.4820μs | 50.1400μs | 19.9442 KOps/s | 19.9716 KOps/s | |
test_tdseq | 65.9430μs | 31.2860μs | 31.9632 KOps/s | 32.7014 KOps/s | |
test_tdseq_dispatch | 93.4640μs | 55.4339μs | 18.0395 KOps/s | 17.9578 KOps/s | |
test_instantiation_functorch | 2.0743ms | 1.5049ms | 664.4764 Ops/s | 642.9392 Ops/s | |
test_exec_functorch | 0.3197ms | 0.1797ms | 5.5645 KOps/s | 5.5324 KOps/s | |
test_exec_functional_call | 0.2956ms | 0.1724ms | 5.7994 KOps/s | 5.8090 KOps/s | |
test_exec_td_decorator | 0.5231ms | 0.2314ms | 4.3211 KOps/s | 4.2357 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8245ms | 0.6577ms | 1.5205 KOps/s | 1.4570 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8944ms | 0.6560ms | 1.5243 KOps/s | 1.4868 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8189ms | 0.5383ms | 1.8576 KOps/s | 1.8288 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8657ms | 0.5355ms | 1.8675 KOps/s | 1.8244 KOps/s | |
test_to_module_speed[True] | 1.9669ms | 1.3585ms | 736.1224 Ops/s | 730.4819 Ops/s | |
test_to_module_speed[False] | 1.3977ms | 1.3218ms | 756.5623 Ops/s | 747.7737 Ops/s | |
test_tc_init | 83.5250μs | 46.2562μs | 21.6187 KOps/s | 20.6718 KOps/s | |
test_tc_init_nested | 0.1613ms | 92.8861μs | 10.7659 KOps/s | 10.3467 KOps/s | |
test_tc_first_layer_tensor | 31.9600μs | 1.5780μs | 633.7022 KOps/s | 637.0831 KOps/s | |
test_tc_first_layer_nontensor | 51.9570μs | 4.7946μs | 208.5689 KOps/s | 211.9894 KOps/s | |
test_tc_second_layer_tensor | 42.5690μs | 2.9091μs | 343.7508 KOps/s | 345.4037 KOps/s | |
test_tc_second_layer_nontensor | 32.6610μs | 6.1252μs | 163.2587 KOps/s | 162.8548 KOps/s | |
test_unbind | 0.2610s | 13.9495ms | 71.6872 Ops/s | 71.9608 Ops/s | |
test_full_like | 10.2669ms | 9.0502ms | 110.4947 Ops/s | 100.9779 Ops/s | |
test_zeros_like | 6.4136ms | 5.1932ms | 192.5601 Ops/s | 188.2032 Ops/s | |
test_ones_like | 4.4535ms | 3.7391ms | 267.4429 Ops/s | 246.6197 Ops/s | |
test_clone | 7.7575ms | 6.1701ms | 162.0716 Ops/s | 154.6524 Ops/s | |
test_squeeze | 59.5810μs | 12.4792μs | 80.1334 KOps/s | 79.5882 KOps/s | |
test_unsqueeze | 0.3065ms | 94.2321μs | 10.6121 KOps/s | 10.8334 KOps/s | |
test_split | 0.3297ms | 0.1885ms | 5.3061 KOps/s | 4.9757 KOps/s | |
test_permute | 0.3729ms | 0.1998ms | 5.0060 KOps/s | 4.9319 KOps/s | |
test_stack | 32.6128ms | 27.9298ms | 35.8041 Ops/s | 33.9217 Ops/s | |
test_cat | 34.8969ms | 27.2786ms | 36.6588 Ops/s | 35.3755 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 41.3000μs | 13.0190μs | 76.8108 KOps/s | 77.1647 KOps/s | |
test_plain_set_stack_nested | 45.7510μs | 13.1710μs | 75.9242 KOps/s | 76.2571 KOps/s | |
test_plain_set_nested_inplace | 0.1188ms | 14.2627μs | 70.1132 KOps/s | 70.9137 KOps/s | |
test_plain_set_stack_nested_inplace | 50.1100μs | 14.1367μs | 70.7380 KOps/s | 71.7187 KOps/s | |
test_items | 54.6510μs | 2.8814μs | 347.0496 KOps/s | 335.1729 KOps/s | |
test_items_nested | 0.4206ms | 0.3655ms | 2.7363 KOps/s | 2.7256 KOps/s | |
test_items_nested_locked | 0.4636ms | 0.3657ms | 2.7347 KOps/s | 2.7185 KOps/s | |
test_items_nested_leaf | 89.5110μs | 60.2376μs | 16.6009 KOps/s | 16.5273 KOps/s | |
test_items_stack_nested | 0.4111ms | 0.3654ms | 2.7369 KOps/s | 2.7304 KOps/s | |
test_items_stack_nested_leaf | 98.1520μs | 61.1041μs | 16.3655 KOps/s | 16.0164 KOps/s | |
test_items_stack_nested_locked | 0.4378ms | 0.3666ms | 2.7280 KOps/s | 2.6969 KOps/s | |
test_keys | 27.3110μs | 3.4391μs | 290.7749 KOps/s | 290.4880 KOps/s | |
test_keys_nested | 0.1128ms | 88.3438μs | 11.3194 KOps/s | 11.4991 KOps/s | |
test_keys_nested_locked | 0.8268ms | 94.4516μs | 10.5874 KOps/s | 10.7496 KOps/s | |
test_keys_nested_leaf | 98.6410μs | 80.2329μs | 12.4637 KOps/s | 12.7571 KOps/s | |
test_keys_stack_nested | 0.1345ms | 88.7403μs | 11.2688 KOps/s | 11.3465 KOps/s | |
test_keys_stack_nested_leaf | 0.1200ms | 80.3389μs | 12.4473 KOps/s | 12.5291 KOps/s | |
test_keys_stack_nested_locked | 0.1368ms | 95.2343μs | 10.5004 KOps/s | 10.6603 KOps/s | |
test_values | 74.7177μs | 0.8523μs | 1.1732 MOps/s | 1.1679 MOps/s | |
test_values_nested | 0.1004ms | 37.4987μs | 26.6676 KOps/s | 26.9329 KOps/s | |
test_values_nested_locked | 64.1410μs | 39.2029μs | 25.5083 KOps/s | 25.3453 KOps/s | |
test_values_nested_leaf | 69.9710μs | 42.9098μs | 23.3047 KOps/s | 23.7360 KOps/s | |
test_values_stack_nested | 76.2120μs | 38.0739μs | 26.2647 KOps/s | 26.5395 KOps/s | |
test_values_stack_nested_leaf | 81.1210μs | 43.2508μs | 23.1209 KOps/s | 23.4183 KOps/s | |
test_values_stack_nested_locked | 72.4610μs | 39.8162μs | 25.1154 KOps/s | 24.8668 KOps/s | |
test_membership | 2.7150μs | 0.5014μs | 1.9946 MOps/s | 1.9913 MOps/s | |
test_membership_nested | 21.3755μs | 1.9822μs | 504.4957 KOps/s | 495.5568 KOps/s | |
test_membership_nested_leaf | 22.3055μs | 1.9935μs | 501.6366 KOps/s | 498.6995 KOps/s | |
test_membership_stacked_nested | 47.5310μs | 2.0686μs | 483.4119 KOps/s | 484.8385 KOps/s | |
test_membership_stacked_nested_leaf | 39.1210μs | 2.0427μs | 489.5404 KOps/s | 480.9282 KOps/s | |
test_membership_nested_last | 59.1710μs | 3.0012μs | 333.2045 KOps/s | 326.8886 KOps/s | |
test_membership_nested_leaf_last | 37.8010μs | 3.0641μs | 326.3554 KOps/s | 326.2458 KOps/s | |
test_membership_stacked_nested_last | 80.1410μs | 3.0441μs | 328.5046 KOps/s | 327.5053 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.3510μs | 3.0259μs | 330.4829 KOps/s | 328.8952 KOps/s | |
test_nested_getleaf | 41.5610μs | 6.1847μs | 161.6897 KOps/s | 159.9246 KOps/s | |
test_nested_get | 33.7910μs | 5.9898μs | 166.9503 KOps/s | 166.9009 KOps/s | |
test_stacked_getleaf | 76.1410μs | 6.1818μs | 161.7655 KOps/s | 161.1945 KOps/s | |
test_stacked_get | 31.2900μs | 5.8456μs | 171.0694 KOps/s | 172.2532 KOps/s | |
test_nested_getitemleaf | 0.2111ms | 6.4943μs | 153.9819 KOps/s | 156.3016 KOps/s | |
test_nested_getitem | 0.1163ms | 6.1859μs | 161.6582 KOps/s | 163.3700 KOps/s | |
test_stacked_getitemleaf | 96.7410μs | 6.4273μs | 155.5875 KOps/s | 157.5055 KOps/s | |
test_stacked_getitem | 29.1210μs | 6.0633μs | 164.9273 KOps/s | 166.0303 KOps/s | |
test_lock_nested | 0.5458ms | 0.3437ms | 2.9096 KOps/s | 2.9109 KOps/s | |
test_lock_stack_nested | 0.4039ms | 0.3528ms | 2.8348 KOps/s | 2.9095 KOps/s | |
test_unlock_nested | 0.3734ms | 0.2919ms | 3.4260 KOps/s | 3.5619 KOps/s | |
test_unlock_stack_nested | 0.3464ms | 0.2907ms | 3.4394 KOps/s | 3.5522 KOps/s | |
test_flatten_speed | 0.2494ms | 77.6027μs | 12.8861 KOps/s | 12.8787 KOps/s | |
test_unflatten_speed | 0.4792ms | 0.3198ms | 3.1265 KOps/s | 3.1146 KOps/s | |
test_common_ops | 0.8533ms | 0.6499ms | 1.5387 KOps/s | 1.5707 KOps/s | |
test_creation | 0.1201ms | 1.7531μs | 570.4076 KOps/s | 567.2540 KOps/s | |
test_creation_empty | 45.1000μs | 9.7030μs | 103.0605 KOps/s | 107.1383 KOps/s | |
test_creation_nested_1 | 0.1333ms | 11.3759μs | 87.9053 KOps/s | 90.5319 KOps/s | |
test_creation_nested_2 | 83.4610μs | 14.0226μs | 71.3132 KOps/s | 71.5941 KOps/s | |
test_clone | 0.1532ms | 11.1402μs | 89.7653 KOps/s | 91.4401 KOps/s | |
test_getitem[int] | 1.3402ms | 10.9824μs | 91.0551 KOps/s | 95.4466 KOps/s | |
test_getitem[slice_int] | 0.1155ms | 21.4080μs | 46.7115 KOps/s | 48.3261 KOps/s | |
test_getitem[range] | 0.1575ms | 38.7324μs | 25.8182 KOps/s | 24.8161 KOps/s | |
test_getitem[tuple] | 0.1044ms | 18.6132μs | 53.7252 KOps/s | 52.9457 KOps/s | |
test_getitem[list] | 0.1299ms | 35.0483μs | 28.5320 KOps/s | 27.9123 KOps/s | |
test_setitem_dim[int] | 43.0010μs | 20.7203μs | 48.2618 KOps/s | 49.5053 KOps/s | |
test_setitem_dim[slice_int] | 63.9910μs | 40.1613μs | 24.8996 KOps/s | 24.6468 KOps/s | |
test_setitem_dim[range] | 81.3620μs | 54.7220μs | 18.2742 KOps/s | 17.4971 KOps/s | |
test_setitem_dim[tuple] | 56.5810μs | 34.1558μs | 29.2776 KOps/s | 29.0253 KOps/s | |
test_setitem | 78.2710μs | 16.7341μs | 59.7582 KOps/s | 61.8204 KOps/s | |
test_set | 76.4110μs | 15.8713μs | 63.0068 KOps/s | 63.4440 KOps/s | |
test_set_shared | 0.5864ms | 0.1583ms | 6.3158 KOps/s | 6.1142 KOps/s | |
test_update | 0.2410ms | 19.6732μs | 50.8305 KOps/s | 49.9033 KOps/s | |
test_update_nested | 86.3910μs | 26.0505μs | 38.3870 KOps/s | 38.4396 KOps/s | |
test_update__nested | 0.4868ms | 26.5209μs | 37.7061 KOps/s | 38.2867 KOps/s | |
test_set_nested | 0.1330ms | 17.3925μs | 57.4959 KOps/s | 58.0329 KOps/s | |
test_set_nested_new | 78.5310μs | 19.5655μs | 51.1105 KOps/s | 49.9437 KOps/s | |
test_select | 77.3210μs | 31.4168μs | 31.8301 KOps/s | 31.0123 KOps/s | |
test_select_nested | 68.6310μs | 43.7849μs | 22.8389 KOps/s | 22.7817 KOps/s | |
test_exclude_nested | 98.0810μs | 64.3574μs | 15.5382 KOps/s | 15.6203 KOps/s | |
test_empty[True] | 0.3360ms | 0.2963ms | 3.3753 KOps/s | 3.4169 KOps/s | |
test_empty[False] | 4.4361μs | 0.8278μs | 1.2081 MOps/s | 1.2140 MOps/s | |
test_to | 87.6910μs | 56.0794μs | 17.8319 KOps/s | 17.8491 KOps/s | |
test_to_nonblocking | 87.0520μs | 47.4421μs | 21.0783 KOps/s | 21.0896 KOps/s | |
test_unbind_speed | 0.2787ms | 0.2477ms | 4.0370 KOps/s | 4.2065 KOps/s | |
test_unbind_speed_stack0 | 0.3205ms | 0.2480ms | 4.0326 KOps/s | 4.1604 KOps/s | |
test_unbind_speed_stack1 | 94.5681ms | 0.7508ms | 1.3319 KOps/s | 1.3701 KOps/s | |
test_split | 95.3628ms | 1.6276ms | 614.4153 Ops/s | 623.8028 Ops/s | |
test_chunk | 98.4289ms | 1.6334ms | 612.2080 Ops/s | 613.3722 Ops/s | |
test_consolidate[False-None] | 3.3910ms | 2.7283ms | 366.5240 Ops/s | 370.9558 Ops/s | |
test_consolidate[default-None] | 1.9600ms | 1.7487ms | 571.8372 Ops/s | 584.0844 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9402ms | 1.7880ms | 559.2795 Ops/s | 571.0710 Ops/s | |
test_consolidate_njt[False-None] | 6.8080ms | 6.6649ms | 150.0401 Ops/s | 151.9469 Ops/s | |
test_to[False-False-None] | 1.9347ms | 1.7298ms | 578.1135 Ops/s | 589.2846 Ops/s | |
test_to[True-False-None] | 1.6699ms | 1.4137ms | 707.3709 Ops/s | 743.0217 Ops/s | |
test_to[within-False-None] | 4.5328ms | 4.2143ms | 237.2895 Ops/s | 241.2616 Ops/s | |
test_to[True-default-None] | 5.6951ms | 5.4786ms | 182.5297 Ops/s | 184.3283 Ops/s | |
test_to_njt[False-False-None] | 7.0990ms | 6.9335ms | 144.2282 Ops/s | 144.4354 Ops/s | |
test_to_njt[True-False-None] | 6.1154ms | 5.7947ms | 172.5729 Ops/s | 170.0059 Ops/s | |
test_to_njt[within-False-None] | 13.2424ms | 12.4426ms | 80.3690 Ops/s | 79.1143 Ops/s | |
test_creation[device0] | 0.4580ms | 80.5566μs | 12.4136 KOps/s | 12.2558 KOps/s | |
test_creation_from_tensor | 0.6121ms | 84.5137μs | 11.8324 KOps/s | 11.9127 KOps/s | |
test_add_one[memmap_tensor0] | 0.5112ms | 7.0844μs | 141.1554 KOps/s | 144.9212 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8690μs | 0.4299μs | 2.3262 MOps/s | 2.3511 MOps/s | |
test_stack[memmap_tensor0] | 35.5310μs | 4.6860μs | 213.4031 KOps/s | 230.4675 KOps/s | |
test_memmaptd_index | 1.4721ms | 0.2508ms | 3.9869 KOps/s | 4.1130 KOps/s | |
test_memmaptd_index_astensor | 0.4536ms | 0.3124ms | 3.2013 KOps/s | 3.3058 KOps/s | |
test_memmaptd_index_op | 0.7878ms | 0.6196ms | 1.6139 KOps/s | 1.6797 KOps/s | |
test_serialize_model | 0.1313s | 0.1304s | 7.6708 Ops/s | 7.6509 Ops/s | |
test_serialize_model_pickle | 1.3491s | 1.1844s | 0.8443 Ops/s | 0.8238 Ops/s | |
test_serialize_weights | 0.1304s | 0.1297s | 7.7119 Ops/s | 5.4851 Ops/s | |
test_serialize_weights_returnearly | 0.3280s | 54.3354ms | 18.4042 Ops/s | 23.5722 Ops/s | |
test_serialize_weights_pickle | 1.3486s | 1.2113s | 0.8256 Ops/s | 0.8197 Ops/s | |
test_reshape_pytree | 0.1505ms | 22.5796μs | 44.2877 KOps/s | 44.8591 KOps/s | |
test_reshape_td | 0.1385ms | 27.8389μs | 35.9209 KOps/s | 37.2678 KOps/s | |
test_view_pytree | 0.1371ms | 22.3399μs | 44.7630 KOps/s | 44.6616 KOps/s | |
test_view_td | 73.3310μs | 33.2113μs | 30.1102 KOps/s | 33.4044 KOps/s | |
test_unbind_pytree | 0.1614ms | 29.6604μs | 33.7150 KOps/s | 35.2892 KOps/s | |
test_unbind_td | 0.8364ms | 38.5398μs | 25.9472 KOps/s | 26.9841 KOps/s | |
test_split_pytree | 66.4910μs | 30.4327μs | 32.8594 KOps/s | 34.4488 KOps/s | |
test_split_td | 0.9557ms | 39.4375μs | 25.3566 KOps/s | 25.3266 KOps/s | |
test_add_pytree | 0.1523ms | 35.4887μs | 28.1780 KOps/s | 27.8497 KOps/s | |
test_add_td | 0.1311ms | 51.4142μs | 19.4499 KOps/s | 18.6919 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2893ms | 0.1292ms | 7.7396 KOps/s | 7.7477 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2956ms | 0.1352ms | 7.3972 KOps/s | 7.3292 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2569ms | 97.4741μs | 10.2591 KOps/s | 10.3119 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.5424ms | 0.1517ms | 6.5938 KOps/s | 6.5488 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1796ms | 25.9721μs | 38.5029 KOps/s | 39.7185 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1073ms | 29.6023μs | 33.7811 KOps/s | 33.0656 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3746ms | 63.6263μs | 15.7168 KOps/s | 15.4172 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1562ms | 48.8730μs | 20.4612 KOps/s | 20.4350 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3273ms | 0.1488ms | 6.7209 KOps/s | 7.0690 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3973ms | 0.2177ms | 4.5933 KOps/s | 4.5795 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2554ms | 98.9484μs | 10.1063 KOps/s | 9.7612 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2125ms | 57.2406μs | 17.4701 KOps/s | 17.4874 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1903ms | 0.1424ms | 7.0230 KOps/s | 7.0524 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6889ms | 0.4929ms | 2.0287 KOps/s | 2.0364 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4342ms | 0.2631ms | 3.8002 KOps/s | 3.8357 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3236ms | 0.1505ms | 6.6461 KOps/s | 7.0642 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2264ms | 72.2368μs | 13.8434 KOps/s | 13.9974 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2550ms | 0.1020ms | 9.7993 KOps/s | 9.7249 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5697ms | 0.4108ms | 2.4341 KOps/s | 2.4233 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2860ms | 0.1404ms | 7.1203 KOps/s | 7.0711 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1513ms | 19.4331μs | 51.4586 KOps/s | 51.0948 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 63.5110μs | 31.9199μs | 31.3284 KOps/s | 31.9251 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1472ms | 68.9447μs | 14.5044 KOps/s | 14.2715 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1905ms | 51.6424μs | 19.3639 KOps/s | 19.1064 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6556ms | 0.3974ms | 2.5164 KOps/s | 2.2460 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9250ms | 2.6736ms | 374.0227 Ops/s | 381.5120 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6206ms | 0.4376ms | 2.2854 KOps/s | 2.2342 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8821ms | 2.6828ms | 372.7451 Ops/s | 382.5473 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.3056ms | 0.1189ms | 8.4117 KOps/s | 8.7750 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5566ms | 79.2082μs | 12.6250 KOps/s | 12.5219 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2957ms | 0.1106ms | 9.0434 KOps/s | 9.3943 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2430ms | 67.0229μs | 14.9203 KOps/s | 14.5171 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.3086ms | 0.1105ms | 9.0498 KOps/s | 9.2912 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2507ms | 70.4247μs | 14.1996 KOps/s | 14.6373 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2595ms | 0.1034ms | 9.6754 KOps/s | 9.9756 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2190ms | 19.3276μs | 51.7394 KOps/s | 56.4978 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2509ms | 96.1054μs | 10.4052 KOps/s | 10.3278 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1602ms | 16.2788μs | 61.4294 KOps/s | 64.5996 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2696ms | 0.1015ms | 9.8486 KOps/s | 10.3290 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1352ms | 16.7865μs | 59.5717 KOps/s | 63.7798 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2926ms | 0.1070ms | 9.3452 KOps/s | 9.9304 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5833ms | 19.1107μs | 52.3268 KOps/s | 58.1756 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2603ms | 99.8558μs | 10.0144 KOps/s | 10.3319 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1263ms | 16.7068μs | 59.8560 KOps/s | 61.8065 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2582ms | 0.1020ms | 9.8051 KOps/s | 9.7960 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.2223ms | 16.7022μs | 59.8724 KOps/s | 61.5031 KOps/s | |
test_mod_add[eager] | 0.1877ms | 39.8320μs | 25.1055 KOps/s | 25.2582 KOps/s | |
test_mod_add[compile] | 0.2110ms | 82.6614μs | 12.0975 KOps/s | 11.2279 KOps/s | |
test_mod_add[compile-overhead] | 0.3283ms | 0.1684ms | 5.9370 KOps/s | 5.6790 KOps/s | |
test_mod_wrap[eager] | 0.4053ms | 0.2537ms | 3.9422 KOps/s | 3.7847 KOps/s | |
test_mod_wrap[compile] | 0.4363ms | 0.2917ms | 3.4280 KOps/s | 3.2343 KOps/s | |
test_mod_wrap[compile-overhead] | 7.5996ms | 3.7919ms | 263.7204 Ops/s | 270.6801 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6206ms | 1.4152ms | 706.6201 Ops/s | 687.1558 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4535ms | 1.2846ms | 778.4433 Ops/s | 718.9665 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3659ms | 0.9332ms | 1.0716 KOps/s | 960.1462 Ops/s | |
test_seq_add[eager] | 0.2581ms | 0.1190ms | 8.4046 KOps/s | 8.5039 KOps/s | |
test_seq_add[compile] | 0.2254ms | 95.0717μs | 10.5184 KOps/s | 11.1619 KOps/s | |
test_seq_add[compile-overhead] | 0.2838ms | 0.1308ms | 7.6447 KOps/s | 7.6987 KOps/s | |
test_seq_wrap[eager] | 0.5733ms | 0.4338ms | 2.3052 KOps/s | 2.3300 KOps/s | |
test_seq_wrap[compile] | 0.4735ms | 0.3130ms | 3.1945 KOps/s | 3.0954 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2909ms | 0.2260ms | 4.4251 KOps/s | 4.3970 KOps/s | |
test_func_call_runtime[False-eager] | 0.9139ms | 0.7486ms | 1.3358 KOps/s | 1.3403 KOps/s | |
test_func_call_runtime[False-compile] | 0.8989ms | 0.7521ms | 1.3297 KOps/s | 1.3155 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4947ms | 0.3677ms | 2.7199 KOps/s | 2.7145 KOps/s | |
test_func_call_runtime[True-eager] | 1.0592ms | 0.9077ms | 1.1017 KOps/s | 1.0968 KOps/s | |
test_func_call_runtime[True-compile] | 0.9819ms | 0.7773ms | 1.2865 KOps/s | 1.2972 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5229ms | 0.3918ms | 2.5523 KOps/s | 2.5823 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9574ms | 0.7641ms | 1.3087 KOps/s | 1.3519 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9061ms | 0.7600ms | 1.3158 KOps/s | 1.3150 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5059ms | 0.3690ms | 2.7099 KOps/s | 2.7092 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1730ms | 1.0280ms | 972.7604 Ops/s | 979.2626 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1726ms | 1.0132ms | 986.9469 Ops/s | 996.6806 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2653ms | 1.0421ms | 959.6352 Ops/s | 994.6242 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5156ms | 2.1076ms | 474.4737 Ops/s | 471.1000 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9714ms | 0.8243ms | 1.2131 KOps/s | 1.1954 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5948ms | 0.4205ms | 2.3779 KOps/s | 2.3677 KOps/s | |
test_distributed | 1.6090ms | 0.1272ms | 7.8635 KOps/s | 8.7134 KOps/s | |
test_tdmodule | 0.4366ms | 21.2195μs | 47.1264 KOps/s | 48.7005 KOps/s | |
test_tdmodule_dispatch | 60.4610μs | 37.2048μs | 26.8783 KOps/s | 27.2743 KOps/s | |
test_tdseq | 46.4710μs | 21.3844μs | 46.7630 KOps/s | 46.9330 KOps/s | |
test_tdseq_dispatch | 70.9910μs | 41.0013μs | 24.3895 KOps/s | 25.2389 KOps/s | |
test_instantiation_functorch | 1.7112ms | 1.5700ms | 636.9629 Ops/s | 646.1420 Ops/s | |
test_exec_functorch | 0.2280ms | 0.1483ms | 6.7444 KOps/s | 6.9206 KOps/s | |
test_exec_functional_call | 0.3084ms | 0.1450ms | 6.8944 KOps/s | 7.1521 KOps/s | |
test_exec_td_decorator | 0.3758ms | 0.1955ms | 5.1162 KOps/s | 5.3051 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8625ms | 0.7177ms | 1.3933 KOps/s | 1.4543 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8770ms | 0.7235ms | 1.3822 KOps/s | 1.4450 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8035ms | 0.6271ms | 1.5947 KOps/s | 1.6788 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8275ms | 0.6265ms | 1.5961 KOps/s | 1.6710 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.9808ms | 19.3037ms | 51.8036 Ops/s | 51.8766 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.4860ms | 19.3239ms | 51.7495 Ops/s | 51.8664 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.4336ms | 19.1355ms | 52.2590 Ops/s | 52.3783 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.7883ms | 19.1928ms | 52.1029 Ops/s | 52.4328 Ops/s | |
test_to_module_speed[True] | 1.4607ms | 0.9698ms | 1.0311 KOps/s | 1.0304 KOps/s | |
test_to_module_speed[False] | 1.0912ms | 0.9520ms | 1.0505 KOps/s | 1.0538 KOps/s | |
test_tc_init | 67.0010μs | 38.6101μs | 25.9000 KOps/s | 26.9750 KOps/s | |
test_tc_init_nested | 0.1609ms | 79.4933μs | 12.5797 KOps/s | 13.2085 KOps/s | |
test_tc_first_layer_tensor | 17.5900μs | 0.7965μs | 1.2554 MOps/s | 1.4472 MOps/s | |
test_tc_first_layer_nontensor | 32.3800μs | 2.2359μs | 447.2437 KOps/s | 456.6678 KOps/s | |
test_tc_second_layer_tensor | 8.6300μs | 1.4229μs | 702.8093 KOps/s | 713.8234 KOps/s | |
test_tc_second_layer_nontensor | 31.1100μs | 2.9731μs | 336.3462 KOps/s | 338.4185 KOps/s | |
test_unbind | 0.2188s | 12.1843ms | 82.0726 Ops/s | 142.5370 Ops/s | |
test_full_like | 10.0911ms | 9.5676ms | 104.5191 Ops/s | 102.7300 Ops/s | |
test_zeros_like | 9.4031ms | 7.3277ms | 136.4687 Ops/s | 113.7315 Ops/s | |
test_ones_like | 5.0540ms | 4.4157ms | 226.4661 Ops/s | 227.6206 Ops/s | |
test_clone | 7.2782ms | 6.8243ms | 146.5360 Ops/s | 147.8310 Ops/s | |
test_squeeze | 69.0120μs | 11.6862μs | 85.5711 KOps/s | 103.5645 KOps/s | |
test_unsqueeze | 0.2120ms | 78.2453μs | 12.7803 KOps/s | 13.8123 KOps/s | |
test_split | 0.3732ms | 0.1700ms | 5.8819 KOps/s | 5.9852 KOps/s | |
test_permute | 0.3309ms | 0.1976ms | 5.0595 KOps/s | 5.3414 KOps/s | |
test_stack | 51.7732ms | 51.2273ms | 19.5208 Ops/s | 19.4379 Ops/s | |
test_cat | 51.4495ms | 51.0807ms | 19.5769 Ops/s | 19.4411 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 12, 2025
ghstack-source-id: ce89eb7de8fb1f7f536668b77bdf0684a92f7e52 Pull Request resolved: #1217
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):