Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deprecation] Softly deprecate extra-tensors wrt out_keys #1215

Merged
merged 1 commit into from
Feb 10, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 10, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: aea9814fecbab903ad22ae54903a2921f4b88c5b
Pull Request resolved: #1215
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 10, 2025
@vmoens vmoens merged commit 6e97e7a into gh/vmoens/47/base Feb 10, 2025
12 of 25 checks passed
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: aea9814fecbab903ad22ae54903a2921f4b88c5b
Pull Request resolved: #1215
@vmoens vmoens deleted the gh/vmoens/47/head branch February 10, 2025 11:24
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 43.3310μs 21.3710μs 46.7923 KOps/s 48.1313 KOps/s $\color{#d91a1a}-2.78\%$
test_plain_set_stack_nested 51.1450μs 21.4281μs 46.6677 KOps/s 47.6782 KOps/s $\color{#d91a1a}-2.12\%$
test_plain_set_nested_inplace 51.1960μs 23.2164μs 43.0731 KOps/s 43.8191 KOps/s $\color{#d91a1a}-1.70\%$
test_plain_set_stack_nested_inplace 50.9150μs 23.2637μs 42.9855 KOps/s 43.8642 KOps/s $\color{#d91a1a}-2.00\%$
test_items 28.3730μs 4.3420μs 230.3108 KOps/s 245.0166 KOps/s $\textbf{\color{#d91a1a}-6.00\%}$
test_items_nested 0.6938ms 0.4114ms 2.4305 KOps/s 2.4038 KOps/s $\color{#35bf28}+1.11\%$
test_items_nested_locked 0.5307ms 0.4108ms 2.4341 KOps/s 2.4015 KOps/s $\color{#35bf28}+1.36\%$
test_items_nested_leaf 0.1475ms 77.2209μs 12.9499 KOps/s 12.7474 KOps/s $\color{#35bf28}+1.59\%$
test_items_stack_nested 0.5810ms 0.4130ms 2.4213 KOps/s 2.3955 KOps/s $\color{#35bf28}+1.08\%$
test_items_stack_nested_leaf 0.1448ms 81.3205μs 12.2970 KOps/s 12.2901 KOps/s $\color{#35bf28}+0.06\%$
test_items_stack_nested_locked 0.5457ms 0.4118ms 2.4281 KOps/s 2.3745 KOps/s $\color{#35bf28}+2.26\%$
test_keys 23.1540μs 3.4466μs 290.1441 KOps/s 286.8820 KOps/s $\color{#35bf28}+1.14\%$
test_keys_nested 0.2668ms 0.1645ms 6.0805 KOps/s 6.0792 KOps/s $\color{#35bf28}+0.02\%$
test_keys_nested_locked 1.6564ms 0.1713ms 5.8390 KOps/s 5.8282 KOps/s $\color{#35bf28}+0.18\%$
test_keys_nested_leaf 0.2387ms 0.1440ms 6.9422 KOps/s 6.9806 KOps/s $\color{#d91a1a}-0.55\%$
test_keys_stack_nested 0.2598ms 0.1631ms 6.1323 KOps/s 6.1011 KOps/s $\color{#35bf28}+0.51\%$
test_keys_stack_nested_leaf 0.2299ms 0.1410ms 7.0918 KOps/s 6.9727 KOps/s $\color{#35bf28}+1.71\%$
test_keys_stack_nested_locked 0.3007ms 0.1667ms 5.9973 KOps/s 5.8444 KOps/s $\color{#35bf28}+2.62\%$
test_values 5.7408μs 1.0348μs 966.4031 KOps/s 935.0607 KOps/s $\color{#35bf28}+3.35\%$
test_values_nested 0.1141ms 61.6739μs 16.2143 KOps/s 16.0833 KOps/s $\color{#35bf28}+0.81\%$
test_values_nested_locked 0.1306ms 61.7398μs 16.1970 KOps/s 16.1327 KOps/s $\color{#35bf28}+0.40\%$
test_values_nested_leaf 0.1229ms 70.6218μs 14.1599 KOps/s 14.0817 KOps/s $\color{#35bf28}+0.56\%$
test_values_stack_nested 0.1104ms 63.5642μs 15.7321 KOps/s 15.8564 KOps/s $\color{#d91a1a}-0.78\%$
test_values_stack_nested_leaf 0.1400ms 70.6894μs 14.1464 KOps/s 13.9895 KOps/s $\color{#35bf28}+1.12\%$
test_values_stack_nested_locked 0.1200ms 63.9920μs 15.6270 KOps/s 15.8084 KOps/s $\color{#d91a1a}-1.15\%$
test_membership 4.8547μs 0.7086μs 1.4112 MOps/s 1.4328 MOps/s $\color{#d91a1a}-1.50\%$
test_membership_nested 23.5240μs 2.8826μs 346.9105 KOps/s 339.9983 KOps/s $\color{#35bf28}+2.03\%$
test_membership_nested_leaf 18.1740μs 2.9053μs 344.1980 KOps/s 339.7454 KOps/s $\color{#35bf28}+1.31\%$
test_membership_stacked_nested 30.8180μs 2.9344μs 340.7873 KOps/s 341.8891 KOps/s $\color{#d91a1a}-0.32\%$
test_membership_stacked_nested_leaf 68.2550μs 2.8743μs 347.9104 KOps/s 334.7921 KOps/s $\color{#35bf28}+3.92\%$
test_membership_nested_last 29.9560μs 4.2887μs 233.1733 KOps/s 227.7165 KOps/s $\color{#35bf28}+2.40\%$
test_membership_nested_leaf_last 24.1550μs 4.3317μs 230.8556 KOps/s 225.1217 KOps/s $\color{#35bf28}+2.55\%$
test_membership_stacked_nested_last 30.0060μs 8.5327μs 117.1964 KOps/s 195.8939 KOps/s $\textbf{\color{#d91a1a}-40.17\%}$
test_membership_stacked_nested_leaf_last 35.0250μs 8.4269μs 118.6676 KOps/s 191.7091 KOps/s $\textbf{\color{#d91a1a}-38.10\%}$
test_nested_getleaf 32.4900μs 10.7751μs 92.8066 KOps/s 92.3074 KOps/s $\color{#35bf28}+0.54\%$
test_nested_get 30.5570μs 10.3064μs 97.0272 KOps/s 97.9345 KOps/s $\color{#d91a1a}-0.93\%$
test_stacked_getleaf 39.2330μs 10.7892μs 92.6852 KOps/s 92.8973 KOps/s $\color{#d91a1a}-0.23\%$
test_stacked_get 31.8990μs 10.3015μs 97.0734 KOps/s 98.6768 KOps/s $\color{#d91a1a}-1.62\%$
test_nested_getitemleaf 31.9200μs 11.5250μs 86.7676 KOps/s 86.4482 KOps/s $\color{#35bf28}+0.37\%$
test_nested_getitem 54.2110μs 10.9338μs 91.4593 KOps/s 90.5801 KOps/s $\color{#35bf28}+0.97\%$
test_stacked_getitemleaf 33.4620μs 11.4672μs 87.2053 KOps/s 86.4730 KOps/s $\color{#35bf28}+0.85\%$
test_stacked_getitem 44.9430μs 10.8217μs 92.4070 KOps/s 91.2542 KOps/s $\color{#35bf28}+1.26\%$
test_lock_nested 0.6054ms 0.4167ms 2.3999 KOps/s 2.3437 KOps/s $\color{#35bf28}+2.40\%$
test_lock_stack_nested 0.6200ms 0.4246ms 2.3554 KOps/s 2.2561 KOps/s $\color{#35bf28}+4.40\%$
test_unlock_nested 0.5708ms 0.3401ms 2.9403 KOps/s 2.9660 KOps/s $\color{#d91a1a}-0.86\%$
test_unlock_stack_nested 0.5123ms 0.3404ms 2.9373 KOps/s 2.8900 KOps/s $\color{#35bf28}+1.64\%$
test_flatten_speed 0.1796ms 0.1024ms 9.7622 KOps/s 9.9143 KOps/s $\color{#d91a1a}-1.53\%$
test_unflatten_speed 0.9475ms 0.5312ms 1.8826 KOps/s 1.8960 KOps/s $\color{#d91a1a}-0.71\%$
test_common_ops 5.0613ms 0.8424ms 1.1871 KOps/s 1.2117 KOps/s $\color{#d91a1a}-2.03\%$
test_creation 41.1360μs 2.4747μs 404.0858 KOps/s 400.2908 KOps/s $\color{#35bf28}+0.95\%$
test_creation_empty 31.6090μs 13.0545μs 76.6020 KOps/s 85.4985 KOps/s $\textbf{\color{#d91a1a}-10.41\%}$
test_creation_nested_1 48.5600μs 16.1709μs 61.8396 KOps/s 67.6186 KOps/s $\textbf{\color{#d91a1a}-8.55\%}$
test_creation_nested_2 52.3180μs 20.6180μs 48.5014 KOps/s 51.4094 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_clone 60.2920μs 13.4458μs 74.3727 KOps/s 72.5388 KOps/s $\color{#35bf28}+2.53\%$
test_getitem[int] 0.8615ms 12.8598μs 77.7617 KOps/s 77.3698 KOps/s $\color{#35bf28}+0.51\%$
test_getitem[slice_int] 0.1307ms 24.7161μs 40.4595 KOps/s 39.4523 KOps/s $\color{#35bf28}+2.55\%$
test_getitem[range] 0.1638ms 49.7695μs 20.0926 KOps/s 19.2031 KOps/s $\color{#35bf28}+4.63\%$
test_getitem[tuple] 0.1282ms 20.4115μs 48.9920 KOps/s 48.9847 KOps/s $\color{#35bf28}+0.01\%$
test_getitem[list] 0.1770ms 46.1951μs 21.6473 KOps/s 21.2065 KOps/s $\color{#35bf28}+2.08\%$
test_setitem_dim[int] 52.2580μs 26.6046μs 37.5876 KOps/s 36.5249 KOps/s $\color{#35bf28}+2.91\%$
test_setitem_dim[slice_int] 91.6110μs 51.8526μs 19.2854 KOps/s 18.8343 KOps/s $\color{#35bf28}+2.40\%$
test_setitem_dim[range] 0.1428ms 79.8524μs 12.5231 KOps/s 12.6229 KOps/s $\color{#d91a1a}-0.79\%$
test_setitem_dim[tuple] 75.0500μs 41.9956μs 23.8120 KOps/s 23.2848 KOps/s $\color{#35bf28}+2.26\%$
test_setitem 0.1095ms 21.1218μs 47.3445 KOps/s 47.0468 KOps/s $\color{#35bf28}+0.63\%$
test_set 0.2745ms 20.5772μs 48.5974 KOps/s 48.8129 KOps/s $\color{#d91a1a}-0.44\%$
test_set_shared 0.4126ms 0.1821ms 5.4923 KOps/s 5.4812 KOps/s $\color{#35bf28}+0.20\%$
test_update 0.1275ms 24.1188μs 41.4615 KOps/s 43.1791 KOps/s $\color{#d91a1a}-3.98\%$
test_update_nested 93.5640μs 33.9564μs 29.4495 KOps/s 29.3765 KOps/s $\color{#35bf28}+0.25\%$
test_update__nested 0.3806ms 34.5468μs 28.9462 KOps/s 28.8352 KOps/s $\color{#35bf28}+0.39\%$
test_set_nested 77.1040μs 22.9178μs 43.6342 KOps/s 43.0807 KOps/s $\color{#35bf28}+1.28\%$
test_set_nested_new 94.0350μs 28.1500μs 35.5240 KOps/s 36.0024 KOps/s $\color{#d91a1a}-1.33\%$
test_select 0.1332ms 44.9414μs 22.2512 KOps/s 22.2918 KOps/s $\color{#d91a1a}-0.18\%$
test_select_nested 0.1243ms 66.9719μs 14.9316 KOps/s 14.9684 KOps/s $\color{#d91a1a}-0.25\%$
test_exclude_nested 0.1529ms 83.9536μs 11.9113 KOps/s 11.5112 KOps/s $\color{#35bf28}+3.48\%$
test_empty[True] 0.7656ms 0.4171ms 2.3976 KOps/s 2.4171 KOps/s $\color{#d91a1a}-0.81\%$
test_empty[False] 7.3260μs 1.3888μs 720.0706 KOps/s 708.5516 KOps/s $\color{#35bf28}+1.63\%$
test_unbind_speed 0.5497ms 0.2750ms 3.6362 KOps/s 3.6613 KOps/s $\color{#d91a1a}-0.69\%$
test_unbind_speed_stack0 0.3963ms 0.2690ms 3.7169 KOps/s 3.7217 KOps/s $\color{#d91a1a}-0.13\%$
test_unbind_speed_stack1 95.5298ms 0.7179ms 1.3930 KOps/s 1.1404 KOps/s $\textbf{\color{#35bf28}+22.15\%}$
test_split 99.0242ms 1.7554ms 569.6684 Ops/s 615.5554 Ops/s $\textbf{\color{#d91a1a}-7.45\%}$
test_chunk 0.1033s 1.7726ms 564.1571 Ops/s 516.8340 Ops/s $\textbf{\color{#35bf28}+9.16\%}$
test_consolidate_njt[False-None] 10.8517ms 8.3279ms 120.0784 Ops/s 121.2230 Ops/s $\color{#d91a1a}-0.94\%$
test_creation[device0] 0.2089ms 91.9568μs 10.8747 KOps/s 10.5707 KOps/s $\color{#35bf28}+2.88\%$
test_creation_from_tensor 3.9867ms 97.0200μs 10.3072 KOps/s 10.4504 KOps/s $\color{#d91a1a}-1.37\%$
test_add_one[memmap_tensor0] 0.2033ms 4.9789μs 200.8496 KOps/s 206.1165 KOps/s $\color{#d91a1a}-2.56\%$
test_contiguous[memmap_tensor0] 10.2400μs 0.5057μs 1.9776 MOps/s 1.9658 MOps/s $\color{#35bf28}+0.60\%$
test_stack[memmap_tensor0] 29.5350μs 3.3455μs 298.9052 KOps/s 280.9691 KOps/s $\textbf{\color{#35bf28}+6.38\%}$
test_memmaptd_index 0.4800ms 0.2318ms 4.3143 KOps/s 4.3878 KOps/s $\color{#d91a1a}-1.68\%$
test_memmaptd_index_astensor 0.5227ms 0.3226ms 3.0999 KOps/s 3.1626 KOps/s $\color{#d91a1a}-1.98\%$
test_memmaptd_index_op 0.8423ms 0.6140ms 1.6286 KOps/s 1.6835 KOps/s $\color{#d91a1a}-3.26\%$
test_serialize_model 0.1268s 0.1173s 8.5216 Ops/s 8.8208 Ops/s $\color{#d91a1a}-3.39\%$
test_serialize_model_pickle 0.4982s 0.3940s 2.5382 Ops/s 2.5845 Ops/s $\color{#d91a1a}-1.79\%$
test_serialize_weights 0.1195s 0.1150s 8.6972 Ops/s 8.6977 Ops/s $-0.01\%$
test_serialize_weights_returnearly 0.1711s 0.1580s 6.3273 Ops/s 6.1783 Ops/s $\color{#35bf28}+2.41\%$
test_serialize_weights_pickle 0.6249s 0.4389s 2.2784 Ops/s 2.4937 Ops/s $\textbf{\color{#d91a1a}-8.63\%}$
test_serialize_weights_filesystem 0.1560s 0.1440s 6.9426 Ops/s 6.9698 Ops/s $\color{#d91a1a}-0.39\%$
test_serialize_model_filesystem 0.2461s 0.1622s 6.1635 Ops/s 6.3125 Ops/s $\color{#d91a1a}-2.36\%$
test_reshape_pytree 66.1730μs 27.0710μs 36.9399 KOps/s 37.4476 KOps/s $\color{#d91a1a}-1.36\%$
test_reshape_td 77.3640μs 33.9069μs 29.4926 KOps/s 29.9231 KOps/s $\color{#d91a1a}-1.44\%$
test_view_pytree 66.1230μs 26.5271μs 37.6973 KOps/s 38.0685 KOps/s $\color{#d91a1a}-0.98\%$
test_view_td 93.8850μs 40.0532μs 24.9668 KOps/s 24.7597 KOps/s $\color{#35bf28}+0.84\%$
test_unbind_pytree 0.1155ms 30.0233μs 33.3075 KOps/s 33.5256 KOps/s $\color{#d91a1a}-0.65\%$
test_unbind_td 0.3384ms 40.5318μs 24.6720 KOps/s 24.0447 KOps/s $\color{#35bf28}+2.61\%$
test_split_pytree 88.6050μs 29.4718μs 33.9307 KOps/s 33.8844 KOps/s $\color{#35bf28}+0.14\%$
test_split_td 0.5060ms 45.3180μs 22.0663 KOps/s 21.1685 KOps/s $\color{#35bf28}+4.24\%$
test_add_pytree 81.7020μs 35.8588μs 27.8871 KOps/s 27.2007 KOps/s $\color{#35bf28}+2.52\%$
test_add_td 0.1347ms 61.7450μs 16.1956 KOps/s 17.3856 KOps/s $\textbf{\color{#d91a1a}-6.84\%}$
test_compile_add_one_nested[tensordict-compile] 0.1323ms 66.8135μs 14.9670 KOps/s 14.7189 KOps/s $\color{#35bf28}+1.69\%$
test_compile_add_one_nested[tensordict-eager] 1.3513ms 0.1814ms 5.5115 KOps/s 5.7187 KOps/s $\color{#d91a1a}-3.62\%$
test_compile_add_one_nested[pytree-compile] 0.1118ms 46.0288μs 21.7255 KOps/s 21.4720 KOps/s $\color{#35bf28}+1.18\%$
test_compile_add_one_nested[pytree-eager] 0.3228ms 0.1208ms 8.2795 KOps/s 8.3235 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_copy_nested[tensordict-compile] 75.6410μs 28.3803μs 35.2358 KOps/s 36.1796 KOps/s $\color{#d91a1a}-2.61\%$
test_compile_copy_nested[tensordict-eager] 0.1420ms 61.6690μs 16.2156 KOps/s 16.5230 KOps/s $\color{#d91a1a}-1.86\%$
test_compile_copy_nested[pytree-compile] 0.1433ms 81.2817μs 12.3029 KOps/s 12.0673 KOps/s $\color{#35bf28}+1.95\%$
test_compile_copy_nested[pytree-eager] 0.1530ms 67.8731μs 14.7334 KOps/s 14.7465 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_add_one_flat[tensordict-compile] 0.2391ms 0.1082ms 9.2423 KOps/s 9.3309 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_add_one_flat[tensordict-eager] 0.4600ms 0.2238ms 4.4686 KOps/s 4.5791 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_add_one_flat[tensorclass-compile] 0.1231ms 47.7600μs 20.9380 KOps/s 21.0714 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_add_one_flat[tensorclass-eager] 0.1734ms 70.0648μs 14.2725 KOps/s 14.5360 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_add_one_flat[pytree-compile] 0.1812ms 0.1005ms 9.9463 KOps/s 9.8629 KOps/s $\color{#35bf28}+0.84\%$
test_compile_add_one_flat[pytree-eager] 0.3354ms 0.2014ms 4.9647 KOps/s 4.9377 KOps/s $\color{#35bf28}+0.55\%$
test_compile_add_self_flat[tensordict-eager] 0.3760ms 0.2392ms 4.1807 KOps/s 4.2951 KOps/s $\color{#d91a1a}-2.66\%$
test_compile_add_self_flat[tensordict-compile] 0.2268ms 0.1096ms 9.1201 KOps/s 9.1040 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_self_flat[tensorclass-eager] 0.1992ms 65.3362μs 15.3054 KOps/s 15.6033 KOps/s $\color{#d91a1a}-1.91\%$
test_compile_add_self_flat[tensorclass-compile] 0.1096ms 49.0403μs 20.3914 KOps/s 20.3436 KOps/s $\color{#35bf28}+0.23\%$
test_compile_add_self_flat[pytree-eager] 0.3075ms 0.1586ms 6.3036 KOps/s 6.3412 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_self_flat[pytree-compile] 0.2332ms 0.1017ms 9.8328 KOps/s 9.8141 KOps/s $\color{#35bf28}+0.19\%$
test_compile_copy_flat[tensordict-compile] 74.8690μs 21.8762μs 45.7117 KOps/s 46.6592 KOps/s $\color{#d91a1a}-2.03\%$
test_compile_copy_flat[tensordict-eager] 0.1362ms 67.6554μs 14.7808 KOps/s 14.7257 KOps/s $\color{#35bf28}+0.37\%$
test_compile_copy_flat[pytree-compile] 0.1958ms 85.7207μs 11.6658 KOps/s 11.6609 KOps/s $\color{#35bf28}+0.04\%$
test_compile_copy_flat[pytree-eager] 0.1278ms 67.5033μs 14.8141 KOps/s 14.8371 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_assign_and_add[tensordict-compile] 0.3269ms 0.2201ms 4.5444 KOps/s 4.5943 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_assign_and_add[tensordict-eager] 1.6026ms 1.4061ms 711.2083 Ops/s 716.7199 Ops/s $\color{#d91a1a}-0.77\%$
test_compile_assign_and_add[pytree-compile] 0.2932ms 0.2122ms 4.7131 KOps/s 4.6635 KOps/s $\color{#35bf28}+1.06\%$
test_compile_assign_and_add[pytree-eager] 1.4159ms 0.8286ms 1.2069 KOps/s 1.2109 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_assign_and_add_stack[compile] 0.5802ms 0.4669ms 2.1417 KOps/s 2.1872 KOps/s $\color{#d91a1a}-2.08\%$
test_compile_assign_and_add_stack[eager] 3.9160ms 2.9478ms 339.2371 Ops/s 363.5684 Ops/s $\textbf{\color{#d91a1a}-6.69\%}$
test_compile_indexing[tensor-tensordict-compile] 98.0830μs 38.8670μs 25.7288 KOps/s 25.6671 KOps/s $\color{#35bf28}+0.24\%$
test_compile_indexing[tensor-tensordict-eager] 0.5507ms 33.4039μs 29.9366 KOps/s 29.7643 KOps/s $\color{#35bf28}+0.58\%$
test_compile_indexing[tensor-tensorclass-compile] 96.7700μs 31.0774μs 32.1778 KOps/s 31.4648 KOps/s $\color{#35bf28}+2.27\%$
test_compile_indexing[tensor-tensorclass-eager] 88.0770μs 23.2338μs 43.0408 KOps/s 42.7945 KOps/s $\color{#35bf28}+0.58\%$
test_compile_indexing[tensor-pytree-compile] 84.9960μs 31.5509μs 31.6948 KOps/s 31.2271 KOps/s $\color{#35bf28}+1.50\%$
test_compile_indexing[tensor-pytree-eager] 63.0880μs 23.2430μs 43.0237 KOps/s 42.8499 KOps/s $\color{#35bf28}+0.41\%$
test_compile_indexing[slice-tensordict-compile] 0.1251ms 55.4596μs 18.0311 KOps/s 19.2090 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_compile_indexing[slice-tensordict-eager] 0.3754ms 20.5212μs 48.7302 KOps/s 48.3000 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[slice-tensorclass-compile] 0.1200ms 45.5231μs 21.9669 KOps/s 22.0394 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_indexing[slice-tensorclass-eager] 51.9870μs 18.7621μs 53.2989 KOps/s 53.5156 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_indexing[slice-pytree-compile] 0.1218ms 46.4500μs 21.5285 KOps/s 21.4650 KOps/s $\color{#35bf28}+0.30\%$
test_compile_indexing[slice-pytree-eager] 80.5500μs 18.7551μs 53.3187 KOps/s 53.2381 KOps/s $\color{#35bf28}+0.15\%$
test_compile_indexing[int-tensordict-compile] 0.1149ms 55.6250μs 17.9775 KOps/s 18.4435 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_indexing[int-tensordict-eager] 1.0420ms 19.8880μs 50.2816 KOps/s 48.3592 KOps/s $\color{#35bf28}+3.98\%$
test_compile_indexing[int-tensorclass-compile] 0.1089ms 46.5293μs 21.4918 KOps/s 21.5668 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_indexing[int-tensorclass-eager] 61.6950μs 18.7510μs 53.3304 KOps/s 53.5582 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_indexing[int-pytree-compile] 0.1148ms 46.9113μs 21.3168 KOps/s 21.6464 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_indexing[int-pytree-eager] 0.5180ms 18.8653μs 53.0073 KOps/s 53.5053 KOps/s $\color{#d91a1a}-0.93\%$
test_mod_add[eager] 88.6350μs 38.2279μs 26.1589 KOps/s 26.0632 KOps/s $\color{#35bf28}+0.37\%$
test_mod_add[compile] 0.1253ms 66.6694μs 14.9994 KOps/s 15.1588 KOps/s $\color{#d91a1a}-1.05\%$
test_mod_add[compile-overhead] 0.1261ms 64.7627μs 15.4410 KOps/s 15.0260 KOps/s $\color{#35bf28}+2.76\%$
test_mod_wrap[eager] 0.4769ms 0.2330ms 4.2922 KOps/s 4.3709 KOps/s $\color{#d91a1a}-1.80\%$
test_mod_wrap[compile] 1.8008ms 0.2362ms 4.2345 KOps/s 4.2825 KOps/s $\color{#d91a1a}-1.12\%$
test_mod_wrap[compile-overhead] 0.3698ms 0.2344ms 4.2664 KOps/s 4.3707 KOps/s $\color{#d91a1a}-2.39\%$
test_mod_wrap_and_backward[eager] 12.4441ms 10.8292ms 92.3428 Ops/s 91.3695 Ops/s $\color{#35bf28}+1.07\%$
test_mod_wrap_and_backward[compile] 12.2282ms 10.7982ms 92.6080 Ops/s 89.6509 Ops/s $\color{#35bf28}+3.30\%$
test_mod_wrap_and_backward[compile-overhead] 12.2085ms 10.8901ms 91.8269 Ops/s 88.3045 Ops/s $\color{#35bf28}+3.99\%$
test_seq_add[eager] 0.2639ms 0.1241ms 8.0559 KOps/s 7.8475 KOps/s $\color{#35bf28}+2.66\%$
test_seq_add[compile] 0.1395ms 78.2202μs 12.7844 KOps/s 12.6311 KOps/s $\color{#35bf28}+1.21\%$
test_seq_add[compile-overhead] 0.1759ms 77.9140μs 12.8347 KOps/s 13.3886 KOps/s $\color{#d91a1a}-4.14\%$
test_seq_wrap[eager] 0.5899ms 0.4672ms 2.1406 KOps/s 2.0497 KOps/s $\color{#35bf28}+4.44\%$
test_seq_wrap[compile] 0.4723ms 0.2515ms 3.9760 KOps/s 4.0287 KOps/s $\color{#d91a1a}-1.31\%$
test_seq_wrap[compile-overhead] 0.4047ms 0.2514ms 3.9772 KOps/s 4.0485 KOps/s $\color{#d91a1a}-1.76\%$
test_func_call_runtime[False-eager] 0.9721ms 0.5625ms 1.7778 KOps/s 1.8295 KOps/s $\color{#d91a1a}-2.83\%$
test_func_call_runtime[False-compile] 0.6164ms 0.4575ms 2.1859 KOps/s 2.2052 KOps/s $\color{#d91a1a}-0.88\%$
test_func_call_runtime[False-compile-overhead] 0.6036ms 0.4538ms 2.2036 KOps/s 2.2065 KOps/s $\color{#d91a1a}-0.13\%$
test_func_call_runtime[True-eager] 1.1696ms 0.7917ms 1.2631 KOps/s 1.3067 KOps/s $\color{#d91a1a}-3.34\%$
test_func_call_runtime[True-compile] 0.6076ms 0.4727ms 2.1157 KOps/s 2.1088 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_runtime[True-compile-overhead] 0.6104ms 0.4747ms 2.1065 KOps/s 2.0744 KOps/s $\color{#35bf28}+1.55\%$
test_func_call_cm_runtime[False-eager] 0.7442ms 0.5560ms 1.7986 KOps/s 1.8224 KOps/s $\color{#d91a1a}-1.31\%$
test_func_call_cm_runtime[False-compile] 0.6114ms 0.4578ms 2.1845 KOps/s 2.2010 KOps/s $\color{#d91a1a}-0.75\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5560ms 0.4572ms 2.1871 KOps/s 2.2214 KOps/s $\color{#d91a1a}-1.54\%$
test_func_call_cm_runtime[True-eager] 1.3177ms 0.9493ms 1.0534 KOps/s 1.0882 KOps/s $\color{#d91a1a}-3.20\%$
test_func_call_cm_runtime[True-compile] 1.0185ms 0.8261ms 1.2105 KOps/s 1.2286 KOps/s $\color{#d91a1a}-1.47\%$
test_func_call_cm_runtime[True-compile-overhead] 1.6251ms 0.8356ms 1.1967 KOps/s 1.2210 KOps/s $\color{#d91a1a}-1.98\%$
test_vmap_func_call_cm_runtime[eager] 2.5280ms 1.9799ms 505.0749 Ops/s 520.1111 Ops/s $\color{#d91a1a}-2.89\%$
test_vmap_func_call_cm_runtime[compile] 0.9738ms 0.5501ms 1.8179 KOps/s 1.8122 KOps/s $\color{#35bf28}+0.32\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7320ms 0.5471ms 1.8279 KOps/s 1.8241 KOps/s $\color{#35bf28}+0.21\%$
test_distributed 1.4158ms 0.1298ms 7.7048 KOps/s 7.8068 KOps/s $\color{#d91a1a}-1.31\%$
test_tdmodule 85.6400μs 28.3425μs 35.2827 KOps/s 35.4694 KOps/s $\color{#d91a1a}-0.53\%$
test_tdmodule_dispatch 82.7540μs 52.1372μs 19.1802 KOps/s 19.8095 KOps/s $\color{#d91a1a}-3.18\%$
test_tdseq 50.5640μs 30.7679μs 32.5014 KOps/s 33.5003 KOps/s $\color{#d91a1a}-2.98\%$
test_tdseq_dispatch 0.1136ms 57.2232μs 17.4754 KOps/s 18.2052 KOps/s $\color{#d91a1a}-4.01\%$
test_instantiation_functorch 1.7603ms 1.5580ms 641.8645 Ops/s 638.5101 Ops/s $\color{#35bf28}+0.53\%$
test_exec_functorch 0.4247ms 0.1864ms 5.3646 KOps/s 5.5140 KOps/s $\color{#d91a1a}-2.71\%$
test_exec_functional_call 0.3029ms 0.1759ms 5.6861 KOps/s 5.8878 KOps/s $\color{#d91a1a}-3.43\%$
test_exec_td_decorator 0.4909ms 0.2418ms 4.1363 KOps/s 4.3226 KOps/s $\color{#d91a1a}-4.31\%$
test_vmap_mlp_speed_decorator[True-True] 1.0492ms 0.6744ms 1.4827 KOps/s 1.5177 KOps/s $\color{#d91a1a}-2.30\%$
test_vmap_mlp_speed_decorator[True-False] 0.8290ms 0.6674ms 1.4983 KOps/s 1.5205 KOps/s $\color{#d91a1a}-1.45\%$
test_vmap_mlp_speed_decorator[False-True] 0.8165ms 0.5519ms 1.8120 KOps/s 1.9047 KOps/s $\color{#d91a1a}-4.87\%$
test_vmap_mlp_speed_decorator[False-False] 0.8855ms 0.5450ms 1.8348 KOps/s 1.8984 KOps/s $\color{#d91a1a}-3.35\%$
test_to_module_speed[True] 2.3487ms 1.4326ms 698.0414 Ops/s 749.1888 Ops/s $\textbf{\color{#d91a1a}-6.83\%}$
test_to_module_speed[False] 1.8893ms 1.3902ms 719.3219 Ops/s 754.7698 Ops/s $\color{#d91a1a}-4.70\%$
test_tc_init 0.1038ms 48.4296μs 20.6485 KOps/s 22.0767 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_tc_init_nested 0.1967ms 96.8261μs 10.3278 KOps/s 10.7866 KOps/s $\color{#d91a1a}-4.25\%$
test_tc_first_layer_tensor 25.8680μs 1.6573μs 603.3973 KOps/s 645.6368 KOps/s $\textbf{\color{#d91a1a}-6.54\%}$
test_tc_first_layer_nontensor 43.3000μs 4.8278μs 207.1328 KOps/s 208.5502 KOps/s $\color{#d91a1a}-0.68\%$
test_tc_second_layer_tensor 27.8420μs 3.0955μs 323.0471 KOps/s 346.3778 KOps/s $\textbf{\color{#d91a1a}-6.74\%}$
test_tc_second_layer_nontensor 45.5350μs 6.2853μs 159.1018 KOps/s 163.3695 KOps/s $\color{#d91a1a}-2.61\%$
test_unbind 0.2276s 13.1650ms 75.9592 Ops/s 66.7037 Ops/s $\textbf{\color{#35bf28}+13.88\%}$
test_full_like 8.8780ms 7.6398ms 130.8938 Ops/s 148.4967 Ops/s $\textbf{\color{#d91a1a}-11.85\%}$
test_zeros_like 5.6637ms 2.6929ms 371.3443 Ops/s 355.0984 Ops/s $\color{#35bf28}+4.58\%$
test_ones_like 4.8676ms 3.1439ms 318.0783 Ops/s 321.5912 Ops/s $\color{#d91a1a}-1.09\%$
test_clone 8.7937ms 6.5863ms 151.8294 Ops/s 206.2771 Ops/s $\textbf{\color{#d91a1a}-26.40\%}$
test_squeeze 69.5900μs 12.4290μs 80.4569 KOps/s 82.1673 KOps/s $\color{#d91a1a}-2.08\%$
test_unsqueeze 0.2654ms 95.4930μs 10.4720 KOps/s 10.9722 KOps/s $\color{#d91a1a}-4.56\%$
test_split 0.3738ms 0.1992ms 5.0189 KOps/s 4.9309 KOps/s $\color{#35bf28}+1.79\%$
test_permute 0.3627ms 0.2033ms 4.9182 KOps/s 4.9466 KOps/s $\color{#d91a1a}-0.57\%$
test_stack 31.5728ms 24.4547ms 40.8919 Ops/s 40.7289 Ops/s $\color{#35bf28}+0.40\%$
test_cat 28.6314ms 24.2570ms 41.2252 Ops/s 41.2218 Ops/s $+0.01\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 62.5410μs 13.9266μs 71.8049 KOps/s 70.5178 KOps/s $\color{#35bf28}+1.83\%$
test_plain_set_stack_nested 42.9310μs 13.8374μs 72.2679 KOps/s 70.3127 KOps/s $\color{#35bf28}+2.78\%$
test_plain_set_nested_inplace 67.9710μs 14.6674μs 68.1783 KOps/s 66.4561 KOps/s $\color{#35bf28}+2.59\%$
test_plain_set_stack_nested_inplace 46.2910μs 14.8036μs 67.5511 KOps/s 65.6589 KOps/s $\color{#35bf28}+2.88\%$
test_items 35.7110μs 2.8783μs 347.4262 KOps/s 343.3002 KOps/s $\color{#35bf28}+1.20\%$
test_items_nested 0.4223ms 0.3663ms 2.7303 KOps/s 2.7214 KOps/s $\color{#35bf28}+0.33\%$
test_items_nested_locked 0.4977ms 0.3671ms 2.7242 KOps/s 2.7276 KOps/s $\color{#d91a1a}-0.12\%$
test_items_nested_leaf 88.2520μs 57.7628μs 17.3122 KOps/s 17.2398 KOps/s $\color{#35bf28}+0.42\%$
test_items_stack_nested 0.4696ms 0.3647ms 2.7416 KOps/s 2.6882 KOps/s $\color{#35bf28}+1.99\%$
test_items_stack_nested_leaf 85.7320μs 58.3780μs 17.1297 KOps/s 17.0306 KOps/s $\color{#35bf28}+0.58\%$
test_items_stack_nested_locked 0.4184ms 0.3665ms 2.7288 KOps/s 2.7230 KOps/s $\color{#35bf28}+0.21\%$
test_keys 34.9510μs 3.4147μs 292.8503 KOps/s 287.8621 KOps/s $\color{#35bf28}+1.73\%$
test_keys_nested 0.1190ms 87.0728μs 11.4846 KOps/s 11.2895 KOps/s $\color{#35bf28}+1.73\%$
test_keys_nested_locked 0.7184ms 93.4433μs 10.7017 KOps/s 10.6539 KOps/s $\color{#35bf28}+0.45\%$
test_keys_nested_leaf 0.1076ms 78.2827μs 12.7742 KOps/s 12.7365 KOps/s $\color{#35bf28}+0.30\%$
test_keys_stack_nested 0.1218ms 88.1350μs 11.3462 KOps/s 11.3572 KOps/s $\color{#d91a1a}-0.10\%$
test_keys_stack_nested_leaf 0.1224ms 79.5279μs 12.5742 KOps/s 12.6456 KOps/s $\color{#d91a1a}-0.56\%$
test_keys_stack_nested_locked 0.1223ms 94.1789μs 10.6181 KOps/s 10.6136 KOps/s $\color{#35bf28}+0.04\%$
test_values 4.9568μs 0.8551μs 1.1694 MOps/s 1.1661 MOps/s $\color{#35bf28}+0.28\%$
test_values_nested 55.1810μs 37.5875μs 26.6046 KOps/s 26.0672 KOps/s $\color{#35bf28}+2.06\%$
test_values_nested_locked 67.7520μs 39.2893μs 25.4522 KOps/s 25.5994 KOps/s $\color{#d91a1a}-0.57\%$
test_values_nested_leaf 65.6610μs 41.7839μs 23.9327 KOps/s 23.8421 KOps/s $\color{#35bf28}+0.38\%$
test_values_stack_nested 70.2720μs 37.7436μs 26.4945 KOps/s 26.6148 KOps/s $\color{#d91a1a}-0.45\%$
test_values_stack_nested_leaf 79.3720μs 42.2139μs 23.6889 KOps/s 23.8899 KOps/s $\color{#d91a1a}-0.84\%$
test_values_stack_nested_locked 64.7020μs 39.5258μs 25.2999 KOps/s 25.3421 KOps/s $\color{#d91a1a}-0.17\%$
test_membership 1.8360μs 0.5029μs 1.9885 MOps/s 1.9517 MOps/s $\color{#35bf28}+1.89\%$
test_membership_nested 13.7705μs 2.0258μs 493.6226 KOps/s 481.8423 KOps/s $\color{#35bf28}+2.44\%$
test_membership_nested_leaf 17.0505μs 2.0364μs 491.0646 KOps/s 495.4601 KOps/s $\color{#d91a1a}-0.89\%$
test_membership_stacked_nested 36.4010μs 2.0836μs 479.9303 KOps/s 466.4179 KOps/s $\color{#35bf28}+2.90\%$
test_membership_stacked_nested_leaf 23.6300μs 2.0690μs 483.3156 KOps/s 475.5965 KOps/s $\color{#35bf28}+1.62\%$
test_membership_nested_last 32.7310μs 3.0666μs 326.0961 KOps/s 317.1918 KOps/s $\color{#35bf28}+2.81\%$
test_membership_nested_leaf_last 39.0210μs 3.0896μs 323.6647 KOps/s 314.0048 KOps/s $\color{#35bf28}+3.08\%$
test_membership_stacked_nested_last 38.2110μs 3.0989μs 322.6912 KOps/s 315.1818 KOps/s $\color{#35bf28}+2.38\%$
test_membership_stacked_nested_leaf_last 28.8210μs 3.0657μs 326.1863 KOps/s 318.9742 KOps/s $\color{#35bf28}+2.26\%$
test_nested_getleaf 43.7710μs 6.2145μs 160.9145 KOps/s 160.5498 KOps/s $\color{#35bf28}+0.23\%$
test_nested_get 52.1410μs 5.9109μs 169.1787 KOps/s 168.7537 KOps/s $\color{#35bf28}+0.25\%$
test_stacked_getleaf 36.3010μs 6.1618μs 162.2909 KOps/s 161.9195 KOps/s $\color{#35bf28}+0.23\%$
test_stacked_get 47.2110μs 5.8614μs 170.6086 KOps/s 171.8840 KOps/s $\color{#d91a1a}-0.74\%$
test_nested_getitemleaf 26.0310μs 6.4888μs 154.1122 KOps/s 152.7795 KOps/s $\color{#35bf28}+0.87\%$
test_nested_getitem 40.7810μs 6.1348μs 163.0050 KOps/s 163.1968 KOps/s $\color{#d91a1a}-0.12\%$
test_stacked_getitemleaf 34.7400μs 6.4172μs 155.8315 KOps/s 155.4149 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_getitem 30.3500μs 6.1200μs 163.3977 KOps/s 163.0534 KOps/s $\color{#35bf28}+0.21\%$
test_lock_nested 9.1400ms 0.3492ms 2.8641 KOps/s 2.9623 KOps/s $\color{#d91a1a}-3.32\%$
test_lock_stack_nested 0.3877ms 0.3448ms 2.9004 KOps/s 2.9246 KOps/s $\color{#d91a1a}-0.83\%$
test_unlock_nested 0.3908ms 0.2874ms 3.4792 KOps/s 3.6004 KOps/s $\color{#d91a1a}-3.37\%$
test_unlock_stack_nested 0.3293ms 0.2846ms 3.5132 KOps/s 3.5655 KOps/s $\color{#d91a1a}-1.47\%$
test_flatten_speed 0.1107ms 74.5654μs 13.4110 KOps/s 13.1648 KOps/s $\color{#35bf28}+1.87\%$
test_unflatten_speed 0.3775ms 0.3252ms 3.0754 KOps/s 3.0229 KOps/s $\color{#35bf28}+1.74\%$
test_common_ops 0.8003ms 0.6624ms 1.5096 KOps/s 1.5049 KOps/s $\color{#35bf28}+0.31\%$
test_creation 0.1171ms 1.7167μs 582.5281 KOps/s 571.7236 KOps/s $\color{#35bf28}+1.89\%$
test_creation_empty 37.8910μs 10.7556μs 92.9751 KOps/s 90.4590 KOps/s $\color{#35bf28}+2.78\%$
test_creation_nested_1 38.2810μs 12.4817μs 80.1176 KOps/s 77.1707 KOps/s $\color{#35bf28}+3.82\%$
test_creation_nested_2 54.4110μs 15.1789μs 65.8809 KOps/s 64.9880 KOps/s $\color{#35bf28}+1.37\%$
test_clone 46.3510μs 9.8771μs 101.2440 KOps/s 98.2059 KOps/s $\color{#35bf28}+3.09\%$
test_getitem[int] 1.1939ms 10.8255μs 92.3747 KOps/s 94.6235 KOps/s $\color{#d91a1a}-2.38\%$
test_getitem[slice_int] 0.1116ms 21.2525μs 47.0533 KOps/s 48.5551 KOps/s $\color{#d91a1a}-3.09\%$
test_getitem[range] 0.1267ms 37.0554μs 26.9866 KOps/s 27.3236 KOps/s $\color{#d91a1a}-1.23\%$
test_getitem[tuple] 0.1133ms 18.3992μs 54.3501 KOps/s 53.7946 KOps/s $\color{#35bf28}+1.03\%$
test_getitem[list] 0.1313ms 32.6886μs 30.5917 KOps/s 30.9300 KOps/s $\color{#d91a1a}-1.09\%$
test_setitem_dim[int] 40.3510μs 19.2270μs 52.0102 KOps/s 52.8991 KOps/s $\color{#d91a1a}-1.68\%$
test_setitem_dim[slice_int] 60.9010μs 38.5508μs 25.9398 KOps/s 26.2970 KOps/s $\color{#d91a1a}-1.36\%$
test_setitem_dim[range] 79.2020μs 54.4443μs 18.3674 KOps/s 19.0714 KOps/s $\color{#d91a1a}-3.69\%$
test_setitem_dim[tuple] 53.8510μs 32.6459μs 30.6318 KOps/s 30.7702 KOps/s $\color{#d91a1a}-0.45\%$
test_setitem 39.4910μs 15.7682μs 63.4188 KOps/s 61.8214 KOps/s $\color{#35bf28}+2.58\%$
test_set 52.4510μs 15.2434μs 65.6020 KOps/s 63.6868 KOps/s $\color{#35bf28}+3.01\%$
test_set_shared 0.5357ms 0.1556ms 6.4280 KOps/s 6.4129 KOps/s $\color{#35bf28}+0.24\%$
test_update 0.3450ms 19.7443μs 50.6476 KOps/s 50.3554 KOps/s $\color{#35bf28}+0.58\%$
test_update_nested 60.5020μs 24.8754μs 40.2003 KOps/s 37.7204 KOps/s $\textbf{\color{#35bf28}+6.57\%}$
test_update__nested 0.5130ms 23.9241μs 41.7989 KOps/s 40.6999 KOps/s $\color{#35bf28}+2.70\%$
test_set_nested 73.9320μs 16.1783μs 61.8112 KOps/s 58.2407 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_set_nested_new 53.4110μs 18.8242μs 53.1232 KOps/s 51.3858 KOps/s $\color{#35bf28}+3.38\%$
test_select 70.3620μs 30.4006μs 32.8941 KOps/s 30.8942 KOps/s $\textbf{\color{#35bf28}+6.47\%}$
test_select_nested 66.1120μs 44.1213μs 22.6648 KOps/s 22.1407 KOps/s $\color{#35bf28}+2.37\%$
test_exclude_nested 0.1105ms 62.4869μs 16.0034 KOps/s 15.4233 KOps/s $\color{#35bf28}+3.76\%$
test_empty[True] 0.6852ms 0.2987ms 3.3477 KOps/s 3.3109 KOps/s $\color{#35bf28}+1.11\%$
test_empty[False] 3.2241μs 0.8240μs 1.2135 MOps/s 1.2091 MOps/s $\color{#35bf28}+0.37\%$
test_to 89.6930μs 56.2816μs 17.7678 KOps/s 17.4607 KOps/s $\color{#35bf28}+1.76\%$
test_to_nonblocking 85.2120μs 47.3738μs 21.1087 KOps/s 20.8527 KOps/s $\color{#35bf28}+1.23\%$
test_unbind_speed 0.2919ms 0.2404ms 4.1598 KOps/s 4.2022 KOps/s $\color{#d91a1a}-1.01\%$
test_unbind_speed_stack0 0.3081ms 0.2422ms 4.1289 KOps/s 4.2412 KOps/s $\color{#d91a1a}-2.65\%$
test_unbind_speed_stack1 92.9130ms 0.7381ms 1.3548 KOps/s 1.2484 KOps/s $\textbf{\color{#35bf28}+8.53\%}$
test_split 94.2502ms 1.6316ms 612.8828 Ops/s 628.5039 Ops/s $\color{#d91a1a}-2.49\%$
test_chunk 96.1375ms 1.6355ms 611.4522 Ops/s 628.7150 Ops/s $\color{#d91a1a}-2.75\%$
test_consolidate[False-None] 2.7500ms 2.6863ms 372.2550 Ops/s 371.5887 Ops/s $\color{#35bf28}+0.18\%$
test_consolidate[default-None] 1.7867ms 1.7121ms 584.0818 Ops/s 584.3285 Ops/s $\color{#d91a1a}-0.04\%$
test_consolidate[reduce-overhead-None] 1.7846ms 1.7377ms 575.4584 Ops/s 575.1133 Ops/s $\color{#35bf28}+0.06\%$
test_consolidate_njt[False-None] 6.9770ms 6.6223ms 151.0041 Ops/s 152.1577 Ops/s $\color{#d91a1a}-0.76\%$
test_to[False-False-None] 1.7974ms 1.7039ms 586.9058 Ops/s 584.2910 Ops/s $\color{#35bf28}+0.45\%$
test_to[True-False-None] 1.6570ms 1.3367ms 748.1224 Ops/s 726.5888 Ops/s $\color{#35bf28}+2.96\%$
test_to[within-False-None] 4.2152ms 4.1075ms 243.4551 Ops/s 235.9393 Ops/s $\color{#35bf28}+3.19\%$
test_to[True-default-None] 5.5808ms 5.3389ms 187.3032 Ops/s 188.1713 Ops/s $\color{#d91a1a}-0.46\%$
test_to_njt[False-False-None] 7.3117ms 6.8558ms 145.8616 Ops/s 142.7771 Ops/s $\color{#35bf28}+2.16\%$
test_to_njt[True-False-None] 6.3956ms 5.5880ms 178.9533 Ops/s 176.2275 Ops/s $\color{#35bf28}+1.55\%$
test_to_njt[within-False-None] 12.8443ms 12.2294ms 81.7702 Ops/s 80.7233 Ops/s $\color{#35bf28}+1.30\%$
test_creation[device0] 0.5612ms 80.3329μs 12.4482 KOps/s 12.5552 KOps/s $\color{#d91a1a}-0.85\%$
test_creation_from_tensor 0.4837ms 82.2117μs 12.1637 KOps/s 12.0797 KOps/s $\color{#35bf28}+0.70\%$
test_add_one[memmap_tensor0] 0.5356ms 6.3118μs 158.4332 KOps/s 157.4028 KOps/s $\color{#35bf28}+0.65\%$
test_contiguous[memmap_tensor0] 2.2425μs 0.4233μs 2.3622 MOps/s 2.3856 MOps/s $\color{#d91a1a}-0.98\%$
test_stack[memmap_tensor0] 35.1100μs 4.6114μs 216.8550 KOps/s 213.5697 KOps/s $\color{#35bf28}+1.54\%$
test_memmaptd_index 1.6492ms 0.2449ms 4.0825 KOps/s 4.0815 KOps/s $\color{#35bf28}+0.02\%$
test_memmaptd_index_astensor 0.4396ms 0.3059ms 3.2691 KOps/s 3.2860 KOps/s $\color{#d91a1a}-0.52\%$
test_memmaptd_index_op 0.7739ms 0.6066ms 1.6484 KOps/s 1.6203 KOps/s $\color{#35bf28}+1.74\%$
test_serialize_model 0.4359s 0.1750s 5.7145 Ops/s 7.6647 Ops/s $\textbf{\color{#d91a1a}-25.44\%}$
test_serialize_model_pickle 1.3508s 1.2103s 0.8262 Ops/s 0.8250 Ops/s $\color{#35bf28}+0.15\%$
test_serialize_weights 0.1320s 0.1299s 7.6954 Ops/s 7.7290 Ops/s $\color{#d91a1a}-0.44\%$
test_serialize_weights_returnearly 0.3178s 54.7911ms 18.2511 Ops/s 23.1517 Ops/s $\textbf{\color{#d91a1a}-21.17\%}$
test_serialize_weights_pickle 1.3753s 1.2160s 0.8224 Ops/s 0.8194 Ops/s $\color{#35bf28}+0.36\%$
test_reshape_pytree 51.5520μs 22.1223μs 45.2032 KOps/s 45.0859 KOps/s $\color{#35bf28}+0.26\%$
test_reshape_td 66.6920μs 26.9420μs 37.1168 KOps/s 36.7205 KOps/s $\color{#35bf28}+1.08\%$
test_view_pytree 54.7610μs 22.1275μs 45.1927 KOps/s 44.8073 KOps/s $\color{#35bf28}+0.86\%$
test_view_td 66.9310μs 30.9688μs 32.2906 KOps/s 30.7131 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_unbind_pytree 62.7420μs 28.0679μs 35.6279 KOps/s 35.5218 KOps/s $\color{#35bf28}+0.30\%$
test_unbind_td 0.8898ms 36.2818μs 27.5621 KOps/s 26.5496 KOps/s $\color{#35bf28}+3.81\%$
test_split_pytree 63.6420μs 29.6438μs 33.7339 KOps/s 32.5642 KOps/s $\color{#35bf28}+3.59\%$
test_split_td 1.0042ms 39.5386μs 25.2918 KOps/s 25.1535 KOps/s $\color{#35bf28}+0.55\%$
test_add_pytree 60.6820μs 32.7817μs 30.5048 KOps/s 29.3925 KOps/s $\color{#35bf28}+3.78\%$
test_add_td 0.1967ms 51.0740μs 19.5794 KOps/s 18.1555 KOps/s $\textbf{\color{#35bf28}+7.84\%}$
test_compile_add_one_nested[tensordict-compile] 0.1800ms 0.1235ms 8.0964 KOps/s 7.8128 KOps/s $\color{#35bf28}+3.63\%$
test_compile_add_one_nested[tensordict-eager] 0.2222ms 0.1292ms 7.7397 KOps/s 7.5710 KOps/s $\color{#35bf28}+2.23\%$
test_compile_add_one_nested[pytree-compile] 0.2038ms 96.8888μs 10.3211 KOps/s 9.9808 KOps/s $\color{#35bf28}+3.41\%$
test_compile_add_one_nested[pytree-eager] 0.2239ms 0.1476ms 6.7737 KOps/s 6.6917 KOps/s $\color{#35bf28}+1.23\%$
test_compile_copy_nested[tensordict-compile] 57.6110μs 24.9619μs 40.0611 KOps/s 39.6818 KOps/s $\color{#35bf28}+0.96\%$
test_compile_copy_nested[tensordict-eager] 57.8820μs 29.2961μs 34.1343 KOps/s 33.4464 KOps/s $\color{#35bf28}+2.06\%$
test_compile_copy_nested[pytree-compile] 0.4289ms 66.3834μs 15.0640 KOps/s 14.8060 KOps/s $\color{#35bf28}+1.74\%$
test_compile_copy_nested[pytree-eager] 85.3320μs 49.5591μs 20.1779 KOps/s 19.9134 KOps/s $\color{#35bf28}+1.33\%$
test_compile_add_one_flat[tensordict-compile] 0.1941ms 0.1420ms 7.0410 KOps/s 6.9883 KOps/s $\color{#35bf28}+0.75\%$
test_compile_add_one_flat[tensordict-eager] 0.3325ms 0.2164ms 4.6221 KOps/s 4.5984 KOps/s $\color{#35bf28}+0.52\%$
test_compile_add_one_flat[tensorclass-compile] 0.1574ms 97.7361μs 10.2316 KOps/s 10.1414 KOps/s $\color{#35bf28}+0.89\%$
test_compile_add_one_flat[tensorclass-eager] 0.1151ms 54.8006μs 18.2480 KOps/s 17.9781 KOps/s $\color{#35bf28}+1.50\%$
test_compile_add_one_flat[pytree-compile] 0.2176ms 0.1383ms 7.2285 KOps/s 7.3217 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_add_one_flat[pytree-eager] 0.5927ms 0.4729ms 2.1146 KOps/s 2.1043 KOps/s $\color{#35bf28}+0.49\%$
test_compile_add_self_flat[tensordict-eager] 0.3784ms 0.2574ms 3.8856 KOps/s 3.8163 KOps/s $\color{#35bf28}+1.82\%$
test_compile_add_self_flat[tensordict-compile] 0.1840ms 0.1442ms 6.9337 KOps/s 7.0102 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_add_self_flat[tensorclass-eager] 0.1612ms 67.5133μs 14.8119 KOps/s 14.3930 KOps/s $\color{#35bf28}+2.91\%$
test_compile_add_self_flat[tensorclass-compile] 0.1463ms 98.9814μs 10.1029 KOps/s 10.0807 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_self_flat[pytree-eager] 0.4706ms 0.4037ms 2.4770 KOps/s 2.4837 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_add_self_flat[pytree-compile] 0.1736ms 0.1353ms 7.3910 KOps/s 7.4356 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_copy_flat[tensordict-compile] 63.0720μs 19.7852μs 50.5428 KOps/s 54.6653 KOps/s $\textbf{\color{#d91a1a}-7.54\%}$
test_compile_copy_flat[tensordict-eager] 75.0220μs 30.9031μs 32.3592 KOps/s 31.8793 KOps/s $\color{#35bf28}+1.51\%$
test_compile_copy_flat[pytree-compile] 0.1792ms 72.6477μs 13.7651 KOps/s 13.7824 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_copy_flat[pytree-eager] 79.5020μs 54.0105μs 18.5149 KOps/s 18.6667 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_assign_and_add[tensordict-compile] 1.6694ms 0.4034ms 2.4789 KOps/s 2.2264 KOps/s $\textbf{\color{#35bf28}+11.34\%}$
test_compile_assign_and_add[tensordict-eager] 2.7901ms 2.6235ms 381.1676 Ops/s 383.4902 Ops/s $\color{#d91a1a}-0.61\%$
test_compile_assign_and_add[pytree-compile] 1.5914ms 0.4312ms 2.3191 KOps/s 2.2788 KOps/s $\color{#35bf28}+1.77\%$
test_compile_assign_and_add[pytree-eager] 2.8437ms 2.6120ms 382.8517 Ops/s 379.3704 Ops/s $\color{#35bf28}+0.92\%$
test_compile_indexing[tensor-tensordict-compile] 0.1800ms 0.1172ms 8.5355 KOps/s 8.6481 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_indexing[tensor-tensordict-eager] 0.5665ms 81.3378μs 12.2944 KOps/s 12.9758 KOps/s $\textbf{\color{#d91a1a}-5.25\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1751ms 0.1093ms 9.1460 KOps/s 9.6294 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.1163ms 69.0964μs 14.4725 KOps/s 14.9935 KOps/s $\color{#d91a1a}-3.47\%$
test_compile_indexing[tensor-pytree-compile] 0.1616ms 0.1108ms 9.0252 KOps/s 9.5792 KOps/s $\textbf{\color{#d91a1a}-5.78\%}$
test_compile_indexing[tensor-pytree-eager] 0.1801ms 70.2718μs 14.2305 KOps/s 15.0363 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_compile_indexing[slice-tensordict-compile] 0.1516ms 0.1006ms 9.9404 KOps/s 9.8661 KOps/s $\color{#35bf28}+0.75\%$
test_compile_indexing[slice-tensordict-eager] 0.1431ms 17.5304μs 57.0439 KOps/s 53.1076 KOps/s $\textbf{\color{#35bf28}+7.41\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1670ms 98.4000μs 10.1626 KOps/s 10.3937 KOps/s $\color{#d91a1a}-2.22\%$
test_compile_indexing[slice-tensorclass-eager] 51.9210μs 16.0181μs 62.4292 KOps/s 63.2829 KOps/s $\color{#d91a1a}-1.35\%$
test_compile_indexing[slice-pytree-compile] 0.2054ms 98.6791μs 10.1339 KOps/s 10.3474 KOps/s $\color{#d91a1a}-2.06\%$
test_compile_indexing[slice-pytree-eager] 47.3710μs 16.0415μs 62.3381 KOps/s 64.0101 KOps/s $\color{#d91a1a}-2.61\%$
test_compile_indexing[int-tensordict-compile] 0.1467ms 0.1007ms 9.9344 KOps/s 9.7746 KOps/s $\color{#35bf28}+1.63\%$
test_compile_indexing[int-tensordict-eager] 0.5643ms 17.1996μs 58.1410 KOps/s 58.0426 KOps/s $\color{#35bf28}+0.17\%$
test_compile_indexing[int-tensorclass-compile] 0.1874ms 96.0290μs 10.4135 KOps/s 10.2679 KOps/s $\color{#35bf28}+1.42\%$
test_compile_indexing[int-tensorclass-eager] 47.6910μs 16.1075μs 62.0828 KOps/s 63.7418 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_indexing[int-pytree-compile] 0.1558ms 95.6917μs 10.4502 KOps/s 10.3084 KOps/s $\color{#35bf28}+1.38\%$
test_compile_indexing[int-pytree-eager] 0.1058ms 15.8449μs 63.1116 KOps/s 63.3079 KOps/s $\color{#d91a1a}-0.31\%$
test_mod_add[eager] 95.2620μs 40.4715μs 24.7087 KOps/s 24.1273 KOps/s $\color{#35bf28}+2.41\%$
test_mod_add[compile] 0.1283ms 82.6796μs 12.0949 KOps/s 12.0769 KOps/s $\color{#35bf28}+0.15\%$
test_mod_add[compile-overhead] 0.3237ms 0.1704ms 5.8688 KOps/s 5.3485 KOps/s $\textbf{\color{#35bf28}+9.73\%}$
test_mod_wrap[eager] 0.3436ms 0.2624ms 3.8109 KOps/s 3.8738 KOps/s $\color{#d91a1a}-1.62\%$
test_mod_wrap[compile] 0.3944ms 0.2838ms 3.5237 KOps/s 3.4476 KOps/s $\color{#35bf28}+2.21\%$
test_mod_wrap[compile-overhead] 7.0435ms 3.7176ms 268.9932 Ops/s 270.6376 Ops/s $\color{#d91a1a}-0.61\%$
test_mod_wrap_and_backward[eager] 1.4504ms 1.3355ms 748.7877 Ops/s 722.9376 Ops/s $\color{#35bf28}+3.58\%$
test_mod_wrap_and_backward[compile] 1.3364ms 1.2581ms 794.8598 Ops/s 723.6089 Ops/s $\textbf{\color{#35bf28}+9.85\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3821ms 0.9241ms 1.0822 KOps/s 947.3077 Ops/s $\textbf{\color{#35bf28}+14.24\%}$
test_seq_add[eager] 0.1759ms 0.1200ms 8.3360 KOps/s 8.0260 KOps/s $\color{#35bf28}+3.86\%$
test_seq_add[compile] 0.2143ms 90.8941μs 11.0018 KOps/s 10.7836 KOps/s $\color{#35bf28}+2.02\%$
test_seq_add[compile-overhead] 0.1819ms 0.1295ms 7.7224 KOps/s 7.3799 KOps/s $\color{#35bf28}+4.64\%$
test_seq_wrap[eager] 0.5039ms 0.4278ms 2.3373 KOps/s 2.1951 KOps/s $\textbf{\color{#35bf28}+6.48\%}$
test_seq_wrap[compile] 0.3775ms 0.2995ms 3.3394 KOps/s 3.1184 KOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_seq_wrap[compile-overhead] 0.3124ms 0.2266ms 4.4127 KOps/s 4.3580 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_runtime[False-eager] 0.7916ms 0.7179ms 1.3930 KOps/s 1.3473 KOps/s $\color{#35bf28}+3.39\%$
test_func_call_runtime[False-compile] 0.8422ms 0.7425ms 1.3467 KOps/s 1.3170 KOps/s $\color{#35bf28}+2.25\%$
test_func_call_runtime[False-compile-overhead] 0.4164ms 0.3649ms 2.7404 KOps/s 2.7222 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_runtime[True-eager] 0.9856ms 0.8830ms 1.1326 KOps/s 1.1009 KOps/s $\color{#35bf28}+2.88\%$
test_func_call_runtime[True-compile] 0.8254ms 0.7631ms 1.3105 KOps/s 1.2833 KOps/s $\color{#35bf28}+2.12\%$
test_func_call_runtime[True-compile-overhead] 0.4328ms 0.3845ms 2.6009 KOps/s 2.5734 KOps/s $\color{#35bf28}+1.07\%$
test_func_call_cm_runtime[False-eager] 0.8510ms 0.7185ms 1.3919 KOps/s 1.3474 KOps/s $\color{#35bf28}+3.30\%$
test_func_call_cm_runtime[False-compile] 0.8228ms 0.7463ms 1.3399 KOps/s 1.3232 KOps/s $\color{#35bf28}+1.27\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4712ms 0.3685ms 2.7139 KOps/s 2.6972 KOps/s $\color{#35bf28}+0.62\%$
test_func_call_cm_runtime[True-eager] 1.0671ms 0.9901ms 1.0100 KOps/s 1.0111 KOps/s $\color{#d91a1a}-0.11\%$
test_func_call_cm_runtime[True-compile] 1.1082ms 1.0251ms 975.5345 Ops/s 996.1682 Ops/s $\color{#d91a1a}-2.07\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0910ms 1.0213ms 979.1309 Ops/s 1.0257 KOps/s $\color{#d91a1a}-4.54\%$
test_vmap_func_call_cm_runtime[eager] 2.4420ms 2.0292ms 492.7947 Ops/s 487.9002 Ops/s $\color{#35bf28}+1.00\%$
test_vmap_func_call_cm_runtime[compile] 0.8787ms 0.8086ms 1.2367 KOps/s 1.2098 KOps/s $\color{#35bf28}+2.22\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5374ms 0.4174ms 2.3960 KOps/s 2.3624 KOps/s $\color{#35bf28}+1.42\%$
test_distributed 0.7021ms 0.1628ms 6.1440 KOps/s 8.3808 KOps/s $\textbf{\color{#d91a1a}-26.69\%}$
test_tdmodule 0.2762ms 21.8788μs 45.7064 KOps/s 43.3628 KOps/s $\textbf{\color{#35bf28}+5.40\%}$
test_tdmodule_dispatch 61.7510μs 39.4320μs 25.3601 KOps/s 25.1674 KOps/s $\color{#35bf28}+0.77\%$
test_tdseq 43.0510μs 22.5085μs 44.4277 KOps/s 44.0800 KOps/s $\color{#35bf28}+0.79\%$
test_tdseq_dispatch 70.0320μs 42.0377μs 23.7882 KOps/s 23.5218 KOps/s $\color{#35bf28}+1.13\%$
test_instantiation_functorch 1.6231ms 1.5266ms 655.0327 Ops/s 640.6130 Ops/s $\color{#35bf28}+2.25\%$
test_exec_functorch 0.1797ms 0.1414ms 7.0726 KOps/s 7.0290 KOps/s $\color{#35bf28}+0.62\%$
test_exec_functional_call 0.1738ms 0.1315ms 7.6068 KOps/s 7.5520 KOps/s $\color{#35bf28}+0.73\%$
test_exec_td_decorator 0.3768ms 0.1821ms 5.4927 KOps/s 5.4423 KOps/s $\color{#35bf28}+0.92\%$
test_vmap_mlp_speed_decorator[True-True] 0.8263ms 0.6742ms 1.4833 KOps/s 1.4454 KOps/s $\color{#35bf28}+2.62\%$
test_vmap_mlp_speed_decorator[True-False] 0.7937ms 0.6765ms 1.4783 KOps/s 1.4426 KOps/s $\color{#35bf28}+2.47\%$
test_vmap_mlp_speed_decorator[False-True] 0.7229ms 0.5809ms 1.7215 KOps/s 1.7176 KOps/s $\color{#35bf28}+0.22\%$
test_vmap_mlp_speed_decorator[False-False] 0.6960ms 0.5803ms 1.7233 KOps/s 1.7073 KOps/s $\color{#35bf28}+0.94\%$
test_vmap_transformer_speed_decorator[True-True] 18.8403ms 18.7681ms 53.2818 Ops/s 53.1446 Ops/s $\color{#35bf28}+0.26\%$
test_vmap_transformer_speed_decorator[True-False] 19.3241ms 18.7745ms 53.2636 Ops/s 51.8336 Ops/s $\color{#35bf28}+2.76\%$
test_vmap_transformer_speed_decorator[False-True] 18.7205ms 18.6046ms 53.7501 Ops/s 53.7642 Ops/s $\color{#d91a1a}-0.03\%$
test_vmap_transformer_speed_decorator[False-False] 18.6585ms 18.5864ms 53.8029 Ops/s 53.6372 Ops/s $\color{#35bf28}+0.31\%$
test_to_module_speed[True] 1.0630ms 0.9617ms 1.0398 KOps/s 1.0374 KOps/s $\color{#35bf28}+0.23\%$
test_to_module_speed[False] 1.3085ms 0.9440ms 1.0593 KOps/s 1.0557 KOps/s $\color{#35bf28}+0.34\%$
test_tc_init 85.6020μs 37.8550μs 26.4166 KOps/s 24.5856 KOps/s $\textbf{\color{#35bf28}+7.45\%}$
test_tc_init_nested 0.1240ms 76.0666μs 13.1464 KOps/s 12.5797 KOps/s $\color{#35bf28}+4.50\%$
test_tc_first_layer_tensor 7.4244μs 0.7166μs 1.3956 MOps/s 1.2047 MOps/s $\textbf{\color{#35bf28}+15.84\%}$
test_tc_first_layer_nontensor 40.3210μs 2.2639μs 441.7234 KOps/s 442.4473 KOps/s $\color{#d91a1a}-0.16\%$
test_tc_second_layer_tensor 8.5502μs 1.4378μs 695.5237 KOps/s 701.9617 KOps/s $\color{#d91a1a}-0.92\%$
test_tc_second_layer_nontensor 60.0810μs 3.0215μs 330.9583 KOps/s 327.9049 KOps/s $\color{#35bf28}+0.93\%$
test_unbind 0.2254s 9.9940ms 100.0603 Ops/s 142.2180 Ops/s $\textbf{\color{#d91a1a}-29.64\%}$
test_full_like 9.1892ms 9.0673ms 110.2869 Ops/s 110.5429 Ops/s $\color{#d91a1a}-0.23\%$
test_zeros_like 5.1303ms 4.3129ms 231.8652 Ops/s 231.8249 Ops/s $\color{#35bf28}+0.02\%$
test_ones_like 4.9413ms 4.3177ms 231.6056 Ops/s 231.8933 Ops/s $\color{#d91a1a}-0.12\%$
test_clone 11.2006ms 9.0089ms 111.0015 Ops/s 159.0664 Ops/s $\textbf{\color{#d91a1a}-30.22\%}$
test_squeeze 80.6520μs 10.6084μs 94.2651 KOps/s 105.5523 KOps/s $\textbf{\color{#d91a1a}-10.69\%}$
test_unsqueeze 0.4838ms 74.2274μs 13.4721 KOps/s 13.4138 KOps/s $\color{#35bf28}+0.43\%$
test_split 0.2825ms 0.1621ms 6.1680 KOps/s 5.9343 KOps/s $\color{#35bf28}+3.94\%$
test_permute 0.5804ms 0.1809ms 5.5272 KOps/s 5.3250 KOps/s $\color{#35bf28}+3.80\%$
test_stack 50.3951ms 49.8068ms 20.0776 Ops/s 20.0702 Ops/s $\color{#35bf28}+0.04\%$
test_cat 52.6312ms 49.9136ms 20.0346 Ops/s 19.7285 Ops/s $\color{#35bf28}+1.55\%$

vmoens added a commit that referenced this pull request Feb 12, 2025
ghstack-source-id: aea9814fecbab903ad22ae54903a2921f4b88c5b
Pull Request resolved: #1215

(cherry picked from commit 7dd385b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants