Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Consolidate lazy stacks of non-tensors [EDIT] #1224

Merged
merged 1 commit into from
Feb 20, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 20, 2025

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2025
@vmoens vmoens changed the title [BugFix] Consolidate lazy stacks of non-tensors [BugFix] Consolidate lazy stacks of non-tensors [EDIT] Feb 20, 2025
@vmoens vmoens added the bug Something isn't working label Feb 20, 2025
@vmoens vmoens merged commit 9d11f69 into gh/vmoens/48/base Feb 20, 2025
25 of 36 checks passed
vmoens added a commit that referenced this pull request Feb 20, 2025
ghstack-source-id: d3d822dba235b74128f99e6cbff08989d13c1af4
Pull Request resolved: #1224
@vmoens vmoens deleted the gh/vmoens/48/head branch February 20, 2025 10:52
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 82.4630μs 21.0865μs 47.4236 KOps/s 47.2165 KOps/s $\color{#35bf28}+0.44\%$
test_plain_set_stack_nested 45.7460μs 21.2103μs 47.1470 KOps/s 46.5496 KOps/s $\color{#35bf28}+1.28\%$
test_plain_set_nested_inplace 60.5530μs 22.7574μs 43.9418 KOps/s 42.7514 KOps/s $\color{#35bf28}+2.78\%$
test_plain_set_stack_nested_inplace 90.4680μs 22.7549μs 43.9467 KOps/s 42.5656 KOps/s $\color{#35bf28}+3.24\%$
test_items 63.6990μs 4.2981μs 232.6621 KOps/s 239.0922 KOps/s $\color{#d91a1a}-2.69\%$
test_items_nested 0.6487ms 0.4181ms 2.3917 KOps/s 2.4323 KOps/s $\color{#d91a1a}-1.67\%$
test_items_nested_locked 0.5034ms 0.3982ms 2.5115 KOps/s 2.4411 KOps/s $\color{#35bf28}+2.89\%$
test_items_nested_leaf 0.1316ms 77.7667μs 12.8590 KOps/s 12.9141 KOps/s $\color{#d91a1a}-0.43\%$
test_items_stack_nested 0.8318ms 0.4017ms 2.4893 KOps/s 2.4025 KOps/s $\color{#35bf28}+3.61\%$
test_items_stack_nested_leaf 0.1793ms 77.2070μs 12.9522 KOps/s 12.6480 KOps/s $\color{#35bf28}+2.40\%$
test_items_stack_nested_locked 0.5886ms 0.4012ms 2.4925 KOps/s 2.4185 KOps/s $\color{#35bf28}+3.06\%$
test_keys 23.4340μs 3.6231μs 276.0073 KOps/s 281.7361 KOps/s $\color{#d91a1a}-2.03\%$
test_keys_nested 0.2607ms 0.1642ms 6.0885 KOps/s 6.0467 KOps/s $\color{#35bf28}+0.69\%$
test_keys_nested_locked 0.7230ms 0.1701ms 5.8799 KOps/s 5.8635 KOps/s $\color{#35bf28}+0.28\%$
test_keys_nested_leaf 0.2363ms 0.1428ms 7.0006 KOps/s 6.9104 KOps/s $\color{#35bf28}+1.30\%$
test_keys_stack_nested 0.2537ms 0.1630ms 6.1367 KOps/s 6.0809 KOps/s $\color{#35bf28}+0.92\%$
test_keys_stack_nested_leaf 0.2248ms 0.1431ms 6.9857 KOps/s 6.9415 KOps/s $\color{#35bf28}+0.64\%$
test_keys_stack_nested_locked 0.2683ms 0.1681ms 5.9486 KOps/s 5.9060 KOps/s $\color{#35bf28}+0.72\%$
test_values 8.6340μs 1.0539μs 948.8470 KOps/s 931.9904 KOps/s $\color{#35bf28}+1.81\%$
test_values_nested 0.1095ms 61.4310μs 16.2784 KOps/s 16.0460 KOps/s $\color{#35bf28}+1.45\%$
test_values_nested_locked 0.1268ms 61.6657μs 16.2165 KOps/s 16.1329 KOps/s $\color{#35bf28}+0.52\%$
test_values_nested_leaf 0.1453ms 70.9685μs 14.0908 KOps/s 13.9352 KOps/s $\color{#35bf28}+1.12\%$
test_values_stack_nested 0.1130ms 62.0003μs 16.1290 KOps/s 15.4073 KOps/s $\color{#35bf28}+4.68\%$
test_values_stack_nested_leaf 0.1332ms 70.9605μs 14.0924 KOps/s 13.8987 KOps/s $\color{#35bf28}+1.39\%$
test_values_stack_nested_locked 0.1941ms 62.7352μs 15.9400 KOps/s 15.8732 KOps/s $\color{#35bf28}+0.42\%$
test_membership 83.1650μs 0.8880μs 1.1261 MOps/s 1.0928 MOps/s $\color{#35bf28}+3.05\%$
test_membership_nested 19.0050μs 2.8596μs 349.7012 KOps/s 337.3472 KOps/s $\color{#35bf28}+3.66\%$
test_membership_nested_leaf 24.0750μs 2.8800μs 347.2175 KOps/s 329.7488 KOps/s $\textbf{\color{#35bf28}+5.30\%}$
test_membership_stacked_nested 22.7920μs 2.8786μs 347.3971 KOps/s 332.5313 KOps/s $\color{#35bf28}+4.47\%$
test_membership_stacked_nested_leaf 45.8350μs 2.8799μs 347.2330 KOps/s 339.8966 KOps/s $\color{#35bf28}+2.16\%$
test_membership_nested_last 22.9830μs 4.2681μs 234.2990 KOps/s 225.1568 KOps/s $\color{#35bf28}+4.06\%$
test_membership_nested_leaf_last 50.6350μs 4.2918μs 233.0033 KOps/s 223.5508 KOps/s $\color{#35bf28}+4.23\%$
test_membership_stacked_nested_last 25.2270μs 4.2539μs 235.0791 KOps/s 176.9916 KOps/s $\textbf{\color{#35bf28}+32.82\%}$
test_membership_stacked_nested_leaf_last 47.2080μs 4.2997μs 232.5733 KOps/s 174.1117 KOps/s $\textbf{\color{#35bf28}+33.58\%}$
test_nested_getleaf 77.1840μs 10.5409μs 94.8684 KOps/s 92.1813 KOps/s $\color{#35bf28}+2.92\%$
test_nested_get 29.5750μs 10.0552μs 99.4508 KOps/s 97.6473 KOps/s $\color{#35bf28}+1.85\%$
test_stacked_getleaf 58.2380μs 10.3622μs 96.5048 KOps/s 92.8452 KOps/s $\color{#35bf28}+3.94\%$
test_stacked_get 53.5000μs 9.9363μs 100.6410 KOps/s 98.3182 KOps/s $\color{#35bf28}+2.36\%$
test_nested_getitemleaf 33.5030μs 11.0905μs 90.1671 KOps/s 87.3221 KOps/s $\color{#35bf28}+3.26\%$
test_nested_getitem 49.3910μs 10.5163μs 95.0907 KOps/s 91.0306 KOps/s $\color{#35bf28}+4.46\%$
test_stacked_getitemleaf 46.5170μs 11.0967μs 90.1166 KOps/s 88.2706 KOps/s $\color{#35bf28}+2.09\%$
test_stacked_getitem 47.1780μs 10.4019μs 96.1365 KOps/s 93.9209 KOps/s $\color{#35bf28}+2.36\%$
test_lock_nested 7.3809ms 0.4163ms 2.4020 KOps/s 2.4156 KOps/s $\color{#d91a1a}-0.57\%$
test_lock_stack_nested 0.5416ms 0.4229ms 2.3644 KOps/s 2.3587 KOps/s $\color{#35bf28}+0.24\%$
test_unlock_nested 0.7279ms 0.3395ms 2.9456 KOps/s 2.9867 KOps/s $\color{#d91a1a}-1.38\%$
test_unlock_stack_nested 0.5199ms 0.3421ms 2.9235 KOps/s 2.9190 KOps/s $\color{#35bf28}+0.15\%$
test_flatten_speed 0.2089ms 0.1014ms 9.8663 KOps/s 9.8595 KOps/s $\color{#35bf28}+0.07\%$
test_unflatten_speed 0.9781ms 0.5211ms 1.9189 KOps/s 1.8846 KOps/s $\color{#35bf28}+1.82\%$
test_common_ops 1.0447ms 0.8335ms 1.1998 KOps/s 1.2173 KOps/s $\color{#d91a1a}-1.43\%$
test_creation 32.6610μs 2.6163μs 382.2257 KOps/s 394.5776 KOps/s $\color{#d91a1a}-3.13\%$
test_creation_empty 36.7990μs 12.6643μs 78.9622 KOps/s 77.5212 KOps/s $\color{#35bf28}+1.86\%$
test_creation_nested_1 78.7770μs 15.3764μs 65.0349 KOps/s 63.2115 KOps/s $\color{#35bf28}+2.88\%$
test_creation_nested_2 74.3690μs 20.4473μs 48.9063 KOps/s 48.8544 KOps/s $\color{#35bf28}+0.11\%$
test_clone 67.3480μs 13.3371μs 74.9785 KOps/s 72.8953 KOps/s $\color{#35bf28}+2.86\%$
test_getitem[int] 0.9629ms 12.5038μs 79.9759 KOps/s 77.0490 KOps/s $\color{#35bf28}+3.80\%$
test_getitem[slice_int] 0.1616ms 23.7087μs 42.1785 KOps/s 41.0078 KOps/s $\color{#35bf28}+2.85\%$
test_getitem[range] 0.1698ms 51.0256μs 19.5980 KOps/s 19.4746 KOps/s $\color{#35bf28}+0.63\%$
test_getitem[tuple] 0.1651ms 20.0937μs 49.7669 KOps/s 48.7459 KOps/s $\color{#35bf28}+2.09\%$
test_getitem[list] 0.1663ms 44.9448μs 22.2495 KOps/s 21.3282 KOps/s $\color{#35bf28}+4.32\%$
test_setitem_dim[int] 47.0580μs 25.6688μs 38.9578 KOps/s 38.6969 KOps/s $\color{#35bf28}+0.67\%$
test_setitem_dim[slice_int] 90.7990μs 50.9202μs 19.6386 KOps/s 19.1944 KOps/s $\color{#35bf28}+2.31\%$
test_setitem_dim[range] 0.1689ms 77.4101μs 12.9182 KOps/s 13.0314 KOps/s $\color{#d91a1a}-0.87\%$
test_setitem_dim[tuple] 84.8480μs 40.7397μs 24.5461 KOps/s 24.0542 KOps/s $\color{#35bf28}+2.04\%$
test_setitem 80.0600μs 20.8298μs 48.0081 KOps/s 46.3571 KOps/s $\color{#35bf28}+3.56\%$
test_set 75.7010μs 20.2127μs 49.4739 KOps/s 47.6689 KOps/s $\color{#35bf28}+3.79\%$
test_set_shared 3.5387ms 0.1829ms 5.4680 KOps/s 5.4974 KOps/s $\color{#d91a1a}-0.53\%$
test_update 0.1663ms 23.5092μs 42.5365 KOps/s 40.9509 KOps/s $\color{#35bf28}+3.87\%$
test_update_nested 0.1144ms 34.0774μs 29.3449 KOps/s 28.4413 KOps/s $\color{#35bf28}+3.18\%$
test_update__nested 0.4722ms 34.0385μs 29.3785 KOps/s 29.4282 KOps/s $\color{#d91a1a}-0.17\%$
test_set_nested 75.6410μs 22.4759μs 44.4922 KOps/s 43.3813 KOps/s $\color{#35bf28}+2.56\%$
test_set_nested_new 68.1670μs 27.2126μs 36.7477 KOps/s 36.3067 KOps/s $\color{#35bf28}+1.21\%$
test_select 0.1230ms 43.6241μs 22.9231 KOps/s 22.1864 KOps/s $\color{#35bf28}+3.32\%$
test_select_nested 0.1165ms 63.3639μs 15.7819 KOps/s 15.8221 KOps/s $\color{#d91a1a}-0.25\%$
test_exclude_nested 0.1516ms 81.1762μs 12.3189 KOps/s 12.3289 KOps/s $\color{#d91a1a}-0.08\%$
test_empty[True] 0.6969ms 0.4048ms 2.4706 KOps/s 2.4313 KOps/s $\color{#35bf28}+1.62\%$
test_empty[False] 11.6040μs 1.3546μs 738.2208 KOps/s 736.8984 KOps/s $\color{#35bf28}+0.18\%$
test_unbind_speed 0.5860ms 0.2685ms 3.7247 KOps/s 3.6497 KOps/s $\color{#35bf28}+2.05\%$
test_unbind_speed_stack0 0.4651ms 0.2667ms 3.7497 KOps/s 3.6921 KOps/s $\color{#35bf28}+1.56\%$
test_unbind_speed_stack1 0.7457ms 0.6617ms 1.5114 KOps/s 1.2302 KOps/s $\textbf{\color{#35bf28}+22.86\%}$
test_split 0.1043s 1.8935ms 528.1172 Ops/s 569.7388 Ops/s $\textbf{\color{#d91a1a}-7.31\%}$
test_chunk 2.5084ms 1.5705ms 636.7429 Ops/s 630.8728 Ops/s $\color{#35bf28}+0.93\%$
test_consolidate_njt[False-None] 0.1082s 9.0311ms 110.7281 Ops/s 108.2703 Ops/s $\color{#35bf28}+2.27\%$
test_creation[device0] 4.3265ms 93.5931μs 10.6846 KOps/s 10.8415 KOps/s $\color{#d91a1a}-1.45\%$
test_creation_from_tensor 0.2459ms 95.0217μs 10.5239 KOps/s 10.4794 KOps/s $\color{#35bf28}+0.42\%$
test_add_one[memmap_tensor0] 93.3540μs 5.0357μs 198.5817 KOps/s 185.0536 KOps/s $\textbf{\color{#35bf28}+7.31\%}$
test_contiguous[memmap_tensor0] 10.6600μs 0.5070μs 1.9722 MOps/s 1.9067 MOps/s $\color{#35bf28}+3.43\%$
test_stack[memmap_tensor0] 23.4030μs 3.3981μs 294.2843 KOps/s 275.7159 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_memmaptd_index 1.1629ms 0.2282ms 4.3814 KOps/s 4.2088 KOps/s $\color{#35bf28}+4.10\%$
test_memmaptd_index_astensor 0.6253ms 0.3184ms 3.1411 KOps/s 3.0303 KOps/s $\color{#35bf28}+3.66\%$
test_memmaptd_index_op 1.1309ms 0.6056ms 1.6512 KOps/s 1.5857 KOps/s $\color{#35bf28}+4.13\%$
test_serialize_model 0.1214s 0.1143s 8.7499 Ops/s 8.5528 Ops/s $\color{#35bf28}+2.30\%$
test_serialize_model_pickle 0.5032s 0.4054s 2.4664 Ops/s 2.4931 Ops/s $\color{#d91a1a}-1.07\%$
test_serialize_weights 0.1283s 0.1139s 8.7810 Ops/s 8.6161 Ops/s $\color{#35bf28}+1.91\%$
test_serialize_weights_returnearly 0.1782s 0.1619s 6.1782 Ops/s 6.4577 Ops/s $\color{#d91a1a}-4.33\%$
test_serialize_weights_pickle 0.4992s 0.4230s 2.3638 Ops/s 2.2941 Ops/s $\color{#35bf28}+3.04\%$
test_serialize_weights_filesystem 0.1513s 0.1426s 7.0131 Ops/s 7.0826 Ops/s $\color{#d91a1a}-0.98\%$
test_serialize_model_filesystem 0.1544s 0.1442s 6.9346 Ops/s 6.4716 Ops/s $\textbf{\color{#35bf28}+7.16\%}$
test_reshape_pytree 65.9630μs 26.0539μs 38.3819 KOps/s 37.9071 KOps/s $\color{#35bf28}+1.25\%$
test_reshape_td 74.0080μs 32.6844μs 30.5957 KOps/s 30.0414 KOps/s $\color{#35bf28}+1.84\%$
test_view_pytree 58.6900μs 25.9476μs 38.5392 KOps/s 38.2133 KOps/s $\color{#35bf28}+0.85\%$
test_view_td 82.3540μs 40.0559μs 24.9651 KOps/s 24.7749 KOps/s $\color{#35bf28}+0.77\%$
test_unbind_pytree 67.4560μs 29.0648μs 34.4059 KOps/s 33.6720 KOps/s $\color{#35bf28}+2.18\%$
test_unbind_td 0.3800ms 39.6129μs 25.2443 KOps/s 24.6740 KOps/s $\color{#35bf28}+2.31\%$
test_split_pytree 67.8960μs 28.7404μs 34.7943 KOps/s 34.0421 KOps/s $\color{#35bf28}+2.21\%$
test_split_td 0.5025ms 45.3800μs 22.0361 KOps/s 21.9141 KOps/s $\color{#35bf28}+0.56\%$
test_add_pytree 96.5800μs 35.8438μs 27.8989 KOps/s 26.8287 KOps/s $\color{#35bf28}+3.99\%$
test_add_td 0.1593ms 58.2898μs 17.1557 KOps/s 16.2786 KOps/s $\textbf{\color{#35bf28}+5.39\%}$
test_compile_add_one_nested[tensordict-compile] 0.1305ms 66.2430μs 15.0959 KOps/s 14.9286 KOps/s $\color{#35bf28}+1.12\%$
test_compile_add_one_nested[tensordict-eager] 0.5280ms 0.1710ms 5.8475 KOps/s 5.7295 KOps/s $\color{#35bf28}+2.06\%$
test_compile_add_one_nested[pytree-compile] 0.1045ms 45.6390μs 21.9111 KOps/s 21.6286 KOps/s $\color{#35bf28}+1.31\%$
test_compile_add_one_nested[pytree-eager] 0.2190ms 0.1184ms 8.4429 KOps/s 8.0804 KOps/s $\color{#35bf28}+4.49\%$
test_compile_copy_nested[tensordict-compile] 65.2310μs 28.3307μs 35.2974 KOps/s 35.9903 KOps/s $\color{#d91a1a}-1.93\%$
test_compile_copy_nested[tensordict-eager] 0.1191ms 58.6565μs 17.0484 KOps/s 16.9660 KOps/s $\color{#35bf28}+0.49\%$
test_compile_copy_nested[pytree-compile] 0.1512ms 79.0822μs 12.6451 KOps/s 12.4669 KOps/s $\color{#35bf28}+1.43\%$
test_compile_copy_nested[pytree-eager] 0.1514ms 66.8387μs 14.9614 KOps/s 14.8288 KOps/s $\color{#35bf28}+0.89\%$
test_compile_add_one_flat[tensordict-compile] 0.1934ms 0.1075ms 9.3063 KOps/s 9.3162 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_one_flat[tensordict-eager] 0.4327ms 0.2171ms 4.6060 KOps/s 4.5933 KOps/s $\color{#35bf28}+0.28\%$
test_compile_add_one_flat[tensorclass-compile] 0.1828ms 47.5467μs 21.0320 KOps/s 21.2629 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_add_one_flat[tensorclass-eager] 0.1556ms 68.3639μs 14.6276 KOps/s 14.5390 KOps/s $\color{#35bf28}+0.61\%$
test_compile_add_one_flat[pytree-compile] 0.1766ms 0.1020ms 9.8062 KOps/s 9.9329 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_add_one_flat[pytree-eager] 0.3399ms 0.2042ms 4.8968 KOps/s 4.8189 KOps/s $\color{#35bf28}+1.62\%$
test_compile_add_self_flat[tensordict-eager] 0.4098ms 0.2331ms 4.2901 KOps/s 4.2467 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_self_flat[tensordict-compile] 0.2370ms 0.1118ms 8.9432 KOps/s 9.1892 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_add_self_flat[tensorclass-eager] 0.3611ms 63.4030μs 15.7721 KOps/s 15.8970 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_add_self_flat[tensorclass-compile] 0.3409ms 49.0053μs 20.4060 KOps/s 19.7038 KOps/s $\color{#35bf28}+3.56\%$
test_compile_add_self_flat[pytree-eager] 0.3684ms 0.1642ms 6.0910 KOps/s 6.2098 KOps/s $\color{#d91a1a}-1.91\%$
test_compile_add_self_flat[pytree-compile] 0.1752ms 0.1008ms 9.9214 KOps/s 9.8986 KOps/s $\color{#35bf28}+0.23\%$
test_compile_copy_flat[tensordict-compile] 60.9230μs 20.8340μs 47.9985 KOps/s 47.3309 KOps/s $\color{#35bf28}+1.41\%$
test_compile_copy_flat[tensordict-eager] 0.1494ms 68.5625μs 14.5852 KOps/s 15.0306 KOps/s $\color{#d91a1a}-2.96\%$
test_compile_copy_flat[pytree-compile] 0.2323ms 83.4707μs 11.9803 KOps/s 12.2345 KOps/s $\color{#d91a1a}-2.08\%$
test_compile_copy_flat[pytree-eager] 0.1160ms 68.0617μs 14.6925 KOps/s 14.8208 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_assign_and_add[tensordict-compile] 0.5718ms 0.2180ms 4.5876 KOps/s 4.7023 KOps/s $\color{#d91a1a}-2.44\%$
test_compile_assign_and_add[tensordict-eager] 1.6185ms 1.3687ms 730.6345 Ops/s 722.0444 Ops/s $\color{#35bf28}+1.19\%$
test_compile_assign_and_add[pytree-compile] 0.3123ms 0.2112ms 4.7343 KOps/s 4.7446 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_assign_and_add[pytree-eager] 1.7398ms 0.8329ms 1.2007 KOps/s 1.1942 KOps/s $\color{#35bf28}+0.54\%$
test_compile_assign_and_add_stack[compile] 0.8237ms 0.4626ms 2.1617 KOps/s 2.1920 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_assign_and_add_stack[eager] 2.9123ms 2.6960ms 370.9247 Ops/s 354.0799 Ops/s $\color{#35bf28}+4.76\%$
test_compile_indexing[tensor-tensordict-compile] 90.6690μs 38.9448μs 25.6773 KOps/s 25.9003 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_indexing[tensor-tensordict-eager] 0.5940ms 33.1436μs 30.1717 KOps/s 29.2257 KOps/s $\color{#35bf28}+3.24\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1352ms 31.6795μs 31.5661 KOps/s 31.3385 KOps/s $\color{#35bf28}+0.73\%$
test_compile_indexing[tensor-tensorclass-eager] 0.5797ms 23.0931μs 43.3029 KOps/s 40.8484 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_compile_indexing[tensor-pytree-compile] 92.7430μs 31.7059μs 31.5399 KOps/s 30.5264 KOps/s $\color{#35bf28}+3.32\%$
test_compile_indexing[tensor-pytree-eager] 64.7300μs 22.9197μs 43.6307 KOps/s 41.9134 KOps/s $\color{#35bf28}+4.10\%$
test_compile_indexing[slice-tensordict-compile] 0.1366ms 52.7919μs 18.9423 KOps/s 19.0129 KOps/s $\color{#d91a1a}-0.37\%$
test_compile_indexing[slice-tensordict-eager] 0.3838ms 19.2303μs 52.0013 KOps/s 49.3713 KOps/s $\textbf{\color{#35bf28}+5.33\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1221ms 46.1500μs 21.6685 KOps/s 21.2749 KOps/s $\color{#35bf28}+1.85\%$
test_compile_indexing[slice-tensorclass-eager] 50.6150μs 18.5929μs 53.7839 KOps/s 53.6053 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[slice-pytree-compile] 98.9540μs 46.7660μs 21.3830 KOps/s 21.2672 KOps/s $\color{#35bf28}+0.54\%$
test_compile_indexing[slice-pytree-eager] 65.3720μs 18.7618μs 53.2998 KOps/s 53.6329 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_indexing[int-tensordict-compile] 0.1152ms 54.0010μs 18.5182 KOps/s 18.3499 KOps/s $\color{#35bf28}+0.92\%$
test_compile_indexing[int-tensordict-eager] 0.9827ms 19.1544μs 52.2072 KOps/s 49.4436 KOps/s $\textbf{\color{#35bf28}+5.59\%}$
test_compile_indexing[int-tensorclass-compile] 0.1124ms 46.1708μs 21.6587 KOps/s 21.3371 KOps/s $\color{#35bf28}+1.51\%$
test_compile_indexing[int-tensorclass-eager] 0.1287ms 19.0606μs 52.4643 KOps/s 53.5299 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_indexing[int-pytree-compile] 0.1119ms 46.2624μs 21.6158 KOps/s 21.1581 KOps/s $\color{#35bf28}+2.16\%$
test_compile_indexing[int-pytree-eager] 73.9480μs 18.5569μs 53.8882 KOps/s 53.5920 KOps/s $\color{#35bf28}+0.55\%$
test_mod_add[eager] 93.8240μs 35.6360μs 28.0615 KOps/s 27.1542 KOps/s $\color{#35bf28}+3.34\%$
test_mod_add[compile] 0.1101ms 63.8104μs 15.6714 KOps/s 15.5863 KOps/s $\color{#35bf28}+0.55\%$
test_mod_add[compile-overhead] 0.2042ms 64.0806μs 15.6053 KOps/s 15.5402 KOps/s $\color{#35bf28}+0.42\%$
test_mod_wrap[eager] 0.9339ms 0.2254ms 4.4365 KOps/s 4.4596 KOps/s $\color{#d91a1a}-0.52\%$
test_mod_wrap[compile] 1.9018ms 0.2299ms 4.3492 KOps/s 4.3505 KOps/s $\color{#d91a1a}-0.03\%$
test_mod_wrap[compile-overhead] 0.4242ms 0.2264ms 4.4177 KOps/s 4.3822 KOps/s $\color{#35bf28}+0.81\%$
test_mod_wrap_and_backward[eager] 15.9641ms 13.4876ms 74.1421 Ops/s 72.0827 Ops/s $\color{#35bf28}+2.86\%$
test_mod_wrap_and_backward[compile] 14.5857ms 11.7592ms 85.0397 Ops/s 85.3565 Ops/s $\color{#d91a1a}-0.37\%$
test_mod_wrap_and_backward[compile-overhead] 16.7307ms 11.9013ms 84.0245 Ops/s 87.4527 Ops/s $\color{#d91a1a}-3.92\%$
test_seq_add[eager] 0.1880ms 0.1154ms 8.6621 KOps/s 8.2400 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_seq_add[compile] 0.1398ms 75.7865μs 13.1950 KOps/s 12.8365 KOps/s $\color{#35bf28}+2.79\%$
test_seq_add[compile-overhead] 0.1384ms 74.9989μs 13.3335 KOps/s 13.3191 KOps/s $\color{#35bf28}+0.11\%$
test_seq_wrap[eager] 0.5786ms 0.4380ms 2.2831 KOps/s 2.1509 KOps/s $\textbf{\color{#35bf28}+6.14\%}$
test_seq_wrap[compile] 0.4534ms 0.2445ms 4.0896 KOps/s 4.0100 KOps/s $\color{#35bf28}+1.98\%$
test_seq_wrap[compile-overhead] 0.3412ms 0.2427ms 4.1197 KOps/s 4.0278 KOps/s $\color{#35bf28}+2.28\%$
test_func_call_runtime[False-eager] 0.7223ms 0.5300ms 1.8866 KOps/s 1.8483 KOps/s $\color{#35bf28}+2.08\%$
test_func_call_runtime[False-compile] 0.5586ms 0.4438ms 2.2534 KOps/s 2.2153 KOps/s $\color{#35bf28}+1.72\%$
test_func_call_runtime[False-compile-overhead] 0.5366ms 0.4429ms 2.2578 KOps/s 2.2218 KOps/s $\color{#35bf28}+1.62\%$
test_func_call_runtime[True-eager] 1.1114ms 0.7417ms 1.3483 KOps/s 1.3263 KOps/s $\color{#35bf28}+1.66\%$
test_func_call_runtime[True-compile] 0.5717ms 0.4624ms 2.1627 KOps/s 2.1236 KOps/s $\color{#35bf28}+1.84\%$
test_func_call_runtime[True-compile-overhead] 0.6886ms 0.4639ms 2.1557 KOps/s 2.1266 KOps/s $\color{#35bf28}+1.37\%$
test_func_call_cm_runtime[False-eager] 0.9641ms 0.5306ms 1.8845 KOps/s 1.8505 KOps/s $\color{#35bf28}+1.84\%$
test_func_call_cm_runtime[False-compile] 0.6201ms 0.4385ms 2.2805 KOps/s 2.2049 KOps/s $\color{#35bf28}+3.43\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6220ms 0.4407ms 2.2692 KOps/s 2.2187 KOps/s $\color{#35bf28}+2.28\%$
test_func_call_cm_runtime[True-eager] 1.0202ms 0.8791ms 1.1375 KOps/s 1.1095 KOps/s $\color{#35bf28}+2.53\%$
test_func_call_cm_runtime[True-compile] 1.1585ms 0.7833ms 1.2767 KOps/s 1.2448 KOps/s $\color{#35bf28}+2.57\%$
test_func_call_cm_runtime[True-compile-overhead] 1.3268ms 0.7936ms 1.2601 KOps/s 1.2324 KOps/s $\color{#35bf28}+2.24\%$
test_vmap_func_call_cm_runtime[eager] 3.9205ms 1.9193ms 521.0315 Ops/s 517.6806 Ops/s $\color{#35bf28}+0.65\%$
test_vmap_func_call_cm_runtime[compile] 0.8637ms 0.5537ms 1.8061 KOps/s 1.8355 KOps/s $\color{#d91a1a}-1.60\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7810ms 0.5387ms 1.8563 KOps/s 1.8335 KOps/s $\color{#35bf28}+1.24\%$
test_distributed 0.2775ms 0.1279ms 7.8198 KOps/s 7.8115 KOps/s $\color{#35bf28}+0.11\%$
test_tdmodule 46.8070μs 27.0599μs 36.9551 KOps/s 36.6310 KOps/s $\color{#35bf28}+0.88\%$
test_tdmodule_dispatch 0.1148ms 51.4228μs 19.4466 KOps/s 19.7018 KOps/s $\color{#d91a1a}-1.30\%$
test_tdseq 48.8610μs 29.5049μs 33.8927 KOps/s 32.7639 KOps/s $\color{#35bf28}+3.45\%$
test_tdseq_dispatch 0.1157ms 55.6928μs 17.9556 KOps/s 18.1166 KOps/s $\color{#d91a1a}-0.89\%$
test_instantiation_functorch 2.0360ms 1.5378ms 650.2841 Ops/s 639.8023 Ops/s $\color{#35bf28}+1.64\%$
test_exec_functorch 0.3884ms 0.1766ms 5.6611 KOps/s 5.6186 KOps/s $\color{#35bf28}+0.76\%$
test_exec_functional_call 0.3989ms 0.1713ms 5.8387 KOps/s 5.8353 KOps/s $\color{#35bf28}+0.06\%$
test_exec_td_decorator 0.5217ms 0.2331ms 4.2900 KOps/s 4.2743 KOps/s $\color{#35bf28}+0.37\%$
test_vmap_mlp_speed_decorator[True-True] 0.9678ms 0.6575ms 1.5208 KOps/s 1.4939 KOps/s $\color{#35bf28}+1.80\%$
test_vmap_mlp_speed_decorator[True-False] 0.8806ms 0.6531ms 1.5311 KOps/s 1.5001 KOps/s $\color{#35bf28}+2.07\%$
test_vmap_mlp_speed_decorator[False-True] 0.8592ms 0.5301ms 1.8866 KOps/s 1.8675 KOps/s $\color{#35bf28}+1.02\%$
test_vmap_mlp_speed_decorator[False-False] 0.8765ms 0.5334ms 1.8748 KOps/s 1.8564 KOps/s $\color{#35bf28}+0.99\%$
test_to_module_speed[True] 1.9705ms 1.3308ms 751.4401 Ops/s 753.2216 Ops/s $\color{#d91a1a}-0.24\%$
test_to_module_speed[False] 2.4274ms 1.3214ms 756.7445 Ops/s 771.0601 Ops/s $\color{#d91a1a}-1.86\%$
test_tc_init 88.8260μs 49.3753μs 20.2530 KOps/s 20.2017 KOps/s $\color{#35bf28}+0.25\%$
test_tc_init_nested 0.1824ms 98.6175μs 10.1402 KOps/s 10.1113 KOps/s $\color{#35bf28}+0.29\%$
test_tc_first_layer_tensor 43.8520μs 1.6314μs 612.9637 KOps/s 639.3121 KOps/s $\color{#d91a1a}-4.12\%$
test_tc_first_layer_nontensor 30.7970μs 4.9586μs 201.6687 KOps/s 209.5784 KOps/s $\color{#d91a1a}-3.77\%$
test_tc_second_layer_tensor 22.2920μs 2.9403μs 340.0963 KOps/s 343.6534 KOps/s $\color{#d91a1a}-1.04\%$
test_tc_second_layer_nontensor 51.9570μs 6.3697μs 156.9938 KOps/s 163.5150 KOps/s $\color{#d91a1a}-3.99\%$
test_unbind 0.2491s 13.9931ms 71.4639 Ops/s 70.1015 Ops/s $\color{#35bf28}+1.94\%$
test_full_like 9.2718ms 8.0533ms 124.1725 Ops/s 139.4272 Ops/s $\textbf{\color{#d91a1a}-10.94\%}$
test_zeros_like 9.2510ms 4.6573ms 214.7161 Ops/s 359.8828 Ops/s $\textbf{\color{#d91a1a}-40.34\%}$
test_ones_like 5.0855ms 3.3418ms 299.2442 Ops/s 313.3520 Ops/s $\color{#d91a1a}-4.50\%$
test_clone 13.2095ms 7.4227ms 134.7225 Ops/s 187.3545 Ops/s $\textbf{\color{#d91a1a}-28.09\%}$
test_squeeze 69.4400μs 12.6291μs 79.1822 KOps/s 79.5056 KOps/s $\color{#d91a1a}-0.41\%$
test_unsqueeze 0.2755ms 95.0682μs 10.5188 KOps/s 10.3282 KOps/s $\color{#35bf28}+1.84\%$
test_split 0.4317ms 0.1956ms 5.1122 KOps/s 5.0690 KOps/s $\color{#35bf28}+0.85\%$
test_permute 0.3364ms 0.2030ms 4.9270 KOps/s 4.9264 KOps/s $\color{#35bf28}+0.01\%$
test_stack 28.2078ms 25.2077ms 39.6704 Ops/s 40.2664 Ops/s $\color{#d91a1a}-1.48\%$
test_cat 29.8918ms 25.2455ms 39.6111 Ops/s 40.7719 Ops/s $\color{#d91a1a}-2.85\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.9700μs 12.8500μs 77.8213 KOps/s 81.3223 KOps/s $\color{#d91a1a}-4.31\%$
test_plain_set_stack_nested 34.1700μs 12.9815μs 77.0327 KOps/s 80.1578 KOps/s $\color{#d91a1a}-3.90\%$
test_plain_set_nested_inplace 43.4410μs 13.9801μs 71.5302 KOps/s 73.7923 KOps/s $\color{#d91a1a}-3.07\%$
test_plain_set_stack_nested_inplace 45.8910μs 13.8753μs 72.0706 KOps/s 74.8436 KOps/s $\color{#d91a1a}-3.71\%$
test_items 29.9310μs 2.8676μs 348.7220 KOps/s 342.5485 KOps/s $\color{#35bf28}+1.80\%$
test_items_nested 0.4357ms 0.3702ms 2.7012 KOps/s 2.6730 KOps/s $\color{#35bf28}+1.06\%$
test_items_nested_locked 0.4173ms 0.3715ms 2.6916 KOps/s 2.6617 KOps/s $\color{#35bf28}+1.12\%$
test_items_nested_leaf 91.4410μs 60.0957μs 16.6401 KOps/s 16.5699 KOps/s $\color{#35bf28}+0.42\%$
test_items_stack_nested 0.4035ms 0.3683ms 2.7155 KOps/s 2.6927 KOps/s $\color{#35bf28}+0.85\%$
test_items_stack_nested_leaf 87.5810μs 60.5718μs 16.5093 KOps/s 16.3929 KOps/s $\color{#35bf28}+0.71\%$
test_items_stack_nested_locked 0.4306ms 0.3720ms 2.6880 KOps/s 2.6653 KOps/s $\color{#35bf28}+0.85\%$
test_keys 37.2610μs 3.4274μs 291.7690 KOps/s 291.9284 KOps/s $\color{#d91a1a}-0.05\%$
test_keys_nested 0.1334ms 87.9063μs 11.3758 KOps/s 11.3327 KOps/s $\color{#35bf28}+0.38\%$
test_keys_nested_locked 0.7633ms 94.4716μs 10.5852 KOps/s 10.6198 KOps/s $\color{#d91a1a}-0.33\%$
test_keys_nested_leaf 0.1151ms 79.8926μs 12.5168 KOps/s 12.6384 KOps/s $\color{#d91a1a}-0.96\%$
test_keys_stack_nested 0.1435ms 88.4761μs 11.3025 KOps/s 11.3579 KOps/s $\color{#d91a1a}-0.49\%$
test_keys_stack_nested_leaf 0.1132ms 80.3783μs 12.4412 KOps/s 12.5539 KOps/s $\color{#d91a1a}-0.90\%$
test_keys_stack_nested_locked 0.1325ms 94.9989μs 10.5264 KOps/s 10.5430 KOps/s $\color{#d91a1a}-0.16\%$
test_values 6.0435μs 0.8584μs 1.1650 MOps/s 1.1740 MOps/s $\color{#d91a1a}-0.77\%$
test_values_nested 77.9510μs 37.2826μs 26.8222 KOps/s 26.8376 KOps/s $\color{#d91a1a}-0.06\%$
test_values_nested_locked 72.1010μs 39.6248μs 25.2367 KOps/s 25.6083 KOps/s $\color{#d91a1a}-1.45\%$
test_values_nested_leaf 78.9510μs 42.4983μs 23.5304 KOps/s 23.7158 KOps/s $\color{#d91a1a}-0.78\%$
test_values_stack_nested 65.7010μs 37.4495μs 26.7026 KOps/s 26.5897 KOps/s $\color{#35bf28}+0.42\%$
test_values_stack_nested_leaf 75.4010μs 42.4132μs 23.5776 KOps/s 23.4314 KOps/s $\color{#35bf28}+0.62\%$
test_values_stack_nested_locked 0.1033ms 39.9061μs 25.0588 KOps/s 25.5233 KOps/s $\color{#d91a1a}-1.82\%$
test_membership 1.7071μs 0.5016μs 1.9936 MOps/s 1.9741 MOps/s $\color{#35bf28}+0.99\%$
test_membership_nested 29.0600μs 2.0976μs 476.7342 KOps/s 478.7006 KOps/s $\color{#d91a1a}-0.41\%$
test_membership_nested_leaf 21.9755μs 2.0089μs 497.7783 KOps/s 491.8793 KOps/s $\color{#35bf28}+1.20\%$
test_membership_stacked_nested 32.7710μs 2.1207μs 471.5489 KOps/s 475.8986 KOps/s $\color{#d91a1a}-0.91\%$
test_membership_stacked_nested_leaf 41.8710μs 2.1199μs 471.7275 KOps/s 462.3066 KOps/s $\color{#35bf28}+2.04\%$
test_membership_nested_last 31.4910μs 3.0673μs 326.0244 KOps/s 322.5975 KOps/s $\color{#35bf28}+1.06\%$
test_membership_nested_leaf_last 32.1400μs 3.0397μs 328.9748 KOps/s 323.6620 KOps/s $\color{#35bf28}+1.64\%$
test_membership_stacked_nested_last 30.4610μs 3.1044μs 322.1272 KOps/s 324.5949 KOps/s $\color{#d91a1a}-0.76\%$
test_membership_stacked_nested_leaf_last 34.6410μs 3.1018μs 322.3926 KOps/s 323.1095 KOps/s $\color{#d91a1a}-0.22\%$
test_nested_getleaf 55.2410μs 6.1851μs 161.6790 KOps/s 159.1955 KOps/s $\color{#35bf28}+1.56\%$
test_nested_get 34.0410μs 5.9774μs 167.2966 KOps/s 168.1455 KOps/s $\color{#d91a1a}-0.50\%$
test_stacked_getleaf 35.5010μs 6.1623μs 162.2769 KOps/s 160.2197 KOps/s $\color{#35bf28}+1.28\%$
test_stacked_get 33.0100μs 5.8616μs 170.6008 KOps/s 170.4022 KOps/s $\color{#35bf28}+0.12\%$
test_nested_getitemleaf 30.0710μs 6.4790μs 154.3441 KOps/s 153.7289 KOps/s $\color{#35bf28}+0.40\%$
test_nested_getitem 31.0900μs 6.1507μs 162.5819 KOps/s 162.9327 KOps/s $\color{#d91a1a}-0.22\%$
test_stacked_getitemleaf 41.1610μs 6.4345μs 155.4115 KOps/s 155.5882 KOps/s $\color{#d91a1a}-0.11\%$
test_stacked_getitem 35.1610μs 5.9939μs 166.8363 KOps/s 167.0478 KOps/s $\color{#d91a1a}-0.13\%$
test_lock_nested 0.4003ms 0.3433ms 2.9130 KOps/s 2.9817 KOps/s $\color{#d91a1a}-2.30\%$
test_lock_stack_nested 0.3837ms 0.3494ms 2.8622 KOps/s 2.9064 KOps/s $\color{#d91a1a}-1.52\%$
test_unlock_nested 0.3827ms 0.2898ms 3.4506 KOps/s 3.5542 KOps/s $\color{#d91a1a}-2.92\%$
test_unlock_stack_nested 0.3366ms 0.2898ms 3.4506 KOps/s 3.5404 KOps/s $\color{#d91a1a}-2.53\%$
test_flatten_speed 0.1115ms 77.8701μs 12.8419 KOps/s 12.7541 KOps/s $\color{#35bf28}+0.69\%$
test_unflatten_speed 0.4177ms 0.3261ms 3.0662 KOps/s 3.1084 KOps/s $\color{#d91a1a}-1.36\%$
test_common_ops 0.7546ms 0.6212ms 1.6099 KOps/s 1.6362 KOps/s $\color{#d91a1a}-1.61\%$
test_creation 78.4810μs 1.7575μs 569.0022 KOps/s 580.2079 KOps/s $\color{#d91a1a}-1.93\%$
test_creation_empty 39.7000μs 9.0458μs 110.5483 KOps/s 124.5914 KOps/s $\textbf{\color{#d91a1a}-11.27\%}$
test_creation_nested_1 42.6100μs 10.5729μs 94.5817 KOps/s 102.6839 KOps/s $\textbf{\color{#d91a1a}-7.89\%}$
test_creation_nested_2 48.1910μs 13.3807μs 74.7344 KOps/s 80.9147 KOps/s $\textbf{\color{#d91a1a}-7.64\%}$
test_clone 56.7210μs 11.1498μs 89.6876 KOps/s 92.8230 KOps/s $\color{#d91a1a}-3.38\%$
test_getitem[int] 1.2048ms 10.9538μs 91.2927 KOps/s 95.6949 KOps/s $\color{#d91a1a}-4.60\%$
test_getitem[slice_int] 0.1167ms 21.4230μs 46.6787 KOps/s 49.1492 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_getitem[range] 0.1295ms 39.2133μs 25.5016 KOps/s 26.8209 KOps/s $\color{#d91a1a}-4.92\%$
test_getitem[tuple] 0.1064ms 18.4766μs 54.1226 KOps/s 56.3739 KOps/s $\color{#d91a1a}-3.99\%$
test_getitem[list] 0.1675ms 34.4229μs 29.0504 KOps/s 29.7768 KOps/s $\color{#d91a1a}-2.44\%$
test_setitem_dim[int] 47.6910μs 20.0976μs 49.7571 KOps/s 50.4232 KOps/s $\color{#d91a1a}-1.32\%$
test_setitem_dim[slice_int] 63.8010μs 39.1260μs 25.5585 KOps/s 25.7394 KOps/s $\color{#d91a1a}-0.70\%$
test_setitem_dim[range] 0.1059ms 55.9077μs 17.8866 KOps/s 18.8388 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_setitem_dim[tuple] 60.5710μs 32.6786μs 30.6011 KOps/s 29.6221 KOps/s $\color{#35bf28}+3.30\%$
test_setitem 67.0110μs 16.0120μs 62.4533 KOps/s 66.8440 KOps/s $\textbf{\color{#d91a1a}-6.57\%}$
test_set 75.7920μs 15.1916μs 65.8260 KOps/s 67.6725 KOps/s $\color{#d91a1a}-2.73\%$
test_set_shared 0.5159ms 0.1599ms 6.2535 KOps/s 6.2965 KOps/s $\color{#d91a1a}-0.68\%$
test_update 0.2313ms 18.6899μs 53.5049 KOps/s 56.7598 KOps/s $\textbf{\color{#d91a1a}-5.73\%}$
test_update_nested 68.9410μs 24.3475μs 41.0721 KOps/s 42.8787 KOps/s $\color{#d91a1a}-4.21\%$
test_update__nested 0.5101ms 25.5786μs 39.0952 KOps/s 38.8957 KOps/s $\color{#35bf28}+0.51\%$
test_set_nested 64.0610μs 16.7630μs 59.6552 KOps/s 62.2903 KOps/s $\color{#d91a1a}-4.23\%$
test_set_nested_new 76.2120μs 18.6645μs 53.5777 KOps/s 53.4038 KOps/s $\color{#35bf28}+0.33\%$
test_select 67.1610μs 30.1491μs 33.1684 KOps/s 33.7563 KOps/s $\color{#d91a1a}-1.74\%$
test_select_nested 74.3110μs 43.2636μs 23.1141 KOps/s 22.5378 KOps/s $\color{#35bf28}+2.56\%$
test_exclude_nested 94.1510μs 63.1741μs 15.8293 KOps/s 15.6248 KOps/s $\color{#35bf28}+1.31\%$
test_empty[True] 0.3604ms 0.2945ms 3.3957 KOps/s 3.3541 KOps/s $\color{#35bf28}+1.24\%$
test_empty[False] 3.7130μs 0.8388μs 1.1922 MOps/s 1.1961 MOps/s $\color{#d91a1a}-0.32\%$
test_to 88.2910μs 57.1923μs 17.4849 KOps/s 16.4618 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_to_nonblocking 91.3620μs 49.2259μs 20.3145 KOps/s 20.8840 KOps/s $\color{#d91a1a}-2.73\%$
test_unbind_speed 0.2923ms 0.2480ms 4.0329 KOps/s 4.2837 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_unbind_speed_stack0 0.3126ms 0.2474ms 4.0414 KOps/s 4.2288 KOps/s $\color{#d91a1a}-4.43\%$
test_unbind_speed_stack1 93.5453ms 0.7503ms 1.3328 KOps/s 1.3467 KOps/s $\color{#d91a1a}-1.03\%$
test_split 94.2802ms 1.6044ms 623.2778 Ops/s 627.8731 Ops/s $\color{#d91a1a}-0.73\%$
test_chunk 96.3463ms 1.6143ms 619.4774 Ops/s 627.1133 Ops/s $\color{#d91a1a}-1.22\%$
test_consolidate[False-None] 2.8077ms 2.7081ms 369.2684 Ops/s 336.0734 Ops/s $\textbf{\color{#35bf28}+9.88\%}$
test_consolidate[default-None] 1.7968ms 1.7316ms 577.4926 Ops/s 584.7331 Ops/s $\color{#d91a1a}-1.24\%$
test_consolidate[reduce-overhead-None] 1.8454ms 1.7534ms 570.3048 Ops/s 572.8247 Ops/s $\color{#d91a1a}-0.44\%$
test_consolidate_njt[False-None] 6.7370ms 6.5330ms 153.0692 Ops/s 111.2031 Ops/s $\textbf{\color{#35bf28}+37.65\%}$
test_to[False-False-None] 1.8194ms 1.7532ms 570.3847 Ops/s 564.3212 Ops/s $\color{#35bf28}+1.07\%$
test_to[True-False-None] 1.4931ms 1.3763ms 726.6059 Ops/s 751.6225 Ops/s $\color{#d91a1a}-3.33\%$
test_to[within-False-None] 4.4646ms 4.2387ms 235.9187 Ops/s 237.8136 Ops/s $\color{#d91a1a}-0.80\%$
test_to[True-default-None] 5.5874ms 5.2450ms 190.6581 Ops/s 178.7626 Ops/s $\textbf{\color{#35bf28}+6.65\%}$
test_to_njt[False-False-None] 7.1205ms 6.9804ms 143.2579 Ops/s 134.1229 Ops/s $\textbf{\color{#35bf28}+6.81\%}$
test_to_njt[True-False-None] 5.7053ms 5.5317ms 180.7763 Ops/s 173.7753 Ops/s $\color{#35bf28}+4.03\%$
test_to_njt[within-False-None] 12.4026ms 12.1236ms 82.4838 Ops/s 80.6630 Ops/s $\color{#35bf28}+2.26\%$
test_creation[device0] 0.4579ms 79.3675μs 12.5996 KOps/s 12.6119 KOps/s $\color{#d91a1a}-0.10\%$
test_creation_from_tensor 0.6225ms 83.3414μs 11.9988 KOps/s 11.8954 KOps/s $\color{#35bf28}+0.87\%$
test_add_one[memmap_tensor0] 0.4339ms 7.0271μs 142.3069 KOps/s 148.1619 KOps/s $\color{#d91a1a}-3.95\%$
test_contiguous[memmap_tensor0] 2.0430μs 0.4183μs 2.3905 MOps/s 2.2562 MOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_stack[memmap_tensor0] 38.1010μs 4.7791μs 209.2435 KOps/s 229.5432 KOps/s $\textbf{\color{#d91a1a}-8.84\%}$
test_memmaptd_index 1.8343ms 0.2515ms 3.9766 KOps/s 4.1440 KOps/s $\color{#d91a1a}-4.04\%$
test_memmaptd_index_astensor 0.4534ms 0.3110ms 3.2153 KOps/s 3.2908 KOps/s $\color{#d91a1a}-2.29\%$
test_memmaptd_index_op 0.7522ms 0.6109ms 1.6369 KOps/s 1.7363 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_serialize_model 0.1310s 0.1298s 7.7012 Ops/s 7.6913 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_model_pickle 1.3820s 1.2222s 0.8182 Ops/s 0.8206 Ops/s $\color{#d91a1a}-0.29\%$
test_serialize_weights 0.1308s 0.1292s 7.7384 Ops/s 7.7419 Ops/s $\color{#d91a1a}-0.04\%$
test_serialize_weights_returnearly 0.2951s 52.8193ms 18.9325 Ops/s 14.8062 Ops/s $\textbf{\color{#35bf28}+27.87\%}$
test_serialize_weights_pickle 1.3779s 1.2203s 0.8195 Ops/s 0.7714 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_reshape_pytree 51.7400μs 22.0371μs 45.3781 KOps/s 45.0607 KOps/s $\color{#35bf28}+0.70\%$
test_reshape_td 66.1910μs 27.2803μs 36.6564 KOps/s 36.3830 KOps/s $\color{#35bf28}+0.75\%$
test_view_pytree 56.2200μs 22.1690μs 45.1081 KOps/s 45.4979 KOps/s $\color{#d91a1a}-0.86\%$
test_view_td 65.5910μs 31.4571μs 31.7893 KOps/s 30.8832 KOps/s $\color{#35bf28}+2.93\%$
test_unbind_pytree 58.0510μs 28.5857μs 34.9825 KOps/s 35.8853 KOps/s $\color{#d91a1a}-2.52\%$
test_unbind_td 0.6173ms 37.6149μs 26.5852 KOps/s 27.1819 KOps/s $\color{#d91a1a}-2.20\%$
test_split_pytree 65.5810μs 29.9188μs 33.4238 KOps/s 33.3854 KOps/s $\color{#35bf28}+0.11\%$
test_split_td 0.7787ms 39.0661μs 25.5977 KOps/s 25.5585 KOps/s $\color{#35bf28}+0.15\%$
test_add_pytree 66.4010μs 35.5357μs 28.1407 KOps/s 28.4188 KOps/s $\color{#d91a1a}-0.98\%$
test_add_td 0.1032ms 53.7308μs 18.6113 KOps/s 20.9821 KOps/s $\textbf{\color{#d91a1a}-11.30\%}$
test_compile_add_one_nested[tensordict-compile] 0.1744ms 0.1223ms 8.1735 KOps/s 7.9163 KOps/s $\color{#35bf28}+3.25\%$
test_compile_add_one_nested[tensordict-eager] 0.2263ms 0.1320ms 7.5782 KOps/s 7.3398 KOps/s $\color{#35bf28}+3.25\%$
test_compile_add_one_nested[pytree-compile] 0.2029ms 96.0188μs 10.4146 KOps/s 10.3462 KOps/s $\color{#35bf28}+0.66\%$
test_compile_add_one_nested[pytree-eager] 1.1541ms 0.1511ms 6.6191 KOps/s 6.6733 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_copy_nested[tensordict-compile] 54.8110μs 24.3814μs 41.0149 KOps/s 41.5395 KOps/s $\color{#d91a1a}-1.26\%$
test_compile_copy_nested[tensordict-eager] 95.8120μs 29.4006μs 34.0129 KOps/s 33.1532 KOps/s $\color{#35bf28}+2.59\%$
test_compile_copy_nested[pytree-compile] 0.4255ms 63.8800μs 15.6544 KOps/s 15.2975 KOps/s $\color{#35bf28}+2.33\%$
test_compile_copy_nested[pytree-eager] 82.7710μs 48.6699μs 20.5466 KOps/s 20.2094 KOps/s $\color{#35bf28}+1.67\%$
test_compile_add_one_flat[tensordict-compile] 0.1965ms 0.1426ms 7.0124 KOps/s 7.0377 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_add_one_flat[tensordict-eager] 0.3140ms 0.2188ms 4.5703 KOps/s 4.5716 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_add_one_flat[tensorclass-compile] 0.2210ms 97.7165μs 10.2337 KOps/s 10.2954 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_add_one_flat[tensorclass-eager] 0.1160ms 55.8169μs 17.9157 KOps/s 17.8406 KOps/s $\color{#35bf28}+0.42\%$
test_compile_add_one_flat[pytree-compile] 0.1858ms 0.1375ms 7.2706 KOps/s 7.3665 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_add_one_flat[pytree-eager] 0.5390ms 0.4920ms 2.0324 KOps/s 2.0490 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_add_self_flat[tensordict-eager] 0.3807ms 0.2639ms 3.7898 KOps/s 3.7699 KOps/s $\color{#35bf28}+0.53\%$
test_compile_add_self_flat[tensordict-compile] 0.1985ms 0.1474ms 6.7821 KOps/s 7.0611 KOps/s $\color{#d91a1a}-3.95\%$
test_compile_add_self_flat[tensorclass-eager] 0.1636ms 67.2128μs 14.8781 KOps/s 14.3054 KOps/s $\color{#35bf28}+4.00\%$
test_compile_add_self_flat[tensorclass-compile] 0.1418ms 0.1003ms 9.9734 KOps/s 10.0912 KOps/s $\color{#d91a1a}-1.17\%$
test_compile_add_self_flat[pytree-eager] 0.4800ms 0.4182ms 2.3912 KOps/s 2.4414 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_self_flat[pytree-compile] 0.1792ms 0.1381ms 7.2388 KOps/s 7.4232 KOps/s $\color{#d91a1a}-2.49\%$
test_compile_copy_flat[tensordict-compile] 58.7710μs 18.9517μs 52.7657 KOps/s 46.4657 KOps/s $\textbf{\color{#35bf28}+13.56\%}$
test_compile_copy_flat[tensordict-eager] 92.8320μs 31.3424μs 31.9056 KOps/s 31.4160 KOps/s $\color{#35bf28}+1.56\%$
test_compile_copy_flat[pytree-compile] 0.2180ms 68.8135μs 14.5320 KOps/s 14.3578 KOps/s $\color{#35bf28}+1.21\%$
test_compile_copy_flat[pytree-eager] 0.1384ms 51.4516μs 19.4357 KOps/s 19.2906 KOps/s $\color{#35bf28}+0.75\%$
test_compile_assign_and_add[tensordict-compile] 1.6371ms 0.4003ms 2.4983 KOps/s 2.2053 KOps/s $\textbf{\color{#35bf28}+13.29\%}$
test_compile_assign_and_add[tensordict-eager] 2.9187ms 2.7465ms 364.1025 Ops/s 376.8972 Ops/s $\color{#d91a1a}-3.39\%$
test_compile_assign_and_add[pytree-compile] 1.5984ms 0.4325ms 2.3124 KOps/s 2.2681 KOps/s $\color{#35bf28}+1.95\%$
test_compile_assign_and_add[pytree-eager] 2.8367ms 2.7249ms 366.9871 Ops/s 376.7115 Ops/s $\color{#d91a1a}-2.58\%$
test_compile_indexing[tensor-tensordict-compile] 0.5111ms 0.1176ms 8.5031 KOps/s 8.5720 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_indexing[tensor-tensordict-eager] 0.5650ms 80.7123μs 12.3897 KOps/s 12.1327 KOps/s $\color{#35bf28}+2.12\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5430ms 0.1128ms 8.8623 KOps/s 9.2199 KOps/s $\color{#d91a1a}-3.88\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1290ms 69.8626μs 14.3138 KOps/s 14.2006 KOps/s $\color{#35bf28}+0.80\%$
test_compile_indexing[tensor-pytree-compile] 0.2335ms 0.1139ms 8.7799 KOps/s 9.1360 KOps/s $\color{#d91a1a}-3.90\%$
test_compile_indexing[tensor-pytree-eager] 0.1521ms 69.4133μs 14.4065 KOps/s 14.2129 KOps/s $\color{#35bf28}+1.36\%$
test_compile_indexing[slice-tensordict-compile] 0.1480ms 0.1040ms 9.6187 KOps/s 9.8932 KOps/s $\color{#d91a1a}-2.77\%$
test_compile_indexing[slice-tensordict-eager] 0.1480ms 17.2742μs 57.8897 KOps/s 57.1565 KOps/s $\color{#35bf28}+1.28\%$
test_compile_indexing[slice-tensorclass-compile] 0.1439ms 97.0872μs 10.3000 KOps/s 10.3329 KOps/s $\color{#d91a1a}-0.32\%$
test_compile_indexing[slice-tensorclass-eager] 84.8110μs 15.6353μs 63.9577 KOps/s 64.3575 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_indexing[slice-pytree-compile] 0.1671ms 97.5174μs 10.2546 KOps/s 10.2303 KOps/s $\color{#35bf28}+0.24\%$
test_compile_indexing[slice-pytree-eager] 47.1010μs 15.6956μs 63.7121 KOps/s 57.7693 KOps/s $\textbf{\color{#35bf28}+10.29\%}$
test_compile_indexing[int-tensordict-compile] 0.1515ms 0.1028ms 9.7282 KOps/s 9.6683 KOps/s $\color{#35bf28}+0.62\%$
test_compile_indexing[int-tensordict-eager] 0.5919ms 17.0922μs 58.5063 KOps/s 57.5289 KOps/s $\color{#35bf28}+1.70\%$
test_compile_indexing[int-tensorclass-compile] 0.1636ms 98.0224μs 10.2018 KOps/s 9.9624 KOps/s $\color{#35bf28}+2.40\%$
test_compile_indexing[int-tensorclass-eager] 50.9010μs 15.7669μs 63.4239 KOps/s 64.5180 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_indexing[int-pytree-compile] 0.1591ms 97.6275μs 10.2430 KOps/s 10.1749 KOps/s $\color{#35bf28}+0.67\%$
test_compile_indexing[int-pytree-eager] 0.2485ms 17.5007μs 57.1406 KOps/s 64.6684 KOps/s $\textbf{\color{#d91a1a}-11.64\%}$
test_mod_add[eager] 77.0410μs 39.5224μs 25.3021 KOps/s 26.1127 KOps/s $\color{#d91a1a}-3.10\%$
test_mod_add[compile] 0.3082ms 82.0436μs 12.1886 KOps/s 12.1894 KOps/s $-0.01\%$
test_mod_add[compile-overhead] 0.3390ms 0.1715ms 5.8308 KOps/s 5.6896 KOps/s $\color{#35bf28}+2.48\%$
test_mod_wrap[eager] 0.3380ms 0.2559ms 3.9074 KOps/s 3.7725 KOps/s $\color{#35bf28}+3.58\%$
test_mod_wrap[compile] 0.3592ms 0.2894ms 3.4552 KOps/s 3.4658 KOps/s $\color{#d91a1a}-0.31\%$
test_mod_wrap[compile-overhead] 7.2202ms 3.8200ms 261.7777 Ops/s 268.9822 Ops/s $\color{#d91a1a}-2.68\%$
test_mod_wrap_and_backward[eager] 1.5329ms 1.3913ms 718.7608 Ops/s 677.9732 Ops/s $\textbf{\color{#35bf28}+6.02\%}$
test_mod_wrap_and_backward[compile] 1.3867ms 1.2899ms 775.2711 Ops/s 714.5655 Ops/s $\textbf{\color{#35bf28}+8.50\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3848ms 0.9340ms 1.0707 KOps/s 954.8104 Ops/s $\textbf{\color{#35bf28}+12.13\%}$
test_seq_add[eager] 0.1868ms 0.1201ms 8.3245 KOps/s 8.5295 KOps/s $\color{#d91a1a}-2.40\%$
test_seq_add[compile] 0.1572ms 89.1346μs 11.2190 KOps/s 10.8283 KOps/s $\color{#35bf28}+3.61\%$
test_seq_add[compile-overhead] 0.1697ms 0.1305ms 7.6603 KOps/s 7.6080 KOps/s $\color{#35bf28}+0.69\%$
test_seq_wrap[eager] 0.5195ms 0.4337ms 2.3055 KOps/s 2.2236 KOps/s $\color{#35bf28}+3.68\%$
test_seq_wrap[compile] 0.4825ms 0.3118ms 3.2073 KOps/s 3.1003 KOps/s $\color{#35bf28}+3.45\%$
test_seq_wrap[compile-overhead] 0.3094ms 0.2269ms 4.4066 KOps/s 4.3613 KOps/s $\color{#35bf28}+1.04\%$
test_func_call_runtime[False-eager] 0.8157ms 0.7500ms 1.3333 KOps/s 1.2357 KOps/s $\textbf{\color{#35bf28}+7.90\%}$
test_func_call_runtime[False-compile] 0.8971ms 0.7634ms 1.3100 KOps/s 1.3111 KOps/s $\color{#d91a1a}-0.09\%$
test_func_call_runtime[False-compile-overhead] 0.4105ms 0.3689ms 2.7110 KOps/s 2.7214 KOps/s $\color{#d91a1a}-0.38\%$
test_func_call_runtime[True-eager] 0.9923ms 0.9187ms 1.0885 KOps/s 1.0705 KOps/s $\color{#35bf28}+1.68\%$
test_func_call_runtime[True-compile] 0.8481ms 0.7842ms 1.2752 KOps/s 1.2902 KOps/s $\color{#d91a1a}-1.16\%$
test_func_call_runtime[True-compile-overhead] 0.5562ms 0.3894ms 2.5682 KOps/s 2.5692 KOps/s $\color{#d91a1a}-0.04\%$
test_func_call_cm_runtime[False-eager] 0.8293ms 0.7510ms 1.3316 KOps/s 1.3103 KOps/s $\color{#35bf28}+1.62\%$
test_func_call_cm_runtime[False-compile] 1.0111ms 0.7614ms 1.3134 KOps/s 1.2879 KOps/s $\color{#35bf28}+1.97\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4484ms 0.3688ms 2.7112 KOps/s 2.7081 KOps/s $\color{#35bf28}+0.11\%$
test_func_call_cm_runtime[True-eager] 1.1059ms 1.0215ms 978.9055 Ops/s 961.0818 Ops/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[True-compile] 1.1188ms 1.0050ms 995.0678 Ops/s 975.3712 Ops/s $\color{#35bf28}+2.02\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0697ms 1.0064ms 993.6562 Ops/s 971.0163 Ops/s $\color{#35bf28}+2.33\%$
test_vmap_func_call_cm_runtime[eager] 2.5238ms 2.1032ms 475.4685 Ops/s 467.7946 Ops/s $\color{#35bf28}+1.64\%$
test_vmap_func_call_cm_runtime[compile] 0.9021ms 0.8268ms 1.2096 KOps/s 1.2106 KOps/s $\color{#d91a1a}-0.09\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4649ms 0.4187ms 2.3885 KOps/s 2.3707 KOps/s $\color{#35bf28}+0.75\%$
test_distributed 2.8979ms 0.2354ms 4.2482 KOps/s 8.7807 KOps/s $\textbf{\color{#d91a1a}-51.62\%}$
test_tdmodule 36.2710μs 20.8044μs 48.0668 KOps/s 50.5049 KOps/s $\color{#d91a1a}-4.83\%$
test_tdmodule_dispatch 83.2810μs 37.2086μs 26.8755 KOps/s 28.3575 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_tdseq 31.0800μs 21.4020μs 46.7246 KOps/s 48.2386 KOps/s $\color{#d91a1a}-3.14\%$
test_tdseq_dispatch 58.1910μs 38.6556μs 25.8694 KOps/s 25.8834 KOps/s $\color{#d91a1a}-0.05\%$
test_instantiation_functorch 1.6976ms 1.5738ms 635.4113 Ops/s 649.6501 Ops/s $\color{#d91a1a}-2.19\%$
test_exec_functorch 0.1873ms 0.1461ms 6.8451 KOps/s 7.0015 KOps/s $\color{#d91a1a}-2.23\%$
test_exec_functional_call 0.1982ms 0.1411ms 7.0885 KOps/s 7.3345 KOps/s $\color{#d91a1a}-3.35\%$
test_exec_td_decorator 0.3803ms 0.1909ms 5.2394 KOps/s 5.3129 KOps/s $\color{#d91a1a}-1.38\%$
test_vmap_mlp_speed_decorator[True-True] 0.8366ms 0.6931ms 1.4428 KOps/s 1.4404 KOps/s $\color{#35bf28}+0.17\%$
test_vmap_mlp_speed_decorator[True-False] 0.8281ms 0.6913ms 1.4465 KOps/s 1.4427 KOps/s $\color{#35bf28}+0.27\%$
test_vmap_mlp_speed_decorator[False-True] 0.7136ms 0.6010ms 1.6639 KOps/s 1.6552 KOps/s $\color{#35bf28}+0.52\%$
test_vmap_mlp_speed_decorator[False-False] 0.7155ms 0.6005ms 1.6653 KOps/s 1.6515 KOps/s $\color{#35bf28}+0.84\%$
test_vmap_transformer_speed_decorator[True-True] 20.1005ms 19.4450ms 51.4272 Ops/s 51.1006 Ops/s $\color{#35bf28}+0.64\%$
test_vmap_transformer_speed_decorator[True-False] 19.5093ms 19.4289ms 51.4696 Ops/s 51.2784 Ops/s $\color{#35bf28}+0.37\%$
test_vmap_transformer_speed_decorator[False-True] 19.3391ms 19.2907ms 51.8385 Ops/s 51.6929 Ops/s $\color{#35bf28}+0.28\%$
test_vmap_transformer_speed_decorator[False-False] 19.3845ms 19.2929ms 51.8326 Ops/s 51.6996 Ops/s $\color{#35bf28}+0.26\%$
test_to_module_speed[True] 1.4302ms 0.9694ms 1.0316 KOps/s 1.0240 KOps/s $\color{#35bf28}+0.74\%$
test_to_module_speed[False] 1.0211ms 0.9507ms 1.0519 KOps/s 1.0391 KOps/s $\color{#35bf28}+1.23\%$
test_tc_init 75.6310μs 36.6059μs 27.3180 KOps/s 28.2507 KOps/s $\color{#d91a1a}-3.30\%$
test_tc_init_nested 0.1178ms 73.0007μs 13.6985 KOps/s 14.4008 KOps/s $\color{#d91a1a}-4.88\%$
test_tc_first_layer_tensor 24.7310μs 0.7970μs 1.2547 MOps/s 1.2590 MOps/s $\color{#d91a1a}-0.35\%$
test_tc_first_layer_nontensor 31.6900μs 2.2046μs 453.5987 KOps/s 451.2249 KOps/s $\color{#35bf28}+0.53\%$
test_tc_second_layer_tensor 9.4402μs 1.4078μs 710.3391 KOps/s 709.0221 KOps/s $\color{#35bf28}+0.19\%$
test_tc_second_layer_nontensor 31.2910μs 2.9328μs 340.9676 KOps/s 335.3211 KOps/s $\color{#35bf28}+1.68\%$
test_unbind 0.2164s 11.9649ms 83.5775 Ops/s 141.1119 Ops/s $\textbf{\color{#d91a1a}-40.77\%}$
test_full_like 9.3135ms 9.1365ms 109.4505 Ops/s 108.2235 Ops/s $\color{#35bf28}+1.13\%$
test_zeros_like 6.7772ms 4.3513ms 229.8182 Ops/s 234.7166 Ops/s $\color{#d91a1a}-2.09\%$
test_ones_like 4.9591ms 4.3278ms 231.0635 Ops/s 235.4041 Ops/s $\color{#d91a1a}-1.84\%$
test_clone 6.9333ms 6.3820ms 156.6903 Ops/s 109.8489 Ops/s $\textbf{\color{#35bf28}+42.64\%}$
test_squeeze 53.4510μs 10.0993μs 99.0171 KOps/s 103.2357 KOps/s $\color{#d91a1a}-4.09\%$
test_unsqueeze 0.1264ms 74.4285μs 13.4357 KOps/s 13.5746 KOps/s $\color{#d91a1a}-1.02\%$
test_split 0.3711ms 0.1621ms 6.1703 KOps/s 6.1252 KOps/s $\color{#35bf28}+0.74\%$
test_permute 0.2425ms 0.1859ms 5.3797 KOps/s 5.4420 KOps/s $\color{#d91a1a}-1.14\%$
test_stack 50.8762ms 50.5157ms 19.7958 Ops/s 19.9554 Ops/s $\color{#d91a1a}-0.80\%$
test_cat 50.6447ms 50.2292ms 19.9088 Ops/s 20.0065 Ops/s $\color{#d91a1a}-0.49\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants