Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] inplace arg in TDM constructor #992

Merged
merged 8 commits into from
Sep 17, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 16, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 16, 2024
Copy link

github-actions bot commented Sep 16, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}25$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.2750μs 19.1191μs 52.3036 KOps/s 49.2278 KOps/s $\textbf{\color{#35bf28}+6.25\%}$
test_plain_set_stack_nested 84.5070μs 19.1860μs 52.1213 KOps/s 48.5777 KOps/s $\textbf{\color{#35bf28}+7.29\%}$
test_plain_set_nested_inplace 46.1460μs 20.7332μs 48.2319 KOps/s 44.8778 KOps/s $\textbf{\color{#35bf28}+7.47\%}$
test_plain_set_stack_nested_inplace 0.1020ms 20.9719μs 47.6830 KOps/s 45.3087 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_items 21.5810μs 4.1346μs 241.8623 KOps/s 233.9152 KOps/s $\color{#35bf28}+3.40\%$
test_items_nested 0.5236ms 0.3592ms 2.7842 KOps/s 2.8159 KOps/s $\color{#d91a1a}-1.13\%$
test_items_nested_locked 0.5281ms 0.3617ms 2.7645 KOps/s 2.7950 KOps/s $\color{#d91a1a}-1.09\%$
test_items_nested_leaf 0.1297ms 68.4004μs 14.6198 KOps/s 14.3806 KOps/s $\color{#35bf28}+1.66\%$
test_items_stack_nested 0.6432ms 0.3639ms 2.7478 KOps/s 2.7598 KOps/s $\color{#d91a1a}-0.43\%$
test_items_stack_nested_leaf 0.1645ms 71.7869μs 13.9301 KOps/s 13.6716 KOps/s $\color{#35bf28}+1.89\%$
test_items_stack_nested_locked 0.5918ms 0.3732ms 2.6794 KOps/s 2.7540 KOps/s $\color{#d91a1a}-2.71\%$
test_keys 22.7230μs 3.5836μs 279.0479 KOps/s 281.1463 KOps/s $\color{#d91a1a}-0.75\%$
test_keys_nested 0.1785ms 0.1022ms 9.7857 KOps/s 9.7985 KOps/s $\color{#d91a1a}-0.13\%$
test_keys_nested_locked 1.9011ms 0.1078ms 9.2747 KOps/s 9.2849 KOps/s $\color{#d91a1a}-0.11\%$
test_keys_nested_leaf 0.1479ms 84.3390μs 11.8569 KOps/s 11.5880 KOps/s $\color{#35bf28}+2.32\%$
test_keys_stack_nested 0.1650ms 99.3052μs 10.0700 KOps/s 9.7232 KOps/s $\color{#35bf28}+3.57\%$
test_keys_stack_nested_leaf 0.1404ms 81.3850μs 12.2873 KOps/s 11.6883 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_keys_stack_nested_locked 0.1998ms 0.1044ms 9.5794 KOps/s 9.2975 KOps/s $\color{#35bf28}+3.03\%$
test_values 13.1822μs 1.0974μs 911.2209 KOps/s 931.0493 KOps/s $\color{#d91a1a}-2.13\%$
test_values_nested 0.1391ms 74.6810μs 13.3903 KOps/s 13.5471 KOps/s $\color{#d91a1a}-1.16\%$
test_values_nested_locked 0.1432ms 73.0934μs 13.6811 KOps/s 13.5840 KOps/s $\color{#35bf28}+0.72\%$
test_values_nested_leaf 0.1173ms 61.6492μs 16.2208 KOps/s 15.9900 KOps/s $\color{#35bf28}+1.44\%$
test_values_stack_nested 0.1353ms 74.2577μs 13.4666 KOps/s 13.3737 KOps/s $\color{#35bf28}+0.70\%$
test_values_stack_nested_leaf 0.1215ms 59.6307μs 16.7699 KOps/s 16.1709 KOps/s $\color{#35bf28}+3.70\%$
test_values_stack_nested_locked 0.1385ms 73.7149μs 13.5658 KOps/s 13.5025 KOps/s $\color{#35bf28}+0.47\%$
test_membership 22.9630μs 0.8625μs 1.1595 MOps/s 1.4136 MOps/s $\textbf{\color{#d91a1a}-17.98\%}$
test_membership_nested 29.8560μs 2.7373μs 365.3216 KOps/s 360.6925 KOps/s $\color{#35bf28}+1.28\%$
test_membership_nested_leaf 20.0980μs 2.7118μs 368.7644 KOps/s 369.5049 KOps/s $\color{#d91a1a}-0.20\%$
test_membership_stacked_nested 26.9800μs 2.7166μs 368.1132 KOps/s 365.9161 KOps/s $\color{#35bf28}+0.60\%$
test_membership_stacked_nested_leaf 31.6590μs 2.7242μs 367.0826 KOps/s 367.8747 KOps/s $\color{#d91a1a}-0.22\%$
test_membership_nested_last 34.0540μs 3.9335μs 254.2280 KOps/s 253.7718 KOps/s $\color{#35bf28}+0.18\%$
test_membership_nested_leaf_last 36.5780μs 3.9260μs 254.7129 KOps/s 250.7386 KOps/s $\color{#35bf28}+1.59\%$
test_membership_stacked_nested_last 26.1890μs 3.8876μs 257.2280 KOps/s 179.1601 KOps/s $\textbf{\color{#35bf28}+43.57\%}$
test_membership_stacked_nested_leaf_last 24.8560μs 3.9049μs 256.0890 KOps/s 179.0967 KOps/s $\textbf{\color{#35bf28}+42.99\%}$
test_nested_getleaf 52.6790μs 10.4531μs 95.6656 KOps/s 93.0541 KOps/s $\color{#35bf28}+2.81\%$
test_nested_get 39.5240μs 10.0454μs 99.5477 KOps/s 96.7907 KOps/s $\color{#35bf28}+2.85\%$
test_stacked_getleaf 43.6810μs 10.4261μs 95.9132 KOps/s 94.2641 KOps/s $\color{#35bf28}+1.75\%$
test_stacked_get 31.4390μs 9.9847μs 100.1535 KOps/s 98.7183 KOps/s $\color{#35bf28}+1.45\%$
test_nested_getitemleaf 58.4370μs 10.5833μs 94.4888 KOps/s 88.5847 KOps/s $\textbf{\color{#35bf28}+6.66\%}$
test_nested_getitem 42.1490μs 10.2851μs 97.2284 KOps/s 96.1981 KOps/s $\color{#35bf28}+1.07\%$
test_stacked_getitemleaf 35.2460μs 10.9042μs 91.7077 KOps/s 89.9554 KOps/s $\color{#35bf28}+1.95\%$
test_stacked_getitem 35.2460μs 10.1946μs 98.0910 KOps/s 94.7427 KOps/s $\color{#35bf28}+3.53\%$
test_lock_nested 84.8870ms 0.5757ms 1.7369 KOps/s 2.0512 KOps/s $\textbf{\color{#d91a1a}-15.32\%}$
test_lock_stack_nested 0.7186ms 0.4492ms 2.2263 KOps/s 2.1823 KOps/s $\color{#35bf28}+2.01\%$
test_unlock_nested 86.3763ms 0.5038ms 1.9847 KOps/s 2.4119 KOps/s $\textbf{\color{#d91a1a}-17.71\%}$
test_unlock_stack_nested 0.5528ms 0.3716ms 2.6911 KOps/s 2.6203 KOps/s $\color{#35bf28}+2.70\%$
test_flatten_speed 0.1890ms 89.0567μs 11.2288 KOps/s 11.3405 KOps/s $\color{#d91a1a}-0.99\%$
test_unflatten_speed 0.8445ms 0.4735ms 2.1120 KOps/s 2.1504 KOps/s $\color{#d91a1a}-1.78\%$
test_common_ops 2.0146ms 1.0664ms 937.7176 Ops/s 893.8386 Ops/s $\color{#35bf28}+4.91\%$
test_creation 91.9220μs 2.1198μs 471.7330 KOps/s 474.9589 KOps/s $\color{#d91a1a}-0.68\%$
test_creation_empty 51.4260μs 15.1910μs 65.8285 KOps/s 58.0193 KOps/s $\textbf{\color{#35bf28}+13.46\%}$
test_creation_nested_1 60.4830μs 18.0490μs 55.4048 KOps/s 49.0498 KOps/s $\textbf{\color{#35bf28}+12.96\%}$
test_creation_nested_2 55.4140μs 22.3158μs 44.8112 KOps/s 39.8406 KOps/s $\textbf{\color{#35bf28}+12.48\%}$
test_clone 60.9840μs 17.0236μs 58.7421 KOps/s 57.7425 KOps/s $\color{#35bf28}+1.73\%$
test_getitem[int] 0.8786ms 17.1419μs 58.3367 KOps/s 58.6644 KOps/s $\color{#d91a1a}-0.56\%$
test_getitem[slice_int] 0.1375ms 31.1479μs 32.1048 KOps/s 32.0601 KOps/s $\color{#35bf28}+0.14\%$
test_getitem[range] 0.3400ms 59.6111μs 16.7754 KOps/s 16.7030 KOps/s $\color{#35bf28}+0.43\%$
test_getitem[tuple] 0.1303ms 25.6918μs 38.9229 KOps/s 39.6775 KOps/s $\color{#d91a1a}-1.90\%$
test_getitem[list] 0.1910ms 55.3062μs 18.0811 KOps/s 18.1047 KOps/s $\color{#d91a1a}-0.13\%$
test_setitem_dim[int] 74.6990μs 33.8443μs 29.5471 KOps/s 29.8332 KOps/s $\color{#d91a1a}-0.96\%$
test_setitem_dim[slice_int] 0.1044ms 62.0756μs 16.1094 KOps/s 16.1168 KOps/s $\color{#d91a1a}-0.05\%$
test_setitem_dim[range] 0.1402ms 85.9758μs 11.6312 KOps/s 11.6296 KOps/s $\color{#35bf28}+0.01\%$
test_setitem_dim[tuple] 90.3590μs 50.0987μs 19.9606 KOps/s 19.8739 KOps/s $\color{#35bf28}+0.44\%$
test_setitem 89.6980μs 28.8329μs 34.6826 KOps/s 33.8369 KOps/s $\color{#35bf28}+2.50\%$
test_set 81.6530μs 27.8543μs 35.9011 KOps/s 34.9222 KOps/s $\color{#35bf28}+2.80\%$
test_set_shared 1.3033ms 0.2117ms 4.7241 KOps/s 4.6548 KOps/s $\color{#35bf28}+1.49\%$
test_update 0.1541ms 33.3734μs 29.9640 KOps/s 27.2493 KOps/s $\textbf{\color{#35bf28}+9.96\%}$
test_update_nested 0.1426ms 44.6678μs 22.3875 KOps/s 21.3246 KOps/s $\color{#35bf28}+4.98\%$
test_update__nested 83.8270μs 34.5626μs 28.9330 KOps/s 28.9271 KOps/s $\color{#35bf28}+0.02\%$
test_set_nested 0.1039ms 30.1934μs 33.1198 KOps/s 31.9790 KOps/s $\color{#35bf28}+3.57\%$
test_set_nested_new 0.1057ms 36.2133μs 27.6141 KOps/s 26.9884 KOps/s $\color{#35bf28}+2.32\%$
test_select 0.1322ms 53.1869μs 18.8016 KOps/s 18.7041 KOps/s $\color{#35bf28}+0.52\%$
test_select_nested 0.1221ms 60.4615μs 16.5395 KOps/s 16.5135 KOps/s $\color{#35bf28}+0.16\%$
test_exclude_nested 0.1936ms 76.3665μs 13.0948 KOps/s 13.0951 KOps/s $-0.00\%$
test_empty[True] 0.4708ms 0.3196ms 3.1288 KOps/s 3.1079 KOps/s $\color{#35bf28}+0.67\%$
test_empty[False] 7.9598μs 1.2608μs 793.1405 KOps/s 798.0378 KOps/s $\color{#d91a1a}-0.61\%$
test_unbind_speed 0.4970ms 0.3091ms 3.2357 KOps/s 3.2186 KOps/s $\color{#35bf28}+0.53\%$
test_unbind_speed_stack0 0.4598ms 0.2962ms 3.3764 KOps/s 3.3256 KOps/s $\color{#35bf28}+1.53\%$
test_unbind_speed_stack1 97.4602ms 0.7973ms 1.2542 KOps/s 1.3432 KOps/s $\textbf{\color{#d91a1a}-6.63\%}$
test_split 2.2527ms 2.0236ms 494.1617 Ops/s 447.3076 Ops/s $\textbf{\color{#35bf28}+10.47\%}$
test_chunk 0.1137s 2.2650ms 441.4970 Ops/s 447.0895 Ops/s $\color{#d91a1a}-1.25\%$
test_creation[device0] 4.3576ms 0.1202ms 8.3190 KOps/s 8.2795 KOps/s $\color{#35bf28}+0.48\%$
test_creation_from_tensor 0.2813ms 0.1176ms 8.5047 KOps/s 8.2531 KOps/s $\color{#35bf28}+3.05\%$
test_add_one[memmap_tensor0] 0.1865ms 7.6568μs 130.6032 KOps/s 129.1879 KOps/s $\color{#35bf28}+1.10\%$
test_contiguous[memmap_tensor0] 22.5220μs 1.8833μs 530.9959 KOps/s 516.4560 KOps/s $\color{#35bf28}+2.82\%$
test_stack[memmap_tensor0] 35.4160μs 5.9379μs 168.4100 KOps/s 172.6294 KOps/s $\color{#d91a1a}-2.44\%$
test_memmaptd_index 1.0350ms 0.4155ms 2.4069 KOps/s 2.4036 KOps/s $\color{#35bf28}+0.14\%$
test_memmaptd_index_astensor 0.9970ms 0.4915ms 2.0346 KOps/s 2.0366 KOps/s $\color{#d91a1a}-0.10\%$
test_memmaptd_index_op 1.7741ms 1.0106ms 989.5301 Ops/s 945.3920 Ops/s $\color{#35bf28}+4.67\%$
test_serialize_model 0.2106s 0.1326s 7.5421 Ops/s 8.6298 Ops/s $\textbf{\color{#d91a1a}-12.60\%}$
test_serialize_model_pickle 0.4665s 0.3882s 2.5757 Ops/s 2.5239 Ops/s $\color{#35bf28}+2.05\%$
test_serialize_weights 0.1223s 0.1160s 8.6207 Ops/s 7.4380 Ops/s $\textbf{\color{#35bf28}+15.90\%}$
test_serialize_weights_returnearly 0.1864s 0.1586s 6.3045 Ops/s 6.1923 Ops/s $\color{#35bf28}+1.81\%$
test_serialize_weights_pickle 0.4806s 0.4242s 2.3575 Ops/s 2.3085 Ops/s $\color{#35bf28}+2.13\%$
test_serialize_weights_filesystem 0.1491s 0.1412s 7.0826 Ops/s 7.0930 Ops/s $\color{#d91a1a}-0.15\%$
test_serialize_model_filesystem 0.1627s 0.1528s 6.5445 Ops/s 6.5664 Ops/s $\color{#d91a1a}-0.33\%$
test_reshape_pytree 85.3390μs 38.5428μs 25.9452 KOps/s 25.5475 KOps/s $\color{#35bf28}+1.56\%$
test_reshape_td 0.1080ms 44.7727μs 22.3350 KOps/s 21.3932 KOps/s $\color{#35bf28}+4.40\%$
test_view_pytree 0.1410ms 38.7193μs 25.8269 KOps/s 25.5705 KOps/s $\color{#35bf28}+1.00\%$
test_view_td 0.1069ms 50.7389μs 19.7087 KOps/s 19.1641 KOps/s $\color{#35bf28}+2.84\%$
test_unbind_pytree 74.8600μs 36.2858μs 27.5590 KOps/s 27.7266 KOps/s $\color{#d91a1a}-0.60\%$
test_unbind_td 0.2949ms 45.6083μs 21.9258 KOps/s 22.1155 KOps/s $\color{#d91a1a}-0.86\%$
test_split_pytree 79.7490μs 38.4512μs 26.0070 KOps/s 26.0089 KOps/s $-0.01\%$
test_split_td 0.5231ms 57.9912μs 17.2440 KOps/s 14.4394 KOps/s $\textbf{\color{#35bf28}+19.42\%}$
test_add_pytree 0.1098ms 46.6666μs 21.4286 KOps/s 21.8181 KOps/s $\color{#d91a1a}-1.79\%$
test_add_td 0.3023ms 80.8982μs 12.3612 KOps/s 12.1715 KOps/s $\color{#35bf28}+1.56\%$
test_compile_add_one_nested[tensordict-compile] 0.1266ms 57.7656μs 17.3113 KOps/s 17.5988 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_add_one_nested[tensordict-eager] 0.3520ms 0.1779ms 5.6200 KOps/s 5.5773 KOps/s $\color{#35bf28}+0.76\%$
test_compile_add_one_nested[pytree-compile] 0.1214ms 56.4217μs 17.7237 KOps/s 17.7431 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_one_nested[pytree-eager] 0.2746ms 0.1457ms 6.8641 KOps/s 6.8788 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_copy_nested[tensordict-compile] 62.0460μs 21.2113μs 47.1447 KOps/s 46.2925 KOps/s $\color{#35bf28}+1.84\%$
test_compile_copy_nested[tensordict-eager] 0.1389ms 68.1493μs 14.6737 KOps/s 14.8630 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_copy_nested[pytree-compile] 0.1701ms 75.6658μs 13.2160 KOps/s 13.1148 KOps/s $\color{#35bf28}+0.77\%$
test_compile_copy_nested[pytree-eager] 0.1533ms 67.5547μs 14.8028 KOps/s 14.8716 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_add_one_flat[tensordict-compile] 0.2910ms 0.1731ms 5.7760 KOps/s 5.7489 KOps/s $\color{#35bf28}+0.47\%$
test_compile_add_one_flat[tensordict-eager] 0.3931ms 0.1885ms 5.3058 KOps/s 5.2725 KOps/s $\color{#35bf28}+0.63\%$
test_compile_add_one_flat[tensorclass-compile] 0.1091ms 46.0338μs 21.7232 KOps/s 21.3406 KOps/s $\color{#35bf28}+1.79\%$
test_compile_add_one_flat[tensorclass-eager] 0.1597ms 68.8552μs 14.5232 KOps/s 13.7768 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_compile_add_one_flat[pytree-compile] 0.3782ms 0.1772ms 5.6440 KOps/s 5.7910 KOps/s $\color{#d91a1a}-2.54\%$
test_compile_add_one_flat[pytree-eager] 0.5842ms 0.2951ms 3.3890 KOps/s 3.4136 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_add_self_flat[tensordict-eager] 0.3002ms 0.1979ms 5.0526 KOps/s 5.0011 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_self_flat[tensordict-compile] 0.3336ms 0.1771ms 5.6452 KOps/s 5.8551 KOps/s $\color{#d91a1a}-3.58\%$
test_compile_add_self_flat[tensorclass-eager] 0.1343ms 61.7377μs 16.1975 KOps/s 15.9843 KOps/s $\color{#35bf28}+1.33\%$
test_compile_add_self_flat[tensorclass-compile] 89.1260μs 46.5699μs 21.4731 KOps/s 21.7755 KOps/s $\color{#d91a1a}-1.39\%$
test_compile_add_self_flat[pytree-eager] 0.3689ms 0.2352ms 4.2515 KOps/s 4.2488 KOps/s $\color{#35bf28}+0.07\%$
test_compile_add_self_flat[pytree-compile] 0.2673ms 0.1733ms 5.7699 KOps/s 5.6368 KOps/s $\color{#35bf28}+2.36\%$
test_compile_copy_flat[tensordict-compile] 0.1851ms 0.1014ms 9.8645 KOps/s 9.8803 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_copy_flat[tensordict-eager] 0.1231ms 58.3331μs 17.1429 KOps/s 17.2573 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_copy_flat[pytree-compile] 0.1870ms 78.3242μs 12.7674 KOps/s 12.8408 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_copy_flat[pytree-eager] 0.1257ms 68.5375μs 14.5905 KOps/s 14.6573 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_assign_and_add[tensordict-compile] 0.3006ms 0.1972ms 5.0711 KOps/s 5.1371 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_assign_and_add[tensordict-eager] 2.7753ms 1.6701ms 598.7615 Ops/s 601.7945 Ops/s $\color{#d91a1a}-0.50\%$
test_compile_assign_and_add[pytree-compile] 0.2997ms 0.1959ms 5.1043 KOps/s 5.1998 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_assign_and_add[pytree-eager] 1.7739ms 1.1298ms 885.0817 Ops/s 896.0053 Ops/s $\color{#d91a1a}-1.22\%$
test_compile_assign_and_add_stack[compile] 0.5261ms 0.4234ms 2.3620 KOps/s 2.3853 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_assign_and_add_stack[eager] 5.7069ms 3.7008ms 270.2109 Ops/s 263.4368 Ops/s $\color{#35bf28}+2.57\%$
test_compile_indexing[tensor-tensordict-compile] 74.9800μs 33.9019μs 29.4969 KOps/s 28.7662 KOps/s $\color{#35bf28}+2.54\%$
test_compile_indexing[tensor-tensordict-eager] 1.0664ms 50.0244μs 19.9903 KOps/s 20.1267 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_indexing[tensor-tensorclass-compile] 72.1350μs 29.7170μs 33.6507 KOps/s 33.2033 KOps/s $\color{#35bf28}+1.35\%$
test_compile_indexing[tensor-tensorclass-eager] 77.7950μs 30.1611μs 33.1552 KOps/s 32.8905 KOps/s $\color{#35bf28}+0.81\%$
test_compile_indexing[tensor-pytree-compile] 0.1179ms 29.8959μs 33.4494 KOps/s 32.5290 KOps/s $\color{#35bf28}+2.83\%$
test_compile_indexing[tensor-pytree-eager] 0.1003ms 29.6459μs 33.7315 KOps/s 33.1502 KOps/s $\color{#35bf28}+1.75\%$
test_compile_indexing[slice-tensordict-compile] 0.1554ms 73.5331μs 13.5993 KOps/s 13.4119 KOps/s $\color{#35bf28}+1.40\%$
test_compile_indexing[slice-tensordict-eager] 0.5241ms 27.5755μs 36.2641 KOps/s 35.9191 KOps/s $\color{#35bf28}+0.96\%$
test_compile_indexing[slice-tensorclass-compile] 0.1394ms 67.8788μs 14.7321 KOps/s 14.7446 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[slice-tensorclass-eager] 63.5690μs 23.0532μs 43.3779 KOps/s 41.0553 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_compile_indexing[slice-pytree-compile] 0.1976ms 67.7885μs 14.7518 KOps/s 14.8753 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[slice-pytree-eager] 60.4730μs 23.2240μs 43.0590 KOps/s 42.6139 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[int-tensordict-compile] 0.1436ms 72.3660μs 13.8186 KOps/s 13.7109 KOps/s $\color{#35bf28}+0.79\%$
test_compile_indexing[int-tensordict-eager] 1.0206ms 27.6009μs 36.2308 KOps/s 36.3019 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[int-tensorclass-compile] 0.1566ms 67.6187μs 14.7888 KOps/s 14.8530 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_indexing[int-tensorclass-eager] 63.4080μs 23.1438μs 43.2080 KOps/s 43.0546 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[int-pytree-compile] 0.1345ms 67.9575μs 14.7151 KOps/s 15.0064 KOps/s $\color{#d91a1a}-1.94\%$
test_compile_indexing[int-pytree-eager] 83.4660μs 23.0524μs 43.3794 KOps/s 42.7971 KOps/s $\color{#35bf28}+1.36\%$
test_mod_add[eager] 75.1500μs 22.9002μs 43.6677 KOps/s 40.1954 KOps/s $\textbf{\color{#35bf28}+8.64\%}$
test_mod_add[compile] 0.1093ms 40.1907μs 24.8814 KOps/s 25.1954 KOps/s $\color{#d91a1a}-1.25\%$
test_mod_add[compile-overhead] 0.1019ms 39.4446μs 25.3520 KOps/s 25.1407 KOps/s $\color{#35bf28}+0.84\%$
test_mod_wrap[eager] 0.4088ms 0.2087ms 4.7912 KOps/s 4.7632 KOps/s $\color{#35bf28}+0.59\%$
test_mod_wrap[compile] 0.3145ms 0.2377ms 4.2064 KOps/s 4.2868 KOps/s $\color{#d91a1a}-1.88\%$
test_mod_wrap[compile-overhead] 0.3351ms 0.2333ms 4.2856 KOps/s 4.2996 KOps/s $\color{#d91a1a}-0.32\%$
test_mod_wrap_and_backward[eager] 12.1390ms 10.6706ms 93.7152 Ops/s 92.8704 Ops/s $\color{#35bf28}+0.91\%$
test_mod_wrap_and_backward[compile] 12.0450ms 10.6022ms 94.3200 Ops/s 91.5298 Ops/s $\color{#35bf28}+3.05\%$
test_mod_wrap_and_backward[compile-overhead] 11.5439ms 10.6135ms 94.2195 Ops/s 90.4529 Ops/s $\color{#35bf28}+4.16\%$
test_seq_add[eager] 0.2153ms 89.4437μs 11.1802 KOps/s 11.2898 KOps/s $\color{#d91a1a}-0.97\%$
test_seq_add[compile] 0.1546ms 64.4649μs 15.5123 KOps/s 15.6057 KOps/s $\color{#d91a1a}-0.60\%$
test_seq_add[compile-overhead] 0.1475ms 63.2906μs 15.8001 KOps/s 15.7932 KOps/s $\color{#35bf28}+0.04\%$
test_seq_wrap[eager] 0.6391ms 0.3809ms 2.6251 KOps/s 2.5971 KOps/s $\color{#35bf28}+1.08\%$
test_seq_wrap[compile] 1.4113ms 0.2692ms 3.7141 KOps/s 3.6989 KOps/s $\color{#35bf28}+0.41\%$
test_seq_wrap[compile-overhead] 1.3333ms 0.2698ms 3.7071 KOps/s 3.6801 KOps/s $\color{#35bf28}+0.73\%$
test_func_call_runtime[False-eager] 0.9624ms 0.5235ms 1.9101 KOps/s 1.9061 KOps/s $\color{#35bf28}+0.21\%$
test_func_call_runtime[False-compile] 0.9487ms 0.5035ms 1.9861 KOps/s 1.9970 KOps/s $\color{#d91a1a}-0.55\%$
test_func_call_runtime[False-compile-overhead] 0.6540ms 0.5041ms 1.9838 KOps/s 1.9761 KOps/s $\color{#35bf28}+0.39\%$
test_func_call_runtime[True-eager] 1.2724ms 0.7426ms 1.3466 KOps/s 1.3587 KOps/s $\color{#d91a1a}-0.90\%$
test_func_call_runtime[True-compile] 0.6465ms 0.5128ms 1.9501 KOps/s 1.9409 KOps/s $\color{#35bf28}+0.47\%$
test_func_call_runtime[True-compile-overhead] 0.7004ms 0.5137ms 1.9467 KOps/s 1.9533 KOps/s $\color{#d91a1a}-0.34\%$
test_func_call_cm_runtime[False-eager] 0.7751ms 0.5198ms 1.9240 KOps/s 1.9333 KOps/s $\color{#d91a1a}-0.48\%$
test_func_call_cm_runtime[False-compile] 0.7145ms 0.5038ms 1.9851 KOps/s 1.9884 KOps/s $\color{#d91a1a}-0.17\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6747ms 0.5057ms 1.9776 KOps/s 1.9832 KOps/s $\color{#d91a1a}-0.28\%$
test_func_call_cm_runtime[True-eager] 1.2246ms 0.8687ms 1.1511 KOps/s 1.1528 KOps/s $\color{#d91a1a}-0.15\%$
test_func_call_cm_runtime[True-compile] 0.9016ms 0.7422ms 1.3474 KOps/s 1.3481 KOps/s $\color{#d91a1a}-0.05\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0540ms 0.7352ms 1.3601 KOps/s 1.3422 KOps/s $\color{#35bf28}+1.34\%$
test_vmap_func_call_cm_runtime[eager] 2.3937ms 1.8708ms 534.5239 Ops/s 528.8157 Ops/s $\color{#35bf28}+1.08\%$
test_vmap_func_call_cm_runtime[compile] 2.6223ms 1.9317ms 517.6799 Ops/s 517.6657 Ops/s $+0.00\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.7021ms 1.9169ms 521.6880 Ops/s 517.2949 Ops/s $\color{#35bf28}+0.85\%$
test_distributed 0.2289ms 0.1236ms 8.0882 KOps/s 7.9178 KOps/s $\color{#35bf28}+2.15\%$
test_tdmodule 75.8020μs 16.9200μs 59.1015 KOps/s 55.3262 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_tdmodule_dispatch 61.3350μs 33.1418μs 30.1734 KOps/s 27.7755 KOps/s $\textbf{\color{#35bf28}+8.63\%}$
test_tdseq 35.5770μs 19.0801μs 52.4106 KOps/s 49.1686 KOps/s $\textbf{\color{#35bf28}+6.59\%}$
test_tdseq_dispatch 67.7460μs 37.4782μs 26.6822 KOps/s 24.0881 KOps/s $\textbf{\color{#35bf28}+10.77\%}$
test_instantiation_functorch 1.7054ms 1.5942ms 627.2655 Ops/s 613.8585 Ops/s $\color{#35bf28}+2.18\%$
test_instantiation_td 1.9007ms 1.1772ms 849.4568 Ops/s 838.5721 Ops/s $\color{#35bf28}+1.30\%$
test_exec_functorch 0.3070ms 0.1857ms 5.3856 KOps/s 5.3840 KOps/s $\color{#35bf28}+0.03\%$
test_exec_functional_call 0.3561ms 0.1780ms 5.6166 KOps/s 5.7057 KOps/s $\color{#d91a1a}-1.56\%$
test_exec_td 0.3237ms 0.1721ms 5.8090 KOps/s 5.8150 KOps/s $\color{#d91a1a}-0.10\%$
test_exec_td_decorator 0.4143ms 0.2268ms 4.4082 KOps/s 4.4447 KOps/s $\color{#d91a1a}-0.82\%$
test_vmap_mlp_speed[True-True] 1.0673ms 0.6402ms 1.5621 KOps/s 1.5348 KOps/s $\color{#35bf28}+1.77\%$
test_vmap_mlp_speed[True-False] 0.9571ms 0.6387ms 1.5656 KOps/s 1.5553 KOps/s $\color{#35bf28}+0.67\%$
test_vmap_mlp_speed[False-True] 0.7301ms 0.4995ms 2.0020 KOps/s 1.9889 KOps/s $\color{#35bf28}+0.66\%$
test_vmap_mlp_speed[False-False] 0.7740ms 0.5014ms 1.9945 KOps/s 1.9998 KOps/s $\color{#d91a1a}-0.27\%$
test_vmap_mlp_speed_decorator[True-True] 1.2443ms 0.6146ms 1.6271 KOps/s 1.5925 KOps/s $\color{#35bf28}+2.17\%$
test_vmap_mlp_speed_decorator[True-False] 0.9506ms 0.6206ms 1.6114 KOps/s 1.5949 KOps/s $\color{#35bf28}+1.03\%$
test_vmap_mlp_speed_decorator[False-True] 0.8198ms 0.5164ms 1.9364 KOps/s 1.9383 KOps/s $\color{#d91a1a}-0.10\%$
test_vmap_mlp_speed_decorator[False-False] 0.7084ms 0.5141ms 1.9451 KOps/s 1.9390 KOps/s $\color{#35bf28}+0.31\%$
test_to_module_speed[True] 2.0584ms 1.2962ms 771.4778 Ops/s 783.5097 Ops/s $\color{#d91a1a}-1.54\%$
test_to_module_speed[False] 1.7362ms 1.2547ms 797.0252 Ops/s 804.8348 Ops/s $\color{#d91a1a}-0.97\%$
test_tc_init 75.1600μs 41.5878μs 24.0455 KOps/s 23.1934 KOps/s $\color{#35bf28}+3.67\%$
test_tc_init_nested 0.1544ms 85.3437μs 11.7173 KOps/s 11.3298 KOps/s $\color{#35bf28}+3.42\%$
test_tc_first_layer_tensor 18.4750μs 1.5655μs 638.7824 KOps/s 656.0347 KOps/s $\color{#d91a1a}-2.63\%$
test_tc_first_layer_nontensor 34.4440μs 4.7573μs 210.2020 KOps/s 214.1845 KOps/s $\color{#d91a1a}-1.86\%$
test_tc_second_layer_tensor 25.7080μs 2.8868μs 346.4058 KOps/s 358.1563 KOps/s $\color{#d91a1a}-3.28\%$
test_tc_second_layer_nontensor 24.9570μs 6.0760μs 164.5807 KOps/s 166.5299 KOps/s $\color{#d91a1a}-1.17\%$
test_unbind 0.4783s 13.2076ms 75.7140 Ops/s 75.4316 Ops/s $\color{#35bf28}+0.37\%$
test_full_like 9.6939ms 7.9325ms 126.0629 Ops/s 129.8895 Ops/s $\color{#d91a1a}-2.95\%$
test_zeros_like 3.8229ms 3.0628ms 326.4997 Ops/s 152.2856 Ops/s $\textbf{\color{#35bf28}+114.40\%}$
test_ones_like 13.7831ms 7.0864ms 141.1157 Ops/s 128.5688 Ops/s $\textbf{\color{#35bf28}+9.76\%}$
test_clone 16.5545ms 8.7870ms 113.8041 Ops/s 108.0901 Ops/s $\textbf{\color{#35bf28}+5.29\%}$
test_squeeze 68.7180μs 12.3760μs 80.8015 KOps/s 83.2828 KOps/s $\color{#d91a1a}-2.98\%$
test_unsqueeze 0.2339ms 94.9209μs 10.5351 KOps/s 10.3708 KOps/s $\color{#35bf28}+1.58\%$
test_split 0.5352ms 0.1992ms 5.0211 KOps/s 4.9801 KOps/s $\color{#35bf28}+0.82\%$
test_permute 0.4705ms 0.2300ms 4.3479 KOps/s 4.3016 KOps/s $\color{#35bf28}+1.08\%$
test_stack 30.7310ms 24.9623ms 40.0604 Ops/s 40.0158 Ops/s $\color{#35bf28}+0.11\%$
test_cat 28.0563ms 24.8276ms 40.2777 Ops/s 39.9682 Ops/s $\color{#35bf28}+0.77\%$

Copy link

github-actions bot commented Sep 16, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}36$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.5810ms 13.4748μs 74.2124 KOps/s 66.7465 KOps/s $\textbf{\color{#35bf28}+11.19\%}$
test_plain_set_stack_nested 38.8110μs 13.8443μs 72.2319 KOps/s 66.3379 KOps/s $\textbf{\color{#35bf28}+8.88\%}$
test_plain_set_nested_inplace 67.1610μs 14.7315μs 67.8818 KOps/s 62.6182 KOps/s $\textbf{\color{#35bf28}+8.41\%}$
test_plain_set_stack_nested_inplace 50.3110μs 14.4730μs 69.0940 KOps/s 61.4923 KOps/s $\textbf{\color{#35bf28}+12.36\%}$
test_items 32.8400μs 2.9049μs 344.2504 KOps/s 345.5687 KOps/s $\color{#d91a1a}-0.38\%$
test_items_nested 0.4732ms 0.3242ms 3.0848 KOps/s 3.1025 KOps/s $\color{#d91a1a}-0.57\%$
test_items_nested_locked 0.3756ms 0.3259ms 3.0687 KOps/s 3.0323 KOps/s $\color{#35bf28}+1.20\%$
test_items_nested_leaf 90.2520μs 55.5788μs 17.9925 KOps/s 17.9532 KOps/s $\color{#35bf28}+0.22\%$
test_items_stack_nested 0.3867ms 0.3291ms 3.0383 KOps/s 3.0554 KOps/s $\color{#d91a1a}-0.56\%$
test_items_stack_nested_leaf 96.0420μs 57.3979μs 17.4222 KOps/s 17.6270 KOps/s $\color{#d91a1a}-1.16\%$
test_items_stack_nested_locked 0.4141ms 0.3300ms 3.0300 KOps/s 3.0336 KOps/s $\color{#d91a1a}-0.12\%$
test_keys 39.1710μs 3.4615μs 288.8924 KOps/s 290.2361 KOps/s $\color{#d91a1a}-0.46\%$
test_keys_nested 85.0910μs 56.3602μs 17.7430 KOps/s 17.6823 KOps/s $\color{#35bf28}+0.34\%$
test_keys_nested_locked 2.2340ms 63.0239μs 15.8670 KOps/s 15.9831 KOps/s $\color{#d91a1a}-0.73\%$
test_keys_nested_leaf 95.0310μs 46.9920μs 21.2802 KOps/s 21.0737 KOps/s $\color{#35bf28}+0.98\%$
test_keys_stack_nested 88.6420μs 55.0631μs 18.1610 KOps/s 17.8707 KOps/s $\color{#35bf28}+1.62\%$
test_keys_stack_nested_leaf 81.5910μs 48.5605μs 20.5929 KOps/s 20.7854 KOps/s $\color{#d91a1a}-0.93\%$
test_keys_stack_nested_locked 84.7620μs 61.8682μs 16.1634 KOps/s 16.2495 KOps/s $\color{#d91a1a}-0.53\%$
test_values 4.8850μs 0.8646μs 1.1566 MOps/s 1.1487 MOps/s $\color{#35bf28}+0.68\%$
test_values_nested 65.0420μs 40.8163μs 24.5000 KOps/s 24.5855 KOps/s $\color{#d91a1a}-0.35\%$
test_values_nested_locked 73.4810μs 42.8682μs 23.3273 KOps/s 23.3965 KOps/s $\color{#d91a1a}-0.30\%$
test_values_nested_leaf 81.5120μs 35.6541μs 28.0472 KOps/s 28.1902 KOps/s $\color{#d91a1a}-0.51\%$
test_values_stack_nested 81.3420μs 41.7067μs 23.9770 KOps/s 24.0197 KOps/s $\color{#d91a1a}-0.18\%$
test_values_stack_nested_leaf 69.7720μs 36.0092μs 27.7706 KOps/s 28.0227 KOps/s $\color{#d91a1a}-0.90\%$
test_values_stack_nested_locked 70.9220μs 43.7337μs 22.8657 KOps/s 22.8370 KOps/s $\color{#35bf28}+0.13\%$
test_membership 1.7301μs 0.5056μs 1.9778 MOps/s 1.9776 MOps/s $+0.01\%$
test_membership_nested 26.2505μs 1.8986μs 526.7080 KOps/s 510.7386 KOps/s $\color{#35bf28}+3.13\%$
test_membership_nested_leaf 15.1855μs 1.8988μs 526.6562 KOps/s 526.8074 KOps/s $\color{#d91a1a}-0.03\%$
test_membership_stacked_nested 35.8100μs 1.9418μs 514.9810 KOps/s 506.3305 KOps/s $\color{#35bf28}+1.71\%$
test_membership_stacked_nested_leaf 25.7000μs 1.9479μs 513.3830 KOps/s 499.2538 KOps/s $\color{#35bf28}+2.83\%$
test_membership_nested_last 34.6100μs 2.7966μs 357.5742 KOps/s 347.4701 KOps/s $\color{#35bf28}+2.91\%$
test_membership_nested_leaf_last 28.2800μs 2.8251μs 353.9746 KOps/s 346.2352 KOps/s $\color{#35bf28}+2.24\%$
test_membership_stacked_nested_last 35.8110μs 3.1286μs 319.6294 KOps/s 125.3498 KOps/s $\textbf{\color{#35bf28}+154.99\%}$
test_membership_stacked_nested_leaf_last 29.3310μs 3.1006μs 322.5213 KOps/s 125.6149 KOps/s $\textbf{\color{#35bf28}+156.75\%}$
test_nested_getleaf 30.6910μs 6.1105μs 163.6516 KOps/s 161.6791 KOps/s $\color{#35bf28}+1.22\%$
test_nested_get 37.8510μs 5.7539μs 173.7942 KOps/s 170.7503 KOps/s $\color{#35bf28}+1.78\%$
test_stacked_getleaf 38.3810μs 6.0326μs 165.7649 KOps/s 162.7685 KOps/s $\color{#35bf28}+1.84\%$
test_stacked_get 31.0600μs 5.6708μs 176.3423 KOps/s 173.0144 KOps/s $\color{#35bf28}+1.92\%$
test_nested_getitemleaf 32.4710μs 6.1837μs 161.7167 KOps/s 161.2354 KOps/s $\color{#35bf28}+0.30\%$
test_nested_getitem 48.7210μs 5.8339μs 171.4114 KOps/s 172.4260 KOps/s $\color{#d91a1a}-0.59\%$
test_stacked_getitemleaf 36.9910μs 6.1487μs 162.6371 KOps/s 161.6949 KOps/s $\color{#35bf28}+0.58\%$
test_stacked_getitem 34.8310μs 5.8331μs 171.4343 KOps/s 171.8948 KOps/s $\color{#d91a1a}-0.27\%$
test_lock_nested 2.8911ms 0.4234ms 2.3617 KOps/s 2.3629 KOps/s $\color{#d91a1a}-0.05\%$
test_lock_stack_nested 0.4244ms 0.3870ms 2.5839 KOps/s 2.6603 KOps/s $\color{#d91a1a}-2.87\%$
test_unlock_nested 0.7317ms 0.3610ms 2.7701 KOps/s 2.7705 KOps/s $\color{#d91a1a}-0.02\%$
test_unlock_stack_nested 0.3915ms 0.3268ms 3.0603 KOps/s 3.1721 KOps/s $\color{#d91a1a}-3.52\%$
test_flatten_speed 0.1067ms 68.9094μs 14.5118 KOps/s 14.5013 KOps/s $\color{#35bf28}+0.07\%$
test_unflatten_speed 0.3637ms 0.2845ms 3.5155 KOps/s 3.4684 KOps/s $\color{#35bf28}+1.36\%$
test_common_ops 1.5132ms 1.2450ms 803.2147 Ops/s 752.7426 Ops/s $\textbf{\color{#35bf28}+6.71\%}$
test_creation 32.3610μs 1.4901μs 671.0760 KOps/s 659.8788 KOps/s $\color{#35bf28}+1.70\%$
test_creation_empty 48.4710μs 15.3696μs 65.0635 KOps/s 55.7523 KOps/s $\textbf{\color{#35bf28}+16.70\%}$
test_creation_nested_1 49.0810μs 17.2461μs 57.9842 KOps/s 47.9954 KOps/s $\textbf{\color{#35bf28}+20.81\%}$
test_creation_nested_2 49.2110μs 19.1728μs 52.1573 KOps/s 39.8790 KOps/s $\textbf{\color{#35bf28}+30.79\%}$
test_clone 57.3610μs 32.0787μs 31.1734 KOps/s 30.2162 KOps/s $\color{#35bf28}+3.17\%$
test_getitem[int] 91.9901ms 23.8026μs 42.0122 KOps/s 60.1065 KOps/s $\textbf{\color{#d91a1a}-30.10\%}$
test_getitem[slice_int] 0.1179ms 28.4842μs 35.1072 KOps/s 34.4521 KOps/s $\color{#35bf28}+1.90\%$
test_getitem[range] 0.2232ms 0.1113ms 8.9885 KOps/s 8.8639 KOps/s $\color{#35bf28}+1.41\%$
test_getitem[tuple] 0.1166ms 26.6948μs 37.4605 KOps/s 40.0085 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_getitem[list] 0.2029ms 0.1095ms 9.1347 KOps/s 9.2195 KOps/s $\color{#d91a1a}-0.92\%$
test_setitem_dim[int] 75.2820μs 50.5030μs 19.8008 KOps/s 20.1393 KOps/s $\color{#d91a1a}-1.68\%$
test_setitem_dim[slice_int] 0.1154ms 69.8389μs 14.3187 KOps/s 14.5958 KOps/s $\color{#d91a1a}-1.90\%$
test_setitem_dim[range] 0.1779ms 0.1384ms 7.2273 KOps/s 7.6572 KOps/s $\textbf{\color{#d91a1a}-5.61\%}$
test_setitem_dim[tuple] 93.9320μs 67.3519μs 14.8474 KOps/s 16.1380 KOps/s $\textbf{\color{#d91a1a}-8.00\%}$
test_setitem 89.3420μs 46.0264μs 21.7267 KOps/s 22.3079 KOps/s $\color{#d91a1a}-2.61\%$
test_set 82.9720μs 44.7596μs 22.3416 KOps/s 22.8638 KOps/s $\color{#d91a1a}-2.28\%$
test_set_shared 0.3510ms 53.2729μs 18.7713 KOps/s 19.0124 KOps/s $\color{#d91a1a}-1.27\%$
test_update 99.1320μs 52.7742μs 18.9486 KOps/s 18.3886 KOps/s $\color{#35bf28}+3.05\%$
test_update_nested 0.1070ms 59.7326μs 16.7413 KOps/s 16.4476 KOps/s $\color{#35bf28}+1.79\%$
test_update__nested 0.1023ms 66.4040μs 15.0593 KOps/s 16.1972 KOps/s $\textbf{\color{#d91a1a}-7.02\%}$
test_set_nested 86.7120μs 47.5963μs 21.0100 KOps/s 21.2739 KOps/s $\color{#d91a1a}-1.24\%$
test_set_nested_new 89.6310μs 50.8306μs 19.6732 KOps/s 19.8309 KOps/s $\color{#d91a1a}-0.80\%$
test_select 0.1035ms 64.0016μs 15.6246 KOps/s 15.6018 KOps/s $\color{#35bf28}+0.15\%$
test_select_nested 0.5456ms 42.0982μs 23.7540 KOps/s 23.2285 KOps/s $\color{#35bf28}+2.26\%$
test_exclude_nested 92.8610μs 59.2438μs 16.8794 KOps/s 16.9899 KOps/s $\color{#d91a1a}-0.65\%$
test_empty[True] 0.2966ms 0.2432ms 4.1114 KOps/s 4.1149 KOps/s $\color{#d91a1a}-0.09\%$
test_empty[False] 4.0381μs 0.7442μs 1.3438 MOps/s 1.3093 MOps/s $\color{#35bf28}+2.64\%$
test_to 45.1310μs 24.9767μs 40.0372 KOps/s 39.0716 KOps/s $\color{#35bf28}+2.47\%$
test_to_nonblocking 59.7420μs 24.5954μs 40.6581 KOps/s 40.7419 KOps/s $\color{#d91a1a}-0.21\%$
test_unbind_speed 0.3519ms 0.2869ms 3.4860 KOps/s 3.5220 KOps/s $\color{#d91a1a}-1.02\%$
test_unbind_speed_stack0 0.4128ms 0.2831ms 3.5320 KOps/s 3.6400 KOps/s $\color{#d91a1a}-2.97\%$
test_unbind_speed_stack1 91.1690ms 0.7208ms 1.3873 KOps/s 1.4236 KOps/s $\color{#d91a1a}-2.55\%$
test_split 93.9870ms 2.1384ms 467.6455 Ops/s 437.5686 Ops/s $\textbf{\color{#35bf28}+6.87\%}$
test_chunk 92.9697ms 2.1449ms 466.2316 Ops/s 434.3583 Ops/s $\textbf{\color{#35bf28}+7.34\%}$
test_creation[device0] 0.2902ms 0.1263ms 7.9188 KOps/s 7.7954 KOps/s $\color{#35bf28}+1.58\%$
test_creation_from_tensor 0.4374ms 0.1352ms 7.3972 KOps/s 7.7041 KOps/s $\color{#d91a1a}-3.98\%$
test_add_one[memmap_tensor0] 0.1497ms 9.4578μs 105.7333 KOps/s 111.4941 KOps/s $\textbf{\color{#d91a1a}-5.17\%}$
test_contiguous[memmap_tensor0] 31.0300μs 2.2037μs 453.7833 KOps/s 447.0314 KOps/s $\color{#35bf28}+1.51\%$
test_stack[memmap_tensor0] 36.4600μs 6.8674μs 145.6146 KOps/s 146.3735 KOps/s $\color{#d91a1a}-0.52\%$
test_memmaptd_index 1.0938ms 0.4512ms 2.2165 KOps/s 2.3118 KOps/s $\color{#d91a1a}-4.12\%$
test_memmaptd_index_astensor 1.0298ms 0.5070ms 1.9723 KOps/s 2.0490 KOps/s $\color{#d91a1a}-3.74\%$
test_memmaptd_index_op 1.4324ms 1.0494ms 952.9142 Ops/s 919.6501 Ops/s $\color{#35bf28}+3.62\%$
test_serialize_model 0.1305s 0.1293s 7.7332 Ops/s 7.7371 Ops/s $\color{#d91a1a}-0.05\%$
test_serialize_model_pickle 1.3506s 1.2109s 0.8258 Ops/s 0.8244 Ops/s $\color{#35bf28}+0.18\%$
test_serialize_weights 0.2225s 0.1420s 7.0445 Ops/s 7.7855 Ops/s $\textbf{\color{#d91a1a}-9.52\%}$
test_serialize_weights_returnearly 0.2242s 56.0300ms 17.8476 Ops/s 16.2926 Ops/s $\textbf{\color{#35bf28}+9.54\%}$
test_serialize_weights_pickle 1.3481s 1.2128s 0.8245 Ops/s 0.8140 Ops/s $\color{#35bf28}+1.30\%$
test_reshape_pytree 70.3320μs 37.5538μs 26.6284 KOps/s 26.9185 KOps/s $\color{#d91a1a}-1.08\%$
test_reshape_td 90.0910μs 47.3919μs 21.1006 KOps/s 23.4886 KOps/s $\textbf{\color{#d91a1a}-10.17\%}$
test_view_pytree 75.9110μs 38.1078μs 26.2413 KOps/s 27.8739 KOps/s $\textbf{\color{#d91a1a}-5.86\%}$
test_view_td 0.1117ms 51.3127μs 19.4883 KOps/s 21.5382 KOps/s $\textbf{\color{#d91a1a}-9.52\%}$
test_unbind_pytree 77.4810μs 37.2413μs 26.8519 KOps/s 28.3175 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_unbind_td 0.4452ms 47.4698μs 21.0660 KOps/s 23.3636 KOps/s $\textbf{\color{#d91a1a}-9.83\%}$
test_split_pytree 94.7620μs 50.4765μs 19.8112 KOps/s 20.9131 KOps/s $\textbf{\color{#d91a1a}-5.27\%}$
test_split_td 0.6540ms 58.6907μs 17.0385 KOps/s 16.6761 KOps/s $\color{#35bf28}+2.17\%$
test_add_pytree 92.6520μs 58.1044μs 17.2104 KOps/s 17.2155 KOps/s $\color{#d91a1a}-0.03\%$
test_add_td 0.1274ms 91.1096μs 10.9758 KOps/s 10.3726 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_compile_add_one_nested[tensordict-compile] 0.4155ms 0.2112ms 4.7348 KOps/s 4.6816 KOps/s $\color{#35bf28}+1.14\%$
test_compile_add_one_nested[tensordict-eager] 0.5647ms 0.1536ms 6.5103 KOps/s 6.5907 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_add_one_nested[pytree-compile] 0.2210ms 0.1477ms 6.7719 KOps/s 6.8103 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_add_one_nested[pytree-eager] 0.2381ms 0.1863ms 5.3683 KOps/s 5.2796 KOps/s $\color{#35bf28}+1.68\%$
test_compile_copy_nested[tensordict-compile] 93.2620μs 23.2055μs 43.0932 KOps/s 43.8754 KOps/s $\color{#d91a1a}-1.78\%$
test_compile_copy_nested[tensordict-eager] 90.3320μs 44.3302μs 22.5580 KOps/s 22.8815 KOps/s $\color{#d91a1a}-1.41\%$
test_compile_copy_nested[pytree-compile] 0.2386ms 64.7538μs 15.4431 KOps/s 15.9162 KOps/s $\color{#d91a1a}-2.97\%$
test_compile_copy_nested[pytree-eager] 82.7010μs 49.6292μs 20.1494 KOps/s 20.6142 KOps/s $\color{#d91a1a}-2.25\%$
test_compile_add_one_flat[tensordict-compile] 0.3955ms 0.3213ms 3.1123 KOps/s 3.1122 KOps/s $+0.00\%$
test_compile_add_one_flat[tensordict-eager] 0.2455ms 0.2107ms 4.7471 KOps/s 4.5826 KOps/s $\color{#35bf28}+3.59\%$
test_compile_add_one_flat[tensorclass-compile] 0.1719ms 0.1300ms 7.6894 KOps/s 7.2442 KOps/s $\textbf{\color{#35bf28}+6.15\%}$
test_compile_add_one_flat[tensorclass-eager] 0.1052ms 60.8392μs 16.4368 KOps/s 15.5141 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_compile_add_one_flat[pytree-compile] 0.3918ms 0.3201ms 3.1244 KOps/s 3.0390 KOps/s $\color{#35bf28}+2.81\%$
test_compile_add_one_flat[pytree-eager] 0.6941ms 0.6318ms 1.5827 KOps/s 1.4854 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_compile_add_self_flat[tensordict-eager] 0.2997ms 0.2495ms 4.0075 KOps/s 3.9126 KOps/s $\color{#35bf28}+2.42\%$
test_compile_add_self_flat[tensordict-compile] 0.3715ms 0.3211ms 3.1139 KOps/s 3.0796 KOps/s $\color{#35bf28}+1.11\%$
test_compile_add_self_flat[tensorclass-eager] 0.1141ms 71.2359μs 14.0379 KOps/s 13.4230 KOps/s $\color{#35bf28}+4.58\%$
test_compile_add_self_flat[tensorclass-compile] 0.1803ms 0.1306ms 7.6594 KOps/s 7.3381 KOps/s $\color{#35bf28}+4.38\%$
test_compile_add_self_flat[pytree-eager] 0.6313ms 0.5390ms 1.8554 KOps/s 1.7874 KOps/s $\color{#35bf28}+3.80\%$
test_compile_add_self_flat[pytree-compile] 0.3842ms 0.3199ms 3.1257 KOps/s 3.0755 KOps/s $\color{#35bf28}+1.63\%$
test_compile_copy_flat[tensordict-compile] 93.4320μs 18.7330μs 53.3818 KOps/s 54.2986 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_copy_flat[tensordict-eager] 61.2910μs 27.2510μs 36.6960 KOps/s 36.9990 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_copy_flat[pytree-compile] 98.4620μs 70.1178μs 14.2617 KOps/s 14.3461 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_copy_flat[pytree-eager] 84.5910μs 51.9463μs 19.2506 KOps/s 19.5800 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_assign_and_add[tensordict-compile] 2.3316ms 0.8141ms 1.2283 KOps/s 1.1207 KOps/s $\textbf{\color{#35bf28}+9.60\%}$
test_compile_assign_and_add[tensordict-eager] 3.2992ms 3.2029ms 312.2187 Ops/s 303.7670 Ops/s $\color{#35bf28}+2.78\%$
test_compile_assign_and_add[pytree-compile] 2.3213ms 0.8138ms 1.2287 KOps/s 1.1435 KOps/s $\textbf{\color{#35bf28}+7.46\%}$
test_compile_assign_and_add[pytree-eager] 3.6216ms 3.2671ms 306.0813 Ops/s 298.3852 Ops/s $\color{#35bf28}+2.58\%$
test_compile_indexing[tensor-tensordict-compile] 0.1714ms 0.1098ms 9.1058 KOps/s 8.8697 KOps/s $\color{#35bf28}+2.66\%$
test_compile_indexing[tensor-tensordict-eager] 0.1954ms 61.1946μs 16.3413 KOps/s 15.6692 KOps/s $\color{#35bf28}+4.29\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1545ms 0.1038ms 9.6308 KOps/s 9.3787 KOps/s $\color{#35bf28}+2.69\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1333ms 43.5138μs 22.9812 KOps/s 22.1416 KOps/s $\color{#35bf28}+3.79\%$
test_compile_indexing[tensor-pytree-compile] 0.1453ms 0.1048ms 9.5421 KOps/s 9.4379 KOps/s $\color{#35bf28}+1.10\%$
test_compile_indexing[tensor-pytree-eager] 99.9420μs 43.4591μs 23.0101 KOps/s 22.5792 KOps/s $\color{#35bf28}+1.91\%$
test_compile_indexing[slice-tensordict-compile] 0.1951ms 0.1389ms 7.1972 KOps/s 7.1616 KOps/s $\color{#35bf28}+0.50\%$
test_compile_indexing[slice-tensordict-eager] 0.1681ms 25.4911μs 39.2294 KOps/s 37.3137 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1852ms 0.1316ms 7.5982 KOps/s 7.5063 KOps/s $\color{#35bf28}+1.22\%$
test_compile_indexing[slice-tensorclass-eager] 62.7010μs 20.8899μs 47.8701 KOps/s 46.3284 KOps/s $\color{#35bf28}+3.33\%$
test_compile_indexing[slice-pytree-compile] 0.1799ms 0.1320ms 7.5784 KOps/s 7.4576 KOps/s $\color{#35bf28}+1.62\%$
test_compile_indexing[slice-pytree-eager] 54.6410μs 20.5870μs 48.5744 KOps/s 46.5170 KOps/s $\color{#35bf28}+4.42\%$
test_compile_indexing[int-tensordict-compile] 0.1975ms 0.1399ms 7.1468 KOps/s 7.1379 KOps/s $\color{#35bf28}+0.12\%$
test_compile_indexing[int-tensordict-eager] 0.5000ms 25.2978μs 39.5291 KOps/s 37.5352 KOps/s $\textbf{\color{#35bf28}+5.31\%}$
test_compile_indexing[int-tensorclass-compile] 0.2025ms 0.1340ms 7.4652 KOps/s 7.4437 KOps/s $\color{#35bf28}+0.29\%$
test_compile_indexing[int-tensorclass-eager] 0.1566ms 21.1098μs 47.3715 KOps/s 46.5165 KOps/s $\color{#35bf28}+1.84\%$
test_compile_indexing[int-pytree-compile] 0.1933ms 0.1324ms 7.5524 KOps/s 7.4749 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[int-pytree-eager] 50.7910μs 20.5520μs 48.6571 KOps/s 46.7074 KOps/s $\color{#35bf28}+4.17\%$
test_mod_add[eager] 69.9210μs 32.2254μs 31.0315 KOps/s 26.3846 KOps/s $\textbf{\color{#35bf28}+17.61\%}$
test_mod_add[compile] 0.3224ms 69.2753μs 14.4352 KOps/s 13.3030 KOps/s $\textbf{\color{#35bf28}+8.51\%}$
test_mod_add[compile-overhead] 0.2670ms 0.1379ms 7.2510 KOps/s 6.7062 KOps/s $\textbf{\color{#35bf28}+8.12\%}$
test_mod_wrap[eager] 0.3128ms 0.2424ms 4.1249 KOps/s 3.7233 KOps/s $\textbf{\color{#35bf28}+10.79\%}$
test_mod_wrap[compile] 0.6757ms 0.3145ms 3.1797 KOps/s 3.3097 KOps/s $\color{#d91a1a}-3.93\%$
test_mod_wrap[compile-overhead] 7.4878ms 4.0333ms 247.9364 Ops/s 255.6675 Ops/s $\color{#d91a1a}-3.02\%$
test_mod_wrap_and_backward[eager] 1.4928ms 1.3631ms 733.6102 Ops/s 682.3424 Ops/s $\textbf{\color{#35bf28}+7.51\%}$
test_mod_wrap_and_backward[compile] 1.9196ms 1.3346ms 749.3099 Ops/s 687.4703 Ops/s $\textbf{\color{#35bf28}+9.00\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3269ms 0.9188ms 1.0884 KOps/s 947.9812 Ops/s $\textbf{\color{#35bf28}+14.81\%}$
test_seq_add[eager] 0.1843ms 96.9258μs 10.3172 KOps/s 9.5984 KOps/s $\textbf{\color{#35bf28}+7.49\%}$
test_seq_add[compile] 0.3825ms 82.0399μs 12.1892 KOps/s 11.7723 KOps/s $\color{#35bf28}+3.54\%$
test_seq_add[compile-overhead] 0.1574ms 0.1161ms 8.6144 KOps/s 8.4949 KOps/s $\color{#35bf28}+1.41\%$
test_seq_wrap[eager] 0.4588ms 0.3751ms 2.6656 KOps/s 2.5066 KOps/s $\textbf{\color{#35bf28}+6.34\%}$
test_seq_wrap[compile] 0.3848ms 0.3164ms 3.1604 KOps/s 3.1203 KOps/s $\color{#35bf28}+1.29\%$
test_seq_wrap[compile-overhead] 0.2693ms 0.2224ms 4.4967 KOps/s 4.4714 KOps/s $\color{#35bf28}+0.57\%$
test_func_call_runtime[False-eager] 0.8312ms 0.7449ms 1.3424 KOps/s 1.3265 KOps/s $\color{#35bf28}+1.20\%$
test_func_call_runtime[False-compile] 1.0270ms 0.7882ms 1.2687 KOps/s 1.2329 KOps/s $\color{#35bf28}+2.91\%$
test_func_call_runtime[False-compile-overhead] 0.4618ms 0.3610ms 2.7698 KOps/s 2.7582 KOps/s $\color{#35bf28}+0.42\%$
test_func_call_runtime[True-eager] 0.9502ms 0.9063ms 1.1034 KOps/s 1.0843 KOps/s $\color{#35bf28}+1.76\%$
test_func_call_runtime[True-compile] 0.9209ms 0.8222ms 1.2162 KOps/s 1.1866 KOps/s $\color{#35bf28}+2.50\%$
test_func_call_runtime[True-compile-overhead] 0.4916ms 0.3946ms 2.5341 KOps/s 2.5184 KOps/s $\color{#35bf28}+0.62\%$
test_func_call_cm_runtime[False-eager] 0.8044ms 0.7443ms 1.3436 KOps/s 1.3201 KOps/s $\color{#35bf28}+1.77\%$
test_func_call_cm_runtime[False-compile] 0.8349ms 0.7901ms 1.2657 KOps/s 1.2261 KOps/s $\color{#35bf28}+3.23\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4159ms 0.3646ms 2.7429 KOps/s 2.7345 KOps/s $\color{#35bf28}+0.31\%$
test_func_call_cm_runtime[True-eager] 1.1230ms 1.0052ms 994.8116 Ops/s 983.3634 Ops/s $\color{#35bf28}+1.16\%$
test_func_call_cm_runtime[True-compile] 0.9298ms 0.8497ms 1.1769 KOps/s 1.1483 KOps/s $\color{#35bf28}+2.50\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4930ms 0.4205ms 2.3780 KOps/s 2.3785 KOps/s $\color{#d91a1a}-0.02\%$
test_vmap_func_call_cm_runtime[eager] 2.5329ms 2.0707ms 482.9309 Ops/s 475.6803 Ops/s $\color{#35bf28}+1.52\%$
test_vmap_func_call_cm_runtime[compile] 0.9931ms 0.8665ms 1.1540 KOps/s 1.1264 KOps/s $\color{#35bf28}+2.45\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5282ms 0.4264ms 2.3450 KOps/s 2.3413 KOps/s $\color{#35bf28}+0.16\%$
test_distributed 1.6980ms 0.2082ms 4.8020 KOps/s 8.4706 KOps/s $\textbf{\color{#d91a1a}-43.31\%}$
test_tdmodule 91.1310μs 14.1986μs 70.4297 KOps/s 63.6017 KOps/s $\textbf{\color{#35bf28}+10.74\%}$
test_tdmodule_dispatch 65.4510μs 28.8618μs 34.6478 KOps/s 32.2449 KOps/s $\textbf{\color{#35bf28}+7.45\%}$
test_tdseq 37.8310μs 16.3840μs 61.0351 KOps/s 60.2729 KOps/s $\color{#35bf28}+1.26\%$
test_tdseq_dispatch 53.3310μs 32.3656μs 30.8970 KOps/s 29.4375 KOps/s $\color{#35bf28}+4.96\%$
test_instantiation_functorch 2.0658ms 1.9913ms 502.1933 Ops/s 525.8592 Ops/s $\color{#d91a1a}-4.50\%$
test_instantiation_td 1.8205ms 1.2194ms 820.0807 Ops/s 822.4717 Ops/s $\color{#d91a1a}-0.29\%$
test_exec_functorch 0.2639ms 0.2162ms 4.6254 KOps/s 4.7503 KOps/s $\color{#d91a1a}-2.63\%$
test_exec_functional_call 0.2728ms 0.2230ms 4.4833 KOps/s 4.4088 KOps/s $\color{#35bf28}+1.69\%$
test_exec_td 0.2773ms 0.2273ms 4.3993 KOps/s 4.2538 KOps/s $\color{#35bf28}+3.42\%$
test_exec_td_decorator 0.8932ms 0.2634ms 3.7963 KOps/s 3.6105 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_vmap_mlp_speed[True-True] 0.7744ms 0.7139ms 1.4008 KOps/s 1.4366 KOps/s $\color{#d91a1a}-2.49\%$
test_vmap_mlp_speed[True-False] 0.8058ms 0.7174ms 1.3939 KOps/s 1.3738 KOps/s $\color{#35bf28}+1.47\%$
test_vmap_mlp_speed[False-True] 0.7161ms 0.6059ms 1.6504 KOps/s 1.6399 KOps/s $\color{#35bf28}+0.64\%$
test_vmap_mlp_speed[False-False] 0.6663ms 0.6063ms 1.6493 KOps/s 1.6388 KOps/s $\color{#35bf28}+0.64\%$
test_vmap_mlp_speed_decorator[True-True] 1.3873ms 0.6961ms 1.4367 KOps/s 1.4039 KOps/s $\color{#35bf28}+2.33\%$
test_vmap_mlp_speed_decorator[True-False] 0.8280ms 0.6985ms 1.4317 KOps/s 1.4019 KOps/s $\color{#35bf28}+2.13\%$
test_vmap_mlp_speed_decorator[False-True] 0.7617ms 0.6168ms 1.6214 KOps/s 1.6029 KOps/s $\color{#35bf28}+1.15\%$
test_vmap_mlp_speed_decorator[False-False] 0.7387ms 0.6161ms 1.6231 KOps/s 1.5966 KOps/s $\color{#35bf28}+1.66\%$
test_vmap_transformer_speed[True-True] 8.7707ms 8.4171ms 118.8052 Ops/s 117.0264 Ops/s $\color{#35bf28}+1.52\%$
test_vmap_transformer_speed[True-False] 8.7276ms 8.3790ms 119.3460 Ops/s 116.5690 Ops/s $\color{#35bf28}+2.38\%$
test_vmap_transformer_speed[False-True] 8.6083ms 8.2623ms 121.0311 Ops/s 120.0125 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_transformer_speed[False-False] 8.6025ms 8.2000ms 121.9518 Ops/s 120.8770 Ops/s $\color{#35bf28}+0.89\%$
test_vmap_transformer_speed_decorator[True-True] 20.5133ms 19.6859ms 50.7978 Ops/s 50.2205 Ops/s $\color{#35bf28}+1.15\%$
test_vmap_transformer_speed_decorator[True-False] 20.3825ms 19.6644ms 50.8532 Ops/s 50.4252 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_transformer_speed_decorator[False-True] 20.2846ms 19.5933ms 51.0379 Ops/s 50.8731 Ops/s $\color{#35bf28}+0.32\%$
test_vmap_transformer_speed_decorator[False-False] 19.9370ms 19.5381ms 51.1821 Ops/s 50.6575 Ops/s $\color{#35bf28}+1.04\%$
test_to_module_speed[True] 1.3718ms 0.9416ms 1.0620 KOps/s 1.0621 KOps/s $-0.01\%$
test_to_module_speed[False] 1.3037ms 0.9202ms 1.0867 KOps/s 1.0920 KOps/s $\color{#d91a1a}-0.48\%$
test_tc_init 67.7810μs 32.9802μs 30.3212 KOps/s 27.1742 KOps/s $\textbf{\color{#35bf28}+11.58\%}$
test_tc_init_nested 0.1154ms 67.9423μs 14.7184 KOps/s 13.2055 KOps/s $\textbf{\color{#35bf28}+11.46\%}$
test_tc_first_layer_tensor 6.0530μs 0.6759μs 1.4796 MOps/s 1.4843 MOps/s $\color{#d91a1a}-0.32\%$
test_tc_first_layer_nontensor 26.8400μs 2.2289μs 448.6503 KOps/s 450.5350 KOps/s $\color{#d91a1a}-0.42\%$
test_tc_second_layer_tensor 8.6800μs 1.3593μs 735.6561 KOps/s 743.7549 KOps/s $\color{#d91a1a}-1.09\%$
test_tc_second_layer_nontensor 0.1044ms 2.9515μs 338.8154 KOps/s 344.7502 KOps/s $\color{#d91a1a}-1.72\%$
test_unbind 0.1901s 11.9221ms 83.8777 Ops/s 91.1574 Ops/s $\textbf{\color{#d91a1a}-7.99\%}$
test_full_like 0.6593ms 0.5733ms 1.7444 KOps/s 1.7418 KOps/s $\color{#35bf28}+0.15\%$
test_zeros_like 0.2892ms 0.1980ms 5.0517 KOps/s 5.0609 KOps/s $\color{#d91a1a}-0.18\%$
test_ones_like 0.2357ms 0.1977ms 5.0569 KOps/s 5.0677 KOps/s $\color{#d91a1a}-0.21\%$
test_clone 0.4458ms 0.4148ms 2.4106 KOps/s 2.4193 KOps/s $\color{#d91a1a}-0.36\%$
test_squeeze 42.9110μs 11.4064μs 87.6704 KOps/s 100.8936 KOps/s $\textbf{\color{#d91a1a}-13.11\%}$
test_unsqueeze 0.2937ms 74.4611μs 13.4298 KOps/s 12.6548 KOps/s $\textbf{\color{#35bf28}+6.12\%}$
test_split 0.2596ms 0.1612ms 6.2025 KOps/s 6.0193 KOps/s $\color{#35bf28}+3.04\%$
test_permute 0.2336ms 0.1900ms 5.2634 KOps/s 5.4249 KOps/s $\color{#d91a1a}-2.98\%$
test_stack 1.2423ms 0.8745ms 1.1436 KOps/s 1.1338 KOps/s $\color{#35bf28}+0.86\%$
test_cat 1.2625ms 1.2314ms 812.0681 Ops/s 811.9570 Ops/s $\color{#35bf28}+0.01\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 684707e into gh/vmoens/19/base Sep 17, 2024
39 of 50 checks passed
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: 2c6c43c5b34be73572d1d1a8da009585e8876bc5
Pull Request resolved: #992
@vmoens vmoens deleted the gh/vmoens/19/head branch September 17, 2024 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants