Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix non-deterministic key order in stack #1230

Open
wants to merge 1 commit into
base: gh/vmoens/48/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 22, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 22, 2025
ghstack-source-id: 7f394789b783d6359a78a300aaf449eb25adb5e3
Pull Request resolved: #1230
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 22, 2025
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 50.5640μs 20.0925μs 49.7697 KOps/s 49.5689 KOps/s $\color{#35bf28}+0.41\%$
test_plain_set_stack_nested 45.7450μs 20.0545μs 49.8641 KOps/s 49.2224 KOps/s $\color{#35bf28}+1.30\%$
test_plain_set_nested_inplace 54.9320μs 21.5403μs 46.4247 KOps/s 45.2589 KOps/s $\color{#35bf28}+2.58\%$
test_plain_set_stack_nested_inplace 54.5110μs 21.6148μs 46.2646 KOps/s 44.8616 KOps/s $\color{#35bf28}+3.13\%$
test_items 20.8280μs 4.1462μs 241.1856 KOps/s 242.2765 KOps/s $\color{#d91a1a}-0.45\%$
test_items_nested 0.7584ms 0.4077ms 2.4526 KOps/s 2.4268 KOps/s $\color{#35bf28}+1.06\%$
test_items_nested_locked 0.4965ms 0.4056ms 2.4652 KOps/s 2.4229 KOps/s $\color{#35bf28}+1.75\%$
test_items_nested_leaf 0.1441ms 76.4252μs 13.0847 KOps/s 13.0296 KOps/s $\color{#35bf28}+0.42\%$
test_items_stack_nested 0.7546ms 0.4098ms 2.4404 KOps/s 2.4065 KOps/s $\color{#35bf28}+1.41\%$
test_items_stack_nested_leaf 0.1496ms 76.3706μs 13.0940 KOps/s 12.4926 KOps/s $\color{#35bf28}+4.81\%$
test_items_stack_nested_locked 0.7752ms 0.4130ms 2.4214 KOps/s 2.4216 KOps/s $-0.01\%$
test_keys 23.2640μs 3.4641μs 288.6739 KOps/s 289.2372 KOps/s $\color{#d91a1a}-0.19\%$
test_keys_nested 0.2913ms 0.1629ms 6.1402 KOps/s 6.1117 KOps/s $\color{#35bf28}+0.47\%$
test_keys_nested_locked 1.9566ms 0.1694ms 5.9030 KOps/s 5.8818 KOps/s $\color{#35bf28}+0.36\%$
test_keys_nested_leaf 0.2214ms 0.1417ms 7.0567 KOps/s 7.0130 KOps/s $\color{#35bf28}+0.62\%$
test_keys_stack_nested 0.2915ms 0.1622ms 6.1640 KOps/s 6.1385 KOps/s $\color{#35bf28}+0.42\%$
test_keys_stack_nested_leaf 0.2114ms 0.1418ms 7.0499 KOps/s 7.1402 KOps/s $\color{#d91a1a}-1.26\%$
test_keys_stack_nested_locked 0.2965ms 0.1693ms 5.9061 KOps/s 5.9589 KOps/s $\color{#d91a1a}-0.89\%$
test_values 4.7186μs 1.0261μs 974.5827 KOps/s 944.1908 KOps/s $\color{#35bf28}+3.22\%$
test_values_nested 0.1164ms 62.7862μs 15.9271 KOps/s 16.2162 KOps/s $\color{#d91a1a}-1.78\%$
test_values_nested_locked 0.1139ms 62.0827μs 16.1076 KOps/s 16.1797 KOps/s $\color{#d91a1a}-0.45\%$
test_values_nested_leaf 0.1327ms 70.8202μs 14.1203 KOps/s 14.1840 KOps/s $\color{#d91a1a}-0.45\%$
test_values_stack_nested 0.1220ms 62.5424μs 15.9891 KOps/s 15.9563 KOps/s $\color{#35bf28}+0.21\%$
test_values_stack_nested_leaf 0.1353ms 70.6468μs 14.1549 KOps/s 13.8631 KOps/s $\color{#35bf28}+2.11\%$
test_values_stack_nested_locked 0.1346ms 62.7205μs 15.9438 KOps/s 15.8213 KOps/s $\color{#35bf28}+0.77\%$
test_membership 7.3098μs 0.7162μs 1.3962 MOps/s 1.4393 MOps/s $\color{#d91a1a}-2.99\%$
test_membership_nested 20.3280μs 2.8643μs 349.1285 KOps/s 335.7401 KOps/s $\color{#35bf28}+3.99\%$
test_membership_nested_leaf 23.2040μs 2.8934μs 345.6096 KOps/s 331.4669 KOps/s $\color{#35bf28}+4.27\%$
test_membership_stacked_nested 20.4680μs 2.9044μs 344.3051 KOps/s 334.4255 KOps/s $\color{#35bf28}+2.95\%$
test_membership_stacked_nested_leaf 31.7990μs 2.8725μs 348.1283 KOps/s 331.3921 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_membership_nested_last 31.0180μs 4.3341μs 230.7261 KOps/s 227.3015 KOps/s $\color{#35bf28}+1.51\%$
test_membership_nested_leaf_last 29.3750μs 4.4380μs 225.3243 KOps/s 222.3284 KOps/s $\color{#35bf28}+1.35\%$
test_membership_stacked_nested_last 51.2350μs 4.3678μs 228.9479 KOps/s 225.7058 KOps/s $\color{#35bf28}+1.44\%$
test_membership_stacked_nested_leaf_last 24.0650μs 4.3688μs 228.8941 KOps/s 224.5410 KOps/s $\color{#35bf28}+1.94\%$
test_nested_getleaf 42.7790μs 10.3986μs 96.1666 KOps/s 93.9256 KOps/s $\color{#35bf28}+2.39\%$
test_nested_get 29.9550μs 9.8456μs 101.5687 KOps/s 98.9290 KOps/s $\color{#35bf28}+2.67\%$
test_stacked_getleaf 29.9950μs 10.5160μs 95.0936 KOps/s 92.7077 KOps/s $\color{#35bf28}+2.57\%$
test_stacked_get 34.3440μs 10.1883μs 98.1515 KOps/s 98.9119 KOps/s $\color{#d91a1a}-0.77\%$
test_nested_getitemleaf 36.8490μs 11.1755μs 89.4818 KOps/s 87.7022 KOps/s $\color{#35bf28}+2.03\%$
test_nested_getitem 34.4240μs 10.5735μs 94.5759 KOps/s 93.1571 KOps/s $\color{#35bf28}+1.52\%$
test_stacked_getitemleaf 44.5530μs 11.3200μs 88.3392 KOps/s 88.9028 KOps/s $\color{#d91a1a}-0.63\%$
test_stacked_getitem 32.5010μs 10.5897μs 94.4318 KOps/s 92.0642 KOps/s $\color{#35bf28}+2.57\%$
test_lock_nested 0.6591ms 0.4082ms 2.4497 KOps/s 2.4337 KOps/s $\color{#35bf28}+0.66\%$
test_lock_stack_nested 0.6763ms 0.4194ms 2.3843 KOps/s 2.3820 KOps/s $\color{#35bf28}+0.09\%$
test_unlock_nested 0.5369ms 0.3319ms 3.0132 KOps/s 2.9532 KOps/s $\color{#35bf28}+2.03\%$
test_unlock_stack_nested 0.5289ms 0.3386ms 2.9531 KOps/s 2.9409 KOps/s $\color{#35bf28}+0.42\%$
test_flatten_speed 0.2051ms 99.7813μs 10.0219 KOps/s 10.0681 KOps/s $\color{#d91a1a}-0.46\%$
test_unflatten_speed 0.6448ms 0.5143ms 1.9444 KOps/s 1.9447 KOps/s $\color{#d91a1a}-0.01\%$
test_common_ops 4.1030ms 0.7860ms 1.2722 KOps/s 1.2491 KOps/s $\color{#35bf28}+1.85\%$
test_creation 20.4680μs 2.5234μs 396.2863 KOps/s 408.8910 KOps/s $\color{#d91a1a}-3.08\%$
test_creation_empty 36.3480μs 11.1764μs 89.4743 KOps/s 90.6145 KOps/s $\color{#d91a1a}-1.26\%$
test_creation_nested_1 34.7950μs 14.0074μs 71.3911 KOps/s 72.1796 KOps/s $\color{#d91a1a}-1.09\%$
test_creation_nested_2 50.4840μs 18.4126μs 54.3106 KOps/s 54.7872 KOps/s $\color{#d91a1a}-0.87\%$
test_clone 73.9680μs 13.6713μs 73.1459 KOps/s 71.8208 KOps/s $\color{#35bf28}+1.84\%$
test_getitem[int] 1.0845ms 12.6996μs 78.7429 KOps/s 78.6148 KOps/s $\color{#35bf28}+0.16\%$
test_getitem[slice_int] 0.1288ms 23.9138μs 41.8168 KOps/s 40.6290 KOps/s $\color{#35bf28}+2.92\%$
test_getitem[range] 0.1637ms 49.4003μs 20.2428 KOps/s 20.1528 KOps/s $\color{#35bf28}+0.45\%$
test_getitem[tuple] 0.1230ms 20.0088μs 49.9779 KOps/s 49.4755 KOps/s $\color{#35bf28}+1.02\%$
test_getitem[list] 0.1913ms 45.4505μs 22.0020 KOps/s 21.9756 KOps/s $\color{#35bf28}+0.12\%$
test_setitem_dim[int] 53.3200μs 25.0534μs 39.9148 KOps/s 38.2331 KOps/s $\color{#35bf28}+4.40\%$
test_setitem_dim[slice_int] 0.1067ms 51.2792μs 19.5011 KOps/s 19.1575 KOps/s $\color{#35bf28}+1.79\%$
test_setitem_dim[range] 0.1251ms 77.5564μs 12.8938 KOps/s 12.8566 KOps/s $\color{#35bf28}+0.29\%$
test_setitem_dim[tuple] 70.2900μs 39.7106μs 25.1822 KOps/s 24.2307 KOps/s $\color{#35bf28}+3.93\%$
test_setitem 77.5440μs 20.0409μs 49.8979 KOps/s 48.8040 KOps/s $\color{#35bf28}+2.24\%$
test_set 0.2014ms 19.4946μs 51.2962 KOps/s 50.2512 KOps/s $\color{#35bf28}+2.08\%$
test_set_shared 3.3504ms 0.1809ms 5.5277 KOps/s 5.4261 KOps/s $\color{#35bf28}+1.87\%$
test_update 0.1211ms 22.5269μs 44.3915 KOps/s 44.8796 KOps/s $\color{#d91a1a}-1.09\%$
test_update_nested 96.7100μs 32.2025μs 31.0534 KOps/s 29.6571 KOps/s $\color{#35bf28}+4.71\%$
test_update__nested 0.3872ms 33.4913μs 29.8585 KOps/s 29.4236 KOps/s $\color{#35bf28}+1.48\%$
test_set_nested 79.8780μs 21.5691μs 46.3625 KOps/s 45.2932 KOps/s $\color{#35bf28}+2.36\%$
test_set_nested_new 83.5150μs 25.9756μs 38.4976 KOps/s 37.2038 KOps/s $\color{#35bf28}+3.48\%$
test_select 0.1063ms 42.3990μs 23.5854 KOps/s 23.6249 KOps/s $\color{#d91a1a}-0.17\%$
test_select_nested 0.1349ms 62.6949μs 15.9503 KOps/s 15.8965 KOps/s $\color{#35bf28}+0.34\%$
test_exclude_nested 0.1653ms 79.5516μs 12.5705 KOps/s 12.2760 KOps/s $\color{#35bf28}+2.40\%$
test_empty[True] 0.6136ms 0.4020ms 2.4877 KOps/s 2.4720 KOps/s $\color{#35bf28}+0.63\%$
test_empty[False] 8.1878μs 1.3641μs 733.0891 KOps/s 716.2079 KOps/s $\color{#35bf28}+2.36\%$
test_unbind_speed 0.3545ms 0.2666ms 3.7512 KOps/s 3.6597 KOps/s $\color{#35bf28}+2.50\%$
test_unbind_speed_stack0 0.3183ms 0.2653ms 3.7686 KOps/s 3.7366 KOps/s $\color{#35bf28}+0.86\%$
test_unbind_speed_stack1 0.1027s 0.7309ms 1.3682 KOps/s 1.3903 KOps/s $\color{#d91a1a}-1.59\%$
test_split 96.8259ms 1.7293ms 578.2627 Ops/s 539.6095 Ops/s $\textbf{\color{#35bf28}+7.16\%}$
test_chunk 94.1865ms 1.7276ms 578.8473 Ops/s 637.4578 Ops/s $\textbf{\color{#d91a1a}-9.19\%}$
test_consolidate_njt[False-None] 8.6641ms 8.3190ms 120.2063 Ops/s 110.5560 Ops/s $\textbf{\color{#35bf28}+8.73\%}$
test_creation[device0] 0.2666ms 92.1606μs 10.8506 KOps/s 10.7026 KOps/s $\color{#35bf28}+1.38\%$
test_creation_from_tensor 3.7649ms 95.5846μs 10.4619 KOps/s 10.2886 KOps/s $\color{#35bf28}+1.68\%$
test_add_one[memmap_tensor0] 0.1266ms 5.1077μs 195.7816 KOps/s 185.7611 KOps/s $\textbf{\color{#35bf28}+5.39\%}$
test_contiguous[memmap_tensor0] 15.7790μs 0.5100μs 1.9606 MOps/s 1.9694 MOps/s $\color{#d91a1a}-0.44\%$
test_stack[memmap_tensor0] 41.9580μs 3.5252μs 283.6698 KOps/s 277.6067 KOps/s $\color{#35bf28}+2.18\%$
test_memmaptd_index 0.3004ms 0.2315ms 4.3205 KOps/s 4.2787 KOps/s $\color{#35bf28}+0.98\%$
test_memmaptd_index_astensor 1.1184ms 0.3185ms 3.1397 KOps/s 3.1488 KOps/s $\color{#d91a1a}-0.29\%$
test_memmaptd_index_op 1.1183ms 0.5758ms 1.7366 KOps/s 1.6841 KOps/s $\color{#35bf28}+3.12\%$
test_serialize_model 0.2228s 0.1289s 7.7556 Ops/s 8.6258 Ops/s $\textbf{\color{#d91a1a}-10.09\%}$
test_serialize_model_pickle 0.4577s 0.3893s 2.5689 Ops/s 2.5563 Ops/s $\color{#35bf28}+0.49\%$
test_serialize_weights 0.1212s 0.1157s 8.6423 Ops/s 8.9384 Ops/s $\color{#d91a1a}-3.31\%$
test_serialize_weights_returnearly 0.1772s 0.1600s 6.2513 Ops/s 6.3346 Ops/s $\color{#d91a1a}-1.32\%$
test_serialize_weights_pickle 0.4671s 0.4072s 2.4557 Ops/s 2.4569 Ops/s $\color{#d91a1a}-0.05\%$
test_serialize_weights_filesystem 0.1511s 0.1411s 7.0891 Ops/s 7.2141 Ops/s $\color{#d91a1a}-1.73\%$
test_serialize_model_filesystem 0.2512s 0.1609s 6.2161 Ops/s 6.5841 Ops/s $\textbf{\color{#d91a1a}-5.59\%}$
test_reshape_pytree 57.4970μs 26.4788μs 37.7661 KOps/s 38.0484 KOps/s $\color{#d91a1a}-0.74\%$
test_reshape_td 71.6330μs 33.2770μs 30.0508 KOps/s 30.9803 KOps/s $\color{#d91a1a}-3.00\%$
test_view_pytree 73.5870μs 26.3074μs 38.0122 KOps/s 38.3970 KOps/s $\color{#d91a1a}-1.00\%$
test_view_td 0.1005ms 39.7074μs 25.1842 KOps/s 25.0184 KOps/s $\color{#35bf28}+0.66\%$
test_unbind_pytree 69.1680μs 29.6631μs 33.7119 KOps/s 33.8626 KOps/s $\color{#d91a1a}-0.44\%$
test_unbind_td 0.3544ms 39.7603μs 25.1507 KOps/s 24.4127 KOps/s $\color{#35bf28}+3.02\%$
test_split_pytree 58.3090μs 29.0730μs 34.3962 KOps/s 34.3660 KOps/s $\color{#35bf28}+0.09\%$
test_split_td 0.5334ms 44.6635μs 22.3896 KOps/s 22.4785 KOps/s $\color{#d91a1a}-0.40\%$
test_add_pytree 73.8480μs 35.6472μs 28.0527 KOps/s 27.4963 KOps/s $\color{#35bf28}+2.02\%$
test_add_td 0.1030ms 56.9646μs 17.5548 KOps/s 17.9959 KOps/s $\color{#d91a1a}-2.45\%$
test_compile_add_one_nested[tensordict-compile] 0.1527ms 68.3119μs 14.6387 KOps/s 14.7099 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_add_one_nested[tensordict-eager] 0.2900ms 0.1692ms 5.9116 KOps/s 5.8149 KOps/s $\color{#35bf28}+1.66\%$
test_compile_add_one_nested[pytree-compile] 0.1345ms 45.8766μs 21.7976 KOps/s 21.2851 KOps/s $\color{#35bf28}+2.41\%$
test_compile_add_one_nested[pytree-eager] 0.2349ms 0.1185ms 8.4401 KOps/s 8.2548 KOps/s $\color{#35bf28}+2.24\%$
test_compile_copy_nested[tensordict-compile] 67.3660μs 27.8959μs 35.8476 KOps/s 34.0074 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_compile_copy_nested[tensordict-eager] 0.1133ms 59.1498μs 16.9062 KOps/s 17.0681 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_copy_nested[pytree-compile] 0.1955ms 78.4037μs 12.7545 KOps/s 12.5262 KOps/s $\color{#35bf28}+1.82\%$
test_compile_copy_nested[pytree-eager] 0.1414ms 65.7827μs 15.2016 KOps/s 14.9458 KOps/s $\color{#35bf28}+1.71\%$
test_compile_add_one_flat[tensordict-compile] 0.1927ms 0.1097ms 9.1134 KOps/s 9.1113 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_one_flat[tensordict-eager] 0.3237ms 0.2146ms 4.6593 KOps/s 4.6641 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_add_one_flat[tensorclass-compile] 99.4950μs 48.5853μs 20.5824 KOps/s 21.1996 KOps/s $\color{#d91a1a}-2.91\%$
test_compile_add_one_flat[tensorclass-eager] 0.1428ms 67.0000μs 14.9254 KOps/s 14.6614 KOps/s $\color{#35bf28}+1.80\%$
test_compile_add_one_flat[pytree-compile] 0.1749ms 0.1020ms 9.8008 KOps/s 9.8859 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_add_one_flat[pytree-eager] 0.3438ms 0.2007ms 4.9816 KOps/s 4.8463 KOps/s $\color{#35bf28}+2.79\%$
test_compile_add_self_flat[tensordict-eager] 0.3430ms 0.2283ms 4.3797 KOps/s 4.3612 KOps/s $\color{#35bf28}+0.42\%$
test_compile_add_self_flat[tensordict-compile] 0.2364ms 0.1103ms 9.0661 KOps/s 9.1451 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_add_self_flat[tensorclass-eager] 0.4927ms 63.3642μs 15.7818 KOps/s 16.0179 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_add_self_flat[tensorclass-compile] 0.1053ms 51.2503μs 19.5121 KOps/s 20.3721 KOps/s $\color{#d91a1a}-4.22\%$
test_compile_add_self_flat[pytree-eager] 0.2790ms 0.1570ms 6.3711 KOps/s 6.1839 KOps/s $\color{#35bf28}+3.03\%$
test_compile_add_self_flat[pytree-compile] 0.1710ms 0.1023ms 9.7739 KOps/s 9.6994 KOps/s $\color{#35bf28}+0.77\%$
test_compile_copy_flat[tensordict-compile] 53.0380μs 21.7783μs 45.9172 KOps/s 46.9630 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_copy_flat[tensordict-eager] 0.1326ms 69.8782μs 14.3106 KOps/s 15.0708 KOps/s $\textbf{\color{#d91a1a}-5.04\%}$
test_compile_copy_flat[pytree-compile] 0.1743ms 80.8193μs 12.3733 KOps/s 12.4078 KOps/s $\color{#d91a1a}-0.28\%$
test_compile_copy_flat[pytree-eager] 0.1396ms 66.8934μs 14.9491 KOps/s 15.0077 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_assign_and_add[tensordict-compile] 0.3158ms 0.2160ms 4.6302 KOps/s 4.5990 KOps/s $\color{#35bf28}+0.68\%$
test_compile_assign_and_add[tensordict-eager] 1.5460ms 1.3636ms 733.3668 Ops/s 712.4200 Ops/s $\color{#35bf28}+2.94\%$
test_compile_assign_and_add[pytree-compile] 0.2929ms 0.2127ms 4.7010 KOps/s 4.6878 KOps/s $\color{#35bf28}+0.28\%$
test_compile_assign_and_add[pytree-eager] 1.0281ms 0.8341ms 1.1989 KOps/s 1.1686 KOps/s $\color{#35bf28}+2.59\%$
test_compile_assign_and_add_stack[compile] 0.8256ms 0.4602ms 2.1729 KOps/s 2.1490 KOps/s $\color{#35bf28}+1.11\%$
test_compile_assign_and_add_stack[eager] 4.9075ms 2.7143ms 368.4126 Ops/s 358.9632 Ops/s $\color{#35bf28}+2.63\%$
test_compile_indexing[tensor-tensordict-compile] 84.9780μs 38.8282μs 25.7545 KOps/s 26.1942 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_indexing[tensor-tensordict-eager] 0.6854ms 32.4282μs 30.8374 KOps/s 29.8526 KOps/s $\color{#35bf28}+3.30\%$
test_compile_indexing[tensor-tensorclass-compile] 88.8250μs 30.9863μs 32.2723 KOps/s 33.0588 KOps/s $\color{#d91a1a}-2.38\%$
test_compile_indexing[tensor-tensorclass-eager] 73.0960μs 22.9305μs 43.6101 KOps/s 43.1858 KOps/s $\color{#35bf28}+0.98\%$
test_compile_indexing[tensor-pytree-compile] 85.3090μs 31.2143μs 32.0366 KOps/s 32.0131 KOps/s $\color{#35bf28}+0.07\%$
test_compile_indexing[tensor-pytree-eager] 84.8880μs 22.9848μs 43.5070 KOps/s 43.1868 KOps/s $\color{#35bf28}+0.74\%$
test_compile_indexing[slice-tensordict-compile] 0.1078ms 52.8895μs 18.9073 KOps/s 18.7620 KOps/s $\color{#35bf28}+0.77\%$
test_compile_indexing[slice-tensordict-eager] 0.3060ms 19.5923μs 51.0405 KOps/s 49.1046 KOps/s $\color{#35bf28}+3.94\%$
test_compile_indexing[slice-tensorclass-compile] 0.1615ms 47.1177μs 21.2235 KOps/s 21.3950 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_indexing[slice-tensorclass-eager] 74.2480μs 18.6917μs 53.4996 KOps/s 54.1682 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[slice-pytree-compile] 0.1262ms 46.5580μs 21.4786 KOps/s 21.0729 KOps/s $\color{#35bf28}+1.92\%$
test_compile_indexing[slice-pytree-eager] 66.1130μs 18.6723μs 53.5553 KOps/s 53.7130 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[int-tensordict-compile] 0.1126ms 54.1931μs 18.4525 KOps/s 18.0790 KOps/s $\color{#35bf28}+2.07\%$
test_compile_indexing[int-tensordict-eager] 0.9634ms 19.4792μs 51.3369 KOps/s 49.8769 KOps/s $\color{#35bf28}+2.93\%$
test_compile_indexing[int-tensorclass-compile] 0.1104ms 46.2707μs 21.6119 KOps/s 21.1788 KOps/s $\color{#35bf28}+2.05\%$
test_compile_indexing[int-tensorclass-eager] 0.2242ms 18.8348μs 53.0933 KOps/s 54.6501 KOps/s $\color{#d91a1a}-2.85\%$
test_compile_indexing[int-pytree-compile] 0.1144ms 46.2437μs 21.6246 KOps/s 21.0545 KOps/s $\color{#35bf28}+2.71\%$
test_compile_indexing[int-pytree-eager] 65.9830μs 18.7044μs 53.4633 KOps/s 54.7544 KOps/s $\color{#d91a1a}-2.36\%$
test_mod_add[eager] 0.1204ms 35.0507μs 28.5301 KOps/s 29.1686 KOps/s $\color{#d91a1a}-2.19\%$
test_mod_add[compile] 0.1326ms 64.7804μs 15.4368 KOps/s 15.2008 KOps/s $\color{#35bf28}+1.55\%$
test_mod_add[compile-overhead] 0.1294ms 64.9284μs 15.4016 KOps/s 14.8384 KOps/s $\color{#35bf28}+3.80\%$
test_mod_wrap[eager] 0.3898ms 0.2277ms 4.3915 KOps/s 4.3607 KOps/s $\color{#35bf28}+0.71\%$
test_mod_wrap[compile] 1.2908ms 0.2291ms 4.3642 KOps/s 3.5848 KOps/s $\textbf{\color{#35bf28}+21.74\%}$
test_mod_wrap[compile-overhead] 1.3510ms 0.2272ms 4.4014 KOps/s 4.3354 KOps/s $\color{#35bf28}+1.52\%$
test_mod_wrap_and_backward[eager] 12.1833ms 10.7587ms 92.9484 Ops/s 77.9155 Ops/s $\textbf{\color{#35bf28}+19.29\%}$
test_mod_wrap_and_backward[compile] 13.8025ms 11.3765ms 87.9007 Ops/s 86.5198 Ops/s $\color{#35bf28}+1.60\%$
test_mod_wrap_and_backward[compile-overhead] 12.5487ms 10.8182ms 92.4368 Ops/s 86.8288 Ops/s $\textbf{\color{#35bf28}+6.46\%}$
test_seq_add[eager] 0.1956ms 0.1179ms 8.4787 KOps/s 8.3750 KOps/s $\color{#35bf28}+1.24\%$
test_seq_add[compile] 0.1755ms 75.7242μs 13.2058 KOps/s 12.7631 KOps/s $\color{#35bf28}+3.47\%$
test_seq_add[compile-overhead] 0.1745ms 74.4694μs 13.4283 KOps/s 13.1445 KOps/s $\color{#35bf28}+2.16\%$
test_seq_wrap[eager] 3.5348ms 0.4578ms 2.1845 KOps/s 2.1605 KOps/s $\color{#35bf28}+1.11\%$
test_seq_wrap[compile] 0.3870ms 0.2385ms 4.1921 KOps/s 4.0709 KOps/s $\color{#35bf28}+2.98\%$
test_seq_wrap[compile-overhead] 0.3170ms 0.2377ms 4.2072 KOps/s 4.1174 KOps/s $\color{#35bf28}+2.18\%$
test_func_call_runtime[False-eager] 0.9774ms 0.5534ms 1.8071 KOps/s 1.7810 KOps/s $\color{#35bf28}+1.46\%$
test_func_call_runtime[False-compile] 0.5817ms 0.4397ms 2.2741 KOps/s 2.2660 KOps/s $\color{#35bf28}+0.36\%$
test_func_call_runtime[False-compile-overhead] 0.6729ms 0.4396ms 2.2750 KOps/s 2.2349 KOps/s $\color{#35bf28}+1.80\%$
test_func_call_runtime[True-eager] 1.8347ms 0.7688ms 1.3007 KOps/s 1.3000 KOps/s $\color{#35bf28}+0.05\%$
test_func_call_runtime[True-compile] 0.5368ms 0.4597ms 2.1754 KOps/s 2.1697 KOps/s $\color{#35bf28}+0.26\%$
test_func_call_runtime[True-compile-overhead] 0.5398ms 0.4599ms 2.1742 KOps/s 2.1486 KOps/s $\color{#35bf28}+1.19\%$
test_func_call_cm_runtime[False-eager] 0.7193ms 0.5514ms 1.8134 KOps/s 1.8306 KOps/s $\color{#d91a1a}-0.94\%$
test_func_call_cm_runtime[False-compile] 0.6185ms 0.4416ms 2.2644 KOps/s 2.2600 KOps/s $\color{#35bf28}+0.19\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5456ms 0.4402ms 2.2719 KOps/s 2.2559 KOps/s $\color{#35bf28}+0.71\%$
test_func_call_cm_runtime[True-eager] 1.1192ms 0.9084ms 1.1008 KOps/s 1.0947 KOps/s $\color{#35bf28}+0.56\%$
test_func_call_cm_runtime[True-compile] 0.9128ms 0.8051ms 1.2421 KOps/s 1.2232 KOps/s $\color{#35bf28}+1.55\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8967ms 0.8028ms 1.2456 KOps/s 1.2151 KOps/s $\color{#35bf28}+2.52\%$
test_vmap_func_call_cm_runtime[eager] 2.4985ms 1.9024ms 525.6410 Ops/s 515.3538 Ops/s $\color{#35bf28}+2.00\%$
test_vmap_func_call_cm_runtime[compile] 0.9085ms 0.5286ms 1.8917 KOps/s 1.8459 KOps/s $\color{#35bf28}+2.48\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8739ms 0.5283ms 1.8927 KOps/s 1.8473 KOps/s $\color{#35bf28}+2.46\%$
test_distributed 0.2699ms 0.1242ms 8.0535 KOps/s 7.8641 KOps/s $\color{#35bf28}+2.41\%$
test_tdmodule 43.2710μs 26.4252μs 37.8426 KOps/s 36.4959 KOps/s $\color{#35bf28}+3.69\%$
test_tdmodule_dispatch 74.1090μs 49.0706μs 20.3788 KOps/s 20.0747 KOps/s $\color{#35bf28}+1.51\%$
test_tdseq 63.1880μs 29.3815μs 34.0351 KOps/s 33.9213 KOps/s $\color{#35bf28}+0.34\%$
test_tdseq_dispatch 71.0820μs 53.1177μs 18.8261 KOps/s 18.2945 KOps/s $\color{#35bf28}+2.91\%$
test_instantiation_functorch 1.7762ms 1.5410ms 648.9318 Ops/s 644.4957 Ops/s $\color{#35bf28}+0.69\%$
test_exec_functorch 0.4446ms 0.1819ms 5.4976 KOps/s 5.5480 KOps/s $\color{#d91a1a}-0.91\%$
test_exec_functional_call 0.2395ms 0.1745ms 5.7309 KOps/s 5.8931 KOps/s $\color{#d91a1a}-2.75\%$
test_exec_td_decorator 0.4719ms 0.2358ms 4.2415 KOps/s 4.2715 KOps/s $\color{#d91a1a}-0.70\%$
test_vmap_mlp_speed_decorator[True-True] 1.0145ms 0.6690ms 1.4948 KOps/s 1.4987 KOps/s $\color{#d91a1a}-0.26\%$
test_vmap_mlp_speed_decorator[True-False] 0.8576ms 0.6629ms 1.5085 KOps/s 1.3673 KOps/s $\textbf{\color{#35bf28}+10.32\%}$
test_vmap_mlp_speed_decorator[False-True] 0.8055ms 0.5406ms 1.8499 KOps/s 1.8527 KOps/s $\color{#d91a1a}-0.15\%$
test_vmap_mlp_speed_decorator[False-False] 0.7769ms 0.5405ms 1.8500 KOps/s 1.8568 KOps/s $\color{#d91a1a}-0.36\%$
test_to_module_speed[True] 1.7863ms 1.3205ms 757.3101 Ops/s 752.0651 Ops/s $\color{#35bf28}+0.70\%$
test_to_module_speed[False] 2.1728ms 1.3002ms 769.0910 Ops/s 770.5427 Ops/s $\color{#d91a1a}-0.19\%$
test_tc_init 0.1037ms 45.0803μs 22.1826 KOps/s 22.0758 KOps/s $\color{#35bf28}+0.48\%$
test_tc_init_nested 0.1531ms 90.3156μs 11.0723 KOps/s 10.9704 KOps/s $\color{#35bf28}+0.93\%$
test_tc_first_layer_tensor 36.3280μs 1.5456μs 647.0066 KOps/s 636.9337 KOps/s $\color{#35bf28}+1.58\%$
test_tc_first_layer_nontensor 24.8760μs 4.7306μs 211.3913 KOps/s 211.0261 KOps/s $\color{#35bf28}+0.17\%$
test_tc_second_layer_tensor 28.5030μs 2.8407μs 352.0248 KOps/s 349.3585 KOps/s $\color{#35bf28}+0.76\%$
test_tc_second_layer_nontensor 44.9730μs 6.0429μs 165.4843 KOps/s 165.6710 KOps/s $\color{#d91a1a}-0.11\%$
test_unbind 0.2348s 12.9810ms 77.0359 Ops/s 68.7478 Ops/s $\textbf{\color{#35bf28}+12.06\%}$
test_full_like 8.8810ms 7.1752ms 139.3687 Ops/s 137.5501 Ops/s $\color{#35bf28}+1.32\%$
test_zeros_like 4.6198ms 2.7202ms 367.6171 Ops/s 363.9363 Ops/s $\color{#35bf28}+1.01\%$
test_ones_like 3.9289ms 3.2757ms 305.2826 Ops/s 307.3360 Ops/s $\color{#d91a1a}-0.67\%$
test_clone 5.1932ms 4.9158ms 203.4243 Ops/s 202.1834 Ops/s $\color{#35bf28}+0.61\%$
test_squeeze 73.1560μs 12.7443μs 78.4664 KOps/s 79.3490 KOps/s $\color{#d91a1a}-1.11\%$
test_unsqueeze 0.1473ms 94.1755μs 10.6185 KOps/s 10.7362 KOps/s $\color{#d91a1a}-1.10\%$
test_split 0.4407ms 0.1980ms 5.0493 KOps/s 5.0507 KOps/s $\color{#d91a1a}-0.03\%$
test_permute 0.3486ms 0.1985ms 5.0388 KOps/s 5.0286 KOps/s $\color{#35bf28}+0.20\%$
test_stack 29.7805ms 24.8847ms 40.1853 Ops/s 39.6228 Ops/s $\color{#35bf28}+1.42\%$
test_cat 27.1423ms 24.9931ms 40.0111 Ops/s 39.9189 Ops/s $\color{#35bf28}+0.23\%$

raise KeyError(
f"got keys {keys} and {set(td.keys())} which are incompatible"
)
return keys
if strict:
return keys
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should actually make it a list

return keys
if strict:
return keys
return keys_set
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If keys can be exclusive, their order becomes arbitrary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False) used?

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}21$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.5410μs 13.6061μs 73.4966 KOps/s 77.1433 KOps/s $\color{#d91a1a}-4.73\%$
test_plain_set_stack_nested 39.0600μs 13.7418μs 72.7709 KOps/s 76.4285 KOps/s $\color{#d91a1a}-4.79\%$
test_plain_set_nested_inplace 42.1100μs 14.6365μs 68.3222 KOps/s 70.5491 KOps/s $\color{#d91a1a}-3.16\%$
test_plain_set_stack_nested_inplace 39.1910μs 14.5491μs 68.7330 KOps/s 70.7554 KOps/s $\color{#d91a1a}-2.86\%$
test_items 33.5210μs 2.8721μs 348.1801 KOps/s 342.1117 KOps/s $\color{#35bf28}+1.77\%$
test_items_nested 0.4250ms 0.3678ms 2.7189 KOps/s 2.6924 KOps/s $\color{#35bf28}+0.99\%$
test_items_nested_locked 0.5635ms 0.3746ms 2.6694 KOps/s 2.6605 KOps/s $\color{#35bf28}+0.33\%$
test_items_nested_leaf 96.6620μs 60.6408μs 16.4905 KOps/s 16.5184 KOps/s $\color{#d91a1a}-0.17\%$
test_items_stack_nested 0.4210ms 0.3672ms 2.7230 KOps/s 2.6896 KOps/s $\color{#35bf28}+1.24\%$
test_items_stack_nested_leaf 93.3320μs 61.4555μs 16.2719 KOps/s 16.5485 KOps/s $\color{#d91a1a}-1.67\%$
test_items_stack_nested_locked 0.4282ms 0.3740ms 2.6741 KOps/s 2.6949 KOps/s $\color{#d91a1a}-0.77\%$
test_keys 25.9500μs 3.4380μs 290.8635 KOps/s 284.3881 KOps/s $\color{#35bf28}+2.28\%$
test_keys_nested 0.1352ms 89.8187μs 11.1335 KOps/s 11.1605 KOps/s $\color{#d91a1a}-0.24\%$
test_keys_nested_locked 0.7020ms 95.8698μs 10.4308 KOps/s 10.5041 KOps/s $\color{#d91a1a}-0.70\%$
test_keys_nested_leaf 0.1114ms 80.9040μs 12.3603 KOps/s 12.4584 KOps/s $\color{#d91a1a}-0.79\%$
test_keys_stack_nested 0.1172ms 89.2553μs 11.2038 KOps/s 11.2049 KOps/s $-0.01\%$
test_keys_stack_nested_leaf 0.1175ms 80.2939μs 12.4543 KOps/s 12.4887 KOps/s $\color{#d91a1a}-0.28\%$
test_keys_stack_nested_locked 0.1436ms 95.2787μs 10.4955 KOps/s 10.4658 KOps/s $\color{#35bf28}+0.28\%$
test_values 5.1833μs 0.8509μs 1.1752 MOps/s 1.1582 MOps/s $\color{#35bf28}+1.47\%$
test_values_nested 65.9110μs 37.7713μs 26.4751 KOps/s 26.7651 KOps/s $\color{#d91a1a}-1.08\%$
test_values_nested_locked 66.5210μs 39.9201μs 25.0500 KOps/s 25.4821 KOps/s $\color{#d91a1a}-1.70\%$
test_values_nested_leaf 86.2610μs 43.1266μs 23.1876 KOps/s 23.5245 KOps/s $\color{#d91a1a}-1.43\%$
test_values_stack_nested 71.9010μs 37.8733μs 26.4038 KOps/s 26.1705 KOps/s $\color{#35bf28}+0.89\%$
test_values_stack_nested_leaf 69.3410μs 43.0741μs 23.2158 KOps/s 23.2588 KOps/s $\color{#d91a1a}-0.19\%$
test_values_stack_nested_locked 90.7220μs 39.7394μs 25.1640 KOps/s 25.3009 KOps/s $\color{#d91a1a}-0.54\%$
test_membership 2.6790μs 0.5029μs 1.9887 MOps/s 1.9569 MOps/s $\color{#35bf28}+1.62\%$
test_membership_nested 28.9155μs 2.0122μs 496.9570 KOps/s 490.7488 KOps/s $\color{#35bf28}+1.27\%$
test_membership_nested_leaf 20.6800μs 2.0391μs 490.4227 KOps/s 491.2340 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_stacked_nested 47.1210μs 2.1146μs 472.9062 KOps/s 470.5750 KOps/s $\color{#35bf28}+0.50\%$
test_membership_stacked_nested_leaf 31.2210μs 2.0930μs 477.7903 KOps/s 467.4353 KOps/s $\color{#35bf28}+2.22\%$
test_membership_nested_last 39.7600μs 3.1255μs 319.9528 KOps/s 318.6482 KOps/s $\color{#35bf28}+0.41\%$
test_membership_nested_leaf_last 33.1200μs 3.1792μs 314.5422 KOps/s 321.1626 KOps/s $\color{#d91a1a}-2.06\%$
test_membership_stacked_nested_last 29.5010μs 3.1364μs 318.8329 KOps/s 326.2028 KOps/s $\color{#d91a1a}-2.26\%$
test_membership_stacked_nested_leaf_last 31.2410μs 3.1368μs 318.7983 KOps/s 329.0502 KOps/s $\color{#d91a1a}-3.12\%$
test_nested_getleaf 38.9800μs 6.2573μs 159.8122 KOps/s 162.5377 KOps/s $\color{#d91a1a}-1.68\%$
test_nested_get 32.5800μs 5.9320μs 168.5781 KOps/s 166.3394 KOps/s $\color{#35bf28}+1.35\%$
test_stacked_getleaf 46.3210μs 6.1229μs 163.3222 KOps/s 161.0143 KOps/s $\color{#35bf28}+1.43\%$
test_stacked_get 37.7210μs 5.7727μs 173.2294 KOps/s 172.0361 KOps/s $\color{#35bf28}+0.69\%$
test_nested_getitemleaf 38.5500μs 6.5022μs 153.7937 KOps/s 154.0278 KOps/s $\color{#d91a1a}-0.15\%$
test_nested_getitem 30.1200μs 6.0627μs 164.9436 KOps/s 163.6670 KOps/s $\color{#35bf28}+0.78\%$
test_stacked_getitemleaf 44.2110μs 6.4129μs 155.9353 KOps/s 155.1202 KOps/s $\color{#35bf28}+0.53\%$
test_stacked_getitem 34.9600μs 5.9441μs 168.2338 KOps/s 166.5162 KOps/s $\color{#35bf28}+1.03\%$
test_lock_nested 8.9669ms 0.3456ms 2.8932 KOps/s 2.9000 KOps/s $\color{#d91a1a}-0.24\%$
test_lock_stack_nested 0.4020ms 0.3397ms 2.9441 KOps/s 2.8569 KOps/s $\color{#35bf28}+3.05\%$
test_unlock_nested 0.3659ms 0.2836ms 3.5266 KOps/s 3.5408 KOps/s $\color{#d91a1a}-0.40\%$
test_unlock_stack_nested 0.3358ms 0.2793ms 3.5809 KOps/s 3.4643 KOps/s $\color{#35bf28}+3.37\%$
test_flatten_speed 0.1171ms 78.1504μs 12.7958 KOps/s 12.8010 KOps/s $\color{#d91a1a}-0.04\%$
test_unflatten_speed 0.3801ms 0.3205ms 3.1206 KOps/s 3.0819 KOps/s $\color{#35bf28}+1.25\%$
test_common_ops 0.7557ms 0.6423ms 1.5569 KOps/s 1.5454 KOps/s $\color{#35bf28}+0.75\%$
test_creation 72.1510μs 1.7644μs 566.7721 KOps/s 561.2165 KOps/s $\color{#35bf28}+0.99\%$
test_creation_empty 45.4210μs 10.4593μs 95.6086 KOps/s 105.2291 KOps/s $\textbf{\color{#d91a1a}-9.14\%}$
test_creation_nested_1 48.7410μs 12.1048μs 82.6118 KOps/s 89.1546 KOps/s $\textbf{\color{#d91a1a}-7.34\%}$
test_creation_nested_2 44.7310μs 14.8565μs 67.3105 KOps/s 72.9102 KOps/s $\textbf{\color{#d91a1a}-7.68\%}$
test_clone 53.5910μs 10.2820μs 97.2569 KOps/s 90.2163 KOps/s $\textbf{\color{#35bf28}+7.80\%}$
test_getitem[int] 1.1254ms 10.5486μs 94.7993 KOps/s 93.2072 KOps/s $\color{#35bf28}+1.71\%$
test_getitem[slice_int] 0.1144ms 20.7450μs 48.2043 KOps/s 47.3061 KOps/s $\color{#35bf28}+1.90\%$
test_getitem[range] 0.1245ms 37.4302μs 26.7164 KOps/s 25.8075 KOps/s $\color{#35bf28}+3.52\%$
test_getitem[tuple] 0.1041ms 18.0155μs 55.5079 KOps/s 54.1972 KOps/s $\color{#35bf28}+2.42\%$
test_getitem[list] 0.1274ms 32.0851μs 31.1671 KOps/s 30.2182 KOps/s $\color{#35bf28}+3.14\%$
test_setitem_dim[int] 38.7300μs 19.2630μs 51.9129 KOps/s 50.2039 KOps/s $\color{#35bf28}+3.40\%$
test_setitem_dim[slice_int] 67.6210μs 38.0608μs 26.2738 KOps/s 25.3089 KOps/s $\color{#35bf28}+3.81\%$
test_setitem_dim[range] 0.1010ms 52.8484μs 18.9221 KOps/s 18.6140 KOps/s $\color{#35bf28}+1.65\%$
test_setitem_dim[tuple] 54.3300μs 32.3033μs 30.9566 KOps/s 30.5653 KOps/s $\color{#35bf28}+1.28\%$
test_setitem 52.7510μs 16.0923μs 62.1416 KOps/s 61.6079 KOps/s $\color{#35bf28}+0.87\%$
test_set 45.8900μs 15.6797μs 63.7766 KOps/s 65.1270 KOps/s $\color{#d91a1a}-2.07\%$
test_set_shared 0.5185ms 0.1573ms 6.3587 KOps/s 6.0777 KOps/s $\color{#35bf28}+4.62\%$
test_update 0.2391ms 19.6496μs 50.8917 KOps/s 48.7960 KOps/s $\color{#35bf28}+4.29\%$
test_update_nested 59.3410μs 25.0997μs 39.8411 KOps/s 40.3644 KOps/s $\color{#d91a1a}-1.30\%$
test_update__nested 0.4478ms 24.8881μs 40.1798 KOps/s 38.4154 KOps/s $\color{#35bf28}+4.59\%$
test_set_nested 59.6510μs 17.3833μs 57.5265 KOps/s 59.2984 KOps/s $\color{#d91a1a}-2.99\%$
test_set_nested_new 62.5910μs 20.0044μs 49.9891 KOps/s 51.4042 KOps/s $\color{#d91a1a}-2.75\%$
test_select 81.9620μs 30.8203μs 32.4461 KOps/s 32.7691 KOps/s $\color{#d91a1a}-0.99\%$
test_select_nested 81.7210μs 44.2540μs 22.5968 KOps/s 22.9285 KOps/s $\color{#d91a1a}-1.45\%$
test_exclude_nested 92.3110μs 64.4825μs 15.5081 KOps/s 15.7831 KOps/s $\color{#d91a1a}-1.74\%$
test_empty[True] 0.3574ms 0.2971ms 3.3663 KOps/s 3.3998 KOps/s $\color{#d91a1a}-0.99\%$
test_empty[False] 3.8631μs 0.8244μs 1.2130 MOps/s 1.2197 MOps/s $\color{#d91a1a}-0.55\%$
test_to 86.8020μs 54.6384μs 18.3022 KOps/s 16.4763 KOps/s $\textbf{\color{#35bf28}+11.08\%}$
test_to_nonblocking 88.1810μs 46.8541μs 21.3428 KOps/s 20.2539 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_unbind_speed 0.2865ms 0.2379ms 4.2029 KOps/s 4.0886 KOps/s $\color{#35bf28}+2.80\%$
test_unbind_speed_stack0 0.3082ms 0.2349ms 4.2580 KOps/s 4.1218 KOps/s $\color{#35bf28}+3.30\%$
test_unbind_speed_stack1 92.9467ms 0.7319ms 1.3663 KOps/s 1.3443 KOps/s $\color{#35bf28}+1.63\%$
test_split 95.7852ms 1.5787ms 633.4443 Ops/s 628.3468 Ops/s $\color{#35bf28}+0.81\%$
test_chunk 94.8858ms 1.6021ms 624.1842 Ops/s 628.1301 Ops/s $\color{#d91a1a}-0.63\%$
test_consolidate[False-None] 3.2761ms 2.6995ms 370.4344 Ops/s 336.0171 Ops/s $\textbf{\color{#35bf28}+10.24\%}$
test_consolidate[default-None] 1.7847ms 1.6986ms 588.7076 Ops/s 578.4046 Ops/s $\color{#35bf28}+1.78\%$
test_consolidate[reduce-overhead-None] 1.8502ms 1.7463ms 572.6255 Ops/s 572.2877 Ops/s $\color{#35bf28}+0.06\%$
test_consolidate_njt[False-None] 6.8688ms 6.4762ms 154.4106 Ops/s 150.6467 Ops/s $\color{#35bf28}+2.50\%$
test_to[False-False-None] 1.7985ms 1.7034ms 587.0475 Ops/s 568.5955 Ops/s $\color{#35bf28}+3.25\%$
test_to[True-False-None] 1.5271ms 1.2889ms 775.8318 Ops/s 710.3645 Ops/s $\textbf{\color{#35bf28}+9.22\%}$
test_to[within-False-None] 4.3549ms 4.0974ms 244.0583 Ops/s 239.6647 Ops/s $\color{#35bf28}+1.83\%$
test_to[True-default-None] 5.4746ms 5.2467ms 190.5945 Ops/s 187.7344 Ops/s $\color{#35bf28}+1.52\%$
test_to_njt[False-False-None] 7.1078ms 6.9333ms 144.2315 Ops/s 141.4949 Ops/s $\color{#35bf28}+1.93\%$
test_to_njt[True-False-None] 5.7797ms 5.5265ms 180.9475 Ops/s 177.2419 Ops/s $\color{#35bf28}+2.09\%$
test_to_njt[within-False-None] 12.8024ms 12.1745ms 82.1391 Ops/s 80.6673 Ops/s $\color{#35bf28}+1.82\%$
test_creation[device0] 0.2980ms 79.4588μs 12.5851 KOps/s 11.7203 KOps/s $\textbf{\color{#35bf28}+7.38\%}$
test_creation_from_tensor 0.5933ms 84.8317μs 11.7880 KOps/s 11.4038 KOps/s $\color{#35bf28}+3.37\%$
test_add_one[memmap_tensor0] 0.4522ms 6.5038μs 153.7555 KOps/s 144.8975 KOps/s $\textbf{\color{#35bf28}+6.11\%}$
test_contiguous[memmap_tensor0] 2.0486μs 0.4082μs 2.4501 MOps/s 2.4276 MOps/s $\color{#35bf28}+0.93\%$
test_stack[memmap_tensor0] 40.5110μs 4.2548μs 235.0281 KOps/s 221.3827 KOps/s $\textbf{\color{#35bf28}+6.16\%}$
test_memmaptd_index 1.6438ms 0.2453ms 4.0773 KOps/s 4.0229 KOps/s $\color{#35bf28}+1.35\%$
test_memmaptd_index_astensor 0.4423ms 0.2985ms 3.3500 KOps/s 3.2215 KOps/s $\color{#35bf28}+3.99\%$
test_memmaptd_index_op 0.7687ms 0.6007ms 1.6647 KOps/s 1.6365 KOps/s $\color{#35bf28}+1.73\%$
test_serialize_model 0.1312s 0.1300s 7.6909 Ops/s 7.6353 Ops/s $\color{#35bf28}+0.73\%$
test_serialize_model_pickle 1.3476s 1.2124s 0.8248 Ops/s 0.8231 Ops/s $\color{#35bf28}+0.21\%$
test_serialize_weights 0.1311s 0.1298s 7.7048 Ops/s 7.6042 Ops/s $\color{#35bf28}+1.32\%$
test_serialize_weights_returnearly 46.6673ms 41.8307ms 23.9059 Ops/s 15.0261 Ops/s $\textbf{\color{#35bf28}+59.10\%}$
test_serialize_weights_pickle 1.3516s 1.2186s 0.8206 Ops/s 0.8142 Ops/s $\color{#35bf28}+0.79\%$
test_reshape_pytree 55.7210μs 22.3737μs 44.6954 KOps/s 41.7628 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_reshape_td 59.4610μs 26.9061μs 37.1663 KOps/s 37.1952 KOps/s $\color{#d91a1a}-0.08\%$
test_view_pytree 61.4410μs 21.9300μs 45.5996 KOps/s 45.0560 KOps/s $\color{#35bf28}+1.21\%$
test_view_td 74.9310μs 30.5133μs 32.7726 KOps/s 30.0056 KOps/s $\textbf{\color{#35bf28}+9.22\%}$
test_unbind_pytree 60.8110μs 28.2480μs 35.4007 KOps/s 34.4204 KOps/s $\color{#35bf28}+2.85\%$
test_unbind_td 0.6398ms 36.3779μs 27.4892 KOps/s 26.0310 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_split_pytree 68.2310μs 29.7933μs 33.5646 KOps/s 32.6809 KOps/s $\color{#35bf28}+2.70\%$
test_split_td 0.8072ms 37.6357μs 26.5705 KOps/s 25.3147 KOps/s $\color{#35bf28}+4.96\%$
test_add_pytree 75.8410μs 34.3008μs 29.1538 KOps/s 27.8361 KOps/s $\color{#35bf28}+4.73\%$
test_add_td 96.3910μs 50.6841μs 19.7301 KOps/s 19.3592 KOps/s $\color{#35bf28}+1.92\%$
test_compile_add_one_nested[tensordict-compile] 0.1717ms 0.1225ms 8.1620 KOps/s 7.7279 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_compile_add_one_nested[tensordict-eager] 0.2299ms 0.1343ms 7.4443 KOps/s 7.3356 KOps/s $\color{#35bf28}+1.48\%$
test_compile_add_one_nested[pytree-compile] 0.1430ms 95.0112μs 10.5251 KOps/s 10.2156 KOps/s $\color{#35bf28}+3.03\%$
test_compile_add_one_nested[pytree-eager] 0.9848ms 0.1483ms 6.7416 KOps/s 6.5651 KOps/s $\color{#35bf28}+2.69\%$
test_compile_copy_nested[tensordict-compile] 59.0610μs 23.6245μs 42.3290 KOps/s 42.5041 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_copy_nested[tensordict-eager] 0.1288ms 28.9639μs 34.5257 KOps/s 34.0193 KOps/s $\color{#35bf28}+1.49\%$
test_compile_copy_nested[pytree-compile] 0.3531ms 63.5576μs 15.7338 KOps/s 15.7275 KOps/s $\color{#35bf28}+0.04\%$
test_compile_copy_nested[pytree-eager] 92.3420μs 49.1202μs 20.3582 KOps/s 20.2906 KOps/s $\color{#35bf28}+0.33\%$
test_compile_add_one_flat[tensordict-compile] 0.1918ms 0.1419ms 7.0459 KOps/s 7.1010 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_add_one_flat[tensordict-eager] 0.3068ms 0.2188ms 4.5708 KOps/s 4.5735 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_one_flat[tensorclass-compile] 0.1533ms 98.4831μs 10.1540 KOps/s 10.2660 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_add_one_flat[tensorclass-eager] 0.1117ms 55.8289μs 17.9119 KOps/s 17.9233 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_one_flat[pytree-compile] 0.1767ms 0.1366ms 7.3217 KOps/s 6.9329 KOps/s $\textbf{\color{#35bf28}+5.61\%}$
test_compile_add_one_flat[pytree-eager] 0.5509ms 0.4813ms 2.0779 KOps/s 2.0086 KOps/s $\color{#35bf28}+3.45\%$
test_compile_add_self_flat[tensordict-eager] 0.3925ms 0.2626ms 3.8085 KOps/s 3.7874 KOps/s $\color{#35bf28}+0.56\%$
test_compile_add_self_flat[tensordict-compile] 0.1853ms 0.1433ms 6.9783 KOps/s 6.8557 KOps/s $\color{#35bf28}+1.79\%$
test_compile_add_self_flat[tensorclass-eager] 0.1540ms 68.2834μs 14.6448 KOps/s 14.2155 KOps/s $\color{#35bf28}+3.02\%$
test_compile_add_self_flat[tensorclass-compile] 0.1418ms 99.2490μs 10.0757 KOps/s 10.0320 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_self_flat[pytree-eager] 0.4773ms 0.4049ms 2.4695 KOps/s 2.4161 KOps/s $\color{#35bf28}+2.21\%$
test_compile_add_self_flat[pytree-compile] 0.1695ms 0.1355ms 7.3813 KOps/s 7.3041 KOps/s $\color{#35bf28}+1.06\%$
test_compile_copy_flat[tensordict-compile] 58.2710μs 19.0720μs 52.4328 KOps/s 53.6044 KOps/s $\color{#d91a1a}-2.19\%$
test_compile_copy_flat[tensordict-eager] 78.0110μs 31.2280μs 32.0226 KOps/s 31.8294 KOps/s $\color{#35bf28}+0.61\%$
test_compile_copy_flat[pytree-compile] 0.1141ms 69.8338μs 14.3197 KOps/s 14.3092 KOps/s $\color{#35bf28}+0.07\%$
test_compile_copy_flat[pytree-eager] 87.4310μs 52.6082μs 19.0084 KOps/s 19.0429 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_assign_and_add[tensordict-compile] 1.6020ms 0.3913ms 2.5558 KOps/s 2.2413 KOps/s $\textbf{\color{#35bf28}+14.03\%}$
test_compile_assign_and_add[tensordict-eager] 2.7790ms 2.6698ms 374.5635 Ops/s 373.4315 Ops/s $\color{#35bf28}+0.30\%$
test_compile_assign_and_add[pytree-compile] 1.5738ms 0.3796ms 2.6341 KOps/s 2.2359 KOps/s $\textbf{\color{#35bf28}+17.81\%}$
test_compile_assign_and_add[pytree-eager] 3.5151ms 2.7034ms 369.9072 Ops/s 357.2263 Ops/s $\color{#35bf28}+3.55\%$
test_compile_indexing[tensor-tensordict-compile] 0.3020ms 0.1142ms 8.7592 KOps/s 8.2137 KOps/s $\textbf{\color{#35bf28}+6.64\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5558ms 80.4934μs 12.4234 KOps/s 11.9349 KOps/s $\color{#35bf28}+4.09\%$
test_compile_indexing[tensor-tensorclass-compile] 0.4888ms 0.1064ms 9.3977 KOps/s 9.3783 KOps/s $\color{#35bf28}+0.21\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1096ms 68.7992μs 14.5351 KOps/s 13.6875 KOps/s $\textbf{\color{#35bf28}+6.19\%}$
test_compile_indexing[tensor-pytree-compile] 0.1494ms 0.1077ms 9.2876 KOps/s 8.8270 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_compile_indexing[tensor-pytree-eager] 0.1701ms 68.1956μs 14.6637 KOps/s 14.3736 KOps/s $\color{#35bf28}+2.02\%$
test_compile_indexing[slice-tensordict-compile] 0.1611ms 0.1006ms 9.9388 KOps/s 9.7779 KOps/s $\color{#35bf28}+1.65\%$
test_compile_indexing[slice-tensordict-eager] 0.1561ms 20.3867μs 49.0517 KOps/s 55.5656 KOps/s $\textbf{\color{#d91a1a}-11.72\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1300ms 95.3262μs 10.4903 KOps/s 10.2276 KOps/s $\color{#35bf28}+2.57\%$
test_compile_indexing[slice-tensorclass-eager] 51.4910μs 15.7716μs 63.4051 KOps/s 62.8394 KOps/s $\color{#35bf28}+0.90\%$
test_compile_indexing[slice-pytree-compile] 0.1646ms 97.0480μs 10.3042 KOps/s 10.3278 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_indexing[slice-pytree-eager] 52.9810μs 15.9421μs 62.7269 KOps/s 63.6386 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_indexing[int-tensordict-compile] 0.1714ms 0.1015ms 9.8475 KOps/s 9.7431 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[int-tensordict-eager] 0.5889ms 17.1893μs 58.1758 KOps/s 57.0524 KOps/s $\color{#35bf28}+1.97\%$
test_compile_indexing[int-tensorclass-compile] 0.1672ms 97.7911μs 10.2259 KOps/s 10.2584 KOps/s $\color{#d91a1a}-0.32\%$
test_compile_indexing[int-tensorclass-eager] 52.6510μs 15.7932μs 63.3185 KOps/s 62.9343 KOps/s $\color{#35bf28}+0.61\%$
test_compile_indexing[int-pytree-compile] 0.1541ms 96.8774μs 10.3223 KOps/s 10.2982 KOps/s $\color{#35bf28}+0.23\%$
test_compile_indexing[int-pytree-eager] 52.1810μs 15.5632μs 64.2541 KOps/s 63.7337 KOps/s $\color{#35bf28}+0.82\%$
test_mod_add[eager] 89.0910μs 39.9671μs 25.0206 KOps/s 24.9853 KOps/s $\color{#35bf28}+0.14\%$
test_mod_add[compile] 0.3607ms 81.9668μs 12.2001 KOps/s 12.0746 KOps/s $\color{#35bf28}+1.04\%$
test_mod_add[compile-overhead] 0.3310ms 0.1682ms 5.9469 KOps/s 5.6685 KOps/s $\color{#35bf28}+4.91\%$
test_mod_wrap[eager] 0.3271ms 0.2527ms 3.9573 KOps/s 3.8663 KOps/s $\color{#35bf28}+2.35\%$
test_mod_wrap[compile] 0.3890ms 0.2871ms 3.4826 KOps/s 3.4426 KOps/s $\color{#35bf28}+1.16\%$
test_mod_wrap[compile-overhead] 7.2251ms 3.8873ms 257.2471 Ops/s 264.6801 Ops/s $\color{#d91a1a}-2.81\%$
test_mod_wrap_and_backward[eager] 1.4753ms 1.3628ms 733.7893 Ops/s 675.3752 Ops/s $\textbf{\color{#35bf28}+8.65\%}$
test_mod_wrap_and_backward[compile] 1.4080ms 1.2710ms 786.7851 Ops/s 756.3527 Ops/s $\color{#35bf28}+4.02\%$
test_mod_wrap_and_backward[compile-overhead] 1.4232ms 0.9374ms 1.0668 KOps/s 1.0385 KOps/s $\color{#35bf28}+2.73\%$
test_seq_add[eager] 0.1702ms 0.1200ms 8.3336 KOps/s 8.2993 KOps/s $\color{#35bf28}+0.41\%$
test_seq_add[compile] 0.1351ms 91.3387μs 10.9483 KOps/s 11.2114 KOps/s $\color{#d91a1a}-2.35\%$
test_seq_add[compile-overhead] 0.2447ms 0.1322ms 7.5619 KOps/s 7.5592 KOps/s $\color{#35bf28}+0.03\%$
test_seq_wrap[eager] 0.5184ms 0.4342ms 2.3031 KOps/s 2.2914 KOps/s $\color{#35bf28}+0.51\%$
test_seq_wrap[compile] 0.3985ms 0.3062ms 3.2654 KOps/s 3.2754 KOps/s $\color{#d91a1a}-0.31\%$
test_seq_wrap[compile-overhead] 0.3202ms 0.2287ms 4.3726 KOps/s 4.4008 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_runtime[False-eager] 0.8233ms 0.7359ms 1.3588 KOps/s 1.3091 KOps/s $\color{#35bf28}+3.80\%$
test_func_call_runtime[False-compile] 1.1495ms 0.7500ms 1.3333 KOps/s 1.3312 KOps/s $\color{#35bf28}+0.16\%$
test_func_call_runtime[False-compile-overhead] 0.4300ms 0.3646ms 2.7426 KOps/s 2.7431 KOps/s $\color{#d91a1a}-0.02\%$
test_func_call_runtime[True-eager] 1.0027ms 0.9042ms 1.1060 KOps/s 1.0873 KOps/s $\color{#35bf28}+1.72\%$
test_func_call_runtime[True-compile] 1.0220ms 0.7818ms 1.2791 KOps/s 1.2832 KOps/s $\color{#d91a1a}-0.33\%$
test_func_call_runtime[True-compile-overhead] 0.4600ms 0.3910ms 2.5574 KOps/s 2.5455 KOps/s $\color{#35bf28}+0.47\%$
test_func_call_cm_runtime[False-eager] 0.8797ms 0.7643ms 1.3084 KOps/s 1.3237 KOps/s $\color{#d91a1a}-1.16\%$
test_func_call_cm_runtime[False-compile] 0.8661ms 0.7816ms 1.2795 KOps/s 1.3177 KOps/s $\color{#d91a1a}-2.90\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4436ms 0.3775ms 2.6487 KOps/s 2.7324 KOps/s $\color{#d91a1a}-3.06\%$
test_func_call_cm_runtime[True-eager] 1.1580ms 1.0041ms 995.8979 Ops/s 960.3710 Ops/s $\color{#35bf28}+3.70\%$
test_func_call_cm_runtime[True-compile] 1.1528ms 1.0436ms 958.1897 Ops/s 981.4276 Ops/s $\color{#d91a1a}-2.37\%$
test_func_call_cm_runtime[True-compile-overhead] 1.2321ms 1.0285ms 972.3066 Ops/s 965.1675 Ops/s $\color{#35bf28}+0.74\%$
test_vmap_func_call_cm_runtime[eager] 2.5443ms 2.1239ms 470.8310 Ops/s 466.0175 Ops/s $\color{#35bf28}+1.03\%$
test_vmap_func_call_cm_runtime[compile] 0.9374ms 0.8319ms 1.2020 KOps/s 1.2033 KOps/s $\color{#d91a1a}-0.10\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4809ms 0.4181ms 2.3919 KOps/s 2.3594 KOps/s $\color{#35bf28}+1.38\%$
test_distributed 2.7628ms 0.1261ms 7.9285 KOps/s 8.5477 KOps/s $\textbf{\color{#d91a1a}-7.24\%}$
test_tdmodule 63.9410μs 22.3580μs 44.7267 KOps/s 46.5545 KOps/s $\color{#d91a1a}-3.93\%$
test_tdmodule_dispatch 62.3410μs 40.3923μs 24.7572 KOps/s 25.9851 KOps/s $\color{#d91a1a}-4.73\%$
test_tdseq 49.1000μs 23.0061μs 43.4667 KOps/s 46.5943 KOps/s $\textbf{\color{#d91a1a}-6.71\%}$
test_tdseq_dispatch 95.1010μs 44.1613μs 22.6442 KOps/s 25.1827 KOps/s $\textbf{\color{#d91a1a}-10.08\%}$
test_instantiation_functorch 2.2260ms 1.5605ms 640.8004 Ops/s 641.9224 Ops/s $\color{#d91a1a}-0.17\%$
test_exec_functorch 0.1867ms 0.1401ms 7.1357 KOps/s 6.9230 KOps/s $\color{#35bf28}+3.07\%$
test_exec_functional_call 0.2506ms 0.1339ms 7.4680 KOps/s 7.1000 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_exec_td_decorator 0.3698ms 0.1858ms 5.3832 KOps/s 5.1882 KOps/s $\color{#35bf28}+3.76\%$
test_vmap_mlp_speed_decorator[True-True] 0.8015ms 0.6927ms 1.4436 KOps/s 1.3819 KOps/s $\color{#35bf28}+4.47\%$
test_vmap_mlp_speed_decorator[True-False] 0.8061ms 0.6978ms 1.4330 KOps/s 1.3732 KOps/s $\color{#35bf28}+4.36\%$
test_vmap_mlp_speed_decorator[False-True] 0.7291ms 0.6078ms 1.6452 KOps/s 1.5833 KOps/s $\color{#35bf28}+3.91\%$
test_vmap_mlp_speed_decorator[False-False] 0.7213ms 0.6059ms 1.6504 KOps/s 1.5823 KOps/s $\color{#35bf28}+4.31\%$
test_vmap_transformer_speed_decorator[True-True] 20.4523ms 19.6633ms 50.8562 Ops/s 50.8737 Ops/s $\color{#d91a1a}-0.03\%$
test_vmap_transformer_speed_decorator[True-False] 19.7450ms 19.4533ms 51.4052 Ops/s 51.4993 Ops/s $\color{#d91a1a}-0.18\%$
test_vmap_transformer_speed_decorator[False-True] 19.6009ms 19.2234ms 52.0199 Ops/s 51.6745 Ops/s $\color{#35bf28}+0.67\%$
test_vmap_transformer_speed_decorator[False-False] 19.6924ms 19.3039ms 51.8030 Ops/s 51.9730 Ops/s $\color{#d91a1a}-0.33\%$
test_to_module_speed[True] 1.4573ms 0.9722ms 1.0286 KOps/s 1.0389 KOps/s $\color{#d91a1a}-0.99\%$
test_to_module_speed[False] 1.0592ms 0.9584ms 1.0434 KOps/s 1.0710 KOps/s $\color{#d91a1a}-2.58\%$
test_tc_init 81.5910μs 38.0381μs 26.2895 KOps/s 27.9877 KOps/s $\textbf{\color{#d91a1a}-6.07\%}$
test_tc_init_nested 0.1376ms 78.5294μs 12.7341 KOps/s 13.8870 KOps/s $\textbf{\color{#d91a1a}-8.30\%}$
test_tc_first_layer_tensor 25.4010μs 0.8015μs 1.2476 MOps/s 1.4627 MOps/s $\textbf{\color{#d91a1a}-14.71\%}$
test_tc_first_layer_nontensor 25.7810μs 2.2142μs 451.6395 KOps/s 451.7166 KOps/s $\color{#d91a1a}-0.02\%$
test_tc_second_layer_tensor 9.8550μs 1.4444μs 692.3257 KOps/s 719.4011 KOps/s $\color{#d91a1a}-3.76\%$
test_tc_second_layer_nontensor 33.2210μs 2.9903μs 334.4190 KOps/s 343.0677 KOps/s $\color{#d91a1a}-2.52\%$
test_unbind 7.3191ms 7.0823ms 141.1964 Ops/s 142.0276 Ops/s $\color{#d91a1a}-0.59\%$
test_full_like 13.2523ms 9.2852ms 107.6986 Ops/s 106.9973 Ops/s $\color{#35bf28}+0.66\%$
test_zeros_like 6.0588ms 4.2828ms 233.4906 Ops/s 230.7746 Ops/s $\color{#35bf28}+1.18\%$
test_ones_like 4.4555ms 4.3340ms 230.7330 Ops/s 230.9418 Ops/s $\color{#d91a1a}-0.09\%$
test_clone 11.7985ms 9.2298ms 108.3449 Ops/s 155.4602 Ops/s $\textbf{\color{#d91a1a}-30.31\%}$
test_squeeze 46.9510μs 10.0025μs 99.9749 KOps/s 101.2956 KOps/s $\color{#d91a1a}-1.30\%$
test_unsqueeze 0.1230ms 77.6865μs 12.8723 KOps/s 12.6843 KOps/s $\color{#35bf28}+1.48\%$
test_split 0.2111s 0.2213ms 4.5182 KOps/s 5.8858 KOps/s $\textbf{\color{#d91a1a}-23.24\%}$
test_permute 0.3020ms 0.1868ms 5.3532 KOps/s 5.3432 KOps/s $\color{#35bf28}+0.19\%$
test_stack 51.7996ms 50.2804ms 19.8885 Ops/s 19.2038 Ops/s $\color{#35bf28}+3.57\%$
test_cat 51.6839ms 50.5569ms 19.7797 Ops/s 19.6903 Ops/s $\color{#35bf28}+0.45\%$

else:
keys: set[str] = set(keys)
keys_set: set[str] = set(keys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is it much more efficient using set rather than using always the other option? Or there is another reason?

@@ -626,7 +626,6 @@ def stack_fn(key, values, is_not_init, is_tensor):
key: stack_fn(key, values, is_not_init, is_tensor)
for key, (values, is_not_init, is_tensor) in out.items()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this added space is on purpose.

raise KeyError(
f"got keys {keys} and {set(td.keys())} which are incompatible"
)
return keys
if strict:
return keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return keys
return list(keys)

pretty sure that's what you mean with your comment, but just to be on the safe side. Rn, the return type is not consistent with typing.

return keys
if strict:
return keys
return keys_set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False) used?

tc1 = MyTensorClass(foo=torch.zeros((1,)), bar=torch.ones((1,)))

for _ in range(10000):
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [
assert list(torch.stack([tc1, tc1], dim=0).keys()) == [

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants