-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix non-deterministic key order in stack #1230
base: gh/vmoens/48/base
Are you sure you want to change the base?
Conversation
ghstack-source-id: 7f394789b783d6359a78a300aaf449eb25adb5e3 Pull Request resolved: #1230
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 50.5640μs | 20.0925μs | 49.7697 KOps/s | 49.5689 KOps/s | |
test_plain_set_stack_nested | 45.7450μs | 20.0545μs | 49.8641 KOps/s | 49.2224 KOps/s | |
test_plain_set_nested_inplace | 54.9320μs | 21.5403μs | 46.4247 KOps/s | 45.2589 KOps/s | |
test_plain_set_stack_nested_inplace | 54.5110μs | 21.6148μs | 46.2646 KOps/s | 44.8616 KOps/s | |
test_items | 20.8280μs | 4.1462μs | 241.1856 KOps/s | 242.2765 KOps/s | |
test_items_nested | 0.7584ms | 0.4077ms | 2.4526 KOps/s | 2.4268 KOps/s | |
test_items_nested_locked | 0.4965ms | 0.4056ms | 2.4652 KOps/s | 2.4229 KOps/s | |
test_items_nested_leaf | 0.1441ms | 76.4252μs | 13.0847 KOps/s | 13.0296 KOps/s | |
test_items_stack_nested | 0.7546ms | 0.4098ms | 2.4404 KOps/s | 2.4065 KOps/s | |
test_items_stack_nested_leaf | 0.1496ms | 76.3706μs | 13.0940 KOps/s | 12.4926 KOps/s | |
test_items_stack_nested_locked | 0.7752ms | 0.4130ms | 2.4214 KOps/s | 2.4216 KOps/s | |
test_keys | 23.2640μs | 3.4641μs | 288.6739 KOps/s | 289.2372 KOps/s | |
test_keys_nested | 0.2913ms | 0.1629ms | 6.1402 KOps/s | 6.1117 KOps/s | |
test_keys_nested_locked | 1.9566ms | 0.1694ms | 5.9030 KOps/s | 5.8818 KOps/s | |
test_keys_nested_leaf | 0.2214ms | 0.1417ms | 7.0567 KOps/s | 7.0130 KOps/s | |
test_keys_stack_nested | 0.2915ms | 0.1622ms | 6.1640 KOps/s | 6.1385 KOps/s | |
test_keys_stack_nested_leaf | 0.2114ms | 0.1418ms | 7.0499 KOps/s | 7.1402 KOps/s | |
test_keys_stack_nested_locked | 0.2965ms | 0.1693ms | 5.9061 KOps/s | 5.9589 KOps/s | |
test_values | 4.7186μs | 1.0261μs | 974.5827 KOps/s | 944.1908 KOps/s | |
test_values_nested | 0.1164ms | 62.7862μs | 15.9271 KOps/s | 16.2162 KOps/s | |
test_values_nested_locked | 0.1139ms | 62.0827μs | 16.1076 KOps/s | 16.1797 KOps/s | |
test_values_nested_leaf | 0.1327ms | 70.8202μs | 14.1203 KOps/s | 14.1840 KOps/s | |
test_values_stack_nested | 0.1220ms | 62.5424μs | 15.9891 KOps/s | 15.9563 KOps/s | |
test_values_stack_nested_leaf | 0.1353ms | 70.6468μs | 14.1549 KOps/s | 13.8631 KOps/s | |
test_values_stack_nested_locked | 0.1346ms | 62.7205μs | 15.9438 KOps/s | 15.8213 KOps/s | |
test_membership | 7.3098μs | 0.7162μs | 1.3962 MOps/s | 1.4393 MOps/s | |
test_membership_nested | 20.3280μs | 2.8643μs | 349.1285 KOps/s | 335.7401 KOps/s | |
test_membership_nested_leaf | 23.2040μs | 2.8934μs | 345.6096 KOps/s | 331.4669 KOps/s | |
test_membership_stacked_nested | 20.4680μs | 2.9044μs | 344.3051 KOps/s | 334.4255 KOps/s | |
test_membership_stacked_nested_leaf | 31.7990μs | 2.8725μs | 348.1283 KOps/s | 331.3921 KOps/s | |
test_membership_nested_last | 31.0180μs | 4.3341μs | 230.7261 KOps/s | 227.3015 KOps/s | |
test_membership_nested_leaf_last | 29.3750μs | 4.4380μs | 225.3243 KOps/s | 222.3284 KOps/s | |
test_membership_stacked_nested_last | 51.2350μs | 4.3678μs | 228.9479 KOps/s | 225.7058 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.0650μs | 4.3688μs | 228.8941 KOps/s | 224.5410 KOps/s | |
test_nested_getleaf | 42.7790μs | 10.3986μs | 96.1666 KOps/s | 93.9256 KOps/s | |
test_nested_get | 29.9550μs | 9.8456μs | 101.5687 KOps/s | 98.9290 KOps/s | |
test_stacked_getleaf | 29.9950μs | 10.5160μs | 95.0936 KOps/s | 92.7077 KOps/s | |
test_stacked_get | 34.3440μs | 10.1883μs | 98.1515 KOps/s | 98.9119 KOps/s | |
test_nested_getitemleaf | 36.8490μs | 11.1755μs | 89.4818 KOps/s | 87.7022 KOps/s | |
test_nested_getitem | 34.4240μs | 10.5735μs | 94.5759 KOps/s | 93.1571 KOps/s | |
test_stacked_getitemleaf | 44.5530μs | 11.3200μs | 88.3392 KOps/s | 88.9028 KOps/s | |
test_stacked_getitem | 32.5010μs | 10.5897μs | 94.4318 KOps/s | 92.0642 KOps/s | |
test_lock_nested | 0.6591ms | 0.4082ms | 2.4497 KOps/s | 2.4337 KOps/s | |
test_lock_stack_nested | 0.6763ms | 0.4194ms | 2.3843 KOps/s | 2.3820 KOps/s | |
test_unlock_nested | 0.5369ms | 0.3319ms | 3.0132 KOps/s | 2.9532 KOps/s | |
test_unlock_stack_nested | 0.5289ms | 0.3386ms | 2.9531 KOps/s | 2.9409 KOps/s | |
test_flatten_speed | 0.2051ms | 99.7813μs | 10.0219 KOps/s | 10.0681 KOps/s | |
test_unflatten_speed | 0.6448ms | 0.5143ms | 1.9444 KOps/s | 1.9447 KOps/s | |
test_common_ops | 4.1030ms | 0.7860ms | 1.2722 KOps/s | 1.2491 KOps/s | |
test_creation | 20.4680μs | 2.5234μs | 396.2863 KOps/s | 408.8910 KOps/s | |
test_creation_empty | 36.3480μs | 11.1764μs | 89.4743 KOps/s | 90.6145 KOps/s | |
test_creation_nested_1 | 34.7950μs | 14.0074μs | 71.3911 KOps/s | 72.1796 KOps/s | |
test_creation_nested_2 | 50.4840μs | 18.4126μs | 54.3106 KOps/s | 54.7872 KOps/s | |
test_clone | 73.9680μs | 13.6713μs | 73.1459 KOps/s | 71.8208 KOps/s | |
test_getitem[int] | 1.0845ms | 12.6996μs | 78.7429 KOps/s | 78.6148 KOps/s | |
test_getitem[slice_int] | 0.1288ms | 23.9138μs | 41.8168 KOps/s | 40.6290 KOps/s | |
test_getitem[range] | 0.1637ms | 49.4003μs | 20.2428 KOps/s | 20.1528 KOps/s | |
test_getitem[tuple] | 0.1230ms | 20.0088μs | 49.9779 KOps/s | 49.4755 KOps/s | |
test_getitem[list] | 0.1913ms | 45.4505μs | 22.0020 KOps/s | 21.9756 KOps/s | |
test_setitem_dim[int] | 53.3200μs | 25.0534μs | 39.9148 KOps/s | 38.2331 KOps/s | |
test_setitem_dim[slice_int] | 0.1067ms | 51.2792μs | 19.5011 KOps/s | 19.1575 KOps/s | |
test_setitem_dim[range] | 0.1251ms | 77.5564μs | 12.8938 KOps/s | 12.8566 KOps/s | |
test_setitem_dim[tuple] | 70.2900μs | 39.7106μs | 25.1822 KOps/s | 24.2307 KOps/s | |
test_setitem | 77.5440μs | 20.0409μs | 49.8979 KOps/s | 48.8040 KOps/s | |
test_set | 0.2014ms | 19.4946μs | 51.2962 KOps/s | 50.2512 KOps/s | |
test_set_shared | 3.3504ms | 0.1809ms | 5.5277 KOps/s | 5.4261 KOps/s | |
test_update | 0.1211ms | 22.5269μs | 44.3915 KOps/s | 44.8796 KOps/s | |
test_update_nested | 96.7100μs | 32.2025μs | 31.0534 KOps/s | 29.6571 KOps/s | |
test_update__nested | 0.3872ms | 33.4913μs | 29.8585 KOps/s | 29.4236 KOps/s | |
test_set_nested | 79.8780μs | 21.5691μs | 46.3625 KOps/s | 45.2932 KOps/s | |
test_set_nested_new | 83.5150μs | 25.9756μs | 38.4976 KOps/s | 37.2038 KOps/s | |
test_select | 0.1063ms | 42.3990μs | 23.5854 KOps/s | 23.6249 KOps/s | |
test_select_nested | 0.1349ms | 62.6949μs | 15.9503 KOps/s | 15.8965 KOps/s | |
test_exclude_nested | 0.1653ms | 79.5516μs | 12.5705 KOps/s | 12.2760 KOps/s | |
test_empty[True] | 0.6136ms | 0.4020ms | 2.4877 KOps/s | 2.4720 KOps/s | |
test_empty[False] | 8.1878μs | 1.3641μs | 733.0891 KOps/s | 716.2079 KOps/s | |
test_unbind_speed | 0.3545ms | 0.2666ms | 3.7512 KOps/s | 3.6597 KOps/s | |
test_unbind_speed_stack0 | 0.3183ms | 0.2653ms | 3.7686 KOps/s | 3.7366 KOps/s | |
test_unbind_speed_stack1 | 0.1027s | 0.7309ms | 1.3682 KOps/s | 1.3903 KOps/s | |
test_split | 96.8259ms | 1.7293ms | 578.2627 Ops/s | 539.6095 Ops/s | |
test_chunk | 94.1865ms | 1.7276ms | 578.8473 Ops/s | 637.4578 Ops/s | |
test_consolidate_njt[False-None] | 8.6641ms | 8.3190ms | 120.2063 Ops/s | 110.5560 Ops/s | |
test_creation[device0] | 0.2666ms | 92.1606μs | 10.8506 KOps/s | 10.7026 KOps/s | |
test_creation_from_tensor | 3.7649ms | 95.5846μs | 10.4619 KOps/s | 10.2886 KOps/s | |
test_add_one[memmap_tensor0] | 0.1266ms | 5.1077μs | 195.7816 KOps/s | 185.7611 KOps/s | |
test_contiguous[memmap_tensor0] | 15.7790μs | 0.5100μs | 1.9606 MOps/s | 1.9694 MOps/s | |
test_stack[memmap_tensor0] | 41.9580μs | 3.5252μs | 283.6698 KOps/s | 277.6067 KOps/s | |
test_memmaptd_index | 0.3004ms | 0.2315ms | 4.3205 KOps/s | 4.2787 KOps/s | |
test_memmaptd_index_astensor | 1.1184ms | 0.3185ms | 3.1397 KOps/s | 3.1488 KOps/s | |
test_memmaptd_index_op | 1.1183ms | 0.5758ms | 1.7366 KOps/s | 1.6841 KOps/s | |
test_serialize_model | 0.2228s | 0.1289s | 7.7556 Ops/s | 8.6258 Ops/s | |
test_serialize_model_pickle | 0.4577s | 0.3893s | 2.5689 Ops/s | 2.5563 Ops/s | |
test_serialize_weights | 0.1212s | 0.1157s | 8.6423 Ops/s | 8.9384 Ops/s | |
test_serialize_weights_returnearly | 0.1772s | 0.1600s | 6.2513 Ops/s | 6.3346 Ops/s | |
test_serialize_weights_pickle | 0.4671s | 0.4072s | 2.4557 Ops/s | 2.4569 Ops/s | |
test_serialize_weights_filesystem | 0.1511s | 0.1411s | 7.0891 Ops/s | 7.2141 Ops/s | |
test_serialize_model_filesystem | 0.2512s | 0.1609s | 6.2161 Ops/s | 6.5841 Ops/s | |
test_reshape_pytree | 57.4970μs | 26.4788μs | 37.7661 KOps/s | 38.0484 KOps/s | |
test_reshape_td | 71.6330μs | 33.2770μs | 30.0508 KOps/s | 30.9803 KOps/s | |
test_view_pytree | 73.5870μs | 26.3074μs | 38.0122 KOps/s | 38.3970 KOps/s | |
test_view_td | 0.1005ms | 39.7074μs | 25.1842 KOps/s | 25.0184 KOps/s | |
test_unbind_pytree | 69.1680μs | 29.6631μs | 33.7119 KOps/s | 33.8626 KOps/s | |
test_unbind_td | 0.3544ms | 39.7603μs | 25.1507 KOps/s | 24.4127 KOps/s | |
test_split_pytree | 58.3090μs | 29.0730μs | 34.3962 KOps/s | 34.3660 KOps/s | |
test_split_td | 0.5334ms | 44.6635μs | 22.3896 KOps/s | 22.4785 KOps/s | |
test_add_pytree | 73.8480μs | 35.6472μs | 28.0527 KOps/s | 27.4963 KOps/s | |
test_add_td | 0.1030ms | 56.9646μs | 17.5548 KOps/s | 17.9959 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1527ms | 68.3119μs | 14.6387 KOps/s | 14.7099 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2900ms | 0.1692ms | 5.9116 KOps/s | 5.8149 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1345ms | 45.8766μs | 21.7976 KOps/s | 21.2851 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2349ms | 0.1185ms | 8.4401 KOps/s | 8.2548 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 67.3660μs | 27.8959μs | 35.8476 KOps/s | 34.0074 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1133ms | 59.1498μs | 16.9062 KOps/s | 17.0681 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1955ms | 78.4037μs | 12.7545 KOps/s | 12.5262 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1414ms | 65.7827μs | 15.2016 KOps/s | 14.9458 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1927ms | 0.1097ms | 9.1134 KOps/s | 9.1113 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3237ms | 0.2146ms | 4.6593 KOps/s | 4.6641 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 99.4950μs | 48.5853μs | 20.5824 KOps/s | 21.1996 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1428ms | 67.0000μs | 14.9254 KOps/s | 14.6614 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1749ms | 0.1020ms | 9.8008 KOps/s | 9.8859 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3438ms | 0.2007ms | 4.9816 KOps/s | 4.8463 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3430ms | 0.2283ms | 4.3797 KOps/s | 4.3612 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2364ms | 0.1103ms | 9.0661 KOps/s | 9.1451 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.4927ms | 63.3642μs | 15.7818 KOps/s | 16.0179 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1053ms | 51.2503μs | 19.5121 KOps/s | 20.3721 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2790ms | 0.1570ms | 6.3711 KOps/s | 6.1839 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1710ms | 0.1023ms | 9.7739 KOps/s | 9.6994 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 53.0380μs | 21.7783μs | 45.9172 KOps/s | 46.9630 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1326ms | 69.8782μs | 14.3106 KOps/s | 15.0708 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1743ms | 80.8193μs | 12.3733 KOps/s | 12.4078 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1396ms | 66.8934μs | 14.9491 KOps/s | 15.0077 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3158ms | 0.2160ms | 4.6302 KOps/s | 4.5990 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.5460ms | 1.3636ms | 733.3668 Ops/s | 712.4200 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2929ms | 0.2127ms | 4.7010 KOps/s | 4.6878 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0281ms | 0.8341ms | 1.1989 KOps/s | 1.1686 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8256ms | 0.4602ms | 2.1729 KOps/s | 2.1490 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.9075ms | 2.7143ms | 368.4126 Ops/s | 358.9632 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 84.9780μs | 38.8282μs | 25.7545 KOps/s | 26.1942 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6854ms | 32.4282μs | 30.8374 KOps/s | 29.8526 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 88.8250μs | 30.9863μs | 32.2723 KOps/s | 33.0588 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 73.0960μs | 22.9305μs | 43.6101 KOps/s | 43.1858 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 85.3090μs | 31.2143μs | 32.0366 KOps/s | 32.0131 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 84.8880μs | 22.9848μs | 43.5070 KOps/s | 43.1868 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1078ms | 52.8895μs | 18.9073 KOps/s | 18.7620 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3060ms | 19.5923μs | 51.0405 KOps/s | 49.1046 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1615ms | 47.1177μs | 21.2235 KOps/s | 21.3950 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 74.2480μs | 18.6917μs | 53.4996 KOps/s | 54.1682 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1262ms | 46.5580μs | 21.4786 KOps/s | 21.0729 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 66.1130μs | 18.6723μs | 53.5553 KOps/s | 53.7130 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1126ms | 54.1931μs | 18.4525 KOps/s | 18.0790 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9634ms | 19.4792μs | 51.3369 KOps/s | 49.8769 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1104ms | 46.2707μs | 21.6119 KOps/s | 21.1788 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2242ms | 18.8348μs | 53.0933 KOps/s | 54.6501 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1144ms | 46.2437μs | 21.6246 KOps/s | 21.0545 KOps/s | |
test_compile_indexing[int-pytree-eager] | 65.9830μs | 18.7044μs | 53.4633 KOps/s | 54.7544 KOps/s | |
test_mod_add[eager] | 0.1204ms | 35.0507μs | 28.5301 KOps/s | 29.1686 KOps/s | |
test_mod_add[compile] | 0.1326ms | 64.7804μs | 15.4368 KOps/s | 15.2008 KOps/s | |
test_mod_add[compile-overhead] | 0.1294ms | 64.9284μs | 15.4016 KOps/s | 14.8384 KOps/s | |
test_mod_wrap[eager] | 0.3898ms | 0.2277ms | 4.3915 KOps/s | 4.3607 KOps/s | |
test_mod_wrap[compile] | 1.2908ms | 0.2291ms | 4.3642 KOps/s | 3.5848 KOps/s | |
test_mod_wrap[compile-overhead] | 1.3510ms | 0.2272ms | 4.4014 KOps/s | 4.3354 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.1833ms | 10.7587ms | 92.9484 Ops/s | 77.9155 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.8025ms | 11.3765ms | 87.9007 Ops/s | 86.5198 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.5487ms | 10.8182ms | 92.4368 Ops/s | 86.8288 Ops/s | |
test_seq_add[eager] | 0.1956ms | 0.1179ms | 8.4787 KOps/s | 8.3750 KOps/s | |
test_seq_add[compile] | 0.1755ms | 75.7242μs | 13.2058 KOps/s | 12.7631 KOps/s | |
test_seq_add[compile-overhead] | 0.1745ms | 74.4694μs | 13.4283 KOps/s | 13.1445 KOps/s | |
test_seq_wrap[eager] | 3.5348ms | 0.4578ms | 2.1845 KOps/s | 2.1605 KOps/s | |
test_seq_wrap[compile] | 0.3870ms | 0.2385ms | 4.1921 KOps/s | 4.0709 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3170ms | 0.2377ms | 4.2072 KOps/s | 4.1174 KOps/s | |
test_func_call_runtime[False-eager] | 0.9774ms | 0.5534ms | 1.8071 KOps/s | 1.7810 KOps/s | |
test_func_call_runtime[False-compile] | 0.5817ms | 0.4397ms | 2.2741 KOps/s | 2.2660 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6729ms | 0.4396ms | 2.2750 KOps/s | 2.2349 KOps/s | |
test_func_call_runtime[True-eager] | 1.8347ms | 0.7688ms | 1.3007 KOps/s | 1.3000 KOps/s | |
test_func_call_runtime[True-compile] | 0.5368ms | 0.4597ms | 2.1754 KOps/s | 2.1697 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5398ms | 0.4599ms | 2.1742 KOps/s | 2.1486 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7193ms | 0.5514ms | 1.8134 KOps/s | 1.8306 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6185ms | 0.4416ms | 2.2644 KOps/s | 2.2600 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5456ms | 0.4402ms | 2.2719 KOps/s | 2.2559 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1192ms | 0.9084ms | 1.1008 KOps/s | 1.0947 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9128ms | 0.8051ms | 1.2421 KOps/s | 1.2232 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8967ms | 0.8028ms | 1.2456 KOps/s | 1.2151 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4985ms | 1.9024ms | 525.6410 Ops/s | 515.3538 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9085ms | 0.5286ms | 1.8917 KOps/s | 1.8459 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8739ms | 0.5283ms | 1.8927 KOps/s | 1.8473 KOps/s | |
test_distributed | 0.2699ms | 0.1242ms | 8.0535 KOps/s | 7.8641 KOps/s | |
test_tdmodule | 43.2710μs | 26.4252μs | 37.8426 KOps/s | 36.4959 KOps/s | |
test_tdmodule_dispatch | 74.1090μs | 49.0706μs | 20.3788 KOps/s | 20.0747 KOps/s | |
test_tdseq | 63.1880μs | 29.3815μs | 34.0351 KOps/s | 33.9213 KOps/s | |
test_tdseq_dispatch | 71.0820μs | 53.1177μs | 18.8261 KOps/s | 18.2945 KOps/s | |
test_instantiation_functorch | 1.7762ms | 1.5410ms | 648.9318 Ops/s | 644.4957 Ops/s | |
test_exec_functorch | 0.4446ms | 0.1819ms | 5.4976 KOps/s | 5.5480 KOps/s | |
test_exec_functional_call | 0.2395ms | 0.1745ms | 5.7309 KOps/s | 5.8931 KOps/s | |
test_exec_td_decorator | 0.4719ms | 0.2358ms | 4.2415 KOps/s | 4.2715 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0145ms | 0.6690ms | 1.4948 KOps/s | 1.4987 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8576ms | 0.6629ms | 1.5085 KOps/s | 1.3673 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8055ms | 0.5406ms | 1.8499 KOps/s | 1.8527 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7769ms | 0.5405ms | 1.8500 KOps/s | 1.8568 KOps/s | |
test_to_module_speed[True] | 1.7863ms | 1.3205ms | 757.3101 Ops/s | 752.0651 Ops/s | |
test_to_module_speed[False] | 2.1728ms | 1.3002ms | 769.0910 Ops/s | 770.5427 Ops/s | |
test_tc_init | 0.1037ms | 45.0803μs | 22.1826 KOps/s | 22.0758 KOps/s | |
test_tc_init_nested | 0.1531ms | 90.3156μs | 11.0723 KOps/s | 10.9704 KOps/s | |
test_tc_first_layer_tensor | 36.3280μs | 1.5456μs | 647.0066 KOps/s | 636.9337 KOps/s | |
test_tc_first_layer_nontensor | 24.8760μs | 4.7306μs | 211.3913 KOps/s | 211.0261 KOps/s | |
test_tc_second_layer_tensor | 28.5030μs | 2.8407μs | 352.0248 KOps/s | 349.3585 KOps/s | |
test_tc_second_layer_nontensor | 44.9730μs | 6.0429μs | 165.4843 KOps/s | 165.6710 KOps/s | |
test_unbind | 0.2348s | 12.9810ms | 77.0359 Ops/s | 68.7478 Ops/s | |
test_full_like | 8.8810ms | 7.1752ms | 139.3687 Ops/s | 137.5501 Ops/s | |
test_zeros_like | 4.6198ms | 2.7202ms | 367.6171 Ops/s | 363.9363 Ops/s | |
test_ones_like | 3.9289ms | 3.2757ms | 305.2826 Ops/s | 307.3360 Ops/s | |
test_clone | 5.1932ms | 4.9158ms | 203.4243 Ops/s | 202.1834 Ops/s | |
test_squeeze | 73.1560μs | 12.7443μs | 78.4664 KOps/s | 79.3490 KOps/s | |
test_unsqueeze | 0.1473ms | 94.1755μs | 10.6185 KOps/s | 10.7362 KOps/s | |
test_split | 0.4407ms | 0.1980ms | 5.0493 KOps/s | 5.0507 KOps/s | |
test_permute | 0.3486ms | 0.1985ms | 5.0388 KOps/s | 5.0286 KOps/s | |
test_stack | 29.7805ms | 24.8847ms | 40.1853 Ops/s | 39.6228 Ops/s | |
test_cat | 27.1423ms | 24.9931ms | 40.0111 Ops/s | 39.9189 Ops/s |
raise KeyError( | ||
f"got keys {keys} and {set(td.keys())} which are incompatible" | ||
) | ||
return keys | ||
if strict: | ||
return keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should actually make it a list
return keys | ||
if strict: | ||
return keys | ||
return keys_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If keys can be exclusive, their order becomes arbitrary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False)
used?
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 40.5410μs | 13.6061μs | 73.4966 KOps/s | 77.1433 KOps/s | |
test_plain_set_stack_nested | 39.0600μs | 13.7418μs | 72.7709 KOps/s | 76.4285 KOps/s | |
test_plain_set_nested_inplace | 42.1100μs | 14.6365μs | 68.3222 KOps/s | 70.5491 KOps/s | |
test_plain_set_stack_nested_inplace | 39.1910μs | 14.5491μs | 68.7330 KOps/s | 70.7554 KOps/s | |
test_items | 33.5210μs | 2.8721μs | 348.1801 KOps/s | 342.1117 KOps/s | |
test_items_nested | 0.4250ms | 0.3678ms | 2.7189 KOps/s | 2.6924 KOps/s | |
test_items_nested_locked | 0.5635ms | 0.3746ms | 2.6694 KOps/s | 2.6605 KOps/s | |
test_items_nested_leaf | 96.6620μs | 60.6408μs | 16.4905 KOps/s | 16.5184 KOps/s | |
test_items_stack_nested | 0.4210ms | 0.3672ms | 2.7230 KOps/s | 2.6896 KOps/s | |
test_items_stack_nested_leaf | 93.3320μs | 61.4555μs | 16.2719 KOps/s | 16.5485 KOps/s | |
test_items_stack_nested_locked | 0.4282ms | 0.3740ms | 2.6741 KOps/s | 2.6949 KOps/s | |
test_keys | 25.9500μs | 3.4380μs | 290.8635 KOps/s | 284.3881 KOps/s | |
test_keys_nested | 0.1352ms | 89.8187μs | 11.1335 KOps/s | 11.1605 KOps/s | |
test_keys_nested_locked | 0.7020ms | 95.8698μs | 10.4308 KOps/s | 10.5041 KOps/s | |
test_keys_nested_leaf | 0.1114ms | 80.9040μs | 12.3603 KOps/s | 12.4584 KOps/s | |
test_keys_stack_nested | 0.1172ms | 89.2553μs | 11.2038 KOps/s | 11.2049 KOps/s | |
test_keys_stack_nested_leaf | 0.1175ms | 80.2939μs | 12.4543 KOps/s | 12.4887 KOps/s | |
test_keys_stack_nested_locked | 0.1436ms | 95.2787μs | 10.4955 KOps/s | 10.4658 KOps/s | |
test_values | 5.1833μs | 0.8509μs | 1.1752 MOps/s | 1.1582 MOps/s | |
test_values_nested | 65.9110μs | 37.7713μs | 26.4751 KOps/s | 26.7651 KOps/s | |
test_values_nested_locked | 66.5210μs | 39.9201μs | 25.0500 KOps/s | 25.4821 KOps/s | |
test_values_nested_leaf | 86.2610μs | 43.1266μs | 23.1876 KOps/s | 23.5245 KOps/s | |
test_values_stack_nested | 71.9010μs | 37.8733μs | 26.4038 KOps/s | 26.1705 KOps/s | |
test_values_stack_nested_leaf | 69.3410μs | 43.0741μs | 23.2158 KOps/s | 23.2588 KOps/s | |
test_values_stack_nested_locked | 90.7220μs | 39.7394μs | 25.1640 KOps/s | 25.3009 KOps/s | |
test_membership | 2.6790μs | 0.5029μs | 1.9887 MOps/s | 1.9569 MOps/s | |
test_membership_nested | 28.9155μs | 2.0122μs | 496.9570 KOps/s | 490.7488 KOps/s | |
test_membership_nested_leaf | 20.6800μs | 2.0391μs | 490.4227 KOps/s | 491.2340 KOps/s | |
test_membership_stacked_nested | 47.1210μs | 2.1146μs | 472.9062 KOps/s | 470.5750 KOps/s | |
test_membership_stacked_nested_leaf | 31.2210μs | 2.0930μs | 477.7903 KOps/s | 467.4353 KOps/s | |
test_membership_nested_last | 39.7600μs | 3.1255μs | 319.9528 KOps/s | 318.6482 KOps/s | |
test_membership_nested_leaf_last | 33.1200μs | 3.1792μs | 314.5422 KOps/s | 321.1626 KOps/s | |
test_membership_stacked_nested_last | 29.5010μs | 3.1364μs | 318.8329 KOps/s | 326.2028 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.2410μs | 3.1368μs | 318.7983 KOps/s | 329.0502 KOps/s | |
test_nested_getleaf | 38.9800μs | 6.2573μs | 159.8122 KOps/s | 162.5377 KOps/s | |
test_nested_get | 32.5800μs | 5.9320μs | 168.5781 KOps/s | 166.3394 KOps/s | |
test_stacked_getleaf | 46.3210μs | 6.1229μs | 163.3222 KOps/s | 161.0143 KOps/s | |
test_stacked_get | 37.7210μs | 5.7727μs | 173.2294 KOps/s | 172.0361 KOps/s | |
test_nested_getitemleaf | 38.5500μs | 6.5022μs | 153.7937 KOps/s | 154.0278 KOps/s | |
test_nested_getitem | 30.1200μs | 6.0627μs | 164.9436 KOps/s | 163.6670 KOps/s | |
test_stacked_getitemleaf | 44.2110μs | 6.4129μs | 155.9353 KOps/s | 155.1202 KOps/s | |
test_stacked_getitem | 34.9600μs | 5.9441μs | 168.2338 KOps/s | 166.5162 KOps/s | |
test_lock_nested | 8.9669ms | 0.3456ms | 2.8932 KOps/s | 2.9000 KOps/s | |
test_lock_stack_nested | 0.4020ms | 0.3397ms | 2.9441 KOps/s | 2.8569 KOps/s | |
test_unlock_nested | 0.3659ms | 0.2836ms | 3.5266 KOps/s | 3.5408 KOps/s | |
test_unlock_stack_nested | 0.3358ms | 0.2793ms | 3.5809 KOps/s | 3.4643 KOps/s | |
test_flatten_speed | 0.1171ms | 78.1504μs | 12.7958 KOps/s | 12.8010 KOps/s | |
test_unflatten_speed | 0.3801ms | 0.3205ms | 3.1206 KOps/s | 3.0819 KOps/s | |
test_common_ops | 0.7557ms | 0.6423ms | 1.5569 KOps/s | 1.5454 KOps/s | |
test_creation | 72.1510μs | 1.7644μs | 566.7721 KOps/s | 561.2165 KOps/s | |
test_creation_empty | 45.4210μs | 10.4593μs | 95.6086 KOps/s | 105.2291 KOps/s | |
test_creation_nested_1 | 48.7410μs | 12.1048μs | 82.6118 KOps/s | 89.1546 KOps/s | |
test_creation_nested_2 | 44.7310μs | 14.8565μs | 67.3105 KOps/s | 72.9102 KOps/s | |
test_clone | 53.5910μs | 10.2820μs | 97.2569 KOps/s | 90.2163 KOps/s | |
test_getitem[int] | 1.1254ms | 10.5486μs | 94.7993 KOps/s | 93.2072 KOps/s | |
test_getitem[slice_int] | 0.1144ms | 20.7450μs | 48.2043 KOps/s | 47.3061 KOps/s | |
test_getitem[range] | 0.1245ms | 37.4302μs | 26.7164 KOps/s | 25.8075 KOps/s | |
test_getitem[tuple] | 0.1041ms | 18.0155μs | 55.5079 KOps/s | 54.1972 KOps/s | |
test_getitem[list] | 0.1274ms | 32.0851μs | 31.1671 KOps/s | 30.2182 KOps/s | |
test_setitem_dim[int] | 38.7300μs | 19.2630μs | 51.9129 KOps/s | 50.2039 KOps/s | |
test_setitem_dim[slice_int] | 67.6210μs | 38.0608μs | 26.2738 KOps/s | 25.3089 KOps/s | |
test_setitem_dim[range] | 0.1010ms | 52.8484μs | 18.9221 KOps/s | 18.6140 KOps/s | |
test_setitem_dim[tuple] | 54.3300μs | 32.3033μs | 30.9566 KOps/s | 30.5653 KOps/s | |
test_setitem | 52.7510μs | 16.0923μs | 62.1416 KOps/s | 61.6079 KOps/s | |
test_set | 45.8900μs | 15.6797μs | 63.7766 KOps/s | 65.1270 KOps/s | |
test_set_shared | 0.5185ms | 0.1573ms | 6.3587 KOps/s | 6.0777 KOps/s | |
test_update | 0.2391ms | 19.6496μs | 50.8917 KOps/s | 48.7960 KOps/s | |
test_update_nested | 59.3410μs | 25.0997μs | 39.8411 KOps/s | 40.3644 KOps/s | |
test_update__nested | 0.4478ms | 24.8881μs | 40.1798 KOps/s | 38.4154 KOps/s | |
test_set_nested | 59.6510μs | 17.3833μs | 57.5265 KOps/s | 59.2984 KOps/s | |
test_set_nested_new | 62.5910μs | 20.0044μs | 49.9891 KOps/s | 51.4042 KOps/s | |
test_select | 81.9620μs | 30.8203μs | 32.4461 KOps/s | 32.7691 KOps/s | |
test_select_nested | 81.7210μs | 44.2540μs | 22.5968 KOps/s | 22.9285 KOps/s | |
test_exclude_nested | 92.3110μs | 64.4825μs | 15.5081 KOps/s | 15.7831 KOps/s | |
test_empty[True] | 0.3574ms | 0.2971ms | 3.3663 KOps/s | 3.3998 KOps/s | |
test_empty[False] | 3.8631μs | 0.8244μs | 1.2130 MOps/s | 1.2197 MOps/s | |
test_to | 86.8020μs | 54.6384μs | 18.3022 KOps/s | 16.4763 KOps/s | |
test_to_nonblocking | 88.1810μs | 46.8541μs | 21.3428 KOps/s | 20.2539 KOps/s | |
test_unbind_speed | 0.2865ms | 0.2379ms | 4.2029 KOps/s | 4.0886 KOps/s | |
test_unbind_speed_stack0 | 0.3082ms | 0.2349ms | 4.2580 KOps/s | 4.1218 KOps/s | |
test_unbind_speed_stack1 | 92.9467ms | 0.7319ms | 1.3663 KOps/s | 1.3443 KOps/s | |
test_split | 95.7852ms | 1.5787ms | 633.4443 Ops/s | 628.3468 Ops/s | |
test_chunk | 94.8858ms | 1.6021ms | 624.1842 Ops/s | 628.1301 Ops/s | |
test_consolidate[False-None] | 3.2761ms | 2.6995ms | 370.4344 Ops/s | 336.0171 Ops/s | |
test_consolidate[default-None] | 1.7847ms | 1.6986ms | 588.7076 Ops/s | 578.4046 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8502ms | 1.7463ms | 572.6255 Ops/s | 572.2877 Ops/s | |
test_consolidate_njt[False-None] | 6.8688ms | 6.4762ms | 154.4106 Ops/s | 150.6467 Ops/s | |
test_to[False-False-None] | 1.7985ms | 1.7034ms | 587.0475 Ops/s | 568.5955 Ops/s | |
test_to[True-False-None] | 1.5271ms | 1.2889ms | 775.8318 Ops/s | 710.3645 Ops/s | |
test_to[within-False-None] | 4.3549ms | 4.0974ms | 244.0583 Ops/s | 239.6647 Ops/s | |
test_to[True-default-None] | 5.4746ms | 5.2467ms | 190.5945 Ops/s | 187.7344 Ops/s | |
test_to_njt[False-False-None] | 7.1078ms | 6.9333ms | 144.2315 Ops/s | 141.4949 Ops/s | |
test_to_njt[True-False-None] | 5.7797ms | 5.5265ms | 180.9475 Ops/s | 177.2419 Ops/s | |
test_to_njt[within-False-None] | 12.8024ms | 12.1745ms | 82.1391 Ops/s | 80.6673 Ops/s | |
test_creation[device0] | 0.2980ms | 79.4588μs | 12.5851 KOps/s | 11.7203 KOps/s | |
test_creation_from_tensor | 0.5933ms | 84.8317μs | 11.7880 KOps/s | 11.4038 KOps/s | |
test_add_one[memmap_tensor0] | 0.4522ms | 6.5038μs | 153.7555 KOps/s | 144.8975 KOps/s | |
test_contiguous[memmap_tensor0] | 2.0486μs | 0.4082μs | 2.4501 MOps/s | 2.4276 MOps/s | |
test_stack[memmap_tensor0] | 40.5110μs | 4.2548μs | 235.0281 KOps/s | 221.3827 KOps/s | |
test_memmaptd_index | 1.6438ms | 0.2453ms | 4.0773 KOps/s | 4.0229 KOps/s | |
test_memmaptd_index_astensor | 0.4423ms | 0.2985ms | 3.3500 KOps/s | 3.2215 KOps/s | |
test_memmaptd_index_op | 0.7687ms | 0.6007ms | 1.6647 KOps/s | 1.6365 KOps/s | |
test_serialize_model | 0.1312s | 0.1300s | 7.6909 Ops/s | 7.6353 Ops/s | |
test_serialize_model_pickle | 1.3476s | 1.2124s | 0.8248 Ops/s | 0.8231 Ops/s | |
test_serialize_weights | 0.1311s | 0.1298s | 7.7048 Ops/s | 7.6042 Ops/s | |
test_serialize_weights_returnearly | 46.6673ms | 41.8307ms | 23.9059 Ops/s | 15.0261 Ops/s | |
test_serialize_weights_pickle | 1.3516s | 1.2186s | 0.8206 Ops/s | 0.8142 Ops/s | |
test_reshape_pytree | 55.7210μs | 22.3737μs | 44.6954 KOps/s | 41.7628 KOps/s | |
test_reshape_td | 59.4610μs | 26.9061μs | 37.1663 KOps/s | 37.1952 KOps/s | |
test_view_pytree | 61.4410μs | 21.9300μs | 45.5996 KOps/s | 45.0560 KOps/s | |
test_view_td | 74.9310μs | 30.5133μs | 32.7726 KOps/s | 30.0056 KOps/s | |
test_unbind_pytree | 60.8110μs | 28.2480μs | 35.4007 KOps/s | 34.4204 KOps/s | |
test_unbind_td | 0.6398ms | 36.3779μs | 27.4892 KOps/s | 26.0310 KOps/s | |
test_split_pytree | 68.2310μs | 29.7933μs | 33.5646 KOps/s | 32.6809 KOps/s | |
test_split_td | 0.8072ms | 37.6357μs | 26.5705 KOps/s | 25.3147 KOps/s | |
test_add_pytree | 75.8410μs | 34.3008μs | 29.1538 KOps/s | 27.8361 KOps/s | |
test_add_td | 96.3910μs | 50.6841μs | 19.7301 KOps/s | 19.3592 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1717ms | 0.1225ms | 8.1620 KOps/s | 7.7279 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2299ms | 0.1343ms | 7.4443 KOps/s | 7.3356 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1430ms | 95.0112μs | 10.5251 KOps/s | 10.2156 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.9848ms | 0.1483ms | 6.7416 KOps/s | 6.5651 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 59.0610μs | 23.6245μs | 42.3290 KOps/s | 42.5041 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1288ms | 28.9639μs | 34.5257 KOps/s | 34.0193 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3531ms | 63.5576μs | 15.7338 KOps/s | 15.7275 KOps/s | |
test_compile_copy_nested[pytree-eager] | 92.3420μs | 49.1202μs | 20.3582 KOps/s | 20.2906 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1918ms | 0.1419ms | 7.0459 KOps/s | 7.1010 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3068ms | 0.2188ms | 4.5708 KOps/s | 4.5735 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1533ms | 98.4831μs | 10.1540 KOps/s | 10.2660 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1117ms | 55.8289μs | 17.9119 KOps/s | 17.9233 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1767ms | 0.1366ms | 7.3217 KOps/s | 6.9329 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5509ms | 0.4813ms | 2.0779 KOps/s | 2.0086 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3925ms | 0.2626ms | 3.8085 KOps/s | 3.7874 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1853ms | 0.1433ms | 6.9783 KOps/s | 6.8557 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1540ms | 68.2834μs | 14.6448 KOps/s | 14.2155 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1418ms | 99.2490μs | 10.0757 KOps/s | 10.0320 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4773ms | 0.4049ms | 2.4695 KOps/s | 2.4161 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1695ms | 0.1355ms | 7.3813 KOps/s | 7.3041 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.2710μs | 19.0720μs | 52.4328 KOps/s | 53.6044 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 78.0110μs | 31.2280μs | 32.0226 KOps/s | 31.8294 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1141ms | 69.8338μs | 14.3197 KOps/s | 14.3092 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.4310μs | 52.6082μs | 19.0084 KOps/s | 19.0429 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6020ms | 0.3913ms | 2.5558 KOps/s | 2.2413 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7790ms | 2.6698ms | 374.5635 Ops/s | 373.4315 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5738ms | 0.3796ms | 2.6341 KOps/s | 2.2359 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.5151ms | 2.7034ms | 369.9072 Ops/s | 357.2263 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.3020ms | 0.1142ms | 8.7592 KOps/s | 8.2137 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5558ms | 80.4934μs | 12.4234 KOps/s | 11.9349 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.4888ms | 0.1064ms | 9.3977 KOps/s | 9.3783 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1096ms | 68.7992μs | 14.5351 KOps/s | 13.6875 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1494ms | 0.1077ms | 9.2876 KOps/s | 8.8270 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1701ms | 68.1956μs | 14.6637 KOps/s | 14.3736 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1611ms | 0.1006ms | 9.9388 KOps/s | 9.7779 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1561ms | 20.3867μs | 49.0517 KOps/s | 55.5656 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1300ms | 95.3262μs | 10.4903 KOps/s | 10.2276 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 51.4910μs | 15.7716μs | 63.4051 KOps/s | 62.8394 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1646ms | 97.0480μs | 10.3042 KOps/s | 10.3278 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 52.9810μs | 15.9421μs | 62.7269 KOps/s | 63.6386 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1714ms | 0.1015ms | 9.8475 KOps/s | 9.7431 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5889ms | 17.1893μs | 58.1758 KOps/s | 57.0524 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1672ms | 97.7911μs | 10.2259 KOps/s | 10.2584 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 52.6510μs | 15.7932μs | 63.3185 KOps/s | 62.9343 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1541ms | 96.8774μs | 10.3223 KOps/s | 10.2982 KOps/s | |
test_compile_indexing[int-pytree-eager] | 52.1810μs | 15.5632μs | 64.2541 KOps/s | 63.7337 KOps/s | |
test_mod_add[eager] | 89.0910μs | 39.9671μs | 25.0206 KOps/s | 24.9853 KOps/s | |
test_mod_add[compile] | 0.3607ms | 81.9668μs | 12.2001 KOps/s | 12.0746 KOps/s | |
test_mod_add[compile-overhead] | 0.3310ms | 0.1682ms | 5.9469 KOps/s | 5.6685 KOps/s | |
test_mod_wrap[eager] | 0.3271ms | 0.2527ms | 3.9573 KOps/s | 3.8663 KOps/s | |
test_mod_wrap[compile] | 0.3890ms | 0.2871ms | 3.4826 KOps/s | 3.4426 KOps/s | |
test_mod_wrap[compile-overhead] | 7.2251ms | 3.8873ms | 257.2471 Ops/s | 264.6801 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4753ms | 1.3628ms | 733.7893 Ops/s | 675.3752 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4080ms | 1.2710ms | 786.7851 Ops/s | 756.3527 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4232ms | 0.9374ms | 1.0668 KOps/s | 1.0385 KOps/s | |
test_seq_add[eager] | 0.1702ms | 0.1200ms | 8.3336 KOps/s | 8.2993 KOps/s | |
test_seq_add[compile] | 0.1351ms | 91.3387μs | 10.9483 KOps/s | 11.2114 KOps/s | |
test_seq_add[compile-overhead] | 0.2447ms | 0.1322ms | 7.5619 KOps/s | 7.5592 KOps/s | |
test_seq_wrap[eager] | 0.5184ms | 0.4342ms | 2.3031 KOps/s | 2.2914 KOps/s | |
test_seq_wrap[compile] | 0.3985ms | 0.3062ms | 3.2654 KOps/s | 3.2754 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3202ms | 0.2287ms | 4.3726 KOps/s | 4.4008 KOps/s | |
test_func_call_runtime[False-eager] | 0.8233ms | 0.7359ms | 1.3588 KOps/s | 1.3091 KOps/s | |
test_func_call_runtime[False-compile] | 1.1495ms | 0.7500ms | 1.3333 KOps/s | 1.3312 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4300ms | 0.3646ms | 2.7426 KOps/s | 2.7431 KOps/s | |
test_func_call_runtime[True-eager] | 1.0027ms | 0.9042ms | 1.1060 KOps/s | 1.0873 KOps/s | |
test_func_call_runtime[True-compile] | 1.0220ms | 0.7818ms | 1.2791 KOps/s | 1.2832 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4600ms | 0.3910ms | 2.5574 KOps/s | 2.5455 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8797ms | 0.7643ms | 1.3084 KOps/s | 1.3237 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8661ms | 0.7816ms | 1.2795 KOps/s | 1.3177 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4436ms | 0.3775ms | 2.6487 KOps/s | 2.7324 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1580ms | 1.0041ms | 995.8979 Ops/s | 960.3710 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1528ms | 1.0436ms | 958.1897 Ops/s | 981.4276 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2321ms | 1.0285ms | 972.3066 Ops/s | 965.1675 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5443ms | 2.1239ms | 470.8310 Ops/s | 466.0175 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9374ms | 0.8319ms | 1.2020 KOps/s | 1.2033 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4809ms | 0.4181ms | 2.3919 KOps/s | 2.3594 KOps/s | |
test_distributed | 2.7628ms | 0.1261ms | 7.9285 KOps/s | 8.5477 KOps/s | |
test_tdmodule | 63.9410μs | 22.3580μs | 44.7267 KOps/s | 46.5545 KOps/s | |
test_tdmodule_dispatch | 62.3410μs | 40.3923μs | 24.7572 KOps/s | 25.9851 KOps/s | |
test_tdseq | 49.1000μs | 23.0061μs | 43.4667 KOps/s | 46.5943 KOps/s | |
test_tdseq_dispatch | 95.1010μs | 44.1613μs | 22.6442 KOps/s | 25.1827 KOps/s | |
test_instantiation_functorch | 2.2260ms | 1.5605ms | 640.8004 Ops/s | 641.9224 Ops/s | |
test_exec_functorch | 0.1867ms | 0.1401ms | 7.1357 KOps/s | 6.9230 KOps/s | |
test_exec_functional_call | 0.2506ms | 0.1339ms | 7.4680 KOps/s | 7.1000 KOps/s | |
test_exec_td_decorator | 0.3698ms | 0.1858ms | 5.3832 KOps/s | 5.1882 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8015ms | 0.6927ms | 1.4436 KOps/s | 1.3819 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8061ms | 0.6978ms | 1.4330 KOps/s | 1.3732 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7291ms | 0.6078ms | 1.6452 KOps/s | 1.5833 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7213ms | 0.6059ms | 1.6504 KOps/s | 1.5823 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.4523ms | 19.6633ms | 50.8562 Ops/s | 50.8737 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7450ms | 19.4533ms | 51.4052 Ops/s | 51.4993 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.6009ms | 19.2234ms | 52.0199 Ops/s | 51.6745 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.6924ms | 19.3039ms | 51.8030 Ops/s | 51.9730 Ops/s | |
test_to_module_speed[True] | 1.4573ms | 0.9722ms | 1.0286 KOps/s | 1.0389 KOps/s | |
test_to_module_speed[False] | 1.0592ms | 0.9584ms | 1.0434 KOps/s | 1.0710 KOps/s | |
test_tc_init | 81.5910μs | 38.0381μs | 26.2895 KOps/s | 27.9877 KOps/s | |
test_tc_init_nested | 0.1376ms | 78.5294μs | 12.7341 KOps/s | 13.8870 KOps/s | |
test_tc_first_layer_tensor | 25.4010μs | 0.8015μs | 1.2476 MOps/s | 1.4627 MOps/s | |
test_tc_first_layer_nontensor | 25.7810μs | 2.2142μs | 451.6395 KOps/s | 451.7166 KOps/s | |
test_tc_second_layer_tensor | 9.8550μs | 1.4444μs | 692.3257 KOps/s | 719.4011 KOps/s | |
test_tc_second_layer_nontensor | 33.2210μs | 2.9903μs | 334.4190 KOps/s | 343.0677 KOps/s | |
test_unbind | 7.3191ms | 7.0823ms | 141.1964 Ops/s | 142.0276 Ops/s | |
test_full_like | 13.2523ms | 9.2852ms | 107.6986 Ops/s | 106.9973 Ops/s | |
test_zeros_like | 6.0588ms | 4.2828ms | 233.4906 Ops/s | 230.7746 Ops/s | |
test_ones_like | 4.4555ms | 4.3340ms | 230.7330 Ops/s | 230.9418 Ops/s | |
test_clone | 11.7985ms | 9.2298ms | 108.3449 Ops/s | 155.4602 Ops/s | |
test_squeeze | 46.9510μs | 10.0025μs | 99.9749 KOps/s | 101.2956 KOps/s | |
test_unsqueeze | 0.1230ms | 77.6865μs | 12.8723 KOps/s | 12.6843 KOps/s | |
test_split | 0.2111s | 0.2213ms | 4.5182 KOps/s | 5.8858 KOps/s | |
test_permute | 0.3020ms | 0.1868ms | 5.3532 KOps/s | 5.3432 KOps/s | |
test_stack | 51.7996ms | 50.2804ms | 19.8885 Ops/s | 19.2038 Ops/s | |
test_cat | 51.6839ms | 50.5569ms | 19.7797 Ops/s | 19.6903 Ops/s |
else: | ||
keys: set[str] = set(keys) | ||
keys_set: set[str] = set(keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is it much more efficient using set
rather than using always the other option? Or there is another reason?
@@ -626,7 +626,6 @@ def stack_fn(key, values, is_not_init, is_tensor): | |||
key: stack_fn(key, values, is_not_init, is_tensor) | |||
for key, (values, is_not_init, is_tensor) in out.items() | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this added space is on purpose.
raise KeyError( | ||
f"got keys {keys} and {set(td.keys())} which are incompatible" | ||
) | ||
return keys | ||
if strict: | ||
return keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return keys | |
return list(keys) |
pretty sure that's what you mean with your comment, but just to be on the safe side. Rn, the return type is not consistent with typing.
return keys | ||
if strict: | ||
return keys | ||
return keys_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False)
used?
tc1 = MyTensorClass(foo=torch.zeros((1,)), bar=torch.ones((1,))) | ||
|
||
for _ in range(10000): | ||
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [ | |
assert list(torch.stack([tc1, tc1], dim=0).keys()) == [ |
Stack from ghstack (oldest at bottom):