-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Allow update to increment the number of tds in a lazy stack #1220
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 47.5390μs | 20.9710μs | 47.6848 KOps/s | 45.9395 KOps/s | |
test_plain_set_stack_nested | 55.4240μs | 21.1826μs | 47.2086 KOps/s | 45.3991 KOps/s | |
test_plain_set_nested_inplace | 53.9410μs | 22.8528μs | 43.7584 KOps/s | 42.3928 KOps/s | |
test_plain_set_stack_nested_inplace | 78.3360μs | 22.9130μs | 43.6434 KOps/s | 43.1842 KOps/s | |
test_items | 38.5320μs | 4.3481μs | 229.9851 KOps/s | 242.1970 KOps/s | |
test_items_nested | 0.5306ms | 0.4070ms | 2.4570 KOps/s | 2.4087 KOps/s | |
test_items_nested_locked | 0.6234ms | 0.4025ms | 2.4845 KOps/s | 2.4209 KOps/s | |
test_items_nested_leaf | 0.1380ms | 77.5382μs | 12.8969 KOps/s | 12.6145 KOps/s | |
test_items_stack_nested | 0.5981ms | 0.4041ms | 2.4744 KOps/s | 2.3930 KOps/s | |
test_items_stack_nested_leaf | 0.1387ms | 78.7203μs | 12.7032 KOps/s | 12.0448 KOps/s | |
test_items_stack_nested_locked | 0.7096ms | 0.4035ms | 2.4785 KOps/s | 2.3779 KOps/s | |
test_keys | 23.2740μs | 3.5050μs | 285.3103 KOps/s | 280.7495 KOps/s | |
test_keys_nested | 0.2649ms | 0.1624ms | 6.1577 KOps/s | 6.0311 KOps/s | |
test_keys_nested_locked | 0.7716ms | 0.1686ms | 5.9300 KOps/s | 5.7937 KOps/s | |
test_keys_nested_leaf | 0.2102ms | 0.1420ms | 7.0425 KOps/s | 6.9151 KOps/s | |
test_keys_stack_nested | 0.2927ms | 0.1634ms | 6.1209 KOps/s | 6.1968 KOps/s | |
test_keys_stack_nested_leaf | 0.2289ms | 0.1422ms | 7.0311 KOps/s | 7.1584 KOps/s | |
test_keys_stack_nested_locked | 0.3193ms | 0.1688ms | 5.9244 KOps/s | 6.0006 KOps/s | |
test_values | 36.0880μs | 1.1913μs | 839.4346 KOps/s | 935.6387 KOps/s | |
test_values_nested | 0.1230ms | 62.0990μs | 16.1033 KOps/s | 15.6529 KOps/s | |
test_values_nested_locked | 0.1102ms | 61.6517μs | 16.2201 KOps/s | 15.7507 KOps/s | |
test_values_nested_leaf | 0.1198ms | 70.6305μs | 14.1582 KOps/s | 13.3461 KOps/s | |
test_values_stack_nested | 0.1158ms | 62.4799μs | 16.0051 KOps/s | 15.7248 KOps/s | |
test_values_stack_nested_leaf | 0.1229ms | 70.6027μs | 14.1638 KOps/s | 14.0583 KOps/s | |
test_values_stack_nested_locked | 0.1213ms | 62.7733μs | 15.9303 KOps/s | 15.8867 KOps/s | |
test_membership | 28.7240μs | 0.8604μs | 1.1622 MOps/s | 1.1526 MOps/s | |
test_membership_nested | 36.5690μs | 2.9104μs | 343.6001 KOps/s | 339.8354 KOps/s | |
test_membership_nested_leaf | 34.6140μs | 2.9319μs | 341.0785 KOps/s | 342.5004 KOps/s | |
test_membership_stacked_nested | 39.3430μs | 2.8552μs | 350.2417 KOps/s | 345.7953 KOps/s | |
test_membership_stacked_nested_leaf | 21.1190μs | 2.9028μs | 344.4952 KOps/s | 341.7338 KOps/s | |
test_membership_nested_last | 37.7910μs | 4.3330μs | 230.7884 KOps/s | 228.4639 KOps/s | |
test_membership_nested_leaf_last | 39.1530μs | 4.3362μs | 230.6165 KOps/s | 228.1520 KOps/s | |
test_membership_stacked_nested_last | 24.4350μs | 5.0949μs | 196.2739 KOps/s | 73.4898 KOps/s | |
test_membership_stacked_nested_leaf_last | 82.6440μs | 5.3680μs | 186.2904 KOps/s | 71.9950 KOps/s | |
test_nested_getleaf | 54.0610μs | 10.3212μs | 96.8883 KOps/s | 93.1301 KOps/s | |
test_nested_get | 45.7060μs | 9.7596μs | 102.4629 KOps/s | 98.5229 KOps/s | |
test_stacked_getleaf | 58.0480μs | 10.2383μs | 97.6727 KOps/s | 93.8724 KOps/s | |
test_stacked_get | 56.5260μs | 9.9470μs | 100.5326 KOps/s | 97.7707 KOps/s | |
test_nested_getitemleaf | 43.1700μs | 10.9187μs | 91.5864 KOps/s | 87.7311 KOps/s | |
test_nested_getitem | 54.3110μs | 10.4476μs | 95.7156 KOps/s | 92.4056 KOps/s | |
test_stacked_getitemleaf | 34.9450μs | 10.9653μs | 91.1965 KOps/s | 89.3051 KOps/s | |
test_stacked_getitem | 64.4600μs | 10.4449μs | 95.7409 KOps/s | 93.3780 KOps/s | |
test_lock_nested | 0.5190ms | 0.4018ms | 2.4890 KOps/s | 2.4243 KOps/s | |
test_lock_stack_nested | 0.5236ms | 0.4137ms | 2.4173 KOps/s | 2.3767 KOps/s | |
test_unlock_nested | 0.3971ms | 0.3272ms | 3.0559 KOps/s | 2.9712 KOps/s | |
test_unlock_stack_nested | 0.4654ms | 0.3352ms | 2.9830 KOps/s | 2.9285 KOps/s | |
test_flatten_speed | 0.1679ms | 99.3333μs | 10.0671 KOps/s | 9.7479 KOps/s | |
test_unflatten_speed | 0.6973ms | 0.5149ms | 1.9423 KOps/s | 1.8832 KOps/s | |
test_common_ops | 5.0484ms | 0.8103ms | 1.2341 KOps/s | 1.1970 KOps/s | |
test_creation | 55.4220μs | 2.4591μs | 406.6555 KOps/s | 396.9766 KOps/s | |
test_creation_empty | 98.2530μs | 12.4480μs | 80.3343 KOps/s | 77.5917 KOps/s | |
test_creation_nested_1 | 45.0140μs | 15.3396μs | 65.1906 KOps/s | 63.2497 KOps/s | |
test_creation_nested_2 | 77.0030μs | 19.9413μs | 50.1472 KOps/s | 48.1109 KOps/s | |
test_clone | 0.1569ms | 13.2436μs | 75.5083 KOps/s | 72.7569 KOps/s | |
test_getitem[int] | 0.8829ms | 12.7599μs | 78.3704 KOps/s | 78.1875 KOps/s | |
test_getitem[slice_int] | 0.1256ms | 23.8502μs | 41.9284 KOps/s | 40.6759 KOps/s | |
test_getitem[range] | 0.1943ms | 50.0205μs | 19.9918 KOps/s | 19.5331 KOps/s | |
test_getitem[tuple] | 0.1283ms | 20.0414μs | 49.8968 KOps/s | 48.7592 KOps/s | |
test_getitem[list] | 0.1816ms | 45.9600μs | 21.7581 KOps/s | 21.5686 KOps/s | |
test_setitem_dim[int] | 47.6990μs | 25.1569μs | 39.7505 KOps/s | 38.6959 KOps/s | |
test_setitem_dim[slice_int] | 0.1319ms | 50.3094μs | 19.8770 KOps/s | 19.5930 KOps/s | |
test_setitem_dim[range] | 0.1234ms | 74.6241μs | 13.4005 KOps/s | 12.8237 KOps/s | |
test_setitem_dim[tuple] | 85.5500μs | 39.1472μs | 25.5446 KOps/s | 24.5072 KOps/s | |
test_setitem | 0.2403ms | 20.4924μs | 48.7986 KOps/s | 45.7525 KOps/s | |
test_set | 0.2020ms | 20.1844μs | 49.5433 KOps/s | 47.5697 KOps/s | |
test_set_shared | 4.3360ms | 0.1800ms | 5.5557 KOps/s | 5.5661 KOps/s | |
test_update | 0.2446ms | 23.4970μs | 42.5587 KOps/s | 40.5283 KOps/s | |
test_update_nested | 0.2140ms | 34.5909μs | 28.9093 KOps/s | 28.2041 KOps/s | |
test_update__nested | 0.4553ms | 33.5825μs | 29.7774 KOps/s | 28.4767 KOps/s | |
test_set_nested | 0.1017ms | 22.0899μs | 45.2695 KOps/s | 42.8356 KOps/s | |
test_set_nested_new | 0.1027ms | 26.2374μs | 38.1135 KOps/s | 35.8851 KOps/s | |
test_select | 0.1059ms | 43.6118μs | 22.9296 KOps/s | 22.4953 KOps/s | |
test_select_nested | 0.1283ms | 62.5546μs | 15.9860 KOps/s | 15.6726 KOps/s | |
test_exclude_nested | 0.1629ms | 80.9303μs | 12.3563 KOps/s | 12.2166 KOps/s | |
test_empty[True] | 0.7299ms | 0.4012ms | 2.4926 KOps/s | 2.4370 KOps/s | |
test_empty[False] | 47.9773μs | 1.4002μs | 714.1586 KOps/s | 722.8709 KOps/s | |
test_unbind_speed | 0.4290ms | 0.2649ms | 3.7747 KOps/s | 3.6623 KOps/s | |
test_unbind_speed_stack0 | 0.4482ms | 0.2633ms | 3.7986 KOps/s | 3.8335 KOps/s | |
test_unbind_speed_stack1 | 0.1099s | 0.7282ms | 1.3733 KOps/s | 1.3807 KOps/s | |
test_split | 0.1158s | 1.7546ms | 569.9277 Ops/s | 502.4054 Ops/s | |
test_chunk | 0.1092s | 1.7355ms | 576.2039 Ops/s | 619.1901 Ops/s | |
test_consolidate_njt[False-None] | 9.7715ms | 8.0807ms | 123.7520 Ops/s | 118.5728 Ops/s | |
test_creation[device0] | 3.6637ms | 94.1987μs | 10.6159 KOps/s | 10.4656 KOps/s | |
test_creation_from_tensor | 0.2714ms | 94.1103μs | 10.6258 KOps/s | 10.4892 KOps/s | |
test_add_one[memmap_tensor0] | 95.1380μs | 4.8022μs | 208.2367 KOps/s | 205.3798 KOps/s | |
test_contiguous[memmap_tensor0] | 16.4400μs | 0.5088μs | 1.9654 MOps/s | 1.9490 MOps/s | |
test_stack[memmap_tensor0] | 34.9250μs | 3.3186μs | 301.3363 KOps/s | 292.0765 KOps/s | |
test_memmaptd_index | 1.3155ms | 0.2246ms | 4.4523 KOps/s | 4.3134 KOps/s | |
test_memmaptd_index_astensor | 0.7014ms | 0.3128ms | 3.1966 KOps/s | 3.1244 KOps/s | |
test_memmaptd_index_op | 1.1614ms | 0.5918ms | 1.6898 KOps/s | 1.6440 KOps/s | |
test_serialize_model | 0.2323s | 0.1329s | 7.5273 Ops/s | 8.2217 Ops/s | |
test_serialize_model_pickle | 0.4462s | 0.4010s | 2.4939 Ops/s | 2.4498 Ops/s | |
test_serialize_weights | 0.1296s | 0.1162s | 8.6023 Ops/s | 8.4445 Ops/s | |
test_serialize_weights_returnearly | 0.1846s | 0.1669s | 5.9918 Ops/s | 6.0679 Ops/s | |
test_serialize_weights_pickle | 1.2513s | 0.7327s | 1.3649 Ops/s | 2.4935 Ops/s | |
test_serialize_weights_filesystem | 0.1520s | 0.1452s | 6.8880 Ops/s | 6.1313 Ops/s | |
test_serialize_model_filesystem | 0.1602s | 0.1474s | 6.7828 Ops/s | 6.5327 Ops/s | |
test_reshape_pytree | 73.2560μs | 26.1678μs | 38.2149 KOps/s | 36.4052 KOps/s | |
test_reshape_td | 0.1065ms | 34.0690μs | 29.3522 KOps/s | 29.5760 KOps/s | |
test_view_pytree | 68.7180μs | 26.0008μs | 38.4604 KOps/s | 37.2476 KOps/s | |
test_view_td | 0.1201ms | 40.5325μs | 24.6716 KOps/s | 24.0674 KOps/s | |
test_unbind_pytree | 0.1110ms | 28.9967μs | 34.4867 KOps/s | 33.1244 KOps/s | |
test_unbind_td | 0.3345ms | 39.4881μs | 25.3241 KOps/s | 24.8575 KOps/s | |
test_split_pytree | 71.1820μs | 28.4289μs | 35.1755 KOps/s | 33.5137 KOps/s | |
test_split_td | 0.5320ms | 44.5377μs | 22.4529 KOps/s | 21.5890 KOps/s | |
test_add_pytree | 99.0450μs | 36.2997μs | 27.5484 KOps/s | 28.2620 KOps/s | |
test_add_td | 0.2707ms | 65.4222μs | 15.2853 KOps/s | 17.1198 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1748ms | 67.4365μs | 14.8288 KOps/s | 14.6825 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3760ms | 0.1737ms | 5.7568 KOps/s | 5.7193 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1070ms | 45.4244μs | 22.0146 KOps/s | 21.6428 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2381ms | 0.1180ms | 8.4759 KOps/s | 8.3105 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1271ms | 28.6005μs | 34.9644 KOps/s | 34.9504 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1188ms | 59.0987μs | 16.9208 KOps/s | 16.8001 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1536ms | 80.1640μs | 12.4744 KOps/s | 11.9492 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1625ms | 67.3566μs | 14.8464 KOps/s | 14.5430 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2108ms | 0.1072ms | 9.3258 KOps/s | 9.3108 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3915ms | 0.2164ms | 4.6220 KOps/s | 4.6013 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1197ms | 46.5823μs | 21.4674 KOps/s | 21.1905 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4002ms | 69.8153μs | 14.3235 KOps/s | 14.6419 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1813ms | 0.1004ms | 9.9580 KOps/s | 9.8929 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3360ms | 0.1979ms | 5.0530 KOps/s | 4.8599 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3984ms | 0.2297ms | 4.3526 KOps/s | 4.2649 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2426ms | 0.1071ms | 9.3365 KOps/s | 9.2269 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.4043ms | 65.1571μs | 15.3475 KOps/s | 15.3377 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1331ms | 47.7189μs | 20.9560 KOps/s | 20.5017 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.8970ms | 0.1573ms | 6.3575 KOps/s | 6.2406 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2355ms | 0.1008ms | 9.9199 KOps/s | 9.6602 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 79.0080μs | 20.6165μs | 48.5048 KOps/s | 45.2078 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.3916ms | 69.3567μs | 14.4182 KOps/s | 14.7912 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2715ms | 81.2516μs | 12.3075 KOps/s | 11.9244 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1933ms | 66.7020μs | 14.9921 KOps/s | 14.6149 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2908ms | 0.2130ms | 4.6956 KOps/s | 4.6192 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.7384ms | 1.3971ms | 715.7786 Ops/s | 704.2911 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3424ms | 0.2097ms | 4.7698 KOps/s | 4.7368 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0437ms | 0.8180ms | 1.2226 KOps/s | 1.1854 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.9003ms | 0.4572ms | 2.1871 KOps/s | 2.1817 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.1117ms | 2.7406ms | 364.8883 Ops/s | 355.6559 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1004ms | 38.8906μs | 25.7132 KOps/s | 25.3066 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5475ms | 33.0855μs | 30.2247 KOps/s | 28.5617 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 80.1190μs | 31.4691μs | 31.7772 KOps/s | 32.0703 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 96.9800μs | 22.6679μs | 44.1152 KOps/s | 42.5161 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 71.7440μs | 31.0825μs | 32.1724 KOps/s | 30.7612 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 69.9000μs | 22.1929μs | 45.0595 KOps/s | 42.7921 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1638ms | 53.2265μs | 18.7876 KOps/s | 18.5107 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5117ms | 19.5022μs | 51.2764 KOps/s | 46.6792 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1273ms | 45.8451μs | 21.8126 KOps/s | 20.9472 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 73.8280μs | 18.4346μs | 54.2457 KOps/s | 52.4539 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1040ms | 46.5541μs | 21.4804 KOps/s | 20.6908 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 57.7480μs | 18.4156μs | 54.3019 KOps/s | 52.8610 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2289ms | 57.6594μs | 17.3432 KOps/s | 18.4474 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8632ms | 19.4316μs | 51.4627 KOps/s | 48.0468 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1629ms | 46.6994μs | 21.4135 KOps/s | 20.9104 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 69.7700μs | 18.2384μs | 54.8295 KOps/s | 54.1703 KOps/s | |
test_compile_indexing[int-pytree-compile] | 99.5460μs | 46.1395μs | 21.6734 KOps/s | 21.2319 KOps/s | |
test_compile_indexing[int-pytree-eager] | 56.3050μs | 18.1649μs | 55.0512 KOps/s | 53.1571 KOps/s | |
test_mod_add[eager] | 0.1074ms | 34.5279μs | 28.9621 KOps/s | 26.8650 KOps/s | |
test_mod_add[compile] | 0.1696ms | 63.1480μs | 15.8358 KOps/s | 15.1409 KOps/s | |
test_mod_add[compile-overhead] | 0.1178ms | 63.1047μs | 15.8467 KOps/s | 14.9558 KOps/s | |
test_mod_wrap[eager] | 0.5230ms | 0.2191ms | 4.5643 KOps/s | 4.4254 KOps/s | |
test_mod_wrap[compile] | 2.3553ms | 0.2224ms | 4.4961 KOps/s | 4.3201 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3418ms | 0.2203ms | 4.5387 KOps/s | 4.3844 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.5299ms | 11.1865ms | 89.3931 Ops/s | 71.8616 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.7341ms | 11.3546ms | 88.0698 Ops/s | 82.0002 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.2612ms | 11.6624ms | 85.7454 Ops/s | 84.8896 Ops/s | |
test_seq_add[eager] | 0.2683ms | 0.1154ms | 8.6680 KOps/s | 8.1916 KOps/s | |
test_seq_add[compile] | 0.1660ms | 76.3470μs | 13.0981 KOps/s | 12.3930 KOps/s | |
test_seq_add[compile-overhead] | 0.1410ms | 74.4293μs | 13.4356 KOps/s | 12.9312 KOps/s | |
test_seq_wrap[eager] | 0.7375ms | 0.4441ms | 2.2519 KOps/s | 2.1537 KOps/s | |
test_seq_wrap[compile] | 0.4603ms | 0.2411ms | 4.1480 KOps/s | 4.0425 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3891ms | 0.2385ms | 4.1928 KOps/s | 4.0719 KOps/s | |
test_func_call_runtime[False-eager] | 0.8944ms | 0.5361ms | 1.8654 KOps/s | 1.8001 KOps/s | |
test_func_call_runtime[False-compile] | 0.8297ms | 0.4469ms | 2.2374 KOps/s | 2.2062 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8603ms | 0.4460ms | 2.2419 KOps/s | 2.2220 KOps/s | |
test_func_call_runtime[True-eager] | 1.0845ms | 0.7535ms | 1.3271 KOps/s | 1.2798 KOps/s | |
test_func_call_runtime[True-compile] | 0.5686ms | 0.4646ms | 2.1525 KOps/s | 2.1334 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5789ms | 0.4648ms | 2.1515 KOps/s | 2.1106 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7382ms | 0.5367ms | 1.8631 KOps/s | 1.7927 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5585ms | 0.4450ms | 2.2472 KOps/s | 2.2311 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7316ms | 0.4419ms | 2.2631 KOps/s | 2.2099 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1854ms | 0.9015ms | 1.1093 KOps/s | 1.0844 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2982ms | 0.7973ms | 1.2542 KOps/s | 1.1982 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0346ms | 0.7970ms | 1.2547 KOps/s | 1.1888 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6929ms | 1.9210ms | 520.5732 Ops/s | 509.9440 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9170ms | 0.5410ms | 1.8483 KOps/s | 1.7964 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9545ms | 0.5383ms | 1.8579 KOps/s | 1.8275 KOps/s | |
test_distributed | 0.2628ms | 0.1259ms | 7.9451 KOps/s | 7.4715 KOps/s | |
test_tdmodule | 0.1331ms | 26.6625μs | 37.5059 KOps/s | 36.4821 KOps/s | |
test_tdmodule_dispatch | 0.1102ms | 48.5516μs | 20.5966 KOps/s | 20.3885 KOps/s | |
test_tdseq | 58.9900μs | 28.6339μs | 34.9237 KOps/s | 33.4719 KOps/s | |
test_tdseq_dispatch | 84.0870μs | 53.7526μs | 18.6037 KOps/s | 17.8154 KOps/s | |
test_instantiation_functorch | 1.7374ms | 1.5120ms | 661.3945 Ops/s | 622.6454 Ops/s | |
test_exec_functorch | 0.3150ms | 0.1747ms | 5.7249 KOps/s | 5.4014 KOps/s | |
test_exec_functional_call | 0.3193ms | 0.1680ms | 5.9515 KOps/s | 5.7028 KOps/s | |
test_exec_td_decorator | 0.5747ms | 0.2308ms | 4.3331 KOps/s | 4.2899 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0644ms | 0.6556ms | 1.5252 KOps/s | 1.4995 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0254ms | 0.6598ms | 1.5156 KOps/s | 1.5013 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8258ms | 0.5290ms | 1.8902 KOps/s | 1.8364 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8259ms | 0.5287ms | 1.8914 KOps/s | 1.8541 KOps/s | |
test_to_module_speed[True] | 1.9325ms | 1.3341ms | 749.5759 Ops/s | 749.7244 Ops/s | |
test_to_module_speed[False] | 1.8051ms | 1.3080ms | 764.5515 Ops/s | 754.5233 Ops/s | |
test_tc_init | 89.5670μs | 49.9520μs | 20.0192 KOps/s | 20.0235 KOps/s | |
test_tc_init_nested | 0.1652ms | 99.2404μs | 10.0765 KOps/s | 10.2836 KOps/s | |
test_tc_first_layer_tensor | 39.1230μs | 1.5749μs | 634.9554 KOps/s | 642.6088 KOps/s | |
test_tc_first_layer_nontensor | 39.1930μs | 4.7966μs | 208.4790 KOps/s | 209.2099 KOps/s | |
test_tc_second_layer_tensor | 38.3620μs | 2.8971μs | 345.1728 KOps/s | 352.0118 KOps/s | |
test_tc_second_layer_nontensor | 41.2370μs | 6.1773μs | 161.8818 KOps/s | 166.8481 KOps/s | |
test_unbind | 0.2543s | 14.6081ms | 68.4553 Ops/s | 57.2371 Ops/s | |
test_full_like | 11.9717ms | 8.9160ms | 112.1580 Ops/s | 118.2353 Ops/s | |
test_zeros_like | 5.9825ms | 4.7072ms | 212.4417 Ops/s | 299.1298 Ops/s | |
test_ones_like | 5.0005ms | 3.6802ms | 271.7208 Ops/s | 274.4979 Ops/s | |
test_clone | 8.0553ms | 5.2922ms | 188.9564 Ops/s | 160.1769 Ops/s | |
test_squeeze | 79.9590μs | 12.7776μs | 78.2619 KOps/s | 75.3991 KOps/s | |
test_unsqueeze | 0.1942ms | 91.5747μs | 10.9200 KOps/s | 10.4787 KOps/s | |
test_split | 0.3877ms | 0.1957ms | 5.1089 KOps/s | 4.9852 KOps/s | |
test_permute | 0.2874ms | 0.1980ms | 5.0505 KOps/s | 4.8651 KOps/s | |
test_stack | 30.9389ms | 24.5147ms | 40.7918 Ops/s | 35.9036 Ops/s | |
test_cat | 28.7327ms | 24.1429ms | 41.4200 Ops/s | 35.9172 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 36.3100μs | 12.8860μs | 77.6037 KOps/s | 82.8174 KOps/s | |
test_plain_set_stack_nested | 35.4410μs | 12.9705μs | 77.0982 KOps/s | 81.5332 KOps/s | |
test_plain_set_nested_inplace | 55.6710μs | 13.9693μs | 71.5857 KOps/s | 75.6612 KOps/s | |
test_plain_set_stack_nested_inplace | 44.0010μs | 13.9160μs | 71.8600 KOps/s | 75.9952 KOps/s | |
test_items | 37.1400μs | 2.8914μs | 345.8547 KOps/s | 342.4807 KOps/s | |
test_items_nested | 0.4186ms | 0.3602ms | 2.7759 KOps/s | 2.7097 KOps/s | |
test_items_nested_locked | 0.4128ms | 0.3638ms | 2.7486 KOps/s | 2.7075 KOps/s | |
test_items_nested_leaf | 93.1110μs | 60.5760μs | 16.5082 KOps/s | 16.6295 KOps/s | |
test_items_stack_nested | 0.4159ms | 0.3595ms | 2.7814 KOps/s | 2.7280 KOps/s | |
test_items_stack_nested_leaf | 83.9510μs | 60.1707μs | 16.6194 KOps/s | 16.0744 KOps/s | |
test_items_stack_nested_locked | 0.4334ms | 0.3619ms | 2.7634 KOps/s | 2.7335 KOps/s | |
test_keys | 56.4010μs | 3.4141μs | 292.9057 KOps/s | 284.8367 KOps/s | |
test_keys_nested | 0.1157ms | 88.2680μs | 11.3291 KOps/s | 11.3727 KOps/s | |
test_keys_nested_locked | 0.7798ms | 93.6367μs | 10.6796 KOps/s | 10.6806 KOps/s | |
test_keys_nested_leaf | 0.1064ms | 79.9539μs | 12.5072 KOps/s | 12.7092 KOps/s | |
test_keys_stack_nested | 0.1203ms | 88.3432μs | 11.3195 KOps/s | 11.2660 KOps/s | |
test_keys_stack_nested_leaf | 0.1060ms | 79.2688μs | 12.6153 KOps/s | 12.4553 KOps/s | |
test_keys_stack_nested_locked | 0.1449ms | 93.4460μs | 10.7014 KOps/s | 10.6480 KOps/s | |
test_values | 8.2852μs | 0.8781μs | 1.1389 MOps/s | 1.1576 MOps/s | |
test_values_nested | 78.3010μs | 37.5380μs | 26.6397 KOps/s | 26.9128 KOps/s | |
test_values_nested_locked | 93.5010μs | 39.6848μs | 25.1986 KOps/s | 25.5861 KOps/s | |
test_values_nested_leaf | 95.1410μs | 42.3377μs | 23.6196 KOps/s | 23.6476 KOps/s | |
test_values_stack_nested | 71.0510μs | 37.6990μs | 26.5259 KOps/s | 26.4600 KOps/s | |
test_values_stack_nested_leaf | 80.3410μs | 42.8117μs | 23.3581 KOps/s | 23.3128 KOps/s | |
test_values_stack_nested_locked | 67.1010μs | 39.6480μs | 25.2220 KOps/s | 24.9768 KOps/s | |
test_membership | 2.2171μs | 0.5004μs | 1.9984 MOps/s | 1.9912 MOps/s | |
test_membership_nested | 39.0500μs | 2.0663μs | 483.9506 KOps/s | 480.6274 KOps/s | |
test_membership_nested_leaf | 22.8250μs | 2.0039μs | 499.0237 KOps/s | 498.2671 KOps/s | |
test_membership_stacked_nested | 39.5300μs | 2.0766μs | 481.5566 KOps/s | 479.2675 KOps/s | |
test_membership_stacked_nested_leaf | 38.6200μs | 2.0704μs | 482.9931 KOps/s | 478.8443 KOps/s | |
test_membership_nested_last | 25.3300μs | 3.0503μs | 327.8364 KOps/s | 325.2656 KOps/s | |
test_membership_nested_leaf_last | 63.7010μs | 3.0320μs | 329.8182 KOps/s | 324.8731 KOps/s | |
test_membership_stacked_nested_last | 24.3200μs | 3.0424μs | 328.6844 KOps/s | 266.6523 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.6110μs | 3.0581μs | 326.9969 KOps/s | 266.1170 KOps/s | |
test_nested_getleaf | 40.9600μs | 6.2988μs | 158.7601 KOps/s | 159.7387 KOps/s | |
test_nested_get | 95.9210μs | 5.9118μs | 169.1521 KOps/s | 167.5349 KOps/s | |
test_stacked_getleaf | 87.2210μs | 6.2267μs | 160.5985 KOps/s | 162.0254 KOps/s | |
test_stacked_get | 97.8510μs | 5.7864μs | 172.8198 KOps/s | 171.8377 KOps/s | |
test_nested_getitemleaf | 38.4700μs | 6.4273μs | 155.5873 KOps/s | 156.1968 KOps/s | |
test_nested_getitem | 34.5910μs | 6.1395μs | 162.8792 KOps/s | 163.5442 KOps/s | |
test_stacked_getitemleaf | 38.1210μs | 6.4133μs | 155.9249 KOps/s | 156.2090 KOps/s | |
test_stacked_getitem | 33.6200μs | 6.0141μs | 166.2748 KOps/s | 165.8428 KOps/s | |
test_lock_nested | 0.4044ms | 0.3412ms | 2.9309 KOps/s | 2.8585 KOps/s | |
test_lock_stack_nested | 0.3932ms | 0.3471ms | 2.8808 KOps/s | 2.8937 KOps/s | |
test_unlock_nested | 0.3601ms | 0.2875ms | 3.4786 KOps/s | 3.5064 KOps/s | |
test_unlock_stack_nested | 0.3249ms | 0.2849ms | 3.5097 KOps/s | 3.5195 KOps/s | |
test_flatten_speed | 0.1146ms | 77.2441μs | 12.9460 KOps/s | 13.1246 KOps/s | |
test_unflatten_speed | 0.3762ms | 0.3193ms | 3.1317 KOps/s | 3.1324 KOps/s | |
test_common_ops | 0.7522ms | 0.6219ms | 1.6080 KOps/s | 1.6464 KOps/s | |
test_creation | 80.0900μs | 1.7500μs | 571.4268 KOps/s | 566.3684 KOps/s | |
test_creation_empty | 46.8400μs | 9.0115μs | 110.9690 KOps/s | 130.3780 KOps/s | |
test_creation_nested_1 | 34.8200μs | 10.7884μs | 92.6923 KOps/s | 106.6787 KOps/s | |
test_creation_nested_2 | 44.6100μs | 13.3636μs | 74.8300 KOps/s | 83.5185 KOps/s | |
test_clone | 60.9010μs | 11.2269μs | 89.0715 KOps/s | 87.8522 KOps/s | |
test_getitem[int] | 1.2118ms | 10.7657μs | 92.8877 KOps/s | 92.8982 KOps/s | |
test_getitem[slice_int] | 0.1074ms | 20.9363μs | 47.7640 KOps/s | 47.2233 KOps/s | |
test_getitem[range] | 0.1358ms | 37.5655μs | 26.6202 KOps/s | 26.0949 KOps/s | |
test_getitem[tuple] | 95.9610μs | 18.5196μs | 53.9968 KOps/s | 54.8849 KOps/s | |
test_getitem[list] | 0.1459ms | 32.8356μs | 30.4547 KOps/s | 29.6736 KOps/s | |
test_setitem_dim[int] | 42.7110μs | 19.9595μs | 50.1014 KOps/s | 49.7415 KOps/s | |
test_setitem_dim[slice_int] | 59.9500μs | 38.3450μs | 26.0790 KOps/s | 24.5534 KOps/s | |
test_setitem_dim[range] | 75.8910μs | 53.3625μs | 18.7397 KOps/s | 19.1711 KOps/s | |
test_setitem_dim[tuple] | 54.2110μs | 33.1615μs | 30.1555 KOps/s | 30.6064 KOps/s | |
test_setitem | 66.7710μs | 15.9779μs | 62.5863 KOps/s | 63.2529 KOps/s | |
test_set | 74.3610μs | 15.3980μs | 64.9437 KOps/s | 66.1499 KOps/s | |
test_set_shared | 0.5254ms | 0.1582ms | 6.3207 KOps/s | 6.2948 KOps/s | |
test_update | 0.4252ms | 18.7792μs | 53.2503 KOps/s | 56.2339 KOps/s | |
test_update_nested | 68.1710μs | 25.0931μs | 39.8515 KOps/s | 43.0003 KOps/s | |
test_update__nested | 0.5558ms | 25.2867μs | 39.5465 KOps/s | 39.3165 KOps/s | |
test_set_nested | 59.4110μs | 16.9692μs | 58.9304 KOps/s | 61.5391 KOps/s | |
test_set_nested_new | 62.2910μs | 19.6262μs | 50.9524 KOps/s | 52.7833 KOps/s | |
test_select | 75.8010μs | 30.7325μs | 32.5389 KOps/s | 33.6467 KOps/s | |
test_select_nested | 74.5110μs | 43.4738μs | 23.0023 KOps/s | 22.5997 KOps/s | |
test_exclude_nested | 0.1253ms | 62.3253μs | 16.0448 KOps/s | 15.6497 KOps/s | |
test_empty[True] | 0.4028ms | 0.2917ms | 3.4284 KOps/s | 3.3903 KOps/s | |
test_empty[False] | 4.9731μs | 0.8214μs | 1.2175 MOps/s | 1.2201 MOps/s | |
test_to | 88.1410μs | 58.5964μs | 17.0659 KOps/s | 17.5749 KOps/s | |
test_to_nonblocking | 0.1010ms | 47.7186μs | 20.9562 KOps/s | 19.9552 KOps/s | |
test_unbind_speed | 0.2763ms | 0.2469ms | 4.0500 KOps/s | 4.0537 KOps/s | |
test_unbind_speed_stack0 | 0.2922ms | 0.2417ms | 4.1381 KOps/s | 4.1352 KOps/s | |
test_unbind_speed_stack1 | 92.8770ms | 0.7429ms | 1.3461 KOps/s | 1.3623 KOps/s | |
test_split | 95.2124ms | 1.5946ms | 627.1206 Ops/s | 681.2448 Ops/s | |
test_chunk | 95.0993ms | 1.5916ms | 628.3064 Ops/s | 568.7204 Ops/s | |
test_consolidate[False-None] | 2.7863ms | 2.6084ms | 383.3745 Ops/s | 375.7593 Ops/s | |
test_consolidate[default-None] | 1.8285ms | 1.7139ms | 583.4689 Ops/s | 597.6450 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9265ms | 1.7327ms | 577.1238 Ops/s | 575.1144 Ops/s | |
test_consolidate_njt[False-None] | 6.6733ms | 6.3865ms | 156.5802 Ops/s | 155.4058 Ops/s | |
test_to[False-False-None] | 1.8676ms | 1.7427ms | 573.8085 Ops/s | 576.3117 Ops/s | |
test_to[True-False-None] | 1.5858ms | 1.3522ms | 739.5446 Ops/s | 752.9897 Ops/s | |
test_to[within-False-None] | 4.2275ms | 4.1110ms | 243.2503 Ops/s | 244.1300 Ops/s | |
test_to[True-default-None] | 5.3661ms | 5.1771ms | 193.1587 Ops/s | 191.1560 Ops/s | |
test_to_njt[False-False-None] | 7.0257ms | 6.9120ms | 144.6754 Ops/s | 143.6874 Ops/s | |
test_to_njt[True-False-None] | 5.6471ms | 5.4421ms | 183.7525 Ops/s | 180.8922 Ops/s | |
test_to_njt[within-False-None] | 12.0926ms | 11.8740ms | 84.2174 Ops/s | 82.0161 Ops/s | |
test_creation[device0] | 0.6339ms | 79.6813μs | 12.5500 KOps/s | 12.5311 KOps/s | |
test_creation_from_tensor | 0.6370ms | 83.5785μs | 11.9648 KOps/s | 12.0499 KOps/s | |
test_add_one[memmap_tensor0] | 0.6326ms | 7.0792μs | 141.2588 KOps/s | 143.1263 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8505μs | 0.4123μs | 2.4252 MOps/s | 2.4326 MOps/s | |
test_stack[memmap_tensor0] | 23.9610μs | 4.4899μs | 222.7223 KOps/s | 218.6109 KOps/s | |
test_memmaptd_index | 1.6425ms | 0.2417ms | 4.1377 KOps/s | 4.0618 KOps/s | |
test_memmaptd_index_astensor | 0.4155ms | 0.3038ms | 3.2918 KOps/s | 3.2875 KOps/s | |
test_memmaptd_index_op | 0.7120ms | 0.5879ms | 1.7009 KOps/s | 1.7358 KOps/s | |
test_serialize_model | 0.1313s | 0.1300s | 7.6939 Ops/s | 7.6926 Ops/s | |
test_serialize_model_pickle | 1.3471s | 1.1854s | 0.8436 Ops/s | 0.8218 Ops/s | |
test_serialize_weights | 0.1308s | 0.1292s | 7.7386 Ops/s | 7.7161 Ops/s | |
test_serialize_weights_returnearly | 0.3142s | 53.3916ms | 18.7295 Ops/s | 23.3814 Ops/s | |
test_serialize_weights_pickle | 1.3721s | 1.2166s | 0.8220 Ops/s | 0.8189 Ops/s | |
test_reshape_pytree | 47.0400μs | 21.5477μs | 46.4088 KOps/s | 44.7443 KOps/s | |
test_reshape_td | 0.4239ms | 25.7052μs | 38.9026 KOps/s | 37.3024 KOps/s | |
test_view_pytree | 51.1200μs | 21.3602μs | 46.8160 KOps/s | 46.4231 KOps/s | |
test_view_td | 0.4272ms | 29.7893μs | 33.5691 KOps/s | 31.1141 KOps/s | |
test_unbind_pytree | 64.4100μs | 28.2474μs | 35.4015 KOps/s | 34.3948 KOps/s | |
test_unbind_td | 0.5904ms | 36.8639μs | 27.1268 KOps/s | 25.3481 KOps/s | |
test_split_pytree | 0.4367ms | 28.9816μs | 34.5047 KOps/s | 33.9092 KOps/s | |
test_split_td | 0.7511ms | 37.6045μs | 26.5926 KOps/s | 25.9349 KOps/s | |
test_add_pytree | 0.4337ms | 34.4531μs | 29.0250 KOps/s | 28.0642 KOps/s | |
test_add_td | 84.5210μs | 48.5206μs | 20.6098 KOps/s | 21.1898 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1793ms | 0.1222ms | 8.1862 KOps/s | 7.8179 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2230ms | 0.1309ms | 7.6379 KOps/s | 7.4276 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1469ms | 98.6030μs | 10.1417 KOps/s | 10.3071 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.2939ms | 0.1528ms | 6.5462 KOps/s | 6.4389 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 56.5010μs | 22.6837μs | 44.0845 KOps/s | 43.9189 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 84.6200μs | 29.5724μs | 33.8153 KOps/s | 34.2029 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4510ms | 64.6464μs | 15.4688 KOps/s | 15.5123 KOps/s | |
test_compile_copy_nested[pytree-eager] | 84.6410μs | 49.0258μs | 20.3974 KOps/s | 20.5527 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2135ms | 0.1443ms | 6.9291 KOps/s | 7.0638 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3123ms | 0.2198ms | 4.5506 KOps/s | 4.6900 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1501ms | 99.3104μs | 10.0694 KOps/s | 9.7849 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4201ms | 55.0243μs | 18.1738 KOps/s | 17.1592 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2295ms | 0.1387ms | 7.2088 KOps/s | 7.0194 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5911ms | 0.4971ms | 2.0115 KOps/s | 1.9777 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3958ms | 0.2588ms | 3.8642 KOps/s | 3.7682 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2160ms | 0.1468ms | 6.8123 KOps/s | 6.7637 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2974ms | 67.5300μs | 14.8082 KOps/s | 14.1995 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.4461ms | 0.1012ms | 9.8856 KOps/s | 9.6526 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5174ms | 0.4137ms | 2.4174 KOps/s | 2.3823 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1825ms | 0.1344ms | 7.4397 KOps/s | 7.4264 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 52.5100μs | 17.6642μs | 56.6117 KOps/s | 56.5996 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 66.9210μs | 31.6535μs | 31.5921 KOps/s | 31.9560 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1038ms | 70.0766μs | 14.2701 KOps/s | 14.3392 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1560ms | 51.9109μs | 19.2638 KOps/s | 18.9442 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6275ms | 0.3950ms | 2.5316 KOps/s | 2.2220 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8494ms | 2.6573ms | 376.3260 Ops/s | 363.7004 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5971ms | 0.3835ms | 2.6072 KOps/s | 2.2800 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8668ms | 2.7117ms | 368.7784 Ops/s | 368.7640 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5988ms | 0.1194ms | 8.3721 KOps/s | 8.7841 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5804ms | 84.9563μs | 11.7708 KOps/s | 12.3793 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.6200ms | 0.1131ms | 8.8452 KOps/s | 9.4695 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1136ms | 72.6234μs | 13.7697 KOps/s | 13.8870 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1677ms | 0.1134ms | 8.8185 KOps/s | 8.9892 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1260ms | 73.2876μs | 13.6449 KOps/s | 13.8387 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2390ms | 0.1042ms | 9.5951 KOps/s | 9.9116 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1407ms | 17.0375μs | 58.6940 KOps/s | 56.4360 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1475ms | 98.6112μs | 10.1408 KOps/s | 10.2511 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 53.5310μs | 15.9479μs | 62.7043 KOps/s | 62.0060 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1562ms | 0.1015ms | 9.8510 KOps/s | 10.3322 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 51.6610μs | 15.8579μs | 63.0601 KOps/s | 62.5806 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1507ms | 0.1056ms | 9.4654 KOps/s | 9.8382 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5886ms | 17.1237μs | 58.3985 KOps/s | 57.3588 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1860ms | 97.7643μs | 10.2287 KOps/s | 10.3260 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 50.3400μs | 15.7340μs | 63.5567 KOps/s | 62.4231 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1506ms | 0.1003ms | 9.9743 KOps/s | 10.1408 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.4091ms | 15.8521μs | 63.0830 KOps/s | 62.5171 KOps/s | |
test_mod_add[eager] | 0.1819ms | 40.7273μs | 24.5536 KOps/s | 26.5244 KOps/s | |
test_mod_add[compile] | 0.3147ms | 80.6883μs | 12.3934 KOps/s | 12.3370 KOps/s | |
test_mod_add[compile-overhead] | 0.3210ms | 0.1666ms | 6.0026 KOps/s | 5.4186 KOps/s | |
test_mod_wrap[eager] | 0.4039ms | 0.2573ms | 3.8867 KOps/s | 3.7502 KOps/s | |
test_mod_wrap[compile] | 0.5414ms | 0.2841ms | 3.5195 KOps/s | 3.3950 KOps/s | |
test_mod_wrap[compile-overhead] | 7.3763ms | 3.8742ms | 258.1197 Ops/s | 271.2869 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7917ms | 1.4782ms | 676.4861 Ops/s | 683.5139 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4955ms | 1.3867ms | 721.1189 Ops/s | 721.7766 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4378ms | 1.0041ms | 995.9027 Ops/s | 949.9888 Ops/s | |
test_seq_add[eager] | 0.1857ms | 0.1167ms | 8.5715 KOps/s | 8.7131 KOps/s | |
test_seq_add[compile] | 0.1384ms | 87.5944μs | 11.4163 KOps/s | 11.4924 KOps/s | |
test_seq_add[compile-overhead] | 0.1922ms | 0.1295ms | 7.7199 KOps/s | 7.8179 KOps/s | |
test_seq_wrap[eager] | 0.5411ms | 0.4245ms | 2.3559 KOps/s | 2.2907 KOps/s | |
test_seq_wrap[compile] | 0.4429ms | 0.2995ms | 3.3389 KOps/s | 3.2681 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3299ms | 0.2232ms | 4.4807 KOps/s | 4.3611 KOps/s | |
test_func_call_runtime[False-eager] | 0.8639ms | 0.7360ms | 1.3588 KOps/s | 1.2709 KOps/s | |
test_func_call_runtime[False-compile] | 1.1260ms | 0.7371ms | 1.3567 KOps/s | 1.3273 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4349ms | 0.3620ms | 2.7625 KOps/s | 2.7211 KOps/s | |
test_func_call_runtime[True-eager] | 1.3451ms | 0.9094ms | 1.0996 KOps/s | 1.0574 KOps/s | |
test_func_call_runtime[True-compile] | 1.1697ms | 0.7601ms | 1.3157 KOps/s | 1.2968 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8081ms | 0.3835ms | 2.6078 KOps/s | 2.5663 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1722ms | 0.7331ms | 1.3641 KOps/s | 1.2532 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1673ms | 0.7342ms | 1.3619 KOps/s | 1.3222 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4664ms | 0.3602ms | 2.7759 KOps/s | 2.6968 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4409ms | 0.9942ms | 1.0058 KOps/s | 966.4844 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.4285ms | 0.9908ms | 1.0093 KOps/s | 976.2432 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.4127ms | 0.9908ms | 1.0093 KOps/s | 970.7679 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4870ms | 2.0859ms | 479.4032 Ops/s | 463.5644 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9575ms | 0.8034ms | 1.2447 KOps/s | 1.2365 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4890ms | 0.4158ms | 2.4048 KOps/s | 2.3425 KOps/s | |
test_distributed | 5.0659ms | 0.2542ms | 3.9340 KOps/s | 8.7451 KOps/s | |
test_tdmodule | 0.4315ms | 21.4298μs | 46.6640 KOps/s | 52.8064 KOps/s | |
test_tdmodule_dispatch | 71.6510μs | 36.4022μs | 27.4709 KOps/s | 29.9220 KOps/s | |
test_tdseq | 44.5600μs | 20.6232μs | 48.4890 KOps/s | 50.9007 KOps/s | |
test_tdseq_dispatch | 58.4510μs | 38.5900μs | 25.9134 KOps/s | 26.9111 KOps/s | |
test_instantiation_functorch | 1.7350ms | 1.5358ms | 651.1439 Ops/s | 648.9585 Ops/s | |
test_exec_functorch | 0.1909ms | 0.1434ms | 6.9733 KOps/s | 6.9880 KOps/s | |
test_exec_functional_call | 0.2079ms | 0.1407ms | 7.1056 KOps/s | 7.3331 KOps/s | |
test_exec_td_decorator | 0.3728ms | 0.1890ms | 5.2902 KOps/s | 5.3395 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8416ms | 0.7017ms | 1.4251 KOps/s | 1.4646 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0940ms | 0.6887ms | 1.4520 KOps/s | 1.4583 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9875ms | 0.5925ms | 1.6877 KOps/s | 1.6639 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9851ms | 0.5946ms | 1.6818 KOps/s | 1.6701 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.2123ms | 19.2853ms | 51.8530 Ops/s | 51.5752 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2558ms | 19.5621ms | 51.1194 Ops/s | 51.7029 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.1859ms | 19.8063ms | 50.4891 Ops/s | 52.3495 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.0111ms | 19.1527ms | 52.2119 Ops/s | 51.8182 Ops/s | |
test_to_module_speed[True] | 1.5058ms | 0.9540ms | 1.0482 KOps/s | 1.0490 KOps/s | |
test_to_module_speed[False] | 1.0343ms | 0.9476ms | 1.0552 KOps/s | 1.0763 KOps/s | |
test_tc_init | 89.4010μs | 36.4786μs | 27.4133 KOps/s | 30.4610 KOps/s | |
test_tc_init_nested | 0.1150ms | 75.3200μs | 13.2767 KOps/s | 14.9419 KOps/s | |
test_tc_first_layer_tensor | 24.1500μs | 0.8089μs | 1.2363 MOps/s | 1.2506 MOps/s | |
test_tc_first_layer_nontensor | 36.4810μs | 2.2002μs | 454.5026 KOps/s | 446.9891 KOps/s | |
test_tc_second_layer_tensor | 95.9810μs | 1.3837μs | 722.7001 KOps/s | 718.6849 KOps/s | |
test_tc_second_layer_nontensor | 22.4200μs | 2.9296μs | 341.3420 KOps/s | 344.8639 KOps/s | |
test_unbind | 0.2168s | 11.9258ms | 83.8515 Ops/s | 141.4074 Ops/s | |
test_full_like | 9.7789ms | 9.3376ms | 107.0939 Ops/s | 105.0931 Ops/s | |
test_zeros_like | 5.0359ms | 4.3472ms | 230.0351 Ops/s | 230.4989 Ops/s | |
test_ones_like | 9.0880ms | 4.4621ms | 224.1111 Ops/s | 231.1717 Ops/s | |
test_clone | 7.0872ms | 6.6707ms | 149.9091 Ops/s | 155.6649 Ops/s | |
test_squeeze | 59.0510μs | 9.4195μs | 106.1629 KOps/s | 105.9108 KOps/s | |
test_unsqueeze | 0.1276ms | 74.4458μs | 13.4326 KOps/s | 14.1725 KOps/s | |
test_split | 0.5718ms | 0.1589ms | 6.2921 KOps/s | 6.3617 KOps/s | |
test_permute | 0.2501ms | 0.1858ms | 5.3826 KOps/s | 5.6132 KOps/s | |
test_stack | 51.4680ms | 50.7329ms | 19.7111 Ops/s | 19.7586 Ops/s | |
test_cat | 50.9789ms | 50.6237ms | 19.7536 Ops/s | 20.1009 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: 5b670a86bb60d8bbde40f3dcf5b1ef9d04ae0a74 Pull Request resolved: #1220
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):