[Benchmark] Add Redis benchmarks, optimize reads with covering-range strategy#1570
Open
vmoens wants to merge 1 commit intogh/vmoens/56/basefrom
Open
[Benchmark] Add Redis benchmarks, optimize reads with covering-range strategy#1570vmoens wants to merge 1 commit intogh/vmoens/56/basefrom
vmoens wants to merge 1 commit intogh/vmoens/56/basefrom
Conversation
This was referenced Feb 14, 2026
This was referenced Feb 14, 2026
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 36.4500μs | 14.9323μs | 66.9691 KOps/s | 66.5021 KOps/s | |
| test_plain_set_stack_nested | 48.4700μs | 15.3129μs | 65.3045 KOps/s | 65.5666 KOps/s | |
| test_plain_set_nested_inplace | 63.7010μs | 16.3156μs | 61.2910 KOps/s | 59.5171 KOps/s | |
| test_plain_set_stack_nested_inplace | 48.3410μs | 16.6166μs | 60.1807 KOps/s | 60.1078 KOps/s | |
| test_items | 38.7510μs | 5.8163μs | 171.9320 KOps/s | 166.7211 KOps/s | |
| test_items_nested | 0.5936ms | 0.5388ms | 1.8559 KOps/s | 1.8530 KOps/s | |
| test_items_nested_locked | 0.6007ms | 0.5443ms | 1.8371 KOps/s | 1.8143 KOps/s | |
| test_items_nested_leaf | 0.1319ms | 97.1817μs | 10.2900 KOps/s | 10.3392 KOps/s | |
| test_items_stack_nested | 0.6284ms | 0.5377ms | 1.8598 KOps/s | 1.8505 KOps/s | |
| test_items_stack_nested_leaf | 0.1247ms | 95.9722μs | 10.4197 KOps/s | 10.4725 KOps/s | |
| test_items_stack_nested_locked | 0.5938ms | 0.5395ms | 1.8536 KOps/s | 1.8190 KOps/s | |
| test_keys | 30.5010μs | 4.3737μs | 228.6419 KOps/s | 237.3241 KOps/s | |
| test_keys_nested | 0.1716ms | 0.1204ms | 8.3036 KOps/s | 8.3346 KOps/s | |
| test_keys_nested_locked | 88.8248ms | 0.1415ms | 7.0651 KOps/s | 7.7579 KOps/s | |
| test_keys_nested_leaf | 0.1511ms | 0.1101ms | 9.0792 KOps/s | 9.0088 KOps/s | |
| test_keys_stack_nested | 0.1608ms | 0.1196ms | 8.3617 KOps/s | 8.3814 KOps/s | |
| test_keys_stack_nested_leaf | 0.1504ms | 0.1102ms | 9.0727 KOps/s | 9.0689 KOps/s | |
| test_keys_stack_nested_locked | 0.1703ms | 0.1295ms | 7.7216 KOps/s | 7.7914 KOps/s | |
| test_values | 8.6680μs | 1.0208μs | 979.6124 KOps/s | 797.8923 KOps/s | |
| test_values_nested | 78.3520μs | 47.8889μs | 20.8817 KOps/s | 20.8172 KOps/s | |
| test_values_nested_locked | 0.2583ms | 51.0801μs | 19.5771 KOps/s | 19.3353 KOps/s | |
| test_values_nested_leaf | 75.7610μs | 54.6445μs | 18.3001 KOps/s | 17.9812 KOps/s | |
| test_values_stack_nested | 80.8120μs | 47.9406μs | 20.8591 KOps/s | 20.8044 KOps/s | |
| test_values_stack_nested_leaf | 85.4610μs | 54.5549μs | 18.3302 KOps/s | 18.2961 KOps/s | |
| test_values_stack_nested_locked | 91.8320μs | 51.4988μs | 19.4179 KOps/s | 19.4136 KOps/s | |
| test_membership | 4.7035μs | 0.8612μs | 1.1611 MOps/s | 1.1428 MOps/s | |
| test_membership_nested | 31.4100μs | 3.2061μs | 311.9040 KOps/s | 314.5865 KOps/s | |
| test_membership_nested_leaf | 37.6310μs | 3.1951μs | 312.9747 KOps/s | 312.8918 KOps/s | |
| test_membership_stacked_nested | 36.8800μs | 3.2193μs | 310.6278 KOps/s | 312.0186 KOps/s | |
| test_membership_stacked_nested_leaf | 31.3310μs | 3.2049μs | 312.0195 KOps/s | 314.4032 KOps/s | |
| test_membership_nested_last | 71.1120μs | 4.6229μs | 216.3132 KOps/s | 215.5442 KOps/s | |
| test_membership_nested_leaf_last | 76.4820μs | 4.6239μs | 216.2654 KOps/s | 215.6237 KOps/s | |
| test_membership_stacked_nested_last | 41.7510μs | 4.6388μs | 215.5721 KOps/s | 215.3246 KOps/s | |
| test_membership_stacked_nested_leaf_last | 29.6600μs | 4.6625μs | 214.4782 KOps/s | 214.5184 KOps/s | |
| test_nested_getleaf | 50.2910μs | 21.3825μs | 46.7673 KOps/s | 46.2434 KOps/s | |
| test_nested_get | 49.2610μs | 20.1930μs | 49.5220 KOps/s | 47.9664 KOps/s | |
| test_stacked_getleaf | 55.1910μs | 21.7586μs | 45.9589 KOps/s | 45.8879 KOps/s | |
| test_stacked_get | 44.7410μs | 20.4950μs | 48.7923 KOps/s | 48.5729 KOps/s | |
| test_nested_getitemleaf | 48.1910μs | 21.7503μs | 45.9764 KOps/s | 44.1753 KOps/s | |
| test_nested_getitem | 0.2606ms | 20.4035μs | 49.0112 KOps/s | 47.6981 KOps/s | |
| test_stacked_getitemleaf | 50.6710μs | 21.6768μs | 46.1323 KOps/s | 45.0003 KOps/s | |
| test_stacked_getitem | 47.1800μs | 20.7267μs | 48.2470 KOps/s | 46.2045 KOps/s | |
| test_lock_nested | 7.7380ms | 0.4806ms | 2.0805 KOps/s | 2.0862 KOps/s | |
| test_lock_stack_nested | 0.5318ms | 0.4778ms | 2.0928 KOps/s | 2.0615 KOps/s | |
| test_unlock_nested | 0.4985ms | 0.3832ms | 2.6093 KOps/s | 2.6035 KOps/s | |
| test_unlock_stack_nested | 0.4322ms | 0.3823ms | 2.6161 KOps/s | 2.5501 KOps/s | |
| test_flatten_speed | 0.1735ms | 0.1222ms | 8.1846 KOps/s | 8.1060 KOps/s | |
| test_unflatten_speed | 0.6469ms | 0.5899ms | 1.6953 KOps/s | 1.6773 KOps/s | |
| test_common_ops | 0.8179ms | 0.6761ms | 1.4790 KOps/s | 1.4442 KOps/s | |
| test_creation | 0.1267ms | 2.9003μs | 344.7875 KOps/s | 343.1075 KOps/s | |
| test_creation_empty | 39.2810μs | 6.1510μs | 162.5749 KOps/s | 162.1115 KOps/s | |
| test_creation_nested_1 | 50.8110μs | 10.9102μs | 91.6575 KOps/s | 91.5878 KOps/s | |
| test_creation_nested_2 | 34.6200μs | 11.8656μs | 84.2775 KOps/s | 84.2981 KOps/s | |
| test_creation_many_keys[10] | 45.2310μs | 18.2938μs | 54.6633 KOps/s | 54.4250 KOps/s | |
| test_creation_many_keys[50] | 0.1141ms | 78.4317μs | 12.7499 KOps/s | 12.7235 KOps/s | |
| test_creation_many_keys[100] | 0.2061ms | 0.1540ms | 6.4948 KOps/s | 6.5051 KOps/s | |
| test_creation_nested_many_keys[10] | 64.7220μs | 39.3274μs | 25.4276 KOps/s | 25.2352 KOps/s | |
| test_creation_nested_many_keys[50] | 0.1939ms | 0.1602ms | 6.2406 KOps/s | 6.1945 KOps/s | |
| test_clone | 41.8600μs | 13.2048μs | 75.7300 KOps/s | 75.4836 KOps/s | |
| test_getitem[int] | 1.6909ms | 14.6692μs | 68.1700 KOps/s | 55.3323 KOps/s | |
| test_getitem[slice_int] | 0.1458ms | 25.2556μs | 39.5951 KOps/s | 39.7305 KOps/s | |
| test_getitem[range] | 0.2031ms | 60.4210μs | 16.5505 KOps/s | 16.0159 KOps/s | |
| test_getitem[tuple] | 0.1483ms | 24.3137μs | 41.1291 KOps/s | 41.2246 KOps/s | |
| test_getitem[list] | 0.1822ms | 56.8754μs | 17.5823 KOps/s | 17.3770 KOps/s | |
| test_setitem_dim[int] | 47.3910μs | 26.0681μs | 38.3611 KOps/s | 39.0930 KOps/s | |
| test_setitem_dim[slice_int] | 79.3220μs | 43.8630μs | 22.7982 KOps/s | 22.2742 KOps/s | |
| test_setitem_dim[range] | 0.1223ms | 92.5019μs | 10.8106 KOps/s | 10.6523 KOps/s | |
| test_setitem_dim[tuple] | 63.4920μs | 40.6888μs | 24.5768 KOps/s | 23.8971 KOps/s | |
| test_setitem | 49.5910μs | 17.9933μs | 55.5763 KOps/s | 55.8306 KOps/s | |
| test_set | 61.0210μs | 17.0583μs | 58.6225 KOps/s | 59.0393 KOps/s | |
| test_set_shared | 0.6237ms | 0.2041ms | 4.8995 KOps/s | 4.7542 KOps/s | |
| test_update | 0.4200ms | 22.0265μs | 45.3998 KOps/s | 45.4881 KOps/s | |
| test_update_nested | 74.1910μs | 34.5186μs | 28.9699 KOps/s | 29.1625 KOps/s | |
| test_update__nested | 0.4462ms | 35.7117μs | 28.0020 KOps/s | 28.7851 KOps/s | |
| test_set_nested | 55.7110μs | 19.7352μs | 50.6708 KOps/s | 52.3334 KOps/s | |
| test_set_nested_new | 57.1710μs | 24.2829μs | 41.1812 KOps/s | 41.3441 KOps/s | |
| test_select | 74.8520μs | 41.7139μs | 23.9728 KOps/s | 23.4676 KOps/s | |
| test_select_nested | 0.1103ms | 74.5841μs | 13.4077 KOps/s | 13.1392 KOps/s | |
| test_exclude_nested | 0.1274ms | 97.9707μs | 10.2071 KOps/s | 10.1290 KOps/s | |
| test_empty[True] | 0.4894ms | 0.4397ms | 2.2745 KOps/s | 2.2425 KOps/s | |
| test_empty[False] | 8.2027μs | 1.3282μs | 752.9101 KOps/s | 752.6957 KOps/s | |
| test_to | 0.1039ms | 72.6077μs | 13.7726 KOps/s | 13.7468 KOps/s | |
| test_to_nonblocking | 0.1094ms | 64.1692μs | 15.5838 KOps/s | 15.4039 KOps/s | |
| test_unbind_speed | 0.3822ms | 0.3306ms | 3.0250 KOps/s | 3.0425 KOps/s | |
| test_unbind_speed_stack0 | 0.4003ms | 0.3257ms | 3.0702 KOps/s | 3.0581 KOps/s | |
| test_unbind_speed_stack1 | 0.1035s | 0.9129ms | 1.0954 KOps/s | 1.1915 KOps/s | |
| test_split | 1.3371ms | 1.1403ms | 876.9843 Ops/s | 881.8171 Ops/s | |
| test_chunk | 0.1030s | 1.2101ms | 826.3952 Ops/s | 914.2983 Ops/s | |
| test_to_cpu_blocking | 19.4335ms | 19.2501ms | 51.9479 Ops/s | 35.3424 Ops/s | |
| test_to_cpu_global_sync | 11.3527ms | 11.0979ms | 90.1073 Ops/s | 89.0897 Ops/s | |
| test_to_cpu_event_sync | 12.2321ms | 12.0505ms | 82.9843 Ops/s | 81.9194 Ops/s | |
| test_to_cpu_default | 0.1145s | 13.3834ms | 74.7192 Ops/s | 73.8414 Ops/s | |
| test_consolidate[False-None] | 4.1522ms | 4.0756ms | 245.3641 Ops/s | 245.2643 Ops/s | |
| test_consolidate[default-None] | 2.0980ms | 2.0188ms | 495.3341 Ops/s | 492.0250 Ops/s | |
| test_consolidate[reduce-overhead-None] | 1.9968ms | 1.9339ms | 517.0985 Ops/s | 507.2162 Ops/s | |
| test_consolidate_njt[False-None] | 8.6688ms | 8.4112ms | 118.8884 Ops/s | 118.2557 Ops/s | |
| test_to[False-False-None] | 2.1377ms | 2.0497ms | 487.8747 Ops/s | 480.3928 Ops/s | |
| test_to[True-False-None] | 2.1627ms | 1.8910ms | 528.8312 Ops/s | 524.3152 Ops/s | |
| test_to[within-False-None] | 6.3644ms | 6.0603ms | 165.0075 Ops/s | 164.4546 Ops/s | |
| test_to[True-default-None] | 7.6409ms | 7.4984ms | 133.3610 Ops/s | 128.8316 Ops/s | |
| test_to_njt[False-False-None] | 8.8843ms | 8.5521ms | 116.9300 Ops/s | 116.1738 Ops/s | |
| test_to_njt[True-False-None] | 7.3268ms | 7.0017ms | 142.8231 Ops/s | 143.1244 Ops/s | |
| test_to_njt[within-False-None] | 16.0263ms | 15.4576ms | 64.6933 Ops/s | 63.2631 Ops/s | |
| test_creation[device0] | 0.4232ms | 0.1161ms | 8.6147 KOps/s | 8.5626 KOps/s | |
| test_creation_from_tensor | 0.4165ms | 0.1139ms | 8.7758 KOps/s | 8.8228 KOps/s | |
| test_add_one[memmap_tensor0] | 0.3560ms | 6.2059μs | 161.1366 KOps/s | 157.0581 KOps/s | |
| test_contiguous[memmap_tensor0] | 13.9800μs | 0.6694μs | 1.4939 MOps/s | 2.1464 MOps/s | |
| test_stack[memmap_tensor0] | 34.0810μs | 4.6051μs | 217.1524 KOps/s | 221.1890 KOps/s | |
| test_memmaptd_index | 0.9954ms | 0.2585ms | 3.8692 KOps/s | 3.8959 KOps/s | |
| test_memmaptd_index_astensor | 0.5107ms | 0.3500ms | 2.8574 KOps/s | 2.8352 KOps/s | |
| test_memmaptd_index_op | 0.7602ms | 0.5851ms | 1.7091 KOps/s | 1.7013 KOps/s | |
| test_serialize_model | 0.1401s | 0.1368s | 7.3109 Ops/s | 7.3523 Ops/s | |
| test_serialize_model_pickle | 1.8945s | 1.3068s | 0.7652 Ops/s | 0.8384 Ops/s | |
| test_serialize_weights | 0.1397s | 0.1365s | 7.3252 Ops/s | 7.3463 Ops/s | |
| test_serialize_weights_returnearly | 0.4431s | 93.1370ms | 10.7369 Ops/s | 10.5289 Ops/s | |
| test_serialize_weights_pickle | 1.3796s | 1.2030s | 0.8312 Ops/s | 0.8206 Ops/s | |
| test_reshape_pytree | 0.2133ms | 33.2747μs | 30.0529 KOps/s | 30.1810 KOps/s | |
| test_reshape_td | 78.3320μs | 44.6103μs | 22.4164 KOps/s | 22.9459 KOps/s | |
| test_view_pytree | 0.2315ms | 33.3214μs | 30.0108 KOps/s | 30.6489 KOps/s | |
| test_view_td | 87.4920μs | 52.1642μs | 19.1702 KOps/s | 19.8178 KOps/s | |
| test_unbind_pytree | 0.2455ms | 36.8992μs | 27.1008 KOps/s | 27.1588 KOps/s | |
| test_unbind_td | 0.1265ms | 48.5871μs | 20.5816 KOps/s | 20.2761 KOps/s | |
| test_split_pytree | 0.1978ms | 42.9633μs | 23.2757 KOps/s | 23.4172 KOps/s | |
| test_split_td | 0.2204ms | 64.7998μs | 15.4321 KOps/s | 15.2513 KOps/s | |
| test_add_pytree | 0.1948ms | 42.5273μs | 23.5143 KOps/s | 23.7906 KOps/s | |
| test_add_td | 96.2720μs | 52.8003μs | 18.9393 KOps/s | 18.8613 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.2686ms | 0.1428ms | 7.0048 KOps/s | 6.9019 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.4455ms | 0.1934ms | 5.1708 KOps/s | 5.2643 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1590ms | 0.1073ms | 9.3208 KOps/s | 8.7484 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4328ms | 0.1795ms | 5.5722 KOps/s | 5.5702 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.1137ms | 27.9265μs | 35.8083 KOps/s | 31.6740 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 95.3020μs | 52.7892μs | 18.9433 KOps/s | 19.3872 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1008ms | 9.7876μs | 102.1697 KOps/s | 102.7640 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4662ms | 70.0779μs | 14.2698 KOps/s | 14.3808 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.3769ms | 0.1743ms | 5.7363 KOps/s | 5.4380 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.2980ms | 0.2564ms | 3.9003 KOps/s | 3.8841 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.2104ms | 0.1150ms | 8.6965 KOps/s | 8.3867 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1164ms | 69.5807μs | 14.3718 KOps/s | 14.3381 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.3836ms | 0.1573ms | 6.3560 KOps/s | 6.1880 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8606ms | 0.5218ms | 1.9164 KOps/s | 1.7907 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4829ms | 0.3070ms | 3.2570 KOps/s | 3.2046 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2598ms | 0.1774ms | 5.6371 KOps/s | 5.1380 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1438ms | 85.2890μs | 11.7248 KOps/s | 11.5970 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.2905ms | 0.1173ms | 8.5240 KOps/s | 7.7116 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6592ms | 0.4279ms | 2.3369 KOps/s | 2.1403 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.3981ms | 0.1597ms | 6.2611 KOps/s | 6.0735 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 57.4810μs | 23.7654μs | 42.0779 KOps/s | 37.8157 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 75.7120μs | 42.0428μs | 23.7853 KOps/s | 24.2950 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 43.1610μs | 10.8885μs | 91.8400 KOps/s | 92.4187 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.3805ms | 51.9754μs | 19.2399 KOps/s | 19.0988 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 2.0099ms | 0.1721ms | 5.8108 KOps/s | 5.3417 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.3726ms | 3.2654ms | 306.2399 Ops/s | 303.8777 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9793ms | 0.1603ms | 6.2381 KOps/s | 5.8950 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9126ms | 2.7707ms | 360.9199 Ops/s | 354.8662 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.2192ms | 0.1072ms | 9.3293 KOps/s | 8.8389 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3133ms | 71.7209μs | 13.9429 KOps/s | 13.5628 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1410ms | 95.1742μs | 10.5070 KOps/s | 10.2563 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2532ms | 45.1353μs | 22.1556 KOps/s | 22.1327 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1603ms | 96.5130μs | 10.3613 KOps/s | 10.1741 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2803ms | 45.3408μs | 22.0552 KOps/s | 22.0280 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1590ms | 56.3565μs | 17.7442 KOps/s | 17.1170 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2279ms | 28.2254μs | 35.4290 KOps/s | 35.3540 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1520ms | 44.3779μs | 22.5338 KOps/s | 21.9453 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2591ms | 22.9411μs | 43.5899 KOps/s | 44.0569 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 0.1013ms | 44.7216μs | 22.3606 KOps/s | 21.5984 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2636ms | 22.7773μs | 43.9033 KOps/s | 44.4200 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1058ms | 56.2180μs | 17.7879 KOps/s | 17.0155 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2458ms | 27.4760μs | 36.3955 KOps/s | 35.5450 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 82.0820μs | 44.6386μs | 22.4022 KOps/s | 21.8191 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2657ms | 22.6440μs | 44.1618 KOps/s | 44.2512 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 86.5220μs | 44.1136μs | 22.6687 KOps/s | 21.3548 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2717ms | 22.6015μs | 44.2448 KOps/s | 43.8162 KOps/s | |
| test_mod_add[eager] | 0.1102ms | 50.2669μs | 19.8938 KOps/s | 19.6414 KOps/s | |
| test_mod_add[compile] | 0.5504ms | 0.1073ms | 9.3201 KOps/s | 9.3424 KOps/s | |
| test_mod_add[compile-overhead] | 0.2473ms | 0.1466ms | 6.8232 KOps/s | 6.3563 KOps/s | |
| test_mod_wrap[eager] | 0.3810ms | 0.3058ms | 3.2704 KOps/s | 3.2882 KOps/s | |
| test_mod_wrap[compile] | 0.4857ms | 0.3599ms | 2.7787 KOps/s | 2.8184 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.4146ms | 4.0998ms | 243.9163 Ops/s | 247.7261 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.5993ms | 1.4847ms | 673.5327 Ops/s | 664.6137 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.5511ms | 1.4404ms | 694.2566 Ops/s | 636.4033 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2747ms | 0.8853ms | 1.1296 KOps/s | 986.2120 Ops/s | |
| test_seq_add[eager] | 0.2102ms | 0.1559ms | 6.4162 KOps/s | 6.4238 KOps/s | |
| test_seq_add[compile] | 0.2521ms | 0.1139ms | 8.7768 KOps/s | 8.1254 KOps/s | |
| test_seq_add[compile-overhead] | 0.2164ms | 0.1523ms | 6.5650 KOps/s | 6.2783 KOps/s | |
| test_seq_wrap[eager] | 0.6012ms | 0.5243ms | 1.9074 KOps/s | 1.8196 KOps/s | |
| test_seq_wrap[compile] | 0.4185ms | 0.3667ms | 2.7269 KOps/s | 2.6431 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3276ms | 0.2626ms | 3.8088 KOps/s | 3.7508 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9240ms | 0.8268ms | 1.2094 KOps/s | 1.1840 KOps/s | |
| test_func_call_runtime[False-compile] | 1.0874ms | 0.9100ms | 1.0990 KOps/s | 1.0735 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5387ms | 0.4605ms | 2.1716 KOps/s | 2.1521 KOps/s | |
| test_func_call_runtime[True-eager] | 1.1221ms | 1.0767ms | 928.7687 Ops/s | 914.8752 Ops/s | |
| test_func_call_runtime[True-compile] | 1.0222ms | 0.9560ms | 1.0460 KOps/s | 1.0702 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.6044ms | 0.4755ms | 2.1030 KOps/s | 2.0748 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.9421ms | 0.8871ms | 1.1272 KOps/s | 1.1780 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.0072ms | 0.9486ms | 1.0542 KOps/s | 1.0777 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5611ms | 0.4687ms | 2.1335 KOps/s | 2.1408 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.5330ms | 1.2562ms | 796.0406 Ops/s | 805.9934 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.2259ms | 0.9992ms | 1.0008 KOps/s | 1.0372 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5852ms | 0.5047ms | 1.9814 KOps/s | 1.9470 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8282ms | 2.3471ms | 426.0607 Ops/s | 422.1228 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0975ms | 0.9693ms | 1.0316 KOps/s | 1.0197 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5638ms | 0.5084ms | 1.9669 KOps/s | 1.9302 KOps/s | |
| test_distributed | 2.9086ms | 0.1673ms | 5.9765 KOps/s | 5.5932 KOps/s | |
| test_tdmodule | 0.5023ms | 28.6249μs | 34.9347 KOps/s | 34.8051 KOps/s | |
| test_tdmodule_dispatch | 77.5510μs | 46.7294μs | 21.3998 KOps/s | 21.2931 KOps/s | |
| test_tdseq | 46.5910μs | 27.5888μs | 36.2466 KOps/s | 35.8201 KOps/s | |
| test_tdseq_dispatch | 69.3410μs | 49.1759μs | 20.3352 KOps/s | 20.4026 KOps/s | |
| test_instantiation_functorch | 2.1514ms | 2.0562ms | 486.3387 Ops/s | 484.0324 Ops/s | |
| test_exec_functorch | 0.2300ms | 0.1787ms | 5.5947 KOps/s | 5.5388 KOps/s | |
| test_exec_functional_call | 0.2096ms | 0.1594ms | 6.2742 KOps/s | 6.3748 KOps/s | |
| test_exec_td_decorator | 0.4492ms | 0.2345ms | 4.2649 KOps/s | 4.2399 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0181ms | 0.8159ms | 1.2257 KOps/s | 1.2092 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 1.0202ms | 0.8182ms | 1.2221 KOps/s | 1.2132 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8840ms | 0.7075ms | 1.4135 KOps/s | 1.4020 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.9108ms | 0.7079ms | 1.4127 KOps/s | 1.4042 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.1447ms | 20.3616ms | 49.1122 Ops/s | 48.7245 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.1679ms | 20.3888ms | 49.0466 Ops/s | 48.7268 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.7433ms | 20.1562ms | 49.6126 Ops/s | 49.2583 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.3626ms | 20.1742ms | 49.5682 Ops/s | 49.2791 Ops/s | |
| test_to_module_speed[True] | 1.6131ms | 1.4819ms | 674.8184 Ops/s | 665.5558 Ops/s | |
| test_to_module_speed[False] | 1.5464ms | 1.4441ms | 692.4505 Ops/s | 681.6535 Ops/s | |
| test_tc_init | 77.3920μs | 46.7766μs | 21.3782 KOps/s | 21.5013 KOps/s | |
| test_tc_init_tensor_only | 38.5800μs | 9.9674μs | 100.3267 KOps/s | 100.8990 KOps/s | |
| test_tc_init_nested | 0.1478ms | 93.3513μs | 10.7122 KOps/s | 10.8309 KOps/s | |
| test_tc_init_many_fields | 43.5210μs | 16.7281μs | 59.7797 KOps/s | 60.5693 KOps/s | |
| test_tc_first_layer_tensor | 22.1110μs | 1.8567μs | 538.5882 KOps/s | 538.7478 KOps/s | |
| test_tc_first_layer_tensor_only | 5.0900μs | 0.7599μs | 1.3160 MOps/s | 1.3133 MOps/s | |
| test_tc_first_layer_tensor_set | 26.6200μs | 4.1694μs | 239.8401 KOps/s | 235.9953 KOps/s | |
| test_tc_first_layer_tensor_only_set | 16.6000μs | 3.1389μs | 318.5831 KOps/s | 314.0949 KOps/s | |
| test_tc_first_layer_nontensor | 28.3400μs | 6.1696μs | 162.0850 KOps/s | 161.2530 KOps/s | |
| test_tc_second_layer_tensor | 40.4200μs | 4.4558μs | 224.4245 KOps/s | 224.8797 KOps/s | |
| test_tc_second_layer_nontensor | 40.3800μs | 8.7592μs | 114.1657 KOps/s | 113.1505 KOps/s | |
| test_unbind | 0.2448s | 14.2322ms | 70.2631 Ops/s | 57.0595 Ops/s | |
| test_full_like | 4.9841ms | 4.3849ms | 228.0576 Ops/s | 228.6780 Ops/s | |
| test_zeros_like | 4.4643ms | 4.3580ms | 229.4604 Ops/s | 228.8660 Ops/s | |
| test_ones_like | 4.4802ms | 4.3655ms | 229.0671 Ops/s | 228.8934 Ops/s | |
| test_clone | 6.6580ms | 6.4544ms | 154.9337 Ops/s | 154.5093 Ops/s | |
| test_squeeze | 0.1879ms | 13.9733μs | 71.5651 KOps/s | 69.2840 KOps/s | |
| test_unsqueeze | 0.2607ms | 0.1103ms | 9.0686 KOps/s | 9.0254 KOps/s | |
| test_split | 0.2425ms | 0.1835ms | 5.4482 KOps/s | 5.4460 KOps/s | |
| test_permute | 0.2486ms | 0.2029ms | 4.9287 KOps/s | 4.9190 KOps/s | |
| test_stack | 51.4509ms | 51.1405ms | 19.5540 Ops/s | 23.3102 Ops/s | |
| test_cat | 51.4945ms | 50.9972ms | 19.6089 Ops/s | 23.2974 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 53.0710μs | 15.0821μs | 66.3037 KOps/s | 66.1575 KOps/s | |
| test_plain_set_stack_nested | 36.7400μs | 15.1358μs | 66.0684 KOps/s | 64.4900 KOps/s | |
| test_plain_set_nested_inplace | 64.5810μs | 16.6186μs | 60.1737 KOps/s | 58.4135 KOps/s | |
| test_plain_set_stack_nested_inplace | 45.5600μs | 16.6449μs | 60.0786 KOps/s | 59.5080 KOps/s | |
| test_items | 36.3900μs | 5.9295μs | 168.6495 KOps/s | 169.7163 KOps/s | |
| test_items_nested | 0.7194ms | 0.5388ms | 1.8561 KOps/s | 1.8674 KOps/s | |
| test_items_nested_locked | 0.7629ms | 0.5391ms | 1.8549 KOps/s | 1.8492 KOps/s | |
| test_items_nested_leaf | 0.1756ms | 97.8323μs | 10.2216 KOps/s | 10.4667 KOps/s | |
| test_items_stack_nested | 0.7246ms | 0.5398ms | 1.8524 KOps/s | 1.8767 KOps/s | |
| test_items_stack_nested_leaf | 0.1528ms | 96.4331μs | 10.3699 KOps/s | 10.1885 KOps/s | |
| test_items_stack_nested_locked | 0.7529ms | 0.5452ms | 1.8343 KOps/s | 1.8217 KOps/s | |
| test_keys | 46.2210μs | 4.2852μs | 233.3614 KOps/s | 229.2209 KOps/s | |
| test_keys_nested | 0.2133ms | 0.1228ms | 8.1463 KOps/s | 8.3246 KOps/s | |
| test_keys_nested_locked | 88.6536ms | 0.1431ms | 6.9886 KOps/s | 7.7574 KOps/s | |
| test_keys_nested_leaf | 0.1700ms | 0.1122ms | 8.9096 KOps/s | 9.0275 KOps/s | |
| test_keys_stack_nested | 0.1932ms | 0.1197ms | 8.3523 KOps/s | 8.1877 KOps/s | |
| test_keys_stack_nested_leaf | 0.2009ms | 0.1127ms | 8.8700 KOps/s | 8.9122 KOps/s | |
| test_keys_stack_nested_locked | 0.1911ms | 0.1301ms | 7.6835 KOps/s | 7.6462 KOps/s | |
| test_values | 10.3822μs | 1.0281μs | 972.6747 KOps/s | 977.9815 KOps/s | |
| test_values_nested | 87.6420μs | 47.3071μs | 21.1385 KOps/s | 20.8226 KOps/s | |
| test_values_nested_locked | 96.1420μs | 50.5078μs | 19.7989 KOps/s | 19.5504 KOps/s | |
| test_values_nested_leaf | 88.3020μs | 53.6431μs | 18.6417 KOps/s | 18.4562 KOps/s | |
| test_values_stack_nested | 0.1177ms | 47.4836μs | 21.0599 KOps/s | 20.9452 KOps/s | |
| test_values_stack_nested_leaf | 0.1003ms | 54.2725μs | 18.4255 KOps/s | 18.4556 KOps/s | |
| test_values_stack_nested_locked | 84.1010μs | 50.6943μs | 19.7261 KOps/s | 19.7761 KOps/s | |
| test_membership | 7.3252μs | 0.8415μs | 1.1884 MOps/s | 1.1755 MOps/s | |
| test_membership_nested | 22.9900μs | 3.2104μs | 311.4880 KOps/s | 315.5347 KOps/s | |
| test_membership_nested_leaf | 55.1410μs | 3.2106μs | 311.4644 KOps/s | 316.0496 KOps/s | |
| test_membership_stacked_nested | 32.5110μs | 3.2668μs | 306.1077 KOps/s | 314.7572 KOps/s | |
| test_membership_stacked_nested_leaf | 53.8310μs | 3.1878μs | 313.6961 KOps/s | 316.3856 KOps/s | |
| test_membership_nested_last | 37.6410μs | 4.6408μs | 215.4810 KOps/s | 219.5590 KOps/s | |
| test_membership_nested_leaf_last | 67.9310μs | 4.6713μs | 214.0734 KOps/s | 218.2892 KOps/s | |
| test_membership_stacked_nested_last | 37.0010μs | 4.6626μs | 214.4743 KOps/s | 217.4423 KOps/s | |
| test_membership_stacked_nested_leaf_last | 43.9710μs | 4.7036μs | 212.6050 KOps/s | 213.3685 KOps/s | |
| test_nested_getleaf | 60.6610μs | 21.3394μs | 46.8616 KOps/s | 45.1639 KOps/s | |
| test_nested_get | 64.3010μs | 20.4316μs | 48.9438 KOps/s | 47.5484 KOps/s | |
| test_stacked_getleaf | 61.5010μs | 21.9681μs | 45.5205 KOps/s | 45.4699 KOps/s | |
| test_stacked_get | 59.4210μs | 20.5338μs | 48.7002 KOps/s | 47.6541 KOps/s | |
| test_nested_getitemleaf | 50.3110μs | 22.0883μs | 45.2729 KOps/s | 43.5470 KOps/s | |
| test_nested_getitem | 63.5610μs | 21.0652μs | 47.4716 KOps/s | 47.2002 KOps/s | |
| test_stacked_getitemleaf | 72.9610μs | 22.0222μs | 45.4087 KOps/s | 43.9832 KOps/s | |
| test_stacked_getitem | 62.4810μs | 21.1625μs | 47.2533 KOps/s | 47.0606 KOps/s | |
| test_lock_nested | 7.6490ms | 0.4804ms | 2.0816 KOps/s | 2.0715 KOps/s | |
| test_lock_stack_nested | 0.5620ms | 0.4827ms | 2.0717 KOps/s | 2.0573 KOps/s | |
| test_unlock_nested | 0.5229ms | 0.3809ms | 2.6253 KOps/s | 2.5299 KOps/s | |
| test_unlock_stack_nested | 0.4517ms | 0.3842ms | 2.6031 KOps/s | 2.5435 KOps/s | |
| test_flatten_speed | 0.1983ms | 0.1215ms | 8.2333 KOps/s | 8.1208 KOps/s | |
| test_unflatten_speed | 0.7803ms | 0.6066ms | 1.6486 KOps/s | 1.6593 KOps/s | |
| test_common_ops | 0.8836ms | 0.6943ms | 1.4404 KOps/s | 1.4179 KOps/s | |
| test_creation | 0.4426ms | 2.9864μs | 334.8525 KOps/s | 334.0187 KOps/s | |
| test_creation_empty | 29.7600μs | 6.2811μs | 159.2086 KOps/s | 157.7846 KOps/s | |
| test_creation_nested_1 | 30.8810μs | 10.8843μs | 91.8756 KOps/s | 90.5681 KOps/s | |
| test_creation_nested_2 | 0.4265ms | 11.8778μs | 84.1909 KOps/s | 82.3429 KOps/s | |
| test_creation_many_keys[10] | 41.1510μs | 18.7131μs | 53.4385 KOps/s | 54.3758 KOps/s | |
| test_creation_many_keys[50] | 0.1189ms | 78.6281μs | 12.7181 KOps/s | 12.6690 KOps/s | |
| test_creation_many_keys[100] | 0.5739ms | 0.1538ms | 6.5041 KOps/s | 6.5291 KOps/s | |
| test_creation_nested_many_keys[10] | 0.4560ms | 39.6310μs | 25.2327 KOps/s | 24.9416 KOps/s | |
| test_creation_nested_many_keys[50] | 0.5745ms | 0.1616ms | 6.1898 KOps/s | 6.1940 KOps/s | |
| test_clone | 40.0210μs | 13.4088μs | 74.5779 KOps/s | 74.5675 KOps/s | |
| test_getitem[int] | 1.6164ms | 14.5949μs | 68.5172 KOps/s | 55.6337 KOps/s | |
| test_getitem[slice_int] | 0.4546ms | 26.4629μs | 37.7887 KOps/s | 39.1609 KOps/s | |
| test_getitem[range] | 0.1820ms | 66.4768μs | 15.0428 KOps/s | 16.0363 KOps/s | |
| test_getitem[tuple] | 0.1490ms | 25.0304μs | 39.9514 KOps/s | 40.8746 KOps/s | |
| test_getitem[list] | 0.5042ms | 62.0778μs | 16.1088 KOps/s | 17.5870 KOps/s | |
| test_setitem_dim[int] | 49.6810μs | 28.8160μs | 34.7030 KOps/s | 38.8951 KOps/s | |
| test_setitem_dim[slice_int] | 72.0010μs | 47.4451μs | 21.0770 KOps/s | 22.5778 KOps/s | |
| test_setitem_dim[range] | 0.5472ms | 0.1027ms | 9.7360 KOps/s | 10.6715 KOps/s | |
| test_setitem_dim[tuple] | 71.9810μs | 44.4919μs | 22.4760 KOps/s | 24.1922 KOps/s | |
| test_setitem | 67.9910μs | 18.4187μs | 54.2928 KOps/s | 55.1216 KOps/s | |
| test_set | 60.7710μs | 17.5603μs | 56.9465 KOps/s | 56.6949 KOps/s | |
| test_set_shared | 0.6301ms | 0.2100ms | 4.7630 KOps/s | 4.7867 KOps/s | |
| test_update | 0.1997ms | 22.3499μs | 44.7430 KOps/s | 44.9933 KOps/s | |
| test_update_nested | 78.1010μs | 34.6024μs | 28.8997 KOps/s | 28.6023 KOps/s | |
| test_update__nested | 0.4597ms | 34.3179μs | 29.1393 KOps/s | 28.4697 KOps/s | |
| test_set_nested | 58.2410μs | 18.8768μs | 52.9751 KOps/s | 51.6232 KOps/s | |
| test_set_nested_new | 59.0210μs | 24.2108μs | 41.3039 KOps/s | 39.9394 KOps/s | |
| test_select | 80.7810μs | 42.8060μs | 23.3612 KOps/s | 23.4409 KOps/s | |
| test_select_nested | 0.1054ms | 75.0275μs | 13.3284 KOps/s | 13.2317 KOps/s | |
| test_exclude_nested | 0.1481ms | 98.6181μs | 10.1401 KOps/s | 10.1745 KOps/s | |
| test_empty[True] | 0.5673ms | 0.4435ms | 2.2549 KOps/s | 2.2647 KOps/s | |
| test_empty[False] | 9.3575μs | 1.3300μs | 751.8762 KOps/s | 750.8220 KOps/s | |
| test_to | 0.1026ms | 71.1736μs | 14.0501 KOps/s | 13.7655 KOps/s | |
| test_to_nonblocking | 0.1180ms | 64.8959μs | 15.4093 KOps/s | 15.6550 KOps/s | |
| test_unbind_speed | 0.4544ms | 0.3270ms | 3.0584 KOps/s | 3.0610 KOps/s | |
| test_unbind_speed_stack0 | 0.4375ms | 0.3248ms | 3.0788 KOps/s | 3.0059 KOps/s | |
| test_unbind_speed_stack1 | 0.1020s | 0.9105ms | 1.0983 KOps/s | 1.1799 KOps/s | |
| test_split | 1.3307ms | 1.1421ms | 875.5524 Ops/s | 776.5132 Ops/s | |
| test_chunk | 0.1018s | 1.2053ms | 829.6720 Ops/s | 916.1265 Ops/s | |
| test_to_cpu_blocking | 19.6972ms | 19.4171ms | 51.5011 Ops/s | 40.0220 Ops/s | |
| test_to_cpu_global_sync | 11.4553ms | 11.2534ms | 88.8621 Ops/s | 88.9312 Ops/s | |
| test_to_cpu_event_sync | 0.1137s | 13.4787ms | 74.1913 Ops/s | 81.7288 Ops/s | |
| test_to_cpu_default | 12.6440ms | 12.3314ms | 81.0937 Ops/s | 81.4164 Ops/s | |
| test_consolidate[False-None] | 4.7298ms | 4.1539ms | 240.7394 Ops/s | 217.2961 Ops/s | |
| test_consolidate[default-None] | 2.1744ms | 2.0278ms | 493.1467 Ops/s | 472.2677 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.0166ms | 1.9471ms | 513.5753 Ops/s | 501.8109 Ops/s | |
| test_consolidate_njt[False-None] | 8.6583ms | 8.5203ms | 117.3665 Ops/s | 116.0160 Ops/s | |
| test_to[False-False-None] | 2.1788ms | 2.0743ms | 482.0791 Ops/s | 475.4570 Ops/s | |
| test_to[True-False-None] | 2.2244ms | 1.9593ms | 510.3743 Ops/s | 515.8969 Ops/s | |
| test_to[within-False-None] | 6.2253ms | 6.1525ms | 162.5365 Ops/s | 162.2329 Ops/s | |
| test_to[True-default-None] | 7.7209ms | 7.5354ms | 132.7076 Ops/s | 127.5393 Ops/s | |
| test_to_njt[False-False-None] | 8.8480ms | 8.6691ms | 115.3521 Ops/s | 115.6911 Ops/s | |
| test_to_njt[True-False-None] | 7.1665ms | 6.9966ms | 142.9261 Ops/s | 141.6473 Ops/s | |
| test_to_njt[within-False-None] | 16.1343ms | 15.8437ms | 63.1164 Ops/s | 63.4315 Ops/s | |
| test_creation[device0] | 0.3517ms | 0.1170ms | 8.5491 KOps/s | 8.3763 KOps/s | |
| test_creation_from_tensor | 0.3679ms | 0.1140ms | 8.7683 KOps/s | 8.6514 KOps/s | |
| test_add_one[memmap_tensor0] | 0.2681ms | 6.3528μs | 157.4103 KOps/s | 155.4092 KOps/s | |
| test_contiguous[memmap_tensor0] | 19.7610μs | 0.6724μs | 1.4872 MOps/s | 2.1097 MOps/s | |
| test_stack[memmap_tensor0] | 26.8000μs | 4.6422μs | 215.4173 KOps/s | 215.4886 KOps/s | |
| test_memmaptd_index | 0.9977ms | 0.2599ms | 3.8480 KOps/s | 3.7552 KOps/s | |
| test_memmaptd_index_astensor | 0.5143ms | 0.3538ms | 2.8265 KOps/s | 2.8007 KOps/s | |
| test_memmaptd_index_op | 0.8490ms | 0.5900ms | 1.6950 KOps/s | 1.6673 KOps/s | |
| test_serialize_model | 0.1398s | 0.1371s | 7.2915 Ops/s | 7.2363 Ops/s | |
| test_serialize_model_pickle | 1.3485s | 1.1930s | 0.8382 Ops/s | 0.8230 Ops/s | |
| test_serialize_weights | 0.1374s | 0.1360s | 7.3506 Ops/s | 7.2940 Ops/s | |
| test_serialize_weights_returnearly | 0.4393s | 94.4192ms | 10.5911 Ops/s | 6.3024 Ops/s | |
| test_serialize_weights_pickle | 1.3649s | 1.2132s | 0.8242 Ops/s | 0.8215 Ops/s | |
| test_reshape_pytree | 0.2088ms | 33.6015μs | 29.7606 KOps/s | 29.9943 KOps/s | |
| test_reshape_td | 80.0820μs | 44.9238μs | 22.2599 KOps/s | 21.8760 KOps/s | |
| test_view_pytree | 0.2185ms | 34.4273μs | 29.0467 KOps/s | 30.1859 KOps/s | |
| test_view_td | 96.3720μs | 53.6073μs | 18.6542 KOps/s | 19.2987 KOps/s | |
| test_unbind_pytree | 0.2373ms | 38.1386μs | 26.2202 KOps/s | 26.7354 KOps/s | |
| test_unbind_td | 0.1823ms | 49.6080μs | 20.1580 KOps/s | 19.9604 KOps/s | |
| test_split_pytree | 0.2632ms | 42.7840μs | 23.3732 KOps/s | 23.2869 KOps/s | |
| test_split_td | 0.1141ms | 64.6353μs | 15.4714 KOps/s | 15.2330 KOps/s | |
| test_add_pytree | 0.2341ms | 42.6024μs | 23.4729 KOps/s | 24.0821 KOps/s | |
| test_add_td | 0.1123ms | 54.0868μs | 18.4888 KOps/s | 19.0788 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1941ms | 0.1396ms | 7.1637 KOps/s | 6.6339 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.3957ms | 0.2097ms | 4.7684 KOps/s | 5.1859 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1634ms | 0.1100ms | 9.0908 KOps/s | 8.9516 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4371ms | 0.1814ms | 5.5129 KOps/s | 5.5160 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.2853ms | 38.3004μs | 26.1094 KOps/s | 30.5081 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 92.5320μs | 53.5682μs | 18.6678 KOps/s | 18.8920 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1233ms | 10.0526μs | 99.4769 KOps/s | 101.5130 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4569ms | 70.1654μs | 14.2520 KOps/s | 14.2108 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2825ms | 0.1790ms | 5.5859 KOps/s | 5.3769 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3377ms | 0.2572ms | 3.8879 KOps/s | 3.8951 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1734ms | 0.1184ms | 8.4473 KOps/s | 8.2173 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1355ms | 69.5889μs | 14.3701 KOps/s | 14.3150 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.4215ms | 0.1588ms | 6.2978 KOps/s | 6.1289 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8531ms | 0.5310ms | 1.8832 KOps/s | 1.8557 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.3901ms | 0.3167ms | 3.1576 KOps/s | 3.2020 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.5002ms | 0.1833ms | 5.4561 KOps/s | 5.1329 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.2150ms | 88.5229μs | 11.2965 KOps/s | 11.7413 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.2827ms | 0.1225ms | 8.1622 KOps/s | 8.0540 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6848ms | 0.4460ms | 2.2422 KOps/s | 2.2745 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.2817ms | 0.1643ms | 6.0849 KOps/s | 6.0232 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 0.1308ms | 24.8354μs | 40.2652 KOps/s | 38.3730 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 0.1587ms | 41.4162μs | 24.1452 KOps/s | 23.6685 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 1.3401ms | 10.8812μs | 91.9012 KOps/s | 91.1391 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4009ms | 52.6727μs | 18.9852 KOps/s | 18.8340 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 1.9930ms | 0.1737ms | 5.7584 KOps/s | 5.3748 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.4359ms | 3.3210ms | 301.1182 Ops/s | 301.3263 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9662ms | 0.1626ms | 6.1501 KOps/s | 6.1088 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9228ms | 2.7896ms | 358.4711 Ops/s | 360.3436 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.2334ms | 0.1092ms | 9.1595 KOps/s | 8.8248 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3122ms | 74.5294μs | 13.4175 KOps/s | 13.8701 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.2212ms | 98.0198μs | 10.2020 KOps/s | 10.2938 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2690ms | 45.3790μs | 22.0366 KOps/s | 22.0753 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1476ms | 99.5113μs | 10.0491 KOps/s | 10.2247 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.3125ms | 46.3775μs | 21.5622 KOps/s | 22.2019 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1915ms | 56.3591μs | 17.7434 KOps/s | 17.0846 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2331ms | 28.1087μs | 35.5762 KOps/s | 34.9948 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1401ms | 45.2288μs | 22.1098 KOps/s | 21.4823 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2646ms | 22.9107μs | 43.6477 KOps/s | 43.4145 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 87.6410μs | 45.6474μs | 21.9071 KOps/s | 21.3913 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2553ms | 22.9371μs | 43.5974 KOps/s | 42.9271 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1001ms | 57.3708μs | 17.4305 KOps/s | 16.7750 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2761ms | 28.6169μs | 34.9443 KOps/s | 35.2447 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 88.3910μs | 46.2193μs | 21.6360 KOps/s | 20.6802 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2655ms | 22.9767μs | 43.5223 KOps/s | 43.4663 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 82.1110μs | 45.2072μs | 22.1204 KOps/s | 21.0865 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2938ms | 22.9199μs | 43.6303 KOps/s | 42.9167 KOps/s | |
| test_mod_add[eager] | 87.3720μs | 50.9337μs | 19.6334 KOps/s | 19.6733 KOps/s | |
| test_mod_add[compile] | 0.1856ms | 0.1041ms | 9.6021 KOps/s | 9.3726 KOps/s | |
| test_mod_add[compile-overhead] | 0.2436ms | 0.1496ms | 6.6841 KOps/s | 6.6424 KOps/s | |
| test_mod_wrap[eager] | 0.3961ms | 0.3145ms | 3.1793 KOps/s | 3.4323 KOps/s | |
| test_mod_wrap[compile] | 0.4561ms | 0.3634ms | 2.7516 KOps/s | 2.8185 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.3591ms | 4.0797ms | 245.1177 Ops/s | 245.7921 Ops/s | |
| test_mod_wrap_and_backward[eager] | 2.1742ms | 1.5233ms | 656.4601 Ops/s | 658.4517 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.6233ms | 1.4580ms | 685.8927 Ops/s | 685.1319 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.3419ms | 0.9050ms | 1.1049 KOps/s | 1.1001 KOps/s | |
| test_seq_add[eager] | 0.2301ms | 0.1608ms | 6.2195 KOps/s | 6.2239 KOps/s | |
| test_seq_add[compile] | 0.1732ms | 0.1164ms | 8.5927 KOps/s | 8.3709 KOps/s | |
| test_seq_add[compile-overhead] | 0.2381ms | 0.1553ms | 6.4387 KOps/s | 6.3019 KOps/s | |
| test_seq_wrap[eager] | 0.5970ms | 0.5341ms | 1.8725 KOps/s | 1.9071 KOps/s | |
| test_seq_wrap[compile] | 0.4865ms | 0.3751ms | 2.6663 KOps/s | 2.6953 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3333ms | 0.2667ms | 3.7496 KOps/s | 3.7371 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9848ms | 0.8934ms | 1.1193 KOps/s | 1.1299 KOps/s | |
| test_func_call_runtime[False-compile] | 1.0669ms | 0.9299ms | 1.0754 KOps/s | 1.0792 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5276ms | 0.4635ms | 2.1574 KOps/s | 2.1547 KOps/s | |
| test_func_call_runtime[True-eager] | 1.1718ms | 1.0832ms | 923.2145 Ops/s | 912.0631 Ops/s | |
| test_func_call_runtime[True-compile] | 1.0548ms | 0.9247ms | 1.0814 KOps/s | 1.0665 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5521ms | 0.4793ms | 2.0862 KOps/s | 2.0600 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.9313ms | 0.8390ms | 1.1919 KOps/s | 1.1811 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.0198ms | 0.9272ms | 1.0785 KOps/s | 1.0758 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.6012ms | 0.4674ms | 2.1394 KOps/s | 2.1297 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3851ms | 1.2372ms | 808.2510 Ops/s | 804.6322 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.0321ms | 0.9615ms | 1.0400 KOps/s | 1.0344 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5699ms | 0.5056ms | 1.9778 KOps/s | 1.9547 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8230ms | 2.3434ms | 426.7330 Ops/s | 422.0612 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0659ms | 0.9823ms | 1.0180 KOps/s | 1.0151 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5667ms | 0.5165ms | 1.9361 KOps/s | 1.9179 KOps/s | |
| test_distributed | 0.7618ms | 0.1544ms | 6.4783 KOps/s | 6.4874 KOps/s | |
| test_tdmodule | 0.2678ms | 28.6635μs | 34.8875 KOps/s | 33.8436 KOps/s | |
| test_tdmodule_dispatch | 77.8610μs | 46.3951μs | 21.5540 KOps/s | 21.2084 KOps/s | |
| test_tdseq | 56.9010μs | 28.3841μs | 35.2310 KOps/s | 36.0692 KOps/s | |
| test_tdseq_dispatch | 78.1610μs | 49.2009μs | 20.3248 KOps/s | 20.4376 KOps/s | |
| test_instantiation_functorch | 2.1964ms | 2.1119ms | 473.5070 Ops/s | 481.7340 Ops/s | |
| test_exec_functorch | 0.2236ms | 0.1805ms | 5.5393 KOps/s | 5.4067 KOps/s | |
| test_exec_functional_call | 0.2245ms | 0.1616ms | 6.1866 KOps/s | 6.1780 KOps/s | |
| test_exec_td_decorator | 0.4552ms | 0.2367ms | 4.2240 KOps/s | 4.1925 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0259ms | 0.8205ms | 1.2188 KOps/s | 1.2157 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.9943ms | 0.8187ms | 1.2214 KOps/s | 1.2178 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.9430ms | 0.7097ms | 1.4091 KOps/s | 1.3848 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.9088ms | 0.7148ms | 1.3990 KOps/s | 1.3984 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 20.9487ms | 20.5709ms | 48.6124 Ops/s | 48.8250 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.0421ms | 20.6276ms | 48.4788 Ops/s | 48.6567 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 21.2422ms | 20.5887ms | 48.5703 Ops/s | 48.6957 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 21.5296ms | 20.6171ms | 48.5035 Ops/s | 49.0393 Ops/s | |
| test_to_module_speed[True] | 2.0307ms | 1.4770ms | 677.0336 Ops/s | 656.8239 Ops/s | |
| test_to_module_speed[False] | 1.9573ms | 1.4562ms | 686.7321 Ops/s | 669.2965 Ops/s | |
| test_tc_init | 88.1010μs | 45.9732μs | 21.7518 KOps/s | 20.7578 KOps/s | |
| test_tc_init_tensor_only | 33.9900μs | 9.9855μs | 100.1455 KOps/s | 99.8510 KOps/s | |
| test_tc_init_nested | 0.1384ms | 93.9041μs | 10.6492 KOps/s | 10.4251 KOps/s | |
| test_tc_init_many_fields | 57.5410μs | 17.2164μs | 58.0841 KOps/s | 59.4010 KOps/s | |
| test_tc_first_layer_tensor | 27.9200μs | 1.8947μs | 527.7760 KOps/s | 533.3707 KOps/s | |
| test_tc_first_layer_tensor_only | 4.2029μs | 0.7665μs | 1.3046 MOps/s | 1.3105 MOps/s | |
| test_tc_first_layer_tensor_set | 37.9210μs | 4.1936μs | 238.4579 KOps/s | 238.9117 KOps/s | |
| test_tc_first_layer_tensor_only_set | 32.1600μs | 3.1561μs | 316.8470 KOps/s | 314.0508 KOps/s | |
| test_tc_first_layer_nontensor | 34.9210μs | 6.3186μs | 158.2642 KOps/s | 158.7240 KOps/s | |
| test_tc_second_layer_tensor | 35.2600μs | 4.6189μs | 216.5035 KOps/s | 222.3110 KOps/s | |
| test_tc_second_layer_nontensor | 42.2510μs | 8.9711μs | 111.4697 KOps/s | 111.6973 KOps/s | |
| test_unbind | 0.2577s | 16.0201ms | 62.4214 Ops/s | 56.7847 Ops/s | |
| test_full_like | 5.0700ms | 4.3374ms | 230.5533 Ops/s | 228.1880 Ops/s | |
| test_zeros_like | 4.4818ms | 4.3645ms | 229.1218 Ops/s | 228.8570 Ops/s | |
| test_ones_like | 4.9553ms | 4.3808ms | 228.2689 Ops/s | 228.6968 Ops/s | |
| test_clone | 6.6185ms | 6.4721ms | 154.5098 Ops/s | 154.7065 Ops/s | |
| test_squeeze | 0.1815ms | 14.4860μs | 69.0322 KOps/s | 64.9912 KOps/s | |
| test_unsqueeze | 0.1677ms | 0.1148ms | 8.7124 KOps/s | 8.7355 KOps/s | |
| test_split | 0.2596ms | 0.1865ms | 5.3629 KOps/s | 5.3679 KOps/s | |
| test_permute | 0.2969ms | 0.2063ms | 4.8468 KOps/s | 4.7086 KOps/s | |
| test_stack | 51.8021ms | 50.8625ms | 19.6608 Ops/s | 23.1830 Ops/s | |
| test_cat | 51.6385ms | 51.2774ms | 19.5018 Ops/s | 23.0894 Ops/s |
Xmaster6y
pushed a commit
to Xmaster6y/tensordict
that referenced
this pull request
Feb 27, 2026
…strategy Add benchmarks/storage/bench_redis.py comparing RedisTensorDict against local TensorDict for get/set, key iteration, indexed read/write (int, slice, step-slice, fancy, bool mask), and td[idx].to_tensordict(). Performance improvements: - Fix _tensor_to_bytes: replace bytes(untyped_storage()) with tensor.numpy().tobytes() (~8000x faster serialization). - Override _index_tensordict with _abatch_index: batch all leaf key fetches into a single pipeline instead of one round-trip per key. - Covering-range strategy (_compute_covering_range): every index type (int, slice, step-slice, tensor, bool mask) emits at most ONE GETRANGE per key. For non-contiguous indices, the covering byte range is fetched and a local post-index extracts the requested rows. - Coalesce contiguous byte ranges for step-1 slices. - Partial covering-range RMW for writes: step/fancy/bool writes fetch only the covering range, patch locally, write back (2 cmds/key instead of N SETRANGEs). ghstack-source-id: b1854cb Pull-Request: pytorch#1570
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Add benchmarks/storage/bench_redis.py comparing RedisTensorDict against
local TensorDict for get/set, key iteration, indexed read/write (int,
slice, step-slice, fancy, bool mask), and td[idx].to_tensordict().
Performance improvements:
tensor.numpy().tobytes() (~8000x faster serialization).
fetches into a single pipeline instead of one round-trip per key.
(int, slice, step-slice, tensor, bool mask) emits at most ONE
GETRANGE per key. For non-contiguous indices, the covering byte range
is fetched and a local post-index extracts the requested rows.
only the covering range, patch locally, write back (2 cmds/key
instead of N SETRANGEs).