Skip to content

[Feature] Add Redis-backed TensorDict (RedisTensorDict)#1567

Open
vmoens wants to merge 1 commit intogh/vmoens/53/basefrom
gh/vmoens/53/head
Open

[Feature] Add Redis-backed TensorDict (RedisTensorDict)#1567
vmoens wants to merge 1 commit intogh/vmoens/53/basefrom
gh/vmoens/53/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 14, 2026

Stack from ghstack (oldest at bottom):

Adds RedisTensorDict, a TensorDictBase subclass that stores tensors in a
Redis instance for out-of-core storage. This enables datasets that exceed
local RAM to be accessed with the familiar TensorDict interface.

MVP features:

  • Async redis.asyncio client on a background event loop with sync wrappers
  • Zero-copy serialization via torch.frombuffer / untyped_storage bytes
  • Metadata (shape, dtype) stored in Redis Hashes for fast introspection
  • Redis Cluster-compatible key schema using hash tags
  • Pipelined batch I/O for all get/set operations
  • Nested TensorDict support via lightweight prefix-based views
  • Pickle support for multi-process usage
  • from_dict() factory, to_local() / to_tensordict() materialization
  • CI: install redis package and start redis-server before tests

[ghstack-poisoned]
@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 243. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 38.8100μs 14.8761μs 67.2220 KOps/s 66.7814 KOps/s $\color{#35bf28}+0.66\%$
test_plain_set_stack_nested 52.4900μs 15.1528μs 65.9944 KOps/s 66.0276 KOps/s $\color{#d91a1a}-0.05\%$
test_plain_set_nested_inplace 45.5210μs 16.5238μs 60.5186 KOps/s 59.7986 KOps/s $\color{#35bf28}+1.20\%$
test_plain_set_stack_nested_inplace 48.6410μs 16.4485μs 60.7960 KOps/s 60.0789 KOps/s $\color{#35bf28}+1.19\%$
test_items 37.3200μs 5.7434μs 174.1140 KOps/s 173.1432 KOps/s $\color{#35bf28}+0.56\%$
test_items_nested 0.5996ms 0.5333ms 1.8752 KOps/s 1.8502 KOps/s $\color{#35bf28}+1.35\%$
test_items_nested_locked 0.5971ms 0.5373ms 1.8613 KOps/s 1.8400 KOps/s $\color{#35bf28}+1.16\%$
test_items_nested_leaf 0.1378ms 96.6860μs 10.3428 KOps/s 10.5075 KOps/s $\color{#d91a1a}-1.57\%$
test_items_stack_nested 0.5766ms 0.5304ms 1.8853 KOps/s 1.8528 KOps/s $\color{#35bf28}+1.76\%$
test_items_stack_nested_leaf 0.1372ms 96.8933μs 10.3206 KOps/s 10.4732 KOps/s $\color{#d91a1a}-1.46\%$
test_items_stack_nested_locked 0.5981ms 0.5399ms 1.8521 KOps/s 1.8368 KOps/s $\color{#35bf28}+0.83\%$
test_keys 41.7300μs 4.3934μs 227.6158 KOps/s 235.9342 KOps/s $\color{#d91a1a}-3.53\%$
test_keys_nested 0.1513ms 0.1204ms 8.3029 KOps/s 8.2911 KOps/s $\color{#35bf28}+0.14\%$
test_keys_nested_locked 88.5357ms 0.1412ms 7.0831 KOps/s 7.7570 KOps/s $\textbf{\color{#d91a1a}-8.69\%}$
test_keys_nested_leaf 0.1950ms 0.1113ms 8.9826 KOps/s 9.0781 KOps/s $\color{#d91a1a}-1.05\%$
test_keys_stack_nested 0.1621ms 0.1210ms 8.2613 KOps/s 8.3147 KOps/s $\color{#d91a1a}-0.64\%$
test_keys_stack_nested_leaf 0.1493ms 0.1113ms 8.9837 KOps/s 9.0314 KOps/s $\color{#d91a1a}-0.53\%$
test_keys_stack_nested_locked 0.1724ms 0.1295ms 7.7197 KOps/s 7.7985 KOps/s $\color{#d91a1a}-1.01\%$
test_values 6.5562μs 1.0195μs 980.8678 KOps/s 972.5758 KOps/s $\color{#35bf28}+0.85\%$
test_values_nested 76.1810μs 47.8318μs 20.9066 KOps/s 20.9505 KOps/s $\color{#d91a1a}-0.21\%$
test_values_nested_locked 81.0710μs 50.6063μs 19.7604 KOps/s 19.5009 KOps/s $\color{#35bf28}+1.33\%$
test_values_nested_leaf 81.6810μs 54.2028μs 18.4492 KOps/s 18.4555 KOps/s $\color{#d91a1a}-0.03\%$
test_values_stack_nested 94.2510μs 47.5883μs 21.0136 KOps/s 20.9630 KOps/s $\color{#35bf28}+0.24\%$
test_values_stack_nested_leaf 93.2620μs 54.3416μs 18.4021 KOps/s 18.4109 KOps/s $\color{#d91a1a}-0.05\%$
test_values_stack_nested_locked 0.1325ms 50.1099μs 19.9561 KOps/s 19.7002 KOps/s $\color{#35bf28}+1.30\%$
test_membership 11.8168μs 0.8633μs 1.1583 MOps/s 1.1848 MOps/s $\color{#d91a1a}-2.23\%$
test_membership_nested 31.8200μs 3.1531μs 317.1526 KOps/s 314.6553 KOps/s $\color{#35bf28}+0.79\%$
test_membership_nested_leaf 26.3600μs 3.1701μs 315.4464 KOps/s 314.9292 KOps/s $\color{#35bf28}+0.16\%$
test_membership_stacked_nested 41.7610μs 3.1961μs 312.8817 KOps/s 311.8411 KOps/s $\color{#35bf28}+0.33\%$
test_membership_stacked_nested_leaf 32.4410μs 3.2024μs 312.2706 KOps/s 310.9717 KOps/s $\color{#35bf28}+0.42\%$
test_membership_nested_last 33.5700μs 4.6278μs 216.0867 KOps/s 212.2262 KOps/s $\color{#35bf28}+1.82\%$
test_membership_nested_leaf_last 33.5000μs 4.6339μs 215.8022 KOps/s 212.8174 KOps/s $\color{#35bf28}+1.40\%$
test_membership_stacked_nested_last 46.8510μs 4.6284μs 216.0559 KOps/s 213.0321 KOps/s $\color{#35bf28}+1.42\%$
test_membership_stacked_nested_leaf_last 42.1610μs 4.6303μs 215.9678 KOps/s 210.7766 KOps/s $\color{#35bf28}+2.46\%$
test_nested_getleaf 52.2410μs 21.7358μs 46.0071 KOps/s 45.7895 KOps/s $\color{#35bf28}+0.48\%$
test_nested_get 48.1810μs 20.1031μs 49.7436 KOps/s 48.6377 KOps/s $\color{#35bf28}+2.27\%$
test_stacked_getleaf 51.3310μs 21.5575μs 46.3875 KOps/s 46.1158 KOps/s $\color{#35bf28}+0.59\%$
test_stacked_get 49.3510μs 20.6307μs 48.4715 KOps/s 47.4049 KOps/s $\color{#35bf28}+2.25\%$
test_nested_getitemleaf 89.7820μs 21.5987μs 46.2990 KOps/s 44.4674 KOps/s $\color{#35bf28}+4.12\%$
test_nested_getitem 60.7710μs 20.9220μs 47.7966 KOps/s 47.3372 KOps/s $\color{#35bf28}+0.97\%$
test_stacked_getitemleaf 54.2010μs 21.7717μs 45.9311 KOps/s 44.8127 KOps/s $\color{#35bf28}+2.50\%$
test_stacked_getitem 61.5410μs 20.8582μs 47.9428 KOps/s 47.0166 KOps/s $\color{#35bf28}+1.97\%$
test_lock_nested 7.7418ms 0.4781ms 2.0917 KOps/s 2.0997 KOps/s $\color{#d91a1a}-0.38\%$
test_lock_stack_nested 0.5782ms 0.4777ms 2.0933 KOps/s 2.0591 KOps/s $\color{#35bf28}+1.66\%$
test_unlock_nested 0.4861ms 0.3790ms 2.6386 KOps/s 2.6068 KOps/s $\color{#35bf28}+1.22\%$
test_unlock_stack_nested 0.4923ms 0.3824ms 2.6149 KOps/s 2.5587 KOps/s $\color{#35bf28}+2.20\%$
test_flatten_speed 0.1987ms 0.1221ms 8.1887 KOps/s 8.0733 KOps/s $\color{#35bf28}+1.43\%$
test_unflatten_speed 0.6312ms 0.5927ms 1.6873 KOps/s 1.6475 KOps/s $\color{#35bf28}+2.41\%$
test_common_ops 0.8029ms 0.6782ms 1.4745 KOps/s 1.4520 KOps/s $\color{#35bf28}+1.55\%$
test_creation 0.2025ms 2.9452μs 339.5368 KOps/s 343.7897 KOps/s $\color{#d91a1a}-1.24\%$
test_creation_empty 35.8410μs 6.1466μs 162.6908 KOps/s 162.0687 KOps/s $\color{#35bf28}+0.38\%$
test_creation_nested_1 35.7810μs 10.7669μs 92.8776 KOps/s 91.6629 KOps/s $\color{#35bf28}+1.33\%$
test_creation_nested_2 73.5910μs 11.8233μs 84.5789 KOps/s 84.3577 KOps/s $\color{#35bf28}+0.26\%$
test_creation_many_keys[10] 45.6410μs 18.2740μs 54.7227 KOps/s 54.7909 KOps/s $\color{#d91a1a}-0.12\%$
test_creation_many_keys[50] 0.1187ms 77.9019μs 12.8367 KOps/s 12.6842 KOps/s $\color{#35bf28}+1.20\%$
test_creation_many_keys[100] 0.1822ms 0.1522ms 6.5683 KOps/s 6.4817 KOps/s $\color{#35bf28}+1.34\%$
test_creation_nested_many_keys[10] 75.4020μs 39.2902μs 25.4517 KOps/s 25.4173 KOps/s $\color{#35bf28}+0.14\%$
test_creation_nested_many_keys[50] 0.1964ms 0.1594ms 6.2747 KOps/s 6.2383 KOps/s $\color{#35bf28}+0.58\%$
test_clone 44.3600μs 13.3148μs 75.1043 KOps/s 72.3906 KOps/s $\color{#35bf28}+3.75\%$
test_getitem[int] 1.6267ms 14.5622μs 68.6709 KOps/s 59.9894 KOps/s $\textbf{\color{#35bf28}+14.47\%}$
test_getitem[slice_int] 0.1414ms 25.4012μs 39.3682 KOps/s 39.0950 KOps/s $\color{#35bf28}+0.70\%$
test_getitem[range] 0.1800ms 61.6472μs 16.2213 KOps/s 15.9601 KOps/s $\color{#35bf28}+1.64\%$
test_getitem[tuple] 0.1428ms 24.7496μs 40.4047 KOps/s 40.6766 KOps/s $\color{#d91a1a}-0.67\%$
test_getitem[list] 0.1921ms 57.0904μs 17.5161 KOps/s 17.3126 KOps/s $\color{#35bf28}+1.18\%$
test_setitem_dim[int] 48.3600μs 26.2116μs 38.1511 KOps/s 37.4252 KOps/s $\color{#35bf28}+1.94\%$
test_setitem_dim[slice_int] 67.2510μs 44.8000μs 22.3214 KOps/s 22.0219 KOps/s $\color{#35bf28}+1.36\%$
test_setitem_dim[range] 0.1325ms 93.9189μs 10.6475 KOps/s 10.6252 KOps/s $\color{#35bf28}+0.21\%$
test_setitem_dim[tuple] 62.4610μs 41.7655μs 23.9432 KOps/s 23.5620 KOps/s $\color{#35bf28}+1.62\%$
test_setitem 56.0110μs 18.1592μs 55.0685 KOps/s 53.3905 KOps/s $\color{#35bf28}+3.14\%$
test_set 61.7810μs 17.0090μs 58.7925 KOps/s 55.8208 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_set_shared 0.4988ms 0.2076ms 4.8162 KOps/s 4.8221 KOps/s $\color{#d91a1a}-0.12\%$
test_update 0.3734ms 22.5558μs 44.3345 KOps/s 43.8610 KOps/s $\color{#35bf28}+1.08\%$
test_update_nested 75.0410μs 34.7768μs 28.7548 KOps/s 28.0326 KOps/s $\color{#35bf28}+2.58\%$
test_update__nested 0.4623ms 34.8749μs 28.6739 KOps/s 27.5375 KOps/s $\color{#35bf28}+4.13\%$
test_set_nested 53.2310μs 19.1006μs 52.3543 KOps/s 50.5732 KOps/s $\color{#35bf28}+3.52\%$
test_set_nested_new 62.7710μs 24.3212μs 41.1164 KOps/s 39.6876 KOps/s $\color{#35bf28}+3.60\%$
test_select 77.6410μs 42.7179μs 23.4094 KOps/s 23.1380 KOps/s $\color{#35bf28}+1.17\%$
test_select_nested 0.1239ms 75.5640μs 13.2338 KOps/s 13.2843 KOps/s $\color{#d91a1a}-0.38\%$
test_exclude_nested 0.1416ms 97.5130μs 10.2550 KOps/s 10.1010 KOps/s $\color{#35bf28}+1.53\%$
test_empty[True] 0.5086ms 0.4375ms 2.2855 KOps/s 2.2607 KOps/s $\color{#35bf28}+1.10\%$
test_empty[False] 7.1900μs 1.3258μs 754.2791 KOps/s 757.3441 KOps/s $\color{#d91a1a}-0.40\%$
test_to 0.1017ms 71.1624μs 14.0524 KOps/s 13.1335 KOps/s $\textbf{\color{#35bf28}+7.00\%}$
test_to_nonblocking 0.1148ms 64.4632μs 15.5127 KOps/s 14.9826 KOps/s $\color{#35bf28}+3.54\%$
test_unbind_speed 0.3563ms 0.3280ms 3.0490 KOps/s 3.0276 KOps/s $\color{#35bf28}+0.71\%$
test_unbind_speed_stack0 0.3985ms 0.3241ms 3.0855 KOps/s 3.0524 KOps/s $\color{#35bf28}+1.08\%$
test_unbind_speed_stack1 0.1033s 0.9158ms 1.0920 KOps/s 1.1797 KOps/s $\textbf{\color{#d91a1a}-7.44\%}$
test_split 1.2225ms 1.1389ms 878.0329 Ops/s 883.4957 Ops/s $\color{#d91a1a}-0.62\%$
test_chunk 0.1030s 1.2127ms 824.5811 Ops/s 912.9976 Ops/s $\textbf{\color{#d91a1a}-9.68\%}$
test_to_cpu_blocking 28.4204ms 28.3334ms 35.2940 Ops/s 35.4372 Ops/s $\color{#d91a1a}-0.40\%$
test_to_cpu_global_sync 11.3438ms 11.2063ms 89.2354 Ops/s 87.3708 Ops/s $\color{#35bf28}+2.13\%$
test_to_cpu_event_sync 12.4547ms 12.2027ms 81.9490 Ops/s 80.6291 Ops/s $\color{#35bf28}+1.64\%$
test_to_cpu_default 12.4839ms 12.2623ms 81.5509 Ops/s 80.5745 Ops/s $\color{#35bf28}+1.21\%$
test_consolidate[False-None] 4.2464ms 4.0808ms 245.0526 Ops/s 243.3805 Ops/s $\color{#35bf28}+0.69\%$
test_consolidate[default-None] 2.1288ms 2.0192ms 495.2481 Ops/s 482.2397 Ops/s $\color{#35bf28}+2.70\%$
test_consolidate[reduce-overhead-None] 2.0573ms 1.9494ms 512.9901 Ops/s 504.3587 Ops/s $\color{#35bf28}+1.71\%$
test_consolidate_njt[False-None] 9.0339ms 8.6180ms 116.0359 Ops/s 117.4618 Ops/s $\color{#d91a1a}-1.21\%$
test_to[False-False-None] 2.2485ms 2.0750ms 481.9345 Ops/s 472.0930 Ops/s $\color{#35bf28}+2.08\%$
test_to[True-False-None] 2.1706ms 1.9171ms 521.6186 Ops/s 522.3086 Ops/s $\color{#d91a1a}-0.13\%$
test_to[within-False-None] 6.2734ms 6.0884ms 164.2465 Ops/s 164.1972 Ops/s $\color{#35bf28}+0.03\%$
test_to[True-default-None] 8.0055ms 7.7475ms 129.0734 Ops/s 131.3863 Ops/s $\color{#d91a1a}-1.76\%$
test_to_njt[False-False-None] 8.9005ms 8.5668ms 116.7292 Ops/s 115.1994 Ops/s $\color{#35bf28}+1.33\%$
test_to_njt[True-False-None] 7.1006ms 6.9267ms 144.3690 Ops/s 142.9813 Ops/s $\color{#35bf28}+0.97\%$
test_to_njt[within-False-None] 16.1117ms 15.4796ms 64.6012 Ops/s 63.4711 Ops/s $\color{#35bf28}+1.78\%$
test_creation[device0] 0.3869ms 0.1169ms 8.5513 KOps/s 8.5303 KOps/s $\color{#35bf28}+0.25\%$
test_creation_from_tensor 0.4203ms 0.1175ms 8.5142 KOps/s 8.6616 KOps/s $\color{#d91a1a}-1.70\%$
test_add_one[memmap_tensor0] 0.2097ms 6.5079μs 153.6596 KOps/s 145.8026 KOps/s $\textbf{\color{#35bf28}+5.39\%}$
test_contiguous[memmap_tensor0] 26.2800μs 0.6924μs 1.4443 MOps/s 2.1773 MOps/s $\textbf{\color{#d91a1a}-33.66\%}$
test_stack[memmap_tensor0] 35.1900μs 4.6728μs 214.0033 KOps/s 214.3699 KOps/s $\color{#d91a1a}-0.17\%$
test_memmaptd_index 1.0587ms 0.2623ms 3.8120 KOps/s 3.8442 KOps/s $\color{#d91a1a}-0.84\%$
test_memmaptd_index_astensor 0.5166ms 0.3584ms 2.7904 KOps/s 2.8338 KOps/s $\color{#d91a1a}-1.53\%$
test_memmaptd_index_op 0.7479ms 0.5996ms 1.6677 KOps/s 1.6390 KOps/s $\color{#35bf28}+1.75\%$
test_serialize_model 0.1403s 0.1366s 7.3221 Ops/s 7.2873 Ops/s $\color{#35bf28}+0.48\%$
test_serialize_model_pickle 1.3501s 1.2151s 0.8230 Ops/s 0.8357 Ops/s $\color{#d91a1a}-1.53\%$
test_serialize_weights 0.1375s 0.1350s 7.4064 Ops/s 7.3401 Ops/s $\color{#35bf28}+0.90\%$
test_serialize_weights_returnearly 0.3801s 94.2204ms 10.6134 Ops/s 14.2458 Ops/s $\textbf{\color{#d91a1a}-25.50\%}$
test_serialize_weights_pickle 1.3635s 1.2126s 0.8247 Ops/s 0.8232 Ops/s $\color{#35bf28}+0.18\%$
test_reshape_pytree 0.2312ms 33.4882μs 29.8613 KOps/s 30.0470 KOps/s $\color{#d91a1a}-0.62\%$
test_reshape_td 78.4520μs 44.2527μs 22.5975 KOps/s 22.4181 KOps/s $\color{#35bf28}+0.80\%$
test_view_pytree 0.2210ms 32.7771μs 30.5091 KOps/s 30.1352 KOps/s $\color{#35bf28}+1.24\%$
test_view_td 88.0710μs 52.1178μs 19.1873 KOps/s 18.8571 KOps/s $\color{#35bf28}+1.75\%$
test_unbind_pytree 0.2327ms 37.0252μs 27.0086 KOps/s 26.9573 KOps/s $\color{#35bf28}+0.19\%$
test_unbind_td 0.1373ms 48.1297μs 20.7772 KOps/s 20.1772 KOps/s $\color{#35bf28}+2.97\%$
test_split_pytree 0.2539ms 43.4082μs 23.0371 KOps/s 23.2688 KOps/s $\color{#d91a1a}-1.00\%$
test_split_td 0.1637ms 65.5897μs 15.2463 KOps/s 15.2849 KOps/s $\color{#d91a1a}-0.25\%$
test_add_pytree 0.1983ms 42.2851μs 23.6490 KOps/s 23.1334 KOps/s $\color{#35bf28}+2.23\%$
test_add_td 0.1594ms 52.9230μs 18.8954 KOps/s 18.4168 KOps/s $\color{#35bf28}+2.60\%$
test_compile_add_one_nested[tensordict-compile] 0.1901ms 0.1404ms 7.1200 KOps/s 6.6027 KOps/s $\textbf{\color{#35bf28}+7.84\%}$
test_compile_add_one_nested[tensordict-eager] 0.2735ms 0.1946ms 5.1380 KOps/s 5.1179 KOps/s $\color{#35bf28}+0.39\%$
test_compile_add_one_nested[pytree-compile] 0.1597ms 0.1094ms 9.1393 KOps/s 9.0529 KOps/s $\color{#35bf28}+0.95\%$
test_compile_add_one_nested[pytree-eager] 0.4386ms 0.1852ms 5.3988 KOps/s 5.4644 KOps/s $\color{#d91a1a}-1.20\%$
test_compile_copy_nested[tensordict-compile] 69.4820μs 31.4807μs 31.7655 KOps/s 27.3936 KOps/s $\textbf{\color{#35bf28}+15.96\%}$
test_compile_copy_nested[tensordict-eager] 98.6820μs 52.1287μs 19.1833 KOps/s 19.5848 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_copy_nested[pytree-compile] 51.8800μs 9.9452μs 100.5514 KOps/s 102.2964 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_copy_nested[pytree-eager] 0.4708ms 70.0124μs 14.2832 KOps/s 14.7194 KOps/s $\color{#d91a1a}-2.96\%$
test_compile_add_one_flat[tensordict-compile] 0.2485ms 0.1793ms 5.5759 KOps/s 5.2510 KOps/s $\textbf{\color{#35bf28}+6.19\%}$
test_compile_add_one_flat[tensordict-eager] 0.3808ms 0.2566ms 3.8976 KOps/s 3.7538 KOps/s $\color{#35bf28}+3.83\%$
test_compile_add_one_flat[tensorclass-compile] 0.1940ms 0.1192ms 8.3906 KOps/s 8.1248 KOps/s $\color{#35bf28}+3.27\%$
test_compile_add_one_flat[tensorclass-eager] 0.1261ms 69.5568μs 14.3767 KOps/s 14.0912 KOps/s $\color{#35bf28}+2.03\%$
test_compile_add_one_flat[pytree-compile] 0.2212ms 0.1600ms 6.2495 KOps/s 6.0679 KOps/s $\color{#35bf28}+2.99\%$
test_compile_add_one_flat[pytree-eager] 0.8920ms 0.5290ms 1.8904 KOps/s 1.8295 KOps/s $\color{#35bf28}+3.32\%$
test_compile_add_self_flat[tensordict-eager] 0.4589ms 0.3093ms 3.2328 KOps/s 3.0870 KOps/s $\color{#35bf28}+4.73\%$
test_compile_add_self_flat[tensordict-compile] 0.2403ms 0.1822ms 5.4877 KOps/s 5.2425 KOps/s $\color{#35bf28}+4.68\%$
test_compile_add_self_flat[tensorclass-eager] 0.1467ms 85.7217μs 11.6657 KOps/s 11.3902 KOps/s $\color{#35bf28}+2.42\%$
test_compile_add_self_flat[tensorclass-compile] 0.1823ms 0.1213ms 8.2468 KOps/s 8.0336 KOps/s $\color{#35bf28}+2.65\%$
test_compile_add_self_flat[pytree-eager] 0.7024ms 0.4375ms 2.2859 KOps/s 1.8322 KOps/s $\textbf{\color{#35bf28}+24.77\%}$
test_compile_add_self_flat[pytree-compile] 0.2230ms 0.1610ms 6.2100 KOps/s 6.0648 KOps/s $\color{#35bf28}+2.40\%$
test_compile_copy_flat[tensordict-compile] 0.1455ms 23.5729μs 42.4216 KOps/s 38.2469 KOps/s $\textbf{\color{#35bf28}+10.91\%}$
test_compile_copy_flat[tensordict-eager] 0.1064ms 41.6068μs 24.0345 KOps/s 23.7493 KOps/s $\color{#35bf28}+1.20\%$
test_compile_copy_flat[pytree-compile] 63.9210μs 11.0320μs 90.6450 KOps/s 93.4306 KOps/s $\color{#d91a1a}-2.98\%$
test_compile_copy_flat[pytree-eager] 0.4056ms 51.9494μs 19.2495 KOps/s 19.0931 KOps/s $\color{#35bf28}+0.82\%$
test_compile_assign_and_add[tensordict-compile] 2.0656ms 0.1835ms 5.4484 KOps/s 5.3656 KOps/s $\color{#35bf28}+1.54\%$
test_compile_assign_and_add[tensordict-eager] 3.4961ms 3.3131ms 301.8281 Ops/s 296.8796 Ops/s $\color{#35bf28}+1.67\%$
test_compile_assign_and_add[pytree-compile] 1.9831ms 0.1626ms 6.1507 KOps/s 5.9677 KOps/s $\color{#35bf28}+3.07\%$
test_compile_assign_and_add[pytree-eager] 2.9606ms 2.8013ms 356.9826 Ops/s 348.4601 Ops/s $\color{#35bf28}+2.45\%$
test_compile_indexing[tensor-tensordict-compile] 0.1707ms 0.1124ms 8.8966 KOps/s 8.6883 KOps/s $\color{#35bf28}+2.40\%$
test_compile_indexing[tensor-tensordict-eager] 0.3198ms 73.3438μs 13.6344 KOps/s 13.0560 KOps/s $\color{#35bf28}+4.43\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2390ms 0.1016ms 9.8465 KOps/s 10.1507 KOps/s $\color{#d91a1a}-3.00\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2640ms 45.4480μs 22.0032 KOps/s 21.4678 KOps/s $\color{#35bf28}+2.49\%$
test_compile_indexing[tensor-pytree-compile] 0.1554ms 0.1036ms 9.6570 KOps/s 10.0384 KOps/s $\color{#d91a1a}-3.80\%$
test_compile_indexing[tensor-pytree-eager] 0.2548ms 45.4154μs 22.0190 KOps/s 22.0784 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_indexing[slice-tensordict-compile] 0.1055ms 56.5880μs 17.6716 KOps/s 17.2491 KOps/s $\color{#35bf28}+2.45\%$
test_compile_indexing[slice-tensordict-eager] 0.2186ms 28.4199μs 35.1866 KOps/s 35.6141 KOps/s $\color{#d91a1a}-1.20\%$
test_compile_indexing[slice-tensorclass-compile] 0.1074ms 45.7301μs 21.8674 KOps/s 21.9837 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[slice-tensorclass-eager] 0.2602ms 22.5128μs 44.4191 KOps/s 44.4752 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_indexing[slice-pytree-compile] 83.4110μs 45.9758μs 21.7506 KOps/s 21.5551 KOps/s $\color{#35bf28}+0.91\%$
test_compile_indexing[slice-pytree-eager] 0.2715ms 22.5451μs 44.3555 KOps/s 44.3825 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_indexing[int-tensordict-compile] 0.1014ms 57.6009μs 17.3608 KOps/s 17.0898 KOps/s $\color{#35bf28}+1.59\%$
test_compile_indexing[int-tensordict-eager] 0.2354ms 28.3772μs 35.2396 KOps/s 35.4238 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_indexing[int-tensorclass-compile] 95.9610μs 46.4561μs 21.5257 KOps/s 21.5092 KOps/s $\color{#35bf28}+0.08\%$
test_compile_indexing[int-tensorclass-eager] 0.2602ms 22.5402μs 44.3652 KOps/s 44.2483 KOps/s $\color{#35bf28}+0.26\%$
test_compile_indexing[int-pytree-compile] 82.6820μs 46.0722μs 21.7051 KOps/s 20.8411 KOps/s $\color{#35bf28}+4.15\%$
test_compile_indexing[int-pytree-eager] 0.2560ms 22.7018μs 44.0495 KOps/s 44.1107 KOps/s $\color{#d91a1a}-0.14\%$
test_mod_add[eager] 95.8220μs 51.3667μs 19.4679 KOps/s 18.1954 KOps/s $\textbf{\color{#35bf28}+6.99\%}$
test_mod_add[compile] 0.1828ms 0.1049ms 9.5337 KOps/s 9.2834 KOps/s $\color{#35bf28}+2.70\%$
test_mod_add[compile-overhead] 0.2571ms 0.1480ms 6.7569 KOps/s 6.5492 KOps/s $\color{#35bf28}+3.17\%$
test_mod_wrap[eager] 0.3706ms 0.2953ms 3.3863 KOps/s 3.2830 KOps/s $\color{#35bf28}+3.15\%$
test_mod_wrap[compile] 0.4368ms 0.3627ms 2.7569 KOps/s 2.6855 KOps/s $\color{#35bf28}+2.66\%$
test_mod_wrap[compile-overhead] 7.2599ms 4.0084ms 249.4737 Ops/s 251.6971 Ops/s $\color{#d91a1a}-0.88\%$
test_mod_wrap_and_backward[eager] 1.7709ms 1.6331ms 612.3472 Ops/s 611.4631 Ops/s $\color{#35bf28}+0.14\%$
test_mod_wrap_and_backward[compile] 1.6393ms 1.5481ms 645.9625 Ops/s 674.6945 Ops/s $\color{#d91a1a}-4.26\%$
test_mod_wrap_and_backward[compile-overhead] 1.2934ms 0.9846ms 1.0156 KOps/s 1.1020 KOps/s $\textbf{\color{#d91a1a}-7.84\%}$
test_seq_add[eager] 0.2708ms 0.1580ms 6.3278 KOps/s 6.2103 KOps/s $\color{#35bf28}+1.89\%$
test_seq_add[compile] 0.1729ms 0.1158ms 8.6379 KOps/s 8.3691 KOps/s $\color{#35bf28}+3.21\%$
test_seq_add[compile-overhead] 0.2313ms 0.1550ms 6.4529 KOps/s 6.2774 KOps/s $\color{#35bf28}+2.80\%$
test_seq_wrap[eager] 0.6024ms 0.5292ms 1.8895 KOps/s 1.8660 KOps/s $\color{#35bf28}+1.26\%$
test_seq_wrap[compile] 0.4657ms 0.3816ms 2.6205 KOps/s 2.6672 KOps/s $\color{#d91a1a}-1.75\%$
test_seq_wrap[compile-overhead] 0.3306ms 0.2655ms 3.7658 KOps/s 3.7278 KOps/s $\color{#35bf28}+1.02\%$
test_func_call_runtime[False-eager] 0.9488ms 0.8491ms 1.1777 KOps/s 1.1687 KOps/s $\color{#35bf28}+0.77\%$
test_func_call_runtime[False-compile] 1.1123ms 0.9131ms 1.0951 KOps/s 1.0762 KOps/s $\color{#35bf28}+1.76\%$
test_func_call_runtime[False-compile-overhead] 0.6054ms 0.4601ms 2.1737 KOps/s 2.1401 KOps/s $\color{#35bf28}+1.57\%$
test_func_call_runtime[True-eager] 1.2796ms 1.0896ms 917.7396 Ops/s 905.5765 Ops/s $\color{#35bf28}+1.34\%$
test_func_call_runtime[True-compile] 1.1100ms 0.9261ms 1.0798 KOps/s 1.0602 KOps/s $\color{#35bf28}+1.84\%$
test_func_call_runtime[True-compile-overhead] 0.6271ms 0.4759ms 2.1013 KOps/s 2.0595 KOps/s $\color{#35bf28}+2.03\%$
test_func_call_cm_runtime[False-eager] 0.9796ms 0.8530ms 1.1724 KOps/s 1.1721 KOps/s $\color{#35bf28}+0.02\%$
test_func_call_cm_runtime[False-compile] 1.1656ms 0.9161ms 1.0916 KOps/s 1.0704 KOps/s $\color{#35bf28}+1.98\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5575ms 0.4659ms 2.1462 KOps/s 2.1246 KOps/s $\color{#35bf28}+1.02\%$
test_func_call_cm_runtime[True-eager] 1.3689ms 1.2329ms 811.1066 Ops/s 796.7297 Ops/s $\color{#35bf28}+1.80\%$
test_func_call_cm_runtime[True-compile] 1.2319ms 0.9601ms 1.0415 KOps/s 1.0244 KOps/s $\color{#35bf28}+1.67\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5772ms 0.5085ms 1.9665 KOps/s 1.9288 KOps/s $\color{#35bf28}+1.95\%$
test_vmap_func_call_cm_runtime[eager] 2.8771ms 2.3754ms 420.9759 Ops/s 415.5628 Ops/s $\color{#35bf28}+1.30\%$
test_vmap_func_call_cm_runtime[compile] 1.4423ms 0.9765ms 1.0240 KOps/s 1.0036 KOps/s $\color{#35bf28}+2.03\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6033ms 0.5179ms 1.9309 KOps/s 1.9071 KOps/s $\color{#35bf28}+1.25\%$
test_distributed 0.5562ms 0.1559ms 6.4151 KOps/s 6.0267 KOps/s $\textbf{\color{#35bf28}+6.44\%}$
test_tdmodule 56.9310μs 28.5873μs 34.9805 KOps/s 33.4947 KOps/s $\color{#35bf28}+4.44\%$
test_tdmodule_dispatch 76.8510μs 46.5002μs 21.5053 KOps/s 21.0842 KOps/s $\color{#35bf28}+2.00\%$
test_tdseq 49.5710μs 27.4797μs 36.3905 KOps/s 36.0686 KOps/s $\color{#35bf28}+0.89\%$
test_tdseq_dispatch 84.4710μs 48.4127μs 20.6557 KOps/s 20.3030 KOps/s $\color{#35bf28}+1.74\%$
test_instantiation_functorch 2.1572ms 2.0547ms 486.6906 Ops/s 479.1383 Ops/s $\color{#35bf28}+1.58\%$
test_exec_functorch 0.2484ms 0.1798ms 5.5609 KOps/s 5.4946 KOps/s $\color{#35bf28}+1.21\%$
test_exec_functional_call 0.2234ms 0.1629ms 6.1380 KOps/s 6.2068 KOps/s $\color{#d91a1a}-1.11\%$
test_exec_td_decorator 0.4444ms 0.2352ms 4.2524 KOps/s 4.1743 KOps/s $\color{#35bf28}+1.87\%$
test_vmap_mlp_speed_decorator[True-True] 1.0182ms 0.8242ms 1.2134 KOps/s 1.1933 KOps/s $\color{#35bf28}+1.68\%$
test_vmap_mlp_speed_decorator[True-False] 1.0022ms 0.8230ms 1.2150 KOps/s 1.1955 KOps/s $\color{#35bf28}+1.63\%$
test_vmap_mlp_speed_decorator[False-True] 0.9262ms 0.7124ms 1.4038 KOps/s 1.3828 KOps/s $\color{#35bf28}+1.51\%$
test_vmap_mlp_speed_decorator[False-False] 0.9131ms 0.7168ms 1.3950 KOps/s 1.3798 KOps/s $\color{#35bf28}+1.10\%$
test_vmap_transformer_speed_decorator[True-True] 21.4903ms 20.6609ms 48.4006 Ops/s 48.1414 Ops/s $\color{#35bf28}+0.54\%$
test_vmap_transformer_speed_decorator[True-False] 21.2953ms 20.6013ms 48.5406 Ops/s 48.1103 Ops/s $\color{#35bf28}+0.89\%$
test_vmap_transformer_speed_decorator[False-True] 21.0924ms 20.3711ms 49.0892 Ops/s 48.6067 Ops/s $\color{#35bf28}+0.99\%$
test_vmap_transformer_speed_decorator[False-False] 20.9951ms 20.3883ms 49.0478 Ops/s 48.5230 Ops/s $\color{#35bf28}+1.08\%$
test_to_module_speed[True] 2.0448ms 1.4747ms 678.0883 Ops/s 669.6816 Ops/s $\color{#35bf28}+1.26\%$
test_to_module_speed[False] 1.9644ms 1.4464ms 691.3759 Ops/s 686.8315 Ops/s $\color{#35bf28}+0.66\%$
test_tc_init 0.1130ms 46.3185μs 21.5896 KOps/s 21.3095 KOps/s $\color{#35bf28}+1.31\%$
test_tc_init_tensor_only 0.4200ms 9.8377μs 101.6498 KOps/s 100.3485 KOps/s $\color{#35bf28}+1.30\%$
test_tc_init_nested 0.1386ms 92.0386μs 10.8650 KOps/s 10.7582 KOps/s $\color{#35bf28}+0.99\%$
test_tc_init_many_fields 68.3610μs 16.5271μs 60.5065 KOps/s 59.2717 KOps/s $\color{#35bf28}+2.08\%$
test_tc_first_layer_tensor 27.1810μs 1.8496μs 540.6550 KOps/s 531.7546 KOps/s $\color{#35bf28}+1.67\%$
test_tc_first_layer_tensor_only 8.5301μs 0.7569μs 1.3212 MOps/s 1.3097 MOps/s $\color{#35bf28}+0.87\%$
test_tc_first_layer_tensor_set 26.0300μs 4.2018μs 237.9954 KOps/s 229.9180 KOps/s $\color{#35bf28}+3.51\%$
test_tc_first_layer_tensor_only_set 48.6600μs 3.2022μs 312.2824 KOps/s 312.0821 KOps/s $\color{#35bf28}+0.06\%$
test_tc_first_layer_nontensor 43.6500μs 6.2149μs 160.9037 KOps/s 158.0249 KOps/s $\color{#35bf28}+1.82\%$
test_tc_second_layer_tensor 65.1710μs 4.5138μs 221.5408 KOps/s 222.4747 KOps/s $\color{#d91a1a}-0.42\%$
test_tc_second_layer_nontensor 41.0400μs 8.8609μs 112.8558 KOps/s 111.1884 KOps/s $\color{#35bf28}+1.50\%$
test_unbind 0.2597s 16.0434ms 62.3309 Ops/s 57.0841 Ops/s $\textbf{\color{#35bf28}+9.19\%}$
test_full_like 4.8676ms 4.3731ms 228.6730 Ops/s 227.8673 Ops/s $\color{#35bf28}+0.35\%$
test_zeros_like 4.8571ms 4.3647ms 229.1087 Ops/s 229.3249 Ops/s $\color{#d91a1a}-0.09\%$
test_ones_like 4.7639ms 4.3706ms 228.8004 Ops/s 228.7921 Ops/s $+0.00\%$
test_clone 6.6302ms 6.4414ms 155.2457 Ops/s 155.3133 Ops/s $\color{#d91a1a}-0.04\%$
test_squeeze 69.8310μs 15.7133μs 63.6403 KOps/s 70.0136 KOps/s $\textbf{\color{#d91a1a}-9.10\%}$
test_unsqueeze 0.5526ms 0.1132ms 8.8336 KOps/s 9.1136 KOps/s $\color{#d91a1a}-3.07\%$
test_split 0.3666ms 0.1910ms 5.2346 KOps/s 5.4595 KOps/s $\color{#d91a1a}-4.12\%$
test_permute 0.7832ms 0.2111ms 4.7377 KOps/s 4.9190 KOps/s $\color{#d91a1a}-3.68\%$
test_stack 51.6454ms 51.2433ms 19.5147 Ops/s 19.5604 Ops/s $\color{#d91a1a}-0.23\%$
test_cat 51.4113ms 51.2002ms 19.5312 Ops/s 19.5760 Ops/s $\color{#d91a1a}-0.23\%$

@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 243. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.1020μs 14.8039μs 67.5499 KOps/s 66.3922 KOps/s $\color{#35bf28}+1.74\%$
test_plain_set_stack_nested 39.9420μs 15.1255μs 66.1135 KOps/s 65.4059 KOps/s $\color{#35bf28}+1.08\%$
test_plain_set_nested_inplace 41.2020μs 16.5429μs 60.4491 KOps/s 59.1621 KOps/s $\color{#35bf28}+2.18\%$
test_plain_set_stack_nested_inplace 41.2020μs 16.4431μs 60.8157 KOps/s 59.2763 KOps/s $\color{#35bf28}+2.60\%$
test_items 29.1520μs 5.7094μs 175.1488 KOps/s 172.2860 KOps/s $\color{#35bf28}+1.66\%$
test_items_nested 0.6139ms 0.5323ms 1.8787 KOps/s 1.8620 KOps/s $\color{#35bf28}+0.90\%$
test_items_nested_locked 0.5969ms 0.5368ms 1.8628 KOps/s 1.8523 KOps/s $\color{#35bf28}+0.56\%$
test_items_nested_leaf 0.1502ms 95.6673μs 10.4529 KOps/s 10.4474 KOps/s $\color{#35bf28}+0.05\%$
test_items_stack_nested 0.6205ms 0.5368ms 1.8629 KOps/s 1.8653 KOps/s $\color{#d91a1a}-0.13\%$
test_items_stack_nested_leaf 0.1529ms 95.7700μs 10.4417 KOps/s 10.0847 KOps/s $\color{#35bf28}+3.54\%$
test_items_stack_nested_locked 0.5925ms 0.5401ms 1.8514 KOps/s 1.8267 KOps/s $\color{#35bf28}+1.35\%$
test_keys 30.5520μs 4.2016μs 238.0046 KOps/s 238.7535 KOps/s $\color{#d91a1a}-0.31\%$
test_keys_nested 0.1642ms 0.1190ms 8.4022 KOps/s 8.2091 KOps/s $\color{#35bf28}+2.35\%$
test_keys_nested_locked 89.0343ms 0.1393ms 7.1799 KOps/s 7.5516 KOps/s $\color{#d91a1a}-4.92\%$
test_keys_nested_leaf 0.1511ms 0.1097ms 9.1161 KOps/s 8.8741 KOps/s $\color{#35bf28}+2.73\%$
test_keys_stack_nested 0.1647ms 0.1193ms 8.3823 KOps/s 8.2176 KOps/s $\color{#35bf28}+2.00\%$
test_keys_stack_nested_leaf 0.1402ms 0.1099ms 9.1032 KOps/s 8.9258 KOps/s $\color{#35bf28}+1.99\%$
test_keys_stack_nested_locked 0.1603ms 0.1272ms 7.8635 KOps/s 7.6686 KOps/s $\color{#35bf28}+2.54\%$
test_values 7.0924μs 1.0109μs 989.2086 KOps/s 976.6508 KOps/s $\color{#35bf28}+1.29\%$
test_values_nested 75.9940μs 47.6666μs 20.9791 KOps/s 20.7713 KOps/s $\color{#35bf28}+1.00\%$
test_values_nested_locked 79.3940μs 50.2585μs 19.8971 KOps/s 19.4220 KOps/s $\color{#35bf28}+2.45\%$
test_values_nested_leaf 85.2940μs 53.9334μs 18.5414 KOps/s 18.2193 KOps/s $\color{#35bf28}+1.77\%$
test_values_stack_nested 1.0662ms 47.3207μs 21.1324 KOps/s 20.8430 KOps/s $\color{#35bf28}+1.39\%$
test_values_stack_nested_leaf 80.8040μs 53.9359μs 18.5405 KOps/s 18.3605 KOps/s $\color{#35bf28}+0.98\%$
test_values_stack_nested_locked 87.5840μs 50.6828μs 19.7305 KOps/s 19.6024 KOps/s $\color{#35bf28}+0.65\%$
test_membership 5.4953μs 0.8499μs 1.1766 MOps/s 1.1984 MOps/s $\color{#d91a1a}-1.81\%$
test_membership_nested 32.2520μs 3.1503μs 317.4257 KOps/s 317.9781 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_nested_leaf 41.4120μs 3.1555μs 316.9096 KOps/s 317.5069 KOps/s $\color{#d91a1a}-0.19\%$
test_membership_stacked_nested 28.1010μs 3.2004μs 312.4568 KOps/s 315.6988 KOps/s $\color{#d91a1a}-1.03\%$
test_membership_stacked_nested_leaf 34.3610μs 3.1833μs 314.1415 KOps/s 315.2285 KOps/s $\color{#d91a1a}-0.34\%$
test_membership_nested_last 33.3920μs 4.5922μs 217.7585 KOps/s 216.4191 KOps/s $\color{#35bf28}+0.62\%$
test_membership_nested_leaf_last 32.4720μs 4.6158μs 216.6486 KOps/s 220.3237 KOps/s $\color{#d91a1a}-1.67\%$
test_membership_stacked_nested_last 35.9120μs 4.6508μs 215.0173 KOps/s 219.0620 KOps/s $\color{#d91a1a}-1.85\%$
test_membership_stacked_nested_leaf_last 35.5620μs 4.5575μs 219.4168 KOps/s 218.7080 KOps/s $\color{#35bf28}+0.32\%$
test_nested_getleaf 54.5530μs 21.7188μs 46.0431 KOps/s 45.8114 KOps/s $\color{#35bf28}+0.51\%$
test_nested_get 49.4930μs 20.2324μs 49.4258 KOps/s 48.8635 KOps/s $\color{#35bf28}+1.15\%$
test_stacked_getleaf 55.7530μs 21.3837μs 46.7647 KOps/s 46.4175 KOps/s $\color{#35bf28}+0.75\%$
test_stacked_get 43.1720μs 20.6730μs 48.3722 KOps/s 48.7358 KOps/s $\color{#d91a1a}-0.75\%$
test_nested_getitemleaf 50.9230μs 21.8250μs 45.8191 KOps/s 44.8081 KOps/s $\color{#35bf28}+2.26\%$
test_nested_getitem 49.2620μs 21.0831μs 47.4313 KOps/s 47.9748 KOps/s $\color{#d91a1a}-1.13\%$
test_stacked_getitemleaf 46.1520μs 22.2885μs 44.8661 KOps/s 44.6298 KOps/s $\color{#35bf28}+0.53\%$
test_stacked_getitem 50.9520μs 21.2884μs 46.9740 KOps/s 47.2890 KOps/s $\color{#d91a1a}-0.67\%$
test_lock_nested 7.5117ms 0.4830ms 2.0705 KOps/s 2.1019 KOps/s $\color{#d91a1a}-1.49\%$
test_lock_stack_nested 0.5230ms 0.4818ms 2.0755 KOps/s 2.0548 KOps/s $\color{#35bf28}+1.01\%$
test_unlock_nested 0.4534ms 0.3855ms 2.5938 KOps/s 2.5932 KOps/s $\color{#35bf28}+0.02\%$
test_unlock_stack_nested 0.4251ms 0.3870ms 2.5839 KOps/s 2.5585 KOps/s $\color{#35bf28}+1.00\%$
test_flatten_speed 0.1595ms 0.1219ms 8.2051 KOps/s 8.1081 KOps/s $\color{#35bf28}+1.20\%$
test_unflatten_speed 0.6509ms 0.5939ms 1.6837 KOps/s 1.6719 KOps/s $\color{#35bf28}+0.71\%$
test_common_ops 0.8080ms 0.6812ms 1.4680 KOps/s 1.4651 KOps/s $\color{#35bf28}+0.20\%$
test_creation 72.3640μs 2.9212μs 342.3281 KOps/s 343.7786 KOps/s $\color{#d91a1a}-0.42\%$
test_creation_empty 28.9420μs 6.1357μs 162.9814 KOps/s 161.8800 KOps/s $\color{#35bf28}+0.68\%$
test_creation_nested_1 36.6620μs 10.8540μs 92.1318 KOps/s 91.9630 KOps/s $\color{#35bf28}+0.18\%$
test_creation_nested_2 50.4030μs 11.9434μs 83.7285 KOps/s 83.8562 KOps/s $\color{#d91a1a}-0.15\%$
test_creation_many_keys[10] 47.3520μs 17.9759μs 55.6302 KOps/s 54.8379 KOps/s $\color{#35bf28}+1.44\%$
test_creation_many_keys[50] 0.1300ms 77.5863μs 12.8889 KOps/s 12.8414 KOps/s $\color{#35bf28}+0.37\%$
test_creation_many_keys[100] 0.2314ms 0.1534ms 6.5184 KOps/s 6.5556 KOps/s $\color{#d91a1a}-0.57\%$
test_creation_nested_many_keys[10] 72.7640μs 39.2096μs 25.5039 KOps/s 25.7431 KOps/s $\color{#d91a1a}-0.93\%$
test_creation_nested_many_keys[50] 0.2009ms 0.1593ms 6.2758 KOps/s 6.2821 KOps/s $\color{#d91a1a}-0.10\%$
test_clone 40.2820μs 13.3023μs 75.1749 KOps/s 76.0248 KOps/s $\color{#d91a1a}-1.12\%$
test_getitem[int] 1.6991ms 14.7760μs 67.6771 KOps/s 60.7592 KOps/s $\textbf{\color{#35bf28}+11.39\%}$
test_getitem[slice_int] 0.1439ms 25.2344μs 39.6284 KOps/s 38.0171 KOps/s $\color{#35bf28}+4.24\%$
test_getitem[range] 0.1727ms 61.3349μs 16.3039 KOps/s 15.4026 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_getitem[tuple] 0.1502ms 24.5988μs 40.6525 KOps/s 39.1156 KOps/s $\color{#35bf28}+3.93\%$
test_getitem[list] 0.1825ms 56.7717μs 17.6144 KOps/s 16.9334 KOps/s $\color{#35bf28}+4.02\%$
test_setitem_dim[int] 53.7230μs 26.3676μs 37.9254 KOps/s 36.9508 KOps/s $\color{#35bf28}+2.64\%$
test_setitem_dim[slice_int] 66.4440μs 43.8949μs 22.7817 KOps/s 21.2756 KOps/s $\textbf{\color{#35bf28}+7.08\%}$
test_setitem_dim[range] 0.1226ms 93.8977μs 10.6499 KOps/s 10.7048 KOps/s $\color{#d91a1a}-0.51\%$
test_setitem_dim[tuple] 68.4130μs 41.0424μs 24.3650 KOps/s 24.1905 KOps/s $\color{#35bf28}+0.72\%$
test_setitem 46.9920μs 17.6954μs 56.5119 KOps/s 55.7334 KOps/s $\color{#35bf28}+1.40\%$
test_set 52.1630μs 17.1859μs 58.1871 KOps/s 58.1212 KOps/s $\color{#35bf28}+0.11\%$
test_set_shared 0.5418ms 0.2042ms 4.8980 KOps/s 4.7682 KOps/s $\color{#35bf28}+2.72\%$
test_update 0.3871ms 22.4336μs 44.5761 KOps/s 45.2067 KOps/s $\color{#d91a1a}-1.40\%$
test_update_nested 71.0330μs 34.6431μs 28.8658 KOps/s 28.8527 KOps/s $\color{#35bf28}+0.05\%$
test_update__nested 0.4741ms 34.3347μs 29.1251 KOps/s 28.7162 KOps/s $\color{#35bf28}+1.42\%$
test_set_nested 56.3830μs 19.0132μs 52.5949 KOps/s 52.2120 KOps/s $\color{#35bf28}+0.73\%$
test_set_nested_new 68.9630μs 23.9491μs 41.7552 KOps/s 41.5631 KOps/s $\color{#35bf28}+0.46\%$
test_select 73.1830μs 42.0124μs 23.8025 KOps/s 23.8242 KOps/s $\color{#d91a1a}-0.09\%$
test_select_nested 0.1171ms 75.5933μs 13.2287 KOps/s 13.4230 KOps/s $\color{#d91a1a}-1.45\%$
test_exclude_nested 0.1282ms 97.2427μs 10.2835 KOps/s 10.2995 KOps/s $\color{#d91a1a}-0.16\%$
test_empty[True] 0.4801ms 0.4383ms 2.2815 KOps/s 2.2811 KOps/s $\color{#35bf28}+0.02\%$
test_empty[False] 8.1205μs 1.3211μs 756.9307 KOps/s 755.5254 KOps/s $\color{#35bf28}+0.19\%$
test_to 0.1028ms 72.5689μs 13.7800 KOps/s 13.7452 KOps/s $\color{#35bf28}+0.25\%$
test_to_nonblocking 0.1168ms 64.1186μs 15.5961 KOps/s 15.4219 KOps/s $\color{#35bf28}+1.13\%$
test_unbind_speed 0.3829ms 0.3307ms 3.0242 KOps/s 3.0912 KOps/s $\color{#d91a1a}-2.17\%$
test_unbind_speed_stack0 0.4087ms 0.3297ms 3.0329 KOps/s 3.0853 KOps/s $\color{#d91a1a}-1.70\%$
test_unbind_speed_stack1 0.1036s 0.8423ms 1.1872 KOps/s 1.1822 KOps/s $\color{#35bf28}+0.42\%$
test_split 0.1030s 1.2798ms 781.3985 Ops/s 789.0487 Ops/s $\color{#d91a1a}-0.97\%$
test_chunk 0.1028s 1.2174ms 821.4535 Ops/s 923.9799 Ops/s $\textbf{\color{#d91a1a}-11.10\%}$
test_to_cpu_blocking 19.4532ms 19.3358ms 51.7174 Ops/s 47.2084 Ops/s $\textbf{\color{#35bf28}+9.55\%}$
test_to_cpu_global_sync 11.3406ms 11.2148ms 89.1678 Ops/s 90.4132 Ops/s $\color{#d91a1a}-1.38\%$
test_to_cpu_event_sync 12.3965ms 12.1800ms 82.1017 Ops/s 82.9855 Ops/s $\color{#d91a1a}-1.07\%$
test_to_cpu_default 0.1145s 13.4246ms 74.4901 Ops/s 83.1621 Ops/s $\textbf{\color{#d91a1a}-10.43\%}$
test_consolidate[False-None] 4.3190ms 4.1138ms 243.0860 Ops/s 220.3182 Ops/s $\textbf{\color{#35bf28}+10.33\%}$
test_consolidate[default-None] 2.1310ms 2.0325ms 492.0016 Ops/s 484.9602 Ops/s $\color{#35bf28}+1.45\%$
test_consolidate[reduce-overhead-None] 2.0454ms 1.9552ms 511.4439 Ops/s 506.2937 Ops/s $\color{#35bf28}+1.02\%$
test_consolidate_njt[False-None] 8.6283ms 8.4527ms 118.3055 Ops/s 118.5736 Ops/s $\color{#d91a1a}-0.23\%$
test_to[False-False-None] 2.1667ms 2.0652ms 484.2058 Ops/s 486.7050 Ops/s $\color{#d91a1a}-0.51\%$
test_to[True-False-None] 2.1575ms 1.9097ms 523.6349 Ops/s 511.7328 Ops/s $\color{#35bf28}+2.33\%$
test_to[within-False-None] 6.3219ms 6.0998ms 163.9401 Ops/s 164.1838 Ops/s $\color{#d91a1a}-0.15\%$
test_to[True-default-None] 0.1769s 8.8406ms 113.1139 Ops/s 125.0978 Ops/s $\textbf{\color{#d91a1a}-9.58\%}$
test_to_njt[False-False-None] 8.8518ms 8.5440ms 117.0413 Ops/s 116.2744 Ops/s $\color{#35bf28}+0.66\%$
test_to_njt[True-False-None] 7.2936ms 6.9321ms 144.2560 Ops/s 141.9425 Ops/s $\color{#35bf28}+1.63\%$
test_to_njt[within-False-None] 16.3893ms 15.4682ms 64.6486 Ops/s 63.6290 Ops/s $\color{#35bf28}+1.60\%$
test_creation[device0] 0.2550ms 0.1161ms 8.6099 KOps/s 8.2013 KOps/s $\color{#35bf28}+4.98\%$
test_creation_from_tensor 0.3603ms 0.1162ms 8.6043 KOps/s 8.5467 KOps/s $\color{#35bf28}+0.67\%$
test_add_one[memmap_tensor0] 0.2791ms 6.5394μs 152.9196 KOps/s 152.1972 KOps/s $\color{#35bf28}+0.47\%$
test_contiguous[memmap_tensor0] 15.8500μs 0.6535μs 1.5303 MOps/s 2.1397 MOps/s $\textbf{\color{#d91a1a}-28.48\%}$
test_stack[memmap_tensor0] 60.5340μs 4.6228μs 216.3198 KOps/s 221.9371 KOps/s $\color{#d91a1a}-2.53\%$
test_memmaptd_index 1.0250ms 0.2637ms 3.7924 KOps/s 3.7694 KOps/s $\color{#35bf28}+0.61\%$
test_memmaptd_index_astensor 0.5156ms 0.3610ms 2.7704 KOps/s 2.8022 KOps/s $\color{#d91a1a}-1.14\%$
test_memmaptd_index_op 0.9048ms 0.6048ms 1.6534 KOps/s 1.6526 KOps/s $\color{#35bf28}+0.05\%$
test_serialize_model 0.1404s 0.1358s 7.3625 Ops/s 7.3157 Ops/s $\color{#35bf28}+0.64\%$
test_serialize_model_pickle 1.3484s 1.2098s 0.8266 Ops/s 0.8260 Ops/s $\color{#35bf28}+0.07\%$
test_serialize_weights 0.1386s 0.1355s 7.3793 Ops/s 7.3625 Ops/s $\color{#35bf28}+0.23\%$
test_serialize_weights_returnearly 0.4051s 91.0123ms 10.9875 Ops/s 10.3534 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_serialize_weights_pickle 1.3584s 1.2145s 0.8234 Ops/s 0.8210 Ops/s $\color{#35bf28}+0.29\%$
test_reshape_pytree 0.2068ms 33.6047μs 29.7578 KOps/s 30.0986 KOps/s $\color{#d91a1a}-1.13\%$
test_reshape_td 89.6550μs 45.1162μs 22.1650 KOps/s 22.6971 KOps/s $\color{#d91a1a}-2.34\%$
test_view_pytree 0.2398ms 33.2681μs 30.0588 KOps/s 30.3642 KOps/s $\color{#d91a1a}-1.01\%$
test_view_td 92.7950μs 52.8714μs 18.9138 KOps/s 19.0357 KOps/s $\color{#d91a1a}-0.64\%$
test_unbind_pytree 0.2360ms 37.1907μs 26.8885 KOps/s 27.0006 KOps/s $\color{#d91a1a}-0.42\%$
test_unbind_td 0.1309ms 49.2682μs 20.2971 KOps/s 20.3649 KOps/s $\color{#d91a1a}-0.33\%$
test_split_pytree 0.2592ms 43.0572μs 23.2249 KOps/s 23.3264 KOps/s $\color{#d91a1a}-0.43\%$
test_split_td 0.2223ms 66.7872μs 14.9729 KOps/s 15.3116 KOps/s $\color{#d91a1a}-2.21\%$
test_add_pytree 0.2366ms 42.7689μs 23.3815 KOps/s 23.9051 KOps/s $\color{#d91a1a}-2.19\%$
test_add_td 99.3050μs 53.7840μs 18.5929 KOps/s 18.5222 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_nested[tensordict-compile] 0.1917ms 0.1388ms 7.2046 KOps/s 6.9835 KOps/s $\color{#35bf28}+3.17\%$
test_compile_add_one_nested[tensordict-eager] 0.6047ms 0.1930ms 5.1811 KOps/s 5.1779 KOps/s $\color{#35bf28}+0.06\%$
test_compile_add_one_nested[pytree-compile] 0.1721ms 0.1087ms 9.1966 KOps/s 9.0752 KOps/s $\color{#35bf28}+1.34\%$
test_compile_add_one_nested[pytree-eager] 0.6122ms 0.1828ms 5.4695 KOps/s 5.5508 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_copy_nested[tensordict-compile] 68.3140μs 30.0555μs 33.2718 KOps/s 32.8376 KOps/s $\color{#35bf28}+1.32\%$
test_compile_copy_nested[tensordict-eager] 0.1268ms 52.3918μs 19.0870 KOps/s 18.6722 KOps/s $\color{#35bf28}+2.22\%$
test_compile_copy_nested[pytree-compile] 41.8320μs 9.7931μs 102.1126 KOps/s 105.0910 KOps/s $\color{#d91a1a}-2.83\%$
test_compile_copy_nested[pytree-eager] 0.4550ms 70.2780μs 14.2292 KOps/s 14.4065 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_add_one_flat[tensordict-compile] 0.2763ms 0.1760ms 5.6818 KOps/s 5.4825 KOps/s $\color{#35bf28}+3.64\%$
test_compile_add_one_flat[tensordict-eager] 0.3201ms 0.2528ms 3.9550 KOps/s 3.9086 KOps/s $\color{#35bf28}+1.19\%$
test_compile_add_one_flat[tensorclass-compile] 0.2501ms 0.1158ms 8.6392 KOps/s 8.3385 KOps/s $\color{#35bf28}+3.61\%$
test_compile_add_one_flat[tensorclass-eager] 0.1277ms 68.5560μs 14.5866 KOps/s 14.2765 KOps/s $\color{#35bf28}+2.17\%$
test_compile_add_one_flat[pytree-compile] 0.2034ms 0.1588ms 6.2973 KOps/s 6.1230 KOps/s $\color{#35bf28}+2.85\%$
test_compile_add_one_flat[pytree-eager] 0.8311ms 0.5428ms 1.8423 KOps/s 1.8783 KOps/s $\color{#d91a1a}-1.92\%$
test_compile_add_self_flat[tensordict-eager] 0.3588ms 0.3059ms 3.2687 KOps/s 3.1835 KOps/s $\color{#35bf28}+2.68\%$
test_compile_add_self_flat[tensordict-compile] 0.2332ms 0.1797ms 5.5650 KOps/s 5.1982 KOps/s $\textbf{\color{#35bf28}+7.06\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1293ms 84.9662μs 11.7694 KOps/s 11.5265 KOps/s $\color{#35bf28}+2.11\%$
test_compile_add_self_flat[tensorclass-compile] 0.1691ms 0.1181ms 8.4644 KOps/s 7.8869 KOps/s $\textbf{\color{#35bf28}+7.32\%}$
test_compile_add_self_flat[pytree-eager] 0.6600ms 0.4435ms 2.2549 KOps/s 2.3019 KOps/s $\color{#d91a1a}-2.04\%$
test_compile_add_self_flat[pytree-compile] 0.1999ms 0.1596ms 6.2672 KOps/s 6.1310 KOps/s $\color{#35bf28}+2.22\%$
test_compile_copy_flat[tensordict-compile] 53.8430μs 24.3683μs 41.0369 KOps/s 41.2146 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_copy_flat[tensordict-eager] 0.1393ms 41.5300μs 24.0790 KOps/s 24.2078 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_copy_flat[pytree-compile] 36.7120μs 10.9684μs 91.1711 KOps/s 90.0701 KOps/s $\color{#35bf28}+1.22\%$
test_compile_copy_flat[pytree-eager] 0.4169ms 52.9486μs 18.8862 KOps/s 19.2205 KOps/s $\color{#d91a1a}-1.74\%$
test_compile_assign_and_add[tensordict-compile] 1.9986ms 0.1744ms 5.7348 KOps/s 5.5009 KOps/s $\color{#35bf28}+4.25\%$
test_compile_assign_and_add[tensordict-eager] 3.4202ms 3.3156ms 301.6059 Ops/s 301.9230 Ops/s $\color{#d91a1a}-0.11\%$
test_compile_assign_and_add[pytree-compile] 1.9817ms 0.1623ms 6.1608 KOps/s 6.1002 KOps/s $\color{#35bf28}+0.99\%$
test_compile_assign_and_add[pytree-eager] 2.9829ms 2.8232ms 354.2037 Ops/s 359.0176 Ops/s $\color{#d91a1a}-1.34\%$
test_compile_indexing[tensor-tensordict-compile] 0.1726ms 0.1078ms 9.2728 KOps/s 8.9912 KOps/s $\color{#35bf28}+3.13\%$
test_compile_indexing[tensor-tensordict-eager] 0.3160ms 73.0188μs 13.6951 KOps/s 13.6530 KOps/s $\color{#35bf28}+0.31\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2220ms 95.6873μs 10.4507 KOps/s 10.2449 KOps/s $\color{#35bf28}+2.01\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2570ms 45.4354μs 22.0093 KOps/s 22.3016 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[tensor-pytree-compile] 0.1415ms 96.7035μs 10.3409 KOps/s 10.1937 KOps/s $\color{#35bf28}+1.44\%$
test_compile_indexing[tensor-pytree-eager] 0.2560ms 45.0586μs 22.1933 KOps/s 22.2557 KOps/s $\color{#d91a1a}-0.28\%$
test_compile_indexing[slice-tensordict-compile] 94.5450μs 55.3631μs 18.0626 KOps/s 17.5316 KOps/s $\color{#35bf28}+3.03\%$
test_compile_indexing[slice-tensordict-eager] 0.2266ms 28.3421μs 35.2832 KOps/s 35.7393 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_indexing[slice-tensorclass-compile] 83.9650μs 45.2949μs 22.0775 KOps/s 21.9064 KOps/s $\color{#35bf28}+0.78\%$
test_compile_indexing[slice-tensorclass-eager] 0.2646ms 22.8489μs 43.7657 KOps/s 43.2980 KOps/s $\color{#35bf28}+1.08\%$
test_compile_indexing[slice-pytree-compile] 78.8240μs 45.9017μs 21.7857 KOps/s 21.7501 KOps/s $\color{#35bf28}+0.16\%$
test_compile_indexing[slice-pytree-eager] 0.2761ms 22.8375μs 43.7875 KOps/s 43.1896 KOps/s $\color{#35bf28}+1.38\%$
test_compile_indexing[int-tensordict-compile] 96.0650μs 57.2522μs 17.4666 KOps/s 17.4749 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[int-tensordict-eager] 0.2817ms 28.4499μs 35.1495 KOps/s 35.4883 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_indexing[int-tensorclass-compile] 85.5640μs 45.8102μs 21.8292 KOps/s 21.5160 KOps/s $\color{#35bf28}+1.46\%$
test_compile_indexing[int-tensorclass-eager] 0.2602ms 22.8504μs 43.7629 KOps/s 43.2590 KOps/s $\color{#35bf28}+1.16\%$
test_compile_indexing[int-pytree-compile] 95.4650μs 45.6022μs 21.9288 KOps/s 21.6005 KOps/s $\color{#35bf28}+1.52\%$
test_compile_indexing[int-pytree-eager] 0.2752ms 23.0035μs 43.4716 KOps/s 43.2933 KOps/s $\color{#35bf28}+0.41\%$
test_mod_add[eager] 97.8850μs 51.5056μs 19.4154 KOps/s 19.3960 KOps/s $\color{#35bf28}+0.10\%$
test_mod_add[compile] 0.1946ms 0.1047ms 9.5536 KOps/s 9.3685 KOps/s $\color{#35bf28}+1.98\%$
test_mod_add[compile-overhead] 0.2314ms 0.1461ms 6.8439 KOps/s 6.5885 KOps/s $\color{#35bf28}+3.88\%$
test_mod_wrap[eager] 0.3727ms 0.2930ms 3.4126 KOps/s 3.3929 KOps/s $\color{#35bf28}+0.58\%$
test_mod_wrap[compile] 0.4313ms 0.3552ms 2.8152 KOps/s 2.8396 KOps/s $\color{#d91a1a}-0.86\%$
test_mod_wrap[compile-overhead] 7.3632ms 4.0581ms 246.4193 Ops/s 247.5345 Ops/s $\color{#d91a1a}-0.45\%$
test_mod_wrap_and_backward[eager] 1.6182ms 1.4965ms 668.2378 Ops/s 649.7632 Ops/s $\color{#35bf28}+2.84\%$
test_mod_wrap_and_backward[compile] 1.6172ms 1.4531ms 688.1667 Ops/s 686.9249 Ops/s $\color{#35bf28}+0.18\%$
test_mod_wrap_and_backward[compile-overhead] 1.7191ms 0.8828ms 1.1328 KOps/s 1.1084 KOps/s $\color{#35bf28}+2.20\%$
test_seq_add[eager] 0.2233ms 0.1638ms 6.1055 KOps/s 5.9676 KOps/s $\color{#35bf28}+2.31\%$
test_seq_add[compile] 0.1760ms 0.1204ms 8.3065 KOps/s 8.3047 KOps/s $\color{#35bf28}+0.02\%$
test_seq_add[compile-overhead] 0.2127ms 0.1590ms 6.2902 KOps/s 6.2783 KOps/s $\color{#35bf28}+0.19\%$
test_seq_wrap[eager] 0.6301ms 0.5572ms 1.7948 KOps/s 1.8262 KOps/s $\color{#d91a1a}-1.72\%$
test_seq_wrap[compile] 0.4701ms 0.3811ms 2.6243 KOps/s 2.6609 KOps/s $\color{#d91a1a}-1.38\%$
test_seq_wrap[compile-overhead] 0.3362ms 0.2729ms 3.6637 KOps/s 3.7405 KOps/s $\color{#d91a1a}-2.05\%$
test_func_call_runtime[False-eager] 0.9755ms 0.8786ms 1.1382 KOps/s 1.1807 KOps/s $\color{#d91a1a}-3.59\%$
test_func_call_runtime[False-compile] 1.0436ms 0.9406ms 1.0632 KOps/s 1.0879 KOps/s $\color{#d91a1a}-2.27\%$
test_func_call_runtime[False-compile-overhead] 0.5135ms 0.4560ms 2.1929 KOps/s 2.1493 KOps/s $\color{#35bf28}+2.03\%$
test_func_call_runtime[True-eager] 1.2320ms 1.0991ms 909.8338 Ops/s 914.0972 Ops/s $\color{#d91a1a}-0.47\%$
test_func_call_runtime[True-compile] 1.0388ms 0.9481ms 1.0548 KOps/s 1.0707 KOps/s $\color{#d91a1a}-1.49\%$
test_func_call_runtime[True-compile-overhead] 0.5193ms 0.4709ms 2.1235 KOps/s 2.0761 KOps/s $\color{#35bf28}+2.28\%$
test_func_call_cm_runtime[False-eager] 0.9095ms 0.8385ms 1.1926 KOps/s 1.1422 KOps/s $\color{#35bf28}+4.41\%$
test_func_call_cm_runtime[False-compile] 1.0750ms 0.9161ms 1.0916 KOps/s 1.0736 KOps/s $\color{#35bf28}+1.68\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5341ms 0.4601ms 2.1735 KOps/s 2.1339 KOps/s $\color{#35bf28}+1.86\%$
test_func_call_cm_runtime[True-eager] 1.3197ms 1.2224ms 818.0837 Ops/s 797.9728 Ops/s $\color{#35bf28}+2.52\%$
test_func_call_cm_runtime[True-compile] 1.0197ms 0.9559ms 1.0462 KOps/s 1.0293 KOps/s $\color{#35bf28}+1.64\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5582ms 0.5035ms 1.9863 KOps/s 1.9518 KOps/s $\color{#35bf28}+1.76\%$
test_vmap_func_call_cm_runtime[eager] 2.8425ms 2.3473ms 426.0246 Ops/s 417.1863 Ops/s $\color{#35bf28}+2.12\%$
test_vmap_func_call_cm_runtime[compile] 1.0810ms 0.9743ms 1.0263 KOps/s 1.0211 KOps/s $\color{#35bf28}+0.51\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5529ms 0.5064ms 1.9746 KOps/s 1.9358 KOps/s $\color{#35bf28}+2.01\%$
test_distributed 0.6858ms 0.1545ms 6.4725 KOps/s 6.4883 KOps/s $\color{#d91a1a}-0.24\%$
test_tdmodule 0.2680ms 29.6059μs 33.7770 KOps/s 34.9812 KOps/s $\color{#d91a1a}-3.44\%$
test_tdmodule_dispatch 77.3740μs 46.4540μs 21.5267 KOps/s 21.3803 KOps/s $\color{#35bf28}+0.68\%$
test_tdseq 48.1320μs 27.9357μs 35.7964 KOps/s 35.3917 KOps/s $\color{#35bf28}+1.14\%$
test_tdseq_dispatch 69.8130μs 48.8922μs 20.4531 KOps/s 20.1647 KOps/s $\color{#35bf28}+1.43\%$
test_instantiation_functorch 2.1901ms 2.0566ms 486.2512 Ops/s 484.8135 Ops/s $\color{#35bf28}+0.30\%$
test_exec_functorch 0.2329ms 0.1790ms 5.5856 KOps/s 5.4687 KOps/s $\color{#35bf28}+2.14\%$
test_exec_functional_call 0.2111ms 0.1679ms 5.9562 KOps/s 6.1385 KOps/s $\color{#d91a1a}-2.97\%$
test_exec_td_decorator 0.4642ms 0.2367ms 4.2256 KOps/s 4.2065 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[True-True] 1.0197ms 0.8194ms 1.2204 KOps/s 1.2090 KOps/s $\color{#35bf28}+0.94\%$
test_vmap_mlp_speed_decorator[True-False] 1.0082ms 0.8430ms 1.1863 KOps/s 1.2110 KOps/s $\color{#d91a1a}-2.04\%$
test_vmap_mlp_speed_decorator[False-True] 0.9321ms 0.7160ms 1.3966 KOps/s 1.4012 KOps/s $\color{#d91a1a}-0.33\%$
test_vmap_mlp_speed_decorator[False-False] 0.8714ms 0.7094ms 1.4097 KOps/s 1.4014 KOps/s $\color{#35bf28}+0.59\%$
test_vmap_transformer_speed_decorator[True-True] 21.1009ms 20.5919ms 48.5627 Ops/s 48.5387 Ops/s $\color{#35bf28}+0.05\%$
test_vmap_transformer_speed_decorator[True-False] 20.7170ms 20.5539ms 48.6526 Ops/s 48.4994 Ops/s $\color{#35bf28}+0.32\%$
test_vmap_transformer_speed_decorator[False-True] 20.9515ms 20.3823ms 49.0621 Ops/s 48.9317 Ops/s $\color{#35bf28}+0.27\%$
test_vmap_transformer_speed_decorator[False-False] 21.0341ms 20.3757ms 49.0782 Ops/s 48.9304 Ops/s $\color{#35bf28}+0.30\%$
test_to_module_speed[True] 2.0412ms 1.4736ms 678.6007 Ops/s 667.7855 Ops/s $\color{#35bf28}+1.62\%$
test_to_module_speed[False] 1.9711ms 1.4450ms 692.0253 Ops/s 682.1545 Ops/s $\color{#35bf28}+1.45\%$
test_tc_init 82.2440μs 46.0791μs 21.7018 KOps/s 21.2611 KOps/s $\color{#35bf28}+2.07\%$
test_tc_init_tensor_only 35.8720μs 9.9138μs 100.8695 KOps/s 101.0767 KOps/s $\color{#d91a1a}-0.21\%$
test_tc_init_nested 0.1703ms 92.7423μs 10.7826 KOps/s 10.8651 KOps/s $\color{#d91a1a}-0.76\%$
test_tc_init_many_fields 44.8530μs 16.5403μs 60.4584 KOps/s 60.6943 KOps/s $\color{#d91a1a}-0.39\%$
test_tc_first_layer_tensor 15.2810μs 1.8247μs 548.0313 KOps/s 540.2200 KOps/s $\color{#35bf28}+1.45\%$
test_tc_first_layer_tensor_only 4.0173μs 0.7525μs 1.3289 MOps/s 1.3396 MOps/s $\color{#d91a1a}-0.79\%$
test_tc_first_layer_tensor_set 24.6720μs 4.2070μs 237.6987 KOps/s 250.1938 KOps/s $\color{#d91a1a}-4.99\%$
test_tc_first_layer_tensor_only_set 18.4710μs 3.1647μs 315.9831 KOps/s 319.6985 KOps/s $\color{#d91a1a}-1.16\%$
test_tc_first_layer_nontensor 28.0720μs 6.1860μs 161.6558 KOps/s 160.7817 KOps/s $\color{#35bf28}+0.54\%$
test_tc_second_layer_tensor 23.9210μs 4.4168μs 226.4081 KOps/s 228.4642 KOps/s $\color{#d91a1a}-0.90\%$
test_tc_second_layer_nontensor 29.2310μs 8.7549μs 114.2213 KOps/s 115.0136 KOps/s $\color{#d91a1a}-0.69\%$
test_unbind 0.2673s 16.1285ms 62.0020 Ops/s 56.9528 Ops/s $\textbf{\color{#35bf28}+8.87\%}$
test_full_like 10.4718ms 4.3975ms 227.4010 Ops/s 227.7124 Ops/s $\color{#d91a1a}-0.14\%$
test_zeros_like 4.7837ms 4.3488ms 229.9503 Ops/s 228.6936 Ops/s $\color{#35bf28}+0.55\%$
test_ones_like 4.5044ms 4.3561ms 229.5634 Ops/s 228.6935 Ops/s $\color{#35bf28}+0.38\%$
test_clone 6.8953ms 6.4059ms 156.1068 Ops/s 155.8422 Ops/s $\color{#35bf28}+0.17\%$
test_squeeze 0.1873ms 14.4104μs 69.3944 KOps/s 68.2709 KOps/s $\color{#35bf28}+1.65\%$
test_unsqueeze 0.1596ms 0.1113ms 8.9859 KOps/s 8.7081 KOps/s $\color{#35bf28}+3.19\%$
test_split 0.2834ms 0.1882ms 5.3132 KOps/s 5.2430 KOps/s $\color{#35bf28}+1.34\%$
test_permute 0.2590ms 0.2127ms 4.7019 KOps/s 4.6756 KOps/s $\color{#35bf28}+0.56\%$
test_stack 51.2381ms 50.6467ms 19.7446 Ops/s 19.5978 Ops/s $\color{#35bf28}+0.75\%$
test_cat 51.2586ms 51.0277ms 19.5972 Ops/s 23.3962 Ops/s $\textbf{\color{#d91a1a}-16.24\%}$

Xmaster6y pushed a commit to Xmaster6y/tensordict that referenced this pull request Feb 27, 2026
Adds RedisTensorDict, a TensorDictBase subclass that stores tensors in a
Redis instance for out-of-core storage. This enables datasets that exceed
local RAM to be accessed with the familiar TensorDict interface.

MVP features:
- Async redis.asyncio client on a background event loop with sync wrappers
- Zero-copy serialization via torch.frombuffer / untyped_storage bytes
- Metadata (shape, dtype) stored in Redis Hashes for fast introspection
- Redis Cluster-compatible key schema using hash tags
- Pipelined batch I/O for all get/set operations
- Nested TensorDict support via lightweight prefix-based views
- Pickle support for multi-process usage
- from_dict() factory, to_local() / to_tensordict() materialization
- CI: install redis package and start redis-server before tests


ghstack-source-id: 6cb2eff
Pull-Request: pytorch#1567
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant