[DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A#1640
Open
vmoens wants to merge 1 commit intogh/vmoens/81/basefrom
Open
[DTensor] Add unified dtensor_send/dtensor_recv API with Strategy A#1640vmoens wants to merge 1 commit intogh/vmoens/81/basefrom
vmoens wants to merge 1 commit intogh/vmoens/81/basefrom
Conversation
Contributor
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
This was referenced Mar 6, 2026
Contributor
PR Title Label ErrorUnknown or invalid prefix Current title: Supported PrefixesYour PR title must start with exactly one of these prefixes (case-insensitive):
Note: Matching is case-insensitive. Common variations (singular/plural) are supported. |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 43.9500μs | 14.3876μs | 69.5044 KOps/s | 70.4946 KOps/s | |
| test_plain_set_stack_nested | 37.2910μs | 14.7438μs | 67.8250 KOps/s | 68.7282 KOps/s | |
| test_plain_set_nested_inplace | 42.4910μs | 16.0103μs | 62.4597 KOps/s | 62.2523 KOps/s | |
| test_plain_set_stack_nested_inplace | 62.5210μs | 15.8845μs | 62.9544 KOps/s | 62.4856 KOps/s | |
| test_items | 27.0910μs | 5.4805μs | 182.4636 KOps/s | 182.0408 KOps/s | |
| test_items_nested | 0.5698ms | 0.4462ms | 2.2409 KOps/s | 2.2634 KOps/s | |
| test_items_nested_locked | 0.5638ms | 0.4518ms | 2.2134 KOps/s | 2.2362 KOps/s | |
| test_items_nested_leaf | 0.1665ms | 91.3404μs | 10.9481 KOps/s | 10.8012 KOps/s | |
| test_items_stack_nested | 0.4805ms | 0.4472ms | 2.2359 KOps/s | 2.2537 KOps/s | |
| test_items_stack_nested_leaf | 0.1301ms | 93.7867μs | 10.6625 KOps/s | 10.6662 KOps/s | |
| test_items_stack_nested_locked | 0.5176ms | 0.4493ms | 2.2258 KOps/s | 2.2304 KOps/s | |
| test_keys | 33.9100μs | 4.1361μs | 241.7738 KOps/s | 246.1273 KOps/s | |
| test_keys_nested | 0.1937ms | 0.1276ms | 7.8350 KOps/s | 7.8925 KOps/s | |
| test_keys_nested_locked | 2.1438ms | 0.1374ms | 7.2793 KOps/s | 7.4172 KOps/s | |
| test_keys_nested_leaf | 0.1678ms | 0.1189ms | 8.4132 KOps/s | 8.5649 KOps/s | |
| test_keys_stack_nested | 0.2127ms | 0.1292ms | 7.7417 KOps/s | 7.8971 KOps/s | |
| test_keys_stack_nested_leaf | 0.1886ms | 0.1189ms | 8.4098 KOps/s | 8.5845 KOps/s | |
| test_keys_stack_nested_locked | 0.1630ms | 0.1380ms | 7.2474 KOps/s | 7.4417 KOps/s | |
| test_values | 5.4782μs | 1.0093μs | 990.8121 KOps/s | 999.7890 KOps/s | |
| test_values_nested | 92.6810μs | 51.7589μs | 19.3204 KOps/s | 19.5699 KOps/s | |
| test_values_nested_locked | 87.6210μs | 55.2306μs | 18.1059 KOps/s | 18.5448 KOps/s | |
| test_values_nested_leaf | 89.3110μs | 59.3172μs | 16.8585 KOps/s | 17.2467 KOps/s | |
| test_values_stack_nested | 81.6310μs | 51.8681μs | 19.2797 KOps/s | 19.5830 KOps/s | |
| test_values_stack_nested_leaf | 86.8820μs | 59.1589μs | 16.9036 KOps/s | 17.2172 KOps/s | |
| test_values_stack_nested_locked | 0.1030ms | 55.7281μs | 17.9443 KOps/s | 18.3803 KOps/s | |
| test_membership | 9.5518μs | 0.7978μs | 1.2534 MOps/s | 1.2521 MOps/s | |
| test_membership_nested | 33.3110μs | 2.7173μs | 368.0186 KOps/s | 370.0032 KOps/s | |
| test_membership_nested_leaf | 15.4150μs | 2.6389μs | 378.9392 KOps/s | 362.2450 KOps/s | |
| test_membership_stacked_nested | 33.6400μs | 2.7621μs | 362.0380 KOps/s | 369.0774 KOps/s | |
| test_membership_stacked_nested_leaf | 39.8610μs | 2.7499μs | 363.6506 KOps/s | 364.1836 KOps/s | |
| test_membership_nested_last | 35.8010μs | 4.1316μs | 242.0356 KOps/s | 244.2148 KOps/s | |
| test_membership_nested_leaf_last | 29.6210μs | 4.1077μs | 243.4443 KOps/s | 244.9041 KOps/s | |
| test_membership_stacked_nested_last | 73.4710μs | 4.0802μs | 245.0843 KOps/s | 244.1228 KOps/s | |
| test_membership_stacked_nested_leaf_last | 36.2310μs | 4.1006μs | 243.8696 KOps/s | 246.6123 KOps/s | |
| test_nested_getleaf | 51.2110μs | 20.6709μs | 48.3771 KOps/s | 48.9640 KOps/s | |
| test_nested_get | 61.8610μs | 19.2225μs | 52.0224 KOps/s | 51.7182 KOps/s | |
| test_stacked_getleaf | 48.6210μs | 20.2948μs | 49.2738 KOps/s | 48.1768 KOps/s | |
| test_stacked_get | 50.1410μs | 19.4363μs | 51.4502 KOps/s | 51.0424 KOps/s | |
| test_nested_getitemleaf | 85.2010μs | 21.0590μs | 47.4855 KOps/s | 46.8230 KOps/s | |
| test_nested_getitem | 46.9510μs | 19.7722μs | 50.5760 KOps/s | 49.9813 KOps/s | |
| test_stacked_getitemleaf | 48.7810μs | 20.7738μs | 48.1374 KOps/s | 47.2947 KOps/s | |
| test_stacked_getitem | 46.9910μs | 19.9313μs | 50.1724 KOps/s | 49.4553 KOps/s | |
| test_lock_nested | 0.5208ms | 0.4586ms | 2.1808 KOps/s | 2.1894 KOps/s | |
| test_lock_stack_nested | 0.5230ms | 0.4622ms | 2.1635 KOps/s | 2.1527 KOps/s | |
| test_unlock_nested | 0.5241ms | 0.3726ms | 2.6842 KOps/s | 2.6912 KOps/s | |
| test_unlock_stack_nested | 0.4330ms | 0.3743ms | 2.6717 KOps/s | 2.6395 KOps/s | |
| test_flatten_speed | 0.1596ms | 0.1152ms | 8.6834 KOps/s | 8.6092 KOps/s | |
| test_unflatten_speed | 0.6151ms | 0.5436ms | 1.8396 KOps/s | 1.7923 KOps/s | |
| test_common_ops | 0.8383ms | 0.6969ms | 1.4349 KOps/s | 1.4781 KOps/s | |
| test_creation | 69.0610μs | 2.9521μs | 338.7461 KOps/s | 340.8953 KOps/s | |
| test_creation_empty | 32.2200μs | 6.5272μs | 153.2040 KOps/s | 151.5111 KOps/s | |
| test_creation_nested_1 | 57.3110μs | 11.0053μs | 90.8650 KOps/s | 91.0301 KOps/s | |
| test_creation_nested_2 | 38.6110μs | 12.5652μs | 79.5851 KOps/s | 78.9655 KOps/s | |
| test_creation_many_keys[10] | 48.3310μs | 19.5516μs | 51.1466 KOps/s | 50.5929 KOps/s | |
| test_creation_many_keys[50] | 0.1489ms | 83.5534μs | 11.9684 KOps/s | 11.6980 KOps/s | |
| test_creation_many_keys[100] | 0.2267ms | 0.1642ms | 6.0917 KOps/s | 5.9790 KOps/s | |
| test_creation_nested_many_keys[10] | 75.4310μs | 41.7514μs | 23.9513 KOps/s | 23.3619 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2343ms | 0.1710ms | 5.8469 KOps/s | 5.6919 KOps/s | |
| test_clone | 42.6010μs | 12.7681μs | 78.3202 KOps/s | 75.4074 KOps/s | |
| test_getitem[int] | 1.7072ms | 14.5209μs | 68.8664 KOps/s | 61.9057 KOps/s | |
| test_getitem[slice_int] | 0.1372ms | 24.5522μs | 40.7296 KOps/s | 43.3453 KOps/s | |
| test_getitem[range] | 0.1762ms | 65.0297μs | 15.3776 KOps/s | 15.2469 KOps/s | |
| test_getitem[tuple] | 0.1455ms | 23.1887μs | 43.1245 KOps/s | 41.2563 KOps/s | |
| test_getitem[list] | 0.1775ms | 59.6661μs | 16.7599 KOps/s | 16.4677 KOps/s | |
| test_setitem_dim[int] | 59.3110μs | 27.4442μs | 36.4376 KOps/s | 37.4142 KOps/s | |
| test_setitem_dim[slice_int] | 67.2610μs | 43.0431μs | 23.2325 KOps/s | 23.0178 KOps/s | |
| test_setitem_dim[range] | 0.1225ms | 97.6786μs | 10.2377 KOps/s | 10.3676 KOps/s | |
| test_setitem_dim[tuple] | 70.2410μs | 38.7339μs | 25.8172 KOps/s | 24.1116 KOps/s | |
| test_setitem | 49.9000μs | 17.6363μs | 56.7012 KOps/s | 53.7185 KOps/s | |
| test_set | 58.3110μs | 17.0492μs | 58.6539 KOps/s | 56.4200 KOps/s | |
| test_set_shared | 0.4909ms | 0.2083ms | 4.8016 KOps/s | 4.8322 KOps/s | |
| test_update | 0.3316ms | 22.2901μs | 44.8629 KOps/s | 44.2841 KOps/s | |
| test_update_nested | 72.7310μs | 32.0420μs | 31.2091 KOps/s | 29.3695 KOps/s | |
| test_update__nested | 0.4756ms | 33.7405μs | 29.6380 KOps/s | 28.3864 KOps/s | |
| test_set_nested | 58.7100μs | 18.8534μs | 53.0408 KOps/s | 49.6968 KOps/s | |
| test_set_nested_new | 72.3310μs | 23.6633μs | 42.2595 KOps/s | 39.3543 KOps/s | |
| test_select | 80.8010μs | 39.8606μs | 25.0874 KOps/s | 23.7536 KOps/s | |
| test_select_nested | 0.1068ms | 70.3085μs | 14.2230 KOps/s | 14.1926 KOps/s | |
| test_exclude_nested | 0.1354ms | 87.1797μs | 11.4706 KOps/s | 11.7028 KOps/s | |
| test_empty[True] | 0.4469ms | 0.3877ms | 2.5790 KOps/s | 2.6127 KOps/s | |
| test_empty[False] | 8.2325μs | 1.2523μs | 798.5551 KOps/s | 808.4200 KOps/s | |
| test_to | 0.1023ms | 71.0523μs | 14.0741 KOps/s | 14.0506 KOps/s | |
| test_to_nonblocking | 0.1141ms | 65.4895μs | 15.2696 KOps/s | 16.1060 KOps/s | |
| test_unbind_speed | 0.3637ms | 0.3222ms | 3.1039 KOps/s | 3.1749 KOps/s | |
| test_unbind_speed_stack0 | 0.3957ms | 0.3172ms | 3.1522 KOps/s | 3.1636 KOps/s | |
| test_unbind_speed_stack1 | 0.1049s | 0.8889ms | 1.1250 KOps/s | 1.2313 KOps/s | |
| test_split | 1.1472ms | 1.0897ms | 917.6620 Ops/s | 810.0656 Ops/s | |
| test_chunk | 0.1055s | 1.1590ms | 862.7786 Ops/s | 957.4254 Ops/s | |
| test_to_cpu_blocking | 28.5776ms | 28.1296ms | 35.5497 Ops/s | 53.3079 Ops/s | |
| test_to_cpu_global_sync | 11.2611ms | 11.1424ms | 89.7471 Ops/s | 81.4204 Ops/s | |
| test_to_cpu_event_sync | 12.3055ms | 12.0195ms | 83.1985 Ops/s | 83.9742 Ops/s | |
| test_to_cpu_default | 12.3259ms | 12.0569ms | 82.9402 Ops/s | 83.9019 Ops/s | |
| test_consolidate[False-None] | 4.0514ms | 3.9607ms | 252.4792 Ops/s | 224.2915 Ops/s | |
| test_consolidate[default-None] | 2.2210ms | 1.9313ms | 517.7986 Ops/s | 494.3408 Ops/s | |
| test_consolidate[reduce-overhead-None] | 1.9435ms | 1.8660ms | 535.8932 Ops/s | 515.3975 Ops/s | |
| test_consolidate_njt[False-None] | 8.4550ms | 8.2501ms | 121.2106 Ops/s | 120.0504 Ops/s | |
| test_to[False-False-None] | 2.1710ms | 2.0248ms | 493.8807 Ops/s | 487.9463 Ops/s | |
| test_to[True-False-None] | 2.0172ms | 1.8714ms | 534.3733 Ops/s | 530.1603 Ops/s | |
| test_to[within-False-None] | 6.2363ms | 5.9472ms | 168.1465 Ops/s | 165.9848 Ops/s | |
| test_to[True-default-None] | 9.0054ms | 8.7144ms | 114.7521 Ops/s | 112.5332 Ops/s | |
| test_to_njt[False-False-None] | 8.5388ms | 8.2590ms | 121.0807 Ops/s | 120.3476 Ops/s | |
| test_to_njt[True-False-None] | 6.8938ms | 6.7835ms | 147.4159 Ops/s | 146.6121 Ops/s | |
| test_to_njt[within-False-None] | 15.7899ms | 15.2026ms | 65.7783 Ops/s | 65.4721 Ops/s | |
| test_creation[device0] | 0.3528ms | 0.1101ms | 9.0791 KOps/s | 8.8788 KOps/s | |
| test_creation_from_tensor | 0.3579ms | 0.1097ms | 9.1169 KOps/s | 8.9933 KOps/s | |
| test_add_one[memmap_tensor0] | 0.2790ms | 6.3189μs | 158.2557 KOps/s | 156.5768 KOps/s | |
| test_contiguous[memmap_tensor0] | 15.2700μs | 0.6077μs | 1.6456 MOps/s | 2.2697 MOps/s | |
| test_stack[memmap_tensor0] | 32.8500μs | 4.4642μs | 224.0060 KOps/s | 219.1616 KOps/s | |
| test_memmaptd_index | 1.0126ms | 0.2644ms | 3.7818 KOps/s | 3.8193 KOps/s | |
| test_memmaptd_index_astensor | 0.5162ms | 0.3625ms | 2.7590 KOps/s | 2.7748 KOps/s | |
| test_memmaptd_index_op | 0.8939ms | 0.5988ms | 1.6701 KOps/s | 1.6556 KOps/s | |
| test_serialize_model | 0.1392s | 0.1368s | 7.3108 Ops/s | 7.3416 Ops/s | |
| test_serialize_model_pickle | 1.3619s | 1.2125s | 0.8248 Ops/s | 0.8262 Ops/s | |
| test_serialize_weights | 0.1387s | 0.1367s | 7.3155 Ops/s | 7.3918 Ops/s | |
| test_serialize_weights_returnearly | 0.3978s | 85.8888ms | 11.6430 Ops/s | 6.0512 Ops/s | |
| test_serialize_weights_pickle | 1.3698s | 1.2141s | 0.8236 Ops/s | 0.8224 Ops/s | |
| test_reshape_pytree | 0.2161ms | 30.9432μs | 32.3173 KOps/s | 30.2491 KOps/s | |
| test_reshape_td | 90.2720μs | 43.0876μs | 23.2086 KOps/s | 22.6174 KOps/s | |
| test_view_pytree | 0.2125ms | 30.7738μs | 32.4951 KOps/s | 30.7558 KOps/s | |
| test_view_td | 91.7320μs | 51.9014μs | 19.2673 KOps/s | 19.0076 KOps/s | |
| test_unbind_pytree | 0.2328ms | 34.6836μs | 28.8321 KOps/s | 27.2902 KOps/s | |
| test_unbind_td | 0.1891ms | 47.7887μs | 20.9255 KOps/s | 20.2093 KOps/s | |
| test_split_pytree | 0.2397ms | 40.2995μs | 24.8142 KOps/s | 23.6289 KOps/s | |
| test_split_td | 0.1229ms | 61.8757μs | 16.1614 KOps/s | 15.3843 KOps/s | |
| test_add_pytree | 0.2274ms | 41.9811μs | 23.8203 KOps/s | 23.6734 KOps/s | |
| test_add_td | 99.3020μs | 57.7142μs | 17.3267 KOps/s | 17.4002 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1898ms | 0.1360ms | 7.3548 KOps/s | 6.7497 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.4364ms | 0.1936ms | 5.1656 KOps/s | 5.2034 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1500ms | 0.1050ms | 9.5233 KOps/s | 9.3811 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4301ms | 0.1709ms | 5.8525 KOps/s | 5.8162 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.2520ms | 9.6702μs | 103.4107 KOps/s | 97.9012 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 0.1107ms | 50.5786μs | 19.7712 KOps/s | 19.5222 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1472ms | 9.5602μs | 104.6007 KOps/s | 105.1866 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4607ms | 64.5649μs | 15.4883 KOps/s | 15.5073 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2273ms | 0.1735ms | 5.7640 KOps/s | 5.4699 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3116ms | 0.2753ms | 3.6321 KOps/s | 3.6114 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.3064ms | 0.1150ms | 8.6985 KOps/s | 8.5564 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1159ms | 75.3891μs | 13.2645 KOps/s | 13.7948 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2042ms | 0.1551ms | 6.4481 KOps/s | 6.3510 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.7669ms | 0.5039ms | 1.9845 KOps/s | 1.9475 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4848ms | 0.3275ms | 3.0535 KOps/s | 3.0247 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.3082ms | 0.1755ms | 5.6973 KOps/s | 5.4049 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1292ms | 87.4687μs | 11.4327 KOps/s | 11.3693 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.3294ms | 0.1169ms | 8.5551 KOps/s | 7.9955 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6337ms | 0.4186ms | 2.3889 KOps/s | 2.3068 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.1887ms | 0.1557ms | 6.4209 KOps/s | 6.2437 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 55.3710μs | 13.1493μs | 76.0499 KOps/s | 73.4540 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 65.5310μs | 40.0824μs | 24.9486 KOps/s | 24.9993 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1190ms | 10.5573μs | 94.7208 KOps/s | 96.3560 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4036ms | 51.5072μs | 19.4148 KOps/s | 19.5658 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 1.9706ms | 0.1714ms | 5.8327 KOps/s | 5.5493 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.5348ms | 3.3142ms | 301.7274 Ops/s | 307.5907 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9179ms | 0.1576ms | 6.3438 KOps/s | 6.0741 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.9230ms | 2.7507ms | 363.5463 Ops/s | 368.5598 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.2190ms | 0.1118ms | 8.9473 KOps/s | 8.9589 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3156ms | 75.6438μs | 13.2198 KOps/s | 13.6862 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.2139ms | 97.7599μs | 10.2291 KOps/s | 10.3475 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2501ms | 44.7380μs | 22.3524 KOps/s | 23.3598 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1737ms | 98.6168μs | 10.1403 KOps/s | 10.5005 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2592ms | 44.8896μs | 22.2769 KOps/s | 23.3753 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1975ms | 56.6533μs | 17.6512 KOps/s | 16.4805 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2221ms | 26.9639μs | 37.0866 KOps/s | 37.9288 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1537ms | 44.5785μs | 22.4324 KOps/s | 21.9284 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2568ms | 21.2579μs | 47.0412 KOps/s | 46.3510 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 85.7710μs | 45.7084μs | 21.8778 KOps/s | 22.5172 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2744ms | 21.5654μs | 46.3705 KOps/s | 46.4178 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1362ms | 57.2840μs | 17.4569 KOps/s | 17.3533 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2389ms | 27.5982μs | 36.2342 KOps/s | 37.5533 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 0.1022ms | 44.3722μs | 22.5366 KOps/s | 21.6460 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2604ms | 21.2694μs | 47.0158 KOps/s | 46.6142 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 94.1510μs | 44.7105μs | 22.3661 KOps/s | 21.9285 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2608ms | 21.3546μs | 46.8282 KOps/s | 46.4967 KOps/s | |
| test_compile_replace[single-eager] | 0.1095ms | 47.1690μs | 21.2004 KOps/s | 22.1953 KOps/s | |
| test_compile_replace[single-compile] | 0.1714ms | 0.1053ms | 9.4939 KOps/s | 9.4245 KOps/s | |
| test_compile_replace[multi-eager] | 0.6229ms | 0.5627ms | 1.7773 KOps/s | 1.8377 KOps/s | |
| test_compile_replace[multi-compile] | 0.2628ms | 0.1102ms | 9.0756 KOps/s | 8.6270 KOps/s | |
| test_compile_tc_getattr_20[eager] | 0.2151ms | 0.1628ms | 6.1411 KOps/s | 6.1038 KOps/s | |
| test_compile_tc_getattr_20[compile] | 0.3152ms | 0.1189ms | 8.4077 KOps/s | 8.4732 KOps/s | |
| test_compile_clone_shallow[20-eager] | 54.3110μs | 18.6009μs | 53.7609 KOps/s | 53.8095 KOps/s | |
| test_compile_clone_shallow[20-compile] | 67.3010μs | 11.0020μs | 90.8922 KOps/s | 90.5705 KOps/s | |
| test_compile_clone_shallow[40-eager] | 90.6810μs | 32.8757μs | 30.4176 KOps/s | 30.7664 KOps/s | |
| test_compile_clone_shallow[40-compile] | 0.1806ms | 12.1626μs | 82.2191 KOps/s | 80.4860 KOps/s | |
| test_compile_clone_shallow[80-eager] | 0.1199ms | 61.3453μs | 16.3012 KOps/s | 16.5544 KOps/s | |
| test_compile_clone_shallow[80-compile] | 48.3610μs | 15.0108μs | 66.6188 KOps/s | 67.1884 KOps/s | |
| test_compile_update_inplace[eager] | 0.1069ms | 57.7756μs | 17.3083 KOps/s | 17.3916 KOps/s | |
| test_compile_update_inplace[compile] | 0.2039ms | 0.1331ms | 7.5106 KOps/s | 7.0600 KOps/s | |
| test_mod_add[eager] | 88.9220μs | 47.2403μs | 21.1684 KOps/s | 21.3474 KOps/s | |
| test_mod_add[compile] | 0.5193ms | 0.1003ms | 9.9738 KOps/s | 9.2071 KOps/s | |
| test_mod_add[compile-overhead] | 0.3762ms | 0.1477ms | 6.7689 KOps/s | 6.7233 KOps/s | |
| test_mod_wrap[eager] | 0.3520ms | 0.2809ms | 3.5601 KOps/s | 3.3860 KOps/s | |
| test_mod_wrap[compile] | 0.5081ms | 0.3405ms | 2.9366 KOps/s | 2.9176 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.3268ms | 4.0242ms | 248.4955 Ops/s | 252.3634 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.6764ms | 1.4742ms | 678.3110 Ops/s | 668.9375 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.5428ms | 1.4097ms | 709.3700 Ops/s | 705.8707 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2367ms | 0.8709ms | 1.1483 KOps/s | 1.1411 KOps/s | |
| test_seq_add[eager] | 0.2088ms | 0.1473ms | 6.7879 KOps/s | 6.6645 KOps/s | |
| test_seq_add[compile] | 0.2914ms | 0.1099ms | 9.1029 KOps/s | 8.7859 KOps/s | |
| test_seq_add[compile-overhead] | 0.2192ms | 0.1490ms | 6.7126 KOps/s | 6.4874 KOps/s | |
| test_seq_wrap[eager] | 0.5736ms | 0.5052ms | 1.9793 KOps/s | 1.9879 KOps/s | |
| test_seq_wrap[compile] | 0.4343ms | 0.3550ms | 2.8173 KOps/s | 2.7879 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3265ms | 0.2585ms | 3.8687 KOps/s | 3.8490 KOps/s | |
| test_func_call_runtime[False-eager] | 0.8814ms | 0.8063ms | 1.2402 KOps/s | 1.2462 KOps/s | |
| test_func_call_runtime[False-compile] | 0.9837ms | 0.8785ms | 1.1383 KOps/s | 1.1298 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5202ms | 0.4437ms | 2.2536 KOps/s | 2.2269 KOps/s | |
| test_func_call_runtime[True-eager] | 1.1815ms | 1.0375ms | 963.8985 Ops/s | 958.1684 Ops/s | |
| test_func_call_runtime[True-compile] | 0.9594ms | 0.8865ms | 1.1281 KOps/s | 1.0836 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5144ms | 0.4550ms | 2.1977 KOps/s | 2.1543 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 1.1505ms | 0.8059ms | 1.2409 KOps/s | 1.2331 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 0.9734ms | 0.8788ms | 1.1380 KOps/s | 1.1258 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5000ms | 0.4470ms | 2.2373 KOps/s | 2.2117 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.2881ms | 1.1767ms | 849.8311 Ops/s | 831.0437 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.0643ms | 0.9338ms | 1.0709 KOps/s | 1.0631 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5738ms | 0.4873ms | 2.0522 KOps/s | 2.0066 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.7736ms | 2.3063ms | 433.6018 Ops/s | 429.9375 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0087ms | 0.9400ms | 1.0638 KOps/s | 1.0470 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5340ms | 0.4924ms | 2.0307 KOps/s | 1.9913 KOps/s | |
| test_distributed | 2.9237ms | 0.1564ms | 6.3951 KOps/s | 6.5930 KOps/s | |
| test_tdmodule | 0.7404ms | 27.4274μs | 36.4599 KOps/s | 37.1106 KOps/s | |
| test_tdmodule_dispatch | 73.1310μs | 43.8796μs | 22.7896 KOps/s | 22.5539 KOps/s | |
| test_tdseq | 45.5310μs | 25.9340μs | 38.5594 KOps/s | 38.1209 KOps/s | |
| test_tdseq_dispatch | 79.3520μs | 45.7623μs | 21.8521 KOps/s | 21.7433 KOps/s | |
| test_instantiation_functorch | 2.0467ms | 1.9741ms | 506.5563 Ops/s | 502.2256 Ops/s | |
| test_exec_functorch | 0.2497ms | 0.1781ms | 5.6145 KOps/s | 5.7718 KOps/s | |
| test_exec_functional_call | 0.1894ms | 0.1558ms | 6.4171 KOps/s | 6.4885 KOps/s | |
| test_exec_td_decorator | 0.4306ms | 0.2272ms | 4.4011 KOps/s | 4.3895 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0045ms | 0.7997ms | 1.2504 KOps/s | 1.2338 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.9742ms | 0.7981ms | 1.2530 KOps/s | 1.2371 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8865ms | 0.6912ms | 1.4468 KOps/s | 1.4186 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8765ms | 0.6930ms | 1.4431 KOps/s | 1.3940 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 20.1729ms | 20.0469ms | 49.8831 Ops/s | 49.4307 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 20.2045ms | 20.0664ms | 49.8345 Ops/s | 49.3949 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.5757ms | 19.8974ms | 50.2577 Ops/s | 49.8765 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.3919ms | 19.8943ms | 50.2657 Ops/s | 49.8832 Ops/s | |
| test_to_module_speed[True] | 1.9485ms | 1.4091ms | 709.6781 Ops/s | 700.2111 Ops/s | |
| test_to_module_speed[False] | 1.8542ms | 1.3738ms | 727.9298 Ops/s | 714.0486 Ops/s | |
| test_tc_init | 70.8710μs | 42.7568μs | 23.3881 KOps/s | 23.0945 KOps/s | |
| test_tc_init_tensor_only | 33.9910μs | 9.2502μs | 108.1057 KOps/s | 108.8304 KOps/s | |
| test_tc_init_nested | 0.3643ms | 85.3820μs | 11.7121 KOps/s | 11.5583 KOps/s | |
| test_tc_init_many_fields | 78.5810μs | 15.7123μs | 63.6446 KOps/s | 64.2410 KOps/s | |
| test_tc_first_layer_tensor | 27.4810μs | 1.7198μs | 581.4536 KOps/s | 584.9094 KOps/s | |
| test_tc_first_layer_tensor_only | 2.2700μs | 0.3818μs | 2.6192 MOps/s | 2.5742 MOps/s | |
| test_tc_first_layer_tensor_set | 39.8210μs | 3.6915μs | 270.8908 KOps/s | 268.6168 KOps/s | |
| test_tc_first_layer_tensor_only_set | 24.6710μs | 3.0889μs | 323.7362 KOps/s | 319.5773 KOps/s | |
| test_tc_first_layer_nontensor | 30.4800μs | 5.8585μs | 170.6914 KOps/s | 172.3147 KOps/s | |
| test_tc_second_layer_tensor | 45.7010μs | 4.1549μs | 240.6797 KOps/s | 235.7931 KOps/s | |
| test_tc_second_layer_nontensor | 31.2710μs | 8.3184μs | 120.2154 KOps/s | 120.7440 KOps/s | |
| test_unbind | 0.2639s | 17.0437ms | 58.6728 Ops/s | 55.1392 Ops/s | |
| test_full_like | 5.0348ms | 4.4048ms | 227.0243 Ops/s | 59.3602 Ops/s | |
| test_zeros_like | 4.9890ms | 4.3868ms | 227.9555 Ops/s | 59.4565 Ops/s | |
| test_ones_like | 4.8867ms | 4.3952ms | 227.5231 Ops/s | 59.3844 Ops/s | |
| test_clone | 6.8351ms | 6.5965ms | 151.5948 Ops/s | 55.4774 Ops/s | |
| test_squeeze | 0.1767ms | 13.5540μs | 73.7791 KOps/s | 72.1011 KOps/s | |
| test_unsqueeze | 0.1803ms | 0.1093ms | 9.1486 KOps/s | 8.8205 KOps/s | |
| test_split | 0.3444ms | 0.1775ms | 5.6348 KOps/s | 5.4029 KOps/s | |
| test_permute | 0.2627ms | 0.2003ms | 4.9934 KOps/s | 4.8037 KOps/s | |
| test_stack | 35.5903ms | 35.2742ms | 28.3493 Ops/s | 19.1189 Ops/s | |
| test_cat | 35.5180ms | 35.1998ms | 28.4093 Ops/s | 19.1711 Ops/s | |
| test_sequential_tensordict | 0.2649ms | 0.2079ms | 4.8103 KOps/s | 4.6210 KOps/s | |
| test_sequential_graph_module | 0.1610ms | 0.1143ms | 8.7508 KOps/s | 8.2879 KOps/s | |
| test_nested_tensordict | 0.3727ms | 0.2780ms | 3.5972 KOps/s | 3.4827 KOps/s | |
| test_nested_graph_module | 0.2242ms | 0.1318ms | 7.5857 KOps/s | 7.6489 KOps/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 29.1210μs | 14.9734μs | 66.7849 KOps/s | 66.7369 KOps/s | |
| test_plain_set_stack_nested | 31.9500μs | 15.2942μs | 65.3845 KOps/s | 66.2761 KOps/s | |
| test_plain_set_nested_inplace | 41.0610μs | 16.7826μs | 59.5855 KOps/s | 59.1133 KOps/s | |
| test_plain_set_stack_nested_inplace | 45.2710μs | 16.8056μs | 59.5041 KOps/s | 59.7575 KOps/s | |
| test_items | 81.1620μs | 5.9896μs | 166.9553 KOps/s | 166.2827 KOps/s | |
| test_items_nested | 0.5164ms | 0.4633ms | 2.1585 KOps/s | 2.1332 KOps/s | |
| test_items_nested_locked | 0.5268ms | 0.4702ms | 2.1268 KOps/s | 2.1316 KOps/s | |
| test_items_nested_leaf | 0.1298ms | 99.6165μs | 10.0385 KOps/s | 10.0889 KOps/s | |
| test_items_stack_nested | 0.4983ms | 0.4659ms | 2.1466 KOps/s | 2.1326 KOps/s | |
| test_items_stack_nested_leaf | 0.1860ms | 98.6724μs | 10.1345 KOps/s | 10.0941 KOps/s | |
| test_items_stack_nested_locked | 0.5017ms | 0.4738ms | 2.1108 KOps/s | 2.1326 KOps/s | |
| test_keys | 46.5000μs | 4.2335μs | 236.2131 KOps/s | 234.4990 KOps/s | |
| test_keys_nested | 0.1745ms | 0.1292ms | 7.7429 KOps/s | 7.7069 KOps/s | |
| test_keys_nested_locked | 2.1171ms | 0.1389ms | 7.1976 KOps/s | 7.1915 KOps/s | |
| test_keys_nested_leaf | 0.1552ms | 0.1207ms | 8.2817 KOps/s | 8.2495 KOps/s | |
| test_keys_stack_nested | 0.1647ms | 0.1303ms | 7.6745 KOps/s | 7.7220 KOps/s | |
| test_keys_stack_nested_leaf | 0.1521ms | 0.1208ms | 8.2759 KOps/s | 8.3017 KOps/s | |
| test_keys_stack_nested_locked | 0.1878ms | 0.1380ms | 7.2475 KOps/s | 7.2418 KOps/s | |
| test_values | 6.8760μs | 1.0325μs | 968.5682 KOps/s | 976.4249 KOps/s | |
| test_values_nested | 78.7710μs | 52.5351μs | 19.0349 KOps/s | 19.1824 KOps/s | |
| test_values_nested_locked | 81.5220μs | 56.3740μs | 17.7387 KOps/s | 17.8245 KOps/s | |
| test_values_nested_leaf | 98.6220μs | 59.9604μs | 16.6777 KOps/s | 16.5542 KOps/s | |
| test_values_stack_nested | 91.7020μs | 52.2525μs | 19.1379 KOps/s | 18.9917 KOps/s | |
| test_values_stack_nested_leaf | 0.1115ms | 59.6800μs | 16.7560 KOps/s | 16.5378 KOps/s | |
| test_values_stack_nested_locked | 97.6920μs | 55.8526μs | 17.9043 KOps/s | 18.1336 KOps/s | |
| test_membership | 6.2583μs | 0.8464μs | 1.1814 MOps/s | 1.1733 MOps/s | |
| test_membership_nested | 29.9200μs | 2.8913μs | 345.8608 KOps/s | 347.6444 KOps/s | |
| test_membership_nested_leaf | 67.0910μs | 2.9148μs | 343.0713 KOps/s | 346.2628 KOps/s | |
| test_membership_stacked_nested | 23.6400μs | 2.9125μs | 343.3479 KOps/s | 344.1282 KOps/s | |
| test_membership_stacked_nested_leaf | 33.8510μs | 2.9153μs | 343.0157 KOps/s | 347.8675 KOps/s | |
| test_membership_nested_last | 27.5000μs | 4.2837μs | 233.4443 KOps/s | 231.1597 KOps/s | |
| test_membership_nested_leaf_last | 24.4800μs | 4.3562μs | 229.5593 KOps/s | 231.7868 KOps/s | |
| test_membership_stacked_nested_last | 28.9400μs | 4.3651μs | 229.0915 KOps/s | 232.2604 KOps/s | |
| test_membership_stacked_nested_leaf_last | 46.9510μs | 4.3266μs | 231.1276 KOps/s | 231.2833 KOps/s | |
| test_nested_getleaf | 90.0410μs | 21.8823μs | 45.6990 KOps/s | 46.3581 KOps/s | |
| test_nested_get | 55.4810μs | 20.3703μs | 49.0910 KOps/s | 49.2792 KOps/s | |
| test_stacked_getleaf | 53.5410μs | 21.4981μs | 46.5157 KOps/s | 46.5158 KOps/s | |
| test_stacked_get | 51.9710μs | 20.7002μs | 48.3088 KOps/s | 49.4772 KOps/s | |
| test_nested_getitemleaf | 0.1298ms | 21.3850μs | 46.7617 KOps/s | 45.7059 KOps/s | |
| test_nested_getitem | 53.4110μs | 20.8258μs | 48.0173 KOps/s | 47.8463 KOps/s | |
| test_stacked_getitemleaf | 46.3010μs | 21.8770μs | 45.7101 KOps/s | 45.6299 KOps/s | |
| test_stacked_getitem | 47.0410μs | 20.7768μs | 48.1306 KOps/s | 47.2081 KOps/s | |
| test_lock_nested | 0.5883ms | 0.4777ms | 2.0935 KOps/s | 2.0905 KOps/s | |
| test_lock_stack_nested | 0.5299ms | 0.4797ms | 2.0848 KOps/s | 2.0637 KOps/s | |
| test_unlock_nested | 0.4722ms | 0.3896ms | 2.5667 KOps/s | 2.5609 KOps/s | |
| test_unlock_stack_nested | 0.4531ms | 0.3902ms | 2.5627 KOps/s | 2.5267 KOps/s | |
| test_flatten_speed | 0.1770ms | 0.1237ms | 8.0814 KOps/s | 8.1201 KOps/s | |
| test_unflatten_speed | 0.6208ms | 0.5688ms | 1.7580 KOps/s | 1.7634 KOps/s | |
| test_common_ops | 0.9383ms | 0.6938ms | 1.4413 KOps/s | 1.4177 KOps/s | |
| test_creation | 0.1145ms | 3.1748μs | 314.9785 KOps/s | 318.0065 KOps/s | |
| test_creation_empty | 33.1510μs | 7.0011μs | 142.8355 KOps/s | 143.3830 KOps/s | |
| test_creation_nested_1 | 41.4510μs | 11.5752μs | 86.3919 KOps/s | 86.7919 KOps/s | |
| test_creation_nested_2 | 40.6810μs | 13.2396μs | 75.5309 KOps/s | 74.9807 KOps/s | |
| test_creation_many_keys[10] | 64.2420μs | 21.2317μs | 47.0993 KOps/s | 47.4688 KOps/s | |
| test_creation_many_keys[50] | 0.1366ms | 92.2125μs | 10.8445 KOps/s | 10.9727 KOps/s | |
| test_creation_many_keys[100] | 0.2243ms | 0.1809ms | 5.5286 KOps/s | 5.6273 KOps/s | |
| test_creation_nested_many_keys[10] | 72.9910μs | 45.4590μs | 21.9979 KOps/s | 22.1264 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2183ms | 0.1852ms | 5.3987 KOps/s | 5.4211 KOps/s | |
| test_clone | 42.1500μs | 13.1777μs | 75.8859 KOps/s | 74.1090 KOps/s | |
| test_getitem[int] | 1.5547ms | 15.0112μs | 66.6170 KOps/s | 59.8943 KOps/s | |
| test_getitem[slice_int] | 0.1390ms | 24.1160μs | 41.4662 KOps/s | 41.1270 KOps/s | |
| test_getitem[range] | 0.1697ms | 61.8070μs | 16.1794 KOps/s | 15.0874 KOps/s | |
| test_getitem[tuple] | 0.1521ms | 23.9193μs | 41.8072 KOps/s | 41.9820 KOps/s | |
| test_getitem[list] | 0.1790ms | 56.6272μs | 17.6594 KOps/s | 17.1190 KOps/s | |
| test_setitem_dim[int] | 48.8310μs | 25.1914μs | 39.6961 KOps/s | 37.8830 KOps/s | |
| test_setitem_dim[slice_int] | 62.4910μs | 41.7970μs | 23.9252 KOps/s | 23.2862 KOps/s | |
| test_setitem_dim[range] | 0.1225ms | 93.5775μs | 10.6863 KOps/s | 10.6868 KOps/s | |
| test_setitem_dim[tuple] | 66.3710μs | 38.7307μs | 25.8193 KOps/s | 25.3527 KOps/s | |
| test_setitem | 47.9010μs | 17.3828μs | 57.5280 KOps/s | 55.8101 KOps/s | |
| test_set | 40.2510μs | 16.7394μs | 59.7393 KOps/s | 58.5777 KOps/s | |
| test_set_shared | 0.6334ms | 0.2067ms | 4.8389 KOps/s | 4.7085 KOps/s | |
| test_update | 0.4515ms | 21.3619μs | 46.8124 KOps/s | 45.4338 KOps/s | |
| test_update_nested | 69.1210μs | 32.7373μs | 30.5462 KOps/s | 29.9491 KOps/s | |
| test_update__nested | 0.4889ms | 33.3281μs | 30.0047 KOps/s | 28.6655 KOps/s | |
| test_set_nested | 54.1710μs | 18.7751μs | 53.2622 KOps/s | 52.4980 KOps/s | |
| test_set_nested_new | 60.9510μs | 23.7880μs | 42.0381 KOps/s | 41.8450 KOps/s | |
| test_select | 73.2810μs | 40.0874μs | 24.9455 KOps/s | 24.7061 KOps/s | |
| test_select_nested | 0.1023ms | 74.3953μs | 13.4417 KOps/s | 13.6216 KOps/s | |
| test_exclude_nested | 0.1377ms | 91.1171μs | 10.9749 KOps/s | 11.0470 KOps/s | |
| test_empty[True] | 0.4259ms | 0.3977ms | 2.5142 KOps/s | 2.5177 KOps/s | |
| test_empty[False] | 7.8102μs | 1.3197μs | 757.7758 KOps/s | 773.5999 KOps/s | |
| test_to | 0.1056ms | 72.7765μs | 13.7407 KOps/s | 13.7530 KOps/s | |
| test_to_nonblocking | 0.1131ms | 65.3397μs | 15.3046 KOps/s | 15.1682 KOps/s | |
| test_unbind_speed | 0.3717ms | 0.3317ms | 3.0149 KOps/s | 2.9919 KOps/s | |
| test_unbind_speed_stack0 | 0.4030ms | 0.3296ms | 3.0337 KOps/s | 3.0163 KOps/s | |
| test_unbind_speed_stack1 | 0.1068s | 0.9201ms | 1.0869 KOps/s | 1.1714 KOps/s | |
| test_split | 1.1984ms | 1.1397ms | 877.3913 Ops/s | 785.5224 Ops/s | |
| test_chunk | 0.1072s | 1.2137ms | 823.9068 Ops/s | 925.7417 Ops/s | |
| test_to_cpu_blocking | 18.9162ms | 18.6117ms | 53.7297 Ops/s | 52.6000 Ops/s | |
| test_to_cpu_global_sync | 11.3384ms | 11.2231ms | 89.1017 Ops/s | 77.8927 Ops/s | |
| test_to_cpu_event_sync | 12.4673ms | 12.0596ms | 82.9213 Ops/s | 80.7280 Ops/s | |
| test_to_cpu_default | 12.3148ms | 12.0894ms | 82.7173 Ops/s | 80.6695 Ops/s | |
| test_consolidate[False-None] | 4.3301ms | 4.1199ms | 242.7232 Ops/s | 215.0100 Ops/s | |
| test_consolidate[default-None] | 2.1291ms | 2.0440ms | 489.2291 Ops/s | 479.3427 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.0426ms | 1.9467ms | 513.6862 Ops/s | 499.8178 Ops/s | |
| test_consolidate_njt[False-None] | 8.8481ms | 8.4786ms | 117.9444 Ops/s | 117.4216 Ops/s | |
| test_to[False-False-None] | 2.2331ms | 2.0537ms | 486.9179 Ops/s | 473.3487 Ops/s | |
| test_to[True-False-None] | 2.2645ms | 1.8980ms | 526.8760 Ops/s | 527.0959 Ops/s | |
| test_to[within-False-None] | 6.3199ms | 6.1320ms | 163.0782 Ops/s | 162.9656 Ops/s | |
| test_to[True-default-None] | 8.8644ms | 8.6824ms | 115.1754 Ops/s | 112.8102 Ops/s | |
| test_to_njt[False-False-None] | 8.5953ms | 8.4503ms | 118.3392 Ops/s | 116.7291 Ops/s | |
| test_to_njt[True-False-None] | 7.0457ms | 6.9107ms | 144.7037 Ops/s | 141.4012 Ops/s | |
| test_to_njt[within-False-None] | 15.8740ms | 15.6648ms | 63.8374 Ops/s | 63.1720 Ops/s | |
| test_creation[device0] | 0.3914ms | 0.1156ms | 8.6517 KOps/s | 8.3176 KOps/s | |
| test_creation_from_tensor | 0.4030ms | 0.1125ms | 8.8912 KOps/s | 8.7836 KOps/s | |
| test_add_one[memmap_tensor0] | 0.1383ms | 6.4558μs | 154.9006 KOps/s | 149.6523 KOps/s | |
| test_contiguous[memmap_tensor0] | 19.5710μs | 0.6782μs | 1.4745 MOps/s | 2.1320 MOps/s | |
| test_stack[memmap_tensor0] | 31.3800μs | 4.6482μs | 215.1354 KOps/s | 217.6022 KOps/s | |
| test_memmaptd_index | 1.0481ms | 0.2712ms | 3.6868 KOps/s | 3.6748 KOps/s | |
| test_memmaptd_index_astensor | 0.5374ms | 0.3739ms | 2.6742 KOps/s | 2.6754 KOps/s | |
| test_memmaptd_index_op | 0.7635ms | 0.6200ms | 1.6128 KOps/s | 1.5992 KOps/s | |
| test_serialize_model | 0.1401s | 0.1373s | 7.2846 Ops/s | 7.2912 Ops/s | |
| test_serialize_model_pickle | 1.3493s | 1.2107s | 0.8260 Ops/s | 0.8261 Ops/s | |
| test_serialize_weights | 0.1376s | 0.1359s | 7.3583 Ops/s | 7.3249 Ops/s | |
| test_serialize_weights_returnearly | 0.4476s | 94.9856ms | 10.5279 Ops/s | 14.8097 Ops/s | |
| test_serialize_weights_pickle | 1.3746s | 1.1911s | 0.8396 Ops/s | 0.8181 Ops/s | |
| test_reshape_pytree | 0.2067ms | 32.8905μs | 30.4039 KOps/s | 30.4817 KOps/s | |
| test_reshape_td | 87.3120μs | 46.1950μs | 21.6474 KOps/s | 21.9999 KOps/s | |
| test_view_pytree | 0.2258ms | 32.8348μs | 30.4555 KOps/s | 31.0965 KOps/s | |
| test_view_td | 89.9420μs | 54.5786μs | 18.3222 KOps/s | 19.0684 KOps/s | |
| test_unbind_pytree | 0.2371ms | 36.1129μs | 27.6910 KOps/s | 27.3272 KOps/s | |
| test_unbind_td | 0.1982ms | 49.4892μs | 20.2064 KOps/s | 19.7120 KOps/s | |
| test_split_pytree | 0.2550ms | 42.2740μs | 23.6552 KOps/s | 23.8254 KOps/s | |
| test_split_td | 0.1689ms | 63.4914μs | 15.7502 KOps/s | 15.5655 KOps/s | |
| test_add_pytree | 0.2330ms | 42.2940μs | 23.6440 KOps/s | 23.5172 KOps/s | |
| test_add_td | 0.1110ms | 54.9641μs | 18.1937 KOps/s | 17.9952 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.2022ms | 0.1401ms | 7.1358 KOps/s | 6.8324 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.3122ms | 0.2027ms | 4.9334 KOps/s | 4.9464 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1603ms | 0.1111ms | 9.0020 KOps/s | 8.9872 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4442ms | 0.1782ms | 5.6130 KOps/s | 5.5894 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.2559ms | 11.1881μs | 89.3810 KOps/s | 98.7399 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 86.5120μs | 53.6030μs | 18.6557 KOps/s | 18.4986 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1642ms | 9.7085μs | 103.0030 KOps/s | 103.1693 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4712ms | 68.1504μs | 14.6734 KOps/s | 14.6958 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.3204ms | 0.1766ms | 5.6609 KOps/s | 5.1586 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3644ms | 0.2809ms | 3.5604 KOps/s | 3.4914 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1973ms | 0.1203ms | 8.3134 KOps/s | 7.8609 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1233ms | 74.6816μs | 13.3902 KOps/s | 13.2373 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.3857ms | 0.1594ms | 6.2724 KOps/s | 5.9968 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8122ms | 0.5194ms | 1.9252 KOps/s | 1.8973 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.3909ms | 0.3328ms | 3.0044 KOps/s | 2.9569 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2382ms | 0.1798ms | 5.5631 KOps/s | 5.0841 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1489ms | 90.7747μs | 11.0163 KOps/s | 11.1786 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.2354ms | 0.1223ms | 8.1770 KOps/s | 7.7289 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6689ms | 0.4348ms | 2.3000 KOps/s | 2.2967 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.2164ms | 0.1598ms | 6.2586 KOps/s | 6.0495 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 0.1065ms | 13.2543μs | 75.4474 KOps/s | 75.5765 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 85.7020μs | 41.2387μs | 24.2491 KOps/s | 24.0866 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1135ms | 10.7158μs | 93.3199 KOps/s | 93.4844 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4152ms | 52.7723μs | 18.9493 KOps/s | 19.0813 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 2.0142ms | 0.1739ms | 5.7501 KOps/s | 5.4517 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.4723ms | 3.2949ms | 303.5016 Ops/s | 302.5634 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 2.0119ms | 0.1608ms | 6.2186 KOps/s | 5.9648 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.8630ms | 2.7515ms | 363.4394 Ops/s | 357.3338 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.2253ms | 0.1090ms | 9.1724 KOps/s | 8.3599 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3164ms | 73.0917μs | 13.6814 KOps/s | 13.4581 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1992ms | 98.0979μs | 10.1939 KOps/s | 10.1075 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2600ms | 45.4529μs | 22.0008 KOps/s | 22.8179 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1483ms | 97.5423μs | 10.2520 KOps/s | 9.5309 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2564ms | 44.2970μs | 22.5749 KOps/s | 22.0274 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1103ms | 54.9331μs | 18.2040 KOps/s | 17.2833 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2204ms | 27.2677μs | 36.6735 KOps/s | 36.9183 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1463ms | 43.2706μs | 23.1104 KOps/s | 21.7881 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2572ms | 22.1995μs | 45.0460 KOps/s | 45.0873 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 96.6320μs | 45.3527μs | 22.0494 KOps/s | 21.1751 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2684ms | 21.9986μs | 45.4574 KOps/s | 45.1415 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1061ms | 55.0894μs | 18.1523 KOps/s | 16.9716 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2075ms | 26.8142μs | 37.2937 KOps/s | 36.7996 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 92.1920μs | 43.7046μs | 22.8809 KOps/s | 21.4838 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2634ms | 22.1738μs | 45.0982 KOps/s | 45.2912 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 0.1013ms | 44.1522μs | 22.6489 KOps/s | 21.7056 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2659ms | 21.9895μs | 45.4763 KOps/s | 45.2488 KOps/s | |
| test_compile_replace[single-eager] | 99.3420μs | 46.6874μs | 21.4191 KOps/s | 21.5229 KOps/s | |
| test_compile_replace[single-compile] | 0.1842ms | 0.1052ms | 9.5024 KOps/s | 9.2608 KOps/s | |
| test_compile_replace[multi-eager] | 0.7133ms | 0.5794ms | 1.7258 KOps/s | 1.7912 KOps/s | |
| test_compile_replace[multi-compile] | 0.2588ms | 0.1117ms | 8.9500 KOps/s | 8.7469 KOps/s | |
| test_compile_tc_getattr_20[eager] | 0.2239ms | 0.1659ms | 6.0263 KOps/s | 6.0302 KOps/s | |
| test_compile_tc_getattr_20[compile] | 0.3271ms | 0.1199ms | 8.3412 KOps/s | 8.2301 KOps/s | |
| test_compile_clone_shallow[20-eager] | 47.8000μs | 19.2590μs | 51.9237 KOps/s | 53.1324 KOps/s | |
| test_compile_clone_shallow[20-compile] | 62.0820μs | 11.2650μs | 88.7708 KOps/s | 87.2189 KOps/s | |
| test_compile_clone_shallow[40-eager] | 77.7310μs | 33.4404μs | 29.9040 KOps/s | 29.6623 KOps/s | |
| test_compile_clone_shallow[40-compile] | 63.9010μs | 12.4582μs | 80.2682 KOps/s | 76.1360 KOps/s | |
| test_compile_clone_shallow[80-eager] | 99.2820μs | 62.3292μs | 16.0438 KOps/s | 15.7490 KOps/s | |
| test_compile_clone_shallow[80-compile] | 57.6610μs | 14.7163μs | 67.9521 KOps/s | 69.5440 KOps/s | |
| test_compile_update_inplace[eager] | 0.1028ms | 58.7403μs | 17.0241 KOps/s | 16.7911 KOps/s | |
| test_compile_update_inplace[compile] | 0.2582ms | 0.1389ms | 7.1972 KOps/s | 6.7085 KOps/s | |
| test_mod_add[eager] | 0.1220ms | 49.4539μs | 20.2208 KOps/s | 20.4172 KOps/s | |
| test_mod_add[compile] | 0.1721ms | 0.1041ms | 9.6031 KOps/s | 9.3163 KOps/s | |
| test_mod_add[compile-overhead] | 0.3135ms | 0.1490ms | 6.7125 KOps/s | 6.4987 KOps/s | |
| test_mod_wrap[eager] | 0.3646ms | 0.2914ms | 3.4318 KOps/s | 3.4484 KOps/s | |
| test_mod_wrap[compile] | 0.6309ms | 0.3583ms | 2.7906 KOps/s | 2.8147 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.4590ms | 4.1025ms | 243.7534 Ops/s | 248.2331 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.6271ms | 1.5043ms | 664.7544 Ops/s | 671.4262 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.6054ms | 1.4397ms | 694.6018 Ops/s | 689.4579 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2448ms | 0.8885ms | 1.1255 KOps/s | 1.0995 KOps/s | |
| test_seq_add[eager] | 0.2303ms | 0.1569ms | 6.3735 KOps/s | 6.5007 KOps/s | |
| test_seq_add[compile] | 0.5575ms | 0.1144ms | 8.7397 KOps/s | 8.1318 KOps/s | |
| test_seq_add[compile-overhead] | 0.2111ms | 0.1576ms | 6.3463 KOps/s | 5.9324 KOps/s | |
| test_seq_wrap[eager] | 0.7037ms | 0.5350ms | 1.8691 KOps/s | 1.8848 KOps/s | |
| test_seq_wrap[compile] | 0.4748ms | 0.3743ms | 2.6716 KOps/s | 2.6332 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3377ms | 0.2708ms | 3.6926 KOps/s | 3.6915 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9710ms | 0.8942ms | 1.1184 KOps/s | 1.1898 KOps/s | |
| test_func_call_runtime[False-compile] | 1.1608ms | 0.9267ms | 1.0792 KOps/s | 1.0881 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.5167ms | 0.4631ms | 2.1596 KOps/s | 2.1232 KOps/s | |
| test_func_call_runtime[True-eager] | 1.1885ms | 1.0776ms | 928.0098 Ops/s | 925.7568 Ops/s | |
| test_func_call_runtime[True-compile] | 1.0103ms | 0.9203ms | 1.0866 KOps/s | 1.0645 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5385ms | 0.4762ms | 2.1001 KOps/s | 2.0600 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 1.2957ms | 0.8865ms | 1.1280 KOps/s | 1.2007 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 0.9970ms | 0.9086ms | 1.1006 KOps/s | 1.0892 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5416ms | 0.4675ms | 2.1390 KOps/s | 2.1286 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3434ms | 1.2310ms | 812.3533 Ops/s | 822.9822 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 0.9965ms | 0.9473ms | 1.0557 KOps/s | 1.0296 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5554ms | 0.5079ms | 1.9690 KOps/s | 1.9106 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8804ms | 2.3742ms | 421.1882 Ops/s | 419.0351 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0390ms | 0.9704ms | 1.0305 KOps/s | 1.0093 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5661ms | 0.5179ms | 1.9309 KOps/s | 1.8976 KOps/s | |
| test_distributed | 0.8121ms | 0.1532ms | 6.5273 KOps/s | 6.5408 KOps/s | |
| test_tdmodule | 0.3072ms | 27.6923μs | 36.1111 KOps/s | 35.7193 KOps/s | |
| test_tdmodule_dispatch | 72.6810μs | 44.4640μs | 22.4901 KOps/s | 21.9698 KOps/s | |
| test_tdseq | 48.6710μs | 26.9789μs | 37.0660 KOps/s | 37.0605 KOps/s | |
| test_tdseq_dispatch | 69.0010μs | 47.6851μs | 20.9709 KOps/s | 21.1648 KOps/s | |
| test_instantiation_functorch | 2.1768ms | 2.0700ms | 483.0873 Ops/s | 479.8831 Ops/s | |
| test_exec_functorch | 0.2189ms | 0.1781ms | 5.6161 KOps/s | 5.4980 KOps/s | |
| test_exec_functional_call | 0.2071ms | 0.1590ms | 6.2910 KOps/s | 6.2286 KOps/s | |
| test_exec_td_decorator | 0.4426ms | 0.2341ms | 4.2715 KOps/s | 4.2058 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0070ms | 0.8260ms | 1.2107 KOps/s | 1.2090 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 1.0332ms | 0.8274ms | 1.2086 KOps/s | 1.2165 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.9048ms | 0.7131ms | 1.4023 KOps/s | 1.4119 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8990ms | 0.7135ms | 1.4014 KOps/s | 1.4083 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.1480ms | 20.5223ms | 48.7275 Ops/s | 48.5774 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.3982ms | 20.5646ms | 48.6272 Ops/s | 48.6261 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.9987ms | 20.3596ms | 49.1168 Ops/s | 49.1641 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 21.1997ms | 20.3957ms | 49.0299 Ops/s | 49.0477 Ops/s | |
| test_to_module_speed[True] | 1.5579ms | 1.4819ms | 674.8291 Ops/s | 674.1735 Ops/s | |
| test_to_module_speed[False] | 1.5642ms | 1.4589ms | 685.4508 Ops/s | 689.5384 Ops/s | |
| test_tc_init | 68.3110μs | 44.9895μs | 22.2274 KOps/s | 22.5047 KOps/s | |
| test_tc_init_tensor_only | 40.3110μs | 9.7048μs | 103.0422 KOps/s | 101.3112 KOps/s | |
| test_tc_init_nested | 0.1408ms | 88.7637μs | 11.2659 KOps/s | 11.4285 KOps/s | |
| test_tc_init_many_fields | 41.9810μs | 16.3301μs | 61.2365 KOps/s | 59.8829 KOps/s | |
| test_tc_first_layer_tensor | 27.0500μs | 1.8241μs | 548.2130 KOps/s | 540.5713 KOps/s | |
| test_tc_first_layer_tensor_only | 2.7034μs | 0.4055μs | 2.4661 MOps/s | 2.5418 MOps/s | |
| test_tc_first_layer_tensor_set | 42.1710μs | 3.9508μs | 253.1162 KOps/s | 253.3258 KOps/s | |
| test_tc_first_layer_tensor_only_set | 30.5710μs | 3.2812μs | 304.7711 KOps/s | 304.4631 KOps/s | |
| test_tc_first_layer_nontensor | 34.0510μs | 6.1591μs | 162.3609 KOps/s | 161.7195 KOps/s | |
| test_tc_second_layer_tensor | 27.7510μs | 4.4256μs | 225.9563 KOps/s | 224.8043 KOps/s | |
| test_tc_second_layer_nontensor | 37.8410μs | 8.6909μs | 115.0624 KOps/s | 114.1211 KOps/s | |
| test_unbind | 0.2666s | 18.3167ms | 54.5949 Ops/s | 55.2336 Ops/s | |
| test_full_like | 7.5427ms | 4.4471ms | 224.8650 Ops/s | 225.1095 Ops/s | |
| test_zeros_like | 5.1286ms | 4.4142ms | 226.5406 Ops/s | 226.1677 Ops/s | |
| test_ones_like | 4.6109ms | 4.4113ms | 226.6917 Ops/s | 225.5629 Ops/s | |
| test_clone | 7.3167ms | 6.7254ms | 148.6903 Ops/s | 148.4699 Ops/s | |
| test_squeeze | 0.2300ms | 14.2391μs | 70.2291 KOps/s | 69.2488 KOps/s | |
| test_unsqueeze | 0.1645ms | 0.1123ms | 8.9061 KOps/s | 8.9112 KOps/s | |
| test_split | 0.2351ms | 0.1821ms | 5.4909 KOps/s | 5.4149 KOps/s | |
| test_permute | 0.2703ms | 0.2028ms | 4.9302 KOps/s | 4.5704 KOps/s | |
| test_stack | 36.7227ms | 35.7382ms | 27.9813 Ops/s | 18.5745 Ops/s | |
| test_cat | 36.2006ms | 35.6175ms | 28.0761 Ops/s | 19.2982 Ops/s | |
| test_sequential_tensordict | 0.5246ms | 0.2140ms | 4.6738 KOps/s | 4.4058 KOps/s | |
| test_sequential_graph_module | 0.2299ms | 0.1217ms | 8.2203 KOps/s | 8.4604 KOps/s | |
| test_nested_tensordict | 0.6846ms | 0.2811ms | 3.5568 KOps/s | 3.5104 KOps/s | |
| test_nested_graph_module | 0.1849ms | 0.1308ms | 7.6426 KOps/s | 7.6602 KOps/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Add dtensor_send() and dtensor_recv() methods to TensorDictBase with:
TensorDictPipe -> UCXX)
Strategy A (materialize) implementation:
Strategies B and C are stubbed with NotImplementedError (implemented in
subsequent PRs).
Made-with: Cursor