Skip to content

[DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs#1652

Open
vmoens wants to merge 3 commits intogh/vmoens/92/basefrom
gh/vmoens/92/head
Open

[DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs#1652
vmoens wants to merge 3 commits intogh/vmoens/92/basefrom
gh/vmoens/92/head

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Mar 11, 2026

Stack from ghstack (oldest at bottom):


  • Fix _deduplicate_src_specs to use hashable (start, stop) tuples
    instead of slice objects as dict keys (slice is unhashable on
    Python < 3.12)
  • Add dtensor_send/dtensor_recv stubs to tensorclass.pyi
  • Fix lint: remove unused imports (TensorDictPipe TYPE_CHECKING,
    Sequence in megatron, _TransferPlan/ParameterPlan in tests),
    unused variable (sends in ModelTransferPlan.execute), loop
    variable naming (key -> _key in base.py)
  • Fix docstring formatting (D205/D415 in _chunk_slice,
    ModelTransferPlan)
  • Add examples/*.py to T201 (print) lint ignore in setup.cfg
  • Remove unused variable in example file
  • Auto-format with ufmt/black

Made-with: Cursor

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 11, 2026
- Fix _deduplicate_src_specs to use hashable (start, stop) tuples
  instead of slice objects as dict keys (slice is unhashable on
  Python < 3.12)
- Add dtensor_send/dtensor_recv stubs to tensorclass.pyi
- Fix lint: remove unused imports (TensorDictPipe TYPE_CHECKING,
  Sequence in megatron, _TransferPlan/ParameterPlan in tests),
  unused variable (sends in ModelTransferPlan.execute), loop
  variable naming (key -> _key in base.py)
- Fix docstring formatting (D205/D415 in _chunk_slice,
  ModelTransferPlan)
- Add examples/*.py to T201 (print) lint ignore in setup.cfg
- Remove unused variable in example file
- Auto-format with ufmt/black

Made-with: Cursor
ghstack-source-id: 92ae2d9
Pull-Request: #1652
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 11, 2026
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}31$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 29.0600μs 15.1333μs 66.0794 KOps/s 67.4849 KOps/s $\color{#d91a1a}-2.08\%$
test_plain_set_stack_nested 36.1500μs 15.4504μs 64.7233 KOps/s 66.4998 KOps/s $\color{#d91a1a}-2.67\%$
test_plain_set_nested_inplace 41.0510μs 17.0275μs 58.7285 KOps/s 59.3107 KOps/s $\color{#d91a1a}-0.98\%$
test_plain_set_stack_nested_inplace 46.5410μs 16.7894μs 59.5613 KOps/s 59.5288 KOps/s $\color{#35bf28}+0.05\%$
test_items 32.5610μs 5.9877μs 167.0084 KOps/s 165.9980 KOps/s $\color{#35bf28}+0.61\%$
test_items_nested 0.5489ms 0.4684ms 2.1348 KOps/s 2.1381 KOps/s $\color{#d91a1a}-0.16\%$
test_items_nested_locked 0.5628ms 0.4694ms 2.1303 KOps/s 2.1274 KOps/s $\color{#35bf28}+0.13\%$
test_items_nested_leaf 0.1334ms 97.5230μs 10.2540 KOps/s 10.1245 KOps/s $\color{#35bf28}+1.28\%$
test_items_stack_nested 0.5531ms 0.4639ms 2.1554 KOps/s 2.1285 KOps/s $\color{#35bf28}+1.26\%$
test_items_stack_nested_leaf 0.1313ms 98.1045μs 10.1932 KOps/s 10.2551 KOps/s $\color{#d91a1a}-0.60\%$
test_items_stack_nested_locked 0.4983ms 0.4679ms 2.1372 KOps/s 2.1444 KOps/s $\color{#d91a1a}-0.34\%$
test_keys 47.7900μs 4.2285μs 236.4915 KOps/s 234.4696 KOps/s $\color{#35bf28}+0.86\%$
test_keys_nested 1.0876ms 0.1284ms 7.7863 KOps/s 7.7487 KOps/s $\color{#35bf28}+0.49\%$
test_keys_nested_locked 2.2387ms 0.1392ms 7.1856 KOps/s 7.2364 KOps/s $\color{#d91a1a}-0.70\%$
test_keys_nested_leaf 0.1707ms 0.1208ms 8.2752 KOps/s 8.3516 KOps/s $\color{#d91a1a}-0.91\%$
test_keys_stack_nested 0.1849ms 0.1302ms 7.6804 KOps/s 7.7289 KOps/s $\color{#d91a1a}-0.63\%$
test_keys_stack_nested_leaf 0.1869ms 0.1213ms 8.2448 KOps/s 8.3947 KOps/s $\color{#d91a1a}-1.78\%$
test_keys_stack_nested_locked 0.2061ms 0.1358ms 7.3633 KOps/s 7.2850 KOps/s $\color{#35bf28}+1.08\%$
test_values 4.5520μs 1.0052μs 994.8491 KOps/s 985.2960 KOps/s $\color{#35bf28}+0.97\%$
test_values_nested 81.6310μs 53.0034μs 18.8667 KOps/s 19.1074 KOps/s $\color{#d91a1a}-1.26\%$
test_values_nested_locked 84.7910μs 55.8795μs 17.8957 KOps/s 18.0761 KOps/s $\color{#d91a1a}-1.00\%$
test_values_nested_leaf 91.4410μs 60.9412μs 16.4093 KOps/s 16.7864 KOps/s $\color{#d91a1a}-2.25\%$
test_values_stack_nested 83.1410μs 53.0780μs 18.8402 KOps/s 19.1890 KOps/s $\color{#d91a1a}-1.82\%$
test_values_stack_nested_leaf 0.1438ms 59.5702μs 16.7869 KOps/s 16.5063 KOps/s $\color{#35bf28}+1.70\%$
test_values_stack_nested_locked 91.0010μs 55.8675μs 17.8995 KOps/s 17.8368 KOps/s $\color{#35bf28}+0.35\%$
test_membership 6.7550μs 0.8315μs 1.2027 MOps/s 1.1681 MOps/s $\color{#35bf28}+2.96\%$
test_membership_nested 26.8700μs 2.8703μs 348.4007 KOps/s 348.1948 KOps/s $\color{#35bf28}+0.06\%$
test_membership_nested_leaf 29.7710μs 2.8676μs 348.7185 KOps/s 345.2127 KOps/s $\color{#35bf28}+1.02\%$
test_membership_stacked_nested 39.4600μs 2.8871μs 346.3661 KOps/s 343.3956 KOps/s $\color{#35bf28}+0.87\%$
test_membership_stacked_nested_leaf 32.5100μs 2.8743μs 347.9107 KOps/s 339.7473 KOps/s $\color{#35bf28}+2.40\%$
test_membership_nested_last 35.3310μs 4.4713μs 223.6461 KOps/s 228.2546 KOps/s $\color{#d91a1a}-2.02\%$
test_membership_nested_leaf_last 31.6310μs 4.4515μs 224.6444 KOps/s 227.9318 KOps/s $\color{#d91a1a}-1.44\%$
test_membership_stacked_nested_last 31.3900μs 4.4586μs 224.2872 KOps/s 227.5900 KOps/s $\color{#d91a1a}-1.45\%$
test_membership_stacked_nested_leaf_last 25.8700μs 4.4033μs 227.1002 KOps/s 228.8977 KOps/s $\color{#d91a1a}-0.79\%$
test_nested_getleaf 51.7010μs 21.6473μs 46.1952 KOps/s 46.0297 KOps/s $\color{#35bf28}+0.36\%$
test_nested_get 51.8900μs 20.5072μs 48.7635 KOps/s 48.2837 KOps/s $\color{#35bf28}+0.99\%$
test_stacked_getleaf 44.8900μs 21.4864μs 46.5410 KOps/s 46.0792 KOps/s $\color{#35bf28}+1.00\%$
test_stacked_get 46.6700μs 20.7275μs 48.2452 KOps/s 48.4771 KOps/s $\color{#d91a1a}-0.48\%$
test_nested_getitemleaf 50.8710μs 22.1175μs 45.2130 KOps/s 45.3210 KOps/s $\color{#d91a1a}-0.24\%$
test_nested_getitem 52.2610μs 21.0902μs 47.4153 KOps/s 47.6014 KOps/s $\color{#d91a1a}-0.39\%$
test_stacked_getitemleaf 52.0010μs 21.8872μs 45.6888 KOps/s 45.0605 KOps/s $\color{#35bf28}+1.39\%$
test_stacked_getitem 58.4510μs 21.1625μs 47.2535 KOps/s 47.5097 KOps/s $\color{#d91a1a}-0.54\%$
test_lock_nested 0.5909ms 0.4744ms 2.1081 KOps/s 2.0985 KOps/s $\color{#35bf28}+0.46\%$
test_lock_stack_nested 0.5823ms 0.4787ms 2.0892 KOps/s 2.0598 KOps/s $\color{#35bf28}+1.43\%$
test_unlock_nested 0.4805ms 0.3901ms 2.5637 KOps/s 2.5654 KOps/s $\color{#d91a1a}-0.06\%$
test_unlock_stack_nested 0.4298ms 0.3889ms 2.5711 KOps/s 2.5526 KOps/s $\color{#35bf28}+0.72\%$
test_flatten_speed 0.1729ms 0.1226ms 8.1575 KOps/s 8.1141 KOps/s $\color{#35bf28}+0.54\%$
test_unflatten_speed 0.6873ms 0.5832ms 1.7146 KOps/s 1.7476 KOps/s $\color{#d91a1a}-1.89\%$
test_common_ops 0.8183ms 0.6789ms 1.4730 KOps/s 1.4228 KOps/s $\color{#35bf28}+3.52\%$
test_creation 85.7610μs 3.1232μs 320.1835 KOps/s 316.4082 KOps/s $\color{#35bf28}+1.19\%$
test_creation_empty 24.1300μs 6.9117μs 144.6812 KOps/s 141.7417 KOps/s $\color{#35bf28}+2.07\%$
test_creation_nested_1 42.3310μs 11.4074μs 87.6624 KOps/s 85.5611 KOps/s $\color{#35bf28}+2.46\%$
test_creation_nested_2 33.7200μs 13.2179μs 75.6547 KOps/s 76.4012 KOps/s $\color{#d91a1a}-0.98\%$
test_creation_many_keys[10] 49.9110μs 20.9743μs 47.6773 KOps/s 47.6964 KOps/s $\color{#d91a1a}-0.04\%$
test_creation_many_keys[50] 0.1300ms 89.9243μs 11.1205 KOps/s 10.9170 KOps/s $\color{#35bf28}+1.86\%$
test_creation_many_keys[100] 0.2373ms 0.1746ms 5.7288 KOps/s 5.6040 KOps/s $\color{#35bf28}+2.23\%$
test_creation_nested_many_keys[10] 76.2400μs 44.5191μs 22.4623 KOps/s 21.9838 KOps/s $\color{#35bf28}+2.18\%$
test_creation_nested_many_keys[50] 0.2349ms 0.1817ms 5.5031 KOps/s 5.3957 KOps/s $\color{#35bf28}+1.99\%$
test_clone 39.4100μs 12.8084μs 78.0737 KOps/s 76.4184 KOps/s $\color{#35bf28}+2.17\%$
test_getitem[int] 1.5745ms 14.9158μs 67.0430 KOps/s 59.1629 KOps/s $\textbf{\color{#35bf28}+13.32\%}$
test_getitem[slice_int] 0.1395ms 23.6049μs 42.3642 KOps/s 40.2691 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_getitem[range] 0.1867ms 61.4794μs 16.2656 KOps/s 15.6688 KOps/s $\color{#35bf28}+3.81\%$
test_getitem[tuple] 0.1398ms 23.4837μs 42.5828 KOps/s 41.1456 KOps/s $\color{#35bf28}+3.49\%$
test_getitem[list] 0.1812ms 56.9312μs 17.5651 KOps/s 17.1251 KOps/s $\color{#35bf28}+2.57\%$
test_setitem_dim[int] 59.7810μs 26.0026μs 38.4576 KOps/s 38.3590 KOps/s $\color{#35bf28}+0.26\%$
test_setitem_dim[slice_int] 78.4410μs 42.6472μs 23.4482 KOps/s 23.0289 KOps/s $\color{#35bf28}+1.82\%$
test_setitem_dim[range] 0.1204ms 93.6106μs 10.6826 KOps/s 10.5111 KOps/s $\color{#35bf28}+1.63\%$
test_setitem_dim[tuple] 60.9910μs 39.6742μs 25.2053 KOps/s 25.4403 KOps/s $\color{#d91a1a}-0.92\%$
test_setitem 57.7000μs 17.0869μs 58.5243 KOps/s 55.7609 KOps/s $\color{#35bf28}+4.96\%$
test_set 47.5310μs 16.3240μs 61.2596 KOps/s 58.6794 KOps/s $\color{#35bf28}+4.40\%$
test_set_shared 0.5005ms 0.2096ms 4.7717 KOps/s 4.8750 KOps/s $\color{#d91a1a}-2.12\%$
test_update 0.3221ms 20.6199μs 48.4968 KOps/s 45.3088 KOps/s $\textbf{\color{#35bf28}+7.04\%}$
test_update_nested 69.0210μs 31.6569μs 31.5887 KOps/s 30.1239 KOps/s $\color{#35bf28}+4.86\%$
test_update__nested 0.4510ms 32.8382μs 30.4524 KOps/s 28.8319 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_set_nested 53.2900μs 18.2573μs 54.7726 KOps/s 51.7454 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_set_nested_new 61.9710μs 22.8146μs 43.8315 KOps/s 41.6039 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_select 78.9310μs 39.0837μs 25.5861 KOps/s 24.4280 KOps/s $\color{#35bf28}+4.74\%$
test_select_nested 0.1139ms 73.3014μs 13.6423 KOps/s 13.4005 KOps/s $\color{#35bf28}+1.80\%$
test_exclude_nested 0.1323ms 92.2179μs 10.8439 KOps/s 10.7811 KOps/s $\color{#35bf28}+0.58\%$
test_empty[True] 0.4853ms 0.4004ms 2.4973 KOps/s 2.4903 KOps/s $\color{#35bf28}+0.28\%$
test_empty[False] 8.9125μs 1.3206μs 757.2062 KOps/s 758.2936 KOps/s $\color{#d91a1a}-0.14\%$
test_to 0.1014ms 70.6145μs 14.1614 KOps/s 13.4125 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_to_nonblocking 0.1124ms 62.5342μs 15.9912 KOps/s 15.3518 KOps/s $\color{#35bf28}+4.17\%$
test_unbind_speed 0.3975ms 0.3285ms 3.0445 KOps/s 2.9707 KOps/s $\color{#35bf28}+2.49\%$
test_unbind_speed_stack0 0.4344ms 0.3273ms 3.0551 KOps/s 2.9996 KOps/s $\color{#35bf28}+1.85\%$
test_unbind_speed_stack1 0.1046s 0.9088ms 1.1003 KOps/s 950.3171 Ops/s $\textbf{\color{#35bf28}+15.78\%}$
test_split 1.2408ms 1.1324ms 883.0683 Ops/s 875.0311 Ops/s $\color{#35bf28}+0.92\%$
test_chunk 0.1043s 1.2033ms 831.0411 Ops/s 925.7356 Ops/s $\textbf{\color{#d91a1a}-10.23\%}$
test_to_cpu_blocking 19.0780ms 18.7354ms 53.3748 Ops/s 48.0045 Ops/s $\textbf{\color{#35bf28}+11.19\%}$
test_to_cpu_global_sync 11.1699ms 11.0412ms 90.5697 Ops/s 88.8544 Ops/s $\color{#35bf28}+1.93\%$
test_to_cpu_event_sync 0.1164s 13.1437ms 76.0823 Ops/s 82.5885 Ops/s $\textbf{\color{#d91a1a}-7.88\%}$
test_to_cpu_default 12.2973ms 11.8770ms 84.1960 Ops/s 82.6844 Ops/s $\color{#35bf28}+1.83\%$
test_consolidate[False-None] 4.2835ms 4.1135ms 243.1046 Ops/s 214.7061 Ops/s $\textbf{\color{#35bf28}+13.23\%}$
test_consolidate[default-None] 2.4368ms 1.9852ms 503.7273 Ops/s 472.6338 Ops/s $\textbf{\color{#35bf28}+6.58\%}$
test_consolidate[reduce-overhead-None] 2.3615ms 1.9097ms 523.6354 Ops/s 493.8311 Ops/s $\textbf{\color{#35bf28}+6.04\%}$
test_consolidate_njt[False-None] 9.4315ms 8.8199ms 113.3806 Ops/s 114.1105 Ops/s $\color{#d91a1a}-0.64\%$
test_to[False-False-None] 2.4909ms 2.0705ms 482.9669 Ops/s 469.6955 Ops/s $\color{#35bf28}+2.83\%$
test_to[True-False-None] 2.3597ms 1.9202ms 520.7691 Ops/s 511.5337 Ops/s $\color{#35bf28}+1.81\%$
test_to[within-False-None] 6.6140ms 6.1909ms 161.5265 Ops/s 160.8586 Ops/s $\color{#35bf28}+0.42\%$
test_to[True-default-None] 9.3567ms 8.9462ms 111.7793 Ops/s 111.3806 Ops/s $\color{#35bf28}+0.36\%$
test_to_njt[False-False-None] 9.5135ms 8.4564ms 118.2531 Ops/s 116.8191 Ops/s $\color{#35bf28}+1.23\%$
test_to_njt[True-False-None] 7.4288ms 7.1791ms 139.2941 Ops/s 142.5648 Ops/s $\color{#d91a1a}-2.29\%$
test_to_njt[within-False-None] 16.4044ms 15.9493ms 62.6986 Ops/s 62.6990 Ops/s $-0.00\%$
test_creation[device0] 0.3910ms 0.1189ms 8.4078 KOps/s 8.4138 KOps/s $\color{#d91a1a}-0.07\%$
test_creation_from_tensor 0.6820ms 0.1164ms 8.5918 KOps/s 8.6342 KOps/s $\color{#d91a1a}-0.49\%$
test_add_one[memmap_tensor0] 0.1986ms 6.7221μs 148.7639 KOps/s 152.2398 KOps/s $\color{#d91a1a}-2.28\%$
test_contiguous[memmap_tensor0] 21.4200μs 0.7241μs 1.3811 MOps/s 2.1617 MOps/s $\textbf{\color{#d91a1a}-36.11\%}$
test_stack[memmap_tensor0] 36.9400μs 4.6172μs 216.5795 KOps/s 209.4272 KOps/s $\color{#35bf28}+3.42\%$
test_memmaptd_index 1.0315ms 0.2650ms 3.7729 KOps/s 3.7337 KOps/s $\color{#35bf28}+1.05\%$
test_memmaptd_index_astensor 0.8002ms 0.3729ms 2.6819 KOps/s 2.6898 KOps/s $\color{#d91a1a}-0.29\%$
test_memmaptd_index_op 1.0638ms 0.6286ms 1.5909 KOps/s 1.5804 KOps/s $\color{#35bf28}+0.67\%$
test_serialize_model 0.3145s 0.1612s 6.2020 Ops/s 7.4072 Ops/s $\textbf{\color{#d91a1a}-16.27\%}$
test_serialize_model_pickle 1.3591s 1.2104s 0.8262 Ops/s 0.8215 Ops/s $\color{#35bf28}+0.57\%$
test_serialize_weights 0.1384s 0.1355s 7.3784 Ops/s 7.3862 Ops/s $\color{#d91a1a}-0.11\%$
test_serialize_weights_returnearly 0.4154s 87.2336ms 11.4635 Ops/s 15.1754 Ops/s $\textbf{\color{#d91a1a}-24.46\%}$
test_serialize_weights_pickle 1.4264s 1.2237s 0.8172 Ops/s 0.8226 Ops/s $\color{#d91a1a}-0.65\%$
test_reshape_pytree 0.1999ms 32.4443μs 30.8220 KOps/s 30.3226 KOps/s $\color{#35bf28}+1.65\%$
test_reshape_td 77.1810μs 44.0160μs 22.7190 KOps/s 21.9994 KOps/s $\color{#35bf28}+3.27\%$
test_view_pytree 0.2136ms 32.2388μs 31.0185 KOps/s 30.6064 KOps/s $\color{#35bf28}+1.35\%$
test_view_td 0.1610ms 52.6381μs 18.9977 KOps/s 18.6420 KOps/s $\color{#35bf28}+1.91\%$
test_unbind_pytree 0.2286ms 36.4357μs 27.4456 KOps/s 27.1801 KOps/s $\color{#35bf28}+0.98\%$
test_unbind_td 0.1511ms 48.4535μs 20.6384 KOps/s 19.7801 KOps/s $\color{#35bf28}+4.34\%$
test_split_pytree 0.1995ms 42.2379μs 23.6754 KOps/s 23.4694 KOps/s $\color{#35bf28}+0.88\%$
test_split_td 0.2747ms 63.8646μs 15.6581 KOps/s 15.3820 KOps/s $\color{#35bf28}+1.80\%$
test_add_pytree 0.2321ms 41.7137μs 23.9729 KOps/s 23.8688 KOps/s $\color{#35bf28}+0.44\%$
test_add_td 0.1718ms 55.3151μs 18.0783 KOps/s 18.2646 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_add_one_nested[tensordict-compile] 0.2780ms 0.1456ms 6.8682 KOps/s 6.6644 KOps/s $\color{#35bf28}+3.06\%$
test_compile_add_one_nested[tensordict-eager] 0.3354ms 0.2003ms 4.9920 KOps/s 5.0535 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_add_one_nested[pytree-compile] 0.1572ms 0.1053ms 9.4992 KOps/s 8.9503 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_compile_add_one_nested[pytree-eager] 0.4310ms 0.1723ms 5.8053 KOps/s 5.4588 KOps/s $\textbf{\color{#35bf28}+6.35\%}$
test_compile_copy_nested[tensordict-compile] 0.3556ms 10.2749μs 97.3247 KOps/s 97.9908 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_copy_nested[tensordict-eager] 87.5310μs 53.9406μs 18.5389 KOps/s 18.4914 KOps/s $\color{#35bf28}+0.26\%$
test_compile_copy_nested[pytree-compile] 40.8010μs 10.2332μs 97.7213 KOps/s 106.3919 KOps/s $\textbf{\color{#d91a1a}-8.15\%}$
test_compile_copy_nested[pytree-eager] 0.4684ms 69.3478μs 14.4201 KOps/s 14.7948 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_add_one_flat[tensordict-compile] 0.2134ms 0.1754ms 5.7000 KOps/s 5.3864 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_compile_add_one_flat[tensordict-eager] 0.3756ms 0.2818ms 3.5486 KOps/s 3.5316 KOps/s $\color{#35bf28}+0.48\%$
test_compile_add_one_flat[tensorclass-compile] 0.1916ms 0.1170ms 8.5442 KOps/s 8.2602 KOps/s $\color{#35bf28}+3.44\%$
test_compile_add_one_flat[tensorclass-eager] 0.1075ms 72.9477μs 13.7084 KOps/s 13.5960 KOps/s $\color{#35bf28}+0.83\%$
test_compile_add_one_flat[pytree-compile] 0.1973ms 0.1576ms 6.3438 KOps/s 6.1720 KOps/s $\color{#35bf28}+2.78\%$
test_compile_add_one_flat[pytree-eager] 0.7958ms 0.5022ms 1.9912 KOps/s 1.9014 KOps/s $\color{#35bf28}+4.72\%$
test_compile_add_self_flat[tensordict-eager] 0.4735ms 0.3342ms 2.9924 KOps/s 2.9753 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_self_flat[tensordict-compile] 0.2379ms 0.1807ms 5.5338 KOps/s 3.3141 KOps/s $\textbf{\color{#35bf28}+66.98\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1278ms 90.1250μs 11.0957 KOps/s 11.2513 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_add_self_flat[tensorclass-compile] 0.1970ms 0.1186ms 8.4315 KOps/s 7.7744 KOps/s $\textbf{\color{#35bf28}+8.45\%}$
test_compile_add_self_flat[pytree-eager] 0.6201ms 0.4228ms 2.3652 KOps/s 2.2679 KOps/s $\color{#35bf28}+4.29\%$
test_compile_add_self_flat[pytree-compile] 0.2012ms 0.1579ms 6.3331 KOps/s 6.1189 KOps/s $\color{#35bf28}+3.50\%$
test_compile_copy_flat[tensordict-compile] 79.1910μs 13.3782μs 74.7482 KOps/s 73.9853 KOps/s $\color{#35bf28}+1.03\%$
test_compile_copy_flat[tensordict-eager] 69.6200μs 40.8693μs 24.4682 KOps/s 24.3438 KOps/s $\color{#35bf28}+0.51\%$
test_compile_copy_flat[pytree-compile] 39.6410μs 10.5783μs 94.5328 KOps/s 91.7269 KOps/s $\color{#35bf28}+3.06\%$
test_compile_copy_flat[pytree-eager] 0.4251ms 53.0622μs 18.8458 KOps/s 18.8683 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_assign_and_add[tensordict-compile] 2.0111ms 0.1744ms 5.7344 KOps/s 5.2561 KOps/s $\textbf{\color{#35bf28}+9.10\%}$
test_compile_assign_and_add[tensordict-eager] 3.3954ms 3.2628ms 306.4894 Ops/s 296.8255 Ops/s $\color{#35bf28}+3.26\%$
test_compile_assign_and_add[pytree-compile] 1.9793ms 0.1615ms 6.1925 KOps/s 6.2019 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_assign_and_add[pytree-eager] 2.8034ms 2.7001ms 370.3517 Ops/s 367.0204 Ops/s $\color{#35bf28}+0.91\%$
test_compile_indexing[tensor-tensordict-compile] 0.1592ms 0.1081ms 9.2538 KOps/s 8.6548 KOps/s $\textbf{\color{#35bf28}+6.92\%}$
test_compile_indexing[tensor-tensordict-eager] 0.3141ms 72.8803μs 13.7211 KOps/s 13.3369 KOps/s $\color{#35bf28}+2.88\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1224ms 94.0212μs 10.6359 KOps/s 10.2629 KOps/s $\color{#35bf28}+3.63\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2768ms 43.8508μs 22.8046 KOps/s 22.4614 KOps/s $\color{#35bf28}+1.53\%$
test_compile_indexing[tensor-pytree-compile] 0.1349ms 94.7453μs 10.5546 KOps/s 9.6259 KOps/s $\textbf{\color{#35bf28}+9.65\%}$
test_compile_indexing[tensor-pytree-eager] 0.2411ms 43.5344μs 22.9703 KOps/s 22.4147 KOps/s $\color{#35bf28}+2.48\%$
test_compile_indexing[slice-tensordict-compile] 94.2810μs 56.4411μs 17.7176 KOps/s 16.3271 KOps/s $\textbf{\color{#35bf28}+8.52\%}$
test_compile_indexing[slice-tensordict-eager] 0.2249ms 27.5361μs 36.3160 KOps/s 36.6506 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_indexing[slice-tensorclass-compile] 76.1810μs 44.0567μs 22.6980 KOps/s 21.7853 KOps/s $\color{#35bf28}+4.19\%$
test_compile_indexing[slice-tensorclass-eager] 0.2532ms 22.6342μs 44.1809 KOps/s 44.8621 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_indexing[slice-pytree-compile] 83.3410μs 44.2587μs 22.5944 KOps/s 21.9735 KOps/s $\color{#35bf28}+2.83\%$
test_compile_indexing[slice-pytree-eager] 0.2615ms 22.5623μs 44.3218 KOps/s 44.8871 KOps/s $\color{#d91a1a}-1.26\%$
test_compile_indexing[int-tensordict-compile] 87.3910μs 56.2642μs 17.7733 KOps/s 16.0755 KOps/s $\textbf{\color{#35bf28}+10.56\%}$
test_compile_indexing[int-tensordict-eager] 0.2220ms 27.5504μs 36.2971 KOps/s 36.5001 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[int-tensorclass-compile] 79.2510μs 44.2589μs 22.5943 KOps/s 21.3040 KOps/s $\textbf{\color{#35bf28}+6.06\%}$
test_compile_indexing[int-tensorclass-eager] 0.2560ms 22.5240μs 44.3971 KOps/s 44.8330 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_indexing[int-pytree-compile] 82.1410μs 44.9324μs 22.2556 KOps/s 21.2286 KOps/s $\color{#35bf28}+4.84\%$
test_compile_indexing[int-pytree-eager] 0.2760ms 22.5614μs 44.3234 KOps/s 44.6102 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_replace[single-eager] 0.1068ms 47.5067μs 21.0497 KOps/s 20.2244 KOps/s $\color{#35bf28}+4.08\%$
test_compile_replace[single-compile] 0.1798ms 0.1042ms 9.5930 KOps/s 9.2607 KOps/s $\color{#35bf28}+3.59\%$
test_compile_replace[multi-eager] 0.6643ms 0.5602ms 1.7851 KOps/s 1.8000 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_replace[multi-compile] 0.1583ms 0.1106ms 9.0452 KOps/s 8.6785 KOps/s $\color{#35bf28}+4.23\%$
test_compile_tc_getattr_20[eager] 0.1978ms 0.1587ms 6.3013 KOps/s 6.0247 KOps/s $\color{#35bf28}+4.59\%$
test_compile_tc_getattr_20[compile] 0.1649ms 0.1182ms 8.4605 KOps/s 8.0514 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_compile_clone_shallow[20-eager] 78.3510μs 19.3399μs 51.7065 KOps/s 52.2929 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_clone_shallow[20-compile] 42.8510μs 11.3442μs 88.1505 KOps/s 85.2723 KOps/s $\color{#35bf28}+3.38\%$
test_compile_clone_shallow[40-eager] 66.0210μs 33.9398μs 29.4639 KOps/s 29.9198 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_clone_shallow[40-compile] 41.0410μs 12.7131μs 78.6591 KOps/s 78.6425 KOps/s $\color{#35bf28}+0.02\%$
test_compile_clone_shallow[80-eager] 98.4310μs 63.2659μs 15.8063 KOps/s 15.8587 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_clone_shallow[80-compile] 49.6100μs 15.2926μs 65.3911 KOps/s 65.2588 KOps/s $\color{#35bf28}+0.20\%$
test_compile_update_inplace[eager] 98.4010μs 58.9444μs 16.9651 KOps/s 16.9771 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_update_inplace[compile] 0.1961ms 0.1395ms 7.1697 KOps/s 6.8812 KOps/s $\color{#35bf28}+4.19\%$
test_mod_add[eager] 0.1250ms 48.7457μs 20.5146 KOps/s 20.3689 KOps/s $\color{#35bf28}+0.72\%$
test_mod_add[compile] 0.1568ms 0.1045ms 9.5729 KOps/s 9.3310 KOps/s $\color{#35bf28}+2.59\%$
test_mod_add[compile-overhead] 0.2424ms 0.1492ms 6.7009 KOps/s 6.4966 KOps/s $\color{#35bf28}+3.14\%$
test_mod_wrap[eager] 0.3666ms 0.2917ms 3.4287 KOps/s 3.4401 KOps/s $\color{#d91a1a}-0.33\%$
test_mod_wrap[compile] 0.4223ms 0.3416ms 2.9275 KOps/s 2.7174 KOps/s $\textbf{\color{#35bf28}+7.73\%}$
test_mod_wrap[compile-overhead] 7.2327ms 3.9985ms 250.0911 Ops/s 250.5469 Ops/s $\color{#d91a1a}-0.18\%$
test_mod_wrap_and_backward[eager] 1.6850ms 1.4789ms 676.1608 Ops/s 669.5056 Ops/s $\color{#35bf28}+0.99\%$
test_mod_wrap_and_backward[compile] 1.6902ms 1.4165ms 705.9618 Ops/s 683.2401 Ops/s $\color{#35bf28}+3.33\%$
test_mod_wrap_and_backward[compile-overhead] 1.2686ms 0.8903ms 1.1232 KOps/s 1.1071 KOps/s $\color{#35bf28}+1.45\%$
test_seq_add[eager] 0.1979ms 0.1511ms 6.6167 KOps/s 6.5335 KOps/s $\color{#35bf28}+1.27\%$
test_seq_add[compile] 0.1683ms 0.1144ms 8.7430 KOps/s 8.4610 KOps/s $\color{#35bf28}+3.33\%$
test_seq_add[compile-overhead] 0.1985ms 0.1513ms 6.6084 KOps/s 6.2656 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_seq_wrap[eager] 0.6105ms 0.5352ms 1.8684 KOps/s 1.8773 KOps/s $\color{#d91a1a}-0.47\%$
test_seq_wrap[compile] 0.4493ms 0.3742ms 2.6724 KOps/s 2.5759 KOps/s $\color{#35bf28}+3.75\%$
test_seq_wrap[compile-overhead] 0.3485ms 0.2646ms 3.7796 KOps/s 3.5525 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_func_call_runtime[False-eager] 0.9796ms 0.8692ms 1.1504 KOps/s 1.1774 KOps/s $\color{#d91a1a}-2.29\%$
test_func_call_runtime[False-compile] 1.0205ms 0.8889ms 1.1250 KOps/s 1.1028 KOps/s $\color{#35bf28}+2.01\%$
test_func_call_runtime[False-compile-overhead] 0.5613ms 0.4604ms 2.1718 KOps/s 2.1491 KOps/s $\color{#35bf28}+1.06\%$
test_func_call_runtime[True-eager] 1.2038ms 1.0662ms 937.9433 Ops/s 927.7884 Ops/s $\color{#35bf28}+1.09\%$
test_func_call_runtime[True-compile] 1.0100ms 0.8975ms 1.1142 KOps/s 1.0437 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_func_call_runtime[True-compile-overhead] 0.6008ms 0.4728ms 2.1152 KOps/s 2.0290 KOps/s $\color{#35bf28}+4.25\%$
test_func_call_cm_runtime[False-eager] 0.9613ms 0.8366ms 1.1953 KOps/s 1.1405 KOps/s $\color{#35bf28}+4.80\%$
test_func_call_cm_runtime[False-compile] 1.0005ms 0.8865ms 1.1280 KOps/s 1.0975 KOps/s $\color{#35bf28}+2.78\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5160ms 0.4618ms 2.1655 KOps/s 2.1423 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_cm_runtime[True-eager] 1.3430ms 1.2085ms 827.4667 Ops/s 820.9217 Ops/s $\color{#35bf28}+0.80\%$
test_func_call_cm_runtime[True-compile] 1.0247ms 0.9328ms 1.0721 KOps/s 1.0169 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_func_call_cm_runtime[True-compile-overhead] 0.5958ms 0.5069ms 1.9728 KOps/s 1.9385 KOps/s $\color{#35bf28}+1.77\%$
test_vmap_func_call_cm_runtime[eager] 2.8072ms 2.3444ms 426.5436 Ops/s 421.9948 Ops/s $\color{#35bf28}+1.08\%$
test_vmap_func_call_cm_runtime[compile] 1.0197ms 0.9516ms 1.0508 KOps/s 997.1071 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5649ms 0.5152ms 1.9412 KOps/s 1.8606 KOps/s $\color{#35bf28}+4.33\%$
test_distributed 0.4454ms 0.1521ms 6.5765 KOps/s 6.3673 KOps/s $\color{#35bf28}+3.28\%$
test_tdmodule 0.3111ms 26.4948μs 37.7433 KOps/s 36.9889 KOps/s $\color{#35bf28}+2.04\%$
test_tdmodule_dispatch 73.6700μs 43.7894μs 22.8366 KOps/s 22.5548 KOps/s $\color{#35bf28}+1.25\%$
test_tdseq 45.5500μs 26.3895μs 37.8938 KOps/s 37.7338 KOps/s $\color{#35bf28}+0.42\%$
test_tdseq_dispatch 69.2310μs 46.7402μs 21.3948 KOps/s 21.3535 KOps/s $\color{#35bf28}+0.19\%$
test_instantiation_functorch 2.3263ms 2.0553ms 486.5482 Ops/s 482.2071 Ops/s $\color{#35bf28}+0.90\%$
test_exec_functorch 0.2316ms 0.1749ms 5.7182 KOps/s 5.6679 KOps/s $\color{#35bf28}+0.89\%$
test_exec_functional_call 0.2270ms 0.1588ms 6.2976 KOps/s 6.3506 KOps/s $\color{#d91a1a}-0.83\%$
test_exec_td_decorator 0.4425ms 0.2326ms 4.2999 KOps/s 4.2743 KOps/s $\color{#35bf28}+0.60\%$
test_vmap_mlp_speed_decorator[True-True] 1.0140ms 0.8127ms 1.2305 KOps/s 1.2222 KOps/s $\color{#35bf28}+0.68\%$
test_vmap_mlp_speed_decorator[True-False] 0.9905ms 0.8160ms 1.2255 KOps/s 1.2245 KOps/s $\color{#35bf28}+0.08\%$
test_vmap_mlp_speed_decorator[False-True] 0.8532ms 0.7021ms 1.4242 KOps/s 1.4176 KOps/s $\color{#35bf28}+0.47\%$
test_vmap_mlp_speed_decorator[False-False] 0.8828ms 0.6998ms 1.4290 KOps/s 1.4202 KOps/s $\color{#35bf28}+0.62\%$
test_vmap_transformer_speed_decorator[True-True] 20.7818ms 20.2003ms 49.5042 Ops/s 49.0682 Ops/s $\color{#35bf28}+0.89\%$
test_vmap_transformer_speed_decorator[True-False] 20.5747ms 20.2035ms 49.4963 Ops/s 49.0530 Ops/s $\color{#35bf28}+0.90\%$
test_vmap_transformer_speed_decorator[False-True] 20.1029ms 19.9953ms 50.0117 Ops/s 49.6085 Ops/s $\color{#35bf28}+0.81\%$
test_vmap_transformer_speed_decorator[False-False] 20.5209ms 19.9943ms 50.0141 Ops/s 49.5052 Ops/s $\color{#35bf28}+1.03\%$
test_to_module_speed[True] 2.0761ms 1.4651ms 682.5700 Ops/s 681.6623 Ops/s $\color{#35bf28}+0.13\%$
test_to_module_speed[False] 1.5282ms 1.4446ms 692.2263 Ops/s 699.4862 Ops/s $\color{#d91a1a}-1.04\%$
test_tc_init 70.6600μs 43.6608μs 22.9038 KOps/s 22.7859 KOps/s $\color{#35bf28}+0.52\%$
test_tc_init_tensor_only 35.6900μs 9.8726μs 101.2908 KOps/s 100.1246 KOps/s $\color{#35bf28}+1.16\%$
test_tc_init_nested 0.1344ms 86.2864μs 11.5893 KOps/s 11.3521 KOps/s $\color{#35bf28}+2.09\%$
test_tc_init_many_fields 40.4710μs 16.5862μs 60.2911 KOps/s 60.4519 KOps/s $\color{#d91a1a}-0.27\%$
test_tc_first_layer_tensor 27.6600μs 1.8064μs 553.5886 KOps/s 555.8392 KOps/s $\color{#d91a1a}-0.40\%$
test_tc_first_layer_tensor_only 3.2342μs 0.4004μs 2.4978 MOps/s 2.5483 MOps/s $\color{#d91a1a}-1.98\%$
test_tc_first_layer_tensor_set 27.3800μs 3.9092μs 255.8050 KOps/s 258.2975 KOps/s $\color{#d91a1a}-0.96\%$
test_tc_first_layer_tensor_only_set 22.9910μs 3.2883μs 304.1128 KOps/s 305.2388 KOps/s $\color{#d91a1a}-0.37\%$
test_tc_first_layer_nontensor 31.4300μs 6.1437μs 162.7679 KOps/s 162.3115 KOps/s $\color{#35bf28}+0.28\%$
test_tc_second_layer_tensor 35.6310μs 4.3666μs 229.0095 KOps/s 232.3604 KOps/s $\color{#d91a1a}-1.44\%$
test_tc_second_layer_nontensor 39.1900μs 8.6074μs 116.1791 KOps/s 115.6663 KOps/s $\color{#35bf28}+0.44\%$
test_unbind 0.2582s 16.4449ms 60.8090 Ops/s 67.4710 Ops/s $\textbf{\color{#d91a1a}-9.87\%}$
test_full_like 5.1689ms 4.4321ms 225.6253 Ops/s 227.5731 Ops/s $\color{#d91a1a}-0.86\%$
test_zeros_like 9.3465ms 7.3568ms 135.9282 Ops/s 227.2699 Ops/s $\textbf{\color{#d91a1a}-40.19\%}$
test_ones_like 4.6450ms 4.4385ms 225.2995 Ops/s 226.5538 Ops/s $\color{#d91a1a}-0.55\%$
test_clone 12.2700ms 9.5052ms 105.2058 Ops/s 149.8548 Ops/s $\textbf{\color{#d91a1a}-29.79\%}$
test_squeeze 0.1644ms 13.9972μs 71.4427 KOps/s 70.9473 KOps/s $\color{#35bf28}+0.70\%$
test_unsqueeze 0.1682ms 0.1073ms 9.3181 KOps/s 9.2567 KOps/s $\color{#35bf28}+0.66\%$
test_split 0.3819ms 0.1797ms 5.5654 KOps/s 5.4704 KOps/s $\color{#35bf28}+1.74\%$
test_permute 0.2710ms 0.1989ms 5.0273 KOps/s 5.0168 KOps/s $\color{#35bf28}+0.21\%$
test_stack 53.7877ms 51.8928ms 19.2705 Ops/s 19.3288 Ops/s $\color{#d91a1a}-0.30\%$
test_cat 51.8441ms 51.3720ms 19.4658 Ops/s 19.3637 Ops/s $\color{#35bf28}+0.53\%$
test_sequential_tensordict 0.3806ms 0.2150ms 4.6503 KOps/s 4.6008 KOps/s $\color{#35bf28}+1.08\%$
test_sequential_graph_module 0.1814ms 0.1197ms 8.3561 KOps/s 8.4177 KOps/s $\color{#d91a1a}-0.73\%$
test_nested_tensordict 0.3694ms 0.2785ms 3.5903 KOps/s 3.4195 KOps/s $\color{#35bf28}+5.00\%$
test_nested_graph_module 0.1911ms 0.1259ms 7.9441 KOps/s 7.3654 KOps/s $\textbf{\color{#35bf28}+7.86\%}$

@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 261. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 42.8020μs 14.8142μs 67.5026 KOps/s 67.4417 KOps/s $\color{#35bf28}+0.09\%$
test_plain_set_stack_nested 38.8110μs 15.3234μs 65.2596 KOps/s 65.2262 KOps/s $\color{#35bf28}+0.05\%$
test_plain_set_nested_inplace 49.7720μs 16.9416μs 59.0263 KOps/s 59.3359 KOps/s $\color{#d91a1a}-0.52\%$
test_plain_set_stack_nested_inplace 51.6120μs 16.6298μs 60.1331 KOps/s 59.7145 KOps/s $\color{#35bf28}+0.70\%$
test_items 29.1410μs 5.9517μs 168.0200 KOps/s 169.8426 KOps/s $\color{#d91a1a}-1.07\%$
test_items_nested 0.5181ms 0.4672ms 2.1405 KOps/s 2.1089 KOps/s $\color{#35bf28}+1.50\%$
test_items_nested_locked 0.5167ms 0.4731ms 2.1139 KOps/s 2.1073 KOps/s $\color{#35bf28}+0.31\%$
test_items_nested_leaf 0.1342ms 97.5612μs 10.2500 KOps/s 10.1713 KOps/s $\color{#35bf28}+0.77\%$
test_items_stack_nested 0.5187ms 0.4675ms 2.1390 KOps/s 2.1127 KOps/s $\color{#35bf28}+1.24\%$
test_items_stack_nested_leaf 0.1423ms 98.0072μs 10.2033 KOps/s 10.1958 KOps/s $\color{#35bf28}+0.07\%$
test_items_stack_nested_locked 0.5338ms 0.4707ms 2.1243 KOps/s 2.0947 KOps/s $\color{#35bf28}+1.41\%$
test_keys 28.6010μs 4.2287μs 236.4769 KOps/s 236.8275 KOps/s $\color{#d91a1a}-0.15\%$
test_keys_nested 0.1902ms 0.1313ms 7.6169 KOps/s 7.6678 KOps/s $\color{#d91a1a}-0.67\%$
test_keys_nested_locked 2.1945ms 0.1398ms 7.1541 KOps/s 7.1510 KOps/s $\color{#35bf28}+0.04\%$
test_keys_nested_leaf 0.1561ms 0.1214ms 8.2396 KOps/s 8.2256 KOps/s $\color{#35bf28}+0.17\%$
test_keys_stack_nested 0.2256ms 0.1310ms 7.6318 KOps/s 7.6247 KOps/s $\color{#35bf28}+0.09\%$
test_keys_stack_nested_leaf 0.1664ms 0.1213ms 8.2428 KOps/s 8.3270 KOps/s $\color{#d91a1a}-1.01\%$
test_keys_stack_nested_locked 0.1906ms 0.1379ms 7.2494 KOps/s 7.1963 KOps/s $\color{#35bf28}+0.74\%$
test_values 5.3022μs 1.0359μs 965.3030 KOps/s 979.6212 KOps/s $\color{#d91a1a}-1.46\%$
test_values_nested 86.9230μs 52.6978μs 18.9761 KOps/s 18.9033 KOps/s $\color{#35bf28}+0.39\%$
test_values_nested_locked 0.1443ms 56.0400μs 17.8444 KOps/s 17.7052 KOps/s $\color{#35bf28}+0.79\%$
test_values_nested_leaf 84.1830μs 60.2044μs 16.6101 KOps/s 16.4508 KOps/s $\color{#35bf28}+0.97\%$
test_values_stack_nested 95.7440μs 52.3900μs 19.0876 KOps/s 18.9883 KOps/s $\color{#35bf28}+0.52\%$
test_values_stack_nested_leaf 92.1840μs 60.1045μs 16.6377 KOps/s 16.5021 KOps/s $\color{#35bf28}+0.82\%$
test_values_stack_nested_locked 94.8040μs 55.7455μs 17.9387 KOps/s 17.7003 KOps/s $\color{#35bf28}+1.35\%$
test_membership 4.1385μs 0.8494μs 1.1772 MOps/s 1.1914 MOps/s $\color{#d91a1a}-1.19\%$
test_membership_nested 32.9510μs 2.9065μs 344.0592 KOps/s 341.9376 KOps/s $\color{#35bf28}+0.62\%$
test_membership_nested_leaf 31.9210μs 2.8951μs 345.4090 KOps/s 344.4914 KOps/s $\color{#35bf28}+0.27\%$
test_membership_stacked_nested 20.4200μs 2.9315μs 341.1246 KOps/s 341.0867 KOps/s $\color{#35bf28}+0.01\%$
test_membership_stacked_nested_leaf 45.4720μs 2.8714μs 348.2617 KOps/s 342.8235 KOps/s $\color{#35bf28}+1.59\%$
test_membership_nested_last 77.2630μs 4.4542μs 224.5069 KOps/s 228.7206 KOps/s $\color{#d91a1a}-1.84\%$
test_membership_nested_leaf_last 33.5710μs 4.4254μs 225.9660 KOps/s 229.9371 KOps/s $\color{#d91a1a}-1.73\%$
test_membership_stacked_nested_last 34.3120μs 4.4614μs 224.1440 KOps/s 229.0108 KOps/s $\color{#d91a1a}-2.13\%$
test_membership_stacked_nested_leaf_last 29.3310μs 4.4470μs 224.8712 KOps/s 229.6869 KOps/s $\color{#d91a1a}-2.10\%$
test_nested_getleaf 49.2220μs 21.4833μs 46.5479 KOps/s 45.7058 KOps/s $\color{#35bf28}+1.84\%$
test_nested_get 60.8630μs 20.5402μs 48.6849 KOps/s 48.3090 KOps/s $\color{#35bf28}+0.78\%$
test_stacked_getleaf 65.5530μs 21.7093μs 46.0632 KOps/s 46.0020 KOps/s $\color{#35bf28}+0.13\%$
test_stacked_get 0.2302ms 20.5109μs 48.7546 KOps/s 48.5839 KOps/s $\color{#35bf28}+0.35\%$
test_nested_getitemleaf 51.8920μs 21.9778μs 45.5004 KOps/s 44.6719 KOps/s $\color{#35bf28}+1.85\%$
test_nested_getitem 48.9810μs 21.1559μs 47.2681 KOps/s 47.1257 KOps/s $\color{#35bf28}+0.30\%$
test_stacked_getitemleaf 52.0220μs 22.4986μs 44.4473 KOps/s 45.6350 KOps/s $\color{#d91a1a}-2.60\%$
test_stacked_getitem 46.6220μs 20.9245μs 47.7908 KOps/s 47.3346 KOps/s $\color{#35bf28}+0.96\%$
test_lock_nested 0.5645ms 0.4809ms 2.0794 KOps/s 2.1060 KOps/s $\color{#d91a1a}-1.26\%$
test_lock_stack_nested 0.6425ms 0.4859ms 2.0581 KOps/s 2.0621 KOps/s $\color{#d91a1a}-0.19\%$
test_unlock_nested 0.4682ms 0.3950ms 2.5315 KOps/s 2.5842 KOps/s $\color{#d91a1a}-2.04\%$
test_unlock_stack_nested 0.5181ms 0.3949ms 2.5324 KOps/s 2.5507 KOps/s $\color{#d91a1a}-0.72\%$
test_flatten_speed 0.2480ms 0.1244ms 8.0358 KOps/s 8.1165 KOps/s $\color{#d91a1a}-0.99\%$
test_unflatten_speed 0.6278ms 0.5828ms 1.7159 KOps/s 1.7217 KOps/s $\color{#d91a1a}-0.34\%$
test_common_ops 0.8417ms 0.6985ms 1.4317 KOps/s 1.4430 KOps/s $\color{#d91a1a}-0.78\%$
test_creation 68.0630μs 3.1511μs 317.3543 KOps/s 319.4553 KOps/s $\color{#d91a1a}-0.66\%$
test_creation_empty 35.9710μs 6.9802μs 143.2618 KOps/s 142.9645 KOps/s $\color{#35bf28}+0.21\%$
test_creation_nested_1 46.5210μs 11.5942μs 86.2497 KOps/s 86.8473 KOps/s $\color{#d91a1a}-0.69\%$
test_creation_nested_2 50.6610μs 13.3376μs 74.9759 KOps/s 74.7717 KOps/s $\color{#35bf28}+0.27\%$
test_creation_many_keys[10] 40.4910μs 21.0501μs 47.5057 KOps/s 47.6311 KOps/s $\color{#d91a1a}-0.26\%$
test_creation_many_keys[50] 0.1322ms 91.4377μs 10.9364 KOps/s 11.1129 KOps/s $\color{#d91a1a}-1.59\%$
test_creation_many_keys[100] 0.2418ms 0.1760ms 5.6818 KOps/s 5.6742 KOps/s $\color{#35bf28}+0.13\%$
test_creation_nested_many_keys[10] 71.4030μs 44.7933μs 22.3248 KOps/s 22.2651 KOps/s $\color{#35bf28}+0.27\%$
test_creation_nested_many_keys[50] 0.2400ms 0.1861ms 5.3746 KOps/s 5.4723 KOps/s $\color{#d91a1a}-1.78\%$
test_clone 43.5120μs 13.0305μs 76.7428 KOps/s 75.9651 KOps/s $\color{#35bf28}+1.02\%$
test_getitem[int] 1.5477ms 15.2467μs 65.5878 KOps/s 60.8282 KOps/s $\textbf{\color{#35bf28}+7.82\%}$
test_getitem[slice_int] 0.1402ms 24.0114μs 41.6469 KOps/s 41.6381 KOps/s $\color{#35bf28}+0.02\%$
test_getitem[range] 0.1756ms 65.5750μs 15.2497 KOps/s 15.7111 KOps/s $\color{#d91a1a}-2.94\%$
test_getitem[tuple] 0.1430ms 24.3192μs 41.1198 KOps/s 41.6993 KOps/s $\color{#d91a1a}-1.39\%$
test_getitem[list] 0.1774ms 57.7244μs 17.3237 KOps/s 17.3214 KOps/s $\color{#35bf28}+0.01\%$
test_setitem_dim[int] 57.3420μs 25.3593μs 39.4332 KOps/s 39.0995 KOps/s $\color{#35bf28}+0.85\%$
test_setitem_dim[slice_int] 68.6530μs 43.6574μs 22.9056 KOps/s 23.3034 KOps/s $\color{#d91a1a}-1.71\%$
test_setitem_dim[range] 0.1330ms 96.3197μs 10.3821 KOps/s 10.5062 KOps/s $\color{#d91a1a}-1.18\%$
test_setitem_dim[tuple] 61.9720μs 40.6150μs 24.6215 KOps/s 24.9536 KOps/s $\color{#d91a1a}-1.33\%$
test_setitem 51.5520μs 17.6810μs 56.5580 KOps/s 57.1758 KOps/s $\color{#d91a1a}-1.08\%$
test_set 61.3720μs 16.9324μs 59.0585 KOps/s 59.6188 KOps/s $\color{#d91a1a}-0.94\%$
test_set_shared 0.5563ms 0.2064ms 4.8446 KOps/s 4.9236 KOps/s $\color{#d91a1a}-1.61\%$
test_update 0.3400ms 21.9541μs 45.5496 KOps/s 46.2784 KOps/s $\color{#d91a1a}-1.57\%$
test_update_nested 67.8230μs 32.8182μs 30.4709 KOps/s 30.0656 KOps/s $\color{#35bf28}+1.35\%$
test_update__nested 0.4501ms 34.0752μs 29.3468 KOps/s 29.1562 KOps/s $\color{#35bf28}+0.65\%$
test_set_nested 45.9620μs 18.6571μs 53.5990 KOps/s 53.2829 KOps/s $\color{#35bf28}+0.59\%$
test_set_nested_new 58.4120μs 23.3928μs 42.7482 KOps/s 42.4902 KOps/s $\color{#35bf28}+0.61\%$
test_select 82.8030μs 41.7844μs 23.9324 KOps/s 24.5079 KOps/s $\color{#d91a1a}-2.35\%$
test_select_nested 0.1120ms 75.2927μs 13.2815 KOps/s 13.5715 KOps/s $\color{#d91a1a}-2.14\%$
test_exclude_nested 0.1244ms 92.2179μs 10.8439 KOps/s 10.7425 KOps/s $\color{#35bf28}+0.94\%$
test_empty[True] 0.4933ms 0.4016ms 2.4902 KOps/s 2.4875 KOps/s $\color{#35bf28}+0.11\%$
test_empty[False] 7.6277μs 1.3215μs 756.6985 KOps/s 755.7644 KOps/s $\color{#35bf28}+0.12\%$
test_to 0.1044ms 73.0515μs 13.6890 KOps/s 13.8008 KOps/s $\color{#d91a1a}-0.81\%$
test_to_nonblocking 0.1358ms 66.7619μs 14.9786 KOps/s 15.4601 KOps/s $\color{#d91a1a}-3.11\%$
test_unbind_speed 0.3758ms 0.3365ms 2.9714 KOps/s 2.9895 KOps/s $\color{#d91a1a}-0.61\%$
test_unbind_speed_stack0 0.4212ms 0.3341ms 2.9929 KOps/s 2.9757 KOps/s $\color{#35bf28}+0.58\%$
test_unbind_speed_stack1 0.1049s 0.8381ms 1.1932 KOps/s 1.1782 KOps/s $\color{#35bf28}+1.27\%$
test_split 0.1045s 1.2702ms 787.2730 Ops/s 782.4974 Ops/s $\color{#35bf28}+0.61\%$
test_chunk 0.1045s 1.2096ms 826.7023 Ops/s 919.7810 Ops/s $\textbf{\color{#d91a1a}-10.12\%}$
test_to_cpu_blocking 19.6518ms 19.4713ms 51.3575 Ops/s 46.6734 Ops/s $\textbf{\color{#35bf28}+10.04\%}$
test_to_cpu_global_sync 11.4063ms 11.2307ms 89.0417 Ops/s 89.6625 Ops/s $\color{#d91a1a}-0.69\%$
test_to_cpu_event_sync 12.5092ms 12.2293ms 81.7711 Ops/s 82.4771 Ops/s $\color{#d91a1a}-0.86\%$
test_to_cpu_default 0.1170s 13.5209ms 73.9597 Ops/s 82.4812 Ops/s $\textbf{\color{#d91a1a}-10.33\%}$
test_consolidate[False-None] 4.3720ms 4.1730ms 239.6376 Ops/s 242.5197 Ops/s $\color{#d91a1a}-1.19\%$
test_consolidate[default-None] 2.4395ms 2.0445ms 489.1076 Ops/s 490.4678 Ops/s $\color{#d91a1a}-0.28\%$
test_consolidate[reduce-overhead-None] 2.0519ms 1.9766ms 505.9263 Ops/s 507.7312 Ops/s $\color{#d91a1a}-0.36\%$
test_consolidate_njt[False-None] 9.8455ms 8.5743ms 116.6277 Ops/s 117.2257 Ops/s $\color{#d91a1a}-0.51\%$
test_to[False-False-None] 2.5499ms 2.0953ms 477.2500 Ops/s 481.8991 Ops/s $\color{#d91a1a}-0.96\%$
test_to[True-False-None] 2.2472ms 1.9709ms 507.3942 Ops/s 513.8630 Ops/s $\color{#d91a1a}-1.26\%$
test_to[within-False-None] 6.5185ms 6.2373ms 160.3268 Ops/s 163.3451 Ops/s $\color{#d91a1a}-1.85\%$
test_to[True-default-None] 9.0439ms 8.7957ms 113.6919 Ops/s 111.0265 Ops/s $\color{#35bf28}+2.40\%$
test_to_njt[False-False-None] 8.6013ms 8.4640ms 118.1469 Ops/s 117.3664 Ops/s $\color{#35bf28}+0.67\%$
test_to_njt[True-False-None] 7.0742ms 6.9319ms 144.2614 Ops/s 142.5180 Ops/s $\color{#35bf28}+1.22\%$
test_to_njt[within-False-None] 16.3172ms 15.5858ms 64.1609 Ops/s 63.4930 Ops/s $\color{#35bf28}+1.05\%$
test_creation[device0] 0.3438ms 0.1153ms 8.6729 KOps/s 8.7113 KOps/s $\color{#d91a1a}-0.44\%$
test_creation_from_tensor 0.4491ms 0.1150ms 8.6934 KOps/s 8.8725 KOps/s $\color{#d91a1a}-2.02\%$
test_add_one[memmap_tensor0] 0.3698ms 6.5143μs 153.5091 KOps/s 157.9501 KOps/s $\color{#d91a1a}-2.81\%$
test_contiguous[memmap_tensor0] 16.9710μs 0.6697μs 1.4932 MOps/s 2.1456 MOps/s $\textbf{\color{#d91a1a}-30.40\%}$
test_stack[memmap_tensor0] 22.5210μs 4.7504μs 210.5079 KOps/s 215.6386 KOps/s $\color{#d91a1a}-2.38\%$
test_memmaptd_index 1.0351ms 0.2736ms 3.6549 KOps/s 3.7565 KOps/s $\color{#d91a1a}-2.70\%$
test_memmaptd_index_astensor 0.5366ms 0.3750ms 2.6669 KOps/s 2.6940 KOps/s $\color{#d91a1a}-1.01\%$
test_memmaptd_index_op 0.9232ms 0.6308ms 1.5852 KOps/s 1.6224 KOps/s $\color{#d91a1a}-2.29\%$
test_serialize_model 0.1387s 0.1363s 7.3380 Ops/s 7.3269 Ops/s $\color{#35bf28}+0.15\%$
test_serialize_model_pickle 1.3489s 1.2171s 0.8216 Ops/s 0.8394 Ops/s $\color{#d91a1a}-2.12\%$
test_serialize_weights 0.1377s 0.1344s 7.4404 Ops/s 7.3960 Ops/s $\color{#35bf28}+0.60\%$
test_serialize_weights_returnearly 0.4275s 88.0491ms 11.3573 Ops/s 12.8033 Ops/s $\textbf{\color{#d91a1a}-11.29\%}$
test_serialize_weights_pickle 1.3706s 1.2130s 0.8244 Ops/s 0.8225 Ops/s $\color{#35bf28}+0.24\%$
test_reshape_pytree 0.2028ms 32.1587μs 31.0958 KOps/s 30.7890 KOps/s $\color{#35bf28}+1.00\%$
test_reshape_td 85.6030μs 45.5653μs 21.9465 KOps/s 22.1186 KOps/s $\color{#d91a1a}-0.78\%$
test_view_pytree 0.2219ms 31.9315μs 31.3171 KOps/s 31.2262 KOps/s $\color{#35bf28}+0.29\%$
test_view_td 96.9630μs 54.7726μs 18.2573 KOps/s 18.9168 KOps/s $\color{#d91a1a}-3.49\%$
test_unbind_pytree 0.2346ms 36.5281μs 27.3762 KOps/s 27.7348 KOps/s $\color{#d91a1a}-1.29\%$
test_unbind_td 0.2134ms 50.0136μs 19.9946 KOps/s 20.2398 KOps/s $\color{#d91a1a}-1.21\%$
test_split_pytree 0.2427ms 42.6356μs 23.4546 KOps/s 23.5731 KOps/s $\color{#d91a1a}-0.50\%$
test_split_td 0.2112ms 64.2051μs 15.5751 KOps/s 15.5796 KOps/s $\color{#d91a1a}-0.03\%$
test_add_pytree 0.2070ms 42.4377μs 23.5640 KOps/s 23.9014 KOps/s $\color{#d91a1a}-1.41\%$
test_add_td 0.1122ms 54.7018μs 18.2809 KOps/s 18.4770 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_add_one_nested[tensordict-compile] 0.2305ms 0.1409ms 7.0972 KOps/s 6.9614 KOps/s $\color{#35bf28}+1.95\%$
test_compile_add_one_nested[tensordict-eager] 0.4228ms 0.2044ms 4.8934 KOps/s 5.0197 KOps/s $\color{#d91a1a}-2.52\%$
test_compile_add_one_nested[pytree-compile] 0.1875ms 0.1115ms 8.9710 KOps/s 8.8546 KOps/s $\color{#35bf28}+1.31\%$
test_compile_add_one_nested[pytree-eager] 0.4322ms 0.1779ms 5.6225 KOps/s 5.6681 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_copy_nested[tensordict-compile] 0.3067ms 10.5522μs 94.7668 KOps/s 97.2229 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_copy_nested[tensordict-eager] 88.9230μs 54.2258μs 18.4414 KOps/s 18.3198 KOps/s $\color{#35bf28}+0.66\%$
test_compile_copy_nested[pytree-compile] 0.1189ms 9.8304μs 101.7257 KOps/s 101.0089 KOps/s $\color{#35bf28}+0.71\%$
test_compile_copy_nested[pytree-eager] 0.4507ms 67.7156μs 14.7676 KOps/s 12.2172 KOps/s $\textbf{\color{#35bf28}+20.88\%}$
test_compile_add_one_flat[tensordict-compile] 0.2192ms 0.1805ms 5.5393 KOps/s 5.2770 KOps/s $\color{#35bf28}+4.97\%$
test_compile_add_one_flat[tensordict-eager] 0.3476ms 0.2802ms 3.5687 KOps/s 3.5361 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[tensorclass-compile] 0.1925ms 0.1174ms 8.5168 KOps/s 8.1847 KOps/s $\color{#35bf28}+4.06\%$
test_compile_add_one_flat[tensorclass-eager] 0.1190ms 73.8810μs 13.5353 KOps/s 13.5817 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_one_flat[pytree-compile] 0.1997ms 0.1595ms 6.2676 KOps/s 6.1945 KOps/s $\color{#35bf28}+1.18\%$
test_compile_add_one_flat[pytree-eager] 0.8065ms 0.5163ms 1.9370 KOps/s 1.9382 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_self_flat[tensordict-eager] 0.4652ms 0.3337ms 2.9967 KOps/s 2.9493 KOps/s $\color{#35bf28}+1.61\%$
test_compile_add_self_flat[tensordict-compile] 0.2952ms 0.1801ms 5.5531 KOps/s 5.4137 KOps/s $\color{#35bf28}+2.57\%$
test_compile_add_self_flat[tensorclass-eager] 0.1217ms 89.4494μs 11.1795 KOps/s 11.2803 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_add_self_flat[tensorclass-compile] 0.1902ms 0.1243ms 8.0439 KOps/s 8.1781 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_add_self_flat[pytree-eager] 0.6497ms 0.4326ms 2.3118 KOps/s 2.3380 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_add_self_flat[pytree-compile] 0.2229ms 0.1596ms 6.2638 KOps/s 4.2689 KOps/s $\textbf{\color{#35bf28}+46.73\%}$
test_compile_copy_flat[tensordict-compile] 0.1122ms 13.3082μs 75.1417 KOps/s 76.4582 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_copy_flat[tensordict-eager] 73.4130μs 42.4439μs 23.5605 KOps/s 23.7825 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_copy_flat[pytree-compile] 90.5730μs 10.8834μs 91.8828 KOps/s 93.5376 KOps/s $\color{#d91a1a}-1.77\%$
test_compile_copy_flat[pytree-eager] 0.4597ms 52.7078μs 18.9725 KOps/s 18.9319 KOps/s $\color{#35bf28}+0.21\%$
test_compile_assign_and_add[tensordict-compile] 2.0849ms 0.1749ms 5.7176 KOps/s 5.3472 KOps/s $\textbf{\color{#35bf28}+6.93\%}$
test_compile_assign_and_add[tensordict-eager] 3.4806ms 3.3132ms 301.8188 Ops/s 303.0679 Ops/s $\color{#d91a1a}-0.41\%$
test_compile_assign_and_add[pytree-compile] 2.0206ms 0.1639ms 6.1001 KOps/s 6.0501 KOps/s $\color{#35bf28}+0.83\%$
test_compile_assign_and_add[pytree-eager] 2.9004ms 2.7613ms 362.1515 Ops/s 363.9189 Ops/s $\color{#d91a1a}-0.49\%$
test_compile_indexing[tensor-tensordict-compile] 0.1739ms 0.1109ms 9.0195 KOps/s 8.7784 KOps/s $\color{#35bf28}+2.75\%$
test_compile_indexing[tensor-tensordict-eager] 0.3121ms 74.8347μs 13.3628 KOps/s 13.5383 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1433ms 99.7722μs 10.0228 KOps/s 10.1510 KOps/s $\color{#d91a1a}-1.26\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2580ms 44.7727μs 22.3351 KOps/s 22.5630 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[tensor-pytree-compile] 0.1436ms 99.7608μs 10.0240 KOps/s 10.1369 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_indexing[tensor-pytree-eager] 0.2773ms 44.3613μs 22.5422 KOps/s 22.5527 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[slice-tensordict-compile] 0.2108ms 58.0834μs 17.2166 KOps/s 16.6985 KOps/s $\color{#35bf28}+3.10\%$
test_compile_indexing[slice-tensordict-eager] 0.2396ms 27.5543μs 36.2920 KOps/s 36.9236 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[slice-tensorclass-compile] 88.8540μs 44.0243μs 22.7147 KOps/s 22.1027 KOps/s $\color{#35bf28}+2.77\%$
test_compile_indexing[slice-tensorclass-eager] 0.2432ms 22.3796μs 44.6836 KOps/s 44.9612 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_indexing[slice-pytree-compile] 90.7740μs 46.8714μs 21.3350 KOps/s 21.9924 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_indexing[slice-pytree-eager] 0.2598ms 22.4712μs 44.5014 KOps/s 44.7646 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_indexing[int-tensordict-compile] 98.8630μs 57.2182μs 17.4770 KOps/s 16.6563 KOps/s $\color{#35bf28}+4.93\%$
test_compile_indexing[int-tensordict-eager] 0.2719ms 27.6484μs 36.1684 KOps/s 36.6674 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_indexing[int-tensorclass-compile] 85.8930μs 45.5074μs 21.9745 KOps/s 21.3016 KOps/s $\color{#35bf28}+3.16\%$
test_compile_indexing[int-tensorclass-eager] 0.2626ms 22.2512μs 44.9413 KOps/s 44.9941 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_indexing[int-pytree-compile] 84.5730μs 43.9531μs 22.7515 KOps/s 21.0958 KOps/s $\textbf{\color{#35bf28}+7.85\%}$
test_compile_indexing[int-pytree-eager] 0.2706ms 22.3606μs 44.7216 KOps/s 44.8932 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_replace[single-eager] 0.1002ms 47.4602μs 21.0703 KOps/s 20.1772 KOps/s $\color{#35bf28}+4.43\%$
test_compile_replace[single-compile] 0.1900ms 0.1045ms 9.5682 KOps/s 9.3873 KOps/s $\color{#35bf28}+1.93\%$
test_compile_replace[multi-eager] 0.6550ms 0.5724ms 1.7471 KOps/s 1.7382 KOps/s $\color{#35bf28}+0.51\%$
test_compile_replace[multi-compile] 0.2251ms 0.1112ms 8.9891 KOps/s 8.7466 KOps/s $\color{#35bf28}+2.77\%$
test_compile_tc_getattr_20[eager] 0.2248ms 0.1665ms 6.0072 KOps/s 6.0348 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_tc_getattr_20[compile] 0.3317ms 0.1213ms 8.2424 KOps/s 8.2110 KOps/s $\color{#35bf28}+0.38\%$
test_compile_clone_shallow[20-eager] 47.2520μs 19.5995μs 51.0217 KOps/s 51.3748 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_clone_shallow[20-compile] 0.1130ms 11.5103μs 86.8787 KOps/s 88.2125 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_clone_shallow[40-eager] 62.4220μs 34.5611μs 28.9343 KOps/s 28.9755 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_clone_shallow[40-compile] 42.8110μs 12.4629μs 80.2378 KOps/s 81.4736 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_clone_shallow[80-eager] 90.2640μs 63.5471μs 15.7364 KOps/s 15.7875 KOps/s $\color{#d91a1a}-0.32\%$
test_compile_clone_shallow[80-compile] 61.0920μs 15.3218μs 65.2663 KOps/s 66.2456 KOps/s $\color{#d91a1a}-1.48\%$
test_compile_update_inplace[eager] 98.3040μs 59.1113μs 16.9172 KOps/s 16.7094 KOps/s $\color{#35bf28}+1.24\%$
test_compile_update_inplace[compile] 0.1863ms 0.1407ms 7.1048 KOps/s 6.9773 KOps/s $\color{#35bf28}+1.83\%$
test_mod_add[eager] 98.5140μs 49.6174μs 20.1542 KOps/s 20.5480 KOps/s $\color{#d91a1a}-1.92\%$
test_mod_add[compile] 0.1433ms 0.1037ms 9.6406 KOps/s 9.4587 KOps/s $\color{#35bf28}+1.92\%$
test_mod_add[compile-overhead] 0.2370ms 0.1498ms 6.6748 KOps/s 6.6020 KOps/s $\color{#35bf28}+1.10\%$
test_mod_wrap[eager] 0.3701ms 0.2924ms 3.4194 KOps/s 3.4224 KOps/s $\color{#d91a1a}-0.09\%$
test_mod_wrap[compile] 0.4342ms 0.3483ms 2.8714 KOps/s 2.8468 KOps/s $\color{#35bf28}+0.86\%$
test_mod_wrap[compile-overhead] 7.3311ms 4.0113ms 249.2937 Ops/s 248.0279 Ops/s $\color{#35bf28}+0.51\%$
test_mod_wrap_and_backward[eager] 1.9336ms 1.5043ms 664.7738 Ops/s 663.9060 Ops/s $\color{#35bf28}+0.13\%$
test_mod_wrap_and_backward[compile] 1.7253ms 1.5568ms 642.3480 Ops/s 689.8355 Ops/s $\textbf{\color{#d91a1a}-6.88\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4413ms 0.9982ms 1.0018 KOps/s 999.5225 Ops/s $\color{#35bf28}+0.23\%$
test_seq_add[eager] 0.1979ms 0.1542ms 6.4852 KOps/s 6.4880 KOps/s $\color{#d91a1a}-0.04\%$
test_seq_add[compile] 0.5817ms 0.1138ms 8.7858 KOps/s 8.2818 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_seq_add[compile-overhead] 0.2192ms 0.1571ms 6.3651 KOps/s 6.0582 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_seq_wrap[eager] 0.6044ms 0.5377ms 1.8597 KOps/s 1.8371 KOps/s $\color{#35bf28}+1.23\%$
test_seq_wrap[compile] 0.4900ms 0.3671ms 2.7239 KOps/s 2.7147 KOps/s $\color{#35bf28}+0.34\%$
test_seq_wrap[compile-overhead] 0.3282ms 0.2649ms 3.7749 KOps/s 3.7417 KOps/s $\color{#35bf28}+0.89\%$
test_func_call_runtime[False-eager] 0.9361ms 0.8394ms 1.1914 KOps/s 1.1934 KOps/s $\color{#d91a1a}-0.17\%$
test_func_call_runtime[False-compile] 1.0721ms 0.9152ms 1.0927 KOps/s 1.0987 KOps/s $\color{#d91a1a}-0.54\%$
test_func_call_runtime[False-compile-overhead] 0.5344ms 0.4611ms 2.1687 KOps/s 2.1515 KOps/s $\color{#35bf28}+0.80\%$
test_func_call_runtime[True-eager] 1.1531ms 1.0701ms 934.4923 Ops/s 918.0039 Ops/s $\color{#35bf28}+1.80\%$
test_func_call_runtime[True-compile] 1.0287ms 0.9259ms 1.0800 KOps/s 1.0790 KOps/s $\color{#35bf28}+0.09\%$
test_func_call_runtime[True-compile-overhead] 0.5556ms 0.4746ms 2.1069 KOps/s 2.0815 KOps/s $\color{#35bf28}+1.22\%$
test_func_call_cm_runtime[False-eager] 0.9476ms 0.8346ms 1.1982 KOps/s 1.1918 KOps/s $\color{#35bf28}+0.54\%$
test_func_call_cm_runtime[False-compile] 1.0240ms 0.9120ms 1.0965 KOps/s 1.0893 KOps/s $\color{#35bf28}+0.66\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5755ms 0.4615ms 2.1669 KOps/s 2.1488 KOps/s $\color{#35bf28}+0.84\%$
test_func_call_cm_runtime[True-eager] 1.4184ms 1.2197ms 819.8830 Ops/s 810.9643 Ops/s $\color{#35bf28}+1.10\%$
test_func_call_cm_runtime[True-compile] 1.0129ms 0.9555ms 1.0466 KOps/s 1.0421 KOps/s $\color{#35bf28}+0.43\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5740ms 0.5081ms 1.9681 KOps/s 1.9459 KOps/s $\color{#35bf28}+1.14\%$
test_vmap_func_call_cm_runtime[eager] 2.8443ms 2.3747ms 421.0970 Ops/s 422.4423 Ops/s $\color{#d91a1a}-0.32\%$
test_vmap_func_call_cm_runtime[compile] 1.0787ms 0.9713ms 1.0296 KOps/s 1.0206 KOps/s $\color{#35bf28}+0.88\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6086ms 0.5121ms 1.9527 KOps/s 1.9290 KOps/s $\color{#35bf28}+1.23\%$
test_distributed 0.5712ms 0.1511ms 6.6202 KOps/s 6.4898 KOps/s $\color{#35bf28}+2.01\%$
test_tdmodule 80.8130μs 27.6108μs 36.2177 KOps/s 36.7021 KOps/s $\color{#d91a1a}-1.32\%$
test_tdmodule_dispatch 76.1330μs 45.8944μs 21.7891 KOps/s 21.8471 KOps/s $\color{#d91a1a}-0.27\%$
test_tdseq 46.6310μs 27.2058μs 36.7568 KOps/s 37.4533 KOps/s $\color{#d91a1a}-1.86\%$
test_tdseq_dispatch 77.8640μs 46.7312μs 21.3990 KOps/s 21.1733 KOps/s $\color{#35bf28}+1.07\%$
test_instantiation_functorch 2.1796ms 2.0849ms 479.6422 Ops/s 478.7470 Ops/s $\color{#35bf28}+0.19\%$
test_exec_functorch 0.2137ms 0.1802ms 5.5500 KOps/s 5.5377 KOps/s $\color{#35bf28}+0.22\%$
test_exec_functional_call 0.2052ms 0.1573ms 6.3563 KOps/s 6.1751 KOps/s $\color{#35bf28}+2.93\%$
test_exec_td_decorator 0.4342ms 0.2343ms 4.2681 KOps/s 4.2195 KOps/s $\color{#35bf28}+1.15\%$
test_vmap_mlp_speed_decorator[True-True] 0.9929ms 0.8226ms 1.2157 KOps/s 1.2148 KOps/s $\color{#35bf28}+0.07\%$
test_vmap_mlp_speed_decorator[True-False] 1.0058ms 0.8202ms 1.2193 KOps/s 1.2172 KOps/s $\color{#35bf28}+0.17\%$
test_vmap_mlp_speed_decorator[False-True] 0.9087ms 0.7116ms 1.4053 KOps/s 1.4066 KOps/s $\color{#d91a1a}-0.09\%$
test_vmap_mlp_speed_decorator[False-False] 0.9039ms 0.7114ms 1.4056 KOps/s 1.4096 KOps/s $\color{#d91a1a}-0.28\%$
test_vmap_transformer_speed_decorator[True-True] 21.2449ms 20.4502ms 48.8994 Ops/s 49.0205 Ops/s $\color{#d91a1a}-0.25\%$
test_vmap_transformer_speed_decorator[True-False] 20.5689ms 20.4418ms 48.9194 Ops/s 49.1274 Ops/s $\color{#d91a1a}-0.42\%$
test_vmap_transformer_speed_decorator[False-True] 20.4082ms 20.2404ms 49.4062 Ops/s 49.5721 Ops/s $\color{#d91a1a}-0.33\%$
test_vmap_transformer_speed_decorator[False-False] 20.3493ms 20.2561ms 49.3678 Ops/s 49.5463 Ops/s $\color{#d91a1a}-0.36\%$
test_to_module_speed[True] 1.5675ms 1.4699ms 680.3029 Ops/s 675.4660 Ops/s $\color{#35bf28}+0.72\%$
test_to_module_speed[False] 1.5483ms 1.4427ms 693.1486 Ops/s 688.9056 Ops/s $\color{#35bf28}+0.62\%$
test_tc_init 76.7430μs 44.2057μs 22.6215 KOps/s 22.0036 KOps/s $\color{#35bf28}+2.81\%$
test_tc_init_tensor_only 28.7410μs 9.8484μs 101.5389 KOps/s 102.1466 KOps/s $\color{#d91a1a}-0.59\%$
test_tc_init_nested 0.1236ms 86.1470μs 11.6081 KOps/s 11.1406 KOps/s $\color{#35bf28}+4.20\%$
test_tc_init_many_fields 43.5120μs 16.2851μs 61.4058 KOps/s 61.3361 KOps/s $\color{#35bf28}+0.11\%$
test_tc_first_layer_tensor 22.4710μs 1.8222μs 548.7747 KOps/s 552.8342 KOps/s $\color{#d91a1a}-0.73\%$
test_tc_first_layer_tensor_only 2.4921μs 0.4041μs 2.4743 MOps/s 2.5519 MOps/s $\color{#d91a1a}-3.04\%$
test_tc_first_layer_tensor_set 33.5210μs 3.9226μs 254.9332 KOps/s 255.4827 KOps/s $\color{#d91a1a}-0.22\%$
test_tc_first_layer_tensor_only_set 35.6310μs 3.3053μs 302.5405 KOps/s 304.3319 KOps/s $\color{#d91a1a}-0.59\%$
test_tc_first_layer_nontensor 28.8010μs 6.1548μs 162.4741 KOps/s 155.8909 KOps/s $\color{#35bf28}+4.22\%$
test_tc_second_layer_tensor 22.3810μs 4.4184μs 226.3274 KOps/s 225.0272 KOps/s $\color{#35bf28}+0.58\%$
test_tc_second_layer_nontensor 40.0210μs 8.7153μs 114.7408 KOps/s 110.4745 KOps/s $\color{#35bf28}+3.86\%$
test_unbind 0.2501s 16.7074ms 59.8538 Ops/s 55.7229 Ops/s $\textbf{\color{#35bf28}+7.41\%}$
test_full_like 11.6720ms 8.8086ms 113.5256 Ops/s 60.4101 Ops/s $\textbf{\color{#35bf28}+87.92\%}$
test_zeros_like 4.8462ms 4.3534ms 229.7031 Ops/s 60.6811 Ops/s $\textbf{\color{#35bf28}+278.54\%}$
test_ones_like 4.5496ms 4.3603ms 229.3403 Ops/s 229.1508 Ops/s $\color{#35bf28}+0.08\%$
test_clone 11.4253ms 9.1131ms 109.7322 Ops/s 157.2598 Ops/s $\textbf{\color{#d91a1a}-30.22\%}$
test_squeeze 0.1282ms 14.2015μs 70.4152 KOps/s 70.1681 KOps/s $\color{#35bf28}+0.35\%$
test_unsqueeze 0.2656ms 0.1143ms 8.7493 KOps/s 8.7339 KOps/s $\color{#35bf28}+0.18\%$
test_split 0.2736ms 0.1876ms 5.3296 KOps/s 5.3291 KOps/s $\color{#35bf28}+0.01\%$
test_permute 0.2895ms 0.2001ms 4.9964 KOps/s 4.6922 KOps/s $\textbf{\color{#35bf28}+6.48\%}$
test_stack 51.0135ms 50.7435ms 19.7070 Ops/s 19.7203 Ops/s $\color{#d91a1a}-0.07\%$
test_cat 50.9005ms 50.6226ms 19.7540 Ops/s 19.8043 Ops/s $\color{#d91a1a}-0.25\%$
test_sequential_tensordict 0.2985ms 0.2239ms 4.4672 KOps/s 4.6418 KOps/s $\color{#d91a1a}-3.76\%$
test_sequential_graph_module 0.5211ms 0.1251ms 7.9943 KOps/s 8.4164 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_nested_tensordict 0.3750ms 0.2947ms 3.3937 KOps/s 3.5473 KOps/s $\color{#d91a1a}-4.33\%$
test_nested_graph_module 0.5437ms 0.1300ms 7.6922 KOps/s 7.7907 KOps/s $\color{#d91a1a}-1.26\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 17, 2026
- Fix _deduplicate_src_specs to use hashable (start, stop) tuples
  instead of slice objects as dict keys (slice is unhashable on
  Python < 3.12)
- Add dtensor_send/dtensor_recv stubs to tensorclass.pyi
- Fix lint: remove unused imports (TensorDictPipe TYPE_CHECKING,
  Sequence in megatron, _TransferPlan/ParameterPlan in tests),
  unused variable (sends in ModelTransferPlan.execute), loop
  variable naming (key -> _key in base.py)
- Fix docstring formatting (D205/D415 in _chunk_slice,
  ModelTransferPlan)
- Add examples/*.py to T201 (print) lint ignore in setup.cfg
- Remove unused variable in example file
- Auto-format with ufmt/black

Made-with: Cursor
ghstack-source-id: c7756ce
Pull-Request: #1652
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

1 similar comment
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 17, 2026
- Fix _deduplicate_src_specs to use hashable (start, stop) tuples
  instead of slice objects as dict keys (slice is unhashable on
  Python < 3.12)
- Add dtensor_send/dtensor_recv stubs to tensorclass.pyi
- Fix lint: remove unused imports (TensorDictPipe TYPE_CHECKING,
  Sequence in megatron, _TransferPlan/ParameterPlan in tests),
  unused variable (sends in ModelTransferPlan.execute), loop
  variable naming (key -> _key in base.py)
- Fix docstring formatting (D205/D415 in _chunk_slice,
  ModelTransferPlan)
- Add examples/*.py to T201 (print) lint ignore in setup.cfg
- Remove unused variable in example file
- Auto-format with ufmt/black

Made-with: Cursor
ghstack-source-id: 4fd0dc9
Pull-Request: #1652
Made-with: Cursor
@github-actions
Copy link
Contributor

PR Title Label Error

Unknown or invalid prefix [DTensor].

Current title: [DTensor] Fix CI: Python 3.10 compat, lint, and pyi stubs

Supported Prefixes

Your PR title must start with exactly one of these prefixes (case-insensitive):

Prefix Label Applied Example
[BugFix] or [Fix] bug [BugFix] Fix memory leak in TensorDict
[Feature] Feature [Feature] Add new storage backend
[Doc] or [Docs] documentation [Doc] Update installation guide
[Refactor] Refactor [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Test [Test] Add unit tests for nn module
[Compile] Compile [Compile] Fix torch.compile issue
[Performance] or [Perf] Performance [Perf] Optimize tensor operations
[Deprecation] Deprecation [Deprecation] Mark old function
[Setup] setup [Setup] Update build configuration
[Distributed] or [Dist] Distributed [Distributed] Add scatter collective
[Benchmark] or [Bench] Benchmarks [Benchmark] Add compile benchmark
[Typing] or [Type] Typing [Typing] Add type stubs
[BC-breaking] or [BC] BC-breaking [BC-breaking] Remove deprecated API
[Formatting] or [Format] Formatting [Format] Fix code style
[Quality] Quality [Quality] Improve error messages

Note: Matching is case-insensitive. Common variations (singular/plural) are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. tensorclass Test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant