Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
aa602ac
Initial LTX 2.0 transformer implementation
dg845 Dec 12, 2025
b3096c3
Add tests for LTX 2 transformer model
dg845 Dec 13, 2025
980591d
Get LTX 2 transformer tests working
dg845 Dec 13, 2025
e100b8f
Rename LTX 2 compile test class to have LTX2
dg845 Dec 13, 2025
780fb61
Remove RoPE debug print statements
dg845 Dec 13, 2025
5765759
Get LTX 2 transformer compile tests passing
dg845 Dec 15, 2025
aeecc4d
Fix LTX 2 transformer shape errors
dg845 Dec 15, 2025
a5f2d2d
Initial script to convert LTX 2 transformer to diffusers
dg845 Dec 15, 2025
d86f89d
Add more LTX 2 transformer audio arguments
dg845 Dec 16, 2025
57a8b9c
Allow LTX 2 transformer to be loaded from local path for conversion
dg845 Dec 16, 2025
a7bc052
Improve dummy inputs and add test for LTX 2 transformer consistency
dg845 Dec 16, 2025
bda3ff1
Fix LTX 2 transformer bugs so consistency test passes
dg845 Dec 16, 2025
269cf7b
Initial implementation of LTX 2.0 video VAE
dg845 Dec 17, 2025
baf23e2
Explicitly specify temporal and spatial VAE scale factors when conver…
dg845 Dec 17, 2025
5b950d6
Add initial LTX 2.0 video VAE tests
dg845 Dec 17, 2025
491aae0
Add initial LTX 2.0 video VAE tests (part 2)
dg845 Dec 17, 2025
a748975
Get diffusers implementation on par with official LTX 2.0 video VAE i…
dg845 Dec 19, 2025
c6a11a5
Initial LTX 2.0 vocoder implementation
dg845 Dec 19, 2025
8bfeb4a
Merge pull request #3 from huggingface/ltx-2-vocoder
dg845 Dec 20, 2025
b1cf6ff
Merge pull request #2 from huggingface/ltx-2-video-vae
dg845 Dec 20, 2025
6c56954
Use RMSNorm implementation closer to original for LTX 2.0 video VAE
dg845 Dec 20, 2025
b34ddb1
start audio decoder.
sayakpaul Dec 22, 2025
f4c2435
init registration.
sayakpaul Dec 22, 2025
e54cd6b
up
sayakpaul Dec 22, 2025
907896d
simplify and clean up
sayakpaul Dec 22, 2025
4904fd6
up
sayakpaul Dec 22, 2025
0028955
Initial LTX 2.0 text encoder implementation
dg845 Dec 22, 2025
d0f9cda
Rough initial LTX 2.0 pipeline implementation
dg845 Dec 22, 2025
5f0f2a0
up
sayakpaul Dec 22, 2025
58257eb
up
sayakpaul Dec 22, 2025
059999a
up
sayakpaul Dec 22, 2025
8134da6
up
sayakpaul Dec 22, 2025
409d651
resolve conflicts.
sayakpaul Dec 22, 2025
7bb4cf7
Merge pull request #5 from huggingface/audio-decoder
dg845 Dec 23, 2025
5f7e43d
Add imports for LTX 2.0 Audio VAE
dg845 Dec 23, 2025
d303e2a
Conversion script for LTX 2.0 Audio VAE Decoder
dg845 Dec 23, 2025
ae3b6e7
Merge branch 'ltx-2-transformer' into ltx-2-t2v-pipeline
dg845 Dec 23, 2025
54bfc5d
Add Audio VAE logic to T2V pipeline
dg845 Dec 23, 2025
6e6ce20
Duplicate scheduler for audio latents
dg845 Dec 23, 2025
cbb10b8
Support num_videos_per_prompt for prompt embeddings
dg845 Dec 23, 2025
595f485
LTX 2.0 scheduler and full pipeline conversion
dg845 Dec 23, 2025
3bf7369
Add script to test full LTX2Pipeline T2V inference
dg845 Dec 23, 2025
fa7d9f7
Fix pipeline return bugs
dg845 Dec 23, 2025
a56cf23
Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__
dg845 Dec 23, 2025
90edc6a
Fix more bugs in LTX2Pipeline.__call__
dg845 Dec 23, 2025
1484c43
Improve CPU offload support
dg845 Dec 23, 2025
f9b9476
Fix pipeline audio VAE decoding dtype bug
dg845 Dec 23, 2025
e89d9c1
Fix video shape error in full pipeline test script
dg845 Dec 23, 2025
b5891b1
Get LTX 2 T2V pipeline to produce reasonable outputs
dg845 Dec 24, 2025
0c41297
Merge pull request #4 from huggingface/ltx-2-t2v-pipeline
dg845 Dec 24, 2025
581f21c
Make LTX 2.0 scheduler more consistent with original code
dg845 Dec 29, 2025
e1f0b7e
Fix typo when applying scheduler fix in T2V inference script
dg845 Dec 29, 2025
280e347
Refactor Audio VAE to be simpler and remove helpers (#7)
sayakpaul Dec 30, 2025
46822c4
Add support for I2V (#8)
sayakpaul Dec 30, 2025
6a236a2
Merge branch 'ltx-2-transformer' into make-scheduler-consistent
dg845 Dec 30, 2025
bd607b9
Denormalize audio latents in I2V pipeline (analogous to T2V change) (…
dg845 Dec 31, 2025
d3f10fe
test i2v.
sayakpaul Dec 31, 2025
aae70b9
Merge pull request #10 from huggingface/make-scheduler-consistent
dg845 Dec 31, 2025
caae167
Move Video and Audio Text Encoder Connectors to Transformer (#12)
dg845 Jan 5, 2026
0be4f31
up (#19)
sayakpaul Jan 5, 2026
c5b52d6
address initial feedback from lightricks team (#16)
sayakpaul Jan 5, 2026
2fa4f84
When using split RoPE, make sure that the output dtype is same as inp…
dg845 Jan 5, 2026
bff9891
Fix apply split RoPE shape error when reshaping x to 4D
dg845 Jan 6, 2026
cb50cac
Add export_utils file for exporting LTX 2.0 videos with audio
dg845 Jan 6, 2026
ce9da5d
Merge pull request #20 from huggingface/video-export-utils-file
dg845 Jan 6, 2026
93a417f
Tests for T2V and I2V (#6)
sayakpaul Jan 6, 2026
9b8788c
resolve conflicts.
sayakpaul Jan 6, 2026
c039c87
up
sayakpaul Jan 6, 2026
550eca3
use export util funcs.
sayakpaul Jan 6, 2026
ef19911
Point original checkpoint to LTX 2.0 official checkpoint
dg845 Jan 6, 2026
ace2ee9
Allow the I2V pipeline to accept image URLs
dg845 Jan 6, 2026
dd81242
make style and make quality
dg845 Jan 6, 2026
2fc5789
Merge branch 'main' into ltx-2-transformer
sayakpaul Jan 6, 2026
57ead0b
remove function map.
sayakpaul Jan 6, 2026
c39f1b8
remove args.
sayakpaul Jan 6, 2026
bdcf23e
update docs.
sayakpaul Jan 6, 2026
61e0fb4
update doc entries.
sayakpaul Jan 6, 2026
8c5ab1f
disable ltx2_consistency test
sayakpaul Jan 6, 2026
64b48c1
Merge branch 'main' into ltx-2-transformer
sayakpaul Jan 6, 2026
5e0cf2b
Simplify LTX 2 RoPE forward by removing coords is None logic
dg845 Jan 6, 2026
d01a242
make style and make quality
dg845 Jan 6, 2026
79cf6d7
Support LTX 2.0 audio VAE encoder
dg845 Jan 7, 2026
cc28cf7
Merge branch 'main' into ltx-2-transformer
sayakpaul Jan 7, 2026
91ee2dd
resolve conflicts
sayakpaul Jan 7, 2026
5269ee5
Merge branch 'ltx-2-transformer' of github.com:huggingface/diffusers …
dg845 Jan 7, 2026
a17f5cb
Apply suggestions from code review
dg845 Jan 7, 2026
964f106
Remove print statement in audio VAE
dg845 Jan 7, 2026
4dfe509
up
sayakpaul Jan 7, 2026
249ae1f
Merge branch 'main' into ltx-2-transformer
sayakpaul Jan 7, 2026
040c118
Fix bug when calculating audio RoPE coords
dg845 Jan 7, 2026
44925cb
Ltx 2 latent upsample pipeline (#12922)
sayakpaul Jan 7, 2026
5e50046
Fix latent upsampler filename in LTX 2 conversion script
dg845 Jan 8, 2026
2b85b93
Add latent upsample pipeline to LTX 2 docs
dg845 Jan 8, 2026
40ee3e3
Add dummy objects for LTX 2 latent upsample pipeline
dg845 Jan 8, 2026
99ff722
Set default FPS to official LTX 2 ckpt default of 24.0
dg845 Jan 8, 2026
165b945
Set default CFG scale to official LTX 2 ckpt default of 4.0
dg845 Jan 8, 2026
1a4ae58
Update LTX 2 pipeline example docstrings
dg845 Jan 8, 2026
b4d33df
make style and make quality
dg845 Jan 8, 2026
724afee
Remove LTX 2 test scripts
dg845 Jan 8, 2026
d24faa7
Fix LTX 2 upsample pipeline example docstring
dg845 Jan 8, 2026
353f0db
Add logic to convert and save a LTX 2 upsampling pipeline
dg845 Jan 8, 2026
0c9e4e2
Merge branch 'main' into ltx-2-transformer
sayakpaul Jan 8, 2026
f85b969
Document LTX2VideoTransformer3DModel forward pass
dg845 Jan 8, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,8 @@
title: LatteTransformer3DModel
- local: api/models/longcat_image_transformer2d
title: LongCatImageTransformer2DModel
- local: api/models/ltx2_video_transformer3d
title: LTX2VideoTransformer3DModel
- local: api/models/ltx_video_transformer3d
title: LTXVideoTransformer3DModel
- local: api/models/lumina2_transformer2d
Expand Down Expand Up @@ -443,6 +445,10 @@
title: AutoencoderKLHunyuanVideo
- local: api/models/autoencoder_kl_hunyuan_video15
title: AutoencoderKLHunyuanVideo15
- local: api/models/autoencoderkl_audio_ltx_2
title: AutoencoderKLLTX2Audio
- local: api/models/autoencoderkl_ltx_2
title: AutoencoderKLLTX2Video
- local: api/models/autoencoderkl_ltx_video
title: AutoencoderKLLTXVideo
- local: api/models/autoencoderkl_magvit
Expand Down Expand Up @@ -678,6 +684,8 @@
title: Kandinsky 5.0 Video
- local: api/pipelines/latte
title: Latte
- local: api/pipelines/ltx2
title: LTX-2
- local: api/pipelines/ltx_video
title: LTXVideo
- local: api/pipelines/mochi
Expand Down
29 changes: 29 additions & 0 deletions docs/source/en/api/models/autoencoderkl_audio_ltx_2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->

# AutoencoderKLLTX2Audio

The 3D variational autoencoder (VAE) model with KL loss used in [LTX-2](https://huggingface.co/Lightricks/LTX-2) was introduced by Lightricks. This is for encoding and decoding audio latent representations.

The model can be loaded with the following code snippet.

```python
from diffusers import AutoencoderKLLTX2Audio

vae = AutoencoderKLLTX2Audio.from_pretrained("Lightricks/LTX-2", subfolder="vae", torch_dtype=torch.float32).to("cuda")
```

## AutoencoderKLLTX2Audio

[[autodoc]] AutoencoderKLLTX2Audio
- encode
- decode
- all
29 changes: 29 additions & 0 deletions docs/source/en/api/models/autoencoderkl_ltx_2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->

# AutoencoderKLLTX2Video

The 3D variational autoencoder (VAE) model with KL loss used in [LTX-2](https://huggingface.co/Lightricks/LTX-2) was introduced by Lightricks.

The model can be loaded with the following code snippet.

```python
from diffusers import AutoencoderKLLTX2Video

vae = AutoencoderKLLTX2Video.from_pretrained("Lightricks/LTX-2", subfolder="vae", torch_dtype=torch.float32).to("cuda")
```

## AutoencoderKLLTX2Video

[[autodoc]] AutoencoderKLLTX2Video
- decode
- encode
- all
26 changes: 26 additions & 0 deletions docs/source/en/api/models/ltx2_video_transformer3d.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->

# LTX2VideoTransformer3DModel

A Diffusion Transformer model for 3D data from [LTX](https://huggingface.co/Lightricks/LTX-2) was introduced by Lightricks.

The model can be loaded with the following code snippet.

```python
from diffusers import LTX2VideoTransformer3DModel

transformer = LTX2VideoTransformer3DModel.from_pretrained("Lightricks/LTX-2", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
```

## LTX2VideoTransformer3DModel

[[autodoc]] LTX2VideoTransformer3DModel
43 changes: 43 additions & 0 deletions docs/source/en/api/pipelines/ltx2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License. -->

# LTX-2

LTX-2 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model. It brings together the core building blocks of modern video generation, with open weights and a focus on practical, local execution.

You can find all the original LTX-Video checkpoints under the [Lightricks](https://huggingface.co/Lightricks) organization.

The original codebase for LTX-2 can be found [here](https://github.com/Lightricks/LTX-2).

## LTX2Pipeline

[[autodoc]] LTX2Pipeline
- all
- __call__

## LTX2ImageToVideoPipeline

[[autodoc]] LTX2ImageToVideoPipeline
- all
- __call__

## LTX2LatentUpsamplePipeline

[[autodoc]] LTX2LatentUpsamplePipeline
- all
- __call__

## LTX2PipelineOutput

[[autodoc]] pipelines.ltx2.pipeline_output.LTX2PipelineOutput
Loading
Loading