-
Notifications
You must be signed in to change notification settings - Fork 6k
Add SkyReels V2: Infinite-Length Film Generative Model #11518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…usion forcing - Introduced the drafts of `SkyReelsV2TextToVideoPipeline`, `SkyReelsV2ImageToVideoPipeline`, `SkyReelsV2DiffusionForcingPipeline`, and `FlowUniPCMultistepScheduler`.
It's about time. Thanks. |
Replaces custom attention implementations with `SkyReelsV2AttnProcessor2_0` and the standard `Attention` module. Updates `WanAttentionBlock` to use `FP32LayerNorm` and `FeedForward`. Removes the `model_type` parameter, simplifying model architecture and attention block initialization.
Introduces new classes `SkyReelsV2ImageEmbedding` and `SkyReelsV2TimeTextImageEmbedding` for enhanced image and time-text processing. Refactors the `SkyReelsV2Transformer3DModel` to integrate these embeddings, updating the constructor parameters for better clarity and functionality. Removes unused classes and methods to streamline the codebase.
…ds and begin reorganizing the forward pass.
…hod, integrating rotary embeddings and improving attention handling. Removes the deprecated `rope_apply` function and streamlines the attention mechanism for better integration and clarity.
…ethod by updating parameter names for clarity, integrating attention masks, and improving the handling of encoder hidden states.
…ethod by enhancing the handling of time embeddings and encoder hidden states. Updates parameter names for clarity and integrates rotary embeddings, ensuring better compatibility with the model's architecture.
… of latent tensors and video output generation. Update logic for processing latents and ensure correct video formatting, enhancing clarity and maintainability in the video generation workflow.
FWIW, I have been successful in using the same T5 encoder for WAN 2.1 for this model just by fiddling with their pipeline:
Then this: I incorporate my bitsandbytes nf4 transformer, their tokenizer and the WAN based T5 encoder:
I need to add this function to the pipeline for the T5 encoder to work:
|
…predix_video_latent_length` to `prefix_video_latent_length` for consistency.
… consistency and update related logic in `SkyReelsV2DiffusionForcingPipeline`.
…st `causal_block_size` and `num_steps` references in `SkyReelsV2DiffusionForcingPipeline` to use the correct configuration properties for improved consistency and clarity.
…al_attn_mask` method to use `Union` for <=python 3.9.
It seems appropriate to me. Only Diffusion Forcing pipelines are different for large models. How are the results with your setting? |
…ng of timesteps for sample schedulers in both short and long video generation paths, enhancing code clarity and maintainability.
…d timestep handling and refactor related logic to enhance clarity and maintainability.
…maintainability by updating type hints, adjusting timestep handling, and correcting parameter defaults for `shift` and `addnoise_condition`.
…fy examples and return values.
…ion to use `pathlib.Path` for improved compatibility.
…ToVideoPipeline` and set text encoder and VAE to None for improved memory.
Thanks for the opportunity to fix #11374!
Original repo: https://github.com/SkyworkAI/SkyReels-V2
TODOs:
✅
FlowMatchUniPCMultistepScheduler
: Just copy-pasted (for now).⏳
SkyReelsV2DiffusionForcingPipeline
⏳
SkyReelsV2DiffusionForcingImageToVideoPipeline
: Includes Start/End Frame Control.⏳
SkyReelsV2DiffusionForcingVideoToVideoPipeline
: Extends a given video.⬜
SkyReelsV2Pipeline
⬜
SkyReelsV2ImageToVideoPipeline
⬜ Did you make sure to update the documentation with your changes?
⬜ Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.