Skip to content

AeroScripts/leapfusion-hunyuan-image2video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Leapfusion Hunyuan Image-to-Video

Show your support! You can try HunyuanVideo free with some of our custom spice here. Supporting LeapFusion enables us to do more open source releases like this in the future!

Training code can be found Here.

Usage

First, Download the hunyuan weights as explained here and get the image2video lora weights from here. Then run the following command to encode an image: (ex. input_image.png)

python encode_image.py --vae hunyuan-video-t2v-720p/vae/pytorch_model.pt --vae_chunk_size 32 --vae_tiling --image ./input_image.png

Then, you can launch generate a video with something like:

python generate.py --fp8 --video_size 320 512 --infer_steps 30 --save_path ./samples/ --output_type both --dit mp_rank_00_model_states.pt --attn_mode sdpa --split_attn --vae hunyuan-video-t2v-720p/vae/pytorch_model.pt --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 --text_encoder1 llava_llama3_fp16.safetensors --text_encoder2 clip_l.safetensors --lora_multiplier 1.0 --lora_weight img2vid.safetensors --video_length 129 --prompt "" --seed 123 

Leaving the prompt blank, the model will infer based on the image alone. If you prompt changes, make sure to describe some baseline details about the image too or you might get bad results.

Note: The current model is trained at 512x320, as our research budget is quite small. If anyone would like to help train a higher res chekpoint and has some spare compute, please reach out!

Samples

icremyuv.mp4
dog.mp4
bubbles.mp4
meme2.mp4

License

Much of the code is based on musubi-tuner. Code under the hunyuan_model directory is modified from HunyuanVideo and follows their license. Other code is under the Apache License 2.0. Some code is copied and modified from musubi-tuner, k-diffusion and Diffusers.

About

A novel approach to hunyuan image-to-video sampling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages