Feat: Local Deployment, Distributed Acceleration, Parallel VAE Decode, QKV Fused_projection, and torch.compile Support #744

Sallyeen · 2025-08-27T07:08:55Z

This PR introduces several major enhancements to the FramePack:

Local Service Deployment: Added support for running and invoking models as local services.
Distributed Inference Acceleration: Enabled parallelism with Ulysses and Ring Attention for faster distributed inference.
Parallel VAE Decode: Optimized VAE decoding by introducing parallel execution.
QKV Projection Fusion: Improved efficiency with optional fused QKV projection operations.
Torch.compile Integration: Leveraged torch.compile to further optimize runtime performance.

…ention, QKV fuse_projection, torch.compile, and VAE decode parallelism

Sallyeen · 2025-08-27T07:18:41Z

Performance Improvement

Before optimization: Single RTX 4090, deployment time ≈ 237s
After optimization: 8× RTX 4090, deployment time ≈ 49s
Speedup: ~4.8× faster

Notes

These improvements have been tested in both local and distributed environments, showing significant gains in deployment efficiency and inference scalability.

Finally, I would like to sincerely thank you and the community for your outstanding contribution to this project.
It has provided an excellent foundation for further optimization, and I hope this PR can also be helpful in return.
Looking forward to your feedback!

Mia and others added 4 commits August 26, 2025 17:03

feat: support local service deployment, Ulysses parallelism, Ring Att…

789d43b

…ention, QKV fuse_projection, torch.compile, and VAE decode parallelism

bugfix

97f02fd

Update README.md

3633489

feat: add client.py

92d6782

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Local Deployment, Distributed Acceleration, Parallel VAE Decode, QKV Fused_projection, and torch.compile Support #744

Feat: Local Deployment, Distributed Acceleration, Parallel VAE Decode, QKV Fused_projection, and torch.compile Support #744

Uh oh!

Sallyeen commented Aug 27, 2025

Uh oh!

Sallyeen commented Aug 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat: Local Deployment, Distributed Acceleration, Parallel VAE Decode, QKV Fused_projection, and torch.compile Support #744

Are you sure you want to change the base?

Feat: Local Deployment, Distributed Acceleration, Parallel VAE Decode, QKV Fused_projection, and torch.compile Support #744

Uh oh!

Conversation

Sallyeen commented Aug 27, 2025

Uh oh!

Sallyeen commented Aug 27, 2025

Performance Improvement

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant