Skip to content

Nixl optimization for llama4 local attention #87

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 59 commits into
base: pd-launch-branch
Choose a base branch
from

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented May 15, 2025

No description provided.

ekagra-ranjan and others added 30 commits May 14, 2025 12:31
…aft model to free ~1GB for llama 3 model (vllm-project#17326)

Co-authored-by: root <[email protected]>
Co-authored-by: Woosuk Kwon <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Co-authored-by: Nick Hill <[email protected]>
Signed-off-by: reidliu41 <[email protected]>
Co-authored-by: reidliu41 <[email protected]>
schoennenbeck and others added 25 commits May 15, 2025 09:00
… unquantizedMethod to reenable LLama4 BF16 (vllm-project#18205)

Signed-off-by: tjtanaa <[email protected]>
Signed-off-by: lisiqi23 <[email protected]>
Signed-off-by: skylee-01 <[email protected]>
Co-authored-by: lisiqi23 <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.