Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support InternVL2 Series #2629

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Support InternVL2 Series #2629

wants to merge 10 commits into from

Conversation

amosyou
Copy link
Contributor

@amosyou amosyou commented Dec 28, 2024

Motivation

This PR adds support for InternVL2 series of models. Currently working on InternVL2-8B.

Modifications

These are the main modifications I'm making from the vllm implementation.

  • new chat template
  • add dummy image processor because of dynamic processing
  • patch type -> rope_type in rope_scaling due to vllm standardization

Some broader design questions I have as I'm implementing this.

  1. Due to the dynamic processing feature of InternVL2, there is no image_processor that we load. Since these transforms are done on the fly with respect to each individual image, I'm using a dummy image processor, and transforming the images in the forward pass of the model. This should be fine right?
  2. Do we want to remove vllm rope dependency now or can that be done in a later commit?

TODOs

  • pad_input_ids in forward

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@amosyou amosyou marked this pull request as draft January 1, 2025 05:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant