-
Notifications
You must be signed in to change notification settings - Fork 88
feat(transformers): Transformers 4.54 base #1387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
feat(transformers): support qwen3-vl series
fix(transformers): fix typos in qwen3_vl docs
…e_search/group_beam_search/constrainted beam search
Summary of ChangesHello @wcrzlh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates the Qwen3-VL model series, encompassing both standard and Mixture-of-Experts architectures, into the MindOne Transformers library. This integration is accompanied by new inference examples and a specialized module for MoE handling. Concurrently, the PR features a comprehensive refactoring effort across the library's foundational components. This includes updating activation functions to leverage Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request upgrades several modules to align with Hugging Face Transformers v4.54.1, including significant refactoring for MoE support, attention mechanisms, and generation utilities. Overall, the changes are a good step forward, introducing better structure and new features. I've identified a few areas for improvement, mainly concerning code robustness and documentation clarity.
What does this PR do?
This pr is proposed to upgrade modeling_XXX/generation/cache_utils/activations/processing_XXX module to align with hf transformers 4.54.1.
Highlighted Features
modeing_XXXEmbeddingUtilsMixinhave been extracted from BaseClass PretrainedClass as the independent part for dealing with input and output embeddings and tie_weights.attn_implementationpart has been refactored. In transformers 4.50, there are "autoset_attn_implementation" in from_pretrained and from_config func so that attn_implementation could still be modified after setting attn_implementation in your scripts. In transformers 4.54,check_and_adjust_attn_implementationhas been set to repalceautoset_attn_implementationso that default attn_implementation would be set during init process, then in from_pretrained or from_config func, attn_implementation could only be modified by setting in user's scripts.get_mindspore_dtypehas been extracted from from_pretrained basic logic to clarify the code.find_missing_and unexpected keyshas been extracted as an independant func. Then key_rename_mapping is added to minimize the model_weight gap between pytorch and hf transformers.get_compiled_callis added for initial exploration of substitution for "torch.compile".generationmindone.transformers, we use_supports_dynamic_inputas the label to control if padding inputs and compilable cache need to be implemented. But right now the logic is converted to use_supports_jitas the label so that we do not need to supplement any variables in modeling_xxx.py.load_custom_generatehas been added so that custom generate.py could been used for inference.chunk_prefillandcompilehas been added.cacheCacheLayerMixinhas been extracted as the basic class for HybridCache.ModelsQwen3_VL/Qwen3_VL_MoEmodel supportXXXModelhas been extracted fromXXXForConditionalGenerationto deal with visual model and text model.XXXForConditionalGenerationonly keep the general func and is consisted ofXXXModelandlm_headValidation
ModernBert:ModernBertUT could pass if running single model likepytest tests/transformers_tests/models/ModernBert. But it would raise an OSError if running above commands.Before submitting
What's New. Here are thedocumentation guidelines
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@xxx