Skip to content

Releases: modelscope/ms-swift

v3.0.1

27 Dec 03:45
Compare
Choose a tag to compare

中文版

新特性

  1. 支持SequenceClassification模型的训练、推理和部署。可以查看以下例子:qwen2.5bert
  2. LlamaPro支持多模态模型. 例如:qwen2vl、internvl2.5、llama3-vision等。

新模型

  1. Qwen/QVQ-72B-Preview
  2. iic/DocOwl2
  3. OpenGVLab/InternVL2-Pretrain-Models, OpenGVLab/InternVL2_5-4B-AWQ系列, OpenGVLab/InternVL2_5-1B-MPO系列
  4. deepseek-ai/DeepSeek-V3系列
  5. answerdotai/ModernBERT-base系列
  6. AI-ModelScope/paligemma2-3b-pt-224系列, AI-ModelScope/paligemma2-3b-ft-docci-448系列
  7. AI-ModelScope/Skywork-o1-Open-Llama-3.1-8B

English Version

New Features:

  1. Support for training, inference, and deployment of SequenceClassification models. You can check the following examples: qwen2.5, bert.
  2. LlamaPro supports multimodal models, such as qwen2vl, internvl2.5, and llama3-vision.

New Models:

  1. Qwen/QVQ-72B-Preview
  2. iic/DocOwl2
  3. OpenGVLab/InternVL2-Pretrain-Models, OpenGVLab/InternVL2_5-4B-AWQ series, OpenGVLab/InternVL2_5-1B-MPO series
  4. deepseek-ai/DeepSeek-V3 series
  5. answerdotai/ModernBERT-base series
  6. AI-ModelScope/paligemma2-3b-pt-224 series, AI-ModelScope/paligemma2-3b-ft-docci-448 series
  7. AI-ModelScope/Skywork-o1-Open-Llama-3.1-8B

What's Changed

Full Changelog: v3.0.0...v3.0.1

v3.0.0

23 Dec 03:17
Compare
Choose a tag to compare

中文版

架构修改与新特性:

具体可以查看这里: https://swift.readthedocs.io/zh-cn/latest/Instruction/ReleaseNote3.0.html

新模型:

  1. OpenGVLab/InternVL2_5-1B等系列模型
  2. LLM-Research/Llama-3.3-70B-Instruct
  3. BAAI/Emu3-Gen
  4. deepseek-ai/DeepSeek-V2.5-1210, deepseek-ai/deepseek-vl2等系列模型
  5. Shanghai_AI_Laboratory/internlm-xcomposer2d5-ol-7b
  6. InfiniAI/Megrez-3b-Instruct, InfiniAI/Megrez-3B-Omni
  7. TeleAI/TeleChat2-3B等系列模型

English Version

Architecture Modifications and New Features:

For more details, please visit: https://swift.readthedocs.io/en/latest/Instruction/ReleaseNote3.0.html

New Models:

  1. OpenGVLab/InternVL2_5-1B series models
  2. LLM-Research/Llama-3.3-70B-Instruct
  3. BAAI/Emu3-Gen
  4. deepseek-ai/DeepSeek-V2.5-1210, deepseek-ai/deepseek-vl2 series models
  5. Shanghai_AI_Laboratory/internlm-xcomposer2d5-ol-7b
  6. InfiniAI/Megrez-3b-Instruct, InfiniAI/Megrez-3B-Omni
  7. TeleAI/TeleChat2-3B series models

What's Changed

New Contributors

Full Changelog: v2.6.1...v3.0.0

v2.6.1

29 Nov 09:29
Compare
Choose a tag to compare

New Models:

  1. Marco-o1
  2. mPLUG-Owl3-7B-241101
  3. QwQ-32B-Preview
  4. glm-edge, glm-edge-v

New Datasets:

  1. OpenO1-SFT

What's Changed

New Contributors

Full Changelog: v2.6.0...v2.6.1

v2.6.0

13 Nov 08:06
Compare
Choose a tag to compare

English Version

Models

  1. Support Qwen2.5 coder models

Feature

  1. Correct and support the new loss and gradient accumulation algorithm from transformers.trainer

中文版本

模型

  1. 支持千问coder系列模型

功能

  1. 支持新的transformers loss和GA计算算法,并修正了其中的bug

What's Changed

Full Changelog: v2.5.2...v2.6.0

v2.5.2

02 Nov 07:50
Compare
Choose a tag to compare

New Models:

  1. emu3-chat
  2. aya-expanse
  3. ministral-8b-inst-2410

New Datasets:

  1. llava-video-178k
  2. moviechat-1k-test

What's Changed

New Contributors

Full Changelog: v2.5.1...v2.5.2

v2.5.1

21 Oct 12:05
Compare
Choose a tag to compare

English Version

New Features:

  1. Support for RM for LLM and MLLM, as well as PPO for LLM.

New Models:

  1. molmo series
  2. mplug-owl3 1b/2b
  3. llama3.1-nemotron-70b-instruct
  4. deepseek-janus

中文版

新特性:

  1. 支持LLM和MLLM的RM, 以及LLM的PPO.

新模型:

  1. molmo系列
  2. mplug-owl3 1b/2b
  3. llama3.1-nemotron-70b-instruct
  4. deepseek-janus

What's Changed

New Contributors

Full Changelog: v2.5.0...v2.5.1

v2.5.0

10 Oct 02:21
Compare
Choose a tag to compare

English Version

New Features:

  1. Support for GPTQ & AWQ quantization of multimodal LLMs.
  2. Support for dynamic addition of gradient checkpointing in the ViT section to reduce memory consumption.
  3. Support for multimodal model pre-training.

New Models:

  1. llama3.2, llama3.2-vision series
  2. got-ocr2
  3. llama3.1-omni
  4. ovis1.6-gemma2
  5. pixtral-12b
  6. telechat2-115b
  7. mistral-small-inst-2409

New Datasets:

  1. egoschema

中文版

新特性:

  1. 支持多模态LLM的gptq&awq量化.
  2. 支持动态在vit部分增加gradient_checkpointing, 减少显存消耗.
  3. 支持多模态模型预训练.

新模型:

  1. llama3.2, llama3.2-vision系列
  2. got-ocr2
  3. llama3.1-omni
  4. ovis1.6-gemma2
  5. pixtral-12b
  6. telechat2-115b
  7. mistral-small-inst-2409

新数据集:

  1. egoschema

What's Changed

New Contributors

Full Changelog: v2.4.2...v2.5.0

v2.4.2

18 Sep 16:56
Compare
Choose a tag to compare

English Version

New Features:

  1. RLHF reconstruction, supporting all integrated multimodal models, compatible with DeepSpeed Zero2/Zero3, and supports lazy_tokenize.
  2. Using infer_backend vllm, inference deployment of multimodal large models supports multiple images.

New Models:

  1. Qwen2.5 series, Qwen2-vl-72b series (base/instruct/gptq-int4/gptq-int8/awq)
  2. Qwen2.5-math, Qwen2.5-coder series (base/instruct)
  3. Deepseek-v2.5

New Datasets:

  1. longwriter-6k-filtered

中文版

新特性:

  1. RLHF重构,支持所有已接入的多模态模型,兼容deepspeed zero2/zero3,支持lazy_tokenize
  2. 使用infer_backend vllm,推理部署多模态大模型支持多图.

新模型:

  1. qwen2.5系列、qwen2-vl-72b系列(base/instruct/gptq-int4/gptq-int8/awq)
  2. qwen2.5-math, qwen2.5-coder系列(base/instruct)
  3. deepseek-v2.5

新数据集:

  1. longwriter-6k-filtered

What's Changed

New Contributors

Full Changelog: v2.4.1...v2.4.2

v2.4.1

13 Sep 05:03
Compare
Choose a tag to compare

English Version

New Features:

  1. Inference and deployment support for logprobs.
  2. RLHF support for lazy_tokenize.
  3. Multimodal model support for neftune.
  4. dynamic_eos compatibility with glm4 series and other models.

New Models:

  1. mplug-owl3, best practices can be found here.
  2. yi-coder 1.5b, base/chat model of 9b.
  3. minicpm3-4b.
  4. reflection-llama3.1-70b.

中文版

新功能:

  1. 推理和部署支持 logprobs。
  2. RLHF支持lazy_tokenize。
  3. 多模态模型支持neftune。
  4. dynamic_eos兼容glm4系列等模型。

新模型:

  1. mplug-owl3,最佳实践可以查看这里
  2. yi-coder 1.5b、9b 的base/chat模型。
  3. minicpm3-4b。
  4. reflection-llama3.1-70b。

What's Changed

Full Changelog: v2.4.0...v2.4.1

v2.4.0

13 Sep 04:50
Compare
Choose a tag to compare

English Version

New Features:

  1. Support for Liger, which accommodates models like LLaMA, Qwen, Mistral, etc., and reduces memory usage by 10% to 60%.
  2. Support for custom loss function training using a registration mechanism.
  3. Training now supports pushing models to ModelScope and HuggingFace.
  4. Support for the freeze_vit parameter to control the behavior of full parameter training for multimodal models.

New Models:

  1. Qwen2-VL series includes GPTQ/AWQ quantized models. For best practices, see here.
  2. InternVL2 AWQ quantized models.

New Datasets:

  1. qwen2-pro series

中文版

新特性:

  1. 支持 Liger训练LLaMA、Qwen、Mistral 等模型,内存使用降低 10% 至 60%。
  2. 支持使用注册机制进行自定义损失函数的训练。
  3. 训练支持将模型推送至 ModelScope 和 HuggingFace。
  4. 支持 freeze_vit 参数,以控制多模态模型全参数训练的行为。

新模型:

  1. Qwen2-VL 系列包括 GPTQ/AWQ 量化模型,最佳实践可以查看这里
  2. InternVL2 AWQ 量化模型。

新数据集:

  1. qwen2-pro 系列

What's Changed

Full Changelog: v2.3.2...v2.4.0