Releases · ModelCloud/GPTQModel

11 Nov 05:58

Qubitium

v1.2.0

42902e5

GPTQModel v1.2.0 Pre-release

Pre-release

Note:

v1.2.0 was released with wrong version value of 1.2.1-dev.

We are re-releasing 1.2.0 correctly as 1.2.1.

https://github.com/ModelCloud/GPTQModel/releases/tag/v1.2.1

Assets 20

29 Oct 15:48

Qubitium

v1.1.0

6468062

GPTQModel v1.1.0

What's Changed

IBM Granite model support. Full auto-buildless wheel install from pypi. Reduce max cpu memory usage by >20% during quantization. 100% CI model/feature coverage. Updated hf-integration support with latest transformers.

Full deprecations: liger-kernel support and exllama v1 quant kernel.

Fix deprecated by @CSY-ModelCloud in #447
[COMPAT] [FIX] vllm params by @ZYC-ModelCloud in #448
add estimate-vram by @PZS-ModelCloud in #452
add field uri by @ZYC-ModelCloud in #449
auto infer model base name from model files by @ZYC-ModelCloud in #451
remove exllama v1 by @PZS-ModelCloud in #453
[SECURITY] drop support of loading unsafe .bin weights by @ZYC-ModelCloud in #460
[MODEL] add granite support by @LRL-ModelCloud in #466
Split base.py file by @ZYC-ModelCloud in #465
Move save_quantized function into saver.py by @ZYC-ModelCloud in #467
remove deprecated exllama v1 code by @Qubitium in #473
[MISC] move model def file to model_def folder by @PZS-ModelCloud in #479
[FIX] Fix unit test by @PZS-ModelCloud in #480
Download whl in setup.py by @CSY-ModelCloud in #481
[Fix] cpu memory leak by @ZX-ModelCloud in #485
[CI] set ninja threads to 4 by @CSY-ModelCloud in #487
[FIX] sharded model loading error by @ZX-ModelCloud in #490
add internlm test by @PZS-ModelCloud in #491
remove needless function by @ZYC-ModelCloud in #494
Fix unit test by @ZYC-ModelCloud in #495
[FIX] fix test_integration by @PZS-ModelCloud in #497
[Test] add codegen and xverse test by @PZS-ModelCloud in #496

Full Changelog: v1.0.9...v1.1.0

Contributors

Qubitium, PZS-ModelCloud, and 4 other contributors

Assets 52

13 Oct 00:00

Qubitium

v1.0.9

e6ac223

GPTQModel v1.0.9

What's Changed

Fixed HF integration to work with latest transformers. Moved AutoRound to optional. Update flaky CI tests.

[FIX] mark auto_round extras_require by @LRL-ModelCloud in #430
[BUILD] update compile flags by @Qubitium in #428
[FIX] failed test_transformers_integration.py by @ZX-ModelCloud in #435

Full Changelog: v1.0.8...v1.0.9

Contributors

Qubitium, ZX-ModelCloud, and LRL-ModelCloud

Assets 32

11 Oct 05:00

Qubitium

v1.0.8

7b53f5c

GPTQModel v1.0.8

What's Changed

Moved QBits to optional. Add Python 3.12 wheels and fix wheel generation for cuda 11.8.

[PKG] update vllm/sglang optional depends by @PZS-ModelCloud in #423
[FIX] autoround depend causing torch-cpu to be installed by @Qubitium in #422

Full Changelog: v1.0.7...v1.0.8

Contributors

Qubitium and PZS-ModelCloud

Assets 32

08 Oct 14:19

Qubitium

v1.0.7

e208d38

GPTQModel v1.0.7

What's Changed

Fixed marlin (faster) kernel was not auto-selected for some models and autoround quantization save throwing json errors.

[FIX] marlin_inference_linear not correctly auto selected for eligible models by @ZX-ModelCloud in #413
[FIX] remove "scale" and "zp" Tensor from layer_config by @ZX-ModelCloud in #414
[FIX] Failed unit test by @ZX-ModelCloud in #420

Full Changelog: v1.0.6...v1.0.7

Contributors

ZX-ModelCloud

Assets 12

26 Sep 15:59

Qubitium

v1.0.6

25e7313

GPTQModel v1.0.6

What's Changed

Patch release to fix loading of quantized Llama 3.2 Vision model.

[FIX] mllama loader by @LRL-ModelCloud in #404

Full Changelog: v1.0.5...v1.0.6

Contributors

LRL-ModelCloud

Assets 15

26 Sep 10:54

Qubitium

v1.0.5

4921d68

GPTQModel v1.0.5

What's Changed

Added partial quantization support Llama 3.2 Vision model. v1.0.5 allows quantization of text-layers (layers responsible for text-generation) only. We will add vision layer support shortly. A Llama 3.2 11B Vision Instruct models will quantize to 50% of the size in 4bit mode. Once vision layer support is added, the size will reduce to expected ~1/4.

[MODEL] Add Llama 3.2 Vision (mllama)* support by @LRL-ModelCloud in #401

Full Changelog: v1.0.4...v1.0.5

Contributors

LRL-ModelCloud

Assets 15

26 Sep 04:26

Qubitium

v1.0.4

cffee9a

GPTQModel v1.0.4

What's Changed

Liger Kernel support added for ~50% vram reduction in quantization stage for some models. Added toggle to disable parallel packing to avoid oom larger models. Transformers depend updated to 4.45.0 for Llama 3.2 support.

[FEATURE] add a parallel_packing toggle by @LRL-ModelCloud in #393
[FEATURE] add liger_kernel support by @LRL-ModelCloud in #394

Full Changelog: v1.0.3...v1.0.4

Contributors

LRL-ModelCloud

Assets 15

19 Sep 06:36

Qubitium

v1.0.3

44b9df7

GPTQModel v1.0.3

What's Changed

[MODEL] Add minicpm3 by @LDLINGLINGLING in #385
[FIX] fix minicpm3 support by @LRL-ModelCloud in #387
[MODEL] Added GRIN-MoE support by @LRL-ModelCloud in #388

New Contributors

@LDLINGLINGLING made their first contribution in #385
@mrT23 made their first contribution in #386

Full Changelog: v1.0.2...v1.0.3

Contributors

mrT23, LDLINGLINGLING, and LRL-ModelCloud

Assets 3

17 Aug 01:44

Qubitium

v1.0.2

182df2b

GPTQModel v1.0.2

What's Changed

Upgrade the AutoRound package to v0.3.0. Pre-built WHL and PyPI source releases are now available. Installation can be done by downloading our pre-built WHL or using pip install gptqmodel --no-build-isolation.

[CORE] Autoround v0.3 by @LRL-ModelCloud in #368
[CI] Lots of CI fixups by @CSY-ModelCloud

Full Changelog: v1.0.0...v1.0.2

Contributors

LRL-ModelCloud and CSY-ModelCloud

Assets 27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

Releases: ModelCloud/GPTQModel

GPTQModel v1.2.0

GPTQModel v1.1.0

What's Changed

Contributors

GPTQModel v1.0.9

What's Changed

Contributors

GPTQModel v1.0.8

What's Changed

Contributors

GPTQModel v1.0.7

What's Changed

Contributors

GPTQModel v1.0.6

What's Changed

Contributors

GPTQModel v1.0.5

What's Changed

Contributors

GPTQModel v1.0.4

What's Changed

Contributors

GPTQModel v1.0.3

What's Changed

New Contributors

Contributors

GPTQModel v1.0.2

What's Changed

Contributors