v0.22.0.dev0
Pre-release
Pre-release
What's Changed
- Fix Roformer export symbol by @abheesht17 in #2199
- Bump up master version to 0.21 by @abheesht17 in #2204
- reenable test by @mattdangerw in #2188
- Add xception model by @mattdangerw in #2179
- Make image converter built by @mattdangerw in #2206
- Qwen - Fix Preset Loader + Add Causal LM Test by @kanpuriyanawab in #2193
- Update Qwen conversion script by @laxmareddyp in #2207
- Revert "Do not export Qwen for release" by @sachinprasadhs in #2208
- Fixes compute_output_shape for PaliGemmaVitEncoder and Gemma3VisionEncoderBlock by @JyotinderSingh in #2210
- Python 3.12 fix by @mattdangerw in #2211
- Small Gemma3 doc-string edits by @abheesht17 in #2214
- Llama3.1 by @pctablet505 in #2132
- Update gemma3_causal_lm_preprocessor.py by @pctablet505 in #2217
- fix: apply
weights_only = Trueby @b8zhong in #2215 - Fix the keras_hub package for typecheckers and IDEs by @mattdangerw in #2222
- Add utility to map COCO IDs to class names by @mattdangerw in #2219
- Set GPU timeouts to 2 hours by @mattdangerw in #2226
- Fix nightly by @mattdangerw in #2227
- Another fix for nightly builds by @mattdangerw in #2229
- Cast a few more input to tensors in SD3 by @mattdangerw in #2234
- Fix up package build scripts again by @mattdangerw in #2230
- Add qwen presets by @laxmareddyp in #2241
- script for converting retinanet weights from trochvision by @sineeli in #2233
- Sharded weights support by @james77777778 in #2218
- Add Qwen Moe by @kanpuriyanawab in #2163
- Add Mixtral by @kanpuriyanawab in #2196
- Made label data optional for inference and adopted other required changes by @laxmareddyp in #2183
- Fix the layer names by @kanpuriyanawab in #2247
- Add new CSPNet preset and add manual padding. by @sachinprasadhs in #2212
- Update the int8 quant logic in
ReversibleEmbeddingby @james77777778 in #2250 - Add Moonshine to KerasHub by @harshaljanjani in #2093
- Add Kaggle handle for moonshine presets by @laxmareddyp in #2253
- Update requirements-jax-cuda.txt by @pctablet505 in #2252
- Add Mixtral,Qwen-MoE presets and Update conversion script. by @laxmareddyp in #2248
- fix flash attention test by @divyashreepathihalli in #2263
- Fix JAX bugs for qwen moe & mixtral by @kanpuriyanawab in #2258
- Create pull_request_template.md by @sachinprasadhs in #2262
- Update preset versions for sharded models by @laxmareddyp in #2264
- Add AudioToText and AudioToTextPreprocessor class stubs to enable auto class functionality by @harshaljanjani in #2265
- register moonshine presets by @sachinprasadhs in #2267
- register presets by @sachinprasadhs in #2268
- Fix batch preprocessing bug in Moonshine generation by @harshaljanjani in #2266
- fix get_lora_target_names function by @divyashreepathihalli in #2167
- implement of leftpadding by @pass-lin in #2242
- make vit compatible with non square images by @sineeli in #2255
- Bump up master version to 0.22.0.dev0 by @laxmareddyp in #2277
- Fix keras-io integration test by @laxmareddyp in #2280
- Add Qwen3 by @kanpuriyanawab in #2249
- Add DeiT Model by @Sohaib-Ahmed21 in #2203
- [HOTFIX] Add Docstring for QwenCausalLM by @kanpuriyanawab in #2279
- Fix: Correct coverage tracking for keras_hub by @sachinprasadhs in #2283
- Update the sharded version number for Llama3 variants by @laxmareddyp in #2294
- Support None for max_shard_size by @laxmareddyp in #2261
- Sharded weights type error by @laxmareddyp in #2296
- Fix PaliGemmaCausalLM example. by @hertschuh in #2302
- Routine HF sync by @divyashreepathihalli in #2303
- Incorrect condition on sliding_window_size by @laxmareddyp in #2289
- Bump the python group with 2 updates by @dependabot[bot] in #2282
- Modify TransformerEncoder masking documentation by @sonali-kumari1 in #2297
- Fix Gemma3InterleaveEmbeddings JAX inference error by ensuring indices are int32 by @pctablet505 in #2305
- Update preset versions for Mixtral,Qwen-MoE and Mistral models by @laxmareddyp in #2307
- Fix Mistral conversion script by @laxmareddyp in #2306
- Bump the python group with 6 updates by @dependabot[bot] in #2317
- Qwen3 causal lm by @kanpuriyanawab in #2311
- Fix JAX GPU tests by @sachinprasadhs in #2319
- support flash-attn at torch backend by @pass-lin in #2257
- Add HGNetV2 to KerasHub by @harshaljanjani in #2293
- Qwen3 presets register by @laxmareddyp in #2325
- Register HGNetV2 presets by @laxmareddyp in #2326
- Safetensors conversion by @Bond099 in #2290
- Add DINOV2. by @james77777778 in #2328
- Refactor
CLIPand update SD3. by @james77777778 in #2316 - add DINOv2 preset details by @sachinprasadhs in #2336
- Fix dtype issues on JAX CPU in SD3 tests. by @james77777778 in #2338
- Revert "Fix dtype issues of JAX CPU in SD3. (#2338)" by @divyashreepathihalli in #2344
- Resolve preset comparison bug in glue load model method by @emmanuel-ferdman in #2345
- Removes unnecessary call to
torch.no_grad()by @JyotinderSingh in #2353 - Add Esm by @pass-lin in #2244
- Fix float16 issue in SD3 when using JAX CPU. by @james77777778 in #2354
- update python to 3.10 and Keras minimum version to 3.8 by @sachinprasadhs in #2292
- register DeiT presets by @sachinprasadhs in #2348
- Fix path for presets to link it to API docs in keras.io by @sachinprasadhs in #2357
- Fix for llama3.1 instruct models by @pctablet505 in #2355
- Add & register ESM presets by @sachinprasadhs in #2356
- Add Gemma 3 conversion script by @abheesht17 in #2358
- Remove exact matching of outputs from Gemma 3 conversion notebook by @abheesht17 in #2359
New Contributors
- @JyotinderSingh made their first contribution in #2210
- @pctablet505 made their first contribution in #2132
- @b8zhong made their first contribution in #2215
- @harshaljanjani made their first contribution in #2093
- @Sohaib-Ahmed21 made their first contribution in #2203
- @sonali-kumari1 made their first contribution in #2297
- @Bond099 made their first contribution in #2290
- @emmanuel-ferdman made their first contribution in #2345
Full Changelog: v0.20.0.dev0...v0.22.0.dev0