Releases: magda-io/magda-embedding-api
Releases · magda-io/magda-embedding-api
v1.1.0
What's New
- Rename EmbeddingGenerator to EmbeddingEncoder
- Fixed serverOptions weren't passed through properly in test cases
- Upgrade to @huggingface/transformers v3.2.4
- Upgrade onnxruntime-node v1.20.1
- Avoid including unused models in docker images (smaller image size)
- Increase probe timeout seconds
- Use worker pool
- Process sentence list with separate model runs
- set default
workerTaskTimeout
to60
seconds - use quantized version (q8) default model
- set default
limits.memory
to850M
- set default replicas number to
2
- Add max_length config to model config (configurable via helm config)
- set max_length of default model to 1024 due to excessive memory usage when working on text longer than 2048 (the default model supports up to 8192)
- only use padding for multiple inputs received when encoding the input
Full Changelog: v1.0.0...v1.1.0
v1.1.0-alpha.3
What's New
- Rename EmbeddingGenerator to EmbeddingEncoder
- Fixed serverOptions weren't passed through properly in test cases
- Upgrade to @huggingface/transformers v3.2.4
- Upgrade onnxruntime-node v1.20.1
- Avoid including unused models in docker images (smaller image size)
- Increase probe timeout seconds
- Use worker pool
- Process sentence list with separate model runs
- set default
workerTaskTimeout
to60
seconds - use quantized version (q8) default model
- set default
limits.memory
to850M
- set default replicas number to
2
- Add max_length config to model config (configurable via helm config)
- set max_length of default model to 1024 due to excessive memory usage when working on text longer than 2048 (the default model supports up to 8192)
- only use padding for multiple inputs received when encoding the input
Full Changelog: v1.0.0...v1.1.0-alpha.2
v1.1.0-alpha.2
What's New
- Rename EmbeddingGenerator to EmbeddingEncoder
- Fixed serverOptions weren't passed through properly in test cases
- Upgrade to @huggingface/transformers v3.2.4
- Upgrade onnxruntime-node v1.20.1
- Avoid including unused models in docker images (smaller image size)
- Increase probe timeout seconds
- Use worker pool
- Process sentence list with separate model runs
- set default
workerTaskTimeout
to60
seconds - set default
limits.memory
to2000M
- set default replicas number to
2
Full Changelog: v1.0.0...v1.1.0-alpha.2
v1.1.0-alpha.1
What's New
- Rename EmbeddingGenerator to EmbeddingEncoder
- Fixed serverOptions weren't passed through properly in test cases
- Upgrade to @huggingface/transformers v3.2.4
- Upgrade onnxruntime-node v1.20.1
- Avoid including unused models in docker images (smaller image size)
- Increase probe timeout seconds
- Use worker pool
- Process sentence list with separate model runs
- set default
workerTaskTimeout
to60
seconds - set default
limits.memory
to1100M
- set default replicas number to
2
Full Changelog: v1.0.0...v1.1.0-alpha.1
v1.1.0-alpha.0
What's New
- Rename EmbeddingGenerator to EmbeddingEncoder
- Use non-quantized default model by default for better embedding performance but require more memory
- Fixed serverOptions weren't passed through properly in test cases
- Upgrade to @huggingface/transformers v3.2.4
- Upgrade onnxruntime-node v1.20.1
- Avoid including unused models in docker images
- Increase probe timeout seconds
- Use worker pool
- Process sentence list with separate model runs
Full Changelog: v1.0.0...v1.1.0-alpha.0
v1.0.1
v1.0.1-alpha.0
v1.0.0
What's New
- Initial production release
Full Changelog: https://github.com/magda-io/magda-embedding-api/commits/v1.0.0
v1.0.0-alpha.2
What's New
- Bug fixes
- Add support of specifying a different model
Full Changelog: v1.0.0-alpha.1...v1.0.0-alpha.2
v1.0.0-alpha.1
What's New
- Add more helm chart config fields
- Re-install prebuilt binary for
onnxruntime-node
&sharp
during docker build
Full Changelog: v1.0.0-alpha.0...v1.0.0-alpha.1