Skip to content

Commit 5a4c5a8

Browse files
committed
v1.0.0-alpha.1
1 parent 78bd83a commit 5a4c5a8

File tree

3 files changed

+12
-13
lines changed

3 files changed

+12
-13
lines changed

README.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# magda-csv-semantic-indexer
22

3-
![Version: 1.0.0-alpha.0](https://img.shields.io/badge/Version-1.0.0--alpha.0-informational?style=flat-square)
3+
![Version: 1.0.0-alpha.1](https://img.shields.io/badge/Version-1.0.0--alpha.1-informational?style=flat-square)
44

55
A Helm chart for Magda CSV Semantic Indexer
66

@@ -23,6 +23,9 @@ Kubernetes: `>= 1.14.0-0`
2323
| Key | Type | Default | Description |
2424
|-----|------|---------|-------------|
2525
| defaultAdminUserId | string | `"00000000-0000-4000-8000-000000000000"` | |
26+
| defaultImage.imagePullSecret | bool | `false` | |
27+
| defaultImage.pullPolicy | string | `"IfNotPresent"` | |
28+
| defaultImage.repository | string | `"ghcr.io/magda-io"` | |
2629
| defaultSemanticIndexerConfig.bulkEmbeddingsSize | int | `1` | |
2730
| defaultSemanticIndexerConfig.bulkIndexSize | int | `50` | |
2831
| defaultSemanticIndexerConfig.chunkSizeLimit | int | `512` | |
@@ -33,18 +36,14 @@ Kubernetes: `>= 1.14.0-0`
3336
| defaultSemanticIndexerConfig.overlap | int | `50` | |
3437
| defaultSemanticIndexerConfig.overlap | int | `50` | |
3538
| embeddingApiURL | string | `"http://magda-embedding-api"` | |
36-
| global | object | `{"image":{},"rollingUpdate":{},"searchEngine":{"defaultDatasetBucket":"magda-datasets","semanticIndexer":{"indexName":null,"indexVersion":null,"knnVectorFieldConfig":{"compressionLevel":null,"dimension":768,"efConstruction":100,"efSearch":100,"encoder":{"clip":false,"name":"sq","type":"fp16"},"m":16,"mode":"in_memory","spaceType":"l2"},"numberOfReplicas":0,"numberOfShards":1}}}` | only for providing appropriate default value for helm lint |
37-
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.compressionLevel | string | `nil` | The compression_level mapping parameter selects a quantization encoder that reduces vector memory consumption by the given factor. |
39+
| global | object | `{"image":{},"rollingUpdate":{},"searchEngine":{"defaultDatasetBucket":"magda-datasets","semanticIndexer":{"indexName":null,"indexVersion":null,"knnVectorFieldConfig":{"compressionLevel":"32x","dimension":768,"efConstruction":100,"efSearch":100,"m":16,"mode":"on_disk","spaceType":"l2"},"numberOfReplicas":0,"numberOfShards":1}}}` | only for providing appropriate default value for helm lint |
40+
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.compressionLevel | string | `"32x"` | The compression_level mapping parameter selects a quantization encoder that reduces vector memory consumption by the given factor. |
3841
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.dimension | int | `768` | Dimension of the embedding vectors. |
3942
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.efConstruction | int | `100` | Similar to efSearch but used during index construction. Higher values improve search quality but increase index build time. |
4043
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.efSearch | int | `100` | The size of the candidate queue during search. Larger values may improve search quality but increase search latency. |
41-
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.encoder | object | `{"clip":false,"name":"sq","type":"fp16"}` | FAISS Encoder configuration (If compressionLevel is set, encoder will be ignored). |
4244
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.m | int | `16` | The maximum number of graph edges per vector. Higher values increase memory usage but may improve search quality. |
43-
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.mode | string | `"in_memory"` | Vector workload mode: `on_disk` or `in_memory`. |
44-
| image.name | string | `"data61/magda-csv-semantic-indexer"` | |
45-
| image.pullPolicy | string | `"IfNotPresent"` | |
46-
| image.repository | string | `"localhost:5000"` | |
47-
| image.tag | string | `"latest"` | |
45+
| global.searchEngine.semanticIndexer.knnVectorFieldConfig.mode | string | `"on_disk"` | Vector workload mode: `on_disk` or `in_memory`. |
46+
| image.name | string | `"magda-csv-semantic-indexer"` | |
4847
| minioConfig.defaultDatasetBucket | string | `""` | |
4948
| minioConfig.endPoint | string | `"magda-minio"` | |
5049
| minioConfig.port | int | `9000` | |
@@ -58,7 +57,7 @@ Kubernetes: `>= 1.14.0-0`
5857
| semanticIndexer.bulkEmbeddingsSize | int | `nil` | number of string we request embedding api to process in one request |
5958
| semanticIndexer.bulkIndexSize | int | `nil` | Number of documents we send to OpenSearch for bulk processing in a single request |
6059
| semanticIndexer.chunkSizeLimit | int | `nil` | The maximum number of tokens in a single chunk. |
61-
| semanticIndexer.id | string | `"csv-semantic-indexer-5"` | Semantic indexer ID |
60+
| semanticIndexer.id | string | `""` | Semantic indexer ID |
6261
| semanticIndexer.indexName | string | `nil` | index name |
6362
| semanticIndexer.indexVersion | int | `nil` | index version |
6463
| semanticIndexer.overlap | int | `nil` | The number of overlapping tokens between chunks. |

deploy/magda-csv-semantic-indexer/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
apiVersion: v2
22
name: magda-csv-semantic-indexer
3-
version: 1.0.0-alpha.0
3+
version: 1.0.0-alpha.1
44
kubeVersion: ">= 1.14.0-0"
55
description: A Helm chart for Magda CSV Semantic Indexer
66
home: "https://github.com/magda-io/magda-csv-semantic-indexer"

package.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@magda/magda-csv-semantic-indexer",
3-
"version": "1.0.0-alpha.0",
3+
"version": "1.0.0-alpha.1",
44
"description": "Magda CSV Semantic Indexer",
55
"type": "module",
66
"files": [
@@ -34,8 +34,8 @@
3434
"@types/mocha": "^10.0.10",
3535
"@types/nock": "^11.1.0",
3636
"@types/node": "^20.19.0",
37-
"@types/yargs": "^17.0.33",
3837
"@types/sinon": "^17.0.4",
38+
"@types/yargs": "^17.0.33",
3939
"chai": "^5.2.0",
4040
"husky": "^9.1.7",
4141
"mocha": "^11.7.1",

0 commit comments

Comments
 (0)