@@ -11,36 +11,14 @@ the indexing algorithm runs searches under the hood to create the vector index
11
11
structures. So these same recommendations also help with indexing speed.
12
12
13
13
[discrete]
14
- === Ensure data nodes have enough memory
15
-
16
- {es} uses the https://arxiv.org/abs/1603.09320[HNSW] algorithm for approximate
17
- kNN search. HNSW is a graph-based algorithm which only works efficiently when
18
- most vector data is held in memory. You should ensure that data nodes have at
19
- least enough RAM to hold the vector data and index structures. To check the
20
- size of the vector data, you can use the <<indices-disk-usage>> API. As a
21
- loose rule of thumb, and assuming the default HNSW options, the bytes used will
22
- be `num_vectors * 4 * (num_dimensions + 12)`. When using the `byte` <<dense-vector-element-type,`element_type`>>
23
- the space required will be closer to `num_vectors * (num_dimensions + 12)`. Note that
24
- the required RAM is for the filesystem cache, which is separate from the Java
25
- heap.
26
-
27
- The data nodes should also leave a buffer for other ways that RAM is needed.
28
- For example your index might also include text fields and numerics, which also
29
- benefit from using filesystem cache. It's recommended to run benchmarks with
30
- your specific dataset to ensure there's a sufficient amount of memory to give
31
- good search performance.
32
- You can find https://elasticsearch-benchmarks.elastic.co/#tracks/so_vector[here]
33
- and https://elasticsearch-benchmarks.elastic.co/#tracks/dense_vector[here] some examples
34
- of datasets and configurations that we use for our nightly benchmarks.
35
-
36
- [discrete]
37
- include::search-speed.asciidoc[tag=warm-fs-cache]
38
-
39
- The following file extensions are used for the approximate kNN search:
14
+ === Reduce vector memory foot-print
40
15
41
- * `vec` and `veq` for vector values
42
- * `vex` for HNSW graph
43
- * `vem`, `vemf`, and `vemq` for metadata
16
+ The default <<dense-vector-element-type,`element_type`>> is `float`. But this
17
+ can be automatically quantized during index time through
18
+ <<dense-vector-quantization,`quantization`>>. Quantization will reduce the
19
+ required memory by 4x, but it will also reduce the precision of the vectors. For
20
+ `float` vectors with `dim` greater than or equal to `384`, using a
21
+ <<dense-vector-quantization,`quantized`>> index is highly recommended.
44
22
45
23
[discrete]
46
24
=== Reduce vector dimensionality
@@ -54,14 +32,6 @@ reduction techniques like PCA. When experimenting with different approaches,
54
32
it's important to measure the impact on relevance to ensure the search
55
33
quality is still acceptable.
56
34
57
- [discrete]
58
- === Reduce vector memory foot-print
59
-
60
- The default <<dense-vector-element-type,`element_type`>> is `float`. But this can be
61
- automatically quantized during index time through <<dense-vector-quantization,`quantization`>>. Quantization will
62
- reduce the required memory by 4x, but it will also reduce the precision of the vectors. For `float` vectors with
63
- `dim` greater than or equal to `384`, using a <<dense-vector-quantization,`quantized`>> index is highly recommended.
64
-
65
35
[discrete]
66
36
=== Exclude vector fields from `_source`
67
37
@@ -82,6 +52,37 @@ downsides of omitting fields from `_source`.
82
52
Another option is to use <<synthetic-source,synthetic `_source`>> if all
83
53
your index fields support it.
84
54
55
+ [discrete]
56
+ === Ensure data nodes have enough memory
57
+
58
+ {es} uses the https://arxiv.org/abs/1603.09320[HNSW] algorithm for approximate
59
+ kNN search. HNSW is a graph-based algorithm which only works efficiently when
60
+ most vector data is held in memory. You should ensure that data nodes have at
61
+ least enough RAM to hold the vector data and index structures. To check the
62
+ size of the vector data, you can use the <<indices-disk-usage>> API. As a
63
+ loose rule of thumb, and assuming the default HNSW options, the bytes used will
64
+ be `num_vectors * 4 * (num_dimensions + 12)`. When using the `byte` <<dense-vector-element-type,`element_type`>>
65
+ the space required will be closer to `num_vectors * (num_dimensions + 12)`. Note that
66
+ the required RAM is for the filesystem cache, which is separate from the Java
67
+ heap.
68
+
69
+ The data nodes should also leave a buffer for other ways that RAM is needed.
70
+ For example your index might also include text fields and numerics, which also
71
+ benefit from using filesystem cache. It's recommended to run benchmarks with
72
+ your specific dataset to ensure there's a sufficient amount of memory to give
73
+ good search performance.
74
+ You can find https://elasticsearch-benchmarks.elastic.co/#tracks/so_vector[here]
75
+ and https://elasticsearch-benchmarks.elastic.co/#tracks/dense_vector[here] some examples
76
+ of datasets and configurations that we use for our nightly benchmarks.
77
+
78
+ [discrete]
79
+ include::search-speed.asciidoc[tag=warm-fs-cache]
80
+
81
+ The following file extensions are used for the approximate kNN search:
82
+
83
+ * `vec` and `veq` for vector values
84
+ * `vex` for HNSW graph
85
+ * `vem`, `vemf`, and `vemq` for metadata
85
86
86
87
[discrete]
87
88
=== Reduce the number of index segments
0 commit comments