Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector search documentation #9135

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,9 @@ collections:
workspace:
permalink: /:collection/:path/
output: true
vector-search:
permalink: /:collection/:path/
output: true

opensearch_collection:
# Define the collections used in the theme
Expand Down Expand Up @@ -173,6 +176,9 @@ opensearch_collection:
search-plugins:
name: Search features
nav_fold: true
vector-search:
name: Vector search
nav_fold: true
ml-commons-plugin:
name: Machine learning
nav_fold: true
Expand Down
14 changes: 7 additions & 7 deletions _field-types/supported-field-types/knn-vector.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
---
layout: default
title: k-NN vector
nav_order: 58
nav_order: 20
has_children: false
parent: Supported field types
has_math: true
---

# k-NN vector field type
# k-NN vector
**Introduced 1.0**
{: .label .label-purple }

The [k-NN plugin]({{site.url}}{{site.baseurl}}/search-plugins/knn/index/) introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors into an OpenSearch index and perform different kinds of k-NN search. The `knn_vector` field is highly configurable and can serve many different k-NN workloads. In general, a `knn_vector` field can be built either by providing a method definition or specifying a model id.
The `knn_vector` data type allows you to ingest vectors into an OpenSearch index and perform different kinds of vector search. The `knn_vector` field is highly configurable and can serve many different vector workloads. In general, a `knn_vector` field can be built either by providing a method definition or specifying a model id.

## Example

Expand Down Expand Up @@ -53,7 +53,7 @@
| `in_memory` (Default) | `nmslib` | Prioritizes low-latency search. This mode uses the `nmslib` engine without any quantization applied. It is configured with the default parameter values for vector search in OpenSearch. |
| `on_disk` | `faiss` | Prioritizes low-cost vector search while maintaining strong recall. By default, the `on_disk` mode uses quantization and rescoring to execute a two-pass approach to retrieve the top neighbors. The `on_disk` mode supports only `float` vector types. |

To create a k-NN index that uses the `on_disk` mode for low-cost search, send the following request:
To create a vector index that uses the `on_disk` mode for low-cost search, send the following request:

```json
PUT test-index
Expand Down Expand Up @@ -130,7 +130,7 @@

## Method definitions

[Method definitions]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#method-definitions) are used when the underlying [approximate k-NN]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/) algorithm does not require training. For example, the following `knn_vector` field specifies that *nmslib*'s implementation of *hnsw* should be used for approximate k-NN search. During indexing, *nmslib* will build the corresponding *hnsw* segment files.
[Method definitions]({{site.url}}{{site.baseurl}}/vector-search/creating-vector-index/method/) are used when the underlying [approximate k-NN]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/) algorithm does not require training. For example, the following `knn_vector` field specifies that NMSLIB's implementation of HNSW should be used for approximate k-NN search. During indexing, NMSLIB will build the corresponding HNSW segment files.

Check failure on line 133 in _field-types/supported-field-types/knn-vector.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/supported-field-types/knn-vector.md#L133

[OpenSearch.Spelling] Error: NMSLIB's. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: NMSLIB's. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_field-types/supported-field-types/knn-vector.md", "range": {"start": {"line": 133, "column": 308}}}, "severity": "ERROR"}

```json
"my_vector": {
Expand All @@ -150,7 +150,7 @@

## Model IDs

Model IDs are used when the underlying Approximate k-NN algorithm requires a training step. As a prerequisite, the model must be created with the [Train API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#train-a-model). The
Model IDs are used when the underlying approximate k-NN algorithm requires a training step. As a prerequisite, the model must be created with the [Train API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#train-a-model). The
model contains the information needed to initialize the native library segment files.

```json
Expand Down Expand Up @@ -180,7 +180,7 @@
When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
{: .important}

When using `byte` vectors with the `faiss` engine, we recommend using [SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), which helps to significantly reduce search latencies and improve indexing throughput.
When using `byte` vectors with the `faiss` engine, we recommend using [SIMD optimization]({{site.url}}{{site.baseurl}}/vector-search/creating-vector-index/vector-field/#simd-optimization-for-the-faiss-engine), which helps to significantly reduce search latencies and improve indexing throughput.
{: .important}

Introduced in k-NN plugin version 2.9, the optional `data_type` parameter defines the data type of a vector. The default value of this parameter is `float`.
Expand Down
56 changes: 15 additions & 41 deletions _includes/cards.html
Original file line number Diff line number Diff line change
@@ -1,43 +1,17 @@
<div class="card-container-wrapper">
<p class="heading-main">Explore OpenSearch documentation</p>
<div class="card-container">
<div class="card">
<a href="{{site.url}}{{site.baseurl}}/about/" class='card-link'></a>
<p class="heading">OpenSearch and OpenSearch Dashboards</p>
<p class="description">Build your OpenSearch solution using core tooling and visualizations</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="card">
<a href="{{site.url}}/docs/latest/data-prepper/" class='card-link'></a>
<p class="heading">OpenSearch Data Prepper</p>
<p class="description">Filter, mutate, and sample your data for ingestion into OpenSearch</p>
<p class="last-link" >Documentation &#x2192;</p>
</div>

<div class="card">
<a href="{{site.url}}/docs/latest/clients/" class='card-link'></a>
<p class="heading">Clients</p>
<p class="description">Interact with OpenSearch from your application using language APIs</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="card">
<a href="{{site.url}}/docs/latest/benchmark/" class='card-link'></a>
<p class="heading">OpenSearch Benchmark</p>
<p class="description">Measure performance metrics for your OpenSearch cluster</p>
<p class="last-link">Documentation &#x2192;</p>
</div>

<div class="card">
<a href="{{site.url}}/docs/latest/migration-assistant/" class='card-link'></a>
<p class="heading">Migration Assistant</p>
<p class="description">Migrate to OpenSearch</p>
<p class="last-link">Documentation &#x2192;</p>
</div>
<div class="card-container">
{% for card in include.cards %}
<div class="card">
<a href="{{ site.url }}{{ site.baseurl }}{{ card.link }}" class="card-link"></a>
<p class="heading">{{ card.heading }}</p>
{% if card.description %}
<p class="description">{{ card.description }}</p>
{% endif %}
{% if include.documentation_link %}
<p class="last-link">Documentation &#x2192;</p>
{% endif %}
</div>
{% endfor %}
</div>
</div>

</div>


72 changes: 72 additions & 0 deletions _includes/home_cards.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
<div class="home-card-container-wrapper">
<p class="heading-main">OpenSearch and OpenSearch Dashboards</p>
<div class="home-card-container">
<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/about/" class='card-link'></a>
<p class="heading">All documentation</p>
<p class="description">Build your OpenSearch solution using core tooling and visualizations.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/vector-search/" class='card-link'></a>
<p class="heading">Vector search</p>
<p class="description">Use vector database capabilities for more relevant search results.</p>
<p class="last-link" >Documentation &#x2192;</p>
</div>

<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/ml-commons-plugin/" class='card-link'></a>
<p class="heading">Machine learning</p>
<p class="description">Power your applications with machine learning model integration.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/dashboards/" class='card-link'></a>
<p class="heading">OpenSearch Dashboards</p>
<p class="description">Explore and visualize your data using interactive dashboards.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>
</div>

</div>

<div class="home-card-container-wrapper">
<p class="heading-main">Supporting tools</p>
<div class="home-card-container">

<div class="home-card">
<a href="{{site.url}}/docs/latest/data-prepper/" class='card-link'></a>
<p class="heading">Data Prepper</p>
<p class="description">Filter, mutate, and sample your data for ingestion into OpenSearch.</p>
<p class="last-link" >Documentation &#x2192;</p>
</div>

<div class="home-card">
<a href="{{site.url}}/docs/latest/clients/" class='card-link'></a>
<p class="heading">Clients</p>
<p class="description">Interact with OpenSearch from your application using language APIs.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="home-card">
<a href="{{site.url}}/docs/latest/benchmark/" class='card-link'></a>
<p class="heading">OpenSearch Benchmark</p>
<p class="description">Measure performance metrics for your OpenSearch cluster.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>

<div class="home-card">
<a href="{{site.url}}/docs/latest/migration-assistant/" class='card-link'></a>
<p class="heading">Migration Assistant</p>
<p class="description">Migrate to OpenSearch.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>
</div>

</div>

22 changes: 22 additions & 0 deletions _includes/list.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<div class="numbered-list">
{% if include.list_title %}
<div class="heading">{{ include.list_title }}</div>
{% endif %}
{% assign counter = 0 %}
{% for item in include.list_items %}
{% assign counter = counter | plus: 1 %}
<div class="list-item">
<div class="number-circle">{{ counter }}</div>
<div class="list-content">
<div class="list-heading">
{% if item.link %}
<a href="{{ site.url }}{{ site.baseurl }}{{ item.link }}">{{ item.heading }}</a>
{% else %}
{{ item.heading }}
{% endif %}
</div>
<p class="description">{{ item.description | markdownify }}</p>
</div>
</div>
{% endfor %}
</div>
2 changes: 1 addition & 1 deletion _ml-commons-plugin/custom-local-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@ The response contains the tokens and weights:

## Step 5: Use the model for search

To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
To learn how to use the model for vector search, see [ML-powered search methods]({{site.url}}{{site.baseurl}}/vector-search/ml-powered-search/index/#ml-powered-search-methods).

## Question answering models

Expand Down
2 changes: 1 addition & 1 deletion _ml-commons-plugin/remote-models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ To learn how to use the model for batch ingestion in order to improve ingestion

## Step 7: Use the model for search

To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
To learn how to use the model for vector search, see [ML-powered search methods]({{site.url}}{{site.baseurl}}/vector-search/ml-powered-search/index/#ml-powered-search-methods).

## Step 8 (Optional): Undeploy the model

Expand Down
81 changes: 77 additions & 4 deletions _sass/_home.scss
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,16 @@

// Card style

.card-container-wrapper {
.home-card-container-wrapper {
@include gradient-open-sky;
margin-bottom: 2rem;
}

.card-container {
.card-container-wrapper {
margin-bottom: 0;
}

.home-card-container {
display: grid;
grid-template-columns: 1fr;
margin: 0 auto;
Expand All @@ -42,11 +47,27 @@
}
}

.card {
.card-container {
display: grid;
grid-template-columns: 1fr;
margin: 0 auto;
padding: 2rem 0;
grid-row-gap: 1rem;
grid-column-gap: 1rem;
grid-auto-rows: 1fr;
@include mq(md) {
grid-template-columns: repeat(1, 1fr);
}
@include mq(lg) {
grid-template-columns: repeat(2, 1fr);
}
}

.home-card {
@extend .panel;
@include thick-edge-left;
padding: 1rem;
margin-bottom: 4rem;
margin-bottom: 2rem;
text-align: left;
background-color: white;
display: flex;
Expand All @@ -67,6 +88,11 @@
}
}

.card {
@extend .home-card;
margin-bottom: 0;
}

@mixin heading-font {
@include heading-sans-serif;
font-size: 1.5rem;
Expand Down Expand Up @@ -110,6 +136,53 @@
width: 100%;
}

// List layout

.numbered-list {
display: flex;
flex-direction: column;
gap: 2rem;
padding: 1rem;
}

.list-item {
display: flex;
align-items: flex-start;
gap: 1rem;
}

.number-circle {
width: 2.5rem;
height: 2.5rem;
border-radius: 50%;
background-color: $blue-lt-100;
color: $blue-dk-300;
display: flex;
align-items: center;
justify-content: center;
font-weight: bold;
font-size: 1.2rem;
flex-shrink: 0;
}

.list-content {
max-width: 100%;
}

.list-heading {
@include heading-font;
margin: 0 0 0.75rem 0;
font-size: 1.2rem;
color: $blue-dk-300;
font-weight: bold;
}

.list-content p {
margin: 0.5rem 0;
font-size: 1rem;
line-height: 1.5;
}

// Banner style

.os-banner {
Expand Down
Loading
Loading