Skip to content
This repository was archived by the owner on Mar 29, 2025. It is now read-only.

Commit a7e1d06

Browse files
rderbiergcxml
andauthored
Raphael/v24 graphql (#669)
Add content of alpha and alpha3 posts in doc for v24. --------- Co-authored-by: Gajanan <[email protected]>
1 parent 4dd9983 commit a7e1d06

File tree

7 files changed

+153
-1
lines changed

7 files changed

+153
-1
lines changed

content/graphql/mutations/mutations-overview.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,35 @@ mutation {
221221
}
222222
```
223223

224+
## Vector Embedding mutations
225+
226+
For types with vector embeddings Dgraph automatically generates the add mutation. For this example of add mutation we use the following schema.
227+
228+
```graphql
229+
type User {
230+
userID: ID!
231+
name: String!
232+
name_v: [Float!] @embedding @search(by: ["hnsw(metric: euclidean, exponent: 4)"])
233+
}
234+
235+
mutation {
236+
addUser(input: [
237+
{ name: "iCreate with a Mini iPad", name_v: [0.12, 0.53, 0.9, 0.11, 0.32] },
238+
{ name: "Resistive Touchscreen", name_v: [0.72, 0.89, 0.54, 0.15, 0.26] },
239+
{ name: "Fitness Band", name_v: [0.56, 0.91, 0.93, 0.71, 0.24] },
240+
{ name: "Smart Ring", name_v: [0.38, 0.62, 0.99, 0.44, 0.25] }])
241+
{
242+
project {
243+
id
244+
name
245+
name_v
246+
}
247+
}
248+
}
249+
```
250+
251+
Note: The embeddings are generated outside of Dgraph using any suitable machine learning model.
252+
224253
## Examples
225254

226255
You can refer to the following [link](https://github.com/dgraph-io/dgraph/tree/main/graphql/schema/testdata/schemagen) for more examples.

content/graphql/queries/aggregate.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
+++
22
title = "Aggregate Queries"
33
description = "Dgraph automatically generates aggregate queries for GraphQL schemas. These are compatible with the @auth directive."
4-
weight = 3
4+
weight = 4
55
[menu.main]
66
parent = "graphql-queries"
77
name = "Aggregate Queries"
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
+++
2+
title = "Similarity Search"
3+
description = "Dgraph automatically generates GraphQL queries for each vector index that you define in your schema. There are two types of queries generated for each index."
4+
weight = 3
5+
[menu.main]
6+
parent = "graphql-queries"
7+
identifier = "vector-queries"
8+
+++
9+
10+
Dgraph automatically generates two GraphQL similarity queries for each type that have at least one [vector predicate](/graphql/schema/types/#vectors) with `@search` directive.
11+
12+
For example
13+
14+
```graphql
15+
type User {
16+
id: ID!
17+
name: String!
18+
name_v: [Float!] @embedding @search(by: ["hnsw(metric: euclidean, exponent: 4)"])
19+
}
20+
```
21+
22+
With the above schema, the auto-generated `querySimilar<Object>ByEmbedding` query allows us to run similarity search using the vector index specified in our schema.
23+
24+
```graphql
25+
getSimilar<Object>ByEmbedding(
26+
by: vector_predicate,
27+
topK: n,
28+
vector: searchVector): [User]
29+
```
30+
31+
For example in order to find top 3 users with names similar to a given user name embedding the following query function can be used.
32+
33+
```graphql
34+
querySimilarUserByEmbedding(by: name_v, topK: 3, vector: [0.1, 0.2, 0.3, 0.4, 0.5]) {
35+
id
36+
name
37+
vector_distance
38+
}
39+
```
40+
The results obtained for this query includes the 3 closest Users ordered by vector_distance. The vector_distance is the Euclidean distance between the name_v embedding vector and the input vector used in our query.
41+
42+
Note: you can omit vector_distance predicate in the query, the result will still be ordered by vector_distance.
43+
44+
The distance metric used is specified in the index creation.
45+
46+
Similarly, the auto-generated `querySimilar<Object>ById` query allows us to search for similar objects to an existing object, given it’s Id. using the function.
47+
48+
```graphql
49+
getSimilar<Object>ById(
50+
by: vector_predicate,
51+
topK: n,
52+
id: userID): [User]
53+
```
54+
55+
For example the following query searches for top 3 users whose names are most similar to the name of the user with id "0xef7".
56+
57+
```graphql
58+
querySimilarUserById(by: name_v, topK: 3, id: "0xef7") {
59+
id
60+
name
61+
vector_distance
62+
}
63+
```
64+

content/graphql/schema/directives/_index.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,12 @@ Reference: [Deprecation]({{< relref "deprecated.md" >}})
3838

3939
Reference: [@dgraph directive]({{< relref "directive-dgraph" >}})
4040

41+
### @embedding
42+
43+
`@embedding` directive designates one or more fields as vector embeddings.
44+
45+
Reference: [@embedding directive]({{< relref "embedding" >}})
46+
4147
### @generate
4248

4349
The `@generate` directive is used to specify which GraphQL APIs are generated for a type.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
+++
2+
title = "@embedding"
3+
weight = 1
4+
[menu.main]
5+
parent = "directives"
6+
+++
7+
8+
9+
A Float array can be used as a vector using `@embedding` directive. It denotes a vector of floating point numbers, i.e an ordered array of float32.
10+
11+
The embeddings can be defined on one or more predicates of a type and they are generated using suitable machine learning models.
12+
13+
This directive is used in conjunction with `@search` directive to declare the HNSW index. For more information see: [@search](/graphql/schema/directives/search/#vector-embedding) directive for vector embeddings.

content/graphql/schema/directives/search.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -624,3 +624,23 @@ query {
624624
}
625625
}
626626
```
627+
628+
### Vector embedding
629+
630+
The `@search` directive is used in conjunction with `@embeding` directive to define the HNSW index on vector embeddings. These vector embeddings are obtained from external Machine Learning models.
631+
632+
```graphql
633+
type User {
634+
userID: ID!
635+
name: String!
636+
name_v: [Float!] @embedding @search(by: ["hnsw(metric: euclidean, exponent: 4)"])
637+
}
638+
```
639+
640+
In this schema, the field `name_v` is an embedding on which the HNSW algorithm is used to create a vector search index.
641+
642+
The metric used to compute the distance between vectors (in this example) is Euclidean distance. Other possible metrics are `cosine` and `dotproduct`.
643+
644+
The directive, `@embedding`, designates one or more fields as vector embeddings.
645+
646+
The `exponent` value is used to set reasonable defaults for HNSW internal tuning parameters. It is an integer representing an approximate number for the vectors expected in the index, in terms of power of 10. Default is “4” (10^4 vectors).

content/graphql/schema/types.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,26 @@ type User {
5151

5252
Scalar lists in Dgraph act more like sets, so `tags: [String]` would always contain unique tags. Similarly, `recentScores: [Float]` could never contain duplicate scores.
5353

54+
### Vectors
55+
56+
A Float array can be used as a vector using `@embedding` directive. It denotes a vector of floating point numbers, i.e an ordered array of float32. A type can contain more than one vector predicate.
57+
58+
Vectors are normaly used to store embeddings obtained from an ML model.
59+
60+
When a Float vector is indexed, the GraphQL `querySimilar<type name>ByEmbedding` and `querySimilar<type name>ById` functions can be used for [similarity search]({{<relref "vector-similarity.md">}}).
61+
62+
A simple example of adding a vector embedding on `name` to `User` type is shown below.
63+
64+
```graphql
65+
type User {
66+
userID: ID!
67+
name: String!
68+
name_v: [Float!] @embedding @search(by: ["hnsw(metric: euclidean, exponent: 4)"])
69+
}
70+
```
71+
72+
In this schema, the field `name_v` is an embedding on which the [@search ](/graphql/schema/directives/search/#vector-embedding) directive for vector embeddings is used.
73+
5474
### The `ID` type
5575

5676
In Dgraph, every node has a unique 64-bit identifier that you can expose in GraphQL using the `ID` type. An `ID` is auto-generated, immutable and never reused. Each type can have at most one `ID` field.

0 commit comments

Comments
 (0)