[EIS] Dense Text Embedding task type integration #129847

timgrein · 2025-06-23T12:19:39Z

This PR adds the text_embedding task type to the elastic inference provider.

Testing flow:

Prerequisites:
- EIS is setup locally to allow multilingual-embed-v1
- EIS returns a dummy response for dense text embeddings

Verifying, that the default endpoint exists:

curl --location 'http://localhost:9200/_inference?pretty=null' \
--header 'Authorization: {BASIC_AUTH}'

{
    "endpoints": [
        ...
        {
            "inference_id": ".multilingual-embed-v1-elastic",
            "task_type": "text_embedding",
            "service": "elastic",
            "service_settings": {
                "model_id": "multilingual-embed-v1",
                "rate_limit": {
                    "requests_per_minute": 10000
                }
            }
        },
        ...

Generating embeddings using the default endpoint:

curl --location 'http://localhost:9200/_inference/text_embedding/.multilingual-embed-v1-elastic' \
--header 'Content-Type: application/json' \
--header 'Authorization: {BASIC_AUTH}' \
--data '{
    "input": "A blue sky"
}'

{
    "text_embedding": [
        {
            "embedding": [
                2.1259406,
                1.7073475,
                ...
            ]
        }
    ]
}

…Some tests WIP

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceGetServicesIT.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionCreator.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionVisitor.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java

…reatorTests

…to eis-text-embedding-task-type

jonathan-buttner

Overall looks good, I left a few suggestions.

jonathan-buttner · 2025-06-23T13:47:49Z

...ence/external/response/elastic/ElasticInferenceServiceDenseTextEmbeddingsResponseEntity.java

+    public static TextEmbeddingFloatResults fromResponse(Request request, HttpResult response) throws IOException {
+        var parserConfig = XContentParserConfiguration.EMPTY.withDeprecationHandler(LoggingDeprecationHandler.INSTANCE);
+
+        try (XContentParser jsonParser = XContentFactory.xContent(XContentType.JSON).createParser(parserConfig, response.body())) {


Can we use the ConstructingObjectParser style instead?

Here's an example: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/openai/response/OpenAiEmbeddingsResponseEntity.java#L76

Addressed by @brendan-jugan-elastic's commit: use ConstructingObjectParser for response parsing

jonathan-buttner · 2025-06-23T13:53:06Z

...s/elastic/densetextembeddings/ElasticInferenceServiceDenseTextEmbeddingsServiceSettings.java

+        ElasticInferenceServiceRateLimitServiceSettings {
+
+    public static final String NAME = "elastic_inference_service_dense_embeddings_service_settings";
+    static final String DIMENSIONS_SET_BY_USER = "dimensions_set_by_user";


Having this field is important if the request to EIS will include a field called dimensions or someway to telling EIS to number of dimensions to return in the response. I don't see a field being sent to EIS. Or did I miss it?

The reason we rely on a "set by user" is because it helps determining whether we automatically figured out the number of dimensions or if we took the user's value.

It's not there yet as we're not 100% sure, which model we're going to host, so this includes some guesswork. I thought it would be better to have it or would you suggest to remove it and add it in a patch version, if we really need it?

Ah I see. I think I'd remove it for now because we could get in a weird state where if we release the code as it is right now and the user specifies the dimensions field, we'll set dimensions_set_by_user to true but behind the scenes we're going to auto set it after the fact. So the actual state will not be accurate.

Addressed by @brendan-jugan-elastic's commit: remove dimensions_set_by_user

jonathan-buttner · 2025-06-23T14:31:05Z

...erence/services/elastic/request/ElasticInferenceServiceDenseTextEmbeddingsRequestEntity.java

+
+        // optional field
+        if ((usageContext == ElasticInferenceServiceUsageContext.UNSPECIFIED) == false) {
+            builder.field(USAGE_CONTEXT, usageContext);


If usageContext is null I believe this if-block will return true. Is that what we want? I think we could also do != right?

Fair point, adjusted with Do not set usage context, if it's null

brendan-jugan-elastic · 2025-06-24T03:32:00Z

...ence/external/response/elastic/ElasticInferenceServiceDenseTextEmbeddingsResponseEntity.java

+     *                      0.9020516
+     *                  ],
+     *                  (...)
+     *             ],


I vaguely remembered Tim's thread on this a couple weeks ago, but should we revisit the response format? Looking at OpenAI, Alibaba, and Mixedbread as quick references, it looks like they return a list of objects. I don't have a strong preference, but just wanted to bring this up since we might be differing from others here and wanted to confirm that this is what we want.
Thanks!

Answered in the thread

elasticsearchmachine · 2025-06-24T07:44:29Z

Pinging @elastic/ml-core (Team:ML)

brendan-jugan-elastic

LGTM! One small question

brendan-jugan-elastic · 2025-06-24T12:48:29Z

...st/java/org/elasticsearch/xpack/inference/integration/InferenceRevokeDefaultEndpointsIT.java

+                                ElasticInferenceService.NAME,
+                                ElasticInferenceService.DENSE_TEXT_EMBEDDINGS_DIMENSIONS,
+                                ElasticInferenceService.defaultDenseTextEmbeddingsSimilarity(),
+                                DenseVectorFieldMapper.ElementType.FLOAT


Not a blocker, but can you explain why the MinimalServiceSettings differ from other task types?

I think it's just about the different purposes models/tasks:

Dense vector embeddings can have different element types (typically float, but they can also be quantized to bit vectors or int vectors for example) , therefore we need to specify the ElementType. Some models also allow you to specify a target number of dimensions (f.e. when using Matryoshka embeddings, therefore we need to specify the number of dimensions. Also vector embeddings can be compared using different similarity measures, therefore we need to specify the similarity measure.

A reranking model simply returns an ordered list of ranked documents, so it doesn't make sense to specify dimensions, an element type or a similarity measure

Makes sense! Thanks for the background :)

elasticsearchmachine · 2025-06-24T19:39:35Z

💔 Backport failed

Status	Branch	Result
❌	8.19	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 129847

(cherry picked from commit 3b51dd5) # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceGetModelsWithElasticInferenceServiceIT.java # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/MockElasticInferenceServiceAuthorizationServer.java # x-pack/plugin/inference/src/internalClusterTest/java/org/elasticsearch/xpack/inference/integration/InferenceRevokeDefaultEndpointsIT.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionVisitor.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/response/ElasticInferenceServiceAuthorizationResponseEntity.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java

timgrein · 2025-06-24T19:50:07Z

💚 All backports created successfully

Status	Branch	Result
✅	8.19

Questions ?

Please refer to the Backport tool documentation

…29963)

timgrein added 4 commits June 23, 2025 13:19

Add working dense text embeddings integration with default endpoint. …

f054dca

…Some tests WIP

Fix merge conflicts, compilation errors and test failures

6584dab

Spotless apply

9d47176

elasticsearchmachine added the v9.1.0 label Jun 23, 2025

timgrein and others added 8 commits June 23, 2025 15:12

Add ElasticInferenceServiceDenseTextEmbeddingsRequestTests

3e8c70a

Add ElasticInferenceServiceDenseTextEmbeddingsRequestEntityTests

23e7595

Add "-v1" to multilingual-embed

5af7516

Add ElasticInferenceServiceDenseTextEmbeddingsServiceSettingsTests.java

fddfd9d

Add dense text embedding test cases to ElasticInferenceServiceActionC…

9b48dfb

…reatorTests

[CI] Auto commit changes from spotless

dbdadbe

Add ElasticInferenceServiceDenseTextEmbeddingsResponseEntityTests

e2f872e

Merge remote-tracking branch 'origin/eis-text-embedding-task-type' in…

485dd89

…to eis-text-embedding-task-type

jonathan-buttner reviewed Jun 23, 2025

View reviewed changes

timgrein and others added 11 commits June 23, 2025 16:50

Merge branch 'main' into eis-text-embedding-task-type

172070a

Fix compilation error after resolving merge conflict and spotlessAppl

6a35870

Merge branch 'main' into eis-text-embedding-task-type

a8b604b

remove dimensions_set_by_user

3b486b7

Merge branch 'main' into eis-text-embedding-task-type

6ffcc22

[CI] Auto commit changes from spotless

3489a09

fix checkstyle

fb5dbc0

fix checkstyle

1dcbcab

[CI] Auto commit changes from spotless

dc6f320

use ConstructingObjectParser for response parsing

087d4e5

[CI] Auto commit changes from spotless

cd3e116

brendan-jugan-elastic reviewed Jun 24, 2025

View reviewed changes

timgrein and others added 2 commits June 24, 2025 09:17

Merge branch 'main' into eis-text-embedding-task-type

aa24341

Some cleanup (removing unused vars etc.)

7269c51

timgrein marked this pull request as ready for review June 24, 2025 07:42

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jun 24, 2025

timgrein added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.19.0 auto-backport Automatically create backport pull requests when merged and removed needs:triage Requires assignment of a team area label labels Jun 24, 2025

timgrein requested review from jonathan-buttner and brendan-jugan-elastic June 24, 2025 08:09

timgrein added 2 commits June 24, 2025 10:19

Fix integration test

220e208

Do not set usage context, if it's null

27ca440

brendan-jugan-elastic approved these changes Jun 24, 2025

View reviewed changes

timgrein and others added 4 commits June 24, 2025 15:57

Pass through chunking settings and provide default for default endpoint

b7d10b8

Merge branch 'main' into eis-text-embedding-task-type

3164c6c

After merge conflict resolution clean-up

fc11815

Merge branch 'main' into eis-text-embedding-task-type

59f84a9

timgrein merged commit 3b51dd5 into elastic:main Jun 24, 2025
32 checks passed

elasticsearchmachine added the backport pending label Jun 24, 2025

timgrein mentioned this pull request Jun 24, 2025

[8.19] [EIS] Dense Text Embedding task type integration (#129847) #129963

Merged

mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jun 25, 2025

[EIS] Dense Text Embedding task type integration (elastic#129847)

aea4d5f

timgrein added a commit that referenced this pull request Jun 25, 2025

[8.19] [EIS] Dense Text Embedding task type integration (#129847) (#1…

7d09f05

…29963)

[EIS] Dense Text Embedding task type integration #129847

[EIS] Dense Text Embedding task type integration #129847

Uh oh!

Conversation

timgrein commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonathan-buttner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Jun 24, 2025

Uh oh!

brendan-jugan-elastic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Jun 24, 2025

💔 Backport failed

Uh oh!

timgrein commented Jun 24, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Uh oh!

timgrein commented Jun 23, 2025 •

edited

Loading