-
Notifications
You must be signed in to change notification settings - Fork 25.3k
[EIS] Dense Text Embedding task type integration #129847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EIS] Dense Text Embedding task type integration #129847
Conversation
# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceGetServicesIT.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionCreator.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionVisitor.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java
…to eis-text-embedding-task-type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, I left a few suggestions.
public static TextEmbeddingFloatResults fromResponse(Request request, HttpResult response) throws IOException { | ||
var parserConfig = XContentParserConfiguration.EMPTY.withDeprecationHandler(LoggingDeprecationHandler.INSTANCE); | ||
|
||
try (XContentParser jsonParser = XContentFactory.xContent(XContentType.JSON).createParser(parserConfig, response.body())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the ConstructingObjectParser
style instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by @brendan-jugan-elastic's commit: use ConstructingObjectParser for response parsing
ElasticInferenceServiceRateLimitServiceSettings { | ||
|
||
public static final String NAME = "elastic_inference_service_dense_embeddings_service_settings"; | ||
static final String DIMENSIONS_SET_BY_USER = "dimensions_set_by_user"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this field is important if the request to EIS will include a field called dimensions
or someway to telling EIS to number of dimensions to return in the response. I don't see a field being sent to EIS. Or did I miss it?
The reason we rely on a "set by user" is because it helps determining whether we automatically figured out the number of dimensions or if we took the user's value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not there yet as we're not 100% sure, which model we're going to host, so this includes some guesswork. I thought it would be better to have it or would you suggest to remove it and add it in a patch version, if we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. I think I'd remove it for now because we could get in a weird state where if we release the code as it is right now and the user specifies the dimensions field, we'll set dimensions_set_by_user
to true but behind the scenes we're going to auto set it after the fact. So the actual state will not be accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by @brendan-jugan-elastic's commit: remove dimensions_set_by_user
|
||
// optional field | ||
if ((usageContext == ElasticInferenceServiceUsageContext.UNSPECIFIED) == false) { | ||
builder.field(USAGE_CONTEXT, usageContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If usageContext
is null
I believe this if-block
will return true
. Is that what we want? I think we could also do !=
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, adjusted with Do not set usage context, if it's null
* 0.9020516 | ||
* ], | ||
* (...) | ||
* ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vaguely remembered Tim's thread on this a couple weeks ago, but should we revisit the response format? Looking at OpenAI, Alibaba, and Mixedbread as quick references, it looks like they return a list of objects. I don't have a strong preference, but just wanted to bring this up since we might be differing from others here and wanted to confirm that this is what we want.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered in the thread
Pinging @elastic/ml-core (Team:ML) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! One small question
ElasticInferenceService.NAME, | ||
ElasticInferenceService.DENSE_TEXT_EMBEDDINGS_DIMENSIONS, | ||
ElasticInferenceService.defaultDenseTextEmbeddingsSimilarity(), | ||
DenseVectorFieldMapper.ElementType.FLOAT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but can you explain why the MinimalServiceSettings
differ from other task types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's just about the different purposes models/tasks:
- Dense vector embeddings can have different element types (typically float, but they can also be quantized to bit vectors or int vectors for example) , therefore we need to specify the
ElementType
. Some models also allow you to specify a target number of dimensions (f.e. when using Matryoshka embeddings, therefore we need to specify the number of dimensions. Also vector embeddings can be compared using different similarity measures, therefore we need to specify the similarity measure. - A reranking model simply returns an ordered list of ranked documents, so it doesn't make sense to specify dimensions, an element type or a similarity measure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Thanks for the background :)
💔 Backport failed
You can use sqren/backport to manually backport by running |
(cherry picked from commit 3b51dd5) # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceGetModelsWithElasticInferenceServiceIT.java # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/MockElasticInferenceServiceAuthorizationServer.java # x-pack/plugin/inference/src/internalClusterTest/java/org/elasticsearch/xpack/inference/integration/InferenceRevokeDefaultEndpointsIT.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionVisitor.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/response/ElasticInferenceServiceAuthorizationResponseEntity.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
This PR adds the
text_embedding
task type to theelastic
inference provider.Testing flow:
multilingual-embed-v1
Verifying, that the default endpoint exists:
Generating embeddings using the default endpoint: