-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Handle legacy mappings with placeholder fields #85059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-search (Team:Search) |
…ersion indexCreatedVersion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with the Mapper changes now, I think. Can we have some tests checking serialization of placeholder mappers with unknown fields?
@@ -227,7 +227,8 @@ public String toString() { | |||
|
|||
public static final TypeParser PARSER = new TypeParser( | |||
(n, c) -> new Builder(n, c.indexVersionCreated()), | |||
notInMultiFields(CONTENT_TYPE) | |||
notInMultiFields(CONTENT_TYPE), | |||
Version.CURRENT.minimumCompatibilityVersion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that there are a few of these multi-parameter calls, maybe we should add a super constructor that takes the Builder and verifier functions and passes Version.CURRENT.minimumCompatibilityVersion()
as a default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed in 07234b1
mappingsBuilder.startObject("date").field("type", "date").field("format", "yyyy/MM/dd").endObject(); | ||
mappingsBuilder.endObject().endObject(); | ||
putMappingsRequest.setJsonEntity(Strings.toString(mappingsBuilder)); | ||
assertOK(client().performRequest(putMappingsRequest)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why this needs to be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now automated by the mapping conversion. Prior to this PR, you had to manually define the mapping after importing a legacy index in order to access the fields in that index (the original mapping would only be imported to _meta/legacy-mappings section).
I've added such a test in ad99173 in addition to the ones in OldMappingsIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of nits, but LGTM otherwise. Thanks for your patience on this @ywelsch!
server/src/main/java/org/elasticsearch/index/mapper/FieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/BooleanFieldMapper.java
Outdated
Show resolved
Hide resolved
"include_in_all", | ||
"[include_in_all] is deprecated, the _all field have been removed in this version" | ||
); | ||
if (parserContext.indexVersionCreated().isLegacyIndexVersion() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to guard this with a version check? I think it will still be true even for older mappings, in that we won't generate an _all
field type that you can search against even if the underlying lucene field exists in the index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot. In an earlier iteration, I was pondering on adding _all support for legacy indices, hence the version check. I've removed it now.
@elasticmachine run elasticsearch-ci/part-1 (unrelated failure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some nits and some questions, no blockers on my end. LGTM, thanks for all the iterations.
*/ | ||
private void checkMappingsCompatibility(IndexMetadata indexMetadata) { | ||
@Nullable | ||
public Mapping checkMappingsCompatibility(IndexMetadata indexMetadata) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I find it slightly confusing that this check method is now returning the mappings. I can see how it is convenient to return the mappings obtained from the merge operation. Would it make sense renaming the method to adapt it to the updated behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed in 07a2602
Map<String, MetadataFieldMapper.TypeParser> metadata5x = new LinkedHashMap<>(metadata7x); | ||
metadata5x.put(LegacyTypeFieldMapper.NAME, LegacyTypeFieldMapper.PARSER); | ||
this.metadataMapperParsers5x = metadata5x; | ||
this.fieldFilter = fieldFilter; | ||
} | ||
|
||
/** | ||
* Return a map of the mappers that have been registered. The | ||
* Return a map of the non-legacy mappers that have been registered. The | ||
* returned map uses the type of the field as a key. | ||
*/ | ||
public Map<String, Mapper.TypeParser> getMapperParsers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this method is only used in tests now: is that ok or do we need to adapt tests as a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've adapted the tests and removed it in 4c86d3c
* so that the original mapping can be preserved and proper exception messages can | ||
* be provided when accessing these fields. | ||
*/ | ||
public class PlaceHolderFieldMapper extends FieldMapper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to get confused with naming here, and maybe it is just me: would a different name like UnknownLegacyFieldMapper better represent what this mapper is used for? Shall we make it final also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the PlaceHolder terminology, as we might know the field type (e.g. it's a completion field, or a search_as_you_type field, which the current version/implementation knows about), but we just not chose to support for legacy indices. I'm open for changing it if we find a better name, but also happy with the current one.
Regarding "final" modifier, it's more of a personal preference, but I like to avoid adding too many restrictive modifiers in our codebase, as we're not developing a library (unlike Lucene), and it feels unnecessary doing so (putting extra focus on things that should not matter).
this(builderFunction, contextValidator, Version.CURRENT.minimumIndexCompatibilityVersion()); | ||
} | ||
|
||
public TypeParser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this one could be made private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in f163412
this(builderFunction, (n, c) -> {}, Version.CURRENT.minimumIndexCompatibilityVersion()); | ||
} | ||
|
||
public TypeParser(BiFunction<String, MappingParserContext, Builder> builderFunction, Version minimumCompatibilityVersion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it help to add javadocs here to clarify when to use which constructor? as far as I understand, the constructor that takes the version is the one called by all the mappers that are supported on legacy indices, all the rest remains the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in f163412
@@ -1352,6 +1358,10 @@ public final void parse(String name, MappingParserContext parserContext, Map<Str | |||
validate(); | |||
} | |||
|
|||
protected void handleUnknownParam(String propName, Object propNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we rename it to something that removes any doubt that this is only called for legacy indices? handleLegacyindexUnknownParam ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed the method to handleUnknownParamOnLegacyIndex in f163412
|
||
@Override | ||
public boolean supportsVersion(Version indexCreatedVersion) { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this I find a bit surprising especially given that field aliases were added with 6.4 . I wonder if this is never called, in which case maybe we should throw exception instead, or if it could rely on the default impl instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though field aliases were only added in ES 6.4, they could be used with older indices as well (i.e. you had a 5.x index for example that you kept using in 6.4 and now added the field alias to the mapping of that 5.x index). As field aliases are more of a runtime property (e.g. just like runtime fields, which you could also use on indices created before the release of runtime fields) and have no connection to the actual data, I see no reason to limit their use.
|
||
@Override | ||
public boolean supportsVersion(Version indexCreatedVersion) { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if overriding these two supportsVersion is needed. Maybe it does because metadata fields are always supported and they don't take a version in like FieldMapper does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was never called, it mainly resulted from the fact that MetadataFieldMapper.TypeParser was extending Mapper.TypeParser (with no good reason). I separated the two in b25369e, which means we no longer have to have this method here.
|
||
@Override | ||
public boolean supportsVersion(Version indexCreatedVersion) { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too, I wonder what this means. Does this get called, or does this simply mean "objects are always supported regardless of the version"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter (objects are always supported regardless of the version). This gets called indeed, e.g. when you have an object below an object.
Thank you for the reviews @romseygeek and @javanna! |
As part of #81210 we would like to add support for handling legacy (Elasticsearch 5 and 6) mappings in newer Elasticsearch versions. The idea is to import old mappings "as-is" into Elasticsearch 8, and adapt the mapper parsers so that they can handle those old mappings. Only a select subset of the legacy mapping will actually be parsed, and fields that are neither known to newer ES version nor supported for search will be mapped as "placeholder fields", i.e., they are still represented as fields in the system so that they can give proper error messages when queried by a user.
Fields that are supported:
5.x indices with mappings that have multiple mapping types are collapsed together on a best-effort basis before they are imported.
Relates #81210