-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Migration Assistant Type Mapping documentation #9164
base: main
Are you sure you want to change the base?
Add Migration Assistant Type Mapping documentation #9164
Conversation
Signed-off-by: Andre Kurait <[email protected]>
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
8. When running metadata migration, run with the additional parameter `console metadata migrate --transformer-config-file /shared-logs-output/transformation.json`.[^1] | ||
9. If the transformation configuration is updated, backfill/replayer will need to be stopped and restarted to apply the changes. | ||
|
||
[^1]: The `/shared-logs-output` mount will soon be relocated with a new mountpoint on the container for the EFS volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove, what action would a reader of the document take with this note?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up.
Lets consider reframing this documentation for a user they doesn't know if they use this deprecated feature, this kind of jumps into 'what to do' without enough context for a user to know if they are impacted and why they should make different choices.
### Using the TypeMappingsSanitizationTransformer | ||
|
||
1. Navigate to the bootstrap box and open the `cdk.context.json` with vim. | ||
2. Add/Update the key `reindexFromSnapshotExtraArgs` to include `--doc-transformer-config-file /shared-logs-output/transformation.json`. [^1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we supply an empty file at this location and adjust our defaults to pull from this location - that would shorten what a user need to do? We are adding a lot of steps that look very similar and are easy to do incorrectly.
"post": "new_posts" | ||
} | ||
}, | ||
"sourceProperties": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the version here? Maybe I am missing how it is valuable for a user to pick a different version
"user": "combined_activity", | ||
"post": "combined_activity" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify this name to activity
?
[ | ||
{ | ||
"TypeMappingsSanitizationTransformerProvider": { | ||
"staticMappings": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop this node, so its just a list of indices and the types to be mapped?
"TypeMappingsSanitizationTransformerProvider": { | ||
"staticMappings": { | ||
"activity": { | ||
"user": "users_only", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So any types that are [not] included will be dropped? That seems like an easy thing to get wrong or mistype, do we have a way to preflight this and warn users what they are doing if its by accident or is there a way we can make this more intention?
{
"{index-name}": {
"mappedTypes": {
"users": "users_only"
},
"droppedTypes": [
"post"
]
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peternied are you asking "any types that aren't included"? That's what I got reading this, but I'd love if the documentation said that explicitly. (and/or peter's proposal here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching my typo, fixed it!
[ | ||
"(.*)", | ||
".*", | ||
"$1" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add names property names to signal what these mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, these are super confusing and not explained at all. I think we had previously discussed these being dicts, e.g. {"index_regex": "(.*)", "type_regex": ".*", "new_index": "$1"}
or whatever names, but either way we definitely need more explanation.
- Within the replayer, create index split is not supported. | ||
- Note: This is only impactful on ES 5.x since multi type index creation is not supported on ES 6.x. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets frame this like the other operations so its easier to see.
If it is a known gap, we should proactively create a github issue that folks could comment / vote on and we'd leave a link here in the docs.
- PUT/POST /{index}/{type}/{id} | ||
- PUT/POST /{index}/{type}/ | ||
- GET /{index}/{type}/{id} | ||
- PUT/POST /_bulk | ||
- PUT/POST /{index} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets turn this into a table with a leading column for the operation name such as Index
, etc Bulk Index/Update/Delete
.
- Within the replayer, create index split is not supported. | ||
- Note: This is only impactful on ES 5.x since multi type index creation is not supported on ES 6.x. | ||
|
||
### Important Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these important notes are at the bottom, lets see if we can better integrate them into the workflow.
|
||
# Handling type mapping deprecation | ||
|
||
During a migration from Elasticsearch versions 6.x and before, you may have used the type mapping feature. This page provides a cookbook of different scenarios and templates that can be used to handle these deprecations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"handle these deprecations when migrating to OpenSearch."
6. Create a file with `vim /shared-logs-output/transformation.json`. [^1] | ||
7. Add your transformation configuration to the file (see examples below). | ||
8. When running metadata migration, run with the additional parameter `console metadata migrate --transformer-config-file /shared-logs-output/transformation.json`.[^1] | ||
9. If the transformation configuration is updated, backfill/replayer will need to be stopped and restarted to apply the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should also probably be a note here that already-transferred data and metadata should be cleared to avoid inconsistent and incompatible states.
"TypeMappingsSanitizationTransformerProvider": { | ||
"staticMappings": { | ||
"activity": { | ||
"user": "users_only", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peternied are you asking "any types that aren't included"? That's what I got reading this, but I'd love if the documentation said that explicitly. (and/or peter's proposal here)
[ | ||
"(.*)", | ||
".*", | ||
"$1" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, these are super confusing and not explained at all. I think we had previously discussed these being dicts, e.g. {"index_regex": "(.*)", "type_regex": ".*", "new_index": "$1"}
or whatever names, but either way we definitely need more explanation.
|
||
### Defaults | ||
|
||
When no regex mappings are specified, the transformer will default to the following behavior: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this when nothing at all is specified? Or regexMappings
is specified but it's empty?
``` | ||
|
||
This has the effect of retaining the index name for es 6+ created indices while comibining the type and index name for es 5.x created indices. If you wish to retain the index name for es 5.x created indices, please use the `staticMappings` option or override the type mappings using the `regexMappings` option similar to the `Keep original structure` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example/some more info on how these can be combined for multiple indices? Is it possible to combine regexMappings
for one index with static mappings for another? Can you show an example with multiple static mappings?
### Important Notes | ||
|
||
- After ES 6, documents either have no type or are forced to use the type `_doc` | ||
- When migrating from ES 6, some type context might not be available (e.g., during RFS operations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what this point means
@AndreKurait: Let me know once you've implemented @peternied and @mikaylathompson's feedback and I'll begin the writer review process. |
Description
Add Migration Assistant Type Mapping documentation
Issues Resolved
MIGRATIONS-2385
Version
all (Migration Assistant 2.1.5+)
Frontend features
n/a
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.