discovers: replace invalid collection keys with discovered fallback keys#2808
Open
discovers: replace invalid collection keys with discovered fallback keys#2808
Conversation
If a collection's existing key no longer exists in the discovered schema, and the discovered fallback key does, replace the collection's key to use the discovered fallback key. If both are invalid, or the existing key still resolves, do nothing. Also update `Changed` to carry an optional `reason` field that surfaces in the publication detail when a fallback key replacement occurs.
7790f7d to
9664bbf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
When a user removes a primary key column from a captured table, the connector's discover response returns the updated schema (without that column) and optionally a set of fields which together uniquely identify a row, marked as a fallback key. The discover merge logic is designed to skip key updates for fallback keys, preserving any user-customized key, but it still updates the schema of the collection. The result is a collection whose key references field(s) in the schema that no longer exist, which causes newly captured documents to fail validation due to:
The connector behavior is correct: a secondary unique index is not the primary key, so marking it as a fallback is accurate. The merge logic just didn't account for the case where the existing key becomes invalid after a schema update, and we have a viable alternative in the form of the fallback key.
What changed
merge_collectionsnow validates the existing key against the discovered schema. When a fallback key is discovered and the existing key no longer resolves (its fields aren't in the schema), the fallback replaces it. Two guards prevent over-application:The key validity check uses
doc::Shape::infer+shape.locate()to determine whether each pointer resolves to an explicit schema location (Exists::MustorExists::May) vs an implicit or impossible one.When a fallback key replacement occurs, a
reasonstring is attached to theChangedstruct and included in the publication detail, producing output like:Question: Should this be further constrained to
Exists::Must? I imagine that we don't currently have any constraint that fields included in a fallback key be required fields, so many of them likely areExists::May. What... happens when a key is optional? Would that even work?