Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dynamodb_item_version metadata that is derived from timestamp for… #3596

Merged
merged 2 commits into from
Nov 8, 2023

Conversation

graytaylor0
Copy link
Member

@graytaylor0 graytaylor0 commented Nov 7, 2023

… stream events

Description

This change adds a new metadata field for DymamoDB Events, dynamodb_item_version. This version is derived from the timestamp for stream events, and set to 0L for export Events, in order to give stream events precedence over export

This change also writes stream event timestamps to the externalOriginationMetadata for stream Events. This will provide an out of the box external latency metric for ddb streams.

The dynamodb_item_version metadata is intended for use with the document_version field of the OpenSearch sink to always take the latest Events.

Since the timestamp for stream Events is in seconds, we are differentiating stream events received in the same second for a shard with the formula ddb_version = ddbEventTimestamp.getEpochSecond() * 1_000_000 + recordsSeenThisSecond.

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

public void addToBuffer(final AcknowledgementSet acknowledgementSet,
final Map<String, Object> data,
final Map<String, Object> keys,
long eventCreationTimeMillis,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make these two parameters final.


// Only set external origination time for stream events, not export
if (eventName != null) {
event.getEventHandle().setExternalOriginationTime(Instant.ofEpochMilli(eventCreationTimeMillis));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also should set it on the EventMetadata.

recordsSeenThisSecond++;
}

return eventTimeInSeconds.toEpochMilli() * 1000 + recordsSeenThisSecond;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when there are more than 1000 records in a second?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be an issue, but we would have to get 1000 records in same shard, which is impossible at this time

Signed-off-by: Taylor Gray <[email protected]>
@graytaylor0 graytaylor0 requested a review from dlvenable November 7, 2023 22:27
@graytaylor0 graytaylor0 dismissed dlvenable’s stale review November 8, 2023 00:28

Addressing comments in follow up PR

@graytaylor0
Copy link
Member Author

Addressed the comments from @dlvenable

@graytaylor0 graytaylor0 merged commit a1c56d0 into opensearch-project:main Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants