Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: DynamoDB projections #1222

Merged
merged 1 commit into from
Oct 25, 2024
Merged

feat: DynamoDB projections #1222

merged 1 commit into from
Oct 25, 2024

Conversation

pvlugter
Copy link
Contributor

Ported over from the internal incubating project.

project/project-info.conf Outdated Show resolved Hide resolved
@@ -38,6 +38,13 @@ object Dependencies {
case Seq(major, minor, _*) => s"$major.$minor"
}

val AkkaPersistenceDynamodb = "2.0.0-M1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First 2.0 milestone published, while depending on Akka 2.10 milestones.

@pvlugter pvlugter marked this pull request as ready for review October 16, 2024 06:41
private def readTimestampOffset(): Future[TimestampOffsetBySlice] = {
val oldState = state.get()
// retrieve latest timestamp for each slice, and use the earliest
val futTimestamps =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is conceivably 1024 queries fired off right at startup, wonder if it's better to do this in a stream with mapAsync and limited parallelism?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, needing the separate queries is a weak point for the plugin. The events-by-slices queries are also all separate and in parallel. How many queries that actually run in parallel should come down to the concurrency settings for the DynamoDB client. Not sure if @patriknw had any other thoughts at the time?

(And let's do any improvements in separate PRs. With this one as the initial copy over.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the concern that it creates a burst that is not desired? I don't know if that is a problem. Create an issue/pr if that has been seen as a real problem when testing.

events-by-slices queries are more separate and therefore more spread over time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At minimum, if client.http.max-concurrency + client.http.max-pending-connection-acquires is less than 1024 (which particularly latency-sensitive applications may end up having), there is no chance of a DynamoDB projection successfully starting, so the projection should check for this.

Another possibility would be having the queries here retry, so that the offsets can eventually be recovered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a PR up for at least one of these when this is merged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to add the bounded parallelism, thanks

Copy link
Member

@patriknw patriknw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pvlugter pvlugter merged commit 1742fd1 into main Oct 25, 2024
21 of 22 checks passed
@pvlugter pvlugter deleted the dynamodb branch October 25, 2024 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants