-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cio optimization #2550
base: main
Are you sure you want to change the base?
feat: cio optimization #2550
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
.infra/application.properties
Outdated
@@ -9,7 +9,7 @@ debezium.source.database.password=%database_pass% | |||
debezium.source.database.dbname=%database_dbname% | |||
debezium.source.database.server.name=api | |||
debezium.source.table.include.list=public.comment,public.user_comment,public.comment_mention,public.source_request,public.post,public.user,public.post_report,public.source_feed,public.settings,public.reputation_event,public.submission,public.user_state,public.notification_v2,public.source_member,public.feature,public.source,public.post_mention,public.content_image,public.comment_report,public.user_post,public.banner,public.post_relation,public.marketing_cta,public.squad_public_request,public.user_streak,public.bookmark,public.user_company,public.source_report,public.user_top_reader,public.source_post_moderation | |||
debezium.source.column.exclude.list=public.post.tsv,public.post.placeholder,public.source.flags,public.user_top_reader.image | |||
debezium.source.column.exclude.list=public.post.tsv,public.post.placeholder,public.source.flags,public.user_top_reader.image,public.user.cioRegistered,public.user.acceptedMarketing,public.user.followingEmail,public.user.notificationEmail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to track changes from any of these columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure though?
For example acceptedMarketing should trigger userUpdated no? to sync them to CIO (when they not in your flow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. We have these props in CIO, they should definitely be synced. Let me check the worker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should not add these columns, but only when running the initial rollout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please note, this list doesn't exclude these columns from triggering cdc but from being sent in the debezium payload. To ignore them you will need to add a filter, it's more complex process but you can find examples in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does, didn't we fix that issue a while back by finally updating debezium?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an issue just misunderstanding of what this config does from our side. But it's better to check with adhoc infra any change in debezium props
src/cio.ts
Outdated
user: ChangeObject<User>, | ||
) => { | ||
const { id } = user; | ||
const changed = JSON.parse(JSON.stringify(user)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we have a util function that can turn regular objects into ChangeObject
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
structuredClone maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Yeah, I think that's better than json-parse-stringify.
Edit: introduced a function instead that makes the same parse and stringify of our objects to pubsub. Structured clone returns Date object still for which we need either a string or number, etc.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
.infra/application.properties
Outdated
@@ -28,4 +28,5 @@ debezium.transforms.ReadOperationFilter.condition=!(valueSchema.field('op') && v | |||
debezium.transforms.PostsFilter.type=io.debezium.transforms.Filter | |||
debezium.transforms.PostsFilter.language=jsr223.groovy | |||
debezium.transforms.PostsFilter.condition=!(valueSchema.field('op') && value.op == 'u' && value.source.table == 'post' && value.before.views != value.after.views) | |||
# debezium.transforms.PostsFilter.condition=!(valueSchema.field('op') && value.op == 'u' && value.source.table == 'post' && value.before.views != value.after.views) && !(valueSchema.field('op') && value.op == 'u' && value.source.table == 'user' && value.before.cioRegistered != value.after.cioRegistered && !value.after.cioRegistered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reactivation is still one-by-one based on this filter right?
We is it commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we are running it also in batches to avoid the rate-limiting as the users can reach up to thousands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we run in batches, we definitely need to uncomment this line, don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes for that part. This is in part of the preparation, so we can merge this PR once done, and when the initial rollout is about to start, we will just uncomment this line. I will mention this in the confluence page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we need this to be uncommented for production as well, not just initial rollout. Don't we? We don't rely on CDC anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Yes, moving forward, we also won't have to track the changes for the column. In that case, then yeah, we can always have it ignored.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't commit the initial rollout csv yet. we need to do it only when we're ready
Yes, this was just a sample. Just 1k records to test it. |
I will add more details in the rollout plan confluence page on how to proceed. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good but let's not merge until we meet to do the initial rollout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, just monitor very closely I guess.
This comment has been minimized.
This comment has been minimized.
🍹 The Update (preview) for dailydotdev/api/prod was successful. Resource Changes Name Type Operation
~ vpc-native-personalized-digest-deployment kubernetes:apps/v1:Deployment update
~ vpc-native-generate-search-invites-cron kubernetes:batch/v1:CronJob update
~ vpc-native-ws-deployment kubernetes:apps/v1:Deployment update
- vpc-native-api-migration-64eff7cd kubernetes:batch/v1:Job delete
+- vpc-native-debezium-props kubernetes:core/v1:Secret create-replacement
~ vpc-native-private-deployment kubernetes:apps/v1:Deployment update
~ vpc-native-check-analytics-report-cron kubernetes:batch/v1:CronJob update
~ vpc-native-deployment kubernetes:apps/v1:Deployment update
~ vpc-native-personalized-digest-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-trending-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-highlighted-views-cron kubernetes:batch/v1:CronJob update
+- vpc-native-debezium-deployment kubernetes:apps/v1:Deployment create-replacement
~ vpc-native-update-tag-recommendations-cron kubernetes:batch/v1:CronJob update
~ vpc-native-temporal-deployment kubernetes:apps/v1:Deployment update
~ vpc-native-generic-referral-reminder-cron kubernetes:batch/v1:CronJob update
~ vpc-native-bg-deployment kubernetes:apps/v1:Deployment update
~ vpc-native-clean-zombie-users-cron kubernetes:batch/v1:CronJob update
~ vpc-native-clean-zombie-images-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-source-public-threshold-cron kubernetes:batch/v1:CronJob update
~ vpc-native-sync-subscription-with-cio-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-views-cron kubernetes:batch/v1:CronJob update
~ vpc-native-hourly-notification-cron kubernetes:batch/v1:CronJob update
~ vpc-native-clean-zombie-user-companies-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-source-tag-view-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-current-streak-cron kubernetes:batch/v1:CronJob update
~ vpc-native-daily-digest-cron kubernetes:batch/v1:CronJob update
~ vpc-native-update-tags-str-cron kubernetes:batch/v1:CronJob update
~ vpc-native-calculate-top-readers-cron kubernetes:batch/v1:CronJob update
+ vpc-native-api-migration-94455505 kubernetes:batch/v1:Job create
|
CIO optimization by removing/registering users based on their current activity, along with downgrading their digest if we need to.
Once we've agreed on the processing code-wise, I will start working on the rollout plan, as it should basically use the same processing.