Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix client provider dist acc aggregation crashing #65

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

kacperzuk-neti
Copy link
Collaborator

@kacperzuk-neti kacperzuk-neti commented Mar 10, 2025

This PR:

  1. Bumps prisma CLI to match @prisma/client version
  2. Fixes measuring query & store time metrics in "streaming" cases
  3. Switches from truncate to delete all for data consistency (though this will need to be monitored closely by @lukasz-wal to make sure the DB can handle vacuuming this)
  4. Changes client-provider-distribution-acc to process data week by week instead of all at once to avoid blowing up DMOB DB. After changes it takes 1-3 minutes per week and it needs ~2G of temporary disk space on DMOB DB.

The biggest issue I see that this new aggregation on it's own requires ~2h. But that's a temporary until we switch to the incremental version, which will take <5 minutes.

@kacperzuk-neti kacperzuk-neti self-assigned this Mar 10, 2025
@kacperzuk-neti kacperzuk-neti force-pushed the fix-client-provider-dist-acc-perf branch 2 times, most recently from d913270 to 94454b0 Compare March 10, 2025 14:34
@@ -65,7 +65,7 @@
"eslint-plugin-prettier": "^5.2.1",
"jest": "^29.7.0",
"prettier": "^3.3.3",
"prisma": "^5.22.0",
"prisma": "^6.0.1",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes it consistent with @prisma/client

@@ -1,7 +1,7 @@
generator client {
provider = "prisma-client-js"
output = "./generated/client"
previewFeatures = ["typedSql", "omitApi", "metrics"]
previewFeatures = ["typedSql", "metrics"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no longer a preview feature in prisma 6

Comment on lines +22 to +37
//const latestStored =
// await prismaService.client_provider_distribution_weekly_acc.findFirst({
// select: {
// week: true,
// },
// orderBy: {
// week: 'desc',
// },
// });
//let nextWeek = latestStored
// ? DateTime.fromJSDate(latestStored.date, { zone: 'UTC' }) // we want to reprocess the last stored week, as it might've been incomplete
// : DateTime.fromSeconds(3847920 * 30 + 1598306400).startOf('week'); // nv22 start week - a.k.a. reprocess everything
Copy link
Collaborator Author

@kacperzuk-neti kacperzuk-neti Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commented code is for incremental processing - it only recalculates since last processed week (inclusive, as last processed week could be incomplete or could be current ongoing week).

My plan is to first do the uncommented version that simply reprocesses everything every time. That's because we actually need to recalculate current data - it's wrong, data uniqueness was only calculated "per week" and not "up to a week". So a client could store that same data with the same SP on week 1 and week 2 and it wouldn't detect it as duplicates.

After it runs at least once, we can switch to the incremental version for performance.

@kacperzuk-neti kacperzuk-neti force-pushed the fix-client-provider-dist-acc-perf branch from 94454b0 to eb3e3e7 Compare March 10, 2025 14:41
@kacperzuk-neti kacperzuk-neti force-pushed the fix-client-provider-dist-acc-perf branch from eb3e3e7 to 2a3cb95 Compare March 10, 2025 14:46
@kacperzuk-neti kacperzuk-neti deployed to production-fidl March 10, 2025 14:48 — with GitHub Actions Active
@kacperzuk-neti kacperzuk-neti changed the title Fix client provider dist acc perf Fix client provider dist acc aggregation crashing Mar 10, 2025
@kacperzuk-neti kacperzuk-neti marked this pull request as ready for review March 10, 2025 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants