Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new pipeline 2025-03-6 #2846

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

StepanBrychta
Copy link
Contributor

@StepanBrychta StepanBrychta commented Mar 6, 2025

What does this change?

#5912

Following on wellcomecollection/platform#5911.

How to test

How can we measure success?

After reindexing with the new pipeline, all concepts which reference the same source concept or have the same labels (for label-derived concepts) should have the same canonical id.

Have we considered potential risks?

Risks should be minimal.

Copy link

github-actions bot commented Mar 6, 2025

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
1833 1582 86% 0% 🟢

New Files

No new covered files...

Modified Files

No covered modified files...

updated for commit: 3eceeb4 by action🐍

@StepanBrychta StepanBrychta force-pushed the Add-pipeline-2025-03-06 branch from b78d6d3 to f1ada4b Compare March 6, 2025 11:21
@agnesgaroux
Copy link
Contributor

Approved for reindex 👍
I assume it'll be made ready for review once it's all said and done and the reindexing_state variables are set to false?

@kenoir kenoir force-pushed the Add-pipeline-2025-03-06 branch from f1ada4b to 9689608 Compare March 7, 2025 10:46
@kenoir kenoir force-pushed the Add-pipeline-2025-03-06 branch from 9689608 to 0a4a359 Compare March 7, 2025 11:52
@StepanBrychta StepanBrychta marked this pull request as ready for review March 10, 2025 09:08
@StepanBrychta StepanBrychta requested a review from a team as a code owner March 10, 2025 09:08
@@ -52,7 +52,7 @@ module "id_minter_lambda" {
queue_config = {
topic_arns = local.transformer_output_topic_arns
max_receive_count = 3
maximum_concurrency = 10
maximum_concurrency = 30
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the ID minter concurrency is set to 10, it becomes a bottleneck during a reindex. Increasing the concurrency to 30 helps, without overwhelming the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants