Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: mitigate 429 errors in cloud function execution for validation reports #883

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

davidgamez
Copy link
Member

Summary:
This PR adds a cloud task to the infrastructure to handle tasks that can be executed within two hours and has a rate of 8 concurrent requests. The call from the gtfs_validator_execution workflow to the process-validation-report function is wrapped with the added cloud task.

From our AI friend

This pull request includes several changes to the infrastructure and workflow configurations to improve the handling of cloud tasks and permissions. The most important changes include adding new resources for cloud task queues, updating IAM permissions, and modifying the workflow for task enqueuing and handling.

Infrastructure Changes:

  • Added a local variable x_number_of_concurrent_instance to define the number of concurrent instances. (infra/functions-python/main.tf, infra/functions-python/main.tfR26-R27)
  • Created a dead letter queue (google_cloud_tasks_queue resource) to handle failed cloud tasks with specific rate limits and retry configurations. (infra/functions-python/main.tf, infra/functions-python/main.tfR351-R390)
  • Created a 2X rate queue (google_cloud_tasks_queue resource) with defined rate limits and retry configurations for cloud tasks. (infra/functions-python/main.tf, infra/functions-python/main.tfR351-R390)

IAM Permissions:

  • Added IAM permissions for the workflow service account to act as a service account user, enqueuer, and viewer for cloud tasks. (infra/workflows/main.tf, infra/workflows/main.tfR55-R80)

Workflow Modifications:

  • Updated the gtfs_validator_execution.yml workflow to extract the environment from the project ID and use it to define the cloud task queue name. (workflows/gtfs_validator_execution.yml, workflows/gtfs_validator_execution.ymlR22-R24)
  • Replaced the direct database update call with a task enqueueing process, including steps to create a payload, enqueue a task, and handle task completion with retries and logging. (workflows/gtfs_validator_execution.yml, workflows/gtfs_validator_execution.ymlL224-R297)
    Expected behavior:

Explain and/or show screenshots for how you expect the pull request to work in your testing (in case other devices exhibit different behavior).

Testing tips:

Provide tips, procedures and sample files on how to test the feature.
Testers are invited to follow the tips AND to try anything they deem relevant outside the bounds of the testing tips.

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with ./scripts/api-tests.sh to make sure you didn't break anything
  • Add or update any needed documentation to the repo
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant