Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review query performance #3571

Open
5 tasks
finestructure opened this issue Dec 30, 2024 · 5 comments
Open
5 tasks

Review query performance #3571

finestructure opened this issue Dec 30, 2024 · 5 comments

Comments

@finestructure
Copy link
Member

finestructure commented Dec 30, 2024

The incident on Dec 30 is a reminder to review query performance.

The package page in particular is powered by a number of queries which aren't trivial and could perhaps be optimised, for example by backing them with a materialised view.

  • see if we can obtain some info what queries were active during the incident
  • enumerate sitemap queries
  • enumerate package page queries
  • enumerate doc page queries
  • investigate opportunities to back queries with materialised views
@finestructure
Copy link
Member Author

Running SELECT pid, datname, usename, query FROM pg_stat_activity showed lots of active queries starting with:

SELECT "packages"."id" AS "packages_id", "packages"."created_at" AS "packages_created_at", "packages"."updated_at" AS "packages_updated_at", "packages"."platform_compatibility" AS "packages_platform_compatibility", "packages"."processing_stage" AS "packages_processing_stage", "packages"."score" AS "packages_score", "packages"."score_details" AS "packages_score_details", "packages"."status" AS "packages_status", "packages"."url" AS "packages_url", "repositories"."id" AS "repositories_id", "repositories"."created_at" AS "repositories_created_at", "repositories"."updated_at" AS "repositories_updated_at", "repositories"."package_id" AS "repositories_package_id", "repositories"."authors" AS "repositories_authors", "repositories"."commit_count" AS "repositories_commit_count", "repositories"."default_branch" AS "repositories_default_branch", "repositories"."first_commit_date" AS "repositories_first_commit_date", "repositories"."forked_from" AS "repositories_forked_from", "repositories"."forks" AS "repositories_fork

Unfortunately, the query is truncated but I'm pretty sure we can narrow this down and it's almost certainly a package page query.

We should also be able to get some request logs from CF to back this up.

@finestructure
Copy link
Member Author

The top requests were to sitemap urls, so these queries are also worth reviewing.

@finestructure
Copy link
Member Author

finestructure commented Dec 30, 2024

I've set pg_stat_statements.track to TOP on dev to see if that will help us enumerate the top queries more easily.

Edit: Not sure if this can actually be made to work in Azure's DB:

ERROR:  pg_stat_statements must be loaded via shared_preload_libraries

@finestructure
Copy link
Member Author

It's actually already loaded, not sure why this error is coming up:

Screenshot 2024-12-30 at 10 51 45

@finestructure
Copy link
Member Author

It's working but it needs to be run via an admin user:

Screenshot 2024-12-30 at 10 54 34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant