-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index results by build id and pkg id #55
Comments
Another index that may be worthwhile to speed up GetSingleResultByPkgName, to look up failed dependencies on a package results detail page of an indirect-failed package like https://releng.netbsd.org/bulktracker/pkg/17645166 (especially once #56 is fixed so it can display more than one failed dependency at a time):
I haven't measured how much space it takes or how much it speeds up queries, though -- just guessing by code inspection. (Should measure these before implementing it.) |
Another index that may be worthwhile to speed up GetPkgsBreakingMostOthers, to show on a bulk build details page like https://releng.netbsd.org/bulktracker/build/658 which packages break most others, by narrowing the search down in advance to which builds are broken:
|
I added this index just now. Will look into the others. Thank you! |
The list of results for a particular package like https://releng.netbsd.org/bulktracker/lang/rust is bottlenecked on GetAllPkgResults, and the results_i_build_pkg index doesn't help because it wants to look up the pkg id first, not the build id first. Could add an opposite index:
But maybe it would be better to just use three separate indices on each of the two or three relevant columns -- a cursory glance suggests that will work just as well for all the queries I checked, and cost less space than both results_i_build_pkg and results_i_pkg_build combined (about 1.1 GB vs 1.2 GB for the whole database):
Might be worthwhile to systematically examine all the queries to see which ones are improved by indices -- I have been spot-checking by mousing around the web site and noticing when things are slow, and I may have missed this slowness last time around because of the caching layer. All of the queries I spot-checked (GetResultsInCategory, GetAllPkgResults, GetSingleResultByPkgName, getPkgsBrokenBy) were quick with results_i_build + results_i_pkg, and some were slower with just results_i_build_pkg or with just results_i_pkg_build. |
I measured over 21sec to get an answer back for https://releng.netbsd.org/bulktracker/www/firefox115 today. It would be nice if this were a little more responsive -- adding one of the indices I suggested would make it essentially instantaneous, rather than eating half a minute of 100% CPU to show the latest build results. |
I did some further investigation, and it looks like the indexing may not be the problem here. Computing the GetAllPkgResults query takes <30ms on my laptop with the current index as is. (And running all the queries through EXPLAIN QUERY PLAN or the sqlite3 shell's .expert command doesn't turn up any low-hanging fruit for other indices, so never mind about my other index suggestions.) I guessed maybe the loop in API.PkgResults to filter results to the most recent is slow -- it is almost certainly better to do the filtering in SQL rather than in the SQL caller, like this:
But on my laptop, it's only an improvement of ~25ms (77ms vs 49ms, 66ms vs 43ms, 64ms vs 35ms, in three trials). So, while that's nearly half the time of /json/pkgresults/N, that still might not explain it either. |
Maybe what you saw was a concurrent request? If the app was busy adding a new result to the DB at the time, that would explain the slowness. |
Can you run the bulktracker under a profiler to see where the hot spots are? The very long delays make browsing it for reviewing failures somewhat difficult. |
I can, once I'll be back from vacation. Sorry for the slowness in the meantime. |
For the
I added this index: CREATE INDEX pkg_id ON results (pkg_id); And it turns the query plan into
This helps tremendously with a query like in https://releng.netbsd.org/bulktracker/www/firefox. Maybe the composite index CREATE INDEX results_i_build_pkg ON results (build_id, pkg_id); should go and be replaces with a pure |
Looking at this:
It seems to me that an index on just |
The following index dramatically speeds up common queries:
It increases the size of the database by about 1/3, but compare, e.g., the GetResultsInCategory query (bottleneck of https://releng.netbsd.org/bulktracker/build/645/meta-pkgs):
Without index:
With index:
GetSingleResultByPkgName (bottleneck of https://releng.netbsd.org/bulktracker/pkg/17227701):
Without index:
With index:
Given the amount of CPU time mollari is spending in bulktracker, I think this couple hundred megabytes of space is worth it.
The text was updated successfully, but these errors were encountered: