-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Extract reposcheduler and add support for syntactic indexes #60958
Conversation
internal/codeintel/autoindexing/internal/background/scheduler/job_scheduler.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first round visiting 20/30 files, will do a second round later since this PR is quite big. 😬
internal/codeintel/autoindexing/internal/background/scheduler/job_scheduler.go
Outdated
Show resolved
Hide resolved
"codeintel_autoindexing_store", | ||
metrics.WithLabels("op"), | ||
metrics.WithCountHelp("Total number of method invocations."), | ||
) | ||
}) | ||
|
||
op := func(name string) *observation.Operation { | ||
return observationCtx.Operation(observation.Op{ | ||
Name: fmt.Sprintf("codeintel.reposcheduler.store.%s", name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these two names be more similar/do you know what both of these are doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, would be good to have more clarity on the test logic and the decision to set the boolean to be on by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the changes. The test cases in particular are much more easily understandable.
// The following tests simulate the passage of time (i.e. repeated scheduled invocations of the repo scheduling logic) | ||
// T is time | ||
|
||
// N.B. We use 1 hour process delay in all those tests, | ||
// which means that IF the repository id was returned, it will be put on cooldown for an hour | ||
|
||
// T = 0: No records in last index scan table, so we return all repos permitted by the limit parameter | ||
assertRepoList(t, tt.store, NewBatchOptions(1*time.Hour, true, nil, 2), now, []int{50, 51}) | ||
|
||
// T + 20 minutes: first two repositories are still on cooldown | ||
assertRepoList(t, tt.store, NewBatchOptions(1*time.Hour, true, nil, 100), now.Add(time.Minute*20), []int{52, 53}) | ||
|
||
// T + 30 minutes: all repositories are on cooldown | ||
assertRepoList(t, tt.store, NewBatchOptions(1*time.Hour, true, nil, 100), now.Add(time.Minute*30), []int(nil)) | ||
|
||
// T + 90 minutes: all repositories are visible again | ||
// Repos 50, 51 are visible because they were scheduled at (T + 0) - which is more than 1 hour ago | ||
// Repos 52, 53 are visible because they were scheduled at (T + 20) - which is more than 1 hour ago | ||
assertRepoList(t, tt.store, NewBatchOptions(1*time.Hour, true, nil, 100), now.Add(time.Minute*90), []int{50, 51, 52, 53}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, now this is much more understandable. I think part of what was throwing me off was the names. Here, the function name starts with assert
and the query this bottoms out is getRepositoriesForIndexScanQuery
. However, that query modifies state, and consequently, this assertBlah
function also modifies state, whereas normally you'd expect 'get' and 'assert' to not modify state.
I don't have specific alternate naming suggestions, but the usage of get
for a query which does inserts seems a bit unfortunate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the upsert nature of that operation is a bit confusing - also makes it hard to debug.
The insert is pushed into the query for (I believe) atomicity and to handle possible dotcom batch sizes (though I wonder how much of a problem it is to literally return only integer repository IDs, you can do thousands of them)
We can rename it to Calculate/Compute, and replace this single helper method assert with a combo of assert + a local computeReposForScheduling
or something, but I think we'd hit diminishing returns wrt readability
type RepositorySchedulingService interface { | ||
GetRepositoriesForIndexScan( | ||
ctx context.Context, | ||
_ RepositoryBatchOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Go question), why the underscore here in an interface definition?
This fixes scheduling, isolating the precise indexes from syntactic ones in the last scan table.
80aa55c
to
5962c57
Compare
Part of #59795
Fixes ENG-23423
This PR's main goal is to extract the logic that selects repositories that should be scanned for new commits-candidates for syntactic indexing.
As part of this work, I've extracted this specific functionality into a new component, called reposcheduler, to reduce the area of responsibility assumed by auto-indexing service.
This is a relatively minor refactoring, mostly invisible from the outside.
The implementation of reposcheduling logic was mostly ported over, with some changes to SQL queries.
We are using a new table, instead of a
index_type
extra field on thelsif_last_index_scan
table, because there is currently no way to introduce a composite uniqueness constraint (repository_id, index_type) in a backwards compatible way.This PR will be squashed and merged. Old commits are left in PR history to refer to previous approaches.
Test plan