-
Notifications
You must be signed in to change notification settings - Fork 867
feat(shard distributor): Persist Shard-Level Statistics for Load Balancing, and Add Cleanup Function #7354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AndreasHolt
wants to merge
22
commits into
cadence-workflow:master
Choose a base branch
from
AndreasHolt:lb-shard-metrics-etcd
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+513
−33
Open
feat(shard distributor): Persist Shard-Level Statistics for Load Balancing, and Add Cleanup Function #7354
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
d1e858f
feat(shard distributor): add shard key helpers and metrics state
AndreasHolt 5f105bf
feat(shard distributor): persist shard metrics in etcd store
AndreasHolt d32cc72
fix(shard distributor): update LastMoveTime in the case where a shard…
AndreasHolt 9e958d9
test(shard distributor): add tests for shard metrics
AndreasHolt 3be24c4
fix(shard distributor): modify comment
AndreasHolt 1ca89cd
fix(shard distributor): add atomic check to prevent metrics race
AndreasHolt 9a48ab6
fix(shard distributor): apply shard metric updates in a second phase …
AndreasHolt 876472d
feat(shard distributor): move shard metric updates out of AssignShard…
AndreasHolt 023fc73
fix(shard distributor): keep NamespaceState revisions tied to assignm…
AndreasHolt 3b3b8db
refactor(shard distributor): use shard cache and clock for preparing …
AndreasHolt 81143d8
test(shard distributor): BuildShardPrefix, BuildShardKey, ParseShardKey
AndreasHolt 3035006
feat(shard distributor): simplify shard metrics updates
AndreasHolt 266d00e
refactor(shard distributor): ShardMetrics renamed to ShardStatistics.…
AndreasHolt c5dee7f
test(shard distributor): small changes to shard key tests s.t. they l…
AndreasHolt b333e7d
fix(shard distributor): no longer check for key type ShardStatisticsK…
AndreasHolt ac4b237
refactor(shard distributor): found a place where I forgot to rename t…
AndreasHolt 24888ac
fix(shard distributor): move non-exported helpers to end of file to f…
AndreasHolt 63d060b
feat(shard distributor): clean up the shard statistics
AndreasHolt 588a8f4
test(shard distributor): add test case for when shard stats are deleted
AndreasHolt 2c88990
fix(shard distributor): add mapping (new metric)
AndreasHolt 2642080
feat(shard distributor): retain shard stats while shards are within h…
AndreasHolt 3931fea
refactor(shard distributor): fetch namespace state once per cleanup tick
AndreasHolt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ const ( | |
| ExecutorReportedShardsKey = "reported_shards" | ||
| ExecutorAssignedStateKey = "assigned_state" | ||
| ShardAssignedKey = "assigned" | ||
| ShardStatisticsKey = "statistics" | ||
| ExecutorMetadataKey = "metadata" | ||
| ) | ||
|
|
||
|
|
@@ -70,6 +71,30 @@ func ParseExecutorKey(prefix string, namespace, key string) (executorID, keyType | |
| return parts[0], parts[1], nil | ||
| } | ||
|
|
||
| func BuildShardPrefix(prefix string, namespace string) string { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to have tests for BuildShardPrefix, BuildShardKey and ParseShardKey :) |
||
| return fmt.Sprintf("%s/shards/", BuildNamespacePrefix(prefix, namespace)) | ||
| } | ||
|
|
||
| func BuildShardKey(prefix string, namespace, shardID, keyType string) (string, error) { | ||
| if keyType != ShardStatisticsKey { | ||
| return "", fmt.Errorf("invalid shard key type: %s", keyType) | ||
| } | ||
| return fmt.Sprintf("%s%s/%s", BuildShardPrefix(prefix, namespace), shardID, keyType), nil | ||
| } | ||
|
|
||
| func ParseShardKey(prefix string, namespace, key string) (shardID, keyType string, err error) { | ||
| prefix = BuildShardPrefix(prefix, namespace) | ||
| if !strings.HasPrefix(key, prefix) { | ||
| return "", "", fmt.Errorf("key '%s' does not have expected prefix '%s'", key, prefix) | ||
| } | ||
| remainder := strings.TrimPrefix(key, prefix) | ||
| parts := strings.Split(remainder, "/") | ||
| if len(parts) != 2 { | ||
| return "", "", fmt.Errorf("unexpected shard key format: %s", key) | ||
| } | ||
| return parts[0], parts[1], nil | ||
| } | ||
|
|
||
| func BuildMetadataKey(prefix string, namespace, executorID, metadataKey string) string { | ||
| metadataKeyPrefix, err := BuildExecutorKey(prefix, namespace, executorID, ExecutorMetadataKey) | ||
| if err != nil { | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if the executor is in draining state? are we fine with losing the statics for that? it is covered from the following case where the shard is not in a done state right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ShardStatus != DONE check in cleanupStaleShardStats keeps shard stats alive while a draining executor still reports them, and the TTL "grace period" only removes them after the shard has been marked DONE and stayed idle for that whole TTL window.