You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SyncBlocks is what Store Gateway uses for initial block sync and periodic sync with sync interval.
When store gateways are sharded, Thanos uses the metadata fetcher and applies block metadata filters there to filter out blocks each store gateway instance needs to download.
One problem we found during the block sync process is, because of the block sync concurrency, it might take a long time to finish syncing all blocks for each store gateway. e.g. 15 ~ 20 mins to sync 5000 blocks.
This is fine for Thanos' case because Thanos only supports static sharding. e.g. number of shards are usually predefined.
Cortex uses Ring to manage shards dynamically. Instances can be added or removed from the Ring dynamically during a deployment. The metadata files fetched at the beginning of the block sync might be outdated and the blocks ownership can be changed.
Describe the solution you'd like
for i := 0; i < s.blockSyncConcurrency; i++ {
wg.Add(1)
go func() {
for meta := range blockc {
if shouldAdd := s.preAddBlock(); !shouldAdd {
continue
}
if err := s.addBlock(ctx, meta); err != nil {
continue
}
}
wg.Done()
}()
}
Expose a new hook before adding block. It can be called either preAddBlock or checkBlockOwnership and returns true if the block is still owned by the current store gateway instance.
The text was updated successfully, but these errors were encountered:
This is fine for Thanos' case because Thanos only supports static sharding. e.g. number of shards are usually predefined.
If Thanos supports static sharding and shard ownership doesnt change dynamically, could you please clarify the necessity of adding checkBlockOwnership hook?
Are there any further plans to support dynamic sharding?
It is not used in Thanos so we can just pass a noop in Thanos.
The hook is for downstream projects like Cortex to customize the logic here.
Are there any further plans to support dynamic sharding?
I feel this is definitely something useful for the Thanos project, too. Similar to the hash ring used in Receiver. But I am not aware of any ongoing effort for it.
Is your proposal related to a problem?
https://github.com/thanos-io/thanos/blob/main/pkg/store/bucket.go#L672C23-L672C33
SyncBlocks
is what Store Gateway uses for initial block sync and periodic sync with sync interval.When store gateways are sharded, Thanos uses the metadata fetcher and applies block metadata filters there to filter out blocks each store gateway instance needs to download.
https://github.com/thanos-io/thanos/blob/main/pkg/block/fetcher.go#L616
One problem we found during the block sync process is, because of the block sync concurrency, it might take a long time to finish syncing all blocks for each store gateway. e.g. 15 ~ 20 mins to sync 5000 blocks.
This is fine for Thanos' case because Thanos only supports static sharding. e.g. number of shards are usually predefined.
Cortex uses Ring to manage shards dynamically. Instances can be added or removed from the Ring dynamically during a deployment. The metadata files fetched at the beginning of the block sync might be outdated and the blocks ownership can be changed.
Describe the solution you'd like
Expose a new hook before adding block. It can be called either
preAddBlock
orcheckBlockOwnership
and returns true if the block is still owned by the current store gateway instance.The text was updated successfully, but these errors were encountered: