Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store Gateway: Expose hook before adding blocks when sync #8029

Open
yeya24 opened this issue Dec 31, 2024 · 2 comments · May be fixed by #8036
Open

Store Gateway: Expose hook before adding blocks when sync #8029

yeya24 opened this issue Dec 31, 2024 · 2 comments · May be fixed by #8036

Comments

@yeya24
Copy link
Contributor

yeya24 commented Dec 31, 2024

Is your proposal related to a problem?

https://github.com/thanos-io/thanos/blob/main/pkg/store/bucket.go#L672C23-L672C33

SyncBlocks is what Store Gateway uses for initial block sync and periodic sync with sync interval.

When store gateways are sharded, Thanos uses the metadata fetcher and applies block metadata filters there to filter out blocks each store gateway instance needs to download.

metas, _, metaFetchErr := s.fetcher.Fetch(ctx)

https://github.com/thanos-io/thanos/blob/main/pkg/block/fetcher.go#L616

One problem we found during the block sync process is, because of the block sync concurrency, it might take a long time to finish syncing all blocks for each store gateway. e.g. 15 ~ 20 mins to sync 5000 blocks.

This is fine for Thanos' case because Thanos only supports static sharding. e.g. number of shards are usually predefined.

Cortex uses Ring to manage shards dynamically. Instances can be added or removed from the Ring dynamically during a deployment. The metadata files fetched at the beginning of the block sync might be outdated and the blocks ownership can be changed.

Describe the solution you'd like

	for i := 0; i < s.blockSyncConcurrency; i++ {
		wg.Add(1)
		go func() {
			for meta := range blockc {
                                if shouldAdd := s.preAddBlock(); !shouldAdd {
                                        continue
                                }
				if err := s.addBlock(ctx, meta); err != nil {
					continue
				}
			}
			wg.Done()
		}()
	}

Expose a new hook before adding block. It can be called either preAddBlock or checkBlockOwnership and returns true if the block is still owned by the current store gateway instance.

@milinddethe15
Copy link
Contributor

This is fine for Thanos' case because Thanos only supports static sharding. e.g. number of shards are usually predefined.

If Thanos supports static sharding and shard ownership doesnt change dynamically, could you please clarify the necessity of adding checkBlockOwnership hook?

Are there any further plans to support dynamic sharding?

@yeya24
Copy link
Contributor Author

yeya24 commented Jan 2, 2025

It is not used in Thanos so we can just pass a noop in Thanos.
The hook is for downstream projects like Cortex to customize the logic here.

Are there any further plans to support dynamic sharding?

I feel this is definitely something useful for the Thanos project, too. Similar to the hash ring used in Receiver. But I am not aware of any ongoing effort for it.

@yeya24 yeya24 linked a pull request Jan 3, 2025 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants