Skip to content

Data race in ForEachShardΒ #2621

Open
Open
@bibicu123

Description

@bibicu123

We have a bunch of go routines using pipelines to send commands to the server.
Each routine creates a set of pipelines, one for each shard in the cluster.
We took this approach to be able to send KEYS command to each shard, since go-redis is only sending to random node.

The Redis ClusterClient is shared b/w all routines, but pipelines are not

We get a data race when calling ForEachShard

This is go-redis v9 latest

Expected Behavior

no data race

Current Behavior

WARNING: DATA RACE
Write at 0x00c000af28c0 by goroutine 71:
github.com/redis/go-redis/v9.(*clusterNodes).GC()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:480 +0x225
github.com/redis/go-redis/v9.newClusterState.func1()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:648 +0x4f

Previous read at 0x00c000af28c0 by goroutine 75:
github.com/redis/go-redis/v9.(*ClusterClient).loadState()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:1157 +0x2d0
github.com/redis/go-redis/v9.(*ClusterClient).loadState-fm()
:1 +0x4d
github.com/redis/go-redis/v9.(*clusterStateHolder).Reload()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:786 +0x56
github.com/redis/go-redis/v9.(*clusterStateHolder).ReloadOrGet()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:823 +0x44
github.com/redis/go-redis/v9.(*ClusterClient).ForEachShard()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:1068 +0x97
.....

Goroutine 71 (running) created at:
time.goFunc()
/usr/lib/go-1.18/src/time/sleep.go:176 +0x47

Possible Solution

I do not understand all intricacies of go-redis, but I think go-redis is reloading cluster slots each time we call ForEachShard.
This creates additional (big) network overhead

In our case the number of shards and cluster configuration never changes dynamically.

May I suggest have an option to disable reloading the cluster state automatically. This will fix this issue for now and it will also greatly improve the speed.

Steps to Reproduce

Here is the code that generates the race:

type pipe struct {
	numShards      int                         // number of shards; 1 for a non cluster client
	shardPipes     []redis.Pipeliner   // pipelines corresponding to each shard node
	commonPipe     redis.Pipeliner   // pipeline that will be executed on all nodes
	shardPipesUsed bool                 // true if shardPipes contain commands (KEYS commands)
}


func (r *redisClientType) Pipeline(ctx context.Context) (IRedisPipeline, error) {
	var err error

	p := pipe{}

	if r.clusterMode {
		var clusterClient, ok = r.univClient.(*redis.ClusterClient)
		if !ok {
			panic("redisClientType.Pipeline: not a cluster client !")
		}

		var mu sync.Mutex

		// create one pipeline for each shard (even if only KEYS command uses all)
                // !!!!!!!!!!! ForEachShard generates data race !!!!!!!!!!!
	        err = clusterClient.ForEachShard(ctx, func(ctx context.Context, shard *redis.Client) error {
			mu.Lock()
			p.shardPipes = append(p.shardPipes, shard.Pipeline())
			p.numShards++
			mu.Unlock()
			return nil
		})
	} else {
		p.numShards = 1
	}

	p.commonPipe = r.univClient.Pipeline()

	return &p, err
}

Context (Environment)

This is using go-redis v9 latest version

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions