Description
We have a bunch of go routines using pipelines to send commands to the server.
Each routine creates a set of pipelines, one for each shard in the cluster.
We took this approach to be able to send KEYS command to each shard, since go-redis is only sending to random node.
The Redis ClusterClient is shared b/w all routines, but pipelines are not
We get a data race when calling ForEachShard
This is go-redis v9 latest
Expected Behavior
no data race
Current Behavior
WARNING: DATA RACE
Write at 0x00c000af28c0 by goroutine 71:
github.com/redis/go-redis/v9.(*clusterNodes).GC()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:480 +0x225
github.com/redis/go-redis/v9.newClusterState.func1()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:648 +0x4f
Previous read at 0x00c000af28c0 by goroutine 75:
github.com/redis/go-redis/v9.(*ClusterClient).loadState()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:1157 +0x2d0
github.com/redis/go-redis/v9.(*ClusterClient).loadState-fm()
:1 +0x4d
github.com/redis/go-redis/v9.(*clusterStateHolder).Reload()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:786 +0x56
github.com/redis/go-redis/v9.(*clusterStateHolder).ReloadOrGet()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:823 +0x44
github.com/redis/go-redis/v9.(*ClusterClient).ForEachShard()
/home/ssm-user/go/pkg/mod/github.com/redis/go-redis/[email protected]/cluster.go:1068 +0x97
.....
Goroutine 71 (running) created at:
time.goFunc()
/usr/lib/go-1.18/src/time/sleep.go:176 +0x47
Possible Solution
I do not understand all intricacies of go-redis, but I think go-redis is reloading cluster slots each time we call ForEachShard.
This creates additional (big) network overhead
In our case the number of shards and cluster configuration never changes dynamically.
May I suggest have an option to disable reloading the cluster state automatically. This will fix this issue for now and it will also greatly improve the speed.
Steps to Reproduce
Here is the code that generates the race:
type pipe struct {
numShards int // number of shards; 1 for a non cluster client
shardPipes []redis.Pipeliner // pipelines corresponding to each shard node
commonPipe redis.Pipeliner // pipeline that will be executed on all nodes
shardPipesUsed bool // true if shardPipes contain commands (KEYS commands)
}
func (r *redisClientType) Pipeline(ctx context.Context) (IRedisPipeline, error) {
var err error
p := pipe{}
if r.clusterMode {
var clusterClient, ok = r.univClient.(*redis.ClusterClient)
if !ok {
panic("redisClientType.Pipeline: not a cluster client !")
}
var mu sync.Mutex
// create one pipeline for each shard (even if only KEYS command uses all)
// !!!!!!!!!!! ForEachShard generates data race !!!!!!!!!!!
err = clusterClient.ForEachShard(ctx, func(ctx context.Context, shard *redis.Client) error {
mu.Lock()
p.shardPipes = append(p.shardPipes, shard.Pipeline())
p.numShards++
mu.Unlock()
return nil
})
} else {
p.numShards = 1
}
p.commonPipe = r.univClient.Pipeline()
return &p, err
}
Context (Environment)
This is using go-redis v9 latest version