Skip to content

[BUG] redis-cluster bench data and delete all pod recover failed #9742

@JashBook

Description

@JashBook

Describe the bug
A clear and concise description of what the bug is.

kbcli version
Kubernetes: v1.30.4-vke.4
KubeBlocks: 1.0.1
kbcli: 1.0.1

To Reproduce
Steps to reproduce the behavior:

  1. create cluster
kubectl apply -f -<<EOF
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: rediscl-wizpai
  namespace: default
spec:
  terminationPolicy: Delete
  shardings:
  - name: shard
    shards: 3
    template:
      name: redis
      componentDef: redis-cluster-7-1.0.1
      serviceVersion: 7.2.4
      replicas: 2
      services:
      - name: redis-advertised
        serviceType: NodePort
        podService: true
      systemAccounts:
      - name: default
        passwordConfig:
          length: 10
          numDigits: 5
          numSymbols: 0
          letterCase: MixedCases
          seed: rediscl-wizpai
      resources:
        limits:
          cpu: 100m
          memory: 0.5Gi
        requests:
          cpu: 100m
          memory: 0.5Gi
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName: 
            accessModes:
              - ReadWriteOnce
            resources:
               requests:
                 storage: 20Gi
EOF
  1. redis bench
kubectl create -f -<<EOF
apiVersion: v1
kind: Pod
metadata:
  name: benchtest-rediscl-wizpai
  namespace: default
spec:
  containers:
    - name: test-benchmark
      imagePullPolicy: IfNotPresent
      image: apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-benchmark:latest
      args:
        - "-h"
        - "rediscl-wizpai-shard-shw-redis-advertised-0.default.svc.cluster.local"
        - "-p"
        - "6379"
        - "-a"
        - "4xI7V26Z4b"
        - "-n"
        - "5000"
        - "-c"
        - "10"

        - "--cluster"
        - "-q"
      
  restartPolicy: Never
EOF
  1. delete all cluster pod
  2. See error
kubectl get cmp -l app.kubernetes.io/instance=rediscl-wizpai
NAME                       DEFINITION              SERVICE-VERSION   STATUS    AGE
rediscl-wizpai-shard-cbv   redis-cluster-7-1.0.1   7.2.4             Running   30m
rediscl-wizpai-shard-shw   redis-cluster-7-1.0.1   7.2.4             Failed    30m
rediscl-wizpai-shard-svb   redis-cluster-7-1.0.1   7.2.4             Running   30m
➜  ~ 
➜  ~ kubectl get pod -l app.kubernetes.io/instance=rediscl-wizpai
NAME                         READY   STATUS             RESTARTS      AGE
rediscl-wizpai-shard-cbv-0   3/3     Running            0             18m
rediscl-wizpai-shard-cbv-1   3/3     Running            0             18m
rediscl-wizpai-shard-shw-0   2/3     CrashLoopBackOff   8 (64s ago)   18m
rediscl-wizpai-shard-shw-1   2/3     CrashLoopBackOff   8 (16s ago)   18m
rediscl-wizpai-shard-svb-0   3/3     Running            0             18m
rediscl-wizpai-shard-svb-1   3/3     Running            0             18m

describe crash pod

kubectl describe pod rediscl-wizpai-shard-shw-0
Name:             rediscl-wizpai-shard-shw-0
Namespace:        default
Priority:         0
Service Account:  kb-redis-cluster-7-1.0.1
Node:             192.168.0.250/192.168.0.250
Start Time:       Mon, 15 Sep 2025 15:43:19 +0800
Labels:           app.kubernetes.io/component=redis-cluster-7-1.0.1
                  app.kubernetes.io/instance=rediscl-wizpai
                  app.kubernetes.io/managed-by=kubeblocks
                  apps.kubeblocks.io/component-name=shard-shw
                  apps.kubeblocks.io/pod-name=rediscl-wizpai-shard-shw-0
                  apps.kubeblocks.io/release-phase=stable
                  apps.kubeblocks.io/service-version=7.2.4
                  apps.kubeblocks.io/sharding-name=shard
                  controller-revision-hash=544655fc48
                  kubeblocks.io/role=primary
                  workloads.kubeblocks.io/instance=rediscl-wizpai-shard-shw
                  workloads.kubeblocks.io/managed-by=InstanceSet
Annotations:      apps.kubeblocks.io/last-role-snapshot-version: 1757923255263080
                  vke.volcengine.com/cello-pod-evict-policy: allow
Status:           Running
IP:               192.168.0.17
IPs:
  IP:           192.168.0.17
Controlled By:  InstanceSet/rediscl-wizpai-shard-shw
Init Containers:
  init-dbctl:
    Container ID:  containerd://3e253d9d9d37ee69c483a0d339b3a5278b1ea11e72ce84eae0de6ca0a65f4363
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl:0.1.8
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl@sha256:af3024b9bf44b353b670938fb490b9f1e651f52785036895fed69a6bf62e9feb
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      -r
      /bin/dbctl
      /config
      /tools/
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:24 +0800
      Finished:     Mon, 15 Sep 2025 15:43:24 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
    Mounts:
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  init-kbagent:
    Container ID:  containerd://6ebecc27b398ebc034f8b0b4d45f3ddb66404482652a5691c2da45048d36c1ce
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools@sha256:6e0084ec006f707226b29e30b9e6e81d3a2d454152e1a6b4bf5dfdc60edf17c8
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      -r
      /bin/kbagent
      /kubeblocks/
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:25 +0800
      Finished:     Mon, 15 Sep 2025 15:43:25 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
    Mounts:
      /kubeblocks from kubeblocks (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  kbagent-worker:
    Container ID:  containerd://15b2bb1b9f9060a15433f13a33456bd3185aed34b72887b3b64f01d00096b597
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:f55712f253ffafedd9f403cee24bc6644a26e058d48a65037bc64b6d31a86349
    Port:          <none>
    Host Port:     <none>
    Command:
      /kubeblocks/kbagent
    Args:
      --server=false
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:25 +0800
      Finished:     Mon, 15 Sep 2025 15:43:25 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:        <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:    <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:       <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CURRENT_POD_NAME:          rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      CURRENT_POD_IP:             (v1:status.podIP)
      CURRENT_POD_HOST_IP:        (v1:status.hostIP)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_FQDN:               $(CURRENT_POD_NAME).$(CURRENT_SHARD_COMPONENT_NAME)-headless.$(CLUSTER_NAMESPACE).svc.cluster.local
      KB_CLUSTER_COMP_NAME:      $(CURRENT_SHARD_COMPONENT_NAME)
      REDIS_LB_ADVERTISED_HOST:  $(CURRENT_SHARD_LB_ADVERTISED_HOST)
      KB_AGENT_NAMESPACE:        default (v1:metadata.namespace)
      KB_AGENT_POD_NAME:         rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      KB_AGENT_POD_UID:           (v1:metadata.uid)
      KB_AGENT_NODE_NAME:         (v1:spec.nodeName)
      KB_AGENT_ACTION:           [{"name":"postProvision","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --post-provision  \u003e /tmp/post-provision.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"preTerminate","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --pre-terminate \u003e /tmp/pre-terminate.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"switchover","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-switchover.sh  \u003e /tmp/switchover.log 2\u003e\u00261"]}},{"name":"memberLeave","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-replica-member-leave.sh \u003e /tmp/member-leave.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"roleProbe","exec":{"command":["/tools/dbctl","--config-path","/tools/config/dbctl/components","redis","getrole"]},"timeoutSeconds":1}]
      KB_AGENT_PROBE:            [{"instance":"rediscl-wizpai-shard-shw","action":"roleProbe","periodSeconds":1}]
    Mounts:
      /data from data (rw)
      /etc/conf from redis-cluster-config (rw)
      /etc/redis from redis-conf (rw)
      /kubeblocks from kubeblocks (rw)
      /scripts from scripts (rw)
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  install-config-manager-tool:
    Container ID:  containerd://9d5ea6fe2aa1b36668f99906f4ecda4724e8c162c4449e97aad695b6afc072ee
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools@sha256:6e0084ec006f707226b29e30b9e6e81d3a2d454152e1a6b4bf5dfdc60edf17c8
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
      /bin/reloader
      /kb_tools
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 15:43:27 +0800
      Finished:     Mon, 15 Sep 2025 15:43:27 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
    Mounts:
      /etc/conf from redis-cluster-config (rw)
      /kb_tools from kb-tools (rw)
      /opt/config-manager from config-manager-config (rw)
      /opt/kb-tools/reload/redis-cluster-config from cm-script-redis-cluster-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
Containers:
  redis-cluster:
    Container ID:  containerd://87a35cd5d22c4863697fe255fa5be167a45a2df07709ccb91913d21162af569d
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:f55712f253ffafedd9f403cee24bc6644a26e058d48a65037bc64b6d31a86349
    Ports:         6379/TCP, 16379/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /scripts/redis-cluster-server-start.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Sep 2025 16:00:54 +0800
      Finished:     Mon, 15 Sep 2025 16:01:02 +0800
    Ready:          False
    Restart Count:  8
    Limits:
      cpu:                        100m
      memory:                     512Mi
      vke.volcengine.com/eni-ip:  1
    Requests:
      cpu:                        100m
      memory:                     512Mi
      vke.volcengine.com/eni-ip:  1
    Readiness:                    exec [sh -c /scripts/redis-ping.sh] delay=10s timeout=5s period=5s #success=1 #failure=5
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CURRENT_POD_NAME:        rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      CURRENT_POD_IP:           (v1:status.podIP)
      CURRENT_POD_HOST_IP:      (v1:status.hostIP)
      POD_FQDN:                $(CURRENT_POD_NAME).$(CURRENT_SHARD_COMPONENT_NAME)-headless.$(CLUSTER_NAMESPACE).svc.cluster.local
    Mounts:
      /data from data (rw)
      /etc/conf from redis-cluster-config (rw)
      /etc/redis from redis-conf (rw)
      /kb_tools from kb-tools (rw)
      /scripts from scripts (rw)
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  kbagent:
    Container ID:  containerd://46152cb497d10e12ec44b87c24c484e21aa48da4e2bee816162aa65701de52ef
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:f55712f253ffafedd9f403cee24bc6644a26e058d48a65037bc64b6d31a86349
    Ports:         3501/TCP, 3502/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /kubeblocks/kbagent
    Args:
      --port
      3501
      --streaming-port
      3502
    State:          Running
      Started:      Mon, 15 Sep 2025 15:43:28 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Startup:   tcp-socket :3501 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:        <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:    <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:       <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CURRENT_POD_NAME:          rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      CURRENT_POD_IP:             (v1:status.podIP)
      CURRENT_POD_HOST_IP:        (v1:status.hostIP)
      KB_HOST_IP:                 (v1:status.hostIP)
      KB_POD_FQDN:               $(CURRENT_POD_NAME).$(CURRENT_SHARD_COMPONENT_NAME)-headless.$(CLUSTER_NAMESPACE).svc.cluster.local
      KB_CLUSTER_COMP_NAME:      $(CURRENT_SHARD_COMPONENT_NAME)
      REDIS_LB_ADVERTISED_HOST:  $(CURRENT_SHARD_LB_ADVERTISED_HOST)
      KB_AGENT_NAMESPACE:        default (v1:metadata.namespace)
      KB_AGENT_POD_NAME:         rediscl-wizpai-shard-shw-0 (v1:metadata.name)
      KB_AGENT_POD_UID:           (v1:metadata.uid)
      KB_AGENT_NODE_NAME:         (v1:spec.nodeName)
      KB_AGENT_ACTION:           [{"name":"postProvision","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --post-provision  \u003e /tmp/post-provision.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"preTerminate","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-manage.sh --pre-terminate \u003e /tmp/pre-terminate.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"switchover","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-switchover.sh  \u003e /tmp/switchover.log 2\u003e\u00261"]}},{"name":"memberLeave","exec":{"command":["/bin/bash","-c","/scripts/redis-cluster-replica-member-leave.sh \u003e /tmp/member-leave.log 2\u003e\u00261"]},"retryPolicy":{"maxRetries":10}},{"name":"roleProbe","exec":{"command":["/tools/dbctl","--config-path","/tools/config/dbctl/components","redis","getrole"]},"timeoutSeconds":1}]
      KB_AGENT_PROBE:            [{"instance":"rediscl-wizpai-shard-shw","action":"roleProbe","periodSeconds":1}]
    Mounts:
      /data from data (rw)
      /etc/conf from redis-cluster-config (rw)
      /etc/redis from redis-conf (rw)
      /kb_tools from kb-tools (rw)
      /kubeblocks from kubeblocks (rw)
      /scripts from scripts (rw)
      /tools from tools (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
  config-manager:
    Container ID:  containerd://76798595c2e4f6cd804deada953e93a8e71ec8742f92eb5bb5a7c12579f25569
    Image:         apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v18
    Image ID:      apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server@sha256:02c6fbcb146638f2943b501ab184d5ebff2c11838c6057f2838d92bd0ab9ee9d
    Port:          9901/TCP
    Host Port:     0/TCP
    Command:
      env
    Args:
      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$(TOOLS_PATH)
      /kb_tools/reloader
      --log-level
      info
      --operator-update-enable
      --tcp
      9901
      --config
      /opt/config-manager/config-manager.yaml
    State:          Running
      Started:      Mon, 15 Sep 2025 15:43:28 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     0
      memory:  0
    Requests:
      cpu:     0
      memory:  0
    Environment Variables from:
      rediscl-wizpai-shard-shw-env  ConfigMap  Optional: false
    Environment:
      REDIS_DEFAULT_USER:      <set to the key 'username' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_DEFAULT_PASSWORD:  <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      REDIS_REPL_PASSWORD:     <set to the key 'password' in secret 'rediscl-wizpai-shard-shw-account-default'>  Optional: false
      CONFIG_MANAGER_POD_IP:    (v1:status.podIP)
      TOOLS_PATH:              /opt/kb-tools/reload/redis-cluster-config:/opt/config-manager:/kb_tools
    Mounts:
      /etc/conf from redis-cluster-config (rw)
      /kb_tools from kb-tools (rw)
      /opt/config-manager from config-manager-config (rw)
      /opt/kb-tools/reload/redis-cluster-config from cm-script-redis-cluster-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fdblv (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-rediscl-wizpai-shard-shw-0
    ReadOnly:   false
  redis-conf:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  tools:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kubeblocks:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  redis-cluster-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rediscl-wizpai-shard-shw-redis-cluster-config
    Optional:  false
  redis-metrics-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rediscl-wizpai-shard-shw-redis-metrics-config
    Optional:  false
  scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rediscl-wizpai-shard-shw-redis-cluster-scripts
    Optional:  false
  cm-script-redis-cluster-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-redis-reload-tools-script-rediscl-wizpai
    Optional:  false
  config-manager-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      sidecar-rediscl-wizpai-shard-shw-config-manager-config
    Optional:  false
  kb-tools:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-fdblv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  19m                   default-scheduler  Successfully assigned default/rediscl-wizpai-shard-shw-0 to 192.168.0.250
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl:0.1.8" already present on machine
  Normal   Created    19m                   kubelet            Created container init-dbctl
  Normal   Started    19m                   kubelet            Started container init-dbctl
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1" already present on machine
  Normal   Created    19m                   kubelet            Created container init-kbagent
  Normal   Started    19m                   kubelet            Started container init-kbagent
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10" already present on machine
  Normal   Created    19m                   kubelet            Created container kbagent-worker
  Normal   Started    19m                   kubelet            Started container kbagent-worker
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.1" already present on machine
  Normal   Created    19m                   kubelet            Created container install-config-manager-tool
  Normal   Started    19m                   kubelet            Started container install-config-manager-tool
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v18" already present on machine
  Normal   Pulled     19m                   kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10" already present on machine
  Normal   Created    19m                   kubelet            Created container kbagent
  Normal   Started    19m                   kubelet            Started container kbagent
  Normal   Created    19m                   kubelet            Created container config-manager
  Normal   Started    19m                   kubelet            Started container config-manager
  Normal   roleProbe  19m (x10 over 2m15s)  kbagent            {"instance":"rediscl-wizpai-shard-shw","probe":"roleProbe","code":0,"output":"cHJpbWFyeQ=="}
  Normal   roleProbe  19m                   kbagent            {"instance":"rediscl-wizpai-shard-shw","probe":"roleProbe","code":0,"output":"c2Vjb25kYXJ5"}
  Normal   Pulled     18m (x2 over 19m)     kubelet            Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/redis-stack-server:7.2.0-v10" already present on machine
  Normal   Created    18m (x2 over 19m)     kubelet            Created container redis-cluster
  Normal   Started    18m (x2 over 19m)     kubelet            Started container redis-cluster
  Normal   roleProbe  18m (x1080 over 0s)   kbagent            {"instance":"rediscl-wizpai-shard-shw","probe":"roleProbe","code":-1,"output":"cHJpbWFyeQ==","message":"exit code: 1: failed"}
  Warning  BackOff    4m46s (x67 over 18m)  kubelet            Back-off restarting failed container redis-cluster in pod rediscl-wizpai-shard-shw-0_default(d16ed53d-7cd1-4cdd-a488-bacbf9180a89)

logs error pod

kubectl logs rediscl-wizpai-shard-shw-0 --tail 50
Defaulted container "redis-cluster" out of: redis-cluster, kbagent, config-manager, init-dbctl (init), init-kbagent (init), kbagent-worker (init), install-config-manager-tool (init)
+ send_cluster_meet_with_retry 192.168.0.250 31267 192.168.0.226 30228 30325
+ local primary_endpoint=192.168.0.250
+ local primary_port=31267
+ local announce_ip=192.168.0.226
+ local announce_port=30228
+ local announce_bus_port=30325
++ call_func_with_retry 3 2 send_cluster_meet 192.168.0.250 31267 192.168.0.226 30228 30325
++ local max_retries=3
++ local retry_interval=2
++ local function_name=send_cluster_meet
++ shift 3
++ local retries=0
++ true
++ send_cluster_meet 192.168.0.250 31267 192.168.0.226 30228 30325
++ local primary_endpoint=192.168.0.250
++ local primary_port=31267
++ local announce_ip=192.168.0.226
++ local announce_port=30228
++ local announce_bus_port=30325
++ unset_xtrace_when_ut_mode_false
++ '[' false == false ']'
++ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.250:31267: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 1 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.250:31267: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 2 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.250:31267: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed after 3 retries.
+ send_cluster_meet_result='check and correct other primary nodes meet command: redis-cli -h 192.168.0.250 -p 31267 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.250 -p 31267 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.250 -p 31267 -a ******** cluster meet 192.168.0.226 30228 30325'
+ status=1
+ '[' 1 -ne 0 ']'
+ echo 'Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry'
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry
+ return 1
+ echo 'Failed to meet the node 192.168.0.226'
Failed to meet the node 192.168.0.226
+ shutdown_redis_server 6379
+ local service_port=6379
+ unset_xtrace_when_ut_mode_false
+ '[' false == false ']'
+ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
kubectl logs rediscl-wizpai-shard-shw-1 --tail 50
Defaulted container "redis-cluster" out of: redis-cluster, kbagent, config-manager, init-dbctl (init), init-kbagent (init), kbagent-worker (init), install-config-manager-tool (init)
+ send_cluster_meet_with_retry 192.168.0.85 30656 192.168.0.226 30228 30325
+ local primary_endpoint=192.168.0.85
+ local primary_port=30656
+ local announce_ip=192.168.0.226
+ local announce_port=30228
+ local announce_bus_port=30325
++ call_func_with_retry 3 2 send_cluster_meet 192.168.0.85 30656 192.168.0.226 30228 30325
++ local max_retries=3
++ local retry_interval=2
++ local function_name=send_cluster_meet
++ shift 3
++ local retries=0
++ true
++ send_cluster_meet 192.168.0.85 30656 192.168.0.226 30228 30325
++ local primary_endpoint=192.168.0.85
++ local primary_port=30656
++ local announce_ip=192.168.0.226
++ local announce_port=30228
++ local announce_bus_port=30325
++ unset_xtrace_when_ut_mode_false
++ '[' false == false ']'
++ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.85:30656: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 1 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.85:30656: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed in 2 times. Retrying in 2 seconds...
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 192.168.0.85:30656: Connection refused
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes
Function 'send_cluster_meet' failed after 3 retries.
+ send_cluster_meet_result='check and correct other primary nodes meet command: redis-cli -h 192.168.0.85 -p 30656 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.85 -p 30656 -a ******** cluster meet 192.168.0.226 30228 30325
check and correct other primary nodes meet command: redis-cli -h 192.168.0.85 -p 30656 -a ******** cluster meet 192.168.0.226 30228 30325'
+ status=1
+ '[' 1 -ne 0 ']'
+ echo 'Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry'
Failed to meet the node 192.168.0.226:30228 in check_and_meet_other_primary_nodes after retry
+ return 1
+ echo 'Failed to meet the node 192.168.0.226'
Failed to meet the node 192.168.0.226
+ shutdown_redis_server 6379
+ local service_port=6379
+ unset_xtrace_when_ut_mode_false
+ '[' false == false ']'
+ set +x
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
kubectl logs rediscl-wizpai-shard-shw-0 --tail 10 kbagent
2025-09-15T08:04:51Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:52Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:53Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:54Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:55Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:56Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:57Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:58Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:04:59Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}
2025-09-15T08:05:00Z	INFO	send probe event	{"probe": "roleProbe", "probe": "roleProbe", "code": -1, "output": "primary", "message": "exit code: 1: failed"}

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions