Skip to content

Operator cache obsolete IP address during cluster migration #3059

@baznikin

Description

@baznikin

Please, answer some short questions which should help us to understand your problem / question better?

  • Which image of the operator are you using? e.g. ghcr.io/zalando/postgres-operator:v1.15.1
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? DigitalOcean managed K8S
  • Are you running Postgres Operator in production? yes
  • Type of issue? Bug report

We observed operator malfunction during k8s nodes rotation - it recreates database pod and tried to connect to its old IP address:

Before restart:

asana-automate-db-0   2/2     Running   0          19d   10.244.3.71     psql-staging-b248b8-p7zrb   <none>           <none>

After restart:

asana-automate-db-0   2/2     Running   0          16m   10.244.15.254   psql-staging-b248b8-dxqfq   <none>           <none>

Operator logs:

time="2026-03-23T16:24:16Z" level=warning msg="migrating single pod cluster \"sprint-reports/asana-automate-db\", this will cause downtime of the Postgres cluster until pod is back" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:24:16Z" level=info msg="moving pod \"sprint-reports/asana-automate-db-0\" out of the end-of-life node \"psql-staging-b248b8-p7zrb\"" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:24:16Z" level=debug msg="subscribing to pod \"sprint-reports/asana-automate-db-0\"" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:00Z" level=info msg="pod \"sprint-reports/asana-automate-db-0\" has been recreated" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:00Z" level=debug msg="unsubscribing from pod \"sprint-reports/asana-automate-db-0\" events" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:00Z" level=info msg="pod \"sprint-reports/asana-automate-db-0\" moved from node \"psql-staging-b248b8-p7zrb\" to node \"psql-staging-b248b8-dxqfq\"" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:00Z" level=debug msg="subscribing to pod \"sprint-reports/asana-automate-db-0\"" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:00Z" level=debug msg="switching over from \"asana-automate-db-0\" to \"sprint-reports/asana-automate-db-0\"" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:00Z" level=debug msg="making POST http request: http://10.244.3.71:8008/switchover" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:30Z" level=debug msg="unsubscribing from pod \"sprint-reports/asana-automate-db-0\" events" cluster-name=sprint-reports/asana-automate-db pkg=cluster
time="2026-03-23T16:25:30Z" level=error msg="could not switchover to pod \"sprint-reports/asana-automate-db-0\": could not switch over from \"asana-automate-db-0\" to \"sprint-reports/asana-automate-db-0\": could not make request: Post \"http://10.244.3.71:8008/switchover\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" cluster-name=sprint-reports/asana-automate-db pkg=cluster

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions