Skip to content

Bug in failsafe_mode ? #3055

@baznikin

Description

@baznikin

Please, answer some short questions which should help us to understand your problem / question better?

  • Which image of the operator are you using? ghcr.io/zalando/postgres-operator:v1.14.0
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? DigitalOcean managed K8S
  • Are you running Postgres Operator in production? yes
  • Type of issue? Bug report

Leader replica was failed to connect to K8S API at but cluster maintained its state due to failsafe_mode enabled.

2026-03-18 15:21:36,310 ERROR: Error communicating with DCS
2026-03-18 15:21:36,314 INFO: Got response from develop-postgresql-0 http://10.244.4.163:8008/failsafe: Accepted
2026-03-18 15:21:36,315 INFO: continue to run as a leader because failsafe mode is enabled and all members are accessible

It works as expected.

But after 30 seconds master gone into Demoting self (immediate-nolock) mode. I see no error regarding "got no response from https://X.X.X.X/failsafe", leader just decided to demote itself.

Reading https://patroni.readthedocs.io/en/master/dcs_failsafe_mode.html I assume cluster should run indefinitely while other replicas available. From my perspective it looks like an bug in operator.

2026-03-18 15:21:36,316 WARNING: Loop time exceeded, rescheduling immediately.
2026-03-18 15:21:37.788 UTC [29] LOG {ticks: 0, maint: 0, retry: 0}
2026-03-18 15:21:38,416 WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=0, status=None)) after connection broken by 'SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)'))': /api/v1/namespaces/production/endpoints/develop-postgresql
2026-03-18 15:21:38,433 INFO: Could not take out TTL lock
2026-03-18 15:21:38,434 INFO: Demoting self (immediate-nolock)

More log lines - https://gist.github.com/baznikin/608ff8709f00a008723080e4388a579e

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions