Skip to content

High availability

High availability (HA) ensures that your PostgreSQL database remains accessible even in the event of node or pod failures. With the Percona Operator for PostgreSQL, high availability is achieved by running multiple PostgreSQL nodes in a cluster, using the Patroni framework for automated failover and PostgreSQL streaming replication for data consistency.

A PostgreSQL cluster consists of the following members:

  • A Primary node handles all write operations. The Primary continuously streams changes to its Standby nodes.
  • Read-only (Standby in PostgreSQL terminology) replicas that continuously receive and replay changes from the Primary node. If the Primary fails, one of the Standbys can be automatically promoted to become the new Primary.

Data replication

Percona Operator leverages PostgreSQL streaming replication to keep Standby nodes up-to-date.

By default, asynchronous replication is used: the Primary sends changes to Standbys, but does not wait for confirmation before committing transactions. This offers better performance but presents a risk of minimal data loss (transactions not yet copied to a Standby could be lost in a failure).

Synchronous replication is also supported. In this replication type the Primary waits for at least one Standby to acknowledge receipt of data before marking a transaction as committed. This minimizes the risk of data loss, but can be slightly slower since each transaction must wait for a confirmation.

Minimum and recommended number of nodes for high availability:

The absolute minimum that can technically work for high availability is 2 nodes. But this does not provide full high availability or protection against split-brain scenarios since the loss of either node can impact availability and data safety.

The recommended number of nodes for high availability setups is 3 or more PostgreSQL nodes.

Adding nodes to a cluster

There are two ways how to control the number replicas in your HA cluster:

  1. Through changing spec.instances.replicas value
  2. By adding new entry into spec.instances

Using spec.instances.replicas

For example, you have the following Custom Resource manifest:

spec:
...
  instances:
    - name: instance1
      replicas: 2

This will provision a cluster with two nodes - one Primary and one Replica. Add the node by changing the manifest…

spec:
...
  instances:
    - name: instance1
      replicas: 3

…and applying the Custom Resource:

kubectl apply -f deploy/cr.yaml

The Operator will provision a new replica node. It will be ready and available once data is synchronized from Primary.

Using spec.instances

Each instance’s entry has its own set of parameters, like resources, storage configuration, sidecars, etc. When you add a new entry into instances, this creates replica PostgreSQL nodes, but with a new set of parameters. This can be useful in various cases:

  • Test or migrate to new hardware
  • Blue-green deployment of a new configuration
  • Try out new versions of your sidecar containers

For example, you have the following Custom Resource manifest:

spec:
...
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: old-ssd
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi

Now you have a goal to migrate to new disks, which are coming with the new-ssd storage class. You can create a new instance entry. This will instruct the Operator to create additional nodes with the new configuration keeping your existing nodes intact.

spec:
...
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: old-ssd
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
    - name: instance2
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: new-ssd
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi

Using Synchronous replication

Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one or more synchronous standby servers. When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the write-ahead log on disk of both the primary and standby server. The drawbacks of synchronous replication are increased latency and reduced throughput on writes.

You can turn on synchronous replication by customizing the patroni.dynamicConfiguration Custom Resource option.

  • Enable synchronous replication by setting synchronous_mode option to on.
  • Use synchronous_node_count option to set the number of replicas (PostgreSQL standby servers) which should operate in syncrhonous mode (the default value is 1).

The result in your deploy/cr.yaml manifest may look as follows:

...
  patroni:
    dynamicConfiguration:
      synchronous_mode: "on"
      synchronous_node_count: 2
      ...

You will have the desired amount of replicas switched to synchronous replication after applying changes as usual, with kubectl apply -f deploy/cr.yaml command.

Find more options useful to tune how your database cluster should operate in synchronous mode in the official Patroni documentation .


Last update: 2025-11-11