diff --git a/docs/_images/PPG_links.png b/docs/_images/PPG_links.png deleted file mode 100644 index 9ba9add26..000000000 Binary files a/docs/_images/PPG_links.png and /dev/null differ diff --git a/docs/_images/diagrams/HA-basic.svg b/docs/_images/diagrams/HA-basic.svg new file mode 100644 index 000000000..d47d87be8 --- /dev/null +++ b/docs/_images/diagrams/HA-basic.svg @@ -0,0 +1,4 @@ + + + +
Database layer
Primary
Replica 1
Stream Replication
PostgreSQL
Patroni
                 ETCD
PostgreSQL
Patroni
                   ETCD
           Read Only   
                  Read / write
Application
ETCD Witness
                    ETCD
pgBackRest
(Backup Server)
\ No newline at end of file diff --git a/docs/_images/diagrams/ha-architecture-patroni.png b/docs/_images/diagrams/ha-architecture-patroni.png deleted file mode 100644 index 0f18b0d61..000000000 Binary files a/docs/_images/diagrams/ha-architecture-patroni.png and /dev/null differ diff --git a/docs/_images/diagrams/ha-overview-backup.svg b/docs/_images/diagrams/ha-overview-backup.svg new file mode 100644 index 000000000..03b06cda1 --- /dev/null +++ b/docs/_images/diagrams/ha-overview-backup.svg @@ -0,0 +1,3 @@ + + +
PostgreSQL 
Primary
PostgreSQL 
Replicas
Replication
Failover
Client
Load balancing proxy
Backup tool
\ No newline at end of file diff --git a/docs/_images/diagrams/ha-overview-failover.svg b/docs/_images/diagrams/ha-overview-failover.svg new file mode 100644 index 000000000..ea77da45c --- /dev/null +++ b/docs/_images/diagrams/ha-overview-failover.svg @@ -0,0 +1,3 @@ + + +
PostgreSQL 
Primary
PostgreSQL 
Replicas
Replication
Failover
\ No newline at end of file diff --git a/docs/_images/diagrams/ha-overview-load-balancer.svg b/docs/_images/diagrams/ha-overview-load-balancer.svg new file mode 100644 index 000000000..318ede1ed --- /dev/null +++ b/docs/_images/diagrams/ha-overview-load-balancer.svg @@ -0,0 +1,3 @@ + + +
PostgreSQL 
Primary
PostgreSQL 
Replicas
Replication
Failover
Client
Load balancing proxy
\ No newline at end of file diff --git a/docs/_images/diagrams/ha-overview-replication.svg b/docs/_images/diagrams/ha-overview-replication.svg new file mode 100644 index 000000000..114320498 --- /dev/null +++ b/docs/_images/diagrams/ha-overview-replication.svg @@ -0,0 +1,4 @@ + + + +
PostgreSQL 
Primary
PostgreSQL 
Replicas
Replication
\ No newline at end of file diff --git a/docs/_images/diagrams/ha-recommended.svg b/docs/_images/diagrams/ha-recommended.svg new file mode 100644 index 000000000..4fe393fa6 --- /dev/null +++ b/docs/_images/diagrams/ha-recommended.svg @@ -0,0 +1,3 @@ + + +
Proxy Layer
HAProxy-Node2
HAProxy-Node1
Database layer
DCS Layer
ETCD-Node2
ETCD-Node3
ETCD-Node1
Replica 2
Primary
Replica 1
Stream Replication
PostgreSQL
Patroni
ETCD
PMM Client
PMM Server
pgBackRest
(Backup Server)
Stream Replication
PostgreSQL
Patroni
ETCD
PMM Client
PostgreSQL
Patroni
ETCD
PMM Client
   Read/write   
   Read  Only
Application
PMM Client
PMM Client
PMM Client
PMM Client
PMM Client
HAProxy-Node3
PMM Client
watchdog
watchdog
watchdog
\ No newline at end of file diff --git a/docs/_images/diagrams/patroni-architecture.png b/docs/_images/diagrams/patroni-architecture.png deleted file mode 100644 index 20729d3c4..000000000 Binary files a/docs/_images/diagrams/patroni-architecture.png and /dev/null differ diff --git a/docs/css/design.css b/docs/css/design.css index f4861d6db..eeca956a9 100644 --- a/docs/css/design.css +++ b/docs/css/design.css @@ -144,6 +144,16 @@ --md-typeset-table-color: hsla(var(--md-hue),0%,100%,0.25) } +[data-md-color-scheme="percona-light"] img[src$="#only-dark"], +[data-md-color-scheme="percona-light"] img[src$="#gh-dark-mode-only"] { + display: none; /* Hide dark images in light mode */ +} + +[data-md-color-scheme="percona-dark"] img[src$="#only-light"], +[data-md-color-scheme="percona-dark"] img[src$="#gh-light-mode-only"] { + display: none; /* Hide light images in dark mode */ +} + /* Typography */ .md-typeset { diff --git a/docs/docker.md b/docs/docker.md index 0bb3dbb45..897aae0a4 100644 --- a/docs/docker.md +++ b/docs/docker.md @@ -160,6 +160,71 @@ Follow these steps to enable `pg_tde`: CREATE TABLE ( ) USING tde_heap; ``` +## Enable encryption + +Percona Distribution for PostgreSQL Docker image includes the `pg_tde` extension to provide data encryption. You must explicitly enable it when you start the container. + +Here's how to do this: +{.power-number} + +1. Start the container with the `ENABLE_PG_TDE=1` environment variable: + + ```{.bash data-prompt="$"} + $ docker run --name container-name -e ENABLE_PG_TDE=1 -e POSTGRES_PASSWORD=sUpers3cRet -d percona/percona-distribution-postgresql:{{dockertag}}-multi + ``` + + where: + + * `container-name` is the name you assign to your container + * `ENABLE_PG_TDE=1` adds the `pg_tde` to the `shared_preload_libraries` and enables the custom storage manager + * `POSTGRES_PASSWORD` is the superuser password + + +2. Connect to the container and start the interactive `psql` session: + + ```{.bash data-prompt="$"} + $ docker exec -it container-name psql + ``` + + ??? example "Sample output" + + ```{.text .no-copy} + psql ({{dockertag}} - Percona Server for PostgreSQL {{dockertag}}.1) + Type "help" for help. + + postgres=# + ``` + +3. Create the extension in the database where you want to encrypt data. This requires superuser privileges. + + ```sql + CREATE EXTENSION pg_tde; + ``` + +4. Configure a key provider. In this sample configuration intended for testing and development purpose, we use a local keyring provider. + + For production use, set up an external key management store and configure an external key provider. Refer to the [Setup :octicons-link-external-16:](https://percona.github.io/pg_tde/main/setup.html#key-provider-configuration) chapter in the `pg_tde` documentation. + + :material-information: Warning: This example is for testing purposes only: + + ```sql + SELECT pg_tde_add_key_provider_file('file-keyring','/tmp/pg_tde_test_local_keyring.per'); + ``` + +5. Add a principal key + + ```sql + SELECT pg_tde_set_principal_key('test-db-master-key','file-keyring'); + ``` + + The key is autogenerated. You are ready to use data encryption. + +6. Create a table with encryption enabled. Pass the `USING tde_heap` clause to the `CREATE TABLE` command: + + ```sql + CREATE TABLE ( ) USING tde_heap; + ``` + ## Enable `pg_stat_monitor` To enable the `pg_stat_monitor` extension after launching the container, do the following: diff --git a/docs/minor-upgrade.md b/docs/minor-upgrade.md index 22cecad60..4b5b32231 100644 --- a/docs/minor-upgrade.md +++ b/docs/minor-upgrade.md @@ -40,6 +40,24 @@ Minor upgrade of Percona Distribution for PostgreSQL includes the following step ## Procedure +## Before you start + +1. [Update the `percona-release` :octicons-link-external-16:](https://www.percona.com/doc/percona-repo-config/percona-release.html#updating-percona-release-to-the-latest-version) utility to the latest version. This is required to install the new version packages of Percona Distribution for PostgreSQL. + +2. Starting with version 17.2.1, `pg_tde` is part of the Percona Server for PostgreSQL package. If you installed `pg_tde` from its dedicated package, do the following to avoid conflicts during the upgrade: + + * Drop the extension using the `DROP EXTENSION` with `CASCADE` command. + + :material-alert: Warning: The use of the `CASCADE` parameter deletes all tables that were created in the database with `pg_tde` enabled and also all dependencies upon the encrypted table (e.g. foreign keys in a non-encrypted table used in the encrypted one). + + ```sql + DROP EXTENSION pg_tde CASCADE + ``` + + * Uninstall the `percona-postgresql-17-pg-tde` package for Debian/Ubuntu or the `percona-pg_tde_17` package for RHEL and derivatives. + +## Procedure + Run **all** commands as root or via **sudo**: {.power-number} diff --git a/docs/release-notes-v17.2.md b/docs/release-notes-v17.2.md index 0bbb621a9..6d97890d0 100644 --- a/docs/release-notes-v17.2.md +++ b/docs/release-notes-v17.2.md @@ -10,7 +10,9 @@ This release of Percona Distribution for PostgreSQL is based on Percona Server f * Percona Distribution for PostgreSQL includes [`pgvector` :octicons-link-external-16:](https://github.com/pgvector/pgvector) - an open source extension that enables you to use PostgreSQL as a vector database. It brings vector data type and vector operations (mainly similarity search) to PostgreSQL. You can install `pgvector` from repositories, tarballs, and it is also available as a Docker image. * The new version of `pg_tde` extension features index encryption and the support of storing encryption keys in KMIP-compatible servers. These feature come with the Beta version of the `tde_heap` access method. Learn more in the [pg_tde release notes :octicons-link-external-16:](https://docs.percona.com/pg-tde/release-notes/beta2.html) * The `pg_tde` extension itself is now a part of the Percona Server for PostgreSQL server package and a Docker image. If you installed the extension before, from its individual package, uninstall it first to avoid conflicts during the upgrade. See the [Minor Upgrade of Percona Distribution for PostgreSQL](minor-upgrade.md#before-you-start) for details. -For how to run `pg_tde` in Docker, check the [Enable encryption](docker.md#enable-encryption) section in the documentation. + + For how to run `pg_tde` in Docker, check the [Enable encryption](docker.md#enable-encryption) section in the documentation. + * Percona Distribution for PostgreSQL now statically links `llvmjit.so` library for Red Hat Enterprise Linux 8 and 9 and compatible derivatives. This resolves the conflict between the LLVM version required by Percona Distribution for PostgreSQL and the one supplied with the operating system. This also enables you to use the LLVM modules supplied with the operating system for other software you require. * Percona Monitoring and Management (PMM) 2.43.2 is now compatible with `pg_stat_monitor` 2.1.0 to monitor PostgreSQL 17. diff --git a/docs/solutions/dr-pgbackrest-setup.md b/docs/solutions/dr-pgbackrest-setup.md index af548538e..74cba57e0 100644 --- a/docs/solutions/dr-pgbackrest-setup.md +++ b/docs/solutions/dr-pgbackrest-setup.md @@ -239,7 +239,7 @@ log-level-console=info log-level-file=debug [prod_backup] -pg1-path=/var/lib/postgresql/14/main +pg1-path=/var/lib/postgresql/{{pgversion}}/main ``` diff --git a/docs/solutions/etcd-info.md b/docs/solutions/etcd-info.md new file mode 100644 index 000000000..520a08e51 --- /dev/null +++ b/docs/solutions/etcd-info.md @@ -0,0 +1,64 @@ +# ETCD + +`etcd` is one of the key components in high availability architecture, therefore, it's important to understand it. + +`etcd` is a distributed key-value consensus store that helps applications store and manage cluster configuration data and perform distributed coordination of a PostgreSQL cluster. + +`etcd` runs as a cluster of nodes that communicate with each other to maintain a consistent state. The primary node in the cluster is called the "leader", and the remaining nodes are the "followers". + +## How `etcd` works + +Each node in the cluster stores data in a structured format and keeps a copy of the same data to ensure redundancy and fault tolerance. When you write data to `etcd`, the change is sent to the leader node, which then replicates it to the other nodes in the cluster. This ensures that all nodes remain synchronized and maintain data consistency. + +When a client wants to change data, it sends the request to the leader. The leader accepts the writes and proposes this change to the followers. The followers vote on the proposal. If a majority of followers agree (including the leader), the change is committed, ensuring consistency. The leader then confirms the change to the client. + +This flow corresponds to the Raft consensus algorithm, based on which `etcd` works. Read morea bout it the [`ectd` Raft consensus](#etcd-raft-consensus) section. + +## Leader election + +An `etcd` cluster can have only one leader node at a time. The leader is responsible for receiving client requests, proposing changes, and ensuring they are replicated to the followers. When an `etcd` cluster starts, or if the current leader fails, the nodes hold an election to choose a new leader. Each node waits for a random amount of time before sending a vote request to other nodes, and the first node to get a majority of votes becomes the new leader. The cluster remains available as long as a majority of nodes (quorum) are still running. + +### How many members to have in a cluster + +The recommended approach is to deploy an odd-sized cluster (e.g., 3, 5, or 7 nodes). The odd number of nodes ensures that there is always a majority of nodes available to make decisions and keep the cluster running smoothly. This majority is crucial for maintaining consistency and availability, even if one node fails. For a cluster with `n` members, the majority is `(n/2)+1`. + +To better illustrate this concept, take an example of clusters with 3 nodes and 4 nodes. In a 3-node cluster, if one node fails, the remaining 2 nodes still form a majority (2 out of 3), and the cluster can continue to operate. In a 4-node cluster, if one node fails, there are only 3 nodes left, which is not enough to form a majority (3 out of 4). The cluster stops functioning. + +## `etcd` Raft consensus + +The heart of `etcd`'s reliability is the Raft consensus algorithm. Raft ensures that all nodes in the cluster agree on the same data. This ensures a consistent view of the data, even if some nodes are unavailable or experiencing network issues. + +An example of the Raft's role in `etcd` is the situation when there is no majority in the cluster. If a majority of nodes can't communicate (for example, due to network partitions), no new leader can be elected, and no new changes can be committed. This prevents the system from getting into an inconsistent state. The system waits for the network to heal and a majority to be re-established. This is crucial for data integrity. + +You can also check [this resource :octicons-link-external-17:](https://thesecretlivesofdata.com/raft/) to learn more about Raft and understand it better. + +## `etcd` logs and performance considerations + +`etcd` keeps a detailed log of every change made to the data. These logs are essential for several reasons, including the ensurance of consistency, fault tolerance, leader elections, auditing, and others, maintaining a consistent state across nodes. For example, if a node fails, it can use the logs to catch up with the other nodes and restore its data. The logs also provide a history of all changes, which can be useful for debugging and security analysis if needed. + +### Slow disk performance + +`etcd` is very sensitive to disk I/O performance. Writing to the logs is a frequent operation and will be slow if the disk is slow. This can lead to timeouts, delaying consensus, instability, and even data loss. In extreme cases, slow disk performance can cause a leader to fail health checks, triggering unnecessary leader elections. Always use fast, reliable storage for `etcd`. + +### Slow or high-latency networks + +Communication between `etcd` nodes is critical. A slow or unreliable network can cause delays in replicating data, increasing the risk of stale reads. This can trigger premature timeouts leading to leader elections happening more frequently, and even delays in leader elections in some cases, impacting performance and stability. Also keep in mind that if nodes cannot reach each other in a timely manner, the cluster may lose quorum and become unavailable. + +## etcd Locks + +`etcd` provides a distributed locking mechanism, which helps applications coordinate actions across multiple nodes and access to shared resources preventing conflicts. Locks ensure that only one process can hold a resource at a time, avoiding race conditions and inconsistencies. Patroni is an example of an application that uses `etcd` locks for primary election control in the PostgreSQL cluster. + +### Deployment considerations + +Running `etcd` on separate hosts has the following benefits: + +* Both PostgreSQL and `etcd` are highly dependant on I/O. And running them on the separate hosts improves performance. + +* Higher resilience. If one or even two PostgreSQL node crash, the `etcd` cluster remains healthy and can trigger a new primary election. + +* Scalability and better performance. You can scale the `etcd` cluster separately from PostgreSQL based on the load and thus achieve better performance. + +Note that separate deployment increases the complexity of the infrastructure and requires additional effort on maintenance. Also, pay close attention to network configuration to eliminate the latency that might occur due to the communication between `etcd` and Patroni nodes over the network. + +If a separate dedicated host for 1 is not a viable option, you can use the same host machines used for Patroni and PostgreSQL. + \ No newline at end of file diff --git a/docs/solutions/ha-architecture.md b/docs/solutions/ha-architecture.md new file mode 100644 index 000000000..a2c67d9d1 --- /dev/null +++ b/docs/solutions/ha-architecture.md @@ -0,0 +1,60 @@ +# Architecture + +In the [overview of high availability](high-availability.md), we discussed the required components to achieve high-availability. + +Our recommended minimalistic approach to a highly-available deployment is to have a three-node PostgreSQL cluster with the cluster management and failover mechanisms, load balancer and a backup / restore solution. + +The following diagram shows this architecture, including all additional components. If you are considering a simple and cost-effective setup, refer to the [Bare-minimum architecture](#bare-minimum-architecture) section. + +![Architecture of the three-node, single primary PostgreSQL cluster](../_images/diagrams/ha-recommended.svg) + +## Components + +The components in this architecture are: + +### Database layer + +- PostgreSQL nodes bearing the user data. + +- Patroni - an automatic failover system. Patroni requires and uses the Distributed Configuration Store to store the cluster configuration, health and status. + +- watchdog - a mechanism that will reset the whole system when they do not get a keepalive heartbeat within a specified timeframe. This adds an additional layer of fail safe in case usual Patroni split-brain protection mechanisms fail. + +### DCS layer + +- etcd - a Distributed Configuration Store. It stores the state of the PostgreSQL cluster and handles the election of a new primary. The odd number of nodes (minimum three) is required to always have the majority to agree on updates to the cluster state. + +### Load balancing layer + +- HAProxy - the load balancer and the single point of entry to the cluster for client applications. Minimum two instances are required for redundancy. + +- keepalived - a high-availability and failover solution for HAProxy. It provides a virtual IP (VIP) address for HAProxy and prevents its single point of failure by failing over the services to the operational instance + +- (Optional) pgbouncer - a connection pooler for PostgreSQL. The aim of pgbouncer is to lower the performance impact of opening new connections to PostgreSQL. + +### Services layer + +- pgBackRest - the backup and restore solution for PostgreSQL. It should also be redundant to eliminate a single point of failure. + +- (Optional) Percona Monitoring and Management (PMM) - the solution to monitor the health of your cluster + +## Bare-minimum architecture + +There may be constraints to use the [reference architecture with all additional components](#architecture), like the number of available servers or the cost for additional hardware. You can still achieve high-availability with the minimum two database nodes and three `etcd` instances. The following diagram shows this architecture: + +![Bare-minimum architecture of the PostgreSQL cluster](../_images/diagrams/HA-basic.svg) + +Using such architecture has the following limitations: + +* This setup only protects against a one node failure, either a database or a etcd node. Losing more than one node results in the read-only database. +* The application must be able to connect to multiple database nodes and fail over to the new primary in the case of outage. +* The application must act as the load-balancer. It must be able to determine read/write and read-only requests and distribute them across the cluster. +- The `pbBackRest` component is optional as it doesn't server the purpose of high-availability. But it is highly-recommended for disaster recovery and is a must fo production environments. [Contact us](https://www.percona.com/about/contact) to discuss backup configurations and retention policies. + +## Additional reading + +[How components work together](ha-components.md){.md-button} + +## Next steps + +[Deployment - initial setup](ha-init-setup.md){.md-button} \ No newline at end of file diff --git a/docs/solutions/ha-components.md b/docs/solutions/ha-components.md new file mode 100644 index 000000000..9ec070a03 --- /dev/null +++ b/docs/solutions/ha-components.md @@ -0,0 +1,50 @@ +# How components work together + +This document explains how components of the proposed [high-availability architecture](ha-architecture.md) work together. + +## Database and DSC layers + +Let's start with the database and DCS layers as they are interconnected and work closely together. + +Every database node hosts PostgreSQL and Patroni instances. + +Each PostgreSQL instance in the cluster maintains consistency with other members through streaming replication. Streaming replication is asynchronous by default, meaning that the primary does not wait for the secondaries to acknowledge the receipt of the data to consider the transaction complete. + +Each Patroni instance manages its own PostgreSQL instance. This means that Patroni starts and stops PostgreSQL and manages its configuration, being a sophisticated service manager for a PostgreSQL cluster. + +Patroni also can make an initial cluster initialization, monitor the cluster state and take other automatic actions if needed. To do so, Patroni relies on and uses the Distributed Configuration Store (DCS), represented by `etcd` in our architecture. + +Though Patroni supports various Distributed Configuration Stores like ZooKeeper, etcd, Consul or Kubernetes, we recommend and support `etcd` as the most popular DCS due to its simplicity, consistency and reliability. + +Note that the PostgreSQL high availability (HA) cluster and Patroni cluster are the same thing, and we will use these names interchangeably. + +When you start Patroni, it writes the cluster configuration information in `etcd`. During the initial cluster initialization, Patroni uses the `etcd` locking mechanism to ensure that only one instance becomes the primary. This mechanism ensures that only a single process can hold a resource at a time avoiding race conditions and inconsistencies. + +You start Patroni instances one by one so the first instance acquires the lock with a lease in `etcd` and becomes the primary PostgreSQL node. The other instances join the primary as replicas, waiting for the lock to be released. + +If the current primary node crashes, its lease on the lock in `etcd` expires. The lock is automatically released after its expiration time. `etcd` the starts a new election and a standby node attempts to acquire the lock to become the new primary. + +Patroni uses not only `etcd` locking mechanism. It also uses `etcd` to store the current state of the cluster, ensuring that all nodes are aware of the latest topology and status. + +Another important component is the watchdog. It runs on each database node. The purpose of watchdog is to prevent split-brain scenarios, where multiple nodes might mistakenly think they are the primary node. The watchdog monitors the node's health by receiving periodic "keepalive" signals from Patroni. If these signals stop due to a crash, high system load or any other reason, the watchdog resets the node to ensure it does not cause inconsistencies. + +## Load balancing layer + +This layer consists of HAProxy as the connection router and load balancer. + +HAProxy acts as a single point of entry to your cluster for client applications. It accepts all requests from client applications and distributes the load evenly across the cluster nodes. It can route read/write requests to the primary and read-only requests to the secondary nodes. This behavior is defined within HAProxy configuration. To determine the current primary node, HAProxy queries the Patroni REST API. + +HAProxy must be also redundant. Each application server or Pod can have its own HAProxy. If it cannot have own HAProxy, you can deploy HAProxy outside the application layer. This may introduce additional network hops and a failure point. + +If you are deploying HAProxy outside the application layer, you need a minimum of 2 HAProxy nodes (one is active and another one standby) to avoid a single point of failure. These instances share a floating virtual IP address using Keepalived. + +Keepalived acts as the failover tool for HAProxy. It provides the virtual IP address (VIP) for HAProxy and monitors its state. When the current active HAProxy node is down, it transfers the VIP to the remaining node and fails over the services there. + +## Services layer + +Finally, the services layer is represented by `pgBackRest` and PMM. + +`pgBackRest` can manage a dedicated backup server or make backups to the cloud. `pgBackRest` agent are deployed on every database node. `pgBackRest` can utilize standby nodes to offload the backup load from the primary. However, WAL archiving is happening only from the primary node. By communicating with its agents,`pgBackRest` determines the current cluster topology and uses the nodes to make backups most effectively without any manual reconfiguration at the event of a switchover or failover. + +The monitoring solution is optional but nice to have. It enables you to monitor the health of your high-availability architecture, receive timely alerts should performance issues occur and proactively react to them. + diff --git a/docs/solutions/ha-etcd-config.md b/docs/solutions/ha-etcd-config.md new file mode 100644 index 000000000..68fc52b5c --- /dev/null +++ b/docs/solutions/ha-etcd-config.md @@ -0,0 +1,167 @@ +# Etcd setup + +In our solutions, we use etcd distributed configuration store. [Refresh your knowledge about etcd](ha-components.md#etcd). + +## Install etcd + +Install etcd on all PostgreSQL nodes: `node1`, `node2` and `node3`. + +=== ":material-debian: On Debian / Ubuntu" + + 1. Install etcd: + + ```{.bash data-prompt="$"} + $ sudo apt install etcd etcd-server etcd-client + ``` + + 3. Stop and disable etcd: + + ```{.bash data-prompt="$"} + $ sudo systemctl stop etcd + $ sudo systemctl disable etcd + ``` + +=== ":material-redhat: On RHEL and derivatives" + + + 1. Install etcd. + + ```{.bash data-prompt="$"} + $ sudo yum install + etcd python3-python-etcd\ + ``` + + 3. Stop and disable etcd: + + ```{.bash data-prompt="$"} + $ sudo systemctl stop etcd + $ systemctl disable etcd + ``` + +!!! note + + If you [installed etcd from tarballs](../tarball.md), you must first [enable it](../enable-extensions.md#etcd) before configuring it. + +## Configure etcd + +To get started with `etcd` cluster, you need to bootstrap it. This means setting up the initial configuration and starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms: + +* Static in the case when the IP addresses of the cluster nodes are known +* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time. + +Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service, please refer to the [etcd documentation :octicons-link-external-16:](https://etcd.io/docs/v3.5/op-guide/clustering/#etcd-discovery){:target="_blank"}. + +We will configure and start all etcd nodes in parallel. This can be done either by modifying each node's configuration or using the command line options. Use the method that you prefer more. + +### Method 1. Modify the configuration file + +1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes. + + === "node1" + + ```yaml title="/etc/etcd/etcd.conf.yaml" + name: 'node1' + initial-cluster-token: PostgreSQL_HA_Cluster_1 + initial-cluster-state: new + initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380 + data-dir: /var/lib/etcd + initial-advertise-peer-urls: http://10.104.0.1:2380 + listen-peer-urls: http://10.104.0.1:2380 + advertise-client-urls: http://10.104.0.1:2379 + listen-client-urls: http://10.104.0.1:2379 + ``` + + === "node2" + + ```yaml title="/etc/etcd/etcd.conf.yaml" + name: 'node2' + initial-cluster-token: PostgreSQL_HA_Cluster_1 + initial-cluster-state: new + initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380 + data-dir: /var/lib/etcd + initial-advertise-peer-urls: http://10.104.0.2:2380 + listen-peer-urls: http://10.104.0.2:2380 + advertise-client-urls: http://10.104.0.2:2379 + listen-client-urls: http://10.104.0.2:2379 + ``` + + === "node3" + + ```yaml title="/etc/etcd/etcd.conf.yaml" + name: 'node3' + initial-cluster-token: PostgreSQL_HA_Cluster_1 + initial-cluster-state: new + initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380 + data-dir: /var/lib/etcd + initial-advertise-peer-urls: http://10.104.0.3:2380 + listen-peer-urls: http://10.104.0.3:2380 + advertise-client-urls: http://10.104.0.3:2379 + listen-client-urls: http://10.104.0.3:2379 + ``` + +2. Enable and start the `etcd` service on all nodes: + + ```{.bash data-prompt="$"} + $ sudo systemctl enable --now etcd + $ sudo systemctl status etcd + ``` + + During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created. + +--8<-- "check-etcd.md" + +### Method 2. Start etcd nodes with command line options + +1. On each etcd node, set the environment variables for the cluster members, the cluster token and state: + + ``` + TOKEN=PostgreSQL_HA_Cluster_1 + CLUSTER_STATE=new + NAME_1=node1 + NAME_2=node2 + NAME_3=node3 + HOST_1=10.104.0.1 + HOST_2=10.104.0.2 + HOST_3=10.104.0.3 + CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380 + ``` + +2. Start each etcd node in parallel using the following command: + + === "node1" + + ```{.bash data-prompt="$"} + THIS_NAME=${NAME_1} + THIS_IP=${HOST_1} + etcd --data-dir=data.etcd --name ${THIS_NAME} \ + --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ + --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ + --initial-cluster ${CLUSTER} \ + --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} & + ``` + + === "node2" + + ```{.bash data-prompt="$"} + THIS_NAME=${NAME_2} + THIS_IP=${HOST_2} + etcd --data-dir=data.etcd --name ${THIS_NAME} \ + --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ + --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ + --initial-cluster ${CLUSTER} \ + --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} & + ``` + + === "node3" + + ```{.bash data-prompt="$"} + THIS_NAME=${NAME_3} + THIS_IP=${HOST_3} + etcd --data-dir=data.etcd --name ${THIS_NAME} \ + --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ + --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ + --initial-cluster ${CLUSTER} \ + --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} & + ``` + +--8<-- "check-etcd.md" \ No newline at end of file diff --git a/docs/solutions/ha-haproxy.md b/docs/solutions/ha-haproxy.md new file mode 100644 index 000000000..84daff5e3 --- /dev/null +++ b/docs/solutions/ha-haproxy.md @@ -0,0 +1,270 @@ +# Configure HAProxy + +HAproxy is the connection router and acts as a single point of entry to your PostgreSQL cluster for client applications. Additionally, HAProxy provides load-balancing for read-only connections. + +A client application connects to HAProxy and sends its read/write requests there. You can provide different ports in the HAProxy configuration file so that the client application can explicitly choose between read-write (primary) connection or read-only (replica) connection using the right port number to connect. In this deployment, writes are routed to port 5000 and reads - to port 5001. + +The client application doesn't know what node in the underlying cluster is the current primary. But it must connect to the HAProxy read-write connection to send all write requests. This ensures that HAProxy correctly routes all write load to the current primary node. Read requests are routed to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. + +When you deploy HAProxy outside the application layer, you must deploy multiple instances of it and have the automatic failover mechanism to eliminate a single point of failure for HAProxy. + +For this document we focus on deployment on premises and we use `keepalived`. It monitors HAProxy state and manages the virtual IP for HAProxy. + +If you use a cloud infrastructure, it may be easier to use the load balancer provided by the cloud provider to achieve high-availability with HAProxy. + +## HAProxy setup + +1. Install HAProxy on the HAProxy nodes: `HAProxy1`, `HAProxy2` and `HAProxy3`: + + ```{.bash data-prompt="$"} + $ sudo apt install percona-haproxy + ``` + +2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file for every node. + + ``` + global + maxconn 100 # Maximum number of concurrent connections + + defaults + log global # Use global logging configuration + mode tcp # TCP mode for PostgreSQL connections + retries 2 # Number of retries before marking a server as failed + timeout client 30m # Maximum time to wait for client data + timeout connect 4s # Maximum time to establish connection to server + timeout server 30m # Maximum time to wait for server response + timeout check 5s # Maximum time to wait for health check response + + listen stats # Statistics monitoring + mode http # The protocol for web-based stats UI + bind *:7000 # Port to listen to on all network interfaces + stats enable # Statistics reporting interface + stats uri /stats # URL path for the stats page + stats auth percona:myS3cr3tpass # Username:password authentication + + listen primary + bind *:5000 # Port for write connections + option httpchk /primary + http-check expect status 200 + default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions # Server health check parameters + server node1 node1:5432 maxconn 100 check port 8008 + server node2 node2:5432 maxconn 100 check port 8008 + server node3 node3:5432 maxconn 100 check port 8008 + + listen standbys + balance roundrobin # Round-robin load balancing for read connections + bind *:5001 # Port for read connections + option httpchk /replica + http-check expect status 200 + default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions # Server health check parameters + server node1 node1:5432 maxconn 100 check port 8008 + server node2 node2:5432 maxconn 100 check port 8008 + server node3 node3:5432 maxconn 100 check port 8008 + ``` + + HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately. + + To monitor HAProxy stats, create the user who has the access to it. Read more about statistics dashboard in [HAProxy documentation :octicons-link-external-16:](https://www.haproxy.com/documentation/haproxy-configuration-tutorials/alerts-and-monitoring/statistics/) + +3. Restart HAProxy: + + ```{.bash data-prompt="$"} + $ sudo systemctl restart haproxy + ``` + +4. Check the HAProxy logs to see if there are any errors: + + ```{.bash data-prompt="$"} + $ sudo journalctl -u haproxy.service -n 100 -f + ``` + +## Keepalived setup + +The HAproxy instances will share a virtual IP address `203.0.113.1` as the single point of entry for client applications. + +In this setup we define the basic health check for HAProxy. You may want to use a more sophisticated check. You can do this by writing a script and referencing it in the `keeplaived` configuration. See the [Example of HAProxy health check](#example-of-haproxy-health-check) section for details. + +1. Install `keepalived` on all HAProxy nodes: + + === ":material-debian: On Debian and Ubuntu" + + ```{.bash data-prompt="$"} + $ sudo apt install keepalived + ``` + + === ":material-redhat: On RHEL and derivatives" + + ```{.bash data-prompt="$"} + $ sudo yum install keepalived + ``` + +2. Create the `keepalived` configuration file at `/etc/keepalived/keepalived.conf` with the following contents for each node: + + === "Primary HAProxy (HAProxy1)" + + ```ini + vrrp_script chk_haproxy { + script "killall -0 haproxy" # Basic check if HAProxy process is running + interval 3 # Check every 2 seconds + fall 3 # The number of failures to mark the node as down + rise 2 # The number of successes to mark the node as up + weight -11 # Reduce priority by 2 on failure + } + + vrrp_instance CLUSTER_1 { # The name of Patroni cluster + state MASTER # Initial state for the primary node + interface eth1 # Network interface to bind to + virtual_router_id 99 # Unique ID for this VRRP instance + priority 110 # The priority for the primary must be the highest + advert_int 1 # Advertisement interval + authentication { + auth_type PASS + auth_pass myS3cr3tpass # Authentication password + } + virtual_ipaddress { + 203.0.113.1/24 # The virtual IP address + } + track_script { + chk_haproxy + } + } + ``` + + === "HAProxy2" + + ```ini + vrrp_script chk_haproxy { + script "killall -0 haproxy" # Basic check if HAProxy process is running + interval 2 # Check every 2 seconds + fall 2 # The number of failures to mark the node as down + rise 2 # The number of successes to mark the node as up + weight 2 # Reduce priority by 2 on failure + } + + vrrp_instance CLUSTER_1 { + state BACKUP # Initial state for backup node + interface eth1 # Network interface to bind to + virtual_router_id 99 # Same ID as primary + priority 100 # Lower priority than primary + advert_int 1 # Advertisement interval + authentication { + auth_type PASS + auth_pass myS3cr3tpass # Same password as primary + } + virtual_ipaddress { + 203.0.113.1/24 + } + track_script { + chk_haproxy + } + } + ``` + + === "HAProxy3" + + ```ini + vrrp_script chk_haproxy { + script "killall -0 haproxy" # Basic check if HAProxy process is running + interval 2 # Check every 2 seconds + fall 3 # The number of failures to mark the node as down + rise 2 # The number of successes to mark the node as up + weight 6 # Reduce priority by 2 on failure + } + + vrrp_instance CLUSTER_1 { + state BACKUP # Initial state for backup node + interface eth1 # Network interface to bind to + virtual_router_id 99 # Same ID as primary + priority 105 # Lowest priority + advert_int 1 # Advertisement interval + authentication { + auth_type PASS + auth_pass myS3cr3tpass # Same password as primary + } + virtual_ipaddress { + 203.0.113.1/24 + } + track_script { + chk_haproxy + } + } + ``` + +3. Start `keepalived`: + + ```{.bash data-prompt="$"} + $ sudo systemctl start keepalived + ``` + +4. Check the `keepalived` status: + + ```{.bash data-prompt="$"} + $ sudo systemctl status keepalived + ``` + +!!! note + + The basic health check (`killall -0 haproxy`) only verifies that the HAProxy process is running. For production environments, consider implementing more comprehensive health checks that verify the node's overall responsiveness and HAProxy's ability to handle connections. + +### Example of HAProxy health check + +Sometimes checking only the running haproxy process is not enough. The process may be running while HAProxy is in a degraded state. A good practice is to make additional checks to ensure HAProxy is healthy. + +Here's an example health check script for HAProxy. It performs the following checks: + +1. Verifies that the HAProxy process is running +2. Tests if the HAProxy admin socket is accessible +3. Confirms that HAProxy is binding to the default port `5432` + +```bash +#!/bin/bash + +# Exit codes: +# 0 - HAProxy is healthy +# 1 - HAProxy is not healthy + +# Check if HAProxy process is running +if ! pgrep -x haproxy > /dev/null; then + echo "HAProxy process is not running" + exit 1 +fi + +# Check if HAProxy socket is accessible +if ! socat - UNIX-CONNECT:/var/run/haproxy/admin.sock > /dev/null 2>&1; then + echo "HAProxy socket is not accessible" + exit 1 +fi + +# Check if HAProxy is binding to port 5432 +if ! netstat -tuln | grep -q ":5432 "; then + exit 1 +fi + +# All checks passed +exit 0 +``` + +Save this script as `/usr/local/bin/check_haproxy.sh` and make it executable: + +```{.bash data-prompt="$"} +$ sudo chmod +x /usr/local/bin/check_haproxy.sh +``` + +Then define this script in Keepalived configuration on each node: + +```ini +vrrp_script chk_haproxy { + script "/usr/local/bin/check_haproxy.sh" + interval 2 + fall 3 + rise 2 + weight -10 +} +``` + + +Congratulations! You have successfully configured your HAProxy solution. Now you can proceed to testing it. + +## Next steps + +[Test Patroni PostgreSQL cluster](ha-test.md){.md-button} \ No newline at end of file diff --git a/docs/solutions/ha-init-setup.md b/docs/solutions/ha-init-setup.md new file mode 100644 index 000000000..a857197d4 --- /dev/null +++ b/docs/solutions/ha-init-setup.md @@ -0,0 +1,81 @@ +# Initial setup for high availability + +This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni. This guide relies on the provided [architecture](ha-architecture.md) for high-availability. + +## Considerations + +1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes. + + If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and PostgreSQL is recommended due to performance reasons. + +2. For this setup, we will use the nodes that have the following IP addresses: + + + | Node name | Public IP address | Internal IP address + |---------------|-------------------|-------------------- + | node1 | 157.230.42.174 | 10.104.0.7 + | node2 | 68.183.177.183 | 10.104.0.2 + | node3 | 165.22.62.167 | 10.104.0.8 + | HAProxy1 | 112.209.126.159 | 10.104.0.6 + | HAProxy2 | 134.209.111.138 | 10.104.0.5 + | HAProxy3 | 134.60.204.27 | 10.104.0.3 + | backup | 97.78.129.11 | 10.104.0.9 + + We also need a virtual IP address for HAProxy: `203.0.113.1` + + +!!! important + + We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack. + +## Configure name resolution + +It’s not necessary to have name resolution, but it makes the whole setup more readable and less error prone. Here, instead of configuring a DNS, we use a local name resolution by updating the file `/etc/hosts`. By resolving their hostnames to their IP addresses, we make the nodes aware of each other’s names and allow their seamless communication. + +Run the following commands on each node. + +1. Set the hostname for nodes. Change the node name to `node1`, `node2`, `node3`, `HAProxy1`, `HAProxy2` and `backup`, respectively: + + ```{.bash data-prompt="$"} + $ sudo hostnamectl set-hostname node1 + ``` + +2. Modify the `/etc/hosts` file of each node to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: + + ```text + # Cluster IP and names + + 10.104.0.7 node1 + 10.104.0.2 node2 + 10.104.0.8 node3 + 10.104.0.6 HAProxy1 + 10.104.0.5 HAProxy2 + 10.104.0.3 HAProxy3 + 10.104.0.9 backup + ``` + +## Configure Percona repository + +To install the software from Percona, you need to subscribe to Percona repositories. To do this, you require `percona-release` - the repository management tool. + +Run the following commands on each node as the root user or with `sudo` privileges. + +1. Install `percona-release` + + === ":material-debian: On Debian and Ubuntu" + + --8<-- "percona-release-apt.md" + + === ":material-redhat: On RHEL and derivatives" + + --8<-- "percona-release-yum.md" + +2. Enable the repository: + + ```{.bash data-prompt="$"} + $ sudo percona-release setup ppg{{pgversion}} + ``` + +## Next steps + +[Install Percona Distribution for PostgreSQL](ha-install-postgres.md){.md-button} \ No newline at end of file diff --git a/docs/solutions/ha-measure.md b/docs/solutions/ha-measure.md new file mode 100644 index 000000000..d246544b2 --- /dev/null +++ b/docs/solutions/ha-measure.md @@ -0,0 +1,23 @@ +# Measuring high availability + +The need for high availability is determined by the business requirements, potential risks, and operational limitations. For example, the more components you add to your infrastructure, the more complex and time-consuming it is to maintain. Moreover, it may introduce extra failure points. The recommendation is to follow the principle "The simpler the better". + +The level of high availability depends on the following: + +* how frequently you may encounter an outage or a downtime. +* how much downtime you can bear without negatively impacting your users for every outage, and +* how much data loss you can tolerate during the outage. + +The measurement of availability is done by establishing a measurement time frame and dividing it by the time that it was available. This ratio will rarely be one, which is equal to 100% availability. At Percona, we don’t consider a solution to be highly available if it is not at least 99% or two nines available. + +The following table shows the amount of downtime for each level of availability from two to five nines. + +| Availability % | Downtime per year | Downtime per month | Downtime per week | Downtime per day | +|--------------------------|-------------------|--------------------|-------------------|-------------------| +| 99% (“two nines”) | 3.65 days | 7.31 hours | 1.68 hours | 14.40 minutes | +| 99.5% (“two nines five”) | 1.83 days | 3.65 hours | 50.40 minutes | 7.20 minutes | +| 99.9% (“three nines”) | 8.77 hours | 43.83 minutes | 10.08 minutes | 1.44 minutes | +| 99.95% (“three nines five”) | 4.38 hours | 21.92 minutes | 5.04 minutes | 43.20 seconds | +| 99.99% (“four nines”) | 52.60 minutes | 4.38 minutes | 1.01 minutes | 8.64 seconds | +| 99.995% (“four nines five”) | 26.30 minutes | 2.19 minutes | 30.24 seconds | 4.32 seconds | +| 99.999% (“five nines”) | 5.26 minutes | 26.30 seconds | 6.05 seconds | 864.00 milliseconds | diff --git a/docs/solutions/ha-patroni.md b/docs/solutions/ha-patroni.md new file mode 100644 index 000000000..ad60d777c --- /dev/null +++ b/docs/solutions/ha-patroni.md @@ -0,0 +1,367 @@ +# Patroni setup + +## Install Percona Distribution for PostgreSQL and Patroni + +Run the following commands as root or with `sudo` privileges on `node1`, `node2` and `node3`. + +=== "On Debian / Ubuntu" + + 1. Disable the upstream `postgresql-{{pgversion}}` package. + + 2. Install Percona Distribution for PostgreSQL package + + ```{.bash data-prompt="$"} + $ sudo apt install percona-postgresql-{{pgversion}} + ``` + + 3. Install some Python and auxiliary packages to help with Patroni + + ```{.bash data-prompt="$"} + $ sudo apt install python3-pip python3-dev binutils + ``` + + 4. Install Patroni + + ```{.bash data-prompt="$"} + $ sudo apt install percona-patroni + ``` + + 5. Stop and disable all installed services: + + ```{.bash data-prompt="$"} + $ sudo systemctl stop {patroni,postgresql} + $ sudo systemctl disable {patroni,postgresql} + ``` + + 6. Even though Patroni can use an existing Postgres installation, our recommendation for a **new cluster that has no data** is to remove the data directory. This forces Patroni to initialize a new Postgres cluster instance. + + ```{.bash data-prompt="$"} + $ sudo systemctl stop postgresql + $ sudo rm -rf /var/lib/postgresql/{{pgversion}}/main + ``` + +=== "On RHEL and derivatives" + + 1. Install Percona Distribution for PostgreSQL package + + ```{.bash data-prompt="$"} + $ sudo yum install percona-postgresql{{pgversion}}-server + ``` + + 2. Check the [platform specific notes for Patroni](../yum.md#for-percona-distribution-for-postgresql-packages) + + 3. Install some Python and auxiliary packages to help with Patroni and etcd + + ```{.bash data-prompt="$"} + $ sudo yum install python3-pip python3-devel binutils + ``` + + 4. Install Patroni + + ```{.bash data-prompt="$"} + $ sudo yum install percona-patroni + ``` + + 3. Stop and disable all installed services: + + ```{.bash data-prompt="$"} + $ sudo systemctl stop {patroni,postgresql-{{pgversion}}} + $ sudo systemctl disable {patroni,postgresql-{{pgversion}}} + ``` + + !!! important + + **Don't** initialize the cluster and start the `postgresql` service. The cluster initialization and setup are handled by Patroni during the bootsrapping stage. + +## Configure Patroni + +Run the following commands on all nodes. You can do this in parallel: + +### Create environment variables + +Environment variables simplify the config file creation: + +1. Node name: + + ```{.bash data-prompt="$"} + $ export NODE_NAME=`hostname -f` + ``` + +2. Node IP: + + ```{.bash data-prompt="$"} + $ export NODE_IP=`getent hosts $(hostname -f) | awk '{ print $1 }' | grep -v grep | grep -v '127.0.1.1'` + ``` + + * Check that the correct IP address is defined: + + ```{.bash data-prompt="$"} + $ echo $NODE_IP + ``` + + ??? admonition "Sample output `node1`" + + ```{text .no-copy} + 10.104.0.7 + ``` + + If you have multiple IP addresses defined on your server and the environment variable contains the wrong one, you can manually redefine it. For example, run the following command for `node1`: + + ```{.bash data-prompt="$"} + $ NODE_IP=10.104.0.7 + ``` + +3. Create variables to store the `PATH`. Check the path to the `data` and `bin` folders on your operating system and change it for the variables accordingly: + + === ":material-debian: Debian and Ubuntu" + + ```bash + DATA_DIR="/var/lib/postgresql/{{pgversion}}/main" + PG_BIN_DIR="/usr/lib/postgresql/{{pgversion}}/bin" + ``` + + === ":material-redhat: RHEL and derivatives" + + ```bash + DATA_DIR="/var/lib/pgsql/data/" + PG_BIN_DIR="/usr/pgsql-{{pgversion}}/bin" + ``` + +4. Patroni information: + + ```bash + NAMESPACE="percona_lab" + SCOPE="cluster_1" + ``` + +### Create the directories required by Patroni + +Create the directory to store the configuration file and make it owned by the `postgres` user. + +```{.bash data-prompt="$"} +$ sudo mkdir -p /etc/patroni/ +$ sudo chown -R postgres:postgres /etc/patroni/ +``` + +### Patroni configuration file + +Use the following command to create the `/etc/patroni/patroni.yml` configuration file and add the following configuration for every node: + +```bash +echo " +namespace: ${NAMESPACE} +scope: ${SCOPE} +name: ${NODE_NAME} + +restapi: + listen: 0.0.0.0:8008 + connect_address: ${NODE_IP}:8008 + +etcd3: + host: ${NODE_IP}:2379 + +bootstrap: + # this section will be written into Etcd:///config after initializing new cluster + dcs: + ttl: 30 + loop_wait: 10 + retry_timeout: 10 + maximum_lag_on_failover: 1048576 + + postgresql: + use_pg_rewind: true + use_slots: true + parameters: + wal_level: replica + hot_standby: "on" + wal_keep_segments: 10 + max_wal_senders: 5 + max_replication_slots: 10 + wal_log_hints: "on" + logging_collector: 'on' + max_wal_size: '10GB' + archive_mode: "on" + archive_timeout: 600s + archive_command: "cp -f %p /home/postgres/archived/%f" + + pg_hba: # Add following lines to pg_hba.conf after running 'initdb' + - host replication replicator 127.0.0.1/32 trust + - host replication replicator 0.0.0.0/0 md5 + - host all all 0.0.0.0/0 md5 + - host all all ::0/0 md5 + recovery_conf: + restore_command: cp /home/postgres/archived/%f %p + + # some desired options for 'initdb' + initdb: # Note: It needs to be a list (some options need values, others are switches) + - encoding: UTF8 + - data-checksums + + +postgresql: + cluster_name: cluster_1 + listen: 0.0.0.0:5432 + connect_address: ${NODE_IP}:5432 + data_dir: ${DATA_DIR} + bin_dir: ${PG_BIN_DIR} + pgpass: /tmp/pgpass0 + authentication: + replication: + username: replicator + password: replPasswd + superuser: + username: postgres + password: qaz123 + parameters: + unix_socket_directories: "/var/run/postgresql/" + create_replica_methods: + - basebackup + basebackup: + checkpoint: 'fast' + + watchdog: + mode: required # Allowed values: off, automatic, required + device: /dev/watchdog + safety_margin: 5 + +tags: + nofailover: false + noloadbalance: false + clonefrom: false + nosync: false +" | sudo tee /etc/patroni/patroni.yml +``` + +??? admonition "Patroni configuration file" + + Let’s take a moment to understand the contents of the `patroni.yml` file. + + The first section provides the details of the node and its connection ports. After that, we have the `etcd` service and its port details. + + Following these, there is a `bootstrap` section that contains the PostgreSQL configurations and the steps to run once + +### Systemd configuration + +1. Check that the systemd unit file `percona-patroni.service` is created in `/etc/systemd/system`. If it is created, skip this step. + + If it's **not created**, create it manually and specify the following contents within: + + ```ini title="/etc/systemd/system/percona-patroni.service" + [Unit] + Description=Runners to orchestrate a high-availability PostgreSQL + After=syslog.target network.target + + [Service] + Type=simple + + User=postgres + Group=postgres + + # Start the patroni process + ExecStart=/bin/patroni /etc/patroni/patroni.yml + + # Send HUP to reload from patroni.yml + ExecReload=/bin/kill -s HUP $MAINPID + + # only kill the patroni process, not its children, so it will gracefully stop postgres + KillMode=process + + # Give a reasonable amount of time for the server to start up/shut down + TimeoutSec=30 + + # Do not restart the service if it crashes, we want to manually inspect database on failure + Restart=no + + [Install] + WantedBy=multi-user.target + ``` + +2. Make `systemd` aware of the new service: + + ```{.bash data-prompt="$"} + $ sudo systemctl daemon-reload + ``` + +3. Make sure you have the configuration file and the `systemd` unit file created on every node. + +### Start Patroni + +Now it's time to start Patroni. You need the following commands on all nodes but **not in parallel**. + +1. Start Patroni on `node1` first, wait for the service to come to live, and then proceed with the other nodes one-by-one, always waiting for them to sync with the primary node: + + ```{.bash data-prompt="$"} + $ sudo systemctl enable --now percona-patroni + ``` + + When Patroni starts, it initializes PostgreSQL (because the service is not currently running and the data directory is empty) following the directives in the bootstrap section of the configuration file. + +2. Check the service to see if there are errors: + + ```{.bash data-prompt="$"} + $ sudo journalctl -fu percona-patroni + ``` + + See [Troubleshooting Patroni startup](#troubleshooting-patroni-startup) for guidelines in case of errors. + + If Patroni has started properly, you should be able to locally connect to a PostgreSQL node using the following command: + + ```{.bash data-prompt="$"} + $ sudo psql -U postgres + + psql ({{dockertag}}) + Type "help" for help. + + postgres=# + ``` + +9. When all nodes are up and running, you can check the cluster status using the following command: + + ```{.bash data-prompt="$"} + $ sudo patronictl -c /etc/patroni/patroni.yml list + ``` + + The output resembles the following: + + ??? admonition "Sample output node1" + + ```{.text .no-copy} + + Cluster: cluster_1 (7440127629342136675) -----+----+-------+ + | Member | Host | Role | State | TL | Lag in MB | + +--------+------------+---------+-----------+----+-----------+ + | node1 | 10.0.100.1 | Leader | running | 1 | | + ``` + + ??? admonition "Sample output node3" + + ```{.text .no-copy} + + Cluster: cluster_1 (7440127629342136675) -----+----+-------+ + | Member | Host | Role | State | TL | Lag in MB | + +--------+------------+---------+-----------+----+-----------+ + | node1 | 10.0.100.1 | Leader | running | 1 | | + | node2 | 10.0.100.2 | Replica | streaming | 1 | 0 | + | node3 | 10.0.100.3 | Replica | streaming | 1 | 0 | + +--------+------------+---------+-----------+----+-----------+ + ``` + +### Troubleshooting Patroni startup + + A common error is Patroni complaining about the lack of proper entries in the `pg_hba.conf` file. If you see such errors, you must manually add or fix the entries in that file and then restart the service. + +An example of such an error is `No pg_hba.conf entry for replication connection from host to , user replicator, no encryption`. This means that Patroni cannot connect to the node you're adding to the cluster. To resolve this issue, add the IP addresses of the nodes to the `pg_hba:` section of the Patroni configuration file. + +``` +pg_hba: # Add following lines to pg_hba.conf after running 'initdb' + - host replication replicator 127.0.0.1/32 trust + - host replication replicator 0.0.0.0/0 md5 + - host replication replicator 10.0.100.2/32 trust + - host replication replicator 10.0.100.3/32 trust + - host all all 0.0.0.0/0 md5 + - host all all ::0/0 md5 + recovery_conf: + restore_command: cp /home/postgres/archived/%f %p +``` + +For production use, we recommend adding nodes individually as the more secure way. However, if your network is secure and you trust it, you can add the whole network these nodes belong to as the trusted one to bypass passwords use during authentication. Then all nodes from this network can connect to Patroni cluster. + +Changing the `patroni.yml` file and restarting the service will not have any effect here because the bootstrap section specifies the configuration to apply when PostgreSQL is first started in the node. It will not repeat the process even if the Patroni configuration file is modified and the service is restarted. diff --git a/docs/solutions/ha-setup-apt.md b/docs/solutions/ha-setup-apt.md deleted file mode 100644 index 0052a2661..000000000 --- a/docs/solutions/ha-setup-apt.md +++ /dev/null @@ -1,581 +0,0 @@ -# Deploying PostgreSQL for high availability with Patroni on Debian or Ubuntu - -This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni on Debian or Ubuntu. - - -## Preconditions - -1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes. - - If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and PostgreSQL is recommended due to performance reasons. - -2. For this setup, we will use the nodes running on Ubuntu 22.04 as the base operating system: - - -| Node name | Public IP address | Internal IP address -|---------------|-------------------|-------------------- -| node1 | 157.230.42.174 | 10.104.0.7 -| node2 | 68.183.177.183 | 10.104.0.2 -| node3 | 165.22.62.167 | 10.104.0.8 -| HAProxy-demo | 134.209.111.138 | 10.104.0.6 - - -!!! note - - We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack. - -## Initial setup - -Configure every node. - -### Set up hostnames in the `/etc/hosts` file - -It's not necessary to have name resolution, but it makes the whole setup more readable and less error prone. Here, instead of configuring a DNS, we use a local name resolution by updating the file `/etc/hosts`. By resolving their hostnames to their IP addresses, we make the nodes aware of each other's names and allow their seamless communication. - -=== "node1" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname node1 - ``` - - 2. Modify the `/etc/hosts` file to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: - - ```text hl_lines="3 4" - # Cluster IP and names - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -=== "node2" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname node2 - ``` - - 2. Modify the `/etc/hosts` file to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: - - ```text hl_lines="2 4" - # Cluster IP and names - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -=== "node3" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname node3 - ``` - - 2. Modify the `/etc/hosts` file to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: - - ```text hl_lines="2 3" - # Cluster IP and names - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -=== "HAproxy-demo" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname HAProxy-demo - ``` - - 2. Modify the `/etc/hosts` file. The HAProxy instance should have the name resolution for all the three nodes in its `/etc/hosts` file. Add the following lines at the end of the file: - - ```text hl_lines="3 4 5" - # Cluster IP and names - 10.104.0.6 HAProxy-demo - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -### Install the software - -Run the following commands on `node1`, `node2` and `node3`: - -1. Install Percona Distribution for PostgreSQL - - * Disable the upstream `postgresql-{{pgversion}}` package. - - * Install the `percona-release` repository management tool - - --8<-- "percona-release-apt.md" - - * Enable the repository - - ```{.bash data-prompt="$"} - $ sudo percona-release setup ppg{{pgversion}} - ``` - - * Install Percona Distribution for PostgreSQL package - - ```{.bash data-prompt="$"} - $ sudo apt install percona-postgresql-{{pgversion}} - ``` - -2. Install some Python and auxiliary packages to help with Patroni and etcd - - ```{.bash data-prompt="$"} - $ sudo apt install python3-pip python3-dev binutils - ``` - -3. Install etcd, Patroni, pgBackRest packages: - - - ```{.bash data-prompt="$"} - $ sudo apt install percona-patroni \ - etcd etcd-server etcd-client \ - percona-pgbackrest - ``` - -4. Stop and disable all installed services: - - ```{.bash data-prompt="$"} - $ sudo systemctl stop {etcd,patroni,postgresql} - $ systemctl disable {etcd,patroni,postgresql} - ``` - -5. Even though Patroni can use an existing Postgres installation, remove the data directory to force it to initialize a new Postgres cluster instance. - - ```{.bash data-prompt="$"} - $ sudo systemctl stop postgresql - $ sudo rm -rf /var/lib/postgresql/{{pgversion}}/main - ``` - -## Configure etcd distributed store - -In our implementation we use etcd distributed configuration store. [Refresh your knowledge about etcd](high-availability.md#etcd). - -!!! note - - If you [installed the software from tarballs](../tarball.md), you must first [enable etcd](../enable-extensions.md#etcd) before configuring it. - -To get started with `etcd` cluster, you need to bootstrap it. This means setting up the initial configuration and starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms: - -* Static in the case when the IP addresses of the cluster nodes are known -* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time. - -Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service, please refer to the [etcd documentation :octicons-external-link-16:](https://etcd.io/docs/v3.5/op-guide/clustering/#etcd-discovery){:target="_blank"}. - -We will configure and start all etcd nodes in parallel. This can be done either by modifying each node's configuration or using the command line options. Use the method that you prefer more. - -### Method 1. Modify the configuration file - -1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes. - - === "node1" - - ```yaml title="/etc/etcd/etcd.conf.yaml" - name: 'node1' - initial-cluster-token: PostgreSQL_HA_Cluster_1 - initial-cluster-state: new - initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380 - data-dir: /var/lib/etcd - initial-advertise-peer-urls: http://10.104.0.1:2380 - listen-peer-urls: http://10.104.0.1:2380 - advertise-client-urls: http://10.104.0.1:2379 - listen-client-urls: http://10.104.0.1:2379 - ``` - - === "node2" - - ```yaml title="/etc/etcd/etcd.conf.yaml" - name: 'node2' - initial-cluster-token: PostgreSQL_HA_Cluster_1 - initial-cluster-state: new - initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380 - data-dir: /var/lib/etcd - initial-advertise-peer-urls: http://10.104.0.2:2380 - listen-peer-urls: http://10.104.0.2:2380 - advertise-client-urls: http://10.104.0.2:2379 - listen-client-urls: http://10.104.0.2:2379 - ``` - - === "node3" - - ```yaml title="/etc/etcd/etcd.conf.yaml" - name: 'node3' - initial-cluster-token: PostgreSQL_HA_Cluster_1 - initial-cluster-state: new - initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380 - data-dir: /var/lib/etcd - initial-advertise-peer-urls: http://10.104.0.3:2380 - listen-peer-urls: http://10.104.0.3:2380 - advertise-client-urls: http://10.104.0.3:2379 - listen-client-urls: http://10.104.0.3:2379 - ``` - -2. Enable and start the `etcd` service on all nodes: - - ```{.bash data-prompt="$"} - $ sudo systemctl enable --now etcd - $ sudo systemctl start etcd - $ sudo systemctl status etcd - ``` - - During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created. - ---8<-- "check-etcd.md" - -### Method 2. Start etcd nodes with command line options - -1. On each etcd node, set the environment variables for the cluster members, the cluster token and state: - - ``` - TOKEN=PostgreSQL_HA_Cluster_1 - CLUSTER_STATE=new - NAME_1=node1 - NAME_2=node2 - NAME_3=node3 - HOST_1=10.104.0.1 - HOST_2=10.104.0.2 - HOST_3=10.104.0.3 - CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380 - ``` - -2. Start each etcd node in parallel using the following command: - - === "node1" - - ```{.bash data-prompt="$"} - THIS_NAME=${NAME_1} - THIS_IP=${HOST_1} - etcd --data-dir=data.etcd --name ${THIS_NAME} \ - --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ - --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ - --initial-cluster ${CLUSTER} \ - --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} - ``` - - === "node2" - - ```{.bash data-prompt="$"} - THIS_NAME=${NAME_2} - THIS_IP=${HOST_2} - etcd --data-dir=data.etcd --name ${THIS_NAME} \ - --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ - --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ - --initial-cluster ${CLUSTER} \ - --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} - ``` - - === "node3" - - ```{.bash data-prompt="$"} - THIS_NAME=${NAME_3} - THIS_IP=${HOST_3} - etcd --data-dir=data.etcd --name ${THIS_NAME} \ - --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ - --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ - --initial-cluster ${CLUSTER} \ - --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} - ``` - ---8<-- "check-etcd.md" - -## Configure Patroni - -Run the following commands on all nodes. You can do this in parallel: - -1. Export and create environment variables to simplify the config file creation: - - * Node name: - - ```{.bash data-prompt="$"} - $ export NODE_NAME=`hostname -f` - ``` - - * Node IP: - - ```{.bash data-prompt="$"} - $ export NODE_IP=`hostname -i | awk '{print $1}'` - ``` - - * Create variables to store the PATH: - - ```bash - DATA_DIR="/var/lib/postgresql/{{pgversion}}/main" - PG_BIN_DIR="/usr/lib/postgresql/{{pgversion}}/bin" - ``` - - **NOTE**: Check the path to the data and bin folders on your operating system and change it for the variables accordingly. - - * Patroni information: - - ```bash - NAMESPACE="percona_lab" - SCOPE="cluster_1" - ``` - -2. Use the following command to create the `/etc/patroni/patroni.yml` configuration file and add the following configuration for `node1`: - - ```bash - echo " - namespace: ${NAMESPACE} - scope: ${SCOPE} - name: ${NODE_NAME} - - restapi: - listen: 0.0.0.0:8008 - connect_address: ${NODE_IP}:8008 - - etcd3: - host: ${NODE_IP}:2379 - - bootstrap: - # this section will be written into Etcd:///config after initializing new cluster - dcs: - ttl: 30 - loop_wait: 10 - retry_timeout: 10 - maximum_lag_on_failover: 1048576 - - postgresql: - use_pg_rewind: true - use_slots: true - parameters: - wal_level: replica - hot_standby: "on" - wal_keep_segments: 10 - max_wal_senders: 5 - max_replication_slots: 10 - wal_log_hints: "on" - logging_collector: 'on' - max_wal_size: '10GB' - archive_mode: "on" - archive_timeout: 600s - archive_command: "cp -f %p /home/postgres/archived/%f" - pg_hba: - - local all all peer - - host replication replicator 127.0.0.1/32 trust - - host replication replicator 192.0.0.0/8 scram-sha-256 - - host all all 0.0.0.0/0 scram-sha-256 - recovery_conf: - restore_command: cp /home/postgres/archived/%f %p - - # some desired options for 'initdb' - initdb: # Note: It needs to be a list (some options need values, others are switches) - - encoding: UTF8 - - data-checksums - - postgresql: - cluster_name: cluster_1 - listen: 0.0.0.0:5432 - connect_address: ${NODE_IP}:5432 - data_dir: ${DATA_DIR} - bin_dir: ${PG_BIN_DIR} - pgpass: /tmp/pgpass0 - authentication: - replication: - username: replicator - password: replPasswd - superuser: - username: postgres - password: qaz123 - parameters: - unix_socket_directories: "/var/run/postgresql/" - create_replica_methods: - - basebackup - basebackup: - checkpoint: 'fast' - - watchdog: - mode: required # Allowed values: off, automatic, required - device: /dev/watchdog - safety_margin: 5 - - - tags: - nofailover: false - noloadbalance: false - clonefrom: false - nosync: false - " | sudo tee -a /etc/patroni/patroni.yml - ``` - - ??? admonition "Patroni configuration file" - - Let’s take a moment to understand the contents of the `patroni.yml` file. - - The first section provides the details of the node and its connection ports. After that, we have the `etcd` service and its port details. - - Following these, there is a `bootstrap` section that contains the PostgreSQL configurations and the steps to run once the database is initialized. The `pg_hba.conf` entries specify all the other nodes that can connect to this node and their authentication mechanism. - - -3. Check that the systemd unit file `percona-patroni.service` is created in `/etc/systemd/system`. If it is created, skip this step. - - If it's **not created**, create it manually and specify the following contents within: - - ```ini title="/etc/systemd/system/percona-patroni.service" - [Unit] - Description=Runners to orchestrate a high-availability PostgreSQL - After=syslog.target network.target - - [Service] - Type=simple - - User=postgres - Group=postgres - - # Start the patroni process - ExecStart=/bin/patroni /etc/patroni/patroni.yml - - # Send HUP to reload from patroni.yml - ExecReload=/bin/kill -s HUP $MAINPID - - # only kill the patroni process, not its children, so it will gracefully stop postgres - KillMode=process - - # Give a reasonable amount of time for the server to start up/shut down - TimeoutSec=30 - - # Do not restart the service if it crashes, we want to manually inspect database on failure - Restart=no - - [Install] - WantedBy=multi-user.target - ``` - -4. Make systemd aware of the new service: - - ```{.bash data-prompt="$"} - $ sudo systemctl daemon-reload - ``` - -5. Repeat steps 1-4 on the remaining nodes. In the end you must have the configuration file and the systemd unit file created on every node. -6. Now it's time to start Patroni. You need the following commands on all nodes but not in parallel. Start with the `node1` first, wait for the service to come to live, and then proceed with the other nodes one-by-one, always waiting for them to sync with the primary node: - - - ```{.bash data-prompt="$"} - $ sudo systemctl enable --now patroni - $ sudo systemctl restart patroni - ``` - - When Patroni starts, it initializes PostgreSQL (because the service is not currently running and the data directory is empty) following the directives in the bootstrap section of the configuration file. - -7. Check the service to see if there are errors: - - ```{.bash data-prompt="$"} - $ sudo journalctl -fu patroni - ``` - - A common error is Patroni complaining about the lack of proper entries in the pg_hba.conf file. If you see such errors, you must manually add or fix the entries in that file and then restart the service. - - Changing the patroni.yml file and restarting the service will not have any effect here because the bootstrap section specifies the configuration to apply when PostgreSQL is first started in the node. It will not repeat the process even if the Patroni configuration file is modified and the service is restarted. - -8. Check the cluster. Run the following command on any node: - - ```{.bash data-prompt="$"} - $ patronictl -c /etc/patroni/patroni.yml list $SCOPE - ``` - - The output resembles the following: - - ```{.text .no-copy} - + Cluster: cluster_1 (7440127629342136675) -----+----+-------+ - | Member | Host | Role | State | TL | Lag in MB | - +--------+------------+---------+-----------+----+-----------+ - | node1 | 10.0.100.1 | Leader | running | 1 | | - | node2 | 10.0.100.2 | Replica | streaming | 1 | 0 | - | node3 | 10.0.100.3 | Replica | streaming | 1 | 0 | - +--------+------------+---------+-----------+----+-----------+ - ``` - -If Patroni has started properly, you should be able to locally connect to a PostgreSQL node using the following command: - -```{.bash data-prompt="$"} -$ sudo psql -U postgres -``` - -The command output is the following: - -``` -psql ({{dockertag}}) -Type "help" for help. - -postgres=# -``` - -## Configure HAProxy - -HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene, HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001 - -This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that client application requests are never rejected. - -1. Install HAProxy on the `HAProxy-demo` node: - - ```{.bash data-prompt="$"} - $ sudo apt install percona-haproxy - ``` - -2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file. - - ``` - global - maxconn 100 - - defaults - log global - mode tcp - retries 2 - timeout client 30m - timeout connect 4s - timeout server 30m - timeout check 5s - - listen stats - mode http - bind *:7000 - stats enable - stats uri / - - listen primary - bind *:5000 - option httpchk /primary - http-check expect status 200 - default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions - server node1 node1:5432 maxconn 100 check port 8008 - server node2 node2:5432 maxconn 100 check port 8008 - server node3 node3:5432 maxconn 100 check port 8008 - - listen standbys - balance roundrobin - bind *:5001 - option httpchk /replica - http-check expect status 200 - default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions - server node1 node1:5432 maxconn 100 check port 8008 - server node2 node2:5432 maxconn 100 check port 8008 - server node3 node3:5432 maxconn 100 check port 8008 - ``` - - - HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately. - -3. Restart HAProxy: - - ```{.bash data-prompt="$"} - $ sudo systemctl restart haproxy - ``` - -4. Check the HAProxy logs to see if there are any errors: - - ```{.bash data-prompt="$"} - $ sudo journalctl -u haproxy.service -n 100 -f - ``` - -## Next steps - -[Configure pgBackRest](pgbackrest.md){.md-button} diff --git a/docs/solutions/ha-setup-yum.md b/docs/solutions/ha-setup-yum.md deleted file mode 100644 index 98f40ca29..000000000 --- a/docs/solutions/ha-setup-yum.md +++ /dev/null @@ -1,584 +0,0 @@ -# Deploying PostgreSQL for high availability with Patroni on RHEL or CentOS - -This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni on Red Hat Enterprise Linux or CentOS. - - -## Considerations - -1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes. - - If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and PostgreSQL is recommended due to performance reasons. - -2. For this setup, we use the nodes running on Red Hat Enterprise Linux 8 as the base operating system: - - | Node name | Application | IP address - |---------------|-------------------|-------------------- - | node1 | Patroni, PostgreSQL, etcd | 10.104.0.1 - | node2 | Patroni, PostgreSQL, etcd | 10.104.0.2 - | node3 | Patroni, PostgreSQL, etcd | 10.104.0.3 - | HAProxy-demo | HAProxy | 10.104.0.6 - - -!!! note - - We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack. - -## Initial setup - -### Set up hostnames in the `/etc/hosts` file - -It's not necessary to have name resolution, but it makes the whole setup more readable and less error prone. Here, instead of configuring a DNS, we use a local name resolution by updating the file `/etc/hosts`. By resolving their hostnames to their IP addresses, we make the nodes aware of each other's names and allow their seamless communication. - -=== "node1" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname node1 - ``` - - 2. Modify the `/etc/hosts` file to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: - - ```text hl_lines="3 4" - # Cluster IP and names - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -=== "node2" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname node2 - ``` - - 2. Modify the `/etc/hosts` file to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: - - ```text hl_lines="2 4" - # Cluster IP and names - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -=== "node3" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname node3 - ``` - - 2. Modify the `/etc/hosts` file to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: - - ```text hl_lines="2 3" - # Cluster IP and names - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -=== "HAproxy-demo" - - 1. Set up the hostname for the node - - ```{.bash data-prompt="$"} - $ sudo hostnamectl set-hostname HAProxy-demo - ``` - - 2. Modify the `/etc/hosts` file. The HAProxy instance should have the name resolution for all the three nodes in its `/etc/hosts` file. Add the following lines at the end of the file: - - ```text hl_lines="3 4 5" - # Cluster IP and names - 10.104.0.6 HAProxy-demo - 10.104.0.1 node1 - 10.104.0.2 node2 - 10.104.0.3 node3 - ``` - -### Install the software - -Run the following commands on `node1`, `node2` and `node3`: - -1. Install Percona Distribution for PostgreSQL: - - * Check the [platform specific notes](../yum.md#for-percona-distribution-for-postgresql-packages) - * Install the `percona-release` repository management tool - - --8<-- "percona-release-yum.md" - - * Enable the repository: - - ```{.bash data-prompt="$"} - $ sudo percona-release setup ppg{{pgversion}} - ``` - - * Install Percona Distribution for PostgreSQL package - - ```{.bash data-prompt="$"} - $ sudo yum install percona-postgresql{{pgversion}}-server - ``` - - !!! important - - **Don't** initialize the cluster and start the `postgresql` service. The cluster initialization and setup are handled by Patroni during the bootsrapping stage. - -2. Install some Python and auxiliary packages to help with Patroni and etcd - - ```{.bash data-prompt="$"} - $ sudo yum install python3-pip python3-devel binutils - ``` - -3. Install etcd, Patroni, pgBackRest packages. Check [platform specific notes for Patroni](../yum.md#for-percona-patroni-package): - - ```{.bash data-prompt="$"} - $ sudo yum install percona-patroni \ - etcd python3-python-etcd\ - percona-pgbackrest - ``` - -4. Stop and disable all installed services: - - ```{.bash data-prompt="$"} - $ sudo systemctl stop {etcd,patroni,postgresql-{{pgversion}}} - $ sudo systemctl disable {etcd,patroni,postgresql-{{pgversion}}} - ``` - -## Configure etcd distributed store - -In our implementation we use etcd distributed configuration store. [Refresh your knowledge about etcd](high-availability.md#etcd). - -!!! note - - If you [installed the software from tarballs](../tarball.md), you must first [enable etcd](../enable-extensions.md#etcd) before configuring it. - -To get started with `etcd` cluster, you need to bootstrap it. This means setting up the initial configuration and starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms: - -* Static in the case when the IP addresses of the cluster nodes are known -* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time. - -Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service, please refer to the [etcd documentation :octicons-external-link-16:](https://etcd.io/docs/v3.5/op-guide/clustering/#etcd-discovery){:target="_blank"}. - -We will configure and start all etcd nodes in parallel. This can be done either by modifying each node's configuration or using the command line options. Use the method that you prefer more. - -### Method 1. Modify the configuration file - -1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes. - - === "node1" - - ```yaml title="/etc/etcd/etcd.conf.yaml" - name: 'node1' - initial-cluster-token: PostgreSQL_HA_Cluster_1 - initial-cluster-state: new - initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380 - data-dir: /var/lib/etcd - initial-advertise-peer-urls: http://10.104.0.1:2380 - listen-peer-urls: http://10.104.0.1:2380 - advertise-client-urls: http://10.104.0.1:2379 - listen-client-urls: http://10.104.0.1:2379 - ``` - - === "node2" - - ```yaml title="/etc/etcd/etcd.conf.yaml" - name: 'node2' - initial-cluster-token: PostgreSQL_HA_Cluster_1 - initial-cluster-state: new - initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380 - data-dir: /var/lib/etcd - initial-advertise-peer-urls: http://10.104.0.2:2380 - listen-peer-urls: http://10.104.0.2:2380 - advertise-client-urls: http://10.104.0.2:2379 - listen-client-urls: http://10.104.0.2:2379 - ``` - - === "node3" - - ```yaml title="/etc/etcd/etcd.conf.yaml" - name: 'node3' - initial-cluster-token: PostgreSQL_HA_Cluster_1 - initial-cluster-state: new - initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380 - data-dir: /var/lib/etcd - initial-advertise-peer-urls: http://10.104.0.3:2380 - listen-peer-urls: http://10.104.0.3:2380 - advertise-client-urls: http://10.104.0.3:2379 - listen-client-urls: http://10.104.0.3:2379 - ``` - -2. Enable and start the `etcd` service on all nodes: - - ```{.bash data-prompt="$"} - $ sudo systemctl enable --now etcd - $ sudo systemctl start etcd - $ sudo systemctl status etcd - ``` - - During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created. - ---8<-- "check-etcd.md" - -### Method 2. Start etcd nodes with command line options - -1. On each etcd node, set the environment variables for the cluster members, the cluster token and state: - - ``` - TOKEN=PostgreSQL_HA_Cluster_1 - CLUSTER_STATE=new - NAME_1=node1 - NAME_2=node2 - NAME_3=node3 - HOST_1=10.104.0.1 - HOST_2=10.104.0.2 - HOST_3=10.104.0.3 - CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380 - ``` - -2. Start each etcd node in parallel using the following command: - - === "node1" - - ```{.bash data-prompt="$"} - THIS_NAME=${NAME_1} - THIS_IP=${HOST_1} - etcd --data-dir=data.etcd --name ${THIS_NAME} \ - --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ - --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ - --initial-cluster ${CLUSTER} \ - --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} - ``` - - === "node2" - - ```{.bash data-prompt="$"} - THIS_NAME=${NAME_2} - THIS_IP=${HOST_2} - etcd --data-dir=data.etcd --name ${THIS_NAME} \ - --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ - --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ - --initial-cluster ${CLUSTER} \ - --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} - ``` - - === "node3" - - ```{.bash data-prompt="$"} - THIS_NAME=${NAME_3} - THIS_IP=${HOST_3} - etcd --data-dir=data.etcd --name ${THIS_NAME} \ - --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \ - --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \ - --initial-cluster ${CLUSTER} \ - --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} - ``` - ---8<-- "check-etcd.md" - -## Configure Patroni - -Run the following commands on all nodes. You can do this in parallel: - -1. Export and create environment variables to simplify the config file creation: - - * Node name: - - ```{.bash data-prompt="$"} - $ export NODE_NAME=`hostname -f` - ``` - - * Node IP: - - ```{.bash data-prompt="$"} - $ export NODE_IP=`hostname -i | awk '{print $1}'` - ``` - - * Create variables to store the PATH: - - ```bash - DATA_DIR="/var/lib/pgsql/data/" - PG_BIN_DIR="/usr/pgsql-{{pgversion}}/bin" - ``` - - **NOTE**: Check the path to the data and bin folders on your operating system and change it for the variables accordingly. - - * Patroni information: - - ```bash - NAMESPACE="percona_lab" - SCOPE="cluster_1 - ``` - -2. Create the directories required by Patroni - - * Create the directory to store the configuration file and make it owned by the `postgres` user. - - ```{.bash data-prompt="$"} - $ sudo mkdir -p /etc/patroni/ - $ sudo chown -R postgres:postgres /etc/patroni/ - ``` - - * Create the data directory to store PostgreSQL data. Change its ownership to the `postgres` user and restrict the access to it - - ```{.bash data-prompt="$"} - $ sudo mkdir /data/pgsql -p - $ sudo chown -R postgres:postgres /data/pgsql - $ sudo chmod 700 /data/pgsql - ``` - -3. Use the following command to create the `/etc/patroni/patroni.yml` configuration file and add the following configuration for `node1`: - - ```bash - echo " - namespace: ${NAMESPACE} - scope: ${SCOPE} - name: ${NODE_NAME} - - restapi: - listen: 0.0.0.0:8008 - connect_address: ${NODE_IP}:8008 - - etcd3: - host: ${NODE_IP}:2379 - - bootstrap: - # this section will be written into Etcd:///config after initializing new cluster - dcs: - ttl: 30 - loop_wait: 10 - retry_timeout: 10 - maximum_lag_on_failover: 1048576 - - postgresql: - use_pg_rewind: true - use_slots: true - parameters: - wal_level: replica - hot_standby: "on" - wal_keep_segments: 10 - max_wal_senders: 5 - max_replication_slots: 10 - wal_log_hints: "on" - logging_collector: 'on' - max_wal_size: '10GB' - archive_mode: "on" - archive_timeout: 600s - archive_command: "cp -f %p /home/postgres/archived/%f" - pg_hba: - - local all all peer - - host replication replicator 127.0.0.1/32 trust - - host replication replicator 192.0.0.0/8 scram-sha-256 - - host all all 0.0.0.0/0 scram-sha-256 - recovery_conf: - restore_command: cp /home/postgres/archived/%f %p - - # some desired options for 'initdb' - initdb: # Note: It needs to be a list (some options need values, others are switches) - - encoding: UTF8 - - data-checksums - - postgresql: - cluster_name: cluster_1 - listen: 0.0.0.0:5432 - connect_address: ${NODE_IP}:5432 - data_dir: ${DATA_DIR} - bin_dir: ${PG_BIN_DIR} - pgpass: /tmp/pgpass0 - authentication: - replication: - username: replicator - password: replPasswd - superuser: - username: postgres - password: qaz123 - parameters: - unix_socket_directories: "/var/run/postgresql/" - create_replica_methods: - - basebackup - basebackup: - checkpoint: 'fast' - - watchdog: - mode: required # Allowed values: off, automatic, required - device: /dev/watchdog - safety_margin: 5 - - - tags: - nofailover: false - noloadbalance: false - clonefrom: false - nosync: false - " | sudo tee -a /etc/patroni/patroni.yml - ``` - -4. Check that the systemd unit file `percona-patroni.service` is created in `/etc/systemd/system`. If it is created, skip this step. - - If it's **not created**, create it manually and specify the following contents within: - - ```ini title="/etc/systemd/system/percona-patroni.service" - [Unit] - Description=Runners to orchestrate a high-availability PostgreSQL - After=syslog.target network.target - - [Service] - Type=simple - - User=postgres - Group=postgres - - # Start the patroni process - ExecStart=/bin/patroni /etc/patroni/patroni.yml - - # Send HUP to reload from patroni.yml - ExecReload=/bin/kill -s HUP $MAINPID - - # only kill the patroni process, not its children, so it will gracefully stop postgres - KillMode=process - - # Give a reasonable amount of time for the server to start up/shut down - TimeoutSec=30 - - # Do not restart the service if it crashes, we want to manually inspect database on failure - Restart=no - - [Install] - WantedBy=multi-user.target - ``` - -5. Make `systemd` aware of the new service: - - ```{.bash data-prompt="$"} - $ sudo systemctl daemon-reload - ``` - -6. Repeat steps 1-5 on the remaining nodes. In the end you must have the configuration file and the systemd unit file created on every node. -7. Now it's time to start Patroni. You need the following commands on all nodes but not in parallel. Start with the `node1` first, wait for the service to come to live, and then proceed with the other nodes one-by-one, always waiting for them to sync with the primary node: - - ```{.bash data-prompt="$"} - $ sudo systemctl enable --now patroni - $ sudo systemctl restart patroni - ``` - - When Patroni starts, it initializes PostgreSQL (because the service is not currently running and the data directory is empty) following the directives in the bootstrap section of the configuration file. - -8. Check the service to see if there are errors: - - ```{.bash data-prompt="$"} - $ sudo journalctl -fu patroni - ``` - - A common error is Patroni complaining about the lack of proper entries in the pg_hba.conf file. If you see such errors, you must manually add or fix the entries in that file and then restart the service. - - Changing the patroni.yml file and restarting the service will not have any effect here because the bootstrap section specifies the configuration to apply when PostgreSQL is first started in the node. It will not repeat the process even if the Patroni configuration file is modified and the service is restarted. - - If Patroni has started properly, you should be able to locally connect to a PostgreSQL node using the following command: - - ```{.bash data-prompt="$"} - $ sudo psql -U postgres - - psql ({{dockertag}}) - Type "help" for help. - - postgres=# - ``` - -9. When all nodes are up and running, you can check the cluster status using the following command: - - ```{.bash data-prompt="$"} - $ sudo patronictl -c /etc/patroni/patroni.yml list - ``` - - The output resembles the following: - - ```{.text .no-copy} - + Cluster: cluster_1 (7440127629342136675) -----+----+-------+ - | Member | Host | Role | State | TL | Lag in MB | - +--------+------------+---------+-----------+----+-----------+ - | node1 | 10.0.100.1 | Leader | running | 1 | | - | node2 | 10.0.100.2 | Replica | streaming | 1 | 0 | - | node3 | 10.0.100.3 | Replica | streaming | 1 | 0 | - +--------+------------+---------+-----------+----+-----------+ - ``` - -## Configure HAProxy - -HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene, HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001 - -This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that client application requests are never rejected. - -1. Install HAProxy on the `HAProxy-demo` node: - - ```{.bash data-prompt="$"} - $ sudo yum install percona-haproxy - ``` - -2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file. - - ``` - global - maxconn 100 - - defaults - log global - mode tcp - retries 2 - timeout client 30m - timeout connect 4s - timeout server 30m - timeout check 5s - - listen stats - mode http - bind *:7000 - stats enable - stats uri / - - listen primary - bind *:5000 - option httpchk /primary - http-check expect status 200 - default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions - server node1 node1:5432 maxconn 100 check port 8008 - server node2 node2:5432 maxconn 100 check port 8008 - server node3 node3:5432 maxconn 100 check port 8008 - - listen standbys - balance roundrobin - bind *:5001 - option httpchk /replica - http-check expect status 200 - default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions - server node1 node1:5432 maxconn 100 check port 8008 - server node2 node2:5432 maxconn 100 check port 8008 - server node3 node3:5432 maxconn 100 check port 8008 - ``` - - - HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately. - -3. Enable a SELinux boolean to allow HAProxy to bind to non standard ports: - - ```{.bash data-prompt="$"} - $ sudo setsebool -P haproxy_connect_any on - ``` - -4. Restart HAProxy: - - ```{.bash data-prompt="$"} - $ sudo systemctl restart haproxy - ``` - -5. Check the HAProxy logs to see if there are any errors: - - ```{.bash data-prompt="$"} - $ sudo journalctl -u haproxy.service -n 100 -f - ``` - -## Next steps - -[Configure pgBackRest](pgbackrest.md){.md-button} - - diff --git a/docs/solutions/haproxy-info.md b/docs/solutions/haproxy-info.md new file mode 100644 index 000000000..9ff03b325 --- /dev/null +++ b/docs/solutions/haproxy-info.md @@ -0,0 +1,75 @@ +# HAProxy + +HAProxy (High Availability Proxy) is a powerful, open-source load balancer and +proxy server used to improve the performance and reliability of web services by +distributing network traffic across multiple servers. It is widely used to enhance the scalability, availability, and reliability of web applications by balancing client requests among backend servers. + +HAProxy architecture is +optimized to move data as fast as possible with the least possible operations. +It focuses on optimizing the CPU cache's efficiency by sticking connections to +the same CPU as long as possible. + +## How HAProxy works + +HAProxy operates as a reverse proxy, which means it accepts client requests and distributes them to one or more backend servers using the configured load-balancing algorithm. This ensures efficient use of server resources and prevents any single server from becoming overloaded. + +- **Client request processing**: + + 1. A client application connects to HAProxy instead of directly to the server. + 2. HAProxy analyzes the requests and determines what server to route it to for further processing. + 3. HAProxy forwards the request to the selected server using the routing algorithm defined in its configuration. It can be round robin, least connections, and others. + 4. HAProxy receives the response from the server and forwards it back to the client. + 5. After sending the response, HAProxy either closes the connection or keeps it open, depending on the configuration. + +- **Load balancing**: HAProxy distributes incoming traffic using various algorithms such as round-robin, least connections, and IP hash. +- **Health checks**: HAProxy continuously monitors the health of backend servers to ensure requests are only routed to healthy servers. +- **SSL termination**: HAProxy offloads SSL/TLS encryption and decryption, reducing the workload on backend servers. +- **Session persistence**: HAProxy ensures that requests from the same client are routed to the same server for session consistency. +- **Traffic management**: HAProxy supports rate limiting, request queuing, and connection pooling for optimal resource utilization. +- **Security**: HAProxy supports SSL/TLS, IP filtering, and integration with Web Application Firewalls (WAF). + +## Role in a HA Patroni cluster + +HAProxy plays a crucial role in managing PostgreSQL high availability in a Patroni cluster. Patroni is an open-source tool that automates PostgreSQL cluster management, including failover and replication. HAProxy acts as a load balancer and proxy, distributing client connections across the cluster nodes. + +Client applications connect to HAProxy, which transparently forwards their requests to the appropriate PostgreSQL node. This ensures that clients always connect to the active primary node without needing to know the cluster's internal state and topology. + +HAProxy monitors the health of PostgreSQL nodes using Patroni's API and routes traffic to the primary node. If the primary node fails, Patroni promotes a secondary node to a new primary, and HAProxy updates its routing to reflect the change. You can configure HAProxy to route write requests to the primary node and read requests - to the secondary nodes. + +## Redundancy for HAProxy + +A single HAProxy node creates a single point of failure. If HAProxy goes down, clients lose connection to the cluster. To prevent this, set up multiple HAProxy instances with a failover mechanism. This way, if one instance fails, another takes over automatically. + +To implement HAProxy redundancy: + +1. Set up a virtual IP address that can move between HAProxy instances. + +2. Install and configure a failover mechanism to monitor HAProxy instances and move the virtual IP to a backup if the primary fails. + +3. Keep HAProxy configurations synchronized across all instances. + +!!! note + + In this reference architecture we focus on the on-premises deployment and use Keepalived as the failover mechanism. + + If you use a cloud infrastructure, it may be easier to use the load balancer provided by the cloud provider to achieve high-availability for HAProxy. + +### How Keepalived works + +Keepalived manages failover by moving the virtual IP to a backup HAProxy node when the primary fails. + +No matter how many HAProxy nodes you have, only one of them can be a primary and have the MASTER state. All other nodes are BACKUP nodes. They monitor the MASTER state and take over when it is down. + +To determine the MASTER, Keepalived uses the `priority` setting. Every node must have a different priority. + +The node with the highest priority becomes the MASTER. Keepalived periodically checks every node's health. + +When the MASTER node is down or unavailable, it's priority is lowered so that the next highest priority node becomes the new MASTER and takes over. The priority is adjusted by the value you define in the `weight` setting. + +You must carefully define the `priority` and `weight` values in the configuration. When a primary node is down, its priority must be adjusted to be lower than the active node with the lowest priority by at least 1. + +For example, your nodes have priority 110 and 100. The node with priority 110 is MASTER. When it is down, its priority must be lower than the priority of the remaining node (100). + +When a failed node restores, its priority adjusts again. If it is the highest one among the nodes, this node restores its MASTER state, holds the virtual IP address and handles the client connections. + + diff --git a/docs/solutions/high-availability.md b/docs/solutions/high-availability.md index e6118b3fc..fc7ef2a50 100644 --- a/docs/solutions/high-availability.md +++ b/docs/solutions/high-availability.md @@ -1,110 +1,120 @@ # High Availability in PostgreSQL with Patroni -PostgreSQL has been widely adopted as a modern, high-performance transactional database. A highly available PostgreSQL cluster can withstand failures caused by network outages, resource saturation, hardware failures, operating system crashes or unexpected reboots. Such cluster is often a critical component of the enterprise application landscape, where [four nines of availability :octicons-link-external-16:](https://en.wikipedia.org/wiki/High_availability#Percentage_calculation) is a minimum requirement. +Whether you are a small startup or a big enterprise, downtime of your services may cause severe consequences, such as loss of customers, impact on your reputation, and penalties for not meeting the Service Level Agreements (SLAs). That’s why ensuring a highly-available deployment is crucial. -There are several methods to achieve high availability in PostgreSQL. This solution document provides [Patroni](#patroni) - the open-source extension to facilitate and manage the deployment of high availability in PostgreSQL. +But what does it mean, high availability (HA)? And how to achieve it? This document answers these questions. -??? admonition "High availability methods" +After reading this document, you will learn the following: - There are several native methods for achieving high availability with PostgreSQL: +* [what is high availability](#what-is-high-availability) +* the recommended [reference architecture](ha-architecture.md) to achieve it +* how to deploy it using our step-by-step deployment guides for each component. The deployment instructions focus on the minimalistic approach to high availability that we recommend. It also gives instructions how to deploy additional components that you can add when your infrastructure grows. +* how to verify that your high availability deployment works as expected, providing replication and failover with the [testing guidelines](ha-test.md) +* additional components that you can add to address existing limitations on to your infrastructure. An example of such limitations can be the ones on application driver/connectors, or the lack of the connection pooler at the application framework. - - shared disk failover, - - file system replication, - - trigger-based replication, - - statement-based replication, - - logical replication, - - Write-Ahead Log (WAL) shipping, and - - [streaming replication](#streaming-replication) +## What is high availability +High availability (HA) is the ability of the system to operate continuously without the interruption of services. During the outage, the system must be able to transfer the services from the failed component to the healthy ones so that they can take over its responsibility. The system must have sufficient automation to perform this transfer without the need of human intervention, minimizing disruption and avoiding the need for human intervention. - ## Streaming replication +Overall, High availability is about: - Streaming replication is part of Write-Ahead Log shipping, where changes to the WALs are immediately made available to standby replicas. With this approach, a standby instance is always up-to-date with changes from the primary node and can assume the role of primary in case of a failover. +1. Reducing the chance of failures +2. Elimination of single-point-of-failure (SPOF) +3. Automatic detection of failures +4. Automatic action to reduce the impact +### How to achieve it? - ### Why native streaming replication is not enough +A short answer is: add redundancy to your deployment, eliminate a single point of failure (SPOF) and have the mechanism to transfer the services from a failed member to the healthy one. - Although the native streaming replication in PostgreSQL supports failing over to the primary node, it lacks some key features expected from a truly highly-available solution. These include: +For a long answer, let's break it down into steps. +#### Step 1. Replication - * No consensus-based promotion of a “leader” node during a failover - * No decent capability for monitoring cluster status - * No automated way to bring back the failed primary node to the cluster - * A manual or scheduled switchover is not easy to manage +First, you should have more than one copy of your data. This means, you need to have several instances of your database where one is the primary instance that accepts reads and writes. Other instances are replicas – they must have an up-to-date copy of the data from the primary and remain in sync with it. They may also accept reads to offload your primary. - To address these shortcomings, there are a multitude of third-party, open-source extensions for PostgreSQL. The challenge for a database administrator here is to select the right utility for the current scenario. +You must deploy these instances on separate hardware (servers or nodes) and use a separate storage for storing the data. This way you eliminate a single point of failure for your database. - Percona Distribution for PostgreSQL solves this challenge by providing the [Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/) extension for achieving PostgreSQL high availability. +The minimum number of database nodes is two: one primary and one replica. -## Patroni +The recommended deployment is a three-instance cluster consisting of one primary and two replica nodes. The replicas receive the data via the replication mechanism. -[Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/) is a Patroni is an open-source tool that helps to deploy, manage, and monitor highly available PostgreSQL clusters using physical streaming replication. Patroni relies on a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes to store the cluster configuration. +![Primary-replica setup](../_images/diagrams/ha-overview-replication.svg) -### Key benefits of Patroni: +PostgreSQL natively supports logical and streaming replication. To achieve high availability, use streaming replication to ensure an exact copy of data is maintained and is ready to take over, while reducing the delay between primary and replica nodes to prevent data loss. -* Continuous monitoring and automatic failover -* Manual/scheduled switchover with a single command -* Built-in automation for bringing back a failed node to cluster again. -* REST APIs for entire cluster configuration and further tooling. -* Provides infrastructure for transparent application failover -* Distributed consensus for every action and configuration. -* Integration with Linux watchdog for avoiding split-brain syndrome. +#### Step 2. Switchover and Failover -## etcd +You may want to transfer the primary role from one machine to another. This action is called a **manual switchover**. A reason for that could be the following: -As stated before, Patroni uses a distributed configuration store to store the cluster configuration, health and status.The most popular implementation of the distributed configuration store is etcd due to its simplicity, consistency and reliability. Etcd not only stores the cluster data, it also handles the election of a new primary node (a leader in ETCD terminology). +* a planned maintenance on the OS level, like applying quarterly security updates or replacing some of the end-of-life components from the server +* troubleshooting some of the problems, like high network latency. -etcd is deployed as a cluster for fault-tolerance. An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state. +Switchover is a manual action performed when you decide to transfer the primary role to another node. The high-availability framework makes this process easier and helps minimize downtime during maintenance, thereby improving overall availability. -The recommended approach is to deploy an odd-sized cluster (e.g. 3, 5 or 7 nodes). The odd number of nodes ensures that there is always a majority of nodes available to make decisions and keep the cluster running smoothly. This majority is crucial for maintaining consistency and availability, even if one node fails. For a cluster with n members, the majority is (n/2)+1. +There could be an unexpected situation where a primary node is down or not responding. Reasons for that can be different, from hardware or network issues to software failures, power outages and the like. In such situations, the high-availability solution should automatically detect the problem, find out a suitable candidate from the remaining nodes and transfer the primary role to the best candidate (promote a new node to become a primary). Such automatic remediation is called **Failover**. -To better illustrate this concept, let's take an example of clusters with 3 nodes and 4 nodes. +![Failover](../_images/diagrams/ha-overview-failover.svg) -In a 3-node cluster, if one node fails, the remaining 2 nodes still form a majority (2 out of 3), and the cluster can continue to operate. +You can do a manual failover when automatic remediation fails, for example, due to: -In a 4-nodes cluster, if one node fails, there are only 3 nodes left, which is not enough to form a majority (3 out of 4). The cluster stops functioning. +* a complete network partitioning +* high-availability framework not being able to find a good candidate +* the insufficient number of nodes remaining for a new primary election. -In this solution we use a 3-nodes etcd cluster that resides on the same hosts with PostgreSQL and Patroni. Though +The high-availability framework allows a human operator / administrator to take control and do a manual failover. -!!! admonition "See also" +#### Step 3. Connection routing and load balancing - - [Patroni documentation :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/SETTINGS.html#settings) +Instead of a single node you now have a cluster. How to enable users to connect to the cluster and ensure they always connect to the correct node, especially when the primary node changes? - - Percona Blog: +One option is to configure a DNS resolution that resolves the IPs of all cluster nodes. A drawback here is that only the primary node accepts all requests. When your system grows, so does the load and it may lead to overloading the primary node and result in performance degradation. - - [PostgreSQL HA with Patroni: Your Turn to Test Failure Scenarios :octicons-link-external-16:](https://www.percona.com/blog/2021/06/11/postgresql-ha-with-patroni-your-turn-to-test-failure-scenarios/) +You can write your application to send read/write requests to the primary and read-only requests to the secondary nodes. This requires significant programming experience. -## Architecture layout +![Load-balancer](../_images/diagrams/ha-overview-load-balancer.svg) -The following diagram shows the architecture of a three-node PostgreSQL cluster with a single-leader node. +Another option is to use a load-balancing proxy. Instead of connecting directly to the IP address of the primary node, which can change during a failover, you use a proxy that acts as a single point of entry for the entire cluster. This proxy provides the IP address visible for user applications. It also knows which node is currently the primary and directs all incoming write requests to it. At the same time, it can distribute read requests among the replicas to evenly spread the load and improve performance. -![Architecture of the three-node, single primary PostgreSQL cluster](../_images/diagrams/ha-architecture-patroni.png) +To eliminate a single point of failure for a load balancer, we recommend to deploy multiple connection routers/proxies for redundancy. Each application server can have its own connection router whose task is to identify the cluster topology and route the traffic to the current primary node. -### Components +Alternatively you can deploy a redundant load balancer for the whole cluster. The load balancer instances share the public IP address so that it can "float" from one instance to another in the case of a failure. To control the load balancer's state and transfer the IP address to the active instance, you also need the failover solution for load balancers. -The components in this architecture are: +The use of a load balancer is optional. If your application implements the logic of connection routing and load-balancing, it is a highly-recommended approach. -- PostgreSQL nodes -- Patroni - a template for configuring a highly available PostgreSQL cluster. +#### Step 4. Backups -- etcd - a Distributed Configuration store that stores the state of the PostgreSQL cluster. +Even with replication and failover mechanisms in place, it’s crucial to have regular backups of your data. Backups provide a safety net for catastrophic failures that affect both the primary and replica nodes. While replication ensures data is synchronized across multiple nodes, it does not protect against data corruption, accidental deletions, or malicious attacks that can affect all nodes. -- HAProxy - the load balancer for the cluster and is the single point of entry to client applications. +![Backup tool](../_images/diagrams/ha-overview-backup.svg) -- pgBackRest - the backup and restore solution for PostgreSQL +Having regular backups ensures that you can restore your data to a previous state, preserving data integrity and availability even in the worst-case scenarios. Store your backups in separate, secure locations and regularly test them to ensure that you can quickly and accurately restore them when needed. This additional layer of protection is essential to maintaining continuous operation and minimizing data loss. -- Percona Monitoring and Management (PMM) - the solution to monitor the health of your cluster +The backup tool is optional but highly-recommended for data corruption recovery. Additionally, backups protect against human error, when a user can accidentally drop a table or make another mistake. -### How components work together +As a result, you end up with the following components for a minimalistic highly-available deployment: -Each PostgreSQL instance in the cluster maintains consistency with other members through streaming replication. Each instance hosts Patroni - a cluster manager that monitors the cluster health. Patroni relies on the operational etcd cluster to store the cluster configuration and sensitive data about the cluster health there. +* A minimum two-node PostgreSQL cluster with the replication configured among nodes. The recommended minimalistic cluster is a three-node one. +* A solution to manage the cluster and perform automatic failover when the primary node is down. +* (Optional but recommended) A load-balancing proxy that provides a single point of entry to your cluster and distributes the load across cluster nodes. You need at least two instances of a load-balancing proxy and a failover tool to eliminate a single point of failure. +* (Optional but recommended) A backup and restore solution to protect data against loss, corruption and human error. -Patroni periodically sends heartbeat requests with the cluster status to etcd. etcd writes this information to disk and sends the response back to Patroni. If the current primary fails to renew its status as leader within the specified timeout, Patroni updates the state change in etcd, which uses this information to elect the new primary and keep the cluster up and running. +Optionally, you can add a monitoring tool to observe the health of your deployment, receive alerts about performance issues and timely react to them. -The connections to the cluster do not happen directly to the database nodes but are routed via a connection proxy like HAProxy. This proxy determines the active node by querying the Patroni REST API. +### What tools to use? + +The PostgreSQL ecosystem offers many tools for high availability, but choosing the right ones can be challenging. At Percona, we have carefully selected and tested open-source tools to ensure they work well together and help you achieve high availability. + +In our [reference architecture](ha-architecture.md) section we recommend a combination of open-source tools, focusing on a minimalistic three-node PostgreSQL cluster. + +Note that the tools are recommended but not mandatory. You can use your own solutions and alternatives if they better meet your business needs. However, in this case, we cannot guarantee their compatibility and smooth operation. + +### Additional reading + +[Measuring high availability](ha-measure.md){.md-button} ## Next steps -[Deploy on Debian or Ubuntu](ha-setup-apt.md){.md-button} -[Deploy on RHEL or derivatives](ha-setup-yum.md){.md-button} +[Architecture](ha-architecture.md){.md-button} + diff --git a/docs/solutions/patroni-info.md b/docs/solutions/patroni-info.md new file mode 100644 index 000000000..18df3aa58 --- /dev/null +++ b/docs/solutions/patroni-info.md @@ -0,0 +1,81 @@ +# Patroni + +Patroni is an open-source tool designed to manage and automate the high availability (HA) of PostgreSQL clusters. It ensures that your PostgreSQL database remains available even in the event of hardware failures, network issues or other disruptions. Patroni achieves this by using distributed consensus stores like ETCD, Consul, or ZooKeeper to manage cluster state and automate failover processes. We'll use [`etcd`](etcd-info.md) in our architecture. + +## Key benefits of Patroni for high availability + +- Automated failover and promotion of a new primary in case of a failure; +- Prevention and protection from split-brain scenarios (where two nodes believe they are the primary and both accept transactions). Split-brain can lead to serious logical corruptions such as wrong, duplicate data or data loss, and to associated business loss and risk of litigation; +- Simplifying the management of PostgreSQL clusters across multiple data centers; +- Self-healing via automatic restarts of failed PostgreSQL instances or reinitialization of broken replicas. +- Integration with tools like `pgBackRest`, `HAProxy`, and monitoring systems for a complete HA solution. + +## How Patroni works + +Patroni uses the `etcd` distributed consensus store to coordinate the state of a PostgreSQL cluster for the following operations: + +1. Cluster state management: + + - After a user installs and configures Patroni, Patroni takes over the PostgreSQL service administration and configuration; + - Patroni maintains the cluster state data such as PostgreSQL configuration, information about which node is the primary and which are replicas, and their health status. + - Patroni manages PostgreSQL configuration files such as` postgresql.conf` and `pg_hba.conf` dynamically, ensuring consistency across the cluster. + - A Patroni agent runs on each cluster node and communicates with `etcd` and other nodes. + +2. Primary node election: + + - Patroni initiates a primary election process after the cluster is initialized; + - Patroni initiates a failover process if the primary node fails; + - When the old primary is recovered, it rejoins the cluster as a new replica; + - Every new node added to the cluster joins it as a new replica; + - `etcd` and the Raft consensus algorithm ensures that only one node is elected as the new primary, preventing split-brain scenarios. + +3. Automatic failover: + + - If the primary node becomes unavailable, Patroni initiates a new primary election process with the most up-to-date replicas; + - When a node is elected it is automatically promoted to primary; + - Patroni updates the `etcd` consensus store and reconfigures the remaining replicas to follow the new primary. + +4. Health checks: + + - Patroni continuously monitors the health of all PostgreSQL instances; + - If a node fails or becomes unreachable, Patroni takes corrective actions by restarting PostgreSQL or initiating a failover process. + +## Split-brain prevention + +Split-brain is an issue, which occurs when two or more nodes believe they are the primary, leading to data inconsistencies. + +Patroni prevents split-brain by using a three-layer protection and prevention mechanism where the `etcd` distributed locking mechanism plays a key role: + +* At the Patroni layer, a node needs to acquire a leader key in the race before promoting itself as the primary. If the node cannot to renew its leader key, Patroni demotes it to a replica. +* The `etcd` layer uses the Raft consensus algorithm to allow only one node to acquire the leader key. +- At the OS and hardware layers, Patroni uses Linux Watchdog to perform [STONITH](https://en.wikipedia.org/wiki/Fencing_(computing)#STONITH) / fencing and terminate a PostgreSQL instance to prevent a split-brain scenario. + + +One important aspect of how Patroni works is that it requires a quorum (the majority) of nodes to agree on the cluster state, preventing isolated nodes from becoming a primary. The quorum strengthens Patroni's capabilities of preventing split-brain. + +## Watchdog + +Patroni can use a watchdog mechanism to improve resilience. But what is watchdog? + +A watchdog is a mechanism that ensures a system can recover from critical failures. In the context of Patroni, a watchdog is used to forcibly restart the node and terminate a failed primary node to prevent split-brain scenarios. + +While Patroni itself is designed for high availability, a watchdog provides an extra layer of protection against system-level failures that Patroni might not be able to detect, such as kernel panics or hardware lockups. If the entire operating system becomes unresponsive, Patroni might not be able to function correctly. The watchdog operates independently so it can detect that the server is unresponsive and reset it, bringing it back to a known good state. + +Watchdog adds an extra layer of safety, because it helps protecting against scenarios where the `etcd` consensus store is unavailable or network partitions occur. + +There are 2 types of watchdogs: + + - Hardware watchdog: A physical device that reboots the server if the operating system becomes unresponsive. +- Software watchdog (also called a softdog): A software-based watchdog timer tha emulates the functionality of a hardware watchdog but is implemented entirely in software. It is part of the Linux kernel's watchdog infrastructure and is useful in systems that lack dedicated hardware watchdog timers. The softdog monitors the system and takes corrective actions such as killing processes or rebooting the node. + +Most of the servers in the cloud nowadays use a softdog. + +## Integration with other tools + +Patroni integrates well with other tools to create a comprehensive high-availability solution. In our architecture, such tools are: + +* HAProxy to check the current topology and route the traffic to both the primary and replica nodes, balancing the load among them, +* pgBackRest to help to ensure robust backup and restore, +* PMM for monitoring. + +Patroni provides hooks that allow you to customize its behavior. You can use hooks to execute custom scripts or commands at various stages of Patroni lifecycle, such as before and after failover, or when a new instance joins the cluster. Thereby you can integrate Patroni with other systems and automate various tasks. For example, use a hook to update the monitoring system when a failover occurs. \ No newline at end of file diff --git a/docs/solutions/pgbackrest-info.md b/docs/solutions/pgbackrest-info.md new file mode 100644 index 000000000..34f675061 --- /dev/null +++ b/docs/solutions/pgbackrest-info.md @@ -0,0 +1,35 @@ +# PgBackRest + +`pgBackRest` is an advanced backup and restore tool designed specifically for PostgreSQL databases. `pgBackRest` emphasizes simplicity, speed, and scalability. Its architecture is focused on minimizing the time and resources required for both backup and restoration processes. + +`pgBackRest` uses a custom protocol, which allows for more flexibility compared to traditional tools like `tar` and `rsync` and limits the types of connections that are required to perform a backup, thereby increasing security. `pgBackRest` is a simple, but feature-rich, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads. + +## Key features of `pgBackRest` + +1. **Full, differential, and incremental backups (at file or block level)**: `pgBackRest` supports various types of backups, including full, differential, and incremental, providing efficient storage and recovery options. Block-level backups save space by only copying the parts of files that have changed. + +2. **Point-in-Time recovery (PITR)**: `pgBackRest` enables restoring a PostgreSQL database to a specific point in time, crucial for disaster recovery scenarios. + +3. **Parallel backup and restore**: `pgBackRest` can perform backups and restores in parallel, utilizing multiple CPU cores to significantly reduce the time required for these operations. + +4. **Local or remote operation**: A custom protocol allows `pgBackRest` to backup, restore, and archive locally or remotely via TLS/SSH with minimal configuration. This allows for flexible deployment options. + +5. **Backup rotation and archive expiration**: You can set retention policies to manage backup rotation and WAL archive expiration automatically. + +6. **Backup integrity and verification**: `pgBackRest` performs integrity checks on backup files, ensuring they are consistent and reliable for recovery. + +7. **Backup resume**: `pgBackRest` can resume an interrupted backup from the point where it was stopped. Files that were already copied are compared with the checksums in the manifest to ensure integrity. This operation can take place entirely on the repository host, therefore, it reduces load on the PostgreSQL host and saves time since checksum calculation is faster than compressing and retransmitting data. + +8. **Delta restore**: This feature allows pgBackRest to quickly apply incremental changes to an existing database, reducing restoration time. + +9. **Compression and encryption**: `pgBackRest` offers options for compressing and encrypting backup data, enhancing security and reducing storage requirements. + +## How `pgBackRest` works + +For making backups and restores you need a backup server and the `pgBackRest` agents running on the database nodes. The backup server has the information about a PostgreSQL cluster, where it is located, how to back it up and where to store backup files. This information is defined within a configuration section called a *stanza*. + +The storage location where `pgBackRest` stores backup data and WAL archives is called the repository. It can be a local directory, a remote server, or a cloud storage service like AWS S3, S3-compatible storages or Azure blob storage. `pgBackRest` supports up to 4 repositories, allowing for redundancy and flexibility in backup storage. + +When you create a stanza, it initializes the repository and prepares it for storing backups. During the backup process, `pgBackRest` reads the data from the PostgreSQL cluster and writes it to the repository. It also performs integrity checks and compresses the data if configured. + +Similarly, during the restore process, `pgBackRest` reads the backup data from the repository and writes it to the PostgreSQL data directory. It also verifies the integrity of the restored data. \ No newline at end of file diff --git a/docs/solutions/pgbackrest.md b/docs/solutions/pgbackrest.md index 7697f4e19..f941d8ad4 100644 --- a/docs/solutions/pgbackrest.md +++ b/docs/solutions/pgbackrest.md @@ -1,48 +1,44 @@ # pgBackRest setup -[pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) is a backup tool used to perform PostgreSQL database backup, archiving, restoration, and point-in-time recovery. While it can be used for local backups, this procedure shows how to deploy a [pgBackRest server running on a dedicated host :octicons-link-external-16:](https://pgbackrest.org/user-guide-rhel.html#repo-host) and how to configure PostgreSQL servers to use it for backups and archiving. +[pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) is a backup tool used to perform PostgreSQL database backup, archiving, restoration, and point-in-time recovery. + +In our solution we deploy a [pgBackRest server on a dedicated host :octicons-link-external-16:](https://pgbackrest.org/user-guide-rhel.html#repo-host) and also deploy pgBackRest on the PostgreSQL servers. Them we configure PostgreSQL servers to use it for backups and archiving. You also need a backup storage to store the backups. It can either be a remote storage such as AWS S3, S3-compatible storages or Azure blob storage, or a filesystem-based one. -## Configure backup server +## Preparation -To make things easier when working with some templates, run the commands below as the root user. Run the following command to switch to the root user: - -```{.bash data-prompt="$"} -$ sudo su - -``` +Make sure to complete the [initial setup](ha-init-setup.md) steps. + +## Install pgBackRest -### Install pgBackRest +Install pgBackRest on the following nodes: `node1`, `node2`, `node3`, `backup` -1. Enable the repository with [percona-release :octicons-link-external-16:](https://www.percona.com/doc/percona-repo-config/index.html) +=== ":material-debian: On Debian/Ubuntu" ```{.bash data-prompt="$"} - $ percona-release setup ppg-{{pgversion}} + $ sudo apt install percona-pgbackrest ``` -2. Install pgBackRest package +=== ":material-redhat: On RHEL/derivatives" - === ":material-debian: On Debian/Ubuntu" - - ```{.bash data-prompt="$"} - $ apt install percona-pgbackrest - ``` + ```{.bash data-prompt="$"} + $ sudo yum install percona-pgbackrest + ``` - === ":material-redhat: On RHEL/derivatives" +## Configure a backup server - ```{.bash data-prompt="$"} - $ yum install percona-pgbackrest - ``` +Do the following steps on the `backup` node. ### Create the configuration file 1. Create environment variables to simplify the config file creation: ```{.bash data-prompt="$"} - export SRV_NAME="bkp-srv" - export NODE1_NAME="node-1" - export NODE2_NAME="node-2" - export NODE3_NAME="node-3" + export SRV_NAME="backup" + export NODE1_NAME="node1" + export NODE2_NAME="node2" + export NODE3_NAME="node3" export CA_PATH="/etc/ssl/certs/pg_ha" ``` @@ -53,25 +49,25 @@ $ sudo su - This directory is usually created during pgBackRest's installation process. If it's not there already, create it as follows: ```{.bash data-prompt="$"} - $ mkdir -p /var/lib/pgbackrest - $ chmod 750 /var/lib/pgbackrest - $ chown postgres:postgres /var/lib/pgbackrest + $ sudo mkdir -p /var/lib/pgbackrest + $ sudo chmod 750 /var/lib/pgbackrest + $ sudo chown postgres:postgres /var/lib/pgbackrest ``` 3. The default `pgBackRest` configuration file location is `/etc/pgbackrest/pgbackrest.conf`, but some systems continue to use the old path, `/etc/pgbackrest.conf`, which remains a valid alternative. If the former is not present in your system, create the latter. - Access the file's parent directory (either `cd /etc/` or `cd /etc/pgbackrest/`), and make a backup copy of it: + Go to the file's parent directory (either `cd /etc/` or `cd /etc/pgbackrest/`), and make a backup copy of it: ```{.bash data-prompt="$"} - $ cp pgbackrest.conf pgbackrest.conf.bak + $ sudo cp pgbackrest.conf pgbackrest.conf.orig ``` - Then use the following command to create a basic configuration file using the environment variables we created in a previous step: +4. Then use the following command to create a basic configuration file using the environment variables we created in a previous step. This example command adds the configuration file at the path `/etc/pgbackrest.conf`. Make sure to specify the correct path for the configuration file on your system: === ":material-debian: On Debian/Ubuntu" ``` - cat < pgbackrest.conf + echo " [global] # Server repo details @@ -96,7 +92,7 @@ $ sudo su - repo1-retention-full=4 # Server general options - process-max=12 + process-max=4 # This depends on the number of CPU resources your server has. The recommended value should equal or be less than the number of CPUs. While more processes can speed up backups, they will also consume additional system resources. log-level-console=info #log-level-file=debug log-level-file=info @@ -146,13 +142,14 @@ $ sudo su - pg3-host-key-file=${CA_PATH}/${SRV_NAME}.key pg3-host-ca-file=${CA_PATH}/ca.crt pg3-socket-path=/var/run/postgresql - EOF + + " | sudo tee /etc/pgbackrest.conf ``` === ":material-redhat: On RHEL/derivatives" ``` - cat < pgbackrest.conf + echo " [global] # Server repo details @@ -177,7 +174,7 @@ $ sudo su - repo1-retention-full=4 # Server general options - process-max=12 + process-max=4 # This depends on the number of CPU resources your server has. The recommended value should equal or be less than the number of CPUs. While more processes can speed up backups, they will also consume additional system resources. log-level-console=info #log-level-file=debug log-level-file=info @@ -201,7 +198,7 @@ $ sudo su - pg1-host=${NODE1_NAME} pg1-host-port=8432 pg1-port=5432 - pg1-path=/var/lib/pgsql/{{pgversion}}/data + pg1-path=/var/lib/postgresql/{{pgversion}}/main pg1-host-type=tls pg1-host-cert-file=${CA_PATH}/${SRV_NAME}.crt pg1-host-key-file=${CA_PATH}/${SRV_NAME}.key @@ -211,7 +208,7 @@ $ sudo su - pg2-host=${NODE2_NAME} pg2-host-port=8432 pg2-port=5432 - pg2-path=/var/lib/pgsql/{{pgversion}}/data + pg2-path=/var/lib/postgresql/{{pgversion}}/main pg2-host-type=tls pg2-host-cert-file=${CA_PATH}/${SRV_NAME}.crt pg2-host-key-file=${CA_PATH}/${SRV_NAME}.key @@ -221,55 +218,69 @@ $ sudo su - pg3-host=${NODE3_NAME} pg3-host-port=8432 pg3-port=5432 - pg3-path=/var/lib/pgsql/{{pgversion}}/data + pg3-path=/var/lib/postgresql/{{pgversion}}/main pg3-host-type=tls pg3-host-cert-file=${CA_PATH}/${SRV_NAME}.crt pg3-host-key-file=${CA_PATH}/${SRV_NAME}.key pg3-host-ca-file=${CA_PATH}/ca.crt pg3-socket-path=/var/run/postgresql - EOF + + " | sudo tee /etc/pgbackrest.conf ``` *NOTE*: The option `backup-standby=y` above indicates the backups should be taken from a standby server. If you are operating with a primary only, or if your secondaries are not configured with `pgBackRest`, set this option to `n`. ### Create the certificate files + +Run the following commands as a root user or with `sudo` privileges 1. Create the folder to store the certificates: ```{.bash data-prompt="$"} - $ mkdir -p ${CA_PATH} + $ sudo mkdir -p /etc/ssl/certs/pg_ha + ``` + +2. Create the environment variable to simplify further configuration + + ```{.bash data-prompt="$"} + $ export CA_PATH="/etc/ssl/certs/pg_ha" ``` - -2. Create the certificates and keys + +3. Create the CA certificates and keys + + ```{.bash data-prompt="$"} + $ sudo openssl req -new -x509 -days 365 -nodes -out ${CA_PATH}/ca.crt -keyout ${CA_PATH}/ca.key -subj "/CN=root-ca" + ``` + +3. Create the certificate and keys for the backup server ```{.bash data-prompt="$"} - $ openssl req -new -x509 -days 365 -nodes -out ${CA_PATH}/ca.crt -keyout ${CA_PATH}/ca.key -subj "/CN=root-ca" + $ sudo openssl req -new -nodes -out ${CA_PATH}/${SRV_NAME}.csr -keyout ${CA_PATH}/${SRV_NAME}.key -subj "/CN=${SRV_NAME}" ``` -3. Create the certificate for the backup and the PostgreSQL servers +4. Create the certificates and keys for each PostgreSQL node ```{.bash data-prompt="$"} - $ for node in ${SRV_NAME} ${NODE1_NAME} ${NODE2_NAME} ${NODE3_NAME} - do - openssl req -new -nodes -out ${CA_PATH}/$node.csr -keyout ${CA_PATH}/$node.key -subj "/CN=$node"; - done + $ sudo openssl req -new -nodes -out ${CA_PATH}/${NODE1_NAME}.csr -keyout ${CA_PATH}/${NODE1_NAME}.key -subj "/CN=${NODE1_NAME}" + $ sudo openssl req -new -nodes -out ${CA_PATH}/${NODE2_NAME}.csr -keyout ${CA_PATH}/${NODE2_NAME}.key -subj "/CN=${NODE2_NAME}" + $ sudo openssl req -new -nodes -out ${CA_PATH}/${NODE3_NAME}.csr -keyout ${CA_PATH}/${NODE3_NAME}.key -subj "/CN=${NODE3_NAME}" ``` -4. Sign the certificates with the `root-ca` key +4. Sign all certificates with the `root-ca` key ```{.bash data-prompt="$"} - $ for node in ${SRV_NAME} ${NODE1_NAME} ${NODE2_NAME} ${NODE3_NAME} - do - openssl x509 -req -in ${CA_PATH}/$node.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/$node.crt; - done + $ sudo openssl x509 -req -in ${CA_PATH}/${SRV_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${SRV_NAME}.crt + $ sudo openssl x509 -req -in ${CA_PATH}/${NODE1_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${NODE1_NAME}.crt + $ sudo openssl x509 -req -in ${CA_PATH}/${NODE2_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${NODE2_NAME}.crt + $ sudo openssl x509 -req -in ${CA_PATH}/${NODE3_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${NODE3_NAME}.crt ``` 5. Remove temporary files, set ownership of the remaining files to the `postgres` user, and restrict their access: ```{.bash data-prompt="$"} - $ rm -f ${CA_PATH}/*.csr - $ chown postgres:postgres -R ${CA_PATH} - $ chmod 0600 ${CA_PATH}/* + $ sudo rm -f ${CA_PATH}/*.csr + $ sudo chown postgres:postgres -R ${CA_PATH} + $ sudo chmod 0600 ${CA_PATH}/* ``` ### Create the `pgbackrest` daemon service @@ -295,59 +306,70 @@ $ sudo su - WantedBy=multi-user.target ``` -2. Reload, start, and enable the service +2. Make `systemd` aware of the new service: + + ```{.bash data-prompt="$"} + $ sudo systemctl daemon-reload + ``` +3. Enable `pgBackRest`: + ```{.bash data-prompt="$"} - $ systemctl daemon-reload - $ systemctl start pgbackrest.service - $ systemctl enable pgbackrest.service + $ sudo systemctl enable --now pgbackrest.service ``` ## Configure database servers Run the following commands on `node1`, `node2`, and `node3`. -1. Install pgBackRest package +1. Install `pgBackRest` package === ":material-debian: On Debian/Ubuntu" ```{.bash data-prompt="$"} - $ apt install percona-pgbackrest + $ sudo apt install percona-pgbackrest ``` === ":material-redhat: On RHEL/derivatives" ```{.bash data-prompt="$"} - $ yum install percona-pgbackrest + $ sudo yum install percona-pgbackrest + ``` 2. Export environment variables to simplify the config file creation: ```{.bash data-prompt="$"} $ export NODE_NAME=`hostname -f` - $ export SRV_NAME="bkp-srv" + $ export SRV_NAME="backup" $ export CA_PATH="/etc/ssl/certs/pg_ha" ``` 3. Create the certificates folder: ```{.bash data-prompt="$"} - $ mkdir -p ${CA_PATH} + $ sudo mkdir -p ${CA_PATH} ``` 4. Copy the `.crt`, `.key` certificate files and the `ca.crt` file from the backup server where they were created to every respective node. Then change the ownership to the `postgres` user and restrict their access. Use the following commands to achieve this: ```{.bash data-prompt="$"} - $ scp ${SRV_NAME}:${CA_PATH}/{$NODE_NAME.crt,$NODE_NAME.key,ca.crt} ${CA_PATH}/ - $ chown postgres:postgres -R ${CA_PATH} - $ chmod 0600 ${CA_PATH}/* + $ sudo scp ${SRV_NAME}:${CA_PATH}/{$NODE_NAME.crt,$NODE_NAME.key,ca.crt} ${CA_PATH}/ + $ sudo chown postgres:postgres -R ${CA_PATH} + $ sudo chmod 0600 ${CA_PATH}/* ``` -5. Edit or create the configuration file which, as explained above, can be either at the `/etc/pgbackrest/pgbackrest.conf` or `/etc/pgbackrest.conf` path: +5. Make a copy of the configuration file. The path to it can be either `/etc/pgbackrest/pgbackrest.conf` or `/etc/pgbackrest.conf`: + + ```{.bash data-prompt="$"} + $ sudo cp pgbackrest.conf pgbackrest.conf.orig + ``` + +6. Create the configuration file. This example command adds the configuration file at the path `/etc/pgbackrest.conf`. Make sure to specify the correct path for the configuration file on your system: === ":material-debian: On Debian/Ubuntu" ```ini title="pgbackrest.conf" - cat < pgbackrest.conf + echo " [global] repo1-host=${SRV_NAME} repo1-host-user=postgres @@ -357,7 +379,7 @@ Run the following commands on `node1`, `node2`, and `node3`. repo1-host-ca-file=${CA_PATH}/ca.crt # general options - process-max=16 + process-max=6 log-level-console=info log-level-file=debug @@ -370,14 +392,14 @@ Run the following commands on `node1`, `node2`, and `node3`. [cluster_1] pg1-path=/var/lib/postgresql/{{pgversion}}/main - EOF + " | sudo tee /etc/pgbackrest.conf ``` === ":material-redhat: On RHEL/derivatives" ```ini title="pgbackrest.conf" - cat < pgbackrest.conf + echo " [global] repo1-host=${SRV_NAME} repo1-host-user=postgres @@ -387,7 +409,7 @@ Run the following commands on `node1`, `node2`, and `node3`. repo1-host-ca-file=${CA_PATH}/ca.crt # general options - process-max=16 + process-max=6 log-level-console=info log-level-file=debug @@ -400,10 +422,10 @@ Run the following commands on `node1`, `node2`, and `node3`. [cluster_1] pg1-path=/var/lib/pgsql/{{pgversion}}/data - EOF + " | sudo tee /etc/pgbackrest.conf ``` -6. Create the pgbackrest `systemd` unit file at the path `/etc/systemd/system/pgbackrest.service` +7. Create the pgbackrest `systemd` unit file at the path `/etc/systemd/system/pgbackrest.service` ```ini title="/etc/systemd/system/pgbackrest.service" [Unit] @@ -424,64 +446,73 @@ Run the following commands on `node1`, `node2`, and `node3`. WantedBy=multi-user.target ``` -7. Reload, start, and enable the service +8. Reload the `systemd`, the start the service ```{.bash data-prompt="$"} - $ systemctl daemon-reload - $ systemctl start pgbackrest - $ systemctl enable pgbackrest + $ sudo systemctl daemon-reload + $ sudo systemctl enable --now pgbackrest ``` The pgBackRest daemon listens on port `8432` by default: ```{.bash data-prompt="$"} - $ netstat -taunp - Active Internet connections (servers and established) - Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name - tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd - tcp 0 0 0.0.0.0:8432 0.0.0.0:* LISTEN 40224/pgbackrest + $ netstat -taunp | grep '8432' ``` -8. If you are using Patroni, change its configuration to use `pgBackRest` for archiving and restoring WAL files. Run this command only on one node, for example, on `node1`: + ??? admonition "Sample output" + + ```{text .no-copy} + Active Internet connections (servers and established) + Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name + tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd + tcp 0 0 0.0.0.0:8432 0.0.0.0:* LISTEN 40224/pgbackrest + ``` + +9. If you are using Patroni, change its configuration to use `pgBackRest` for archiving and restoring WAL files. Run this command only on one node, for example, on `node1`: ```{.bash data-prompt="$"} $ patronictl -c /etc/patroni/patroni.yml edit-config ``` - - === ":material-debian: On Debian/Ubuntu" - ```yaml title="/etc/patroni/patroni.yml" - postgresql: - (...) - parameters: - (...) - archive_command: pgbackrest --stanza=cluster_1 archive-push /var/lib/postgresql/{{pgversion}}/main/pg_wal/%f - (...) - recovery_conf: - (...) - restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1 archive-get %f %p - (...) - ``` - - === ":material-redhat: On RHEL/derivatives" + This opens the editor for you. + +10. Change the configuration as follows: + + ```yaml title="/etc/patroni/patroni.yml" + postgresql: + parameters: + archive_command: pgbackrest --stanza=cluster_1 archive-push /var/lib/postgresql/{{pgversion}}/main/pg_wal/%f + archive_mode: true + archive_timeout: 600s + hot_standby: true + logging_collector: 'on' + max_replication_slots: 10 + max_wal_senders: 5 + max_wal_size: 10GB + wal_keep_segments: 10 + wal_level: logical + wal_log_hints: true + recovery_conf: + recovery_target_timeline: latest + restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1 archive-get %f "%p" + use_pg_rewind: true + use_slots: true + retry_timeout: 10 + slots: + percona_cluster_1: + type: physical + ttl: 30 + ``` - ```yaml title="/etc/patroni/patroni.yml" - postgresql: - (...) - parameters: - archive_command: pgbackrest --stanza=cluster_1 archive-push /var/lib/pgsql/{{pgversion}}/data/pg_wal/%f - (...) - recovery_conf: - restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1 archive-get %f %p - (...) - ``` - Reload the changed configurations: +11. Reload the changed configurations. Provide the cluster name or the node name for the following command. In our example we use the `cluster_1` cluster name: ```{.bash data-prompt="$"} - $ patronictl -c /etc/patroni/postgresql.yml reload + $ patronictl -c /etc/patroni/patroni.yml restart cluster_1 ``` + It may take a while to reload the new configuration. + :material-information: Note: When configuring a PostgreSQL server that is not managed by Patroni to archive/restore WALs from the `pgBackRest` server, edit the server's main configuration file directly and adjust the `archive_command` and `restore_command` variables as shown above. ## Create backups diff --git a/docs/solutions/postgis.md b/docs/solutions/postgis.md index 00c0b59f7..ebba7f775 100644 --- a/docs/solutions/postgis.md +++ b/docs/solutions/postgis.md @@ -1,7 +1,5 @@ # Spatial data manipulation -!!! admonition "Version added: 15.3" - Organizations dealing with spatial data need to store it somewhere and manipulate it. PostGIS is the open source extension for PostgreSQL that allows doing just that. It adds support for storing the spatial data types such as: * Geographical data like points, lines, polygons, GPS coordinates that can be mapped on a sphere. diff --git a/mkdocs-base.yml b/mkdocs-base.yml index 5e545da23..8b8d35bad 100644 --- a/mkdocs-base.yml +++ b/mkdocs-base.yml @@ -3,7 +3,7 @@ site_name: Percona Distribution for PostgreSQL site_description: Documentation site_author: Percona LLC -copyright: Percona LLC and/or its affiliates © 2025 — Cookie Consent +copyright: Percona LLC and/or its affiliates © 2024 — Cookie Consent repo_name: percona/postgresql-docs repo_url: https://github.com/percona/postgresql-docs @@ -53,7 +53,8 @@ theme: - content.tabs.link - content.action.edit - content.action.view - - content.code.copy + - content.code.copy + - content.tooltips extra_css: - https://unicons.iconscout.com/release/v3.0.3/css/line.css @@ -98,8 +99,8 @@ markdown_extensions: pymdownx.inlinehilite: {} pymdownx.snippets: base_path: ["snippets"] - # auto_append: - # - services-banner.md + auto_append: + - services-banner.md pymdownx.emoji: emoji_index: !!python/name:material.extensions.emoji.twemoji emoji_generator: !!python/name:material.extensions.emoji.to_svg @@ -124,110 +125,97 @@ plugins: macros: include_yaml: - 'variables.yml' # Use in markdown as '{{ VAR }}' +# exclude: # Don't process these files +# glob: +# - file.md + with-pdf: # https://github.com/orzih/mkdocs-with-pdf + output_path: '_pdf/PerconaDistributionPostgreSQL-17.pdf' + cover_title: 'Distribution for PostgreSQL Documentation' + + cover_subtitle: 17.0 (October 3, 2024) + author: 'Percona Technical Documentation Team' + cover_logo: docs/_images/Percona_Logo_Color.png + debug_html: false + custom_template_path: _resource/templates + enabled_if_env: ENABLE_PDF_EXPORT mike: version_selector: true css_dir: css javascript_dir: js canonical_version: null - print-site: - add_to_navigation: false - print_page_title: 'Percona Distribution for PostgreSQL documentation' - add_print_site_banner: false - # Table of contents - add_table_of_contents: true - toc_title: 'Table of Contents' - toc_depth: 2 - # Content-related - add_full_urls: false - enumerate_headings: false - enumerate_headings_depth: 1 - enumerate_figures: true - add_cover_page: true - cover_page_template: "docs/templates/pdf_cover_page.tpl" - path_to_pdf: "" - include_css: true - enabled: true extra: version: provider: mike #homepage: # https://docs.percona.com - postgresrecommended: 17 -# Google Analytics configuration - analytics: - provider: google - property: G-J4J70BNH0G - feedback: - title: Was this page helpful? - ratings: - - icon: material/emoticon-happy-outline - name: This page was helpful - data: 1 - note: >- - Thanks for your feedback! - - icon: material/emoticon-sad-outline - name: This page could be improved - data: 0 - note: >- - Thank you for your feedback! Help us improve by using our - - feedback form. + postgresrecommended: 16 nav: - - index.md + - 'Home': 'index.md' - get-help.md + - 'Percona Server for PostgreSQL': postgresql-server.md - Get started: - - installing.md + - Quickstart guide: installing.md - 1. Install: - - apt.md - - yum.md - - tarball.md - - docker.md + - Via apt: apt.md + - Via yum: yum.md + - From tarballs: tarball.md + - Run in Docker: docker.md - enable-extensions.md - repo-overview.md - - 2. Connect to PostgreSQL: - - connect.md - - 3. Manipulate data in PostgreSQL: - - crud.md - - 4. What's next: - - whats-next.md + - 2. Connect to PostgreSQL: connect.md + - 3. Manipulate data in PostgreSQL: crud.md + - 4. What's next: whats-next.md - Extensions: - - extensions.md + - 'Extensions': extensions.md - contrib.md - - percona-ext.md + - Percona-authored extensions: percona-ext.md - third-party.md - Solutions: - - solutions.md + - Overview: solutions.md - High availability: - - solutions/high-availability.md - - solutions/ha-setup-apt.md - - solutions/ha-setup-yum.md - - solutions/pgbackrest.md + - 'Overview': 'solutions/high-availability.md' + - solutions/ha-measure.md + - 'Architecture': solutions/ha-architecture.md + - Components: + - 'ETCD': 'solutions/etcd-info.md' + - 'Patroni': 'solutions/patroni-info.md' + - 'HAProxy': 'solutions/haproxy-info.md' + - 'pgBackRest': 'solutions/pgbackrest-info.md' + - solutions/ha-components.md + - Deployment: + - 'Initial setup': 'solutions/ha-init-setup.md' + - 'etcd setup': 'solutions/ha-etcd-config.md' + - 'Patroni setup': 'solutions/ha-patroni.md' + - solutions/pgbackrest.md + - 'HAProxy setup': 'solutions/ha-haproxy.md' - solutions/ha-test.md - Backup and disaster recovery: - - solutions/backup-recovery.md + - 'Overview': 'solutions/backup-recovery.md' - solutions/dr-pgbackrest-setup.md - Spatial data handling: - - solutions/postgis.md - - solutions/postgis-deploy.md - - solutions/postgis-testing.md - - solutions/postgis-upgrade.md - - ldap.md + - Overview: solutions/postgis.md + - Deployment: solutions/postgis-deploy.md + - Query spatial data: solutions/postgis-testing.md + - Upgrade spatial database: solutions/postgis-upgrade.md + - LDAP authentication: + - ldap.md - Upgrade: - - major-upgrade.md + - "Major upgrade": major-upgrade.md - minor-upgrade.md - migration.md - - troubleshooting.md - - uninstalling.md - - release-notes.md - - release-notes-v17.5.md - - release-notes-v17.4.md - - release-notes-v17.2.md - - release-notes-v17.0.md - - telemetry.md - - licensing.md - - trademark-policy.md - - + - Troubleshooting guide: troubleshooting.md + - How to: how-to.md + - Uninstall: uninstalling.md + - Release Notes: + - "Release notes index": "release-notes.md" + - release-notes-v17.0.md + - release-notes-v17.2.md + - release-notes-v17.4.md + - release-notes-v17.5.md + - Reference: + - Telemetry: telemetry.md + - Licensing: licensing.md + - Trademark policy: trademark-policy.md diff --git a/mkdocs.yml b/mkdocs.yml index 00ad630a0..76e16fdd8 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -52,10 +52,21 @@ nav: - Solutions: - Overview: solutions.md - High availability: - - 'High availability': 'solutions/high-availability.md' - - 'Deploying on Debian or Ubuntu': 'solutions/ha-setup-apt.md' - - 'Deploying on RHEL or derivatives': 'solutions/ha-setup-yum.md' - - solutions/pgbackrest.md + - 'Overview': 'solutions/high-availability.md' + - solutions/ha-measure.md + - 'Architecture': solutions/ha-architecture.md + - Components: + - 'ETCD': 'solutions/etcd-info.md' + - 'Patroni': 'solutions/patroni-info.md' + - 'HAProxy': 'solutions/haproxy-info.md' + - 'pgBackRest': 'solutions/pgbackrest-info.md' + - solutions/ha-components.md + - Deployment: + - 'Initial setup': 'solutions/ha-init-setup.md' + - 'etcd setup': 'solutions/ha-etcd-config.md' + - 'Patroni setup': 'solutions/ha-patroni.md' + - solutions/pgbackrest.md + - 'HAProxy setup': 'solutions/ha-haproxy.md' - solutions/ha-test.md - Backup and disaster recovery: - 'Overview': 'solutions/backup-recovery.md' diff --git a/variables.yml b/variables.yml index 9bd256ae6..3782ac236 100644 --- a/variables.yml +++ b/variables.yml @@ -1,6 +1,7 @@ # PG Variables set for HTML output # See also mkdocs.yml plugins.with-pdf.cover_subtitle and output_path + release: 'release-notes-v17.5' dockertag: '17.5' pgversion: '17'