The Postgres documentation covers streaming replication pretty comprehensively, but you may also need something more digestible for reference. In this blog post, we'll discuss how to set up streaming replication in Windows. Credit goes to my colleague Douglas Hunley whose blog post on setting up streaming replication on Linux served as inspiration.
To recap, Postgres replication is the process of copying data from one database server (primary) to another (standby). While this blog post will focus on streaming replication, there is another option with logical replication as well.
The general steps are the same in Windows as in Linux but for the sake of convenience the commands and code samples in this post are tailored specifically to a Windows environment. We are assuming that you already have two servers with Postgres installed on them. Our testing environment was Postgres 11.9 on Windows Server 2012.
Keep in mind that when we say "primary," we mean the Postgres node that you already have running and have configured for read/writes (standbys are not available for writing).
Now, let’s get started.
On your primary, you would need to edit your Postgres.conf to append the following settings:
wal_level = hot_standby
full_page_writes = on
wal_log_hints = on
max_wal_senders = 6
max_replication_slots = 6
hot_standby = on
hot_standby_feedback = on
You will also need to edit your pg_hba.conf on your primary and add the external IP addresses of both the servers. Yes, you will need to put both IP addresses. Below is an example:
host replication repl_user IP_of_primary/32 scram-sha-256
host replication repl_user IP_of_standby/32 scram-sha-256
This file will be replicated to the standby. When that node is promoted to a primary, it will already be configured to allow the old primary to connect.
You may be wondering: who is the repl_user? This is the user we are going to create that has the replication privilege that is needed. We should probably explain why we do this. On your primary, in the command prompt, run the following:
psql -d postgres -U postgres -c "CREATE ROLE repl_user LOGIN REPLICATION ENCRYPTED PASSWORD 'secretpasswordhere';"
As you might have noticed in our configuration changes above (max_replication_slots = 6 ), we are using replication slots for this setup as well. Replication slots avoid issues with not being able to find write-ahead log (WAL) segments when a standby goes down and is brought back up at a later time.
There is one thing to keep in mind about replication slots: when the standby is down, the primary will continue to accumulate WAL since the replication slot isn't being updated. This will cause the WAL directory to fill until the standby is back up and receiving the WAL or until the replication slot is removed. Postgres 13 did bring in a parameter to help with replication slots and disk usage. Postgres 13 now has the max_slot_wal_keep_size parameter which will allow you to specify the amount of WAL replication slots retained. While this will prevent Postgres from filling your disk with WAL waiting for the missing standby, it also means WAL will be discarded and your standby will need to be rebuilt due to the necessary WAL being absent.
On the primary, create the replication slot for the standby by running the following:
psql -d postgres -U postgres -c "SELECT * FROM pg_create_physical_replication_slot('standby1', true);"
After you have completed the above, you will need to wipe your existing standby. To do this you will first need to stop the cluster.
pg_ctl -D "your\data\directory" stop -mi
After stopping the cluster you will need to delete the data directory. You can do this by either using the command prompt and running:
rmdir /s \your\data\directory
Or you can delete it from File Explorer.
Once you’ve deleted the data directory on your standby, you will now run pg_basebackup on the standby to copy your primary’s data directory to it.
pg_basebackup -D \your\data\directory -Fp -R -Xs -c fast -l 'initial clone' -P -v -h IP_of_master -U repl_user
Keep in mind that you will need to enter the repl_user’s password when asked.
Lastly, once the base backup has completed for Postgres 11 and below, you will need to edit your standby’s recovery.conf and add the replication slot name. For Postgres 12 and up, the recovery.conf file no longer exists and the settings for recovery.conf are now moved to Postgres.conf. (Take a look at the release notes for more details.) Using your favorite text editing tool append the below to either your recovery.conf or Postgres.conf depending on your version.
primary_slot_name = 'standby1'
Once you’ve edited the above you can start the standby by either using
[data dir] start as the Postgres user, or using the
net start command
if you created a service entry on your Windows server.
There are many ways to check that replication is working properly. First, in the logs you should see:
Started streaming from primary at [wal] on timeline [number]
You can also run the following commands:
On the primary
SELECT * FROM pg_stat_replication;
On the replica
SELECT * FROM pg_stat_wal_receiver;
You have now successfully set up streaming replication - congratulations!
Once again a huge shout out to Douglas Hunley and his blog post on streaming replication for Linux environments. As always, it is recommended that you review the Postgres documentation. With that said, you might be interested in checking out more blog posts about replication by my Crunchy colleagues. Hope you enjoy:
November 13, 2020 •More by this author