The pgBackRest team is pleased to announce the introduction of multiple repository support in v2.33. Backups already provide redundancy by creating an offline copy of your PostgreSQL cluster that can be used in disaster recovery. Multiple repositories allow you to have copies of your backups and WAL archives in separate locations to increase your redundancy and provide even more protection for your data. This feature is the culmination of many months of hard work, so let's delve into why we think multiple repositories are so important and how they can help preserve your data.
Up to four repositories may be configured and each one can be any repo type,
e.g. S3 or Posix. The configuration below defines two repositories.
Posix and stored on locally-mounted NFS volume; the repository should always be
located off the PostgreSQL server in case disaster strikes.
repo2 is stored on
$ cat /etc/pgbackrest/pgbackrest.conf [demo] pg1-path=/var/lib/postgresql/13/demo [global] repo1-path=/var/lib/pgbackrest repo1-retention-full=2 repo2-type=azure repo2-azure-account=pgbackrest repo2-azure-container=demo-container repo2-azure-key=YXpLZXk= repo2-path=/demo-repo repo2-retention-full=8
Note that the retention has been configured differently on each repository.
repo1 has a shorter retention to save space but still provide several backups
and plenty of WAL on local storage that is both fast and cheap (in terms of
bandwidth cost) to access.
repo2 has a longer retention and is stored in Azure
where storage is cheap but retrieving data is both slower and more expensive
repo2 is both a fail-safe in case something goes wrong with
repo1 and a resource to restore backups from remote sites.
pgBackRest will treat
repo1 with higher priority than
repo2 for certain
commands like restore and archive-get. In general, the lower-numbered
repositories should be faster and/or cheaper than the higher-numbered
Now that the repositories are configured, run
stanza-create. This will
demo stanza in each repository.
$ pgbackrest --stanza=demo stanza-create <...> INFO: stanza-create for stanza 'demo' on repo1 INFO: stanza-create for stanza 'demo' on repo2 <...>
Once the repositories have been initialized it is a good idea to run
ensure that everything is working. Note that WAL segments are being pushed to
$ pgbackrest --stanza=demo check <...> INFO: check repo1 configuration (primary) INFO: check repo2 configuration (primary) INFO: check repo1 archive for WAL (primary) INFO: WAL segment 000000010000000000000003 successfully archived to '/var/lib/pgbackrest/archive/demo/13-1/0000000100000000/000000010000000000000003-c981b4ddc8c1437c539eda05427a6aa454a0923e.gz' on repo1 INFO: check repo2 archive for WAL (primary) INFO: WAL segment 000000010000000000000003 successfully archived to '/demo-repo/archive/demo/13-1/0000000100000000/00000001000000000000000
Now a backup should be run for each repository. Most commands operate automatically on all repos but backup requires the repo to be specified. Each repo is likely to have different retention and backup schedules so backups should be run independently.
$ pgbackrest --stanza=demo --repo=1 backup $ pgbackrest --stanza=demo --repo=2 backup
Info for the repositories is shown in a unified fashion with the organizing unit being the stanza. This way all the backups for a stanza can be seen together and the most recent backup easily identified since the list is sorted oldest to newest.
$ pgbackrest info stanza: demo status: ok cipher: none db (current) wal archive min/max (13): 000000010000000000000003/000000010000000000000006 full backup: 20210331-155726F timestamp start/stop: 2021-03-31 15:57:26 / 2021-03-31 15:57:30 wal start/stop: 000000010000000000000005 / 000000010000000000000005 database size: 23.1MB, database backup size: 23.1MB repo1: backup set size: 2.8MB, backup size: 2.8MB full backup: 20210331-155736F timestamp start/stop: 2021-03-31 15:57:36 / 2021-03-31 15:57:41 wal start/stop: 000000010000000000000006 / 000000010000000000000006 database size: 23.1MB, database backup size: 23.1MB repo2: backup set size: 2.8MB, backup size: 2.8MB
It is also possible to get info for a single repository by specifying the
$ pgbackrest --repo=2 info stanza: demo status: ok cipher: none db (current) wal archive min/max (13): 000000010000000000000003/000000010000000000000006 full backup: 20210331-155736F timestamp start/stop: 2021-03-31 15:57:36 / 2021-03-31 15:57:41 wal start/stop: 000000010000000000000006 / 000000010000000000000006 database size: 23.1MB, database backup size: 23.1MB repo2: backup set size: 2.8MB, backup size: 2.8MB
When restoring, pgBackRest will automatically select the best backup from the repositories based on your criteria. Here is an example of time-based recovery:
pgbackrest --stanza=demo --delta --type=time --target="2021-03-31 15:57:31-04" restore <...> INFO: repo1: restore backup set 20210331-155726F <...>
In this case only the backup from
repo1 was valid because of the time
restriction. Even if a later time was specified that would seem to favor the
later backup in
repo2, the same backup from
repo1 will be selected. This is
because pgBackRest tends to prefer the repo with higher priority, i.e. repo1
over repo2, under the assumption that it will be faster and/or cheaper than a
repository with lower priority.
If you want to restore from a specific repository then simply specify the
preferred repository using
archive-get command generated for
recovery will always search all repositories in priority order for WAL segments.
Multiple repositories allow for more redundancy as well as cost savings and performance improvements by allowing a repository to be located close to the PostgreSQL clusters while also having a repository located safely out in the cloud or in another data center.
For more information about multiple repository support, see the User Guide.
April 9, 2021 •More by this author