Postgres Migrations Using Logical Replication
7 min readMore by this author
Jesse Soyland
7 min readMore by this author
Moving a Postgres database isn’t a small task. Typically for Postgres users this is one of the biggest projects you’ll undertake. If you’re migrating for a new Postgres major version or moving to an entirely new platform or host, you have a couple options:
Using pg_dump and pg_restore: pg_dump is a very reliable way to collect an entire database and restore it to a new place. This includes the entire schema, all tables, and special database elements. If you’re migrating a small database, that’s 50, 100, 150GB this is probably the easiest way to do it. On modern hardware a dump and restore using this method can be done in less than an hour.
Using WAL: For folks that have a WAL based backup system like pgBackRest or WAL-G/E, you can do a major Postgres migration by running a full base backup and streaming that WAL to your new host. Once you’re ready to do a cutover to the new database, you have an exact copy already standing by. This is a great way for larger databases, folks in the terabyte size, to do a major migration with minimal downtime.
But what if your database is too big for a dump restore and you can’t take the downtime? But you don’t have access to the WAL (i.e. you're on a host like RDS that doesn't share it). There’s a third option:
- Logical migration: Using Postgres logical migration you can set up a database copy at a new location. While WAL has everything, logical replication just captures data and doesn't migrate schema, indexes, sequences, and a couple other fiddly bits. But with a few tricks in this post, you can capture everything for a full migration using logical replication.
The architecture of logical replication is straightforward, see our intro post on Data To Go: Postgres Logical Replication if you’re brand new to the topic. Your existing database will be the publisher
, and the receiving database will be the subscriber
. In the initial load, all data is copied from the publisher to the subscriber. Following the initial data copy, any transactions made on the publisher side are sent to the subscriber.
Step 1: Migrate schema
Logical replication only replicates data changes (INSERT
, UPDATE
, DELETE
), so you must ensure that the target database has the correct schema beforehand. To get a schema-only dump of your source and apply to your database, run something like:
pg_dump -Fc -s $SOURCE_DB_URI | pg_restore --no-acl --no-owner -d $TARGET_DB_URI
If your migration process is proceeding while application development continues, you must make sure to update the receiving database's schema as you make any schema changes on your source database.
Step 2: Publisher (current host) set up
Logical replication is enabled via the wal_level setting. Some managed Postgres services may have a different want to turn this on. wal_level = logical Slot configuration Review the replication slots settings to make sure there are sufficient resources. For very large replication projects, the defaults may need to be changed.
max_replication_slots
max_wal_senders
max_logical_replication_workers
max_worker_processes
max_sync_workers_per_subscription
For details on how these parameters should be set, see the PostgreSQL chapter on logical replication configuration settings.
Networking
Make sure that the network/firewall for your old database permits connections from your new database.
Replication user for the new subscriber
You can create a specific user for this purpose that has the REPLICATION
role attribute. Also make sure that the new role has read access to tables being replicated.
CREATE ROLE elizabeth WITH REPLICATION LOGIN PASSWORD 'my_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO elizabeth;
Find tables without primary keys or UNIQUE indexes
For logical replication, Postgres needs a way to uniquely identify rows to be updated/deleted. For tables with primary keys, that key is used, so first identify any tables that lack primary keys:
select tab.table_schema,
tab.table_name
from information_schema.tables tab
left join information_schema.table_constraints tco
on tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null
order by table_schema,
table_name;
For tables without primary keys, any UNIQUE
index can be used:
ALTER TABLE tablename REPLICA IDENTITY USING INDEX idx_some_unique_index;
If there are no existing UNIQUE
indexes, one can be created, or the table can be set with REPLICA IDENTITY FULL
- in which case it treats each row as its own "key":
ALTER TABLE tablename REPLICA IDENTITY FULL;
Next create a publication, which is a grouping of tables you intend to replicate. In most cases you will create a publication FOR ALL TABLES:
CREATE PUBLICATION bridge_migration FOR ALL TABLES;
Check that your tables are ready for publication, all the tables should be listed here.
SELECT * FROM pg_publication_tables;
``
## Step 3: Subscriber (new host) settings
On the new host side, create a subscription to each publication to begin receiving the published data. Using the connection details to your old host and the login details you created in Step 2, you set up a subscription to that replicated data.
```sql
CREATE SUBSCRIPTION bridge_migration CONNECTION 'host={host} port=5432 dbname={datatbase} user={login} password={password}' PUBLICATION bridge_migration;
Creating the subscription in this way will create a replication slot on the publisher and begin copying data from tables specified in the publication. A separate temporary slot will be created for each table for the duration of its initial data synchronization copy.
You can limit how many tables are synchronized at once with the max_sync_workers_per_subscription
setting.
Step 4: Monitor the initial copy
You likely want to monitor this initial copy. The pg_stat_subscription
table will show data on the subscriber end of the transaction:
select * from pg_stat_subscription;
-[ RECORD 1 ]---------+------------------------------
subid | 27183
subname | bridge_migration
worker_type | table synchronization
pid | 1197139
leader_pid |
relid | 26721
received_lsn |
last_msg_send_time | 2025-09-26 15:54:45.095215+00
last_msg_receipt_time | 2025-09-26 15:54:45.095215+00
latest_end_lsn |
latest_end_time | 2025-09-26 15:54:45.095215+00
-[ RECORD 2 ]---------+------------------------------
subid | 27183
subname | bridge_migration
worker_type | apply
pid | 47075
leader_pid |
relid |
received_lsn | 4E32/7092F6F8
last_msg_send_time | 2025-09-26 15:55:11.020012+00
last_msg_receipt_time | 2025-09-26 15:55:11.021989+00
latest_end_lsn | 4E32/7092F3E0
latest_end_time | 2025-09-26 15:55:10.843251+00
You can also look at the pg_subscription_rel view to see the synchronization state of each table with select * from pg_subscription_rel;
.
Here, the state_code
can tell you about each object:
- d - data is being copied
- f - finished table copy
- s - synchronized
- r - ready (normal replication)
Because of table bloat and other factors with internal table statistics, you won't be able to compare the table sizes. Though you can do select count(*) to compare row sizes.
Step 5: Testing and cutover
Now you can begin testing your application against the new database. Once you have confirmed that all the data is present, you can do a migration cutover. This will require stopping transactions on the original host, fixing your sequences, and pointing your application to the new database.
Step 6: Fix sequences
While logical replication will copy over all the data from the source, it doesn't update any of the sequences. For this reason, we recommend that you update your sequences post-cutover, before you begin production operations. The best approach to fixing your sequences is to simply create setval commands for all sequences in your source database(s), which you can do with this query:
SELECT
'SELECT setval(' || quote_literal(quote_ident(n.nspname) || '.' || quote_ident(c.relname)) || ', ' || s.last_value || ');'
FROM
pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_sequences s ON s.schemaname = n.nspname
AND s.sequencename = c.relname
WHERE
c.relkind = 'S';
The resulting output file can be executed on the new host to synchronize all sequences.
Final thoughts
Logical replication is a safe and effective migration strategy. Data consistency for replicated tables is ensured as long as the subscriber's schema is identical and replication is one-way with no conflicting writes on the subscriber.
We help folks with migrations to Crunchy Bridge every day. With Postgres you have a lot of choices for no-downtime or low downtime platform changes. Contact us to find out more about the right plan for your project.