Get Excited About Postgres 18

Elizabeth Christensen

8 min readMore by this author

David Christensen

8 min readMore by this author

Postgres 18 will be released in just a couple weeks! Here’s some details on the most important and exciting features.

Asynchronous i/o

Postgres 18 is adding asynchronous i/o. This means faster reads for many use cases. This is also part of a bigger series of performance improvements planned for future Postgres, part of which may be multi-threading. Expect to see more on this in coming versions.

What is async I/O?

When data isn’t in the shared memory buffers already, Postgres reads from disk, and I/O is needed to retrieve data. Synchronous I/O means that each individual request to the disk is waited on for completion before moving on to something else. For busy databases with a lot of activity, this can be a bottleneck.

Postgres 18 will introduce asynchronous I/O, allowing workers to optimize idle time and improve system throughput by batching reads. Currently, Postgres relies on the operating system for intelligent I/O handling, expecting OS or storage read-ahead for sequential scans and using features like Linux's posix_fadvise for other read types like Bitmap Index Scans. Moving this work into the database with asynchronous I/O will provide a more predictable and better-performing method for batching operations at the database level. Additionally, a new system view, pg_aios, will be available to provide data about the asynchronous I/O system.

Postgres writes will continue to be synchronous - since this is needed for ACID compliance.

If async i/o seems confusing, think of it like ordering food at a restaurant. In a synchronous model, you would place your order and stand at the counter, waiting, until your food is ready before you can do anything else. In an asynchronous model, you place your order, receive a buzzer, and are free to go back to your table and chat with friends until the buzzer goes off, signaling that your food is ready to be picked up.

Async I/O will affect:

  • sequential scans
  • bitmap heap scans (following the bitmap index scan)
  • some maintenance operations like VACUUM.

By default Postgres will turn on io_method = worker. By default there are 3 workers and this can be adjusted up for systems with larger CPU workers. I haven’t seen any reliable recommendations on this, so stay tuned for more on that from our team soon.

For Postgres running on Linux 5.1+ you can utilize the io_uring system calls and have the invocations made via the actual backends rather than having separate processes with the optional io_method = io_uring.

UUID v7

UUIDs are getting a bit of an overhaul in this version by moving to v7.

UUIDs are randomly generated strings which are globally unique and often used for primary keys. UUIDs are popular in modern applications for a couple reasons:

  • They’re unique: You can use keys generated from more than one place.
  • Decoupled:Your application can generate a primary key before sending the data to the database.
  • URL obscurity: If your URLs use primary keys (e.g., .../users/5), other URLs are easy to guess (.../users/6, .../users/7). With a UUID (.../users/f47ac10b-58cc-4372-a567-0e02b2c3d479), it's impossible to guess other IDs.

A new standard for UUID v7 came out in mid-2024 via a series of standards updates. UUIDv4 was the prior version of uuid with native Postgres support. But sorting and indexing in large tables had performance issues due to the relative randomness, leading to fragmented indexes and bad locality.  UUIDv7 helps with the sort and indexing issues. It is still random but that first 48 bits (12 characters) are a timestamp, and the remaining bits are random; this gives better locality for data inserted around the same time and thus better indexability.

The timestamp part is a hexadecimal value (i.e. compressed decimal). So for example a uuid that begins with 01896d6e4a5d6 (hex) would represent the 2707238289622 (decimal) and that is the number of milliseconds since 1970.

This is how the DDL will look for uuid v7:

CREATE TABLE user_actions (
action_id UUID PRIMARY KEY DEFAULT uuidv7(),
user_id BIGINT NOT NULL,
action_description TEXT,
action_time TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_action_id ON user_actions (action_id);

B-tree skip scans

There’s a nice performance bump coming in Postgres 18 for some multi-column B-tree indexes.

In Postgres, a multi-column index. For instance, if you have an index on columns (status, date) in a table, this index can be used to match queries which query both status and date fields, or just status.

In Postgres 17 and below, this same index cannot be used to answer queries against just the date field; you would have to have that column indexed separately or the database would resort to a sequence scan + filter approach if there were no appropriate indexes for that table.

In Postgres 18, in many cases it can automatically use this multi-column index for queries touching only the date field.  Known as a skip scan, this lets the system "skip" over portions of the index.

This works when queries don’t use the leading columns in the conditions and the omitted column has a low cardinality, like a small number of distinct values. The optimization works by:

  1. Identifying all the distinct values in the omitted leading column(s).
  2. Effectively transform the query to add the conditions to match the leading values.
  3. The resulting query is able to use existing infrastructure to optimize lookups across multiple leading columns, effectively skipping any pages in the index scan which do not match both conditions.

For example, if we had a sales table with columns status and date, we might have a multi-column index:

CREATE INDEX idx_status_date
ON sales (status, date);

An example query could have a where clause that doesn’t include status.

SELECT * FROM sales
WHERE date = '2025-01-01';

Nothing in the query plan tells you this is a skip scan, so you’ll end up with a normal Index scan like this, showing you the index conditions.

                                QUERY PLAN
-------------------------------------------------------------
 Index Only Scan using idx_status_date on sales  (cost=0.29..21.54 rows=4 width=8)
   Index Cond: (date = '2025-01-01'::date)
(2 rows)

Before 18, a full scan would be done, since the leading column of the index is not included, but with skip scan Postgres can use the same index for this index scan.

In Postgres 18, because status has a low cardinality and just a few values, a compound index scan can be done. Note that this optimization only works for queries which use the = operator, so it will not work with inequalities or ranges.

This all happens behind-the-scenes in the Postgres planner so you don’t need to turn it on. The idea is that it will benefit analytics use cases where filters and conditions often change and aren’t necessarily related to existing indexes.

The query planner will decide if using a skip scan is worthwhile, based on the table's statistics and the number of distinct values in the columns being skipped.

Generated columns on-the-fly

PostgreSQL 18 introduces virtual generated columns. Previously, generated columns were always stored on disk. This meant for generated columns, values were computed at the time of an insert or update and adding a bit of write overhead.

In PostgreSQL 18, virtual generated columns are now the default type for generated columns. if you define a generated column without explicitly specifying STORED, it will be created as a virtual generated column.

CREATE TABLE user_profiles (
user_id SERIAL PRIMARY KEY,
settings JSONB,
username VARCHAR(100) GENERATED ALWAYS AS (settings ->> 'username') VIRTUAL
);

This is a great update for folks using JSON data, queries can be simplified and data changes or normalization can be done on the fly as needed.

Note that virtual generated columns are not indexable - since they’re not stored on disk. For indexing of JSONB, use the stored version or expression index.

OAUTH 2.0

Good news for folks that use Okta, Keycloak, and other managed authentication services, Postgres is now compatible with OAUTH 2.0. This is specified in the main host based authentication configuration (pg_hba.conf) file.

The Oauth system uses bearer tokens where the client application presents a token instead of a password to prove identity. The token is an opaque string and its format is determined by the authorization server. This feature removes the need to store passwords in the database. It also allows for more robust security measures like multi-factor authentication (MFA) and single sign-on (SSO) to be managed by external identity providers.

Postgres versions are packed with other improvements

Postgres 18 comes with a staggering 3,000 commits from more than 200 authors. While many of these are features, there are numerous additions and optimizations under the hood to the Postgres query planner and other parts of the system that are behind the scenes. Even if you don’t utilize optional features, there’s still performance benefits (uh ... asyc i/o is a biggie), bug fixes, and security patches that make upgrading on a regular cadence a good idea.

Related Articles