Postgres’ Original Project Goals: The Creators Totally Nailed It

Elizabeth Christensen

9 min readMore by this author

I had a chance last week to sit down and read the original academic paper announcing Postgres as a platform and the original design goals from 1986. I was just awestruck at the forethought - and how the original project goals laid the foundation for the database that seems to be taking over the world right now.

The PostgreSQL creators totally nailed it. They laid out a flexible framework for a variety of business use cases that would eventually become the most popular database 30 years later.

The paper outlines 6 project goals:

  1. better support for complex objects growing world of business and engineering use cases

  2. provide user extendibility for data types, operators and access methods

  3. provide facilities for active databases like alerters and triggers

  4. simplify process for crash recovery

  5. take advantage of upgraded hardware

  6. utilize Codd’s relational model

Let's look at all of them in reference to modern features of Postgres.

1) Objects and data types for a growing world of business and engineering use cases

Postgres has a rich and flexible set of native data types that are designed to meet a vast array of business use cases, from simple record-keeping to complex data analysis.

Numeric Types like SMALLINT and INTEGER are used for whole numbers while BIGINT might be for a user's unique ID or primary keys. Precision like NUMERIC and  DECIMAL are used, exact precision is critical, especially for money in Postgres. Floating-Point Types like REAL or DOUBLE PRECISION can be used for scientific or engineering calculations where absolute precision isn't as important as the range of values. You also have your UUID (indexable UUIDs in Postgres 18) for distributed systems and secure URLs.

Character Types like VARCHAR(n) or CHAR(n) store variable-length text up to a specified maximum length (n) and only use as much storage as needed for the actual text.

Date/Time Types like DATE stores only the date (year, month, day).  TIMESTAMPTZ is the time and date GOAT with and is easily implemented into global systems.

But, wait, that’s not all, Postgres has within it, the ability to easily make custom data types and constrain data to the specifics of each use case.

Using CREATE DOMAIN you can create specific value check like confirming a range for birthday or email format validity.

-- Postgres create domain
CREATE DOMAIN date_of_birth AS date
CHECK (value > '1930-01-01'::date);

CREATE DOMAIN valid_email AS text
NOT NULL
CHECK (value ~* '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+[.][A-Za-z]+$');

Or using a direct CREATE TYPE you can make a new type as a composite. For example, new custom date type allowing for storage of height, width, and, weight in a single field.

-- Postgres create type with composite
CREATE TYPE physical_package AS (
height numeric,
width numeric,
weight numeric);

Enums let you create a custom type with a set of predefined values.

-- Postgres enum
CREATE TYPE order_status AS ENUM (
'pending',
'shipped',
'cancelled');

Constraints take the enumerated type a bit further and let you specify rules and restrictions for data. Additionally adding a CHECK constraint to a list or even refer to other fields, like reserving a room with a start and end time.

-- Postgres check contraint
ALTER TABLE public.reservations
ADD CONSTRAINT start_before_end
CHECK (start_time < end_time);

While most applications will constrain data in its own way, Postgres’ strict and flexible typing allows both rigid validity and flexibility.

2) Extensibility for data types, operators and access methods

The authors knew that just data types wouldn’t be enough - the system would actually need to be extensible. In my estimation - this is actually the killer feature of Postgres. Sure, the database is solid  - but the ingenuity and enthusiasm of the extension ecosystem is incredibly special.

Let’s take PostGIS for example. This extension adds several key data types to the mix - the point, line, polygon, to store geospatial types. PostGIS also has hundreds of functions with it. There’s now an entire ecosystem of its own around this project that includes open-source mapping and fully open source web servers that rival paid GIS systems like ESRI.

The pgvector extension is another good example of Postgres extensibility too. Now Postgres can store embedding data right alongside application data. You can have LLMs create embeddings based on your data and you can query your data to find relatedness. You can also build your own Postgres RAG system right inside your database

-- find distance between two embedding values
recipe_1.embedding <=> recipe_2.embedding

Data types and extensions aren’t the only thing that came out of this idea though - the indexes themselves in Postgres are incredibly advanced. Generalized Inverted Index (GIN) and Generalized Search Tree (GiST) are themselves extensible indexing frameworks that support many of the complex data types mentioned above.

3) Features for active databases like alerters and triggers

Modern Postgres users have a suite of tools available to them to have the database do necessary work. The trigger system easily updates fields once another field changes.

-- Postgres sample function to update fields
CREATE OR REPLACE FUNCTION update_inventory_on_sale()
RETURNS TRIGGER AS $$
BEGIN
UPDATE products
SET quantity_on_hand = quantity_on_hand - NEW.quantity_sold
WHERE id = NEW.product_id;
IF NOT FOUND THEN
RAISE EXCEPTION 'No product found with ID %', NEW.product_id;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;

For events outside the database, Postgres has a handy little NOTIFY/LISTEN mechanism for sending notifications to the outside so your application or dashboard will know when a new order was placed or a specific action happened. There’s an extension now to use the listen notify system events as WebSockets.

Postgres’ logical replication makes use of the ‘active database’ idea. PostgreSQL's logical replication is cool because it streams individual data changes rather than physical block-level copies, allowing you to replicate data between different major Postgres versions or even different platforms. This flexibility enables powerful use cases like creating specialized read replicas, consolidating multiple databases into a central one, and performing zero-downtime major version upgrades.

-- Postgres create logical replication
CREATE PUBLICATION user_pub FOR TABLE user_id, forum_posts;

4) Simplify process for crash recovery

The original method of Postgres data recovery relied on writing all data modifications to the files on disk before each commit which was called "force-to-disk". Unfortunately this original implementation had major performance issues and a potential for corruption. The Write Ahead Log (WAL) which was released with version 7.1 changed this into a different system that first writes changes to a log file and then applies those changes to the main data files.

WAL is the foundation of all of Postgres’ amazing backup and disaster recovery story. WAL is used to create incremental backups, complete with the Point-in-Time disaster recovery system that many rely on today.

WAL is also foundational to Postgres streaming replication, which makes high availability possible. A primary writes all database changes (inserts, updates, deletes) into its Write-Ahead Log and then "streams" these WAL records over the network to the standby (replica) nodes. The standby nodes receive these WAL records and apply them to their own copy of the database, keeping them in sync with the primary. In the event of an emergency automated failover, like Patroni, can promote a new primary.

5) Take advantage of upgraded hardware

PostgreSQL was engineered for the hardware realities of its time: single-core CPUs, severely limited RAM often measured in megabytes, and slow, spinning hard drives. The primary design focus was on correctness and data durability over raw speed. PostgreSQL built its legendary reputation for stability and ACID compliance, ensuring that data remained safe even when running on less reliable hardware.

Fast forward to today, where PostgreSQL runs on hardware with dozens of CPU cores, terabytes of ultra-fast NVMe storage and vast amounts of RAM (we even have half a tb of RAM available now). PostgreSQL recently introduced parallel query execution which breaks up complex queries and runs them simultaneously, gathering the results at the end. Modern PostgreSQL has also vastly improved its locking mechanisms, connection pooling solutions, and replication capabilities, evolving from a robust single-server database into a high-performance powerhouse that can scale horizontally and handle the massive, concurrent workloads of the modern internet.

While Postgres today does not yet have the modern CPU multi-threading, this is on the horizon, and Postgres 18 just added asynchronous i/o.

6) Utilize Codd’s relational model

At the height of the NoSQL movement in the late 2000s and early 2010s, a common story was told that relational databases were a relic of the past. With the rise of big and unstructured data, this old model may soon be cast out.

Postgres continued to do what it always has done and embraced its core strength - flexibility of data typing – and adopted some of NoSQL’s own ideas. Postgres introduced the JSON data type and then later the binary, indexable JSONB type. With this update, applications can now store schema-less API driven JSON data directly in a relational database and query it efficiently using a rich set of operators and functions. With features like json_table, you can go between arrays or traditional tables.

The newest revolution in the Postgres world seems to be the adoption of technologies to tie Postgres directly to unstructured flat files. Projects like pg_duckdb, pg_mooncake, and Crunchy Data Warehouse use custom extensions to work directly with files in csv, Parquet, and Iceberg directly in the data lake remote object stores where they reside. Even with the data abstracted to another location, Postgres’ relational model is still relevant, efficient, and trusted.

Summary

With Postgres’ flexibility - you can have a fully normalized, relational schema with foreign keys and JOINs, while also having an indexed JSONB document and full spatial geometry. We’re at a point in history where AI, science, and research are backed by a database that had no idea what the world would be like when it was built. Postgres is still here.

These original goals have had a profound impact on the project. Allowing for complexity and flexibility in a growing business landscape, while being easy to alter for individual use cases. And being ready for hardware (and cloud) technology that makes Postgres’ distribution even easier.

Related Articles