Intro to Postgres Custom Data Types
Custom data types is one of those many features that makes PostgreSQL flexible for development of a huge variety of business and application use cases. Data types will help you primarily for data integrity, ensuring your data is stored in the database as you want it to be. A common surprise gift of using strict data types is that you can end up reducing your long term data maintenance.
There’s two main ways to customize data types in Postgres:
- Create DOMAINs, which are value constraints added to built-in data types
- Creating user-defined custom data types
Before you go crazy with custom data types, make sure that one of the 43 existing data types won’t work for you ;). There’s so many data types and existing extensions, most people will never even need to touch custom data types. That doesn't mean it's not fun to learn about!
Hands on tutorials
My wonderful colleague, Jean-Paul Agudo wrote two hands-on tutorials for these topics, one for DOMAINS and the other for User Defined Custom Data Types.
Using CREATE DOMAIN
DOMAINs let you create a specific value check. For example if I want to make sure my birthdays are all greater than Jan 1st, 1930 and my emails are valid, I could create this:
CREATE DOMAIN date_of_birth AS date CHECK (value > '1930-01-01'::date) ; CREATE DOMAIN valid_email AS text NOT NULL CHECK (value ~* '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+[.][A-Za-z]+$') ;
When creating the table, assign the
valid_email data types
to their respective columns. Now, if a database has multiple fields representing
data of birth or valid email, the logic for those fields is portable to those
CREATE TABLE person_using_domains ( id INTEGER GENERATED always AS IDENTITY PRIMARY KEY, firstname TEXT NOT NULL, lastname TEXT NOT NULL, birth_date DATE_OF_BIRTH, email VALID_EMAIL );
psql, all domains will be listed by running
DOMAIN vs CHECK CONSTRAINT
So you might be wondering why you would use a domain when you could just use a check constraint on the data itself? The simple answer here is that check constraints are not easily altered. They have to be dropped and re-added.
DOMAIN can be created at the schema level and there may be
several tables with email address or birth dates. Use one
DOMAIN to control
several fields, thus centralizing the logic.
Using ALTER DOMAIN
Let’s say the application has now evolved, and we only want to store persons born after 1980. You’ll run:
ALTER DOMAIN date_of_birth ADD CHECK (value > '1980-01-01'::date);
Using CREATE TYPE for Custom Data Types
When using they
CREATE TYPE functionality, there are a few options:
- composite type - few data fields are rolled into a single type
- enumerated type - set of values
- range type - a range of values
Below, we walk through using each of these.
Composite types are used when you’re composing several different values into a single field.
If an application manages a delivery system for package delivery, then this application will use the ID of the package most of the time. In addition, characteristics of the package's dimensions and weight are something we want to store together.
CREATE TYPE will create a new data type allowing for storage of height,
width, and, weight in a single field:
CREATE TYPE physical_package AS ( height numeric , width numeric , weight numeric );
Then, assign that type to a table definition:
CREATE TABLE packages ( id BIGINT GENERATED always AS IDENTITY PRIMARY KEY, properties PHYSICAL_PACKAGE );
Casting Data Types
To insert existing data into this field you’ll want to use a cast() function
:: operator. We have a
tutorial on casting
INSERT INTO packages (properties) VALUES ( '(10.3,4.0,0.5)' :: physical_package ), ( '(5,3.0,0.2)' :: physical_package );
Querying composite data types
To access the values inside a composite data type, use dot-notation, similar to
(properties).weight. The composite data type name must be set into
parenthesis when extracting values:
SELECT id, (properties).weight FROM packages; id | weight ----+-------- 1 | 0.5 2 | 0.2
Composite types vs JSON
You’re probably thinking composite types look just like JSON, and you are right. Composite types are more formally structured and JSON is notoriously less structured. Postgres is likely to be more performant for composite types. So if you care about structure and performance, this might be the solution for you. However, JSON is hugely popular for a variety of reasons so that might be more preferable for ease of future development.
When you want to create a set of static values in PostgreSQL, you can use the
CREATE TYPE AS ENUM.
If we wanted a new field for package category as either box or letter, we could use:
CREATE TYPE package_cat AS ENUM ('box','letter');
Later, to add a new category:
ALTER TYPE package_cat ADD VALUE 'postcard';
Working with ENUMs
Craig Kerstiens has a great primer on enums vs check constraints and the conclusion is CHECK constraints are a little better. Postgres allows adding values to an enum, but cannot remove a value from the enum without removing the type and re-adding.
Numerous range data types exist in PostgreSQL. The existing range data types are limited to, integers, big integers, numerics, timestamps with or without time zone and dates. The common usage of creating a new type of range, is to have it handle other range subtypes.
In our delivery scenario, customers accept some price discounts for delays in delivery (e.g. 3 day, overnight, etc). Let’s create that new range type, composed of intervals:
CREATE TYPE delay AS RANGE ( subtype = interval );
Now we create a new table, with previous custom data types, and this this new one:
CREATE TABLE packages_with_delay ( id BIGINT GENERATED always AS IDENTITY PRIMARY KEY, properties PHYSICAL_PACKAGE NOT NULL, category PACKAGE_CAT NOT NULL, acceptable_delay DELAY NOT NULL );
Recent versions of Postgres have additional options for out of the box range types.
Postgres is awesome. With 43 existing datatypes so you probably do not need a custom one, but if you do, you can add a DOMAIN to an existing type or create completely custom types. DOMAINs are awesome, perhaps better than constraints. There are custom types for composites, enums, and ranges. Composites can be helpful, but you might just use JSON instead. Enums aren’t as good as constraints. Range types can be good and are usually subtype definitions.
Experiment with the functionality in our playground and you will have a new tool in the box.
Co-authored with Jean-Paul Argudo.
February 1, 2023 •More by this author