Integer overflow occurs when a computer program tries to store an integer but the value being stored exceeds the maximum value that can be represented by the data type being used to store it. We have helped a few Crunchy Data clients navigate this recently and wanted to write up some notes.
In Postgres, there are three integer types:
smallint- A 2-byte integer, -32768 to 32767
integer- A 4-byte integer, -2147483648 to 2147483647
bigint- An 8-byte integer, -9223372036854775808 to +9223372036854775807
It is not uncommon to use a 4-byte integer as a primary key when defining a new table. This can cause problems if the value to be represented is more than 4-bytes can hold. If a sequence’s limit is reached you might see an error in your logs that looks like this:
ERROR: nextval: reached maximum value of sequence "test_id_seq" (2147483647)
Don’t Panic! We have some helpful and intelligible PostgreSQL solutions.
How do you know if you are close to overflowing an integer?
The following query will identify any auto-incrementing columns, which SEQUENCE object it owns, data types of the column and SEQUENCE object, and percent until the sequence value exceeds the sequence or column data type:
SELECT seqs.relname AS sequence, format_type(s.seqtypid, NULL) sequence_datatype, CONCAT(tbls.relname, '.', attrs.attname) AS owned_by, format_type(attrs.atttypid, atttypmod) AS column_datatype, pg_sequence_last_value(seqs.oid::regclass) AS last_sequence_value, TO_CHAR(( CASE WHEN format_type(s.seqtypid, NULL) = 'smallint' THEN (pg_sequence_last_value(seqs.relname::regclass) / 32767::float) WHEN format_type(s.seqtypid, NULL) = 'integer' THEN (pg_sequence_last_value(seqs.relname::regclass) / 2147483647::float) WHEN format_type(s.seqtypid, NULL) = 'bigint' THEN (pg_sequence_last_value(seqs.relname::regclass) / 9223372036854775807::float) END) * 100, 'fm9999999999999999999990D00%') AS sequence_percent, TO_CHAR(( CASE WHEN format_type(attrs.atttypid, NULL) = 'smallint' THEN (pg_sequence_last_value(seqs.relname::regclass) / 32767::float) WHEN format_type(attrs.atttypid, NULL) = 'integer' THEN (pg_sequence_last_value(seqs.relname::regclass) / 2147483647::float) WHEN format_type(attrs.atttypid, NULL) = 'bigint' THEN (pg_sequence_last_value(seqs.relname::regclass) / 9223372036854775807::float) END) * 100, 'fm9999999999999999999990D00%') AS column_percent FROM pg_depend d JOIN pg_class AS seqs ON seqs.relkind = 'S' AND seqs.oid = d.objid JOIN pg_class AS tbls ON tbls.relkind = 'r' AND tbls.oid = d.refobjid JOIN pg_attribute AS attrs ON attrs.attrelid = d.refobjid AND attrs.attnum = d.refobjsubid JOIN pg_sequence s ON s.seqrelid = seqs.oid WHERE d.deptype = 'a' AND d.classid = 1259;
To show this query in action, let me set up a test table with an
primary key, where the sequence has been artificially advanced to 2 Billion:
postgres=# create table test(id serial primary key, value integer); CREATE TABLE postgres=# select setval('test_id_seq', 2000000000); setval ------------ 2000000000 (1 row) postgres=# \d test Table "public.test" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------------------------------- id | integer | | not null | nextval('test_id_seq'::regclass) value | integer | | | Indexes: "test_pkey" PRIMARY KEY, btree (id)
Now when running the query above to find the integer overflow percent, I can see
that that the data types for both the column and the sequence are both
integer, and since the sequence’s next value is 2 Billion, it is 93% through
the acceptable range:
sequence | sequence_datatype | owned_by | column_datatype | last_sequence_value | sequence_percent | column_percent -------------+-------------------+----------+-----------------+---------------------+------------------+---------------- test_id_seq | integer | test.id | integer | 2000000001 | 93.13% | 93.13% (1 row)
Changing to negative number sequencing
integer types in Postgres include negative numbers, a simple way to
deal with integer overflow is to flip to sequencing with negative numbers. This
can be done by giving the sequence a new start value of
-1 and converting to a
descending sequence by giving it a negative
alter sequence test_id_seq no minvalue start with -1 increment -1 restart;
If the purpose of the generated key is purely to create uniqueness, negative values are perfectly acceptable, but in some application frameworks or other use cases negative numbers may be undesirable or not work at all. In those cases we can change the field type entirely.
Keep in mind that the data type will need to be changed for any fields that reference this ID as well, or else they will also be out of bounds. Also any foreign key constraints will need to be dropped and reapplied after the both fields’ types have been updated.
Benefits of the negative number approach:
- No change to the column structure
- Very fast: just change the sequence start number
- Negative numbers might not work with your application framework
- You only buy yourself double the amount of IDs. You could be in this situation again soon
In general, this is a buy you some time approach and seen as a short term fix.
The more complete fix to your sequence exhaustion is changing to the
In order to change the field type of the above
test table, we will first
create a new ID of type
bigint that will eventually replace the current
and create a unique constraint on it:
alter table test add column id_new bigint; CREATE UNIQUE INDEX CONCURRENTLY test_id_new ON test (id_new);
The new column will also need a new sequence of type
bigint. The sequence
needs to start at some point after the latest value that had been recorded.
CREATE SEQUENCE test_id_new_seq START 2147483776 AS bigint; ALTER TABLE test ALTER COLUMN id_new SET DEFAULT nextval ('test_id_new_seq'); alter sequence test_id_new_seq owned by test.id_new;
Now new values can be added to the table, but there are two different sequences being incremented - the old and the new, ie:
postgres=# select * from test; id | value | id_new ------------+-------+------------ 2000000007 | | 2000000008 | | 2000000009 | | 2000000010 | | 2000000011 | | 2147483776 2000000012 | | 2147483777 2000000013 | | 2147483778 2000000014 | | 2147483779
In a single transaction, we will drop the old ID constraint and default, rename columns, and add an invalid “not null” constraint on the new ID column:
BEGIN; ALTER TABLE test DROP CONSTRAINT test_pkey; ALTER TABLE test ALTER COLUMN id DROP DEFAULT; ALTER TABLE test RENAME COLUMN id TO id_old; ALTER TABLE test RENAME COLUMN id_new TO id; ALTER TABLE test ALTER COLUMN id_old DROP NOT NULL; ALTER TABLE test ADD CONSTRAINT id_not_null CHECK (id IS NOT NULL) NOT VALID; COMMIT;
Now new IDs are being added to the table. Thanks to the
NOT NULL constraint on
id, new NULL values cannot be added, but since it is also
NOT VALID the
existing NULL values are permitted. In order to make
id back into a primary
id_old data must be backfilled so that the constraint can be made
valid. This can be done in batches, ie:
WITH unset_values AS ( SELECT id_old FROM test WHERE id IS NULL LIMIT 1000) UPDATE test SET id = unset_values.id_old FROM unset_values WHERE unset_values.id_old = test.id_old;
Once all rows have been backfilled, the
NOT NULL constraint can be validated,
the UNIQUE index on
id can be converted to a primary key, and finally the
NOT NULL constraint can be dropped:
ALTER TABLE test VALIDATE CONSTRAINT id_not_null; ALTER TABLE test ADD CONSTRAINT test_pkey PRIMARY KEY USING INDEX test_id_new; ALTER TABLE test DROP CONSTRAINT id_not_null;
At any point now the 4-byte
id_old column can be dropped, as the bigint has
taken its place:
postgres=# ALTER table test drop column id_old; ALTER TABLE postgres=# \d test Table "public.test" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+-------------------------------------- value | integer | | | id | bigint | | not null | nextval('test_id_new_seq'::regclass) Indexes: "test_pkey" PRIMARY KEY, btree (id)
The new 8-byte bigint id should be sufficient for a very, very, very long time:
sequence | sequence_datatype | owned_by | column_datatype | last_sequence_value | sequence_percent | column_percent -----------------+-------------------+----------+-----------------+---------------------+------------------+---------------- test_id_new_seq | bigint | test.id | bigint | 2147483788 | 0.00% | 0.00%
Benefits of the
- This is a long term fix and you won't have to worry about running out of sequence numbers for a very long time.
- You probably need to update a lot of other things to larger integers
- Takes coordination with the entire database. In our experience, this is a large project.
In Postgres, the
SERIAL data types (
are shortcuts for creating auto-incrementing identifier columns whose values are
assigned the next value from a Postgres SEQUENCE object.
Creating a column of type
SERIAL will default to as type
simultaneously creating an integer sequence object owned by the specified table
column and make its nextval() the default value for the column.
For new tables, consider using
- You can check with a query if you’re running out of sequence numbers.
- Changing to negative numbers can be a short term fix.
- Changing to
bigintis the recommended long term fix.
- When you are setting up a new database that’s likely to have a lot of data in
Integer overflow may appear at a glance to be insanely complicated. I have written this to keep Postgres DBAs and intergalactic travelers from panicking.
March 3, 2023 •More by this author