Announcing Crunchy Bridge for Analytics—interact with your data lake using Postgres commands via extension, with a vectorized, parallel query engine. Learn more in our announcement.

Tutorial Instructions

Creating Tables

In the database world, your data is stored and organized in tables. Data is stored in a tabular format, which means the system organizes data into rows and columns inside each table. Imagine tables as a filing cabinet with many folders (rows).

Columns and data types

When creating a table, you should follow this golden rule; pick the right datatype to store your data.

  • For instance, don't create a TEXT column if you want to store a date.
  • Don't create a TEXT column to store a price. Instead use a NUMERIC column which is better suited for storing money information.

To explore table creation, I’m going to create tables for a room reservation system. Below is the information I want to start with for creating this data store.

Let’s start with the creation of guests table. In this table, we want to store the following information:

  • the name of the person who will reserve the room
  • an email to reach them

We can create an empty table named guests.

CREATE TABLE guests ();

We will also want to add entries for each column in the table definition. These follow this pattern: <column_name> <datatype> and end with a comma.

The name column will have a text datatype, name TEXT, as it represents the information we want to store. Email will also be TEXT. If we want to avoid someone registering with the same email multiple time, we can make it unique by adding a constrain to the column type definition like this: email TEXT UNIQUE.

Let’s try our table creation again. Drop the first table:

DROP TABLE guests;

Add a table with columns.

CREATE TABLE guests ( name TEXT, email TEXT UNIQUE );

Primary keys

We also need a primary key for our new table. A primary key is a unique identifier for your data that cannot be null. This primary key will also be used as a key, a reference called the FOREIGN KEY,for other tables referencing this tables information.

The words PRIMARY KEY on the column definition is what specify that this column should be the primary key.

DROP TABLE guests;

Here’s our final table definition.

CREATE TABLE guests ( guest_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name TEXT, email TEXT UNIQUE );

The key words GENERATED ALWAYS AS IDENTITY were added to let Postgres populate the column guest_id automatically. This prevents the need for a process or application to supply the values. Under the hood, Postgres accomplishes incrementing the value with the help of a sequence.

INT & BIGINT

You should note that the data type for INT does have some limits. If you are building a database that might have millions of records and Postgres is automatically creating integers for you, you may want to use a BIGINT instead. See our blog post on The Integer at the End of the Universe: Integer Overflow in Postgres for advice on this topic.

psql commands for table inspection

You can use the command below to list all the tables:

\dt

You can also execute the next command to view the table definition:

\d+ guests

Inserting data

Now that our first table is created, we can now add data to it.

INSERT INTO guests (name, email) VALUES('Simon', 'simon@email.com'); INSERT INTO guests (name, email) VALUES('Bob', 'bob@email.com'); INSERT INTO guests (name, email) VALUES('Patrick', 'patrick@email.com');

Query the table guests.

SELECT * FROM guests;

Now let's try to add the first insert a second time:

INSERT INTO guests (name, email) VALUES('Simon', 'simon@email.com');
INSERT INTO guests (name, email) VALUES('Simon', 'simon@email.com');
ERROR:  duplicate key value violates unique constraint "guests_email_key"
DETAIL:  Key (email)=(simon@email.com) already exists.

As you can see, the last insert was rejected as it would violate the uniqueness of the email column.

More tables for our example

Our first table is in good shape, we will tackle the next table rooms. This table is very simple, we want to store the room information, the room number. In our case, the room number does not utilize alphanumeric characters, only numbers. And since we don't have a lot of rooms, an INT datatype will do the job.

Create the rooms table.

CREATE TABLE rooms ( room_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, rom_number INT UNIQUE );

I made a typo in the column name for the room number... We have 2 choices here, we can drop the table or we can alter the table to change the column name.

Let's start with the ALTER TABLE command.

ALTER TABLE rooms RENAME COLUMN rom_number TO room_number;

Let's take a look at the table definition:

\d+ rooms

Just like that, I fixed my typo!

Insert some data into the table rooms:

INSERT INTO rooms (room_number) VALUES(101); INSERT INTO rooms (room_number) VALUES(201); INSERT INTO rooms (room_number) VALUES(301); INSERT INTO rooms (room_number) VALUES(401); INSERT INTO rooms (room_number) VALUES(501);

Query the table rooms:

SELECT * FROM rooms;

Foreign keys

Finally, we need to create our last table reservations.

This table is particular in the sense that it will store the primary keys of the other 2 tables. In this context, those columns are called foreign key. A foreign key in Postgres is a field in a table that uniquely identifies a row in another table. It establishes a relationship between two tables by referencing the primary key or a unique key of another table.

Here’s a visual for how these tables reference each other. A diagram like this is called a schema.

The key words REFERENCES followed by the table name and column name is what defines a column to be a foreign key. Since we need to store date and time for the reservation start time and end time, we will use a datatype TIMESTAMP.

Here’s our final create statement for the reservations table:

CREATE TABLE reservations ( guest_id INT REFERENCES guests(guest_id), room_id INT REFERENCES rooms(room_id), start_time TIMESTAMP, end_time TIMESTAMP, event_title TEXT );

Now let’s insert some data in table reservations:

INSERT INTO reservations(guest_id, room_id, start_time, end_time, event_title) VALUES(1,1,'2024-04-12 8:00:00','2024-04-12 8:30:00','CPK scrum'); INSERT INTO reservations(guest_id, room_id, start_time, end_time, event_title) VALUES(2,2,'2024-04-12 8:00:00', '2024-04-12 9:00:00','CPA Roadmap'); INSERT INTO reservations(guest_id, room_id, start_time, end_time, event_title) VALUES(1,3,'2024-04-12 10:00:00','2024-04-12 12:00:00','Creating Tables Tutorial'); INSERT INTO reservations(guest_id, room_id, start_time, end_time, event_title) VALUES(3,4,'2024-04-13 10:00:00', '2024-04-13 11:00:00','Intriguing Incidents Meeting');

Let's query our reservations for the guest Simon using a join.

SELECT g.name, r.event_title, ro.room_number, r.start_time, r.end_time FROM reservations r INNER JOIN rooms ro ON ro.room_id = r.room_id INNER JOIN guests g ON g.guest_id = r.guest_id WHERE g.email = 'simon@email.com' ORDER BY r.start_time;

Conclusion

We covered the basics of creating tables, usage of primary and foreign keys. Primary keys ensure uniqueness of our data. Foreign keys ensure referential integrity between related tables.

Next, try the tutorial on creating more advanced table constraints to add extra data validity protection on your tables.

Loading terminal...

Loading terminal...