Temporal Joins

My first thought seeing a temporal join in 2008 was, “Why is this query so complex?” The company I was at relied heavily on database queries, as it was a CRM and student success tracking system for colleges and universities. The query returned a filtered list of users and their last associated record from a second table. The hard part about the query isn’t returning the last timestamp or even performing joins, it’s returning only their last associated record from a second table.

Back in 2008, we didn’t have window functions or CTEs, so the query algorithm was a series of nested tables that looked like this:

SELECT
    *
FROM users, ( -- find the record for the last second_table by created_at and user_id
                SELECT
                    second_table.*
                FROM second_table, ( -- find the last second_table created_at per user_id
                                        SELECT
                                            user_id,
                                            max(created_at) AS created_at
                                        FROM second_table
                                        GROUP BY 1
                                    ) AS last_second_table_at
                WHERE
                    last_second_table_at.user_id = second_table.user_id
                    AND second_table.created_at = last_second_table_at.created_at
            ) AS last_second_table
WHERE users.id = last_second_table.user_id;

See the Sample Code section below for the schema and data to run these queries.

But, even that query was wrong because the second table may have records with duplicate created_at values. That was the source of a bug back in 2008 that resulted in duplicate rows being listed.

Obviously, we weren't using Postgres at the time because there has always been a simpler way to do this in Postgres using DISTINCT ON:

SELECT DISTINCT ON (u.id)
    u.id,
    u.name,
    s.created_at AS last_action_time,
    s.action_type
FROM users u
JOIN second_table s ON u.id = s.user_id
ORDER BY u.id, s.created_at DESC, s.id DESC;

Temporal joins require attention to detail.

Robust Solution: CTEs & Window Functions

Before we go too far into the topic, for those looking for a solution to their current problem, below is how I would write that query today if you aren't finding the first or last in a series. For these situations, we use CTEs and window functions, so there's no need to nest queries when we can abstract them for a cleaner purpose. Here is the template for the temporal joins that do not work with DISTINCT ON:

WITH max_second_table AS (
    SELECT
        *
    FROM (
            SELECT
                *,
                -- Use ROW_NUMBER() window function to return the latest record:
                -- The ORDER BY clause is critical:
                -- 1. ORDER BY created_at DESC finds the latest time.
                -- 2. ORDER BY id DESC serves as a reliable tie-breaker
                ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC, id DESC) AS row_order
            FROM second_table
        ) AS ordered_second_table
    WHERE row_order = 2
)

SELECT
    *
FROM users
LEFT JOIN max_second_table ON users.id = max_second_table.user_id;

In this example, we are joining the second occurrence (WHERE row_order = 2) in the second_table for a user. For the university example, we used these types of queries to report on progress over time by showing the 1st, 2nd, 3rd, and occurrences of events.

Is this actually less code than the first example? No, but it is compartmentalized with a cleaner purpose.

Also, introducing the primary key (id) in the ORDER BY clause provides the necessary tie-breaker logic for the sorting -- that is how we fixed the SQL issue in the opening text.

Problem with ORMs

Due to their query complexity, ORMs are generally not capable of handling temporal joins without complex manipulation. The ORM I'm most familiar with is ActiveRecord, part of the Ruby on Rails suite. When Rails developers encounter temporal joins, they typically resort to the N+1 query pattern from their application code like this:

users = User.all

users.each do |user|
  last_action = user.second_table.last
end

If you aren't running this query too frequently or over too many user records, this is generally performant enough. However, this approach becomes suboptimal for application performance as the user list grows because each iteration of the loop requires a network hop back and forth with the database and an object initialization in the application. While you can make ActiveRecord do this natively, the resulting code is often harder to read and maintain for the typical use case—a pattern you see in other ORMs as well.

Sample Code

Below is the sample SQL you can use to load data into your database to test a few of these queries. Note that Alice has two actions at the exact same timestamp to replicate the original bug scenario.

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name TEXT
);

CREATE TABLE second_table (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id),
    action_type TEXT,
    created_at TIMESTAMP WITHOUT TIME ZONE
);

INSERT INTO users (name) VALUES ('Alice'), ('Bob'), ('Charlie');

-- Alice has two actions at the exact same timestamp (The 2008 bug scenario)
INSERT INTO second_table (user_id, action_type, created_at) VALUES
(1, 'login', '2023-10-01 10:00:00'),
(1, 'page_view', '2023-10-01 10:00:00'),
(2, 'purchase', '2023-10-02 11:00:00'),
(3, 'registration', '2023-10-03 12:00:00'),
(3, 'profile_update', '2023-10-04 13:00:00');

Conclusion

The term "temporal join" isn't a common piece of developer jargon, but the underlying pattern, retrieving the related record, is critical in reporting and analytics. It's a known pattern among people who have worked on a code base that relies heavily on SQL capabilties, typically for reporting.

Using the PostgreSQL feature DISTINCT ON for the simplest case, or CTEs with Window Functions for complex retrieval, we avoid the bugs of older SQL patterns and eliminate the performance penalty of the N+1 problem.

If you would like to learn more about advanced SQL patterns, check out our Postgres Playground.

Latest Articles

Robust Solution: CTEs & Window Functions

Problem with ORMs

Sample Code

Conclusion

Related Articles