Tutorial Instructions

Advent of Code - Day 1

This article will contain spoilers both on how to solve Day 1's challenge "Calorie Counting" in SQL. Data has been loaded for you already in a table called calories.

Data Modeling

We have preloaded our data in a table calories with a single text column, calories_count. This data file used a blank line for a group separator, which we will need to be aware of in our solution.

Now that we have a table with one row per line, how can we group it? Grouping similar rows together is a job for window functions. In this case, we need to somehow put all rows together until we hit an empty line, then we need to start a new group. Let's create a pseudo-column and use a sequence to increment it only when we find an empty value in our calories_count column. We will also call setval() on this column in order to define an initial value so the currval() function will work.

CREATE SEQUENCE aoc; SELECT setval('aoc', 1); SELECT calories_count, CASE WHEN calories_count = '' THEN nextval('aoc') ELSE currval('aoc') END FROM calories;
calories_count | currval 
----------------+---------
 9686           |       1
 10178          |       1
 3375           |       1
 9638           |       1
 6318           |       1
 4978           |       1
 5988           |       1
 6712           |       1
                |       2
 10422          |       2
 3062           |       2
 2753           |       2
 3695           |       2
 1722           |       2
 8110           |       2
 6568           |       2
 7124           |       2
 ...

As we can see, the currval changes when it detects a different group. Let’s use this property to find out the sum of each group. Note that since this is a text field, we will need to convert to an int in order to sum these fields. Additionally, since the blank line cannot convert to an int, we will explicitly detect when the row we’re looking at is the separator and consider its value 0.

Group and Sum

SELECT SUM(calories_count) OVER(partition by currval) FROM (SELECT CASE WHEN calories_count = '' then 0 else calories_count :: int end as calories_count, CASE WHEN calories_count = '' THEN nextval('aoc') ELSE currval('aoc') END FROM calories ) x;

sum  
-------
 56873
 56873
 56873
 56873
 56873
 56873
 56873
 56873
 43456
 43456
 43456
 43456
 43456
 43456
 43456
 43456
 43456
 ...

In this case since we are using a window function this will calculate the sum of those values once for each row, however the total for the group will be the same.

Maximum Value

Note that we don't care which row it came from, nor do we care about the duplicate entries. All we care about is the maximum value in this new table. Rather than a subselect, let's use a CTE to make things look better. Since one of the goals is to do this in as few SQL statements as possible, let's put the setval()  into the top of the CTE:

WITH setup AS (SELECT setval('aoc',1)), x AS (SELECT CASE WHEN calories_count = '' then 0 else calories_count :: int end as calories_count, CASE WHEN calories_count = '' THEN nextval('aoc') ELSE currval('aoc') END FROM setup, calories), y AS (SELECT sum(calories_count) OVER(partition by currval) from x) SELECT max(sum) FROM y;

max  
-------
 69206
(1 row)

Voila! This returns the highest combined number for all the groups.

Loading terminal...

Loading terminal...