Announcing Crunchy Bridge for Analytics—interact with your data lake using Postgres commands via extension, with a vectorized, parallel query engine. Learn more in our announcement.

Tutorial Instructions

Advent of Code - Day 19

Advent of Code - Day 19

Disclaimer

This article will contain spoilers both on how I solved 2022 Day 19's challenge "Not Enough Minerals" using SQL, as well as general ideas on how to approach the problem. I recommend trying to solve it yourself first, using your favorite language.

AOC Day 19

Day 19 tasks us with creating lots and lots of mini robots to gather resources and feed our herd of elephants (rescued a few days back). We'll do this in SQL per usual, with the help of some items such as:

Let’s import our input data:

CREATE TABLE aoc_day19 ( line text ); INSERT INTO aoc_day19 VALUES ('Blueprint 1: Each ore robot costs 4 ore. Each clay robot costs 2 ore. Each obsidian robot costs 3 ore and 14 clay. Each geode robot costs 2 ore and 7 obsidian.'); INSERT INTO aoc_day19 VALUES ('Blueprint 2: Each ore robot costs 2 ore. Each clay robot costs 3 ore. Each obsidian robot costs 3 ore and 8 clay. Each geode robot costs 3 ore and 12 obsidian');

AOC 2022 Day 19 Part 1

For this part of the challenge, we need to figure out the optimal number of geodes collected for each of the given blueprints. Geode robots gather geodes, and building a geode robot requires both ore and obsidian. There are also clay robots and ore robots. Starting with a single ore robot, and a finite number of rounds (24), we need to determine the optimal action of each step (build a type of robot, or do nothing). The input is multiple lines describing the costs for that particular blueprint, e.g.

 Blueprint 1:  Each ore robot costs 4 ore.  Each clay robot costs 2 ore.  \
   Each obsidian robot costs 3 ore and 14 clay. Each geode robot costs 2 ore and 7 obsidian.

We will use SQL to extract that information, but first we will need a place to store it. Let's create a quick unlogged table and then populate it:

CREATE UNLOGGED TABLE blueprint ( minute SMALLINT NOT NULL DEFAULT 0, id INT NOT NULL PRIMARY KEY, ore SMALLINT, clay SMALLINT, obsidian SMALLINT, geodes SMALLINT, ore_robot_cost SMALLINT, clay_robot_cost SMALLINT, obsidian_robot_cost SMALLINT, obsidian_robot_cost2 SMALLINT, geode_robot_cost SMALLINT, geode_robot_cost2 SMALLINT, ore_robots SMALLINT, clay_robots SMALLINT, obsidian_robots SMALLINT, geode_robots SMALLINT );

Every line of the input file follows the exact same format, so we can use the handy regexp_split_to_array function to extract all the numbers, and then insert those into our table:

WITH x AS (SELECT regexp_split_to_array(line, '(\D+)') AS b FROM aoc_day19) ,y AS (SELECT 0 AS minute, b[2]::INT AS blueprint_id, 0 AS ore, 0 AS clay, 0 AS obsidian, 0 AS geodes, b[3]::INT AS ore_robot_cost, b[4]::INT AS clay_robot_cost, b[5]::INT AS obs_robot_cost, b[6]::INT AS obs_robot_cost2, b[7]::INT AS geode_robot_cost, b[8]::INT AS geode_robot_cost2, 1 AS ore_robots, 0 AS clay_robots, 0 AS obsidian_robots, 0 AS geode_robots FROM x) INSERT INTO blueprint SELECT * FROM y;

Now we will need a function to compute the best path for a given blueprint. The best way to do this is with a recursive function, in which we investigate the effects of each possible choice. During each round, we can either build one specific type of robot, or build no robot at all. With only four different types of robots, this may sound easy, but the number of choices available over those 24 rounds gets very large. A naïve approach with no other optimizations will actually cause an exception of ERROR: stack depth limit exceeded which indicates we had too many calls to our recursive function. The basic function will look like this (pseudo-code):

myfunc:
  ore_cost = myfunc(ore_robots++)
  clay_cost = myfunc(clay_robots++)
  obsidian_cost = myfunc(obsidian_robot++)
  geode_cost = myfunc(geode_robots++)
  nothing_cost = myfunc()
return greatest(ore_cost, clay_cost, obsidian_cost, geode_cost, nothing_cost)

During each round, the robots also gather resources, which are used (except for geodes) to build other robots, so the code needs to check for that as well:

myfunc:
  if no_more_minutes
    return geodes
  if ore >= ore_robot_cost
    ore_cost = myfunc(ore_robots++)
  if ore >= clay_robot_cost
    clay_cost = myfunc(clay_robots++)
  if ore >= obsidian_robot_cost and clay >= obsidian_robot_cost2
    obsidian_cost = myfunc(obsidian_robot++)
  if ore >= geode_robot_cost and obsidian >= geode_robot_cost2
    geode_cost = myfunc(geode_robots++)
  nothing_cost = myfunc()
return greatest(ore_cost, clay_cost, obsidian_cost, geode_cost, nothing_cost)

That's the basic idea: walk through deeper and deeper until we run out of minutes, and then bubble up our results through the recursive callers. The "cost" is our ultimate goal: the number of geodes produced. We compare each of the five scenarios and return whichever has produced the greatest number of geodes for that round.

A big part of this puzzle was figuring out ways to reduce the number of times the function runs. For example, if we have 10 ore robots, which produce 10 ore each round, then there is no point in ever producing more ore robots if the most expensive robot costs less than 10 ore. Here are some other rules that I added (see how many you can think of first - I'll put a hint first, then the actual rule)

  • Because our ultimate goal is to produce geodes ... we always want to create geode robots, so that choice always wins
  • If we already have enough of a certain robot type to satisfy demand ... we never build any more of that robot type
  • If we are nearing the end ... only build robots that could possibly lead to more geode robots
  • If we did nothing last round, but we COULD HAVE built a robot type ... never build it this round: always build "greedy"
  • If there are no geode robots yet ... building an obsidian robot is always the best choice

There were a few other rules that got considered but thrown out, as the above rules had a lot more bang for the buck. By far the most important one is the "don't build if we did nothing last round but could have built" as that saves over 1.4 million calls. Here is the final function, annotated with the savings for each of the added rules:

CREATE or replace FUNCTION give_me_the_remote( ore INT, oldore INT, clay INT, oldclay INT, obsidian INT, geodes INT, ore_robots INT, clay_robots INT, obsidian_robots INT, geode_robots INT, ore_robot_cost INT, clay_robot_cost INT, obsidian_robot_cost INT, obsidian_robot_cost2 INT, geode_robot_cost INT, geode_robot_cost2 INT, minute INT, maxminute INT, last_action TEXT = '' ) RETURNS INT language plpgsql AS $$ DECLARE ore_score INT = 0; clay_score INT = 0; obsidian_score INT = 0; geode_score INT = 0; nobody_score INT = 0; BEGIN minute = minute + 1; IF minute >= maxminute THEN RETURN geodes + geode_robots; END IF; IF ore >= geode_robot_cost AND obsidian >= geode_robot_cost2 THEN geode_score = give_me_the_remote( ore + ore_robots - geode_robot_cost, ore, clay + clay_robots, clay, obsidian + obsidian_robots - geode_robot_cost2, geodes + geode_robots, ore_robots, clay_robots, obsidian_robots, geode_robots + 1, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, minute, maxminute, '' ); RETURN geode_score; END IF; IF ore >= obsidian_robot_cost AND clay >= obsidian_robot_cost2 AND minute < maxminute-1 AND obsidian_robots < geode_robot_cost2 AND (last_action <> 'nada' OR oldore < obsidian_robot_cost OR oldclay < obsidian_robot_cost2) THEN obsidian_score = give_me_the_remote( ore + ore_robots - obsidian_robot_cost, ore, clay + clay_robots - obsidian_robot_cost2, clay, obsidian + obsidian_robots, geodes + geode_robots, ore_robots, clay_robots, obsidian_robots + 1, geode_robots, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, minute, maxminute, '' ); IF geode_robots < 1 THEN return geodes + obsidian_score; END IF; END IF; IF ore >= clay_robot_cost AND (last_action <> 'nada' OR oldore < clay_robot_cost) AND minute < maxminute-8 THEN clay_score = give_me_the_remote( ore + ore_robots - clay_robot_cost, ore, clay + clay_robots, clay, obsidian + obsidian_robots, geodes + geode_robots, ore_robots, clay_robots + 1, obsidian_robots, geode_robots, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, minute, maxminute, '' ); END IF; IF ore >= ore_robot_cost AND ore_robots < GREATEST( ore_robot_cost, clay_robot_cost, obsidian_robot_cost, geode_robot_cost) AND minute < maxminute-10 AND (last_action <> 'nada' OR oldore < ore_robot_cost) THEN ore_score = give_me_the_remote( ore + ore_robots - ore_robot_cost, ore, clay + clay_robots, clay, obsidian + obsidian_robots, geodes + geode_robots, ore_robots + 1, clay_robots, obsidian_robots, geode_robots, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, minute, maxminute, '' ); END IF; nobody_score = give_me_the_remote( ore + ore_robots, ore, clay + clay_robots, clay, obsidian + obsidian_robots, geodes + geode_robots, ore_robots, clay_robots, obsidian_robots, geode_robots, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, minute, maxminute, 'nada' ); RETURN GREATEST(nobody_score, ore_score, clay_score, obsidian_score); END $$;

This ran in 115ms and took around 30,000 calls! Not too shabby. The puzzle asked us to sum up the "quality" of each blueprint by finding the maximum number of geodes each can open, then multiplying that by the blueprint number. Summing all the quality numbers produces the final answer, which is simple in SQL:

SELECT SUM(id * give_me_the_remote( ore,0,clay,0,obsidian,geodes, ore_robots,clay_robots,obsidian_robots,geode_robots, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, 0, 24, '') ) FROM blueprint;

AOC 2022 Day 19 Part 2

Part 2 tries to increase the difficulty by changing the number of rounds from 24 to 32, while reducing the total blueprints to the first three. While 32 rounds is exponentially more difficult than 24, all our optimizations and rules above made it easy! The solution we need is the value of each of the first three blueprints multiplied together.

WITH x AS (SELECT give_me_the_remote(ore,ore,clay,clay,obsidian,geodes, ore_robots,clay_robots,obsidian_robots,geode_robots, ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2, 0, 32) FROM blueprint WHERE id <= 3) SELECT round( exp( sum( ln(give_me_the_remote) ) ) ) FROM x;

Postgres has no direct function to compute the product (i.e. the multiplication analog to sum), so we use some ln/exp math to do the equivalent of x1 * x2 * x3.

Loading terminal...

Loading terminal...