Zoom! Enhance!

Nov 7

Today we'll be tackling the Day 20 problem from Advent of Code 2021. This problem is a fun take on the Zoom and Enhance cliche from TV dramas where cops and spies can always seem to get unrealistic details from grainy camera footage by "enhancing" it. We'll have a binary image and we'll need to keep applying a decoding key to expand the image.

As always, you can see all the nitty gritty details of the code at once by going to the GitHub repository I've made for these problems. If you're enjoying these in-depth walkthroughs, make sure to subscribe so you can stay up to date with the latest news.

Problem Statement

Our problem input consists of a couple sections that have "binary" data, where the . character represents 0 and the # character represents 1.

..#.#..#####.#.#.#.###.##.....###.##.#..###.####..#####..#....#..#..##..##
#..######.###...####..#..#####..##..#.#####...##.#.#..#.##..#.#......#.###
.######.###.####...#.##.##..#..#..#####.....#.#....###..#.##......#.....#.
.#..#..##..#...##.######.####.####.#.#...#.......#..#.#.#...####.##.#.....
.#..#...##.#.##..#...##.#.##..###.#......#.#.......#.#.#.####.###.##...#..
...####.#..#..#.##.#....##..#.####....##...##..#...#......#.#.......#.....
..##..####..#...#.#.#...##..#.#..###..#####........#..####......#..#

#..#.
#....
##..#
..#..
..###

The first part (which actually would appear all on one line) is a 512 character decoding key. Why length 512? Well 512 = 2^9, and we'll see in a second why the ninth power is significant.

The second part of the input is a 2D "image", represented in binary. Our goal is to "enhance" the image using the decoding key. How do we enhance it?

To get the new value at a coordinate (x, y), we have to consider the value at that coordinate together with all 8 of its neighbors.

# . . # .
#[. . .].
#[# . .]#
.[. # .].
. . # # #

The brackets show every pixel that is involved in getting the new value at the "center" of our grid. The way we get the value is to line up these pixels in binary: ...#...#. = 000100010. Then we get the decimal value (34 in this case). This tells us new value comes from the 34th character in the decoder key, which is #. So this middle pixel will be "on" after the first expansion. Since each pixel expansion factors in 9 pixels, there are 2^9 = 512 possible values, hence the length of the decoding key.

All transformations happen simultaneously. What is noteworthy is that for "fringe" pixels we must account for the boundary outside the initial image. And in fact, our image then expands into this new region! The enhanced version of our first 5x5 image actually becomes size 7x7.

.##.##.
#..#.#.
##.#..#
####..#
.#..##.
..##..#
...#.#.

For the easy part, we'll do this expansion twice. For the hard part, we'll do it 50 times. Our puzzle answer is the number of pixels that are lit in the final iteration.

Solution Approach

At first glance, this problem is pretty straightforward. It's another "state evolution" problem where we take the problem in an initial state and write a function to evolve that state to the next step. Evolving a single step involves looking at the individual pixels, and applying a fairly simple algorithm to get the resulting pixel.

The ever-expanding range of coordinates is a little tricky. But if we use a structure that allows "negative" indices (and Haskell makes this easy!), it's not too bad.

But there's one BIG nuance though with how the "infinite" image works. We still have to implicitly imagine that the enhancement algorithm is applying to all the other pixels in "infinite space". You would hope that, since all those pixels are surrounded by other "off" pixels, they remain "off".

However, my "hard" puzzle input got a decoding key with # in the 0 position, meaning that "off" pixels surrounded by other "off" pixels all turn on! Luckily, the decoder also has . in the final position, meaning that these pixels turn "off" again on the next step. However, we need to account for this on/off pattern of all these "outside pixels" since they'll affect the pixels on the fringe of our solution.

To that end, we'll need to keep track of the value of outer pixels throughout our algorithm - I'll refer to this as the "outside bit". This will impact every layer of the solution!

So with that to look forward to, let's start coding!

Utilities

As always, a few utilities will benefit us. From last week's look at binary numbers, we'll use a couple helpers like the Bit type and a binary-to-decimal conversion function.

data Bit = Zero | One
  deriving (Eq, Ord)

bitsToDecimal64 :: [Bit] -> Word64

Another very useful idea is turning a nested list into a hash map. This helps simplify parsing a lot. We saw this function in the Day 11 Octopus Problem.

hashMapFromNestedLists :: [[a]] -> HashMap Coord2 a

Another idea from Day 11 was getting all 8 neighbors of a 2D coordinate. Originally, we did this with (0,0) as a hard lower bound. But we can expand this idea so that the grid bounds of the function are taken as inputs. So getNeighbors8Flex takes two additional coordinate parameters to help provide those bounds for us.

getNeighbors8Flex :: Coord2 -> Coord2 -> Coord2 -> [Coord2]
getNeighbors8Flex (minRow, minCol) (maxRow, maxCol) (row, col) = catMaybes
  [maybeUpLeft, maybeUp, maybeUpRight, maybeLeft, maybeRight, maybeDownLeft, maybeDown, maybeDownRight]
  where
    maybeUp = if row > minRow then Just (row - 1, col) else Nothing
    maybeUpRight = if row > minRow && col < maxCol then Just (row - 1, col + 1) else Nothing
    maybeRight = if col < maxCol then Just (row, col + 1) else Nothing
    maybeDownRight = if row < maxRow && col < maxCol then Just (row + 1, col + 1) else Nothing
    maybeDown = if row < maxRow then Just (row + 1, col) else Nothing
    maybeDownLeft = if row < maxRow && col > minCol then Just (row + 1, col - 1) else Nothing
    maybeLeft = if col > minCol then Just (row, col - 1) else Nothing
    maybeUpLeft = if row > minRow && col > minCol then Just (row - 1, col - 1) else Nothing

Of particular note is the way we order the results. This ordering (top, then same row, then bottom), will allow us to easily decode our values for this problem.

Another detail for this problem is that we'll just want to use "no bounds" on the coordinates with the minimum and maximum integers as the bounds.

getNeighbors8Unbounded :: Coord2 -> [Coord2]
getNeighbors8Unbounded = getNeighbors8Flex (minBound, minBound) (maxBound, maxBound)

Last but not least, we'll also rely on this old standby, the countWhere function, to quickly get the occurrence of certain values in a list.

countWhere :: (a -> Bool) -> [a] -> Int

Inputs

Like all Advent of Code problems, we'll start with parsing our input. We need to get everything into bits, but instead of 0 and 1 characters, we're dealing with the character . for off, and # for 1. So we start with a choice parser to get a single pixel.

parsePixel :: (MonadLogger m) => ParsecT Void Text m Bit
parsePixel = choice [char '.' >> return Zero, char '#' >> return One]

Now we need a couple types to organize our values. The decoder map will tell us a particular bit for every index from 0-511. So we can use a hash map with Word64 as the key.

type DecoderMap = HashMap Word64 Bit

Furthermore, it's easy to see how we build this decoder from a list of bits with a simple zip:

buildDecoder :: [Bit] -> DecoderMap
buildDecoder input = HM.fromList (zip [0..] input)

For the image though, we have 2D data. So let's using a hash map over Coord2 for our ImageMap type:

type ImageMap = HashMap Coord2 Bit

We have enough tools to start writing our function now. We'll parse an initial series of pixels and build the decoder out of them, followed by a couple eol characters.

parseInput :: (MonadLogger m) => ParsecT Void Text m (DecoderMap, ImageMap)
parseInput = do
  decoderMap <- buildDecoder <$> some parsePixel
  eol >> eol
  ...

Now we'll get the 2D image. We'll start by getting a nested list structure using the sepEndBy1 ... eol trick we've seen so many times already.

parse2DImage :: (MonadLogger m) => ParsecT Void Text m [[Bit]]
parse2DImage = sepEndBy1 (some parsePixel) eol

Now to put it all together, we'll use our conversion function to get our map from the nested lists, and then we've got our two inputs: the DecoderMap and the initial ImageMap!

parseInput :: (MonadLogger m) => ParsecT Void Text m (DecoderMap, ImageMap)
parseInput = do
  decoderMap <- buildDecoder <$> some parsePixel
  eol >> eol
  image <- hashMapFromNestedLists <$> parse2DImage
  return (decoderMap, image)

Processing One Pixel

In terms of writing out the algorithm, we'll try a "bottom up" approach this time. We'll start by solving the smallest problem we can think of, which is this: For a single pixel, how do we calculate its new value in one step of expansion?

There are multiple ways to approach this piece, but the way I chose was to imagine this as a folding function. We'll start a new "enhanced" image as an empty map, and we'll insert the new pixels one-by-one using this folding function. So each iteration modifies a single Coord2 key of an ImageMap. We can fit this into a "fold" pattern if the end of this function's signature looks like this:

-- At some point we have HM.insert coord bit newImage
f :: ImageMap -> Coord2 -> m ImageMap
f newImage coord = ...

But we need some extra information in this function to solve the problem of which "bit" we're inserting. We'll need the original image of course, to find the pixels around this coordinate. We'll also need the decoding map once we convert these to a decimal index. Last of all, we need the "outside bit" discussed above in the solution approach. Here's a type signature to gather these together.

processPixel ::
  (MonadLogger m) =>
  DecoderMap ->
  ImageMap ->
  Bit ->
  ImageMap -> Coord2 -> m ImageMap
processPixel decoderMap initialImage bounds outsideBit newImage pixel = ...

Let's start with a helper function to get the original image's bit at a particular coordinate. Whenever we do a bit lookup outside our original image, its coordinates will not exist in the initialImage map. In this case we'll use the outside bit.

processPixel decoderMap initialImage outsideBit newImage pixel = do
  ...
  where
    getBit :: Coord2 -> Bit
    getBit coord = fromMaybe outsideBit (initialImage HM.!? coord)

Now we need to get all the neighboring coordinates of this pixel. We'll use our getNeighbors8Unbounded utility from above. We could restrict ourselves to the bounds of the original, augmented by 1, but there's no particular need. We get the bit at each location, and assert that we have indeed found all 8 neighbors.

processPixel decoderMap initialImage outsideBit newImage pixel = do
  let allNeighbors = getNeighbors8Unbounded pixel
      neighborBits = getBit <$> allNeighbors
  if length allNeighbors /= 8
    then error "Must have 8 neighbors!"
    ...
where
    getBit = ...

Now the "neighbors" function doesn't include the bit at the specific input pixel! So we have to split our neighbors and insert it into the middle like so:

processPixel decoderMap initialImage outsideBit newImage pixel = do
  let allNeighbors = getNeighbors8Unbounded pixel
      neighborBits = getBit <$> allNeighbors
  if length allNeighbors /= 8
    then error "Must have 8 neighbors!"
    else do
      let (first4, second4) = splitAt 4 neighborBits
          finalBits = first4 ++ (getBit pixel : second4)
     ...
where
    getBit = ...

Now that we have a list of 9 bits, we can decode those bits (using bitsToDecimal64 from last time). This gives us the index to look up in our decoder, which we insert into the new image!

processPixel decoderMap initialImage outsideBit newImage pixel = do
  let allNeighbors = getNeighbors8Unbounded pixel
      neighborBits = getBit <$> allNeighbors
  if length allNeighbors /= 8
    then error "Must have 8 neighbors!"
    else do
      let (first4, second4) = splitAt 4 neighborBits
          finalBits = first4 ++ (getBit pixel : second4)
          indexToDecode = bitsToDecimal64 finalBits
          bit = decoderMap HM.! indexToDecode
      return $ HM.insert pixel bit newImage
  where
    getBit :: Coord2 -> Bit
    getBit coord = fromMaybe outsideBit (initialImage HM.!? coord)

Expanding the Image

Now that we can populate the value for a single pixel, let's step back one layer of the problem and determine how to expand the full image. As mentioned above, we ultimately want to use our function above like a fold. So we need enough arguments to reduce it to:

ImageMap -> Coord2 -> m ImageMap

Then we can start with an empty image map, and loop through every coordinate. So let's make sure we include the decoder map, the original image, and the "outside bit" in our type signature to ensure we have all the processing arguments.

expandImage :: (MonadLogger m) => DecoderMap -> ImageMap -> Bit -> m ImageMap
expandImage decoderMap image outsideBit = ...

Our chief task is to determine the coordinates to loop through. We can't just use the coordinates from the original image though. We have to expand by 1 in each direction so that the outside pixels can come into play. After adding 1, we use Data.Ix.range to interpolate all the coordinates in between our minimum and maximum.

expandImage decoderMap image outsideBit = ...
  where
    (minRow, minCol) = minimum (HM.keys image)
    (maxRow, maxCol) = maximum (HM.keys image)
    newBounds = ((minRow - 1, minCol - 1), (maxRow + 1, maxCol + 1))
    allCoords = range newBounds

And now we have all the ingredients for our fold! We partially apply decoderMap, image, and outsideBit, and then use a fresh empty image and the coordinates.

expandImage decoderMap image outsideBit = foldM
  (processPixel decoderMap image outsideBit)
  HM.empty
  allCoords
  where
    (minRow, minCol) = minimum (HM.keys image)
    (maxRow, maxCol) = maximum (HM.keys image)
    newBounds = ((minRow - 1, minCol - 1), (maxRow + 1, maxCol + 1))
    allCoords = range newBounds

Running the Expansion

Now that we can expand the image once, we just have to zoom out one more layer, and run the expansion a certain number of times. We'll write a recursive function that uses the decoder map, the initial image, and an integer argument for our current step count. This will return the total number of pixels that are lit in the final image.

runExpand :: (MonadLogger m) => DecoderMap -> ImageMap -> Int -> m Int

The base case occurs when we have 0 steps remaining. We'll just count the number of elements that have the One bit in our current image.

runExpand _ image 0 = return $ countWhere (== One) (HM.elems image)

The only trick with the recursive case is that we have to determine the "outside bit". If the element corresponding to 0 in the decoder map is One, then all the outside bits will flip back and forth. So we need to check this bit, as well as the step count. For even step counts, we'll use Zero for the outside bits. And of course, if the decoder head is 0, then there's no flipping at all, so we always get Zero.

runExpand _ image 0 = return $ countWhere (== One) (HM.elems image)
runExpand decoderMap initialImage stepCount = do
  ...
  where
    outsideBit = if decoderMap HM.! 0 == Zero || even stepCount
      then Zero
      else One

Now we have all the arguments we need for our expandImage call! So let's get that new image and recurse using runExpand, with a reduced step count.

runExpand _ image 0 = return $ countWhere (== One) (HM.elems image)
runExpand decoderMap initialImage stepCount = do
  finalImage <- expandImage decoderMap initialImage outsideBit
  runExpand decoderMap finalImage (stepCount - 1)
  where
    outsideBit = if decoderMap HM.! 0 == Zero || even stepCount then Zero else One

Solving the Problem

Now we're well positioned to solve the problem. We'll parse the input into the decoder map and the first image with another old standby, parseFile. Then we'll run the expansion for 2 steps and return the number of lit pixels.

solveDay20Easy :: String -> IO (Maybe Int)
solveDay20Easy fp = runStdoutLoggingT $ do
  (decoderMap, initialImage) <- parseFile parseInput fp
  pixelsLit <- runExpand decoderMap initialImage 2
  return $ Just pixelsLit

The hard part is virtually identical, just increasing the number of steps up to 50.

solveDay20Hard :: String -> IO (Maybe Int)
solveDay20Hard fp = runStdoutLoggingT $ do
  (decoderMap, initialImage) <- parseFile parseInput fp
  pixelsLit <- runExpand decoderMap initialImage 50
  return $ Just pixelsLit

And we're done!

Conclusion

Later this week we'll have the video walkthrough! If you want to see the complete code in action, you can take a look on GitHub.

If you subscribe to our monthly newsletter, you'll get all the latest news and offers from Monday Morning Haskell, as well as access to our subscriber resources!

In Depth

James Bowen