Advent of Code: Seven Segment Logic Puzzle

We're into the last quarter of the year, and this means Advent of Code is coming up again in a couple months! I'm hoping to do a lot of these problems in Haskell again and this time do up-to-date recaps. To prepare for this, I'm going back through my solutions from last year and trying to update them and come up with common helpers and patterns that will be useful this year.

You can follow me doing these implementation reviews on my stream, and you can take a look at my code on GitHub here!

Most of my blog posts for the next few weeks will recap some of these problems. I'll do written summaries of solutions as well as video summaries to see which are more clear. The written summaries will use the In-Depth Coding style, so get ready for a lot of code! As a final note, you'll notice my frequent use of MonadLogger, as I covered in this article. So let's get started!

Problem Statement

I'm going to start with Day 8 from last year, which I found to be an interesting problem because it was more of a logic puzzle than a traditional programming problem. The problem starts with the general concept of a seven segment display, a way of showing numbers on an electronic display (like scoreboards, for example).

We can label each of the seven segments like so, with letters "a" through "g":

Segments a-g:

     aaaa 
    b    c
    b    c
     dddd 
    e    f
    e    f
     gggg

If all seven segments are lit up, this indicates an 8. If only "c" and "f" are lit up, that's 1, and so on.

The puzzle input consists of lines with 10 "code" strings, and 4 "output" strings, separated by a pipe delimiter:

be cfbegad cbdgef fgaecd cgeb fdcge agebfd fecdb fabcd edb | fdgacbe cefdb cefbgd gcbe
edbfga begcd cbg gc gcadebf fbgde acbgfd abcde gfcbed gfec | fcgedb cgb dgebacf gc

The 10 code strings show a "re-wiring" of the seven segment display. On the first line, we see that be is present as a code string. Since only a "one" has length 2, we know that "b" and "e" each refer either to the "c" or "f" segment, since only those segments are lit up for "one". We can use similar lines of logic to fully determine the mapping of code characters to the original segment display.

Once we have this, we can decode each output string on the right side, get a four-digit number, and then add all of these up.

Solution Approach

When I first solved this problem over a year ago, I went through the effort of deriving a general function to decode any string based on the input codes, and then used this function

However, upon revisiting the problem, I realized it's quite a bit simpler. The length of the output to decode is obviously the first big branching point (as we'll see, "part 1" of the problem clues you on to this). Four of the numbers have unique lengths of "on" segments:

  1. 2 Segments = 1
  2. 3 Segments = 7
  3. 4 Segments = 4
  4. 7 Segments = 8

Then, three possible numbers have 5 "on" segments (2, 3, and 5). The remaining three (0, 6, 9) use six segments.

However, when it comes to solving these more ambiguous numbers, the key still lies with the digits 1 and 4, because we can always find the codes referring to these by their length. So we can figure out which two code characters are on the right side (referring to the c and fsegments) and which two segments refer to "four minus one", so segments b and d. We don't immediately know which is which in either pair, but it doesn't matter!

Between our "length 5" outputs (2, 3, 5), only 3 contains both segments from "one". So if that isn't true, we can then look at the "four minus one" segments (b and d), and if both are present, it's a 5, otherwise it's a 2.

We can employ similar logic for the length-6 possibilities. If either "one" segment is missing, it must be 6. Then if both "four minus one" segments are present, the answer is 9. Otherwise it is 0.

If this logic doesn't make sense in paragraphs, here's a picture that captures the essential branches of the logic.

So how do we turn this solution into code?

Utilities

First, let's start with a couple utility functions. These functions capture patterns that are useful across many different problems. The first of these is countWhere. This is a small helper whenever we have a list of items and we want the number of items that fulfill a certain predicate. This is a simple matter of filtering on the predicate and taking the length.

countWhere :: (a -> Bool) -> [a] -> Int
countWhere predicate list = length $ filter predicate list

Next we'll have a flexible parsing function. In general, I've been trying to use Megaparsec to parse the problem inputs (though it's often easier to parse them by hand). You can read this series to learn more about parsing in Haskell, and this part specifically for megaparsec.

But a good general helper we can have is "given a file where each line has a specific format, parse the file into a list of outputs." I refer to this function as parseLinesFromFile.

parseLinesFromFile :: (MonadIO m) => ParsecT Void Text m a -> FilePath -> m [a]
parseLinesFromFile parser filepath = do
  input <- pack <$> liftIO (readFile filepath)
  result <- runParserT (sepEndBy1 parser eol) "Utils.hs" input
  case result of
    Left e -> error $ "Failed to parse: " ++ show e
    Right x -> return x

Two key observations about this function. We take the parser as an input (this type is ParsecT Void Text m a). Then we apply it line-by-line using the flexible combinator sepEndBy1 and the eol parser for "end of line". The combinator means we parse several instances of the parser that are separated and optionally ended by the second parser. So each instance (except perhaps the last) of the input parser then is followed by an "end of line" character (or carriage return).

Parsing the Lines

Now when it comes to the specific problem solution, we always have to start by parsing the input from a file (at least that's how I prefer to do it). The first step of parsing is to determine what we're parsing into. What is the "output type" of parsing the data?

In this case, each line we parse consists of 10 "code" strings and 4 "output" strings. So we can make two types to hold each of these parts - InputCode and OutputCode.

data InputCode = InputCode
  { screen0 :: String
  , screen1 :: String
  , screen2 :: String
  , screen3 :: String
  , screen4 :: String
  , screen5 :: String
  , screen6 :: String
  , screen7 :: String
  , screen8 :: String
  , screen9 :: String
  } deriving (Show)

data OutputCode = OutputCode
  { output1 :: String
  , output2 :: String
  , output3 :: String
  , output4 :: String
  } deriving (Show)

Now each different code string can be captured by the parser some letterChar. If we wanted to be more specific, we could even do some like:

choice [char 'a', char 'b', char 'c', char 'd', char 'e', char 'f', char 'g']

Now for each group of strings, we'll parse them using the same sepEndBy1 combinator we used before. This time, the separator is hspace, covering horizontal space characters (including tabs, but not newlines). Between these, we use `string "| " to parse the bar in between the input line. So here's the start of our parser:

parseInputLine :: (MonadLogger m) => ParsecT Void Text m (Maybe (InputCode, OutputCode))
parseInputLine = do
  screenCodes <- sepEndBy1 (some letterChar) hspace
  string "| "
  outputCodes <- sepEndBy1 (some letterChar) hspace
  ...

Both screenCodes and outputCodes are lists, and we want to convert them into our output types. So first, we do some validation and ensure that the right number of strings are in each list. Then we can pattern match and group them properly. Invalid results give Nothing.

parseInputLine :: (MonadLogger m) => ParsecT Void Text m (Maybe (InputCode, OutputCode))
parseInputLine = do
  screenCodes <- sepEndBy1 (some letterChar) hspace
  string "| "
  outputCodes <- sepEndBy1 (some letterChar) hspace
  if length screenCodes /= 10 
    then lift (logErrorN $ "Didn't find 10 screen codes: " <> intercalate ", " (pack <$> screenCodes)) >> return Nothing
    else if length outputCodes /= 4
      then lift (logErrorN $ "Didn't find 4 output codes: " <> intercalate ", " (pack <$> outputCodes)) >> return Nothing
      else
        let [s0, s1, s2, s3, s4, s5, s6, s7, s8, s9] = screenCodes
            [o1, o2, o3, o4] = outputCodes
        in  return $ Just (InputCode s0 s1 s2 s3 s4 s5 s6 s7 s8 s9, OutputCode o1 o2 o3 o4)

Then we can parse the codes using parseLinesFromFile, applying this in both the "easy" part and the "hard" part of the problem.

solveDay8Easy :: String -> IO (Maybe Int)
solveDay8Easy fp = runStdoutLoggingT $ do
  codes <- catMaybes <$> parseLinesFromFile parseInputLine fp
  ...

solveDay8Hard :: String -> IO (Maybe Int)
solveDay8Hard fp = runStdoutLoggingT $ do
  inputCodes <- catMaybes <$> parseLinesFromFile parseInputLine fp
  ...

The First Part

Now to complete the "easy" part of the problem, we have to answer the question: "In the output values, how many times do digits 1, 4, 7, or 8 appear?". As we've discussed, each of these has a unique length. So it's easy to first describe a function that can tell from an output string if it is one of these items:

isUniqueDigitCode :: String -> Bool
isUniqueDigitCode input = length input `elem` [2, 3, 4, 7]

Then we can use our countWhere utility to apply this function and figure out how many of these numbers are in each output code.

uniqueOutputs :: OutputCode -> Int
uniqueOutputs (OutputCode o1 o2 o3 o4) = countWhere isUniqueDigitCode [o1, o2, o3, o4]

Finally, we take the sum of these, applied across the outputs, to get our first answer:

solveDay8Easy :: String -> IO (Maybe Int)
solveDay8Easy fp = runStdoutLoggingT $ do
  codes <- catMaybes <$> parseLinesFromFile parseInputLine fp
  let result = sum $ uniqueOutputs <$> (snd <$> codes)
  return $ Just result

The Second Part

Now for the hard part! We have to decode each digit in the output, determine its value, and then get the value on the 4-digit display. The root of this is to decode a single string, given the InputCode of 10 values. So let's write a function that does that. We'll use MaybeT since there are some failure conditions on this function.

decodeString :: (MonadLogger m) => InputCode -> String -> MaybeT m Int

As we've discussed, the logic is easy for certain lengths. If the string length is 2, 3, 4 or 7, we have obvious answers.

decodeString :: (MonadLogger m) => InputCode -> String -> MaybeT m Int
decodeString inputCodes output
  | length output == 2 = return 1
  | length output == 3 = return 7
  | length output == 4 = return 4
  | length output == 7 = return 8
  ...

Now for length 5 and 6, we'll have separate functions:

decode5 :: (MonadLogger m) => InputCode -> String -> MaybeT m Int

decode6 :: (MonadLogger m) => InputCode -> String -> MaybeT m Int

Then we can call these from our base function:

decodeString :: (MonadLogger m) => InputCode -> String -> MaybeT m Int
decodeString inputCodes output
  | length output == 2 = return 1
  | length output == 3 = return 7
  | length output == 4 = return 4
  | length output == 7 = return 8
  | length output == 5 = decode5 inputCodes output
  | length output == 6 = decode6 inputCodes output
  | otherwise = mzero

We have a failure case of mzero if the length doesn't fall within our expectations for some reason.

Now before we can write decode5 and decode6, we'll write a helper function. This helper will determine the two characters present in the "one" segment as well as the two characters present in the "four minus one" segment.

For some reason I separated the two Chars for the "one" segment but kept them together for "four minus one". This probably isn't necessary. But anyways, here's our type signature:

sortInputCodes :: (MonadLogger m) => InputCode -> MaybeT m (Char, Char, String)
sortInputCodes ic@(InputCode c0 c1 c2 c3 c4 c5 c6 c7 c8 c9) = do
  ...

Let's start with some more validation. We'll sort the strings by length and ensure the length distributions are correct.

sortInputCodes :: (MonadLogger m) => InputCode -> MaybeT m (Char, Char, String)
sortInputCodes ic@(InputCode c0 c1 c2 c3 c4 c5 c6 c7 c8 c9) = do
  ...
  where
    [sc0, sc1,sc2,sc3,sc4,sc5,sc6,sc7,sc8,sc9] = sortOn length [c0, c1, c2, c3, c4, c5, c6, c7, c8, c9]
    validLengths =
      length sc0 == 2 && length sc1 == 3 && length sc2 == 4 &&
      length sc3 == 5 && length sc4 == 5 && length sc5 == 5 &&
      length sc6 == 6 && length sc7 == 6 && length sc8 == 6 &&
      length sc9 == 7

If the lengths aren't valid, we'll return mzero as a failure case again. But if they are, we'll pattern match to identify our characters for "one" and the string for "four". By deleting the "one" characters, we'll get a string for "four minus one". Then we can return all our items:

sortInputCodes :: (MonadLogger m) => InputCode -> MaybeT m (Char, Char, String)
sortInputCodes ic@(InputCode c0 c1 c2 c3 c4 c5 c6 c7 c8 c9) = do
  if not validLengths
    then logErrorN ("Invalid inputs: " <> (pack . show $ ic)) >> mzero
    else do
      let [sc01, sc02] = sc0
      let fourMinusOne = delete sc02 (delete sc01 sc2)
      return (sc01, sc02, fourMinusOne)
  where
    [sc0, sc1,sc2,sc3,sc4,sc5,sc6,sc7,sc8,sc9] = sortOn length [c0, c1, c2, c3, c4, c5, c6, c7, c8, c9]
    validLengths =
      length sc0 == 2 && length sc1 == 3 && length sc2 == 4 &&
      length sc3 == 5 && length sc4 == 5 && length sc5 == 5 &&
      length sc6 == 6 && length sc7 == 6 && length sc8 == 6 &&
      length sc9 == 7

Length 5 Logic

Now we're ready to decode a string of length 5! We start by sorting the inputs, and then picking out the three elements from the list that could be of length 5:

decode5 :: (MonadLogger m) => InputCode -> String -> MaybeT m Int
decode5 ic output = do
  (c01, c02, fourMinusOne) <- sortInputCodes ic
  ...

So first we'll check if the "one" characters are present, we get 3.

decode5 :: (MonadLogger m) => InputCode -> String -> MaybeT m Int
decode5 ic output = do
  (c01, c02, fourMinusOne) <- sortInputCodes ic
  -- If both from c0 are present, it's a 3
  if c01 `elem` output && c02 `elem` output
    then return 3
    else ...

Then if "four minus one" shares both its characters with the output, the answer is 5, otherwise it is 2.

decode5 :: (MonadLogger m) => InputCode -> String -> MaybeT m Int
decode5 ic output = do
  (c01, c02, fourMinusOne) <- sortInputCodes ic
  -- If both from c0 are present, it's a 3
  if c01 `elem` output && c02 `elem` output
    then return 3
    else do
      let shared = fourMinusOne `intersect` output
      if length shared == 2
        then return 5
        else return 2

Length 6 Logic

The logic for length 6 strings is very similar. I wrote it a little differently in this function, but the idea is the same.

decode6 :: (MonadLogger m) => InputCode -> String -> MaybeT m Int
decode6 ic output = do
  (c01, c02, fourMinusOne) <- sortInputCodes ic
  -- If not both from c0 are present, it's a 6
  if not (c01 `elem` output && c02 `elem` output)
    then return 6
    else do
      -- If both of these characters are present in output, 9 else 0
      if all (`elem` output) fourMinusOne then return 9 else return 0

Wrapping Up

Now that we can decode an output string, we just have to be able to do this for all strings in our output. We just multiply their values by the appropriate power of 10.

decodeAllOutputs :: (MonadLogger m) => (InputCode, OutputCode) -> MaybeT m Int
decodeAllOutputs (ic, OutputCode o1 o2 o3 o4) = do
  d01 <- decodeString ic o1
  d02 <- decodeString ic o2
  d03 <- decodeString ic o3
  d04 <- decodeString ic o4
  return $ d01 * 1000 + d02 * 100 + d03 * 10 + d04

And now we can complete our "hard" function by decoding all these inputs and taking their sums.

solveDay8Hard :: String -> IO (Maybe Int)
solveDay8Hard fp = runStdoutLoggingT $ do
  inputCodes <- catMaybes <$> parseLinesFromFile parseInputLine fp
  results <- runStdoutLoggingT $ runMaybeT (mapM decodeAllOutputs inputCodes)
  return $ fmap sum results

Conclusion

That's all for this week! You can take a look at all this code on GitHub if you want! Here's the main solution module!

Next time, we'll go through another one of these problems! If you'd like to stay up to date with the latest on Monday Morning Haskell, subscribe to our mailing list! This will give you access to all our subscriber resources!

Previous
Previous

Flashing Octopuses and BFS

Next
Next

Haskell and Visual Studio