James Bowen James Bowen

My New Favorite Monad?

In my last article, I introduced a more complicated example of a problem using Dijkstra's algorithm and suggested MonadLogger as an approach to help debug some of the intricate helper functions.

But as I've started getting back to working on some of the "Advent of Code" type problems, I've come to the conclusion that this kind of logging might be more important than I initially realized. At the very least, it's gotten me thinking about improving my general problem solving process. Let's explore why.

Hitting a Wall

Here's an experience I've had quite a few times, especially with Haskell. I'll be solving a problem, working through the ideas in my head and writing out code that seems to fit. And by using Haskell, a lot of problems will be resolved just from making the types work out.

And then, ultimately, the solution is wrong. I don't get the answer I expected, even though everything seems correct.

So how do I fix this problem? A lot of times (unfortunately), I just look at the code, think about it some more, and eventually realize an idea I missed in one of my functions. This is what I would call an insight-based approach to debugging.

Insight is a useful thing. But used on its own, it's probably one of the worst ways to debug code. Why do I say this? Because insight is not systematic. You have no process or guarantee that you'll eventually come to the right solution.

So what are systematic approaches you can take?

Systematic Approaches to Debugging

Three general approaches come to my mind when I think about systematic debugging methods.

  1. Writing unit tests
  2. Using a debugging program (e.g. GDB)
  3. Using log statements

Unit Tests

The first approach is distinct from the other two in that it is a "black box" method. With unit tests, we provide a function with particular inputs and see if it produces the outputs we expect. We don't need to think about the specific implementation of the function in order to come up with these input/output pairs.

This approach has advantages. Most importantly, when we write unit tests, we will have an automated program that we can always run to verify that the function still behaves how we expect. So we can always refactor our function or try to improve its performance and still know that it's giving us the correct outputs.

Writing unit tests proactively (test driven development) can also force us to think about edge cases before we start programming, which will help us implement our function with these cases in mind.

However, unit testing has a few disadvantages as well. Sometimes it can be cumbersome to construct the inputs to unit-test intermediate functions. And sometimes it can be hard to develop non-trivial test cases that really exercise our code at scale, because we can't necessarily know the answer to harder cases beforehand. And sometimes, coming up with a unit test that will really find a good edge case takes the same kind of non-systematic insight that we were trying to avoid in the first place.

Unit tests can be tricky and time-consuming to do well, but for industry projects you should expect to unit test everything you can. So it's a good habit to get into.

Using a Debugger

The second approach on the list is more of a "white box" approach. Debuggers allow us to explore the interior state of our function while it is running and see if the values match our expectations.

So for example, the typical debugger can set a breakpoint so that the program pauses execution in the middle of our function. We can then explore all the values in scope at this point. And examining these values can tell us if the assumptions we're making about our code are correct. We can then step the program forward and see if these values update the way we expect.

However, it is a bit harder to use debugging programs with Haskell. With most imperative languages, the ordering of the machine code (at least when unoptimized) somewhat resembles the order of the code you write in the editor. But Haskell is not an imperative language, so the ordering of operations is more complicated. This is made even worse because of Haskell's laziness.

But debuggers are still worth pursuing! I'll be exploring this subject more in the future. For now, let's move on to our last approach.

Log Messages

What are the advantages of using a logging approach? Are "print statements" really the way to go?

Well this is the quickest and easiest way to get some information about your program. Anyone who's done a technical interview knows this. When something isn't going right in that kind of fast-paced situation, you don't have time to write unit tests. And attaching a debugger isn't an option in interview environments like coderpad. So throwing a quick print statement in your code can help you get "unstuck" without spending too much time.

But generally speaking, you don't want your program to be printing out random values. Once you've resolved your issue, you'll often get rid of the debugging statements. Since these statements won't become part of your production code, they won't remain as a way to help others understand and debug your code, as unit tests do.

Logging and Frustration in Haskell

Considering these three different methods, logging and print statements are the most common method for anyone who is first learning a language. Setting up a debugger can be a complex task that no beginner wants to go through. Nor does a novice typically want to spend the time to learn unit test frameworks just so they can solve basic problems.

This presents us with a conundrum in helping people to learn Haskell. Because logging is not intuitive in Haskell. Or rather, proper logging is not intuitive.

Showing someone the main :: IO () and then using functions like print and putStrLn is easy enough. But once beginners start writing "pure" functions to solve problems, they'll get confused about how to use logging statements, since the whole point of pure functions is that we can't just add print statements to them.

There are, of course, "unsafe" ways to do this with the trace library and unsafePerformIO. But even these options use patterns that are unintuitive for beginners to the language.

Start with Logger Monad

With these considerations, I'm going to start an experiment for a while. As I write solutions to puzzle problems (the current focus of my Haskell activity), I'm going to write all my code with a MonadLogger constraint. And I would consider the idea of recommending a beginner to do the same. My hypothesis is that I'll solve problems much more quickly with some systematic approach rather than unsystematic insight-driven-development. So I want to see how this will go.

Using MonadLogger is much more "pure" than using IO everywhere. While the most common instances of the logger monad will use IO, it still has major advantages over just using the IO monad from a "purity" standpoint. The logging can be disabled. You can also put different levels of logging into your program. So you can actually use logging statements to log a full trace of your program, but restrict it so that the most verbose statements are at DEBUG and INFO levels. In production, you would disable those messages so that you only see ERROR and WARN messages.

Most importantly, you can't do arbitrary IO activity with just a MonadLogger constraint. You can't open files or send network requests.

Of course, the price of this approach for a beginner is that the newcomer would have to get comfortable with having monadic code with a typeclass constraint before they necessarily understand these topics. But I'm not sure that's worse than using IO everywhere. And if it relieves the frustration of "I can't inspect my program", then I think it could be worthwhile for someone starting out with Haskell.

Conclusion

If you want to see me using this approach on real problems, then tune into my Twitch Stream! I usually stream 5 times a week for short periods. Some of these episodes will end up on my YouTube channel as well.

Meanwhile, you can also subscribe to the mailing list for more updates! This will give you access to Monday Morning Haskell's subscriber resources!

Read More
James Bowen James Bowen

Dijkstra with Monads!

Last time on the blog, we considered this library version of Dijkstra's algorithm that you can find on Hackage.

dijkstra ::
    (Foldable f, Num cost, Ord cost, Ord State)
  => (state -> f state) -- Function to generate list of neighbors
  -> (state -> state -> cost) -- Function for cost generation
  -> (state -> Bool) -- Destination predicate
  -> state -- Initial state
  -> Maybe (cost, [state]) -- Solution cost and path, Nothing if goal is unreachable

However, there are a number of situations where this might be insufficient. In this article we'll consider some reasons why you would want to introduce monads into your solution for Dijkstra's algorithm. Let's explore some of these reasons!

Note! This is a "coding ideas" blog, rather than an "In Depth Tutorial" blog (see this article for a summary of different reading styles). Some of the code sampled are pretty well fleshed out, but some of them are more hypothetical ideas for you to try out on your own!

The Monadic Version

In addition to the "pure" version of Dijkstra's algorithm, the Algorithm.Search library also provides a "monadic" version. This version allows each of the input functions to act within a monad m, and of course gives its final result within this monad.

dijkstraM ::
    (Monad m, Foldable f, Num cost, Ord cost, Ord State)
  => (state -> m (f state))
  -> (state -> state ->m cost)
  -> (state -> m Bool)
  -> state
  -> m (Maybe (cost, [state]))

Now, if you've read our Monads Series, you'll know that a monad is a computational context. What are the kinds of contexts we might find ourselves in while performing Dijkstra's algorithm? Here are a few ideas to start with.

  1. We're reading our graph from a global state (mutable or immutable) 2.Our graph functions require reading a file or making a network call
  2. We would like to log the actions taken in our graph.

Let's go through some pseudocode examples to see how each of these could be useful.

Using a Global State

A global mutable state is represented with (of course) the State monad. An immutable global state uses the Reader monad to represent this context. Now, taken in a simple way, the Reader context could allow us to "pass the graph" without actually including it as an argument:

import qualified Data.Array as A
import Control.Monad.Reader
import Algorithm.Search (dijkstraM)

newtype Graph2D = Graph2D (A.Array (Int, Int) Int)

getNeighbors :: A.Array (Int, Int) Int -> (Int, Int) -> [(Int, Int)]

findShortestPath :: Graph2D -> (Int, Int) -> (Int, Int) -> Maybe (Int, [(Int, Int)])
findShortestPath graph start end = runReader
  (dijkstraM neighbors cost (return . (== end)) start)
  graph
  where
    cost :: (Int, Int) -> (Int, Int) -> Reader Graph2D Int
    cost _ b = do
      (Graph2D gr) <- ask
      return $ gr A.! b

    neighbors :: (Int, Int) -> Reader Graph2D [(Int, Int)]
    neighbors source = do
      (Graph2D gr) <- ask
      return $ getNeighbors gr source

If we're already in this monad for whatever reason, then this could make sense. But on its own, it's not necessarily much of an improvement over partial function application.

A mutable state could be useful in certain circumstances as well. We likely wouldn't want to mutate the graph itself during iteration, as this would invalidate the algorithm. However, we could store certain metadata about what is happening during the search. For instance, we might want to track how often certain nodes are returned as a potential neighbor.

import qualified Data.HashMap.Strict as HM
import Control.Monad.State
import Data.Maybe (fromMaybe)
import Data.Foldable (find)

newtype Graph = Graph
   { edges :: HM.HashMap String [(String, Int)] }

type Metadata = HM.HashMap String Int
incrementKey :: String -> Metadata -> Metadata
incrementKey k metadata = HM.insert k (count + 1) metadata
  where
    count = fromMaybe 0 (HM.lookup k metadata)

findShortestPath :: Graph -> String -> String -> Maybe (Int, [String])
findShortestPath graph start end = evalState
  (dijkstraM neighbors cost (return . (== end)) start)
  HM.empty
  where
    cost :: String -> String -> State Metadata Int
    cost n1 n2 = 
      let assocs = fromMaybe [] (HM.lookup n1 (edges graph))
          costForN2 = find (\(n, _) -> n == n2) assocs
      in  case costForN2 of
            Nothing -> return maxBound
            Just (_, x) -> return x
    neighbors :: String -> State Metadata [String]
    neighbors node = do
      let neighbors = fst <$> fromMaybe [] (HM.lookup node (edges graph))
      metadata <- get
      put $ foldr incrementKey metadata neighbors
      return neighbors

In this implementation, we end up discarding our metadata, but if we wanted to we could include it as an additional output to help us understand what's happening in our search.

Reading from Files

In many cases, our "graph" is actually too big to fit within memory. In various cases, the entire graph could be distributed across many files on our system. Consider this simplified example:

data Location = Location
  { filename :: FilePath
  , tag :: String
  ...
  }

Each file could track a certain "region" of your map, with references to certain locations "on the edge" whose primary data must be found in a different file. This means you'll need to have access to the file system to ensure you can find all the "neighbors" of a particular location: This means you'll need the IO monad in Haskell!

getMapNeighbors :: Location -> IO [Location]
-- Open original locations file
-- Find tag and include neighboring tags together with references to other files

This matches the signature of the "neighbor generator" function in dijkstraM, so we'll be able to pass this function as the first argument.

Using Network Calls

Here's a fun example. Consider wiki-racing - finding the shortest path between the Wikipedia pages of two topics using only the links in the bodies of those pages. You could (theoretically) write a program to do this for you. You might create a type like this:

data WikiPage = WikiPage
  { pageTitle :: Text
  , url :: URL
  , bodyContentHtml :: Text
  }

In order to find the "neighbors" of this page, you would first have to parse the body HTML and find all the wikipedia links within it. This could be done in a pure fashion. But in order to create the WikiPage objects for each of those links, you would then need to send an HTML GET request to get their body HTML. Such a network call would require the IO monad (or some other MonadIO), so you're function will necessarily look like:

getWikiNeighbors :: WikiPage -> IO [Wikipage]

But if you successfully implement that function, it's very easy to apply dijkstraM because the "cost" of each hop is always 1!

findShortestWikiPath :: Text -> Text -> IO (Maybe (Int, [WikiPage]))
findShortestWikiPath start end = do
  firstPage <- findWikiPageFromTitle start
  dijkstraM getWikiNeighbors (\_ _ -> return 1) (return . (== end)) firstPage

findWikiPageFromTitle :: Text -> IO WikiPage
...

Of course, because the cost is always 1 this is actually a case where breadth first search would work more simply than Dijkstra's algorithm, so you could use the function bfsM from the same library!

Logging

Another common context for problem solving is the logging context. While we are solving our problem, we might want to record helpful statements telling us what is happening so that we can debug when things are going wrong. This happens using the MonadLogger typeclass, with a few interesting functions we can use, indicating different "levels" of logging.

class MonadLogger m where
  ...

logDebugN :: (MonadLogger m) => Text -> m ()
logInfoN :: (MonadLogger m) => Text -> m ()
logWarnN :: (MonadLogger m) => Text -> m ()
logErrorN :: (MonadLogger m) => Text -> m ()

Now, unlike the previous two examples, this doesn't require the IO monad. A couple of the most common implementations of this monad class will, in fact, use IO functionality (printing to the screen or logging to a file). But this isn't necessary. You can still do logging in a "pure" way by storing the log messages in a sequence or other structure so you can examine them at the end of your program.

When would we want this for Dijkstra's algorithm? Well, sometimes the process of determining neighbors and costs can be complicated! I'll motivate this by introducing a more complicated example of a Dijkstra's algorithm problem.

A Complicated Example

Here's an example from last year's Advent of Code challenge. You can read the full description on that page. This problem demonstrates a less intuitive use of Dijkstra's algorithm.

The problem input is a "map" of sorts, showing a diagram of 4 rooms leading into one shared hallway.

#############
#...........#
###B#C#B#D###
  #A#D#C#A#
  #########

Each of the four rooms is filled with "tokens", which come in 4 different varieties, A, B, C, D. (The Advent of Code description refers to them as "Amphipods", but that takes a while to write out, so I'm simplifying to "tokens").

We want to move the tokens around so that the A tokens end in the far left room, the B tokens in the room next to them, and so on.

#############
#...........#
###A#B#C#D###
  #A#B#C#D#
  #########

But there are rules on how these tokens move. You can only move each token twice. Once to get it into an empty space in the hallway, and once to get it from the hallway to its final room. And tokens can't move "past" each other within the hallway.

Now each token has a specific cost for each space it moves.

A = 1 energy per move
B = 10 energy per move
C = 100 energy per move
D = 1000 energy per move

So you want to move the token's into the final state with the lowest total cost.

Using Dijkstra's Algorithm

It turns out the most efficient solution (especially at a larger scale) is to treat this like a graph problem and use Dijkstra's algorithm! Each "state" of the problem is like a node in our graph, and we can move to certain "neighboring" nodes by moving tokens at a certain cost.

But the implementation turns out to be quite tricky! To give you an idea of this, here are some of the data type names and functions I came up with.

data Token = ...

data HallSpace = ...

data TokenGraphState = ...

tokenEdges :: TokenGraphState -> [(TokenGraphState, Int)]

updateStateWithMoveFromRoom :: Token -> HallSpace -> Int -> TokenGraphState -> (TokenGraphState, Int)

updateStateWithMoveFromHall :: Token -> HallSpace -> Int -> TokenGraphState -> (TokenGraphState, Int)

validMovesToHall :: Token -> TokenGraphState -> [(HallSpace, Int)]

validMoveToRoom :: TokenGraphState -> (HallSpace, TokenGraphState -> Maybe Token) -> Maybe (Int, Token, HallSpace)

And these are just the functions with complex logic! There are even a few more simple helpers beyond this!

But when I ran this implementation, I didn't get the right answer! So how could I learn more about my solution and figure out what's going wrong? Unit testing and applying a formal debugger would be nice, but simply being able to print out what is going on in the problem is a quicker way to get started.

Haskell doesn't let you (safely) print from pure functions like I've written above, nor can you add values to a global logging state. So we can fix this by modifying the type signatures to instead use a MonadLogger constraint.

tokenEdges :: (MonadLogger m) => TokenGraphState -> m [(TokenGraphState, Int)]

updateStateWithMoveFromRoom :: (MonadLogger m) => Token -> HallSpace -> Int -> TokenGraphState -> (TokenGraphState, Int)

updateStateWithMoveFromHall :: (MonadLogger m) => Token -> HallSpace -> Int -> TokenGraphState -> m (TokenGraphState, Int)

validMovesToHall :: (MonadLogger m) => Token -> TokenGraphState -> m [(HallSpace, Int)]

validMoveToRoom :: (MonadLogger m) => TokenGraphState -> (HallSpace, TokenGraphState -> Maybe Token) -> m (Maybe (Int, Token, HallSpace))

Now it's simple enough to modify a function to give us some important information about what's happening. Hopefully this is enough to help us solve the problem.

We would like to limit the number of functions that "need" the monadic action. But in practice, it is frustrating to find you need a monad in a deeper function of your algorithm because you'll need to modify everything on its call stack. So it might be a good idea to add at least a basic monad constraint from the beginner!

(Update: I did a full implementation of this particular problem that incorporates the logger monad!)

Conclusion

Next time on the blog, we'll start talking more generally about this idea of using monads to debug, especially MonadLogger. We'll consider the implementation pattern of "monads first" and different ways to approach this.

Make sure you're staying up to date with the latest news from Monday Morning Haskell by subscribing to our mailing list! This will also give you access to our subscriber resources!

Read More