Compile Driven Development In Action: Refactoring to Arrays!

big_matrix.jpg

In the last couple weeks, we've been slowly building up our maze game. For instance, last week, we added the ability to serialize our mazes. But software development is never a perfect process! So it's not uncommon to revisit some past decisions and come up with better approaches. This week we're going to address a particular code wart in the random maze generation code.

Right now, we store our Maze as a mapping from Locations to CellBoundaries items. We do this using Data.Map. The Map.lookup function returns a Maybe result, since it might not exist. But most of the time we accessed a location, we had good reason to believe that it would exist in the map. This led to several instances of the following idiom:

fromJust $ Map.lookup location boundsMap

Using a function like fromJust is a code smell, a sign that we could be doing something better. This week, we're going to change this structure so that it uses the Array type instead from Data.Array. It captures our idiomatic definitions better. We'll use "Compile Driven Development" to make this change. We won't need to hunt around our code to figure out what's wrong. We'll just make type changes and follow the compiler errors!

To learn more about compile driven development and the mental part of Haskell, read our Haskell Brain series. It will help you think about the language in a different way. So it's a great tool for beginners!

Another good resource for this article is to look at the Github repository for this project. The complete code for this part is on the part-3 branch. You can consult this commit to see all the changes we make in migrating to arrays.

Initial Changes

To start with, we should make sure our code uses the following type synonym for our maze type:

type Maze = Map.Map Location CellBoundaries

Now we can observe the power of type synonyms! We'll make a change in this one type, and that'll update all the instances in our code!

import qualified Data.Array as Array

type Maze = Array.Array Location CellBoundaries

Of course, this will cause a host of compiler issues! But most of these will be pretty simple to fix. But we should be methodical and start at the top. The errors begin in our parsing code. In our mazeParser, we use Map.fromList to construct the final map. This requires the pairs of Location and CellBoundaries.

mazeParser :: (Int, Int) -> Parsec Void Text Maze
mazeParser (numRows, numColumns) = do
  …
  return $ Map.fromList (cellSpecToBounds <$> (concat rows))

The Array library has a similar function, Array.array. However, it also requires us to provides the bounds for the Array. That is, we need the "min" and "max" locations in a tuple. But these are easy, since we have the dimensions as an input!

mazeParser :: (Int, Int) -> Parsec Void Text Maze
mazeParser (numRows, numColumns) = do
  …
  return $ Array.array 
    ((0,0), (numColumns - 1, numRows - 1))
    (cellSpecToBounds <$> (concat rows))

Our next issue comes up in the dumpMaze function. We use Map.mapKeys to transpose the keys of our map. Then we use Map.toList to get the association list back out. Again, all we need to do is find the comparable functions for arrays to update these.

To change the keys, we want the ixmap function. It does the same thing as mapKeys. As with Array.array, we need to provide an extra argument for the min and max bounds. We'll provide the bounds of our original maze.

transposedMap = Array.ixmap (Array.bounds maze) (\(x, y) -> (y, x)) maze

A few lines below, we can see the usage of Map.toList when grouping our pairs. All we need instead is Array.assocs

cellsByRow :: [[(Location, CellBoundaries)]]
cellsByRow = groupBy
  (\((r1, _), _) ((r2, _), _) -> r1 == r2)
  (Array.assocs transposedMap)

Updating Map Generation

That's all the changes for the basic parsing code. Now let's move on to the random generation code. This is where we have a lot of those yucky fromJust $ Map.lookup calls. We can now instead use the "bang" operator, Array.! to access those elements!

findCandidates currentLocation@(x, y) bounds visited =
  let currentLocBounds = bounds Array.! currentLocation
  ...

Of course, it's possible for an "index out of bounds" error to occur if we aren't careful! But our code should reflect the fact that we expect all these calls to work. After fixing the initial call, we need to change each directional component. Here's what the first update looks like:

findCandidates currentLocation@(x, y) bounds visited =
      let currentLocBounds = bounds Array.! currentLocation
          upLoc = (x, y + 1)
          maybeUpCell = case (upBoundary currentLocBounds,
                              Set.member upLoc visited) of
                          (Wall, False) -> Just
                            ( upLoc
                            , (bounds Array.! upLoc) {downBoundary = 
                                AdjacentCell currentLocation}
                            , currentLocation
                            , currentLocBounds {upBoundary =
                                AdjacentCell upLoc}
                            )
                          _ -> Nothing

We've replaced Map.lookup with Array.! in the second part of the resulting tuple. The other three directions need the same fix.

Then there's one last change in the random generation section! When we choose a new candidate, we currently need two calls to Map.insert. But arrays let us do this with one function call. The function is Array.//, and it takes a list of association updates. Here's what it looks like:

chooseCandidate candidates = do
      (SearchState gen currentLocs boundsMap visited) <- get
      ...
      -- Previously used Map.insert twice!!!
      let newBounds = boundsMap Array.//
            [(chosenLocation, newChosenBounds),
             (prevLocation, newPrevBounds)]
      let newVisited = Set.insert chosenLocation visited
      put (SearchState
             newGen
             (chosenLocation : currentLocs) 
             newBounds 
             newVisited)

Final Touch Ups

Now our final remaining issues are within the Runner code. But they're all similar fixes to what we saw in the parsing code.

In our sample boundariesMap, we once again replace Map.fromList with Array.array. Again, we add a parameter with the bounds of the array. Then, when drawing the pictures for our cells, we need to use Array.assocs instead of Map.toList.

For the final change, we need to update our input handler so that it accesses the array properly. This is our final instance of fromJust $ Map.lookup! We can replace it like so:

inputHandler :: Event -> World -> World
inputHandler event w = case event of
  ...
  where
    cellBounds = (worldBoundaries w) Array.! (playerLocation w)

And that's it! Now our code will compile and work as it did before!

Conclusion

There's a pretty big inefficiency with our new approach. Whereas Map.insert can give us an updated map in log(n) time, the Array.// function isn't so nice. It has to create a complete copy of the array, and we run that function many times! How can we fix this? Next week, we'll find out! We'll use the Mutable Array interface to make it so that we can update our array in-place! This is super efficient, but it requires our code to be more monadic!

For some more ideas of cool projects you can do in Haskell, download our Production Checklist! It goes through a whole bunch of libraries on topics from database management to web servers!