Traverse: Fully Generalized Loops

Apr 28

Last time around, we discussed mapM and sequence, which allow us to run loop-like activities in Haskell while also incorporating monadic effects. Today for our last article of for-loops month, we'll look at the final generalization of this idea: the traverse function.

To understand traverse, it helps to recall the ideas behind fmap. When we use fmap, we can take any Functor structure and transform all the underlying elements of that functor, returning a new object with the exact same structure, but different elements. The new elements might be of the same type or they can be entirely different.

fmap :: (Functor f) => (a -> b) -> f a -> f b

We can apply this idea over many different data structures in Haskell. However, it is a pure function. If the operation we're attempting requires a monadic effect, we won't be able to use fmap. For an example, consider having a list of strings which represent people's names. These names correspond to objects of type Person in a database, and we would like to take our list and look up all the people. Here's a naive C++ outline of this:

class Person;

Person lookupPersonByName(const std::string& name) {
  // Database call
  ...
}

std::vector<Person> lookupAllNames(const std::vector& names) {
  std::vector<Person> results;
  for (const auto& name : names) {
    results.push_back(lookupPersonByName(name));
  }
  return results;
}

In C++, this function can just be a normal for loop (though we would want to parallelize in a production setting). In Haskell though, the lookupAllNames function would need to be an IO-like function.

data Person = ...

lookupPersonByName :: String -> IO Person
...

This means we can't use fmap. Now, mapM from the last article is a viable option here. But it's important to also consider its generalization, found in the Traversable class:

class (Functor t, Foldable t) => Traversable t where
  traverse :: Applicative f => (a -> f b) -> t a -> f (t b)

Let's break this down. The traverse function has two inputs:

A function transforming an object using an effect (Applicative or Monadic)
A container of that object. (The container is the traversable class)

The result of this is a new container which has applied the transformation to the input elements, occurring within the effect class. So for our database example, we might have a list of names, and we transform them all:

data Person = ...

lookupPersonByName :: String -> IO Person

lookupAllNames :: [String] -> IO [Person]
lookupAllNames = traverse lookupPersonByName

Since any "foldable functor" will do though, we can also apply a traversal over a Maybe String, an Either String object, or a Map of strings, for example. All these calls will occur in the IO monad.

lookupAllNames :: (Foldable t, Functor t) => t String -> IO (t Person)
lookupAllNames = traverse lookupPersonByName

...

>> :t (lookupAllNames (Just "Joseph"))
IO (Maybe Person)
>> :t (lookupAllNames (Right "Joseph"))
IO (Either a Person)
>> :t (lookupAllNames (Map.fromList [(1, "Joseph"), (2, "Christina")]))
IO (Map.Map Int Person)

A big advantage of this approach over C++ is that we can use Haskell's monadic behavior to easily determine the correct side effect when an operation fails. In our example, by wrapping the calls in IO we ensure that the user is aware that an IO error could occur that they might need to catch. But we could also improve the monad to make the type of error more clear:

data DatabaseError

lookupPersonByName :: String -> ExceptT DatabaseError IO Person

lookupAllNames :: (Foldable t, Functor t) => t String -> ExceptT DatabaseError IO (t Person)

In this particular case, we would short circuit the operation when an error is encountered. In C++, you would probably want to have lookupPersonByName return a type like StatusOr<Person>. Combining these Status objects appropriately might be a bit tricky. So it's nice that monads do this for us automatically in Haskell.

The last thing I'll note is that in Data.Traversable we finally have a function defined as the word for! Similar to mapM and forM, this function is just a flipped version of traverse:

for :: (Traversable t, Applicative f) => t a -> (a -> f b) -> f (t b)

lookupPersonByName :: String -> IO Person

lookupAllNames :: [String] -> IO [Person]
lookupAllNames inputs = for inputs lookupPersonByName

So now when someone says "Haskell doesn't have for loops", you know the proper reply is "yes it does, here is the 'for' function!".

For these two months now, we've explored a bit about monads, and we've explored different kinds of loops. In both these cases, IO presents some interesting challenges. So next month is going to be all about IO! So make sure you keep coming back Mondays and Thursdays, and subscribe to our monthly newsletter so you'll get a summary in case you miss something!

James Bowen

Traverse: Fully Generalized Loops

Getting a Handle on IO

Effectful Loops: Sequence and MapM