Haskell Data 5: Type Families

Welcome to the conclusion of our series on Haskell data types! We've gone over a lot of things in this series that demonstrated Haskell's simplicity. We compared Haskell against other languages where we saw more cumbersome syntax. In this final part, we'll see something a bit more complicated though. We'll do a quick exploration of the idea of type families. We'll start by tracing the evolution of some related type ideas, and then look at a quick example.

Type families are a rather advanced concept. But if you're more of a beginner, we've got plenty of other resources to help you out! Take a look at our Getting Started Checklist or our Liftoff Series!

Different Kinds of Type Holes

In this series so far, we've seen a couple different ways to "plug in a hole", as far as a type or class definition goes. In the third part of this series we explored parametric types. These have type variables as part of their definition. We can view each type variable as a hole we need to fill in with another type.

Then in the fourth part, we explored the concept of typeclasses. For any instance of a typeclass, we're plugging in the holes of the function definitions of that class. We fill in each hole with an implementation of the function for that particular type.

In this last part, we're going to combine these ideas to get type families! A type family is an enhanced class where one or more of the "holes" we fill in is actually a type! This allows us to associate different types with each other. The result is that we can write special kinds of polymorphic functions.

A Basic Logger

First, here's a contrived example to use through this article. We want to have a logging typeclass. We'll call it MyLogger. We'll have two main functions in this class. We should be able to get all the messages in the log in chronological order. Then we should be able to log a new message while sending some sort of effect. A first pass at this class might look like this:

class MyLogger logger where
  prevMessages :: logger -> [String]
  logString :: String -> logger -> logger

We can make a slight change that would use the State monad instead of passing the logger as an argument:

class MyLogger logger where
  prevMessages :: logger -> [String]
  logString :: String -> State logger ()

But this class is deficient in an important way. We won't be able to have any effects associated with our logging. What if we want to save the log message in a database, send it over network connection, or log it to the console? We could allow this, while still keeping prevMessages pure like so:

class MyLogger logger where
  prevMessages :: logger -> [String]
  logString :: String -> StateT logger IO ()

Now our logString function can use arbitrary effects. But this has the obvious downside that it forces us to introduce the IO monad places where we don't need it. If our logger doesn't need IO, we don't want it. So what do we do?

Type Family Basics

One answer is to make our class a type family. W do this with the type keyword in the class defintion. First, we need a few language pragmas to allow this:

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE AllowAmbiguousTypes #-}

Now we'll make a type within our class that refers to the monadic effect type of the logString function. We have to describe the "kind" of the type with the definition. Since it's a monad, its kind is * -> *. This indicates that it requires another type parameter. Here's what our definition looks like:

class MyLogger logger where
  type LoggerMonad logger :: * -> *
  prevMessages :: logger -> [String]
  logString :: String -> (LoggerMonad logger) ()

Some Simple Instances

Now that we have our class, let's make an instance that does NOT involve IO. We'll use a simple wrapper type for our logger. Our "monad" will contain the logger in a State. Then all we do when logging a string is change the state!

newtype ListWrapper = ListWrapper [String]
instance MyLogger ListWrapper where
  type (LoggerMonad ListWrapper) = State ListWrapper
  prevMessages (ListWrapper msgs) = reverse msgs
  logString s = do
    (ListWrapper msgs) <- get
    put $ ListWrapper (s : msgs)

Now we can make a version of this that starts involving IO, but without any extra "logging" effects. Instead of using a list for our state, we'll use a mapping from timestamps to the messages. When we log a string, we'll use IO to get the current time and store the string in the map with that time.

newtype StampedMessages = StampedMessages (Data.Map.Map UTCTime String)
instance MyLogger StampedMessages where
  type (LoggerMonad StampedMessages) = StateT StampedMessages IO
  prevMessages (StampedMessages msgs) = Data.Map.elems msgs
  logString s = do
    (StampedMessages msgs) <- get
    currentTime <- lift getCurrentTime
    put $ StampedMessages (Data.Map.insert currentTime s msgs)

More IO

Now for a couple examples that use IO in a traditional logging way while also storing the messages. Our first example is a ConsoleLogger. It will save the message in its State but also log the message to the console.

newtype ConsoleLogger = ConsoleLogger [String]
instance MyLogger ConsoleLogger where
  type (LoggerMonad ConsoleLogger) = StateT ConsoleLogger IO
  prevMessages (ConsoleLogger msgs) = reverse msgs
  logString s = do
    (ConsoleLogger msgs) <- get
    lift $ putStrLn s
    put $ ConsoleLogger (s : msgs)

Another option is to write our messages to a file! We'll store the file name as part of our state, though we could use the Handle if we wanted.

newtype FileLogger = FileLogger (String, [String])
instance MyLogger FileLogger where
  type (LoggerMonad FileLogger) = StateT FileLogger IO
  prevMessages (FileLogger (_, msgs)) = reverse msgs
  logString s = do
    (FileLogger (filename, msgs)) <- get
    handle <- lift $ openFile filename AppendMode
    lift $ hPutStrLn handle s
    lift $ hClose handle
    put $ FileLogger (filename, s : msgs)

And we can imagine that we would have a similar situation if we wanted to send the logs over the network. We would use our State to store information about the destination server. Or else we could add something like Servant's ClientM monad to our stack in the type definition.

Using Our Logger

By defining our class like this, we can now write a polymorphic function that will work with any of our loggers!

runComputations :: (Logger logger, Monad (LoggerMonad logger)) => InputType -> (LoggerMonad logger) ResultType
runComputations input = do
  logString "Starting Computation!"
  let x = firstFunction input
  logString "Finished First Computation!"
  let y = secondFunction x
  logString "Finished Second Computation!"
  return y

This is awesome because our code is now abstracted away from the needed effects. We could call this with or without the IO monad.

Comparing to Other Languages

Now, to be fair, this is one area of Haskell's type system that makes it a bit more difficult to use than other languages. Arbitrary effects can happen anywhere in Java or Python. Because of this, we don't have to worry about matching up effects with types.

But let's not forget about the benefits! For all parts of our code, we know what effects we can use. This lets us determine at compile time where certain problems can arise.

And type families give us the best of both worlds! They allow us to write polymorphic code that can work either with or without IO effects!

Conclusion

That's all for our series on Haskell's data system! We've now seen a wide range of elements, from the simple to the complex. We compared Haskell against other languages. Again, the simplicity with which one can declare data in Haskell and use it polymorphically was a key selling point for me!

Hopefully this series has inspired you to get started with Haskell if you haven't already! Download our Getting Started Checklist or read our Liftoff Series to get going!