Understanding Haskell's Type System

One of the things that makes Haskell unique is its strong, static type system. Understanding this system is one of the keys to understanding Haskell. It is radically different from the dynamic typing of languages like Python, Javascript, and Ruby. It’s even different in some respects from other statically typed languages like Java and C++.

Static Typing Vs. Dynamic Typing

To understand why the type system is so important, we have to first consider the advantages and disadvantages of static typing in the first place. One important benefit of  static typing is that many bugs are found at compile time, rather than run time. With a dynamically typed language, you might ship code that seems to work perfectly well. But then in production a user stumbles on an edge case where, for instance, you forget to cast something as a number. And your program crashes.

With static typing, your code will simply not compile in the first place. This also means that errors are more likely to be found where they first occur. An object with the wrong type will raise a compile issue at its source, rather than further down the call chain where it is actually used. As a result of this, static typing needs a lot less unit testing around type agreement between methods. It doesn’t excuse a total lack of testing, but it certainly reduces the burden.

So what are the disadvantages of static typing? Well for one thing, development can be slower. It’s oftentimes a lot easier to hack together a solution in a dynamically typed language. Suppose you suddenly find it useful to do a database operation in the middle of otherwise pure code. This is very hard in Haskell, but easier in Javascript or Python.

Another disadvantage is that there’s more boilerplate code in statically typed languages. In languages like C++ and Java you declare the type of almost every variable you make, which you don’t have to do in Python. However, this is isn’t a big a problem for Haskell, as we’ll see.

Haskell and Type Inference

The first step in truly understanding Haskell’s type system is to remember the fundamental difference between functional programming and imperative programming. In imperative programming, we are giving the program a series of commands to run. In functional programming, our code actually consists of expressions to be evaluated. Under Haskell’s type system, every expression has a type.

Luckily, unlike C++ and Java, Haskell’s most widely used compiler (GHC) is generally quite good at type inference. It can usually determine the type of every expression in our code. For instance, consider this function:

func baseAmt str = replicate rptAmt newStr
  where
    rptAmt = if baseAmt > 5 
      then baseAmt 
      else baseAmt + 2
    newStr = “Hello “ ++ str

In this function, notice that we do not assign any types to our expressions, as we would have to with Java or C++ variables. The compiler has enough information to know that “baseAmt” is an integer, and “str” is a string without us telling it. Further, if we were to use the result of this function somewhere else in our code, the compiler would already know that the resulting type is a list of strings.

We can also see some consequences that the type system has on code syntax. Because every expression has a type, all if statements must have an else branch, and the result of both branches must have the same type. Otherwise the compiler would not, in this example, know what the type of “rptAmt” is.

While we can avoid using type signatures most of the time, it is nonetheless standard practice to at least give a signature for your top level functions and objects. This is because the best way to reason about a Haskell program is to look at the types of the relevant functions. We would write the above code as:

func :: Int -> String -> [String]
func baseAmt str = repeat rptAmt newStr
  where
    rptAmt = if baseAmt > 5 
      then baseAmt 
      else baseAmt + 2
    newStr = “Hello “ ++ str

In fact, it is not unreasonable when writing a Haskell program to write out the types of every function you will need before implementing any of them. This is called Type Driven Development.

You could go further and also give every variable a type signature, but this is generally considered excessive. We can see how much clunkier the code gets when we do this:

func :: Int -> String -> [String]
func baseAmt str = repeat rptAmt newStr
  where
    rptAmt :: Int
    rptAmt = if baseAmt > 5 
      then baseAmt 
      else baseAmt + 2
    newStr :: String
    newStr = “Hello “ ++ str

Functions as Types

In Haskell, functions are expressions, just like an integer or a string. Hence, they all have types. In the example, we see that “func” is a function that takes two arguments, an Int and a String, and returns a list of Strings. We can apply a function by giving it arguments. Function application changes the type of an expression. For instance, we could fully apply our function by providing two arguments, and the resulting type would be a list of Strings:

myStrings :: [String]
myStrings = func 4 “Hello”

However, we don’t have to apply ALL the arguments of a function at once. We could apply only the integer argument, and this would leave us with a function that still required a String. We could then call this partial function with the one missing argument:

partial :: String -> [String]
partial = func 4

full :: [String]
full = partial “World”

Conclusion

So if you’re new to Haskell, this was probably a lot of information, but here are a few main takeaways to remember:

  1. Haskell is a statically typed language

  2. Every expression in Haskell has a type, including functions and if statements

  3. The compiler can usually infer the types of expressions, but you should generally write out the type signature for top level functions and expressions.

Previous
Previous

Using Compound Types to Make Haskell Easier