James Bowen 1/16/17 James Bowen 1/16/17

Faster Code with Laziness

Most programming languages use an evaluation paradigm called eager evaluation. This means the moment the program execution reaches an expression, the expression is evaluated. This makes sense in imperative languages. An imperative program is a series of commands. To execute the commands in order, it is best to evaluate each expression immediately.

Haskell is different. As we’ve discussed, a Haskell program is actually a series of expressions to be evaluated. However, Haskell uses lazy evaluation. It only evaluates the expressions it absolutely needs to in order to produce the program’s output. Furthermore, it only evaluates an expression at the moment it is needed, rather than the moment it is encountered.

Laziness Under the Hood

So how does Haskell accomplish this? To understand, we have to take a deeper look at Haskell’s memory model. Consider this simple program:

main = do
  let myTuple = (“first”, map (*2) [1,2,3,4])
  print “Hello”
  print $ fst myTuple
  print head snd myTuple
  print (length (snd myTuple))
  print $ snd myTuple

When the first print statement (“Hello”) happens, the expression myTuple is actually completely unevaluated, even though it is defined before the print statement. It is represented in memory by what is called a “thunk”. Of course, our program knows how to evaluate this thunk when the time comes. But for now, no value is there. When it prints the first element of the tuple, it still doesn’t completely evaluate myTuple. When the compiler sees us call the fst function on myTuple, it knows myTuple must be a tuple. Instead of seeing a single thunk at this point, the compiler sees myTuple as a tuple containing two unevaluated thunks.

Next, we print the first element of myTuple. Printing an expression forces it to be completely evaluated. So after this, the compiler sees myTuple as a tuple containing a string in its first position and an unevaluated thunk in its second position.

At the next step, we print the head of the second element of myTuple. This tells the compiler this second element must be a non-empty list. If it were empty, our program would actually crash here. This forces the evaluation of this first element (2). However, the remainder of the list remains an unevaluated thunk.

Next, we print the length of the list. Haskell will do enough work here to determine how many elements are in the list, but it will not actually evaluate any more items. The list is now an evaluated first element, and 3 unevaluated thunks. Finally, we print the full list. This evaluates the list in its entirety. If we did not do this last step, the final 3 elements would never be evaluated.

Advantages of Laziness

So all of this seems extremely convoluted. How exactly does laziness help us? Well, the main benefit is it makes our program faster. Consider the following example:

function1 :: Int
function1 = function2 exp1 exp2 exp3
  where
    exp1 = reallyLongFunction 1234
    exp2 = reallyLongFunction 3151
    exp3 = reallyLongFunction 8571

function2 :: Int -> Int -> Int -> Int
function2 exp1 exp2 exp3 = if exp1 < 1000
  then exp2
  else if exp1 < 2000
    then exp3
    else exp1

Comparable code in C++ or Java would need to make all three expensive calls to reallyLongFunction before calling function2 with the results. But in Haskell, the program will not call reallyLongFunction until it absolutely needs to.

So in this example, we will always evaluate exp1 in order to determine the result of the if statement in function2. However, if we happen to have exp1 >= 2000, then we’ll never evaluate exp2 or exp3! We don’t need them! We’ll save ourselves from having to make the expensive calls to reallyLongFunction. As a result, our program will run faster.

For another example, suppose we have a quicksort function. The quicksort algorithm works like this:

Pick a partition element
Move all elements smaller than the partition into one list
Move all elements larger than the partition into another list
Sort these two smaller lists recursively.
Combine them with the partition.

In Haskell, we can leverage this function quite easily to just get the smallest 3 elements of a list:

smallest3 :: [Int] -> [Int]
smallest3 input= take 3 (quicksort input)

quicksort :: Ord a => [a] -> [a]
...

Since Haskell is lazy, we’ll never have to touch the larger half of the partition, even in the recursive calls. In an eager language, the compiler would run the full sorting algorithm. This would take much longer than we need.

Another cool advantage of laziness is the possibility of infinite lists. Here are two ways to create infinite lists:

allTwos :: [Int]
allTwos = repeat 2

first4 :: [Int]
first4 = cycle [1,2,3,4]

Infinite lists can provide elegant solutions to many list problems. We can never do something that will bring the full infinite list into scope by, say, printing it. But we can take the first n elements of an infinite list! The "infinite" remainder will stay as an unevaluated thunk.

Disadvantages of Laziness

Laziness does have some disadvantages though. In particular, when using recursion or recursive data structures, unevaluated thunks can build up in heap memory. If they consume all your memory, your program will crash. In particular, the foldl function suffers from this problem. The following usage will probably fail due to a memory leak.

foldl (+) 0 [1..10^7]

Laziness can also make it harder to reason about our code. Just because a number of expressions are defined in the same where clause does not mean that they will be evaluated at the same time. This can easily confuse beginners.

Introducing Strictness

Now, if we want, there are a few ways to "remove" laziness from our code and instead introduce strictness. The first is the seq function. It has the type:

seq :: a -> b -> b

It takes two arguments. The output is simply the second argument. However, the function is "strict" in its first argument. The first argument will be fully evaluated no matter what. For example, consider the following:

>> seq 3 "Hello"
"Hello"
>> seq undefined 3
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at <interactive>:2:5 in interactive:Ghci2

The foldl function has a strict counterpart foldl’. Observe how it uses seq to avoid the memory leak problem:

foldl’ f z [] = z
foldl’ f z (x:xs) = let z’ = z `f` x
  in seq z’ $ foldl’ f z’ xs

Try the same addition example from above with foldl’. It works perfectly well. By forcing the evaluation of the sum, we prevent the buildup of unevaluated thunks on the heap. This avoids the memory leak.

We can also introduce strictness into our datatypes. Let’s compare two different Person classes.

data Person = Person
  { pFirstName :: String
  , pLastName :: String
  , pAge :: Int }

data StrictPerson  = StrictPerson
  { firstName :: !String
  , lastName :: !String
  , age :: !Int }

The StrictPerson data type (with the exclamation points) is strict in all its arguments. Whenever a StrictPerson object is evaluated, all its fields will be evaluated as well. This is not the case with the regular Person type. One difference is that we can use record syntax to create an incomplete person. We cannot create an incomplete StrictPerson:

-- This works! (though the compiler might give us a warning)
ageOfIncomplete :: Int
ageOfIncomplete = pAge incompletePerson
  where
    incompletePerson = Person { pAge = 25}

-- This will not compile
ageOfIncompleteStrict :: Int
ageOfIncompleteStrict = age incompletePerson
  where 
    incompletePerson = StrictPerson {age = 25}

Naturally, even the Person object would fail if we tried to access the undefined pFirstName field. In general, it is good to introduce strictness into your data types. The exception is if one of the fields is possibly unnecessary and expensive to compute.

Summary

Haskell is a lazy language. It does not evaluate expressions until it absolutely must.
This frequently allows our programs to save time by avoiding unnecessary computation, but they are at more of a risk to leak memory.
There are ways of introducing strictness into our programs when we don’t want lazy evaluation.

If you're ready to try out your Haskell skills, check out our free workbook featuring content on recursion and higher order functions, plus 10 practice problems!

James Bowen 1/9/17 James Bowen 1/9/17

Immutability is Awesome

Immutability is a confusing concept in Haskell. It is the property of all Haskell expressions that, once they are assigned to a name, they CANNOT change their value. To most programmers, this seems like a completely crazy idea. How can you program at all when you cannot change the value of an expression? To start, let’s consider this example of how mutability works in Python:

>> a = [1,2,3]
>> a.reverse()
>> a
[3, 2, 1]
>> b = a.reverse()
>> a
[1, 2, 3]
>> b
(No output)
>>

We see now when we call the reverse function on a, the value of a actually changes. We call a later, and we get a reversed list. Meanwhile, reverse has no return value. When we assign the result of a.reverse() to b, we get nothing.

Let’s see what happens in Haskell:

>> let a = [1,2,3]
>> reverse a
[3,2,1]
>> a
[1,2,3]

This time, the value of a did not change. Calling reverse seems to have produced the output we expected, but it had no impact on the original variable. To access the reversed list, we would need to assign it to a name (by using let).

Build it Again

To understand what it is happening, let’s first look at the type signature for reverse:

reverse :: [a] -> [a]

We see reverse takes one argument, a list. It then outputs a list of the same type. It’s a pure function, so it cannot change the input list! Instead, it constructs a completely new list with the same elements as the input, only in reverse order. This may seem strange, but it has huge implications on our ability to reason about code. Consider the following:

operateOnList :: [Int] -> [Int]
operateOnList ints = concat [l1, l2, l3]
  where 
    l1 = funcOnList1 ints
    l2 = funcOnList2 ints
    l3 = funcOnList3 ints

Comparable python code is much harder to reason about. Each function might mutate the input list, so we don’t know what value is passed to the second and third functions. However, in Haskell, we know we will pass the exact same list as a parameter to each of these functions! This simple principle makes it extremely easy to refactor Haskell code.

Efficiency Concerns

You might be wondering, isn’t immutability really inefficient? If reverse creates a completely new list, is it copying every element from the old list? In fact, it does not copy the elements. It simply uses the exact same elements from the original list. In C++ or Python, this would raise the question of what would happen when we modify one of the elements in the first list. Wouldn’t the reversed list also change? But in Haskell, we can’t modify the items in the first list because of immutability! So we’re safe!

Now, remaking the data structure itself does introduce some overhead. In the case of reverse we do have to create a new linked list with a new set of pointers. But in most cases, it’s not so bad. Consider the concatenation operator.

(:) :: a -> [a] -> [a]

This function takes an element and a list and returns a new list with the new element appended to the front of the old list. This is quite efficient, because the new list simply uses the old list as its tail. There is no need to re-create a separate data structure. Immutability protects us from all the nasty side effects of having two data structures operate on the same data. Other data structures work in the same way. For instance, the insert function of the Map data structure takes a key, a value, and an old map, and returns a new map with all the elements of the old map, plus the new mapping between the key and the value.

insert :: k -> v -> Map k v -> Map k v

This is done efficiently, without having to recreate any of the structure of the old map or any of its underlying elements. In a previous article we discussed record syntax on algebraic data types. We saw how we could “modify” objects with record syntax:

makeJohnDoe :: Person -> Person
makeJohnDoe person = person
  { firstName = “John”
  , lastName = “Doe” }

But this does not change the old Person object. This has the same effect as reversing a list. It recreates the Person structure, but none of the underlying elements. And the result is a completely new Person object. So remember:

Summary

Expressions in Haskell are immutable. They cannot change after they are evaluated.
Immutability makes refactoring super easy and code much easier to reason about.
To “change” an object, most data structures provide methods taking the old object and creating a new copy.

If you want to get started with Haskell, be sure to check out our checklist to make sure you have everything you need!

James Bowen 1/2/17 James Bowen 1/2/17

Many Types, One Behavior

In the last two articles, we learned all the basic ways to make our own data types. Now we need to ask a question: what happens when we have multiple types with similar behavior? Polymorphic code can work with many different types. This flexibility makes it easier to work with and better.

Equality

For a concrete example, let’s consider the simple issue of comparing two objects for equality. Let’s review our person data type:

data Person = Person
  { firstName :: String
  , lastName :: String
  , age :: Int
  , hometown :: String
  , homestate :: String }

We’ve seen before how we can compare integers and strings using the equality function (==). What happens when we try to use this with two Person objects?

>> 1 == 1
True
>> “Hi” == “Bye”
False
>> let p1 = Person “John” “Doe” 23 “Kansas City” “Missouri”
>> let p2 = Person “Jane” “Doe” 22 “Columbus” “Ohio”
>> p1 == p2
    No instance for (Eq Person) arising from a use of ‘==’
    In the expression: p1 == p2
    In an equation for ‘it’: it = p1 == p2

This gives us an error. In frustration, we could define our own function for comparing people, using the equality of each field:

personsEqual :: Person -> Person -> Bool
personsEqual p1 p2 = firstName p1 == firstName p2 &&
  lastName p1 == lastName p2 && 
  age p1 == age p2 &&
  hometown p1 == hometown p2 &&
  homestate p1 == homestate p2

The Eq Typeclass

But this is tedious code. It would be a major pain to write out functions like this for every single type we create. How can we make it so we can compare people with (==)? Let’s dig a little deeper into the (==) function by looking at its type:

(==) :: Eq a => a -> a -> Bool

Notice this doesn’t specify a particular type. It uses a generic type a! This explains how it can compare both ints and strings. But why not the Person class? Let’s observe the caveat on this type signature. What does Eq a mean? We saw something similar in the error when we tried to compare our Person objects. Eq is a typeclass. A typeclass is a Haskell construct which ensures certain functions exist describing the behavior of an otherwise generic type. It is similar to an Interface in Java. The Eq typeclass has two functions, (==) and (/=). Here is its definition:

class Eq a
  (==) :: a -> a -> Bool
  (/=) :: a -> a -> Bool

These two functions are equality and inequality. We can define them for any class. In order for a data type to belong to the Eq typeclass, it must implement these functions. We make our data type a member of the typeclass by defining an instance for it:

instance Eq Person where
  (==) p1 p2 = firstName p1 == firstName p2 &&
  … as above

We don’t need to define (/=). With Eq, the compiler can determine how to implement (/=) based on our definition of (==), just by negating it with not. With this instance available, we can simply compare Person objects:

>> p1 == p2
False

Deriving Typeclasses

But this isn’t completely satisfying. Because we’re still writing out this tedious definition of the (==) function, when really it should be obvious when two Person objects are the same. They’re the same if all their fields are the same! Haskell is smart enough to figure this out if we tell it, using derived classes:

data Person = Person { …as above... }
  deriving (Eq)

Then the compiler will derive an implementation of (==) and (/=) for the Person object doing exactly what we want them to do. This is pretty awesome!

The Show Typeclass

Let’s observe some other common typeclasses. The Show typeclass is also simple. It has a single method, show, which takes the type and converts it to a String, similar to a toString() method in Java. We use this whenever we want to print something to the console. The compiler can usually derive a Show instance, just like an Eq instance. The only condition is for all the fields of an object also belong to the Show typeclass. But we can also customize it. For instance, we can use this implementation:

instance Show Person where
  show p1 = (firstName p1) ++ “ “ 
    ++ (lastName p1) ++ “, age: “ 
    ++ (show (age p1)) ++ “, born in: “ 
    ++ (hometown p1) ++ “, “ 
    ++ (homestate p1)

And we can compare the printed string for a Person in the two cases.

(Using derived show)
>> let p = Person "John" "Doe" 23 "El Paso" "Texas"
>> p
Person {firstName = "John", lastName = "Doe", age = 23, hometown = "El Paso", homestate = "Texas"}
(Using our show)
>> p
John Doe, age: 23, born in: El Paso, Texas

The Ord Typeclass

Another common typeclass is Ord. We need this typeclass whenever we want to be able to sort or compare objects. It is necessary for certain data structures, like maps. We can derive it, but more often we’ll want special control over how we compare our. For instance, the derived instance of Ord will sort our people first by first name, then by last name, then by age, etc. However, suppose we want a different order. Let’s sort them by age instead, and then by last name, and then by first name. We would define the Ord instance like so:

instance Ord Person where
  compare p1 p2 = if ageComparison == EQ
    then if lastNameComparison == EQ
      then firstName p1 `compare` firstName p2
      else lastNameComparison
    else ageComparison
    where
      ageComparison = (age p1) `compare` (age p2)
      lastNameComparison = (lastName p1) `compare` (lastName p2)

We complete this instance with the single function, compare. This function compares two objects, and determines an ordering. The ordering can be one of three values: LT (less than, meaning the first parameter comes “first”), EQ, or GT (greater than). We can see the difference in the sort order of our people using these two different approaches:

>> let p1 = Person "John" "Doe" 23 "El Paso" "Texas"
>> let p2 = Person "Jane" "Doe" 24 "Houston" "Texas"
>> let p3 = Person "Patrick" "Green" 22 "Boston" "Massachusetts"
(Using derived Ord)
>> sort [p1, p2, p3]
[Jane Doe, age: 24, born in: Houston, Texas,John Doe, age: 23, born in: El Paso, Texas,Patrick Green, age: 22, born in: Boston, Massachusetts]
(Using our Ord)
>> sort [p1, p2, p3]
[Patrick Green, age: 22, born in: Boston, Massachusetts,John Doe, age: 23, born in: El Paso, Texas,Jane Doe, age: 24, born in: Houston, Texas]

Defining a Typeclass

We can even create our own typeclasses. Suppose we have another data type called Student:

data Student = Student
  { studentFirstName :: String
  , studentMiddleName :: String
  , studentLastName :: String
  , studentGradeLevel :: Int }

Notice both the Person class and the Student class have name fields. We could combine these to create a “total” name. We could make a typeclass for this behavior and call it HasName. Then we could provide instances for both types:

class HasName a
  totalName :: a -> String

instance HasName Person where
  totalName p = firstName p ++ “ “ ++ lastName p

instance HasName Student where
  totalName s = studentFirstName s ++ “ “ ++ studentMiddleName s ++ studentLastName s

And we’re done! That’s all it takes to make your own typeclasses!

Summary

Typeclasses allow us to establish common behavior between different types.
Many typeclasses can be derived automatically.
They allow us to use many helpful library methods, like sort.
We can define our own typeclasses.

Now you can use built-in typeclasses and make your own. This opens up a big world of Haskell possibilities, so it’s time to start making your own project! Check out this checklist to make sure you have all the tools to get going!

James Bowen 12/26/16 James Bowen 12/26/16

Making Your Types Readable!

So in the last post we discussed creating our own data types using the data keyword. This is an excellent way to describe most of our data. But there are more ways we should explore. We’ll look at two simpler alternatives to creating an full data type.

The Type Keyword

The first method we’ll discuss is the type keyword. This keyword creates a type synonym, much like the typedef keyword from C. We take some type, and simply give it a new name we can use to reference it.

type MyWrapperType = (String, Int, Float)

When we do this, the code’s behavior does not really change. It makes no difference whether we call the type by its new name or its old name. We can even use both in the same function! All the same pattern matching rules apply to the synonym as the original type.

myTypeFunction :: MyWrapperType -> (String, Int, Float) -> MyWrapperType
myTypeFunction (str1, int1, flt1) (str2, int2, flt2) = (str2, int1, flt1)

The type keyword is an excellent way to clarify compound types. As you write more complicated code, you’ll find you often need to have more complicated types, like nested tuples and lists. For example, suppose some function generates a 5-tuple, and a few other functions take the type as input.

complicatedFunction :: … -> ([Int], Map String Float, Maybe String, Float, Int)

func1 :: ([Int], Map String Float, Maybe String, Float, Int) -> …

func2 :: ([Int], Map String Float, Maybe String, Float, Int) -> …

func3 :: ([Int], Map String Float, Maybe String, Float, Int) -> ...

We can dramatically clarify this code with type:

type UserChunk = ([Int], Map String Float, Maybe String, Float, Int)

complicatedFunction :: … -> UserChunk

func1 :: UserChunk -> …

func2 :: UserChunk -> …

func3 :: UserChunk -> …

The best example of the type synonym in normal Haskell is the String type. String is actually just a type name for a list of characters!

type String = [Char]

One possible downside to using type synonyms is the compiler may not use your synonym when telling you about errors. Thus you need to be aware of the different renamed types in your code as you analyze errors.

Newtypes

Now we’ll discuss a different way of expressing our types: the newtype keyword. This creates a new, unique type, wrapping a single other type. For instance, we could create an Interest Rate newtype wrapping a floating point number:

newtype InterestRate = InterestRate Float

The syntax for newtype is just like creating an algebraic data type, except it can only use one constructor, and that constructor must have exactly one parameter. You can still use record syntax with a newtype. One convention is to give the field a name such as “un”-(newtype name), as follows:

newtype InterestRate = InterestRate { unInterestRate :: Float }

As with ADTs, record syntax allows you to unwrap the original float value without pattern matching. Though you can do either if you want:

func1 :: InterestRate -> Float
func1 (InterestRate flt1) = flt1 + 2.5

func2 :: InterestRate -> Float
func2 intRt = (unInterestRate intRt) + 2.5

The most important point about newtypes is they DO change compile time behavior! The following code would not work:

rate :: InterestRate
rate = InterestRate 0.5

badCall :: Float
badCall = func1 rate

func1 :: Float -> Float
func1 flt1 = flt1 + 2.5

This is because func1 demands a Float as input, but badCall passes an InterestRate, which is a different type. This is super helpful when you have numerous items which have the same underlying type, but actually represent different kinds of values.

Using Newtypes to Catch Errors

Let’s see how we can use newtypes to make a real program safer. Suppose you’re writing code for a bank. You might write a function taking a couple interest rates and a bank balance and performing some operations on them. If these are all represented by floating point numbers, you might get code looking like:

bankFunction :: Float -> Float -> Float -> Float
bankFunction balance intRate fedRate = balance * (1 + intRate + fedRate)

But now imagine trying to call this function. There are numerous ways to misorder the arguments, and the program will still compile without telling you about the error!

determineNewBalance :: Float -> Float
-- Wrong order!
determineNewBalance balance = bankFunction intRate fedRate balance
  Where
    intRate = 6.2
    fedRate = 0.5

This code will have an error you can only catch at runtime (or with good testing), since we misordered the different float arguments. We want to avoid such code in Haskell when we can, because Haskell gives us the tools to do so! Let’s see how newtypes can help us avoid this error:

newtype BankBalance = BankBalance { unBankBalance :: Float }
newtype InterestRate = InterestRate { unInterestRate :: Float }

bankFunction :: BankBalance -> InterestRate -> InterestRate -> BankBalance
bankFunction (BankBalance b) (InterestRate i) (InterestRate f) = BankBalance (b * (1 + i + f))

determineNewBalance :: BankBalance -> BankBalance
determineNewBalance b = bankFunction intRate fedRate b
  where 
    intRate = InterestRate 6.2
    fedRate = InterestRate 0.5

This code, which contains the same fundamental error, will not compile anymore! This is because we pass interest rates as the first two parameters , when the first parameter to bankFunction is actually a BankBalance. Marvelous! Note, using type synonyms would not have helped us here, because the underlying types would continue to be floats.

Summary

The type keyword allows us to create type synonyms.
It can allow renaming of complex types without altering compiler behavior.
The newtype keyword allows us to create a wrapper around a single type.
This makes it easier for us to distinguish (at compile time) between variables which have the same underlying type, but different meanings in our code.

Now that you’re ready to use types and newtypes to make your Haskell awesome, time to start making a Haskell project! Check out this checklist to make sure you have all the tools to get going!

James Bowen 12/19/16 James Bowen 12/19/16

Making Your Own Data Types in Haskell

So in the last post we looked at some of the built-in types in Haskell. But one of the best parts about Haskell is that it is easy to build our own types. We do this using the “data” keyword, and then giving a type name (which must begin with a capital letter):

data FightMove = …

Type Constructors

We then describe our type with a series of type constructors, separated by the | character. Each of these also has a name that begins with a capital letter. Let’s look at the most basic type constructors we can have:

data FightMove = Punch | Kick | Block

To see these type constructors in action, let’s build a function which takes some input, and returns a FightMove. We can customize which type of move we return based on the input.

chooseMove :: Int -> FightMove
chooseMove i = if i == 1
  then Punch
  else if i == 2
    then Kick
    else Block

Notice how we use the different constructors in the different cases, but they all typecheck as a FightMove! Now you might be wondering at this point, are these type constructors like constructors in, say, Java? In Java, we can have multiple constructors that each take a different set of parameters but all produce the same type of object.

However, no matter what, the resulting object in Java will always have the same instance variables with the same types. This is not the case in Haskell! Different cases of our types can have entirely different fields. For each constructor, we are not just listing the types it is initialized with, we are listing all the fields the object will have.

data FightMove = Punch Int |
  Kick Int Float |
  Block String

chooseMove :: Int -> FightMove
chooseMove i = if i == 1
  then Punch 3
  Else if i == 2
    then Kick 3 6.5
    else Block “Please Protect Me”

So if we’re taking our type as input, how do we determine which type constructor was used? If we think back to the example with Maybe types in the last article, we remember using pattern matching to determine whether a value was Nothing or Just. But Nothing and Just are simply the two type constructors of Maybe! So we can use the same pattern matching approach for our types!

evaluateMove :: FightMove -> String
evaluateMove (Punch damage) = “Punched for “ ++ (show damage) ++ “ damage”
evaluateMove (Kick damage angle) = “Kicked at “ ++ (show angle) ++ “ degrees for “ ++ (show damage) ++ “ damage”
evaluateMove (Block message) = “Blocked while saying “ ++ message

Parameterized Types

Speaking of Maybe, how is it that Maybe can wrap any type? How can we have a Maybe Int type and a Maybe [Float] type? The answer is that types can be “parameterized” by other types! Maybe is implemented as follows:

data Maybe a = Nothing | Just a

The “a” value in this definition is the parameter of the Maybe type. This lets us use any type we want to be wrapped inside! If we wanted our user to choose the way they represent “damage” in the FightMove type, we could easily parameterize this type as well:

data FightMove a = Punch a |
  Kick a Float |
  Block String

Now we can have separate types like FightMove Int, or FightMove Float, just as we can have different Maybe types like Maybe Int or Maybe Float.

Record Syntax

As you get more comfortable creating your own types, you’ll make some that have many different fields. For instance, consider this type, which has one constructor with five fields.

data Person = Person String String Int String String

The only tool we have for accessing these fields right now is pattern matching, like we used above. If we wanted to get the values of these fields out, our code would need to look something like this:

funcOnPerson :: Person -> String
funcOnPerson (Person str1 str2 int1 str3 str4) = …

But what if we only cared about one of these fields? This code seems rather cumbersome for extracting that. On top of that, how are we supposed to keep track of what field represents what value? This is where record syntax comes into play. Record syntax allows us to assign a name to each field of a constructor.

data Person = Person
  { firstName :: String
  , lastName :: String
  , age :: Int
  , hometown :: String
  , homestate :: String }

Now that our different fields have names, each name actually acts as a function allows us to extract that value from the larger object. We no longer have to deconstruct the entire object via pattern matching to get each value out.

birthPlace :: Person -> String
birthPlace person = hometown person ++ “, “ ++ homestate person

It is also easy to create an updated copy of a person using record syntax. The following code will make a new Person that preserves the age, hometown, and home state fields of the original, but has a different name:

makeJohnDoe :: Person -> Person
makeJohnDoe person = person
  { firstName = “John”
  , lastName = “Doe” }

Summary

You can make your own data types, with several different constructors.
Constructors can have different numbers of parameters and different types.
A type can be parameterized by other types, allowing a user to fill in what kind of type will be used inside the object.
Record syntax allows us to access or change* fields of an object.

*Haskell objects are immutable, and cannot actually be changed. When you do this, you create a new copy of the object! We’ll discuss immutability more later!

If you’re super excited about making your own types and want to try out Haskell, check out our free checklist to kickstart your Haskell learning process. And stay tuned for more content here at the Monday Morning Haskell blog!

James Bowen 12/12/16 James Bowen 12/12/16

Using Compound Types to Make Haskell Easier

Now that we know a bit more about Haskell’s type system, let’s get more familiar with some of the basic types. Like pretty much any other language, Haskell has integers, floating point numbers, characters, and strings. We can define a function that uses all of these basic types:

myFunction :: Int -> Float -> Char -> String -> String
myFunction x y c str = (c : str) ++ “ “ ++ show x ++ show y


myResult :: String
myResult = myFunction 4 5.5 ‘a’ “bcd”

Notice also that “myFunction” has its own type, based on what arguments it takes, and what it returns. As we know from the previous article, applying the function to arguments results in the final return type of the function, which we see in “myResult”.

Tuples

But what if we want to return multiple things from a function? Say we want to return both a floating number (“myNum”) and a string (“myStr”) from “myFunction”:

myFunction :: Int -> Float -> Char -> String -> ???
myFunction x y c str = ???
  Where
    myNum = x * y
    myStr = c : str

This is where tuples come in. Tuples allow us to combine multiple types into a single type. The type name of a tuple is denoted by a comma separated list of types within parentheses. Similarly, you build tuples with a comma separated list of the values. In this example, we would use a tuple of a Float and a String:

myFunction :: Int -> Float -> Char -> String -> (Float, String)
myFunction x y c str = (myNum, myStr)
  Where 
    nyNum = x + y
    myStr = c : str

Tuples can have as many types as you want, but many library functions only work up to tuples of 7 types. You can have the same type multiple times within a tuple as well. For instance, “(String, Int, Float)” and “(String, Int, Char, Float, Int)” are both valid types.

Most often though, two items are present within a tuple. In this case, you can use “fst” to access the first element of the tuple, and “snd” to access the second element.

addElementsOfTuple :: (Int, Int) -> Int
addElementsOfTuple tup = fst tup + snd tup

Lists

The second special type we will discuss is the List. While a tuple can contain several elements , each with a different type, a List contains many elements, each having the same type. A list type is denoted by brackets around the original tuple type name. In code, you can create lists by putting brackets around a comma separated list of values.

listExample :: [Int]
listExample = [1,2,3,4]

It is important to note, for people coming from other programming languages, that the list type in Haskell represents a linked list, not an array. It is quite efficient to separate the first element of a list from the rest, or to add an element to the front of a list. However, accessing an element in a list by index and appending two lists together are relatively inefficient operations. You can only effectively add to the back of a list using inefficient appending (++).

-- “:” is the concatenation operator for lists. It is efficient!
Add1 :: [Int] -> [Int]
add1 firstList = 1 : firstList


-- This is efficient!
getRest :: [Int] -> [Int]
getRest (_ : rs) = rs


-- This is relatively inefficient
add1ToBack :: [Int] -> [Int]
add1Toback l = l ++ [1]


-- Indexing to the 6th element. Also relatively inefficient
index :: [Int] -> Int
index ls = ls !! 5

Notice above in the getRest example that when we receive the list as a parameter, we can pattern match on the structure of the list, allowing us to easier access certain parts of it. We’ll see this pattern matching behavior a lot as we dig deeper into the type system.

Lists are ubiquitous and important in Haskell. In fact, the String class is actually implemented as a list of characters! So if you ever see an error mentioning the type “[Char]”, understand it is talking about Strings! Accordingly, there are a number of library functions which help manipulating lists. We’ll talk about these more in a later article.

The last thing to mention about lists is the existence of infinite lists. Another concept for the future is Haskell’s system of lazy evaluation. It only evaluates the expressions that it absolutely needs to. It is illegal to bring a full infinite list into scope by, say, trying to print it. But you need not bring the full list into scope to take the first 4 elements of an infinite list. Here are two ways to make infinite lists in Haskell:

infiniteLists :: ([Int], [Int])
infiniteLists = (repeat 4, cycle [1,2,3])


-- This works fine!
first4Elements :: [Int]
first4Elements = take 4 (fst infiniteLists)


-- This does not work!
printInfiniteList :: IO ()
printInfiniteList = print (snd infiniteLists)

The first element in “infiniteLists” will be a list containing nothing but 4’s, and the second will be a list that cycles between its three elements, looking like “[1,2,3,1,2,3,1,2,3,...]”

Maybes

The last special type we will discuss is Maybes. Maybe is effectively a wrapper that can be put around any type. The wrapper encapsulates the possibility of failure.

There are two types of maybe values: Nothing and Just. If a Maybe value is Nothing, there is no other value attached to it. If it is Just, then it contains some inner value. You typically “unwrap” a Maybe value by pattern matching, either on a function argument or using a case statement. This allows you to customize the behavior in case something failed.

resultString :: Maybe Int -> String
resultString Nothing = “It failed!”
resultString (Just x) = “It succeeded! “ ++ (show x)


resultString2 :: MaybeInt -> String
resultString2 maybeX = case maybeX of
  Nothing -> “It failed!”
  Just x -> “It succeeded! “ ++ (show x)

The cool thing about all these types is that they are infinitely composable. You can have a list of lists, a tuple containing a Maybe and a List, or a Maybe List of Tuples of Ints, Floats, and Lists! So no matter what, remember these 3 fundamental Haskell data structures:

Tuples
Lists
Maybes

James Bowen 12/5/16 James Bowen 12/5/16

Understanding Haskell's Type System

One of the things that makes Haskell unique is its strong, static type system. Understanding this system is one of the keys to understanding Haskell. It is radically different from the dynamic typing of languages like Python, Javascript, and Ruby. It’s even different in some respects from other statically typed languages like Java and C++.

Static Typing Vs. Dynamic Typing

To understand why the type system is so important, we have to first consider the advantages and disadvantages of static typing in the first place. One important benefit of static typing is that many bugs are found at compile time, rather than run time. With a dynamically typed language, you might ship code that seems to work perfectly well. But then in production a user stumbles on an edge case where, for instance, you forget to cast something as a number. And your program crashes.

With static typing, your code will simply not compile in the first place. This also means that errors are more likely to be found where they first occur. An object with the wrong type will raise a compile issue at its source, rather than further down the call chain where it is actually used. As a result of this, static typing needs a lot less unit testing around type agreement between methods. It doesn’t excuse a total lack of testing, but it certainly reduces the burden.

So what are the disadvantages of static typing? Well for one thing, development can be slower. It’s oftentimes a lot easier to hack together a solution in a dynamically typed language. Suppose you suddenly find it useful to do a database operation in the middle of otherwise pure code. This is very hard in Haskell, but easier in Javascript or Python.

Another disadvantage is that there’s more boilerplate code in statically typed languages. In languages like C++ and Java you declare the type of almost every variable you make, which you don’t have to do in Python. However, this is isn’t a big a problem for Haskell, as we’ll see.

Haskell and Type Inference

The first step in truly understanding Haskell’s type system is to remember the fundamental difference between functional programming and imperative programming. In imperative programming, we are giving the program a series of commands to run. In functional programming, our code actually consists of expressions to be evaluated. Under Haskell’s type system, every expression has a type.

Luckily, unlike C++ and Java, Haskell’s most widely used compiler (GHC) is generally quite good at type inference. It can usually determine the type of every expression in our code. For instance, consider this function:

func baseAmt str = replicate rptAmt newStr
  where
    rptAmt = if baseAmt > 5 
      then baseAmt 
      else baseAmt + 2
    newStr = “Hello “ ++ str

In this function, notice that we do not assign any types to our expressions, as we would have to with Java or C++ variables. The compiler has enough information to know that “baseAmt” is an integer, and “str” is a string without us telling it. Further, if we were to use the result of this function somewhere else in our code, the compiler would already know that the resulting type is a list of strings.

We can also see some consequences that the type system has on code syntax. Because every expression has a type, all if statements must have an else branch, and the result of both branches must have the same type. Otherwise the compiler would not, in this example, know what the type of “rptAmt” is.

While we can avoid using type signatures most of the time, it is nonetheless standard practice to at least give a signature for your top level functions and objects. This is because the best way to reason about a Haskell program is to look at the types of the relevant functions. We would write the above code as:

func :: Int -> String -> [String]
func baseAmt str = repeat rptAmt newStr
  where
    rptAmt = if baseAmt > 5 
      then baseAmt 
      else baseAmt + 2
    newStr = “Hello “ ++ str

In fact, it is not unreasonable when writing a Haskell program to write out the types of every function you will need before implementing any of them. This is called Type Driven Development.

You could go further and also give every variable a type signature, but this is generally considered excessive. We can see how much clunkier the code gets when we do this:

func :: Int -> String -> [String]
func baseAmt str = repeat rptAmt newStr
  where
    rptAmt :: Int
    rptAmt = if baseAmt > 5 
      then baseAmt 
      else baseAmt + 2
    newStr :: String
    newStr = “Hello “ ++ str

Functions as Types

In Haskell, functions are expressions, just like an integer or a string. Hence, they all have types. In the example, we see that “func” is a function that takes two arguments, an Int and a String, and returns a list of Strings. We can apply a function by giving it arguments. Function application changes the type of an expression. For instance, we could fully apply our function by providing two arguments, and the resulting type would be a list of Strings:

myStrings :: [String]
myStrings = func 4 “Hello”

However, we don’t have to apply ALL the arguments of a function at once. We could apply only the integer argument, and this would leave us with a function that still required a String. We could then call this partial function with the one missing argument:

partial :: String -> [String]
partial = func 4

full :: [String]
full = partial “World”

Conclusion

So if you’re new to Haskell, this was probably a lot of information, but here are a few main takeaways to remember:

Haskell is a statically typed language
Every expression in Haskell has a type, including functions and if statements
The compiler can usually infer the types of expressions, but you should generally write out the type signature for top level functions and expressions.