Beginners

Haskell From Scratch Re-Opened!

newlogo3transparent.png

This week we're taking a break from our Gloss/AI series to make a special announcement! Haskell from Scratch, our beginners course, is now re-opened for enrollment! We've added some more content since the last time we offered it. The biggest addition is a mini-project to help you practice your new skills!

Enrollment will only be open for another week, so don't wait! Next Monday, August 5th, will be the last day to sign up! Enrollments will close at midnight. Once you sign up for the course, you'll have permanent access to the course material. This includes any new content we add in the future. So even if you don't have the time now, it's still a good idea to sign up!

I also want to take this opportunity to tell a little bit of the story of how I learned Haskell. I want to share the mistakes I made, since those motivated me to make this course.

My History with Haskell

I first learned Haskell in college as part of a course on programming language theory. I admired the elegance of a few things in particular. I liked how lists and tuples worked well with the type system. I also appreciated the elegance of Haskell's type definitions. No other language I had seen represented the idea of sum types so well. I also saw how useful pattern matching and recursion were. They made it very easy to break problems down into manageable parts.

After college, I had the idea for a code generation project. A college assignment had taught me some useful Haskell libraries for the task. So I got to work writing some Haskell. At first things were quite haphazard. Eventually though, I developed some semblance of test driven development and product organization.

About nine months into that project, I had the great fortune of landing a Haskell project at my day job. As I ramped up on this project, I saw how deficient my knowledge was in a lot of areas. I realized then a lot of the mistakes I had been making while learning the language. This motivated me to start the Monday Morning Haskell blog.

Main Advice

Of course, I've tried to incorporate my learnings throughout the material on this blog. But if I had to distill the key ideas, here's what they'd be.

First, learn tools and project organization early! Learn how to use Stack and/or Cabal! For help with this, you can check out our free Stack mini-course! After several months on my side project, I had to start from scratch to some extent. The only "testing" I was doing was running some manual executables and commands in GHCI. So once I learned more about these tools, I had to re-work a lot of code.

Second, it helps a lot to have some kind of structure when you're first learning the language. Working on a project is nice, but there are a lot of unknown-unknowns out there. You'll often find a "solution" for your problem, only to see that you need a lot more knowledge to implement it. You need to have a solid foundation on the core concepts before you can dive in on anything. So look for a source that provides some kind of structure to your Haskell learning, like a book (or an online course!).

Third, let's get to monads. They're an important key to Haskell and widely misunderstood. But there are a couple things that will help a lot. First, learn the syntactic patterns of do-syntax. Second, learn how to use run functions (runState, runReaderT, etc.). These are how you bring monadic expressions into the rest of your code. You can check out our Monads Series for some help on these ideas. (And of course, you'll learn all about monads in Haskell From Scratch!)

Finally, ask for help earlier! I still don't plug into the Haskell network as much as I should. There are a lot of folks out there who are more than willing to help. Freenode is a great place, as is Reddit and even Twitter!

Conclusion

There's never been a better time to start learning Haskell! The language tools have developed a ton in the last few years and the community is growing stronger. And of course, we've once again opened up our Haskell From Scratch Beginners Course! You don't need any Haskell experience to take this course. So if you always wanted to learn more about Haskell but needed more organization, this is your chance!

If you want to stay up to date with the latest at Monday Morning Haskell, make sure to Subscribe to our mailing list! You'll hear the latest about upcoming articles, as well as any new course offerings. You'll also get access to our Subscriber Resources.

Making Arrays Mutable!

sorting_array.jpg

Last week we walked through the process of refactoring our code to use Data.Array instead of Data.Map. But in the process, we introduced a big inefficiency! When we use the Array.// function to "update" our array, it has to create a completely new copy of the array! For various reasons, Map doesn't have to do this.

So how can we fix this problem? The answer is to use the MArray interface, for mutable arrays. With mutable arrays, we can modify them in-place, without a copy. This results in code that is much more efficient. This week, we'll explore the modifications we can make to our code to allow this. You can see a quick summary of all the changes in this Git Commit.

Refactoring code can seem like an hard process, but it's actually quite easy with Haskell! In this article, we'll use the idea of "Compile Driven Development". With this process, we update our types and then let compiler errors show us all the changes we need. To learn more about this, and other Haskell paradigms, read our Haskell Brain series!

Mutable Arrays

To start with, let's address the seeming contradiction of having mutable data in an immutable language. We'll be working with the IOArray type in this article. An item of type IOArray acts like a pointer, similar to an IORef. And this pointer is, in fact, immutable! We can't make it point to a different spot in memory. But we can change the underlying data at this memory. But to do so, we'll need a monad that allows such side effects.

In our case, with IOArray, we'll use the IO monad. This is also possible with the ST monad. But the specific interface functions we'll use (which are possible with either option) live in the MArray library. There are four in particular we're concerned with:

freeze :: (Ix i, MArray a e m, IArray b e) => a i e -> m (b i e)

thaw :: (Ix i, IArray a e, MArray b e m) => a i e -> m (b i e)

readArray :: (MArray a e m, Ix i) => a i e -> i -> m e

writeArray :: (MArray a e m, Ix i) => a i e -> i -> e -> m ()

The first two are conversion functions between normal, immutable arrays and mutable arrays. Freezing turns the array immutable, thawing makes it mutable. The second two are our replacements for Array.! and Array.// when reading and updating the array. There are a lot of typeclass constraints in these. So let's simplify them by substituting in the types we'll use:

freeze
  :: IOArray Location CellBoundaries 
  -> IO (Array Location CellBoundaries)

thaw 
  :: Array Location CellBoundaries 
  -> IO (IOArray Location CellBoundaries)

readArray
  :: IOArray Location CellBoundaries 
  -> Location 
  -> IO CellBoundaries

writeArray
  :: IOArray Location CellBoundaries
  -> Location
  -> CellBoundaries
  -> IO ()

Obviously, we'll need to add the IO monad into our code at some point. Let's see how this works.

Basic Changes

We won't need to change how the main World type uses the array. We'll only be changing how the SearchState stores it. So let's go ahead and change that type:

type MMaze = IA.IOArray Location CellBoundaries

data SearchState = SearchState
  { randomGen :: StdGen
  , locationStack :: [Location]
  , currentBoundaries :: MMaze
  , visitedCells :: Set.Set Location
  }

The first issue is that we should now pass a mutable array to our initial search state. We'll use the same initialBounds item, except we'll thaw it first to get a mutable version. Then we'll construct the state and pass it along to our search function. At the end, we'll freeze the resulting state. All this involves making our generation function live in the IO monad:

-- This did not have IO before!
generateRandomMaze :: StdGen -> (Int, Int) -> IO Maze
generateRandomMaze gen (numRows, numColumns) = do
  initialMutableBounds <- IA.thaw initialBounds
  let initialState = SearchState 
                       g2
                       [(startX, startY)]
                       initialMutableBounds
                       Set.empty
  let finalBounds = currentBoundaries
                      (execState dfsSearch initialState)
  IA.freeze finalBounds
  where
    (startX, g1) = …
    (startY, g2) = …

    initialBounds :: Maze
    initialBounds = …

This seems to "solve" our issues in this function and push all our errors into dfsSearch. But it should be obvious that we need a fundamental change there. We'll need the IO monad to make array updates. So the type signatures of all our search functions need to change. In particular, we want to combine monads with StateT SearchState IO. Then we'll make any "pure" functions use IO instead.

dfsSearch :: StateT SearchState IO ()

findCandidates :: Location -> Maze -> Set.Set Location
  -> IO [(Location, CellBoundaries, Location, CellBoundaries)]

chooseCandidate
  :: [(Location, CellBoundaries, Location, CellBoundaries)]
  -> StateT SearchState IO ()

This will lead us to update our generation function.

generateRandomMaze :: StdGen -> (Int, Int) -> IO Maze
generateRandomMaze gen (numRows, numColumns) = do
  initialMutableBounds <- IA.thaw initialBounds
  let initialState = SearchState
                       g2
                       [(startX, startY)]
                       initialMutableBounds
                       Set.empty
  finalBounds <- currentBoundaries <$>
                  (execStateT dfsSearch initialState)
  IA.freeze finalBounds
  where
  …

The original dfsSearch definition is almost fine. But findCandidates is now a monadic function. So we'll have to extract its result instead of using let:

-- Previously
let candidateLocs = findCandidates currentLoc bounds visited

-- Now
candidateLocs <- lift $ findCandidates currentLoc bounds visited

The findCandidates function though will need a bit more re-tooling. The main this is that we need readArray instead of Array.!. The first swap is easy:

findCandidates currentLocation@(x, y) bounds visited = do
  currentLocBounds <- IA.readArray bounds currentLocation
  ...

It's tempting to go ahead and read all the other values for upLoc, rightLoc, etc. right now:

findCandidates currentLocation@(x, y) bounds visited = do
  currentLocBounds <- IA.readArray bounds currentLocation
  let upLoc = (x, y + 1)
  upBounds <- IA.readArray bounds upLoc
  ...

We can't do that though, because this will access them in a strict way. We don't want to access upLoc until we know the location is valid. So we need to do this within the case statement:

findCandidates currentLocation@(x, y) bounds visited = do
  currentLocBounds <- IA.readArray bounds currentLocation
  let upLoc = (x, y + 1)
  maybeUpCell <- case (upBoundary currentLocBounds,
                       Set.member upLoc visited) of
    (Wall, False) -> do
      upBounds <- IA.readArray bounds upLoc
      return $ Just
        ( upLoc
        , upBounds {downBoundary = AdjacentCell currentLocation}
        , currentLocation
        , currentLocBounds {upBoundary = AdjacentCell upLoc}
        )
    _ -> return Nothing

And then we'll do the same for the other directions and that's all for this function!

Choosing Candidates

We don't have to change too much about our chooseCandidates function! The primary change is to eliminate the line where we use Array.// to update the array. We'll replace this with two monadic lines using writeArray instead. Here's all that happens!

chooseCandidate candidates = do
  (SearchState gen currentLocs boundsMap visited) <- get
  ...
  lift $ IA.writeArray boundsMap chosenLocation newChosenBounds
  lift $ IA.writeArray boundsMap prevLocation newPrevBounds
  put (SearchState newGen (chosenLocation : currentLocs) boundsMap newVisited)

Aside from that, there's one small change in our runner to use the IO monad for generateRandomMaze. But after that, we're done!

Conclusion

As mentioned above, you can see all these changes in this commit on our github repository. The last two articles have illustrated how it's not hard to refactor our Haskell code much of the time. As long as we are methodical, we can pick the one thing that needs to change. Then we let the compiler errors direct us to everything we need to update as a result. I find refactoring other languages (particularly Python/Javascript) to be much more stressful. I'm often left wondering...have I actually covered everything? But in Haskell, there's a much better chance of getting everything right the first time!

To learn more about Compile Driven Development, read our Haskell Brain Series. If you're new to Haskell you can also read our Liftoff Series and download our Beginners Checklist!

Compile Driven Development In Action: Refactoring to Arrays!

big_matrix.jpg

In the last couple weeks, we've been slowly building up our maze game. For instance, last week, we added the ability to serialize our mazes. But software development is never a perfect process! So it's not uncommon to revisit some past decisions and come up with better approaches. This week we're going to address a particular code wart in the random maze generation code.

Right now, we store our Maze as a mapping from Locations to CellBoundaries items. We do this using Data.Map. The Map.lookup function returns a Maybe result, since it might not exist. But most of the time we accessed a location, we had good reason to believe that it would exist in the map. This led to several instances of the following idiom:

fromJust $ Map.lookup location boundsMap

Using a function like fromJust is a code smell, a sign that we could be doing something better. This week, we're going to change this structure so that it uses the Array type instead from Data.Array. It captures our idiomatic definitions better. We'll use "Compile Driven Development" to make this change. We won't need to hunt around our code to figure out what's wrong. We'll just make type changes and follow the compiler errors!

To learn more about compile driven development and the mental part of Haskell, read our Haskell Brain series. It will help you think about the language in a different way. So it's a great tool for beginners!

Another good resource for this article is to look at the Github repository for this project. The complete code for this part is on the part-3 branch. You can consult this commit to see all the changes we make in migrating to arrays.

Initial Changes

To start with, we should make sure our code uses the following type synonym for our maze type:

type Maze = Map.Map Location CellBoundaries

Now we can observe the power of type synonyms! We'll make a change in this one type, and that'll update all the instances in our code!

import qualified Data.Array as Array

type Maze = Array.Array Location CellBoundaries

Of course, this will cause a host of compiler issues! But most of these will be pretty simple to fix. But we should be methodical and start at the top. The errors begin in our parsing code. In our mazeParser, we use Map.fromList to construct the final map. This requires the pairs of Location and CellBoundaries.

mazeParser :: (Int, Int) -> Parsec Void Text Maze
mazeParser (numRows, numColumns) = do
  …
  return $ Map.fromList (cellSpecToBounds <$> (concat rows))

The Array library has a similar function, Array.array. However, it also requires us to provides the bounds for the Array. That is, we need the "min" and "max" locations in a tuple. But these are easy, since we have the dimensions as an input!

mazeParser :: (Int, Int) -> Parsec Void Text Maze
mazeParser (numRows, numColumns) = do
  …
  return $ Array.array 
    ((0,0), (numColumns - 1, numRows - 1))
    (cellSpecToBounds <$> (concat rows))

Our next issue comes up in the dumpMaze function. We use Map.mapKeys to transpose the keys of our map. Then we use Map.toList to get the association list back out. Again, all we need to do is find the comparable functions for arrays to update these.

To change the keys, we want the ixmap function. It does the same thing as mapKeys. As with Array.array, we need to provide an extra argument for the min and max bounds. We'll provide the bounds of our original maze.

transposedMap = Array.ixmap (Array.bounds maze) (\(x, y) -> (y, x)) maze

A few lines below, we can see the usage of Map.toList when grouping our pairs. All we need instead is Array.assocs

cellsByRow :: [[(Location, CellBoundaries)]]
cellsByRow = groupBy
  (\((r1, _), _) ((r2, _), _) -> r1 == r2)
  (Array.assocs transposedMap)

Updating Map Generation

That's all the changes for the basic parsing code. Now let's move on to the random generation code. This is where we have a lot of those yucky fromJust $ Map.lookup calls. We can now instead use the "bang" operator, Array.! to access those elements!

findCandidates currentLocation@(x, y) bounds visited =
  let currentLocBounds = bounds Array.! currentLocation
  ...

Of course, it's possible for an "index out of bounds" error to occur if we aren't careful! But our code should reflect the fact that we expect all these calls to work. After fixing the initial call, we need to change each directional component. Here's what the first update looks like:

findCandidates currentLocation@(x, y) bounds visited =
      let currentLocBounds = bounds Array.! currentLocation
          upLoc = (x, y + 1)
          maybeUpCell = case (upBoundary currentLocBounds,
                              Set.member upLoc visited) of
                          (Wall, False) -> Just
                            ( upLoc
                            , (bounds Array.! upLoc) {downBoundary = 
                                AdjacentCell currentLocation}
                            , currentLocation
                            , currentLocBounds {upBoundary =
                                AdjacentCell upLoc}
                            )
                          _ -> Nothing

We've replaced Map.lookup with Array.! in the second part of the resulting tuple. The other three directions need the same fix.

Then there's one last change in the random generation section! When we choose a new candidate, we currently need two calls to Map.insert. But arrays let us do this with one function call. The function is Array.//, and it takes a list of association updates. Here's what it looks like:

chooseCandidate candidates = do
      (SearchState gen currentLocs boundsMap visited) <- get
      ...
      -- Previously used Map.insert twice!!!
      let newBounds = boundsMap Array.//
            [(chosenLocation, newChosenBounds),
             (prevLocation, newPrevBounds)]
      let newVisited = Set.insert chosenLocation visited
      put (SearchState
             newGen
             (chosenLocation : currentLocs) 
             newBounds 
             newVisited)

Final Touch Ups

Now our final remaining issues are within the Runner code. But they're all similar fixes to what we saw in the parsing code.

In our sample boundariesMap, we once again replace Map.fromList with Array.array. Again, we add a parameter with the bounds of the array. Then, when drawing the pictures for our cells, we need to use Array.assocs instead of Map.toList.

For the final change, we need to update our input handler so that it accesses the array properly. This is our final instance of fromJust $ Map.lookup! We can replace it like so:

inputHandler :: Event -> World -> World
inputHandler event w = case event of
  ...
  where
    cellBounds = (worldBoundaries w) Array.! (playerLocation w)

And that's it! Now our code will compile and work as it did before!

Conclusion

There's a pretty big inefficiency with our new approach. Whereas Map.insert can give us an updated map in log(n) time, the Array.// function isn't so nice. It has to create a complete copy of the array, and we run that function many times! How can we fix this? Next week, we'll find out! We'll use the Mutable Array interface to make it so that we can update our array in-place! This is super efficient, but it requires our code to be more monadic!

For some more ideas of cool projects you can do in Haskell, download our Production Checklist! It goes through a whole bunch of libraries on topics from database management to web servers!

Declaring Victory! (And Starting Again!)

victory.jpg

In last week's article, we used a neat little algorithm to generate random mazes for our game. This was cool, but nothing happens yet when we "finish" the maze! We'll change that this week. We'll allow the game to continue re-generating new mazes when we're finished! You can find all the code for this part on the part-2 branch on the Github repository for this project!

If you're a beginner to Haskell, hopefully this series is helping you learn simple ways to do cool things! If you're a little overwhelmed, try reading our Liftoff Series first!

Goals

Our objectives for this part are pretty simple. We want to make it so that when we reach the "end" location, we get a "victory" message and can restart the game by pressing a key. We'll get a new maze when we do this. There are a few components to this:

  1. Reaching the end should change a component of our World.
  2. When that component changes, we should display a message instead of the maze.
  3. Pressing "Enter" with the game in this state should start the game again with a new maze.

Sounds pretty simple! Let's get going!

Game Result

We'll start by adding a new type to represent the current "result" of our game. We'll add this piece of state to our World. As an extra little piece, we'll add a random generator to our state. We'll need this when we re-make the maze:

data GameResult = GameInProgress | GameWon
  deriving (Show, Eq)

data World = World
  { playerLocation :: Location
  , startLocation :: Location
  , endLocation :: Location
  , worldBoundaries :: Maze
  , worldResult :: GameResult
  , worldRandomGenerator :: StdGen
  }

Our generation step needs a couple small tweaks. The function itself should now return its final generator as an extra result:

generateRandomMaze :: StdGen -> (Int, Int) -> (Maze, StdGen)
generateRandomMaze gen (numRows, numColumns) =
  (currentBoundaries finalState, randomGen finalState)
  where
    ...
    finalState = execState dfsSearch initialState

Then in our main function, we incorporate the new generator and game result into our World:

main = do
  gen <- getStdGen
  let (maze, gen') = generateRandomMaze gen (25, 25)
  play
    windowDisplay
    white
    20
    (World (0, 0) (0, 0) (24, 24) maze GameInProgress gen')
    ...

Now let's fix our updating function so that it changes the game result if we hit the final location! We'll add a guard here to check for this condition and update accordingly:

updateFunc :: Float -> World -> World
updateFunc _ w
  | playerLocation w == endLocation w = w { worldResult = GameWon }
  | otherwise = w

We could do this in the eventHandler but it seems more idiomatic to let the update function handle it. If we use the event handler, we'll never see our token enter the final square. The game will jump straight to the victory screen. That would be a little odd. Here there's at least a tiny gap.

Displaying Victory!

Now our game will update properly. But we have to respond to this change by changing what the display looks like! This is a quick fix. We'll add a similar guard to our drawingFunc:

drawingFunc :: (Float, Float) -> Float -> World -> Picture
drawingFunc (xOffset, yOffset) cellSize world
  | worldResult world == GameWon =
      Translate (-275) 0 $ Scale 0.12 0.25
        (Text "Congratulations! You've won!\
              \Press enter to restart with a new maze!")
  | otherwise = ...

Note that Text here is the Gloss Picture constructor, not Data.Text. We also scale and translate it a bit to make the text appear on the screen. This is all we need to get the victory screen to appear on completion!

completed_maze.jpg

Restarting the Game

The last step is that we have to follow through on our process to restart the game if they hit enter! This involves changing our inputHandler to give us a brand new World. As with our other functions, we'll add a guard to handle the GameWon case:

inputHandler :: Event -> World -> World
inputHandler event w
  | worldResult w == GameWon = …
  | otherwise = case event of
    ...

We'll want to make a new case section that accounts for the user pressing the "Enter" key. All this section needs to do is call generateRandomMaze and re-initialize the world!

inputHandler event w
  | worldResult w == GameWon = case event of
      (EventKey (SpecialKey KeyEnter) Down _ _) ->
        let (newMaze, gen') = generateRandomMaze 
              (worldRandomGenerator w) (25, 25)
        in  World (0, 0) (0, 0) (24, 24) newMaze GameInProgress gen'
      _ -> w

And with that, we're done! We can restart the game and navigate random mazes to our heart's content!

Conclusion

The ability to restart the game is great! But if we want to make our game re-playable instead of random, we'll need some way of storing mazes. In the next part, we'll look at some code for dumping a maze to an output format. We'll also need a way to re-load from this stored representation. This will ultimately allow us to make a true game with saving and loading state.

In preparation for that, you can read our series on Parsing. You'll especially want to acquaint yourself with the Megaparsec library. We go over this in Part 4 of the series!

Extending Haskell's Syntax!

extension.jpg

When you're starting out with Haskell, compiler extensions seem a little weird. And in a way, they are. It's strange to think that you need to "opt in" to certain compiler features. And as a beginner, it can be overwhelming to think you need to know the meaning of certain odd terms. I still remember how some of the first Haskell code I worked on had at least 10 extensions in every file. And I didn't have a clue what they meant!

But there are good reasons for certain features to be "opt in". They might make the compilation process longer. Or they might make some types of code less performant. But there are many extensions you can use that can make you life easier without worrying. And many extensions are easy to learn, so you can get the hang of the concept.

In this article, we’re going to do a quick run-down of some simple extensions. You’ve probably heard of at least of few of these. But it’s always good to keep learning. None of these are too advanced. For the most part, they allow you to use some more syntactic sugar and write cleaner code. So they’re pretty uncontroversial and you should feel free to use them in any file you want. Learning a few of these will help you get more comfortable so you can tackle harder extensions when you need to.

For some more tools to take your Haskell to the next level, download our Production Checklist! You can also read our Haskell Web Series for some tutorials.

Overloaded Strings

We’ve done one article already on overloaded strings. But here’s another quick summary. There are five different string types in Haskell. By default, Haskell assumes that whenever you use a string literal, you intend for it to be the String type.

-- Defaults as String
aString = "Hello"

-- The following will NOT WORK (by default)
aText :: Text
aText = "Hello"

This is annoying, because String is generally inferior to the other string types. You should be using Text most of the time for performance reasons. If we use the OverloadedStrings extension, then we can use literals for any of these string types!

{-# LANGUAGE OverloadedStrings #-}

-- Now this works!
aText :: Text
aText = "Hello"

aByteString :: ByteString
aByteString = "Hello"

And, in fact, you can use string literals for any type you want! All you have to do is create an instance of the IsString class for it by defining the fromString function.

newtype Name = Name String

instance IsString Name where
  fromString s = Name s

myName :: Name
myName = "James"

This is one of the most common and simplest extensions you can use, so it's a great one to start with!

Lambda Case

Lambda case is another simple extension, but it isn’t quite as common as overloaded strings. It helps clean up a particular syntax wart that comes up from time to time. Consider a function where we take a single argument, and then immediately run a case statement on it:

useParseResult :: Either ParseError Result -> IO ()
useParseResult x = case x of
  Left parseError -> …
  Right goodResult -> ...

You could do a direct pattern match, but sometimes this is impossible if you’re in-lining the function. Notice that we use a one-letter variable name x. We could come up with a better name. But it seems like a waste since we don't use this variable anywhere else in the function definition. If would be nice if we could remove it altogether.

The LambdaCase extension allows this by providing the following syntactic sugar. You can use case as if it were the argument of a lambda expression, and then immediately do the pattern match:

{-# LANGUAGE LambdaCase #-}

useParserResult :: EitherParseError Result -> IO ()
useParserResult = \case ->
  Left parseError -> ...
  Right goodResult -> ...

At the end of the day, it’s a small difference. But it's a nice little trick you can use to save yourself some unneeded variable names.

Bang Patterns

Haskell is lazy by default. But there are certain situations where you need a strict value as an input to your function. This means the value should get evaluated BEFORE the function gets run. This is a little tricky to do with normal Haskell syntax. Consider this function:

bangTest :: Bool -> Int -> Int
bangTest b i = if b then 42 else 2 * i

If the boolean is true, laziness means we never evaluate the int argument. Hence the following works:

>> bangTest True undefined
42

But if we want that situation to fail, we need to use seq:

bangTest b i = seq i $ if b then 42 else 2 * i

…
>> bangTest True undefined
***Exception: Prelude.undefined

The BangPatterns extension allows us to use the bang character ! to specify that the function should be strict in an argument. So instead of using seq like above, we can get the same behavior like so:

bangTest :: Bool -> Int -> Int
bangTest b !i = if b then 42 else 2 * i

…

>> bangTest True undefined
***Exception: Prelude.undefined

Even without this extension, you can use strictness annotations in type definitions. Consider this example:

data Person = Person String

printName :: Bool -> Person -> IO ()
printName b (Person name) = if b
  then putStrLn "Hello"
  else putStrLn name

…
>> printName True (Person undefined)
Hello

But we can also make person strict in its string argument like so:

data Person = Person !String

…

>> printName True (Person undefined)
*** Exception: Prelude.undefined

And again, this last example works even without the extension!

Type Operators

Haskell is sometimes criticized for an abundance of confusing operators. This next syntax extension does not ease this criticism! But it does provide some neat new possibilities when defining types! Here's an example with the Servant library. It requires both DataKinds and TypeOperators, but we'll focus on the latter.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}

type PersonAPI =
       "person" :> Capture "personid" Int :> Get '[JSON] Person
  :<|> "person" :> ReqBody '[JSON] Person :> Post '[JSON] Int

As a reminder, we've defined a type up there, not a normal expression! This means the :> and :<|> operators are actually constructors! Let's define an example for ourselves. Suppose we have a simple type that wraps a couple other types in a pair:

data MyPair a b = MyPair a b

collection :: MyPair [Int] [String]
...

Now suppose we want to join more types together. We can do this by nesting MyPair instances, but the type signatures will get messy:

bigCollection :: MyPair [Int] (MyPair [String] (Map String Int))

But we can define a type operator that allows us to join these together!

infixr 8 +>>
type (t1 +>> t2) = MyPair t1 t2

And now we can get far cleaner signatures!

collection :: [Int] +>> [String]

bigCollection :: [Int] +>> [String] +>> Map String Int

Haskell lets us make complex recursive structures with many different type parameters. Type operators help us keep the signatures concise when we do this!

Tuple Sections

We've got one last trick for you. The TupleSections extension makes tuples easier to work with. Even without an extension can use the comma operator to build tuples like so:

combined :: (Int, String)
combined = (,) 5 "Hello"

combined3 :: (Int, String, Float)
combined3 = (,,) 5 "Hello" 2.3

But suppose we want to apply a function where we hardcode a particular value of a tuple. We'd need a separate definition of this function:

injectHello :: Int -> Float -> (Int, String, Float)
injectHello i f = (i, "Hello", f)

fetchInt :: IO Int

fetchFloat :: IO Float

combined :: IO (Int, String, Float)
combined = injectHello <$> fetchInt <*> fetchFloat

But with TupleSections, we can create a constructor that already has "Hello" built in! We can then apply it as a function with using another definition!

{-# LANGUAGE TupleSections #-}

combined :: IO (Int, String, Float)
combined = (,"Hello",) <$> fetchInt <*> fetchFloat

This is another useful little trick that let's us skip annoying in-between definitions. When you add up all these small things, it can go a long way towards cleaner code!

Conclusion

The ecosystem of Haskell compiler extensions is very large. As a beginner, it can be hard to know where to start. But many extensions are simple. In this article, we went over a couple simple ones and a couple more complicated ones. Once you get familiar with one or two, the concept starts making a lot more sense.

For some more ideas on taking your Haskell to the next level, check out our Production Checklist! It has a list of libraries for cool purposes like writing servers and using databases!

Modifying a Library!

library.jpg

Sometime last year, I wrote an article about advanced Stack techniques. We discussed one hypothetical case where a library had a bug. The solution here was to fork the repository and use a Github location in stack.yaml.

I recently came across an easy example of doing this and thought I'd share it. We can see how to incorporate the change without stressing about its complexity. Even if you're only a beginner, this is a good skill to learn now!

You'll need some background on Stack first though! If you've never used Stack before, take a look at our Stack mini-course to learn more!

Background

I've been doing a bit of work lately with the persistent-migration library. This library lets us set up manual migrations for a Persistent schema we've set up through template Haskell. See this article for more on that topic.

The migrations library has a couple functions that let us run some SQL operations. They live in the SqlPersistT monad, as we would expect. However, something's a little off about them:

getMigration ::
  MigrateSettings -> Migration -> SqlPersistT IO [MigrateSql]

runMigration ::
  MigrateSettings -> Migration -> SqlPersistT IO ()

We can see that these functions restrict us to using SqlPersistT on top of the normal IO monad. But in most cases with Persistent, I use the withPostgresqlConn function. This adds a MonadLogger constraint. Thus IO doesn't cut it. Most functions have a type signature looking like this:

databaseOp :: SqlPersistT (LoggingT IO) a

So we can't interoperate as easily as we'd like between our normal database operations, and these migration functions. The solution is that the migration functions should be more general in what monads they can use. This will be an easy fix, as we'll see. But first, we have to find a way to get our own version of the code.

Getting Started

The persistent-migration library is on Github here. So we can make a fork of the repository, and clone that to our machine:

git clone https://github.com/jhb563/peristent-migration

Now we'll follow the build instructions in the Developing.md file to get set up. Some of the tests fail on my machine, but we'll ignore that for this article.

Making Our Code Fixes

The code fixes turn out to be very easy. We can go into the relevant module and change the type signatures so they are more general:

module Database.Persistent.Migration.Postgres where

...

getMigration :: (MonadIO m) =>
  MigrateSettings -> Migration -> SqlPersistT m [MigrateSql]

runMigration :: (MonadIO m) =>
  MigrateSettings -> Migration -> SqlPersistT m ()

Even in the worst case, the only changes we would need to make here would be to add liftIO calls in various places. But it turns out that this change doesn't break anything! The library still builds, and all the tests that were passing before still pass.

So now we can commit this change to our fork and push it to the repository.

Incorporating the Fix

Now we have to use our own fork as an alternative to the version of the library on Hackage. Before, the extra-deps section in ourstack.yaml` looked like this:

extra-deps:
  - persistent-migration-0.1.0

This indicates we would grab the code from Hackage. But now we can use an alternative package format to reference our Github repository. Here's what we change it to:

extra-deps:
  - git: https://github.com/jhb563/persistent-migration.git
    commit: 9f9c5035efe

And now we've got our own fork as a dependency for our project! We can write code like so:

doMigrations :: SqlPersistT (LoggingT IO) a
doMigrations = do
  runMigration defaultSettings migration
  ...

And everything works! Our code builds!

Potential Issues

Now, an approach like this can lead to some possible issues. We're now disconnected from the original repository. So if there was a new release, we'd have to do a bit more work to pull those changes into our own repository. Still, this isn't too difficult. One solution to this is to submit a pull request with our changes. If it gets accepted, they'll be in the next release! Then we can go back to using the version on Hackage!

Conclusion

In this article, we did a quick overview of how to make our own changes to libraries. We cloned the repository, made a code change, and added our fork as a dependency. Obviously, most of the changes you'll want to make aren't as simple as this one was. But it's good to use an example where all we're doing is tackling the issue of getting the code into our code base!

For a broad overview of how to use the Stack program, make sure to check out of Stack Mini-course! If you've never written any Haskell before, you can also look at our Beginners Checklist!

Shareable Haskell with Jupyter!

In the last couple weeks, we've discussed a couple options for Haskell IDEs, like Atom and IntelliJ. But there's another option I'll talk about this week. Both our IDE setups are still most useful for fully-fledged projects. But if you're writing some quick and dirty one-off code, they can be a little cumbersome to work with.

This other option is Jupyter with IHaskell. It's like IPython, for those who have used that. I got the idea when the good folks at Tweag made a blog post with it. Jupyter was originally intended for making quick Python data science scripts. It allows a nice UI for making data visualizations. Thanks to the hard work of Andrew Gibiansky, there is a Haskell kernel for Jupyter! In this article, we'll discuss some quick approaches to using it.

IHaskell is actually a great tool when you're first learning Haskell! If you've never programmed in Haskell before, you can read our Liftoff Series and follow along with the code examples! You can write them using IHaskell instead of making a Stack project!

Installing

If you want to make a full-fledged Jupyter notebook, you'll need to install Jupyter first. The most heavy-duty but easiest way would be to use the Anaconda distribution. But there are also other options like pip.

After that, you'll need to install the Haskell kernel for it. Unfortunately, you can't do this on a Windows machine. You either need a Mac, Linux or a virtual box. The instructions for these systems are well documented in the README. In short, you need to sort out your Python dependencies, grab the Github repo, and build the project.

Now if you're on Windows, or you don't want to install the full Jupyter system, you can try out IHaskell online. Head to this Binder page, make a new notebook, and get cracking!

Making a Basic Example

In our notebook, we can write Haskell code as if we're in a file, but evaluate it as if we're in GHCI. A quick look at the .cabal file will reveal the libraries we have easy access to in this notebook setup. We can see for instance that we have stalwarts such as mtl, aeson, and split. Using this last library, we can write the following snippet:

firstPic.jpg

Then we press shift+enter to finish the cell and it gets evaluated. Evaluations work as in GHCI. Anything you assign to a variable name will be usable later on in your notebook. Then the final expression you put will get printed. So we'll see output like so:

secondPic.jpg

Then we can use the items we named in another snippet like this:

thirdPic.jpg

Since we also have access to the Aeson library, we can serialize our list like so:

fourthPic.jpg

As a final note it is easy to create multiline definitions and use those! This is a big improvement over GHCI. It would be very annoying to define a new data type, for instance:

fifthPic.jpg

Exporting the Notebook

Now one of the awesome things about Jupyter notebooks is that it's easy to share your work! There's an option off the file menu for downloading your notebooks. There are a great many options, including Haskell source files, pdfs, and HTML documents. These last two can be extremely useful if you want to make a presentation!

sixthPic.png

Conclusion

As I move forward with MMH, I'm definitely going to explore using notebooks like this more. It should provide a better reader experience than what I have now. I'll also be looking at migrating some of our existing permanent content to Jupyter. The lack of Windows functionality for Haskell is unfortunate, but I'll find a way around it.

Jupyter IHaskell is a great way to get familiar with the basics of Haskell without downloading any of the tools. But at some point, you'll need these! Read our Liftoff Series and download our Beginners Checklist to find out more!

Another IDEa: Haskell and IntelliJ!

IntelliJ.jpg

Last week we explored one way to get a nice development environment for Haskell. We used the Atom text editor, which has a couple Haskell plugins and is quite hackable.

But there's another option I hadn't considered at all during that article. This is the IntelliJ IDE. It's primary use is Java and Android development. But like Atom, Visual Studio, and other IDEs, it has a rich library of plugins. And one of these is a Haskell environment!

This week, we'll see how to configure IntelliJ to work with Haskell. We'll see how we can get a nifty Haskell environment set up with the same features we had in Atom. I'm working on a Windows machine, but you should be able to do all these steps on a Mac as well.

An IDE is no substitute for basic knowledge though! If you're new to Haskell, getting a good dev setup will help. But you should also read our Liftoff Series and download our Beginners Checklist! These will give you some other tools you'll need!

Installation and Setup

Getting started with IntelliJ is quite easy. Installing the editor works through the normal wizard. You'll have a lot of options for different plugins to install immediately. A lot of these are Java specific so you won't need them. But once you've done that you can also install the IntelliJ-Haskell plugin. In my case, I also installed a Vim plugin for those keybindings.

There's a little bit of trickiness involved in setting your project up to build with Stack. When you first install the plugin, it will ask you what program to use to "build" a project. This means you'll need to locate your stack executable in the file finder so you can drag it in. On Windows this will mean showing hidden folders in the finder. You might also need to use the where command in the terminal to help (instead of which from Linux). Once you've done this though, you should be good!

Keyboard Shortcuts

When working with Atom, we stressed the importance of keyboard shortcuts. These can streamline our workflow a lot. IntelliJ also allows a good deal of customization options for these. The main thing to know is that you need to hit ctrl+alt+s to get to the settings menu. Then you can find the keymap on the side panel. From here you can customize pretty much anything. The big ones for me were building the project and manipulating panels.

The ability to search for commands is very useful. I found it a lot easier to alter commands for, say, splitting windows then I did in Atom. My current setup involves the following combinations:

Build Project: Ctrl+Alt+Shift+B
Split Screen Vertically: Ctrl+Alt+Shift+Right
Split Screen Horizontally: Ctrl+Alt+Shift+Down
Next/Previous Split: Ctrl+Shift+[Right/Left]
Unsplit: Ctrl+Alt+Shift+U
Toggle Bottom Terminal: Ctrl+Shift+Up

For what it's worth, I'll also note that the Vim keybindings are better than I had in Atom. Moving around with lines and saving files with :w work, for a couple examples.

Haskell Features

This is where the IntelliJ plugin shines. Lots of features work right out of the box. For instance, it knows to using hlint and highlights any code with lints. Compilation hints show up automatically. There's even a good deal of auto-completion from libraries for expressions and types. Integration with Hoogle is fairly straightforward.

Best of all, it seems to me that these features work across projects with different GHC versions. As far as I can tell you don't need to manually install ghc-mod and worry about it's version, as you did with Atom. Given the difficulties I had setting up Atom to work with these features, this was a major relief.

Git Integrations

We didn't go over version control last week. But it's another vital component in any developer's workflow, so IDE integration is a big plus. Both Atom and IntelliJ have good support for Github, which is excellent news! Both come with batteries included when it comes to all the common Git operations we want. You can make new branches, add commits, push and pull with ease. Both allow you to bind these to keys, allowing you the freedom to streamline your workflow even more.

Disadvantages

If I were to find one fault with my IntelliJ setup, it's that project setup can involve a lot of loading time. When you add a new library to the .cabal file, you need to run the Tools->Haskell->Update Settings command. The IDE will take a little while to reset everything to account for this. Having said that, a lot of that loading time goes into getting all the appropriate libraries to set up. This enables all the nice features I mentioned earlier. So I suppose that's the price you pay. Atom is also sometimes slow, for its part. But the program itself isn't quite as bulky as IntelliJ, which has a lot of extra features you probably won't need.

One last note is that IntelliJ will add a .idea folder to your project directory. Make sure to add this to your .gitignore!

Conclusion

All in all working with IntelliJ/HaskelIDE has been a good experience so far. It has all the features I'm looking for, and setup is a bit easier than Atom. Long loads can hold me back at times, but it's usually fine. Again, you can take a look at the Github page for the project for some more details. I highly recommend trying out this plugin! Much love to Rik van der Kleij, the author!

A full IDE setup will really help you get started learning Haskell! But you also need some other tools and knowledge. Download our Beginners Checklist for some other tools you'll need. Also take a look at our Stack mini-course to learn more about setting up a Haskell project!

Upgrading My Development Setup!

code_setup.jpg

In the last year or so, I haven't actually written as much Haskell as I'd like to. There are a few reasons for this. First, I switched to a non-Haskell job, meaning I was now using my Windows laptop for all my Haskell coding. Even on my previous work laptop (a Mac), things weren't easy. Haskell doesn't have a real IDE, like IntelliJ for Java or XCode for iOS development.

Besides not having an IDE, Windows presents extra pains compared to the more developer-friendly Mac. And it finally dawned on me. If I, as an experienced developer, was having this much friction, it must be a nightmare for beginners. Many people must be trying to both learn the language AND fight against their dev setup. So I decided to take some time to improve how I program so that it'll be easier for me to actually do it.

I wanted good general functionality, but also some Haskell-specific functions. I did a decent amount of research and settled on Atom as my editor of choice. In this article, we'll explore the basics of setting up Atom, what it offers, and the Haskell tools we can use within it. If you're just starting out with Haskell, I hope you can take these tips to make your Haskell Journey easier.

Note that many tips in this article won't work without the Haskell platform! To start with Haskell, download our Beginners Checklist, or read our Liftoff Series!

Goals

It's always good to begin with the end in mind. So before we start out, let's establish some goals for our development environment. A lot of these are basic items we should have regardless of what language we're using.

  1. Autocomplete. Must have for terms within the file. Nice to have for extra library functions and types.
  2. Syntax highlighting.
  3. Should be able to display at least two code files side-by-side, should also be able to open extra files in tabs.
  4. Basic file actions should only need the keyboard. These include opening new files to new tabs or splitting the window and opening a new file in the pane.
  5. Should be able to build code using the keyboard only. Should be able to examine terminal output and code at the same time.
  6. Should be able to format code automatically (using, for instance, Hindent)
  7. Some amount of help filling in library functions and basic types. Should be able to coordinate types from other files.
  8. Partial compilation. If I make an obvious mistake, the IDE should let me know immediately.
  9. Vim keybindings (depends on your preference of course)

With these goals in mind, let's go ahead and see how Atom can help us.

Basics of Atom

Luckily, the installation process for Atom is pretty painless. Using the Windows installer comes off without a hitch for me. Out of the box, Atom fulfills most of the basic requirements we'd have for an IDE. In fact, we get all our 1-4 goals without putting in any effort. The trick is that we have to learn a few keybindings. The following are what you'll need to open files.

  1. Ctrl+P - Open a new tab with a file using fuzzy find
  2. Ctrl+K + Direction (left/right/up/down arrow) - Open a new pane (will initially have the same file as before).
  3. Ctrl+K + Ctrl+Direction - Switch pane focus

Those commands solve requirements 3 and 4 from our goals list.

Another awesome thing about Atom is the extensive network of easy-to-install plugins. We'll look at some Haskell specific items below. But to start, we can use the package manager to install vim-mode-improved. This allows most Vim keybindings, fulfilling requirement 9 from above. There are a few things to re-learn with different keystrokes, but it works all right.

Adding Our Own Keybindings

Since Atom is so Hackable, you can also add your own keybindings and change ones you don't like. We'll do one simple example here, but you can also check out the documentation for some more ideas. One thing we'll need for goal #5 is to make it easier to bring up the bottom panel within atom. This is where terminal output goes when we run a command. You'll first want to open up keymap.cson, which you can do by going to the file menu and click Keymap….

Then you can add the following lines at the bottom:

'atom-workspace':
  'ctrl-shift-down': 'window:toggle-bottom-dock'
  'ctrl-shift-up': 'window:toggle-bottom-dock'

First, we scope the command to the entire atom workspace. (We'll see an example below of a command with a more limited scope). Then we assign the Ctrl+Shift+Down Arrow key combination to toggle the bottom dock. Since it's a toggle command, we could repeat the command to move it both up and down. But this isn't very intuitive, so we add the second line so that we can also use the up arrow to bring it up.

A super helpful tool is the key binding resolver. At any point, you can use ctrl+. (control key plus the period key) to bring up the resolver. Then pressing any key combination will bring up the commands Atom will run for it. It will highlight the one it will pick in case of conflicts. This is great for finding unassigned key combinations!

Haskell Mode in Atom

Now let's start looking at adding some Haskell functionality to our editor. We'll start by installing a few different Haskell-related packages in Atom. You don't need all these, but this is a list of the core packages suggested in the Atom documentation.

language-haskell
ide-haskell
ide-haskell-cabal
haskell-ghc-mod
autocomplete-haskel

The trickier part of getting Haskell functionality is the binary dependencies. A couple of the packages we added depend on having a couple programs installed. The most prominent of these is ghc-mod, which exposes some functionality of GHC. You'll also want a code formatter, such as hindent, or stylish-haskell installed.

At the most basic level, it's easy to install these programs with Stack. You can run the command:

stack install ghc-mod stylish-haskell

However, ghc-mod matches up with a specific version of GHC. The command above installs the binaries at a system-wide level. This means you can only have the version for one GHC version installed at a time. So imagine you have one project using GHC 8.0, and another project using GHC 8.2. You won't be able to get Haskell features for each one at the same time using this approach. You would need to re-install the proper version whenever you switched projects.

As a note, there are a couple ways to ensure you know what version you've installed. First, you can run the stack install ghc-mod command from within the particular project directory. This will use that project's LTS to resolve which version you need. You can also modify the install command like so:

stack --resolver lts-9 install ghc-mod

There is an approach where you can install different, compiler specific versions of the binary on your system, and have Atom pick the correct one. I haven't been able to make this approach work yet. But you can read about that approach on Alexis King's blog post here.

Keybinding for Builds

Once we have that working, we'll have met most of our feature goals. We'll have partial compilation and some Haskell specific autocompletion. There are other packages, such as haskell-hoogle that you can install for even more features.

There's one more feature we want though, which is to be able to build our project from the keyboard. When we installed our Haskell packages, Atom added a "Haskell IDE" menu at the top. We can use this to build our project with "Haskell IDE" -> "Builder" -> "Build Project". We can add a keybinding for this command like so.

'atom-text-editor[data-grammer~/"haskell"]':
  ...
  'ctrl-alt-shift-b': 'ide-haskell-cabal:build'

Notice that we added a namespace here, so this command will only run on Haskell files. Now we can build our project at any time with Ctrl+Shift+Alt+B, which will really streamline our development!

Weaknesses

The biggest weakness with Atom Haskell-mode is binary dependencies and GHC versions. The idea behind Stack is that switching to a different project with a different compiler shouldn't be hard. But there are a lot of hoops to jump through to get editor support. To be fair though, these problems are not exclusive to Atom.

Another weakness is that the Haskell plugins for Atom currently only support up through LTS 9 (GHC 8). This is a big weakness if you're looking to use new features from the cutting edge of GHC development. So Atom Haskell-mode might not be fully-featured for industry projects or experimental work.

As a further note, the Vim mode in Atom doesn't give all the keybindings you might expect from Vim. For example, I could no longer use the colon key plus a number to jump to a line. Of course, Atom has its own bindings for these things. But it takes a little while to re-learn the muscle memory.

Alternatives

There are, of course, alternatives to the approach I've laid out in this article. Many plugins/packages exist enabling you to get good Haskell features with Emacs and Vim. For Emacs, you should look at haskell-mode. For Vim, I made the most progress following this article from Stephen Diehl. I'll say for my part that I haven't tried the Emacs approach, and ran into problems a couple times with Vim. But with enough time and effort, you can get them to work!

If you use Visual Studio, there are a couple packages for Haskell: Haskelly and Haskero. I haven't used either of these, but they both seem provide a lot of nice features.

Conclusion

Having a good development environment is one of the keys to programming success. More popular languages have full-featured IDE's that make programming a lot easier. Haskell doesn't have this level of support. But there's enough community help that you can use a hackable editor like Atom to get most of what you want. Since I fixed this glaring weakness, I've been able to write Haskell much more efficiently. If you're starting out with the language, this can make or break your experience! So it's worth investing at least a little bit of time and effort to ensure you've got a smooth system to work with.

Of course, having an editor setup for Haskell is meaningless if you've never used the language! Download our Beginners Checklist or read our Liftoff Series to get going!

Haskell Data Types Review!

This week we're taking a quick break from new content. We've added our new series on Haskell's data system to our permanent collection. You can find it under the beginners panel or check it out here! This series had five parts. Let's take a quick review:

  1. In part 1 we reviewed the basic way to construct data types in Haskell. We compared this to the syntax of other langauges like Java and Python.
  2. Part 2 showed the simple way we can extend our Haskell types to make them sum types! We saw that this is a more difficult process in other languages. In fact, we resorted to making different inherited types in object oriented languages.
  3. Next, we demonstrated the concept of parametric types in part 3. We saw how little we needed to add to Haskell's definitions to make this work. Again, we looked at comparable examples in other languages as well.
  4. In part 4, we delved into Haskell's typeclasses. We compared them against inherited types from OO languages and noted some pros and cons.
  5. Finally, in part 5 we concluded the series by exploring the idea of type families. Our code was more complicated than we'd need in other languages. And yet, our code contains a lot more behavioral guarantees in Haskell than it does elsewhere. And we achieved this while still having a good deal of flexibility. Type families have a definite learning curve, but they're a useful concept to know.

As always keeping coming back every Monday morning for some new Haskell content! For more updates and our monthly newsletter, make sure you Subscribe! This will also give you access to our Subscriber Resources!

Why Haskell V: Type Families

type_families.png

Welcome to the conclusion of our series on Haskell data types! We've gone over a lot of things in this series that demonstrated Haskell's simplicity. We compared Haskell against other languages where we saw more cumbersome syntax. In this final part, we'll see something a bit more complicated though. We'll do a quick exploration of the idea of type families. We'll start by tracing the evolution of some related type ideas, and then look at a quick example.

Type families are a rather advanced concept. But if you're more of a beginner, we've got plenty of other resources to help you out! Take a look at our Getting Started Checklist or our Liftoff Series!

Different Kinds of Type Holes

In this series so far, we've seen a couple different ways to "plug in a hole", as far as a type or class definition goes. In the third part of this series we explored parametric types. These have type variables as part of their definition. We can view each type variable as a hole we need to fill in with another type.

Then in the fourth part, we explored the concept of typeclasses. For any instance of a typeclass, we're plugging in the holes of the function definitions of that class. We fill in each hole with an implementation of the function for that particular type.

This week, we're going to combine these ideas to get type families! A type family is an enhanced class where one or more of the "holes" we fill in is actually a type! This allows us to associate different types with each other. The result is that we can write special kinds of polymorphic functions.

A Basic Logger

First, here's a contrived example to use through this article. We want to have a logging typeclass. We'll call it MyLogger. We'll have two main functions in this class. We should be able to get all the messages in the log in chronological order. Then we should be able to log a new message while sending some sort of effect. A first pass at this class might look like this:

class MyLogger logger where
  prevMessages :: logger -> [String]
  logString :: String -> logger -> logger

We can make a slight change that would use the State monad instead of passing the logger as an argument:

class MyLogger logger where
  prevMessages :: logger -> [String]
  logString :: String -> State logger ()

But this class is deficient in an important way. We won't be able to have any effects associated with our logging. What if we want to save the log message in a database, send it over network connection, or log it to the console? We could allow this, while still keeping prevMessages pure like so:

class MyLogger logger where
  prevMessages :: logger -> [String]
  logString :: String -> StateT logger IO ()

Now our logString function can use arbitrary effects. But this has the obvious downside that it forces us to introduce the IO monad places where we don't need it. If our logger doesn't need IO, we don't want it. So what do we do?

Type Family Basics

One answer is to make our class a type family. W do this with the type keyword in the class defintion. First, we need a few language pragmas to allow this:

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE AllowAmbiguousTypes #-}

Now we'll make a type within our class that refers to the monadic effect type of the logString function. We have to describe the "kind" of the type with the definition. Since it's a monad, its kind is * -> *. This indicates that it requires another type parameter. Here's what our definition looks like:

class MyLogger logger where
  type LoggerMonad logger :: * -> *
  prevMessages :: logger -> [String]
  logString :: String -> (LoggerMonad logger) ()

Some Simple Instances

Now that we have our class, let's make an instance that does NOT involve IO. We'll use a simple wrapper type for our logger. Our "monad" will contain the logger in a State. Then all we do when logging a string is change the state!

newtype ListWrapper = ListWrapper [String]
instance MyLogger ListWrapper where
  type (LoggerMonad ListWrapper) = State ListWrapper
  prevMessages (ListWrapper msgs) = reverse msgs
  logString s = do
    (ListWrapper msgs) <- get
    put $ ListWrapper (s : msgs)

Now we can make a version of this that starts involving IO, but without any extra "logging" effects. Instead of using a list for our state, we'll use a mapping from timestamps to the messages. When we log a string, we'll use IO to get the current time and store the string in the map with that time.

newtype StampedMessages = StampedMessages (Data.Map.Map UTCTime String)
instance MyLogger StampedMessages where
  type (LoggerMonad StampedMessages) = StateT StampedMessages IO
  prevMessages (StampedMessages msgs) = Data.Map.elems msgs
  logString s = do
    (StampedMessages msgs) <- get
    currentTime <- lift getCurrentTime
    put $ StampedMessages (Data.Map.insert currentTime s msgs)

More IO

Now for a couple examples that use IO in a traditional logging way while also storing the messages. Our first example is a ConsoleLogger. It will save the message in its State but also log the message to the console.

newtype ConsoleLogger = ConsoleLogger [String]
instance MyLogger ConsoleLogger where
  type (LoggerMonad ConsoleLogger) = StateT ConsoleLogger IO
  prevMessages (ConsoleLogger msgs) = reverse msgs
  logString s = do
    (ConsoleLogger msgs) <- get
    lift $ putStrLn s
    put $ ConsoleLogger (s : msgs)

Another option is to write our messages to a file! We'll store the file name as part of our state, though we could use the Handle if we wanted.

newtype FileLogger = FileLogger (String, [String])
instance MyLogger FileLogger where
  type (LoggerMonad FileLogger) = StateT FileLogger IO
  prevMessages (FileLogger (_, msgs)) = reverse msgs
  logString s = do
    (FileLogger (filename, msgs)) <- get
    handle <- lift $ openFile filename AppendMode
    lift $ hPutStrLn handle s
    lift $ hClose handle
    put $ FileLogger (filename, s : msgs)

And we can imagine that we would have a similar situation if we wanted to send the logs over the network. We would use our State to store information about the destination server. Or else we could add something like Servant's ClientM monad to our stack in the type definition.

Using Our Logger

By defining our class like this, we can now write a polymorphic function that will work with any of our loggers!

runComputations :: (Logger logger, Monad (LoggerMonad logger)) => InputType -> (LoggerMonad logger) ResultType
runComputations input = do
  logString "Starting Computation!"
  let x = firstFunction input
  logString "Finished First Computation!"
  let y = secondFunction x
  logString "Finished Second Computation!"
  return y

This is awesome because our code is now abstracted away from the needed effects. We could call this with or without the IO monad.

Comparing to Other Languages

Now, to be fair, this is one area of Haskell's type system that makes it a bit more difficult to use than other languages. Arbitrary effects can happen anywhere in Java or Python. Because of this, we don't have to worry about matching up effects with types.

But let's not forget about the benefits! For all parts of our code, we know what effects we can use. This lets us determine at compile time where certain problems can arise.

And type families give us the best of both worlds! They allow us to write polymorphic code that can work either with or without IO effects!

Conclusion

That's all for our series on Haskell's data system! We've now seen a wide range of elements, from the simple to the complex. We compared Haskell against other languages. Again, the simplicity with which one can declare data in Haskell and use it polymorphically was a key selling point for me!

Hopefully this series has inspired you to get started with Haskell if you haven't already! Download our Getting Started Checklist or read our Liftoff Series to get going!

Why Haskell IV: Typeclasses vs. Inheritance

inheritance.jpg

Welcome to part four of our series comparing Haskell's data types to other languages. As I've expressed before, the type system is one of the key reasons I enjoy programming in Haskell. And this week, we're going to get to the heart of the matter. We'll compare Haskell's typeclass system with the idea of inheritance used by object oriented languages.

If Haskell's simplicity inspires you as well, try it out! Download our Beginners Checklist and read our Liftoff Series to get going!

Typeclasses Review

Before we get started, let's do a quick review of the concepts we're discussing. First, let's remember how typeclasses work. A typeclass describes a behavior we expect. Different types can choose to implement this behavior by creating an instance.

One of the most common classes is the Functor typeclass. The behavior of a functor is that it contains some data, and we can map a type transformation over that data.

In the raw code definition, a typeclass is a series of function names with type signatures. There's only one function for Functor: fmap:

class Functor f where
  fmap :: (a -> b) -> f a -> f b

A lot of different container types implement this typeclass. For example, lists implement it with the basic map function:

instance Functor [] where
  fmap = map

But now we can write a function that assumes nothing about one of its inputs except that it is a functor:

stringify :: (Functor f) -> f Int -> f String

We could pass a list of ints, an IO action returning an Int, or a Maybe Int if we wanted. This function would still work! This is the core idea of how we can get polymorphic code in Haskell.

Inheritance Basics

As we saw in previous parts, object oriented languages like Java, C++, and Python tend to use inheritance to achieve polymorphism. With inheritance, we make a new class that extends the functionality of a parent class. The child class can access the fields and functions of the parent. We can call functions from the parent class on the child object. Here's an example:

public class Person {
  public String firstName;
  public String lastName;
  public int age;

  public Person(String fn, String ln, int age) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = age;
  }

  public String getFullName() {
    return this.firstName + " " + this.lastName;
  }
}

public class Employee extends Person {
  public String company;
  public String email;
  public int salary;

  public Employee(String fn,
                  String ln,
                  int age,
                  String company,
                  String em,
                  int sal) {
    super(fn, ln, age);
    this.company = company;
    this.email = em;
    this.sal = sal;
  }
}

Inheritance expresses an "Is-A" relationship. An Employee "is a" Person. Because of this, we can create an Employee, but pass it to any function that expects a Person. We can also call the getFullName function from Person on our Employee type.

public void printPerson(Person p) {
  ...
}

public void main {
  Employee e = Employee("Michael", "Smith", 23, "Google", "msmith@google.com", 100000);
  printPerson(e);
  String s = e.getFullName();
}

Here's another trick. We can put items constructed as either Person or Employee in the same array, if that array has type Person[]:

public void main {
  Employee e = Employee("Michael", "Smith", 23, "Google", "msmith@google.com", 100000);
  Person p = Person("Katie", "Johnson", 25);
  Person[] people = {e, p};
}

This provides a useful kind of polymorphism we can't get in Haskell.

Benefits

Inheritance does have a few benefits. It allows us to reuse code. The Employee class can use the getFullName function without having to define it. If we wanted, we could override the definition in the Employee class, but we don't have to.

Inheritance also allows a degree of polymorphism, as we saw in the code examples above. If the circumstances only require us to use a Person, we can use an Employee or any other subclass of Person we make.

We can also use inheritance to hide variables away when they aren't needed by subclasses. In our example above, we made all our instance variables public. This means an Employee function can still call this.firstName. But if we make them private instead, the subclasses can't use them in their functions. This helps to encapsulate our code.

Drawbacks

Inheritance is not without its downsides though. One unpleasant consequence is that it creates a tight coupling between classes. If we change the parent class, we run the risk of breaking all child classes. If the interface to the parent class changes, we'll have to change any subclass that overrides the function.

Another potential issue is that your interface could deform to accommodate child classes. There might be some parameters only a certain child class needs, and some only the parent needs. But you'll end up having all parameters in all versions because the API needs to match.

A final problem comes from trying to understand source code. There's a yo-yo effect that can happen when you need to hunt down what function definition your code is using. For example your child class can call a parent function. That parent function might call another function in its interface. But if the child has overridden it, you'd have to go back to the child. And this pattern can continue, making it difficult to keep track of what's happening. It gets even worse the more levels of a hierarchy you have.

I was a mobile developer for a couple years, using Java and Objective C. These kinds of flaws were part of what turned me off OO-focused languages.

Typeclasses as Inheritance

Now, Haskell doesn't allow you to "subclass" a type. But we can still get some of the same effects of inheritance by using typeclasses. Let's see how this works with the Person example from above. Instead of making a separate Person data type, we can make a Person typeclass. Here's one approach:

class Person a where
  firstName :: a -> String
  lastName :: a -> String
  age :: a -> Int
  getFullName :: a -> String

data Employee = Employee
  { employeeFirstName :: String
  , employeeLastName :: String
  , employeeAge :: Int
  , company :: String
  , email :: String
  , salary :: Int
  }

instance Person Employee where
  firstName = employeeFirstName
  lastName = employeeLastName
  age = employeeAge
  getFullName e = employeeFirstName e ++ " " ++ employeeLastName e

We can one interesting observation here. Multiple inheritance is now trivial. After all, a type can implement as many typeclasses as it wants. Python and C++ allows multiple inheritance. But it presents enough conceptual pains that languages like Java and Objective C do not allow it.

Looking at this example though, we can see a big drawback. We won't get much code reusability out of this. Every new type will have to define getFullName. That will get tedious. A different approach could be to only have the data fields in the interface. Then we could have a library function as a default implementation:

class Person a where
  firstName :: a -> String
  lastName :: a -> String
  age :: a -> Int

getFullName :: (Person a) => a -> String
getFullName p = firstName p ++ " " ++ lastName p

data Employee = ...

instance Person Employee where
  ...

This allows code reuse. But it does not allow overriding, which the first example would. So you'd have to choose on a one-off basis which approach made more sense for your type. And no matter what, we can't place different types into the same array, as we could in Java.

So while we could do inheritance in Haskell, it's a pattern you should avoid. Stick to using typeclasses in the intended way.

Comparisons

Object oriented inheritance has some interesting uses. But at the end of the day, I found the warts very annoying. Tight coupling between classes seems to defeat the purpose of abstraction. Meanwhile, restrictions like single inheritance feel like a code smell to me. The existence of that restriction suggests a design flaw. Finally, the issue of figuring out which version of a function you're using can be quite tricky. This is especially true when your class hierarchy is large.

Typeclasses express behaviors. And as long as our types implement those behaviors, we get access to a lot of useful code. It can be a little tedious to flesh out a new instance of a class for every type you make. But there are all kinds of ways to derive instances, and this can reduce the burden. I find typeclasses a great deal more intuitive and less restrictive. Whenever I see a requirement expressed through a typeclass, it feels clean and not clunky. This distinction is one of the big reasons I prefer Haskell over other languages.

Conclusion

That wraps up our comparison of typeclasses and inheritance! There's one more topic I'd like to cover in this series. It goes a bit beyond the "simplicity" of Haskell into some deeper ideas. We've seen concepts like parametric types and typeclasses. These force us to fill in "holes" in a type's definition. We can expand on this idea by looking at type families. Next week, we'll explore this more advanced concept and see what it's useful for.

If you want to stay up to date with our blog, make sure to subscribe! That will give you access to our subscriber only resources page!

Why Haskell III: Parametric Types

templating.jpg

Welcome back to our series on the simplicity of Haskell's data declarations. Last week, we looked at how to express sum types in different languages. We saw that they fit very well within Haskell's data declaration system. For Java and Python, we ended up using inheritance, which presents some interesting benefits and drawbacks. We'll explore those more next week. But first, we should wrap our heads around one more concept: parametric types.

We'll see how each of these languages allows for the concept of parametric types. In my view, Haskell does have the cleanest syntax. But other compiled languages do pretty well to incorporate the concept. Dynamica languages though, provide insufficient guarantees for my liking.

This all might seem a little wild if you haven't done any Haskell at all yet! Read our Liftoff Series to get started!

Haskell Parametric Types

Let's remember how easy it is to do parametric types in Haskell. When we want to parameterize a type, we'll add a type variable after its name in the definition. Then we can use this variable as we would any other type. Remember our Person type from the first week? Here's what it looks like if we parameterize the occupation field.

data Person o = Person
  { personFirstName :: String
  , personLastName :: String
  , personEmail :: String
  , personAge :: Int
  , personOccupation :: o
  }

We add the o at the start, and then we can use o in place of our former String type. Now whenever we use the Person type, we have to specify a type parameter to complete the definition.

data Occupation = Lawyer | Doctor | Engineer

person1 :: Person String
person1 = Person "Michael" "Smith" "msmith@gmail.com" 27 "Lawyer"

person2 :: Person Occupation
person2 = Person "Katie" "Johnson" "kjohnson@gmail.com" 26 Doctor

When we define functions, we can use a specific version of our parameterized type if we want to constrain it. We can also use a generic type if it doesn't matter.

salesMessage :: Person Occupation -> String
salesMessage p = case personOccupation p of
  Lawyer -> "We'll get you the settlement you deserve"
  Doctor -> "We'll get you the care you need"
  Engineer -> "We'll build that app for you"

fullName :: Person o -> String
fullName p = personFirstName p ++ " " ++ personLastName p

Last of all, we can use a typeclass constraint on the parametric type if we only need certain behaviors:

sortOnOcc :: (Ord o) => [Person o] -> [Person o]
sortOnOcc = sortBy (\p1 p2 -> compare (personOccupation p1) (personOccupation p2)

Java Generic Types

Java has a comparable concept called generics. The syntax for defining generic types is pretty clean. We define a type variable in brackets. Then we can use that variable as a type freely throughout the class definition.

public class Person<T> {
    private String firstName;
    private String lastName;
    private String email;
    private int age;
    private T occupation;

    public Person(String fn, String ln, String em, int age, T occ) {
        this.firstName = fn;
        this.lastName = ln;
        this.email = em;
        this.age = age;
        this.occupation = occ;
    }

    public T getOccupation() { return this.occupation; }
    public void setOccupation(T occ) { this.occupation = occ; }
    ...
}

There's a bit of a wart in how we pass constraints. This comes from the Java distinction of interfaces from classes. Normally, when you define a class and state the subclass, you would use the extends keyword. But when your class uses an interface, you use the implements keyword.

But with generic type constraints, you only use extends. You can chain constraints together with &. But if one of the constraints is a subclass, it must come first.

public class Person<T extends Number & Comparable & Serializable> {

In this example, our template type T must be a subclass of Number. It must then implement the Comparable and Serializable interfaces. If we mix the order up and put an interface before the parent class, it will not compile:

public class Person<T extends Comparable & Number & Serializable> {

C++ Templates

For the first time in this series, we'll reference a little bit of C++ code. C++ has the idea of "template types" which are very much like Java's generics. Here's how we can create our user type as a template:

template <class T>
class Person {
public:
  string firstName;
  string lastName;
  string email;
  int age;
  T occupation;

  bool compareOccupation(const T& other);
};

There's a bit more overhead with C++ though. C++ function implementations are typically defined outside the class definition. Because of this, you need an extra leading line for each of these stating that T is a template. This can get a bit tedious.

template <class T>
bool Person::compareOccupation(const T& other) {
  ...
}

One more thing I'll note from my experience with C++ templates. The error messages from template types can be verbose and difficult to parse. For example, you could forget the template line above. This alone could cause a very confusing message. So there's definitely a learning curve. I've always found Haskell's error messages easier to deal with.

Python - The Wild West!

Since Python isn't compiled, there aren't type constraints when you construct an object. Thus, there is no need for type parameters. You can pass whatever object you want to a constructor. Take this example with our user and occupation:

class Person(object):

  # This definition hasn't changed!
  def __init__(self, fn, ln, em, age, occ):
    self.firstName = fn
    self.lastName = ln
    self.email = em
    self.age = age
    self.occupation = occ

stringOcc = "Lawyer"
person1 = Person(
    "Michael",
    "Smith",
    "msmith@gmail.com",
    27,
    stringOcc)

class Occupation(object):
  …

classOcc = Occupation()

# Still works!
person2 = Person(
  "Katie",
  "Johnson",
  "kjohnson@gmail.com",
  26,
  classOcc)

Of course, with this flexibility comes great danger. If you expect there are different types you might pass for the occupation, your code must handle them all! Without compilation, it can be tricky to know you can do this. So while you can do polymorphic code in Python, you're more limited. You shouldn't get too carried away, because it is more likely to blow up in your face.

Conclusion

Now that we know about parametric types, we have more intuition for the idea of filling in type holes. This will come in handy next week as we look at Haskell's typeclass system for sharing behaviors. We'll compare the object oriented notion of inheritance and Haskell's typeclasses. This distinction gets to the core of why I've come to prefer Haskell as a language. You won't want to miss it!

If these comparisons have intrigued you, you should give Haskell a try! Download our Beginners Checklist to get started!

Why Haskell II: Sum Types

sum_types.jpg

Today, I'm continuing our series on "Why Haskell". We're looking at concepts that are simple to express in Haskell but harder in other languages. Last week we began by looking at simple data declarations. This week, we'll go one step further and look at sum types. That is, we'll consider types with more than one constructor. These allow the same type to represent different kinds of data. They're invaluable in capturing many concepts.

Most of the material in this article is pretty basic. But if you haven't gotten the chance to use Haskell yet, you might to start from the beginning! Download our Beginners Checklist or read our Liftoff Series!

Haskell Basic Sum Types

Last week we started with a basic Person type like so:

data Person = Person String String String Int String

We can expand this type by adding more constructors to it. Let's imagine our first constructor refers to an adult person. Then we could make a second constructor for a Child. It will have different information attached. For instance, we only care about their first name, age, and what grade they're in:

data Person =
  Adult String String String Int String |
  Child String Int Int

To determine what kind of Person we're dealing with, it's a simple case of pattern matching. So whenever we need to branch, we do this pattern match in a function definition or a case statement!

personAge :: Person -> Int
personAge (Adult _ _ _ a _) = a
personAge (Child _ a _) = a

-- OR

personAge :: Person -> Int
personAge p = case p of
  Adult _ _ _ a _ -> a
  Child _ a _ -> a

On the whole, our definition is very simple! And the approach scales. Adding a third or fourth constructor is just as simple! This extensibility is super attractive when designing types. The ease of this concept was a key point in convincing me about Haskell.

Record Syntax

Before we move onto other languages, it's worth noting the imperfections with this design. In our type above, it can be a bit confusing what each field represents. We used record syntax in the previous part to ease this pain. We can apply that again on this sum type:

data Person =
  Adult
    { adultFirstName :: String
    , adultLastName :: String
    , adultEmail :: String
    , adultAge :: Int
    , adultOccupation :: String
    } |
  Child
    { childFirstName :: String
    , childAge :: Int
    , childGrade :: Int
    }

This works all right, but it still leaves us with some code smells we don't want in Haskell. In particular, record syntax derives functions for us. Here are a few type signatures of those functions:

adultEmail :: Person -> String
childAge :: Person -> Int
childGrade :: Person -> Int

Unfortunately, these are partial functions. They are only defined for Person elements of the proper constructor. If we call adultEmail on a Child, we'll get an error, and we don't like that. The types appear to match up, but it will crash our program! We can work around this a little by merging field names like adultAge and childAge. But at the end of the day we'll still have some differences in what data we need.

Coding practices can reduce the burden somewhat. For example, it is quite safe to call head on a list if you've already pattern matched that it is non-empty. Likewise, we can use record syntax functions if we're in a "post-pattern-match" situation. But we would need to ignore them otherwise! And this is a rule we would like to avoid in Haskell.

Java Approach I: Multiple Constructors

Now let's try to replicate the idea of sum types in other languages. It's a little tricky. Here's a first approach we can do in Java. We could set a flag on our type indicating whether it's a Parent or a Child. Then we'll have all the different fields within our type. Note we'll use public fields without getters and setters for the sake of simplicity. Like Haskell, Java allows us to use two different constructors for our type:

public class Person {
  public boolean isAdult;
  public String adultFirstName;
  public String adultLastName;
  public String adultEmail;
  public int adultAge;
  public String adultOccupation;
  public String childFirstName;
  public int childAge;
  public int childGrade;

  // Adult Constructor
  public Person(String fn, String ln, String em, int age, String occ) {
    this.isAdult = true;
    this.adultFirstName = fn;
    ...
  }

  // Child Constructor
  public Person(String fn, int age, int grade) {
    this.isAdult = false;
    this.childFirstName = fn;
    ...
  }
}

We can see that there's a big amount of bloat on the field values, even if we were to combine common ones like age. Then we'll have more awkwardness when writing functions that have to pattern match. Each function within the type will involve a check on the boolean flag. And these checks might also percolate to outer calls as well.

public class Person {
  …
  public String getFullName() {
    if (this.isAdult) {
      // Adult Code
    } else {
      // Child Code
    }
  }
}

This approach is harder to scale to more constructors. We would need an enumerated type rather than a boolean for the "flag" value. And it would add more conditions to each of our functions. This approach is cumbersome. It's also very unidiomatic Java code. The more "proper" way involves using inheritance.

Java Approach II: Inheritance

Inheritance is a way of sharing code between types in an object oriented language. For this example, we would make Person a "superclass" of separate Adult and Child classes. We would have separate class declarations for each of them. The Person class would share all the common information. Then the child classes would have code specific to them.

public class Person {
  public String firstName;
  public int age;

  public Person(String fn, int age) {
    this.firstName = fn;
    this.age = age;
  }
}

// NOTICE: extends Person
public class Adult extends Person {
  public String lastName;
  public String email;
  public String occupation;

  public Adult(String fn, String ln, String em, int age, String occ) {
    // super calls the "Person" constructor
    super(fn, age);
    this.lastName = ln;
    this.email = em;
    this.occupation = occ;
  }
}

// NOTICE: extends Person
public class Child extends Person {
  public int grade;

  public Child(String fn, int age, int grade) {
    // super calls the "Person" constructor
    super(fn, age);
    this.grade = grade;
  }
}

By extending the Person type, each of our subclasses gets access to the firstName and age fields. There's a big upside we get here that Haskell doesn't usually have. In this case, we've encoded the constructor we used with the type. We'll be passing around Adult and Child objects for the most part. This saves a lot of the partial function problems we encounter in Haskell.

We will, on occasion, combine these in a form where we need to do pattern matching. For example, we can make an array of Person objects. Then at some point we'll need to determine which have type Adult and which have type Child. This is possible by using the isinstance condition in Java. But again, it's unidiomatic and we should strive to avoid it. Still, inheritance represents a big improvement over our first approach.

Python: Only One Constructor!

Unlike Java, Python only allows a single constructor for each type. The way we would control what "type" we make is by passing a certain set of arguments. We then provide None default values for the rest. Here's what it might look like.

class Person(object):
  def __init__(self,
               fn = None,
               ln = None,
               em = None,
               age = None,
               occ = None,
               grade = None):
    if fn and ln and em and age and occ:
      self.isAdult = true
      self.firstName = fn
      self.lastName = ln
      self.age = age
      self.occupation = occ
      self.grade = None
    elif fn and age and grade:
      self.isAdult = false
      self.firstName = fn
      self.age = age
      self.grade = grade
      self.lastName = None
      self.email = None
      self.occupation = None
    else:
      raise ValueError("Failed to construct a Person!")

# Note which arguments we use!
adult = Person(fn="Michael", ln="Smith", em="msmith@gmail.com", age=25, occ="Lawyer")
child = Person(fn="Mike", age=12, grade=7)

But there's a lot of messiness here! A lot of input combinations lead to errors! Because of this, the inheritance approach we proposed for Java is also the best way to go for Python. Again though, Python lacks pattern matching across different types of classes. This means we'll have more if statements like if isinstance(x, Adult). In fact, these will be even more prevalent in Python, as type information isn't attached.

Comparisons

Once again, we see certain themes arising. Haskell has a clean, simple syntax for this concept. It isn't without its difficulties, but it gets the job done if we're careful. Java gives us a couple ways to manage the issue of sum types. One is cumbersome and unidiomatic. The other is more idiomatic, but presents other issues as we'll see later. Then Python gives us a great deal of flexibility but few guarantees about anything's type. The result is that we can get a lot of errors.

Conclusion

This week, we continued our look at the simplicity of constructing types in Haskell. We saw how a first try at replicating the concept of sum types in other languages leads to awkward code. In a couple weeks, we'll dig deeper into the concept of inheritance. It offers a decent way to accomplish our task in Java and Python. And yet, there's a reason we don't have it in Haskell. But first up, our next article will look at the idea of parametric types. We'll see again that it is simpler to do this in Haskell's syntax than other languages. We'll need those ideas to help us explore inheritance later.

If this series makes you want to try Haskell more, it's time to get going! Download our Beginner's Checklist for some tips and tools on starting out! Or read our Liftoff Series for a more in depth look at Haskell basics.

Why Haskell I: Simple Data Types!

building_blocks.jpg

I first learned about Haskell in college. I've considered why I kept up with Haskell after, even when I didn't know about its uses in industry. I realized there were a few key elements that drew me to it.

In a word, Haskell is elegant. For me, this means we can simple concepts in simple terms. In the next few weeks, we're going to look at some of these concepts. We'll see that Haskell expresses a lot of ideas in simple terms that other languages express in more complicated terms. This week, we'll start by looking at simple data declarations.

If you've never used Haskell, now is the perfect time to start! For a quick start guide, download our Beginners Checklist. For a more in-depth walkthrough, read our Liftoff Series!

Haskell Data Declarations

This week, we'll be comparing a data type with a single constructor across a few different languages. Next week, we'll look at multi-constructor types. So let's examine a simple type declaration:

data Person = Person String String String Int String

Our declaration is very simple, and fits on one line. There's a single constructor with a few different fields attached to it. We know exactly what the types of those fields are, so we can build the object. The only way we can declare a Person is to provide all the right information in order.

firstPerson :: Person
firstPerson = Person "Michael" "Smith" "msmith@gmail.com" 32 "Lawyer"

If we provide any less information, we won't have a Person! We can leave off the last argument. But then the resulting type reflects that we still need that field to complete our item:

incomplete :: String -> Person
incomplete = Person "Michael" "Smith" "msmith@gmail.com" 32

Now, our type declaration is admittedly confusing. We don't know what each field means at all when looking at it. And it would be easy to mix things up. But we can fix that in Haskell with record syntax, which assigns a name to each field.

data Person = Person
  { personFirstName :: String
  , personLastName :: String
  , personEmail :: String
  , personAge :: Int
  , personOccupation :: String
  }

We can use these names as functions to retrieve the specific fields out of the data item later.

fullName :: Person -> String
fullName person = personFirstName person ++ " "
  ++ personLastName person

And that's the basics of data types in Haskell! Let's take a look at this same type declaration in a couple other languages.

Java

If we wanted to express this in the simplest possible Java form, we'd do so like this:

public class Person {
  public String firstName;
  public String lastName;
  public String email;
  public int age;
  public String occupation;
}

Now, this definition isn't much longer than the Haskell definition. It isn't a very useful definition as written though! We can only initialize it with a default constructor Person(). And then we have to assign all the fields ourselves! So let's fix this with a constructor:

public class Person {
    public String firstName;
    public String lastName;
    public String email;
    public int age;
    public String occupation;

    public Person(String fn,
                  String ln, 
                  String em, 
                  int age, 
                  String occ) {
        this.firstName = fn;
        this.lastName = ln;
        this.email = em;
        this.age = age;
        this.occupation = occ;
    }
}

Now we can initialize it in a sensible way. But this still isn't idiomatic Java. Normally, we would have our instance variables declared as private, not public. Then we would expose the ones we wanted via "getter" and "setter" methods. If we do this for all our types, it would bloat the class quite a bit. In general though, you wouldn't have arbitrary setters for all your fields. Here's our code with getters and one setter.

public class Person {
    private String firstName;
    private String lastName;
    private String email;
    private int age;
    private String occupation;

    public Person(String fn, 
                  String ln, 
                  String em,
                  int age,
                  String occ) {
        this.firstName = fn;
        this.lastName = ln;
        this.email = em;
        this.age = age;
        this.occupation = occ;
    }

  public String getFirstName() { return this.firstName; }
  public String getLastName() { return this.lastName; }
  public String getEmail() { return this.email; }
  public int getAge() { return this.age; }
  public String getOccupation() { return this.occupation; }

  public void setOccupation(String occ) { this.occupation = occ; }
}

Now we've got code that is both complete and idiomatic Java.

Public and Private

We can see that the lack of a public/private distinction in Haskell saves us a lot of grief in defining our types. Why don't we do this?

In general, we'll declare our data types so that constructors and fields are all visible. After all, data objects should contain data. And this data is usually only useful if we expose it to the outside world. But remember, it's only exposed as read-only, because our objects are immutable! We'd have to construct another object if we want to "mutate" an existing item (IO monad aside).

The other thing to note is we don't consider functions as a part of our data type in the same way Java (or C++) does. A function is a function whether we define it along with our type or not. So we separate them syntactically from our type, which also contributes to conciseness.

Of course, we do have some notion of public and private items in Haskell. Instead of using the type defintion, we handle it with our module definitins. For instance, we might abstract constructors behind other functions. This allows extra features like validation checks. Here's how we can define our person type but hide it's true constructor:

module Person (Person, mkPerson) where

-- We do NOT export the `Person` constructor!
--
-- To do that, we would use:
-- module Person (Person(Person)) where
--   OR
-- module Person (Person(..)) where

data Person = Person String String String Int String

mkPerson :: String -> String -> String -> Int -> String
  -> Either String Person
mkPerson = ...

Now anyone who uses our code has to use the mkPerson function. This lets us return an error if something is wrong!

Python

As our last example in this article, here's a simple Python version of our data type.

class Person(object):

  def __init__(self, fn, ln, em, age, occ):
    self.firstName = fn
    self.lastName = ln
    self.email = em
    self.age = age
    self.occupation = occ

This definition is pretty compact. We can add functions to this class, or define them outside and pass the class as another variable. It's not as clean as Haskell, but much shorter than Java.

Now, Python has no notion of private member variables. Conventions exist, like using an underscore in front of "private" variable names. But you can't restrict their usage outside of your file, even through imports! This helps keep the type definition smaller. But it does make Python a little less flexible than other languages.

What Python does have is more flexibility in argument ordering. We can name our arugments as follows, allowing us to change the order we use to initialize our type. Then we can include default arguments (like None).

class Person(object):

  def __init__(self, fn=None, ln=None, em=None, age=None, occ=None):
    self.firstName = fn
    self.lastName = ln
    self.email = em
    self.age = age
    self.occupation = occ

# This person won't have a first name!
myPerson = Person(
             ln="Smith",
             age=25,
             em="msmith@gmail.com",
             occ="Lawyer")

This gives more flexibility. We can initialize our object in a lot more different ways. But it's also a bit dangerous. Now we don't necessarily know what fields are null when using our object. This can cause a lot of problems later. We'll explore this theme throughout this series when looking at Python data types and code.

Javascript

We'll be making more references to Python throughout this series as we explore its syntax. Most of the observations we make about Python apply equally well to Javascript. In general, Javascript offers us flexibility in constructing objects. For instance, we can even extend objects with new fields once they're created. Javascript even naturalizes the concept of extending objects with functions. (This is possible in Python, but not as idiomatic).

A result of this though is that we have no guarantees about how which of our objects have which fields. We won't know for sure we'll get a good value from calling any given property. Even basic computations in Javascript can give results like NaN or undefined. In Haskell you can end up with undefined, but pretty much only if you assign that value yourself! And in Haskell, we're likely to see an immediate termination of the program if that happens. Javascript might percolate these bad values far up the stack. These can lead to strange computations elsewhere that don't crash our program but give weird output instead!

But the specifics of Javascript can change a lot with the framework you happen to be using. So we won't cite too many code examples in this series. Remember though, most of the observations we make with Python will apply.

Conclusion

So after comparing these methods, I much prefer using Haskell's way of defining data. It's clean, and quick. We can associate functions with our type or not, and we can make fields private if we want. And that's just in the one-constructor case! We'll see how things get even more hairy for other languages when we add more constructors! Come back next week to see how things stack up!

If you've never programmed in Haskell, hopefully this series shows you why it's actually not too hard! Read our Liftoff Series or download our Beginners Checklist to get started!

Deeper Stack Knowledge

stack_of_books.png

This week we'll look at another way we can "level up" our Haskell skills. We'll look at some of the details around how Stack determines package versions. This will help us explain the nuances of when you need "extra deps" and why. We'll also explore some ways to bring in non-standard Haskell code.

But of course you need to know the basics before you can really start going! So if you've never used Stack before, take our free Stack mini-course!

Adding Libraries (Basics)

When I'm writing a small project for one of these articles, I don't have to think much about library versions. Generally, anything recent is fine. Let's take a simple example using Servant. I can start the project with stack new ServantExample. Then I can add servant as a dependency for my library by modifying the .cabal file:

build-depends:
  servant

When we run stack build, it'll install a whole bunch of dependencies for our project. We can do stack ls dependencies (or stack list-dependencies if you're on an older version of Stack). We'll see a list with many libraries, because servant has a lot of dependencies. In this article, we'll explore a few questions. How does Stack know which versions of these to get? Is it always finding the latest version? What happens if we need a different version? Do we ever need to look elsewhere?

Well first, we can, if we want, specify manual constraints on libraries within the .cabal file. To start, we can observe that Stack has downloaded servant-0.14.1 and directory-1.3.1.5 for our program. What happens if we add constraints like so:

build-depends:
  servant >= 0.14.1
  directory <= 1.2.0.0

We'll find that we can't build, because no version of directory matches our constraints. The error message will suggest adding it to extra dependencies in stack.yaml (we'll talk about this later). But this will cause dependency conflicts. So how do we avoid these conflicts? To understand this, let's examine the concept of resolvers.

Resolvers

Resolvers are one of the big things that separate Stack from Cabal on its own. A resolver is a series of libraries that have no conflicts among their dependencies. Each resolver has a version number.

If we go into our stack.yaml file, we'll see that we have a field that relates to the lts version number of our resolver. When we invoked the stack new command, this chose the latest lts, or "Long-Term Support" resolver. If there's an issue with this resolver, we can ask the great people at Stackage what's going wrong. At the time of writing, this version was 12.9:

resolver: lts-12.9

There are other kinds of resolvers we can use, as the comments in our auto-generated file will tell us. There are nightly builds, and resolvers that map to particular versions of GHC.

But let's stick to the idea of lts resolvers for now. A resolver gives us a big set of packages that work together and have no dependency conflicts. This prevents some of the more annoying issues that can come along when we try to have a lot of libraries.

For lts resolvers, the package directory lives on Stackage, and we can examine it if we like. We can see, for instance, on the site that there's a page dedicated to listing everything for lts-12.9. And we can compare the library versions on this site to what we've already got in our directory. And we'll see they're the same! For example, it lists version 0.14.1 of servant and version 1.3.1.5 of directory.

So when we write our cabal file, we don't need to list version constraints on our packages. Stack will find the matching version from the resolver. Then we'll know that we can meet dependency constraints with other packages there!

Resolvers and GHC

Now, our resolver has to work with whatever compiler we use. Each lts resolver links to a specific GHC version. We can't use our lts-12.9 resolver with GHC 7.10.3, because many of the library versions it links to only work for GHC 8+. If we are intent on using an older version of GHC, we'll have to use a resolver version that corresponds to it.

We can also get this kind of information by going onto the Stackage website. Let's lookup GHC 7.10.3, and we'll find that the last resolver for that was 6.35. Given this information, we can then set the resolver in our stack.yaml file. We'll then run stack build again. We'll find it uses different versions of certain packages for servant! For instance, servant itself is now version 0.7.1. Meanwhile the dependency on directory is gone entirely!

Extra Dependencies

Now let's suppose we don't want to write our program using the Servant library. Let's suppose we want to use Spock instead, like we did recently. When we first try to add Spock as a dependency in our .cabal file, we'll actually get an error. It looks like this:

In the dependencies for SpockExample-0.1.0.0:                                                                                     Spock needed, but the stack configuration has no specified version  (latest matching version                                                is 0.13.0.0)

Stack then recommends we add Spock as an extra dependency in stack.yaml. Why do we need to do this? (Get ready for some rhyming). We can't depend on Stackage to contain every package that lives on Hackage. After all, pretty much anyone can publish to Hackage! As you add more libraries, it's more work to ensure they are conflict free.

Often, updates will introduce new conflicts. And often, the original library's authors are no longer maintaining the package. This means they won't release an update to fix it. Thus, the package gets dropped from the latest resolver. And many packages aren't used enough to justify the effort of keeping them in the resolver.

But this is OK! We can still introduce Spock into our Stack program. We'll go to our stack.yaml file and add it under our extra-deps part of the file:

extra-deps:
- Spock-0.13.0.0

This however, leads us to more dependencies we must add:

Spock-core-0.13.0.0                                                                                                - reroute-0.5.0.0

After adding these to our extra packages, everything will build!

Unfortunately, it can be an arduous process to slog through ALL the extra deps you need as a result of one library. You can use the stack solver command to list them all. If you add the --update-config flag, it will even add them to your file for you! At the time of writing though, there seems to be a bug in this feature, as it fails whenever I try to use it on Spock.

Be warned though. Extra dependencies have no guards against conflicts. Packages within the resolver still won't conflict. But every new extra package you introduce brings some more risk. Sometimes you'll need to play version tetris to get things to work. Sometimes you may need to try a different library altogether.

Different Kinds of Packages

Changing gears a bit, the stack.yaml file allows you to specify different packages within your project. Each package is a self-contained Haskell unit containing its own .cabal file. The auto-generated stack.yaml always has this simple packages section:

packages:
- .

One option for what to use as a package is a local directory, relative to wherever the stack.yaml file lives. So the default is the directory itself. But at a certain point, it might make sense for you to break your project into more pieces. This way, they can be independently maintained and tested. You might have sub-packages that look a bit like this:

packages:
- ‘./my-project-core'
- ‘./my-project-server'
- ‘./my-project-client'
- ‘./my-project-db'

You can use other options besides local directories as well. If you have a package stored on a remote server as a tar file, you can reference that:

packages:
- ‘.'
- https://mysite.com/my-project-client-1.0.0.tar.gz

Stack will download the code as necessary and build it. The other common option you'll use is a Github repository. You'll often want to reference a specific commit hash to use. Here's what that would look like:

packages:
- ‘.'
- location:
  git: https://github.com/my-user/my-project.git
  commit: b7deadc0def7128384

This technique is especially useful when you need to fix bugs on a dependency. The normal release process on a library can take a long time. And the library's maintainers might not have time to review your fix. But you can supply your own code and then reference it through Github. Say you want to fix something in Servant. You can make your own fork of the repository, fix the bug, and use that as a package in your project:

packages:
- ‘.'
- location:
  git: https://github.com/jhb563/servant.git
  commit: b7deadc0def7128384

Other Fields

That covers most of what you'll want to do in the Stack file. There are other fields. For instance, the flags field allows you to override certain build flags for packages. Here's an example covered in the docs. The yackage package typically builds with the flag upload. If you're using it as a package or a dependency, you can set this flag in the stack.yaml file:

flags:
  yackage:
    upload: true

But if you want to set it to false, you can do this as well by flipping the flag there.

You can also use the extra-package-dbs field. This is necessary if you need a specialized set of libraries that aren't on Hackage. You can create your own local database if you like and store modified versions of packages there. This feature is pretty advanced so it's unlikely most of you will need it.

Conclusion

Using Stack is easy at a basic level. For starter projects, you probably won't have to change the stack.yaml file much at all. At most you'll add a couple extra dependencies. But as you make more complicated things, you'll need some extra features. You'll need to know how Stack resolves conflicts and how you can bring in code from different places. These small extra features are important to your growth as a Haskell developer.

If you've never learned the basics of Stack, you're in luck! You can take our free Stack mini-course! If you've never learned Haskell at all, nows the time to start! Download our Beginners Checklist to start your journey!

Common (But not so Common) Monads

Function Monad.png

Last week we looked at how monads can help you make the next jump in your Haskell development. We went over the runXXXT pattern and how it’s a common gateway for us to use certain monads from the rest of our code. But sometimes it also helps to go back to the basics. I actually went a long time without really grasping how to use a couple basic monads. Or at the very least, I didn’t understand how to use them as monads.

In this article, we’ll look at how to use the list monad and the function monad. Lists and functions are core concepts that any Haskeller learns from the get-go. But the list data structure and function application are also monads! And understanding how they work as such can teach us more about how monads work.

For an in-depth discussion of monads, check out our Functional Data Structures Series!

The General Pattern of Do Syntax

Using do syntax is one of the keys to understanding how to actually use monads. The bind operator makes it hard to track where your arguments are. Do syntax keeps the structure clean and allows you to pass results with ease. Let’s see how this works with IO, the first monad a lot of Haskellers learn. Here’s an example where we read the second line from a file:

readLineFromFile :: IO String
readLineFromFile = do
  handle <- openFile “myFile.txt” ReadMode
  nextLine <- hGetLine handle
  secondLine <- hGetLine handle
  _ <- hClose handle
  return secondLine

By keeping in mind the type signatures of all the IO functions, we can start to see the general pattern of do syntax. Let’s replace each expression with its type:

openFile :: FilePath -> IOMode -> IO Handle
hGetLine :: Handle -> IO String
hClose :: Handle -> IO ()
return :: a -> IO a

readLineFromFile :: IO String
readLineFromFile = do
  (Handle) <- (IO Handle)
  (String) <- (IO String)
  (String) <- (IO String)
  () <- (IO ())
  IO String

Every line in a do expression (except the last) uses the assignment operator <-. Then it has an expression of IO a on the right side, which it assigns to a value of a on the left side. The last line’s type then matches the final return value of this function. What’s important now is to recognize that we can generalize this structure to ANY monad:

monadicFunction :: m c
monadicFunction = do
  (_ :: a) <- (_ :: m a)
  (_ :: b) <- (_ :: m b)
  (_ :: m c)

So for example, if we have a function in the Maybe monad, we can use it and plug that in for m above:

myMaybeFunction :: a -> Maybe a

monadicMaybe :: a -> Maybe a
monadicMaybe x = do
  (y :: a) <- myMaybeFunction x
  (z :: a) <- myMaybeFunction y
  (Just z :: Maybe a)

The important thing to remember is that a monad captures a computational context. For IO, this context is that the computation might interact with the terminal or network. For Maybe, the context is that the computation might fail.

The List Monad

Now to graph the list monad, we need to know its computational context. We can view any function returning a list as non-deterministic. It could have many different values. So if we chain these computations, our final result is every possible combination. That is, our first computation could return a list of values. Then we want to check what we get with each of these different results as an input to the next function. And then we’ll take all those results. And so on.

To see this, let’s imagine we have a game. We can start that game with a particular number x. On each turn, we can either subtract one, add one, or keep the number the same. We want to know all the possible results after 5 turns, and the distribution of the possibilities. So we start by writing our non-deterministic function. It takes a single input and returns the possible game outputs:

runTurn :: Int -> [Int]
runTurn x = [x - 1, x, x + 1]

Here’s how we’d apply on this 5 turn game. We’ll add the type signatures so you can see the monadic structure:

runGame :: Int -> [Int]
runGame x = do
  (m1 :: Int) <- (runTurn x :: [Int])
  (m2 :: Int) <- (runTurn m1 :: [Int])
  (m3 :: Int) <- (runTurn m2 :: [Int])
  (m4 :: Int) <- (runTurn m3 :: [Int])
  (m5 :: Int) <- (runTurn m4 :: [Int])
  return m5

On the right side, every expression has type [Int]. Then on the left side, we get our Int out. So each of the m expressions represents one of the many solutions we'll get from runTurn. Then we run the rest of the function imagining we’re only using one of them. In reality though, we’ll run them all, because of how the list monad defines its bind operator. This mental jump is a little tricky. And it’s often more intuitive to just stick to using where expressions when we do list computations. But it's cool to see patterns like this pop up in unexpected places.

The Function Monad

The function monad is another one I struggled to understand for a while. In some ways, it's the same as the Reader monad. It encapsulates the context of having a single argument we can pass to different functions. But it’s not defined in the same way as Reader. When I tried to grok the definition, it didn’t make much sense to me:

instance Monad ((->) r) where
  return x = \_ -> x
  h >>= f = \w -> f (h w) w

The return definition makes sense. We’ll have a function that takes some argument, ignore that argument, and give the value as an output. The bind operator is a little more complicated. When we bind two functions together, we’ll get a new function that takes some argument w. We’ll apply that argument against our first function ((h w)). Then we’ll take the result of that, apply it to f, and THEN also apply the argument w again. It’s a little hard to follow.

But let’s think about this in the context of do syntax. Every expression on the right side will be a function that takes our type as its only argument.

myFunctionMonad :: a -> (x, y, z)
myFunctionMonad = do
  x <- :: a -> b
  y <- :: a -> c
  z <- :: a -> d
  return (x, y, z)

Now let’s imagine we’ll pass an Int and use a few different functions that can take an Int. Here’s what we’ll get:

myFunctionMonad :: Int -> (Int, Int, String)
myFunctionMonad = do
  x <- (1 +)
  y <- (2 *)
  z <- show
  return (x, y, z)

And now we have valid do syntax! So what happens when we run this function? We’ll call our different functions on the same input.

>> myFunctionMonad 3
(4, 6, "3")
>> myFunctionMonad (-1)
(0, -2, "-1")

When we pass 3 in the first example, we add 1 to it on the first line, multiply it by 2 on the second line, and show it on the third line. And we do this all without explicitly stating the argument! The tricky part is that all your functions have to take the input argument as their last argument. So you might have to do a little bit of argument flipping.

Conclusion

In this article we explored lists and functions, two of the most common concepts in Haskell. We generally don’t use these as monads. But we saw how they still fit into the monadic structure. We can use them in do-syntax, and follow the patterns we already know to make things work.

Perhaps you’ve tried to learn Haskell before but found monads a little too complex. Hopefully this article helped clarify the structure of monads. If you want to get your Haskell journey back under way, download our Beginners Checklist! Or to learn monads from the ground up, read our series on Functional Data Structures!

Making the Jump II: Using More Monads

making_jump_2.jpg

A few weeks ago, we addressed some important steps to advance past the "beginner" stage of Haskell. We learned how to organize your project and how to find the relevant documentation. This week we’re going to continue to look at another place where we can make a big step up. We’ll explore how to expand our vocabulary on monad usage.

Monads are a vital component of Haskell. You can’t use a lot of libraries unless you know how to incorporate their monadic functions. These functions often involve a monad that is custom to that library. When you’re first starting out, it can be hard to know how to incorporate these monads into the rest of your program.

In this article, we’ll focus on a specific pattern a lot of monads and libraries use. I call this pattern the “run” pattern. Often, you’ll use a function with a name like runXXX or runXXXT, where XXX is the name of the monad. These functions will always take a monadic expression as their first argument. Then they'll also take some other initialization information, and finally return some output. This output can either be in pure form or a different monad you’re already using like IO. We’ll start by seeing how this works with the State monad, and then move onto some other libraries.

Once you grasp this topic, it seems very simple. But a lot of us first learned monads with a bad mental model. For instance, the first thing I learned about monads was that they had side effects. And thus, you can only call them from places that have the same side effects. This applies to IO but doesn’t generalize to other monads. So even though it seems obvious now, I struggled to learn this idea at first. But let's start looking at some examples of this pattern.

For a more in depth look at monads, check out our series on Functional Data Structures! We start by learning about simpler things like functors. Then we eventually work our way up to monads and even monad transformers!

The Basics of “Run”: The State Monad

Let’s start by recalling the State monad. This monad has a single type parameter, and we can access this type as a global read/write state. Here’s an example function written in the State monad:

stateExample :: State Int (Int, Int, Int)
stateExample = do
  a <- get
  modify (+1)
  b <- get
  put 5
  c <- get
  return (a, b, c)

If this function is confusing, you should take a look at the documentation for State. It’ll at least show you the relevant type signatures. First we read the initial state. Then we modify it with some function. Finally we completely change it.

In the example above, if our initial state is 1, we’ll return (1,2,5) as the result. If the initial state is 2, we’ll return (2,3,5). But suppose we have a pure function. How do we call our state function?

pureFunction :: Int -> Int
pureFunction = ???

The answer is the runState function. We can check the documentation and find its type:

runState :: State s a -> s -> (a, s)

This function has two parameters. The first is a State action. We’ll pass our function above as this parameter! Then the second is the initial state, and this is how we’ll configure it. Then the result is pure. It contains our result, as well as the final value of the state. So here’s a sample call we can make that gives us this monadic expression in our pure function. We’ll call it from a where clause, and discard the final state:

pureFunction :: Int -> Int
pureFunction input = a + b + c
  where
    ((a,b,c), _) = runState stateExample input

This is the simplest example of how we can use the runXXX pattern.

Upgrading to Transformers

Now, suppose our State function isn’t quite pure. It now wants to print some of its output, so it’ll need the IO monad. This means it’ll use the StateT monad transformer over IO:

stateTExample :: StateT Int IO (Int, Int, Int)
stateTExample = do
  a <- get
  lift $ print “Initial Value:”
  lift $ print a
  modify (+1)
  b <- get
  lift $ putStrLn “After adding 1:”
  lift $ print b
  put 5
  c <- get
  lift $ putStrLn “After setting as 5:”
  lift $ print c
  return (a, b, c)

Now instead of calling this function from a pure format, we’ll need to call it from an IO function. But once again, we’ll use a runXXX function. Now though, since we’re using a monad transformer, we won’t get a pure result. Instead, we’ll get our result in the underlying monad. This means we can call this function from IO. So let’s examine the type of the runStateT function. We’ve substituted IO for the generic monad parameter m:

runStateT :: StateT s IO a -> s -> IO (a, s)

It looks a lot like runState, except for the extra IO parameters! Instead of returning a pure tuple for the result, it returns an IO action containing that result. Thus we can call it from the IO monad.

main :: IO ()
main = do
  putStrLn “Please enter a number.”
  input <- read <$> getLine
  results <- runStateT stateTExample input
  print results

We’ll get the following output as a result:

Please enter a number.
10
Initial Value:
10
After adding 1
11
After setting as 5
5
(10, 11, 5)

Using Run For Libraries

This pattern will often extend into libraries you use. For example, in our series on parsing, we examine the Megaparsec library. A lot of the individual parser combinators in that library exist in the Parsec or ParsecT monad. So we can combine a bunch of different parsers together into one function.

But then to run that function from your normal IO code (or another monad), you need to use the runParserT function. Let’s look at its type signature:

runParserT
  :: Monad m
  -> ParsecT e s m a
  -> String -- Name of source file
  -> s -- Input for parser
  -> m (Either (ParseError (Token s) e) a)

There are a lot of type parameters there that you don’t need to understand. But the structure is the same. The first parameter to our run function is the monadic action. Then we’ll supply some other inputs we need. Then we get some result, wrapped in an outer monad (such as IO).

We can see the same pattern if we use the servant-client library to make client-side API calls. Any call you make to your API will be in the ClientM monad. Now here’s the type signature of the runClientM function:

runClientM :: ClientM a -> ClientEnv -> IO (Either ServantError a)

So again, the same pattern emerges. We’ll compose our monadic action and pass that as the first parameter. Then we’ll provide some initial state, in this case a ClientEnv. Finally, we’ll get our result (Either ServantError a) wrapped in an outer monad (IO).

Monads Within Expressions

It’s also important to remember that a lot of basic monads work without even needing a runXXX function! For instance, you can use a Maybe or Either monad to take out some of your error handling logic:

divideIfEven :: Int -> Maybe Int
divideIfEven x = if x `mod` 2 == 0
  then Just (x `quot` 2)
  else Nothing

dividesBy8 :: Int -> Bool
dividesBy8 = case thirdResult of
  Just _ -> True
  Nothing -> False
  where
    thirdResult :: Maybe Int
    thirdResult = do
      y <- divideIfEven x
      z <- divideIfEven y
      divideIfEven z

Conclusion

Monads are the key to using a lot of different Haskell libraries. But when you’re first starting out, it can be very confusing how you call into these functions from your code. The same applies with some common monad transformers like Reader and State. The most common pattern to look out for is the runXXXT pattern. Master this pattern and you’re well on your to understanding monads and writing better Haskell!

For a closer look at monads and similar structures, make sure to read our series on Functional Data Structures. If the code in this article was confusing, you should definitely check it out! And if you’ve never written Haskell but want to start, download our Beginners Checklist!

Taking a Look Back: My Mistakes in Learning Haskell

mistakes.jpg

Last week, we announced our Haskell From Scratch Beginners Course. Course sign-ups are still open, but not for much longer! They will close at midnight Pacific time on Wednesday, August 29th!

The course starts next week. But before it does, I wanted to take this opportunity to tell a little bit of the story of how I learned Haskell. I want to share the mistakes I made, since those motivated me to make this course.

My Haskell History

I first learned Haskell in college as part of a course on programming language theory. I admired the elegance of a few things in particular. I liked how lists and tuples worked well with the type system. I also appreciated the elegance of Haskell’s type definitions. No other language I had seen represented the idea of sum types so well. I also saw how useful pattern matching and recursion were. They made it very easy to break problems down into manageable parts.

After college, I had the idea for a code generation project. A couple college assignments had dealt with code generation. So I realized already knew a couple Haskell libraries that could provide the foundation for the work. So I got to work writing up some Haskell. At first things were quite haphazard. Eventually though, I developed some semblance of test driven development and product organization.

About nine months into that project, I had the great fortune of landing a Haskell project at my day-job. As I ramped up on this project, I realized how deficient my knowledge was in a lot of areas. I realized then a lot of the mistakes I had been making while learning the language. This motivated me to start the Monday Morning Haskell blog.

Main Advice

Of course, I’ve tried to incorporate my learnings throughout the material on this blog. But if I had to distill the key ideas, here’s what they’d be.

First, learn tools and project organization early! Learn how to use Stack and/or Cabal! For help with this, you can check out our free Stack mini-course! After several months on my side project, I had to start from scratch to some extent. The only “testing” I was doing was running some manual executables and commands in GHCI. So once I learned more about these tools, I had to re-work a lot of code.

Second, it helps a lot to have some kind of structure when you’re first learning the language. Working on a project is nice, but there are a lot of unknown-unknowns out there. You’ll often find a “solution” for your problem, only to see that you need a lot more knowledge to implement it. You need to have a solid foundation on the core concepts before you can dive in on anything. So look for a source that provides some kind of structure to your Haskell learning, like a book (or an online course!).

Third, let’s get to monads. They’re an important key to Haskell and widely misunderstood. But there are a couple things that will help a lot. First, learn the syntactic patterns of do-syntax. Second, learn how to use run functions (runState, runReaderT, etc.). These are how you bring monadic expressions into the rest of your code. You can check out our Monads Series for some help on these ideas. We’ll also have an article on this topic next week! (And of course, you’ll learn all about monads in Haskell From Scratch!)

Finally, ask for help earlier! I still don’t plug into the Haskell network as much as I should. There are a lot of folks out there who are more than willing to help. Freenode is a great place, as is Reddit and even Twitter!

Conclusion

There’s never been a better time to start learning Haskell! The language tools have developed a ton in the last few years and the community is growing stronger. As we announced last week, we’ve now opened up our Haskell From Scratch Beginners Course! You don’t need any Haskell experience to take this course. So if you always wanted to learn more about the language but needed more organization, this is your chance!

Keeping it Clean: Haskell Code Formatters

clean_code.png

A long time ago, we had an article that featured some tips on how to organize your import statements. As far as I remember, that’s the only piece we’ve ever done on code formatting. But as you work on projects with more people, formatting is one thing you need to consider to keep everyone sane. You’d like to have a consistent style across the code base. That way, there’s less controversy in code reviews, and people don't need to think as much to update code. They shouldn't have to wonder about following the style guide or what exists in a fragment of code.

This week, we’re going to go over three different Haskell code formatting tools. We’ll examine Stylish Haskell, Hindent, and Brittany. These all have their pluses and minuses, as we’ll see.

For some ideas of what Haskell projects you can do, download our Production Checklist. You can also take our free Stack mini-course and learn how to use Stack to organize your code!

Stylish Haskell

The first tool we’ll look at is Stylish Haskell. This is a straightforward tool to use, as it does some cool things with no configuration required. Let’s take a look at a poorly formatted version of code from our Beam article.

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE ImpredicativeTypes #-}

module Schema where

import Database.Beam
import Database.Beam.Backend
import Database.Beam.Migrate
import Database.Beam.Sqlite
import Database.SQLite.Simple (open, Connection)

import Data.Int (Int64)
import Data.Text (Text)
import Data.Time (UTCTime)
import qualified Data.UUID as U

data UserT f = User
  { _userId :: Columnar f Int64
  , _userName :: Columnar f Text
  , _userEmail :: Columnar f Text
  , _userAge :: Columnar f Int
  , _userOccupation :: Columnar f Text
  } deriving (Generic)

There are many undesirable things here. Our language pragmas don’t line up their end braces. They also aren’t in any discernible order. Our imports are also not lined up, and neither are the fields in our data types.

Stylish Haskell can fix all this. First, we’ll install it globally with:

stack install stylish-haskell

(You can also use cabal instead of stack). Then we can call the stylish-haskell command on a file. By default, it will output the results to the terminal. But if we pass the -i flag, it will update the file in place. This will make all the changes we want to line up the various statements in our file!

>> stylish-haskell -i Schema.hs

--- Result:

{-# LANGUAGE DeriveGeneric         #-}
{-# LANGUAGE FlexibleContexts      #-}
{-# LANGUAGE FlexibleInstances     #-}
{-# LANGUAGE GADTs                 #-}
{-# LANGUAGE ImpredicativeTypes    #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings     #-}
{-# LANGUAGE StandaloneDeriving    #-}
{-# LANGUAGE TypeApplications      #-}
{-# LANGUAGE TypeFamilies          #-}
{-# LANGUAGE TypeSynonymInstances  #-}

module Schema where

import           Database.Beam
import           Database.Beam.Backend
import           Database.Beam.Migrate
import           Database.Beam.Sqlite
import           Database.SQLite.Simple (Connection, open)

import           Data.Int               (Int64)
import           Data.Text              (Text)
import           Data.Time              (UTCTime)
import qualified Data.UUID              as U

data UserT f = User
  { _userId         :: Columnar f Int64
  , _userName       :: Columnar f Text
  , _userEmail      :: Columnar f Text
  , _userAge        :: Columnar f Int
  , _userOccupation :: Columnar f Text
  } deriving (Generic)

Stylish Haskell integrates well with most common editors. For instance, if you use Vim, you can also run the command from within the editor with the command:

:%!stylish-haskell

We get all these features without any configuration. If we want to change things though, we can create a configuration file. We’ll make a default file with the following command:

stylish-haskell --defaults > .stylish-haskell.yaml

Then if we want, we can modify it a bit. For one example, we've aligned our imports above globally. This means they all leave space for qualified. But we can decide we don’t want a group of imports to have that space if there are no qualified imports. There’s a setting for this in the config. By default, it looks like this:

imports:
  align: global

We can change it to group to ensure our imports are only aligned within their grouping.

imports:
  align: group

And now when we run the command, we’ll get a different result:

module Schema where

import Database.Beam
import Database.Beam.Backend
import Database.Beam.Migrate
import Database.Beam.Sqlite
import Database.SQLite.Simple (Connection, open)

import           Data.Int  (Int64)
import           Data.Text (Text)
import           Data.Time (UTCTime)
import qualified Data.UUID as U

So in short, Stylish Haskell is a great tool for a limited scope. It has uncontroversial suggestions for several areas like imports and pragmas. It also removes trailing whitespace, and adjusts case statements sensibly. That said, it doesn’t affect your main Haskell code. Let’s look at a couple tools that can do that.

Hindent

Another program we can use is hindent. As its name implies, it deals with updating whitespace and indentation levels. Let’s look at a very simple example. Consider this code, adapted from our Beam article:

user1' = User default_  (val_ "James")  (val_ "james@example.com")  (val_ 25)  (val_ "programmer")

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
    users <- runSelectReturningList $ select $ do
        user <- (all_ (_blogUsers blogDb))
        article <- (all_ (_blogArticles blogDb))
        guard_ (user ^. userName ==. (val_ "James"))
        guard_ (article ^. articleUserId ==. user ^. userId) 
        return (user, article)
    mapM_ (liftIO . putStrLn . show) users

There are a few things we could change. First, we might want to update the indentation level so that it is 2 instead of 4. Second, let's restrict the line size to only being 80. When we run hindent on this file, it’ll make the changes.

user1' =
  User
    default_
    (val_ "James")
    (val_ "james@example.com")
    (val_ 25)
    (val_ "programmer")

findUsers :: Connection -> IO ()
findUsers conn =
  runBeamSqlite conn $ do
    users <-
      runSelectReturningList $
      select $ do
        user <- (all_ (_blogUsers blogDb))
        article <- (all_ (_blogArticles blogDb))
        guard_ (user ^. userName ==. (val_ "James"))
        guard_ (article ^. articleUserId ==. user ^. userId)
        return (user, article)
    mapM_ (liftIO . putStrLn . show) users

Hindent is also configurable. We can create a file .hindent.yaml. By default, we would have the following configuration:

indent-size: 2
line-length: 80
force-trailing-newline: true

But then we can change it if we want so that the indentation level is 3:

indent-size: 3

And now when we run it, we’ll actually see that it’s changed to reflect that:

findUsers :: Connection -> IO ()
findUsers conn =
   runBeamSqlite conn $ do
      users <-
         runSelectReturningList $
         select $ do
            user <- (all_ (_blogUsers blogDb))
            article <- (all_ (_blogArticles blogDb))
            guard_ (user ^. userName ==. (val_ "James"))
            guard_ (article ^. articleUserId ==. user ^. userId)
            return (user, article)
      mapM_ (liftIO . putStrLn . show) users

Hindent also has some other effects that, as far as I can tell, are not configurable. You can see that the separation of lines was not preserved above. In another example, it spaced out instance definitions that I had grouped in another file:

-- BEFORE
deriving instance Show User
deriving instance Eq User
deriving instance Show UserId
deriving instance Eq UserId

-- AFTER
deriving instance Show User

deriving instance Eq User

deriving instance Show UserId

deriving instance Eq UserId

So make sure you’re aware of everything it does before committing to using it. Like stylish-haskell, hindent integrates well with text editors.

Brittany

Brittany is an alternative to Hindent for modifying your expression definitions. It mainly focuses on the use of horizontal space throughout your code. As far as I see, it doesn’t line up language pragmas or change import statements in the way stylish-haskell does. It also doesn’t touch data type declarations. Instead, it seeks to reformat your code to make maximal use of space while avoiding lines that are too long. As an example, we could look at this line from our Beam example:

insertArticles :: Connection -> IO ()
insertArticles conn = runBeamSqlite conn $ runInsert $ 
  insert (_blogArticles blogDb) $ insertValues articles

Our decision on where to separate the line is a little bit arbitrary. But at the very least we don’t try to cram it all on one line. But if we have either the approach above or the one-line version, Brittany will change it to this:

brittany --write-mode=inplace MyModule.hs

--

insertArticles :: Connection -> IO ()
insertArticles conn =
  runBeamSqlite conn $ runInsert $ insert (_blogArticles blogDb) $ insertValues
    articles

This makes “better” use of horizontal space in the sense that we get as much on the first line. That said, one could argue that the first approach we have actually looks nicer. Brittany can also change type signatures that overflow the line limit. Suppose we have this arbitrary type signature that’s too long for a single line:

myReallyLongFunction :: State ComplexType Double -> Maybe Double -> Either Double ComplexType -> IO a -> StateT ComplexType IO a

Brittany will fix it up so that each argument type is on a single line:

myReallyLongFunction
  :: State ComplexType Double
  -> Maybe Double
  -> Either Double ComplexType
  -> IO a
  -> StateT ComplexType IO a

This can be useful in projects with very complicated types. The structure makes it easier for you to add Haddock comments to the various arguments.

Dangers

There is of course, a (small) danger to using tools like these. If you’re going to use them, you want to ensure everyone on the project is using them. Suppose person A isn’t using the program, and commits code that isn’t formatted by the program. Person B might then look through that code, and their editor will correct the file. This will leave them with local changes to the file that aren’t relevant to whatever work they’re doing. This can cause a lot of confusion when they submit code for review. Whoever reviews their code has to sift through the format changes, which slows the review.

People can also have (unreasonably) strong opinions about code formatting. So it’s generally something you want to nail down early on a project and avoid changing afterward. With the examples in this article, I would say it would be an easy sell to use Stylish Haskell on a project. However, the specific choices made in H-Indent and Brittany can be more controversial. So it might cause more problems than it would solve to institute those project-wide.

Conclusion

It’s possible to lose a surprising amount of productivity to code formatting. So it can be important to nail down standards early and often. Code formatting programs can make it easy to enforce particular standards. They’re also very simple to incorporate into your projects with stack and your editor of choice!

Now that you know how to format your code, need some suggestions for what to work on next? Take a look at our Production Checklist! It’ll give you some cool ideas of libraries you can use for building Haskell web apps and much more!