Spock II: Databases and Sessions!

Spock.png

Last week we learned the basics of the the Spock library. We saw how to set up some simple routes. Like Servant, there's a bit of dependent-type machinery with routing. But we didn't need to learn any complex operators. We just needed to match up the number of arguments to our routes. We also saw how to use an application state to persist some data between requests.

This week, we'll add a couple more complex features to our Spock application. First we'll connect to a database. Second, we'll use sessions to keep track of users.

For some more examples of useful Haskell libraries, check out our Production Checklist!

Adding a Database

Last week, we added some global application state. Even with this improvement, our vistor count doesn't persist. When we reset the server, everything goes away, and our users will see a different number. We can change this by adding a database connection to our server. We'll follow the Spock tutorial example and connect to an SQLite database by using Persistent.

If you haven't used Persistent before, take a look at this tutorial in our Haskell Web series! You can also look at our sample code on Github for any of the boilerplate you might be missing. Here's the super simple schema we'll use. Remember that Persistent will give us an auto-incrementing primary key.

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
  NameEntry json
    name Text
    deriving Show
|]

Spock expects us to use a pool of connections to our database when we use it. So let's create one to an SQLite file using createSqlitePool. We need to run this from a logging monad. While we're at it, we can migrate our database from the main startup function. This ensures we're using an up-to-date schema:

import Database.Persist.Sqlite (createSqlitePool)

...

main :: IO ()
main = do
  ref <- newIORef M.empty
  pool <- runStdoutLoggingT $ createSqlitePool "spock_example.db" 5
  runStdoutLoggingT $ runSqlPool (runMigration migrateAll) pool
  ...

Now that we've created this pool, we can pass that to our configuration. We'll use the PCPool constructor. We're now using an SQLBackend for our server, so we'll also have to change the type of our router to reflect this:

main :: IO ()
main = do
  …
  spockConfig <-
    defaultSpockCfg EmptySession (PCPool pool) (AppState ref)
  runSpock 8080 (spock spockConfig app)

app :: SpockM SqlBackend MySession AppState ()
app = ...

Now we want to update our route action to access the database instead of this map. But first, we'll write a helper function that will allow us to call any SQL action from within our SpockM monad. It looks like this:

runSQL :: (HasSpock m, SpockConn m ~ SqlBackend)
  => SqlPersistT (LoggingT IO) a -> m a
runSQL action = runQuery $ \conn -> 
  runStdoutLoggingT $ runSqlConn action conn

At the core of this is the runQuery function from the Spock library. It works since our router now uses SpockM SqlBackend instead of SpockM (). Now let's write a couple SQL actions we can use. We'll have one performing a lookup by name, and returning the Key of the first entry that matches, if one exists. Then we'll also have one that will insert a new name and return its key.

fetchByName
  :: T.Text
  -> SqlPersistT (LoggingT IO) (Maybe Int64)
fetchByName name = (fmap (fromSqlKey . entityKey)) <$> 
  (listToMaybe <$> selectList [NameEntryName ==. name] [])

insertAndReturnKey
  :: T.Text
  -> SqlPersistT (LoggingT IO) Int64
insertAndReturnKey name = fromSqlKey <$> insert (NameEntry name)

Now we can use these functions instead of our map!

app :: SpockM SqlBackend MySession AppState ()
app = do
  get root $ text "Hello World!"
  get ("hello" <//> var) $ \name -> do
    existingKeyMaybe <- runSQL $ fetchByName name
    visitorNumber <- case existingKeyMaybe of
      Nothing -> runSQL $ insertAndReturnKey name
      Just i -> return i
    text ("Hello " <> name <> ", you are visitor number " <> 
      T.pack (show visitorNumber))

And voila! We can shutdown our server between runs, and we'll preserve the visitors we've seen!

Tracking Users

Now, using a route to identify our users isn't what we want to do. Anyone can visit any route after all! So for the last modification to the server, we're going to add a small "login" functionality. We'll use the App's session to track what user is currently visiting. Our new flow will look like this:

  1. We'll change our entry route to /hello.
  2. If the user visits this, we'll show a field allowing them to enter their name and log in.
  3. Pressing the login button will send a post request to our server. This will update the session to match the session ID with the username.
  4. It will then send the user to the /home page, which will greet them and present a logout button.
  5. If they log out, we'll clear the session.

Note that using the session is different from using the app state map that we had in the first part. We share the app state across everyone who uses our server. But the session will contain user-specific references.

Adding a Session

The first step is to change our session type. Once again, we'll use a IORef wrapper around a map. This time though, we'll use a simple type synonym to simplify things. Here's our type definition and the updated main function.

type MySession = IORef (M.Map T.Text T.Text)

main :: IO ()
main = do
  ref <- newIORef M.empty
  -- Initialize a reference for the session
  sessionRef <- newIORef M.empty
  pool <- runStdoutLoggingT $ createSqlitePool "spock_example.db" 5
  runStdoutLoggingT $ runSqlPool (runMigration migrateAll) pool
  -- Pass that reference!
  spockConfig <-
    defaultSpockCfg sessionRef (PCPool pool) (AppState ref)
  runSpock 8080 (spock spockConfig app)

Updating the Hello Page

Now let's update our "Hello" page. Check out the appendix below for what our helloHTML looks like. It's a "login" form with a username field and a submit button.

-- Notice we use MySession!
app :: SpockM SqlBackend MySession AppState ()
app = do
  get root $ text "Hello World!"
  get "hello" $ html helloHTML
  ...

Now we need to add a handler for the post request to /hello. We'll use the post function instead of get. Now instead of our action taking an argument, we'll extract the post body using the body function. If our application were more complicated, we would want to use a proper library for Form URL encoding and decoding. But for this small example, we'll use a simple helper decodeUsername. You can view this helper in the appendix.

app :: SpockM SqlBackend MySession AppState ()
app = do
  …
  post "hello" $ do
    nameEntry <- decodeUsername <$> body
    ...

Now we want to save this user using our session and then redirect them to the home page. First we'll need to get the session ID and the session itself. We use the functions getSessionId and readSession for this. Then we'll want to update our session by associating the name with the session ID. Finally, we'll redirect to home.

post "hello" $ do
  nameEntry <- decodeUsername <$> body
  sessId <- getSessionId 
  currentSessionRef <- readSession
  liftIO $ modifyIORef' currentSessionRef $
    M.insert sessId (nameEntryName nameEntry)
  redirect "home"

The Home Page

Now on the home page, we'll want to check if we've got a user associated with the session ID. If we do, we'll display some text greeting that user (and also display a logout button). Again, we need to invoke getSessionId and readSession. If we have no user associated with the session, we'll bounce them back to the hello page.

get "home" $ do
  sessId <- getSessionId 
  currentSessionRef <- readSession
  currentSession <- liftIO $ readIORef currentSessionRef
  case M.lookup sessId currentSession of
    Nothing -> redirect "hello"
    Just name -> html $ homeHTML name

The last piece of functionality we need is to "logout". We'll follow the familiar pattern of getting the session ID and session. This time, we'll change the session by clearing the session key. Then we'll redirect the user back to the hello page.

post "logout" $ do
  sessId <- getSessionId 
  currentSessionRef <- readSession
  liftIO $ modifyIORef' currentSessionRef $ M.delete sessId
  redirect "hello"

And now our site tracks our users' sessions! We can access the same page as a different user on different sessions!

Conclusion

This wraps up our exploration of the Spock library! We've done a shallow but wide look at some of the different features Spock has to offer. We saw several different ways to persist information across requests on our server! Connecting to a database is the most important. But using the session is a pretty advanced feature that is quite easy in Spock!

For some more cool examples of Haskell web libraries, take a look at our Web Skills Series! You can also download our Production Checklist for even more ideas!

Appendix - HTML Fragments and Helpers

helloHTML :: T.Text
helloHTML =
  "<html>\
    \<body>\
      \<p>Hello! Please enter your username!\
      \<form action=\"/hello\" method=\"post\">\
        \Username: <input type=\"text\" name=\"username\"><br>\
        \<input type=\"submit\"><br>\
      \</form>\
    \</body>\
  \</html>"

homeHTML :: T.Text -> T.Text
homeHTML name =
  "<html><body><p>Hello " <> name <> 
    "</p>\
    \<form action=\"logout\" method=\"post\">\
      \<input type=\"submit\" name=\"logout_button\"<br>\
    \</form>\
  \</body>\
  \</html>" 

-- Note: 61 -> '=' in ASCII
-- We expect input like "username=christopher"
parseUsername :: B.ByteString -> T.Text
parseUsername input = decodeUtf8 $ B.drop 1 tail_
  where
    tail_ = B.dropWhile (/= 61) input

Simple Web Routing with Spock!

spock_image2.jpg

In our Haskell Web Series, we go over the basics of how we can build a web application with Haskell. That includes using Persistent for our database layer, and Servant for our HTTP layer. But these aren't the only libraries for those tasks in the Haskell ecosystem.

We've already looked at how to use Beam as another potential database library. In these next two articles, we'll examine Spock, another HTTP library. We'll compare it to Servant and see what the different design decisions are. We'll start this week by looking at the basics of routing. We'll also see how to use a global application state to coordinate information on our server. Next week, we'll see how to hook up a database and use sessions.

For some useful libraries, make sure to download our Production Checklist. It will give you some more ideas for libraries you can use even beyond these! Also, you can follow along the code here by looking at our Github repository!

Getting Started

Spock gives us a helpful starting point for making a basic server. We'll begin by taking a look at the starter code on their homepage. Here's our initial adaptation of it:

data MySession = EmptySession
data MyAppState = DummyAppState (IORef Int)

main :: IO ()
main = do
  ref <- newIORef 0
  spockConfig <- defaultSpockCfg EmptySession PCNoDatabase (DummyAppState ref)
  runSpock 8080 (spock spockConfig app)

app :: SpockM () MySession MyAppState ()
app = do
  get root $ text "Hello World!"
  get ("hello" <//> var) $ \name -> do
    (DummyAppState ref) <- getState
    visitorNumber <- liftIO $ atomicModifyIORef' ref $ \i -> (i+1, i+1)
    text ("Hello " <> name <> ", you are visitor number " <> T.pack (show visitorNumber))

In our main function, we initialize an IO ref that we'll use as the only "state" of our application. Then we'll create a configuration object for our server. Last, we'll run our server using our app specification of the actual routes.

The configuration has a few important fields attached to it. For now, we're using dummy values for all these. Our config wants a Session, which we've defined as EmptySession. It also wants some kind of a database, which we'll add later. Finally, it includes an application state, and for now we'll only supply our pointer to an integer. We'll see later how we can add a bit more flavor to each of these parameters. But for the moment, let's dig a bit deeper into the app expression that defines the routing for our Server.

The SpockM Monad

Our router lives in the SpockM monad. We can see this has three different type parameters. Remember the defaultSpockConfig had three comparable arguments! We have the empty session as MySession and the IORef app state as MyAppState. Finally, there's an extra () parameter corresponding to our empty database. (The return value of our router is also ()).

Now each element of this monad is a path component. These path components use HTTP verbs, as you might expect. At the moment, our router only has a couple get routes. The first lies at the root of our path, and outputs Hello World!. The second lies at hello/{name}. It will print a message specifying the input name while keeping track of how many visitors we've had.

Composing Routes

Now let's talk a little bit now about the structure of our router code. The SpockM monad works like a Writer monad. Each action we take adds a new route to the application. In this case, we take two actions, each responding to get requests (we'll see an example of a post request next week).

For any of our HTTP verbs, the first argument will be a representation of the path. On our first route, we use the hard-coded root expression to refer to the / path. For our second expression, we have a couple different components that we combine with <//>.

First, we have a string path component hello. We could combine other strings as well. Let's suppose we wanted the route /api/hello/world. We'd use the expression:

"api" <//> "hello" <//> "world"

In our original code though, the second part of the path is a var. This allows us to substitute information into the path. When we visit /hello/james, we'll be able to get the path component james as a variable. Spock passes this argument to the function we have as the second argument of the get combinator.

This argument has a rather complicated type RouteSpec. We don't need to go into the details here. But the simplest thing we can return is some raw text by using the text combinator. (We could also use html if we have our own template). We conclude both our route definitions by doing this.

Notice that the expression for our first route has no parameters, while the second has one parameter. As you might guess, the parameter in the second route refers to the variable we can pull out of the path thanks to var. We have the same number of var elements in the path as we do arguments to the function. Spock uses dependent types to ensure these match.

Using the App State

Now that we know the basics, let's start using some of Spock's more advanced features. This week, we'll see how to use the App State.

Currently, we bump the visitor count each time we visit the route with a name, even if that name is the same. So visiting /hello/michael the first time results in:

Hello michael, you are visitor number 1

Then we'll visit again and see:

Hello michael, you are visitor number 2

Instead, let's make it so we assign each name to a particular number. This way, when a user visits the same route again, they'll see what number they originally were.

Making this change is rather easy. Instead of using an IORef on an Int for our state, we'll use a mapping from Text to Int:

data AppState = AppState (IORef (M.Map Text Int))

Now we'll initialize our ref with an empty map and pass it to our config:

main :: IO ()
main = do
  ref <- newIORef M.empty
  spockConfig <- defaultSpockCfg EmptySession PCNoDatabase (AppState ref)
  runSpock 8080 (spock spockConfig app)

And for our hello/{name} route, we'll update it to follow this process:

  1. Get the map reference
  2. See if we have an entry for this user yet.
  3. If not, insert them with the length of the map, and write this back to our IORef
  4. Return the message

This process is pretty straightforward. Let's see what it looks like:

app :: SpockM () MySession AppState ()
app = do
  get root $ text "Hello World!"
  get ("hello" <//> var) $ \name -> do
    (AppState mapRef) <- getState
    visitorNumber <- liftIO $ atomicModifyIORef' mapRef $ updateMapWithName name
    text ("Hello " <> name <> ", you are visitor number " <> T.pack (show visitorNumber))

updateMapWithName :: T.Text -> M.Map T.Text Int -> (M.Map T.Text Int, Int)
updateMapWithName name nameMap = case M.lookup name nameMap of
  Nothing -> (M.insert name (mapSize + 1) nameMap, mapSize + 1)
  Just i -> (nameMap, i)
  where
    mapSize = M.size nameMap

We create a function to update the map every time our app encounters a new name. The we update our IORef with atomicModifyIORef. And now if we visit /hello/michael twice in a row, we'll get the same output both times!

Conclusion

That's as far as we'll go this week! We covered the basics of how to make a basic application in Spock. We saw the basics of composing routes. Then we saw how we could use the app state to keep track of information across requests. Next week, we'll improve this process by adding a database to our application. We'll also use sessions to keep track of users.

For more cool libraries, read up on our Haskell Web Series. Also, you can download our Production Checklist for more ideas!

Common (But not so Common) Monads

Function Monad.png

Last week we looked at how monads can help you make the next jump in your Haskell development. We went over the runXXXT pattern and how it’s a common gateway for us to use certain monads from the rest of our code. But sometimes it also helps to go back to the basics. I actually went a long time without really grasping how to use a couple basic monads. Or at the very least, I didn’t understand how to use them as monads.

In this article, we’ll look at how to use the list monad and the function monad. Lists and functions are core concepts that any Haskeller learns from the get-go. But the list data structure and function application are also monads! And understanding how they work as such can teach us more about how monads work.

For an in-depth discussion of monads, check out our Functional Data Structures Series!

The General Pattern of Do Syntax

Using do syntax is one of the keys to understanding how to actually use monads. The bind operator makes it hard to track where your arguments are. Do syntax keeps the structure clean and allows you to pass results with ease. Let’s see how this works with IO, the first monad a lot of Haskellers learn. Here’s an example where we read the second line from a file:

readLineFromFile :: IO String
readLineFromFile = do
  handle <- openFile “myFile.txt” ReadMode
  nextLine <- hGetLine handle
  secondLine <- hGetLine handle
  _ <- hClose handle
  return secondLine

By keeping in mind the type signatures of all the IO functions, we can start to see the general pattern of do syntax. Let’s replace each expression with its type:

openFile :: FilePath -> IOMode -> IO Handle
hGetLine :: Handle -> IO String
hClose :: Handle -> IO ()
return :: a -> IO a

readLineFromFile :: IO String
readLineFromFile = do
  (Handle) <- (IO Handle)
  (String) <- (IO String)
  (String) <- (IO String)
  () <- (IO ())
  IO String

Every line in a do expression (except the last) uses the assignment operator <-. Then it has an expression of IO a on the right side, which it assigns to a value of a on the left side. The last line’s type then matches the final return value of this function. What’s important now is to recognize that we can generalize this structure to ANY monad:

monadicFunction :: m c
monadicFunction = do
  (_ :: a) <- (_ :: m a)
  (_ :: b) <- (_ :: m b)
  (_ :: m c)

So for example, if we have a function in the Maybe monad, we can use it and plug that in for m above:

myMaybeFunction :: a -> Maybe a

monadicMaybe :: a -> Maybe a
monadicMaybe x = do
  (y :: a) <- myMaybeFunction x
  (z :: a) <- myMaybeFunction y
  (Just z :: Maybe a)

The important thing to remember is that a monad captures a computational context. For IO, this context is that the computation might interact with the terminal or network. For Maybe, the context is that the computation might fail.

The List Monad

Now to graph the list monad, we need to know its computational context. We can view any function returning a list as non-deterministic. It could have many different values. So if we chain these computations, our final result is every possible combination. That is, our first computation could return a list of values. Then we want to check what we get with each of these different results as an input to the next function. And then we’ll take all those results. And so on.

To see this, let’s imagine we have a game. We can start that game with a particular number x. On each turn, we can either subtract one, add one, or keep the number the same. We want to know all the possible results after 5 turns, and the distribution of the possibilities. So we start by writing our non-deterministic function. It takes a single input and returns the possible game outputs:

runTurn :: Int -> [Int]
runTurn x = [x - 1, x, x + 1]

Here’s how we’d apply on this 5 turn game. We’ll add the type signatures so you can see the monadic structure:

runGame :: Int -> [Int]
runGame x = do
  (m1 :: Int) <- (runTurn x :: [Int])
  (m2 :: Int) <- (runTurn m1 :: [Int])
  (m3 :: Int) <- (runTurn m2 :: [Int])
  (m4 :: Int) <- (runTurn m3 :: [Int])
  (m5 :: Int) <- (runTurn m4 :: [Int])
  return m5

On the right side, every expression has type [Int]. Then on the left side, we get our Int out. So each of the m expressions represents one of the many solutions we'll get from runTurn. Then we run the rest of the function imagining we’re only using one of them. In reality though, we’ll run them all, because of how the list monad defines its bind operator. This mental jump is a little tricky. And it’s often more intuitive to just stick to using where expressions when we do list computations. But it's cool to see patterns like this pop up in unexpected places.

The Function Monad

The function monad is another one I struggled to understand for a while. In some ways, it's the same as the Reader monad. It encapsulates the context of having a single argument we can pass to different functions. But it’s not defined in the same way as Reader. When I tried to grok the definition, it didn’t make much sense to me:

instance Monad ((->) r) where
  return x = \_ -> x
  h >>= f = \w -> f (h w) w

The return definition makes sense. We’ll have a function that takes some argument, ignore that argument, and give the value as an output. The bind operator is a little more complicated. When we bind two functions together, we’ll get a new function that takes some argument w. We’ll apply that argument against our first function ((h w)). Then we’ll take the result of that, apply it to f, and THEN also apply the argument w again. It’s a little hard to follow.

But let’s think about this in the context of do syntax. Every expression on the right side will be a function that takes our type as its only argument.

myFunctionMonad :: a -> (x, y, z)
myFunctionMonad = do
  x <- :: a -> b
  y <- :: a -> c
  z <- :: a -> d
  return (x, y, z)

Now let’s imagine we’ll pass an Int and use a few different functions that can take an Int. Here’s what we’ll get:

myFunctionMonad :: Int -> (Int, Int, String)
myFunctionMonad = do
  x <- (1 +)
  y <- (2 *)
  z <- show
  return (x, y, z)

And now we have valid do syntax! So what happens when we run this function? We’ll call our different functions on the same input.

>> myFunctionMonad 3
(4, 6, "3")
>> myFunctionMonad (-1)
(0, -2, "-1")

When we pass 3 in the first example, we add 1 to it on the first line, multiply it by 2 on the second line, and show it on the third line. And we do this all without explicitly stating the argument! The tricky part is that all your functions have to take the input argument as their last argument. So you might have to do a little bit of argument flipping.

Conclusion

In this article we explored lists and functions, two of the most common concepts in Haskell. We generally don’t use these as monads. But we saw how they still fit into the monadic structure. We can use them in do-syntax, and follow the patterns we already know to make things work.

Perhaps you’ve tried to learn Haskell before but found monads a little too complex. Hopefully this article helped clarify the structure of monads. If you want to get your Haskell journey back under way, download our Beginners Checklist! Or to learn monads from the ground up, read our series on Functional Data Structures!

Making the Jump II: Using More Monads

making_jump_2.jpg

A few weeks ago, we addressed some important steps to advance past the "beginner" stage of Haskell. We learned how to organize your project and how to find the relevant documentation. This week we’re going to continue to look at another place where we can make a big step up. We’ll explore how to expand our vocabulary on monad usage.

Monads are a vital component of Haskell. You can’t use a lot of libraries unless you know how to incorporate their monadic functions. These functions often involve a monad that is custom to that library. When you’re first starting out, it can be hard to know how to incorporate these monads into the rest of your program.

In this article, we’ll focus on a specific pattern a lot of monads and libraries use. I call this pattern the “run” pattern. Often, you’ll use a function with a name like runXXX or runXXXT, where XXX is the name of the monad. These functions will always take a monadic expression as their first argument. Then they'll also take some other initialization information, and finally return some output. This output can either be in pure form or a different monad you’re already using like IO. We’ll start by seeing how this works with the State monad, and then move onto some other libraries.

Once you grasp this topic, it seems very simple. But a lot of us first learned monads with a bad mental model. For instance, the first thing I learned about monads was that they had side effects. And thus, you can only call them from places that have the same side effects. This applies to IO but doesn’t generalize to other monads. So even though it seems obvious now, I struggled to learn this idea at first. But let's start looking at some examples of this pattern.

For a more in depth look at monads, check out our series on Functional Data Structures! We start by learning about simpler things like functors. Then we eventually work our way up to monads and even monad transformers!

The Basics of “Run”: The State Monad

Let’s start by recalling the State monad. This monad has a single type parameter, and we can access this type as a global read/write state. Here’s an example function written in the State monad:

stateExample :: State Int (Int, Int, Int)
stateExample = do
  a <- get
  modify (+1)
  b <- get
  put 5
  c <- get
  return (a, b, c)

If this function is confusing, you should take a look at the documentation for State. It’ll at least show you the relevant type signatures. First we read the initial state. Then we modify it with some function. Finally we completely change it.

In the example above, if our initial state is 1, we’ll return (1,2,5) as the result. If the initial state is 2, we’ll return (2,3,5). But suppose we have a pure function. How do we call our state function?

pureFunction :: Int -> Int
pureFunction = ???

The answer is the runState function. We can check the documentation and find its type:

runState :: State s a -> s -> (a, s)

This function has two parameters. The first is a State action. We’ll pass our function above as this parameter! Then the second is the initial state, and this is how we’ll configure it. Then the result is pure. It contains our result, as well as the final value of the state. So here’s a sample call we can make that gives us this monadic expression in our pure function. We’ll call it from a where clause, and discard the final state:

pureFunction :: Int -> Int
pureFunction input = a + b + c
  where
    ((a,b,c), _) = runState stateExample input

This is the simplest example of how we can use the runXXX pattern.

Upgrading to Transformers

Now, suppose our State function isn’t quite pure. It now wants to print some of its output, so it’ll need the IO monad. This means it’ll use the StateT monad transformer over IO:

stateTExample :: StateT Int IO (Int, Int, Int)
stateTExample = do
  a <- get
  lift $ print “Initial Value:”
  lift $ print a
  modify (+1)
  b <- get
  lift $ putStrLn “After adding 1:”
  lift $ print b
  put 5
  c <- get
  lift $ putStrLn “After setting as 5:”
  lift $ print c
  return (a, b, c)

Now instead of calling this function from a pure format, we’ll need to call it from an IO function. But once again, we’ll use a runXXX function. Now though, since we’re using a monad transformer, we won’t get a pure result. Instead, we’ll get our result in the underlying monad. This means we can call this function from IO. So let’s examine the type of the runStateT function. We’ve substituted IO for the generic monad parameter m:

runStateT :: StateT s IO a -> s -> IO (a, s)

It looks a lot like runState, except for the extra IO parameters! Instead of returning a pure tuple for the result, it returns an IO action containing that result. Thus we can call it from the IO monad.

main :: IO ()
main = do
  putStrLn “Please enter a number.”
  input <- read <$> getLine
  results <- runStateT stateTExample input
  print results

We’ll get the following output as a result:

Please enter a number.
10
Initial Value:
10
After adding 1
11
After setting as 5
5
(10, 11, 5)

Using Run For Libraries

This pattern will often extend into libraries you use. For example, in our series on parsing, we examine the Megaparsec library. A lot of the individual parser combinators in that library exist in the Parsec or ParsecT monad. So we can combine a bunch of different parsers together into one function.

But then to run that function from your normal IO code (or another monad), you need to use the runParserT function. Let’s look at its type signature:

runParserT
  :: Monad m
  -> ParsecT e s m a
  -> String -- Name of source file
  -> s -- Input for parser
  -> m (Either (ParseError (Token s) e) a)

There are a lot of type parameters there that you don’t need to understand. But the structure is the same. The first parameter to our run function is the monadic action. Then we’ll supply some other inputs we need. Then we get some result, wrapped in an outer monad (such as IO).

We can see the same pattern if we use the servant-client library to make client-side API calls. Any call you make to your API will be in the ClientM monad. Now here’s the type signature of the runClientM function:

runClientM :: ClientM a -> ClientEnv -> IO (Either ServantError a)

So again, the same pattern emerges. We’ll compose our monadic action and pass that as the first parameter. Then we’ll provide some initial state, in this case a ClientEnv. Finally, we’ll get our result (Either ServantError a) wrapped in an outer monad (IO).

Monads Within Expressions

It’s also important to remember that a lot of basic monads work without even needing a runXXX function! For instance, you can use a Maybe or Either monad to take out some of your error handling logic:

divideIfEven :: Int -> Maybe Int
divideIfEven x = if x `mod` 2 == 0
  then Just (x `quot` 2)
  else Nothing

dividesBy8 :: Int -> Bool
dividesBy8 = case thirdResult of
  Just _ -> True
  Nothing -> False
  where
    thirdResult :: Maybe Int
    thirdResult = do
      y <- divideIfEven x
      z <- divideIfEven y
      divideIfEven z

Conclusion

Monads are the key to using a lot of different Haskell libraries. But when you’re first starting out, it can be very confusing how you call into these functions from your code. The same applies with some common monad transformers like Reader and State. The most common pattern to look out for is the runXXXT pattern. Master this pattern and you’re well on your to understanding monads and writing better Haskell!

For a closer look at monads and similar structures, make sure to read our series on Functional Data Structures. If the code in this article was confusing, you should definitely check it out! And if you’ve never written Haskell but want to start, download our Beginners Checklist!

Taking a Look Back: My Mistakes in Learning Haskell

mistakes.jpg

Last week, we announced our Haskell From Scratch Beginners Course. Course sign-ups are still open, but not for much longer! They will close at midnight Pacific time on Wednesday, August 29th!

The course starts next week. But before it does, I wanted to take this opportunity to tell a little bit of the story of how I learned Haskell. I want to share the mistakes I made, since those motivated me to make this course.

My Haskell History

I first learned Haskell in college as part of a course on programming language theory. I admired the elegance of a few things in particular. I liked how lists and tuples worked well with the type system. I also appreciated the elegance of Haskell’s type definitions. No other language I had seen represented the idea of sum types so well. I also saw how useful pattern matching and recursion were. They made it very easy to break problems down into manageable parts.

After college, I had the idea for a code generation project. A couple college assignments had dealt with code generation. So I realized already knew a couple Haskell libraries that could provide the foundation for the work. So I got to work writing up some Haskell. At first things were quite haphazard. Eventually though, I developed some semblance of test driven development and product organization.

About nine months into that project, I had the great fortune of landing a Haskell project at my day-job. As I ramped up on this project, I realized how deficient my knowledge was in a lot of areas. I realized then a lot of the mistakes I had been making while learning the language. This motivated me to start the Monday Morning Haskell blog.

Main Advice

Of course, I’ve tried to incorporate my learnings throughout the material on this blog. But if I had to distill the key ideas, here’s what they’d be.

First, learn tools and project organization early! Learn how to use Stack and/or Cabal! For help with this, you can check out our free Stack mini-course! After several months on my side project, I had to start from scratch to some extent. The only “testing” I was doing was running some manual executables and commands in GHCI. So once I learned more about these tools, I had to re-work a lot of code.

Second, it helps a lot to have some kind of structure when you’re first learning the language. Working on a project is nice, but there are a lot of unknown-unknowns out there. You’ll often find a “solution” for your problem, only to see that you need a lot more knowledge to implement it. You need to have a solid foundation on the core concepts before you can dive in on anything. So look for a source that provides some kind of structure to your Haskell learning, like a book (or an online course!).

Third, let’s get to monads. They’re an important key to Haskell and widely misunderstood. But there are a couple things that will help a lot. First, learn the syntactic patterns of do-syntax. Second, learn how to use run functions (runState, runReaderT, etc.). These are how you bring monadic expressions into the rest of your code. You can check out our Monads Series for some help on these ideas. We’ll also have an article on this topic next week! (And of course, you’ll learn all about monads in Haskell From Scratch!)

Finally, ask for help earlier! I still don’t plug into the Haskell network as much as I should. There are a lot of folks out there who are more than willing to help. Freenode is a great place, as is Reddit and even Twitter!

Conclusion

There’s never been a better time to start learning Haskell! The language tools have developed a ton in the last few years and the community is growing stronger. As we announced last week, we’ve now opened up our Haskell From Scratch Beginners Course! You don’t need any Haskell experience to take this course. So if you always wanted to learn more about the language but needed more organization, this is your chance!

Announcing: Haskell From Scratch Beginners Course!

newlogo3transparent.png

This week we have a huge announcement we’ve been working towards for a long time. One of the main goals of this blog has been to create content to make it easy for newcomers to learn Haskell. We’ve now reached the culmination of that goal with our brand new Haskell From Scratch course. This online course will teach you the basics of using and writing Haskell. It assumes no prior knowledge of Haskell, but you should have at least some programming background. To sign up, head over to the course page.

Course Overview

The course consists of seven modules. Each module has a series of video lectures and accompanying exercises. In the first module, we’ll go over the fundamental structure of the language. We’ll take an in-depth look at how we compose programs by using expressions. Then we’ll see how the type system affects what we can do with those expressions.

In module 2, we’ll try to build a deeper mastery of the type system by learning how to construct our own types. We’ll also see how Haskell’s typeclass system lets us capture common behavior between types.

Module 3 deals with lists and recursion. Haskell doesn’t use for-loops like you have in mainstream languages. Instead, we tend to solve problems with recursion. There's a clear, recognizable pattern to recursion. We'll use Haskell's list structure to help understand this pattern.

In module 4, we’ll take our first steps towards learning about monads and writing real programs. We’ll learn about the IO monad, whose functions allow us to do more interesting things. We'll see how to get user input, manipulate files and even use threads.

In module 5, we’ll take what we learned from writing IO code and apply it to learning about other monads. We’ll start by learning about other kinds of functional data structures. Then we’ll use these patterns to help us learn this most dreaded of Haskell concepts (it’s actually not bad!).

Module 6 deals with the complications of similar-looking data. We have many different ways of representing numbers or strings, for example. And each representation can have a different type. This presents a lot of unique challenges for us as Haskell developers. We’ll explore these more in this module.

Finally, we’ll wrap the course up by learning the "Haskell" approach to problem solving. We'll look at some common programming problems and see how to solve them in Haskell. We’ll consider paradigms like memoization and dynamic programming.

Besides the course material, there will also be a Slack group for this course. This will be a place where you can get help from myself or any of your fellow classmates!

Course Schedule

The course will launch on Monday, September 3rd, with the release of module 1. We will then release a new module each Thursday and Monday thereafter. Don’t worry if you’re busy on a particular week! You’ll be able to access all old content indefinitely.

Now, sign-ups for the course will end on Wednesday, August 29th! So don’t miss out! Head over to the course page and reserve your spot today! If you’re not sure yet about starting with Haskell, you can also download our Beginners Checklist. It’ll give you the tools you need to try the language out!

Series Spotlight: Haskell Web Skills!

Haskell has a reputation of being a neat language with cool concepts used a lot in academic research. But the dark side of this reputation is that people imagine it's unsuited for production use. But nothing could be further from the truth!

A common misconception about Haskell is that it “doesn’t have side effects.” Of course any language needs to have side effects to be effective. What makes Haskell unique isn’t that it doesn’t have side effects at all. What makes it unique (or at least uncommon) is that you have to encode side effects within the type system. This means you can know at compile time where the effects lie in your system.

To get a feel for how you can write normal programs with Haskell, you should read our Haskell Web Series. It will take you through the basics of building a simple web application using Haskell. We’ll highlight a couple specific libraries that allow you to do common web tasks.

In part 1, we’ll use the Persistent library to connect to a Postgresql database. This involves using a more complex concept called Template Haskell to create our database schema. That way, we can automatically generate the SQL statements we need.

In part 2, we’ll learn how to use the Servant library to create an HTTP server. We’ll also see how we can connect it to our database and make requests from there.

Next, we’ll use the Hedis library to connect to Redis. This will allow us to cache some of our database information. Then when we need it again, we won't have to go to our database!

Part 4 covers how we test such a complicated system. We’ll also see how to use Stack to connect our system with Docker so that our side services are easy to manage.

Finally, we wrap this series up in part 5, where we’ll look at some more complicated database queries we can make. This will require learning another library called Esqueleto. This cool library works in tandem with Persistent.

There are a lot of different libraries involved in this series. So it’s essential you know how to bring outside code into your Haskell application. To learn more about package management, you should also take our free Stack mini-course. It’ll help you learn how to create a real Haskell program and connect to libraries with ease.

And if you’ve never written Haskell before, now’s a great time to start! Download our Beginner’s Checklist for some tips!

Keeping it Clean: Haskell Code Formatters

clean_code.png

A long time ago, we had an article that featured some tips on how to organize your import statements. As far as I remember, that’s the only piece we’ve ever done on code formatting. But as you work on projects with more people, formatting is one thing you need to consider to keep everyone sane. You’d like to have a consistent style across the code base. That way, there’s less controversy in code reviews, and people don't need to think as much to update code. They shouldn't have to wonder about following the style guide or what exists in a fragment of code.

This week, we’re going to go over three different Haskell code formatting tools. We’ll examine Stylish Haskell, Hindent, and Brittany. These all have their pluses and minuses, as we’ll see.

For some ideas of what Haskell projects you can do, download our Production Checklist. You can also take our free Stack mini-course and learn how to use Stack to organize your code!

Stylish Haskell

The first tool we’ll look at is Stylish Haskell. This is a straightforward tool to use, as it does some cool things with no configuration required. Let’s take a look at a poorly formatted version of code from our Beam article.

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE ImpredicativeTypes #-}

module Schema where

import Database.Beam
import Database.Beam.Backend
import Database.Beam.Migrate
import Database.Beam.Sqlite
import Database.SQLite.Simple (open, Connection)

import Data.Int (Int64)
import Data.Text (Text)
import Data.Time (UTCTime)
import qualified Data.UUID as U

data UserT f = User
  { _userId :: Columnar f Int64
  , _userName :: Columnar f Text
  , _userEmail :: Columnar f Text
  , _userAge :: Columnar f Int
  , _userOccupation :: Columnar f Text
  } deriving (Generic)

There are many undesirable things here. Our language pragmas don’t line up their end braces. They also aren’t in any discernible order. Our imports are also not lined up, and neither are the fields in our data types.

Stylish Haskell can fix all this. First, we’ll install it globally with:

stack install stylish-haskell

(You can also use cabal instead of stack). Then we can call the stylish-haskell command on a file. By default, it will output the results to the terminal. But if we pass the -i flag, it will update the file in place. This will make all the changes we want to line up the various statements in our file!

>> stylish-haskell -i Schema.hs

--- Result:

{-# LANGUAGE DeriveGeneric         #-}
{-# LANGUAGE FlexibleContexts      #-}
{-# LANGUAGE FlexibleInstances     #-}
{-# LANGUAGE GADTs                 #-}
{-# LANGUAGE ImpredicativeTypes    #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings     #-}
{-# LANGUAGE StandaloneDeriving    #-}
{-# LANGUAGE TypeApplications      #-}
{-# LANGUAGE TypeFamilies          #-}
{-# LANGUAGE TypeSynonymInstances  #-}

module Schema where

import           Database.Beam
import           Database.Beam.Backend
import           Database.Beam.Migrate
import           Database.Beam.Sqlite
import           Database.SQLite.Simple (Connection, open)

import           Data.Int               (Int64)
import           Data.Text              (Text)
import           Data.Time              (UTCTime)
import qualified Data.UUID              as U

data UserT f = User
  { _userId         :: Columnar f Int64
  , _userName       :: Columnar f Text
  , _userEmail      :: Columnar f Text
  , _userAge        :: Columnar f Int
  , _userOccupation :: Columnar f Text
  } deriving (Generic)

Stylish Haskell integrates well with most common editors. For instance, if you use Vim, you can also run the command from within the editor with the command:

:%!stylish-haskell

We get all these features without any configuration. If we want to change things though, we can create a configuration file. We’ll make a default file with the following command:

stylish-haskell --defaults > .stylish-haskell.yaml

Then if we want, we can modify it a bit. For one example, we've aligned our imports above globally. This means they all leave space for qualified. But we can decide we don’t want a group of imports to have that space if there are no qualified imports. There’s a setting for this in the config. By default, it looks like this:

imports:
  align: global

We can change it to group to ensure our imports are only aligned within their grouping.

imports:
  align: group

And now when we run the command, we’ll get a different result:

module Schema where

import Database.Beam
import Database.Beam.Backend
import Database.Beam.Migrate
import Database.Beam.Sqlite
import Database.SQLite.Simple (Connection, open)

import           Data.Int  (Int64)
import           Data.Text (Text)
import           Data.Time (UTCTime)
import qualified Data.UUID as U

So in short, Stylish Haskell is a great tool for a limited scope. It has uncontroversial suggestions for several areas like imports and pragmas. It also removes trailing whitespace, and adjusts case statements sensibly. That said, it doesn’t affect your main Haskell code. Let’s look at a couple tools that can do that.

Hindent

Another program we can use is hindent. As its name implies, it deals with updating whitespace and indentation levels. Let’s look at a very simple example. Consider this code, adapted from our Beam article:

user1' = User default_  (val_ "James")  (val_ "james@example.com")  (val_ 25)  (val_ "programmer")

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
    users <- runSelectReturningList $ select $ do
        user <- (all_ (_blogUsers blogDb))
        article <- (all_ (_blogArticles blogDb))
        guard_ (user ^. userName ==. (val_ "James"))
        guard_ (article ^. articleUserId ==. user ^. userId) 
        return (user, article)
    mapM_ (liftIO . putStrLn . show) users

There are a few things we could change. First, we might want to update the indentation level so that it is 2 instead of 4. Second, let's restrict the line size to only being 80. When we run hindent on this file, it’ll make the changes.

user1' =
  User
    default_
    (val_ "James")
    (val_ "james@example.com")
    (val_ 25)
    (val_ "programmer")

findUsers :: Connection -> IO ()
findUsers conn =
  runBeamSqlite conn $ do
    users <-
      runSelectReturningList $
      select $ do
        user <- (all_ (_blogUsers blogDb))
        article <- (all_ (_blogArticles blogDb))
        guard_ (user ^. userName ==. (val_ "James"))
        guard_ (article ^. articleUserId ==. user ^. userId)
        return (user, article)
    mapM_ (liftIO . putStrLn . show) users

Hindent is also configurable. We can create a file .hindent.yaml. By default, we would have the following configuration:

indent-size: 2
line-length: 80
force-trailing-newline: true

But then we can change it if we want so that the indentation level is 3:

indent-size: 3

And now when we run it, we’ll actually see that it’s changed to reflect that:

findUsers :: Connection -> IO ()
findUsers conn =
   runBeamSqlite conn $ do
      users <-
         runSelectReturningList $
         select $ do
            user <- (all_ (_blogUsers blogDb))
            article <- (all_ (_blogArticles blogDb))
            guard_ (user ^. userName ==. (val_ "James"))
            guard_ (article ^. articleUserId ==. user ^. userId)
            return (user, article)
      mapM_ (liftIO . putStrLn . show) users

Hindent also has some other effects that, as far as I can tell, are not configurable. You can see that the separation of lines was not preserved above. In another example, it spaced out instance definitions that I had grouped in another file:

-- BEFORE
deriving instance Show User
deriving instance Eq User
deriving instance Show UserId
deriving instance Eq UserId

-- AFTER
deriving instance Show User

deriving instance Eq User

deriving instance Show UserId

deriving instance Eq UserId

So make sure you’re aware of everything it does before committing to using it. Like stylish-haskell, hindent integrates well with text editors.

Brittany

Brittany is an alternative to Hindent for modifying your expression definitions. It mainly focuses on the use of horizontal space throughout your code. As far as I see, it doesn’t line up language pragmas or change import statements in the way stylish-haskell does. It also doesn’t touch data type declarations. Instead, it seeks to reformat your code to make maximal use of space while avoiding lines that are too long. As an example, we could look at this line from our Beam example:

insertArticles :: Connection -> IO ()
insertArticles conn = runBeamSqlite conn $ runInsert $ 
  insert (_blogArticles blogDb) $ insertValues articles

Our decision on where to separate the line is a little bit arbitrary. But at the very least we don’t try to cram it all on one line. But if we have either the approach above or the one-line version, Brittany will change it to this:

brittany --write-mode=inplace MyModule.hs

--

insertArticles :: Connection -> IO ()
insertArticles conn =
  runBeamSqlite conn $ runInsert $ insert (_blogArticles blogDb) $ insertValues
    articles

This makes “better” use of horizontal space in the sense that we get as much on the first line. That said, one could argue that the first approach we have actually looks nicer. Brittany can also change type signatures that overflow the line limit. Suppose we have this arbitrary type signature that’s too long for a single line:

myReallyLongFunction :: State ComplexType Double -> Maybe Double -> Either Double ComplexType -> IO a -> StateT ComplexType IO a

Brittany will fix it up so that each argument type is on a single line:

myReallyLongFunction
  :: State ComplexType Double
  -> Maybe Double
  -> Either Double ComplexType
  -> IO a
  -> StateT ComplexType IO a

This can be useful in projects with very complicated types. The structure makes it easier for you to add Haddock comments to the various arguments.

Dangers

There is of course, a (small) danger to using tools like these. If you’re going to use them, you want to ensure everyone on the project is using them. Suppose person A isn’t using the program, and commits code that isn’t formatted by the program. Person B might then look through that code, and their editor will correct the file. This will leave them with local changes to the file that aren’t relevant to whatever work they’re doing. This can cause a lot of confusion when they submit code for review. Whoever reviews their code has to sift through the format changes, which slows the review.

People can also have (unreasonably) strong opinions about code formatting. So it’s generally something you want to nail down early on a project and avoid changing afterward. With the examples in this article, I would say it would be an easy sell to use Stylish Haskell on a project. However, the specific choices made in H-Indent and Brittany can be more controversial. So it might cause more problems than it would solve to institute those project-wide.

Conclusion

It’s possible to lose a surprising amount of productivity to code formatting. So it can be important to nail down standards early and often. Code formatting programs can make it easy to enforce particular standards. They’re also very simple to incorporate into your projects with stack and your editor of choice!

Now that you know how to format your code, need some suggestions for what to work on next? Take a look at our Production Checklist! It’ll give you some cool ideas of libraries you can use for building Haskell web apps and much more!

Beam: Database Power without Template Haskell!

haskell_beam.png

As part of our Haskell Web Series, we examined the Persistent and Esqueleto libraries. The first of these allows you to create a database schema in a special syntax. You can then use Template Haskell to generate all the necessary Haskell data types and instances for your types. Even better, you can write Haskell code to query on these that resembles SQL. These queries are type-safe, which is awesome. However, the need to specify our schema with template Haskell presented some drawbacks. For instance, the code takes longer to compile and is less approachable for beginners.

This week on the blog, we'll be exploring another database library called Beam. This library allows us to specify our database schema without using Template Haskell. There's some boilerplate involved, but it's not bad at all! Like Persistent, Beam has support for many backends, such as SQLite and PostgresQL. Unlike Persistent, Beam also supports join queries as a built-in part of its system.

For some more ideas on advanced libraries, be sure to check out our Production Checklist! It includes a couple more different database options to look at.

Specifying our Types

As a first note, while Beam doesn't require Template Haskell, it does need a lot of other compiler extensions. You can look at those in the appendix below, or else take a look at the example code on Github. Now let's think back to how we specified our schema when using Persistent:

import qualified Database.Persist.TH as PTH

PTH.share [PTH.mkPersist PTH.sqlSettings, PTH.mkMigrate "migrateAll"] [PTH.persistLowerCase|
  User sql=users
    name Text
    email Text
    age Int
    occupation Text
    UniqueEmail email
    deriving Show Read Eq

  Article sql=articles
    title Text
    body Text
    publishedTime UTCTime
    authorId UserId
    UniqueTitle title
    deriving Show Read Eq

With Beam, we won't use Template Haskell, so we'll actually be creating normal Haskell data types. There will still be some oddities though. First, by convention, we'll specify our types with the extra character T at the end. This is unnecessary, but the convention helps us remember what types relate to tables. We'll also have to provide an extra type parameter f, that we'll get into a bit more later:

data UserT f =
  …

data ArticleT f =
  ...

Our next convention will be to use an underscore in front of our field names. We will also, unlike Persistent, specify the type name in the field names. With these conventions, I'm following the advice of the library's creator, Travis.

data UserT f =
  { _userId :: ...
  , _userName :: …
  , _userEmail :: …
  , _userAge :: …
  , _userOccupation :: …
  }

data ArticleT f =
  { _articleId :: …
  , _articleTitle :: …
  , _articleBody :: …
  , _articlePublishedTime :: …
  }

So when we specify the actual types of each field, we'll just put the relevant data type, like Int, Text or whatever, right? Well, not quite. To complete our types, we're going to fill in each field with the type we want, except specified via Columnar f. Also, we'll derive Generic on both of these types, which will allow Beam to work its magic:

data UserT f =
  { _userId :: Columnar f Int64
  , _userName :: Columnar f Text
  , _userEmail :: Columnar f Text
  , _userAge :: Columnar f Int
  , _userOccupation :: Columnar f Text
  } deriving (Generic)

data ArticleT f =
  { _articleId :: Columnar f Int64
  , _articleTitle :: Columnar f Text
  , _articleBody :: Columnar f Text
  , _articlePublishedTime :: Columnar f Int64 -- Unix Epoch
  } deriving (Generic)

Now there are a couple small differences between this and our previous schema. First, we have the primary key as an explicit field of our type. With Persistent, we separated it using the Entity abstraction. We'll see below how we can deal with situations where that key isn't known. The second difference is that (for now), we've left out the userId field on the article. We'll add this when we deal with primary keys.

Columar

So what exactly is this Columnar business about? Well under most circumstances, we'd like to specify a User with the raw field types. But there are some situations where we'll have to use a more complicated type for an SQL expression. Let's start with the simple case first.

Luckily, Columnar works in such a way that if we useIdentity for f, we can use raw types to fill in the field values. We'll make a type synonym specifically for this identity case. We can then make some examples:

type User = UserT Identity
type Article = ArticleT Identity

user1 :: User
user1 = User 1 "James" "james@example.com" 25 "programmer"

user2 :: User
user2 = User 2 "Katie" "katie@example.com " 25 "engineer"

users :: [User]
users = [ user1, user2 ]

As a note, if you find it cumbersome to repeat the Columnar keyword, you can shorten it to C:

data UserT f =
  { _userId :: C f Int64
  , _userName :: C f Text
  , _userEmail :: C f Text
  , _userAge :: C f Int
  , _userOccupation :: C f Text
  } deriving (Generic)

Now, our initial examples will assign all our fields with raw values. So we won't initially need to use anything for the f parameter besides Identity. Further down though, we'll deal with the case of auto-incrementing primary keys. In this case, we'll use the default_ function, whose type is actually a Beam form of an SQL expression. In this case, we'll be using a different type for f, but the flexibility will allow us to keep using our User constructor!

Instances for Our Types

Now that we've specified our types, we can use the Beamable and Table type classes to tell Beam more about our types. Before we can make any of these types a Table, we'll want to assign its primary key type. So let's make a couple more type synonyms to represent these:

type UserId = PrimaryKey UserT Identity
type ArticleId = PrimaryKey ArticleT Identity

While we're at it, let's add that foreign key to our Article type:

data ArticleT f =
  { _articleId :: Columnar f Int64
  , _articleTitle :: Columnar f Text
  , _articleBody :: Columnar f Text
  , _articlePublishedTime :: Columnar f Int64
  , _articleUserId :: PrimaryKey UserT f
  } deriving (Generic)

We can now generate instances for Beamable both on our main types and on the primary key types. We'll also derive instances for Show and Eq:

data UserT f =
  …

deriving instance Show User
deriving instance Eq User

instance Beamable UserT
instance Beamable (PrimaryKey UserT)

data ArticleT f =
  …

deriving instance Show Article
deriving instance Eq Article

instance Beamable ArticleT
instance Beamable (PrimaryKey ArticleT)

Now we'll create an instance for the Table class. This will involve some type family syntax. We'll specify UserId and ArticleId as our primary key data types. Then we can fill in the primaryKey function to match up the right field.

instance Table UserT where
  data PrimaryKey UserT f = UserId (Columnar f Int64) deriving Generic
  primaryKey = UserId . _userId

instance Table ArticleT where
  data PrimaryKey ArticleT f = ArticleId (Columnar f Int64) deriving Generic
  primaryKey = ArticleId . _articleId

Accessor Lenses

We'll do one more thing to mimic Persistent. The Template Haskell automatically generated lenses for us. We could use those when making database queries. Below, we'll use something similar. But we'll use a special function, tableLenses, to make these rather than Template Haskell. If you remember back to how we used the Servant Client library, we could create client functions by using client and matching it against a pattern. We'll do something similar with tableLenses. We'll use LensFor on each field of our tables, and create a pattern constructing an item.

User
  (LensFor userId)
  (LensFor userName)
  (LensFor userEmail)
  (LensFor userAge)
  (LensFor userOccupation) = tableLenses

Article
  (LensFor articleId)
  (LensFor articleTitle)
  (LensFor articleBody)
  (LensFor articlePublishedTime)
  (UserId (LensFor articuleUserId)) = tableLenses

Note we have to wrap the foreign key lens in UserId.

Creating our Database

Now unlike Persistent, we'll create an extra type that will represent our database. Each of our two tables will have a field within this database:

data BlogDB f = BlogDB
  { _blogUsers :: f (TableEntity UserT)
  , _blogArticles :: f (TableEntity ArticleT)
  } deriving (Generic)

We'll need to make our database type an instance of the Database class. We'll also specify a set of default settings we can use on our database. Both of these items will involve a parameter be, which stands for a backend, (e.g. SQLite, Postgres). We leave this parameter generic for now.

instance Database be BlogDB

blogDb :: DatabaseSettings be BlogDB
blogDb = defaultDbSettings

Inserting into Our Database

Now, migrating our database with Beam is a little more complicated than it is with Persistent. We might cover that in a later article. For now, we'll keep things simple, and use an SQLite database and migrate it ourselves. So let's first create our tables. We have to follow Beam's conventions here, particularly on the user_id__id field for our foreign key:

CREATE TABLE users \
  ( id INTEGER PRIMARY KEY AUTOINCREMENT\
  , name VARCHAR NOT NULL \
  , email VARCHAR NOT NULL \
  , age INTEGER NOT NULL \
  , occupation VARCHAR NOT NULL \
  );
CREATE TABLE articles \
  ( id INTEGER PRIMARY KEY AUTOINCREMENT \
  , title VARCHAR NOT NULL \
  , body VARCHAR NOT NULL \
  , published_time INTEGER NOT NULL \
  , user_id__id INTEGER NOT NULL \
  );

Now we want to write a couple queries that can interact with the database. Let's start by inserting our raw users. We begin by opening up an SQLite connection, and we'll write a function that uses this connection:

import Database.SQLite.Simple (open, Connection)

main :: IO ()
main = do
  conn <- open "blogdb1.db"
  insertUsers conn

insertUsers :: Connection -> IO ()
insertUsers = ...

We start our expression by using runBeamSqlite and passing the connection. Then we use runInsert to specify to Beam that we wish to make an insert statement.

import Database.Beam
import Database.Beam.SQLite

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  ...

Now we'll use the insert function and signal which one of our tables we want out of our database:

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  insert (_blogUsers blogDb) $ ...

Last, since we are inserting raw values (UserT Identity), we use the insertValues function to complete this call:

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  insert (_blogUsers blogDb) $ insertValues users

And now we can check and verify that our users exist!

SELECT * FROM users;
1|James|james@example.com|25|programmer
2|Katie|katie@example.com|25|engineer

Let's do the same for articles. We'll use the pk function to access the primary key of a particular User:

article1 :: Article
article1 = Article 1 "First article" 
  "A great article" 1531193221 (pk user1)

article2 :: Article
article2 = Article 2 "Second article" 
  "A better article" 1531199221 (pk user2)

article3 :: Article
article3 = Article 3 "Third article" 
  "The best article" 1531200221 (pk user1)

articles :: [Article]
articles = [ article1, article2, article3]

insertArticles :: Connection -> IO ()
insertArticles conn = runBeamSqlite conn $ runInsert $
  insert (_blogArticles blogDb) $ insertValues articles

Select Queries

Now that we've inserted a couple elements, let's run some basic select statements. In general for select, we'll want the runSelectReturningList function. We could also query for a single element with a different function if we wanted:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ ...

Now we'll use select instead of insert from the last query. We'll also use the function all_ on our users field in the database to signify that we want them all. And that's all we need!:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select (all_ (_blogUsers blogDb))
  mapM_ (liftIO . putStrLn . show) users

To do a filtered query, we'll start with the same framework. But now we need to enhance our select statement into a monadic expression. We'll start by selecting user from all our users:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select $ do
   user <- (all_ (_blogUsers blogDb))
    ...
  mapM_ (liftIO . putStrLn . show) users

And we'll now filter on that by using guard_ and applying one of our lenses. We use a ==. operator for equality like in Persistent. We also have to wrap our raw comparison value with val:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select $ do
    user <- (all_ (_blogUsers blogDb))
    guard_ (user ^. userName ==. (val_ "James"))
    return user
  mapM_ (liftIO . putStrLn . show) users

And that's all we need! Beam will generate the SQL for us! Now let's try to do a join. This is actually much simpler in Beam than with Persistent/Esqueleto. All we need is to add a couple more statements to our "select" on the articles. We'll just filter them by the user ID!

findUsersAndArticles :: Connection -> IO ()
findUsersAndArticles conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select $ do
    user <- (all_ (_blogUsers blogDb))
    guard_ (user ^. userName ==. (val_ "James"))
    articles <- (all_ (_blogArticles blogDb))
    guard_ (article ^. articleUserId ==. user ^. userId)
    return user
  mapM_ (liftIO . putStrLn . show) users

That's all there is to it!

Auto Incrementing Primary Keys

In the examples above, we hard-coded all our IDs. But this isn't typically what you want. We should let the database assign the ID via some rule, in our case auto-incrementing. In this case, instead of creating a User "value", we'll make an "expression". This is possible through the polymorphic f parameter in our type. We'll leave off the type signature since it's a bit confusing. But here's the expression we'll create:

user1' = User
  default_ 
  (val_ "James")
  (val_ "james@example.com")
  (val_ 25)
  (val_ "programmer")

We use default_ to represent an expression that will tell SQL to use a default value. Then we lift all our other values with val_. Finally, we'll use insertExpressions instead of insertValues in our Haskell expression.

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  insert (_blogUsers blogDb) $ insertExpressions [ user1' ]

Then we'll have our auto-incrementing key!

Conclusion

That concludes our introduction to the Beam library. As we saw, Beam is a great library that lets you specify a database schema without using any Template Haskell. For more details, make sure to check out the documentation!

For a more in depth look at using Haskell libraries to make a web app, be sure to read our Haskell Web Series. It goes over some database mechanics as well as creating APIs and testing. As an added challenge, trying re-writing the code in that series to use Beam instead of Persistent. See how much of the Servant code needs to change to accommodate that.

And for more examples of cool libraries, download our Production Checklist! There are some more database and API libraries you can check out!

Appendix: Compiler Extensions

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE NoMonoMorphismRestriction #-}

Making the Jump: Advancing Past Beginner Haskell

At certain points in your Haskell development, you might plateau. This can happen to people at many stages, including right at the start. You get used to certain patterns in the language, and it’s hard to know where to go next. This is our first article on how to break out of these plateaus. Specifically, we’ll look at breaking out of a beginner’s plateau. To jump past the simple language constructs, you need to learn how to organize your Haskell code and incorporate other people’s code.

Of course, you can’t make this jump if you’ve never started! Download our Beginners Checklist for some tools on starting out with Haskell. And read our Liftoff Series for some more details on the basic language mechanics.

Organizing Your Project

When I started writing my first Haskell project, it felt like I was trying to hold together steel beams with Elmer’s glue. I installed all my packages system-wide and only tested my code with GHCI and the runghc command. This limited the development of the project and made it hard for me to incorporate anything new.

Generally, you’ll either be building a library for other people to use, or an executable program. And you want to be able to organize everything involved in that so you can coherently run and test what you’ve created. To do this, you need to learn one of the package management tools for Haskell.

The oldest system in common use is Cabal. Cabal allows you to install Haskell packages. It also allows you to organize your project, install executables, and run tests with ease. Check out the user guide for more information on using Cabal on its own.

Cabal has some gotchas though. So if you’re a beginner, I strongly recommend using Stack. Stack adds another layer on top of Cabal, and it simplifies many of Cabal’s difficulties. To learn more about Stack, you can take our free Stack mini-course!

With either of these, there are three main takeaways. You should declare all your dependencies in 1-2 files. The final product of your project, should be explicitly declared in your .cabal file. Finally, you should be able to run all your tests with a single command.

Hackage

Stack and Cabal both make it easy to bring in dependencies. But how do you know what other libraries you can use, and what their functions are? The main source of Haskell libraries is Hackage. You should get used to the idea of googling Haskell functions, and finding the package you need on Hackage. Once you’ve found a function you want to use, you should get used to this pattern:

  1. What module is this function is?
  2. What package is that module part of?
  3. Add this package to your .cabal file and import the module into your code.
  4. Compile your code, make sure it works

Sometimes, you’ll need code that isn’t on Hackage (Github, for instance). Stack has some special ways for you to incorporate that code. You’ll need an entry in the stack.yaml file instead of just the .cabal file.

Hackage Gotchas

While we’re discussing Hackage, there are some gotchas you should be aware of when perusing Haskell documentation. First, it’s important to keep an eye on the version of the package you’re using. On many occasions, I’ve found a discrepancy between the type I had in my program and the type in the documentation. Typically, I would google the package name, and get a link to the documentation for the wrong version. And types can sometimes change between versions, so that will definitely cause problems!

A useful command is stack list-dependencies. It will show the version number for every package you are currently using. Then compare that number for your package to what is at the top of the Hackage page. If they don’t match, click the contents button in the top right. This will take you to the list of versions, where you can find the one you’re actually using!

Another gotcha can occur when you know a function name and want to find it on Hackage. You’ll import it via a generic sounding module name. And then you’ll go to the documentation for that module and find that all you have is a list of re-exported modules. This can be frustrating, especially when those modules in turn re-export other modules. You have to do a full depth first search to try to find the function you’re after. In this situation, you’ll want help from some other tools.

Hoogle and Hayoo

Hackage does have a search function. But I’ve often had better luck with Hoogle and Hayoo. Both of these are search engines written for finding things on Hackage. They’ll both help you track down the actual module a function comes from.

Hayoo especially does a good job with operators. Oftentimes, a simple google search will help you find a function with a normal sounding name (e.g. split). But Google searches are bad at finding special characters. But if you enter the specific operator in Hayoo, you have a great chance of finding the module you’re looking for!

Using Types

My last piece of documentation related advice is to learn to use types. In truth, the type of a function is the most important piece of information you need. You can generally bring up GHCI and get the type of an expression. But oftentimes you’ll only see a type synonym that is not understandable. Also, you won’t be able to see the ways in which you can create something of a particular type. If the code and documentation are well organized, you can almost always do this on Hackage. This can help guide your development process so you can fill in missing holes. For more ideas about using types to drive your development, check out this article in our Haskell Brain series!

Conclusion

Haskell is a cool language. But you’ll soon get over the cool factor of functional purity and laziness. Then it’ll be time to get down to business and start writing more serious programs. To do this, you have to master a Haskell package manager. This will help you organize your work. You’ll also have to learn a lot about the organization of Haskell documentation on Hackage.

To get started on your Haskell journey, download our Beginners Checklist and read our Liftoff Series! For some more ideas on getting to the next level, read our Haskell Brain series!

Series Spotlight: Liftoff Series!

Learning Haskell, like learning anything, can be a long and sometimes difficult journey. But every journey starts with a first step! And when you take things step-by-step, it’s not too difficult to make progress. So what is the first thing you should do? That’s what our Liftoff Series is for!

If you’ve never written any Haskell before, this is a great place to start. We’ll walk you through the first steps of learning the language. You don't need to worry about any prerequisites. The series assumes no prior knowledge of Haskell, and walks you through installation. That said, it's important to have some background in programming.

In part 1 of the series, you’ll learn about expressions and types. These form the basic building blocks of Haskell as a language. The structure they create is a radical departure from most non-functional languages.

In part 2 of the series, we’ll learn about code modules and how to write your own Haskell source files. We’ll also learn some of the more complicated elements of function syntax in Haskell. For instance, we’ll explore the differences between how Haskell if statements work compared to other languages. We'll also see Haskell's other options for control flow.

In part 3, we’ll conclude by improving our mastery of the type system. Specifically, we’ll look at how we can create our own types. Again, this is very different from the process you might follow in other languages. But it’s also a lot easier and more flexible!

An excellent addendum to this series is our free Stack mini-course. This will help you learn how to use the Stack tool so you can start building your own Haskell programs. You’ll need this knowledge if you want to get beyond building simple toy projects, so take a look!

Coming Summer Attractions

Our recent blog series on GHC is now part of our permanent collection of material on the site! Check it out in the advanced section. As a reminder, here are the different parts of that series:

  1. Part 1 goes over how to set up our local development environment to use GHC. This part focuses on Windows but also has good advice for Mac and Linux Users.
  2. In part 2, we’ll set up a basic development cycle and make a simple change. We’ll also learn about the organization of the codebase.
  3. In the third part of the series, we’ll do some more serious hacking. We’ll look at adding some of our own keywords and adding a new syntactic construct.
  4. In the final part, we’ll look at how GHC handles issue tracking. This will lead to us submitting some code for a couple very simple issues.

What’s Next?

In the next few weeks, we’ve got a wide variety of content planned. We’ll be highlighting some more of our permanent content as well as showing some new tutorials. We’ll specifically look at a couple concepts we’ve handled in the past, but we’ll find new ways to build those programs. We’ve also got some more cerebral articles for how to take the next steps in your Haskell development. So stay tuned for more!

As always, if you’re new to Haskell, we’ve got some great resources to help you get started! Download our Beginners Checklist to get the language installed and discover some cool tools! For more language details, read our Liftoff Series!

And for you more experienced Haskell developers, take a look at our Production Checklist. It’ll give you some more ideas for advanced libraries to use for your projects!

Contributing to GHC 4: Real Issues

In the last few weeks, we’ve taken a good look at GHC. We started by looking at the steps we would need to prepare our local machine for GHC development. This was an especially difficult process on Windows, so we focused there. After that, we looked at the basic way of creating a development cycle for ourselves. We validated that by changing an error message and seeing how it appeared in the compiler. Last week, we made some more complicated changes. This week, we’re going to wrap this series up by looking at some basic ways of making contributions.

Documentation

Documentation is a tricky thing on any software project. At any given moment, most of the effort is going into making sure the program works as it ought to. When you understand the code already, you don’t need to look at the documentation. So the temptation is to not change any of the comments. This means documentation is always likely to fall out of date. Haskell, if anything, is more prone to this kind of lapse. We look for issues by making changes, compiling, and seeing what breaks. And documentation never breaks!

Experienced developers will remember to change documentation more. Still though, it’s inevitable that something will slip through the cracks. But there's good news for us as newcomers to the GHC code base! We’re in the best position to find holes in the documentation, since we’re the ones who need to read it most! This is how I found the first contribution I could make.

While exploring the lexing types, I found a comment that didn’t quite make sense. At the top of compiler/basicTypes/BasicTypes.hs, it states:

-- There is considerable overlap between the logic here and the logic
-- in Lexer.x, but sadly there seems to be way to merge them.

That doesn’t quite read right. From the context, it seems pretty clear that the author intended to write “there seems to be no way to merge them”. Great, so let’s submit a pull request for this! We’ll fork the repository and open a pull request. So we’ll create our fork, clone the repo, open a new branch, and open a pull request against master.

Now there’s a somewhat annoying issue with the fact that the CI builds don’t actually seem to be passing right now. But hopefully this PR will get merged in at some point.

Issue Tracking with Trac

Of course, there are also much more complicated issues at stake with GHC. There’s the real features we want to add to the codebase, and the bugs we want to fix! To take a look at what’s going on there, you’ll need to look at the issue tracker. GHC uses Trac for this, and you can observe all the issues on that list. They have labels based on what release they’re for, and how important they are.

It can be quite an overwhelming list. I scrolled through many different tickets and wasn’t sure what I could actually help with. So how can you find something to start out with? First, you can subscribe to the GHC devs mailing list. Conversations there will help you find what people are working on. Second, you can log onto Freenode and get onto the #ghc channel. You can ask anyone what’s going on and where you might help. Luckily, there is also a tag for “newcomers” on the list of issues. These are issues that the GHC devs have highlighted should be easy for people new to the codebase. Let’s take a look at one of these issues.

Looking at a Real Issue: Infix Patterns

From this hunt, I found this ticket, related to the infix value of (->). The ticket claims that the stated infix level of 0 for the arrow operator is actually incorrect. Let’s take a look at what they mean.

As a reminder, the infix level states an operator's priority when determining order of operations. For instance, the multiplication operator (*) has a higher infix level than the addition operator (+). We can confirm this information with a quick ghci session by using the :info command on each of these.

>> :i (+)
…
infixl 6 +
>> :i (*)
…
infixl 7 *
>> 5 + 2 * 3
11 -- Would be 21 if addition were higher precedence

Now, when two operators have the same infix level, then we refer to the direction of the infix level. As an example, we can compare subtraction to addition. We’ll find it's also infixl 6. Since it’s infixl (as opposed to infixr), we give the left side operation priority. Here’s an example.

>> :i (-)
…
infixl 6 -
>> 5 - 2 + 18
21 -- Not (-15)

So let’s look at our arrow operator, which we use when defining our type signatures:

>> :i (->)
data (->) (a :: TYPE q) (b :: TYPE r) -- Defined . `GHC.Prim`
infixr 0 `(->)`
...

This suggests an infix level of 0 for this operator, and that we should prioritize things on the right. However, the person filing the bug suggests the following code:

{-# LANGUAGE TypeOperators #-}

module Bug where

import Data.Type.Equality

type (~>) = (->)
infixr 0 ~>

f :: (a ~> b -> c) :~: (a ~> (b -> c))
f = Refl

There’s a lot going on here with some higher level concepts, so let’s break it all down. First, (->) is a type operator, meaning that it itself is actually a type. Thus we can create a type synonym for it called (~>). Then we can assign this new operator to have whatever infix level we like. In this case, we’ll choose the same stated infix level as we have for the original operator, infixr 0.

The next part creates an expression f. Its type signature uses the (:~:) operator for relational equality between types. This type has the Refl constructor. The only thing you need to understand is that each of our arrow patterns ((a ~> b -> c) and (a ~> (b -> c))) is a type. And this code should only compile if those types are the same.

And on the face of it, these types should be the same. After all, both operators purport to be infixr 0, meaning the way we parenthesize it on the right side of (:~:) should match how it is naturally ordered. But the code does not compile!

>> ghci
>> :l Bug.hs
Bug.hs:11:5: error:
    * Couldn’t match type `a` with `a ~> b`
      `a` is a rigid type variable bound by
        f :: forall a b c. ((a ~> b) -> c) :~: (a ~> ( b -> c))
        At Bug.hs:10:1-38
      Expected type: ((a ~> b) -> c) :~: (a ~> (b -> c))
        Actual type: ((a ~> b) -> c) :~: ((a ~> b) -> c)
    * In the expression: Refl
      In an equation for `f’: f = Refl
    * Relevant bindings include
      f :: ((a ~> b) -> c) :~: (a ~> (b -> c))
        (bound at Bug.hs:11:1)
   |
11 | f = Refl
   |

We can see on the “Actual type” line how the compiler interprets (a ~> b -> c). It gives priority to the left, not the right. Indeed, if we change the type signature to reflect priority given to (~>), our code will compile:

f :: (a ~> b -> c) :~: ((a ~> b) -> c)
f = Refl
…
>> ghci
>> :l Bug.hs
Ok, one module loaded.

The Fix

The fix, luckily for us, has already is already proposed in the ticket. The compiler represents the infix level of our operators using the Fixity type. We can see a particular location where we’ve defined the level for some of our built-in operators:

negateFixity, funTyFixity :: Fixity
negateFixity = Fixity NoSourceText 6 InfixL -- Fixity of unary negate
funTyFixity = Fixity NoSourceText 0 InfixR -- Fixity of `->`

We want to change the fixity of the function type operator. Instead of it appearing to be 0, we should make it appear to be -1, showing the lower precedence of this operator. Note this code refers to our we report it. The actual reasons why it ends up having lower priority are more complicated. But let’s make that change:

funTyFixity = Fixity NoSourceText (-1) InfixR

Testing Our Change

This seems like it should be a simple change to test. First, we’ll make our code again. Then we’ll boot up GHCI and ask for info on (->). But this doesn’t appear to work when we try it!

> make
> ghci
...
>> :i (->)
data (->) (a :: TYPE q) (b :: TYPE r) -- Defined . `GHC.Prim`
infixr 0 `(->)`
...

The issue here is that re-making does not cause GHCI to use our new locally built version of GHC. Even when using ghci.exe from within the ghc/inplace/bin directory, it still doesn’t account for this change. The way around this is that instead of using ghci, we can pass the --interactive flag to a normal call to ghc. So we’ll want something like this:

~/ghc/inplace/bin/ghc-stage2.exe -o prog --interactive Main.hs

This will bring up a GHCI prompt that loads our main module. And now when we go ahead and get info, we’ll see that it works!

> ~/ghc/inplace/bin/ghc-stage2.exe -o prog --interactive Main.hs
...
>> :i (->)
data (->) (a :: TYPE q) (b :: TYPE r) -- Defined . `GHC.Prim`
infixr -1 `(->)`
...

So I’ll now make a simple pull request addressing this bug. You can follow the progress here. I’ll update this post as it moves further along in the process.

Conclusion

This wraps up our series on contributing to GHC! There are a lot of bugs out there, so don’t be afraid to take a look at anything labeled as newcomer. Just make sure to take a look at the discussion that’s occurred already on the ticket!

To learn more about Haskell, you can read our Liftoff Series (for beginners) or our Haskell Web Series if you’re already familiar with the language. You can also download our Haskell Beginners Checklist to get started! Or you can look at our Production Checklist if you want some ideas for more advanced projects.

Contributing to GHC 3: Hacking Syntax and Parsing

In last week's article, we made more progress in understanding GHC. We got our basic development cycle down and explored the general structure of the code base. We also made the simplest possible change by updating one of the error messages. This week, we'll make some more complex changes to the compiler, showing the ways you can tweak the language. It's unlikely you would make changes like these to fix existing issues. But it'll help us get a better grasp of what's going on.

As always, you can learn more about the basics of Haskell by checking out our other resources. Take a look at our Liftoff Series or download our Beginners Checklist!

Comments and Changing the Lexer

Let's get warmed up with a straightforward change. We'll add some new syntax to allow different kinds of comments. First we have to get a little familiar with the Lexer, defined in parser/Lexer.x. Let's try and define it so that we'll be able to use four apostrophes to signify a comment. Here's what this might look like in our code and the error message we'll get if we try to do this right now.

module Main where

'''' This is our main function
main :: IO ()
main = putStrLn "Hello World!"

…

Parser error on `''`
Character literals may not be empty
  |
5 | '''' This is our main function
  | ^^

Now, it's easy enough to add a new line describing what to do with this token. We can follow the example in the Lexer file. Here's where GHC defines a normal single line comment:

"-- " ~$docsym .* { lineCommentToken }
"--" [^$symbol \ ] . * { lineCommentToken }

It needs two cases because of Haddock comments. But we don't need to worry about that. We can specify our symbol on one line like so:

"''''" .* { lineCommentToken }

Now we can add the comment above into our code, and it compiles!

Adding a New Keyword

Let's now look at how we could add a new keyword to the language. We'll start with a simple substitution. Suppose we want to use the word iffy like we use if. Here's what a code snippet would look like, and what the compiler error we get is at first:

main :: IO ()
main = do
  i <- read <$> getLine
  iffy i `mod` 2 == 0
    then putStrLn "Hello"
    else putStrLn "World"

…

Main.hs:11:5: error: parse error on input 'then'
   |
11 |     then putStrLn "Hello"
   |     ^^^^

Let's do a quick search for where the keyword "if" already exists in the parser section. We'll find two spots. The first is a list of all the reserved words in the language. We can update this by adding our new keyword to the list. We'll look for the reservedIds set in basicTypes/Lexeme.hs, and we can add it:

reservedIds :: Set.Set String
reservedIds = Set.fromList [ …
  , "_", "iffy" ]

Now we also have to parse it so that it maps against a particular token. We can see a line in Lexer.x where this happens:

( "if", ITif, 0)

We can add another line right below it, matching it to the same ITif token:

( "iffy", ITif, 0)

Now the lexer matches it against the same token once we start putting the language together. Now our code compiles and produces the expected result!

lghc Main.hs
./prog.exe
5
World

Reversing If

Now let's add a little twist to this process. We'll add another "if" keyword and call it reverseif. This will change the ordering of the if-statement. So when the boolean is false, our code will execute the first branch instead of the second. We'll need to work a little further upstream. We want to re-use as much of the existing machinery as possible and just reverse our two expressions at the right moment. Let's use the same code as above, except with the reverse keyword. Then if we input 5 we should get Hello instead of World.

main :: IO ()
main = do
  i <- read <$> getLine
  reverseif i `mod` 2 == 0
    then putStrLn "Hello"
    else putStrLn "World"

So we'll have to start by adding a new constructor to our Token type, under the current if token in the lexer.

data Token =
  …
  | ITif
  | ITreverseif
  ...

Now we'll have to add a line to convert our keyword into this kind of token.

...
("if", ITif, 0),
("reverseif", ITreverseif, 0),
...

As before, we'll also add it to our list of keywords:

reservedIds :: Set.Set String
reservedIds = Set.fromList [ …
  , "_", "iffy", "reverseif" ]

Let's take a look now at the different places where we use the ITif constructor. Then we can apply them to ITreverseif as well. We can find two more instances in Lexer.x. First, there's the function maybe_layout, which dictates if a syntactic construct might need an open brace. Then there's the isALRopen function, which tells us if we can start some kind of other indentation. In both of these, we'll follow the example of ITif:

maybe_layout :: Token -> P ()
…
  where
    f ITif = pushLexState layout_if
    f ITreverseif = pushLexState layout_if

...
isALRopen ITif = True
isALRopen ITreverseif = True
...

There's also a bit in Parser.y where we'll need to parse our new token:

%token
 …
 'if' { L _ ITif }
 'reverseif' { L _ ITreverseif }

Now we need to figure out how these tokens create syntactic constructs. This also seems to occur in Parser.y. We can look, for instance, at the section that constructs basic if statements:

| 'if' exp optSemi 'then' exp optSemi 'else' exp
    {% checkDoAndIfThenElse $2 (snd $3) $5 (snd $6) $8 >>
      Ams (sLL $1 $> $ mkHsIf $2 $5 $8)
        (mj AnnIf $1:mj AnnThen $4
          :mj AnnElse $7
          :(map (\l -> mj AnnSemi l) (fst $3))
         ++(map (\l -> mj AnnSemi l) (fst $6))) }

There's a lot going on here, and we're not going to try to understand it all right now! But there are only two things we'll need to change to make a new rule for reverseif. First, we'll obviously need to use that token instead of if on the first line.

Second, see that mkHsIf statement on the third line? This is where we make the actual Haskell "If" expression in our syntax tree. The $5 refers to the second instance of exp in the token list, and the $8 refers to the third and final expression. These are, respectively, the True and False branch expressions of our "If" statement. Thus, to reverse our "If", all we need to do is flip this arguments on the third line!

| 'reverseif' exp optSemi 'then' exp optSemi 'else' exp
    {% checkDoAndIfThenElse $2 (snd $3) $5 (snd $6) $8 >>
      Ams (sLL $1 $> $ mkHsIf $2 $8 $5)
        (mj AnnIf $1:mj AnnThen $4
          :mj AnnElse $7
          :(map (\l -> mj AnnSemi l) (fst $3))
         ++(map (\l -> mj AnnSemi l) (fst $6))) }

Finally, there's one more change we need to make. Adding this line will introduce a couple new shift/reduce conflicts into our grammar. There are already 233, so we're not going to worry too much about that right now. All we need to do is change the count on the assertion for the number of conflicts:

%expect 235 -- shift/reduce conflicts

Now when we compile and run our simple program, we'll indeed see that it works as expected!

lghc Main.hs
./prog.exe
5
Hello

Conclusion

So this week we saw some more complicated changes to GHC that have tangible effects. Next week, we'll wrap up our discussion of GHC by looking at the contribution process. We'll see the "simple" way with Github first. Then we'll also walk through the more complicated process using tools like Arc and Phabricator.

To learn more about Haskell, you should check out some of our basic materials! If you're a beginner to the language, read our Liftoff Series. It'll teach you how to use Haskell from the ground up. You can also take a look at our Haskell Web Series to see some more advanced and practical skills!

Contributing to GHC 2: Basic Hacking and Organization

Last week, we took our first step into the world of GHC, the Glasgow Haskell Compiler. We summarized the packages and tools we needed to install to get it building. We did this even in the rather hostile environment of a windows laptop. But, at the end of the day, we can now build the project with make and create our local version of GHC.

This week, we’ll establish our development cycle by looking at a very simple change we can make to the compiler. We’ll also discuss the architecture of the repository so we’ll can make some cooler changes next week.

GHC is truly a testament to some of the awesome benefits of open source software. Haskell would not be the same language without it. But to understand GHC, you first have to have a decent grasp of Haskell itself! If you’ve never written a line of Haskell before, take a look at our Liftoff Series for some tips on how to get going. You can also download our Beginners Checklist.

You may have also heard that while Haskell is a neat language, it’s useless from an industry perspective. But if you take a look at our Production Checklist, you’ll find tons of tools to write more interesting Haskell programs!

Getting Started

Let’s start off by writing a very simple program in Main.hs.

module Main where

main :: IO ()
main = do
  putStrLn "Using GHC!"

We can compile this program into an executable using the ghc command. We start by running:

ghc -o prog Main.hs

This creates our executable prog.exe (or just prog if you’re not using Windows). Then we can run it like we can run any kind of program:

./prog.exe
Using GHC!

However, this is using the system level GHC we had to install while building it locally!

which ghc
/mingw/bin/ghc

When we build GHC, it creates executables for each stage of the compilation process. It produces these in a directory called ghc/inplace/bin. So we can create an alias that will simplify things for us. We’ll write lghc to be a "local GHC" command:

alias lghc="~/ghc/inplace/bin/ghc-stage2.exe -o prog"

This will enable us to compile our single module program with lghc Main.hs.

Hacking Level 0

Ultimately, we want to be able to verify our changes. So we should be able to modify the compiler, build it again, use it on our program, and then see our changes reflected in the code. One simple way to test the compiler’s behavior is to change the error messages. For example, we could try to import a module that doesn’t exist:

module Main where

import OtherModule (otherModuleString)

main :: IO ()
main = do
  putStrLn otherModuleString

Of course, we’ll get an error message:

[1 of 1] Compiling Main (Main.hs, Main.o)

Main.hs:3:1: error:
    Could not find module 'OtherModule'
    Use -v to see a list of the files search for.
   |
3  |import OtherModule (otherModuleString)
   |^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Let’s try now changing the text of this error message. We can do a quick search for this message in the compiler section of the codebase and find where it’s defined:

cd ~/ghc/compiler
grep -r "Could not find module" .
./main/Finder.hs:cannotFindModule = cantFindErr (sLit "Could not find module")

Let’s go ahead and update that string to something a little different:

cannotFindModule :: DynFlags -> ModuleName -> FindResult -> SDoc
cannotFindModule = cantFindErr
  (sLit "We were unable to locate the module")
  (sLit "Ambiguous module name")

Now let’s go ahead and rebuild, except let’s use some of the techniques from last week to make the process go a bit faster. First, we’ll copy mk/build.mk.sample to mk/build.mk. We’ll uncomment the following line, as per the recommendation from the setup guide:

BuildFlavour=devel2

We’ll also uncomment the line that says stage=2. This will restrict the compiler to only building the final stage of the compiler. It will skip past stage 0 and stage 1, which we’ve already build.

We’ll also build from the compiler directory instead of the root ghc directory. Note though that since we’ve changed our build file, we’ll have to boot and configure once again. But after we’ve re-compiled, we’ll now find that we have our new error message!

[1 of 1] Compiling Main (Main.hs, Main.o)

Main.hs:3:1: error:
    We were unable to locate the module 'OtherModule'
    Use -v to see a list of the files search for.
   |
3  |import OtherModule (otherModuleString)
   |^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

General Architecture

Next week, we’ll look into making a more sophisticated change to the compiler. But at least now we’ve validated that we can develop properly. We can make a change, compile in a short amount of time, and then determine that the change has made a difference. But now let’s consider the organization of the GHC repository. This will help us think some more about the types of changes we’ll make. I’ll be drawing on this description written by Simon Peyton Jones and Simon Marlow.

There are three main parts to the GHC codebase. The first of these is the compiler itself. The job of the compiler is to take our Haskell source code and convert it into machine executable code. Here is a very non-exhaustive list of some of the compiler’s tasks

  1. Determining the location of referenced modules
  2. Reading a single source file
  3. Breaking that source into its simplest syntactic representation

Then there is the boot section. This section deals with the libraries that the compiler itself depends on. They include things such as low level types like Int or else Data.Map. This section is somewhat more stable, so we won’t look at it too much.

The last major section is the Runtime System (RTS). This takes the code generated by the compiler above and determines how to run it. A lot of magic happens in this part that makes Haskell particularly strong at tasks like concurrency and parallelism. It’s also where we handle mechanics like garbage collection.

We’ll try to spend most of our time in the compiler section. The compilation pipeline has many stages, like type checking and de-sugaring. This will let us zero in on a particular stage and make a small change. Also the Runtime System is mostly C code, while much of the compiler is in Haskell itself!

Conclusion

Next week we’ll take a look at a couple more ways to modify the compiler. After that, we’ll start looking at taking real issues from GHC and see what we can do to try and fix them! We’ll eventually take a peak at the submission process both with Github and Phabricator.

If you want to start out your Haskell journey, you should read our Liftoff Series! It will help you learn the basics of this awesome language. For more updates, you can also subscribe to our monthly newsletter!

Contributing to GHC 1: Preparation

In the last few weeks, we’ve looked at a few different open source Haskell projects like HNix and Codeworld. This week, we’ll start looking at perhaps the biggest and most important open source element of the Haskell ecosystem. This is GHC, the Glasgow Haskell Compiler. Without GHC and the hard work that goes into it from many volunteers, Haskell would not be the language it is today. So in the next few weeks we’ll be explore the process of building and (hopefully) contributing to GHC.

I’m currently operating on a Windows laptop, which brings many frustrations. Getting GHC to build on Windows is a non-trivial task with many potential hurdles. On the bright side, I view this as an opportunity to show that one can contribute even in the most adverse circumstances. So most of this article will focus on the trials of using Windows. There is a section further down that goes over the most important parts of building for Mac and Linux. I’ll be following this guide by Simon Peyton Jones, sharing my own complications.

Now, you need to walk before you can run. If you’ve never used Haskell before, you have to try it out first to understand GHC! Download our Beginner’s Checklist to get started! You can also read our Liftoff Series to learn more about the language basics.

MSys

The main complication with Windows is that the build tools for GHC are made for Unix-like environments. These tools include programs like autoconf and make. And they don’t work in the normal Windows terminal environment. This means we need some way of emulating a Unix terminal environment in Windows. There are a couple different options for this. One is Cygwin, but the more supported option for GHC is MSYS 2. So my first step was to install this program. This terminal will apply the “Minimalist GNU for Windows” libraries, abbreviated as “MinGW”.

Installing this worked fine the first time. However, there did come a couple points where I decided to nuke everything and start from scratch. Re-installing did bring about one problem I’ll share. In a couple circumstances where I decided to start over, I would run the installer, only to find an error stating bash.exe: permission denied. This occurred because the old version of this program was still running on a process. You can delete the process or else just restart your machine to get around this.

Once MSys is working, you’ll want to set up your terminal to use MinGW programs by default. To do this, you’ll want to set the path to put the mingw directory first:

echo “export PATH=/mingw<bitness>/bin:\$PATH” >> ~/.bash_profile

Use either 32 or 64 for <bitness> depending on your system. Also don’t forget the quotation marks around the command itself!

Package Preparation

Our next step will be to get all the necessary packages for GHC. MSys 2 uses an older package manager called pacman, which operates kind’ve like apt-get. First you’ll need to update your package repository with this command:

pacman -Syuu

As per the instructions in SPJ’s description, you may need to run this a couple times if a connection times out. This happened to me once. Now that pacman is working, you’ll need to install a host of programs and libraries that will assist in building GHC:

pacman -S --needed git tar bsdtar binutils autoconf make xz \
    curl libtool automake python python2 p7zip patch ca-certificates \
    mingw-w64-$(uname -m)-gcc mingw-w64-$(uname -m)-python3-sphinx \
    mingw-w64-$(uname -m)-tools-git

This command typically worked fine for me. The final items we’ll need are alex and happy. These are Haskell programs for lexing and parsing. We’ll want to install Cabal to do this. First let’s set a couple variables for our system:

arch=x64_64 # could also be i386
bitness=64  # could also be 32

Now we’ll get a pre-built GHC binary that we’ll use to Bootstrap our own build later:

curl -L https://downloads.haskell.org/~ghc/8.2.2/ghc-8.2.2-${arch}-unknown-mingw32.tar.xz | tar -xJ -C /mingw${bitness} --strip-components=1

Now we’ll use Cabal to get those packages. We’ll place them (and Cabal) in /usr/local/bin, so we’ll make sure that’s created first:

mkdir -p /usr/local/bin
curl -L https://www.haskell.org/cabal/release/cabal-install-2.2.0.0/cabal-install-2.2.0.0-${arch}-unknown-mingw32.zip | bsdtar -xzf- -C /usr/local/bin

Now we’ll update our Cabal repository and get both alex and happy:

cabal update
cabal install -j --prefix=/usr/local/bin alex happy

Once while running this command I found that happy failed to install due to an issue with the mtl library. I got errors of this sort when running the ghc-pkg check command:

Cannot find any of [“Control\\Monad\\Cont.hi”, “Control\\Monad\Cont.p_hi”, “Control\\Monad\\Cont.dyn_hi”]
Cannot find any of [“Control\\Monad\\Cont\\Class.hi”, “Control\\Monad\Cont\\Class.p_hi”, “Control\\Monad\\Cont\\Class.dyn_hi”]

I managed to fix this by doing a manual re-install of the mtl package:

cabal install -j --prefix=/usr/local/ mtl --reinstall

After this step, there were no errors on ghc-pkg check, and I was able to install happy without any problems.

cabal install -j --prefix=/usr/local/ happy
Resolving dependencies…
Configuring happy-1.19.9…
Building happy-1.19.9…
Installed happy-1.19.9

Getting the Source and Building

Now our dependencies are all set up, so we can actually go get the source code now! The main workflow for contributing to GHC uses some other tools, but we can start from Github.

git clone --recursive git://git.haskell.org/ghc.git

Now, you should run the ./boot command from the ghc directory. This resulted in some particularly nasty problems for me thanks to my antivirus. It decided that perl was an existential threat to my system and threw it in the Virus Chest. You might see an error like this:

sh: /usr/bin/autoreconf: /usr/bin/perl: bad interpreter: No such file or directory

Even after copying another version of perl over to the directory, I saw errors like the following:

Could not locate Autom4te/ChannelDefs.pm in @INC (@INC contains /usr/share/autoconf C:/msys64/usr/lib .) at C:/msys64/usr/bin/autoreconf line 39

In reality, the @INC path should have a lot more entries than that! It took me a while (and a couple complete do-overs) to figure out that my antivirus was the problem here. Everything worked once I dug perl out of the Virus chest. Once boot runs, you’re almost set! You now need to configure everything:

./configure --enable-tarballs-autodownload

The extra option is only necessary on Windows. Finally you’ll use to make command to build everything. Expect this to take a while (12 hours and counting for me!). Once you’re familiar with the codebase, there are a few different ways you can make things build faster. For instance, you can customize the build.mk file in a couple different ways. You can set BuildFlavor=devel2, or you can set stage=2. The latter will skip the first stage of the compiler.

You can also run make from the sub-directories rather than the main directory. This will only build the sub-component, rather than the full compiler. Finally, there’s also a make fast command that will skip building a lot of the dependencies.

Mac and Linux

I won’t go into depth on the instructions for Mac and Linux here, since I haven’t gotten the chance to try them myself. But due to the developer-friendly nature of those systems, they’re likely to have fewer hitches than Windows.

On Linux for instance, you can actually do most of the setup by using a Docker container. You’ll download the source first, and then you can run this docker command:

>> sudo docker run --rm -i -t -v `pwd`:/home/ghc gregweber/ghc-haskell-dev /bin/bash

On Mac, you’ll need to install some similar programs to windows, but there’s no need to use a terminal emulator like MSys. If you have the basic developer tools and a working version of GHC and Cabal already, it might be as simple as:

>> brew install autoconf automake
>> cabal install alex happy haddock
>> sudo easy_install pip
>> sudo pip install sphinx

For more details, check here. But once you’re set up, you’ll follow the same boot, configure and make instructions as for Windows.

Conclusion

So that wraps up our first look at GHC. There’s plenty of work to do just to get it to build! But next week we’ll start looking at some of the simplest modifications we can make to GHC. That way, we can start getting a feel for how the code base works.

If you haven’t written Haskell, it’s hard to appreciate the brilliance of GHC! Get started by downloading our Beginners Checklist and reading our Liftoff Series!

Codeworld: Haskell as a First Programming Language

In the last couple weeks, we’ve explored a couple different Haskell open source projects. We checked out the Nix package manager and its Haskell cousin. Open source is very important to the Haskell community, so we’ll continue in this vein for a little while longer. This week, we’ll explore Codeworld, another project I learned about at Bayhac about a month ago. In the coming weeks, we’ll look at GHC itself, a vital open-source component of the Haskell ecosystem.

What is Codeworld?

Codeworld is an educational tool for teaching kids about mathematics and programming. The most basic version of Codeworld allows students to create geometric images. They do this using simple programming expressions similar to Haskell. Here’s a very basic program we can write and the picture it would draw.

leaves = sector(0, 180, 4)
trunk = solidRectangle(1,4)
tree = colored(leaves, translucent(green)) & colored(trunk, dark(brown))

program = drawingOf(tree)
code_world_0.png

This is different from similar sorts of programs and language in many ways. The Logo programming language that I first learned used a more procedural style. You create “turtles” that move around the screen and perform commands. For example, you could tell a turtle to start drawing, move 25 pixels, turn, and move again. You might also approach drawing in an object oriented fashion. You'd create shapes that have different properties and change these over time. But Codeworld eschews both these approaches in favor of a more functional style.

Your program is ultimately a single drawing. You can compose this drawing with different components, always represented by expressions. As you learn more about the different patterns, you can create your own functions.

leaves = sector(0, 180, 4)
trunk = solidRectangle(1,4)

tree :: (Color, Color) -> Picture
tree(c1, c2) = colored(leaves, translucent(c1)) &
               colored(trunk, dark(c2))

myTree :: (Number, Color, Color) -> Picture
myTree(x, c1, c2) = translated(tree(c1, c2), x, 0)

program = drawingOf(myTree(-5, green, brown) & myTree(5, red, black))
code_world_1.png

Within a few examples, it’s relatively easy to teach the concept of recursion! Here’s a simple example showing repetition and fractals:

branch :: Number -> Picture
branch(0) = blank
branch(n) =
    polyline([(0,0), (0, 5)]) &
    translated(smallBranch, 0, 5) &
    translated(rotated(smallBranch,  30), 0, 2.5) &
    translated(rotated(smallBranch, -30), 0, 2.5)
  where smallBranch = scaled(branch(n-1), 0.5, 0.5)

tree :: Picture
tree = branch(7)

program = drawingOf(tree)
code_world_2.png

Codeworld Haskell

Now the basic version of Codeworld is like Haskell but with some simplifications and syntactic changes. There is also Codeworld Haskell, which employs the full Haskell feature set. This lets you use more complex items and dive into the type signatures a bit more.

It also involves more complex functions than drawing. You can animations and interactions between different elements, or track a global state. It’s even possible to create simple games. The interactionOf function allows you to handle input events that can affect the world. The collaborationOf function looks a bit complicated with its use of StaticPtr. But it allows you to create multiplayer games with relative ease!

drawingOf :: Picture -> IO ()

animationOf :: (Double -> Picture) -> IO ()

simulationOf
  :: world
  -> (Double -> world -> world)
  -> (world -> Picture)
  -> IO ()

interactionOf
  :: world
  -> (Double -> world -> world)
  -> (Event -> world -> world)
  -> (world -> Picture)
  -> IO ()

collaborationOf
  :: Int
  -> StaticPtr (Stdgen -> world)
  -> StaticPtr (Double -> world -> world)
  -> StaticPtr (Int -> Event -> world -> world)
  -> StaticPtr (Int -> world -> Picture)
  -> IO ()

Using Codeworld

The easiest way to get started is to go to https://code.world, follow the Guide, and make some simple programs! Everything takes place in your web browser, so you can get a feel for how it works without needing to do any setup.

If you want to contribute to or fiddle with the source code at all, you’ll have to do some more involved work. You’ll need to follow the instructions on the Github repository, which are primarily for the main Linux distributions. You’ll also need to sign a Google Contributor License Agreement if you haven’t already. But if you want to help on some kind of educational Haskell tool, this is a great project to contribute on! It’s already in use in several schools!

Conclusion

Next week we’ll continue our open-source focus by beginning to look at the process of contributing to GHC. This compiler is a mainstay of the Haskell community. And it depends entirely on volunteer contributions! Naturally though, it's difficult to understand all the inner workings of a compiler. So we’ll start at a very basic level and work our way up. We'll begin by looking at contributions to less technical areas. Only at the end of our discussion will we start looking at any of the organization of the code itself.

If you’ve never written any Haskell before, Codeworld is actually a great way to introduce yourself to some of the fundamentals! But for a more classical introduction, you can also get our Haskell Beginner’s Checklist. It’ll walk you through the basics of setting Haskell up on your system.

HNix: Enhancing Nix with Haskell

hnix.png

Last week we introduced Nix, the purely functional package manager. We saw how it used some different conceptual techniques from functional programming. With these concepts, it seeks to solve some problems in package management. It shares many concepts with Haskell, so it is most often used by Haskell developers.

Because of the Haskell community's interest in Nix, an interesting project has arose alongside it. This is HNix, which I mentioned a few weeks ago in my article about BayHac. HNix is a Haskell implementation of various components of Nix. In this quick article, we’ll look at the different elements of this project.

The Nix Language and the Nix Store

The term “Nix” is a little overloaded. It refers to the package manager or the operating system, but also refers to a language. The Nix language is how we specify the values that represent our different packages. The core repository of this project implements the Nix language in Haskell.

This implementation would make it easier to integrate Nix with your Haskell code. For example, you could combine Nix versioning of your packages with a database schema. This could ensure that you can automatically handle migrations.

Another part of the project is an interface to the Nix Store. The store deals with how Nix actually saves all the different packages on your system. While Nix does sandbox its packages, it can still be useful to have a programmatic interface on them. This allows you to manipulate a representation of this store in-memory, rather than on the file system. For instance, one store implementation has no side effects at all, to allow for unit testing. Another would read from the file system. But then it would perform all write effects in memory without modifying anything.

Open Source Haskell

One of the main reasons I’m discussing HNix is that it’s a good gateway to open source Haskell. If you’ve wanted to contribute to an OS Haskell project and weren’t sure where to start, HNix is a great option. The maintainers are very friendly guys. They'd be more than happy to help you get started in understanding the code base. At BayHac I was very impressed with how well organized the project was. Specifically, the maintainers made it easy for new people to get involved in the project. They laid out many different issue tickets that were doable even for non-experts.

So to get started, take a look at the repository. The README instructions are pretty thorough. Then you can go through the issues section for a little bit and pick up one of the tickets with a “Help Wanted” label. You can email one of the maintainers for help (John Wiegley is probably your best bet). Tagging them in an issue comment should also work if you need some direction.

Conclusion

Haskell depends a lot of open source contributions. A lot of the core pieces of infrastructure (GHC, Stack, Cabal) are all maintained via open source. When you can make these contributions, you’ll be able to rapidly improve your Haskell, add value to the community, and meet great people along the way! Next week, we’ll look at another open source Haskell project.

And if you’ve never written any Haskell before, don’t be afraid! You can start your learning journey with our Beginners Checklist. You’ll be able to make solid contributions much quicker than you think!

Nix: Haskell Concepts for Package Managment

nix_logo.png

Back in my BayHac article, I discussed some of my adventures with Nix and HNix. I didn’t get a lot done. But I was still curious to learn more about these systems. I “used” Nix a little bit at a previous job. And by “used” I mean I learned enough of the basic commands to write code and get on with my life. But I never developed a full understanding of “why Nix” or “what’s good about Nix”. So I’m going to spend a couple weeks doing a high level overview of this program and why it's so cool.

As an introduction, Nix is a purely functional package manager. It aims to be a language-agnostic system to achieve deterministic builds. We’ll get into what it means to be a “purely functional” package manager down below. But a lot of the properties that make Nix what it is are also present in Haskell. So while you could use Nix for any language, most of the development effort so far has come from Haskellers. Meanwhile, NixOS is a linux distribution that seeks to apply the main principles of Nix at the operating system level.

This first article will discuss the basics of Nix, its advantages, and disadvantages. Next week, we’ll take a look at the HNix project, which seeks to implement Nix in Haskell. It’s important to understand though that Nix is definitely not the easiest package manager to use for Haskell. For now, I wold still recommend starting out with Stack. You can read the docs or check out our free Stack mini-course to learn more! And if you’ve never used Haskell before, download our Beginners Checklist to get started!

Now to motivate the use of Nix, let’s consider some of the broader issues are with package management.

Package Problems

At the most basic level, a package manager should enable you to get a program up and running in a small number (~3) of commands. And most accomplish this task, but there are always complications. We’ll look at two main issues. One is versioning. This includes both versioning your own projects and versioning dependencies. The other problem relates to the portability of your application.

The versioning problem plagued Haskell developers when Cabal was still young. Cabal would, by default, install dependencies system wide. But suppose you had many projects on your machine. These might depend on different versions of the same library. And this could lead to conflicts in your system that might render multiple projects unusable.

The addition of Cabal sandboxes and the Stack program mitigated this problem. Both these systems install dependencies in project specific locations. But there was still a problem where it could be difficult to roll back to a previous version of your project. The commands to uninstall and downgrade the packages weren’t intuitive. They could easily break things if you weren't careful.

Meanwhile, unseen dependencies threaten our portability. This is somewhat more common in building C or C++ programs than Haskell programs. C libraries are often still installed system wide. One of the consequences is that you might have a library from another project on your system. Then a new project also depends on it, but you forget to list that dependency. It works fine for you on your local machine. But then when you push your code somewhere else, that dependency isn’t found. This can be quite a hassle.

The Nix Functional Approach

Nix (the package manager) seeks to avoid these problems by using a functional approach to package management. It treats every package as a value constructed by a function. The key input to the function of any package is its dependency graph. That is, a package is the final output, and the other (versioned) libraries are the input. Each version of a package you build has a unique identifier. This identifier is a cryptographic hash of the dependency graph. So if any of the dependencies to your program change, you’ll rebuild and create a totally new version of your package. This means adding dependencies, removing them, or changing versions.

Nix stores all its packages in the /nix/store directory. So you might build one version of your project that ends up in this directory:

/nix/store/2gk7rk2sxx2dkmsjr59gignrfdmya8f6s-my-project-1.0.1

And then you might change the dependencies and end up with another directory.

/nix/store/lg5mkbclaphyayzsxnjlxcko3kll5nbaie-my-project-1.0.2

What are the consequences of this?

Notice it’s very easy to version our project! If we decide to rollback to a previous set of dependencies, that version will still be living on our machine! We’ll update the dependency set. It then calculates the hash of the dependency graph, and this will match an old configuration. So we’ll be all set! This goes for any of our dependencies as well.

There are in fact specific commands related to rollbacks. This means you can upgrade packages without being afraid of any difficulties.

Nix also solves the second problem we mentioned above. First, we explicitly declare all the dependencies as inputs. And second, we only use dependencies we get from the Nix store, rather than any system wide location. This means our derivations are complete. Thus someone else should be able to take the definition and build it themselves.

Nix OS

NixOS seeks to take many of the lessons from the Nix package manager and apply them at the OS level. Many of the problems that plague package management also plague OS management. For instance, upgrading packages with sudo apt-get install can be a risky operation. It can be difficult to rollback, and almost impossible to test what is going to happen before you upgrade. NixOS fixes these. It allows you to have versioned, reproducible system configurations. And you can roll back to a configuration with ease. It also gives you atomic transactions on system modifications. This way, even if something goes wrong, you’ll be completely reverted to your old system state.

Weaknesses with Nix

One potential weakness with Nix is that it defaults to building from source. This means you’ll often have long build times, even for small changes in your code or dependencies. If you’re in luck, you can use the Nix cache for your specific libraries. It stores pre-built binaries you can use. But from my experience using Nix, the length of build times was one of the biggest things holding it back. In particular it was very difficult to incorporate Nix into a CI system, as it was prone to cause timeouts.

Conclusion

So hopefully this gives you some idea of what Nix is about. Next week, we’ll look into HNix. This open source project is seeking to re-implement Nix in Haskell. We’ll see why in our exploration of the project. In the meantime, check out some of our resources on Getting Started with Haskell so you can learn how to get going! And if you want a little bit of experience with package management in Haskell, make sure to try out Stack! Check out our free Stack mini-course to learn how!

Advanced Github: Webhooks and Automation

Github Haskell.png

A couple weeks ago, we saw how to use Docker in conjunction with Heroku to deploy our Haskell application. The process resulted in a simpler Circle CI config than we had before, as we let Docker do most of the heavy lifting. In particular, we no longer needed to download and build stack ourselves. We specified the build process in our Dockerfile, and then called docker build. We also saw a couple different ways to login to these services from our Circle CI box.

In the future, we’ll look at ways to use more diverse deployment platforms than Heroku. In particular, we’ll look at AWS. But that’s a tough nut to crack, so it might be worthy of its own series! For now, we’ll conclude our series on deployment by looking at the Github developer API. Most projects you’ll work on use Github for version control. But with the API, there are a lot of interesting tricks that can make your experience cooler! This week, we’ll see how to setup a server that will respond to events that happen within our repository. Then we’ll see how we can send our own events from the server! You can follow along with this code by looking at this Github repository!

This article builds a lot on our knowledge of the Servant library. If you’ve never used that before, I highly recommend you read our Haskell Web Skills series. You'll learn about Servant and much more! You can also download our Production Checklist for more tools to use in your applications.

Github Webhooks Primer

First let’s understand the concept of webhooks. Many services besides Github also use them. A webhook is an integration where a service will send an HTTP request to an endpoint of your choosing whenever some event happens. Webhooks are often a way for you to get some more advanced functionality out of a system. They can let you automate a lot of your processes. With Github, we can customize the events where this occurs. So for instance, we can trigger a request whenever creates a pull request.

In this article, we’ll set up a very simple server that will do just that. When they open a new PR, we’ll add a comment saying we’ll take a look at the pull request soon. We’ll also have the comment tag our account so we get a notification.

The Github part of this is easy. We go to the settings for our repository, and then find the “Webhooks” section. We’ll add a webhook for custom events, and we’ll only check the box next to “Pull Requests”. We’ll assign this to the URL of a Server that we’ll put up on a Heroku server, hitting the /api/hook endpoint.

Building Our Server

First let’s make a data type for a Github request. This will be a simple two-constructor type. Our first constructor will contain information about an opened pull request. We’ll want to get the user’s name out of the request object, as well as the URL for us to send our comment to. We’ll also have an Other constructor for when the request isn’t about an open pull request.

data GithubRequest =
  GithubOpenPRRequest Text Text | -- User’s name, comments URL
  GithubOtherRequest
  deriving (Show)

So we need a simple server that listens for requests on a particular endpoint. As we have in the past, we’ll use Servant for this process. Our endpoint type will use our desired path. Then it will also take a request body with our GithubRequest. We’ll listen for a post request, and then return a Text as our result, to help debug.

type ServerAPI = “api” :> “hook” :> 
  ReqBody ‘[JSON] GithubRequest :> Post ‘[JSON] Text

Now we need to specify a FromJSON instance for our request type. Using the documentation, we’ll find a few fields we need to read to make this happen. First, we’ll check that, indeed, this request has a pull request section and that it’s action is “opened”. If these aren’t there, we’ll return Other:

instance FromJSON GithubRequest where
  parseJSON = withObject “GithubRequest” $ \o -> do
    (action :: Maybe Text) <- o .:? “action”
    prSectionMaybe <- o .:? “Pull_request”
    case (action, prSectionMaybe) of
      (Just “opened”, Just pr_section :: Maybe Value) -> do
        …
      _ -> return GithubOtherRequest

Now we can fetch the user section and the comments URL from the pull_request section. We do this with a function on a Data.Aeson object like so:

where
  fetchUserAndComments o’ = do
    uSection <- o’ .: “user”
    commentsURL <- o’ .: “comments_url”
    return (uSection, commentsURL)

Note we want comments_url, NOT review_comments_url! We want to leave a single comment, rather than performing a full review of this PR. It was VERY annoying to figure out that the documentation covers this under the Issues section, NOT the section on pull requests! Once we get the user section and comments, URL, we need one more step. We’ll get the user name out of the section, and we’ll return our final request!

instance FromJSON GithubRequest where
  parseJSON = withObject “GithubRequest” $ \o -> do
    (action :: Maybe Text) <- o .:? “action”
    prSectionMaybe <- o .:? “Pull_request”
    case (action, prSectionMaybe) of
      (Just “opened”, Just pr_section :: Maybe Value) -> do
        (userSection :: Value, commentsURL :: Text) <-
          withObject “PR Section” fetchUserAndComments prSection
        userName <-
          withObject “User Section” (\o’ -> o’ .: “login”) userSection
        return $ GithubOpenPRRequest userName commentsURL
      _ -> return GithubOtherRequest

Handling the Endpoint

Now we need a handler function for endpoint. This handler will pattern match on the type of request and return a debugging string. If we have indeed found a request to open the PR, we’ll also want to call another IO function that will add our comment:

hookHandler :: GithubRequest -> Handler Text
hookHandler GithubOtherRequest =
  return “Found a non-PR opening request.”
hookHandler (GithubOpenPRRequest userName commentsURL) = do
  liftIO $ addComment userName commentsURL
  return $ “User: “ <> userName <> 
    “ opened a pull request with comments at: “ <> commentsURL

addComment :: Text -> Text -> IO ()
...

Adding a Comment

In order to add a comment to this pull request, we’ll need to hit the Github API with our own request. Again, we’ll do this using Servant’s magic! First, let’s make another API type to represent Github’s own developer API. Since we’re getting the full comments URL as part of our request, we don’t need any path components here. But we will need to authenticate using BasicAuth:

type GithubAPI = BasicAuth “GithubUser” () :>
  ReqBody GitPRComment :> Post ‘[JSON] ()

Our GitPRComment will only need a Text for the body of the comment. So let’s make a simple newtype wrapper and add a ToJSON instance for it:

newtype GitPRComment = GitPRComment Text

instance ToJSON GitPRComment where
  toJSON (GitPRComment body) = object [ “body” .= body ]

We can create a client function for this API now using the magic client function from Servant.Client:

sendCommentClient :: BasicAuthData -> GitPRComment -> ClientM ()
sendCommentClient = client (Proxy :: Proxy GithubAPI)

Now to build our commenting function, we’ll start by building the auth data.

import qualified Data.ByteString.Char8 as BSC

...
addComment :: Text -> Text -> IO ()
addComment userName commentsURL = do
  gitUsername <- getEnv “GITHUB_USERNAME”
  gitPassword <- getEnv “GITHUB_PASSWORD”
  let authData = BasicAuthData (BSC.pack gitUsername)
                               (BSC.pack gitPassword)
  ...

Now we’ll set up our client environment using the comments URL:

addComment :: Text -> Text -> IO ()
addComment userName commentsURL = do
  ...
  manager <- newManager tlsManagerSettings
  baseUrl <- parseBaseUrl (Data.Text.unpack commentsURL)
  let clientEnv = clientEnv maanger baseUrl
  ...

We’ll add a simple function taking our admin’s username and composing the body of the comment. We’ll tag ourselves as well as the user who opened the PR:

addComment :: Text -> Text -> IO ()
addComment userName commentsURL = do
  …
  where
     commentBody adminName = GitPRComment $
       “Thanks for posting this @” <> userName <>
       “! I’ll take a look soon! - @” <> adminName

Now we wrap everything together by making our client call. And that’s it!

addComment :: Text -> Text -> IO ()
addComment userName commentsURL = do
  gitUsername <- getEnv “GITHUB_USERNAME”
  gitPassword <- getEnv “GITHUB_PASSWORD”
  let authData = BasicAuthData (BSC.pack gitUsername)
                               (BSC.pack gitPassword)
  manager <- newManager tlsManagerSettings
  baseUrl <- parseBaseUrl (Data.Text.unpack commentsURL)
  let clientEnv = clientEnv maanger baseUrl
  runClientM (sendCommentClient
                authData
               (commentBody gitUsername))
             clientEnv
  return ()
  where
    commentBody = ...

Conclusion

Services like Github do their best to provide a good user experience to all their normal users. But if you get a little bit advanced, you can often customize their behavior to a great degree! Notice how important it is to know how to setup a simple server. This gives you limitless freedom to manipulate the system and add your own behaviors. It’s a cool perk of learning these specific web skills. If you want to see the full code I wrote for this article, check it out on this Github repo!

To learn about more web skills that can magnify your programming ability, check out our Haskell Web Skills Series. It’ll walk you through some different Haskell libraries, like Persistent for databases, and Servant for web servers. You can also download our Production Checklist. It’ll give you a lot more ideas of libraries to use to enhance your Haskell experience!