Beam: Database Power without Template Haskell!

haskell_beam.png

As part of our Haskell Web Series, we examined the Persistent and Esqueleto libraries. The first of these allows you to create a database schema in a special syntax. You can then use Template Haskell to generate all the necessary Haskell data types and instances for your types. Even better, you can write Haskell code to query on these that resembles SQL. These queries are type-safe, which is awesome. However, the need to specify our schema with template Haskell presented some drawbacks. For instance, the code takes longer to compile and is less approachable for beginners.

This week on the blog, we'll be exploring another database library called Beam. This library allows us to specify our database schema without using Template Haskell. There's some boilerplate involved, but it's not bad at all! Like Persistent, Beam has support for many backends, such as SQLite and PostgresQL. Unlike Persistent, Beam also supports join queries as a built-in part of its system.

For some more ideas on advanced libraries, be sure to check out our Production Checklist! It includes a couple more different database options to look at.

Specifying our Types

As a first note, while Beam doesn't require Template Haskell, it does need a lot of other compiler extensions. You can look at those in the appendix below, or else take a look at the example code on Github. Now let's think back to how we specified our schema when using Persistent:

import qualified Database.Persist.TH as PTH

PTH.share [PTH.mkPersist PTH.sqlSettings, PTH.mkMigrate "migrateAll"] [PTH.persistLowerCase|
  User sql=users
    name Text
    email Text
    age Int
    occupation Text
    UniqueEmail email
    deriving Show Read Eq

  Article sql=articles
    title Text
    body Text
    publishedTime UTCTime
    authorId UserId
    UniqueTitle title
    deriving Show Read Eq

With Beam, we won't use Template Haskell, so we'll actually be creating normal Haskell data types. There will still be some oddities though. First, by convention, we'll specify our types with the extra character T at the end. This is unnecessary, but the convention helps us remember what types relate to tables. We'll also have to provide an extra type parameter f, that we'll get into a bit more later:

data UserT f =
  …

data ArticleT f =
  ...

Our next convention will be to use an underscore in front of our field names. We will also, unlike Persistent, specify the type name in the field names. With these conventions, I'm following the advice of the library's creator, Travis.

data UserT f =
  { _userId :: ...
  , _userName :: …
  , _userEmail :: …
  , _userAge :: …
  , _userOccupation :: …
  }

data ArticleT f =
  { _articleId :: …
  , _articleTitle :: …
  , _articleBody :: …
  , _articlePublishedTime :: …
  }

So when we specify the actual types of each field, we'll just put the relevant data type, like Int, Text or whatever, right? Well, not quite. To complete our types, we're going to fill in each field with the type we want, except specified via Columnar f. Also, we'll derive Generic on both of these types, which will allow Beam to work its magic:

data UserT f =
  { _userId :: Columnar f Int64
  , _userName :: Columnar f Text
  , _userEmail :: Columnar f Text
  , _userAge :: Columnar f Int
  , _userOccupation :: Columnar f Text
  } deriving (Generic)

data ArticleT f =
  { _articleId :: Columnar f Int64
  , _articleTitle :: Columnar f Text
  , _articleBody :: Columnar f Text
  , _articlePublishedTime :: Columnar f Int64 -- Unix Epoch
  } deriving (Generic)

Now there are a couple small differences between this and our previous schema. First, we have the primary key as an explicit field of our type. With Persistent, we separated it using the Entity abstraction. We'll see below how we can deal with situations where that key isn't known. The second difference is that (for now), we've left out the userId field on the article. We'll add this when we deal with primary keys.

Columar

So what exactly is this Columnar business about? Well under most circumstances, we'd like to specify a User with the raw field types. But there are some situations where we'll have to use a more complicated type for an SQL expression. Let's start with the simple case first.

Luckily, Columnar works in such a way that if we useIdentity for f, we can use raw types to fill in the field values. We'll make a type synonym specifically for this identity case. We can then make some examples:

type User = UserT Identity
type Article = ArticleT Identity

user1 :: User
user1 = User 1 "James" "james@example.com" 25 "programmer"

user2 :: User
user2 = User 2 "Katie" "katie@example.com " 25 "engineer"

users :: [User]
users = [ user1, user2 ]

As a note, if you find it cumbersome to repeat the Columnar keyword, you can shorten it to C:

data UserT f =
  { _userId :: C f Int64
  , _userName :: C f Text
  , _userEmail :: C f Text
  , _userAge :: C f Int
  , _userOccupation :: C f Text
  } deriving (Generic)

Now, our initial examples will assign all our fields with raw values. So we won't initially need to use anything for the f parameter besides Identity. Further down though, we'll deal with the case of auto-incrementing primary keys. In this case, we'll use the default_ function, whose type is actually a Beam form of an SQL expression. In this case, we'll be using a different type for f, but the flexibility will allow us to keep using our User constructor!

Instances for Our Types

Now that we've specified our types, we can use the Beamable and Table type classes to tell Beam more about our types. Before we can make any of these types a Table, we'll want to assign its primary key type. So let's make a couple more type synonyms to represent these:

type UserId = PrimaryKey UserT Identity
type ArticleId = PrimaryKey ArticleT Identity

While we're at it, let's add that foreign key to our Article type:

data ArticleT f =
  { _articleId :: Columnar f Int64
  , _articleTitle :: Columnar f Text
  , _articleBody :: Columnar f Text
  , _articlePublishedTime :: Columnar f Int64
  , _articleUserId :: PrimaryKey UserT f
  } deriving (Generic)

We can now generate instances for Beamable both on our main types and on the primary key types. We'll also derive instances for Show and Eq:

data UserT f =
  …

deriving instance Show User
deriving instance Eq User

instance Beamable UserT
instance Beamable (PrimaryKey UserT)

data ArticleT f =
  …

deriving instance Show Article
deriving instance Eq Article

instance Beamable ArticleT
instance Beamable (PrimaryKey ArticleT)

Now we'll create an instance for the Table class. This will involve some type family syntax. We'll specify UserId and ArticleId as our primary key data types. Then we can fill in the primaryKey function to match up the right field.

instance Table UserT where
  data PrimaryKey UserT f = UserId (Columnar f Int64) deriving Generic
  primaryKey = UserId . _userId

instance Table ArticleT where
  data PrimaryKey ArticleT f = ArticleId (Columnar f Int64) deriving Generic
  primaryKey = ArticleId . _articleId

Accessor Lenses

We'll do one more thing to mimic Persistent. The Template Haskell automatically generated lenses for us. We could use those when making database queries. Below, we'll use something similar. But we'll use a special function, tableLenses, to make these rather than Template Haskell. If you remember back to how we used the Servant Client library, we could create client functions by using client and matching it against a pattern. We'll do something similar with tableLenses. We'll use LensFor on each field of our tables, and create a pattern constructing an item.

User
  (LensFor userId)
  (LensFor userName)
  (LensFor userEmail)
  (LensFor userAge)
  (LensFor userOccupation) = tableLenses

Article
  (LensFor articleId)
  (LensFor articleTitle)
  (LensFor articleBody)
  (LensFor articlePublishedTime)
  (UserId (LensFor articuleUserId)) = tableLenses

Note we have to wrap the foreign key lens in UserId.

Creating our Database

Now unlike Persistent, we'll create an extra type that will represent our database. Each of our two tables will have a field within this database:

data BlogDB f = BlogDB
  { _blogUsers :: f (TableEntity UserT)
  , _blogArticles :: f (TableEntity ArticleT)
  } deriving (Generic)

We'll need to make our database type an instance of the Database class. We'll also specify a set of default settings we can use on our database. Both of these items will involve a parameter be, which stands for a backend, (e.g. SQLite, Postgres). We leave this parameter generic for now.

instance Database be BlogDB

blogDb :: DatabaseSettings be BlogDB
blogDb = defaultDbSettings

Inserting into Our Database

Now, migrating our database with Beam is a little more complicated than it is with Persistent. We might cover that in a later article. For now, we'll keep things simple, and use an SQLite database and migrate it ourselves. So let's first create our tables. We have to follow Beam's conventions here, particularly on the user_id__id field for our foreign key:

CREATE TABLE users \
  ( id INTEGER PRIMARY KEY AUTOINCREMENT\
  , name VARCHAR NOT NULL \
  , email VARCHAR NOT NULL \
  , age INTEGER NOT NULL \
  , occupation VARCHAR NOT NULL \
  );
CREATE TABLE articles \
  ( id INTEGER PRIMARY KEY AUTOINCREMENT \
  , title VARCHAR NOT NULL \
  , body VARCHAR NOT NULL \
  , published_time INTEGER NOT NULL \
  , user_id__id INTEGER NOT NULL \
  );

Now we want to write a couple queries that can interact with the database. Let's start by inserting our raw users. We begin by opening up an SQLite connection, and we'll write a function that uses this connection:

import Database.SQLite.Simple (open, Connection)

main :: IO ()
main = do
  conn <- open "blogdb1.db"
  insertUsers conn

insertUsers :: Connection -> IO ()
insertUsers = ...

We start our expression by using runBeamSqlite and passing the connection. Then we use runInsert to specify to Beam that we wish to make an insert statement.

import Database.Beam
import Database.Beam.SQLite

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  ...

Now we'll use the insert function and signal which one of our tables we want out of our database:

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  insert (_blogUsers blogDb) $ ...

Last, since we are inserting raw values (UserT Identity), we use the insertValues function to complete this call:

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  insert (_blogUsers blogDb) $ insertValues users

And now we can check and verify that our users exist!

SELECT * FROM users;
1|James|james@example.com|25|programmer
2|Katie|katie@example.com|25|engineer

Let's do the same for articles. We'll use the pk function to access the primary key of a particular User:

article1 :: Article
article1 = Article 1 "First article" 
  "A great article" 1531193221 (pk user1)

article2 :: Article
article2 = Article 2 "Second article" 
  "A better article" 1531199221 (pk user2)

article3 :: Article
article3 = Article 3 "Third article" 
  "The best article" 1531200221 (pk user1)

articles :: [Article]
articles = [ article1, article2, article3]

insertArticles :: Connection -> IO ()
insertArticles conn = runBeamSqlite conn $ runInsert $
  insert (_blogArticles blogDb) $ insertValues articles

Select Queries

Now that we've inserted a couple elements, let's run some basic select statements. In general for select, we'll want the runSelectReturningList function. We could also query for a single element with a different function if we wanted:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ ...

Now we'll use select instead of insert from the last query. We'll also use the function all_ on our users field in the database to signify that we want them all. And that's all we need!:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select (all_ (_blogUsers blogDb))
  mapM_ (liftIO . putStrLn . show) users

To do a filtered query, we'll start with the same framework. But now we need to enhance our select statement into a monadic expression. We'll start by selecting user from all our users:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select $ do
   user <- (all_ (_blogUsers blogDb))
    ...
  mapM_ (liftIO . putStrLn . show) users

And we'll now filter on that by using guard_ and applying one of our lenses. We use a ==. operator for equality like in Persistent. We also have to wrap our raw comparison value with val:

findUsers :: Connection -> IO ()
findUsers conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select $ do
    user <- (all_ (_blogUsers blogDb))
    guard_ (user ^. userName ==. (val_ "James"))
    return user
  mapM_ (liftIO . putStrLn . show) users

And that's all we need! Beam will generate the SQL for us! Now let's try to do a join. This is actually much simpler in Beam than with Persistent/Esqueleto. All we need is to add a couple more statements to our "select" on the articles. We'll just filter them by the user ID!

findUsersAndArticles :: Connection -> IO ()
findUsersAndArticles conn = runBeamSqlite conn $ do
  users <- runSelectReturningList $ select $ do
    user <- (all_ (_blogUsers blogDb))
    guard_ (user ^. userName ==. (val_ "James"))
    articles <- (all_ (_blogArticles blogDb))
    guard_ (article ^. articleUserId ==. user ^. userId)
    return user
  mapM_ (liftIO . putStrLn . show) users

That's all there is to it!

Auto Incrementing Primary Keys

In the examples above, we hard-coded all our IDs. But this isn't typically what you want. We should let the database assign the ID via some rule, in our case auto-incrementing. In this case, instead of creating a User "value", we'll make an "expression". This is possible through the polymorphic f parameter in our type. We'll leave off the type signature since it's a bit confusing. But here's the expression we'll create:

user1' = User
  default_ 
  (val_ "James")
  (val_ "james@example.com")
  (val_ 25)
  (val_ "programmer")

We use default_ to represent an expression that will tell SQL to use a default value. Then we lift all our other values with val_. Finally, we'll use insertExpressions instead of insertValues in our Haskell expression.

insertUsers :: Connection -> IO ()
insertUsers conn = runBeamSqlite conn $ runInsert $
  insert (_blogUsers blogDb) $ insertExpressions [ user1' ]

Then we'll have our auto-incrementing key!

Conclusion

That concludes our introduction to the Beam library. As we saw, Beam is a great library that lets you specify a database schema without using any Template Haskell. For more details, make sure to check out the documentation!

For a more in depth look at using Haskell libraries to make a web app, be sure to read our Haskell Web Series. It goes over some database mechanics as well as creating APIs and testing. As an added challenge, trying re-writing the code in that series to use Beam instead of Persistent. See how much of the Servant code needs to change to accommodate that.

And for more examples of cool libraries, download our Production Checklist! There are some more database and API libraries you can check out!

Appendix: Compiler Extensions

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE NoMonoMorphismRestriction #-}