Try Fold First!
Last time we talked about map
and filter
, which allow us to do the work of certain, very simple for-loops in other languages. However, they have a major limitation in that they don't let us carry any kind of state from element to element. In order to get this, we'll need the idea of a Fold.
Folds are the most important idea I will talk about this month. When it pops into your head to use a for-loop, the first one you should try to use to represent that idea in Haskell is a fold. I used them constantly in my Advent of Code run last year. Let's start with a simple example of a for-loop in C++.
struct Point {
int x;
int y;
};
int myFunc(const std::vector<Point>& myPoints) {
int result = 0;
for (const auto& point : myPoints) {
if (result % 2 == 0) {
result += point.x;
} else {
result += point.y
}
}
return result;
}
Map and filter are not sufficient to replicate this in Haskell. We cannot filter myPoints
for those points where we add x
or y
because we need the accumulated results to determine which way each of them goes. We cannot merely map
a function over our list because, again, we lack the stateful information to figure out what information we need from each point.
But we can see the pattern developing.
We have our result variable, set to an initial value. Each element in our list affects the result in some way (based on the previous result). At the end we get our final result.
These element are all present if we look at the type signature for foldl
, the most basic folding function in Haskell.
foldl :: (a -> b -> a) -> a -> [b] -> a
Notice that a
is our "result" type, whereas b
is the type of the items in our list. Things become more clear if we do this renaming:
foldl ::
(result -> item -> result) -> -- Folding Function
result -> -- Initial Result
[item] -> -- List of inputs
result -- Final Result
The most important part of this is the "folding" function. This function takes the previous result and the next item in the list, and incorporates that item into the result. The best analogy I can think of is like a stew. It starts out as water/stock (the initial value). And then you incorporate each item into the stew (but it's still a stew after each thing you add). And at the end your have your "final stew".
In our example above, the folding function considers whether the previous result is even or odd, and uses that to determine which part of the point to add to make the new result.
myFuncFold :: Int -> Point -> Int
myFuncFold prev (Point x y) = if prev `mod` 2 == 0
then prev + x
else prev + y
Once we have the folding function, it is usually very simple to write the final expression!
myFunc :: [Point] -> Int
myFunc points = foldl myFuncFold 0 points
With the list as the last argument, you can often eta reduce:
myFunc :: [Point] -> Int
myFunc = foldl myFuncFold 0
As a note, there are three different folding functions. We'll seen foldl
, but there's also foldr
and foldl'
:
foldr :: (item -> result -> result) -> result -> [item] -> result
foldl' :: (result -> item -> result) -> result -> [item] -> result
Notice that the arguments of the folding function are flipped for foldr
! This is a "right" fold, so it will actually start from the right of your list instead of the left. For associative operations, order does not matter, but often it does!
>> foldl (+) 0 [1, 2, 3, 4]
10
>> foldr (+) 0 [1, 2, 3, 4]
10 -- Addition is associative, same result
>> foldl (-) 0 [1, 2, 3, 4]
-10
>> foldr (-) 0 [1, 2, 3, 4]
-2 -- Subtraction is not, different result!
More commonly foldl
will capture how you see the operation in your head. However, foldl
is more prone to stack overflows with large input lists due to laziness in Haskell. Using foldr
gets around these problems, but might force you to reverse your list. You can use foldl'
to solve both problems. It processes the list in the same order as foldl
, while introducing strictness that prevents memory accumulation. Note that foldl'
must be imported from Data.List
while the other two functions are included in Prelude
.
Finally, another great use for folds is manipulating data structures like sets and sequences. Suppose you're just trying to add a list of elements to an existing set. You could create a second set with fromList
and append.
addListToSet :: Set a -> [a] -> Set [a]
addListToSet prevSet items = prevSet <> (Set.fromList items)
However, this is inefficient. You can avoid making an intermediate set like so:
addListToSet :: Set a -> [a] -> Set a
addListToSet prevSet items = foldr Set.insert prevSet items
-- Alternative with foldl
addListToSet prevSet items = foldl (flip Set.insert) prevSet items
Set insertion is associative so the difference in the two above approaches doesn't matter. But with a Map
, the values for keys can be overwritten, so the order is important. Remember that!
If you master folds, you'll be well on your way to being more comfortable with Haskell. But stay tuned for more ways that Haskell will save you from for-loops! If you're interested in more beginner level tips, you should download our Beginners Checklist! It'll help you get set up with Haskell so you can be more effective!
Does Haskell have For Loops?
Before I get started with today's article, I'll mention that today, April 4th is the last day to take advantage of our Spring Sale! If you subscribe today, you'll get a discount code that will get you 20% off any of our courses. This code is good until 11:59PM tonight (PST, GMT-07), so don't wait!
But now on to our new topic for the month of April: For Loops! I'll start with my least favorite quote about Haskell and for loops.
"But Haskell doesn't even have for loops!"
-- Anyone who dislikes Haskell.
This statement is often used to demonstrate just how different Haskell is from other languages. And unfortunately often, it's used to scare beginners away from trying Haskell. For loops are, in some ways, one of the most fundamental building blocks of modern programming languages. After variables, "if" statements and perhaps basic functions, they are one of the first bits of syntax you'll need to pick up on when you're learning a language like C++, Java, Javascript, Python, Rust, or Go. It's almost just muscle memory for a lot of programmers. "Oh I have to do something with every element of an iteration...", and out come the following lines:
for (int i = 0; i < my_array.size(); ++i) {
// do something
}
// or
for (const auto& item : my_array) {
// do something
}
So the idea that Haskell "doesn't have for loops" can make it a scary prospect to learn Haskell. How can one give up something so fundamental to programming?
However, some programmers suggest, in apparent contrast, that "bare" loops, including for loops, are a code smell. In C++ especially, there is probably an existing std::algorithm
function that gives exactly the loop behavior you are looking for (e.g. std::search
, or std::copy_if
). And using these built-in functions will save you from some potential bugs.
A lot of these algorithms are "functional", requiring you to pass a function as an input to the algorithm. And this is the point! Haskell doesn't have the same broad, generic for
syntax (although a for
function does exist, as we'll explore). But it allows all the behaviors of for loops through different functional algorithms.
For the month of April, we'll explore all the common for-loop patterns and describe how to implement them in Haskell. Many of these are a simple matter of a single functional combinator. Today we'll look at two examples of this pattern: map
and filter
.
Consider these two patterns we could write.
- Given a list of integers, return a new list where each value is doubled.
- Given a list of integers, return a new list containing only the even values in the first list.
Here's some "bare" for loop code we could write to accomplish this in C++:
std::vector<int> myInts = {...};
std::vector<int> doubled;
std::vector<int> onlyEven;
for (int i = 0; i < myInts.size(); ++i) {
doubled.push_back(myInts[i] * 2);
}
for (int i = 0; i < myInts.size(); ++i) {
if (myInts[i] % 2 == 0) {
onlyEven.push_back(myInts[i]);
}
}
In Haskell, we don't need a for
construct for these tasks. We just have the combinators map
and filter
, which take functions as arguments. In the most basic sense, we can treat these as operating over lists.
map :: (a -> b) -> [a] -> [b]
filter :: (a -> Bool) -> [a] -> [a]
Getting our new lists is as easy as supplying a lambda function to each of these.
myInts :: [Int]
myInts = ...
doubled :: [Int]
doubled = map (2 *) myInts
onlyEven :: [Int]
onlyEven = filter (\x -> x `mod` 2 == 0) myInts
Both of these concepts generalize to many structures though! For the idea of "mapping", we can use the fmap
function over any Functor type.
fmap :: Functor f => (a -> b) -> f a -> f b
For filter, we can use any Foldable type.
filter :: Foldable t => (a -> Bool) -> t a -> t b
Most basic data structures implement these classes, so we can use these functions extremely generically!
Throughout the rest of the month we'll be talking about more ways Haskell replaces for-loops with functional constructs. Make sure to subscribe so you can stay up to date with the latest news!
Spring Sale ends in 4 Days!
On Monday we'll start April off with a fresh topic, but I wanted to take one more opportunity to remind you of the special Spring Sale going on right now with our courses at Monday Morning Haskell Academy! Here's what's new this time around:
You can now get the Effects Bundle, which includes Making Sense of Monads as well as Effectful Haskell! These courses will help you build a solid understanding of the critical concept of monads and how to use them in your programs. Between the two courses there's over 3.5 hours of video content, two mini-projects, and a full project where you'll deploy some Haskell server code to Heroku!
If you're super interested in using Haskell for machine learning, you can also enroll in our Haskell Brain course . This course is newly reopened, so if you missed out in the winter sale, you can sign up now! It will teach you the basics of using Tensorflow with Haskell as well as some other useful libraries.
Best of all you can get a further 20% discount on all of our courses if your subscribe to our newsletter! You'll get a discount code in your inbox that you can use on any of our courses.
But Which Course Should I Take?
Our course offerings have grown over the years so it might be a little overwhelming to figure out which one is right for you. Here are a few statements that might describe where you are in your Haskell journey, and which course you should look at!
I've never written a line of Haskell before and I want to try it out!
You should take Haskell From Scratch! This 7-module course will walk you through all the most basic elements of Haskell's syntax and teach you some basic problem solving skills.
I understand the basics but I want a deeper understanding of important concepts.
The Effects Bundle is the way to go here. You'll learn about monads from the ground up in Making Sense of Monads and then you'll learn different ways to organize your monads in Effectful Haskell.
I want to try writing a real-world application using Haskell.
Take a look at Practical Haskell! You'll learn the basics of building a web server, interacting with a database, and even adding a frontend with Elm!
I want to write a machine learning program in Haskell.
Haskell Brain is your bet!
FAQ
Last of all, here are some frequently asked questions you might have!
Do I have to wait to access any course content?
Nope! Our courses are all self-paced, and you'll have access to all content immediately!
How long will I have access to the course content?
There's no expiration! You get lifetime access to the content!
Can I get a discount?
In addition to the 20% discount you can get from subscribing to our newsletter, discounts are available for university students! We also recognize that the prices may be steep for people from countries with unfavorable exchange rates against the US Dollar and can provide an extra discount for these situations. Email me at james@mondaymorninghaskell.me for details on these special cases!
Will I be able to get a refund?
Yes! We provide full refunds with no questions asked within 30 days of purchase.
The Spring Sale ends on Monday, April 4th, so don't miss out! Subscribe, get the discount code, and head over to our courses page!
New Course Bundle!
This whole month, I've been writing about some of the basics of using monads in Haskell. This often misunderstood concept is very important for putting together more sophisticated Haskell programs. In my series on Monads and Functional Structures I discuss monads more from the ground up and go into a lot more depth about different ways to use monads.
But last year, I released two great new ways to learn about monads! If you head over to the Monday Morning Haskell Academy, you'll find two courses that are specifically geared towards this tricky concept!
First, there is Making Sense of Monads. If monads have always confused you and you don't even know where to start, start here! This beginner-level course starts from the ground up, similar to the monads series. But it goes into even more depth, offering lots of slides to clarify concepts, and providing you with the opportunity to practice your skills with dozens of exercises and two different mini-projects!
If you think you've got the basics down, you can try something more advanced with our Effectful Haskell course. This goes into way more depth about how we can actually use monads in a real application. It will teach you several different ways of representing and organizing side effects in your Haskell program, including the idea of Free Monads! By the end of this course, you'll have built your own simple web server and learned to host it on Heroku! This project can serve as an example for many more complicated ideas. If you know the basics of monads already, but want to know how they actually help you build a real program, this is the course for you!
Now perhaps both of these ideas sound appealing to you. For the first time, we're offering our Effects Bundle, which combines both of these courses! If you get them together, you'll save almost 30%!
Speaking of discounts, you can also get 20% off all our courses if you subscribe to our monthly newsletter! Subscribing always gives you access to our Subscriber Resources, but this week it will also get you a discount code on anything on Monday Morning Haskell Academy. This includes an additional discount on the the above-mentioned Effects Bundle. It also includes the newly re-opened Haskell Brain course, which will introduce you to the basics of using machine learning in Haskell! If you missed out on this course back in the winter, now's your chance to get started!
The sale will last for a week! So make sure you sign up and get those discounts before 11:59PM PST (GMT-07) next Monday, April 4th! Don't miss out!
Making your own Monad
There are many built-in monads from the MTL library that you'll find useful, like Reader, Writer, State, ExceptT, and so on. You can use transformers to combine these into a monad that is uniquely specific to your application. Today, we'll talk about how to make your monad into its own data type.
For our example monad, we'll incorporate an environment configuration using a Reader, a persistent application state, the ability to throw certain exceptions, and have all of that on top of IO. This would give us a monad like:
StateT AppState (ReaderT EnvConfig (ExceptT AppError IO)) a
And you might have many different functions in your application using this monad. Obviously you wouldn't want to keep writing down that long expression each time. So you could use a type alias, parameterized by the result type of the operation:
type AppMonad a = StateT AppState (ReaderT EnvConfig (ExceptT AppError IO)) a
loginUser :: AuthInfo -> AppMonad User
logoutUser :: AppMonad ()
printEnvironmentLogs :: AppMonad ()
However, an important trick to know is that we can make AppMonad
a proper type instead of a simple alias by using newtype
. To use this type as a monad, you'll need instances for the Monad
class and its ancestors. But we can derive those automatically, as long as we're using common MTL layers.
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
newtype AppMonad a = AppMonad
(StateT AppState
(ReaderT EnvConfig
(ExceptT AppError IO))) a
deriving (Functor, Applicative, Monad)
Notice we are still using a
as a type parameter for the result!
Like other monads, it is usually a good idea to use a "run" function to give users an entrypoint into this monad. This function will need appropriate inputs. First of all, it will take the monadic action itself. Then it will take initial values for the state and configuration.
There are a few different options for the result value. In certain cases, it can be the "pure" result type from the computation. But if IO
is on the monad stack, your result will need to be in the IO
monad. And in our case, since we've included ExceptT
, we'll also allow the result to be Either
, since it may encounter our error type.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
How do we write such a function? The answer is that we'll incorporate the "run" functions of all the other monads on our stack! The key is knowing how to destructure a single monadic action. First, we pattern match our input action. Below, the expression stateAction
corresponds to a StateT
value, because that is the outer layer of our monad.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ...
How do we make progress with a stateAction
? By using a "run" function from the State
monad of course! In our case, we don't care about the final stateful value, so we'll use evalStateT
. This gives us an expression at the next level of the monad, which is a ReaderT
.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ...
where
readerAction :: ReaderT (ExceptT AppError IO) a
readerAction = evalStateT stateAction appState
Now we can do the same thing to unwind the Reader
action. We'll call runReaderT
using our supplied environment config. This gives us an action in the ExceptT
layer.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ...
where
readerAction :: ReaderT (ExceptT AppError IO) a
readerAction = evalStateT stateAction appState
exceptAction :: ExceptT AppError IO a
exceptAction = runReaderT readerAction envConfig
And finally, we use runExceptT
to unwind the ExceptT
layer, which gives us an IO action. This is our final result.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ioAction
where
readerAction :: ReaderT (ExceptT AppError IO) a
readerAction = evalStateT stateAction appState
exceptAction :: ExceptT AppError IO a
exceptAction = runReaderT readerAction envConfig
ioAction :: IO (Either AppError a)
ioAction = runExceptT exceptAction
Now we can use our monad in any location that has access to the initial state values and the IO monad!
There are a couple more tricks we can pull with our own monad. We can, for example, make this an instance of certain monadic type classes. This will make it easier to incorporate normal IO actions, or let us use State
actions without needing to lift
. Here are a couple examples:
instance MonadIO AppMonad where
liftIO = AppMonad . lift . lift . lift
instance MonadState AppState AppMonad where
get = AppMonad get
put = AppMonad . put
We could even create our own typeclass related to this monad! This idea is a bit more specialized. So if you want to learn more about this and other monad ideas, you should follow up and read our full Monads Series! You can also subscribe to our monthly newsletter so you can keep up to date with the latest Haskell news and offers!
An Alternative Approach
Part of what monads do is that they encapsulate side effects. They typically include activity that is not a simple and pure calculation like 2 + 2
. Because of this, there is a much higher chance that monadic actions will "fail" in some way. They won't be able to achieve the stated goal of the computation and will enter some kind of exceptional flow. There are a number of different classes that help us deal with these failures in reasonable ways. They'll also enable us to write code that can work with many different monads. These classes are Alternative
, MonadFail
, and MonadPlus
.
The Alternative
class works for all "applicative" types, so it actually applies more broadly than monads. But still, most of the common users of it are monads. As you might guess, this class allows us to say, "In case this action fails, do this instead." It has two primary functions: empty
and the operator (<|>)
.
The empty
function provides a "failure" condition of sorts. What happens when we can't execute the monadic (or applicative) action? For a monad like Maybe
, the outcome of this is still something we could resolve with "pure" code. We just get the value Nothing
.
instance Alternative Monad where
empty = Nothing
...
However, IO
provides a failure mechanism that will cause a runtime error if the operation doesn't succeed:
instance Alternative IO where
empty = failIO "mzero"
...
Using failIO
will throw an IOError
that will crash our whole program unless it gets caught elsewhere. The word "mzero" is a default failure message that will make more sense in a second!
Now the (<|>)
operator is, of course intended to look like the "or" operator (||)
. It takes two different actions as its inputs. It allows us to provide a different action to run if our first fails. So for Maybe
, we can see if our first result is Nothing
. And if so, we'll resolve the second Maybe
value. Of course, this second value might also be Nothing
! Using alternatives doesn't always guarantee success!
instance Alternative Monad where
empty = Nothing
m1 <|> m2 = case m1 of
Nothing -> m2
justValue -> justValue
We can see this in action:
>> let f x = if even x then Just (quot x 2) else Nothing
>> f 4
Just 2
>> f 3
Nothing
>> f 4 <|> f 8
Just 2
>> f 5 <|> f 8
Just 4
>> f 3 <|> f 5
Nothing
With IO
, we actually need to catch the exception thrown by failIO
and then perform the next action:
instance Alternative IO where
empty = failIO "mzero"
action1 <|> action2 = action1 `catchException` (\_ :: IOError -> action2)
Here's a quick look at these functions with an IO
action:
>> readFile "does_not_exist.txt"
Error: openFile: does not exist (No such file or directory)
>> readFile "does_not_exist.txt" <|> readFile "does_exist.txt"
"Text in second file"
There are a couple other functions we can use with Alternative
to provide a list of outcomes. The many
function will take a single operation and run it repeatedly until the operation fails by returning empty
:
many :: (Alternative f) => f a -> f [a]
In this case, we are guaranteed that the result of the function is never an error (empty
)! If the first attempt at the operation fails, we'll get an empty list.
But if we want to enforce that it will succeed at least once, we can use some
instead:
some :: (Alternative f) => f a -> f [a]
This cannot return an empty list. If the first result is empty
, it will give the failure action.
It may seem a little odd that these functions take only a single action, rather than a list of actions. But these functions (and the Alternative
class in general) lie at the heart of many parsing operations! You can learn more about that in our Parsing series.
There are a couple other classes that build on these Alternative
ideas. The MonadFail
class has one function: fail
. This function takes a string as an argument and will perform an appropriate failure action, often involving a runtime error:
class MonadFail m where
fail :: String -> m a
Then there's MonadPlus
. This takes the essential activities of Alternative
and raises them to be monad-specific. It has mzero
, which mimics empty
, and mplus
, which works like (<|>)
.
instance (Alternative m, Monad m) => MonadPlus m where
mzero :: m a
mplus :: m a -> m a -> m a
Often, the underlying alternative functions are used as the default instances, and there is no change in behavior. I personally find it a little confusing that "plus" is an "or" operation instead of "and" operation. I would expect that "adding" two operations would perform the first and then the second in succession. But this isn't what happens! If the first succeeds, the second never occurs.
Hopefully these different classes help you to write cleaner monadic operations. To learn more about the basics and fundamentals of monads, you should read our series on Monads and Functional Structures and subscribe to our newsletter!
Cool Monad Combinators
Haskell's if-statements work a bit differently from most other languages. In a language like C++ you can have an if statement that just has a single branch like this:
int sumOfList(const std::vector<int>& inputs, bool onlyHalf) {
size_t listSize = inputs.size();
if (onlyHalf) {
// Only operate on half the list
listSize = listSize / 2;
}
...
}
But a statement like that doesn't strictly fit into Haskell's paradigms of limiting side effects and assigning types to all expressions. An if statement has to be an expression like everything else, and that expression must have a type. So the way Haskell does this is that an if statement must have two branches (if and else) and each branch must be an expression with the same result type.
But what about a situation where you just want to do conditional logging? Here's another quick example:
int sumOfList(const std::vector<int>& inputs, bool isVerbose) {
if (isVerbose) {
std::cout << "Taking sum of list..." << std::endl;
}
...
}
In Haskell, we would need this function to be in the IO
monad, since it's printing to the console. But how would we represent the "conditional logging portion? We'd have to make an "if statement" where each branch has the same type. A putStrLn
expression has the type IO ()
, so we would need an empty
expression of type IO ()
as the other branch. So in this case return ()
works.
sumOfList :: [Int] -> Bool -> m Int
sumOfList inputs isVerbose = do
if isVerbose
then putStrLn "Taking sum of list..."
else return ()
return $ foldl (+) 0 inputs
But it could be annoying to have this pattern of return ()
on a lot of different branches. But there are a couple useful combinators to help us: when
and unless
.
when :: Bool -> m () -> m ()
unless :: Bool -> m () -> m ()
These simply take a boolean value and a monadic action, and they only perform the action based on the boolean value. So we could rewrite our code from above:
sumOfList :: [Int] -> Bool -> m Int
sumOfList inputs isVerbose = do
when isVerbose (putStrLn "Taking sum of list...")
return $ foldl (+) 0 inputs
Now it looks a lot cleaner. With when
, we perform the action whenever it's true. With unless
, the action occurs only when the input is false.
We can observe though that these functions only work when the result value of the input is ()
, which is to say it has no result. Because when the alternative is that "nothing happens", we can't produce a result other than ()
. So these combinators are only sensible when there's some kind of side effect from the monadic action, like printing to the console, or modifying some stateful value.
Sometimes, you may have an action that produces a desired side effect but also returns a value.
printSize :: [Int] -> IO Int
printSize inputs = do
putStrLn $ "Input has size: " ++ show (length inputs)
return $ length inputs
If you want to use this with when
or unless
, you'll have to change the action so it instead returns ()
. This is the job of the void
combinator. It just performs the action and then returns ()
at the end.
void :: m a -> m ()
void action = do
_ <- action
return ()
Now we could apply this with our different expressions above:
sumOfList :: [Int] -> Bool -> m Int
sumOfList inputs isVerbose = do
when isVerbose (void $ printSize inputs)
return $ foldl (+) 0 inputs
Hopefully you're able to use these combinators to make your Haskell a bit cleaner! If you're just getting started on your Haskell journey, you should download our Beginners Checklist! It'll provide you with some helpful tools to get going! Next week, we'll be back with some more monad tips!
Does your Monad even Lift?
Monad transformers are one of the keys to writing Haskell that solves tricker problems with interlocking effect structures. For a more complete look at monad transformers, you should take a look at Part 6 of our Monads Series. But for today we'll tackle the basic idea of "Lifting", which is one of the core ideas behind transformers.
We combine monads by placing them on a "stack". A common example might be StateT s IO
. This transformer allows us to keep track of a stateful value AND perform IO actions.
basicState :: Int -> StateT Int IO Int
basicState x = do
prev <- get
put (2 * prev + x)
return (x + prev)
So far, this function is only performing State
actions like get
and put
. We'd like it to also perform IO
actions like this:
basicState :: Int -> StateT Int IO Int
basicState x = do
prev <- get
put (2 * prev + x)
putStrLn "Double value and added input"
return (x + prev)
However this won't compile. The putStrLn
function lives in the IO
monad, and our overall expression is in the StateT
monad. How do we solve this? By using lift
:
basicState :: Int -> StateT Int IO Int
basicState x = do
prev <- get
put (2 * prev + x)
lift $ putStrLn "Double value and added input"
return (x + prev)
What exactly does lift
do here? It tells Haskell to take a function in one monad and treat it as though it's part of another monad transformer using the original monad as its "underlying" type.
lift :: (Monad m, MonadTrans t) => m a -> t m a
To specialize it to this example:
lift :: IO a -> StateT Int IO a
Why do we call this "lifting"? This comes back to treating monads like a "stack". In this example, IO
is on the bottom of the stack, and State Int
is on top of it. So we have to "lift" it up one level.
Sometimes you might need to lift multiple levels! Consider this example with State
, Writer
, and IO
. We have to use one lift
for tell
(the Writer
function) and two for putStrLn
(the IO
function).
stateWriterIO :: Int -> StateT Int (WriterT [String] IO) Int
stateWriterIO x = do
prev <- get
put (2 * prev + x)
lift . lift $ putStrLn "Double value and added input"
lift $ tell ["Returning input and previous value"]
return (x + prev)
However, with monad classes, we can sometimes skip using multiple lifts. When you combine IO
with any of the basic monads, you can use the special liftIO
function:
liftIO :: (MonadIO m) => IO a -> m a
stateWriterIO :: Int -> StateT Int (WriterT [String] IO) Int
stateWriterIO x = do
prev <- get
put (2 * prev + x)
liftIO $ putStrLn "Double value and added input"
lift $ tell ["Returning input and previous value"]
return (x + prev)
Again, if you monads are a stumbling block for you, I highly encourage you to do some more in depth study and read our Monads Series in some more depth. For our next two articles, we'll go over some other useful combinators with monads in Haskell!
Shorter Run Functions
Last time around, I discussed 'run' functions and how these are the "entrypoint" for using most monads. However, it's also useful to have a couple potential shortcuts up our sleeve. Today we'll go over a couple "shortcut" functions when you don't need everything the monad supplies.
Recall that with the Writer
and State
monads, the run
function produces two outputs. The first is the "result" of our computation (some type a
). The second is the stateful value tracked by the monad:
runWriter :: Writer w a -> (a, w)
runState :: State s a -> s -> (a, s)
There are times however, especially with State
, where the stateful value is the result. There are no shortage of functions out there that look like State s ()
. They have essentially no return value, but they update the tracked value:
doubleAndAddInput :: Int -> State Int ()
doubleAndAddInput x = do
modify (* 2)
modify (+ x)
Now let's think about running this computation. If we use runState
, we'll end up with a tuple where one of the elements is just the unit ()
.
>> runState (doubleAndAddInput 5) 6
((), 17)
Of course we can always just use snd
to ignore the first element of the tuple. But for the sake of code cleanliness it's nice to know that there are helper functions to skip this for you! Here are the exec
functions for these two monads. They will return only the tracked state value!
execWriter :: Writer w a -> w
execState :: State s a -> s -> s
So applying this with our example above, we would get the following:
>> execState (doubleAndAddInput 5) 6
17
On the flip side, there are also times where you don't care about the accumulated state. All you care about is the final result! In this case, the function you want for the State
monad is evalState
:
evalState :: State s a -> s -> a
So now we could supply a return value in our function:
doubleAndAddInput :: Int -> State Int Int
doubleAndAddInput x = do
prev <- get
put (2 * prev + x)
return (x + prev)
Then we can drop the accumulated state like so:
>> evalState (doubleAndAddInput 5) 6
11
For whatever reason, there doesn't seem to be evalWriter
in the library. Perhaps the logic here is that the accumulated Writer value doesn't affect the computation, so if you're going to ignore its output, it has no effect. However, I could imagine cases where you originally wrote the function as a Writer
, but in a particular usage of it, you don't need the value. So it's an interesting design decision.
Anyways, these functions also exist with monad transformers of course:
execStateT :: StateT s m a -> s -> m s
evalStateT :: StateT s m a -> s -> m a
execWriterT :: WriterT w m a -> m w
Next time we'll dig into monad transformers a bit more! In the meantime, learn more about monads by taking a look at our series on Monads and Functional Structures!
Running with Monads
I had a big stumbling block in learning monads. Perhaps unsurprisingly, this occurred because I was trying to take a particular monadic analogy too far in the college class where I first learned about them. I got the idea in my head that, "Monads are like a treasure chest", and my mental model went something like this:
- Monads are like a treasure chest.
- You can't directly access what's inside (the
a
inIO a
). - That is, unless you are already in that monad.
- In that case, the "bind" (
>>=
) operator let's us pass in a function that accesses the "inner value". - But the result we produce is re-wrapped in the monad!
And so my head was spinning in a loop trying to figure out how I could actually get into a monad in the first place.
However, the code I was looking at was not normal monadic code. It was IO
code. And I was conflating the IO
monad with all monads. Don't do this! The IO monad is, in fact, special! It does follow some of the same patterns as other monads. But this special property of "you can't access the inner value unless you're in IO" does not apply to other monads!
The majority of monads you will encounter can be accessed from pure code. The most common way of doing this is through a run
function. Here are three of the most common examples with the Reader
, Writer
and State
monads:
runReader :: Reader r a -> r -> a
runWriter :: Writer w a -> (a, w)
runState :: State s a -> s -> (a, s)
Most often, you'll need to supply an "initial" value, as we see with Reader
and State
. And many times you'll get the final stateful value as a second product of the function. This occurs in Writer
and State
.
Here's a simple example. We have a State
computation that adds 1 to the stored value, and then adds the previous stored value to the input, returning that. We can call into this stateful function from a totally pure function using runState
:
stateFunction :: Int -> State Int Int
stateFunction input = do
prev <- get
modify (+1)
return $ prev + input
callState :: Int -> (Int, Int)
callState x = runState (stateFunction (x + 5)) 11
...
>> callState 3
(19, 12)
>> callState 7
(23, 12)
With monad transformers, the concept of the "run" function is very similar. The functions now end with the suffix T
. The only difference is that it produces a value in the underlying monad m
:
runReaderT :: ReaderT r m a -> r -> m a
runWriterT :: WriterT w m a -> m (a, w)
runStateT :: StateT s m a -> s -> m (a, s)
Of course, there are exceptions to this pattern of "run" functions. As we learned about last time, Either
can be a monad, but we can also treat Either
values as totally normal objects. To access the inner values, we just need a case
statement or a pattern match. You don't need to take the "treasure box" approach. Or at least, with certain monads, the treasure box is very easy to unlock.
If we go back to IO for a second. There is no "run" function for IO. There is no runIO
function, or runIO_T
transformer. You can't conjure IO computations out of nothing (at least not safely). Your program's entrypoint is always a function that looks like:
main :: IO ()
To use any IO function in your program, you must have an unbroken chain of IO access going back to this main function, whether through the IO
monad itself or a transformer like StateT IO
. This pattern allows us to close off large parts of our program to IO computations. But no part of your program is firmly closed off to a State
computation. As long as you can generate the "initial" state, you can then access the State
monad via runState
.
So if you were struggling with the same stumbling block I was, hopefully this clears things up for you! If you want more examples of how monads work and how to apply these run functions, take a look at our series Monads and Functional Structures!
Using Either as a Monad
Now that February is over and we're into March, it's time for "Monads Month"! Over the course of the next month I'll be giving some helpful tips on different ways to use monads.
Today I'll start with a simple observation: the Either
type is a monad! For a long time, I used Either
as if it were just a normal type with no special rules. But its monadic behavior allows us to chain together several computations with it with ease!
Let's start from the beginning. What does Either
look like? Well, it's a very basic type that can essentially hold one of two types at runtime. It takes two type parameters and has two corresponding constructors. If it is "Left", then it will hold a value of the first type. If it is "Right" then it will hold a value of the second type.
data Either a b = Left a | Right b
A common semantic understanding of Either
is that it is an extension of Maybe
. The Maybe
type allows our computations to either succeed and produce a Just
result or fail and produce Nothing
. We can follow this pattern in Either
except that failures now produce some kind of object (the first type parameter) that allows us to distinguish different kinds of failures from each other.
Here's a basic example I like to give. Suppose we are validating a user registration, where they give us their email, their password, and their age. We'll provide simple functions for validating each of these input strings and converting them into newtype
values:
newtype Email = Email String
newtype Password = Password String
newtype Age = Age Int
validateEmail :: String -> Maybe Email
validateEmail input = if '@' member input
then Just (Email input)
else Nothing
validatePassword :: String -> Maybe Password
validatePassword input = if length input > 12
then Just (Password input)
else Nothing
validateAge :: String -> Maybe Age
validateAge input = case (readMaybe input :: Maybe Int) of
Nothing -> Nothing
Just a -> Just (Age a)
We can then chain these operations together using the monadic behavior of Maybe
, which short-circuits the computation if Nothing
is encountered.
data User = User Email Password Age
processInputs :: (String, String, String) -> Maybe User
processInputs (i1, i2, i3) = do
email <- validateEmail i1
password <- validatePassword i2
age <- validateAge i3
return $ User email password age
However, our final function won't have much to say about what the error was. It can only tell us that an error occurred. It can't tell us which input was problematic:
createUser :: IO (Maybe User)
createUser = do
i1 <- getLine
i2 <- getLine
i3 <- getLine
result <- processInputs (i1, i2, i3)
case result of
Nothing -> print "Couldn't create user from those inputs!" >> return Nothing
Just u -> return (Just u)
We can extend this example to use Either
instead of Maybe
. We can make a ValidationError
type that will help explain which kind of error a user encountered. Then we'll update each function to return Left ValidationError
instead of Nothing
in the failure cases.
data ValidationError =
BadEmail String |
BadPassword String |
BadAge String
deriving (Show)
validateEmail :: String -> Either ValidationError Email
validateEmail input = if '@' member input
then Right (Email input)
else Left (BadEmail input)
validatePassword :: String -> Either ValidationError Password
validatePassword input = if length input > 12
then Right (Password input)
else Left (BadPassword input)
validateAge :: String -> Either ValidationError Age
validateAge input = case (readMaybe input :: Maybe Int) of
Nothing -> Left (BadAge input)
Just a -> Right (Age a)
Because Either
is a monad that follows the same short-circuiting pattern as Maybe
, we can also chain these operations together. Only now, the result we give will have more information.
processInputs :: (String, String, String) -> Either ValidationError User
processInputs (i1, i2, i3) = do
email <- validateEmail i1
password <- validatePassword i2
age <- validateAge i3
return $ User email password age
createUser :: IO (Either ValidationError User)
createUser = do
i1 <- getLine
i2 <- getLine
i3 <- getLine
result <- processInputs (i1, i2, i3)
case result of
Left e -> print ("Validation Error: " ++ show e) >> return e
Right u -> return (Right u)
Whereas Maybe
gives us the monadic context of "this computation may fail", Either
can extend this context to say, "If this fails, the program will give you an error why."
Of course, it's not mandatory to view Either
in this way. You can simply use it as a value that could hold two arbitrary types with no error relationship:
parseIntOrString :: String -> Either Int String
parseIntOrString input = case (readMaybe input :: Maybe Int) of
Nothing -> Right input
Just i -> Left i
This is completely valid, you just might not find much use for the Monad
instance.
But you might find the monadic behavior helpful by making the Left
value represent a successful case. Suppose you're writing a function to deal with a multi-layered logic puzzle. For a simple example:
- If the first letter of the string is capitalized, return the third letter. Otherwise, drop the first letter from the string.
- If the third letter in the remainder is an 'a', return the final character. Otherwise, drop the last letter from the string. 3 (and so on with similar rules)
We can encode each rule as an Either
function:
rule1 :: String -> Either Char String
rule1 input = if isUpper (head input)
then Left (input !! 2)
else Right (tail input)
rule2 :: String -> Either Char String
rule2 input = if (input !! 2 == 'a')
then Left (last input)
else Right (init input)
rule3 :: String -> Either Char String
...
To solve this problem, we can use the Either
monad!
solveRules :: String -> Either Char String
solveRules input = do
result1 <- rule1 input
result2 <- rule2 result1
...
If you want to learn more about monads, you should check out our blog series! For a systematic, in depth introduction to the concept, you can also take our Making Sense of Monads course!
Treating Strings like Lists
We've spent the last month studying some of the intricacies of the different string types in Haskell. For this last article, I wanted to have a little bit of fun and consider how we might apply some of the ideas we learned from "list" functions back in January to these different string types.
The naive way to use these functions would be to transform your "Text" or "ByteString" back into a "String", run the list-based function, and then convert back. But it should hopefully be obvious now that that isn't such a great idea!
But these other string types still kind of seem like lists, so we should want to apply those functions. And because of this the authors for those packages included versions of the most common list based functions specifically geared for these types. For example, Text and ByteStrings have their own versions of functions like "map", "filter" and "fold".
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
import qualified Data.ByteString as B
T.map :: (Char -> Char) -> T.Text -> T.Text
T.filter :: (Char -> Bool) -> T.Text -> T.Text
T.foldl :: (a -> Char -> a) -> a -> T.Text -> a
B.map :: (Word8 -> Word8) -> B.ByteString -> B.ByteString
B.filter :: (Word8 -> Bool) -> B.ByteString -> B.ByteString
B.foldl :: (a -> Word8 -> a) -> a -> B.ByteString -> a
Notice how for the ByteString
, instead of Char
being the underlying value, it's Word8
. Unfortunately, this makes it a little harder to do character-specific manipulations, but we can still try!
>> T.filter isLower "Hello World"
"ello orld"
>> B.map (+1) "abcde"
"bcdef"
The numeric property underlying ByteStrings means you can do some interesting mathematical things with them. For example, you can build a simple "Caesar Cypher" like so, if you assume that your input is all lower-case or spaces. The number 97 indicates the offset for the letter 'a' as a Word8
:
caesarCypher :: Int -> B.ByteString -> B.ByteString
caesarCypher shift = B.map shiftChar
where
shiftChar :: Word8 -> Word8
shiftChar 32 = 32 -- Space
shiftChar x = (((x - 97) + shift) `mod` 26) + 97
...
>> caesarCypher "hello there"
"mjqqt ymjwj"
Even more complicated functions like intercalate
and transpose
can be found in these libraries!
>> let bytestrings = ["One", "Two", "Three", "Four", "Five"] :: [B.ByteString]
>> B.transpose bytestrings
["OTTFF", "nwhoi", "eoruv", "ere", "e"]
>> let texts = ["Hello", "my", "friend"] :: [T.Text]
>> T.intercalate ", " texts
"Hello, my, friend"
So even if your program is using more advanced string types instead of the basic [Char]
, you can still get all the functional benefits of Haskell's many different ways of manipulating lists of values!
This is the end of our exploration of string types for now. But we'll be back with a new topic for March. So subscribe to our newsletter and keep coming back here if you want to stay up to date with the blog!
Fusion Powered Strings!
In the last few articles, we've gone through the different String types in Haskell, and I've made a few mentions of "efficiency". But what does this mean? It should be a bit more clear with Bytestrings. By using a more compact representation of the data, and operating on a lower level type like Word8
, our strings will take up less space in memory, and they'll have more continuity. This makes operations on them more efficient. But what is it, exactly, about Text
that makes it more efficient than using String
?
One part of the answer to that is the concept of fusion. Throughout the documentation on Data.Text
, you'll see the phrase "subject" to fusion. The short explanation is that GHC knows how to "fuse" Text operations together so that they happen more quickly and take less memory. Let's see a quick example of this.
Suppose we want to write a string manipulation function. This will drop the first 6 letters, add the letter 'a' on the beginning, append the word "goodbye" to the end, and then capitalize the whole thing. Here's what that function might look like using String
:
transformString :: String -> String
transformString s = map toUpper $ 'a' : (drop 6 s) ++ ", goodbye."
...
>> transformString "Hello Friend"
"AFRIEND, GOODBYE."
In the process of making our final string ("AFRIEND, GOODBYE."
), our program will actually make 4 different strings for each of the operations we run.
- First, it will drop the 6 letters, and allocate space for the string
"Friend"
. - Then, it will add the "a" to the front, giving us a new string
"aFriend"
. - Next, it will append "goodbye", and we'll have
"aFriend, goodbye."
. - Finally, it will uppercase everything, giving
"AFRIEND, GOODBYE."
.
Allocating all this memory seems a bit wasteful. After all, the first three strings were just intermediate computations! But Haskell can't modify these values in place, because values are immutable in Haskell!
However, the Text
type is specifically designed to work around this. It is written so that when compiled with optimizations, fusion functions can all be combined into a single step. So suppose we write the equivalent function with Text:
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
transformString :: T.Text -> T.Text
transformString s = T.map toUpper $ T.append (T.cons 'a' (T.drop 6 s)) ", goodbye."
...
>> transformString (T.pack "Hello Friend")
"AFRIEND, GOODBYE."
If we compile this code with sufficient optimization, it will only allocate memory for the final Text
value. There will be no intermediate allocations, because the whole function is "fused"!
Fusion is a tricky subject, but if you're just starting out with Haskell, you don't need to worry too much about it. You should focus on the basics, like getting conversions right between these types. For more tips and tricks that will help you on your Haskell journey, download our free Beginners Checklist. It'll give you some more help and guide you to other resources that will help you learn!
Loading Different Strings
In the last couple of articles we've explored the Text
and ByteString
types. These provide alternative string representations that are more efficient when compared to the basic "list of characters" definition. But we've also seen that it can be a minor pain to remember all the different ways we can convert back and forth between them. And in fact, not all conversions are actually guaranteed to work the way we would like.
Now let's imagine our program deals with input and output. Now as a beginner, it's easy to think that there are only a couple primary functions for dealing with input and output, and that these only operate on String
values:
putStrLn :: String -> IO ()
getLine :: IO String
If your program actually makes use of Text
, you might end up (as I have in many of my programs) constantly doing something like T.pack <$> getLine
to read the input as a Text
, or putStrLn (unpack text)
to print it out.
When you're dealing with files, this makes it more likely that you'll mess up lazy vs. strict semantics. You'd have to know that readFile
is lazy, and readFile'
, an alternative, is strict.
readFile :: FilePath -> IO String
readFile' :: FilePath -> IO String
But you don't need to resort to always using the String
type! All of our alternative types have their own IO functions. You just have to know which modules to use, and import them in a qualified way!
With Text
, there are separate IO
modules that offer these options for us:
import qualified Data.Text as T
import qualified Data.Text.IO as TI
import qualified Data.Text.Lazy as TL
Import qualified Data.Text.Lazy.IO as TLI
TI.putStrLn :: T.Text -> IO ()
TI.getLine :: IO T.Text
TLI.putStrLn :: TL.Text -> IO ()
TLI.getLine :: IO TL.Text
The modules also contain functions for reading and writing files as well, all with the appropriate Text
types.
With ByteString
types, these functions live in the primary module, and they aren't quite as exhaustive. There is putStr
, but not putStrLn
(I'm not sure why).
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as BL
B.putStr :: B.ByteString -> IO ()
B.getLine :: IO B.ByteString
BL.putStr :: BL.ByteString -> IO ()
BL.getLine :: IO BL.ByteString
So, when you're program is using the more sophisticated string types, use the appropriate input and output mechanisms to spare yourself the inefficiencies (both performance-wise and in terms of code clarity) that come with using String
as a go-between!
If you've never really written a Haskell program with input and output before, you'll need to have some idea of how Stack works so you can get all the pieces working. Take our free Stack mini-course to learn how everything works, and stay tuned for more tips on Strings and other useful areas of Haskell!
Taking a Byte out of Strings
Earlier this week we learned about the Text
type, which is a more efficient alternative to String
. But there's one more set of string-y types we need to learn about, and these are "ByteStrings"!
The Text
types capture a unicode representation of character data. But ByteString
is more low-level, storing its information at the "byte" level. A normal string is a list of the Char
type, but the fundamental underlying data structure of the ByteString
is a list of Word8
- an 8-bit (1 byte) unsigned integer!
This means that in the normal ByteString
library, the pack
and unpack
functions are reserved for converting back and forth between [Word8]
and the ByteString
, rather than a raw String
:
pack :: [Word8] -> ByteString
unpack :: ByteString -> [Word8]
This can make it seem as though it would be quite difficult to construct ByteStrings.
>> import qualified Data.ByteString as B
>> let b = B.pack [72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 33]
>> b
"Hello world!"
>> B.unpack b
[72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 33]
There are two ways around this. Once again, we can use the "OverloadedStrings" extension:
>> :set -XOverloadedStrings
>> let b = "Hello world!" :: B.ByteString
>> :t b
b :: B.ByteString
There is also another module called Data.ByteString.Char8
. In this module, the pack
and unpack
functions are used for strings instead of Word8
chunks. Note though, that all characters you use will be truncated to 8 bits, so your results may not be correct, especially if you go far beyond the simple ASCII character set.
>> import qualified Data.ByteString.Char8 as BC
>> let b = BC.pack "Hello World!"
>> b
"Hello world!"
>> :t b
BC.ByteString
>> :t (BC.unpack b)
[Char]
Like Text
, there are both strict and lazy versions of ByteString
, which you convert using the same way.
>> import qualified Data.ByteString as B
>> import qualified Data.ByteString.Lazy as BL
>> :set -XOverloadedStrings
>> let b1 = "Hello" :: B.ByteString
>> let b2 = "World" :: BL.ByteString
>> let b3 = BL.fromStrict b1
>> let b4 = BL.toStrict b2
>> :t b3
BL.ByteString
>> :t b4
B.ByteString
The last item to make a note of is that we can convert directly between Text
and ByteString
. This is often desirable because of the transcription errors that can occur using String
as a go-between. It also allows strictness or laziness to be maintained, since the following functions live in both Data.Text.Encoding
and Data.Text.Lazy.Encoding
.
The trick is that we have to have some idea of what encoding we are using. Most often, this is UTF-8. In this case, we can use the following functions:
encodeUtf8 :: Text -> ByteString
decodeUtf8 :: ByteString -> Text
Let's see these sorts of functions in action:
>> import qualified Data.Text as T
>> import qualified Data.ByteString as B
>> import Data.Text.Encoding
>> let t = T.pack "Hello world"
>> let b = encodeUtf8 t
>> b
"Hello world"
>> :t b
B.ByteString
>> let t2 = decodeUtf8 b
>> t2
"Hello world"
>> :t t2
T.Text
Encoding a Text
as a ByteString
will always succeed. But decoding a ByteString
might fail. This is because the raw bytes someone uses might not be a representation of a valid set of characters. So decodeUtf8
can throw errors in special cases. If you're concerned about catching these, you can use one of the following functions:
decodeUtf8' :: ByteString -> Either UnicodeException Text
decodeUtf8With :: OnDecodeError -> ByteString -> Text
Other encodings exist besides UTF-8, but you also need to know if it is "big-endian" or "little-endian", indicating the ordering of the bytes in the underlying representation:
encodeUtf16LE :: Text -> ByteString
decodeUtf32BEWith :: OnDecodeError -> ByteString -> Text
It can be difficult to keep all these conversions straight. But here's the TLDR, with 4 different conversions to remember:
- String <-> Text - Use
Data.Text
, withpack
andunpack
. - String <-> ByteString - Use
Data.ByteString.Char8
, withpack
andunpack
- Text <-> ByteString - Use
Data.Text.Encoding
withencodeUtf8
anddecodeUtf8
- Strict and Lazy - Use the "Lazy" module (
Data.Text.Lazy
orData.ByteString.Lazy
) withfromStrict
andtoStrict
.
Like text
, bytestring
is a separate package, so you'll need to include in your project with either Stack or Cabal (or Nix). To learn how to use the Stack tool, sign up for our free Stack mini-course!
Con-Text-ualizing Strings
In the past couple weeks of string exploration, we've only consider the basic String
type, which is just an alias for a list of characters:
type String = [Char]
But it turns out that this representation has quite a few drawbacks. There are other string representations that are more compact and result in more efficient operations, which is paramount when you are parsing a large amount of data.
But since the type system plays such a strong role in Haskell, each of these different string representations must have its own type. Today we'll talk about Text
, which is one of these alternatives.
Rather than storing a raw list of the Char
type, a Text
object stores values as unicode characters. This allows many operations to be much faster.
But perhaps the first and most important thing to learn when you're a beginner is how to convert back and forth between a normal String
and Text
. This is done with the pack
and unpack
functions:
import Data.Text (Text, pack, unpack)
pack :: String -> Text
unpack :: Text -> String
Because the underlying representations are a bit different, not every String
can be converted into Text
in a comprehensible manner. But if you're sticking to the basic ASCII character set, everything will work fine.
>> let s = "ABCD" :: String
>> let t = pack s
>> t
"ABCD"
>> :t t
Text
>> unpack t
"ABCD"
>> :t (unpack t)
[Char]
The Text library has many different functions for manipulating Text
objects. For example, append
will combine two Text
items, and cons
will add a character to the front. Many of these overlap with Prelude functions, so it is usually best to import this module in a qualified way.
>> import qualified Data.Text as T
>> let t1 = T.pack "Hello"
>> let t2 = T.pack "World"
>> T.append t1 (T.cons ' ' t2)
"Hello World"
Naturally, Text
implements the IsString
class we talked about a little while ago. So if you enable OverloadedStrings
, you don't actually need to use pack
to initialize it! You can just use a string literal.
>> import qualified Data.Text as T
>> :set -XOverloadedStrings
>> let t = "Hello" :: T.Text
Now technically there are two different Text
types. So far, we've referred to "strict" Text
objects. These must store all of their data in memory at once. However, we can also have "lazy" Text
objects. These make use of Haskell's laziness mechanics so that you can stream data without having to store it all at once. All the operations are the same, they just come from the Data.Text.Lazy
module instead of Data.Text
!
>> import qualified Data.Text.Lazy as TL
>> let t1 = TL.pack "Hello"
>> let t2 = TL.pack "World"
>> TL.append t1 (TL.cons ' ' t2)
"Hello World"
There will often come times where you need to convert back and forth between these. The Lazy module contains functions for doing this.
>> import qualified Data.Text as T
>> import qualified Data.Text.Lazy as TL
>> let t1 = T.pack "Hello"
>> let t2 = TL.pack "World"
>> let t3 = TL.fromStrict t1
>> let t4 = TL.toStrict t2
>> :t t3
TL.Text
>> :t t4
T.Text
Because these types live in the text package, and this package is not included in base
, you'll need to know how dependencies work in use it in your projects. To learn more about this, take our free Stack mini-course! You'll learn how to make a project using Stack and add dependencies to it.
When Strings get Word-y
Earlier this week we explored the lines
and unlines
functions which allow us to split up strings based on where the newline characters are and then glue them back together if we want. But a lot of times the newline is the character you're worried about, it's actually just a normal space! How can we take a string that is a single line and then separate it?
Again, we have very simple functions to turn to: words
and unwords
:
words :: String -> [String]
unwords :: [String] -> String
These do exactly what you think they do. We can take a string with several words in it and break it up into a list that excludes all whitespace.
>> words "Hello there my friend"
["Hello", "there", "my", "friend"]
And then we can recombine that list into a single string using unwords
:
>> unwords ["Hello", "there", "my", "friend"]
"Hello there my friend"
This pattern is actually quite helpful when solving problems of the kind you'll see on Hackerrank. Your input will often have a particular format like, "the first line has 'n' and 'k' which are the number of cases and the case size, and then there are 'n' lines with 'k' elements in them." So this might look like:
5 3
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
Then words
is very useful for splitting each line. For example:
readFirstLine :: IO (Int, Int)
readFirstLine = do
firstNumbers <- words <$> getLine
case firstNumbers of
(n : k : _) -> return (n, k)
_ -> error "Invalid input!"
As with lines
and unlines
, these functions aren't inverses, properly speaking. With unwords
, the separator will always be a single space character. However, if there are multiple spaces in the original string, words
will work in the same way.
>> words "Hello there my friend"
["Hello", "there", "my", "friend"]
>> unwords (words "Hello there my friend")
"Hello there my friend"
In fact, words
will separate based on any whitespace characters, including tabs and even newlines!
>> words "Hello \t there \n my \n\t friend"
["Hello", "there", "my", "friend"]
So it could actually do the same job as lines
, but then reversing the process would require unlines
rather than `unwords.
Finally, it's interesting to note that unlines
and unwords
are both really just variations of intercalate
, which we learned about last month.
unlines :: [String] -> String
unlines = intercalate "\n"
unwords :: [String] -> String
unwords = intercalate " "
For more tips on string manipulation in Haskell, make sure to come back next week! We'll start exploring the different String types that exist in Haskell! If you're interested in problem solving in Haskell, you should try our free Recursion Workbook! It explains all the ins and outs of recursion and includes 10 different practice problems!
Line 'em Up!
Reading from files and writing to files is a very important job that you'll have to do in a lot of programs. So it's very much worth investing your time in learning functions that will streamline that as much as possible.
Enter lines
and unlines
. These two functions make it substantially easier for you to deal with the separate lines of a file (or any other string really). Fundamentally, these two functions are inverses of each other. The "lines" function will take a single string, and return a list of strings, while "unlines" will take a list of strings and return a single string.
lines :: String -> [String]
unlines :: [String] -> String
Our friend "lines" takes a string that has newline characters (or carriage returns if you're using Windows) and then break it up by the lines.
>> lines "Hello\nWorld\n!"
["Hello", "World", "!"]
Then "unlines" is able to take that list and turn it back into a single string, where the entries of the list are separated by newlines.
>> unlines ["Hello", "World", "!"]
"Hello\nWorld\n!\n"
One distinction you'll notice is that our final string from "unlines" has an extra newline character. So strictly speaking, these functions aren't total inverses (i.e. unlines . lines
/= id
). But they still essentially function in this way.
As mentioned above, these functions are extremely handy for file reading and writing. Suppose we are reading a file and want to break it into its input lines. A naive solution would be to produce a Handle
and then use iterated calls to hGetLine
. But this gets tedious and is also error prone when it comes to our end condition.
readFileLines :: FilePath -> IO [String]
readFileLines fp = do
handle <- openFile fp ReadMode
results <- readLinesHelper handle []
hClose handle
return results
where
readLinesHelper :: Handle -> [String] -> IO [String]
readLinesHelper handle prevs = do
isEnd <- hIsEOF handle
if isEnd
then return (reverse prevs)
else do
nextLine <- hGetLine handle
readLinesHelper handle (nextLine : prevs)
Boy that's a little messy…
And similarly, if we wanted to write each string in a list to a separate line in the file using hPutStrLn
, this is non-trivial.
writeFileLines :: [String] -> FilePath -> IO ()
writeFileLines strings fp = do
handle <- openFile fp WriteMode
mapM_ (hPutStrLn handle) strings
hClose handle
But these both get waaay easier if we think about using lines
and unlines
. We don't even have to think about the file handle! When reading a file, we just use lines
in conjunction with readFile
:
readLines :: FilePath -> IO [String]
readLines fp = lines <$> readFile fp
Then similarly, we use unlines
with writeFile
:
writeFileLines :: [String] -> FilePath -> IO ()
writeFileLines strings fp = writeFile fp (unlines strings)
And thus what were tricky implementations (especially for a beginner) are now made much easier!
Later this week we'll look at another set of functions that help us with this problem of splitting and fusing our strings. In the meantime, make sure to sign up for the Monday Morning Haskell newsletter if you haven't already! If you do, you'll get access to a lot of our great resources for beginning and intermediate Haskellers alike!
Classy Strings
In January we shared many different examples of helpful functions you can use with lists. In Februrary, we'll be focusing on Strings. Of course, Haskell's different string representations all use lists to some extent, so we've already seen several examples of list manipulation being useful with strings. But now we'll look more specifically at the different types that all us to describe string-like data.
Our first useful idea is the IsString class. This can apply to any type that we can derive from a String. The key function is fromString
.
class IsString a where
fromString :: String -> a
Making an instance of this type is easy enough as long as your data makes sense. Suppose we have a simple newtype
wrapper for a string.
newtype Person = Person String
Our IsString
instance is trivial:
instance IsString Person where
fromString = Person
But sometimes we might want something a little trickier. Suppose we want to give our person a first AND last name, given only a single string.
data Person = Person
{ firstName :: String
, lastName :: String
} deriving (Show)
Now our instance is non-trivial, but still simple if we remember our helpful list functions! We first take all the characters that are "not spaces" to get the first name. Then to get the last name, we drop these characters instead, and then drop all the spaces as well!
instance IsString Person where
fromString s = Person firstName lastName
where
firstName = takeWhile (not . isSpace) s
lastName = dropWhile isSpace (dropWhile (not . isSpace) s)
We can use this instance in the simple way, applying the fromString
function directly:
>> fromString "George Washington" :: Person
Person {firstName = "George", lastName = "Washington"}
But what's more interesting is that if we turn on the "Overloaded Strings" extension, we can use a string literal for this object in our code!
>> :set -XOverloadedStrings
>> let president = "George Washington" :: Person
This compiler extension is extremely useful once we get to other kinds of strings (Text
, ByteString
, etc.). But, combined with the IsString
class, it's a nice little quirk that can make our lives easier in certain circumstances, especially when we are running some kind of parsing program, or pasting in stringified data into our Haskell source when we don't want to bother reformatting it.
We'll be back next Monday with more string tricks! If you want to use tricks like these in a project, you need to learn how to use Stack first. To help with this, sign up for our free Stack mini-course!
To Infinity and Beyond!
We're at the end of January now, so this will be the last article on basic list utilities. For this last set of functions, we'll explore some items that are very helpful when you are using infinite lists in Haskell.
Infinite lists are kind of a cool construct because they only really exist due to Haskell's lazy evaluation mechanisms. Most other languages don't have them because eager evaluation would force you to use an infinite amount of space! But in Haskell, you only need to allocate space for the items in the list that actually get evaluated.
The two most basic functions for creating an infinite list are "repeat" and "cycle". The first makes an infinite number of the same element, while the second allows you to cycle through a specific series of elements.
repeat :: a -> [a]
cycle :: [a] -> [a]
But how exactly are these useful? For example, simply by trying to print them we'll create a mess of output that will make us force quit the program:
>> repeat 3
[3, 3, 3, 3, 3, ...
>> cycle [1, 2, 3, 4]
[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, ...
Well one answer is to use take
, which we learned about last time. We can "take" a specific number of a single element.
>> take 5 (repeat 3)
[3, 3, 3, 3, 3]
Infinite lists are also great in conjunction with zip
, because this "shortens" the result to the size of the finite list. Suppose we want to match up a list with the index modulo 4:
>> let l1 = [5, 3, 1, -4, 6, 20]
>> zip l1 (cycle [0, 1, 2, 3])
[(5, 0), (3, 1) ,(1, 2) ,(-4, 3) ,(6, 0) ,(20, 1)]
One more way to generate an infinite list is to use iterate
. This allows us to continually apply a function against a starting value. We start with a
, and then the second element will apply the function once as f a
, and then the next value will be f (f a)
, and so on.
iterate :: (a -> a) -> a -> [a]
One use case for this might be to calculate the value of an investment using compound interest. Let's say you start with $10,000 and you get 5% interest every year. How much will you have after 5 years?
>> take 6 (iterate (* 1.05) 10000.0)
[10000.0, 10500.0, 11025.0, 11576.25, 12155.06, 12762.82]
You can also use fairly simple addition functions with iterate
. But you should also know that in most cases you can represent such an infinite list with a range!
>> take 5 (iterate (+3) 1)
[1, 4, 7, 10, 13]
>> take 5 [1,4..]
[1, 4, 7, 10, 13]
That's all for our exploration of basic list functions! We'll have a new topic next month, so subscribe to our monthly newsletter to stay on top of what we're studying! You can also sign up by downloading our free Beginners Checklist!