Finding What You Seek
In our last couple of articles, we've gone through the basics of how to use Handles. This useful abstraction not only lets us access files in a stateful way, but also to treat terminal streams (standard in, standard out) in the same way we treat files. This week we'll learn a few tricks that are a little more specific to handles on files. File handles are seekable, meaning we can move around where we are "pointing" to in the file, similar to moving the position of a video recording.
To understand how this works, we should first make a note, if it isn't clear already, that a Handle
is a stateful object. The handle points to the file, but it also tracks information about where it is in the file. For example, let's define a file:
First Line
Second Line
Third Line
Fourth Line
...
We can have two different functions that will print from a handle. One will print a single line, the other will print two lines.
printOneLine :: Handle -> IO ()
printOneLine h = hGetLine h >>= putStrLn
printTwoLines :: Handle -> IO ()
printTwoLines h = do
hGetLine h >>= putStrLn
hGetLine h >>= putStrLn
If we call these back to back on our file, it will print all three lines of the file, rather than re-printing the first line.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
printOneLine h
printTwoLines h
hClose h
...
>> stack exec io-program
First Line
Second Line
Third Line
This is because the state of h
carries over after the first call. So the handle remembers that it is now pointing at the second line.
Now you might wonder, if this computation is stateful, why doesn't it use the State
monad? It turns out the IO
monad already is its own "state" monad. However, the "state" in this case is the state of the whole operating system! Or we can even think of IO as tracking the state of "the whole outside world". This is why IO is so impure, because the "state of the whole outside world" changes for every single call!
We can illustrate most plainly how the state has changed by printing the position of the handle. This is accessible through the function hGetPosn
, which gives us an item of type HandlePosn
. We can also use hTell
to give us this value as an integer.
hGetPosn :: Handle -> IO HandlePosn
hTell :: Handle -> IO Integer
Let's see the position at different points in our program.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
hGetPosn h >>= print
hTell h >>= print
printOneLine h
hGetPosn h >>= print
hTell h >>= print
printTwoLines h
hGetPosn h >>= print
hTell h >>= print
hClose h
...
>> stack exec io-program
{handle: testfile.txt} at position 0
0
First Line
{handle: testfile.txt} at position 11
11
Second Line
Third Line
{handle: testfile.txt} at position 34
34
We can manipulate this position in a number of different ways, but they all depend on the file being seekable. By and large, read and write file handles are seekable, while the terminal handles are not. As we'll see, "append" handles are also not seekable.
hIsSeekable :: IO Bool
main :: IO ()
main = do
hIsSeekable stdin >>= print
hIsSeekable stdout >>= print
h <- openFile "testfile.txt" ReadMode
hIsSeekable h >>= print
hClose h
...
>> stack exec io-program
False
False
True
Note: you'll get an error if you even query a closed handle for whether or not it is seekable.
So how do we change the position? The first way is through hSetPosn
.
hSetPosn :: HandlePosn -> IO ()
This lets us go back to a previous position we saw. So in this example, we'll read one line and save that position. Then we'll read two more lines, go back to the previous position, and read one line again. Because the HandlePosn
object relates both to the numeric position AND the specific handle, we don't need to specify the Handle
again in the function call.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
printOneLine h
p <- hGetPosn h
printTwoLines h
hSetPosn p
printOneLine h
hClose h
...
>> stack exec io-program
First Line
Second Line
Third Line
Second Line
We can do various tricks with hSeek
, which takes the handle and an integer position. It also takes a SeekMode
. This tells us if the integer refers to an "absolute" position in the file, a position "relative" to the current position, or even a position relative to the end.
data SeekMode =
AbsoluteSeek |
RelativeSeek |
SeekFromEnd
hSeek :: Handle -> SeekMode -> Integer -> IO ()
In this example we'll read the first line, advance the seek position a few characters (which will cut off what we see of the second line), and then go back to the start.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
printOneLine h
hSeek h RelativeSeek 4
printTwoLines h
hSeek h AbsoluteSeek 0
printOneLine h
hClose h
>> stack exec io-program
First Line
nd Line
Third Line
Second Line
We can also seek when writing to a file. As always with WriteMode
, there's a gotcha. In this example, we'll write our first line, go back to the start, write another line, and then write a final line at the end.
main :: IO ()
main = do
h <- openFile "testfile.txt" WriteMode
hPutStrLn h "The First Line"
hSeek h AbsoluteSeek 0
hPutStrLn h "Second Line"
hSeek h SeekFromEnd 0
hPutStrLn h "Third Line"
hClose h
The result file is a little confusing though!
Second Line
ne
Third Line
We overwrote most of the first line we wrote, instead of appending at the beginning! All that's left of "The First Line" is the "ne" and newline character!
We might hope to fix this by using AppendMode
, but it doesn't work! This mode makes the assumption that you are only writing new information to the end of a file. Therefore, append handles are not seekable.
If you're just writing application level code, you probably don't need to worry too often about these subtleties. But if you have any desire to write a low-level library, you'll need to know about all these specific mechanics! Stay tuned for more IO-related content in the coming weeks. If you want to stay up to date, make sure to subscribe to our monthly newsletter! You'll get access to our subscriber resources, which includes a lot of great beginner materials!
Handling Files more Easily
Earlier this week we learned about the Handle
abstraction which helps us to deal with files in Haskell. An important part of this abstraction is that handles are either "open" or "closed". Today we'll go over a couple ideas to help us deal with opening and closing handles more gracefully.
When we first get a handle, it is "open". Once we're done with it, we can (and should) "close" it so that other parts of our program can use it safely. We close a handle with the hClose
function:
hClose :: Handle -> IO ()
Most of the time I use files, I find myself opening and closing the handle in the same function. So for example, if we're reading a file and getting the first 4 lines:
read4Lines :: FilePath -> IO [String]
read4Lines fp = do
handle <- openFile fp ReadMode
myLines <- lines <$> readFile handle
let result = take 4 myLines
hClose handle
return myLines
A recommendation I would give when you are writing a function like this is to write the code for the hClose
call immediately after you write the code of openFile
. So your might start like this:
read4Lines :: FilePath -> IO [String]
read4Lines fp = do
handle <- openFile fp ReadMode
-- Handle logic
hClose handle
-- Return result
And only after writing these two lines should you write the core logic of the function and the final return statement. This is an effective way to make sure you don't forget to close your handles.
In the strictest sense though, this isn't even fool proof. If you cause some kind of exception while performing your file operations, an exception will be thrown before your program closes the handle. The technically correct way out of this is to use the bracket
pattern. The bracket
function allows you to specify an action that will take place after the main action is done, no matter if an exception is thrown. This is like using try/catch/finally
in Java or try/except/finally
in Python. The finally
action is the second input to bracket
, while our main action is the final argument.
bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c
If we specialize this signature to our specific use case, it might look like this:
bracket :: IO Handle -> (Handle -> IO ()) -> (Handle -> IO [String]) -> IO [String]
And we can then write our function above like this:
read4LinesHandle :: Handle -> IO [String]
read4LinesHandle handle = do
myLines <- lines <$> readFile handle
let result = take 4 myLines
return result
read4Lines :: FilePath -> IO [String]
read4Lines fp = bracket (openFile fp) hClose read4LinesHandle
Now our handle gets closed even if we encounter an error while reading.
Now, this pattern (open a file, perform a handle operation, close the handle) is so common with file handles specifically that there's a special function for it: withFile
:
withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r
This makes our lives a little simpler in the example above:
read4LinesHandle :: Handle -> IO [String]
read4LinesHandle handle = ...
read4Lines :: FilePath -> IO [String]
read4Lines fp = withFile fp ReadMode read4LinesHandle
If you're ever in doubt about whether a Handle
is open or not, you can also check this very easily. There are a couple boolean functions to help you out:
hIsOpen :: Handle -> IO Bool
hIsClosed :: Handle -> IO Bool
Hopefully this gives you more confidence in the proper way to deal with file handles! We'll be back next week with some more tricks you can do with these objects. In the meantime, you can subscribe to our monthly newsletter! This will keep you up to date with our latest articles and give you access to our subscriber resources!
Getting a Handle on IO
Welcome to May! This month is "All About IO". We'll be discussing many of the different useful types and functions related to our program's input and output. Many of these will live in the System.IO library module, so bookmark that if you want to demystify how IO works in Haskell!
The first concept you should get a grasp on if you want to do anything non-trivial with IO in Haskell is the idea of a Handle
. You can think of a handle as a pointer to a file. We can use this pointer to read from the file or write to the file. The first interaction you'll have with a handle is when you generate it with openFile
.
data Handle
openFile :: FilePath -> IOMode -> IO Handle
The first argument here is the FilePath
, which is just a type alias for a plain old string. The second argument tells us how we are interacting with the file. There are four different modes of interacting with a file:
data IOMode =
ReadMode |
WriteMode |
AppendMode |
ReadWriteMode
Each one allows a different set of IO operations, and these are mostly intuitive. With ReadMode
, we can read lines from the handle we receive, but we can't edit the file. With AppendMode
, we can write new lines to the end of the file, but we can't read from it. In order to do both kinds of operations, we need ReadWriteMode
.
As an important note, WriteMode
is the most dangerous! This mode only allows writing. It is impossible to read from the file handle. This is because opening a file in WriteMode
will erase its existing contents. At first glance it's easy to think that WriteMode
will allow you to just write to the end of the file, adding to its contents. But this is the job of AppendMode
! Note however that both these modes will create the file if it does not already exist.
Here's an example of some simple interactions with files:
readFirstLine :: FilePath -> IO String
readFirstLine fp = do
handle <- openFile fp ReadMode
let firstLine = hGetLine handle
hClose handle
return firstLine
writeSingleLine :: FilePath -> String -> IO ()
writeSingleLine fp newLine = do
-- Create file if it doesn't exist, overwrite its contents if it does!
handle <- openFile fp WriteMode
hPutStrLn handle newLine
hClose handle
addNewLine :: FilePath -> String -> IO ()
addNewLine fp newLine = do
handle <- openFile fp AppendMode
hPutStrLn handle newLine
hClose handle
A few notes. All these functions use hClose
when they're done with the handle. This is an important way of letting the file system know we are done with this file.
hClose :: Handle -> IO ()
If we don't close our handles, we might end up with conflicts. Multiple different handles can exist at the same time for reading a file. But only a single handle can existing for writing to a file at any given time. And if we have a write-capable handle (anything other than ReadMode
), we can't have other ReadMode
handles to that file. So if we write a file but don't close it's handle, we won't be able to read from that file later!
A couple of the functions we wrote above might sound familiar to the most basic IO functions. The first functions you learned in Haskell were probably print
, putStrLn
, and getLine
:
print :: (Show a) => a -> IO ()
putStrLn :: String -> IO ()
getLine :: IO String
The first two will output text to the console, the third will pause and let the user enter a line on the console. Above, we used these two functions:
hPutStrLn :: Handle -> String -> IO ()
hGetLine :: Handle -> IO String
These functions work exactly the same, except they are dealing with a file, so they have the extra Handle
argument.
The neat thing is that interacting with the console uses the same Handle
abstraction! When your Haskell program starts, you already have access to the following open file handles:
stdin :: Handle
stdout :: Handle
stderr :: Handle
So the basic functions are simply defined in terms of the Handle
functions like so:
putStrLn = hPutStrLn stdout
getLine = hGetLine stdin
This fact allows you to write a program that can work either with predefined files as the input and output channels, or the standard handles. This is amazingly useful for writing unit tests.
echoProgram :: (Handle, Handle) -> IO ()
echoProgram (inHandle, outHandle) = do
inputLine <- hGetLine inHandle
hPutStrLn outHandle inputLine
main :: IO ()
main = echoProgram (stdin, stdout)
testMain :: IO ()
testMain = do
input <- openFile "test_input.txt" ReadMode
output <- openFile "test_output.txt" WriteMode
echoProgram (input, output)
hClose input
hClose output
-- Assert that "test_output.txt" contains the expected line.
...
That's all for our first step into the world of IO
. For the rest of this month, we'll be looking at other useful functions. For now, make sure you subscribe to our monthly newsletter so you get a summary of anything you might have missed. You'll also get access to our subscriber resources, which can really help you kickstart your Haskell journey!
Traverse: Fully Generalized Loops
Last time around, we discussed mapM
and sequence
, which allow us to run loop-like activities in Haskell while also incorporating monadic effects. Today for our last article of for-loops month, we'll look at the final generalization of this idea: the traverse
function.
To understand traverse
, it helps to recall the ideas behind fmap
. When we use fmap
, we can take any Functor
structure and transform all the underlying elements of that functor, returning a new object with the exact same structure, but different elements. The new elements might be of the same type or they can be entirely different.
fmap :: (Functor f) => (a -> b) -> f a -> f b
We can apply this idea over many different data structures in Haskell. However, it is a pure function. If the operation we're attempting requires a monadic effect, we won't be able to use fmap
. For an example, consider having a list of strings which represent people's names. These names correspond to objects of type Person
in a database, and we would like to take our list and look up all the people. Here's a naive C++ outline of this:
class Person;
Person lookupPersonByName(const std::string& name) {
// Database call
...
}
std::vector<Person> lookupAllNames(const std::vector& names) {
std::vector<Person> results;
for (const auto& name : names) {
results.push_back(lookupPersonByName(name));
}
return results;
}
In C++, this function can just be a normal for loop (though we would want to parallelize in a production setting). In Haskell though, the lookupAllNames
function would need to be an IO-like function.
data Person = ...
lookupPersonByName :: String -> IO Person
...
This means we can't use fmap
. Now, mapM
from the last article is a viable option here. But it's important to also consider its generalization, found in the Traversable
class:
class (Functor t, Foldable t) => Traversable t where
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
Let's break this down. The traverse
function has two inputs:
- A function transforming an object using an effect (Applicative or Monadic)
- A container of that object. (The container is the traversable class)
The result of this is a new container which has applied the transformation to the input elements, occurring within the effect class. So for our database example, we might have a list of names, and we transform them all:
data Person = ...
lookupPersonByName :: String -> IO Person
lookupAllNames :: [String] -> IO [Person]
lookupAllNames = traverse lookupPersonByName
Since any "foldable functor" will do though, we can also apply a traversal over a Maybe String
, an Either String
object, or a Map
of strings, for example. All these calls will occur in the IO
monad.
lookupAllNames :: (Foldable t, Functor t) => t String -> IO (t Person)
lookupAllNames = traverse lookupPersonByName
...
>> :t (lookupAllNames (Just "Joseph"))
IO (Maybe Person)
>> :t (lookupAllNames (Right "Joseph"))
IO (Either a Person)
>> :t (lookupAllNames (Map.fromList [(1, "Joseph"), (2, "Christina")]))
IO (Map.Map Int Person)
A big advantage of this approach over C++ is that we can use Haskell's monadic behavior to easily determine the correct side effect when an operation fails. In our example, by wrapping the calls in IO
we ensure that the user is aware that an IO
error could occur that they might need to catch. But we could also improve the monad to make the type of error more clear:
data DatabaseError
lookupPersonByName :: String -> ExceptT DatabaseError IO Person
lookupAllNames :: (Foldable t, Functor t) => t String -> ExceptT DatabaseError IO (t Person)
In this particular case, we would short circuit the operation when an error is encountered. In C++, you would probably want to have lookupPersonByName
return a type like StatusOr<Person>
. Combining these Status
objects appropriately might be a bit tricky. So it's nice that monads do this for us automatically in Haskell.
The last thing I'll note is that in Data.Traversable
we finally have a function defined as the word for
! Similar to mapM
and forM
, this function is just a flipped version of traverse
:
for :: (Traversable t, Applicative f) => t a -> (a -> f b) -> f (t b)
lookupPersonByName :: String -> IO Person
lookupAllNames :: [String] -> IO [Person]
lookupAllNames inputs = for inputs lookupPersonByName
So now when someone says "Haskell doesn't have for loops", you know the proper reply is "yes it does, here is the 'for' function!".
For these two months now, we've explored a bit about monads, and we've explored different kinds of loops. In both these cases, IO
presents some interesting challenges. So next month is going to be all about IO! So make sure you keep coming back Mondays and Thursdays, and subscribe to our monthly newsletter so you'll get a summary in case you miss something!
Effectful Loops: Sequence and MapM
We've covered a lot of different ways to run loop behavior in Haskell, but all of them operate in a "pure" way. That is, none of them use monadic behavior. While folding and scanning provide us with a basic mechanism for tracking stateful computations, you sometimes have more complicated problems where the stateful object has more layers. And none of the functions we've seen so far allow IO activity, like printing to the console or accessing the file system.
So to motivate this example, let's imagine we're parsing several different files, each containing a set of names. We would like to read each of these files and create a combined list. Here's what this might look like in C++ with a for-loop:
std::vector<std::string> readSingleFile(std::string filename) { ... }
std::vector<std::string> readNamesFromFiles(std::vector<std::string> filenames) {
std::vector<std::string> results;
for (const auto& file : filenames) {
auto namesFromThisFile = readSingleFile(file);
// std::copy also works well
for (const auto& name : namesFromThisFile) {
results.push_back(name);
}
}
return results;
}
This looks a lot like a map
problem in Haskell (or really, concatMap
). However, we have a problem. The function we would like to map
must be an IO function!
readSingleFile :: FilePath -> IO [String]
readSingleFile fp = lines <$> readFile fp
This means that if we try to use map
with this function and a list of FilePaths
, we won't get a list of lists that we can immediately concat
. And in fact, we won't even get an IO
object containing the list of lists! The type will actually be [IO [String]]
, a list of IO actions which each return a list of strings!
>> let filepaths = [...]
>> let results = map readSingleFile filePaths
>> :t results
[IO [String]]
By itself, this doesn't seem to help us achieve our goal. But there are a couple helpers that can get the rest of the way. One function is sequence
. This takes a list of monadic actions and runs them back to back, collecting the results! This function generalizes beyond lists, but we'll just think about the type signature using lists for right now.
sequence :: (Monad m) => [m a] -> m [a]
If we imagine our list of monadic actions, this function essentially acts as though it is running them all together in do
syntax and returning a list of the results.
sequence :: (Monad m) => [m a] -> m [a]
sequence [action1, action2, action3, ...] = do
result1 <- action1
result2 <- action2
result3 <- action3
...
return [result1, result2, result3, ...]
So we could apply this function against our previous output, and then concat
the results within the IO
object using fmap
.
readSingleFile :: FilePath -> IO [String]
readNamesFromFiles :: [FilePath] -> IO [String]
readNamesFromFiles files =
(fmap concat) $ sequence (map readSingleFile files)
But there's an even simpler way to do this! There's also the function mapM
, which essentially combines map
and sequence
:
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
It's type signature is like the original map
, but instead it takes a function in a monadic action and produces a single monadic action. This allows us to simplify our solution:
readSingleFile :: FilePath -> IO [String]
readNamesFromFiles :: [FilePath] -> IO [String]
readNamesFromFiles files =
(fmap concat) $ mapM readSingleFile files
Since mapping works so much like a for-loop, there is even the function forM
. This is the same as mapM
, except that its arguments are reversed, so the list comes first.
forM :: (Monad m) => [a] -> (a -> m b) -> m [b]
Each of these functions also has an equivalent underscored function, which discards the result. These can be useful when you're only interested in the monadic side effect of the function, rather than the return value.
sequence_ :: (Monad m) => [m a] -> m ()
mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()
forM_ :: (Monad m) => [a] -> (a -> m b) -> m ()
Here's an example where we'll want that. Suppose our files might contain duplicated names, and we want to discard duplicates. This means we'll want a Set
instead of a list. Our C++ code doesn't change much:
std::vector<std::string> readSingleFile(std::string filename) { ... }
std::set<std::string> readNamesFromFiles(std::vector<std::string> filenames) {
std::set<std::string> results;
for (const auto& file : filenames) {
auto namesFromThisFile = readSingleFile(file);
for (const auto& name : namesFromThisFile) {
results.insert(name);
}
}
return results;
}
We can change things up in our Haskell code though! Instead of post-processing the list of lists afterward, we'll keep track of the growing set using the State
monad!
readSingleFile :: FilePath -> StateT (Set.Set String) IO ()
readSingleFile fp = do
names <- lift (lines <$> readFile fp)
prevSet <- get
put $ foldr Set.insert prevSet names
readNamesFromFiles :: [FilePath] -> IO [String]
readNamesFromFiles filenames = Set.toList <$> execStateT
(mapM_ readSingleFile filenames) Set.empty
Notice how this uses the execStateT
shortcut we talked about last month! Generally speaking, combining the State
monad and mapM
provides a fully generalizable way to write for-loop code. It allows you to incorporate IO
as needed, and it allows you to track any kind of state you would like. Sometimes though, it'll be much more convenient to use traverse
, which we'll talk about in our final article this month!
If you're enjoying learning about for loops in Haskell, you should sign up for our monthly newsletter! This will keep you up to date with any new content that comes out and you'll get access to our subscriber resources!
What about While Loops? Unfolding in Haskell
So far this month, we've been focusing on "for" loops. All of our functions have taken a list as an input and produced either a single value or a list of accumulated values matching the length of the list. But sometimes we actually want to do the opposite! We want to take a single seed value and produce a list of values! This is more in keeping with the behavior of a "while" loop, though it's also possible to do this as a for-loop.
Recall that "fold" is our basic tool for producing a single value from a list. Now when we do the opposite, the concept is calling "unfolding"! The key function here is unfoldr
:
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
It takes a single input value and produces a list of items in a resulting type. The function we pass will take our input type and produce a new result value as well as a new input value! This new input value gets passed again to the function on the next iteration. It continues until our function produces Nothing
.
unfoldr :: (input -> Maybe (result, input)) -> input -> [result]
Here's a C++ example. Suppose we're trying to produce a binary representation of a number and we're OK with this representation being variable length. Here's how we might do this with a while
loop.
enum BitValue { ONE, ZERO };
// This representation is a little odd in that we use an empty list to
// represent '0'
std::vector<BitValue> produceBits(uint64_t input) {
std::vector result;
while (input > 0) {
if (input % 2 == 0) {
result.push_back(ZERO);
} else {
result.push_back(ONE);
}
input /= 2;
}
std::reverse(result.begin(), result.end());
return result;
}
So each time through the loop we produce a new bit depending on the "input" value status. Once we hit 0, we're done.
How can we do this in Haskell with unfoldr
? Well first let's write a function to do the "unfolding". This follows the same logic that's inside the loop:
produceSingleBit :: Word -> Maybe (Bit, Word)
produceSingleBit 0 = Nothing
produceSingleBit x = if x `mod` 2 == 0
then Just (Zero, x `quot` 2)
else Just (One, x `quot` 2)
And now to complete the function, it's a simple application of unfoldr
!
produceBits :: Word -> [Bit]
produceBits x = reverse (unfoldr produceSingleBit x)
...
>> produceBits 4
[One, Zero, Zero]
>> produceBits 3
[One, One]
>> produceBits 9
[One, Zero, Zero, One]
We can also implement a fibonacci function using unfold. Our "unfolding" function just needs one of its inputs to act as a counter. We provide this counter and the initial values 0 and 1 as the seed values, and it will keep counting down. This will provide us with the complete list of fibonacci numbers up to the given input index.
fib :: Word -> [Word]
fib x = unfoldr unfoldFib (x, 0, 1)
where
unfoldFib (count, a b) = if count == 0
then Nothing
else Just (b, (count - 1, b, a + b))
...
>> fib 1
[1]
>> fib 2
[1, 1]
>> fib 5
[1, 1, 2, 3, 5]
It might seem a little unnatural, but there are lots of opportunities to incorporate unfold
into your Haskell code! Just keep a lookout for these "while loop" kinds of problems. To stay updated with all the latest on Monday Morning Haskell, make sure to subscribe to our newsletter! You can also follow my streaming schedule on Twitch, which I'll also post on Twitter!
Combining Ideas: mapAccum
In the last couple weeks, we've learned about folding and scanning, which are valuable tools for replacing conventional for loops in our Haskell code. Today we'll go over a lesser known idea that sort of combines folding and scanning (or just folding and mapping). The particular function we'll look at is mapAccumL
. For our motivating example, let's consider a parsing program.
Our input is a set of strings giving a series of mathematical operations.
Add 5
Multiply 3
Add 9
Divide 8
Subtract 2
We want to know the final "value" of this file, (in this case it's 6) but we also want to keep track of the operations we used to get there. Here's a C++ outline:
enum OpType { ADD, SUB, MUL, DIV };
struct Operation {
OpType opType;
double value;
};
Operation parseOperation(const std::string& inputLine) { ... }
double processOperation(double currentValue, Operation op) {
switch (op.opType) {
case ADD: return currentValue + op.value;
case SUB: return currentValue - op.value;
case MUL: return currentValue * op.value;
case DIV: return currentValue / op.value;
}
}
Given the list of input lines, there are two approaches we could use to get the desired outputs. We can process the lines first and then do the calculation. This results in two loops like so:
std::pair<double, std::vector<Operation>> processLines(const std::vector<std::string>& lines) {
std::vector<Operation> operations;
for (const auto& line : lines) {
operations.push_back(parseOperation(line));
}
double result = 0.0;
for (auto& operation : operations) {
result = processOperation(result, operation);
}
return {result, operations);
}
But we can also write it with a single for loop like so:
std::pair<double, std::vector<Operation>> processLines(const std::vector<std::string>& lines) {
std::vector<Operation> operations;
double result = 0.0;
for (const auto& line : lines) {
Operation newOp = parseOperation(line);
operations.push_back(newOp);
result = processOperation(result, newOp);
}
return {result, operations);
}
This for loop essentially performs multiple different actions for us at the same time. It appends to our list, AND it updates our result value. So it's acting like a map and a fold simultaneously.
In Haskell, most of the functional loop replacements really only perform a single action, so it's not necessarily clear how to do "double duty" like this. Here's a simple Haskell approach that splits the work in two pieces:
data OpType = Add | Sub | Mul | Div
data Operation = Operation OpType Double
parseOperation :: String -> Operation
processOperation :: Double -> Operation -> Double
processLines :: [String] -> (Double, [Operation])
processLines lines = (ops, result)
where
ops = map parseOperation lines
result = foldl processOperation 0.0 ops
It turns out though, we have a function that can combine these steps! This function is mapAccumL
.
mapAccumL :: (a -> b -> (a, c)) -> a -> [b] -> (a, [c])
Once again, I'll change up this type signature to assign a semantic value to each of these types. As always, the type b
that lives in our input list is our item
type. The type a
is still our primary result
, but now I'll assign c
as the accum
type.
mapAccumL :: (result -> item -> (result, accum)) -> result -> [item] -> (result, [accum])
It should be obvious how we can rewrite our Haskell expression now using this function:
processLines :: [String] -> (Double, [Operation])
processLines lines = mapAccumL processSingleLine 0.0 lines
where
processSingleLine result line =
let newOp = parseOperation line
in (processOperation result newOp, newOp)
Now, our original implementation is still perfectly fine and clean! And we could also do this using foldl
as well. But it's good to know that Haskell has more functions out there that can do more complicated types of loops than just simple maps and folds.
If you want to see me writing some Haskell code, including many many for loops, tune into my Twitch Stream every Monday evening! Start times will be announced via Twitter. You can also subscribe to our newsletter to learn more and stay up to date!
Using Scans to Accumulate
In last week's article we talked about how fold is one of the fundamental for-loop functions you'll want to use in Haskell. It relies on a folding function, which incorporates each new list item into a result value. There's one potential drawback though: you won't get any intermediate results, at least without some effort. Sometimes, you want to know what the "result" value was each step of the way.
This is where "scan" comes in. Let's start with a C++ example for accumulated sums:
std::vector<int> addWithSums(const std::vector<int>& inputs) {
std::vector<int> results = {0};
int total = 0;
for (int i = 0; i < inputs.size(); ++i) {
total += inputs[i];
results.push_back(total);
}
return results;
}
Let's consider a simple folding sum solution in Haskell:
sum :: [Int] -> Int
sum = foldl (+) 0
We could adapt this solution to give intermediate results. But it would be a little bit tricky. Instead of using (+)
by itself as our folding function, we have to make a custom function that will store the list of accumulated values. In order to make it efficient, we'll also have to accumulate it in reverse and add an extra step at the end.
accumulatedSum :: [Int] -> [Int]
accumulatedSum inputs = reverse (foldl foldingFunc [0] inputs)
where
foldingFunc :: [Int] -> Int -> [Int]
foldingFunc prev x = x + head prev : prev
However, we can instead perform this job with the idea of a "scan". There are scan functions corresponding to the fold functions, so we have scanl
, scanr
, and scanl'
.
scanl :: (b -> a -> b) -> b -> [a] -> [b]
scanl' :: (b -> a -> b) -> b -> [a] -> [b]
scanr :: (a -> b -> b) -> b -> [a] -> [b]
Let's focus on scanl
. Once again, I'll re-write the type signatures to be more clear with "items" and "results".
scanl :: (result -> item -> result) -> result -> [item] -> [result]
This is almost identical to foldl
, except that the final result is a list of results, rather than a single result. And it does exactly what you would expect! Each time it calculates a result value in the folding function (scanning function?) it will include this in a list at the end. This makes it much easier for us to write our accumulated sum function!
accumulatedSum :: [Int] -> [Int]
accumulatedSum inputs = scanl (+) 0 inputs
...
>> scanl (+) 0 [5, 3, 8, 11]
[0, 5, 8, 16, 27]
As a curiosity, you can use this pattern to provide an infinite list of all the triangle numbers:
triangleNumbers :: [Int]
triangleNumbers = scanl' (+) 0 [1..]
…
>> triangleNumbers !! 4
10
>> triangleNumbers !! 7
28
Next week we'll be back with more loop alternatives! In the meantime, you should subscribe to our newsletter so you can stay up to date with the latest news!
Two for One: Using concatMap
Today's for-loop replacement is a simpler one that combines two functions we should already know! We'll see how we can use concatMap
to cover some of the basic loop cases we might encounter in other languages. This function covers the case of "every item in my list produces multiple results, and I want these results in a single new list." Let's write some C++ code that demonstrates this idea. We'll begin with a basic function that takes a single (unsigned) integer and produces a list of unsigned integers.
std::vector<uint64_t> makeBoundaries(uint_64t input) {
if (input == 0) {
return {0, 1, 2);
} else if (input == 1) {
return {0, 1, 2, 3};
} else {
return {input - 2, input - 1, input, input + 1, input + 2)
}
}
This function gives the two numbers above and below our input, with a floor of 0 since the value is unsigned. Now let's suppose we want to take the boundaries of each integer in a vector of inputs, and place them all in a single list. We might end up with something like this:
std::vector<uint64_t> makeAllBoundaries(std::vector<uint64_t> inputs) {
std::vector<uint64_t> results;
for (uint64_t i : inputs) {
std::vector<uint64_t> outputs = makeBoundaries(i);
for (uint64_t o : outputs) {
results.push_back(o);
}
}
return results;
}
Here we end up with nested for loops in the same function! We can't avoid this behavior occurring. But we can avoid needing to write this into our source code in Haskell with the concatMap
function:
concatMap :: (a -> [b]) -> [a] -> [b]
As the name implies, this is a combination of the map
function we've already seen with an extra concatenation step. Instead of mapping a function that transforms a single item to a single item, the function now produces a list of items. But this function's "concat" step will append all the result lists for us. Here's how we could write this code in Haskell:
makeBoundaries :: Word -> [Word]
makeBoundaries 0 = [0, 1, 2]
makeBoundaries 1 = [0, 1, 2, 3]
makeBoundaries i = [i - 2, i - 1, i, i + 1, i + 2]
makeAllBoundaries :: [Word] -> [Word]
makeAllBoundaries inputs = concatMap makeBoundaries inputs
Nice and simple! Nothing in our code really looks "iterative". We're just mapping a single function over our input, with a little bit of extra magic to bring the results together. Under the hood, this function combines the existing "concat" and "map" functions, which use recursion. Ultimately, most Haskell for-loop replacements rely on recursion to create their "iterative" behavior. But it's nice that we don't always have to bring that pattern into our own code.
If you want to stay up to date with Haskell tips and tricks, make sure to subscribe to our monthly newsletter! We'll have some special offers coming up soon, so you won't want to miss them!
Try Fold First!
Last time we talked about map
and filter
, which allow us to do the work of certain, very simple for-loops in other languages. However, they have a major limitation in that they don't let us carry any kind of state from element to element. In order to get this, we'll need the idea of a Fold.
Folds are the most important idea I will talk about this month. When it pops into your head to use a for-loop, the first one you should try to use to represent that idea in Haskell is a fold. I used them constantly in my Advent of Code run last year. Let's start with a simple example of a for-loop in C++.
struct Point {
int x;
int y;
};
int myFunc(const std::vector<Point>& myPoints) {
int result = 0;
for (const auto& point : myPoints) {
if (result % 2 == 0) {
result += point.x;
} else {
result += point.y
}
}
return result;
}
Map and filter are not sufficient to replicate this in Haskell. We cannot filter myPoints
for those points where we add x
or y
because we need the accumulated results to determine which way each of them goes. We cannot merely map
a function over our list because, again, we lack the stateful information to figure out what information we need from each point.
But we can see the pattern developing.
We have our result variable, set to an initial value. Each element in our list affects the result in some way (based on the previous result). At the end we get our final result.
These element are all present if we look at the type signature for foldl
, the most basic folding function in Haskell.
foldl :: (a -> b -> a) -> a -> [b] -> a
Notice that a
is our "result" type, whereas b
is the type of the items in our list. Things become more clear if we do this renaming:
foldl ::
(result -> item -> result) -> -- Folding Function
result -> -- Initial Result
[item] -> -- List of inputs
result -- Final Result
The most important part of this is the "folding" function. This function takes the previous result and the next item in the list, and incorporates that item into the result. The best analogy I can think of is like a stew. It starts out as water/stock (the initial value). And then you incorporate each item into the stew (but it's still a stew after each thing you add). And at the end your have your "final stew".
In our example above, the folding function considers whether the previous result is even or odd, and uses that to determine which part of the point to add to make the new result.
myFuncFold :: Int -> Point -> Int
myFuncFold prev (Point x y) = if prev `mod` 2 == 0
then prev + x
else prev + y
Once we have the folding function, it is usually very simple to write the final expression!
myFunc :: [Point] -> Int
myFunc points = foldl myFuncFold 0 points
With the list as the last argument, you can often eta reduce:
myFunc :: [Point] -> Int
myFunc = foldl myFuncFold 0
As a note, there are three different folding functions. We'll seen foldl
, but there's also foldr
and foldl'
:
foldr :: (item -> result -> result) -> result -> [item] -> result
foldl' :: (result -> item -> result) -> result -> [item] -> result
Notice that the arguments of the folding function are flipped for foldr
! This is a "right" fold, so it will actually start from the right of your list instead of the left. For associative operations, order does not matter, but often it does!
>> foldl (+) 0 [1, 2, 3, 4]
10
>> foldr (+) 0 [1, 2, 3, 4]
10 -- Addition is associative, same result
>> foldl (-) 0 [1, 2, 3, 4]
-10
>> foldr (-) 0 [1, 2, 3, 4]
-2 -- Subtraction is not, different result!
More commonly foldl
will capture how you see the operation in your head. However, foldl
is more prone to stack overflows with large input lists due to laziness in Haskell. Using foldr
gets around these problems, but might force you to reverse your list. You can use foldl'
to solve both problems. It processes the list in the same order as foldl
, while introducing strictness that prevents memory accumulation. Note that foldl'
must be imported from Data.List
while the other two functions are included in Prelude
.
Finally, another great use for folds is manipulating data structures like sets and sequences. Suppose you're just trying to add a list of elements to an existing set. You could create a second set with fromList
and append.
addListToSet :: Set a -> [a] -> Set [a]
addListToSet prevSet items = prevSet <> (Set.fromList items)
However, this is inefficient. You can avoid making an intermediate set like so:
addListToSet :: Set a -> [a] -> Set a
addListToSet prevSet items = foldr Set.insert prevSet items
-- Alternative with foldl
addListToSet prevSet items = foldl (flip Set.insert) prevSet items
Set insertion is associative so the difference in the two above approaches doesn't matter. But with a Map
, the values for keys can be overwritten, so the order is important. Remember that!
If you master folds, you'll be well on your way to being more comfortable with Haskell. But stay tuned for more ways that Haskell will save you from for-loops! If you're interested in more beginner level tips, you should download our Beginners Checklist! It'll help you get set up with Haskell so you can be more effective!
Does Haskell have For Loops?
Before I get started with today's article, I'll mention that today, April 4th is the last day to take advantage of our Spring Sale! If you subscribe today, you'll get a discount code that will get you 20% off any of our courses. This code is good until 11:59PM tonight (PST, GMT-07), so don't wait!
But now on to our new topic for the month of April: For Loops! I'll start with my least favorite quote about Haskell and for loops.
"But Haskell doesn't even have for loops!"
-- Anyone who dislikes Haskell.
This statement is often used to demonstrate just how different Haskell is from other languages. And unfortunately often, it's used to scare beginners away from trying Haskell. For loops are, in some ways, one of the most fundamental building blocks of modern programming languages. After variables, "if" statements and perhaps basic functions, they are one of the first bits of syntax you'll need to pick up on when you're learning a language like C++, Java, Javascript, Python, Rust, or Go. It's almost just muscle memory for a lot of programmers. "Oh I have to do something with every element of an iteration...", and out come the following lines:
for (int i = 0; i < my_array.size(); ++i) {
// do something
}
// or
for (const auto& item : my_array) {
// do something
}
So the idea that Haskell "doesn't have for loops" can make it a scary prospect to learn Haskell. How can one give up something so fundamental to programming?
However, some programmers suggest, in apparent contrast, that "bare" loops, including for loops, are a code smell. In C++ especially, there is probably an existing std::algorithm
function that gives exactly the loop behavior you are looking for (e.g. std::search
, or std::copy_if
). And using these built-in functions will save you from some potential bugs.
A lot of these algorithms are "functional", requiring you to pass a function as an input to the algorithm. And this is the point! Haskell doesn't have the same broad, generic for
syntax (although a for
function does exist, as we'll explore). But it allows all the behaviors of for loops through different functional algorithms.
For the month of April, we'll explore all the common for-loop patterns and describe how to implement them in Haskell. Many of these are a simple matter of a single functional combinator. Today we'll look at two examples of this pattern: map
and filter
.
Consider these two patterns we could write.
- Given a list of integers, return a new list where each value is doubled.
- Given a list of integers, return a new list containing only the even values in the first list.
Here's some "bare" for loop code we could write to accomplish this in C++:
std::vector<int> myInts = {...};
std::vector<int> doubled;
std::vector<int> onlyEven;
for (int i = 0; i < myInts.size(); ++i) {
doubled.push_back(myInts[i] * 2);
}
for (int i = 0; i < myInts.size(); ++i) {
if (myInts[i] % 2 == 0) {
onlyEven.push_back(myInts[i]);
}
}
In Haskell, we don't need a for
construct for these tasks. We just have the combinators map
and filter
, which take functions as arguments. In the most basic sense, we can treat these as operating over lists.
map :: (a -> b) -> [a] -> [b]
filter :: (a -> Bool) -> [a] -> [a]
Getting our new lists is as easy as supplying a lambda function to each of these.
myInts :: [Int]
myInts = ...
doubled :: [Int]
doubled = map (2 *) myInts
onlyEven :: [Int]
onlyEven = filter (\x -> x `mod` 2 == 0) myInts
Both of these concepts generalize to many structures though! For the idea of "mapping", we can use the fmap
function over any Functor type.
fmap :: Functor f => (a -> b) -> f a -> f b
For filter, we can use any Foldable type.
filter :: Foldable t => (a -> Bool) -> t a -> t b
Most basic data structures implement these classes, so we can use these functions extremely generically!
Throughout the rest of the month we'll be talking about more ways Haskell replaces for-loops with functional constructs. Make sure to subscribe so you can stay up to date with the latest news!
Spring Sale ends in 4 Days!
On Monday we'll start April off with a fresh topic, but I wanted to take one more opportunity to remind you of the special Spring Sale going on right now with our courses at Monday Morning Haskell Academy! Here's what's new this time around:
You can now get the Effects Bundle, which includes Making Sense of Monads as well as Effectful Haskell! These courses will help you build a solid understanding of the critical concept of monads and how to use them in your programs. Between the two courses there's over 3.5 hours of video content, two mini-projects, and a full project where you'll deploy some Haskell server code to Heroku!
If you're super interested in using Haskell for machine learning, you can also enroll in our Haskell Brain course . This course is newly reopened, so if you missed out in the winter sale, you can sign up now! It will teach you the basics of using Tensorflow with Haskell as well as some other useful libraries.
Best of all you can get a further 20% discount on all of our courses if your subscribe to our newsletter! You'll get a discount code in your inbox that you can use on any of our courses.
But Which Course Should I Take?
Our course offerings have grown over the years so it might be a little overwhelming to figure out which one is right for you. Here are a few statements that might describe where you are in your Haskell journey, and which course you should look at!
I've never written a line of Haskell before and I want to try it out!
You should take Haskell From Scratch! This 7-module course will walk you through all the most basic elements of Haskell's syntax and teach you some basic problem solving skills.
I understand the basics but I want a deeper understanding of important concepts.
The Effects Bundle is the way to go here. You'll learn about monads from the ground up in Making Sense of Monads and then you'll learn different ways to organize your monads in Effectful Haskell.
I want to try writing a real-world application using Haskell.
Take a look at Practical Haskell! You'll learn the basics of building a web server, interacting with a database, and even adding a frontend with Elm!
I want to write a machine learning program in Haskell.
Haskell Brain is your bet!
FAQ
Last of all, here are some frequently asked questions you might have!
Do I have to wait to access any course content?
Nope! Our courses are all self-paced, and you'll have access to all content immediately!
How long will I have access to the course content?
There's no expiration! You get lifetime access to the content!
Can I get a discount?
In addition to the 20% discount you can get from subscribing to our newsletter, discounts are available for university students! We also recognize that the prices may be steep for people from countries with unfavorable exchange rates against the US Dollar and can provide an extra discount for these situations. Email me at james@mondaymorninghaskell.me for details on these special cases!
Will I be able to get a refund?
Yes! We provide full refunds with no questions asked within 30 days of purchase.
The Spring Sale ends on Monday, April 4th, so don't miss out! Subscribe, get the discount code, and head over to our courses page!
New Course Bundle!
This whole month, I've been writing about some of the basics of using monads in Haskell. This often misunderstood concept is very important for putting together more sophisticated Haskell programs. In my series on Monads and Functional Structures I discuss monads more from the ground up and go into a lot more depth about different ways to use monads.
But last year, I released two great new ways to learn about monads! If you head over to the Monday Morning Haskell Academy, you'll find two courses that are specifically geared towards this tricky concept!
First, there is Making Sense of Monads. If monads have always confused you and you don't even know where to start, start here! This beginner-level course starts from the ground up, similar to the monads series. But it goes into even more depth, offering lots of slides to clarify concepts, and providing you with the opportunity to practice your skills with dozens of exercises and two different mini-projects!
If you think you've got the basics down, you can try something more advanced with our Effectful Haskell course. This goes into way more depth about how we can actually use monads in a real application. It will teach you several different ways of representing and organizing side effects in your Haskell program, including the idea of Free Monads! By the end of this course, you'll have built your own simple web server and learned to host it on Heroku! This project can serve as an example for many more complicated ideas. If you know the basics of monads already, but want to know how they actually help you build a real program, this is the course for you!
Now perhaps both of these ideas sound appealing to you. For the first time, we're offering our Effects Bundle, which combines both of these courses! If you get them together, you'll save almost 30%!
Speaking of discounts, you can also get 20% off all our courses if you subscribe to our monthly newsletter! Subscribing always gives you access to our Subscriber Resources, but this week it will also get you a discount code on anything on Monday Morning Haskell Academy. This includes an additional discount on the the above-mentioned Effects Bundle. It also includes the newly re-opened Haskell Brain course, which will introduce you to the basics of using machine learning in Haskell! If you missed out on this course back in the winter, now's your chance to get started!
The sale will last for a week! So make sure you sign up and get those discounts before 11:59PM PST (GMT-07) next Monday, April 4th! Don't miss out!
Making your own Monad
There are many built-in monads from the MTL library that you'll find useful, like Reader, Writer, State, ExceptT, and so on. You can use transformers to combine these into a monad that is uniquely specific to your application. Today, we'll talk about how to make your monad into its own data type.
For our example monad, we'll incorporate an environment configuration using a Reader, a persistent application state, the ability to throw certain exceptions, and have all of that on top of IO. This would give us a monad like:
StateT AppState (ReaderT EnvConfig (ExceptT AppError IO)) a
And you might have many different functions in your application using this monad. Obviously you wouldn't want to keep writing down that long expression each time. So you could use a type alias, parameterized by the result type of the operation:
type AppMonad a = StateT AppState (ReaderT EnvConfig (ExceptT AppError IO)) a
loginUser :: AuthInfo -> AppMonad User
logoutUser :: AppMonad ()
printEnvironmentLogs :: AppMonad ()
However, an important trick to know is that we can make AppMonad
a proper type instead of a simple alias by using newtype
. To use this type as a monad, you'll need instances for the Monad
class and its ancestors. But we can derive those automatically, as long as we're using common MTL layers.
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
newtype AppMonad a = AppMonad
(StateT AppState
(ReaderT EnvConfig
(ExceptT AppError IO))) a
deriving (Functor, Applicative, Monad)
Notice we are still using a
as a type parameter for the result!
Like other monads, it is usually a good idea to use a "run" function to give users an entrypoint into this monad. This function will need appropriate inputs. First of all, it will take the monadic action itself. Then it will take initial values for the state and configuration.
There are a few different options for the result value. In certain cases, it can be the "pure" result type from the computation. But if IO
is on the monad stack, your result will need to be in the IO
monad. And in our case, since we've included ExceptT
, we'll also allow the result to be Either
, since it may encounter our error type.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
How do we write such a function? The answer is that we'll incorporate the "run" functions of all the other monads on our stack! The key is knowing how to destructure a single monadic action. First, we pattern match our input action. Below, the expression stateAction
corresponds to a StateT
value, because that is the outer layer of our monad.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ...
How do we make progress with a stateAction
? By using a "run" function from the State
monad of course! In our case, we don't care about the final stateful value, so we'll use evalStateT
. This gives us an expression at the next level of the monad, which is a ReaderT
.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ...
where
readerAction :: ReaderT (ExceptT AppError IO) a
readerAction = evalStateT stateAction appState
Now we can do the same thing to unwind the Reader
action. We'll call runReaderT
using our supplied environment config. This gives us an action in the ExceptT
layer.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ...
where
readerAction :: ReaderT (ExceptT AppError IO) a
readerAction = evalStateT stateAction appState
exceptAction :: ExceptT AppError IO a
exceptAction = runReaderT readerAction envConfig
And finally, we use runExceptT
to unwind the ExceptT
layer, which gives us an IO action. This is our final result.
runAppMonad :: AppMonad a -> EnvConfig -> AppState -> IO (Either AppError a)
runAppMonad (AppMonad stateAction) envConfig appState = ioAction
where
readerAction :: ReaderT (ExceptT AppError IO) a
readerAction = evalStateT stateAction appState
exceptAction :: ExceptT AppError IO a
exceptAction = runReaderT readerAction envConfig
ioAction :: IO (Either AppError a)
ioAction = runExceptT exceptAction
Now we can use our monad in any location that has access to the initial state values and the IO monad!
There are a couple more tricks we can pull with our own monad. We can, for example, make this an instance of certain monadic type classes. This will make it easier to incorporate normal IO actions, or let us use State
actions without needing to lift
. Here are a couple examples:
instance MonadIO AppMonad where
liftIO = AppMonad . lift . lift . lift
instance MonadState AppState AppMonad where
get = AppMonad get
put = AppMonad . put
We could even create our own typeclass related to this monad! This idea is a bit more specialized. So if you want to learn more about this and other monad ideas, you should follow up and read our full Monads Series! You can also subscribe to our monthly newsletter so you can keep up to date with the latest Haskell news and offers!
An Alternative Approach
Part of what monads do is that they encapsulate side effects. They typically include activity that is not a simple and pure calculation like 2 + 2
. Because of this, there is a much higher chance that monadic actions will "fail" in some way. They won't be able to achieve the stated goal of the computation and will enter some kind of exceptional flow. There are a number of different classes that help us deal with these failures in reasonable ways. They'll also enable us to write code that can work with many different monads. These classes are Alternative
, MonadFail
, and MonadPlus
.
The Alternative
class works for all "applicative" types, so it actually applies more broadly than monads. But still, most of the common users of it are monads. As you might guess, this class allows us to say, "In case this action fails, do this instead." It has two primary functions: empty
and the operator (<|>)
.
The empty
function provides a "failure" condition of sorts. What happens when we can't execute the monadic (or applicative) action? For a monad like Maybe
, the outcome of this is still something we could resolve with "pure" code. We just get the value Nothing
.
instance Alternative Monad where
empty = Nothing
...
However, IO
provides a failure mechanism that will cause a runtime error if the operation doesn't succeed:
instance Alternative IO where
empty = failIO "mzero"
...
Using failIO
will throw an IOError
that will crash our whole program unless it gets caught elsewhere. The word "mzero" is a default failure message that will make more sense in a second!
Now the (<|>)
operator is, of course intended to look like the "or" operator (||)
. It takes two different actions as its inputs. It allows us to provide a different action to run if our first fails. So for Maybe
, we can see if our first result is Nothing
. And if so, we'll resolve the second Maybe
value. Of course, this second value might also be Nothing
! Using alternatives doesn't always guarantee success!
instance Alternative Monad where
empty = Nothing
m1 <|> m2 = case m1 of
Nothing -> m2
justValue -> justValue
We can see this in action:
>> let f x = if even x then Just (quot x 2) else Nothing
>> f 4
Just 2
>> f 3
Nothing
>> f 4 <|> f 8
Just 2
>> f 5 <|> f 8
Just 4
>> f 3 <|> f 5
Nothing
With IO
, we actually need to catch the exception thrown by failIO
and then perform the next action:
instance Alternative IO where
empty = failIO "mzero"
action1 <|> action2 = action1 `catchException` (\_ :: IOError -> action2)
Here's a quick look at these functions with an IO
action:
>> readFile "does_not_exist.txt"
Error: openFile: does not exist (No such file or directory)
>> readFile "does_not_exist.txt" <|> readFile "does_exist.txt"
"Text in second file"
There are a couple other functions we can use with Alternative
to provide a list of outcomes. The many
function will take a single operation and run it repeatedly until the operation fails by returning empty
:
many :: (Alternative f) => f a -> f [a]
In this case, we are guaranteed that the result of the function is never an error (empty
)! If the first attempt at the operation fails, we'll get an empty list.
But if we want to enforce that it will succeed at least once, we can use some
instead:
some :: (Alternative f) => f a -> f [a]
This cannot return an empty list. If the first result is empty
, it will give the failure action.
It may seem a little odd that these functions take only a single action, rather than a list of actions. But these functions (and the Alternative
class in general) lie at the heart of many parsing operations! You can learn more about that in our Parsing series.
There are a couple other classes that build on these Alternative
ideas. The MonadFail
class has one function: fail
. This function takes a string as an argument and will perform an appropriate failure action, often involving a runtime error:
class MonadFail m where
fail :: String -> m a
Then there's MonadPlus
. This takes the essential activities of Alternative
and raises them to be monad-specific. It has mzero
, which mimics empty
, and mplus
, which works like (<|>)
.
instance (Alternative m, Monad m) => MonadPlus m where
mzero :: m a
mplus :: m a -> m a -> m a
Often, the underlying alternative functions are used as the default instances, and there is no change in behavior. I personally find it a little confusing that "plus" is an "or" operation instead of "and" operation. I would expect that "adding" two operations would perform the first and then the second in succession. But this isn't what happens! If the first succeeds, the second never occurs.
Hopefully these different classes help you to write cleaner monadic operations. To learn more about the basics and fundamentals of monads, you should read our series on Monads and Functional Structures and subscribe to our newsletter!
Cool Monad Combinators
Haskell's if-statements work a bit differently from most other languages. In a language like C++ you can have an if statement that just has a single branch like this:
int sumOfList(const std::vector<int>& inputs, bool onlyHalf) {
size_t listSize = inputs.size();
if (onlyHalf) {
// Only operate on half the list
listSize = listSize / 2;
}
...
}
But a statement like that doesn't strictly fit into Haskell's paradigms of limiting side effects and assigning types to all expressions. An if statement has to be an expression like everything else, and that expression must have a type. So the way Haskell does this is that an if statement must have two branches (if and else) and each branch must be an expression with the same result type.
But what about a situation where you just want to do conditional logging? Here's another quick example:
int sumOfList(const std::vector<int>& inputs, bool isVerbose) {
if (isVerbose) {
std::cout << "Taking sum of list..." << std::endl;
}
...
}
In Haskell, we would need this function to be in the IO
monad, since it's printing to the console. But how would we represent the "conditional logging portion? We'd have to make an "if statement" where each branch has the same type. A putStrLn
expression has the type IO ()
, so we would need an empty
expression of type IO ()
as the other branch. So in this case return ()
works.
sumOfList :: [Int] -> Bool -> m Int
sumOfList inputs isVerbose = do
if isVerbose
then putStrLn "Taking sum of list..."
else return ()
return $ foldl (+) 0 inputs
But it could be annoying to have this pattern of return ()
on a lot of different branches. But there are a couple useful combinators to help us: when
and unless
.
when :: Bool -> m () -> m ()
unless :: Bool -> m () -> m ()
These simply take a boolean value and a monadic action, and they only perform the action based on the boolean value. So we could rewrite our code from above:
sumOfList :: [Int] -> Bool -> m Int
sumOfList inputs isVerbose = do
when isVerbose (putStrLn "Taking sum of list...")
return $ foldl (+) 0 inputs
Now it looks a lot cleaner. With when
, we perform the action whenever it's true. With unless
, the action occurs only when the input is false.
We can observe though that these functions only work when the result value of the input is ()
, which is to say it has no result. Because when the alternative is that "nothing happens", we can't produce a result other than ()
. So these combinators are only sensible when there's some kind of side effect from the monadic action, like printing to the console, or modifying some stateful value.
Sometimes, you may have an action that produces a desired side effect but also returns a value.
printSize :: [Int] -> IO Int
printSize inputs = do
putStrLn $ "Input has size: " ++ show (length inputs)
return $ length inputs
If you want to use this with when
or unless
, you'll have to change the action so it instead returns ()
. This is the job of the void
combinator. It just performs the action and then returns ()
at the end.
void :: m a -> m ()
void action = do
_ <- action
return ()
Now we could apply this with our different expressions above:
sumOfList :: [Int] -> Bool -> m Int
sumOfList inputs isVerbose = do
when isVerbose (void $ printSize inputs)
return $ foldl (+) 0 inputs
Hopefully you're able to use these combinators to make your Haskell a bit cleaner! If you're just getting started on your Haskell journey, you should download our Beginners Checklist! It'll provide you with some helpful tools to get going! Next week, we'll be back with some more monad tips!
Does your Monad even Lift?
Monad transformers are one of the keys to writing Haskell that solves tricker problems with interlocking effect structures. For a more complete look at monad transformers, you should take a look at Part 6 of our Monads Series. But for today we'll tackle the basic idea of "Lifting", which is one of the core ideas behind transformers.
We combine monads by placing them on a "stack". A common example might be StateT s IO
. This transformer allows us to keep track of a stateful value AND perform IO actions.
basicState :: Int -> StateT Int IO Int
basicState x = do
prev <- get
put (2 * prev + x)
return (x + prev)
So far, this function is only performing State
actions like get
and put
. We'd like it to also perform IO
actions like this:
basicState :: Int -> StateT Int IO Int
basicState x = do
prev <- get
put (2 * prev + x)
putStrLn "Double value and added input"
return (x + prev)
However this won't compile. The putStrLn
function lives in the IO
monad, and our overall expression is in the StateT
monad. How do we solve this? By using lift
:
basicState :: Int -> StateT Int IO Int
basicState x = do
prev <- get
put (2 * prev + x)
lift $ putStrLn "Double value and added input"
return (x + prev)
What exactly does lift
do here? It tells Haskell to take a function in one monad and treat it as though it's part of another monad transformer using the original monad as its "underlying" type.
lift :: (Monad m, MonadTrans t) => m a -> t m a
To specialize it to this example:
lift :: IO a -> StateT Int IO a
Why do we call this "lifting"? This comes back to treating monads like a "stack". In this example, IO
is on the bottom of the stack, and State Int
is on top of it. So we have to "lift" it up one level.
Sometimes you might need to lift multiple levels! Consider this example with State
, Writer
, and IO
. We have to use one lift
for tell
(the Writer
function) and two for putStrLn
(the IO
function).
stateWriterIO :: Int -> StateT Int (WriterT [String] IO) Int
stateWriterIO x = do
prev <- get
put (2 * prev + x)
lift . lift $ putStrLn "Double value and added input"
lift $ tell ["Returning input and previous value"]
return (x + prev)
However, with monad classes, we can sometimes skip using multiple lifts. When you combine IO
with any of the basic monads, you can use the special liftIO
function:
liftIO :: (MonadIO m) => IO a -> m a
stateWriterIO :: Int -> StateT Int (WriterT [String] IO) Int
stateWriterIO x = do
prev <- get
put (2 * prev + x)
liftIO $ putStrLn "Double value and added input"
lift $ tell ["Returning input and previous value"]
return (x + prev)
Again, if you monads are a stumbling block for you, I highly encourage you to do some more in depth study and read our Monads Series in some more depth. For our next two articles, we'll go over some other useful combinators with monads in Haskell!
Shorter Run Functions
Last time around, I discussed 'run' functions and how these are the "entrypoint" for using most monads. However, it's also useful to have a couple potential shortcuts up our sleeve. Today we'll go over a couple "shortcut" functions when you don't need everything the monad supplies.
Recall that with the Writer
and State
monads, the run
function produces two outputs. The first is the "result" of our computation (some type a
). The second is the stateful value tracked by the monad:
runWriter :: Writer w a -> (a, w)
runState :: State s a -> s -> (a, s)
There are times however, especially with State
, where the stateful value is the result. There are no shortage of functions out there that look like State s ()
. They have essentially no return value, but they update the tracked value:
doubleAndAddInput :: Int -> State Int ()
doubleAndAddInput x = do
modify (* 2)
modify (+ x)
Now let's think about running this computation. If we use runState
, we'll end up with a tuple where one of the elements is just the unit ()
.
>> runState (doubleAndAddInput 5) 6
((), 17)
Of course we can always just use snd
to ignore the first element of the tuple. But for the sake of code cleanliness it's nice to know that there are helper functions to skip this for you! Here are the exec
functions for these two monads. They will return only the tracked state value!
execWriter :: Writer w a -> w
execState :: State s a -> s -> s
So applying this with our example above, we would get the following:
>> execState (doubleAndAddInput 5) 6
17
On the flip side, there are also times where you don't care about the accumulated state. All you care about is the final result! In this case, the function you want for the State
monad is evalState
:
evalState :: State s a -> s -> a
So now we could supply a return value in our function:
doubleAndAddInput :: Int -> State Int Int
doubleAndAddInput x = do
prev <- get
put (2 * prev + x)
return (x + prev)
Then we can drop the accumulated state like so:
>> evalState (doubleAndAddInput 5) 6
11
For whatever reason, there doesn't seem to be evalWriter
in the library. Perhaps the logic here is that the accumulated Writer value doesn't affect the computation, so if you're going to ignore its output, it has no effect. However, I could imagine cases where you originally wrote the function as a Writer
, but in a particular usage of it, you don't need the value. So it's an interesting design decision.
Anyways, these functions also exist with monad transformers of course:
execStateT :: StateT s m a -> s -> m s
evalStateT :: StateT s m a -> s -> m a
execWriterT :: WriterT w m a -> m w
Next time we'll dig into monad transformers a bit more! In the meantime, learn more about monads by taking a look at our series on Monads and Functional Structures!
Running with Monads
I had a big stumbling block in learning monads. Perhaps unsurprisingly, this occurred because I was trying to take a particular monadic analogy too far in the college class where I first learned about them. I got the idea in my head that, "Monads are like a treasure chest", and my mental model went something like this:
- Monads are like a treasure chest.
- You can't directly access what's inside (the
a
inIO a
). - That is, unless you are already in that monad.
- In that case, the "bind" (
>>=
) operator let's us pass in a function that accesses the "inner value". - But the result we produce is re-wrapped in the monad!
And so my head was spinning in a loop trying to figure out how I could actually get into a monad in the first place.
However, the code I was looking at was not normal monadic code. It was IO
code. And I was conflating the IO
monad with all monads. Don't do this! The IO monad is, in fact, special! It does follow some of the same patterns as other monads. But this special property of "you can't access the inner value unless you're in IO" does not apply to other monads!
The majority of monads you will encounter can be accessed from pure code. The most common way of doing this is through a run
function. Here are three of the most common examples with the Reader
, Writer
and State
monads:
runReader :: Reader r a -> r -> a
runWriter :: Writer w a -> (a, w)
runState :: State s a -> s -> (a, s)
Most often, you'll need to supply an "initial" value, as we see with Reader
and State
. And many times you'll get the final stateful value as a second product of the function. This occurs in Writer
and State
.
Here's a simple example. We have a State
computation that adds 1 to the stored value, and then adds the previous stored value to the input, returning that. We can call into this stateful function from a totally pure function using runState
:
stateFunction :: Int -> State Int Int
stateFunction input = do
prev <- get
modify (+1)
return $ prev + input
callState :: Int -> (Int, Int)
callState x = runState (stateFunction (x + 5)) 11
...
>> callState 3
(19, 12)
>> callState 7
(23, 12)
With monad transformers, the concept of the "run" function is very similar. The functions now end with the suffix T
. The only difference is that it produces a value in the underlying monad m
:
runReaderT :: ReaderT r m a -> r -> m a
runWriterT :: WriterT w m a -> m (a, w)
runStateT :: StateT s m a -> s -> m (a, s)
Of course, there are exceptions to this pattern of "run" functions. As we learned about last time, Either
can be a monad, but we can also treat Either
values as totally normal objects. To access the inner values, we just need a case
statement or a pattern match. You don't need to take the "treasure box" approach. Or at least, with certain monads, the treasure box is very easy to unlock.
If we go back to IO for a second. There is no "run" function for IO. There is no runIO
function, or runIO_T
transformer. You can't conjure IO computations out of nothing (at least not safely). Your program's entrypoint is always a function that looks like:
main :: IO ()
To use any IO function in your program, you must have an unbroken chain of IO access going back to this main function, whether through the IO
monad itself or a transformer like StateT IO
. This pattern allows us to close off large parts of our program to IO computations. But no part of your program is firmly closed off to a State
computation. As long as you can generate the "initial" state, you can then access the State
monad via runState
.
So if you were struggling with the same stumbling block I was, hopefully this clears things up for you! If you want more examples of how monads work and how to apply these run functions, take a look at our series Monads and Functional Structures!
Using Either as a Monad
Now that February is over and we're into March, it's time for "Monads Month"! Over the course of the next month I'll be giving some helpful tips on different ways to use monads.
Today I'll start with a simple observation: the Either
type is a monad! For a long time, I used Either
as if it were just a normal type with no special rules. But its monadic behavior allows us to chain together several computations with it with ease!
Let's start from the beginning. What does Either
look like? Well, it's a very basic type that can essentially hold one of two types at runtime. It takes two type parameters and has two corresponding constructors. If it is "Left", then it will hold a value of the first type. If it is "Right" then it will hold a value of the second type.
data Either a b = Left a | Right b
A common semantic understanding of Either
is that it is an extension of Maybe
. The Maybe
type allows our computations to either succeed and produce a Just
result or fail and produce Nothing
. We can follow this pattern in Either
except that failures now produce some kind of object (the first type parameter) that allows us to distinguish different kinds of failures from each other.
Here's a basic example I like to give. Suppose we are validating a user registration, where they give us their email, their password, and their age. We'll provide simple functions for validating each of these input strings and converting them into newtype
values:
newtype Email = Email String
newtype Password = Password String
newtype Age = Age Int
validateEmail :: String -> Maybe Email
validateEmail input = if '@' member input
then Just (Email input)
else Nothing
validatePassword :: String -> Maybe Password
validatePassword input = if length input > 12
then Just (Password input)
else Nothing
validateAge :: String -> Maybe Age
validateAge input = case (readMaybe input :: Maybe Int) of
Nothing -> Nothing
Just a -> Just (Age a)
We can then chain these operations together using the monadic behavior of Maybe
, which short-circuits the computation if Nothing
is encountered.
data User = User Email Password Age
processInputs :: (String, String, String) -> Maybe User
processInputs (i1, i2, i3) = do
email <- validateEmail i1
password <- validatePassword i2
age <- validateAge i3
return $ User email password age
However, our final function won't have much to say about what the error was. It can only tell us that an error occurred. It can't tell us which input was problematic:
createUser :: IO (Maybe User)
createUser = do
i1 <- getLine
i2 <- getLine
i3 <- getLine
result <- processInputs (i1, i2, i3)
case result of
Nothing -> print "Couldn't create user from those inputs!" >> return Nothing
Just u -> return (Just u)
We can extend this example to use Either
instead of Maybe
. We can make a ValidationError
type that will help explain which kind of error a user encountered. Then we'll update each function to return Left ValidationError
instead of Nothing
in the failure cases.
data ValidationError =
BadEmail String |
BadPassword String |
BadAge String
deriving (Show)
validateEmail :: String -> Either ValidationError Email
validateEmail input = if '@' member input
then Right (Email input)
else Left (BadEmail input)
validatePassword :: String -> Either ValidationError Password
validatePassword input = if length input > 12
then Right (Password input)
else Left (BadPassword input)
validateAge :: String -> Either ValidationError Age
validateAge input = case (readMaybe input :: Maybe Int) of
Nothing -> Left (BadAge input)
Just a -> Right (Age a)
Because Either
is a monad that follows the same short-circuiting pattern as Maybe
, we can also chain these operations together. Only now, the result we give will have more information.
processInputs :: (String, String, String) -> Either ValidationError User
processInputs (i1, i2, i3) = do
email <- validateEmail i1
password <- validatePassword i2
age <- validateAge i3
return $ User email password age
createUser :: IO (Either ValidationError User)
createUser = do
i1 <- getLine
i2 <- getLine
i3 <- getLine
result <- processInputs (i1, i2, i3)
case result of
Left e -> print ("Validation Error: " ++ show e) >> return e
Right u -> return (Right u)
Whereas Maybe
gives us the monadic context of "this computation may fail", Either
can extend this context to say, "If this fails, the program will give you an error why."
Of course, it's not mandatory to view Either
in this way. You can simply use it as a value that could hold two arbitrary types with no error relationship:
parseIntOrString :: String -> Either Int String
parseIntOrString input = case (readMaybe input :: Maybe Int) of
Nothing -> Right input
Just i -> Left i
This is completely valid, you just might not find much use for the Monad
instance.
But you might find the monadic behavior helpful by making the Left
value represent a successful case. Suppose you're writing a function to deal with a multi-layered logic puzzle. For a simple example:
- If the first letter of the string is capitalized, return the third letter. Otherwise, drop the first letter from the string.
- If the third letter in the remainder is an 'a', return the final character. Otherwise, drop the last letter from the string. 3 (and so on with similar rules)
We can encode each rule as an Either
function:
rule1 :: String -> Either Char String
rule1 input = if isUpper (head input)
then Left (input !! 2)
else Right (tail input)
rule2 :: String -> Either Char String
rule2 input = if (input !! 2 == 'a')
then Left (last input)
else Right (init input)
rule3 :: String -> Either Char String
...
To solve this problem, we can use the Either
monad!
solveRules :: String -> Either Char String
solveRules input = do
result1 <- rule1 input
result2 <- rule2 result1
...
If you want to learn more about monads, you should check out our blog series! For a systematic, in depth introduction to the concept, you can also take our Making Sense of Monads course!