Combining Ideas: mapAccum
In the last couple weeks, we've learned about folding and scanning, which are valuable tools for replacing conventional for loops in our Haskell code. Today we'll go over a lesser known idea that sort of combines folding and scanning (or just folding and mapping). The particular function we'll look at is mapAccumL
. For our motivating example, let's consider a parsing program.
Our input is a set of strings giving a series of mathematical operations.
Add 5
Multiply 3
Add 9
Divide 8
Subtract 2
We want to know the final "value" of this file, (in this case it's 6) but we also want to keep track of the operations we used to get there. Here's a C++ outline:
enum OpType { ADD, SUB, MUL, DIV };
struct Operation {
OpType opType;
double value;
};
Operation parseOperation(const std::string& inputLine) { ... }
double processOperation(double currentValue, Operation op) {
switch (op.opType) {
case ADD: return currentValue + op.value;
case SUB: return currentValue - op.value;
case MUL: return currentValue * op.value;
case DIV: return currentValue / op.value;
}
}
Given the list of input lines, there are two approaches we could use to get the desired outputs. We can process the lines first and then do the calculation. This results in two loops like so:
std::pair<double, std::vector<Operation>> processLines(const std::vector<std::string>& lines) {
std::vector<Operation> operations;
for (const auto& line : lines) {
operations.push_back(parseOperation(line));
}
double result = 0.0;
for (auto& operation : operations) {
result = processOperation(result, operation);
}
return {result, operations);
}
But we can also write it with a single for loop like so:
std::pair<double, std::vector<Operation>> processLines(const std::vector<std::string>& lines) {
std::vector<Operation> operations;
double result = 0.0;
for (const auto& line : lines) {
Operation newOp = parseOperation(line);
operations.push_back(newOp);
result = processOperation(result, newOp);
}
return {result, operations);
}
This for loop essentially performs multiple different actions for us at the same time. It appends to our list, AND it updates our result value. So it's acting like a map and a fold simultaneously.
In Haskell, most of the functional loop replacements really only perform a single action, so it's not necessarily clear how to do "double duty" like this. Here's a simple Haskell approach that splits the work in two pieces:
data OpType = Add | Sub | Mul | Div
data Operation = Operation OpType Double
parseOperation :: String -> Operation
processOperation :: Double -> Operation -> Double
processLines :: [String] -> (Double, [Operation])
processLines lines = (ops, result)
where
ops = map parseOperation lines
result = foldl processOperation 0.0 ops
It turns out though, we have a function that can combine these steps! This function is mapAccumL
.
mapAccumL :: (a -> b -> (a, c)) -> a -> [b] -> (a, [c])
Once again, I'll change up this type signature to assign a semantic value to each of these types. As always, the type b
that lives in our input list is our item
type. The type a
is still our primary result
, but now I'll assign c
as the accum
type.
mapAccumL :: (result -> item -> (result, accum)) -> result -> [item] -> (result, [accum])
It should be obvious how we can rewrite our Haskell expression now using this function:
processLines :: [String] -> (Double, [Operation])
processLines lines = mapAccumL processSingleLine 0.0 lines
where
processSingleLine result line =
let newOp = parseOperation line
in (processOperation result newOp, newOp)
Now, our original implementation is still perfectly fine and clean! And we could also do this using foldl
as well. But it's good to know that Haskell has more functions out there that can do more complicated types of loops than just simple maps and folds.
If you want to see me writing some Haskell code, including many many for loops, tune into my Twitch Stream every Monday evening! Start times will be announced via Twitter. You can also subscribe to our newsletter to learn more and stay up to date!