Quicksort with Haskell!

sorting_array_2.png

Last week we referenced the ST monad and went into a little bit of depth with how it enables mutable arrays. It provides an alternative to the IO monad that gives us mutable data without side effects. This week, we're going to take a little bit of a break from adding features to our Maze game. We'll look at a specific example where mutable data can allow different algorithms.

Let's consider the quicksort algorithm. We can do this "in place", mutating an input array. But immutable data in Haskell makes it difficult to implement this approach. We'll examine one approach using normal, immutable lists. Then we'll see how we can use a more common quicksort algorithm using ST. At the end of the day, there are still difficulties with making this work the way we'd like. But it's a useful experiment to try nonetheless.

Still new to monads in Haskell? You should read our series on Monads and Functional Structures! It'll help you learn monads from the ground up, starting with simpler concepts like functors!

The ST Monad

Before we dive back into using arrays, let's take a quick second to grasp the purpose of the ST monad. My first attempt at using mutable arrays in the Maze game involved using an IOArray. This worked, but it caused generateRandomMaze to use the IO monad. You should be very wary of any action that changes your code from pure to using IO. The old version of the function couldn't have weird side effects like file system access! The new version could have any number of weird bugs present! Among other things, it makes it much harder to use and test this code.

In my specific case, there was a more pressing issue. It became impossible to run random generation from within the eventHandler. This meant I couldn't restart the game how I wanted. The handler is a pure function and can't use IO.

The ST monad provides precisely what we need. It allows us to run code that can mutate values in place without allowing arbitrary side effects, as IO does. We can use the generic runST function to convert a computation in the ST monad to it's pure result. This is similar to how we can use runState to run a State computation from a pure one.

runST :: (forall. s ST a) -> a

The s parameter is a little bit magic. We generally don't have to specify what it is. But the parameter prevents the outside world from having extra side effects on the data. Don't worry about it too much.

There's another function runSTArray. This does the same thing, except it works with mutable arrays:

runSTArray :: (forall. s ST s (STArray s i e)) -> Array i e

This allows us to use STArray instead of IOArray as our mutable data type. Later in this article, we'll use this type to make our "in-place" quicksort algorithm. But first, let's look at a simpler version of this algorithm.

Slow Quicksort

Learn You a Haskell For Great Good presents a short take on the quicksort algorithm. It demonstrates the elegance with which we can express recursive solutions.

quicksort1 :: (Ord a) => [a] -> [a]
quicksort1 [] = []
quicksort1 (x:xs) =
  let smallerSorted = quicksort1 [a | a <- xs, a <= x]
      biggerSorted = quicksort1 [a | a <- xs, a > x]
  in  smallerSorted ++ [x] ++ biggerSorted

This looks very nice! It captures the general idea of quicksort. We take the first element as our pivot. We divide the remaining list into the elements greater than the pivot and less than the pivot. Then we recursively sort each of these sub-lists, and combine them with the pivot in the middle.

However, each new list we make takes extra memory. So we are copying part of the list at each recursive step. This means we will definitely use at least O(n) memory for this algorithm.

We can also note the way this algorithm chooses its pivot. It always selects the first element. This is quite inefficient on certain inputs (sorted or reverse sorted arrays). To get our expected performance to a good point, we want to choose the pivot index at random. But then we would need an extra argument of type StdGen, so we'll ignore it for this article.

It's possible of course, to do quicksort "in place", without making any copies of any part of the array! But this requires mutable memory. To get an idea of what this algorithm looks like, we'll implement it in Java first. Mutable data is more natural in Java, so this code will be easier to follow.

In-Place Quicksort (Java)

The quicksort algorithm is recursive, but we're going to handle the recursion in a helper. The helper will take two add extra arguments: the int values for the "start" and "end" of this quicksort section. The goal of quicksortHelper will be to ensure that we've sorted only this section. As a stylistic matter, I use "end" to mean one index past the point we're sorting to. So our main quicksort function will call the helper with 0 and arr.length.

public static void quicksort(int[] arr) {
  quicksortHelper(arr, 0, arr.length);
}

public static void quicksortHelper(int[] arr, int start, int end) {
  ...
}

Before we dive into the rest of that function though, let's design two smaller helpers. The first is very simple. It will swap two elements within the array:

public static void swap(int[] arr, int i, int j) {
  int temp = arr[i];
  arr[i] = arr[j];
  arr[j] = temp;
}

The next helper will contain the core of the algorithm. This will be our partition function. It's responsible for choosing a pivot (again, we'll use the first element for simplicity). Then it divides the array so that everything smaller than the pivot is in the first part of the array. After, we insert the pivot, and then we get the larger elements. It returns the index of partition:

public static int partition(int[] arr, int start, int end) {
  int pivotElement = arr[start];
  int pivotIndex = start + 1;
  for (int i = start + 1; i < end; ++i) {
    if (arr[i] <= pivotElement) {
      swap(arr, i, pivotIndex);
      ++pivotIndex;
    }
  }
  swap(arr, start, pivotIndex - 1);
  return pivotIndex - 1;
}

Now our quicksort helper is easy! It will partition the array, and then make recursive calls on the sub-parts! Notice as well the base case:

public static void quicksortHelper(int[] arr, int start, int end) {
  if (start + 1 >= end) {
    return;
  }
  int pivotIndex = partition(arr, start, end);
  quicksortHelper(arr, start, pivotIndex);
  quicksortHelper(arr, pivotIndex + 1, end);
}

Since we did everything in place, we didn't allocate any new arrays! So our function definitions only add O(1) extra memory for the temporary values. Since the stack depth is, on average, O(log n), that is the asymptotic memory usage for this algorithm.

In-Place Quicksort (Haskell)

Now that we're familiar with the in-place algorithm, let's see what it looks like in Haskell. We want to do this with STArray. But we'll still write a function with pure input and output. Unfortunately, this means we'll end up using O(n) memory anyway. The thaw function must copy the array to make a mutable version of it. However, the rest of our operations will work in-place on the mutable array. We'll follow the same patterns as our Java code! Let's start simple and write our swap function!

swap :: ST s Int a -> Int -> Int -> ST s ()
swap arr i j = do
  elem1 <- readArray arr i
  elem2 <- readArray arr j
  writeArray arr i elem2
  writeArray arr j elem1

Now let's write out our partition function. We're going to make it look as much like our Java version as possible. But it's a little tricky because we're don't have for-loops! Let's deal with this problem head on by first designing a function to handle the loop.

The loop produces our value for the final pivot index. But we have to keep track of its current value. This sounds like a job for the State monad! Our state function will take the pivotElement and the array itself as a parameter. Then it will take a final parameter for the i value we have in our partition loop in the Java version.

partitionLoop :: (Ord a)
  => STArray s Int a
  -> a
  -> Int
  -> StateT Int (ST s) ()
partitionLoop arr pivotElement i = do
  ...

We fill this with comparable code to Java. We read the current pivot and the element for the current i index. Then, if it's smaller, we swap them in our array, and increment the pivot:

partitionLoop :: (Ord a)
  => STArray s Int a
  -> a
  -> Int
  -> StateT Int (ST s) ()
partitionLoop arr pivotElement i = do
  pivotIndex <- get
  thisElement <- lift $ readArray arr i
  when (thisElement <= pivotElement) $ do
    lift $ swap arr i pivotIndex
    put (pivotIndex + 1)

Now we incorporate this loop into our primary partition function after getting the pivot element. We'll use mapM to sequence the state actions together and pass that to execStateT. Then we'll return the final pivot (subtracting 1). Don't forget to swap the pivot into the middle of the array though!

partition :: (Ord a)
 => STArray s Int a
 -> Int
 -> Int
 -> ST s Int
partition arr start end = do
  pivotElement <- readArray arr start
  let pivotIndex_0 = start + 1
  finalPivotIndex <- execStateT
    (mapM (partitionLoop arr pivotElement) [(start+1)..(end-1)])
    pivotIndex_0
  swap arr start (finalPivotIndex - 1)
  return $ finalPivotIndex - 1

Now it's super easy to incorporate these into our final function!

quicksort2 :: (Ord a) => Array Int a -> Array Int a
quicksort2 inputArr = runSTArray $ do
  stArr <- thaw inputArr
  let (minIndex, maxIndex) = bounds inputArr
  quicksort2Helper minIndex (maxIndex + 1) stArr
  return stArr

quicksort2Helper :: (Ord a)
  => Int 
  -> Int
  -> STArray s Int a
  -> ST s ()
quicksort2Helper start end stArr = when (start + 1 < end) $ do
  pivotIndex <- partition stArr start end
  quicksort2Helper start pivotIndex stArr
  quicksort2Helper (pivotIndex + 1) end stArr

This completes our algorithm! Notice again though, that we use thaw and freeze. This means our main quicksort2 function can have pure inputs and outputs. But it comes at the price of extra memory. It's still cool though that we can use mutable data from inside a pure function!

Conclusion

Since we have to copy the list, this particular example doesn't result in much improvement. In fact, when we benchmark these functions, we see that the first one actually performs quite a bit faster! But it's still a useful trick to understand how we can manipulate data "in-place" in Haskell. The ST monad allows us to do this in a "pure" way. If we're willing to accept impure code, the IO monad is also possible.

Next week we'll get back to game development! We'll add enemies to our game that will go around and try to destroy our player! As we add more and more features, we'll continue to see cool ways to learn about algorithms in Haskell. We'll also see new ways to architect the game code.

There are many other advanced Haskell programs you can write! Check out our Production Checklist for ideas!