Unit Tests and Benchmarks in Rust

hourglass.jpg

For a couple months now, we've focused on some specific libraries you can use in Rust for web development. But we shouldn't lose sight of some other core language skills and mechanics. Whenever you write code, you should be able to show first that it works, and second that it works efficiently. If you're going to build a larger Rust app, you should also know a bit about unit testing and benchmarking. This week, we'll take a couple simple sorting algorithms as our examples to learn these skills.

As always, you can take a look at the code for this article on our Github Repo for the series. You can find this week's code specifically in sorters.rs! For a more basic introduction to Rust, be sure to check out our Rust Beginners Series!

Insertion Sort

We'll start out this article by implementing insertion sort. This is one of the simpler sorting algorithms, which is rather inefficient. We'll perform this sort "in place". This means our function won't return a value. Rather, we'll pass a mutable reference to our vector so we can manipulate its items. To help out, we'll also define a swap function to change two elements around that same reference:

pub fn swap(numbers: &mut Vec<i32>, i: usize, j: usize) {
    let temp = numbers[i];
    numbers[i] = numbers[j];
    numbers[j] = temp;
}

pub fn insertion_sorter(numbers: &mut Vec<i32>) {
    ...
}

At its core, insertion sort is a pretty simple algorithm. We maintain the invariant that the "left" part of the array is always sorted. (At the start, with only 1 element, this is clearly true). Then we loop through the array and "absorb" the next element into our sorted part. To absorb the element, we'll loop backwards through our sorted portion. Each time we find a larger element, we switch their places. When we finally encounter a smaller element, we know the left side is once again sorted.

pub fn insertion_sorter(numbers: &mut Vec<i32>) {
    for i in 1..numbers.len() {
        let mut j = i;
        while j > 0 && numbers[j-1] > numbers[j] {
            swap(numbers, j, j - 1);
            j = j - 1;
        }
    }
}

Testing

Our algorithm is simple enough. But how do we know it works? The obvious answer is to write some unit tests for it. Rust is actually a bit different from Haskell and most other languages in the canonical approach to unit tests. Most of the time, you'll make a separate test directory. But Rust encourages you to write unit tests in the same file as the function definition. We do this by having a section at the bottom of our file specifically for tests. We delineate a test function with the test macro:

[#test]
fn test_insertion_sort() {
  ...
}

To keep things simple, we'll define a random vector of 100 integers and pass it to our function. We'll use assert to verify that each number is smaller than the next one after it.

#[test]
fn test_insertion_sort() {
    let mut numbers: Vec<i32> = random_vector(100);
    insertion_sorter(&mut numbers);
    for i in 0..(numbers.len() - 1) {
        assert!(numbers[i] <= numbers[i + 1]);
    }
}

When we run the cargo test command, Cargo will automatically detect that we have a test suite in this file and run it.

running 1 test...
test sorter::test_insertion_sort ... ok

Benchmarking

So we know our code works, but how quickly does it work? When you want to check the performance of your code, you need to establish benchmarks. These are like test suites except that they're meant to give out the average time it takes to perform a task.

Just as we had a test macro for making test suites, we can use the bench macro for benchmarks. Each of these takes a mutable Bencher object as an argument. To record some code, we'll call iter on that object and pass a closure that will run our function.

#[bench]
fn bench_insertion_sort_100_ints(b: &mut Bencher) {
    b.iter(|| {
        let mut numbers: Vec<i32> = random_vector(100);
        insertion_sorter(&mut numbers)
    });
}

We can then run the benchmark with cargo bench.

running 2 tests
test sorter::test_insertion_sort ... ignored
test sorter::bench_insertion_sort_100_ints   ... bench:       6,537 ns
/iter (+/- 1,541)

So on average, it took about 6ms to sort 100 numbers. On its own, this number doesn't tell us much. But we can get a more clear idea for the runtime of our algorithm by looking at benchmarks of different sizes. Suppose we make lists of 1000 and 10000:

#[bench]
fn bench_insertion_sort_1000_ints(b: &mut Bencher) {
    b.iter(|| {
        let mut numbers: Vec<i32> = random_vector(1000);
        insertion_sorter(&mut numbers)
    });
}

#[bench]
fn bench_insertion_sort_10000_ints(b: &mut Bencher) {
    b.iter(|| {
        let mut numbers: Vec<i32> = random_vector(10000);
        insertion_sorter(&mut numbers)
    });
}

Now when we run the benchmark, we can compare the results of these different runs:

running 4 tests
test sorter::test_insertion_sort ... ignored
test sorter::bench_insertion_sort_10000_ints ... bench:  65,716,130 ns
/iter (+/- 11,193,188)
test sorter::bench_insertion_sort_1000_ints  ... bench:     612,373 ns
/iter (+/- 124,732)
test sorter::bench_insertion_sort_100_ints   ... bench:      12,032 ns
/iter (+/- 904)

We see that when we increase the problem size by a factor of 10, we increase the runtime by a factor of nearly 100! This confirms for us that our simple insertion sort has an asymptotic runtime of O(n^2), which is not very good.

Quick Sort

There are many ways to sort more efficiently! Let's try our hand at quicksort. For this algorithm, we first "partition" our array. We'll choose a pivot value, and then move all the numbers smaller than the pivot to the left of the array, and all the greater numbers to the right. The upshot is that we know our pivot element is now in the correct final spot!

Here's what the partition algorithm looks like. It works on a specific sub-segment of our vector, indicated by start and end. We initially move the pivot element to the back, and then loop through the other elements of the array. The i index tracks where our pivot will end up. Each time we encounter a smaller number, we increment it. At the very end we swap our pivot element back into its place, and return its final index.

pub fn partition(
  numbers: &mut Vec<i32>,
  start: usize,
  end: usize,
  partition: usize)
  -> usize {
    let pivot_element = numbers[partition];
    swap(numbers, partition, end - 1);
    let mut i = start;
    for j in start..(end - 1) {
        if numbers[j] < pivot_element {
            swap(numbers, i, j);
            i = i + 1;
        }
    }
    swap(numbers, i, end - 1);
    i
}

So to finish sorting, we'll set up a recursive helper that, again, functions on a sub-segment of the array. We'll choose a random element and partition by it:

pub fn quick_sorter_helper(
  numbers: &mut Vec<i32>, start: usize, end: usize) {
    if start >= end {
        return;
    }

    let mut rng = thread_rng();
    let initial_partition = rng.gen_range(start, end);
    let partition_index =
          partition(numbers, start, end, initial_partition);
    ...
}

Now that we've partitioned, all that's left to do is recursively sort each side of the partition! Our main API function will call this helper with the full size of the array.

pub fn quick_sorter_helper(
  numbers: &mut Vec<i32>, start: usize, end: usize) {
    if start >= end {
        return;
    }

    let mut rng = thread_rng();
    let initial_partition = rng.gen_range(start, end);
    let partition_index =
          partition(numbers, start, end, initial_partition);
    quick_sorter_helper(numbers, start, partition_index);
    quick_sorter_helper(numbers, partition_index + 1, end);
}

pub fn quick_sorter(numbers: &mut Vec<i32>) {
    quick_sorter_helper(numbers, 0, numbers.len());
}

Now that we've got this function, let's add tests and benchmarks for it:

#[test]
fn test_quick_sort() {
    let mut numbers: Vec<i32> = random_vector(100);
    quick_sorter(&mut numbers);
    for i in 0..(numbers.len() - 1) {
        assert!(numbers[i] <= numbers[i + 1]);
    }
}

#[bench]
fn bench_quick_sort_100_ints(b: &mut Bencher) {
    b.iter(|| {
        let mut numbers: Vec<i32> = random_vector(100);
        quick_sorter(&mut numbers)
    });
}

// Same kind of benchmarks for 1000, 10000, 100000

Then we can run our benchmarks and see our results:

running 9 tests
test sorter::test_insertion_sort ... ignored
test sorter::test_quick_sort ... ignored
test sorter::bench_insertion_sort_10000_ints ... bench:  65,130,880 ns
/iter (+/- 49,548,187)
test sorter::bench_insertion_sort_1000_ints  ... bench:     312,300 ns
/iter (+/- 243,337)
test sorter::bench_insertion_sort_100_ints   ... bench:       6,159 ns
/iter (+/- 4,139)
test sorter::bench_quick_sort_100000_ints    ... bench:  14,292,660 ns
/iter (+/- 5,815,870)
test sorter::bench_quick_sort_10000_ints     ... bench:   1,263,985 ns
/iter (+/- 622,788)
test sorter::bench_quick_sort_1000_ints      ... bench:     105,443 ns
/iter (+/- 65,812)
test sorter::bench_quick_sort_100_ints       ... bench:       9,259 ns
/iter (+/- 3,882)

Quicksort does much better on the larger values, as expected! We can discern that the times seem to only go up by a factor of around 10. It's difficult to determine that the true runtime is actually O(n log n). But we can clearly see that we're much closer to linear time!

Conclusion

That's all for this intermediate series on Rust! Next week, we'll summarize the skills we learned over the course of these couple months in Rust. Then we'll look ahead to our next series of topics, including some totally new kinds of content!

Don't forget! If you've never programmed in Rust before, our Rust Video Tutorial provides an in-depth introduction to the basics!

Previous
Previous

Rust Web Series Complete!

Next
Next

Cleaning our Rust with Monadic Functions