Ballparking Solutions

Nov 13

In last week's article, we went over how to build a simple benchmarking library for ourselves by timing computations. Now most of the time, we're inclined to use benchmarking in a purely reactive way. We write a solution to a problem, and then document how long it takes.

But perhaps the most important use of benchmarking is proactive. In this article, we'll learn how to use ballpark estimates to guide our problem-solving process. This can save you a lot of time when you're working on performance-intensive problems, whether on Advent of Code, Hackerrank, or, of course, in the real world of software development!

The Uses of Ballparking

A ballpark estimate essentially allows us to provide an order of magnitude on the time we expect our solution to take. We should be able to abstractly define the runtime of our solution with Big-O notation. And given this designation, as well as a problem size, we should be able to determine if we'll get an answer within a second, a few seconds, a few minutes, or a few hours.

Using this ballpark estimate, we can then determine if our proposed solution is even feasible, without needing to implement the solution. Whether we're solving a problem on Advent of Code, Hackerrank, Leetcode, or in the real world, we can usually have some idea of the scale of a problem. AoC will let you observe the "large" problem input on which to test your code. Hackerrank will give you an idea of the bounds of various problem parameters.

These problem parameters can, in fact, tell us if a solution is acceptable before we even write that solution. For example, if my input problem size is 10000, roughly how long should my program take if it's O(n)? What about O(n log n)? What about O(n^2)?

Now it's up to situational judgment to determine what's "acceptable". As a general rule, no one wants to wait around for hours to wait for their code to finish. And yet, in real world machine learning development, taking several hours to train a model on the latest data is quite commonplace. However, Hackerrank will reject your solution (or at least not give you full points) if it takes more than a few seconds on the largest input. In Advent of Code meanwhile, you run the code locally, and you only need to get it once. So if your code takes 30 minutes or even an hour to run, you can still use it if you're sure the algorithm is correct (I've done this a few times).

Getting Ballpark Values

So how do we get an idea for these ballpark values? How do we estimate how an algorithm will perform on some data? Well we can use the library we built last time! All we have to do is select some known operations and time how long they take on varying sizes of inputs. So we can choose some candidate functions for different run times:

O(n) -> Take the sum of a list of integers
O(n log n) -> Sort a list of values
O(n^2) -> Take the intersection of two sets of size n
O(n^3) -> Populate a "cubic" multiplication table.

For each operation we can propose different sizes, and see what the times are. We'll do this with sum first. To start, let's write a function to generate a randomized list of integers, given a range and a size:

import Control.Monad (replicateM)
import System.Random (randomRIO)

randomList :: (Int, Int) -> Int -> IO [Int]
randomList rng n = replicateM n (randomRIO rng)

Now we'll generate random lists for many different sizes:

sizes :: [Int]
sizes = [10, 100, 1000, 10000, 100000, 1000000, 10000000]

main :: IO ()
main = do
  listsWithSizes <- zip sizes <$> mapM (randomList (1, 10000)) sizes
  ...

And now we just use our timeOp function from the last article on each list. We'll print out the time it takes, followed by the size:

import qualified System.IO.Strict as S

timeOp :: (NFData b) => (a -> b) -> a -> S.SIO NominalDiffTime
...

main :: IO ()
main = do
  listsWithSizes <- zip sizes <$> mapM (randomList (1, 10000)) sizes
  putStrLn "Sum"
  forM_ listsWithSizes $ \(sz, lst) -> do
    t <- S.run (timeOp sum lst)
    putStr (show t)
    putStrLn $ " (" <> show sz <> ")"

With these sizes, we get the following output:

Sum
0.000039492s (10)
0.000024104s (100)
0.000031986s (1000)
0.000257356s (10000)
0.001830953s (100000)
0.014561602s (1000000)
0.287729319s (10000000)

And so we can see that if our algorithm is O(n), we can get an answer in less than a second, even for extremely large input sizes! If we went up to 100 million, it would start taking multiple seconds.

Larger Size Examples

Obviously though, O(n) is generally the best case scenario we can get, and most problems don't have such a solution. So we can start trying more expensive solutions. So for O(n log n), we'll do sorting. Now that we've got our template, it's easy to write this code:

import qualified Data.List as L

main :: IO ()
main = do
  listsWithSizes <- zip sizes <$> mapM (randomList (1, 10000)) sizes
  putStrLn "Sorting"
  forM_ listsWithSizes $ \(sz, lst) -> do
    t <- S.run (timeOp L.sort lst)
    putStr (show t)
    putStrLn $ " (" <> show sz <> ")"

And we get our results. With the same sizes, we see much longer times.

0.000011084s (10)
0.000028496s (100)
0.000264609s (1000)
0.004021375s (10000)
0.086308336s (100000)
1.691808146s (1000000)
39.815963511s (10000000)

We can sort 10 million items in under a minute, so an O(n log n) algorithm on such a large input would probably be fine for something like Advent of Code, but not for Hackerrank. If your size is limited to one hundred thousand or smaller, such an algorithm will probably work on Hackerrank.

As we increase the difficulty of the algorithm, we can't include as many test sizes, since we start getting prohibitively long times earlier! Let's try list intersection, which, using the naive algorithm in Data.List, is a quadratic algorithm. We'll only run up to 100000 items:

main = do
  listsWithSizes <- zip sizes <$> mapM (randomList (1, 10000)) (take 5 sizes)
  putStrLn "Intersect"
  forM_ listsWithSizes $ \(sz, lst) -> do
    t <- S.run (timeOp (L.intersect lst) lst')
    putStr (show t)
    putStrLn $ " (" <> show sz <> ")"

Here are the results:

Intersect
0.000009277s (10)
0.000092984s (100)
0.005050921s (1000)
0.396993992s (10000)
10.751072816s (100000)

So 10000 items might be doable. However, by intersecting a list with itself, we're being generous. The problem takes longer if the two lists are disjoint:

main = do
  listsWithSizes <- zip sizes <$> mapM (randomList (1, 10000)) (take 5 sizes)
  altLists <- mapM (randomList (10001, 20000)) (take 5 sizes)
  putStrLn "Intersect"
  forM_ (zip listsWithSizes altLists) $ \((sz, lst), lst') -> do
    t <- S.run (timeOp (L.intersect lst) lst')
    putStr (show t)
    putStrLn $ " (" <> show sz <> ")"

Now it takes longer:

Intersect
0.000009074s (10)
0.000167983s (100)
0.010436625s (1000)
1.109146166s (10000)
215.779740028s (100000)

We're over a second for 10000, but 100000 is now quite prohibitive. Keep in mind though, this problem also has low constant factors. So while 10000 seems doable, a good general rule of thumb is that you should look for something faster than O(n^2) for this problem size.

For one last example, we'll do a cubic algorithm. This function produces a cubic multiplication table:

cubicTable :: [Int] -> [((Int, Int, Int), Int)]
cubicTable ns = [((x, y, z), x * y * z) | x <- ns, y <- ns, z <- ns]

We have to majorly shrink the sizes for this to be even tractable! Let's just try 10, 100 and 200.

main = do
  listsWithSizes <- zip sizes <$> mapM (randomList (1, 10000)) [10, 100, 200]
  putStrLn "Intersect"
  forM_ listsWithSizes $ \(sz, lst) -> do
    t <- S.run (timeOp cubicTable lst
    putStr (show t)
    putStrLn $ " (" <> show sz <> ")"

Already at size 200, we're taking several seconds.

Cubic Table
0.000397915s (10)
0.553257425s (100)
3.549556318s (200)

So if your problem size is larger than 100, you'll really need to find something better than a cubic algorithm.

With slower algorithms, it gets even worse. You may be able to get away with an exhaustive exponential or factorial algorithm, but only at size 10 or below. These kinds of algorithms are rarely tested in programming problems though, since they are the most brute force thing you can do.

Example: Fences Problem

Here's an example of applying this ballpark methodology. I run through this example in more detail in my Testing Series (including benchmarking). But the problem is a Hackerrank question called John and Fences where we are essentially looking to calculate the largest rectangular area we can find amidst fence segments of varying heights.

The solution here is recursive. We find the minimum height plank over an interval. We then either take that height multiplied by the length of the interval, or we recursively look strictly left of the interval and strictly right. We return the maximum of these three options.

At each recursive step, we need to keep finding the minimum over a particular interval. Naively, this takes O(n) time, and we may have to do this step O(n) times, giving an O(n^2) algorithm.

However, we can observe the constraints on the problem, and find that n could be as high as 10000.

This is our clue that O(n^2) may not be fast enough. And indeed, to get full credit, you need to find an algorithm that is O(n log n). You can do this with a Segment Tree, which lets you calculate the minimum over any interval in your fence in logarithmic time. You can read the series for a complete solution.

The power of ballparking is when you realize this fact before you start writing your code, skipping the slow implementation altogether.

Conclusion

For more tips and content on problem solving, make sure to subscribe to our mailing list! There's a special deal coming out next week that you won't want to miss!

James Bowen