Robotics & Parsing MCAP

Nov 24

For the last 3 weeks, we’ve been learning about parsing in Haskell, particularly using the Megaparsec library. We wrote a basic HTTP server that parsed the request information from scratch and produced a response. We learned valuable lessons like how to use documentation to inform our parser design, and how to build smaller parser combinators into larger pieces.

At the end of the day though, HTTP is a fairly simple protocol. Today, we’re going to start a new parsing series with a bigger objective. We’re going to learn to parse an MCAP file as part of a ROS (Robotics) program. While the general rules of this system are still fairly simple, there’s a lot more pieces of the puzzle, and we’ll really see how Megaparsec helps us put them together!

To learn how to parse in Haskell from the ground up, you should take a look at our course Solve.hs! You’ll learn tools for writing simple parsers yourself, and then you’ll master Megaparsec!

What is ROS?

Before we dive into any code, let’s give ourselves a little more background on the data we’re parsing. In the intro I mentioned ROS, the Robot Operating System. This is a set of libraries that help you to write robotics applications. The key elements of ROS consist of facilitating communication between different processes over a shared protocol.

For example, a robot might run separate processes for each hardware device, like camera sensors and the actuators that actually move it. Then there are separate processes for internal software components that do things like route planning or controls modeling. All these processes have to communicate and share information, and ROS provides a lot of tools to help coordinate this effort so the whole robot works smoothly.

Over the last several years, ROS has “upgraded” from “ROS 1” to “ROS 2”. ROS 2 arose out of a need to write new components and performance improvements that couldn’t be made backwards compatible with ROS 1. At this point, ROS 1 is reaching the end of its support lifecycle, so if you’re just starting out you should learn ROS 2!

While this is the context of the data we’re parsing, we won’t focus on the actual ROS program in this series, since we can’t really use Haskell for it yet! You can read the docs to learn more about the (tricky) setup process.

ROS Messages

Now the primary communication model for ROS is Pub/Sub. Certain processes can “publish” messages to a particular topic, and other processes can “subscribe” to messages on this topic. Each message has a type, which we define in a message file. These are fairly simple. Each line defines a “field” of the message, with a type and field name. Here’s a very simple example of a message file:

# Simple.msg
int64 num
float64 value
string name

The ROS libraries include ways to translate these message definitions into types and classes for languages like C++ and Python and translate all the relevant code. If we were to define a Haskell type for this message, it might look like this:

data Simple = Simple
  { num :: Int64
  , value :: Double
  , name :: String
  }

More complicated types are possible. You can refer to other messages you’ve defined, and you can also define arrays. However, maps and tuples are not allowed:

# Complex.msg
double[] measurements
Simple msg1
Simple msg2

So the way our system works, we might have one process that produces objects of type Simple and publishes these to a topic /simple_msgs. Then another node might subscribe to that topic, and produce another topic /complex_msgs using the type Complex.

Bag Recording and MCAP

Now when you have a robotics system sending all these messages back and forth, you’ll want to record the messages so that you can analyze and debug your program. You may also wish to “replay” certain messages from a previous recording while running a subset of your system to see if code modifications in that subset had the desired effect. A file with a recording of ROS messages is often called a “bag”.

In ROS1, there was a bespoke format for these bags. But ROS2 developers initially chose to use the SQLite file format as a backend. This is nice because many languages (including Haskell!) already have libraries for dealing with SQLite. Parsing an SQLite file from scratch is a very interesting challenge, but not one we’ll do right now.

The developers found though that SQLite didn’t give the best performance in a lot of common robotics situations, so a few years ago started working on MCAP (short for Motion CAPture) as a new format for ROS2. MCAP more closely resembles the ROS1 bag format, but contains additional optimization possibilities. It has only recently been adapted as the “default” format for ROS2 projects.

Our Goal

Our primary objective will be to parse messages out of an MCAP bag using Megaparsec in Haskell. As we’ll see, this will actually involve multiple layers of parsing. We’ll first want to parse the bytes into a “Record” structure, and then we’ll need to do some more parsing of the records to get our topic data out of it.

As with HTTP, MCAP has a very helpful schema that can help guide our development. For example, it has a Serialization section that focuses on parsing basic types. To complete this first part of our series, we’ll write parsers for all the basic types we’ll need in our program. Then we’ll be ready to go next time when we start parsing the records themselves!

Basic Type Parsers

Before we jump in, let’s define the same type alias we had with our HTTP Parser, so that our types are cleaner:

import Text.Megaparsec
import Data.ByteStrint.Lazy (ByteString)
import Data.Void

type Parser a = ParserT Void ByteString IO a

Fixed-Width Integers

Our first order of business is parsing fixed-width integer types. Sometimes the specification tells us to parse a 32-bit integer, but sometimes it’s 64-bit or 16-bit. But regardless, the MCAP specification always uses little-endian byte ordering. For our purposes, we’ll also only be parsing unsigned values. For the simplest understanding of the concept, let’s start by writing a 16-bit integer parser.

import Data.Bits (shift)
import Data.Word

parseUint16LE :: Parser Word16
parseUint16LE = do
  b1 <- fromIntegral <$> anySingle
  b2 <- fromIntegral <$> anySingle
  return $ (shift b2 8) + b1

Each call to anySingle gives us a “token” of a bytestring, which is a Word8. We use fromIntegral here to cast both of these values to Word16. Then we want to add them together, except that we shift the second value 8 bits to the left. With a big-endian order, we would instead shift b1.

We could continue this pattern with the larger integer types. For example, we could parse 4 bytes for a 32-bit integer:

parseUint16LE :: Parser Word16
parseUint16LE = do
  b1 <- fromIntegral <$> anySingle
  b2 <- fromIntegral <$> anySingle
  b3 <- fromIntegral <$> anySingle
  b4 <- fromIntegral <$> anySingle
  return $ (shift b4 24) + (shift b3 16) + (shift b2 8) + b1

This gets a little tedious, so we could actually write a general parser that takes the number of bytes, and parses that number of bytes. This relies on a “fold” that goes through an index for each of the bytes we want to parse:

{-# LANGUAGE ScopedTypeVariables #-}

parseUintLE :: (Bits a, Integral a) => Int -> Parser a
parseUintLE numBytes = foldM f 0 [0..(numBytes - 1)]
  where
    f :: (Num a, Bits a) => a -> Int -> Parser a
    f prevSum i = ...

In our folding function, we receive the previous sum, and the next “index” we are parsing. We’ll call anySingle to get the next byte, shift it by the index (multiplied by 8) and then add it to the previous sum.

parseUintLE :: (Bits a, Integral a) => Int -> Parser a
parseUintLE numBytes = foldM f 0 [0..(numBytes - 1)]
  where
    f :: (Num a, Bits a) => a -> Int -> Parser a
    f prevSum i = do
      (next :: Word8) <- anySingle
      return $ shift (fromIntegral next) (8 * i) + prevSum

Now we get 3 parsers for the price of 1!

parseUint64LE :: Parser Word64
parseUint64LE = parseUintLE 8

parseUint32LE :: Parser Word32
parseUint32LE = parseUintLE 4

parseUint16LE :: Parser Word16
parseUint16LE = parseUintLE 2

As a final flourish, we will write this parser to also return the number of bytes parsed. This seems redundant with these fixed-width parsers, but we’ll discuss why this is important in the next sections.

parseUintLE :: (Bits a, Integral a) => Int -> Parser (Word64, a)
parseUintLE numBytes = do
  res <- foldM f 0 [0..(numBytes - 1)]
  return (fromIntegral numBytes, res)
  where
    f = ...


parseUint64LE :: Parser (Word64, Word64)
parseUint64LE = parseUintLE 8

parseUint32LE :: Parser (Word64, Word32)
parseUint32LE = parseUintLE 4

parseUint16LE :: Parser (Word64, Word16)
parseUint16LE = parseUintLE 2

As a final note, the “timestamp” is serialized as a Word64:

parseTimestamp :: Parser (Word64, Word64)
parseTimestamp = parseUint64LE

Strings

The next primitive type we’ll parse the string. Unlike integers, strings do not have a fixed size. Here is how the spec tells us to parse a string:

<byte length><utf-8 bytes>

The byte length in this case is a 32-bit integer which tells us the number of bytes that are in the string. Knowing this, it’s not difficult to write this parser, while also returning the number of bytes parsed:

import qualified Data.ByteString.Lazy as BS

parseString :: Parser (Word64, ByteString)
parseString = do
  (n, len) <- parseUint32LE
  str <- BS.pack <$> count (fromIntegral len) anySingle
  return (n + fromIntegral len, str)

We use the count monad combinator to get a specific number of bytes. Using count with anySingle will give us a list of Word8 values, so we pack them into a ByteString. When we return our final count, we include the bytes we parsed as part of the length.

Tuples

Now we start getting to compound parsers, which are parametric and returned more structured data. The first is the tuple. There’s nothing special in the specification here. We just parse one value after the other.

<first value><second value>

We’ll represent a tuple parser by passing in the Parser arguments for each type. It’s simple to construct once we do this:

parseTuple :: Parser (Word64, a) -> Parser (Word64, b) -> Parser (Word64, (a, b))
parseTuple parseA parseB = do
  (aLen, a) <- parseA
  (bLen, b) <- parseB
  return (aLen + bLen, (a, b))

A more advanced form of this and the other compound parsers might involve building a typeclass representing a parsable type. We’ll skip this idea for now.

Guards

The next couple parsers will start presenting us with the possibility of validation errors. We’ll want to proactively catch these so that we don’t have to spend a lot of time deciphering records with nonsensical information in them. So let’s write a generic function that we can use to make validation checks along the way.

Here’s the idea. Our function will take a string as the “debug message” and a boolean condition. If the condition is not met, we’ll “fail” the parser and use the message. If the condition passes, we’ll do nothing. Since ParsecT uses MonadFail, we don’t have to do anything strange with throwing and catching errors.

import Control.Monad (unless)

guard’ :: String -> Bool -> Parser ()
guard’ dbg cond = unless cond $ fail dbg

Let’s go back to our basic types and see how we can use this!

Arrays

Arrays are where we’ll finally see the importance of tracking the number of bytes parsed. Here’s the specification for an array of items:

<byte length><serialized element><serialized element>...

We begin with a 32-bit size, and then we have a series of elements. We parse elements until we reach the given number of bytes. As we’ll see, we have to watch for a mismatch in the byte count! We’ll start our parser by parsing this length and setting up a helper while-loop function to parse until we run out of bytes:

parseArray :: Parser (Word64, a) -> Parser (Word64, [a])
parseArray parser = do
  (countLen, byteCount) <- parseUint32LE
  let byteCount64 = fromIntegral byteCount -- Cast to Word64
  vs <- f byteCount64 []
  return (countLen + byteCount64, vs)
  where
    f :: Word64 -> [a] -> Parser [a]
    f = ...

The first argument of our helper is the remaining byte count, and the second is the accumulated values. Once the count reaches 0, we should return our accumulated values. Otherwise, we’ll parse the next item in the array:

parseArray :: Parser (Word64, a) -> Parser (Word64, [a])
parseArray parser = do
  ...
  where
    f :: Word64 -> [a] -> Parser [a]
    f 0 acc = return $ reverse acc
    f rem acc = do
      (nextLen, next) <- parser
      ...

Here’s where we should do some validation though. We want to ensure that the bytes we parsed getting this element do not exceed the remaining value. If it does, we should fail, because something has gone wrong, so we’ll add a guard’ check. Otherwise, we can append this to the list and continue, subtracting this parsed length from the remainder. Here’s the complete function:

parseArray :: Parser (Word64, a) -> Parser (Word64, [a])
parseArray parser = do
  (countLen, byteCount) <- parseUint32LE
  let byteCount64 = fromIntegral byteCount -- Cast to Word64
  vs <- f byteCount64 []
  return (countLen + byteCount64, vs)
  where
    f :: Word64 -> [a] -> Parser [a]
    f 0 acc = return $ reverse acc
    f rem acc = do
      (nextLen, next) <- parser
      guard’ (“Exceeded bytes parsing array: “ <> show nextLen <> “ “ <> show rem) (nextLen <= rem)
      f (rem - nextLen) (next : acc)

Keep in mind, we must do the check before subtracting! If we subtract first and compare against 0, we won’t fail the check! These are unsigned values, so it will wrap around to a large positive value!

Maps

The final basic type we’ll parse is the map. Maps are similar to arrays, except they have keys and values. Here’s the specification:

<byte length><key><value><key><value>...

We can do a fun trick here! If we treat each key/value pair as a tuple, then we can simply parse a map as an array of these tuples and then turn them into a map! So we’ll just use our existing functions.

import Data.Hashable
import qualified Data.HashMap.Lazy as HM

parseMap :: (Hashable k) => Parser (Word64, k) -> Parser (Word64, v) -> Parser (Word64, HM.HashMap k v)
parseMap parseK parseV = do
  let parseKV = parseTuple parseK parseV
  (len, kvs) <- parseArray parseKV
  return (len, HM.fromList kvs)

Very nice to see how we can combine our parsers in this way!

Conclusion

This is only the first step in parsing a complete MCAP bag. We’ve dealt with the simple primitives, but now we have to build these up to parse records within the bag file. We’ll start looking at records and the overall file structure next time.

For now, you should learn more about Megaparsec by taking our Solve.hs course! Module 4 will teach you how to parse in Haskell from the ground up. You’ll learn how to use basic library functions, build your own parser, use regular expressions, and of course get a systematic overview of Megaparsec. Try it out today!

James Bowen