James Bowen James Bowen

Enemies, Convolution and Pooling

convolution_small.png

Let's remember now that last week we were still keeping with the idea of supervised learning. We were trying to get our agent to reach the goal on an empty maze, using own moves as a guide. Since the problem was so simple, we could make an agent that navigating the maze.

But what happens if we introduce enemies into the maze? This will complicate the process. Our manual AI isn't actually that successful in this scenario. But recording our own moves provides a reasonable data set.

First, we need to update the features a little bit. Recall that we're representing our board so that every grid space has a single feature. In our old representation, we used a value of 100 to represent the target space, and 10 to represent our own space. We need to expand this representation. We want to know where the enemies are, if they're stunned, and if we still have our stun.

Let's use 25.0 to represent our position if we have the stun. If we don't have our stun available, we'll use 10.0 instead. For positions containing active enemies, we'll use -25.0. If the enemy is in the stunned state, we'll use -10.0.

vectorizeWorld :: World -> V.Vector Float
vectorizeWorld w = finalFeatures
  where
    initialGrid = V.fromList $ take 100 (repeat 0.0)
    (px, py) = playerLocation (worldPlayer w)
    (gx, gy) = endLocation w
    playerLocIndex = (9 - py) * 10 + px
    playerHasStun = playerCurrentStunDelay (worldPlayer w) == 0
    goalLocIndex = (9 - gy) * 10 + gx
    finalFeatures = initialGrid V.//
      ([ (playerLocIndex
       , if playerHasStun then 25.0 else 10.0)
       , (goalLocIndex, 100.0)] ++ enemyLocationUpdates)

    enemyLocationUpdates = enemyPair <$> (worldEnemies w)

    enemyPair e =
      let (ex, ey) = enemyLocation e
      in  ( (9 - ey) * 10 + ex,
             if enemyCurrentStunTimer e == 0 then -25.0 else -10.0)

Using our moves as a training set, we see some interesting results. It's difficult to get great results on our error rates. The data set is very prone to overfitting. We often end up with training error in the 20-30% range with a test error near 50%. And yet, our agent can consistently win the game! Our problem space is still a bit simplistic, but it is an encouraging result!

Convolution and Pooling

As suggested last week, the ideas of convolution and pooling could be useful on this feature set. Let's try it out, for experimentation purposes. We'll start with a function to create a layer in our network that does convolution and pooling.

buildConvPoolLayer :: Int64 -> Int64 -> Tensor v Float -> Build (Variable Float, Variable Float, Tensor Build Float)
buildConvPoolLayer inputChannels outputChannels input = do
  weights <- truncatedNormal (vector weightsShape)
    >>= initializedVariable
  bias <- truncatedNormal (vector [outputChannels])
    >>= initializedVariable
  let conv = conv2D' convAttrs input (readValue weights)
               `add` readValue bias
  let results = maxPool' poolAttrs (relu conv)
  return (weights, bias, results)
  where
    weightsShape :: [Int64]
    weightsShape = [5, 5, inputChannels, outputChannels]

    -- Create "attributes" for our operations
    -- E.g. How far does convolution slide?
    convStridesAttr = opAttr "strides" .~ ([1,1,1,1] :: [Int64])
    poolStridesAttr = opAttr "strides" .~ ([1,2,2,1] :: [Int64])
    poolKSizeAttr = opAttr "ksize" .~ ([1,2,2,1] :: [Int64])
    paddingAttr = opAttr "padding" .~ ("SAME" :: ByteString)
    dataFormatAttr = opAttr "data_format" .~ ("NHWC" :: ByteString)
    convAttrs = convStridesAttr . paddingAttr . dataFormatAttr
    poolAttrs = poolKSizeAttr . poolStridesAttr . paddingAttr . dataFormatAttr

The input to this layer will be our 10x10 "image" of a grid. The output will be 5x5x32. The "5" dimensions come from halving the board dimension. The "32" will come from the number of channels. We'll replace our first neural network layer with this layer:

createModel :: Build Model
createModel = do
  let batchSize = -1
  let (gridDimen :: Int64) = 10
  let inputChannels = 1
  let (convChannels :: Int64) = 32
  let (nnInputSize :: Int64) = 5 * 5 * 32

  (inputs :: Tensor Value Float) <-
    placeholder [batchSize, moveFeatures]
  (outputs :: Tensor Value Int64) <-
    placeholder [batchSize]

  let convDimens = [batchSize, gridDimen, gridDimen, inputChannels]
  let inputAsGrid = reshape inputs (vector convDimens)
  (convWeights, convBias, convResults) <-
    buildConvPoolLayer inputChannels convChannels inputAsGrid

  let nnInput = reshape
          convResults (vector [batchSize, nnInputSize])
  (nnWeights, nnBias, nnResults) <-
    buildNNLayer nnInputSize moveLabels nnInput

  (actualOutput :: Tensor Value Int64) <- render $
    argMax nnResults (scalar (1 :: Int64))

We have a bit less success here. Our training is slower. We get similar error rate. But now our agent doesn't seem to win, unlike the agent from the pure, dense network. So it's not clear that we can actually treat this grid as an image and make any degree of progress.

What's Next?

Over the course of creating these different algorithms, we've discretized the game. We've broken it down into two phases. One where we move, one where the enemies can move. This will make it easier for us to move back to our evaluation function approach. We can even try using multi-move look-ahead. This could lead us to a position where we can try temporal difference learning. Researchers first used this approach to train an agent how to play Backgammon at a very high level.

Temporal difference learning is a very interesting concept. It's the last approach we'll try with our agent. We'll start out next time with an overview of TD learning before we jump into coding.

For another look at using Haskell with AI problems, read our Haskell and AI series! You can also download our Haskell Tensor Flow Guide for some more help with this library!

Read More
James Bowen James Bowen

Different Feature Schemes

new_features.jpg

In last week's edition of our Maze AI series, we explored the mechanics of supervised learning. We took the training data we'd been building up and trained an agent on it. We had one set of data to make the AI follow our own human moves, and another to follow our hand-crafted AI. This wasn't particularly successful. The resulting agent had a lot of difficulty navigating the maze and using its stun at the right times.

This week, we'll explore a couple different ways we can expand the feature set. First, we'll try encoding the legality of moves in our feature set. Second, we'll try expanding the feature set to include more data about the grid. This will motivate some other approaches to the problem. We'll conclude by taking the specifics of grid navigation out. We'll let our agent go to work on an empty grid to validate that this is at least a reasonable approach.

For some more reading on using Haskell and AI, take a look at our Haskell AI Series. We explore some reasons why Haskell could be a good fit for AI and machine learning problems. It will also help you through some of the basics of using Haskell and Tensor Flow.

Our supervised agent uses our current feature set. Let's remind ourselves what these features are. We have five different directions we can go (up, down, left, right, stand still). And in each of these directions, we calculate 8 different features.

  1. The maze distance to the goal
  2. The manhattan distance to the goal
  3. Whether the location contains an active enemy
  4. The number of enemies on the shortest path to the goal from that location
  5. The distance to the nearest enemy from that location
  6. The number of nearby enemies in manhattan distance terms
  7. Whether our stun is available
  8. The number of drills we have after the move

Some of these features are higher level. We do non-trivial calculations to figure them out. This gives our agent some idea of strategy. But there's not a ton of lower level information available! We zero out the features for a particular spot if it's past the world boundary. But we can't immediately tell from these features if a particular move is legal.

This is a big oversight. It's possible for our AI to learn about the legality of moves from the higher level training data. But it would take a lot more data and a lot more time.

So let's add a feature for how "easy" a move is. A value of 0 will indicate an illegal move, either past the world boundary or through a wall when we don't have a drill. A value of 1 will indicate a move that requires a drill. A value of 2 will indicate a normal move.

We'll add the extra feature into the LocationFeatures type. We'll also add an extra parameter to our produceLocationFeatures function. This boolean indicates whether a drill would be necessary. Note, we don't need to account for WorldBoundary. The value will get zeroed out in that case. We'll call this feature lfMoveEase since a higher value indicates less effort.

data LocationFeatures = LocationFeatures
  { …
  , lfMoveEase :: Int
  }

produceLocationFeatures :: Location -> World -> Bool -> LocationFeatures
produceLocationFeatures location@(lx, ly) w needsDrill = LocationFeatures
  …
  moveEase
  where
    moveEase = if not needsDrill then 2
      else if drillsRemaing > 0 then 1 else 0

It's easy to add the extra parameter to the function call in produceWorldFeatures. We already use case statements on the boundary types. Now we need to account for it when vectorizing our world.

vectorizeWorld :: World -> V.Vector Float
vectorizeWorld w = V.fromList (fromInegral <$>
  [ ...
  , lfMoveEase standStill
  ...
  , zeroIfNull (lfMoveEase <$> up)
  ...
  , zeroIfNull (lfMoveEase <$> right)
  ...
  , zeroIfNull (lfMoveEase <$> down)
  ...
  , zeroIfNull (lfMoveEase <$> left)
  ])

We we train with this feature set, we actually get a good training error, down to around 10%. Thus it can learn our data a bit better. Yet it still can't navigate right.

Expanding the Feature Set

Another option we can try is to serialize the world in a more raw state. We currently use more strategic features. But what about using the information on the board?

Here's a different way to look at it. Let's fix it so that the grid must be 10x10, there must be 2 enemies, and we must start with 2 drills powerups on the map. Let's get these features about the world:

  1. 100 features for the grid cells. Each feature will be the integer value corresponding the the wall-shape of that cell. These are hexadecimal, like we have when serializing the maze.
  2. 4 features for the player. Get the X and Y coordinates for the position, the current stun delay, and the number of drills remaining.
  3. 3 features for each enemy. Again, X and Y coordinates, as well as a stun timer.
  4. 2 coordinate features for each drill location. Once a drill gets taken, we'll use -1 and -1.

This will give us a total of 114 features. Here's how it breaks down.

vectorizeWorld :: World -> V.Vector Float
vectorizeWorld w = gridFeatures V.++ playerFeatures V.++
                     enemyFeatures V.++ drillFeatures
  where

    -- 1. Features for the Grid
    mazeStr = Data.Text.unpack $ dumpMaze (worldBoundaries w)
    gridFeatures = V.fromList $
      fromIntegral <$> digitToInt <$> mazeStr

    player = worldPlayer w
    enemies = worldEnemies w

    -- 2. Features for the player
    playerFeatures = V.fromList $ fromIntegral <$>
      [ fst . playerLocation $ player
      , snd . playerLocation $ player
      , fromIntegral $ playerCurrentStunDelay player
      , fromIntegral $ playerDrillsRemaining player
      ]

    -- 3. Features for the two enemies
    enemy1 = worldEnemies w !! 0
    enemy2 = worldEnemies w !! 1
    enemyFeatures = V.fromList $ fromIntegral <$>
      [ fst . enemyLocation $ enemy1
      , snd . enemyLocation $ enemy1
      , fromIntegral $ enemyCurrentStunTimer enemy1
      , fst . enemyLocation $ enemy2
      , snd . enemyLocation $ enemy2
      , fromIntegral $ enemyCurrentStunTimer enemy2
      ]

    -- 4. Features for the drill locations
    drills = worldDrillPowerUpLocations w
    drillFeatures = V.fromList $ fromIntegral <$>
      if length drills == 0 then [-1, -1, -1, -1]
        else if length drills == 1
          then [fst (head drills), snd (head drills), -1, -1]
          else [ fst (head drills), snd (head drills)
               , fst (drills !! 1), snd (drills !! 1)
               ]

As an optimization, we can make the grid features part of the world since they will not change.

Still though, our model struggles to complete the grid when training off this data. Compared to the high-level features, the model doesn't even learn very well. We get training errors around 25-30%, but a test error close to 50%. With more data and time, our model might be able to draw the connection between various features.

We could attempt to make our model more sophisticated. We're working with grid data, which is a little like an image. Image processing algorithms use concepts such as convolution and pooling. This allows them to derive patterns arising from how the grid actually looks. We're only looking at the data as a flat vector.

It's unlikely though that convolution and pooling would help us with this feature set. Our secondary features don't fit into the grid. So we would actually want to add them in at a later stage in the process. Besides, we won't get that much data from taking the average value or the max value in a 2x2 segment of the maze. (This is what pooling does).

If we simplify the problem though, we might find a situation where they'll help.

A Simpler Problem

We're having a lot of difficulty with getting our agent to navigate the maze. So let's throw away the problem of navigation for a second. Can we train an agent that will navigate the empty maze? This should be doable.

Let's start with a bare bones feature set with the goal and current location highlighted in a grid. We'll give a value of 10 for our player's location, and a value of 100 for the target location. We start with a vector of all zeros, and uses Vector.// to modify the proper values:

vectorizeWorld :: World -> V.Vector Float
vectorizeWorld w =
  where
    initialGrid = V.fromList $ take 100 (repeat 0.0)
    (px, py) = playerLocation (worldPlayer w)
    (gx, gy) = endLocation w
    playerLocIndex = (9 - py) * 10 + px
    goalLocIndex = (9 - gy) * 10 + gx
    finalFeatures = initialGrid V.//
      [(playerLocIndex, 10), (goalLocIndex)]

Our AI bot will always follow the same path in this grid, so it will be quite easy for our agent to learn this path! Even if we use our own moves and vary the path a little bit, the agent can still learn it. It'll achieve 100% accuracy on the AI data. It can't get that high on our data, since we might choose different moves for different squares. But we can still train it so it wins every time.

Conclusion

So our results are still not looking great. But next week we'll take this last idea and run a little further with it. We'll keep it so that our features only come from the grid itself. But we'll add a few more complications with enemies. We might find that convolution and pooling are useful in that case.

If you're interested in using Haskell for AI but don't know where to start, read our Haskell AI Series! We discuss some important ideas like why Haskell is a good AI language. We also get into the basics of Tensor Flow with Haskell.

Read More
James Bowen James Bowen

Using Our Data with Supervised Learning

supervised_learning.png

Our aim these last couple weeks has been to try a supervised learning approach to our game. In last week's article we gathered training data for playing the game. We had two different sources. First, we played the game ourselves and recorded our moves. Second, we let our AI play the game and recorded it. This gave us a few CSV files. Each line in these is a record containing the 40 "features" of the game board at the time and the move we chose.

This week, we're going to explore how to build a machine-learning agent based on this data. This will use supervised learning techniques . Look at the supervised-learning branch on our Github repository for more details.

To get started with Haskell and Tensor Flow, download our Haskell Tensor Flow Guide. This library is a little tricky to work with, so you want to make sure you know what you're doing!

Defining Our Model

For our supervised model, we're going to use a fully connected neural network with a single hidden layer. We'll have 40 input features, 100 hidden units, and then our 10 output values for the different move scores. We'll be following a very similar pattern to one we explored in this older article, using the basic Iris data set. We'll copy a lot of code from that article. We won't go over a lot of the helper code in this article, so feel free to check out that one for some help with that!

We define each layer with a "weights" matrix and a "bias" vector. We multiply the input by the weights and then add the bias vector. Let's explore how we can build a single layer of the network. This will take the input and output size, as well as the input tensor. It will have three results. One variable for the weights, one for the biases, and then a final "output" tensor:

buildNNLayer :: Int64 -> Int64 -> Tensor v Float
  -> Build (Variable Float, Variable Float, Tensor Build Float)

The definition is pretty simple. We'll initialize random variables for the weights and bias. We'll produce the result tensor by multiplying by the weights and adding the bias.

buildNNLayer :: Int64 -> Int64 -> Tensor v Float
  -> Build (Variable Float, Variable Float, Tensor Build Float)
buildNNLayer inputSize outputSize input = do
  weights <- truncatedNormal (vector [inputSize, outputSize])
    >>= initializedVariable
  bias <- truncatedNormal (vector [outputSize])
    >>= initializedVariable
  let results = (input `matMul` readValue weights)
        `add` readValue bias
  return (weights, bias, results)

Now that we understand the layers a little better, it's easier to define our model. First, we'll want to include both sets of weights and biases in the model, so we can output them later:

data Model = Model
  { w1 :: Variable Float
  , b1 :: Variable Float
  , w2 :: Variable Float
  , b2 :: Variable Float
  ...
  }

Now we want two different "steps" we can run. The training step will take a batch of data and determine what our network produces for the inputs. It will compare our network's output with the expected output. Then it will train to minimize the loss function. The error rate step will simply produce the error rate on the given data. That is, it will tell us what percentage of the moves we are getting correct. Both of these will be Session actions that take two inputs. First, the TensorData for the features, and then the TensorData for the correct moves:

data Model = Model
  { 
  … -- Weights and biases
  , train :: TensorData Float
          -> TensorData Int64
          -> Session ()
  , errorRate :: TensorData Float
              -> TensorData Int64
              -> Session (V.Vector Float) -- Produces a single number
  }

Let's see how we put this all together.

Building Our Model

To start, let's make placeholders for our input features and expected output results. A dimension of -1 means we can provide any size we like:

createModel :: Build Model
createModel = do
  let batchSize = -1
  (inputs :: Tensor Value Float) <-
    placeholder [batchSize, moveFeatures]
  (outputs :: Tensor Value Int64) <-
    placeholder [batchSize]
  ...

Now we build the layers of our neural network using our helper. We'll apply relu, an activation function, on the results of our hidden layer. This helps our model deal with interaction effects and non-linearities:

createModel :: Build Model
createModel = do
  ...
  (hiddenWeights, hiddenBiases, hiddenResults) <-
    buildNNLayer moveFeatures hiddenUnits inputs
  let rectifiedHiddenResults = relu hiddenResults
  (finalWeights, finalBiases, finalResults) <-
    buildNNLayer hiddenUnits moveLabels rectifiedHiddenResults
  ...

Now to get our error rate, we need a couple steps. We'll get the best move from each predicted result using argMax. We can then compare these to the training data using equal. By using reduceMean we'll get the percentage of our moves that match. Subtracting this from 1 gives our error rate:

createModel :: Build Model
createModel = do
  ...
  (actualOutput :: Tensor Value Int64) <- render $
    argMax finalResults (scalar (1 :: Int64))
  let (correctPredictions :: Tensor Build Float) = cast $
        equal actualOutput outputs
  (errorRate_ :: Tensor Value Float) <- render $
    1 - (reduceMean correctPredictions)

Now we need our training step. We'll compare outputs. This involves the softmaxCrossEntropyWithLogits function. We train our model by selecting our variables for training, and using minimizeWith. This will update the variables to reduce the value of the loss function:

createModel :: Build Model
createModel = do
  ...
  let outputVectors = oneHot outputs (fromIntegral moveLabels) 1 0
  let loss = reduceMean $ fst $
    softmaxCrossEntropyWithLogits finalResults outputVectors
  let params =
        [hiddenWeights, hiddenBiases, finalWeights, finalBiases]
  train_ <- minimizeWith adam loss params
  ...

We conclude by creating our functions. These take the tensor data as parameters. Then they use runWithFeeds to put the data into our placeholders:

createModel :: Build Model
createModel = do
  ...
return $ Model
    { train = \inputFeed outputFeed ->
        runWithFeeds
          [ feed inputs inputFeed
          , feed outputs outputFeed
          ]
          train_
    , errorRate = \inputFeed outputFeed ->
        runWithFeeds
          [ feed inputs inputFeed
          , feed outputs outputFeed
          ]
          errorRate_
    , w1 = hiddenWeights
    , b1 = hiddenBiases
    , w2 = finalWeights
    , b2 = finalBiases
    }

Running Our Tests

Now let's run our tests. We'll read the move record data from the file, shuffle them, and set aside a certain proportion as our test set. Then we'll build our model:

runTraining totalFile = runSession $ do
  initialRecords <- liftIO $ readRecordFromFile totalFile
  shuffledRecords <- liftIO $ shuffleM (V.toList initialRecords)
  let testRecords = V.fromList $ take 2000 shuffledRecords
  let trainingRecords = V.fromList $ drop 2000 shuffledRecords
  model <- build createModel
  ...

Then we run our iterations (we'll do 50000, as an example). We select some random records (100 per batch), and then convert them to data. Then we run our train step. Finally, every 100 iterations or so, we'll get a gauge of the training error on this set. This involves the errorRate step. Note our error rate returns a vector with a single wrapped value. So we need to unwrap it with !.

runTraining totalFile = runSession $ do
  ...
  forM_ ([0..50000] :: [Int]) $ \i -> do
    trainingSample <- liftIO $ chooseRandomRecords trainingRecords
    let (trainingInputs, trainingOutputs) =
          convertRecordsToTensorData trainingSample
    (train model) trainingInputs trainingOutputs
    when (i `mod` 100 == 0) $ do
      err <- (errorRate model) trainingInputs trainingOutputs
      liftIO $ putStrLn $
        (show i) ++ " : current error " ++ show ((err V.! 0) * 100)

Now to run the final test, we use the errorRate step again, this time on our test data:

runTraining totalFile = runSession $ do
  ...

  -- Testing
  let (testingInputs, testingOutputs) =
        convertRecordsToTensorData testRecords
  testingError <- (errorRate model) testingInputs testingOutputs
  liftIO $ putStrLn $
    "test error " ++ show ((testingError V.! 0) * 100)

Results

When it comes to testing our system, we should use proper validation techniques. We want a model that will represent our training data well. But it should also generalize well to other reasonable examples. If our model represents the training data too well, we're in danger of "overfitting" our data. To check this, we'll hold back roughly 20% of the data. This will be our "test" data. We'll train our model on the other 80% of the data. Every 100 steps or so, we print out the training error on that batch of data. We hope this figure drops. But then at the very end, we'll run the model on the other 20% of the data, and we'll see what the error rate is. This will be the true test of our system.

We know we have overfitting if we see figures on training error that are lower than the testing error. When training on human moves for 50000 iterations, the training error drops to the high teens and low 20's. But the test error is often still close to 50%. This suggests we shouldn't be training quite as long.

The AI moves provide a little more consistency though. The training error seems to stabilize around the mid 20's and low 30's, and we end up with a test error of about 34%.

Conclusion

Our error rate isn't terrible. But it's not great either. And worse, testing shows it doesn't appear to capture the behaviors well enough to win the game. A case like this suggests our model isn't sophisticated enough to capture the problem. It could also suggest our data is too noisy, and the patterns we hoped to find aren't there. The feature set we have might not capture all the important information about the graph.

For our final look at this problem, we're going to try a more new serialization technique. Instead of deriving our own features, we're going to serialize the entire game board! The "feature space" will be much much larger now. It will include the structure of the graph and information about enemies and drills. This will call for a more sophisticated model. A pure fully connected network will take a long time to learn things like how walls allow moves or not. A big drawback of this technique is that it will not generalize to arbitrary mazes. It will only work for a certain size and number of enemies. But with enough training time we may find that interesting patterns emerge. So come back next week to see how this works!

Read More
James Bowen James Bowen

Gathering Smart Data

gather_data.jpg

Last week we made a few more fixes to our Q-Learning algorithm. Ultimately though, it still seems to fall short for even basic versions of our problem.

Q-learning is an example of an "unsupervised" learning approach. We don't tell the machine learning algorithm what the "correct" moves are. We give it rewards when it wins the game (and negative rewards when it loses). But it needs to figure out how to play to get those rewards. With supervised learning, we'll have specific examples of what it should do! We'll have data points saying, "for this feature set, we should make this move." We'll determine a way to record the moves we make in our game, both as a human player and with our manual AI algorithm! This will become our "training" data for the supervised learning approach.

This week's code is all on the Gloss side of things. You can find it on our Github repository under the branch record-player-ai. Next week, we'll jump back into Tensor Flow. If you're not familiar yet with how to use Haskell and Tensor Flow, download our Haskell Tensor Flow Guide!

Recording Moves

To gather training data, we first need a way to record moves in the middle of the game. Gloss doesn't give us access to the IO monad in our update functions. So we'll unfortunately have to resort to unsafePerformIO for this, since we need the data in a file. (We did the same thing when saving game states). Here's the skeleton of our function:

unsafeSaveMove :: Int -> World -> World -> World
unsaveSaveMove moveChoice prevWorld nextWorld = unsafePerformIO $ do
  ...

The first parameter will be a representation of our move, an integer from 0-9. This follows the format we had with serialization.

0 -> Move Up
1 -> Move Right
2 -> Move Down
3 -> Move Left
4 -> Stand Still
X + 5 -> Move direction X and use the stun

The first World parameter will be the world under which we made the move. The second world will be the resulting world. This parameter only exists as a pass-through, because of how unsafePerformIO works.

Given these parameters, our function is pretty straightforward. We want to record a single line that has the serialized world state values and our final move choice. These will go in a comma separated list. We'll save everything to a file called moves.csv. So let's open that file and get the list of numbers. We'll immediately convert the numbers to strings with show.

unsafeSaveMove :: Int -> World -> World -> World
unsaveSaveMove moveChoice prevWorld nextWorld = unsafePerformIO $ do
  handle <- openFile "moves.csv" AppendMode
  let numbers = show <$>
      (Vector.toList (vectorizeWorld prevWorld) ++
        [fromIntegral moveChoice])
  ...

Now that our values are all strings, we can get them in a comma separated format with intercalate. We'll write this string to the file and close the handle!

unsafeSaveMove :: Int -> World -> World -> World
unsaveSaveMove moveChoice prevWorld nextWorld = unsafePerformIO $ do
  handle <- openFile "moves.csv" AppendMode
  let numbers = show <$>
        (Vector.toList (vectorizeWorld prevWorld) ++
          [fromIntegral moveChoice])
  let csvString = intercalate "," numbers
  hPutStrLn handle csvString
  hClose handle
  return nextWorld

Now let's figure out how to call this function!

Saving Human Moves

Saving the moves we make as a human is pretty easy. All we need to do is hook into the inputHandler. Recall this section, that receives moves from arrow keys and makes our move:

inputHandler :: Event -> World -> World
inputHandler event w
  ...
  | otherwise = case event of
      (EventKey (SpecialKey KeyUp) Down (Modifiers _ _ Down) _) ->
        drillLocation upBoundary breakUpWall breakDownWall w
      (EventKey (SpecialKey KeyUp) Down _ _) ->
        updatePlayerMove upBoundary
      (EventKey (SpecialKey KeyDown) Down (Modifiers _ _ Down) _) ->
        drillLocation downBoundary breakDownWall breakUpWall w
      (EventKey (SpecialKey KeyDown) Down _ _) ->
        updatePlayerMove downBoundary
      (EventKey (SpecialKey KeyRight) Down (Modifiers _ _ Down) _) ->
        drillLocation rightBoundary breakRightWall breakLeftWall w
      (EventKey (SpecialKey KeyRight) Down _ _) ->
        updatePlayerMove rightBoundary
      (EventKey (SpecialKey KeyLeft) Down (Modifiers _ _ Down) _) ->
        drillLocation leftBoundary breakLeftWall breakRightWall w
      (EventKey (SpecialKey KeyLeft) Down _ _) ->
        updatePlayerMove leftBoundary
      (EventKey (SpecialKey KeySpace) Down _ _) ->
        if playerCurrentStunDelay currentPlayer /= 0
          then w
          else w
            { worldPlayer =
                activatePlayerStun currentPlayer playerParams
            , worldEnemies = stunEnemyIfClose <$> worldEnemies w
            , stunCells = stunAffectedCells
            }
  …

All these lines return World objects! So we just need to wrap them as the final argument to unsafeSaveWorld. Then we add the appropriate move choice number. The strange part is that we cannot move AND stun at the same time when playing as a human. So using the stun will always be 9, which means stunning while standing still. Here are the updates:

inputHandler :: Event -> World -> World
inputHandler event w
  ...
  | otherwise = case event of
      (EventKey (SpecialKey KeyUp) Down (Modifiers _ _ Down) _) -> 
        unsafeSaveMove 0 w $
          drillLocation upBoundary breakUpWall breakDownWall w
      (EventKey (SpecialKey KeyUp) Down _ _) ->
        unsafeSaveMove 0 w $ updatePlayerMove upBoundary
      (EventKey (SpecialKey KeyDown) Down (Modifiers _ _ Down) _) -> 
        unsafeSaveMove 2 w $
          drillLocation downBoundary breakDownWall breakUpWall w
      (EventKey (SpecialKey KeyDown) Down _ _) ->
        unsafeSaveMove 2 w $ updatePlayerMove downBoundary
      (EventKey (SpecialKey KeyRight) Down (Modifiers _ _ Down) _) -> 
        unsafeSaveMove 1 w $
          drillLocation rightBoundary breakRightWall breakLeftWall w
      (EventKey (SpecialKey KeyRight) Down _ _) ->
        unsafeSaveMove 1 w $ updatePlayerMove rightBoundary
      (EventKey (SpecialKey KeyLeft) Down (Modifiers _ _ Down) _) ->
        unsafeSaveMove 3 w $
          drillLocation leftBoundary breakLeftWall breakRightWall w
      (EventKey (SpecialKey KeyLeft) Down _ _) ->
        unsafeSaveMove 3 w $ updatePlayerMove leftBoundary
      (EventKey (SpecialKey KeySpace) Down _ _) ->
        if playerCurrentStunDelay currentPlayer /= 0
          then w
          else unsafeSaveMove 9 w $ w
            { worldPlayer =
                activatePlayerStun currentPlayer playerParams
            , worldEnemies = stunEnemyIfClose <$> worldEnemies w
            , stunCells = stunAffectedCells
            }
  …

And now whenever we play the game, it will save our moves! Keep in mind though, it takes a lot of training data to get good results when using supervised learning. I played for an hour and got around 10000 data points. We'll see if this is enough!

Saving AI Moves

While the game is a least a little fun, it's also exhausting to keep playing it to generate data! So now let's consider how we can get the AI to play the game itself and generate data. The first step is to reset the game automatically on winning or losing:

updateFunc :: Float -> World -> World
updateFunc _ w =
  | (worldResult w == GameWon || worldResult w == GameLost) &&
       (usePlayerAI params) =
    ...

The rest will follow the other logic we have for resetting the game. Now we must examine where to insert our call to unsafeSaveMove. The answer is our updateWorldForPlayerMove function. Wecan see that we get the move (and our player's cached memory) as part of makePlayerMove:

updateWorldForPlayerMove :: World -> World
updateWorldForPlayerMove w = …
  where
    (move, memory) = makePlayerMove w
    ...

We'll want a quick function to convert our move into the number choice:

moveNumber :: PlayerMove -> Int
moveNumber (PlayerMove md useStun dd) =
  let directionFactor = case (md, dd) of
        (DirectionUp, _) -> 0
        (_, DirectionUp) -> 0
        (DirectionRight, _) -> 1
        (_, DirectionRight) -> 1
        (DirectionDown, _) -> 2
        (_, DirectionDown) -> 2
        (DirectionLeft, _) -> 3
        (_, DirectionLeft) -> 3
        _ -> 4
  in  if useStun then directionFactor + 5 else directionFactor

Our saving function requires a pass-through world parameter. So we'll do the saving on our first new World calculation. This comes from modifyWorldForPlayerDrill:

updateWorldForPlayerMove :: World -> World
updateWorldForPlayerMove w = …
  where
    (move, memory) = makePlayerMove w

    worldAfterDrill = unsafeSaveMove (moveNumber move) w
     (modifyWorldForPlayerDrill …)
    ...

And that's all! Now our AI will play the game by itself, gathering data for hours on end if we like! We'll get some different data for different cases, such as 4 enemies 4 drills, 8 enemies 5 drills, and so on. This is much faster and easier than playing the game ourselves! It will automatically get 12-15 thousand data points an hour if we let it!

Conclusion

With a little bit of persistence, we can now get a lot of data for the decisions a smarter agent will make. Next week, we'll take the data we've acquired and use it to write a supervised learning algorithm! Instead of using Q-learning, we'll make the weights reflect the decisions that we (or the AI) would make.

Supervised learning is not without its pitfalls! It won't necessarily perform optimally. It will perform like the training data. So even if we're successful, our algorithm will replicate our own mistakes! It'll be interesting to see how this plays out, so stay tuned!

For more information on using Haskell in AI, take a look at our Haskell AI Series. Plus, download our Haskell Tensor Flow Guide to learn more about using this library!

Read More
James Bowen James Bowen

Tweaks, Fixes, and Some Results

tweaks.jpg

In last week's episode of this AI series, we added random exploration to our algorithm. This helped us escape certain "traps" and local minimums in the model that could keep us rooted in bad spots. But it still didn't improve results too much.

This week we'll explore a couple more ways we can fix and improve our algorithm. For the first time, see some positive outcomes. Still, we'll find our approach still isn't great.

To get started with Tensor Flow and Haskell, download our guide! It's a complex process so you'll want some help! You should also check out our Haskell AI Series to learn more about why Haskell is a good choice as an AI language!

Improvements

To start out, there are a few improvements we can make to how we do q-learning. Let's recall the basic outline of running a world iteration. There are three steps. We get our "new" move from the "input" world. Then we apply that move, and get our "next" move against the "next" world. Then we use the possible reward to create our target actions, and use that to train our model.

runWorldIteration model = do
  (prevWorld, _, _) <- get

  -- Get the next move on the current world (with random chance)
  let inputWorldVector = … -- vectorize prevWorld
  currentMoveWeights <- lift $ lift $
    (iterateWorldStep model) inputWorldVector
  let bestMove = moveFromOutput currentMoveWeights
  let newMove = chooseRandomMoveWithChance …

  -- Get the next world using this move, and produce our next move
  let nextWorld = stepWorld newMove prevWorld
  let nextWorldVector = vectorizeWorld nextWorld
  nextMoveVector <- lift $ lift $
    (iterateWorldStep model) nextWorldVector

  -- Use these to get "target action values" and use them to train!
  let (bestNextMoveIndex, maxScore) =
          (V.maxIndex nextMoveVector, V.maximum nextMoveVector)
  let targetActionData = encodeTensorData (Shape [10, 1]) $
          nextMoveVector V.//
            [(bestNextMoveIndex, newReward + maxScore)]
  lift $ lift $ (trainStep model) nextWorldVector targetActionData

There are a couple issues here. First, we want to substitute based on the first new move, not the later move. We want to learn from the move we are taking now, since we assess its result now. Thus we want to substitute for that index. We'll re-write our randomizer to account for this and return the index it chooses.

Next, when training our model, we should the original world, instead of the next world. That is, we want inputWorldVector instead of nextWorldVector. Our logic is this. We get our "future" action, which accounts for the game's reward. We want our current action on this world should be more like the future action. Here's what the changes look like:

runWorldIteration model = do
  (prevWorld, _, _) <- get

  -- Get the next move on the current world (with random chance)
  let inputWorldVector = … -- vectorize prevWorld
  currentMoveWeights <- lift $ lift $
    (iterateWorldStep model) inputWorldVector
  let bestMove = moveFromOutput currentMoveWeights
  let (newMove, newMoveIndex) = chooseRandomMoveWithChance …

  -- Get the next world using this move, and produce our next move
  let nextWorld = stepWorld newMove prevWorld
  let nextWorldVector = vectorizeWorld nextWorld
  nextMoveVector <- lift $ lift $
    (iterateWorldStep model) nextWorldVector

  -- Use these to get "target action values" and use them to train!
  let maxScore = V.maximum nextMoveVector
  let targetActionData = encodeTensorData (Shape [10, 1]) $
          nextMoveVector V.//
            [(newMoveIndex, newReward + maxScore)]
  lift $ lift $ (trainStep model) inputWorldVector targetActionData

Another change we can make is to provide some rewards based on whether the selected move was legal or not. To do this, we'll need to update the stepWorld game API to return this boolean value:

stepWorld :: PlayerMove -> World -> (World, Bool)

Then we can add a small amount (0.01) to our reward value if we get a legal move, and subtract this otherwise.

As a last flourish, we should also add a timeout condition. Our next step will be to test on simple mazes that have no enemies. This means we'll never get eaten, so we need some loss condition if we get stuck. This timeout condition should have the same negative reward as losing.

Results

Now that we've made some improvements, we'll train on a very basic maze that's only 5x5 and has no walls and no enemies. Whereas we used to struggle to even finish this maze, we now achieve the goal a fair amount of the time. One of our training iterations achieved the goal around 2/3 of the time.

However, our bot is still useless against enemies! It loses every time if we try to train from scratch on a map with a single enemy. One attempt to circumvent this is to first train our weights to solve the empty maze. Then we can start with these weights as we attempt to avoid the enemy. That way, we have some pre-existing knowledge, and we don't have to learn everything at once. Still though, it doesn't result in much improvement. Typical runs only succeeded 40-50 times out of 2000 iterations.

Limiting Features

One conclusion we can draw is that we actually have too many features! Our intuition is that a larger feature set would take more iterations to learn. If the features aren't chosen carefully, they'll introduce noise.

So instead of tracking 8 features for each possible direction of movement, let's stick with 3. We'll see if the enemy is on the location, check the distance to the end, and count the number of nearby enemies. When we do this, we get comparable results on the empty maze. But when it comes to avoiding enemies, we do a little better, surviving 150-250 iterations out of 2000. These statistics are all very rough, of course. If we wanted a more thorough analysis, we'd use multiple maze configurations and a lot more runs using the finalized weights.

Conclusions

We can't draw too many conclusions from this yet. Our model is still failing to solve simple versions of our problem. It's quite possible that our model is too simplistic. After all, all we're doing is a simple matrix multiplication on our features. In theory, this should be able to solve the problem, but it may take a lot more iterations. The results stream we see also suggests local minimums are a big problem. Logging information reveals that we often die in the same spot in the maze many times in a row. The negative rewards aren't enough to draw us out, and we are often relying on random moves to find better outcomes.

So next week we're going to start changing our approach. We'll explore a way to introduce supervised learning into our process. This depends on "correct" data. We'll try a couple different ways to get that data. We'll use our own "human" input, as well as the good AI we've written in the past to solve this problem. All we need is a way to record the moves we make! So stay tuned!

Read More
James Bowen James Bowen

Adding Random Exploration

brain_idea.jpg

Last week, we finally built a pipeline to use machine learning on our maze game. We made a Tensor Flow graph that could train a "brain" with weights so we could navigate the maze. This week, we'll see how our training works, or rather how it doesn't work. We'll consider how randomizing moves during training might help.

Our machine learning code lives in this repository. For this article, you'll want to look at the randomize-moves branch. Take a look here for the original game code. You'll want the q-learning branch in the main repo.

This part of the series uses Haskell and Tensor Flow. To learn more about using these together, download our Haskell Tensor Flow Guide!

Unsupervised Machine Learning

With a few tweaks, we can run our game using the new output weights. But what we'll find as we train the weights is that our bot never seems to win! It always seems to do the same thing! It might move up and then get stuck because it can't move up anymore. It might stand still the whole time and let the enemies come grab it. Why would this happen?

Remember that reinforcement learning depends on being able to reinforce good behaviors. Thus at some point, we have to hope our AI will win the game. Then it will get the good reward so that it can change its behavior to adapt and get good results more often. But if it never gets a good result in the whole training process, it will never learn good behaviors!

This is part of the challenge of unsupervised learning. In a supervised learning algorithm, we have specific good examples to learn from. One way to approach this would be to record our own moves of playing the game. Then the AI could learn directly from us! We'll probably try this approach in the future!

But q-learning is an unsupervised algorithm. We're forcing our AI to explore the world and learn for its own. But right now, it's only making moves that it thinks are "optimal." But with a random set of weights, the "optimal" moves aren't very optimal at all! Part of a good "exploration" plan means letting it choose moves from time to time that don't seem optimal.

Adding a Random Choice

As our first attempt to fix this, we'll add a "random move chance" to our training process. At each training step, our network chooses its "best" move, and we use that to update the world state. From now on, whenever we do this, we'll roll the dice. And if we get a number below our random chance, we'll pick a random move instead of our "best" move.

Over the course of training though, we want to decrease this random chance. In theory, our AI should be better as we train the network. So as we get closer to the end of training, we'll want to make fewer random decisions, and more "best" decisions. We'll aim to start this parameter as 1 in 5, and reduce it down to 1 in 50 as training continues. So how do we implement this?

First of all, we want to keep track of a value representing our chance of making a random move. Our runAllIterations function should be stateful in this parameter.

-- Third "Float" parameter is the random chance
runAllIterations :: Model -> World
  -> StateT ([Float, Int, Float) Session ()
...

trainGame :: World -> Session (Vector Float)
trainGame w = do
  model <- buildModel
  let initialRandomChance = 0.2
  (finalReward, finalWinCount, _) <- execStateT
    (runAllIterations model w)
    ([], 0, initialRandomChance)
  run (readValue $ weightsT model)

Then within runAllIterations, we'll make two changes. First, we'll make a new random generator for each training game. Then, we'll update the random chance, reducing it with the number of iterations:

runAllIterations :: Model -> World
  -> StateT ([Float, Int, Float) Session ()
runAllIterations model initialWorld = do
  let numIterations = 2000
  forM [1..numIterations] $ \i -> do
    gen <- liftIO getStdGen
    (wonGame, (_, finalReward, _)) <- runStateT
      (runWorldIteration model)
      (initialWorld, 0.0, gen)
    (prevRewards, prevWinCount, randomChance) <- get
    let modifiedRandomChance = 1.0 / ((fromIntegral i / 40.0) + 5)
    put (newRewards, newWinCount, modifiedRandomChance)
  return ()

Making Random Moves

We can see now that runWorldIteration must now be stateful in the random generator. We'll retrieve that as well as the random chance at the start of the operation:

runWorldIteration :: Model -> StateT (World, Float, StdGen)
  (StateT ([Float], Int, Float) Session) Bool
runWorldIteration model = do
  (prevWorld, prevReward, gen) <- get
  (_, _, randomChance) <- lift get
  ...

Now let's refactor our serialization code a bit. We want to be able to make a new move based on the index, without needing the weights:

moveFromIndex :: Int -> PlayerMove
moveFromIndex bestMoveIndex =
  PlayerMove moveDirection useStun moveDirection
  where
    moveDirection = case bestMoveIndex `mod` 5 of
      0 -> DirectionUp
      1 -> DirectionRight
      2 -> DirectionDown
      3 -> DirectionLeft
      4 -> DirectionNone

Now we can add a function that will run the random generator and give us a random move if it's low enough. Otherwise, it will keep the best move.

chooseMoveWithRandomChance ::
  PlayerMove -> StdGen -> Float -> (PlayerMove, StdGen)
chooseMoveWithRandomChance bestMove gen randomChance =
  let (randVal, gen') = randomR (0.0, 1.0) gen
      (randomIndex, gen'') = randomR (0, 1) gen'
      randomMove = moveFromIndex randomIndex
  in  if randVal < randomChance
        then (randomMove, gen'')
        else (bestMove, gen')

Now it's a simple matter of applying this function, and we're all set!

runWorldIteration :: Model -> StateT (World, Float StdGen)
  (StateT ([Float], Int, Float) Session) Bool
runWorldIteration model = do
  (prevWorld, prevReward, gen) <- get
  (_, _, randomChance) <- lift get
  ...
  let bestMove = ...
  let (newMove, newGen) = chooseMoveWithRandomChance
                            bestMove gen randomChance
  …
  put (nextWorld, prevReward + newReward, newGen)
  continuationAction

Conclusion

When we test our bot, it has a bit more variety in its moves now, but it's still not succeeding. So what do we want to do about this? It's possible that something is wrong with our network or the algorithm. But it's difficult to reveal this when the problem space is difficult. After all, we're expecting this agent to navigate a complex maze AND avoid/stun enemies.

It might help to break this process down a bit. Next week, we'll start looking at simpler examples of mazes. We'll see if our current approach can be effective at navigating an empty grid. Then we'll see if we can take some of the weights we learned and use them as a starting point for harder problems. We'll try to navigate a true maze, and see if we get better weights. Then we'll look at an empty grid with enemies. And so on. This approach will make it more obvious if there are flaws with our machine learning method.

If you've never programmed in Haskell before, it might be a little hard to jump into machine learning. Check out our Beginners Checklist and our Liftoff Series to get started!

Read More
James Bowen James Bowen

Running Training Iterations

iterations.png

In our last article we built a simple Tensor Flow model to perform Q-Learning on our brain. This week, we'll build out the rest of the code we need to run iterations on this model. This will train it to perform better and make more intelligent decisions.

The machine learning code for this project is in a separate repository from the game code. Check out MazeLearner to follow along. Everything for this article is on the basic-trainer branch.

To learn more about Haskell and AI, make sure to read our Haskell and AI Series!

Iterating on the Model

First let's recall what our Tensor Flow model looks like:

data Model = Model
  { weightsT :: Variable Float
  , iterateWorldStep :: TensorData Float -> Session (Vector Float)
  , trainStep :: TensorData Float -> TensorData Float -> Session ()
  }

We need to think about how we're going to use the last two functions of it. We want to iterate on and make updates to the weights. Across the different iterations, there's certain information we need to track. The first value we'll track is the list of "rewards" from each iteration (this will be more clear in the next section). Then we'll also track the number of wins we get in the iteration.

To track these, we'll use the State monad, run on top the the Session.

runAllIterations :: Model -> World -> StateT ([Float], Int) Session ()

We'll also want a function to run a single iteration. This, in turn, will have its own state information. It will track the World state of the game it's playing. It will also track sum of the accumulated reward values from the moves in that game. Since we'll run it from our function above, it will have a nested StateT type. It will ultimately return a boolean value indicating if we have won the game. We'll define the details in the next section:

runWorldIteration :: Model ->
  StateT (World, Float) (StateT ([Float], Int) Session) Bool

We can now start by filling out our function for running all the iterations. Supposing we'll perform 1000 iterations, we'll make a loop for each iteration. We can start each loop by running the world iteration function on the current model.

runAllIterations :: Model -> World -> StateT ([Float], Int) Session ()
runAllIterations model initialWorld = do
  let numIterations = 1000
  void $ forM [1..numIterations] $ \i -> do
    (wonGame, (_, finalReward)) <-
      runStateT (runWorldIteration model world)
    ...

And now the rest is a simple matter of using our the results to update the existing state:

runAllIterations :: Model -> World -> StateT ([Float], Int) Session ()
runAllIterations model initialWorld = do
  let numIterations = 2000
  forM [1..numIterations] $ \i -> do
    (wonGame, (_, finalReward)) <-
      runStateT (runWorldIteration model) (initialWorld, 0.0)
    (prevRewards, prevWinCount) <- get
    let newRewards = finalReward : prevRewards
    let newWinCount = if wonGame
          then prevWinCount + 1
          else prevWinCount
    put (newRewards, newWinCount)

Running a Single Iteration

Now let's delve into the process of a single iteration. Broadly speaking, we have four goals.

  1. Take the current world and serialize it. Pass it through the iterateStep to get the move our model would make in this world.
  2. Apply this move, getting the "next" world state.
  3. Determine the scores for our moves in this next world. Apply the given reward as the score for the best of these moves.
  4. Use this result to compare against our original moves. Feed it into the training step and update our weights.

Let's start with steps 1 and 2. We'll get the vector representation of the current world. Then we need to encode it as TensorData so we can pass it to an input feed. Next we run our model's iterate step and get our output move. Then we can use that to advance the world state using stepWorld and updateEnvironment.

runWorldIteration
  :: Model
  -> StateT (World, Float) (StateT ([Float], Int) Session) Bool
runWorldIteration model = do
  -- Serialize the world
  (prevWorld :: World, prevReward) <- get
  let (inputWorldVector :: TensorData Float) =
        encodeTensorData (Shape [1, 8]) (vectorizeWorld prevWorld)

  -- Run our model to get the output vector and de-serialize it
  -- Lift twice to get into the Session monad
  (currentMove :: Vector Float) <- lift $ lift $
    (iterateWorldStep model) inputWorldVector
  let newMove = moveFromOutput currentMove

  -- Get the next world state
  let nextWorld = updateEnvironment (stepWorld newMove prevWorld)

Now we need to perform the Q-Learning step. We'll start by repeating the process in our new world state and getting the next vector of move scores:

runWorldIteration model = do
  ...
  let nextWorld = updateEnvironment (stepWorld newMove prevWorld)

  let nextWorldVector =
        encodeTensorData (Shape [1, 8]) (vectorizeWorld nextWorld)

  (nextMoveVector :: Vector Float) <- lift $ lift $
    (iterateWorldStep model) nextWorldVector
  ...

Now it gets a little tricky. We want to examine if the game is over after our last move. If we won, we'll get a reward of 1.0. If we lost, we'll get a reward of -1.0. Otherwise, there's no reward. While we figure out this reward value, we can also determine our final monadic action. We could return a boolean value if the game is over, or recursively iterate again:

runWorldIteration model = do
  ...
  let nextWorld = ...
  (nextMoveVector :: Vector Float) <- ...
  let (newReward, containuationAction) = case worldResult nextWorld of
        GameInProgress -> (0.0, runWorldIteration model)
        GameWon -> (1.0, return True)
        GameLost -> (-1.0, return False)
  ...

Now we'll look at the vector for our next move and replace one of its values. We'll find the maximum score, and replace it with a value that factors in the actual reward we get from the game. This is how we insert "truth" into our training process and how we'll actually learn good reward values.

import qualified Data.Vector as V

runWorldIteration model = do
  ...
  let nextWorld = ...
  (nextMoveVector :: Vector Float) <- ...
  let (newReward, containuationAction) = ...
  let (bestNextMoveIndex, maxScore) =
        (V.maxIndex nextMoveVector, V.maximum nextMoveVector)
  let (targetActionValues :: Vector Float) = nextMoveVector V.//
        [(bestNextMoveIndex, newReward + (0.99 * maxScore))]
  let targetActionData =
        encodeTensorData (Shape [10, 1]) targetActionValues
  ...

Then we'll encode this new vector as the second input to our training step. We'll still use the nextWorldVector as the first input. We conclude by updating our state variables to have their new values. Then we run the continuation action we got earlier.

runWorldIteration model = do
  ...
  let nextWorld = ...
  (nextMoveVector :: Vector Float) <- ...
  let targetActionData = ...

  -- Run training to alter the weights
  lift $ lift $ (trainStep model) nextWorldVector targetActionData
  put (nextWorld, prevReward + newReward)
  continuationAction

Tying It Together

Now to make this code run, we need a little bit of code to tie it together. We'll make a Session action to train our game. It will output the final weights of our model.

trainGame :: World -> Session (Vector Float)
trainGame w = do
  model <- buildModel
  (finalReward, finalWinCount) <-
    execStateT (runAllIterations model w) ([], 0)
  run (readValue $ weightsT model)

Then we can run this from IO using runSession.

playGameTraining :: World -> IO (Vector Float)
playGameTraining w = runSession (trainGame w)

Last of all, we can run this on any World we like by first loading it from a file. For our first examples, we'll use a smaller 10x10 grid with 2 enemies and 1 drill powerup.

main :: IO ()
main = do
  world <- loadWorldFromFile "training_games/maze_grid_10_10_2_1.game"
  finalWeights <- playGameTraining world
  print finalWeights

Conclusion

We've now got the basics down for making our Tensor Flow program work. Come back next week where we'll take a more careful look at how it's performing. We'll see if the AI from this process is actually any good or if there are tweaks we need to make to the learning process.

And make sure to download our Haskell Tensor Flow Guide! This library is difficult to use. There are a lot of secondary dependencies for it. So don't go in trying to use it blind!

Read More
James Bowen James Bowen

Making a Learning Model

tf_logo.jpg

Last week we took a few more steps towards using machine learning to improve the player AI for our maze game. We saw how to vectorize the input and output data for our world state and moves. This week, we'll finally start seeing how to use these in the larger context of a Tensor Flow program. We'll make a model for a super basic neural network that will apply the technique of Q-Learning.

Our machine learning code will live in a separate repository than the primary game code. Be sure to check that out here! The first couple weeks of this part of the series will use the basic-trainer branch.

This week, we'll finally started diving into using Haskell with Tensor Flow. Be sure to read our Haskell AI Series to learn more about this! You can also download our Haskell Tensor Flow guide to learn the basics of the library.

Model Basics

This week's order of business will be to build a Tensor Flow graph that can make decisions in our maze game. The graph should take a serialized world state as an input, and then produce a distribution of scores. These scores correspond to the different moves we can make.

Re-calling from last week, the input to our model will be a 1x8 vector, and the output will be a 10x1 vector. For now then, we'll represent our model with a single variable tensor that will be a matrix of size 8x10. We'll get the output by multiply the inputs by the weights.

Ultimately, there are three things we need to access from this model.

  1. The final weights
  2. A step to iterate the world
  3. A step to train our model and adjust the weights.

Here's what the model looks like, using Tensor Flow types:

data Model = Model
  { weightsT :: Variable Float
  , iterateWorldStep :: TensorData Float -> Session (Vector Float)
  , trainStep :: TensorData Float -> TensorData Float -> Session ()
  }

The first element is the variable tensor for our weights. We need to expose this so we can output them at the end. The second element is a function that will take in a serialized world state and produce the output move. Then the third element will take both a serialized world state AND some expected values. It will update the variable tensor as part of the Q-Learning process. Next week, we'll write iteration functions in the Session monad. They'll use these two elements.

Building the Iterate Step

To make these Tensor Flow items, we'll also need to use the Session monad. Let's start a basic function to build up our model:

buildModel :: Session Model
buildModel = do
  ...

To start, let's make a variable for our weights. At the start, we'll randomize them with truncatedNormal and then make that into a Variable:

buildModel :: Session Model
buildModel = do
  (initialWeights :: Tensor Value Float) <-
    truncatedNormal (vector [8, 10])
  (weights :: Variable Float) <- initializedVariable initialWeights

Now let's build the items for running our iterate step. This first involves taking the inputs as a placeholder. Remember, the inputs come from the vectorization of the world state.

Then to produce our output, we'll multiply the inputs by our weights. The result is a Build tensor, so we need to render it to use it in the next part. As an extra note, we need readValue to turn our Variable into a Tensor we can use in operations.

buildModel :: Session Model
buildModel = do
  (initialWeights :: Tensor Value Float) <-
    truncatedNormal (vector [8, 10])
  (weights :: Variable Float) <- initializedVariable initialWeights
  (inputs :: Tensor Value Float) <- placeholder (Shape [1,8])
  let (allOutputs :: Tensor Build Float) =
        inputs `matMul` (readValue weights)
  returnedOutputs <- render allOutputs
  ...

The next part is to create a step to "run" the outputs. Since the outputs depend on a placeholder, we need to create a feed for the input. Then we can create a runnable Session action with runWithFeeds. This gives us the second element of our Model, the iterateStep.

buildModel :: Session Model
buildModel = do
  ...
  let iterateStep = \inputFeed ->
        runWithFeeds [feed inputs inputFeed] returnedOutputs
  ...

Using Q-Learning in the Model

This gives us what we need to run our basic AI and make moves in the game. But we still need to apply some learning mechanism to update the weights!

We want to use Q-Learning. This means we'll compare the output of our model with the next output from continuing to step through the world. So first let's introduce another placeholder for these new outputs:

buildModel :: Session Model
buildModel = do
  initialWeights <- ...
  weights <- ...
  inputs <- ...
  returnedOutputs <- ...

  let iterateStep = ...

  -- Next set of outputs
  (nextOutputs :: Tensor Value Float) <- placeholder (Shape [10, 1])
  ...

Now we'll define our "loss" function. That is, we'll find the squared difference between our real output and the "next" output. Next week we'll see that the "next" output uses extra information about the game. This will allow us to bring an element of "truth" that we can learn from.

buildModel :: Session Model
buildModel = do
  ...
  returnedOutputs <- ...

  -- Q-Learning Section
  (nextOutputs :: Tensor Value Float) <- placeholder (Shape [10, 1])
  let (diff :: Tensor Build Float) = nextOutputs `sub` allOutputs
  let (loss :: Tensor Build Float) = reduceSum (diff `mul` diff)
  ...

Now, we'll make a final ControlNode using minimizeWith. This will minimize the loss function using the adam optimizer. We'll pass weights as an input, since this is a variable we are trying to update for this change.

buildModel :: Session Model
buildModel = do
  ...
  returnedOutputs <- ...

  -- Q-Learning Section
  (nextOutputs :: Tensor Value Float) <- placeholder (Shape [10, 1])
  let (diff :: Tensor Build Float) = nextOutputs `sub` allOutputs
  let (loss :: Tensor Build Float) = reduceSum (diff `mul` diff)
  (trainer_ :: ControlNode) <- minimizeWith adam loss [weights]

Finally, we'll make our training step, that will run the training node on two input feeds. One for the world input, and one for the expected output. Then we can return our completed model.

buildModel :: Session Model
buildModel = do
  ...
  inputs <- ...
  weights <- ...
  returnedOutputs <- ...
  let iterateStep = ...

  -- Q-Learning Section
  (nextOutputs :: Tensor Value Float) <- placeholder (Shape [10, 1])
  let diff = ...
  let loss = ...
  (trainer_ :: ControlNode) <- minimizeWith adam loss [weights]
  let trainingStep = \inputFeed nextOutputFeed -> runWithFeeds
        [ feed inputs inputFeed
        , feed nextOutputs nextOutputFeed
        ]
        trainer_
  return $ Model
    weights
    iterateStep
    trainingStep

Conclusion

Now we've got our machine learning model. We have different functions that can iterate on our world state as well as train the outputs of our graph. Next week, we'll see how to combine these steps within the Session monad. Then we can start running training iterations and produce results.

If you want to follow along with these code examples, make sure to download our Haskell Tensor Flow Guide! This library is quite tricky to use. There are a lot of secondary dependencies for it. So you won't want to go in trying to use it blind!

Read More
James Bowen James Bowen

Q-Learning Primer

q-learning.png

This week, we're going to take the machine learning process in a different direction than I expected. In the last couple weeks, we've built a simple evaluation function for our world state. We could learn this function using an approach called Temporal Difference learning. We might come back to this approach at some point. But for now, we're actually going to try something a little different.

Instead, we're going to focus on a technique called Q-Learning. Instead of an evaluation function for the world, we're going to learn makePlayerMove. We'll keep most of the same function structure. We're still going to take our world and turn it into a feature space that we can represent as a numeric vector. But instead of producing a single output, we'll give a score for every move from that position. This week, we'll take the basic steps to ready our game for this approach.

As always, check out the Github repository repository for this project. This week's code is on the q-learning branch!

Next week, we'll finally get into some Tensor Flow code. Make sure you're ready for it by reading up on our Tensor Flow Guide!

Vectorizing Inputs

To learn a function, we need to be able to represent both the inputs and the outputs of our system as numeric vectors. We've already done most of the work here. Let's recall our evaluateWorld function. We'll keep the same feature values. But now we'll wrap them, instead of applying scores immediately:

data WorldFeatures = WorldFeatures
  { onActiveEnemy :: Int
  , shortestPathLength :: Int
  , manhattanDistance :: Int
  , enemiesOnPath :: Int
  , nearestEnemyDistance :: Int
  , numNearbyEnemies :: Int
  , stunAvailable :: Int
  , drillsRemaining :: Int
  }

produceWorldFeatures :: World -> WorldFeatures
produceWorldFeatures w = WorldFeatures
  (if onActiveEnemy then 1 else 0)
  shortestPathLength
  manhattanDistance
  enemiesOnPath
  nearestEnemyDistance
  numNearbyEnemies
  (if stunAvailable then 1 else 0)
  (fromIntegral drillsRemaining)
  where
    -- Calculated as before
    onActiveEnemy = ...
    enemiesOnPath = ...
    shortestPathLength = ...
    nearestEnemyDistance = ...
    manhattanDistance = ...
    stunAvailable = ...
    numNearbyEnemies = ...
    drillsRemaining = ...

Now, in our ML code, we'll want to convert this into a vector. Using a vector will enable use to encode this information as a tensor.

vectorizeWorld :: World -> Vector Float
vectorizeWorld w = fromList (fromIntegral <$>
  [ wfOnActiveEnemy features
  , wfShortestPathLength features
  , wfManhattanDistance features
  , wfEnemiesOnPath features
  , wfNearestEnemyDistance features
  , wfNumNearbyEnemies features
  , wfStunAvailable features
  , wfDrillsRemaining features
  ])
  where
    features = produceWorldFeatures w

Vectorizing Outputs

Now we have the inputs to our tensor system. We'll ultimately get a vector of outputs as a result. We want this vector to provide a score for every move. We have 10 potential moves in general. There are five "movement" directions, moving up, right, down, left, and standing still. Then for each direction, we can either use our stun or not. We'll use a drill when the movement direction sends us against a wall. Certain moves won't be available in certain situations. But our size should account for them all.

Our function will often propose invalid moves. For example, it might suggest using the stun while on cooldown, or drilling when we don't have one. In these cases, our game logic should dictate that our player doesn't move. Hopefully this trains our network to make correct moves. If we wanted to, we could even apply a slight negative reward for these.

What we need is the ability to convert a vector of outputs into a move. Once we fix the vector size, this is not difficult. As a slight hack, this function will always give the same direction for moving and drilling. We'll let the game logic determine if the drill needs to apply.

moveFromOutput :: Vector Int -> PlayerMove
moveFromOutput vals = PlayerMove moveDirection useStun moveDirection
  where
    bestMoveIndex = maxIndex vals
    moveDirection = case bestMoveIndex `mod` 5 of
      0 -> DirectionUp
      1 -> DirectionRight
      2 -> DirectionDown
      3 -> DirectionLeft
      4 -> DirectionNone
    useStun = bestMoveIndex > 4

Discrete Updates

Now that we can get numeric vectors for everything, we need to be able to step through the world, one player move at a time. We currently have a generic update function. Depending on the time, it might step the player forward, or it might not. We want to change this so there are two steps. First we receive a player's move, and then we step the world forward until it is time for the next player move:

stepWorld :: PlayerMove -> World -> World

This isn't too difficult; it just requires a little shuffling of our existing code. First, we'll add a new applyPlayerMove' function. This will take the existing applyPlayerMove and add a little bit of validation to it:

applyPlayerMove' :: PlayerMove -> World -> World
applyPlayerMove' move w = if isValidMove
  then worldAfterMove
  else w
  where
    player = worldPlayer w
    currentLoc = playerLocation player

    worldAfterDrill = modifyWorldForPlayerDrill w
      (drillDirection move)

    worldAfterStun = if activateStun move
      then modifyWorldForStun worldAfterDrill
      else worldAfterDrill

    newLocation = nextLocationForMove
      (worldBoundaries worldAfterDrill Array.! currentLoc) 
      currentLoc
      (playerMoveDirection move)

    isValidStunUse = if activateStun move
      then playerCurrentStunDelay player == 0
      else True
    isValidMovement = playerMoveDirection move == DirectionNone ||
      newLocation /= currentLoc
    isValidMove = isValidStunUse && isValidMovement

    worldAfterMove =
      modifyWorldForPlayerMove worldAfterStun newLocation

Now we'll add an updateEnvironment function. This will perform all the work of our updateFunc except for moving the player.

updateEnvironment :: World -> World
updateEnvironment w
  | playerLocation player == endLocation w =
    w { worldResult = GameWon }
  | playerLocation player `elem` activeEnemyLocations =
    w { worldResult = GameLost }
  | otherwise =
    updateWorldForEnemyTicks .
    updateWorldForPlayerTick .
    updateWorldForEnemyMoves .
    clearStunCells .
    incrementWorldTime $ w
  where
    player = worldPlayer w
    activeEnemyLocations = enemyLocation <$>
      filter (\e -> enemyCurrentStunTimer e == 0) (worldEnemies w)

Now we combine these. First we'll make the player's move. Then we'll update the environment once for each tick of the player's "lag" time.

stepWorld :: PlayerMove -> World -> World
stepWorld move w = execStateM (sequence updateActions) worldAfterMove
  where
    worldAfterMove = applyPlayerMove' move w
    updateActions = replicate
      ( fromIntegral .
        lagTime .
        playerGameParameters .
        worldParameters $ w)
      (modify updateEnvironment)

And these are all the modifications we'll need to get going!

Q Learning Teaser

Now we can start thinking about the actual machine learning process. We'll get into a lot more detail next week. But for now, let's think about a particular training iteration. We'll want to use our existing network to step forward into the game. This will produce a certain "reward", and leave the game state in a new position. Then we'll get more values for our next moves out of that position. We'll use the updated move scores and the reward to learn better values for our function weights.

Of course, the immediate "reward" values for most moves will be 0. The only moves that will carry a reward will be those where we either win the game or lose the game. So it could take a while for our program to learn good behaviors. It will take time for the "end" behaviors of the game to affect normal moves. For this reason, we'll start our training on much smaller mazes than the primary game. This should help speed up the training process.

Conclusion

Next week, we'll take our general framework for Q-Learning and apply it within Tensor Flow. We'll get the basics of Q-Learning down with a couple different types of models. For a wider perspective on Haskell and AI problems, make sure to check out our Haskell AI Series!

Read More
James Bowen James Bowen

Adding Features for Better Behavior

brain_feature.jpg

Last week we started exploring the idea of an AI built on an evaluation function. This has the potential to allow us to avoid a lot of the hand-crafting that comes with AI design. Hard old way specified all the rules for the AI to follow. In the new approach, we create a mathematical function to evaluate a game position. Then we can look at all our possible moves and select the one with the best result. We could, if we wanted to, turn the input to our evaluation function into a vector of numbers. And its output is also a number. This property will help us realize our dream future to machine learn this function.

We made a rudimentary version of this function last week. Even before turning to machine learning, there are a couple ways to improve our function. We can try tweaking the weights we applied to each feature. But we can also try coming up with new features, or try different combinations of features. This week, we'll try the latter approach.

In the coming weeks as we start exploring machine learning, we'll use Tensor Flow with Haskell! To get prepared, download our Haskell Tensor Flow guide!

Existing Features

Last week, we came up with a few different features that could help us navigate this maze. These features included:

  1. Maze distance to goal
  2. Manhattan distance to goal
  3. Whether or not an enemy is on our location
  4. Whether or not our stun is available
  5. The number of drills we have available
  6. The number of enemies that are nearby (using manhattan distance)

But there were some clear sub-optimal behaviors with our bot. We tend to get "zoned out" by enemies, even when they aren't near us by maze distance. Obviously, it would suit us to use maze distance instead of manhattan distance. But we also want to be willing to approach enemies aggressively when we have our stun, and retreat intelligently without it. To that end, let's add a couple more features:

  1. The number of enemies on the shortest path to the goal.
  2. The shortest distance to an enemy from a particular square (only up to 5)

We'll impose a penalty for close enemies if we don't have our stun. Otherwise we'll ignore this first new feature. Then we'll also impose a penalty having more enemies on our shortest path. This will make us more willing to use the stun, rather than waiting.

Enemies In The Way

Our first order of business will be to determine how many enemies lie on our shortest path. We'll filter the path itself based on membership in the active enemies set:

evaluateWorld :: World -> Float
evaluateWorld w =

  where
    activeEnemyLocations = …

    shortestPath =
      getShortestPath (worldBoundaries w) playerLoc goalLoc

    enemiesOnPath = length $ filter
      (\l -> Set.member l (Set.fromList activeEnemyLocations))
      shortestPath

Then we'll assign each enemy on this path a penalty greater than the value of using the stun. We'll add this score to our other scores.

evaluateWorld :: World -> Float
evaluateWorld w =
  enemiesOnPathScore +
  ...
  where
    enemiesOnPath = ...
    enemiesOnPathScore = -85.0 * (fromIntegral enemiesOnPath)

Maze Distance

Next lets get the shortest maze distance to a nearby enemy. We'll actually want to generalize the behavior of our existing BFS function for this. We want to find the shortest path to any one of the enemy locations. So instead of supplying a single target location, we'll supply a set of target locations. Then we'll cap the distance to search so we aren't doing a full BFS of the maze every time. This gives an optional range parameter. Let's use these ideas to make an expanded API that our original function will use.

getShortestPathToTargetsWithLimit
  :: Maze
  -> Location
  -> Set.Set Location
  -> Maybe Int
  -> [Location]
getShortestPathToTargetsWithLimit
  maze initialLocation targetLocations maxRange = ...

-- Original function call!
getShortestPath maze initialLocation targetLocation =
  getShortestPathToTargetsWithLimit maze initialLocation
    (Set.singleton targetLocation) Nothing

bfs
  :: Maze
  -> Location
  -> Set.Set Location -- Now a set of targets
  -> Maybe Int -- Added range parameter
  -> [Location]
bfs = ...

We'll have to make a few tweaks to our algorithm now. Each search state element will have a "distance" associated with it.

data BFSState = BFSState
  { bfsSearchQueue :: Seq.Seq (Location, Int)
  ...



-- Our initial state has a distance of 0
getShortestPathToTargetsWithLimit
  maze initialLocation targetLocations maxRange =
    evalState
      (bfs maze initialLocation targetLocations maxRange)
      (BFSState
        (Seq.singleton (initialLocation, 0))
        (Set.Singleton initialLocation)
        Map.empty)

Now we need a couple modifications to the core bfs function. When extracting the next element in the queue, we have to consider its distance. All new items we create will increment that distance. And if we're at the max distance, we won't add anything to the queue. Finally, when evaluating if we're done, we'll check against the set of targets, rather than a single target. Here's our bfs code, with differences noted.

bfs
  :: Maze
  -> Location
  -> Set.Set Location
  -> Maybe Int
  -> State BFSState [Location]
bfs maze initialLocation targetLocations maxRange = do
  BFSState searchQueue visitedSet parentsMap <- get
  if Seq.null searchQueue
    then return []
    else do

      -- ! Unwrap distance as well
      let (nextLoc, distance) = Seq.index searchQueue 0

      -- ! Check set membership, not equality
      if Set.member nextLoc targetLocations
        then return (unwindPath parentsMap [nextLoc])
        else do

          -- ! Add the new distance to each adjacent cell
          let adjacentCells = (, distance + 1) <$> 
                getAdjacentLocations maze nextLoc


          -- ! Account for the distance with a new helper function
          let unvisitedNextCells = filter
                (shouldAddNextCell visitedSet)
                adjacentCells

          let newSearchQueue = foldr
                (flip (Seq.|>))
                (Seq.drop 1 searchQueue) 
                unvisitedNextCells
              newVisitedSet = Set.insert nextLoc visitedSet
              newParentsMap = foldr
                (\(l, _) -> Map.insert l nextLoc)
                parentsMap unvisitedNextCells
          put (BFSState newSearchQueue newVisitedSet newParentsMap)
          bfs maze initialLocation targetLocations maxRange
  where
    -- ! Helper function to account for distance when adding to queue
    shouldAddNextCell visitedSet (loc, distance) = case maxRange of
      Nothing -> not (Set.member loc visitedSet)
      Just x -> distance <= x && not (Set.member loc visitedSet)

    unwindPath parentsMap currentPath = ...

Now to use this feature, we'll use our new different shortest path call. If the distance is "0", this means we have no enemies near us, and there's no penalty. We also won't apply a penalty if our stun is available. Otherwise, we'll provide a stiffer penalty the shorter the path. Then we mix it in with the other scores.

evaluateWorld :: World -> Float
evaluateWorld w =
  ...
  nearestEnemyDistanceScore +
  ...
  where
    ...
    nearestEnemyDistance = length $ getShortestPathToTargetsWithLimit
      (worldBoundaries w)
      playerLoc
      (Set.fromList activeEnemyLocations)
      (Just 4)
    nearestEnemyDistanceScore =
      if nearestEnemyDistance == 0 || stunAvailable then 0.0
        else -100.0 * (fromIntegral (5 - nearestEnemyDistance))

We'll also drop the enemy manhattan distance weight to -5.0.

Results

From this change, our player suddenly appears much more intelligent! It will back away from enemies when it is missing it's stun. It will use the stun and go past the enemy when appropriate.

There are still ways we could improve the AI. It doesn't account for future space to retreat when running away. It sometimes uses the stun too early, when it might be better to wait for more enemies to come into range. But it's not clear how we could improve it by tweaking the weights. This means it's time to consider machine learning as an option to get better weights!

Conclusion

Next week, we'll re-acquaint ourselves with the basics of machine learning and Tensor Flow. This will set us up to write a program that will determine our AI weights.

We're going to start working with Tensor Flow next week! To make sure you can keep up, download our Haskell Tensor Flow Guide. It'll help you with the basics of making this complex Haskell library work.

Read More
James Bowen James Bowen

Building a Better Brain

brain_cache.jpg

In the last few weeks, we've focused a lot on the player AI for our game. We've used a few more advanced tricks to help our player navigate the maze using drills. But that's come at a performance cost. The game can now get a little choppy when there are a lot of enemies, or when our player is far away from the goal. It also takes longer to run our analysis iterations than we would like.

This week, we'll improve the performance of our AI by caching the determined path. Lots of our calculations for shortest path measurements get repeated. We can keep track of these, and avoid the entire BFS algorithm altogether in a lot of circumstances!

This week, you should take a look at the search-caching branch on our Github repository for the complete code we're implementing here. We'll focus on changes in the MazeUtils.hs file.

We're also going to do a little bit of profiling for this article. Profiling your code is an important skill to learn about if you ever want to use Haskell in production. For some other useful skills, check out our Production Checklist!

Profiling Our Code

As alluded to above, we have a pretty good idea of where the performance bottleneck is for our code. But it always pays to be sure. So to double check, we're going to run our code under profiling. We'll go through some of the basics here, but you should also check out this article we did on profiling a while back.

We'll get a readout for our code that will tell us which functions are taking the most time. This will tell us where we can make the most effective improvements. It will also give us a concrete way to prove our improvement later.

To start, we'll need to rebuild our code with stack build --profile. Be warned this can take a while, since all the libraries also need to be re-built. Then we can re-run the analysis program we used last week:

stack exec -- analyze-game maze_save_2 --enemies +RTS -p

Here's the abbreviated readout in the file `analyze-game.EXE.prof:

total time = 32.62 secs

COST CENTRE                                  %time
drillBFS.newParentsMap.\                     21.9
drillBFS.unvisitedNextItems.\                21.7
drillBFS.newVisitedSet                       19.4
getDrillAdjacentItems                        6.2
drillBFS                                     4.5
drillBFS.newSearchQueue                      4.0
getDrillAdjacentItems.mkItemFromResult       3.0
bfs.newParentsMap.\                          2.1
bfs.newVisitedSet                            2.0
getDrillAdjacentItems.mkItemFromResult.(...) 1.7
drillBFS.unvisitedNextItems                  1.4
bfs.unvisitedNextCells.\                     1.1
drillBFS.newParentsMap                       1.0
getDrillAdjacentItems.bounds                 1.0
bfs                                          0.6
getAdjacentLocations                         0.5

Unsurprisingly, we see that drillBFS and it's helpers are the biggest culprits. They account for the top seven entries on the list and a whopping 82% of the time we spend. The enemy AI calculations come in a distant second at around 6.3% of the time. So let's focus on fixing the player algorithm.

A Basic Cache for the Player

As we try to improve our player AI, there's one big observation we can make. Perhaps some of you already noted this when reading about that AI in the first place. For the most part, our player follows a single path the whole time. We calculate the complete path from start to finish on each player move cycle, but then throw most of it away. The only time we get "blown off" this path is when we have to run away from enemies.

There are only a few circumstances where we change this path! So let's make PlayerMemory type that will keep track of it. This should save us a ton of time!

newtype PlayerMemory = PlayerMemory (Maybe [Location])

data Player = Player
  { …
  , playerMemory :: PlayerMemory
  }

We'll add this memory to our player type. When we initialize it from JSON instances, it should start out empty. There's no need to keep track of this in a save-game file.

This change will complicate our move API a little bit. It will now produce the PlayerMemory as an output:

makePlayerMove :: World -> (PlayerMove, PlayerMemory)

Using Our Memory

When it comes to making out move, we first need to put the path into memory. To start, we'll make PlayerMemory out of the path we get from BFS.

makePlayerMove :: World -> (PlayerMove, PlayerMemory)
makePlayerMove w =
  ( PlayerMove finalMoveDirection useStun drillDirection
  , ...
  )
  where
    shortestPath = getShortestPathWithDrills …
    memoryFromMove = PlayerMemory (Just shortestPath)
    ...

In general, we'll want to return this "memory". But there's one circumstance where we'll want to invalidate it. When we have to retreat from our enemies, we'll diverge from this ideal path. In this case, we'll return Nothing. Here's what that logic looks like:

makePlayerMove :: World -> (PlayerMove, PlayerMemory)
makePlayerMove w =
  ( PlayerMove finalMoveDirection useStun drillDirection
  , if emptyCache then (PlayerMemory Nothing) else memoryFromMove
  )
  where
    (finalMoveDirection, useStun, emptyCache) = if not enemyClose
      then (shortestPathMoveDirection, False, False)
      else if canStun
        then (shortestPathMoveDirection, True, False)
        else case find (/= shortestPathMoveLocation) possibleMoves of
          Nothing -> (DirectionNone, False, True)
          Just l -> (getMoveDirection playerLoc, False, True)

Now let's consider when we use the cached information, as this will let us skip the BFS call altogether! We'll add one more validity check when doing this. We'll ensure that the list is non-empty and that our current location is at the head of the list. Then we can use the tail of the memory list as the shortest path call!

makePlayerMove :: World -> (PlayerMove, PlayerMemory)
makePlayerMove w = ...
  where
    (useCache, cachePath) = case playerMemory currentPlayer of
      (PlayerMemory (Just (first : rest))) ->
        (first == playerLoc, rest)
      _ -> (False, [])
    shortestPath = if useCache then cachePath
      else getShortestPathWithDrills ...

The last thing we need is to ensure that the cache goes back into memory. This is a simple modification of our function for making the player move:

modifyWorldForPlayerMove :: World -> Location -> PlayerMemory -> World
modifyWorldForPlayerMove w newLoc memory = ...
  where
    currentPlayer = worldPlayer w
    playerWithMemory = currentPlayer {playerMemory = memory}
    playerAfterMove = movePlayer newLoc playerWithMemory
    ...

Now we can run our analysis again. We'll see that our Player's AI functions are still the biggest contributor. But the percentage has gone down a lot. They now take only take up around 55% of our total time, instead of 82%! Meanwhile, the percentage of time from the normal BFS functions is now up to around 35%. Most importantly, the total time for the analysis declined five-fold. On the first run, it was 32.62 seconds, and it now only takes 6.79 seconds, a huge improvement!

total time = 6.79 secs

COST CENTRE                                  %time
drillBFS.unvisitedNextItems.\                14.3
drillBFS.newParentsMap.\                     14.2
drillBFS.newVisitedSet                       12.6
bfs.newParentsMap.\                          9.9
bfs.newVisitedSet                            9.2
bfs.unvisitedNextCells.\                     5.7
getDrillAdjacentItems                        4.3
drillBFS.newSearchQueue                      2.8
getAdjacentLocations                         2.8
drillBFS                                     2.6
bfs                                          2.6
getDrillAdjacentItems.mkItemFromResult       2.0
bfs.newSearchQueue                           1.8
getDrillAdjacentItems.mkItemFromResult.(...) 1.1
bfs.unwindPath                               1.1
bfs.unvisitedNextCells                       1.0
drillBFS.unvisitedNextItems                  0.9
bfs.newParentsMap                            0.7

Conclusion

Profiling is an important tool we can use for improving our code, no matter what language we're working in. When our program isn't performing how we like, we have to be sure to address the right parts of it. It may have been tempting to make a different assumption from the start. Since there are many enemy characters, it would be natural to tackle that algorithm first. But our profiling output made it clear that the player AI was the problem.

Next week, we'll start exploring different AI concepts. We'll start moving towards a kind of AI that can be machine-learned. Our code will be simpler, but our product won't be as good, at least at the start! But we'll start getting used to the way an AI can evaluate positions.

For more useful resources in improving your Haskell skills, download our Production Checklist! It has a lot of different tools and libraries to check out!

Read More
James Bowen James Bowen

Moving Towards ML: Evaluation Functions

decision_tree.png

Before we get started, here's a reminder that today (August 5th) is the last day of enrollment for our Haskell From Scratch course! Sign-ups close at midnight Pacfic time! Don't miss out!

This week, we're going to start taking our AI in a somewhat new direction. Right now, we're hard-coding specific decisions for our player to make. But this week, we'll make a more general function for evaluating different positions. Our initial results will be inferior to the AI we've hand-coded. But we'll set ourselves up to have a much better AI in the future by applying machine learning.

For more details on the code for this article, take a look at the evaluation-game-function branch on our Github Repository! This article also starts our move towards machine learning related concepts. So now would be a good time to review our Haskell AI Series. You can download our Tensor Flow Guide to learn more about using Haskell and Tensor Flow!

Evaluation as a Strategy

Currently, our AI follows a strict set of rules. It performs pretty well for the current problem space. But suppose circumstances changed. Suppose we use different maze structures. Or we could add a completely new feature to the game. In these cases, we might need a completely different set of ideas to build a competent AI.

Our new strategy will be much more general. We'll supply our AI with a function that can evaluate a particular board position. That is, it will look at the world, and create a numeric output scoring it. Then our brain will look at all possible moves, score each position, and choose the move with the best result.

If game rules change, we'll need to rethink the evaluation function. But, by making the problem one of numbers to numbers, it'll be easier to use machine learning (instead of our own logic) to devise this function. This way, we can radically change the nature of the game, and we won't need to do too much manual work to change the AI. We might need to add new features (as we'll discuss later). But otherwise we would just need to re-train the evaluation function.

Top Down Development

To implement this approach, we'll put the "function" in functional programming. We'll start by outlining our decision making process with a series of type signatures. Let's remember that first, our overarching goal is a function that takes a World and gives us a PlayerMove:

makePlayerMove :: World -> PlayerMove

We should first determine the set of possible moves:

possibleMoves :: World -> [PlayerMove]

Then we'll need to calculate the new World from each of those moves. (We won't go over this function in this article. It mainly consists of refactoring code we already have for manipulating the game).

applyPlayerMove :: World -> PlayerMove -> World

Then we'll score each of those resulting worlds. This is where the real "brain" is going to live now:

evaluateWorld :: World -> Float

Now that we know the functions we're writing, we can already implement makePlayerMove. We'll assume our helpers already exist and then we apply the process outlined above:

makePlayerMove :: World -> PlayerMove
makePlayerMove w = bestMove
  where
    -- 1. Get our Moves
    allMoves = possibleMoves w

    -- 2. See what the results of each move are
    possibleWorlds = applyPlayerMove w <$> allMoves

    -- 3. Score each resulting world
    scores = evaluateWorld <$> possibleWorlds

    -- 4. Combine the world with its move and choose the best one
    movesWithScores = zip allMoves movesWithScores
    bestMove = fst $ maximumBy (\(_, score1) (_, score2) ->
      compare score1 score2) movesWithScores

This will compile, and we can now move on to the individual components.

Getting Possible Moves

Let's start with getting all the possible moves. When it comes to movement, we generally have five options: stand still, or move in one of four directions. But if we're out of drills, or near the boundary of the world, this can restrict our options. But we always have the sure option of standing still, so let's start with that:

possibleMoves :: World -> [PlayerMove]
possibleMoves w = …
  where
    standStillMove = PlayerMove DirectionNone False DirectionNone
    ...

Now in every direction, we'll have a Maybe move possibility. If it's a WorldBoundary, we'll get Nothing. Otherwise if it's a wall, then we'll have a possible move as long as a drill is available. Otherwise the move is possible, and we won't need a drill. We'll wrap these behaviors in a helper function, and then it's easy to use that in each direction:

possibleMoves :: World -> [PlayerMove]
possibleMoves w = baseMoves
  where
    standStillMove = PlayerMove DirectionNone False DirectionNone
    player = worldPlayer w
    bounds = (worldBoundaries w) Array.! (playerLocation player)

    possibleMove :: (CellBoundaries -> BoundaryType) ->
      MoveDirection -> Maybe PlayerMove
    possibleMove boundaryFunc direction =
      case boundaryFunc bounds of
        WorldBoundary -> Nothing
        Wall _ -> if playerDrillsRemaining player > 0
          then Just $ PlayerMove direction False direction
          else Nothing
        AdjacentCell _ -> Just $
          PlayerMove direction False DirectionNone

    upMove = possibleMove upBoundary DirectionUp
    rightMove = possibleMove rightBoundary DirectionRight
    downMove = possibleMove downBoundary DirectionDown
    leftMove = possibleMove leftBoundary DirectionLeft

    baseMoves = standStillMove : (catMaybes [upMove, rightMove, downMove, leftMove])

Now we have to factor in that each move can also apply the stun if it's available.

possibleMoves :: World -> [PlayerMove]
possibleMoves w = baseMoves ++ stunMoves
  where
    ...
    baseMoves = standStillMove : (catMaybes [upMove, rightMove, downMove, leftMove])

    stunMoves = if playerCurrentStunDelay player /= 0 then []
      else [ m { activateStun = True } | m <- baseMoves ]

And now we've got our moves!

Evaluating the Game Position

Now let's start tackling the problem of evaluating a particular game situation. Any manual solution we come up with here is likely to have problems. This is where machine learning will come in. But here's the general approach we want.

First, we'll select particular "features" of the world. For instance, how far away are we from the end of the maze? How many enemies are within our stun radius? We'll consider all these elements, and then come up with a "weight" for each feature. A weight is a measurement of whether that feature makes the position "good" or "bad". Then, we'll add together the weighted feature values to get a score. So here's a list of the features we're going to use:

  1. How close are we (in maze search terms) from the target location? This will use pure BFS and it will not account for using drills.
  2. How close are we in manhattan distance terms from the target location?
  3. Is there an active enemy on the same square as the player (this will receive a heavy negative weight!)
  4. How many enemies are within our stun radius?
  5. Is our stun available?
  6. How many drills do we have left?

Let's start by getting all these features:

evaluateWorld :: World -> Float
evaluateWorld w = ...
  where
    player = worldPlayer w
    playerLoc@(px, py) = playerLocation player
    radius = stunRadius . playerGameParameters . worldParameters $ w
    goalLoc@(gx, gy) = endLocation w
    activeEnemyLocations = enemyLocation <$>
      (filter (\e -> enemyCurrentStunTimer e == 0) (worldEnemies w))

    onActiveEnemy = playerLocation player `elem` activeEnemyLocations

    shortestPathLength = length $
      getShortestPath (worldBoundaries w) playerLoc goalLoc

    manhattanDistance = abs (gx - px) + abs (gy - py)

    stunAvailable = playerCurrentStunDelay player == 0

    numNearbyEnemies = length
      [ el | el@(elx, ely) <- activeEnemyLocations,
        abs (elx - px) <= radius && abs (ely - py) <= radius ]

    drillsRemaining = playerDrillsRemaining player

Now let's move on to assigning scores. If our player is on the same square as an active enemy, we lose. So let's give this a weight of -1000. Conversely, the closer we get to the target, the closer we are to winning. So let's devise a function where if that distance is 0, the score is 1000. Then the farther away we get, the more points we lose. Let's say, 20 points per square. For manhattan distance, we'll use a strict penalty, rather than reward:

evaluateWorld :: World -> Float
evaluateWorld w = ...
  where
    ...
    onActiveEnemyScore = if onActiveEnemy then -1000.0 else 0.0
    shortestPathScore = 1000.0 - (20.0 * (fromIntegral shortestPathLength))
    manhattanDistanceScore = (-5.0) * (fromIntegral manhattanDistance)

Now we want to generally reward having our power ups available to us. This will stop the bot from needlessly using them and also reward it for picking up new drills. We'll also penalize having enemies too close to us.

evaluateWorld :: World -> Float
evaluateWorld w = ...
  where
    ...
    stunAvailableScore = if stunAvailable then 80.0 else 0.0
    numNearbyEnemiesScore = -100.0 * (fromIntegral numNearbyEnemies)
    drillsRemainingScore = 30.0 * (fromIntegral drillsRemaining)

And to complete the function, we'll just add these together:

evaluateWorld :: World -> Float
evaluateWorld w =
  onActiveEnemyScore +
  shortestPathScore +
  manhattanDistanceScore +
  stunAvailableScore +
  numNearbyEnemiesScore +
  drillsRemainingScore

How Well Does it Work?

When we run the game now with the AI active, we see some interesting behaviors. Our bot will generally navigate the maze well. It's path isn't optimal, as we have with drillBFS. But it makes decent choices about drilling. Its behavior around enemies is a bit strange. It tends to stay away from them, even if they're not actually close in maze difference. This makes it take longer than it needs.

We still don't have good retreating behavior in certain cases. It will often stand still and let an enemy grab it instead of running away.

At this point, we have a couple options for improving the AI. First, we could try tweaking the weights. This will be tedious for us to do manually. This is why we want to apply machine learning techniques to come up with optimal weights.

But the other option is to update the feature space. If we can come up with more intelligent features, we won't need as precise weights.

Conclusion

Next week, we'll try to fix our behavior around enemies. We'll use true maze distance in more places as opposed to manhattan distance. This should give us some big improvements. Then we'll start looking into how we can learn better weights.

We'll be coming up pretty soon on using Tensor Flow for this program! Download our Haskell Tensor Flow Guide to learn more!

And if you're still a Haskell beginner, there's never been a better time to learn! Register for our Haskell From Scratch course to jump-start your Haskell journey! Enrollment ends at midnight TODAY! (August 5th).

Read More
James Bowen James Bowen

Haskell From Scratch Re-Opened!

newlogo3transparent.png

This week we're taking a break from our Gloss/AI series to make a special announcement! Haskell from Scratch, our beginners course, is now re-opened for enrollment! We've added some more content since the last time we offered it. The biggest addition is a mini-project to help you practice your new skills!

Enrollment will only be open for another week, so don't wait! Next Monday, August 5th, will be the last day to sign up! Enrollments will close at midnight. Once you sign up for the course, you'll have permanent access to the course material. This includes any new content we add in the future. So even if you don't have the time now, it's still a good idea to sign up!

I also want to take this opportunity to tell a little bit of the story of how I learned Haskell. I want to share the mistakes I made, since those motivated me to make this course.

My History with Haskell

I first learned Haskell in college as part of a course on programming language theory. I admired the elegance of a few things in particular. I liked how lists and tuples worked well with the type system. I also appreciated the elegance of Haskell's type definitions. No other language I had seen represented the idea of sum types so well. I also saw how useful pattern matching and recursion were. They made it very easy to break problems down into manageable parts.

After college, I had the idea for a code generation project. A college assignment had taught me some useful Haskell libraries for the task. So I got to work writing some Haskell. At first things were quite haphazard. Eventually though, I developed some semblance of test driven development and product organization.

About nine months into that project, I had the great fortune of landing a Haskell project at my day job. As I ramped up on this project, I saw how deficient my knowledge was in a lot of areas. I realized then a lot of the mistakes I had been making while learning the language. This motivated me to start the Monday Morning Haskell blog.

Main Advice

Of course, I've tried to incorporate my learnings throughout the material on this blog. But if I had to distill the key ideas, here's what they'd be.

First, learn tools and project organization early! Learn how to use Stack and/or Cabal! For help with this, you can check out our free Stack mini-course! After several months on my side project, I had to start from scratch to some extent. The only "testing" I was doing was running some manual executables and commands in GHCI. So once I learned more about these tools, I had to re-work a lot of code.

Second, it helps a lot to have some kind of structure when you're first learning the language. Working on a project is nice, but there are a lot of unknown-unknowns out there. You'll often find a "solution" for your problem, only to see that you need a lot more knowledge to implement it. You need to have a solid foundation on the core concepts before you can dive in on anything. So look for a source that provides some kind of structure to your Haskell learning, like a book (or an online course!).

Third, let's get to monads. They're an important key to Haskell and widely misunderstood. But there are a couple things that will help a lot. First, learn the syntactic patterns of do-syntax. Second, learn how to use run functions (runState, runReaderT, etc.). These are how you bring monadic expressions into the rest of your code. You can check out our Monads Series for some help on these ideas. (And of course, you'll learn all about monads in Haskell From Scratch!)

Finally, ask for help earlier! I still don't plug into the Haskell network as much as I should. There are a lot of folks out there who are more than willing to help. Freenode is a great place, as is Reddit and even Twitter!

Conclusion

There's never been a better time to start learning Haskell! The language tools have developed a ton in the last few years and the community is growing stronger. And of course, we've once again opened up our Haskell From Scratch Beginners Course! You don't need any Haskell experience to take this course. So if you always wanted to learn more about Haskell but needed more organization, this is your chance!

If you want to stay up to date with the latest at Monday Morning Haskell, make sure to Subscribe to our mailing list! You'll hear the latest about upcoming articles, as well as any new course offerings. You'll also get access to our Subscriber Resources.

Read More
James Bowen James Bowen

Analyzing Our Parameters

analysis.jpg

Our last couple articles have focused on developing an AI for the player character in our game. It isn't perfect, but it's a decent approximation of how a human would try to play the game. This means we can now play iterations of the game without any human involvement. And by changing the parameters of our world, we can play a lot of different versions of the game.

Our goal for this week will be to write some simple analysis functions. These will play through the game without needing to display anything on the screen. Then we'll be able to play different versions in quick succession and compare results.

As always, the code for this project is on a Github Repository. For this article, take a look at the analyze-game branch.

If you're completely new to Haskell, a simple game like this is a great way to get started! But you should start with our Beginners Checklist! It'll help you get everything set up with the language on your local machine! Then you can move onto our Liftoff Series to learn more about Haskell's mechanics.

Generating a Result

The first thing we need is a function that takes a world state and generates a result for it. Our game does have a degree of randomness. But once we fix the starting random generator for a everything is deterministic. This means we need a function like:

runGameToResult :: World -> GameResult

We'll want to use our updateFunc from the main game runner. This is our "evolution" function. It's job is to go from one World state to another. It evolves the game over the course of one timestep by allowing each of the agents to make a decision (or wait). (Note we don't use the Float parameter in our game. It's just needed by Gloss).

updateFunc :: Float -> World -> World

Since we want to track an ever evolving stateful variable, we'll use the State monad. For each iteration, we'll change the world using this update step. Then we'll check its result and see if it's finished. If not, we'll continue to run the game.

runGameToResult :: World -> GameResult
runGameToResult = evalState runGameState
  where
    runGameState :: State World GameResult
    runGameState = do
      modify (updateFunc 1.0)
      currentResult <- gets worldResult
      if currentResult /= GameInProgress
        then return currentResult
        else runGameState

Analysis: Generating World Iterations

Now that we can run a given world to its conclusion, let's add another step to the process. We'll run several different iterations with any given set of parameters on a world. Each of these will have a different set of starting enemy locations and drill power-ups. Let's make a function that will take a random generator and a "base world". It will derive a new world with random initial enemy positions and drill locations.

generateWorldIteration :: World -> StdGen -> World

We'll use a helper function from our game that generates random locations in our maze. It's stateful over the random generator.

generateRandomLocation :: (Int, Int) -> State StdGen Location

So first let's get all our locations:

generateWorldIteration :: World -> StdGen -> World
generateWorldIteration w gen1 = ...
  where
    params = worldParameters w
    rowCount = numRows params
    columnCount = numColumns params
    enemyCount = numEnemies params
    drillCount = numDrillPowerups params

    (enemyLocations, gen2) = runState
      (sequence
        (map
          (const (generateRandomLocation (rowCount, columnCount)))
          [1..enemyCount])
        )
      gen1
    (drillLocations, gen3) = runState
      (sequence
        (map
          (const (generateRandomLocation (rowCount, columnCount)))
          [1..drillCount])
        )
      gen2
    ...

Then we have to use the locations to generate our different enemies. Last, we'll plug all these new elements into our base world and return it!

generateWorldIteration :: World -> StdGen -> World
generateWorldIteration w gen1 = w
  { worldEnemies = enemies
  , worldDrillPowerUpLocations = drillLocations
  , worldRandomGenerator = gen3
  , worldTime = 0
  }
where
    ...
    (enemyLocations, gen2) = ...
    (drillLocations, gen3) = …
    enemies = mkNewEnemy (enemyGameParameters params) <$> enemyLocations

Analysis: Making Parameter Sets

For our next order of business, we want to make what we'll call a parameter set. We want to run the game with different parameters each time. For instance, we can take a base set of parameters, and then change the number of enemies present in each one:

varyNumEnemies :: GameParameters -> [GameParameters]
varyNumEnemies baseParams = newParams <$> allEnemyNumbers
  where
    baseNumEnemies = numEnemies baseParams
    allEnemyNumbers = [baseNumEnemies..(baseNumEnemies + 9)]
    newParams i = baseParams { numEnemies = i }

We can do the same for the number of drill pickups:

varyNumDrillPickups :: GameParameters -> [GameParameters]
varyNumDrillPickups baseParams = newParams <$> allDrillNumbers
  where
    baseNumDrills = numDrillPowerups baseParams
    allDrillNumbers = [baseNumDrills..(baseNumDrills + 9)]
    newParams i = baseParams { numDrillPowerups = i }

Finally, we can have a different cooldown time for our player's stun ability.

varyPlayerStunCooldown :: GameParameters -> [GameParameters]
varyPlayerStunCooldown baseParams = newParams <$> allCooldowns
  where
    basePlayerParams = playerGameParameters baseParams
    baseCooldown = initialStunTimer basePlayerParams
    allCooldowns = [(baseCooldown - 4)..(baseCooldown + 5)]
    newParams i = baseParams
      { playerGameParameters = basePlayerParams { initialStunTimer = i }}

If you fork our code, you can try altering some other parameters. You can even try combining certain parameters to see what the results are!

Tying It Together

We've done most of the hard work now. We'll have a function that takes a number of iterations per parameter set, the base world, and a generator for those sets. It'll match up each parameter set to the number of wins the player gets over the course of the iterations.

runAllIterations
  :: Int
  -> World
  -> (GameParameters -> [GameParameters])
  -> [(GameParameters, Int)]
runAllIterations numIterations w paramGenerator =
  map countWins results
  where
    aiParams = (worldParameters w) { usePlayerAI = True }
    paramSets = paramGenerator aiParams

    runParamSet :: GameParameters -> [GameResult]
    runParamSet ps = map
      (runGame w {worldParameters = ps })
      [1..numIterations]

    runGame :: World -> Int -> GameResult
    runGame baseWorld seed = runGameToResult
      (generateWorldIteration baseWorld (mkStdGen seed))

    results :: [(GameParameters, [GameResult])]
    results = zip paramSets (map runParamSet paramSets)

    countWins :: (GameParameters, [GameResult]) -> (GameParameters, Int)
    countWins (gp, gameResults) =
      (gp, length (filter (== GameWon) gameResults))

We need one more function. It will read an input file and apply our steps over a particular parameter group. Here's an example with varying the number of enemies:

analyzeNumEnemies :: FilePath -> IO ()
analyzeNumEnemies fp = do
  world <- loadWorldFromFile fp
  let numIterations = 10
  putStrLn "Analyzing Different Numbers of Enemies"
  let results = runAllIterations numIterations world varyNumEnemies
  forM_ results $ \(gp, numWins) -> putStrLn $
    "With " ++ (show (numEnemies gp)) ++ " Enemies: " ++ (show numWins)
      ++ " wins out of " ++ (show numIterations) ++ " iterations."

Now we're done! In the appendix, you can find some basic results of our investigation!

Conclusion

Soon, we'll take our analysis steps and apply them in a more systematic way. We'll try to gauge the difficulty of a particular game level. Then we can make levels that get more and more challenging!

But first, we'll start exploring a few ways we can improve the player and enemy AI abilities. We'll start by implementing some basic caching mechanisms in our breadth first search. Then we'll consider some other AI patterns besides simple BFS.

For a review of the code in this article, take a look at our Github Repository. You'll want to explore the analyze-game branch!

We'll soon be exploring machine learning a bit more as we try to improve the game. Make sure to read our series on Haskell and AI to learn more! Download our Haskell Tensorflow Guide to see how we can use tensor flow with Haskell!

Appendix

With 4 drills and 10 cooldown time:

Analyzing Different Numbers of Enemies
With 4 Enemies: 10 wins out of 10 iterations.
With 5 Enemies: 9 wins out of 10 iterations.
With 6 Enemies: 9 wins out of 10 iterations.
With 7 Enemies: 10 wins out of 10 iterations.
With 8 Enemies: 9 wins out of 10 iterations.
With 9 Enemies: 9 wins out of 10 iterations.
With 10 Enemies: 9 wins out of 10 iterations.
With 11 Enemies: 9 wins out of 10 iterations.
With 12 Enemies: 8 wins out of 10 iterations.
With 13 Enemies: 7 wins out of 10 iterations.

With 13 enemies and 10 cooldown time:

With 2 Drills: 5 wins out of 10 iterations.
With 3 Drills: 7 wins out of 10 iterations.
With 4 Drills: 8 wins out of 10 iterations.
With 5 Drills: 8 wins out of 10 iterations.
With 6 Drills: 8 wins out of 10 iterations.
With 7 Drills: 7 wins out of 10 iterations.
With 8 Drills: 8 wins out of 10 iterations.
With 9 Drills: 8 wins out of 10 iterations.
With 10 Drills: 8 wins out of 10 iterations.
With 11 Drills: 8 wins out of 10 iterations.
Read More
James Bowen James Bowen

Advanced Search with Drilling!

drill_2.png

In last week's article we explored how we can make an AI for our main player character. This meant we could play the game without input from a user. The game can now "play itself", and churn out a lot of iterations and results. This, in turn, will let us test combinations of parameters so we can make levels that are challenging.

Our AI is still a little too simplistic. The version we made last week doesn't incorporate the drill feature at all. So this week, let's see if we can devise a way to use that. We'll start with some of the search algorithm ideas we're already using for BFS, and expand from there.

For this article, you'll want to look at the player-ai-drill branch on our Github Repository. It has the full implementation, and you can check the newest commits to see what has changed.

This article will depend a lot on our knowledge of monads, particularly the state monad. If you're newer to Haskell development, you should check out our series on Functional Data Structures. It'll help you understand this tricky concept better.

Updating Our Types

Once again, there are a few quick updates we'll want to make before we start writing our AI. Last week we wrote our main function using this type:

data MoveChoice =
  MoveUp |
  MoveRight |
  MoveDown |
  MoveLeft |
  StandStill

data PlayerMove = PlayerMove
  { playerMoveChoice :: MoveChoice
  , activateStun :: Bool
  }

makePlayerMove :: World -> PlayerMove
...

First we'll change the original type to be MoveDirection. Then to account for the drill powerup, we'll add another field using this type:

data MoveDirection =
  DirectionUp |
  DirectionRight |
  DirectionDown |
  DirectionLeft |
  DirectionNone

data PlayerMove = PlayerMove
  { playerMoveDirection :: MoveDirection
  , activateStun :: Bool
  , drillDirection :: MoveDirection
  }

So we can start out by using DirectionNone for the drill every time, and ensure our game works.

Modifying the World

With the new potential to use the drill to change our world, we'll want another modifier function. This will reuse a lot of old code from our player input section.

modifyWorldForDrill :: World -> MoveDirection -> World

To start, we'll refactor our drillLocation function to be a top level function. Nothing much has to change with its implementation. Let's recall its type signature:

drillLocation
  :: (CellBoundaries -> BoundaryType)
  -> (CellBoundaries -> CellBoundaries)
  -> (CellBoundaries -> CellBoundaries)
  -> World
  -> World

Now our player modifier is pretty straightforward. We'll do a case analysis of the desired direction, as we do with the input keys. This gives us all the necessary inputs to drillLocation:

modifyWorldForPlayerDrill :: World -> MoveDirection -> World
modifyWorldForPlayerDrill w drillDirection = case drillDirection of
  DirectionUp ->
      drillLocation upBoundary breakUpWall breakDownWall w
  DirectionRight ->
      drillLocation rightBoundary breakRightWall breakLeftWall w
  DirectionDown ->
      drillLocation downBoundary breakDownWall breakUpWall w
  DirectionLeft ->
      drillLocation leftBoundary breakLeftWall breakRightWall w
  DirectionNone -> w

And now working this into our update function is easy. It's just another function we have to compose with the other options. We'll apply the drill as the first step, even before the stun:

updateWorldForPlayerMove :: World -> World
updateWorldForPlayerMove w = if shouldMovePlayer
  then worldAfterMove
  else w
  where
    ...
    move = makePlayerMove w

    worldAfterDrill =
        modifyWorldForPlayerDrill w (drillDirection move)

    worldAfterStun = if activateStun move
      then modifyWorldForStun worldAfterDrill
      else worldAfterDrill

    … -- Apply move

Changing the AI

Now let's get cracking on a better algorithm! The good news is that we can still stick to breadth first search. But what needs to change is the items in our search queue. There's a certain amount of state involved in each path we take. Before, we know that if a path backtracks onto a previous location, it will be slower. But now, we can have a faster path coming over a previous square IF we do so with more drills. But each time we take a drill, we need to remove it from the map (from the point of view of this path). Otherwise we could keep picking up the same drill! Thus a "search item", as we'll call it, must also contain the number of drills we have as well as the set of drill power-ups existing on the map. This item has all the important world state for our player moving around the map.

type DrillSearchItem = (Location, Word, Set.Set Location)

Then the other obvious change is that we can go to more adjacent squares. As long as the drill count is at least one, we can go to any adjacent tile. We'll see how this factors into our algorithm below.

Re-Doing BFS

We're going to mimic our original BFS algorithm for the most part when it comes to the drilling version. First, let's reconsider our notion of "adjacent" cells. Remember this function from BFS?

getAdjacentLocations :: Maze -> Location -> [Location]

We now want to re-write it using our DrillSearchItem alias.

getDrillAdjacentItems :: Maze -> DrillSearchItem -> [DrillSearchItem]

In the previous version, we had a section dedicated to checking the bounds around the given location. Then we could see which locations are adjacent. We'll want a similar section for drilling, but we want to use walls as well, as long as we have at least one drill. So let's pattern match on our current item, and gather the locations we can use. We'll also use a boolean to denote if we used a drill to get to that location.

getDrillAdjacentItems :: Maze -> DrillSearchItem -> [DrillSearchItem]
getDrillAdjacentItems maze (location, drillsRemaining, drillLocs) =
  …
  where
    canDrill = drillsRemaining > 0

    maybeUpLoc = case upBoundary bounds of
      (AdjacentCell loc) -> Just (loc, False)
      (Wall loc) -> if canDrill then Just (loc, True) else Nothing
      _ -> Nothing
    maybeRightLoc = case rightBoundary bounds of
      (AdjacentCell loc) -> Just (loc, False)
      (Wall loc) -> if canDrill then Just (loc, True) else Nothing
      _ -> Nothing
    maybeDownLoc = case downBoundary bounds of
      (AdjacentCell loc) -> Just (loc, False)
      (Wall loc) -> if canDrill then Just (loc, True) else Nothing
      _ -> Nothing
    maybeLeftLoc = case leftBoundary bounds of
      (AdjacentCell loc) -> Just (loc, False)
      (Wall loc) -> if canDrill then Just (loc, True) else Nothing
      _ -> Nothing

Now we want a helper function to convert each of these results into a new DrillSearchItem. If we applied the drill, we'll want to subtract one from the remaining drills count. But then if we go over a drill powerup, we'll increment the count and remove this location from our set:

getDrillAdjacentItems :: Maze -> DrillSearchItem -> [DrillSearchItem]
getDrillAdjacentItems maze (location, drillsRemaining, drillLocs) =
  ...
  where
    ...
    mkItemFromResult :: (Location, Bool) -> DrillSearchItem
    mkItemFromResult (loc, usedDrill) =
      let drillsAfterMove =
        if usedDrill
          then drillsRemaining - 1
          else drillsRemaining

      let (drillsAfterPickup, newDrillLocs) =
        if Set.member loc drillLocs
          then (drillsAfterMove + 1, Set.delete loc drillLocs)
          else (drillsAfterMove, drillLocs)
      in  (loc, drillsAfterPickup, newDrillLocs)

And then finally we apply this function over every Just result in our adjacent items!

getDrillAdjacentItems :: Maze -> DrillSearchItem -> [DrillSearchItem]
getDrillAdjacentItems maze (location, drillsRemaining, drillLocs) =
  mkItemFromResult <$>
   (catMaybes [maybeUpLoc, maybeRightLoc, maybeDownLoc, maybeLeftLoc])
  where
  ...

When we rewrite the other BFS functions, the changes are quite trivial. We'll write a new function for the BFS search, since it will apply the different helper, along with other surface level tweaks. The state stays the same except for using DrillSearchItems:

data DrillBFSState = DrillBFSState
  (Seq.Seq DrillSearchItem)
  (Set.Set DrillSearchItem)
  (Map.Map DrillSearchItem DrillSearchItem)

drillBFS :: Maze -> Location -> State DrillBFSState [Location]

Note that we're still only returning a list of locations. We'll let other functions handle the logic of determining if we used the drill or not. So our final call to this function is pretty clean. It just takes a couple extra arguments for our initial state:

getShortestPathWithDrills
  :: Maze
  -> Word
  -> Set.Set Location
  -> Location
  -> Location
  -> [Location]
getShortestPathWithDrills
  maze numDrills drillLocs initialLocation targetLocation = 
      evalState
          (drillBFS maze targetLocation)
          (DrillBFSState
          (Seq.singleton initialItem)
          (Set.singleton initialItem)
          Map.empty)
  where
    initialItem =
      (initialLocation, numDrills, drillLocs)

Last Touches

Once we've changed our algorithm, we have to make one more change to the makePlayerMove function. We'll examine the shortestPathMoveDirection. This will tell us the location we're supposed to go to next. If this is behind a wall, we know we also have to apply the drill in that direction. If it's an AdjacentCell, the drill is not necessary, so use DirectionNone.

makePlayerMove :: World -> PlayerMove
makePlayerMove w =
  PlayerMove finalMoveDirection useStun drillDirection
  where
    ...
    shortestPath = getShortestPathWithDrills
      maze
      (playerDrillsRemaining currentPlayer)
      (Set.fromList $ worldDrillPowerUpLocations w)
      playerLoc
      (endLocation w)
    shortestPathMoveLocation = if null shortestPath
      then playerLoc
      else (head shortestPath)
    shortestPathMoveDirection =
      getMoveDirection playerLoc shortestPathMoveLocation

    locationBounds = maze Array.! playerLoc


    -- Apply the drill if there's a wall!
    drillDirection = case shortestPathMoveDirection of
      DirectionUp -> case upBoundary locationBounds of
        Wall _ -> DirectionUp
        _ -> DirectionNone
      DirectionRight -> case rightBoundary locationBounds of
        Wall _ -> DirectionRight
        _ -> DirectionNone
      DirectionDown -> case downBoundary locationBounds of
        Wall _ -> DirectionDown
        _ -> DirectionNone
      DirectionLeft -> case leftBoundary locationBounds of
        Wall _ -> DirectionLeft
        _ -> DirectionNone
      DirectionNone -> DirectionNone

      ...

Note that we don't need to perform any checks on whether drilling is viable here. If, for some reason, our AI tells us to use the drill and it's invalid, drillLocation will stop this. Then our function for getting the next location will keep the player stationary.

Now we can go ahead and play the game with the AI enabled. We can see that it uses the drill effectively and makes intelligent choices about where and when to do so. It will sometimes go out of its way to pick up an extra drill if this allows it to make a strategic hole. It's not 100% optimal. For instance, we won't use a drill to break a wall to pick up three hidden drills. But it covers most of the important cases for us.

There is a drawback with our algorithm. The game can actually appear a little choppy, especially at the start of the maze. Because we modified the search state to contain a lot more information, the search space is many times bigger. So it is slower to arrive at solutions, especially when we're far from the end.

There are ways we can make this basic algorithm more efficient. Caching is a good place to start. We recompute a lot of information about distances within the maze every time. But there are also other search approaches we can make that will generally be faster. In the coming weeks, we'll explore some of these options.

Conclusion

But first, we'll take a look next week at how we can now run the game separately from the UI. This will allow us to perform many iterations. We'll measure the AI's success rate under different parameters. We'll see how the size of the maze, the number of enemies, and the number of drills affects that rate. This will give us information we can use to make a better use experience for players.

If you want to learn more about Haskell, you should subscribe to our mailing list! You'll get our monthly newsletter, as well as access to our subscriber resouces! This includes, for instance, our Beginners Checklist, to help you get started!

And last, don't forget to take a look at our Github Repository to see all the code in its final state! For this article, look for the player-ai-drill branch!

Read More
James Bowen James Bowen

Preparing for Simulation: Player AI

two_brains.jpg

Our goal for the next couple of weeks will be to run our game as a simulation, without any kind of player input. This means we'll be able to run through rapid iterations of the game and see the results. We can tune parameters and see what makes the game more competitive. But if we don't have player input, we have to move the main character somehow!

This means writing an AI for our player like we have for the enemies. The "player" has more possible actions and inputs, so it will be a little more complicated. We'll restructure our code a bit to make this easier. As with previous parts, take a look at our Github repository for more details. Refer to the player-ai branch for this article.

If you've never programmed in Haskell before, take a look at our Beginners Checklist and read our Liftoff Series! They'll help you get going so you can understand some of the concepts in these articles!

Basic Rules

For the first iteration of our AI, we won't use the drill power up at all. Each "move" will consist of two things. First, we'll pick a direction to move (or stand still). Second, we'll return a boolean for whether to activate the stun power-up.

data MoveChoice =
  MoveUp |
  MoveRight |
  MoveDown |
  MoveLeft |
  StandStill

data PlayerMove = PlayerMove
  {  moveChoice :: MoveChoice
  , activateStun :: Bool
  }

The choice depends on the state of the world. So our final goal is a function like:

makePlayerMove :: World -> PlayerMove
...

Once we have the player's move, we'll leave it up to the main update function to determine the result. We could also return a Location instead of a MoveChoice. This might seem more natural at times. But there are a couple reasons we want to stick with this approach.

One long-term goal is to machine-learn this function. So it will help a lot to limit the scope of the output space as much as possible. This means we're much more likely to come up with legal moves. We also want to leave it up to the game engine to determine that our move can't corrupt the game state. So we'll make our "choice" and devise other functions to manipulate the world.

Now that we know our basic types, let's come up with some simple rules to dictate our behavior.

  1. Determine the shortest path to the destination
  2. If there is an enemy on that path within the stun radius, activate the stun, if we can.
  3. If we don't have our stun, choose any move away from the enemy.

So we can see some specific elements we'll need to code up. Step 1 is the simplest. We've already written out a function for finding the shortest path to our destination. Determining if there's an enemy on that path will be simple as well. The last part will be a little trickier, but manageable. Before we write this function though, we want to update our game infrastructure a bit to support to AI. We'll try to keep this section somewhat light on details.

Game Infrastructure

First, we'll need a couple new parameters for our game. Our player now needs a lagTime parameter like what the Enemy type has. We also need a boolean on our parameter object indicating whether we're using the AI or human input. As always, we'll update JSON instances to reflect these new fields:

data GameParameters = GameParameters
  { …
  , usePlayerAI :: Bool
  }

data PlayerGameParameters = PlayerGameParameters
  { …
  , lagTime :: Word
  }

For a default lag time, we'll use 5. This gives the player 4 moves for ever enemy move. We'll add an extra command line parser that will look for the argument --use-player-ai. We'll use this to fill in the parameter value.

usePlayerAIInfo :: Parser Bool
usePlayerAIInfo = switch
  (long "use-player-ai" <>
    help "Whether to use the AI version of the player")

There are also some structural changes we want to make in the game. Both the player and enemies can now move on the same update cycle, so we have to decide how they interact.

We'll now have a series of discrete update functions for each part of the game. Each of these functions changes the world in some particular way. So they all have the same type signature. We'll structure our update function by composing these different functions. Here's what it might look like:

updateFunc :: Float -> World -> World
updateFunc _ w
  | … -- Win/Lose Conditions
  | otherwise = newWorld -- Normal Game Tick
  where
    afterPlayerMoveWorld = if usePlayerAI . worldParameters $ w
      then
        updateWorldForPlayerMove .
        clearStunCells .
        incrementWorldTime $ w
      else clearStunCells . incrementWorldTime $ w

    newWorld :: World
    newWorld =
      updateWorldForEnemyTicks .
      updateWorldForPlayerTick .
      updateWorldForEnemyMoves .
      updateWorldForPlayerMove .
      clearStunCells .
      incrementWorldTime $
      w

-- Reduce Stun Timers, etc.
updateWorldForEnemyTicks :: World -> World
updateWorldForPlayerTick :: World -> World

-- Update Locations
updateWorldForEnemyMoves :: World -> World

-- Update Location, Pickup Drills, Activate Stun if necessary
updateWorldForPlayerMove :: World -> World

incrementWorldTime :: World -> World

clearStunCells :: World -> World

Note though that we do want to account for the case where the player isn't AI controlled. So the final product actually looks like this:

updateFunc :: Float -> World -> World
updateFunc _ w
  ...
  | otherwise = newWorld
  where
    afterPlayerMoveWorld = if usePlayerAI . worldParameters $ w
      then
        updateWorldForPlayerMove .
        clearStunCells .
        incrementWorldTime $ w
      else clearStunCells . incrementWorldTime $ w

    newWorld :: World
    newWorld =
      updateWorldForEnemyTicks .
      updateWorldForPlayerTick .
      updateWorldForEnemyMoves $
      afterPlayerMoveWorld

We won't go over most of these changes in depth right here, as they're purely refactoring. In future articles we'll definitely look a bit more at how the enemy movement code changed. We'll want it to look as much as possible like the player AI code so we can machine . In the next section, we'll look at the player move function in a little more detail before writing our AI. Before that though, it's worth noting that we should also disable our input handler when using the AI:

inputHandler :: Event -> World -> World
inputHandler event w
  ...
  | usePlayerAI . worldParameters $ w = w -- No updates when AI is on!

Wrapping the Player Move

Above, we can see a mutator function updateWorldForPlayerMove. This will mainly be a wrapper around our primary AI function, makePlayerMove. But we'll establish a general pattern with it. This wrapper will first handle the game logic of determining whether we should move or not.

updateWorldForPlayerMove :: World -> World
updateWorldForPlayerMove w = if shouldMovePlayer
  then worldAfterMove
  else w
  where
    shouldMovePlayer =
      (worldTime w)
      `mod`
      (lagTime . playerGameParameters . worldParameters $ w) == 0
    worldAfterMove = ...

Then it will call the AI function. It will use the results to figure out what world changes are necessary and apply them. The object is to separate the "brain" from the game logic as much as possible. Note that we update the world for the stun first, and then the player's move.

updateWorldForPlayerMove :: World -> World
updateWorldForPlayerMove w = if shouldMovePlayer
  then worldAfterMove
  else w
  where
    shouldMovePlayer =
      (worldTime w) `mod`
      (lagTime . playerGameParameters . worldParameters $ w) == 0
    move = makePlayerMove w
    player = worldPlayer w
    currentLoc = playerLocation player

    worldAfterStun = if activateStun move
      then modifyWorldForStun w
      else w

    newLocation = nextLocationForMove
      (worldBoundaries w Array.! currentLoc)
      currentLoc   
      (playerMoveChoice move)
    worldAfterMove = modifyWorldForPlayerMove
      worldAfterStun newLocation

-- Stun enemies and change parameters as needed
modifyWorldForStun :: World -> World
...

-- Changes the player's location, track drill pickups
modifyWorldForPlayerMove :: World -> Location -> World
…

-- Take a move direction and give the location
nextLocationForMove ::
  CellBoundaries -> Location -> MoveChoice -> Location

Note we can apply more game logic and rules in our World modifier functions. Suppose, for some reason, the AI function returns True to stun enemies when our stun is on cooldown. It's the job of modifyWorldForStun to ensure nothing actually happens.

Shortest Path

Now that that's done, we can start writing our AI function at last! Let's start simple, and find the shortest path to the end.

makePlayerMove w = …
  where
    currentPlayer = worldPlayer w
    playerLoc = playerLocation currentPlayer
    maze = worldBoundaries w

    shortestPath = getShortestPath maze playerLoc (endLocation w)

Now we'll derive a PlayerMoveChoice based on that path. As mentioned above, it's tempting to return the location on the path. But remember, we want to limit the output scope as much as we can. Let's make a function for getting the direction out of two locations.

getMoveChoice :: Location -> Location -> MoveChoice
getMoveChoice (x1, y1) (x2, y2)
  | y2 == y1 + 1 = MoveUp
  | x2 == x1 + 1 = MoveRight
  | y2 == y1 - 1 = MoveDown
  | x2 == x1 - 1 = MoveLeft
  | otherwise = StandStill

So now we could, if we were naive, simply get out our move choice from the path:

makePlayerMove w = PlayerMove moveChoice False
  where
    …
    shortestPath = getShortestPath maze playerLoc (endLocation w)
    moveChoice = if null shortestPath then StandStill else getMoveChoice playerLoc (head shortestPath)

At this point, we should be able to run our game! The player will boldly walk towards the destination square through the maze. But if we have any enemies in our path, we're toast! Let's determine how we can use our stun!

Stunning Enemies

To determine when to use our stun, let's first get a set of the active enemy locations.

makePlayerMove w =
  where
    …
    shortestPath = …
    activeEnemyLocs = Set.fromList
      (enemyLocation <$>
        (filter (\e -> enemyCurrentStunTimer e == 0)
          (worldEnemies w)))

Now let's truncate the path to only include spots that might be in our stun radius. Then we can find if an enemy is there:

makePlayerMove w = (moveChoice, useStun)
  where
    …
    shortestPath = …
    enemyLocs = Set.fromList (enemyLocation <$> (worldEnemies w))
    radius = stunRadius . playerGameParameters . worldParameters $ w
    enemyClose = any
      (\l -> Set.member l activeEnemyLocs)
      (take radius shortestPath)
    canStun = playerCurrentStunDelay currentPlayer == 0
    useStun = enemyClose && canStun

We're getting closer now! We'll find that our player can stun enemies as long as it's ready. But if the stun isn't ready, we'll run straight into them! We don't want that! So let's figure out how to make a good retreat in case we don't have our stun ready.

Retreating

This leaves us with our last bit of logic. If an enemy stands between us and our shortest path, is close by, and we don't have our stun ready, we should run away. We'll make this as simple as possible by picking the first adjacent location that isn't on our shortest path. This doesn't always work that well. But it'll do for now. We need a list of adjacent locations, and then we select one that isn't on our shortest path. Here's how that logic pans out:

makePlayerMove :: World -> PlayerMove
makePlayerMove w = PlayerMove finalMoveChoice useStun
  where
    ...
    shortestPath = getShortestPath maze playerLoc (endLocation w)
    shortestPathMoveLocation = if null shortestPath
      then playerLoc
      else (head shortestPath)
    shortestPathMoveChoice = getMoveChoice
      playerLoc shortestPathMoveLocation

    activeEnemyLocs = ...
    radius = ...
    enemyClose = ...

    canStun = playerCurrentStunDelay currentPlayer == 0

    possibleMoves = getAdjacentLocations maze playerLoc

    (finalMoveChoice, useStun) = if not enemyClose
      then (shortestPathMoveChoice, False)
      else if canStun
        then (shortestPathMoveChoice, True)
        else case find (/= shortestPathMoveLocation) possibleMoves of
          Nothing -> (StandStill, False)
          Just l -> (getMoveChoice playerLoc l, False)

And now our AI actually works quite well! It can navigate the maze and stun enemies when it needs to. When it has to wait for its stun to re-charge, it'll back away from enemies as needed until the stun is ready.

Conclusion

Next week, we'll use some more advanced tactics to navigate the maze. Specifically, we'll look into how we can use our drill powerup. We'll need to re-think how we calculate the shortest path. We won't be using pure maze distance any more, since we can change the maze! So that will be an interesting problem. After we wrap that up, we'll look into separating the game from the GUI component so we can run simulations.

As always, check out our Github repository for full implementation details. This article uses the player-ai branch, so make sure you select that one!

And for more tips on getting better at Haskell, you should subscribe to our mailing list! You'll get access to all our subscriber resources. This includes our Beginners Checklist and our Production Checklist!

Read More
James Bowen James Bowen

Gloss Review!

For the last few months, we've been constructing a simple game using the Gloss library. This library provides a neat and tidy interface for us to construct game components and put them together. The game is available to build and fork on our Github Repository. Here's a quick review of everything we've done:

Part 1: Overview of the Gloss library, constructing basic simulations and games

Part 2: Creating our maze type

Part 3: Generating random mazes using Depth-First-Search

Part 4: Victory Status Screen

Part 5: Serializing and parsing a maze

Part 6: Refactoring using Compile Driven Development

Part 7: Using Mutable Arrays for maze construction

Part 8: Adding enemies to the maze

Part 9: Making enemies more intelligent with Breadth-First-Search

Part 10: Adding a stun power to fight back against enemies

Part 11: Parameterizing the application and saving the world state

Part 12: Re-loading the world state, adding command line parameters

Part 13: Adding a drill power-up to take shortcuts through the maze

In this series, we explored some cool algorithmic concepts. We saw how to use the state monad for breadth first search and depth first search. We also explored mutable arrays, which aren't used that often in Haskell. If you're new to Haskell, it's good to get familiar with how some of these algorithms work.

We also emphasized using a methodical development approach. The Gloss architecture enables us to have a simple process for adding new features to the game. Combining this with compile driven development is powerful combination for rapid iteration.

We're probably not going to add any more core features to this game. But that doesn't mean we're done working with it! We're going to continue using the game as a platform for learning about interesting concepts and algorithms. Expect to see some articles related to search algorithms, AI development, and game architecture coming soon!

Until then, don't forget that you can subscribe to Monday Morning Haskell! You'll hear about our upcoming articles through our monthly newsletter. You'll also get access to our subscriber resources! There's plenty to learn both for beginners and more advanced Haskellers!

Read More
James Bowen James Bowen

Taking a Shortcut!

drilling.png

In the last couple weeks we focused on code improvements and peripheral features. We added parameters and enabled saving and loading. This week we're going back into the meat of the game to add a new feature. We want to give our player another option for navigating the maze faster. We'll add some "drill" power-ups throughout the map. These will allow our player to "drill" through a wall, permanently removing that wall from our grid. Then we can take shortcuts through the maze!

This article will once again emphasize using a methodical process. We'll add the feature step-by-step. We'll include specific commit links for each of the process, so you can follow along. You'll want to be on the part-10 branch of our Github repository. As reminder, here's our development approach with this game:

  1. Determine what extra data we need in our World and related types. Initialize these with reasonable values.
  2. Implement the core logic. This means determining how the new data affects either player inputs or the passage of game time. This means changing our update function and/or our input handler.
  3. Update the drawing function. Determine what Pictures we can make to represent the new data.

Steps 1 and 3 will involve modifying our parameter types and changing JSON instances. We won't emphasize these changes in the article because they're boilerplate code. But it's good to be aware of that! Check out the commits if you're confused about how to update the instances!

For some more ideas on building full-fleged Haskell applications, download our Production Checklist!

Modifying Data

We'll first concern ourselves with the status of the player's drilling ability. We'll add power-ups to the map later. What information do we need to have? First, the Player type needs a field for drillsRemaining.

data Player = Player
  { …
  , playerDrillsRemaining :: Word
  }

We'll also need to select an initial value for this. We'll put this in our game parameters.

data PlayerGameParameters = PlayerGameParameters
  { …
  , initialDrills :: Word
  }

Now when we initialize the player, we'll use that parameter:

newPlayer :: PlayerGameParameters -> Player
newPlayer params = Player (0, 0) 0
  (initialStunTimer params) (initialDrills params)

In this step we also need to update our JSON instances, and add to the game's default parameters.

Our drilling action will also change the maze itself. To make this easier, let's add the adjacent location to our Wall constructor:

data BoundaryType =
  WorldBoundary |
  Wall Location |
  AdjacentCell Location
  deriving (Show, Eq)

The resulting changes aren't too complicated. In pretty much all cases of initialization, we already have access to the proper location.

You can explore these changes more by perusing the first two commits on this branch. The first is for the player data, the second is for the adjacent walls.

Activating the Drill

Now let's implement the logic for actually breaking down these walls! We'll break down walls in response to key commands. So the main part of our logic goes in the input handler. But first, a few helpers will be very useful. For best code reuse, we're going to make some functions that mutate cell boundaries. Each one will remove a wall in the specified direction. Here's an example removing the wall in the "up" direction:

breakUpWall :: CellBoundaries -> CellBoundaries
breakUpWall cb = case upBoundary cb of
  (Wall adjacentLoc) -> cb {upBoundary = AdjacentCell adjacentLoc}
  _ -> error "Can't break wall"

Notice how the extra location information makes it easy to create the AdjacentCell! We want to throw an error because we shouldn't invoke this function if it's not a wall in that direction. We'll want comparable functions for the other three directions.

We also want a mutator function on the player. This reduces the number of drills remaining:

activatePlayerDrill :: Player -> Player
activatePlayerDrill pl = pl
  { playerDrillsRemaining = decrementIfPositive (playerDrillsRemaining pl)}

Now we can create a function drillLocation. This function will fall under our input handler. This way, we don't have to pass all the world state information as parameters.

where
    worldBounds = worldBoundaries w
    currentPlayer = worldPlayer w
    currentLocation = playerLocation currentPlayer
    cellBounds = worldBounds Array.! currentLocation

    drillLocation = ...

The function will take two of the mutator functions for breaking walls. The first will allow us to break the wall from the current cell. The second will allow us to break the wall from the adjacent cell. It will also take a function giving us the cell boundaries in a particular direction. Finally, it will take the World as a parameter. We could access the existing w value if we wanted. But doing it this way could allow us to chain multiple World mutation functions in the future.

drillLocation
      :: (CellBoundaries -> BoundaryType)
      -> (CellBoundaries -> CellBoundaries)
      -> (CellBoundaries -> CellBoundaries)
      -> World
      -> World
    drillLocation boundaryFunc breakFunc1 breakFunc2 w = ...

We first need to determine if we can drill in this state. Our player must have a drill remaining, and there must be a wall in the given direction. If these conditions aren't met, we return our original World parameter.

drillLocation boundaryFunc breakFunc1 breakFunc2 w =
  case (drillLeft, boundaryFunc cellBounds) of
    (True, Wall location2) -> …
    _ -> w
  where
    drillLeft = playerDrillsRemaining currentPlayer > 0

Now we'll create our "new" player with the activateDrill function from above. And we'll use the input functions to get our new cell boundaries. We'll update our maze, and then return the new world!

drillLocation boundaryFunc breakFunc1 breakFunc2 w =
  case (drillLeft, boundaryFunc cellBounds) of
    (True, Wall location2) ->
      let newPlayer = activatePlayerDrill currentPlayer
          newBounds1 = breakFunc1 cellBounds
          newBounds2 = breakFunc2 (worldBounds Array.! location2)
          newMaze = worldBounds Array.//
            [(currentLocation, newBounds1), (location2, newBounds2)]
      in  w { worldPlayer = newPlayer, worldBoundaries = newMaze }
    _ -> w
  where
    drillLeft = playerDrillsRemaining currentPlayer > 0

Last of all, we have to make key inputs for this handler. We'll use the "alt" key in conjunction with a direction to signify that we should drill. We use the Modifiers constructor to signal such a combination. Here's that little snippet:

inputHandler event w
  ...
  | otherwise = case event of
      (EventKey (SpecialKey KeyUp) Down (Modifiers _ _ Down) _) ->
        drillLocation upBoundary breakUpWall breakDownWall w
      (EventKey (SpecialKey KeyDown) Down (Modifiers _ _ Down) _) ->
        drillLocation downBoundary breakDownWall breakUpWall w
      (EventKey (SpecialKey KeyRight) Down (Modifiers _ _ Down) _) ->
        drillLocation rightBoundary breakRightWall breakLeftWall w
      (EventKey (SpecialKey KeyLeft) Down (Modifiers _ _ Down) _) ->
        drillLocation leftBoundary breakLeftWall breakRightWall w

Again, notice how we use our mutator functions as parameters. When drilling "up", we pass the breakUpWall and breakDownWall functions, and so on. For a review of all these steps, take a look at this commit!

Drill Power Ups

At this point, we can use our drill within the game. But we can only use it the specified number of times from the initialDrills parameter. Next, we're going to add power-ups we can pick up in the grid that will give us more drilling abilities. This requires adding a bit more data to our world. We'll have a list of locations for the power-ups, as well as a parameter for the number of them:

data World = World
  { ...
  , worldDrillPowerUpLocations :: [Location]
  }

data GameParameters = GameParameters
  { ...
  , numDrillPowerups :: Int
  }

When we initialize our game, we have to create these locations, as we did for enemies. Since these are the same essential task, we can replicate that logic:

main = …
      let (enemyLocations, gen'') = runState
            (replicateM (numEnemies gameParams) 
              (generateRandomLocation
                (numRows gameParams, numColumns gameParams)))
            gen'
          (drillPowerupLocations, gen''') = runState
            (replicateM (numDrillPowerups gameParams) 
              (generateRandomLocation 
                (numRows gameParams, numColumns gameParams)))
            gen''
    in ...

We also need this same process when resetting the game, whether after winning or losing.

We'll also change how we update the world after a player move. We'll add a wrapper function to update the world whenever the player moves. We'll supply a directional function parameter to determine where the player goes next:

inputHandler :: Event -> World -> World
inputHandler event w
  ...
  | otherwise = case event of
      (EventKey (SpecialKey KeyUp) Down _ _) ->
        updatePlayerMove upBoundary
      (EventKey (SpecialKey KeyDown) Down _ _) ->
        updatePlayerMove downBoundary
      (EventKey (SpecialKey KeyRight) Down _ _) ->
        updatePlayerMove rightBoundary
      (EventKey (SpecialKey KeyLeft) Down _ _) ->
        updatePlayerMove leftBoundary
  where
    updatePlayerMove :: (CellBoundaries -> BoundaryType) -> World
    updatePlayerMove = …

    -- Other values we have in scope, for convenience
    playerParams = playerGameParameters . worldParameters $ w
    enemyParams = enemyGameParameters . worldParameters $ w

    worldRows = numRows . worldParameters $ w
    worldCols = numColumns . worldParameters $ w
    worldBounds = worldBoundaries w
    currentPlayer = worldPlayer w
    currentLocation = playerLocation currentPlayer
    cellBounds = worldBounds Array.! currentLocation

Let's devise a couple more mutators to help us fill in this new function. One moves the player to a new location, the other gives them an extra drill power-up if they find one:

pickupDrill :: Player -> Player
pickupDrill pl = pl
  { playerDrillsRemaining = (playerDrillsRemaining pl) + 1}

movePlayer :: Location -> Player -> Player
movePlayer newLoc pl = pl
  { playerLocation = newLoc }

Now if the location in the proper direction is "adjacent" to us (no wall), then we'll move there and try to pick up a drill. If it is not, the world does not change!

updatePlayerMove :: (CellBoundaries -> BoundaryType) -> World
updatePlayerMove boundaryFunc = case boundaryFunc cellBounds of
  (AdjacentCell cell) ->
    let movedPlayer = movePlayer cell currentPlayer
        drillLocs = worldDrillPowerUpLocations w
        ...
      _ -> w

Then for one last trick. When our next location has a drill, we have to update the player again. We also have to remove this power-up from the world! We update the world with those parameters, and we're done!

updatePlayerMove :: (CellBoundaries -> BoundaryType) -> World
updatePlayerMove boundaryFunc = case boundaryFunc cellBounds of
  (AdjacentCell cell) ->
    let movedPlayer = movePlayer cell currentPlayer
        drillLocs = worldDrillPowerUpLocations w
        (finalPlayer, finalDrillList) = if cell `elem` drillLocs
          then (pickupDrill movedPlayer, delete cell drillLocs)
          else (movedPlayer, drillLocs)
    in w
      { worldPlayer = finalPlayer, 
        worldDrillPowerUpLocations = finalDrillList }
  _ -> w

Check out this commit for a more thorough look at the changes in this part!

Drawing our Drills

We're almost done now! We have to actually draw the drills on the screen so we can figure out what's happening! As with the stun indicator, we'll use a circular ring to show when our player has a drill ready. We'll show the power-ups using purple triangles on the grid. With our new parameters system, we'll want to specify the color we'll use and the size of these indicators.

data PlayerRenderParameters = PlayerRenderParameters
  { ...
  , playerDrillPowerupSize :: Float
  , playerDrillIndicatorSize :: Float
  , playerDrillColor :: Color
  }

defaultRenderParameters :: RenderParameters
defaultRenderParameters = RenderParameters ...
  where
    playerParams = PlayerRenderParameters 10 black 5 red
      5.0 2.0 violet
    ...

As with our game parameters change, this also involves updating JSON instances. We'll also have to update the command line options parsing. But let's focus on the actual drawing that needs to take place.

The player indicator code looks a lot like the code for the stun indicator. We'll use different parameters and a different condition, but nothing else changes:

playerRP = playerRenderParameters rp
drillReadyMarker = if playerDrillsRemaining (worldPlayer world) > 0
  then Color (playerDrillColor playerRP)
    (Circle (playerDrillIndicatorSize playerRP))
  else Blank
...
playerMarker = translate px py (Pictures [drillReadyMarker, ...])

Then for the power-ups, we want to handle them like we do enemies. We'll loop through the list of locations, and apply a picture function against each one:

drawingFunc :: RenderParameters -> World -> Picture
drawingFunc rp world
  ...
 | otherwise = Pictures
    [ ...
    , Pictures (enemyPic <$> worldEnemies world)
    , Pictures (drillPic <$> worldDrillPowerUpLocations world)
    ]
  where
    ...
    drillPic :: Location -> Picture
    ...

The drillPic function will even look a lot like the function for enemy pictures. We'll just use different coordinates to make a triangle instead of a square!

drillPic :: Location -> Picture
drillPic loc =
  let (centerX, centerY) = cellCenter $ conversion loc
      radius = playerDrillPowerupSize playerRP
      top = (centerX, centerY + radius)
      br = (centerX + radius, centerY - radius)
      bl = (centerX - radius, centerY - radius)
      drillColor = playerDrillColor playerRP
  in  Color drillColor (Polygon [top, br, bl])

And that's all! We've got a working, rendered prototype of this feature now! Take a look at this commit for some wrap-up details.

Drills in Action

Here are a few quick pictures of our game in action now! We can see our player is out of drills but near a power-up here:

maze_game_drill-1.png

Then we pick up the new drill, and we're fresh again!

maze_game_drill-2.png

Then we can use it to make a hole and get closer to the end!

maze_game_drill-3.png

Conclusion

This wraps up our feature! In the next couple weeks we're going to take a step back and re-organize things a bit. We want to work towards adding an AI for the main player, and enhancing the AI for the enemies. It would be cool to use some more advanced tactics, rather than relying on hard-coded rules. At some point, we'll consider using machine learning to improve this AI. This will require re-working our code a bit so we can run it independently of Gloss. This will allow us to run many game situations!

As we get ready to try some more advanced AI ideas, it might be a good idea to brush up on how and why we can use Haskell for AI development. Take a look at our Haskell AI series for some ideas! You can also download our Haskell Tensorflow Guide to learn more!

Read More
James Bowen James Bowen

Loading Games and Changing Colors!

floppy_disk.png

Last week we added functionality for serializing our world state. This allowed us to save our game with the press of a button. We also parameterized our application so that we could customize many aspects of it. We didn't explore all the possibilities though! This week, we'll see how we can load a previous game-state by using command line options. We'll also use options to specify how the game appears!

As always, take a look at our Github repository to see the full code. This article corresponds to the part-9 branch.

All this is part of our effort to make our game more "mature". But you should also consider some libraries that will be more useful in industry! Take a look at our Production Checklist!. Whether it's web servers or databases, you'll learn how Haskell interacts with more advanced concepts!

Command Line Options

Everything we do this week will happen through command line options. For a quick refresher on these, take a look at this article! We'll have a couple main concerns. First, we want to take an optional filename argument to load the initial world state. If we get this, we'll throw the player right back into their game! If we don't get this argument, we'll generate a new, random state for them to work with.

Second, we'll make command line parameters for all our different render parameters. This will allow anyone invoking the game to customize the appearance! We'll also allow them to specify a whole file for these as well. This will involve quite a bit of work with the Options.Applicative library.

Our main goal is this function:

parseOptions :: IO (Maybe FilePath, RenderParameters)

This will return us a possible file path to load our world state from, as well as a set of render parameters. Let's start with some basic framework code.

Parser Setup

The first item we want is a generic parser that will give us Maybe values. As far as I could tell the options library doesn't have this built-in. But it's not too hard to write. We want to use the option function to start with. It will attempt the given parser. If it succeeds, we want to wrap it with Just. Then we'll append a default value of Nothing to the options in case it fails.

maybeParser ::
  ReadM a -> Mod OptionFields (Maybe a) -> Parser (Maybe a)
maybeParser reader opts =
  option (Just <$> reader) (opts <> value Nothing)

We can now use this to build a parser for the maze file:

mazeFileParser :: Parser (Maybe FilePath)
mazeFileParser = maybeParser str
 (long "load-file" <> short 'f'
   <> help "A file to use to load the world state")

And now we just apply execParser on this, supplying some simple Info for our parser:

parseOptions :: IO (Maybe FilePath)
parseOptions = execParser $ info mazeFileParser commandInfo

commandInfo :: InfoMod (Maybe FilePath)
commandInfo = fullDesc <> progDesc "Haskell Maze Game"

Runner Updates

The next step is a short function for loading our world from the file. Since we have our JSON instance on the World type, we'll rely on decodeFileStrict'. There's one caveat. If the game parameters don't have a random seed value, we'll use a new one. Otherwise we'll use mkStdGen on the seed:

loadWorldFromFile :: FilePath -> IO World
loadWorldFromFile fp = do
 parseResult <- Data.Aeson.decodeFileStrict' fp
 case parseResult of
   Just w -> do
     gen <- case randomGeneratorSeed (worldParameters w) of
       Nothing -> getStdGen
       Just i -> return $ mkStdGen i
     return $ w { worldRandomGenerator = gen }
   Nothing -> error $ "Couldn't parse world from file " ++ fp ++ "!"

Now we want to make some changes to our main running function. We'll run our argument parser first. If we get a file from the options, we'll load the initial World from that file. Otherwise, we'll use our previous flow with generating a random maze.

main :: IO ()
main = do
 maybeLoadFile <- parseOptions
 initialWorld <- case maybeLoadFile of
   Just loadFile -> loadWorldFromFile loadFile
   Nothing -> …
 play ...

Parsing a Render File

This is good enough to load the world, so we can re-start from a saved (or derived) position. But suppose we wanted to go a step further. Suppose we wanted to also load our render parameters. We could use another file if we liked. We could start with another parser for a file path:

renderFileParser :: Parser (Maybe FilePath)
renderFileParser = maybeParser str
 (long "render-param-file" <> short 'r'
   <> help "A file to use to load render parameters")

Then we'll combine our two parsers together like so:

parser :: Parser (Maybe FilePath, Maybe FilePath)
parser = (,) <$>
 mazeFileParser <*>
 renderFileParser

Now we'll add a bit more logic to the wrapper function. If we have a file, we should use it to load the RenderParameters object:

parseOptions :: IO (Maybe FilePath, RenderParameters)
parseOptions = do
 (mazeFile, renderFile) <- execParser $ info parser commandInfo
 case renderFile of
   Nothing -> return (mazeFile, defaultRenderParameters)
   Just fp -> do
     parseResult <- decodeFileStrict' fp
     case parseResult of
       Nothing -> return (mazeFile, defaultRenderParameters)
       Just fileRenderParams -> return (mazeFile, fileRenderParams)

Note that the type of our commandInfo will also need to change as a result of this. But then we just have the simple task of getting other these items out in our main function:

main :: IO ()
main = do
 (maybeLoadFile, renderParams) <- parseOptions
 ...

Individual Render Parameters

We have one last trick though! Suppose we want to change one thing about the game's appearance and we don't want to use a JSON file. We can add individual options on render elements. We've got a lot of possible elements. We'll wrap them all in a basic type, matching the fields we have in the different sub-components. Each of these fields is optional. We'll "merge" them with a complete set of render parameters to get a final result.

data RenderParamInfo = RenderParamInfo
 -- Screen/Text Parameters
 (Maybe Int)
 (Maybe Int)
 (Maybe Int)
 (Maybe Float)
 (Maybe Float)
 (Maybe Float)
 (Maybe Float)
 -- Player Parameters
 (Maybe Float)
 (Maybe Color)
 (Maybe Float)
 (Maybe Color)
 -- Enemy Parameters
 (Maybe Float)
 (Maybe Color)
 (Maybe Color)
 -- Cell Parameters
 (Maybe Color)
 (Maybe Color)
 (Maybe Float)

Each field will have it's own parser. These will all be variations on our maybeParser:

maybeIntParser :: Mod OptionFields (Maybe Int) -> Parser (Maybe Int)
maybeIntParser = maybeParser auto

maybeFloatParser ::
  Mod OptionFields (Maybe Float) -> Parser (Maybe Float)
maybeFloatParser = maybeParser auto

maybeColorParser ::
  Mod OptionFields (Maybe Color) -> Parser (Maybe Color)
maybeColorParser = maybeParser (maybeReader colorReader)
 where
   colorReader "blue" = Just blue
   … -- other colors

Then we can combine them using applicative syntax, and providing some help information:

parseRenderInfo :: Parser RenderParamInfo
parseRenderInfo = RenderParamInfo <$>
 maybeIntParser (long "screen-dimen"
   <> help "The screen width/height") <*>
 maybeIntParser (long "screen-offset-x"
   <> help "The screen width/height") <*>
 ...

Next we'll write a "merge" function. This will take a RenderParameters item with default values, and apply the Just values.

mergeOptions ::
  RenderParameters -> RenderParamInfo -> RenderParameters
mergeOptions rp (RenderParamInfo sd_ sox_ ...)
 = RenderParameters
   (fromMaybe (screenDimen rp) sd_)
   (fromMaybe (screenOffsetX rp) sox_)
   ...

Then we add this new parser to our set:

parser :: Parser (Maybe FilePath, Maybe FilePath, RenderParamInfo)
parser = (,,) <$>
 mazeFileParser <*>
 renderFileParser <*>
 parseRenderInfo

And our original function should now reflect the need to merge parameters.

parseOptions :: IO (Maybe FilePath, RenderParameters)
parseOptions = do
 (mazeFile, renderFile, renderInfo) <- execParser $
                                         info parser commandInfo
 case renderFile of
   Nothing -> return
     (mazeFile, mergeOptions defaultRenderParameters renderInfo)
   Just fp -> do
     parseResult <- decodeFileStrict' fp
     case parseResult of
       Nothing -> return
         (mazeFile, mergeOptions defaultRenderParameters renderInfo)
       Just fileRenderParams -> return
         (mazeFile, mergeOptions fileRenderParams renderInfo)

Wrapping Up

Now we've got a lot of interesting possibilities. We can save our game from a particular state:

maze_game_1.png

Then we can load it back up with different colors. For example, we can make some obnoxious green walls:

stack exec -- maze-game --load-file=maze_game_save_1557121650\
  --cell-wall-color=green --enemy-base-color=red
maze_game_2.png

Conclusion

Now that our game is customizable and re-loadable we'll be able to do a lot more interesting things with it. We'll be able to run many simulations that test the difficulty. Some day, we'll be able to make the AI much better! For the time being though, there are still some more features we can add. So the next couple parts of this series will explore some other exciting twists in the game-play. This will lead us to a point where other types of refactors will be necessary.

Making a project like this requires a good knowledge of the Stack tool. To learn more, take our free mini-course on Using Stack!.

Read More
James Bowen James Bowen

Spring Cleaning: Parameters and Saving!

broom.jpg

Our game is a lot more interesting after the changes we made last week. But there's still lots of room for improvement. There are many things that could make the game more or less interesting, depending how we tune things. For instance, how many enemies is too many? What's a good stun duration? How quickly does the game get harder if we tweak these parameters?

Right now, it would be hard for us to answer these questions in a systematic way. We've baked many of these parameters directly into the code. So we would have to recompile everything if we wanted to test out another version of the game. This is a bad sign. We should be able to run the same binary with different sets of parameters.

Another issue with our game is that the only real "flow" is to start off with a random arrangement. We don't know what the map will be or where the enemies start. But if we want to test how well certain concepts work, we'll want true re-playability. In other words, we'll want to be able to start the game from a certain state we've established.

This week, we'll start to clean these things up. The first job will be to move a lot of our magic numbers into the World type. Then we'll devise a way to serialize the complete world state. We've already done the hard work of ensuring we can serialize the map. The rest will be pretty straightforward using JSON instances. We'll wrap up this week by adding the option to save the game in the middle of the action. Then next week, we'll add some options to our game so we can load a particular starting state from a file.

As with each of the different phases of this projects, you can take a look at our Github repository to see how we do everything. For this article, you should be following the part-8 branch. We'll also provide a couple commit links for so you can follow along step-by-step.

As you get better at using Haskell, you'll be able to use it for more and more types of projects. Download our Production Checklist for some ideas!

Parameterizing the App

There are a lot of "magic numbers" floating around our app right now. Some of these have to do with game-play logic. How many enemies are there? What's their cool-down time? How long is our player's stun timer? Then there are other parameters that have to do with how we draw the game. For instance, what colors do we use? How big are the cells?

We make this distinction because we should be able to run our game without any render information. At some point, we'll run simulations of this game that don't get drawn at all. So it would be useless to have drawing information around. Thus we'll have GameParameters types that will live in the World. Then we'll have RenderParameters types for drawing everything.

With that said, let's starting devising what information these types contain. We'll start out with types describing the player and enemy parameters:

data PlayerGameParameters = PlayerGameParameters
  { initialStunTimer :: Word
  , stunTimerIncrease :: Word
  , stunTimerMax :: Word
  , stunRadius :: Int
  }

data EnemyGameParameters = EnemyGameParameters
  { initialStunTime :: Word
  , stunTimeDecrease :: Word
  , minStunTime :: Word
  , enemyRandomMoveChance :: Word
  , initialLagTime :: Word
  , minLagTime :: Word
  }

Now we can use these to populate a bigger type with more generic game parameters:

data GameParameters = GameParameters
  { numRows :: Int
  , numColumns :: Int
  , numEnemies :: Int
  , tickRate :: Int
  , playerGameParameters :: PlayerGameParameters
  , enemyGameParameters :: EnemyGameParameters
  , randomGeneratorSeed :: Maybe Int
  }

Notice the random seed is a Maybe value. In the normal circumstances of running the program, we don't want to fix the random generator. But there are cases where we'll want to load from a specific stored state. If we fix the generator seed value, gameplay will be deterministic. This could be a desirable property in some circumstances. In most cases though, this will be Nothing.

With all these types in place, we'll now add the game parameters to our World:

data World = World
  { …
  , worldParameters :: GameParameters
  }

We'll go through a similar process with RenderParameters. The main difference will be that we will not attach the type to the World. There will also be a CellRenderParameters type as well as types for the Player and Enemy. This gives us information about how individual cells get displayed on our screen. Here's a quick sample of this code. You can see the other types at the bottom as an appendix.

data RenderParameters = RenderParameters
  { screenWidth :: Float
  , screenHeight :: Float
  , screenOffsetX :: Float
  , screenOffsetY :: Float
  , textOffset :: (Float, Float)
  , textScale :: (Float, Float)
  , playerRenderParameters :: PlayerRenderParameters
  , enemyRenderParameters :: EnemyRenderParameters
  , cellRenderParameters :: CellRenderParameters
  }

data CellRenderParameters = CellRenderParameters
  { cellWallColor :: Color
  , cellStunColor :: Color
  , cellWallWidth :: Float
  }

No More Magic

Once we have these types in place, our next step is to replace the magic numbers (and colors) in our application. We'll need to add the parameters as arguments in a few places. Most of all, the drawingFunc will need the RenderParameters argument.

drawingFunc :: RenderParameters -> World -> Picture
...

This process isn't too much of a challenge, as all our important functions take the World as an input. Then for now, we'll pass our default parameter packs as arguments when running the program. Here's a quick look at changes to our main function:

main :: IO ()
main = do
  gen <- getStdGen
  let gameParams = defaultGameParameters
      renderParams = defaultRenderParameters
      (maze, gen') = generateRandomMaze
        gen (numRows gameParams, numColumns gameParams)
      (randomLocations, gen'') = runState
        (replicateM
          (numEnemies gameParams)
          (generateRandomLocation
            (numRows gameParams, numColumns gameParams)))
            gen'
      enemies = (mkNewEnemy
        (enemyGameParameters gameParams)) <$> randomLocations
      endCell = (numColumns gameParams - 1, numRows gameParams - 1)
      initialWorld = World
        (newPlayer (playerGameParameters gameParams))
        (0,0) endCell maze GameInProgress gen'' enemies [] 0 
        gameParams
  play
    (windowDisplay renderParams)
    white
    (tickRate gameParams)
    initialWorld
    (drawingFunc renderParams)
    inputHandler
    updateFunc

Take a look at this commit for a longer look at all our parameter changes.

Serializing Our World

Now that we've updated our World type, we'll want to determine how we can serialize it. For simplicity's sake we'll use JSON serialization. This is mostly a matter of creating (or, if you wish, deriving), a bunch of ToJSON and FromJSON instances. Check out this article for a refresher on the Data.Aeson library.

Most of this code is pretty simple. With game parameters, a lot of the instances are a simple matter of creating and parsing pairs. Here's an example with Player:

instance FromJSON Player where
  parseJSON = withObject "Player" $ \o -> do
    location <- o .: "location"
    currentStunDelay <- o .: "currentStunDelay"
    nextStunDelay <- o .: "nextStunDelay"
    return $ Player location currentStunDelay nextStunDelay

instance ToJSON Player where
  toJSON p = object
    [ "location" .= playerLocation p
    , "currentStunDelay" .= playerCurrentStunDelay p
    , "nextStunDelay" .= playerNextStunDelay p
    ]

But there are a few caveats. To start, we need to make a separate file for these instances. We'll need our maze parsing code for the World type, and this depends on the Types module. We have to avoid the resulting dependency cycle.

It's generally a bad practice to separate instances from the type declarations. We call these "orphan" instances and you'll get a compiler warning about them in other projects. A way around this is to create wrapper types. This is a little tedious, so we won't do it for all the types. But we will show the concept for the Color type from Graphics.Gloss. Let's start with a wrapper type:

newtype ColorWrapper = ColorWrapper { unColor :: Color }

Now we can create instances on this wrapper, and they're considered valid. To cover all cases of color, we'd use RGB arrays. But we'll color cover the 9 or so colors we care about and parse them as strings. Here's what the instances look like. Notice how we wrap and unwrap the actually library functions for the colors:

instance ToJSON ColorWrapper where
  toJSON (ColorWrapper c) = Ae.String colorStr
    where
      colorStr
        | c == blue = "blue"
        | c == red = "red"
        | c == yellow = "yellow"
        | c == green = "green"
        | c == cyan = "cyan"
        | c == orange = "orange"
        | c == magenta = "magenta"
        | c == rose = "rose"
        | c == black = "black"

instance FromJSON ColorWrapper where
  parseJSON = withText "ColorWrapper" parseText
    where
      parseText "blue" = return (ColorWrapper blue)
      parseText "red" = return (ColorWrapper red)
      parseText "yellow" = return (ColorWrapper yellow)
      parseText "green" = return (ColorWrapper green)
      parseText "cyan" = return (ColorWrapper cyan)
      parseText "orange" = return (ColorWrapper orange)
      parseText "magenta" = return (ColorWrapper magenta)
      parseText "rose" = return (ColorWrapper rose)
      parseText "black" = return (ColorWrapper black)
      parseText _ = error "Couldn't parse color!"

Then we can use these instances within other parsers for render parameters. The other caveat now is to parse out our World in two stages. We'll get all the basic fields and parameters first:

instance FromJSON World where
  parseJSON = withObject "World" $ \o -> do
    player <- o .: "player"
    startLoc <- o .: "startLocation"
    endLoc <- o .: "endLocation"
    result <- o .: "result"
    enemies <- o .: "enemies"
    stunCells <- o .: "stunCells"
    time <- o .: "time"
    params <- o .: "gameParameters"
  ...

Now we'll get the boundaries as a Text item. We'll parse the maze boundaries out using our parser as well as the number of rows and columns.

instance FromJSON World where
  parseJSON = withObject "World" $ \o -> do
    ...
    (boundaryString :: Text) <- o .: "boundaries"
    let (rs, cs) = (numRows params, numColumns params)
    let boundaries =
      case runParser (mazeParser (rs, cs)) "" boundaryString of
          Right result -> result
          _ -> error "Map parse failed!"

As a last trick, we'll check what the random seed is within our parameters. If it's Nothing, we'll fix the generator with a seed of 1 and rely on other code to change it:

instance FromJSON World where
  parseJSON = withObject "World" $ \o -> do
    ...
    let gen = case randomGeneratorSeed params of
          Just i -> mkStdGen i
          _ -> mkStdGen 1
    return $ World player startLoc endLoc boundaries
      result gen enemies stunCells time params

Take a look at this commit to see the full code for these instances. Now let's see how we use them!

Saving Our World

We'd like to make it so that our user can save their game-state by hitting the s key at any point in the game. This idea starts out simple enough. We add a handler for the key in our inputHandler.

inputHandler :: Event -> World -> World
inputHandler event w
  | worldResult w == GameWon = ...
  | worldResult w == GameLost = ...
  | otherwise = case event of
      ...
      (EventKey (Char 's') Down _ _) -> ...

But now we're a little stuck! We want to write out to a file, but our handler is a pure function! There miiight be a way to do this without breaking functional purity. Perhaps we could keep a list of saved world states and add a handler to save them at the end of our program. But Gloss wasn't made for that. So we're going to break the rules a bit and resort to unsafePerformIO. This allows us to run an IO computation from a seemingly pure context. Here's the basic layout:

inputHandler :: Event -> World -> World
inputHandler event w
      ...
      (EventKey (Char 's') Down _ _) -> unsafeSaveWorldToFile w

unsafeSaveWorldToFile :: World -> World
unsafeSaveWorldToFile w = unsafePerformIO $ do
  …
  return w

Since we have a JSON instance for our World, we'll lean on the encodeFile function. The rest of the work here comes from generating a filename using the current time, for uniqueness:

unsafeSaveWorldToFile :: World -> World
unsafeSaveWorldToFile w = unsafePerformIO $ do
  timeAsString <- show . floor <$> getPOSIXTime
  currentDir <- getCurrentDirectory
  let filename = currentDir ++ "/maze_game_save_" ++ timeAsString
  encodeFile filename w
  return w

And that's all it takes for us to save some game files! Perhaps you've heard the phrase "don't try this at home." When it comes to unsafePerformIO, feel free to try it at home, but don't try it at work! Take a look at this commit for details on saving the state.

Conclusion

In spite of unsafePerformIO, our game feel like a much more "grown-up" program now. The code quality is much better with our parameters. We now have a lot more options of what to do when it comes to improving it. Saving the world state is the first step towards solving some interesting problems. Next week, we'll explore how we can load the saved game states we've created.

As we move forward, we'll keep trying to turn this game into a more mature program. Eventually though, you should think about using Haskell for more common production use cases. To learn about different libraries you can use, download our Production Checklist!

Appendix: Render Parameter Types

data RenderParameters = RenderParameters
  { screenDimen :: Int
  , screenOffsetX :: Int
  , screenOffsetY :: Int
  , textOffset :: (Float, Float)
  , textScale :: (Float, Float)
  , playerRenderParameters :: PlayerRenderParameters
  , enemyRenderParameters :: EnemyRenderParameters
  , cellRenderParameters :: CellRenderParameters
  }

data PlayerRenderParameters = PlayerRenderParameters
  { playerIndicatorSize :: Float
  , playerIndicatorColor :: Color
  , playerStunIndicatorSize :: Float
  , playerStunIndicatorColor :: Color
  }

data EnemyRenderParameters = EnemyRenderParameters
  { enemySize :: Float
  , enemyBaseColor :: Color
  , enemyStunnedColor :: Color
  }

data CellRenderParameters = CellRenderParameters
  { cellWallColor :: Color
  , cellStunColor :: Color
  , cellWallWidth :: Float
  }

defaultRenderParameters :: RenderParameters
defaultRenderParameters = RenderParameters
  625 10 10 (-275, 0) (0.12, 0.25) playerParams enemyParams cellParams
  where
    playerParams = PlayerRenderParameters 10 black 5 red
    enemyParams = EnemyRenderParameters 10 orange yellow
    cellParams = CellRenderParameters blue cyan 2
Read More