Adding Features for Better Behavior
Last week we started exploring the idea of an AI built on an evaluation function. This has the potential to allow us to avoid a lot of the hand-crafting that comes with AI design. Hard old way specified all the rules for the AI to follow. In the new approach, we create a mathematical function to evaluate a game position. Then we can look at all our possible moves and select the one with the best result. We could, if we wanted to, turn the input to our evaluation function into a vector of numbers. And its output is also a number. This property will help us realize our dream future to machine learn this function.
We made a rudimentary version of this function last week. Even before turning to machine learning, there are a couple ways to improve our function. We can try tweaking the weights we applied to each feature. But we can also try coming up with new features, or try different combinations of features. This week, we'll try the latter approach.
In the coming weeks as we start exploring machine learning, we'll use Tensor Flow with Haskell! To get prepared, download our Haskell Tensor Flow guide!
Existing Features
Last week, we came up with a few different features that could help us navigate this maze. These features included:
- Maze distance to goal
- Manhattan distance to goal
- Whether or not an enemy is on our location
- Whether or not our stun is available
- The number of drills we have available
- The number of enemies that are nearby (using manhattan distance)
But there were some clear sub-optimal behaviors with our bot. We tend to get "zoned out" by enemies, even when they aren't near us by maze distance. Obviously, it would suit us to use maze distance instead of manhattan distance. But we also want to be willing to approach enemies aggressively when we have our stun, and retreat intelligently without it. To that end, let's add a couple more features:
- The number of enemies on the shortest path to the goal.
- The shortest distance to an enemy from a particular square (only up to 5)
We'll impose a penalty for close enemies if we don't have our stun. Otherwise we'll ignore this first new feature. Then we'll also impose a penalty having more enemies on our shortest path. This will make us more willing to use the stun, rather than waiting.
Enemies In The Way
Our first order of business will be to determine how many enemies lie on our shortest path. We'll filter the path itself based on membership in the active enemies set:
evaluateWorld :: World -> Float
evaluateWorld w =
where
activeEnemyLocations = …
shortestPath =
getShortestPath (worldBoundaries w) playerLoc goalLoc
enemiesOnPath = length $ filter
(\l -> Set.member l (Set.fromList activeEnemyLocations))
shortestPath
Then we'll assign each enemy on this path a penalty greater than the value of using the stun. We'll add this score to our other scores.
evaluateWorld :: World -> Float
evaluateWorld w =
enemiesOnPathScore +
...
where
enemiesOnPath = ...
enemiesOnPathScore = -85.0 * (fromIntegral enemiesOnPath)
Maze Distance
Next lets get the shortest maze distance to a nearby enemy. We'll actually want to generalize the behavior of our existing BFS function for this. We want to find the shortest path to any one of the enemy locations. So instead of supplying a single target location, we'll supply a set of target locations. Then we'll cap the distance to search so we aren't doing a full BFS of the maze every time. This gives an optional range parameter. Let's use these ideas to make an expanded API that our original function will use.
getShortestPathToTargetsWithLimit
:: Maze
-> Location
-> Set.Set Location
-> Maybe Int
-> [Location]
getShortestPathToTargetsWithLimit
maze initialLocation targetLocations maxRange = ...
-- Original function call!
getShortestPath maze initialLocation targetLocation =
getShortestPathToTargetsWithLimit maze initialLocation
(Set.singleton targetLocation) Nothing
bfs
:: Maze
-> Location
-> Set.Set Location -- Now a set of targets
-> Maybe Int -- Added range parameter
-> [Location]
bfs = ...
We'll have to make a few tweaks to our algorithm now. Each search state element will have a "distance" associated with it.
data BFSState = BFSState
{ bfsSearchQueue :: Seq.Seq (Location, Int)
...
-- Our initial state has a distance of 0
getShortestPathToTargetsWithLimit
maze initialLocation targetLocations maxRange =
evalState
(bfs maze initialLocation targetLocations maxRange)
(BFSState
(Seq.singleton (initialLocation, 0))
(Set.Singleton initialLocation)
Map.empty)
Now we need a couple modifications to the core bfs
function. When extracting the next element in the queue, we have to consider its distance. All new items we create will increment that distance. And if we're at the max distance, we won't add anything to the queue. Finally, when evaluating if we're done, we'll check against the set of targets, rather than a single target. Here's our bfs
code, with differences noted.
bfs
:: Maze
-> Location
-> Set.Set Location
-> Maybe Int
-> State BFSState [Location]
bfs maze initialLocation targetLocations maxRange = do
BFSState searchQueue visitedSet parentsMap <- get
if Seq.null searchQueue
then return []
else do
-- ! Unwrap distance as well
let (nextLoc, distance) = Seq.index searchQueue 0
-- ! Check set membership, not equality
if Set.member nextLoc targetLocations
then return (unwindPath parentsMap [nextLoc])
else do
-- ! Add the new distance to each adjacent cell
let adjacentCells = (, distance + 1) <$>
getAdjacentLocations maze nextLoc
-- ! Account for the distance with a new helper function
let unvisitedNextCells = filter
(shouldAddNextCell visitedSet)
adjacentCells
let newSearchQueue = foldr
(flip (Seq.|>))
(Seq.drop 1 searchQueue)
unvisitedNextCells
newVisitedSet = Set.insert nextLoc visitedSet
newParentsMap = foldr
(\(l, _) -> Map.insert l nextLoc)
parentsMap unvisitedNextCells
put (BFSState newSearchQueue newVisitedSet newParentsMap)
bfs maze initialLocation targetLocations maxRange
where
-- ! Helper function to account for distance when adding to queue
shouldAddNextCell visitedSet (loc, distance) = case maxRange of
Nothing -> not (Set.member loc visitedSet)
Just x -> distance <= x && not (Set.member loc visitedSet)
unwindPath parentsMap currentPath = ...
Now to use this feature, we'll use our new different shortest path call. If the distance is "0", this means we have no enemies near us, and there's no penalty. We also won't apply a penalty if our stun is available. Otherwise, we'll provide a stiffer penalty the shorter the path. Then we mix it in with the other scores.
evaluateWorld :: World -> Float
evaluateWorld w =
...
nearestEnemyDistanceScore +
...
where
...
nearestEnemyDistance = length $ getShortestPathToTargetsWithLimit
(worldBoundaries w)
playerLoc
(Set.fromList activeEnemyLocations)
(Just 4)
nearestEnemyDistanceScore =
if nearestEnemyDistance == 0 || stunAvailable then 0.0
else -100.0 * (fromIntegral (5 - nearestEnemyDistance))
We'll also drop the enemy manhattan distance weight to -5.0.
Results
From this change, our player suddenly appears much more intelligent! It will back away from enemies when it is missing it's stun. It will use the stun and go past the enemy when appropriate.
There are still ways we could improve the AI. It doesn't account for future space to retreat when running away. It sometimes uses the stun too early, when it might be better to wait for more enemies to come into range. But it's not clear how we could improve it by tweaking the weights. This means it's time to consider machine learning as an option to get better weights!
Conclusion
Next week, we'll re-acquaint ourselves with the basics of machine learning and Tensor Flow. This will set us up to write a program that will determine our AI weights.
We're going to start working with Tensor Flow next week! To make sure you can keep up, download our Haskell Tensor Flow Guide. It'll help you with the basics of making this complex Haskell library work.