The following is based on the famous article "Why Functional Programming Matters" (sometimes refered to as "whyfp90") by John Hughes.
I'd like to implement an alphabeta heuristic search algorithm for the complex game of tictactoe. This document is the set of my notes going through the process of implementing and playing with the heuristic in Haskell.
Before we get into the game, here are the imports we'll need.
import Data.Array.IArray
import Data.List (intercalate)
import Data.Tree
import Control.Monad (forM_)
Game State
The game state for tictactoe is a 3x3 grid filled with X's, O's or blanks. Let's represent the grid cell contents (X, O, or blank) with a new data type.
data Cell = Blank  XX  OO
deriving (Enum, Read, Eq, Ord)
instance Show Cell where
show Blank = "."
show XX = "X"
show OO = "O"
  return the "opposite" cell contents
opp :: Cell > Cell
opp Blank = Blank
opp XX = OO
opp OO = XX
data Player = PlayerX  PlayerO
deriving (Show, Eq)
  Return the piece assigned to each player
piece :: Player > Cell
piece PlayerX = XX
piece PlayerO = OO
A position will be an integer tuple (x, y) with entries in {0, 1, 2}. To
represent the grid, the most natural data type is the array. We'll use
the basic Data.Array
interface, defining the game state as an array
with indicies of type (Int, Int)
and values of type Cell
.
type Pos = (Int, Int)
type State = Array Pos Cell
boardSize = 9
boardInds = ((0,0), (2,2))
  Empty board
empty :: State
empty = listArray boardInds $ replicate boardSize Blank
  Return a list of board positions in order
positions :: [Pos]
positions = indices empty
  Return a list of cells in index order
cells :: State > [Cell]
cells = elems
  Lookup the cell at a given position
lookupCell :: State > Pos > Cell
lookupCell = (!)
  Add a new move to the board (order of args is fold left folding)
update :: State > (Pos, Cell) > State
update st pc = st // [pc]
  Return the list of unoccupied positions on the board
freePos :: State > [Pos]
freePos st = filter (\p > st ! p == Blank) positions
The goal is to use the interface specified above everywhere below. This way it wil be easy to modify the internal representation of a board and not rewrite all the code below.
Here are some basic constructors for game states.
  Generate a State given a function from positions to cells
genState :: (Pos > Cell) > State
genState f = foldl (\s p > update s (p, f p)) empty positions
  Assumes that X goes first, constructs a game state from a sequence of
 positions
posSeq :: [Pos] > State
posSeq ps = foldl update empty (zip ps (cycle [XX, OO]))
It's useful to have a pretty printer for the game state.
  Return a multiline string representation of the state
pretty :: State > String
pretty st = let chars = concat $ map show $ cells st
rows = [ take 3 (drop (3*i) chars)  i < [0..2] ]
in intercalate "\n" rows
pprint :: State > IO ()
pprint = putStrLn . pretty
  Return a oneline string representation of the state
prettyOneLine :: State > String
prettyOneLine = concat .
map show .
cells
Here are a couple of example states to experiment with:
exState0 = posSeq [(0,0), (2,2), (0,1)]
exState1 = posSeq [(1,1)]
exState2 = posSeq [(0,0), (2,2), (0,1), (1,0)]
exStateWinX = posSeq [(1,1), (0,0), (1,2), (0,1), (1,0)]
exStateWinO = posSeq [(2,2), (1,1), (0,0), (1,2), (0,1), (1,0)]
ghci> putStrLn $ pretty exState0
XX.
...
..O
Game Trees
We want to build the game tree starting at a given state. To do that we need to know who's turn it is, X or O, from the board data.
  Examine the board and return whose turn it is, X or O.
getTurn :: State > Cell
getTurn st
 k == 0 = XX
 k == 1 = OO
 otherwise = error "encountered invalid board state"
where k = (count XX st)  (count OO st)
count c s = length . filter (==c) $ cells s
To build a game tree we need to know what moves are possible from a
particular state. We define a moves
function that takes a single game
state and returns a list of possible game states which are reachable in
one move from the given one.
  A collection of cell extration functions
getRow, getCol :: State > Int > [Cell]
getRow s i = [ lookupCell s (i,j)  j < [0,1,2] ]
getCol s i = [ lookupCell s (j,i)  j < [0,1,2] ]
getDiag, getAntiDiag :: State > [Cell]
getDiag s = [ lookupCell s (i,j)  (i,j) < [(0,0), (1,1), (2,2)] ]
getAntiDiag s = [ lookupCell s (i,j)  (i,j) < [(2,0), (1,1), (0,2)] ]
  Check if the given player has won the game
win :: Player > State > Bool
win player state = checkRows  checkCols  checkDiag  checkAntiDiag
where c = piece player
checkWin c list = all (==c) list
checkRows = any (==True) [checkWin c $ getRow state i  i < [0,1,2]]
checkCols = any (==True) [checkWin c $ getCol state i  i < [0,1,2]]
checkDiag = checkWin c $ getDiag state
checkAntiDiag = checkWin c $ getDiag state
  Generate a list of new states reachable from the given state in one move
moves :: State > [State]
moves st
 win PlayerX st  win PlayerO st = []
 otherwise = map (\p > update st (p, getTurn st)) (freePos st)
To generate the full game tree from a given state we'll unfold the tree
using the function \s > (s, moves s)
. The input is a seed, the output
is a pair consisting of a node value and a list of seeds with which to
generate the next level.
  Generate a game tree from an initial game state
generate :: State > Tree State
generate = unfoldTree (\s > (s, moves s))
  A general purpose tree fold
foldTree :: (a > b > a) > (a > a > a) > a > Tree b > a
foldTree f g a (Node b []) = f a b
foldTree f g a (Node b (t:rest)) = g (foldTree f g a t)
(foldTree f g a (Node b rest))
It's interesting to see what would happen if we had defined moves
in a
naive way, just using the expression from the second guard.
moves' st = map (\p > update st (p, getTurn st)) (freePos st)
generate' = unfoldTree (\s > (s, moves' s))
For example,
ghci> let t = prune 6 . generate $ exState1
ghci> length . flatten $ t
24785
ghci> let t' = prune 6 . generate' $ exState1
ghci> length . flatten $ t'
28961
That's about 17% more states (at a depth of 6) if you ignore wins and keep on playing.
Back to business... to generate a game tree to a given depth we'll define a prune function.
  A general purpose tree pruner
prune :: Int > Tree a > Tree a
prune 0 t = Node (rootLabel t) []
prune n t = Node (rootLabel t) (map (prune (n1)) (subForest t))
For example here's the game tree starting at exState0
pruned to level
1:
ghci> let t = prune 1 $ generate exState0
ghci> putStrLn $ drawTree $ fmap prettyOneLine t
XX......O

+ XXO.....O

+ XX.O....O

+ XX..O...O

+ XX...O..O

+ XX....O.O

` XX.....OO
Evaluation of Game States
Static evaluation: first we should be able to determine if a game state represents a win for one player.
  Take a game state and return a rough valuation of it. Larger results
 are better for X, smaller (esp. negative) results are better for O.
staticVal :: State > Int
staticVal state
 win PlayerX state = 1
 win PlayerO state = (1)
 otherwise = 0
Now we can look through a game tree and find wins or losses for X:
ghci> let t = fmap staticVal . prune 4 . generate $ exState1
ghci> 1 `elem` flatten t
True
So X can win in 2 moves starting at exState1
where X played the center
position (not surprising!) However, O cannot win in that number of moves
(also not surprising!) but O can win if we look one move deeper.
ghci> (1) `elem` flatten t
False
ghci> let t' = fmap staticVal . prune 5 . generate $ exState1
ghci> (1) `elem` flatten t'
True
Minimax Valuation
The minimax idea is that in order to judge the value of a position, it is useful to assume each player makes the best move availible to them at each juncture. That's obviously not how every game plays out, but it's a good simplifying assumption.
We proceed by defining the valuation of a game state where it is X's turn as the maximum of the valuations of all the game states reachable in one move. Likewise, if it is O's turn, we define the valuation as the minimum of the valuations of all the game states reachable in one move. If the game state is a win or a tie, we define the valuation to be the static valuation above.
Note that this definition only makes sense if the game tree is finite. Even in such cases we may not be able to compute it if the game tree is too large, so in practice we'll prune the game tree to a reasonable depth before computing the valuation (and use the static valuation at leaves).
maximizeTreeVal :: (Ord a) => Tree a > a
maximizeTreeVal (Node x []) = x
maximizeTreeVal (Node x subs) = maximum . map minimizeTreeVal $ subs
minimizeTreeVal :: (Ord a) => Tree a > a
minimizeTreeVal (Node x []) = x
minimizeTreeVal (Node x subs) = minimum . map maximizeTreeVal $ subs
To make the functions more flexible, we can do the same as
maximizeTreeVal, but return the list of numbers that maximizeTreeVal
would take the maximum of.
maximizeTreeList :: (Ord a) => Tree a > [a]
maximizeTreeList (Node x []) = [x]
maximizeTreeList (Node x subs) = map maximum . map minimizeTreeList $ subs
minimizeTreeList :: (Ord a) => Tree a > [a]
minimizeTreeList (Node x []) = [x]
minimizeTreeList (Node x subs) = map minimum . map maximizeTreeList $ subs
With these definitions, the actual valuation function would be, e.g.:
ghci> let f = maximum . maximizeTreeList . fmap staticVal . prune 5 . generate
Loading package array0.4.0.1 ... linking ... done.
Loading package deepseq1.3.0.1 ... linking ... done.
Loading package containers0.5.0.0 ... linking ... done.
ghci> f exState1
0
If we increase the lookahead to 6 plys, we see that this is a good position for X!
ghci> let f = maximum . maximizeTreeList . fmap staticVal . prune 6 . generate
ghci> f exState1
1
AlphaBeta
The key to implementing alphabeta is to replace map maximum
and
map minimum
in the definitions above with a function that decides
whether it needs to consider the next list of numbers in the list, based
on early numbers in that list.
mapmin :: (Ord a) => [[a]] > [a]
mapmin [] = []
mapmin (xs:rest) = n : (omit n rest)
where n = minimum xs
omit :: (Ord a) => a > [[a]] > [a]
omit _ [] = []
omit n (xs:rest)  minleq n xs = omit n rest
 otherwise = k : omit k rest
where k = minimum xs
The next function, minleq
is key to the whole thing. It takes a value
n
and a list of values and returns True if the minimum of the list is
less or equal to n. The function returns immediately without evaluating
the rest of the list if it finds an element less or equal to n
. In
Haskell's lazy evaluation strategy, this means that if we never look at
the numbers in one of the lists sent to minleq
then the entire subtree
of the game tree rooted at those unexamined numbers never gets
evaluated, saving time and space!
minleq :: (Ord a) => a > [a] > Bool
minleq _ [] = False
minleq n (y:ys)  y <= n = True
 otherwise = minleq n ys
For example:
ghci> minleq 1 [0,2,3,undefined]
True
Here are is the code for the corresponding mapmax
:
mapmax [] = []
mapmax (xs:rest) = n : (omit' n rest)
where n = maximum xs
omit' _ [] = []
omit' n (xs:rest)  maxleq n xs = omit' n rest
 otherwise = k : omit' k rest
where k = maximum xs
maxleq _ [] = False
maxleq n (y:ys)  y >= n = True
 otherwise = maxleq n ys
Finally, we can define our alphabeta heuristic game state valuation:
abMaxList :: (Ord a) => Tree a > [a]
abMaxList (Node x []) = [x]
abMaxList (Node x subs) = mapmin . map abMinList $ subs
abMinList :: (Ord a) => Tree a > [a]
abMinList (Node x []) = [x]
abMinList (Node x subs) = mapmax . map abMaxList $ subs
abmax :: State > Int
abmax = maximum .
abMaxList .
fmap staticVal .
prune 8 .
generate
abmin :: State > Int
abmin = minimum .
abMinList .
fmap staticVal .
prune 8 .
generate
Analysis
Here's a simple printer for sets of boards we can use to examine
different moves and their valuation under abmax
:
printMoves :: State > IO ()
printMoves st = forM_ (zip [0..] (moves st)) $ \(i,s) > do
putStrLn (show i)
pprint s
putStrLn ""
To start, let's see how the computer values the empty board:
ghci> fmap abmin $ moves empty
[0,0,0,0,0,0,0,0,0]
So all moves are equally good (our static valuation doesn't take into consideration that choosing the center is a good strategy).
So let's make a move as X and one as O:
ghci> let st = posSeq [(0,0), (2,0)]
ghci> pprint st
X..
...
O..
Now let's compute the valuation of each of X's possible moves from here:
ghci> fmap abmin $ moves st
[1,1,0,0,0,0,1]
Remember that positive values favor X and negative values favor O in our static valuation. Let's see what the game states are that correspond to the 1's in our valuation:
ghci> printMoves st
0
XX.
...
O..
1
X.X
...
O..
2
X..
X..
O..
3
X..
.X.
O..
4
X..
..X
O..
5
X..
...
OX.
6
X..
...
O.X
That certainly makes sense.. now let's choose move 1 and see what happens:
ghci> let st' = posSeq [(1,1), (1,0), (0,2), (2,0)]
ghci> pprint st'
..X
OX.
O..
ghci> map abmax $ moves st'
[1,1,0,1,1]
ghci> printMoves st'
0
X.X
OX.
O..
1
.XX
OX.
O..
2
..X
OXX
O..
3
..X
OX.
OX.
4
..X
OX.
O.X
It's interesting here that move 2 is not valued highly, despite the chance to win in one. Also, move 0 is the only one that prevents O from winning in one. I think the reason this doesn't factor into the valuation is that twoinarow isn't particularly valued by the static valuation, so when O is trying to minimize the valuation, it doesn't see the value of having an unblocked twoinarow.
That, or there's a bug in my program :)
Comments
comments powered by Disqus