Search Methods

Three search methods were tested: expected minimax, expected alpha-beta and monte carlo search. For high-level implementation information, see below, for specifications, check the javadoc of the corresponding classes.

Expected Alpha-Beta

The class AlphaBetaSearchAlgorithm behaves just like the standard alpha-beta algorithm in the case of a deterministic node (see RUSSELL, 2003). In addition, it handles stochastic nodes in the following way:

  • stochasticMax(node, alpha, beta)
  • Let partialSum := 0
  • for each successive nodes s (i.e., each possible outcome of a dice roll)
  •   Let alphaPrime := max{-1, (#sidesOnTheDie) * alpha + (#remainingIterations) - partialSum}
  •   Let betaPrime := min{1, (#sidesOnTheDie) * beta - (#remainingIterations) - partialSum}
  •   partialSum := partialSum + min(alphaPrime, betaPrime)
  •   Let currentLowerBound = (partialSum - (#remainingIterations)) / (#sidesOnTheDie)
  •   if currentLowerBound >= beta
  •     return currentLowerBound
  • return partialSum / (#sidesOnTheDie)
  • (for stochasticMin, see the java source of AlphaBetaSearchAlgorithm) This cut-off is possible since all Heuristic used in this project are s.t. |heuristic(s)| <= 1 (for all knockabout state s). [Note that it is an implicit contract, see interface Heuristic]. So at each iteration, it is possible to find a lower bound for max value by assuming that all the remaining nodes (i.e. not explored yet) have a utility of -1.

    Moreover, AlphaBetaSearchAlgorithm sorts the move based on a heuristic evaluation at each deterministic node using mergesort. This has the effect of slowing down the search (from order 100000 nodes/sec with minimax to order 10000 nodes/sec on my machine), but the reduction of the average branching factor b is considerable. With move ordering and stochastic cutoff, the average b goes from approx. 100 (with minimax) to approx. 20 (if counting the stochastic ramifications). This allows AlphaBetaSearchAlgorithm to "see" 3 or 4 moves aheads and execute basic attack and defense moves. Unfortunately, empirical tests have shown that this is not enough to make AlphaBetaSearchAlgorithm significantly better than a mere expectiminimax. (See below for testing methodology) Possible improvements that could make alpha-beta a more viable alternative include the use of transposition table, a database of pre-computed chance nodes (because the search is harder to cutoff there), guessing opponent's move and using his time period to start the search.

    Monte Carlo Search

    The class MCSearchAlgorithm evaluates a knockabout board position using a Monte Carlo search technique. For each successor state, k simulations of depth d are performed and averaged. The first version of MCSearchAlgorithm simulated trajectories completely randomly and at first, did not peform very well against the expected minimax implementation MinMaxSearchAlgorithm. Since this first version of MCSearchAlgorithm generates moves at random, it implies that less heuristic operations have to be done per node. Therefore it is expected to explore more nodes by units of time than MinMaxSearchAlgorithm but surprisingly, the opposite is observed (a drop from about 100000 nodes/second to 15000 nodes/second). (And 100% of the games lost over minmax-2) This problem is not yet solved in either the first or second implementation, but the latter is already much better. Instead of generating the trajectories completely randomly, it greedily chooses the best successor half the time and uses a random generator the other half.

    Testing methodology