Adversarial Search
By Robert Laing
getmoves([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o), cell(2, 1, b), cell(2, 2, o), cell(2, 3, b), cell(3, 1, b), cell(3, 2, o), cell(3, 3, x), control(white)], Moves), getmax(white, Moves, Max).
getmoves([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o), cell(2, 1, x), cell(2, 2, o), cell(2, 3, b), cell(3, 1, b), cell(3, 2, o), cell(3, 3, x), control(black)], Moves), getmax(black, Moves, Max).
getmoves([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o), cell(2, 1, b), cell(2, 2, o), cell(2, 3, b), cell(3, 1, b), cell(3, 2, o), cell(3, 3, x), control(white)], Moves), prune(white, Moves, Pruned).
getmoves([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o), cell(2, 1, x), cell(2, 2, o), cell(2, 3, b), cell(3, 1, b), cell(3, 2, o), cell(3, 3, x), control(black)], Moves), prune(black, Moves, Pruned).
findall(move(Parent, Label, Child, Value), move(Parent, Label, Child, Value), Moves), print(Moves).
update_previous_move_value( move([ cell(1,1,x),cell(1,2,x),cell(1,3,o), cell(2,1,x),cell(2,2,o),cell(2,3,b), cell(3,1,b),cell(3,2,o),cell(3,3,x),control(black)], [does(white,noop),does(black,mark(3,1))], [cell(1,1,x),cell(1,2,x),cell(1,3,o), cell(2,1,x),cell(2,2,o),cell(2,3,b), cell(3,1,o),cell(3,2,o),cell(3,3,x),control(white)], [value(white,0),value(black,100)])).
update_previous_move_value( move([cell(1,1,x),cell(1,2,x),cell(1,3,o),cell(2,1,b),cell(2,2,o),cell(2,3,x),cell(3,1,b),cell(3,2,o),cell(3,3,x),control(black)], [does(white,noop),does(black,mark(3,1))], [cell(1,1,x),cell(1,2,x),cell(1,3,o),cell(2,1,b),cell(2,2,o),cell(2,3,x),cell(3,1,o),cell(3,2,o),cell(3,3,x),control(white)], [value(white,0),value(black,100)])).
getmoves([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o), cell(2, 1, b), cell(2, 2, o), cell(2, 3, b), cell(3, 1, b), cell(3, 2, o), cell(3, 3, x), control(white)], Moves), prune(white, Moves, Pruned).
findall(move(Parent, Label, Child, Value), move(Parent, Label, Child, Value), Moves), print(Moves).
Expanding Minimax from 2 to n players
There is a tradition in textbooks to use an algorithm called minimax which uses one number to value states in a two player games, with one player ahead if this number is positive, the opponent ahead it’s negative, and zero indicating a draw.
This is often attributed to John von Neumann’s and Oskar Morgenstern’s book Theory of Games and Economic Behavior, though they never used the term and built their theory for “zero-sum two-person games” to zero-sum n-person games".
A snag with using one number and a Max and a Min player is it obscures the basic idea — the controlling player
in a given state will pick their most favourable move. Instead of one number, something like a vector is needed,
which I’ve done as [value(white, 50), value(black, 50)]
which GGP provides with its goal(State, Player, Value)
predicates.
minimax tends to go hand-in-hand with α-β pruning, which again limits an idea applicable to n-players to just 2 players.
Essentially, what we need to do is:
- Assume the player whose turn it is —
member(control(Player), Parent)
— will pick the maximum of thevalue(Player, Value)
of next states available. - The opposing player needs to be forewarned about where future moves will lead. This means that while growing the game tree downward, parent nodes need to update their values with fresh values optained from children — what I think the jargon term dynamic programing means.
- Thanks to iterative deepening, the branches that need to be expanded narrow, helping the AI focus on
The above diagram indicates that once a terminal is reached that isn’t a draw, the existing value of its parent node can
be replaced with that value. The grandparent node in turn needs to check if member(control(Player), GrandParent)
has
been snookered.
At this stage, I’m doing this upward propogation by replacin the existing max of member(control(Player, Parent)
with
max of member(control(Player, Child)
, continuting upward to the root as long as changes are made.
One of the simplification in the example used in Game Trees was the entire graph was placed in the clausal store to search. Moving on, the game player starts only knowing the initial state and the rules for expanding the tree, so has to play blind man’s buff to find it’s way to winning states and avoid losing ones.
I’m going to continue using the clausal store as a place to memoize explored paths so that they don’t have to be constantly regenerated. Another reason to remember the game tree as it develops is to support dynamic programing.
?- bestmove([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o),
cell(2, 1, b), cell(2, 2, o), cell(2, 3, b),
cell(3, 1, b), cell(3, 2, o), cell(3, 3, x),
control(white)], Move).
% Generated
% [Thread 3] Generated
% [Thread 4] Generated
% [Thread 4] Generated
% [Thread 4] Memoized
% [Thread 4] Memoized
% [Thread 4] Generated
Move = [does(white, mark(3, 1)), does(black, noop)].