Frontier Software

Robert Laing's programing notes

Adversarial Search

By Robert Laing

Mobility

Mobility is a measure of the number of things a player can do. This could be the number of actions available in the given state or n steps away from the given state. Or it could be the number of states reachable within n steps. (This could be different from the number of actions since multiple action sequences could lead to the same state. All roads lead to Rome.)

SELECT count(DISTINCT next_id) FROM tictactoe_graph;

Tables/Schemas

State Table

CREATE TABLE IF NOT EXISTS chess_states 
    (    id              SERIAL PRIMARY KEY
    ,    state           JSONB UNIQUE
    ,    goal            JSONB  -- NULL for non-terminal states
    );

Goal

Only final states have a goal value. So instead of a boolean value for terminal, non-terminal states have null for goal.

Move Table

CREATE TABLE IF NOT EXISTS chess_moves
    (    id            SERIAL PRIMARY KEY
    ,    move          JSONB UNIQUE
    );

MID is unique, Moves get appended, but never updated

Game Tree

CREATE TABLE IF NOT EXISTS chess_graph 
    (    init_id      INTEGER REFERENCES chess_states(id)
    ,    move_id      INTEGER REFERENCES chess_moves(id)
    ,    next_id      INTEGER REFERENCES chess_states(id)
    ,    utility      JSONB
    ,    mobility     JSONB
    ,    visits       INTEGER DEFAULT 0
    ,    total        JSONB
    ,    PRIMARY KEY  (init_id, move_id)
    );

Bestmove

bestmove(Dict, TimeLimit, State, Bestmove)

Iterative deepening

Use list of [level(1, ID1), …|Frontier], appending children optained from getmoves/2 to front for depth first search

  1. Use getmoves(Dict, State, ValMoves) to get a list of move(Move, Value, Terminal, Visits, Focus, PlayerMobility, NextID), ordered by MaxValue, MinFocus, MaxPlayerMobility.
  2. Prune the list of Value = -100 and Terminal == true
  3. forall(Next, odbc_execute(Dict.select_state_from_id, [NextID], row(Next)), Nexts)
  4. getmoves(Dict, State, ValMoves)

Utility

  1. For each parent of newly added child node, get utility

utility:update(Dict, NextIDs, JUtilities),

update(Dict, [NextID|NextIDs], [JUtility|JUtilities]) :- json_terms:json_terms(JUtility, Utility), findall(ParentID, odbc_execute(Dict.get_parent, [NextID], row(ParentID, Utility)), Unsorted), sort(Unsorted, ParentIDs), maplist(update(Dict, Utility), ParentIDs), update(Dict, NextIDs, JUtilities).

update_(Dict, Goal, ID) :- odbc_execute(Dict.select_state_from_id, [ID], row(Json)), json_terms:json_terms(Json, Terms), member(control(Player), Terms), findall( parent(ParentID, JUtility), odbc_execute(Dict.get_parent, [ID], row(ParentID, JUtility)), Unsorted ), sort(Unsorted, Parents), ( Parents == [] -> ( update_parents(Dict, Goal, Player, ID, Parents), findall(ParID, member(row(ParID, _), Parents), ParIDs), maplist(update_(Dict, Goal), ParIDs) ) ; true ).

update_parents(_, _, _, _, []) :- !.

update_parents(Dict, Goal, Player, ChildID, [parent(ParentID, JUtility)|Parents]) :- json_terms:json_terms(JUtility, Utility), member(goal(Player, X), Goal), member(goal(Player, Y), Utility), X > Y, !, json_terms:json_terms(NewUtility, Goal), odbc_execute(Dict.update_utility, [NewUtility, ParentID, ChildID]), update_parents(Dict, Goal, Player, ChildID, Parents).

update_parents(Dict, Goal, Player, ChildID, [_|Parents]) :- update_parents(Dict, Goal, Player, ChildID, Parents).

Mobility

for each new child created by add_children(Dict, State), the mobility of each role for InitID and NextID is initialsed to null for non-terminals, zero for terminals.

GranparentID and InitID can then be updated, recursively upwards

InitID’s Player can be given the value of the number of non-losing moves if it’s smaller than the existing value (or replaces null).

odbc_execute(Dict.update_mobility, [JNewMobility, InitID, NextID])

mobility:update(Dict, Player, JUtilities, InitID).

Adding node 5:

  1. Count nonlosing moves for Player from JUtilities (5).
  2. Find parent(s) and old mobility select_parent_utility_mobility
  3. If old mobility for player is null or higher, update_mobility
  4. Recursively repeat bottom up until reaching previous Player turn

State representation

Original kif

(init (cell a 1 wr))
(init (cell a 2 wp))
(init (cell a 3 b))
(init (cell a 4 b))
(init (cell a 5 b))
(init (cell a 6 b))
(init (cell a 7 bp))
(init (cell a 8 br))

(init (cell b 1 wn))
(init (cell b 2 wp))
(init (cell b 3 b))
(init (cell b 4 b))
(init (cell b 5 b))
(init (cell b 6 b))
(init (cell b 7 bp))
(init (cell b 8 bn))

(init (cell c 1 wb))
(init (cell c 2 wp))
(init (cell c 3 b))
(init (cell c 4 b))
(init (cell c 5 b))
(init (cell c 6 b))
(init (cell c 7 bp))
(init (cell c 8 bb))

(init (cell d 1 wq))
(init (cell d 2 wp))
(init (cell d 3 b))
(init (cell d 4 b))
(init (cell d 5 b))
(init (cell d 6 b))
(init (cell d 7 bp))
(init (cell d 8 bq))

(init (cell e 1 wk))
(init (cell e 2 wp))
(init (cell e 3 b))
(init (cell e 4 b))
(init (cell e 5 b))
(init (cell e 6 b))
(init (cell e 7 bp))
(init (cell e 8 bk))

(init (cell f 1 wb))
(init (cell f 2 wp))
(init (cell f 3 b))
(init (cell f 4 b))
(init (cell f 5 b))
(init (cell f 6 b))
(init (cell f 7 bp))
(init (cell f 8 bb))

(init (cell g 1 wn))
(init (cell g 2 wp))
(init (cell g 3 b))
(init (cell g 4 b))
(init (cell g 5 b))
(init (cell g 6 b))
(init (cell g 7 bp))
(init (cell g 8 bn))

(init (cell h 1 wr))
(init (cell h 2 wp))
(init (cell h 3 b))
(init (cell h 4 b))
(init (cell h 5 b))
(init (cell h 6 b))
(init (cell h 7 bp))
(init (cell h 8 br))

(init (control white))
(init (step 1))

Prolog

Start = [
cell(a,8,br),cell(b,8,bn),cell(c,8,bb),cell(d,8,bq),cell(e,8,bk),cell(f,8,bb),cell(g,8,bn),cell(h,8,br),
cell(a,7,bp),cell(b,7,bp),cell(c,7,bp),cell(d,7,bp),cell(e,7,bp),cell(f,7,bp),cell(g,7,bp),cell(h,7,bp),
cell(a,6,b),cell(b,6,b),cell(c,6,b),cell(d,6,b),cell(e,6,b),cell(f,6,b),cell(g,6,b),cell(h,6,b),
cell(a,5,b),cell(b,5,b),cell(c,5,b),cell(d,5,b),cell(e,5,b),cell(f,5,b),cell(g,5,b),cell(h,5,b),
cell(a,4,b),cell(b,4,b),cell(c,4,b),cell(d,4,b),cell(e,4,b),cell(f,4,b),cell(g,4,b),cell(h,4,b),
cell(a,3,b),cell(b,3,b),cell(c,3,b),cell(d,3,b),cell(e,3,b),cell(f,3,b),cell(g,3,b),cell(h,3,b),
cell(a,2,wp),cell(b,2,wp),cell(c,2,wp),cell(d,2,wp),cell(e,2,wp),cell(f,2,wp),cell(g,2,wp),cell(h,2,wp),
cell(a,1,wr),cell(b,1,wn),cell(c,1,wb),cell(d,1,wq),cell(e,1,wk),cell(f,1,wb),cell(g,1,wn),cell(h,1,wr),
control(white)
]

JSON Array

Json = [
 ["cell","a",8,"br"],["cell","b",8,"bn"],["cell","c",8,"bb"],["cell","d",8,"bq"],["cell","e",8,"bk"],["cell","f",8,"bb"],["cell","g",8,"bn"],["cell","h",8,"br"],
 ["cell","a",7,"bp"],["cell","b",7,"bp"],["cell","c",7,"bp"],["cell","d",7,"bp"],["cell","e",7,"bp"],["cell","f",7,"bp"],["cell","g",7,"bp"],["cell","h",7,"bp"],
 ["cell","a",6,"b"], ["cell","b",6,"b"], ["cell","c",6,"b"], ["cell","d",6,"b"], ["cell","e",6,"b"], ["cell","f",6,"b"], ["cell","g",6,"b"], ["cell","h",6,"b"],
 ["cell","a",5,"b"], ["cell","b",5,"b"], ["cell","c",5,"b"], ["cell","d",5,"b"], ["cell","e",5,"b"], ["cell","f",5,"b"], ["cell","g",5,"b"], ["cell","h",5,"b"],
 ["cell","a",4,"b"], ["cell","b",4,"b"], ["cell","c",4,"b"], ["cell","d",4,"b"], ["cell","e",4,"b"], ["cell","f",4,"b"], ["cell","g",4,"b"], ["cell","h",4,"b"],
 ["cell","a",3,"b"], ["cell","b",3,"b"], ["cell","c",3,"b"], ["cell","d",3,"b"], ["cell","e",3,"b"], ["cell","f",3,"b"], ["cell","g",3,"b"], ["cell","h",3,"b"],
 ["cell","a",2,"wp"],["cell","b",2,"wp"],["cell","c",2,"wp"],["cell","d",2,"wp"],["cell","e",2,"wp"],["cell","f",2,"wp"],["cell","g",2,"wp"],["cell","h",2,"wp"],
 ["cell","a",1,"wr"],["cell","b",1,"wn"],["cell","c",1,"wb"],["cell","d",1,"wq"],["cell","e",1,"wk"],["cell","f",1,"wb"],["cell","g",1,"wn"],["cell","h",1,"wr"],
 ["control","white"]
]

Expanding Minimax from 2 to n players

There is a tradition in textbooks to use an algorithm called minimax which uses one number to value states in a two player games, with one player ahead if this number is positive, the opponent ahead it’s negative, and zero indicating a draw.

This is often attributed to John von Neumann’s and Oskar Morgenstern’s book Theory of Games and Economic Behavior, though they never used the term and built their theory for “zero-sum two-person games” to zero-sum n-person games".

A snag with using one number and a Max and a Min player is it obscures the basic idea — the controlling player in a given state will pick their most favourable move. Instead of one number, something like a vector is needed, which I’ve done as [value(white, 50), value(black, 50)] which GGP provides with its goal(State, Player, Value) predicates.

minimax tends to go hand-in-hand with α-β pruning, which again limits an idea applicable to n-players to just 2 players.

Essentially, what we need to do is:

  1. Assume the player whose turn it is — member(control(Player), Parent) — will pick the maximum of the value(Player, Value) of next states available.
  2. The opposing player needs to be forewarned about where future moves will lead. This means that while growing the game tree downward, parent nodes need to update their values with fresh values optained from children — what I think the jargon term dynamic programing means.
  3. Thanks to iterative deepening, the branches that need to be expanded narrow, helping the AI focus on

Dynamic Programming

The above diagram indicates that once a terminal is reached that isn’t a draw, the existing value of its parent node can be replaced with that value. The grandparent node in turn needs to check if member(control(Player), GrandParent) has been snookered.

At this stage, I’m doing this upward propogation by replacin the existing max of member(control(Player, Parent) with max of member(control(Player, Child), continuting upward to the root as long as changes are made.


One of the simplification in the example used in Game Trees was the entire graph was placed in the clausal store to search. Moving on, the game player starts only knowing the initial state and the rules for expanding the tree, so has to play blind man’s buff to find it’s way to winning states and avoid losing ones.

I’m going to continue using the clausal store as a place to memoize explored paths so that they don’t have to be constantly regenerated. Another reason to remember the game tree as it develops is to support dynamic programing.

?- bestmove([cell(1, 1, x), cell(1, 2, x), cell(1, 3, o),
            cell(2, 1, b), cell(2, 2, o), cell(2, 3, b),
            cell(3, 1, b), cell(3, 2, o), cell(3, 3, x),
            control(white)], Move).
% Generated
% [Thread 3] Generated
% [Thread 4] Generated
% [Thread 4] Generated
% [Thread 4] Memoized
% [Thread 4] Memoized
% [Thread 4] Generated
Move = [does(white, mark(3, 1)), does(black, noop)].

Last updated on 12 Jan 2021
Published on 12 Jan 2021

Content Footer