A computer program plays a better opponent, losing. Look back on the program's analysis, particularly to each move's principal variation. Where did something go wrong? Usually when the opponent ultimately plays a much better move than the move predicted as the PV.
Was the failure because of a bad board evaluation? Because a node was pruned that shouldn't have been? For these cases, we may be able to apply a machine learning classification algorithm, to derive a new heuristic which puts the move on the correct side of the classifier.
If the failure was purely due to the horizon effect, for example the program playing a deeper version of itself, this will not (or at least should not) be too effective (though temporal difference learning has had its success).
Surely this has been done.
1 comment :
Yes.
KnightCap: A chess program that learns by combining TD(λ) with game-tree
search by Jonathan Baxter, Andrew Tridgell and Lex Weaver.
Source code from:
http://samba.org/KnightCap/
although I don't think the paper is there
Post a Comment