A computer program learns to play a game like go 囲碁 or chess, machine learning heuristics for accurate evaluation of positions, perhaps playing itself or playing other opponents. However, it deliberately steers the games towards testing and refining the heuristics it's discovered, even making suboptimal moves to reach such positions. This may be especially applicable when learning by playing itself, when it can cooperate with itself to reach such positions.
No comments :
Post a Comment