
It turns out there is this thing call "Deep Learning", which as far as I can tell stands for "neural networks now work", and there are all sorts of things that NNs have become good at since around 2010.
About a year ago I made a very strong Spanish checkers program using a NN as evaluation function, and I used unsupervised learning to train it. These are the steps I followed:
1) I computed 6-piece endgame tablebases, because shallow searches are not going to learn some difficult endgames, and in this particular game it is critically important to know about them in order to determine what "enough advantage to win" means in the midgame.
2) I generated a game database (1000 or so games?) using alpha-beta search with a random evaluation function.
3) I trained a NN using the game database.
4) I generated another game database using alpha-beta search with the NN plus a small random number as the evaluation function.
5) I iterated 3 and 4 (a total of 3 times, I think; I didn't keep good records) using increasingly larger and higher-quality databases, and increasingly complex NNs.
I didn't go overboard with the complexity of the NN. My final version had 121 inputs (a very raw description of the board, with booleans indicating the presence of each type of piece on each possible square, plus whose turn it is), two hidden layers with 32 ReLU units each and a single linear output (tanh might have been better?).
It's hard to test how strong the resulting program was, because high-quality games result in lots of draws, but qualitatively the games kept getting better and better. In the first batch of games the program didn't know that having more pieces than the opponent is a good thing, so it would needlessly give material away. Then it figured out more material is a good thing, but it would move pawns in the back too easily, letting the opponent get to a king without difficulty. By the end, it seemed to have at least as much positional understanding about the game as I did. I then let my friend and Spanish checkers master José Manuel Morán play against it, and his estimation was "it cannot be beaten".
The whole project took about a month. Training a simple NN like the ones I used can be done in a couple of hours in a single CPU thread. Using a GPU you can scale to much larger NNs.
==BEYOND CHECKERS==
Yes, checkers was solved by Chinook a few years ago, although the version of checkers I was dealing with is quite different. But the technique is completely general, and I managed to write a very strong engine without using any insights from human players. Can we do this for other games?
In particular, I am interested in computer go. Starting December 2014 there have been at least three papers about using NNs to predict the next move in go games by strong players (they all use convolutional neural networks):
* http://arxiv.org/abs/1412.3409
* http://arxiv.org/abs/1412.6564
* http://arxiv.org/abs/1511.06410
I have been playing around with similar ideas the last few weeks, with the ultimate goal of making a NN that can be used as an evaluation function for go (where no usable evaluation function exists).
What are other games for which evaluation functions seem difficult to write? Do you think NNs might work for them? I'll be happy to assists others in getting started with this exciting technique, if there is interest.