## Applying go methods to other games

Implementing and improving the AlphaZero algorithm
spurious_ai
Posts: 34
Joined: Fri Mar 28, 2008 8:15 pm

### Re: Applying go methods to other games

If you were to use a hash table to store positions, wouldn't that solve most of the difficulty? I guess you would also have to have some bounds, you couldn't keep playing add one to column one.
Rémi Coulom
Posts: 207
Joined: Tue Feb 12, 2008 8:31 pm
Contact:

### Re: Applying go methods to other games

spurious_ai wrote:If you were to use a hash table to store positions, wouldn't that solve most of the difficulty? I guess you would also have to have some bounds, you couldn't keep playing add one to column one.
The basic principle of MC search is this: in order to evaluate a position, play plenty of games at random, and take the average outcome as evaluation. It works for Go, because you can play moves at random, and the board will eventually fill up. It works also for SameGame. But if you select actions uniformly at random in Numbrosia, random playouts are likely to never finish. I can't see how a hash table would help. Maybe it would be possible to use it to avoid playing into previously visited states, but it would still be extremely unlikely to converge to the solution in a meaningful number of moves.

It may be possible to make it work by using clever playouts, though. That is to say, instead of selecting actions uniformly at random, use a more clever probability distribution.

I don't expect it could beat beam search, though.

Rémi
luke_g
Posts: 29
Joined: Sun Feb 17, 2008 9:52 am

### Re: Applying go methods to other games

Yes, for MC to work, a playout needs to be a good approximation for the true value. Meaning, the expected value of the playout reasonably reflects the true value. Games like Go and SameGame, which "stabilize" over time, are good candidates. In a game like Numbrosia, the MC playout would tend to play way too many moves. There are various ways you could try to make a MC playout be a better approximation, but it's not worthwhile when a reasonable, simple evaulation function exists (sum of squares or whatnot).

It's worth mentioning, UCT is a tree search algorithm, and could be applied with any evaluation function-- but it has been especially effective with MC playouts.
luke_g, your solution for 20+ move Numbrosia puzzles take over a day, how long would a 30 move puzzle take? At some point you have to abandon the idea of optimal, and find a method that gives you the best solution possible in a reasonable amount of time.
Absolutely. However I never heard of using A* /IDA* for non-optimal solutions, which is what confused me about that part of the paper.
spurious_ai
Posts: 34
Joined: Fri Mar 28, 2008 8:15 pm

### Re: Applying go methods to other games

I adjusted my Jt's Blocks program to run on the SameGame dimensions. I started with the first board in the test set they used in their paper. I posted a score of 2799 ( http://www.js-games.de/eng/highscores/samegame/lx ). This beat their best result by 242 points. My program ran about 15 minutes to find this result. They report using 2 hours, but it doesn't say if that was per board or for all 20 boards.

I still have some adjustments to my program and believe I can find a higher score. The biggest problem is the 2 block move. Under the scoring used for this version of SameGame, (n-2)*(n-2), a 2 block move scores 0 points.

Could the 0 point move cause a problem in their MC implementation? Should have I been able to beat their score?
Rémi Coulom
Posts: 207
Joined: Tue Feb 12, 2008 8:31 pm
Contact:

### Re: Applying go methods to other games

spurious_ai wrote:They report using 2 hours, but it doesn't say if that was per board or for all 20 boards.
If I understood correctly, they used 2 hours per board. It seems rather clear according to the sentence at the top of page 10.
spurious_ai wrote:Should have I been able to beat their score?
You would have to run more boards to figure out if you are better on average. I would not be surprised at all if you are. The SameGame domain had probably not been researched very deeply before their paper.

Rémi
luke_g
Posts: 29
Joined: Sun Feb 17, 2008 9:52 am

### Re: Applying go methods to other games

spurious_ai wrote:I still have some adjustments to my program and believe I can find a higher score. The biggest problem is the 2 block move. Under the scoring used for this version of SameGame, (n-2)*(n-2), a 2 block move scores 0 points.

Could the 0 point move cause a problem in their MC implementation? Should have I been able to beat their score?
I noticed that the scores listed in the paper aren't in the hall of fame at the website. I wonder if they submitted their scores?
spurious_ai
Posts: 34
Joined: Fri Mar 28, 2008 8:15 pm

### Re: Applying go methods to other games

The scores they compare their program to are posted.

I have a way around the 0 point move for now, but it really does cause problems with my method.
spurious_ai
Posts: 34
Joined: Fri Mar 28, 2008 8:15 pm

### Re: Applying go methods to other games

First pass thru the test datasets.

I will to call mine Simple Breadth Search (SBS)

Code: Select all

``````Brd  SP-MCTS  SBS  +/-
---  -------  ---  ---
1     2557   2977  420
2     3749   3969  220
3     3085   3569  484
4     3641   3703   62
5     3653   4143  490
6     3971   4689  718
7     2797   2997  200
8     3715   4363  648
9     4603   4967  364
10    3213   3781  568
11    3047   3373  326
12    3131   3851  720
13    3097   3179   82
14    2589   3211  622
15    3183   3679  496
16    4879   5481  602
17    4609   5003  394
18    4853   5411  558
19    4503   5319  816
20    4853   4919   66
--------------------------
Tot  73998  82604 8606
``````
I have a few problems to solve, but going to 2 hours from 15 minutes, I should be able to find a lot more points.

EDIT: I will update scores as I find them
Last edited by spurious_ai on Mon Jun 09, 2008 8:42 pm, edited 19 times in total.
luke_g
Posts: 29
Joined: Sun Feb 17, 2008 9:52 am

### Re: Applying go methods to other games

Oh, so actually it looks like the website's scoring system penalizes you more than the scoring system presented in the paper. In the paper it says that you lose (n-2)^2 points for each color (where n is the number of blocks of that color remaining), while on the website you lose (N-2)^2 points (where N is the total number of blocks remaining). For eample, if you have 1 green and 2 red remaining, the paper's scoring system would penalize you by 1+4 = 5, while the javascript game penalizes you 9.

Spurious, if that's correct, that means you're beating them by quite a bit more than you've indicated for puzzles where you didn't completely clear the board.

EDIT: It looks like the board is usually completely cleared, so this doesn't apply.
spurious_ai
Posts: 34
Joined: Fri Mar 28, 2008 8:15 pm

### Re: Applying go methods to other games

Jt's Blocks is only a 14 x 9 grid with 5 colors and I have never seen a board that would not clear ( and I have played over 200,000 boards ). So a 15 x 15 grid would probably need 6 or 7 colors before clear would become difficult. Of course you could design a board that would not clear, checkerboard with only 2 colors, but a for a randomized board, probably would not happen.

The guys that wrote the paper have now posted their scores.