How does alphazero start learning if moves are random and games don't finish

Implementing and improving the AlphaZero algorithm
Post Reply
tttt1234
Posts: 1
Joined: Sat Jul 08, 2023 4:07 pm

How does alphazero start learning if moves are random and games don't finish

Post by tttt1234 »

Hi everyone! I am trying to program my own version of AlphaZero for Chess. However, when you start training, all moves are random meaning it is highly unlikely that the game will reach a terminated state. So how do you update the values in the MCTS nodes, if your policies and values are still just random.

Is this a common problem that is solved by just more computational power and more iterations? Has anyone tackled a similar problem or knows how to continue?

Any help is appreciated, thank you!
Rémi Coulom
Posts: 219
Joined: Tue Feb 12, 2008 8:31 pm
Contact:

Re: How does alphazero start learning if moves are random and games don't finish

Post by Rémi Coulom »

In my experience, this is not a problem for chess. Even if you play the initial games completely at random, a few of them will have a mate. That should be enough to drive progress. After the initial iterations, games will look more normal.
Post Reply