Hi,
A friend of mine is trying to develop a bot to play rummy, and he asked me for advice. I told him the trendy way of doing this for any game these days is to set up a neural network and train it from self-playing games. The obvious problem with adapting the AlphaZero approach to a game with lots of hidden information is that MCTS doesn't quite work. Still, you could just train a network to predict the outcome of the game from a position and a move, and play lots of self-playing games to generate training data. Assuming randomized strategies are not necessary to play the game well (I don't know if that's the case for rummy, actually), this should work.
Are there any examples of projects that have succeeded at training strong card-game bots using something of this style?
Thanks!
Álvaro.
AlphaZero approach for card games
-
- Posts: 207
- Joined: Tue Feb 12, 2008 8:31 pm
- Contact:
Re: AlphaZero approach for card games
Hi Alvaro,
Because of hidden information, methods that work for board games usually don't work well for card games. The AlphaZero method itself may not work well, but other machine learning algorithms based on self play have been successfully applied to card games.
The main approach for statistical machine learning applied to card games is called CFR, and stands for Counterfactual Regret. Another important algorithm is Fictitious Self Play (FSP), and its neural variation, NFSP. Searching the web for these keywords should give a lot of interesting resources.
The game of rummy is a bit similar to the game of mahjong. Recently Microsoft published a very interesting paper about reaching superhuman strength in mahjong.
Here are some papers:
Because of hidden information, methods that work for board games usually don't work well for card games. The AlphaZero method itself may not work well, but other machine learning algorithms based on self play have been successfully applied to card games.
The main approach for statistical machine learning applied to card games is called CFR, and stands for Counterfactual Regret. Another important algorithm is Fictitious Self Play (FSP), and its neural variation, NFSP. Searching the web for these keywords should give a lot of interesting resources.
The game of rummy is a bit similar to the game of mahjong. Recently Microsoft published a very interesting paper about reaching superhuman strength in mahjong.
Here are some papers:
- Suphx: Mastering Mahjong with Deep Reinforcement Learning https://arxiv.org/abs/2003.13590
- Deep Reinforcement Learning from Self-Play in Imperfect-Information Games https://arxiv.org/abs/1603.01121