Check out Modern Chess, our featured variant for January, 2025.


[ Help | Earliest Comments | Latest Comments ]
[ List All Subjects of Discussion | Create New Subject of Discussion ]
[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Single Comment

Alpha Zero plays chess[Subject Thread] [Add Response]
Joe Joyce wrote on Thu, Dec 14, 2017 09:12 AM UTC:

Aurelian, I've read the first part of the paper V. Reinhart linked a bit after our comments. My math was always bad, but I think this is a relevant paragraph in the paper:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root s root to leaf. Each simulation proceeds by selecting in each state s a move a with low visit count, high move probability and high value (averaged over the leaf states of simulations that selected a from s) according to the current neural network fθ. The search returns a vector π representing a probability distribution over moves, either proportionally or greedily with respect to the visit counts at the root state.

I believe that it would take a truly remarkable neural net to significantly outperform all humans either individually or as teams playing as a general staff, because the sheaves of probability explode from each potential group of moving pieces interacting with each different board or even different entry squares or entry times presented.

Let me offer you a link to a website under construction that steps through the first "day" of a purely combinatorial abstract strategy combat simulation, which includes 24 sequential "daylight" turns alternating between blue and red, and a lesser number of "night" turns to finish all combat, separate the 2 sides, "rally" troops - return 1/3rd of each side's losses to the owning player to drop by friendly leaders. Marked reinforcements come in between turns 8 & 9 (4 turns for each side) on their assigned entry areas, are unmarked and move normally from the start of the next daylight turn. The sequence above is repeated again, with on-board sides each being reinforced twice, once on daylight turns 29/30 and again on 39/40. After a second night, a 3rd day with no reinforcements is played. If none of the 3 criteria for victory has been achieved by either player, both lose. Otherwise, a victor or a draw is determined.

http://anotherlevel.games/?page_id=193 (please wait for it to load - thanks! Said it's under construction!)

Note terrain blocks movement and is completely variable. There are a handful of elements I put in each version of the scenario, a "city" of around 10 squares in the center of the board, a "mountain in the northwest quadrant of the board, a "forest" in the south, a "ridge" running from NE to SE of the city's east edge, a light scattering of terrain to break up and clog up empty areas on the board, and a dozenish entry areas. Nothing need be fixed from game to game. How does even a great neural net do better than any human or team every single time? There are far too many possibilities for each game state, and truly gigantic numbers of game states, in my semi-skilled opinion.