Check out Janggi (Korean Chess), our featured variant for December, 2024.


[ Help | Earliest Comments | Latest Comments ]
[ List All Subjects of Discussion | Create New Subject of Discussion ]
[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Single Comment

Sac Chess. Game with 60 pieces. (10x10, Cells: 100) [All Comments] [Add Comment or Rating]
💡📝Kevin Pacey wrote on Sat, Dec 19, 2015 05:35 AM UTC:
H.G. wrote:

"...One would expect the playtest to be only meaningful when the values used are consistent, i.e. the programmed value used for deciding on trades are the same as the value that comes out based on the score percentage. But to my surprise, putting a moderately wrong value there hardly had any effect on the outcome at all. If I put Q=9.5 and C=9, and play an army with Q against an army with C, the Queen wins by ~58%. If I put Q=9.5, and C=10, the Queen still wins by 58%! The explanation is that both engines share the misconception. So one of the two sides will always try to avoid the trade, meaning that Q for C trades will be relatively rare. So the test mainly measures how much damage Q and C do to the other pieces. Although a wrong C value might lead to wrong 2-for-1 or 3-for-1 trades, the number of occasions where such a trade can be forced is relatively rare, especially if they are not exactly equal, so that one of the players will try to avoid them. So the most error-prone value assignment is actually the one where the value is exactly the same as that of another piece, or the sum of two other pieces (and wrongly so). So I usually avoid that.
..."


Fwiw, instances of assigning a preliminary value for a piece to the same value of that of another piece (and also being a wrong preliminary value) were in fact uppermost in my mind. For example, if one incorrectly sets the value of a rook (or, I would opine for argument's sake, even a bishop) exactly equal to the value of a knight, I'd imagine in a number of playtest games the side with an extra rook would erroneously trade it for the extra knight of the opposing side, say when thinking the position was approximately equal in all respects. If the number of such games is significantly large in the playtesting, this could seriously drive up the percentage of drawn games (let alone losses) in such games where a rook for knight advantage is erroneously thrown away through such a trade, substantially skewing the results of the playtesting.

Regarding when a preliminary value is assigned to an Archbishop that is at least slightly different than that of a Queen when pitting the two pieces against each other in playtesting (other material being equal at the start), for example, and the value assigned the Archbishop is at least slightly wrong, I am now wondering something similar to what was uppermost on my mind before. That is, if the effect of all resulting incorrectly avoided trades during playtesting (e.g. of Queen for Archbishop plus a certain number of pawn[s]) might be to at least drive up the number of resulting unnecessary draws (let alone losses), in a way that may not at a minimum favour the Queen even approximately appropriately as far as its final overall percentage score in playtesting when pitted vs. an Archbishop.

In short, I wonder if the results of such playtest games might even still be substantially skewed (setting aside the quality of the play by the engines). At the risk of stating the obvious, viewers can note that even if the Queen wins about the correct ratio of times vs. its losses, an incorrect (say too high) percentage of drawn games skews the overall results percentages if measuring the Archbishop. For example, if in 20 games the Queen wins 8 times, loses 4 times, with 8 draws, for an overall percentage of 60%, it has the same ratio of wins to losses if it wins 10 times and loses 5 times (i.e. with 5 draws), but in the latter case the Queen scores a better overall percentage (of 62.5%). The actual difference due to any playtesting that might be faulty might conceivably be quite greater percentagewise than for these example figures.