Single Comment

Piece Values[Subject Thread] [Add Response]

H. G. Muller wrote on Sun, May 4, 2008 08:57 AM UTC:

Derek Nalls: | The additional time I normally give to playtesting games to improve | the move quality is partially wasted because I can only control the | time per move instead of the number of plies completed using most | chess variant programs. Well, on Fairy-Max you won't have that problem, as it always finishes an iteration once it decides to start it. But although Fairy-Max might be stronger than most other variant-playing AIs you use, it is not stronger than SMIRF, so using it for 10x8 CVs would still be a waste of time. Joker80 tries to minimize the time wastage you point out by attempting only to start iterations when it has time to finish them. It cannot always accurately guess the required time, though, so unlike Fairy-Max it has built in some emergency breaks. If they are triggered, you would have an incomplete iteration. Basically, the mechanism works by stopping to search new moves in the root if there already is a move with a similar score as on the previous iteration, once it gets in 'overtime'. In practice, these unexpectedly long iterations mainly occur when the previously best move runs into trouble that so far was just beyond the horizon. As the tree for that move will then look completely different from before, it takes a long time to search (no useful information in the hash), and the score will have a huge drop. It then continues searching new moves even in overtime in a desparate attempt to find one that avoids the disaster. Usually this is time well spent: even if there is no guarantee it finds the best move of the new iteration, if it aborts it early, it at least has found a move that was significantly better than that found in the previous iteration. Of course both Joker80 and Fairy-Max support the WinBoard 'sd' command, allowing you to limit the depth to a certain number of plies, although I never use that. I don't like to fix the ply depth, as it makes the engine play like an idiot in the end-game. | Can you explain to me in a way I can understand how and why | you are able to successfully obtain valuable results using this | method? Well, to start with, Joker80 at 1 sec per move still reaches a depth of 8-9 ply in the middle-game, and would probably still beat most Humans at that level. My experience is that, if I immediately see an obvious error, it is usually because the engine makes a strategic mistake, not a tactical one. And such strategic mistakes are awefully persistent, as they are a result of faulty evaluation, not search. If it makes them at 8 ply, it is very likely to make that same error at 20 ply. As even 20 ply is usually not enough to get the resolution of the strategical feature within the horizon. That being said, I really think that an important reason I can afford fast games is a statistical one: by playing so many games I can be reasonably sure that I get a representative number of gross errors in my sample, and they more or less cancel each other out on the average. Suppose at a certain level of play 2% of the games contains a gross error that turns a totally won position into a loss. If I play 10 games, there is a 20% error that one game contains such an error (affecting my result by 10%), and only ~2% probability on two such errors (that then in half the cases would cancel, but in other cases would put the result off by 20%). If, OTOH, I would play 1000 faster games, with an increased 'blunder rate' of 5% because of the lower quality, I would expect 50 blunders. But the probability that they were all made by the same side would be negligible. In most cases the imbalace would be around sqrt(50) ~ 7. That would impact the 1000-game result by only 0.7%. So virtually all results would be off, but only by about 0.7%, so I don't care too much. Another way of visualizing this would be to imagine the game state-space as a2-dimensional plane, with two evaluation terms determining the x- and y-coordinate. Suppose these terms can both run from -5 to +5 (so the state space is a square), and the game is won if we end in the unit circle (x^2 + y^2 < 1), but that we don't know that. Now suppose we want to know how large the probability of winning is if we start within the square with corners (0,0) and (1,1) (say this is the possible range of the evaluation terms when we posses a certain combination of pieces). This should be the area of a quarter circle, PI/4, divided by the area of the square (1), so PI/4 = 79%. We try to determine this empirically by randomly picking points in the square (by setting up the piece combination in some shuffled configuration), and let the engines play the game. The engines know that getting closer or farther away of (0,0) is associated with changing the game result, and are programmed to maximize or minimize this distance to the origin. If they both play perfectly, they should by definition succeed in doing this. They don't care about the 'polar angle' of the game state, so the point representing the game state will make a random walk on a circle around the origin. When the game ends, it will still be in the same region (inside or outside the unit circle), and games starting in the won region will all be won. Now with imperfect play, the engines will not conserve the distance to the origing, but their tug of war will sometimes change it in favor of one or the other (i.e. towards the origin, or away from it). If the engines are still equally strong, by definition on the average this distance will not change. But its probability distribution will now spread out over a ring with finite width during the game. This might lead to won positions close to the boundary (the unit circle) now ending up outside it, in the lost region. But if the ring of final game states is narrow (width << 1), there will be a comparable number of initial game states that diffuse from within the unit circle to the outside, as in the other direction. In other words, the game score as a function of the initial evaluation terms is no longer an absolute all or nothing, but the circle is radially smeared out a little, making a smooth transition from 100% to 0% in a narrow band centered on the original circle. This will hardly affect the averaging, and in particular, making the ring wider by decreasing playing accuracy will initially hardly have any effect. Only when play gets so wildly inaccurate that the final positions (where win/loss is determined) diverge so far from the initial point that it could cross the entire circle, you will start to see effects on the score. In the extreme case wher the radial diffusion is so fast that you could end up anywhere in the 10x10 square when the game finishes, the result score will only be PI/100 = 3%. So it all depends on how much the imperfections in the play spread out the initial positions in the game-state space. If this is only small compared to the measures of the won and lost areas, the result will be almost independent of it.