[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]
Single Comment
Derek Nalls:
| The additional time I normally give to playtesting games to improve
| the move quality is partially wasted because I can only control the
| time per move instead of the number of plies completed using most
| chess variant programs.
Well, on Fairy-Max you won't have that problem, as it always finishes an
iteration once it decides to start it. But although Fairy-Max might be
stronger than most other variant-playing AIs you use, it is not stronger
than SMIRF, so using it for 10x8 CVs would still be a waste of time.
Joker80 tries to minimize the time wastage you point out by attempting
only to start iterations when it has time to finish them. It cannot always
accurately guess the required time, though, so unlike Fairy-Max it has
built in some emergency breaks. If they are triggered, you would have an
incomplete iteration. Basically, the mechanism works by stopping to search
new moves in the root if there already is a move with a similar score as on
the previous iteration, once it gets in 'overtime'. In practice, these
unexpectedly long iterations mainly occur when the previously best move
runs into trouble that so far was just beyond the horizon. As the tree for
that move will then look completely different from before, it takes a long
time to search (no useful information in the hash), and the score will
have a huge drop. It then continues searching new moves even in overtime
in a desparate attempt to find one that avoids the disaster. Usually this
is time well spent: even if there is no guarantee it finds the best move
of the new iteration, if it aborts it early, it at least has found a move
that was significantly better than that found in the previous iteration.
Of course both Joker80 and Fairy-Max support the WinBoard 'sd' command,
allowing you to limit the depth to a certain number of plies, although I
never use that. I don't like to fix the ply depth, as it makes the engine
play like an idiot in the end-game.
| Can you explain to me in a way I can understand how and why
| you are able to successfully obtain valuable results using this
| method?
Well, to start with, Joker80 at 1 sec per move still reaches a depth of
8-9 ply in the middle-game, and would probably still beat most Humans at
that level. My experience is that, if I immediately see an obvious error,
it is usually because the engine makes a strategic mistake, not a tactical
one. And such strategic mistakes are awefully persistent, as they are a
result of faulty evaluation, not search. If it makes them at 8 ply, it is
very likely to make that same error at 20 ply. As even 20 ply is usually
not enough to get the resolution of the strategical feature within the
horizon.
That being said, I really think that an important reason I can afford fast
games is a statistical one: by playing so many games I can be reasonably
sure that I get a representative number of gross errors in my sample, and
they more or less cancel each other out on the average. Suppose at a
certain level of play 2% of the games contains a gross error that turns a
totally won position into a loss. If I play 10 games, there is a 20% error
that one game contains such an error (affecting my result by 10%), and only
~2% probability on two such errors (that then in half the cases would
cancel, but in other cases would put the result off by 20%).
If, OTOH, I would play 1000 faster games, with an increased 'blunder
rate' of 5% because of the lower quality, I would expect 50 blunders. But
the probability that they were all made by the same side would be
negligible. In most cases the imbalace would be around sqrt(50) ~ 7. That
would impact the 1000-game result by only 0.7%. So virtually all results
would be off, but only by about 0.7%, so I don't care too much.
Another way of visualizing this would be to imagine the game state-space
as a2-dimensional plane, with two evaluation terms determining the x- and
y-coordinate. Suppose these terms can both run from -5 to +5 (so the state
space is a square), and the game is won if we end in the unit circle (x^2 +
y^2 < 1), but that we don't know that. Now suppose we want to know how
large the probability of winning is if we start within the square with
corners (0,0) and (1,1) (say this is the possible range of the evaluation
terms when we posses a certain combination of pieces). This should be the
area of a quarter circle, PI/4, divided by the area of the square (1), so
PI/4 = 79%.
We try to determine this empirically by randomly picking points in the
square (by setting up the piece combination in some shuffled
configuration), and let the engines play the game. The engines know that
getting closer or farther away of (0,0) is associated with changing the
game result, and are programmed to maximize or minimize this distance to
the origin. If they both play perfectly, they should by definition succeed
in doing this. They don't care about the 'polar angle' of the game
state, so the point representing the game state will make a random walk on
a circle around the origin. When the game ends, it will still be in the
same region (inside or outside the unit circle), and games starting in the
won region will all be won.
Now with imperfect play, the engines will not conserve the distance to the
origing, but their tug of war will sometimes change it in favor of one or
the other (i.e. towards the origin, or away from it). If the engines are
still equally strong, by definition on the average this distance will not
change. But its probability distribution will now spread out over a ring
with finite width during the game. This might lead to won positions close
to the boundary (the unit circle) now ending up outside it, in the lost
region. But if the ring of final game states is narrow (width << 1), there
will be a comparable number of initial game states that diffuse from within
the unit circle to the outside, as in the other direction.
In other words, the game score as a function of the initial evaluation
terms is no longer an absolute all or nothing, but the circle is radially
smeared out a little, making a smooth transition from 100% to 0% in a
narrow band centered on the original circle.
This will hardly affect the averaging, and in particular, making the ring
wider by decreasing playing accuracy will initially hardly have any
effect. Only when play gets so wildly inaccurate that the final positions
(where win/loss is determined) diverge so far from the initial point that
it could cross the entire circle, you will start to see effects on the
score. In the extreme case wher the radial diffusion is so fast that you
could end up anywhere in the 10x10 square when the game finishes, the
result score will only be PI/100 = 3%.
So it all depends on how much the imperfections in the play spread out the
initial positions in the game-state space. If this is only small compared
to the measures of the won and lost areas, the result will be almost
independent of it.