Check out Modern Chess, our featured variant for January, 2025.


[ Help | Earliest Comments | Latest Comments ]
[ List All Subjects of Discussion | Create New Subject of Discussion ]
[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Comments by DerekNalls

EarliestEarlier Reverse Order LaterLatest
Piece Values[Subject Thread] [Add Response]
Derek Nalls wrote on Mon, May 19, 2008 10:28 PM UTC:
Muller:

I would like to conduct two focused playtests using Joker80 at very long
time controls (e.g., 30 minutes per move) to investigate these important questions-

1.  Is Muller's rook value within the CRC set too low?
2.  Is Scharnagl's archbishop value within the CRC set too low?

I would need for you to compile special versions of Joker80 for me using
significantly different values for those CRC pieces as well as
Scharnagl's CRC piece set.  To isolate the target variable, these games would be Muller (standard values) vs. Muller (test values) and Scharnagl (standard values) vs. Scharnagl (test values) via symmetrical playtesting.  Anyway, we can discuss the details if you are interested or willing.  Please let me know.

Derek Nalls wrote on Tue, May 20, 2008 01:13 AM UTC:
Muller:

Please investigate this potentially serious bug I may have discovered
while testing Joker80 under Winboard F ...

Bugs, Bugs, Bugs!
http://www.symmetryperfect.com/pass

I am having a hard time with software today.

Derek Nalls wrote on Tue, May 20, 2008 07:16 AM UTC:
'Human vs. engine play is virtually untested. 
Did you at any point of the game use 'undo'
(through the WinBoard 'retract move')?'

Yes.
Many of us error-prone humans use it frequently.
________________________________________________

'This is indeed something I should fix but
the current work-around would be not to use 'undo'.'

Makes sense to me.
I can avoid using the 'retract move' command altogether.
________________________________________________________

'I could make a Joker80 version that reads the piece base values from a
file 'joker.ini' at startup. Then you could change them to anything you
want to test, without the need to re-compile. Would that satisfy your
needs?'

Yes, better than I ever imagined.
Thank you!

Derek Nalls wrote on Tue, May 20, 2008 04:48 PM UTC:
Everything is working fine.
Thank you!

I now have 12 instances of the Joker80 program running in various
sub-directories of Winboard F with the 'winboard.ini' file set to
conveniently initiate any desired standard or special material values for
the CRC models by Muller, Scharnagl and Nalls.

In the first test, I am going to attempt to find a playtesting time where
a distinct seperation in playing strength occurs between the standard
Muller model wherein the rook is 1 pawn more valuable than the bishop and
a special Muller model wherein the rook is 2 pawns more valuable than the
bishop.  If I successfully find a playtesting time that is survivable by
humans, then we can hopefully establish a tentative probability as to
which CRC model plays decisively better after a few-several games.

At par 100 (for the pawn), the bishop is at 459 under both models with the
rook at 559 under the standard Muller model and 659 under the special
Muller model.

I want to playtest a special Muller model with a rook value 2.00 pawns higher than the bishop because the Nalls model has a rook value 2.19 pawns higher than the bishop and the Scharnagl model has a rook value 1.94 pawns higher than the bishop (for an average of 2.06 pawns).

Since I am attempting to test for such a small difference in the material value of only one type of piece (the rook), I have doubts that I will be able to obtain conclusive results.  In any case ... If I obtain conclusive results, then very long time controls will surely be required to produce them.

Derek Nalls wrote on Tue, May 20, 2008 09:05 PM UTC:
Of course, I would bet anything that there are no 1:1 exchanges supported
under the standard Muller CRC model that could cause material losses.  If
that were the case, yours would not be one of the three most credible CRC
models under close consideration.  In fact, even your excellent Joker80
program would play poorly if stuck with using faulty CRC piece values.

Obviously, the longer the exchange, the rarer its occurrence during
gameplay.  The predominance of simple 1:1 exchanges over even the least
complicated, 1:2 or 2:1 exchanges, in gameplay is large although I do not
know the stats.

In fact, there is a certain 1:2 or 2:1 exchange I am hoping to see that is
likely to support my contention that the Muller rook value should be
higher: the 1 queen for 2 rooks or 2 rooks for 1 queen exchange.  Please
recall that under the standard Muller model, this is an equal exchange. 
However, under asymmetrical playtesting of comparable quality to and
similar to that I used to confirm the correctness of your higher
archbishop value, I played numerous CRC games at various moderate time
controls where the player without 1 queen (yet with 2 rooks) defeated the
player without 2 rooks (yet with 1 queen).  Ultimately, a key mechanism to conclusive results is that while the standard Muller model is neutral toward a 2 rook : 1 queen or 1 queen : 2 rook exchange, the special Muller model regards its 1 queen as significantly less valuable than 2 rooks of its opponent.  Consequently, this contrast in valuation could be played into ... and we would see who wins.

I am actually pleased that you are a realist who shares my pessimism in
this experiment.  In any case, low odds do not deter a best effort to
succeed.  The main difference between us is that you calculate your
pessimism by extreme statistical methods whereas I calculate my pessimism
by moderate probabilistic methods.  I remain hopeful that eventually I
will prove to you that the method Scharnagl & I developed is occasionally
productive.

Derek Nalls wrote on Tue, May 20, 2008 09:17 PM UTC:
Muller:

Please confirm that these are legal values for the 'winboard.ini' file.

/firstChessProgramNames={'C:\winboard-F\Joker80\w\M-st\w-M-st 22
P100=353=459=559=1029=1059=1118'
'C:\winboard-F\Joker80\w\M-sp\w-M-sp 22
P100=353=459=659=1029=1059=1118'
'C:\winboard-F\Joker80\w\S-st\w-S-st 22
P100=306=363=557=702=912=960'
'C:\winboard-F\Joker80\w\S-sp\w-S-sp 22
P100=306=363=557=866=912=960'
'C:\winboard-F\Joker80\w\N-st\w-N-st 22
P100=308=376=594=940=958=1031'
'C:\winboard-F\Joker80\w\N-sp\w-N-sp 22
P100=308=376=594=940=958=1031'
'C:\winboard-F\TJchess\TJChess10x8'
}
/secondChessProgramNames={'C:\winboard-F\Joker80\b\M-st\b-M-st 22
P100=353=459=559=1029=1059=1118'
'C:\winboard-F\Joker80\b\M-sp\b-M-sp 22
P100=353=459=659=1029=1059=1118'
'C:\winboard-F\Joker80\b\S-st\b-S-st 22
P100=306=363=557=702=912=960'
'C:\winboard-F\Joker80\b\S-sp\b-S-sp 22
P100=306=363=557=866=912=960'
'C:\winboard-F\Joker80\b\N-st\b-N-st 22
P100=308=376=594=940=958=1031'
'C:\winboard-F\Joker80\b\N-sp\b-N-sp 22
P100=308=376=594=940=958=1031'
'C:\winboard-F\TJchess\TJChess10x8'
}

Derek Nalls wrote on Wed, May 21, 2008 04:53 PM UTC:
As I moved to renormalize all of the values used in Joker80 (written into
the 'winboard.ini' file) with the pawn at a par of 85 points, I looked
at my notes again.  They reminded me that your use of the 'bishop pair'
refinement (with a bonus of 40 points) ramifies that the material value of
the rook is either 1.00 pawns or 1.47 pawns greater than the material value
of the bishop in CRC, depending upon whether or not only one bishop or both
bishops, respectively, remain in the game.  At that point, I realized that
I would be attempting to playtest for a discrepancy that I know from
experience is just too small to detect even at very long time controls. 
So, this planned test has been cancelled.

I am not implying that this matter is unimportant, though.  I remain
concerned for the standard Muller model whenever it allows the exchange of
its 2 rooks for 1 queen belonging to its opponent.

Derek Nalls wrote on Wed, May 21, 2008 07:02 PM UTC:
Muller:

Please have another look at this except from my 'winboard.ini' file. 
There are standard and special versions of piece values by Muller,
Scharnagl & Nalls for the white and black players renormalized to pawn =
85 points.

The special version of the Muller model has a rook value exactly 85 points
or 1.00 pawn higher than the standard version.

The special version of the Scharnagl model has an archbishop value (736
points) at appr. 95% of the archbishop value (775 points) instead of 597
points at appr. 77% for the standard version.

The special version of the Nalls model is identical to the standard
version until some test is needed and planned.

Since I assume that the 'bishop pairs bonus' is hardwired into Joker80,
40 points has been subtracted from the model-independant, material values
of the bishop under all three models.  Is this correct?
_____________________________________________________

/firstChessProgramNames={'C:\winboard-F\Joker80\w\M-st\w-M-st 22
P85=300=350=475=875=900=950'
'C:\winboard-F\Joker80\w\M-sp\w-M-sp 22
P85=300=350=560=875=900=950'
'C:\winboard-F\Joker80\w\S-st\w-S-st 22
P85=260=269=474=597=775=816'
'C:\winboard-F\Joker80\w\S-sp\w-S-sp 22
P85=260=269=474=736=775=816'
'C:\winboard-F\Joker80\w\N-st\w-N-st 22
P85=262=279=505=799=815=876'
'C:\winboard-F\Joker80\w\N-sp\w-N-sp 22
P85=262=279=505=799=815=876'
'C:\winboard-F\TJchess\TJChess10x8'
}
/secondChessProgramNames={'C:\winboard-F\Joker80\b\M-st\b-M-st 22
P85=300=350=475=875=900=950'
'C:\winboard-F\Joker80\b\M-sp\b-M-sp 22
P85=300=350=560=875=900=950'
'C:\winboard-F\Joker80\b\S-st\b-S-st 22
P85=260=269=474=597=775=816'
'C:\winboard-F\Joker80\b\S-sp\b-S-sp 22
P85=260=269=474=736=775=816'
'C:\winboard-F\Joker80\b\N-st\b-N-st 22
P85=262=279=505=799=815=876'
'C:\winboard-F\Joker80\b\N-sp\b-N-sp 22
P85=262=279=505=799=815=876'
'C:\winboard-F\TJchess\TJChess10x8'
}

Derek Nalls wrote on Thu, May 22, 2008 12:13 AM UTC:
'If I were you, I would normalize all models to Q=950 but then replace
the pawn value everywhere by 85.'

Since this is what you (the developer of Joker80) recommend as optimum, 
this is what I will do.

Are you sure that replacing any pawn values different than 85 points
after renormalization to queen = 950 points still renders an accurate 
and complete representation, more or less, of the Scharnagl and Nalls 
models?

At par of queen = 950 points, the pawn value in the Nalls model
is not represented as being only 92.19% as high as that in the Muller 
model and the pawn value in the Scharnagl model is not represented
as being only 98.95% as high as that in the Muller model.

Thru it all ... If a perfect representation is not quite possible, 
I can accept that without reservation.
__________________________________

'I don't think you could say then that you deviate from the
model as the models do not really specify which type of Pawn they use as
a standard.'

Correctly calculating pawn values at the start of the game (much less, 
throughout the game) requires finesse as it is indeed a complex issue.
In fact, its excessively complexity is the reason my 66-page paper on
material values of pieces is silent in the case of calculating pawn values
in FRC & CRC.  Instead, someone needs to read an entire book from an 
outside source about calculating the material values of the pieces in 
Chess to sufficiently understand it.

Personally, I am content with the test situation as long as Joker80 
handles all pawns under all three models initially valued at 85 points
as fairly and equally as realistically possible.

I cannot speak for Reinhard Scharnagl at all, though.
________________________________________________

'The way you did it now would make the first Bishop to be traded of the 
value the model prescribes, but would make the second much lighter. 
If you would subtract half the bonus, then on the average they would 
be what the model prescribes.'

Now, I understand better.
It makes sense.
[I am glad I asked you.]

Yes, I will subtract 20 points (1/2 of the 'bishop pair bonus') from the
model-independant, material values for the bishop under the 
Scharnagl & Nalls models.

Derek Nalls wrote on Thu, May 22, 2008 12:33 AM UTC:
Muller:

Here is my latest revision to my 'winboard.ini' file.
Are these piece values acceptable to you?
Do you think these piece values will work smoothly with Joker80 running
under Winboard F yet remain true to all three models?
______________________________________________________

/firstChessProgramNames={'C:\winboard-F\Joker80\w\M-st\w-M-st 22
P85=300=350=475=875=900=950'
'C:\winboard-F\Joker80\w\M-sp\w-M-sp 22
P85=300=350=560=875=900=950'
'C:\winboard-F\Joker80\w\S-st\w-S-st 22
P85=302=339=551=694=902=950'
'C:\winboard-F\Joker80\w\S-sp\w-S-sp 22
P85=302=339=551=857=902=950'
'C:\winboard-F\Joker80\w\N-st\w-N-st 22
P85=284=326=548=866=884=950'
'C:\winboard-F\Joker80\w\N-sp\w-N-sp 22
P85=284=326=548=866=884=950'
'C:\winboard-F\TJchess\TJChess10x8'
}
/secondChessProgramNames={'C:\winboard-F\Joker80\b\M-st\b-M-st 22
P85=300=350=475=875=900=950'
'C:\winboard-F\Joker80\b\M-sp\b-M-sp 22
P85=300=350=560=875=900=950'
'C:\winboard-F\Joker80\b\S-st\b-S-st 22
P85=302=339=551=694=902=950'
'C:\winboard-F\Joker80\b\S-sp\b-S-sp 22
P85=302=339=551=857=902=950'
'C:\winboard-F\Joker80\b\N-st\b-N-st 22
P85=284=326=548=866=884=950'
'C:\winboard-F\Joker80\b\N-sp\b-N-sp 22
P85=284=326=548=866=884=950'
'C:\winboard-F\TJchess\TJChess10x8'
}

Derek Nalls wrote on Fri, May 23, 2008 12:47 AM UTC:
Originally, I planned two 'internal playtests'.  [By this self-invented
term I mean playtests of the standard model of a person against a special
model that I have compelling reasons to think may be superior by a
provable margin.]

The first planned test involves the standard CRC model of Muller against a
special CRC model with a higher, closer-to-conventional rook value.  Upon
closer examination, I suspected that the discrepancy was possibly too
small to be detected even with very long time controls.  So, I announced
that this test was cancelled.

Notwithstanding, I may change my mind and return to this unsolved mystery
if Joker80 demonstrates unusually-high aptitude as a playtesting tool. 
This might require very deep runs of moves with a completion time of a few
weeks to a few months per pair of games to achieve conclusive results.

The second planned test involves the standard CRC model of Scharnagl
against a special CRC model with a higher, unconventional archbishop
value.

Scharnagl currently assigns the archbishop with a material value of appr.
77% that of the chancellor in his standard CRC model.

Muller currently assigns the archbishop with a material value of greater
than 97% that of the chancellor in his standard CRC model.

Nalls currently assigns the archbishop with a material value of lesser
than 98% that of the chancellor in his standard CRC model.

I devised a special CRC model using identical material values for every
piece in the standard CRC model by Scharnagl except that it assigns the
archbishop with a material value of exactly 95% that of the chancellor
(18% or 1.65 pawns higher).  [Note that this figure is slightly more
moderate than those by Muller & Nalls.]  A discrepancy this large should
be detectable at short-moderate time controls.  This test is now
underway.

If either of these tests are successful at establishing or implicating a
probability that the special models play stronger than the standard
models, then revisions to the standard models may occur.  At that
juncture, we would be ready to begin 'external playtests'.  [By this
self-invented term I mean playtests of the standard models of different
persons against one another.]

Derek Nalls wrote on Fri, May 23, 2008 09:38 PM UTC:
I have recently been sufficiently convinced via asymmetrical playtesting
(still underway) that the 2 rooks : 1 queen advantage in material values
is appr. the same in CRC as in FRC.  [I used to think it was higher in
CRC.] Consequently, I revised my model (again) and my CRC piece values:

universal calculation of piece values
http://www.symmetryperfect.com/shots/calc.pdf

CRC
material values of pieces
http://www.symmetryperfect.com/shots/values-capa.pdf

FRC
material values of pieces
http://www.symmetryperfect.com/shots/values-chess.pdf

This change was implemented by raising the value of the queen in CRC- not
by lowering the value of the rook.

revised Joker80 values
Nalls standard CRC model
P85=268=307=518=818=835=950

Derek Nalls wrote on Fri, May 23, 2008 10:22 PM UTC:
'If the result would be different from playing at a a more 'normal' TC,
like one or two hours per game, it would only mean that any conclusions 
you draw on them would be irrelevant for playing Chess at normal TC.'

Conclusions drawn from playing at normal time controls are irrelevant
compared to extremely-long time controls.  It is desirable to see what
secrets can be discovered from a rarely viewed vantage of extremely
well-played games.  Are not you interested at all to analyze move-by-move
games played better than almost any pair of human players are capable?

You do not seem to understand that I, too, am discontent with the
probability of a small number of wins or losses in a row.  This is a
compensation that reduces the chance that the games were randomly
played to the greatest extent attainable and consequently, the winner 
or loser randomly determined.
_____________________________

'... playing 2 games will be like flipping a coin.'

Correction-

Playing 1 game will be like flipping a coin ... once.
Playing 2 games will be like flipping a coin ... twice.

The chance of getting the same flip (heads or tails) twice-in-a-row is
1/4.  Not impressive but a decent beginning.  Add a couple or a few or several consecutive same flips and it departs 'luck' by a huge margin.
_______________________________________________________________

'The result, whatever it is, will not prove anything, as it would be
different if you would repeat the test. Experiments that do not give a
fixed outcome will tell you nothing, unless you conduct enough of them to
get a good impression on the probability for each outcome to occur.'

I have wondered why the performance of computer chess programs is
unpredictable and varied even under identical controls.  Despite their
extraordinary complexity, I think of computer hardware, operating systems
and applications (such as Joker80) as deterministic.

The details of the differences in outcomes do not concern me.  In fact,
to the extent that your remarks are true, they will support my case if my
playtesting is successful that the unlikelihood of achieving the same
outcome (i.e., wins or losses for one player) is extreme.

I am pleased to report that I estimate it will be possible, over time, to
generate enough experiments using Joker80 to have meaning for a
high-quality, low-quantity advocate (such as myself) and even a
moderate-quality, moderate-quantity advocate (such as Scharnagl).  As for
a low-quality, high-quantity advocate (such as you), you will always be
disappointed as you are impossible to please.

Derek Nalls wrote on Sat, May 24, 2008 01:39 PM UTC:
'Actually the chance for twice the same flip in a row is 1/2.'
______________________________________________________

Really?
You obviously need a lesson on probability.
Let us start with elementary stuff.

Mathematical Ideas
fifth edition
Miller & Heeren
1986

It is an old college textbook from a class I took in the mid-90's.
[Yes, I passed the class.]
______________________

It says interesting things such as-

'The relative frequency with which an outcome happens 
represents its probability.'

'In probability, each repetition of an experiment is a trial.
The possible results of each trial are outcomes.'
____________________________________________

An example of a probability experiment is 'tossing a coin'.
Each 'toss' (trial of the experiment) has only two equally-possible 
outcomes, 'heads' or 'tails' ... assuming the condition that the 
coin is fair (i.e., not loaded).

probability = p
heads = h
tails = t
number of tosses = x
addition = +
involution = ^

[This is a substitute upon a single line for superscript representation 
of an exponent to the upper right of a base.]

probability of heads = p(h)
probability of tails = p(t)

p(h) is a base.
p(t) is a base.

x is an exponent.

p(h) = 0.5
p(t) = 0.5
_________________

What follows are examples of the chances of getting the same result
upon EVERY consecutive toss.

1 time
x = 1

p(h) ^ x = 0.5 ^ 1 = 0.5
p(t) ^ x = 0.5 ^ 1 = 0.5

Note:  In this case only ...
p(h) + p(t) = 1.0

2 times
x = 2

p(h) ^ x = 0.5 ^ 2 = 0.25
p(t) ^ x = 0.5 ^ 2 = 0.25

3 times
x = 3

p(h) ^ x = 0.5 ^ 3 = 0.125
p(t) ^ x = 0.5 ^ 3 = 0.125

Etc ...
______________________

By a function that is the inverse of successive exponents of base 2,
the chance for consecutive tosses to yield the same result rapidly
becomes extremely small.

When this occurs, there are only two possibilities- 'random good-bad
luck' or an unfair advantage-disadvantage exists (i.e., 'the coin is loaded').  The sum of these two possibilities always equals 1.

random luck (good or bad) = l
unfair (advantage or disadvantage) = u

luck (heads) = l(h)
luck (tails) = l(t)

unfair (heads) = u(h)
unfair (tails) = u(t)

p(h) ^ x = l(h)
p(t) ^ x = l(t)

l(h) + u(h) = 1
l(t) + u(t) = 1

Therefore, as the chances of 'random good-bad luck' become extremely low in the example, the chances of an advantage-disadvantage existing for 'one side of the coin' or (if you follow the analogy) 'one side of the gameboard' or 'one player' or 'one set of piece values' become likewise extremely high.

Only if it can be proven that an advantage-disadvantage does not exist for one player, then can it be accepted that the extremely unlikely event by
'random good-bad luck' is indeed the case.

It is essential to understand that random good luck or random bad luck
cannot be consistently relied upon.  From this fact alone, firm
conclusions can be responsibly drawn with a strong probability of
correctness.
____________________________________________________________

1 time
x = 1

p(h) ^ x = 0.5
u(h) = 0.5

p(t) ^ x = 0.5
u(t) = 0.5

2 times
x = 2

p(h) ^ x = 0.25
u(h) = 0.75

p(t) ^ x = 0.25
u(t) = 0.75

3 times
x = 3

p(h) ^ x = 0.125
u(h) = 0.875

p(t) ^ x = 0.125
u(t) = 0.875

Etc ...

Derek Nalls wrote on Sat, May 24, 2008 02:16 PM UTC:
'... in Joker the source of indeterminism is much less subtle: it is
programmed explicitly.'

This renders Joker80 totally unsuitable for my playtesting purposes.  [I
am just relieved that you told me this bizarre fact now before I invested
large amounts of computer time and effort.]

It is critically important that any AI program attempt (to its greatest
capability) to pinpoint the single, very best possible move in the time allowed upon every move in the game even if this means that it would
often-sometimes repeat an identical move from an identical position.

Do not you realize that forcing Joker80 to do otherwise must reduce its
playing strength significantly from its maximum potential?

Derek Nalls wrote on Sun, May 25, 2008 02:14 PM UTC:
Well, when you said ...

'Actually the chance for twice the same flip in a row is 1/2.'

... that was vague and misleading.

I thought you meant 'heads' twice OR 'tails' twice equals a chance of
1/2 instead of the sum of 'heads' twice AND 'tails' twice equals a chance
of 1/2.

Since English is a second language to you, of course I will overlook this
minor mis-communication and even apologize for implicitly accusing you 
of incompetence.  However, you should expect that you will draw critical 
reactions from others when you have previously, falsely, explicitly
accused them of incompetence in a subject matter.

Derek Nalls wrote on Sun, May 25, 2008 10:03 PM UTC:
The reason you have never been able find any correlation between winning
probabilities for one army and time controls [contrary to the experiences
of people using other AI programs] in asymmetrical playtests using Joker80
is that you have destructively randomized the algorithm within your program
to such an extent that it fails to measurably improve the quality of its
moves as a function of time or plies completed.  A program with serious
problems of this nature may do well in speed chess but at truly long time
controls against quality programs that improve as they should with time or
plies per move, it cannot consistently win.

I have two useful, important pieces of news for you:

1.  All of the statistical data you have generated using Joker80 (appr.
20,000+ games) is corrupt.  It must all be thrown out and started over
from scratch after you repair Joker80.

2.  All of your material values for CRC pieces are unreliable since they
are based upon and derived from #1 (corrupt statistical data).

I hope you can handle constructive advice.

Derek Nalls wrote on Mon, May 26, 2008 02:54 PM UTC:
I am slightly relieved and surprised that Joker80 measurably improves the
quality of its moves as a function of time or plies completed over a range
of speed chess tournaments.  Nonetheless, completing games of CRC (where a
long, close, well-played game can require more than 80 moves per player)
in 0:24 minutes - 36 minutes does NOT qualify as long or even, moderate
time controls.  In the case of your longest 36-minute games, with an example total of 160 moves, that allows just 13.5 seconds per move per player.  In fact, that is an extremely short time by any serious standards.  

I consider 10 minutes per move a moderate time that produces results of
marginal, unreliable quality and 60-90 minutes per move a long time that
produces results of acceptable, reliable quality.  Ask Reinhard Scharnagl or ET about the longest time per move they have used testing openings with their programs playing 'Unmentionable Chess'- 24 hours per move!

It is noteworthy that you are now resorting to playing dirty by using the
'exclusivist argument' that essentially 'since I am not a computer
chess programmer, I cannot possibly know what I am talking about when I
dare criticize an important working of your Joker80 program'.  What you
fail to take into account is that I am a playtester with more experience
than you at truly long time controls.  If you will not listen to what I am
trying to tell you, then why will you not listen to Scharnagl?  After all,
he is also a computer chess programmer with a lot of knowledge in
important subject matters (such as mathematics).

You really should not be laughing.  This is a serious problem.  Your
sarcastic reaction does nothing to reassure my trust or confidence that
you will competently investigate it, confirm it and fix it.

Now, please do not misconstrue my remarks?  My intent is not to overstate
the problem.  I realize Joker80 in its present form is not a totally
random 'woodpusher'.  It would not be able to win any short time control
tournaments if that were the case.  In fact, I believe you when you state
that you have not experienced any problems with it but ... I think this is
strictly because you have not done any truly long time control playtesting with it.

You must decide upon and define the best primary function for your Joker80
program:

1.  To pinpoint the single, very best move available from any position. 
[Ideally, repeats could produce an identical move.]

OR

2.  To produce a different move from any position upon most repeats. 
[At best, by randomly choosing amongst a short list of the best available
moves.]

These two objectives are mutually exclusive.  It is impossible and
self-contradictory for a program to somehow accomplish both.  Virtually
every AI game developer in the world except you chooses #1 as preferable
to #2 by a long shot in terms of the move quality produced on average.  

If you do not even commit your AI program to TRYING to find the single
best move available because you think variety is just a whole lot more
interesting and fun, then it will be soft competition at truly long time
controls facing other quality AI programs that are frequently-sometimes
pinpointing the single, best move available and playing it against you.

Derek Nalls wrote on Mon, May 26, 2008 07:04 PM UTC:
'Joker80's strength increases with time as expected, 
in the range from 0.4 sec to 36 sec per move, 
in a regular and theoretically expected way.'

'The effect you mention is observed NOT to occur
and thus cannot explain anything that was observed to occur.'

Admittedly, I have no proof ... yet.  Of course, this is due to Joker80
never have been playtested at truly long time controls (to my point of
view).
_______________________________________________________________

'Now if you want to conjecture that this will all miraculously become
very different at longer TC, you are welcome to test it and show us convincing results. I am not going to waste my computer time on such a wild and expensive goose chase.'

I respect your bravery to issue the challenge.  Although I would surely
find the results of a randomized Joker80 vs. non-randomized Joker80
tournament at 60 minutes per move (on average) interesting, I am not
willing either to invest a few (3-4) months of my computer time that I
estimate it would require to playtest 16 games under acceptable, reliable
conditions.

My refusal is due to it not being extremely important or worthwhile to me
just to keep the chess variant community from losing one potentially great
talent to numerology (or some such).  Besides, I have nothing to gain and
nothing new to learn by conducting this long, difficult experiment.  
Only you stand to benefit tangibly from its results.

I just cannot understand how any rational, intelligent man could believe
that introducing chaos (i.e., randomness) is beneficial (instead of
detrimental) to achieving a goal defined in terms of filtering-out
disorder to pinpoint order.  

When you reduce the power of your algorithm in any way to filter-out
inferior moves, you thereby reduce the average quality of the moves chosen
and consequently, you reduce the playing strength of your program- 
esp. at long time controls.  In other words, you are counteracting a
portion of everything desirable that you achieve thru advanced pruning
techniques used elsewhere within your program.

Since you argue that randomization is no problem at all and I argue
that randomization is a moderate-major problem, everything we say to 
one another is becoming purely argumentative.  Only tests (that neither 
one of us intend to perform) can prove who is correct and settle the
issue.
___________________________________________________________________

'As I explained, it is very easy to switch this feature off. 
But you should be prepared for significant loss of strength if you do
that.'

To the contrary, you should be prepared for a significant gain of strength
if you do that.  Notably, you do not dare.

In any event, the addition of the completely-unnecessary module of code 
used to create the randomization effect within Joker80 that you desire 
irrefutably makes your program larger, more complicated and slower.  
Can that be a good thing?

Derek Nalls wrote on Mon, May 26, 2008 10:42 PM UTC:
'It would be very educational then to get yourself acquainted with the
current state of the art of Go programming ...'

Go is a connection game that is not related to Chess or its variants.
The only thing Go has in common with Chess is that it is played upon a 
board using pieces.  You did not directly address my comment.

Derek Nalls wrote on Mon, May 26, 2008 11:36 PM UTC:
Rest assured, I intend to drop this futile topic of conversation soon and
leave you alone.

The following is my impression of how the limited randomization of 
move selection that you have described as being at work within Joker80
must be harmful to the quality of moves made (on average) at long 
time controls.  Since you have experience and knowledge as the
developer of Joker80, I will defer to you the prerogative to correct 
errors in my inferred, general understanding of its workings.
_______________________________________________________

short time control
1x

At an example time control of 10 seconds per move (average),
Joker80 cuts thru 8 plies before it runs out of time and must
produce a move.  At the moment the time expires, it has selected 12 
high-scoring moves as candidates out of a much larger number of 
legal moves available.  Generally, all of them score closely together
with a few of them even tied for the same score.  So, when Joker80 
randomly chooses one move out of this select list, it has probably not 
chosen a move (on average) that is beneath the quality of the best 
move it could have found (within those severe time constraints)
by anything except a minor amount.  In other words, the damage to 
playing strength via randomization of move selection is minimized 
under minimal time controls.
___________________________

long time control
360x

At an example time control of 60 minutes per move (average),
Joker80 cuts thru 14 plies (due to its sophisticated advance pruning
techniques) before it runs out of time and must produce a move.  
At the moment the time expires, it has selected only 4 high-scoring 
moves as candidates out of a much larger number of legal moves 
available.  Generally, all of them score far apart with a probable 
best move scored significantly higher than the probable second best 
move.  So, when Joker80 randomly chooses one move out of this 
select list, the chances are 3/4 that it has ignored its probable best
move.  Furthermore, it may not have chosen the probable second best move,
either.  It just as likely could have chosen the probable third or fourth
best move, instead.  Ultimately, it has probably chosen a move 
(on average) that is beneath the quality of the best move it may have 
successfully found by a moderate-major amount.  In other words, 
the damage to playing strength via randomization of move selection is 
maximized under maximal time controls.
_______________________________________

The moral of the story is that randomization of move selection reduces 
the growth in playing strength that normally occurs with time and plies 
completed.

Derek Nalls wrote on Tue, May 27, 2008 04:11 PM UTC:
I have read that most computer chess programmers use the brute force method
initially when the plies can be cut thru quickly and then switch to use
advanced pruning techniques to focus the search from then on.  This lead
to my mis-interpretation that Joker80 would have more moves under
consideration as the best at short time controls than long time controls. 

Some moves that score highly-positive after only a few-several plies will
score lowly-positive, neutral or negative after more plies.  Thus, I do
not see how the number of moves under consideration as the best could
prevent being reduced slightly with plies completed.  As a practical
concern, there is rarely any benefit in accepting the CPU load associated
with, for example, checking a low-score positive move returned after
13-ply completion thru 14-ply completion (for example) when other
high-score positive moves exist in sufficient number.

Derek Nalls wrote on Mon, Jun 2, 2008 01:43 PM UTC:
Upon reflection, I have no conceivable reason to be distrustful of using
Joker80 IF I shut-off its limited randomization of move selection which
Winboard F activates by default.  

Could you please give me example lines within the 'winboard.ini' file
that would successfully do so?  I need to make sure every character is
correct.

Derek Nalls wrote on Tue, Jun 17, 2008 03:44 PM UTC:
Muller:

Thank you for the helpful response.  Frankly, I considered my own question
so obvious as to be borderline-stupid but I just wanted to be certain.

The following entries within the 'winboard.ini' file should enable me to
playtest (limited) randomized and non-randomized versions of Joker80
against one another.  Does it look alright?  If/When I run out of more
pressing playtesting missions, I may undertake this one after all.

/firstChessProgramNames={'Joker80 22' /firstInitString='new\n'
'Joker80 22'
}
/secondChessProgramNames={'Joker80 22' /secondInitString='new\n'
'Joker80 22'
}

Unfortunately, I no longer plan to playtest sets of CRC piece values by
Muller, Scharnagl and Nalls against one another.  I think having the pawn
set to 85 and the queen set to 950 (as required by Joker80) for all three sets of material values would have the unintentional side effect of equalizing their scales (which are normally different).  This means that the Muller set would, in fact, be tested against something other than a true, accurate representation of the Scharnagl and Nalls sets.

I am currently in the midst of conducting several 'minimized asymmetrical
playtests' using SMIRF at moderate time controls.  I want to tentatively
determine who is correct in disagreements between our models involving 2:1
or 1:2 exchanges (with supreme pieces).  I have to avoid its checkmate bug,
though.  This requires me to take back one move whenever the program
declares checkmate and 'call the game' if a sizeable material and/or
positional advantage indisputably exists for one player.  Fortunately,
this is almost always the case.  I will give a report in a few-several
weeks.

Derek Nalls wrote on Tue, Jun 17, 2008 05:57 PM UTC:
'Of course you could also use Joker80 or TJchess10x8, which do not suffer
from such problems.'
____________________

While you were on vacation, I started a series of 'minimized
asymmetrical playtests' using SMIRF.  So, I will complete them using SMIRF.

Joker80, running under Winboard F, has never acted buggy in computer
vs. computer games.  However, TJChess cannot handle my favorite CRC
opening setup, Embassy Chess, without issuing false 'illegal move'
warnings and stopping the game.

25 comments displayed

EarliestEarlier Reverse Order LaterLatest

Permalink to the exact comments currently displayed.