[ Help | Earliest Comments | Latest Comments ]

[ List All Subjects of Discussion | Create New Subject of Discussion ]

[ List Latest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Comments/Ratings for a Single Item

~~Later~~ ⇩Reverse Order⇧ Earlier⇩ Earliest⇧

Piece Values[Subject Thread] [Add Response]

H. G. Muller wrote on Fri, Feb 14, 2020 10:28 AM UTC:

Strange anomaly

I have started an attempt to determine piece values on a 12x12 board, about which very little is known. I use Fairy-Max self-play for this. To not make the games too long, the basic start position I use consists of a FIDE army augmented with 4 Pawns (to close the Pawn rank). So the board is pretty thinly popuated. The 4 central Pawns start on 4th rank, to speed up engagement, but the 3 Pawns in each wing start on 2nd rank, to provide some King shelter. The Knights also start in a somewhat advanced position, on 3rd rank behind the central Pawns, as does the Queen (to not have too many unprotected Pawns in the start position.

To determine piece values I delete some pieces from this setup, or sometimes replace them by fairy pieces. The side with the weaker piece can be compensated by giving the opponent Pawn odds, for which I always delete the same Pawn (on the h-file). Such Pawn odds alone results in a 68-70% win of the side with the extra Pawn. I only trust results when the total advantage is closer to equality than Pawn odds.

I don't want to delete more than a single Pawn, because Pawns are highly cooperative pieces, and I cannot be sure that deleting two Pawns gives twice the advantage of deleting a single one. By always granting the same simple Pawn advantage (or none at all), I can get a reliable 'measuring stick' on which the values of other pieces can be mapped. This poses a problem if there is a gap of more than 2 Pawns in the 'piece-value spectrum', though: I cannot play those 1-on-1 without getting an imbalance that is too extreme. I solve that by also introducing fairy pieces with values in the gap, and also determine their value.

I try to avoid Bishops, (and color-bound pieces in general) because of the pair-bonus problem, which would introduce yet another unknown. Which, if not handled properly by the engine, might invalidate the results. So I tried to find a very 'Bishop-like' piece that is not color bound. My first choice was this: to break the color binding of a Bishop, I gave it Wazir moves. To compensate for that (and prevent mating potential), I take away the Ferz moves. But to not affect the more distant B moves, they can still be blocked on the F squares. (So basically this is a Tamerlane 'Picket' + Wazir.)

Such a modified Bishops beats a Knight on 12x12 (as ordinary Bishops also do). The strange thing is that when I then handicap them with Pawn odds, they still beat the Knight, by about the same score. The extra Pawn doesn't seem to help the Knight at all! I have never seen that before; normally deleting a Pawn lowers the score by about as much as the pure Pawn-odds advantage. I watched a few games, and often they end in Knight + Pawns vs modified Bishop + Pawns, where the Knights are still judged by the engine to be ahead (because there are more Pawns on their side). But the Knight then almost always loses. The 'WazirPicket' can easily prevent advance of many isolated Pawns, by guarding a diagonal. And by just stepping in front of a Pawn it attacks as well as blocks it, so that the Pawn easily falls. Even connected passers are easily destroyed this way, if they still have far to go. (And on 12x12 they usually have; I use promotion on last rank only.)

I guess the WazirPicket is just unrepresentatively dangerous to Pawns, in the absence of pieces that can protect those. (And Knights are pitifully slow on 12x12...) So that the value of opposing Pawns shrinks to almost nothing in the late end-game.

My next attempt at a non-color-bound substitute for a Bishop will only change the mF moves of a Bishop to mW, but leaves the captures in place. This will also have the advantage that it cannot be blocked on a square that it doesn't attack, so that it cannot be blocked with impunity (as lame leapers can). Such a piece cannot do more damage to Pawns than ordinary Bishops can, but the mW move does allow it to switch color on non-captures. (Hence I dubbed it Swishop.)

George Duke wrote on Mon, Nov 25, 2013 05:44 PM UTC:

'Piece Values' topic has over 170 comments, the most of any article or
thread, but they are all 2002 to 2008, in abeyance five years, the last comment being http://www.chessvariants.org/index/displaycomment.php?commentid=22208. Paulowich's Grand Rook of Unicorn Great Chess is Rook plus Elephant, this Elephant being Alfil + Ferz -- making Grand Rook tri-compound (Rook + Alfil + Ferz). Paulowich's 9-point Grand Rook sets off with 10-point Queen in his system. Not that piece values haven't come up in several other forums since then, but this was one end-point.

http://en.wikipedia.org/wiki/Chess_piece_relative_value -- here Wikipedia,
which is steadily adding Chess information, has Queen value fluctuating
from 7.9 to 10.4 and Knight from 2.4 to 3.5 in realistic systems.

Peter Hatch started the topic 12 years ago with http://www.chessvariants.org/index/displaycomment.php?commentid=197, covering very standard piece-types.
Where Paulowich has interest in Unicorn and Grand Rook, of course I study Falcon, as well as other personal favourites Scorpion, Half-Duck, Sissa, the two Cannons and two Canons "Chinese" and "Korean." The original year 2000 Falcon article had too high Falcon value estimate of 7.0, that was soon recalibrated at 6.0 in comment. Then thanks to H.G. Muller's work Falcon is pegged at 5.5 to Rook 5.0 now. However, Falcon falls towards 4.5 to the same Rook 5.0 in later stages and endgame. Certain pieces including also the Cannons and Canons, whether divergent or convergent, have fluctuating value that has to take account how far into the game.

More recently, Sovereign Piece-types,
http://www.chessvariants.org/index/listcomments.php?subjectid=SOVEREIGN_P_Ts [tables disrupted in Explorer], show way to guage value not pointwise, but by how many of each unit are needed to Mate. Thus Rook mate number is 1, and Bishop 2, and Falcon 1, and Knight 3, and Xiangqi Cannon 3.
One interest is to extend Mating Number to some hexagonal pieces. For example, Gilman's ForeRook and HindRook of AltOrthHex do have 1 Mating Number, but Settler(0,7) and Heptagram(4,1) both 3. A piece-type having a Mate number is Sovereign.

David Paulowich wrote on Fri, Dec 26, 2008 02:13 PM UTC:

John Smith asks: 'How would a Rook be strengthened by adding a Tripper's move? How would a Bishop be strengthened by adding a Threeleaper's move?'

My guess is, considerably. On a related note, see my note on piece values relating to Unicorn Great Chess. I introduced the Grand Rook, a Rook-Alfil-Ferz combination worth 100 points less than the Queen on a 10x10 board. I would expect the Grand Rook and the Chancellor to have the same value on any square board, as they are (roughly) similar in design.

David Paulowich wrote on Fri, Dec 26, 2008 01:50 PM UTC:

20. Two Wazirs cannot checkmate a lone King. 
21. Two Ferz's, even on different colors, cannot checkmate a lone King. 
22. A Wazir and a Ferz cannot checkmate a lone King. 
23. Two Camels, even on different colors, cannot checkmate a lone King. 
24. A Camel and a Zebra cannot checkmate a lone King.

- this data on forcing mate against the lone King is provided by Dave McCooey. Back on [2008-07-02] Sam Trenholme wrote: '... an alfil can only access 25% of the board ...'. CORRECTION: one Alfil can visit only 8 squares on the 8x8 board - it is truly a pitiful piece. The Ferz is a useful little piece (perhaps more valuable than a Wazir) and can be found in my 10x10 Shatranj variant, along with Elephants that can access 25% of the board.

John Smith wrote on Fri, Dec 26, 2008 09:44 AM UTC:

Does anyone here think a Knight is just an Alibaba that happens not be
colourbound? ;) How would a Rook be strengthened by adding a Tripper's
move? How would a Bishop be strengthened by adding a Threeleaper's move?

Derek Nalls wrote on Fri, Nov 7, 2008 06:46 AM UTC:

Upon closer consideration, I have decided to cancel 3 out of the 4 planned
playtests using Joker80 running under Winboard F to play Embassy Chess
(mirror).  The reason is that I suspect they are probably untestable
conclusively within an achievable amount of time and number of games since
differences of less than 5% in value between the CRC pieces under study are
expected.  Obviously, 'untestable playtests' are oxymorons indicative of
a total waste of CPU time.

Please allow me to show the numbers behind my thinking based upon the
present CRC piece values models of Nalls & Muller.  

[Unfortunately, I no longer regard the CRC model of Scharnagl as being
sufficiently refined in compliance with experimental results to yield
accurate, predictive values.]
_____________________________

playtest #1
Embassy Chess (mirror)
1 queen missing vs. 2 rooks missing

Nalls

rook    59.43
queen  115.18

2 rooks / 1 queen = 1.0319

Muller

rook    55.88
queen  111.76

2 rooks / 1 queen = 1.0000
__________________________

average
(Nalls & Muller)

2 rooks / 1 queen = 1.0160

Conclusion- untestable!
_______________________

playtest #2
Embassy Chess (mirror)
1 archbishop missing vs. 1 rook + 1 bishop missing

Nalls

bishop       37.56
rook         59.43
archbishop   98.22

1 rook + 1 bishop / 1 archbishop = 0.9875

Muller

bishop       45.88 
rook         55.88
archbishop  102.94

1 rook + 1 bishop / 1 archbishop = 0.9885
__________________________________________

average
(Nalls & Muller)

1 rook + 1 bishop / 1 archbishop = 0.9880

Conclusion- untestable!
_______________________

playtest #3
Embassy Chess (mirror)
1 chancellor missing vs. 1 rook + 1 bishop missing

Nalls

bishop       37.56
rook         59.43
chancellor  101.48

1 rook + 1 bishop / 1 chancellor = 0.9558

Muller

bishop       45.88 
rook         55.88
chancellor  105.88

1 rook + 1 bishop / 1 chancellor = 0.9611
__________________________________________

average
(Nalls & Muller)

1 rook + 1 bishop / 1 chancellor = 0.9585

Conclusion- untestable!
________________________

playtest #4
Embassy Chess (mirror)
1 archbishop missing vs. 1 rook + 1 knight missing

Nalls

knight       30.77
rook         59.43
archbishop   98.22

1 rook + 1 knight / 1 archbishop = 0.9183

Muller

knight       35.29
rook         55.88
archbishop  102.94

1 rook + 1 knight / 1 archbishop = 0.8857
__________________________________________

average
(Nalls & Muller)

1 rook + 1 knight / 1 archbishop = 0.9020

Conclusion- testable!
______________________

Thus, I will begin playtest #4 very soon.

Reinhard Scharnagl wrote on Fri, Jul 4, 2008 04:50 PM UTC:

To Derek Nalls and H.G.M.:

For your testing purposes I will provide new SMIRF releases. There will be
three of them. The regular one will have the suffix -0, the one with the
minor increased Archbishop basic exchange value will have the suffix -1,
and the one more increased will have the suffix -2.

SMIRF basic exchange values will be for 10x8:

P=1.0000
S=3.0556
B=3.5972 
R=5.4306
A=6.6528 / 7.5685 / 8.0278
C=8.4861
Q=9.0278

So I will be waiting for your testing results ...

Derek Nalls wrote on Thu, Jul 3, 2008 05:25 AM UTC:

Conclusive Report
(but without any evidence)

I began this round of playtesting using SMIRF MS-174b-O which contained
a bad checkmate bug.  Since I regard it as inconsistent to me to:

1.  present saved games unaltered whenever the checkmate bug did not 
present itself.

YET

2.  present saved games altered whenever the checkmate bug did present
itself.

... I chose to present no saved games at all for the sake of consistency.

In fact, I did not save any games at all generated via SMIRF playtests.

This puts me in the strange position of playtesting mainly for my own
interest since I do not have the right to demand that anyone else take my
word for the playtesting results I am reporting.

[The latest version of SMIRF recently given to me by Reinhard Scharnagl, 
MS-174c-O, has never shown me a checkmate bug.  Hopefully, it never
will.]
_____________________________________________________________________

Since I have been convinced thru playtesting recommended by Muller that 
the archbishop has a material value nearly as great as the chancellor in
CRC, the desirability of confirming the order of material values for the
'supreme pieces' (i.e., queen, chancellor, archbishop) used in all
reputable CRC models occurred to me.  Accordingly, 3 asymmetrical
playtests were devised.  These are 1:1 exchanges involving a player
missing 1 given supreme piece versus a player missing 1 different supreme
piece.  Generally, the results were normal as expected.

Embassy Chess

(player without 1 archbishop) vs. (player without 1 chancellor)
10 minutes per move
(player without 1 archbishop) wins 2 games (playing white & black)
75% (3/4) probability of correctness

(player without 1 chancellor) vs. (player without 1 queen)
10 minutes per move
(player without 1 chancellor) wins 2 games (playing white & black)
75% (3/4) probability of correctness

(player without 1 archbishop) vs. (player without 1 queen)
10 minutes per move
(player without 1 archbishop) wins 2 games (playing white & black)
75% (3/4) probability of correctness

order of material values of CRC pieces
(from highest to lowest)

1.  queen
2.  chancellor
3.  archbishop

By transitive logic, the third playtest could have been considered
totally unnecessary.  Nonetheless, I conducted it as a double-check to the
consistency of the results from the first and second playtests.  
Although a 75% (3/4) probability per test could be improved upon greatly 
with a couple-few more games, I am already satisfied that the results are
correct and that something unexpected is not the reality.  So, I will not
be playtesting this issue further.  There are more interesting and
pressing mysteries to me awaiting tests.

Greg Strong wrote on Wed, Jul 2, 2008 09:02 PM UTC:

Sam:

In practice, mild colorboundness seems to matter less than one would
expect. (By mild, I mean only losing half the board, and having a pair on
different colors.) Interestingly, a Ferz is worth more than a Wazir
because it attacks in two forward directions instead of one.

H. G.:

Yes, this is the plan.  The rewrite will begin as console only, with GUI
to be attached later.

H. G. Muller wrote on Wed, Jul 2, 2008 06:25 PM UTC:

Sam Trenholme:
| What is you experience with how being colorbound affects the value 
| of a short range leaper?

I never tried measuring heavily 'challenged' pieces like the Alfil or
Dabbaba. So I can only speak for color-bound pieces that can still access
50% of the board, like Bishop, Ferz, Camel, FD.

My experience is that, when I measure those in pairs of opposite color,
their value hardly suffers. A pair of FDs was worth almost as much as a
pair of Knights (580 vs 600). But in analogy to Bishops the value of such
a pair should be split in a base value and a pair bonus. A good way to
measure the pair bonus seems playing the two color-bound pieces on the
same color against a pair on different color. At least for the Bishops
this worked quite well, using Joker.

Problem is that Fairy-Max is really a bit too simple to measure a subtle
effect like this, as its evaluation does not include any pair bonuses. In
micro-Max, for orthodox Chess, I simply make the Bishop worth more than a
Knight, to bias it against B vs N trades. Although this makes it shy away
from B vs N trades even with only a single Bishop for no justifyable
reason, this is not very harmful. Unfortunately, this trick does not make
it avoid trading Bishops of unlike color against Bishops of like color.
And when tboth engines see these as perfectly equal trade, they become
very likely, wasting the advantage of the pair. I guess I could fix this
by programming the Bishops of either side as different pieces, and give
the Bishops of the side that has the pair a larger base value. (And
similar for other color-bound pieces.) I have not tried this yet.

Note that one should also expect cross-type pair bonuses, e.g. an FD plus
a Bishop are worth more if they are on unlike color. I am also not sure
how to calculate pair bonuses if there are more than 2 color-bound pieces
on the board foreach side. E.g. with 4 Bishops, two on white, two on
black, do I have two pairs, or four pairs?

I currently believe Betza's conjecture as a working hypothesis, that as
long as you have one piece of every color-class, the total value of the
set does not suffer from the color boundness. But I haven't tested 8
Alfils per side, and I have no idea how much the value of the set
decreases if you have only 4 left. There could be a term that is quadratic
in the number of Alfils in the evaluation. All this can in principle be
tested, but a piece with 4 targets, like Ferz, is not much worth to begin
with (~150 cP on 8x8). The Alfil is most likely not better, even in a
dense pack. And pair-bonus effects are usually again a small fraction of
the base value, and might be as low as 20 cP. It requires an enormous
number of games to get such small difference above the noise threshold.

Sam Trenholme wrote on Wed, Jul 2, 2008 06:12 PM UTC:

You know, I think Fairy Max is a good program for doing piece value research; I think I will download it and see if I can get some interesting figures for the value of the pieces in 8x10 chess.

- Sam

H. G. Muller wrote on Wed, Jul 2, 2008 05:54 PM UTC:

Reinhard:
Why is it relevant what you like, for giving Derek what he wants? He would
not ask for it unles HE liked it. You seem to deny other people what they
want/need/like because it is different from what you like.

Just add 2 Pawns to the value of any Archbishop. No matter how the rest of
your evaluation is, that can't be that difficult? If you think the
evaluation becomes totally non-sensical because of this is Derek's
problem.

Reinhard Scharnagl wrote on Wed, Jul 2, 2008 05:27 PM UTC:

H.G.M.: '... But the point really is that Derek ASKS you to provide such a
version of SMIRF to help him conduct an experiment he thinks is
interesting. ...'

Harm, I have released such special versions of SMIRF. But in the meantime
SMIRF internally has separated the mobility parts from the static values
and replaces it by combining a positional detail evaluation, where needed.
This is not at all a mature approach, because I make and improve it merely
by my own.

Here I am not very interested in an epicycle theory kind approach, but
more in Kepler's laws related approach, using clear and simple
statements, which could be of practical use. To see statistical results
might give a hint for researches, but I would like to see convincing
explanations.

Sam Trenholme wrote on Wed, Jul 2, 2008 04:12 PM UTC:

Muller:

What is you experience with how being colorbound affects the value of a short range leaper?

For example, my gut instinct tells me a ferz (moves one square like a bishop) is worth more than an alfil (jumps two squares like a bishop), since an alfil can only access 25% of the board, and a ferz can access 50% of the squares on the board. Likewise, a wazir (one square like a rook) should be worth more than a ferz, since it can access all of the squares on the board.

Thanks for your input (and I'm sure the short range project will greatly appreciate your reseaarch).

- Sam

H. G. Muller wrote on Wed, Jul 2, 2008 11:08 AM UTC:

Some more empirical data for those who are working on ab-initio theories
for calculating piece values:

I did determine piece values of several fully symmetric elementary and
compound leapers, with various number of target squares, in the context of
a normal FIDE Chess set in which the extra pieces were embedded in pairs,
on a 10x8 board. The number of target suares varied from 4 (Ferz, Wazir)
to 24 (Lion), the length of the leap limited to 2 in one dimension. From
this I noticed that the empirical values for pieces with the same number
of target squares tends to cluster quite closely around certain values:
140, 285, 630 and 1140 centiPawn for pieces witth 4, 8, 16 and 24 targets,
respectively). These values can be fitted by the expression

value = (30 + 5/8*N)*N,

where N is the number of target squares (when unrestricted by board
edges).

Then I went on by testing how the value of a piece that is nearly
saturated with moves (so that taking away 1 or 2 hardly affects its
overall manouevrability), namely the Lion, which in this context is a
piece that reaches all targets in the 5x5 area in which it is centered, is
affected by taking some moves away. In taking away moves, I preserved the
left-right symmetry of the piece, so that moves not on a file were
disabled in pairs. This left 14 distinct leap types, which I disabled one
at a time. I then played a pair of the thus handicapped pieces agains a
pair of unimpede Lions (plus the FIDE array present for both sides).

The resulting excess scores in favor of the unimpeded Lions when disabling
the various leaps were:

forward:   12.5% 15.1%  8.8% 15.1% 12.5%
           11.0% 14.8%  5.9% 14.8% 11.0%
            6.8%  5.0%    -   5.0%  6.8%
            7.9%  7.8%  5.4%  7.8%  5.4% 
backward:   7.6%  9.1%  5.4%  9.1%  7.6%

So disabling both forward (2,2) leaps (fA in Betza notation) reduced the
winning chances by 12.5%, etc. Pawn odds produces approximately 12% excess
score, so the two fA leaps marginally contribute a value of 100 cP to the
Lion. Note the values were obtained from 1000-game matches, and thus have
a statistical error of ~1.5% (12.5 cP). Also note that the numbers on the
vertical symmetry axis have to be multiplied by at least a factor 2 for
fair comparison with the other numbers, as in these tests only a singlke
leap was disabled, as opposed to two in the other.

As a general conclusion, we can see that forward moves are worth more (by
about a factor 5/3) than sideway or backward moves. 'Narrow' leaps seem
on average to be worth a little bit more than 'wide' leaps.

I am not sure if the scores above can be taken at face value as indicators
of the relative value of the particular leap in other pieces as well; it
could be that there are some cooperative contributions here that are
included in the measured marginal values, as all other leaps are always
present. E.g. the forward narrow Knight leaps are worth most, but perhaps
this is because they provide the piece with distant solo mating potential
of a King on the backrank. Perhaps the observed piece values should be
corrected for such global properties (of the entire target pattern) first,
before ascribing the value to individual leaps. Note, however, that all the
marginal scores add up to 123%, which is about 10.25 Pawns, not so far away
from empirical total value of the Lion. This suggest that cooperative
effects can't be on the average very large.

Next I intend to figure out how much of the value of each leap is provided
by its capture aspect, and how much by the non-capture aspect, by disabling
these separately. For the distant leaps, I want furthermore to know how
much the value changes if these are turned into lame leaps, blockable on a
single intermediate square. Note that the Xiangqi Horse (Mao) drops a
factor 2 in value compared to an orthodox Knight by being lame. I also
want to investigate if the lameness is worse if the piece has no capture
to the square on which th move could be blocked (a cooperative effect).

H. G. Muller wrote on Wed, Jul 2, 2008 08:48 AM UTC:

Greg Strong:
| The current state of ChessV? 

Hi Greg! Good to see you back here! What would be very interesting to me
is to have a version of ChessV that just plays as a console application
rather than having its own graphical interface. Preferably using WinBoard
protocol, of course, but I would be happy with anything, no matter how
primitive. I wouldn't even mind if the graphical interface stays, as long
as ChessV would also print the move it makes on its standard output, and
reads and accepts a move from its standard input. If it could do those
things, I would be able to write an adapter to run it under WinBoard
against other engines.

Would this be feasible?

| For onething, it doesn't anticipate forced repetition draws in 
| the appropriate way; even if it is winning by quite a margin, 
| it won't break the repetition to save it's advantage.  

I can vouch from my experience with micro-Max that this is extremely
important. It is almost impossible to quantitatively judge performace of
the engine if it can be tricked into rep draws, to the point where very
clear improvements do not affect the score at all.

In uMax I could fix 95% of the problem by recognizing returns to positions
that already occurred before in the game history, and evaluate those at
0.00. That it cannot really plan (or avoid) forced repetitions that occur
entirely in the tree is only a minor problem, as it does not occur too
often that repetitions can be forced.

H. G. Muller wrote on Wed, Jul 2, 2008 08:29 AM UTC:

Reinhard:
| P.S.: Thus it would be best to present a short and convincing argument.

If you don't consider the fact that 'the side having piece A beats that
having piece(s) B 90% of the time' a convincing argument to value A
higher than B, I don't really see what could convince you.

But the point really is that Derek ASKS you to provide such a version of
SMIRF to help him conduct an experiment he thinks is interesting. So it
should not really matter if the piece values here request are CORRECT or
not, because this is exactly what he is trying to test. The question is if
you want to HELP him searching for the truth, by providing him what he
needs to conduct this search...

H. G. Muller wrote on Wed, Jul 2, 2008 07:51 AM UTC:

Sam Trenholme:
| I think the best way to come up with reasonable piece values is
| to have a computer program play itself hundreds or thousands of
| games of a given chess variant, and use genetic selection (evolution)
| to choose the version of the program with piece values that win the
| most games.

This has been tried many times before (in normal Chess, mainly), with an
appalling lack of success. The reason is that even a very wrong evaluation
of one of the pieces (say a program that values a Queen at 7.5 in stead of
a correct 9.5) still only leads to bad trades in a minority of the cases, like 10-20%. This because it does require complex exchanges (like Q vs R+B)
rather than simple 1:1 exchanges, which simply do not present themselves
very often in games. In the other 80-90% of games the Queens will be
traded against each other, which will always be a neutral trade to each
program, no matter how much they differ in Queen value.

When only 10% of the games is affected by the piece-value difference,
while 90% with equal trades will have a 50-50 outcome, that latter set of
games will still produce statistical noise, which is added to the noise in
the overall result score, while it dilutes the systematic bias because of
the different evaluation. If, after a wrong trade (say Q vs R+B) induced
by the faulty Q value the side with Q left would have 70% winning chance,
(20% above par), the total score would be only 52% (2% above par). To
detect this score excess with the same relative statistical accuracy as
the 20% excess would require 100x as many games. (So 10,000 in stead of
100.)

The situation could be lightly improved if one would SELECT games before
analysis, throwing out all games whith equal trades (Q vs Q). Then you
eliminate the random noise produced by them from the result, and would
only look at the sample with unequal trades (with 20% score access). You
would still need about 100 of those, but now you only have to play 1,000
games to acquire them. Problem is that judging which games were affected
by the piece-value under study is a bit subjective, as Q vs Q trades do
not always occur through QxQ, ...xQ combinations, but sometimes are part
of a larger exchange with intermediate positions with material imbalance
(not affecting the engine decision, as they were within the horizon of the
engine search).

This is why I adopted the methodology of forcing the material imbalance
under study into the game from the very beginning. ('Asymmetric
playtesting' in Dereks terminology.) All games I play are then relevant.
Even if the engine I play with has a completely wrong idea of the piece
values, the material advantage it has at the outset (say A vs B+N) will be
needlessly traded away in only 10% of the cases. And if both engines share
the misconception, that will be still lower, as the opponent would
actually try to avoid such trades. So you will have only a light
suppression of the excess score, and very little noise added to it.

| I could do it myself, but I need a chess variant engine that I can
| set, from the command line, white's and black's values of the pieces
| independently, and then have the variant play itself a game of the
| chess variant.

I would only applaud this. In fact the engines you request do exist, and
can be downloaded as free software from my website:

* Joker80 allows setting of the piece values by a command-line argument,
(a feature requested by Derek, as discussed below in this thread) but is
limited to 10x8 variants with the Capablanca pieces.

* Fairy-Max allows implementation of (nearly) arbitrary fairy pieces, and
setting of their values, through a configuration file (fmax.ini) that can
be changed with a simple text editor like Notepad. (This because the
options here are too elaborate to fit on the command line.) The format of
the piece description is admittedly a bit cumbersome (that is, the
description of the way it moves, especially if it is a complex move like
that of a Crooked Bishop), but the fmax.ini that is provided for download includes many examples for the more common fairy pieces. And changing the piece value is absolutely trivial. Furthermore, I am always available to provide assistance.

Greg Strong wrote on Wed, Jul 2, 2008 12:58 AM UTC:

Wow... So much going on here!  I have been away from chessvariants.org for
quite a long time (focusing on other things, for better or worse,) but it
is exciting for me to see so much interest in this topic.  I have been
away for too long.  I actually wrote ChessV initially because I was
inspired by Ralph Betza's work on determing values of fairy chess pieces
and Chess with Different Armies.  ChessV was intended more as a research
tool than as a Zillions-of-Games replacement.

The current state of ChessV?  Well, it's a very good program for
entertaining a human who wants a challenging opponent for a wide variety
of chess variants.  Unfortunately, although I intended it for research,
and it is not bad for that purpose, it is not great either because it
likely has some deep-rooted bugs that I am ill-equiped to fix.  For one
thing, it doesn't anticipate forced repetition draws in the appropriate
way; even if it is winning by quite a margin, it won't break the
repetition to save it's advantage.  I started on a complete re-write, but
got distracted a while ago.  Clearly, I need to get back to the grindstone
...

Thanks to everyone who is conducting research into the value of chess
pieces, and, furthermore, into the value of different variants.

Derek Nalls wrote on Tue, Jul 1, 2008 10:20 PM UTC:

Muller & Scharnagl:

Please note that I have revised my model again in consideration to recent
playtesting results.  This affects material values of 'supreme pieces'
in both FRC and CRC.

CRC
material values of pieces
http://www.symmetryperfect.com/shots/values-capa.pdf

pawn 10.00
knight 30.77
bishop 37.56
rook 59.43
archbishop 98.22 
chancellor 101.48
queen 115.18

FRC
material values of pieces
http://www.symmetryperfect.com/shots/values-chess.pdf

pawn 10.00
knight 30.00
bishop 32.42
rook 50.88
queen 98.92

For details, please see:

universal calculation of piece values
revision- July 1, 2008
http://www.symmetryperfect.com/shots/calc.pdf
65 pages

Consequently ...

My current CRC model is more similar to the Muller model than any other.
My current FRC model is more similar to the Kaufmann model than any
other.

Unfortunately, a 65-page explanation, even if it is 'elaborate sense',
is not conducive to the 'short, convincing argument' you seek.

Sam Trenholme wrote on Tue, Jul 1, 2008 07:23 PM UTC:

I think the best way to come up with reasonable piece values is to have a computer program play itself hundreds or thousands of games of a given chess variant, and use genetic selection (evolution) to choose the version of the program with piece values that win the most games.

You can even have mating (Sex! Can we say that on the Chess Variants site?): Two sets of known piece values can mate and the resulting piece values will be an average of the piece values of the two 'mates' (with some randomization; the 'child''s piece values will be a random mix of the two 'parents' piece values).

I could do it myself, but I need a chess variant engine that I can set, from the command line, white's and black's values of the pieces independently, and then have the variant play itself a game of the chess variant.

- Sam

Reinhard Scharnagl wrote on Tue, Jul 1, 2008 06:23 PM UTC:

Derek: '... Please raise the material value of your archbishop within your
CRC model? ...'

If there would be a theoretical explanation of your value estimation
beside of that statistical argument, I will think over my model. For me
it is more interesting to have a well argued and preferred *SIMPLE*
value derivation, even if that would not match all experiences. That is,
because such thoughts also could influence the lay-out of other program
parts e.g. like the detail evaluation.

P.S.: Thus it would be best to present a short and convincing argument.

Derek Nalls wrote on Tue, Jul 1, 2008 04:00 PM UTC:

Inconclusive Report

One type of 1:2 or 2:1 exchanges I have been playtesting using SMIRF
(versions MS-174b-O and MS-174c-O) involves a player missing 1 archbishop
OR 1 chancellor versus a player missing 1 rook and 1 bishop.  Generally,
the results were favoring the Muller model in which any 1 supreme piece in
CRC (archbishop, chancellor, queen) has a material value significantly
higher than any other 2 pieces (except 2 rooks).

Embassy Chess

(player without 1 archbishop) vs. (player without 1 rook + 1 bishop)
10 minutes per move
(player without 1 rook + 1 bishop) wins 2 games (playing white & black)
75% (3/4) probability of correctness

(player without 1 chancellor) vs. (player without 1 rook + 1 bishop)
15 minutes per move
(player without 1 rook + 1 bishop) wins 2 games (playing white & black)
75% (3/4) probability of correctness

Unfortunately, since I used standard versions of SMIRF loaded with
Scharnagl CRC material values, the results became tainted due to a game
between the (player without 1 chancellor) and the (player without 1 rook +
1 bishop) at 10 minutes per move.  The player with the potentially
game-winning 3:2 advantage in supreme pieces unnecessarily permitted the
exchange of its 1 archbishop for 2 minor power pieces (i.e., 1 bishop + 1
knight).  Eventually, a 3-fold repetition draw occurred.

Scharnagl:

Please raise the material value of your archbishop within your CRC model?
My experience has convinced me that it is obviously 1-2 pawns too low. 
Otherwise, I will be forced to abandon the use of SMIRF in favor of a
program (such as Joker80) with more reliable CRC piece values when I
return to this unresolved playtesting issue.

Reinhard Scharnagl wrote on Wed, Jun 18, 2008 09:15 PM UTC:

Sam Trenholme: '... and I needed to give Smirf over a minute to make a
move ...'

I presume, you had used the donation ware version of SMIRF, which probably
is about 150 Eló behind of the donators' bonus version of SMIRF (actually
improved again). Nevertheless Joker80 is the top 10x8 engine.

Sam Trenholme wrote on Wed, Jun 18, 2008 04:37 PM UTC:

I have been playing around with Joker80 and the corresponding Winboard program; I am very pleased with Joker80's performance. In my initial testing, the only program I could get to beat Joker80 was Smirf, and I needed to give Smirf over a minute to make a move while making Joker80 make all of its moves in under 5 minutes.

Joker80 soundly defeated ChessV 0.9.3 and Zillions of Games, even with a time handicap in both cases.

Anyway, some issues:

I can't set up a custom opening setup in Winboard, save the opening setup, and have Winboard read the setup without complaining the file is unreadable
It would be nice if Joker80 had support for the 'free' castling used by Grotesque and Univers Chess (not to mention my own humble contribution to 10x8 chess, Schoolbook)

25 comments displayed

~~Later~~ ⇩Reverse Order⇧ Earlier⇩ Earliest⇧

Permalink to the exact comments currently displayed.