The Chess Variant Pages
Custom Search




[ Help | Earliest Comments | Latest Comments ]
[ List All Subjects of Discussion | Create New Subject of Discussion ]
[ List Latest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Comments/Ratings for a Single Item

Later Reverse Order EarlierEarliest
This item is a game information page
It belongs to categories: Orthodox chess, 
It was last modified on: 2002-05-28
 By Ralph  Betza. Chess with Different Armies. Betza's classic variant where white and black play with different sets of pieces. (Recognized!)[All Comments] [Add Comment or Rating]
H. G. Muller wrote on 2020-04-28 UTC

I did some more work on the CwDA version of KingSlayer, and finally put the source code on line. The latest version now also supports the Daring Dragons army. This was not a trivial addition; this army needed several unusual features that were not implemented yet. For one, the Dragoons (KimN) need a divergent virgin move, and neither divergence nor virgin moves were implemented (other than in the hard-coded Pawn). The Wyvern has a ski-sliding move, which thoroughly affects the way we have to test for check, and what evasions to generate. It introduces a new mode of checking (which I call 'tandem check'), which is a double check where both checks come from the same direction. These can not be cured by capture of the checker, but unlike normal double check, it can be cured by interposition.

The Dragonfly is a tricky piece, with binding to odd or even files. It requires special evaluation to handle it well in the end-game. One of the unusual properties is that it is a 'semi-major': it can force checkmate on a bare King, but the KFK end-game also has fortress draws. Which of the two it is, is about an even call, like a promotion race in KPK: If the bare King can reach the b-file before the Dragonfly gets there, he can take safe shelter on the a-file, and it is draw. Otherwise it is a win. From the material composition alone, you cannot make a good guess. So I put in a routine that makes a reasonable guess based on the actual locations. (Not perfect yet, as it doesn't take account of the bare King hindering the Dragonfly in its attempt to reach the b-file, or vice versa, but that only happens in a minority of the positions.)

When the weak side still has Pawns (e.g. KFKP), I classify the end-game as drawish. (But not as bad as for KBKP, where you have no chance at all.) This assumes that the Pawn can act as a sufficient distraction for the strong side that the weak side has a very good chance of reaching safety with his King in the mean time. In fact a fair amount of positions in this end-game are won for the Pawn! If the Dragonfly cannot visit the file the Pawn is on, you only have the King to stop it, and the Pawn can easily be outside its reach as well. So Dragonfly endings, like Pawn endings, should really test for 'unstoppable passers' in their evaluation. (At the moment, KingSlayer doesn't do that for either, with as a consequence that is sometimes trades the last (non-Dragonfly) piece in a near-equal position, and on the next move (where it can search much deeper) sees the score dropping to -8.xx because the opponent's promotion can no longer be prevented.

The version I uploaded has the announcement of equal-army sub-variants commented out. With all combinations of the 5 supported armies, the list of variants in the CECP variants feature had become so long that it crashed XBoard!

I also started implementing limited configurability: it supports a variant 'custom', for which the user can specify (in a file gamedef.ini) the armies as an arbitrary selection of all the supported CwDA pieces. In addition there are two user-configurable pieces that can be selected too. These pieces can be built as an arbitrary combination of the move sets used to construct the standard CwDA pieces, plus one user-specified set of leaper moves. I am still thinking about a way to also allow specification of divergent or lame moves on these pieces. It might also be useful to allow redefinition of the set of leaps that is only used for the Charging Knight, in cases where the latter doesn't participate. And perhaps to redefine one or two slider moves, e.g. by making the range of R4 configurable, or perhaps replaceable by B3 or B4, or fB.

[Edit] I now uploaded a Windows binary of the latest version to http://hgm.nubati.net/CwDA.zip .


Aurelian Florea wrote on 2019-06-04 UTC

While watching a cpu vs cpu game of eurasian I had noticed that vaos do not seem to care either about color binding as in the early game color binding is compensated by the other pieces and in the late game lack of platforms probably damages them more.


Aurelian Florea wrote on 2019-06-01 UTC

Cool analisys HG!...


H. G. Muller wrote on 2019-05-31 UTC

Well, it is difficult to asses whether this capability for a pair to statically create an impenetrable barrier for a King is really important. Actually I think that Wizards can just do it (on 8x8), when standing next to each other in the center. But very often pieces can inflict a 'dynamic confinement' on a King. As long as you have to spend fewer moves to maintain it than the King needs to escape, you have moves to spare for other pieces to approach. Besides, FAD complement each other in a different way: standing next to each other the completely cover a 5x6 area, As a result they can drive a King to the edge with checks, and checkmate it there, without any help. This makes them very, very dangerous.

Even a King + Bishop can dynamically confine a King on boards of any size. The King has to cover the hole through which the opponent threatens to escape, and has to follow the bare King as long as it keeps running in the same direction to renew the escape threat. But when it reverses direction, to try an escape on the other side (which he eventually must, as he bumps into the edge) you have one free move. Therefore a Bishop can checkmate together with an arbitrarily weak piece (as long as that can go everywhere) on boards of any size.


Aurelian Florea wrote on 2019-05-30 UTC

@HG,

Also there is another effect that amplifies pairing bonus or color bonding penalty. The effect of the pair being able to block the king from part of the board. That the same way rooks do on their own. Bishops do that. Two dababahriders to that, and they only cover half the board among themselves anyway. Wizards or fads do not.


Aurelian Florea wrote on 2019-05-29 UTC

Also the case of bede and WAD on different shades who work a bit akwardly but do work together fine. Probably stronger than a charging rook+fibnif or waffle+short rook. Many pawns would help a lot the CC pair. But ChessV for example know such tricks. I did whached some games.


H. G. Muller wrote on 2019-05-29 UTC

Well, this is the whole point of making KingSlayer play CwDA: its playing algorithm can take the effects of color binding into account. But it still requires some thought on what exactly it should pay attention to. The only things I discovered about color binding so far were obtained with Fairy-Max, which doesn't take any color binding into account. It thus might under-estimate the effects. E.g. it approximates the effect of the Bishop pair bonus by making all Bishops worth more than Knights. This biases it against trading B for N in general. Which helps to preserve the B pair, (as it should), but makes it unnecessarily shy in lone B vs N situations (which should be a self-inflicted disadvantage of having a Bishop), and it doesn't prevent it from breaking up the pair by Bishop trading in a BB vs BN situation.

But it still finds an effect of about half a Pawn. I.e. B tests about equal to N, also in 'anti-pairs' (on the same shade), but a true B-pair tests as 0.5 Pawn stronger than B+N or 2N. I also did tests with more than 2 Bishops, and concluded that with 3 Bishops (divided 2:1 over the shades) you get 1 pair bonus, and with 4 Bishops (2:2) you get 2, compared to the simple addition of lone-Bishop values. While one could argue that the number of pairs is 2 and 4, respectively, in those cases.

There is a completely different interpretation of this data, not in terms of a pair bonus, but of a binding penalty. With Kaufman values B=N=325, and the pair bonus=50, so 2B(2:0)=650, 2B(1:1)=700, 3B(2:1)=1025 and 4B(2:2)=1400. These same numbers would be obtained by setting B=350, and giving a penalty of 25 when they are not equally distributed over the shades. The remarkable thing is that the penalty doesn't seem any higher for a shade imbalance of 2 than for an imbalance of 1. So it doesn't seem to matter how much power you have on your strong shade (with non-color-bound pieces you could aim them all at the same shade anyway), but it hurts when you lack power on a shade. This would mean the magnitude of the bonus is not really dependent on the value of the color-bound piece, as it is mainly expressing the disadvantage of absence of a piece. Indeed a preliminary test with Pair-o-Max (a Fairy-Max derivative that takes pair effects into account in a primitive way) suggested that the bonus for Bede was also just 50. (Pitting 2 BD on like or unlike shade versus 2 BmW + Pawn.)

The situation in the Clobberers army should be pretty much like the 4B(2:2) case; after trading one BD or FAD you incur the penalty, which you lose again after you then trade BD or FAD on the opposite shade (making that effectively worth 50 less than the first), but which you would keep after trading the second of the same shade (effectively giving that the 'average' value). This is how KingSlayer treats it now.

But pair bonuses / binding penalties are relevant in the middle-game; in the late end-game you could be in a much graver danger than the penalty suggests, vulnerable to tactics that would destroy your mating potential. Like sacrifycing a Rook for the piece on the 'minority shade' in a 2:1 situation. (Similar to what makes KBNN-KR a draw in FIDE, while KBBN-KR is a general win.) But this weakness would only be fullly exploited if the defending engine would know about it; otherwise it would just randomly trade the Rook for a member of the pair that threatens checkmate, with a 50% probability that it leaves a 1:1 distribution, and will be checkmated later anyway. (Like that it should know in KBNN-KR that it should leave NN, and not BN.) Failing to fully exploit an advantage might lead to underestimation of the value of that advantage.


Aurelian Florea wrote on 2019-05-29 UTC

@HG,

But the issue of an game with different armies where one player has more color bound pairs of pieces is an rather difficult one. The more color bound side has stronger pieces (in order to compensate for the color binding). The issues you mentioned are also strongly related to the fact the the playing algorithm does not understand it. If it does then it will play differently. But the problem is not gone away this way either as the game is now reduced to if early mid game tactics work for the color bound side. And from a game design point of view frankly this is not much. It lacks complexity. 

I'm wondering if the more color bound army has weaker values in the color bound pieces than it's counterparts in the other army, and then it compensates through the rest of the army it can work better. Or is the color bound army, just has more pieces be them individually weaker. Even if this is contrary to Betza's game. This last case also has problems though in the realm of the army with more pieces needing more time for coordination.

So the issue you raise is not that simple in it's depths! And quite likely something that people on the musketeer chess website have not fully considered!


Aurelian Florea wrote on 2019-05-28 UTC

@HG

Your analysys is much deeper (although treating only a nieche of the problem) than any of those made by the guys from musketeer chess!...


H. G. Muller wrote on 2019-05-28 UTC

Indeed, these asymmetric variants from the musketeer.net website are very unbalanced. Sometimes as badly as playing 6 minors against 6 Rooks in FIDE.

I discovered that the generalization of 'unlike Bishops' in KingSlayer's drawishness detection is not satisfactory. I had it only kick in when both sides have a single piece (plus Pawns, possibly different 1 or 2 in number), and both these pieces are color bound. But from watching games with the Clobberers I noticed it still stumbles in completely hopeless draws with a huge 'naive' advantage. E.g. there was a game where it had Bede and Fad on the same shade, plus an extra passer, versus a Half Duck. All the opponent's Pawns were on the safe shade, ('passively' blocking his own, i.e. without the possibility to offer trades or a majority to create new passers), and the enemy King was blocking the passer on a safe square. All the Half Duck had to do to block all progress was neutralizing any King attacks on its Pawns. Which it could easily do sitting on the safe shade, though its F and D moves. A single Bishop on the safe color (which can also protect from a safe distance) would also have done.

So I guess any situation where you have only to like-shaded color-bounds plus Pawns should be classified as drawish when the opponent has a piece with significant diagonal power (so it can keep a Pawn protected against King attack) that is not bound to the same shade as the attacker. Under some conditions a Ferz would even do (e.g. a Pawn and the Ferz mutually protect each other, and block two opponent Pawns, while the King blocks the third (which is a passer). Tempo moves can be done with King or Ferz, depending on which of the two is far away from the attacking King. No way Bede + Fad + 3 Pawns would be able to beat Ferz + Pawn. While the naive advantage would be about +9 (Bede, Fad being worth 4-4.5, Ferz 1.5 Pawn)! Of course there is no Ferz in CwDA, but there are pieces with F moves. (They are of course worth a bit more, but then you are still at +7 instead of +9.) A or D moves could sometimes do too, when two connected Pawns and the piece cyclically  protect each other (although with D moves you can then only block two Pawns, rather than three).

So end-games with same-shade color bounds can also very drawish even with many Pawns, even when not just Pawns ahead but also in pieces. I guess these must be heavily discounted in order to play well with or against the Clobberers. Having a Knight as defending piece would probably not do very well, though, due to its color alternation. So it would depend on what pieces exactly the opponent has.


Aurelian Florea wrote on 2019-05-20 UTC

This link with different armies opposing the black orthodox army from the musketeer chess website could be of interest. To me it seems that much effort has not been put in the balancing of the 2 armies.

http://musketeerchess.net/games/castellum/rules/rules.php

http://musketeerchess.net/games/castellum/rules/rules-marsu.php

http://musketeerchess.net/games/castellum/rules/rules-jumpers.php


H. G. Muller wrote on 2019-05-14 UTC

End-games: more armies

The Nutters

I adapted FairyGen to handle also two-fold symmetry (at the expense of the EGT being twice as large, and generation twice slower). This was a bit tricky, as this required distinction between retrograde and prograde moves, and flipping the orientation of the black pieces (neither of which was needed with 4-fold symmetry). But for 3-men EGT it finally gave identical results to the mating app here (which doesn't assume any symmetry). This means I could now do end-games with the Nutters majors as well. To keep everything together as an easy reference, I added the results to the tables in the previous comments.

The 4-men endings of light pieces were already interesting: it turns out the Charging Rook is very adept at beating other light pieces, much more so than an ordinary Rook. It has a general win against B, N, FAD, WA, and Fibnif single-handedly, while wins against BD and WD can in general be forced, but are then almost always cursed. I guess this success can be explained by that checkmating with Rook requires zugzwang, and will not work as long as the opponent has another piece to dump a tempo on. So you have to gain the other piece first, and in most cases this isn't any easier than checkmating (unless the additional piece is much weaker than a King, such as Ferz or Wazir), with the additional handicap that the piece can be protected by its King. Checkmating a bare King with the Charging Rook doesn't require zugzwang, however. So the mere possession of an un-involved defensive piece at a safe distance is no help. The piece must actively engage the Charging Rook, and the weaker pieces will perish in this combat. I did not calculate any 5-men EGT with Charging Rook + other vs defender where the Charging Rook would already beat the defender on its own; these should obviously be won as well.

A Charging Knight as defender behaves like a typical light piece: it loses against pairs of majors and (unlike) Bede/Fad pairs, and draws pairs of minors. Also for the Nutters, pairs of majors typically beat any single light piece. Apart from the WD the Charging Knight is the weakest major, though, and a pair of it has similar difficulties to beat a Rook, or its replacements Charging Rook and Dragon Horse. It does slightly better than the WD in this (as might be expected from the fact that it has one more move target), and has a cursed win against the Rook rather than a plain draw, etc.

The Nutters add new pairs of major + minor. These are interesting, because their ability to win depends on the possibility of the defender to choose which of the two pieces he will trade away. Charging Knight + Fibnif have similar difficulties here as Rook + Knight, against the Rook(-replacements) except Bede (which due to its color binding is apparently easy to dodge); the comparative weakness of Charging Knight compared to Rook is apparently compensated by the relative strength of the Fibnif that was already noticed before. Charging Rook + Fibnif does even better than Rook + FIDE minor, and beat almost anything, although its general wins against Rook or Charging Rook are partly cursed.

End-games with the Colonel are difficult to classify. Because of the extreme forwardness of this super-piece, the outcome will depend very much on where it is placed on the board. End-games where both players have a Colonel thus always contain a fair number of wins and losses, even if one would expect them to be draws. This even holds for the 4-men case Colonel vs Colonel: 15% of those are lost even when you have the move! (For comparison, for Queen vs Queen this is only 0.27%.) A Colonel beats most light pieces; it has mixed results against R, R4 and the charging Rook, while the Commoner (and thus the Dragon Horse) can hold a draw against it.

The Dragons

I also added the pieces from the Daring Dragons army: Commoner, vRsN (Dragonfly) and BW (Dragon Horse). This didn't really require any modification of the existing code for 4-fold symmetry, but to make a more accurate judgement on end-games containing more than one Dragonfly I put in some code to split the statistics in a 'like' and 'unlike' cases of the Dragonfly's special form of color-binding, and only report the result for the unlike pair here (as that is what you start with, and it is not a likely promotion choice).

The BW is quite strong, which should not be surprising, as its middle-game piece value is also slightly above that of a Rook. As a defender it can stand up to the Bede/Fad pairs in addition to pairs of other minors, probably because the Bede cannot easily sneak up on it from a diagonal, as it can against a Rook. (Note two FAD, which lack the distant diagonal attacks, can also not beat a Rook.) The BW is upward compatible with the Commoner, so in cases where a pair containing a Commoner already wins, replacing that Commoner with a BW should win even easier, and no EGT for these end-games were generated.

The Commoner (once under protection of its King) can keep a draw against a Queen and an Archbishop, because it cannot be approached by the enemy King. The Chancellor beats it, however. FairyGen cannot handle the ski-slide of the Wyvern yet.


H. G. Muller wrote on 2019-05-08 UTC

End-games part 3: Super-pieces versus a pair

This is a very murky problem. I have generated the relevant 5-men EGT, but they seem very hard to interpret. Take for example Queen vs Bishop + Knight. This has 98.56% of all positions won when the Queen has the move (including 40.42% immediate King capture). The weak side is lost in 49.08% of the positions where it has the move, 28.44% of such positions are instant wins by King capture (so really illegal positions, that one could choose not to count). And 17.88% are wins by other means, which has to mean in this case gaining of the Queen and a subsequent mate with Bishop + Knight (obtained from generating the reverse EGT), or (rarely) a checkmate with the Queen still on the board. Almost all of these (98%) capture the Queen (or mate) on the first move, and none in more than 5 moves. These should not really be counted as Q vs B+N, they are tactically non-quiet positions in the process of converting to a simpler end-game. The remaining 4.61% of the positions with the weak side to move must be draws.

This looks as much as a general win as one could hope for. Nevertheless it is well known that B + N can make a 'fortress' that even resists the onslaught of an Amazon (Ka1, Bb2, Nd4). The resulting fortress draws are hidden in those 4.61% (which amounts to 8.5% after disrecarding the illegal and non-quiet positions). So in most cases the end-game in a quiet position (where chess engines evaluate) would be a win for the Queen, so it seems reasonable not to excessively discount it. (The factor 2 applied to all pawnless advantages would already do justice to the difficulty of winning this, as the 'raw' advantage is equivalent to a single minor, which after discounting translates to 1.5 Pawns, which is only marginally above the threshold for winning advantages.)

This makes it impossible to avoid the fortress, however. The problem with fortresses that are not recognized by the evaluation is that the engine continues to count itself rich for the almost indefinite duration the defender can maintain the fortress (until the 50-move rule puts an end to it, but that will be seen only after 100 ply, way beyond the horizon when you first enter the fortress). The alternative is to always discount end-games that contain a fortress draw heavily. That would be wrong in the majority of cases, but the won cases will eventually convert to another end-game (KQ-KB or KQ-KN), or checkmate outright. And once this gets within the horizon the score will be corrected. Basically this puts the 'burden of proof' for that an end-game with a fortress draw is a win on the winning side, even when it is the most likely case, because that case is easier to prove. E.g. 26.75% of all positions (=49% of the quiet ones) converts in 5 moves or less, and the search can presumably find that. This still leaves more cases where it is in error than just ignoring the fortress, though. In addition to such a 'passive' fortress there can also be draws due to perpetual checking. But these usually lead to repetitions quickly, so that the search has no difficulty recognizing those without any special discounting.

It is kind of hard to devise a satisfactory algorithm here without actually probing the EGT, or putting in dedicated code to recognize the fortress. The latter doesn't seem feasible for CwDA, where in most end-games we really have no idea at all whether there is a fortress or not, let alone how it looks. When embedding a single exotic piece in, say, a FIDE context, it does seem feasible to generate the Q vs 2 minors EGTs (6 of those, for all combinations of B, N and the exo-piece) in advance. Even an uncompressed 5-men EGT only takes 160MB, so with today's memory sizes a number of those can easily be kept in memory (possibly shared between several instances of the engine).

Fortunately in many cases of super-piece versus a pair of light pieces the discounting is not really important, because the 'raw' advantage is already pretty small to begin with. E.g. with Q vs 2R the difference is only 0.5 Pawn in favor of the Rooks, and for Q vs R+B it is only 1.25 in favor of the Queen. And the general factor 2 penalty for pawnlessness already would reduce that to 0.25 and 0.625, respectively. So it would always shy away of these end-games in favor of an advantage of a healthy Pawn, even when they are not listed as drawish. The drawishness discounting is only important for end-games that have a large raw advantage, possibly only super-piece vs pairs of the weakest minors B, N, WA, WD and Fibnif.

I will publish a table here when I have figured out how to best present the calculated statistics.

[Edit]

I made a useful addition to my EGT generator: when it is done generating the normal staticstic for a 2-vs-1 end-game, it declares all drawn positions in the successor 2-vs-0 and 1-vs-0 end-games a win, and then continues generating from there, effectively calculating whether King-baring can be forced (and in how many moves). This is a great help in investigating end-games like KQ.KBN, by generating the 'reverse' end-game KBN.KQ with King-baring victory. That makes it possible to recognize draws achieved by trading B or N for Q, which otherwise would show up as draws, indistinguishable from any fortress draws with all material, but now are reported as wins. This leads to the conclusion that almost all draws in KQ.KBN are due to shallow tactics that loses the Q against one or both minors: of the legal positions with the weak side to move only 0.14% are fortress draws. The known fortress is apparently very difficult to reach. This is in sharp contrast to Q vs two WD, which has 46.93% wins (38.12% converting within 3 moves), 28.5% forced losses of Q or K (the large majority in 1 move) , leaving 24.57% for fortress draws. Indeed the WD pair has a huge capacity for setting up fortresses: a mutually protecting pair can confine the enemy King on boards of any size, trapping it behind the file or rank they are on. You either gain one of the WD by checking/forking before they connect, or it will be a dead draw. Such end-games deserve heavy discounting, as the search (using check extension and capture search) will easily find the won or lost cases. Queen vs two WA has rather similar statistics, although I don't have a clue as to how the fortress looks there.

[Edit 2]

OK, I finally compiled a table, by combining info from the super-piece vs pair end-games themselves, the reverse end-games, and the reverse end-games under the baring rule. I extracted the info from the positions with the pair on move. This shouldn't really paint a different picture from when the super-piece was on move, except that in the latter case the large majority of positions (>80%) captures a hanging piece on the first move, altering the material balance from the intended one, so that the interesting results are much diluted there. Of course when such a capture does not happen, the other player gets to move, with the statistics presented here.

I only considered end-games where the advantage based on piece values would be large enough to reasonably suspect it could be a win even in the absence of Pawns.

The table list 6 numbers, all percentages:
1) win by shallow tactics (conversion in first 3 moves)
2) win by deep tactics (conversion in move 4-6)
3) lengthy wins
4) fortress draws
5) forced loss of super-piece (or checkmate)
6) immediate loss through King capture

           Q                  C                   A                   Colonel
NN  26-4-13-11-21-25  21-10-25-.1-19-25     9- 5- 2-40-19-25     16-9-14-15-20-25
BN  21-7-21-.1-23-28  18-10-21- 1-22-28     6- 2-15(~8)-27-22-28 10-7- 8-23-23-28
BB  15-7-18- 1-24-35  14- 5-20(~6)-.1-26-35 3-.2- 0-38-24-35      7-3- 4-25-26-35
XX  31-4- 5-15-19-26  26- 9-14- 5-20-26    10- 8- 3-32-20-26     16-9- 9-20-21-26
FX  21-5- 5-19-21-29  18- 6- 4-20-22-29     7- 4- 2-36-22-29
FF  22-7- 1-14-23-34  29- 5- 1-15-26-34     6- 2- 1- 2-54-34
WW  27-5- 2-18-20-28  23-11- 3-14-21-28    12- 9- 2-28-21-28     15-9- 4-11-32-28
II  30-5- 4-16-20-26  23- 9- 5-17-20-26     9- 8- 5-32-20-26     16-7- 3-28-21-26
YY  18-4- 1-13-27-37  11- 8- 3-15-26-37     6- 3- 1-19-34-37      8-5- 1- 7-42-37
KK                    15- 4- 2-30-20-28                           6-5- 2-13-45-28

The relevant statistics for classifying the end-game are highlighted in bold. (Note '.1' means 0.1!) These are the lengthy (i.e. non-tactical) wins versus the fortress draws. The other cases resolve fast enough to simpler end-games for the engine to base the score on static evaluations outside this end-game. A smart evaluation strategy for these end-games could be to initially classify them as a (pawnless) win, but for those that are mainly fortress draws increase the discount factor to a drawish value when the 50-move counter goes up, reflecting the observation that when you cannot make a winning exit from the end-game in the first 3 moves, your chances for a win will be pretty bleak. When looking ahead from end-games with a single Pawn in jeopardy (e.g. Q+P vs F+2X) they should be treated as drawish, as after sacrifycing X or F for P the remaining F and/or X will typically be tactically safe (or they would have been picked off before).

The Archbishop vs two Fads sticks out because in 54% of the cases the Fads can force capture of the Archbishop. (More typically the chances to force super-piece capture are only 20-25%.) One should not conclude from this that the game is mostly won for the Fads, though. The Archbishop is only rarely captured without compensation, and even trading it for a single Fad leaves no mating potential, and thus causes an instant draw. Only 7.46% are genuine losses (Archbishop lost without compensation, or an immediate checkmate). The Fads do dominate the game, however. Where in the other end-games gaining the super-piece in almost all cases happens on the first or second move, here that happens in only 10% of the cases, and takes on average 25 moves otherwise (worst case even 57 moves). The Fads will just methodically tighten the mating net around the enemy King, keeping their own King safe from perpetual check, and at some point the mate can only be averted by sacrificing the Archbishop.

In two cases (A vs B+N, C vs B-pair) a large fraction of the lengthy wins was cursed, and the table mentions the number of cursed wins in parentheses. We see the Archbishop doesn't perform very well; the only case where it has a good number of wins is against B+N (which is the weakest defending combination). A Queen beats the FIDE minors; even the pair of Knights, which still puts up a fight, manages to reach a fortress in less than half the cases, after disregarding all initial tactics. It doesn't manage to beat any pair from the other armies, though. The Chancellor does better: it also beats two WA, and thoroughly crushes the pair of Knights, but has some difficulty with the B-pair because the wins take too long.

[Edit 15-4-2019]

The Colonel is also weak, and only has some success against a pair of Knights. But because it is quite poor in delivering perpetual check, it actually runs a large risk of losing against pairs of majors, where sacrifycing it for one leaves a lost 3-men ending. Even against the weak ones, where the piece values suggest it has an advantage (Woody Rook, Commoner and Dragonfly). The large part of the forced conversions against these pairs are indeed mostly losing conversions, and especially for the Commoners most of these are lengthy.


H. G. Muller wrote on 2019-05-07 UTC

End-games part 2: Super-pieces

The super-pieces are in general so much stronger than the light pieces, that they will almost always beat the latter in a 1-to-1 situation. Only the strongest light piece (Rook) manages to hold a draw against an Archbishop, while its result against a Chancellor is a bit unclear. (The Chancellor can win if its King is already advanced so much that the Rook cannot cut it off at a safe distance from its own King, so that the Chancellor can attack it with its N move while checking with its R move, which is the case in a fair fraction of all possible positions.) The general win of Archbishop vs HFD is mostly cursed.

More interesting are the 5-men end-games where both players have a super-piece, (which in itself would be a general draw in all cases), to see whether an extra light piece can tip the balance. Unnatural pairs are not so unlikely here, as promotiong to the super-piece the opponent starts with should be reasonably common. To be complete I also generated EGT for the 'impossible pairs', where the light piece did not belong to the army of either super piece, because there were not that many, and some of those can occur in Seirawan Chess.

It is a bit tricky to interpret the statistics of super-piece end-games; their capacity for initial tactics that would alter the intended material balance is enormous. And even in genrally won positions there will be many draws due to perpetual checking. If I had a Xiangqi-style EGT generator it would detect perpetual checking and count it as a loss (so that I could judge its importance by comparing with the stats of normal generation), but alas... In theory it would also be possible to count draws through forced conversion to a non-lost end-game, e.g. by forking King + Chancellor by an Archbishop (possibly after some checks) and trade (or gain) it, by making that the 'winning' goal for the defending side in the table with all material present (as this would count as tactically non-quiet positions). But my generator doesn't do that either. How much such tactics is possible depends very much on the blind spots pieces have w.r.t. attacks of the opponent pieces, so it is hard to say what is 'normal' for a general draw or a general win, and even more difficult to recognize end-games that are part win, part draw.

I compiled the following table, which should be read as that the piece in the upper margin should team up with the first piece mentioned in the left margin, to beat the second piece mentioned there.

C = RN
A = BN

?  = probably only partially won

       WA FvN  WD  N   B  FAD  BD vRsN K   N' HFD  R4  R'  R
none   =   =   +   =   =   =   =   +?  +   +   +   +   +   +

Q-A    +   +   +   +   +   +   +               +   +       +
C-A    +   +   +   +   +   +   +               +   +       +
Q-C    =   ?   +   =   +   +   +               +   +       +
Q-Q    =   =   +   =   =   =   =   ~?  +   +   +   +   +   +
C-C    =   ~   +   =   =   +   +   ?   +   +   +   +   +   +
A-A    =   =   +   =   =   =   =   +   +   +   +   +   +   +
C-Q    =   =   +   =   =   =   ?               +   +       +
A-C    =   =   ~   =   =   =   =               +   +       +
A-Q    =   =   =   =   =   =   =               +   +       +

We can see that the super-pieces are not equally strong, but that mating potential of the extra piece in general is sufficient to preserve the win no matter which super-pieces are added, even if the extra piece teams up with the weaker one. The exception is the WD, which is rather minimal for a piece with mating potential. This is not able to overcome the Archbishop vs Queen disadvantage, while with Archbishop against Chancellor the win only seems partial, and then most of it is spoiled by the 50-move rule.

The minors show a more varied behavior. With equal super-pieces, or teaming up with the weaker one, they tend to preserve the draw. It is apparently too difficult to avoid trading of your super-piece against an equal or superior one. The exception occurs with Chancellors. These seem unusually good in cooperating with other pieces (which might have to do with their well-known unusual adeptness at perpetual checking): the pure advantage of Bede or Fad secures a win, and even together with Fibnif it makes a remarkable attempt (partial win, if it were not almost entirely cursed; worst case takes 154 moves!) Together with Bede (the strongest minor) it even gets a partial win against the (stronger) Queen. Together with a better super-piece the Bishop, Bede and Fad are good for a win, and the Knight, WA and Fibnif are if the weaker super-piece is the Archbishop.


Aurelian Florea wrote on 2019-05-07 UTC

Indeed, great work!...


Greg Strong wrote on 2019-05-06 UTC

Wow.  Great work!  Very interesting.


H. G. Muller wrote on 2019-05-06 UTC

End-games: light pieces

The table below gives an overview of some 5-men CwDA end-games, based on the statistics of generated End-Game Tables. I don't have a generator that can handle pieces with only 2-fold symmetry, but a special built of FairyGen can handle 4-fold symmetry, so I did include the Fibnif as only Nutters piece. CwDA armies consist of a super-piece worth 2.5-3 typical minors (such as Knights), and 3 pairs of 'light' pieces worth 1-1.5 minors. In FIDE the Rooks really stand out amongst the latter; in the other armies the pieces are closer in value, only 1 piece being of Knight strength, the other two lying somewhere in between Knight and Rook.

These pieces can be divided into majors and minors, depending on whether they are able to force checkmate onto a bare King. All light pieces of the Clobberers are minors, all of the light Rookies are majors. The Nutters have one minor, FIDE has two. Of all these minors, the Knight is the only one that cannot checkmate as a pair; for the Clobberers the heterogenous pair Bede + Fad cannot checkmate if they are on the same square shade. All other pairs of minors from the same army can force checkmate. Even all 'unnatural' pairs (which can in theory be obtained by promotion) can force checkmate, provided that pairs of color-bound pieces (Bede, Fad, Bishop) are on unlike shades.

The difference in strength between the light pieces is usually not enough to force a win in a 1-vs-1 situation. Somewhat exceptional are Rook vs WA (which would be a general win if it were not for the 50-move rule; as it is the win is cursed) and Rook vs Fibnif (where the result is unclear; a Fibnif is easily confined by a Rook, and in positions where it is separated from its King it can probably be chased to doom). Of course only the major pieces can hope for a win, in these situations.

Because of their closeness in value, I treated the light pieces as a single group, and generated all EGT of a natural pair versus a single one. Each army has 6 natural pairs, but for the Nutters I could only handle the pair of Fibnifs, so 19 pairs in total. I did not bother with a pair of Knights, as these cannot even win without opposition. I also did not bother with a pair of Rooks, as a pair of R4 could already beat any opponent. Each of the 17 remaining pairs was pitted against the 10 light piecs, 170 combinations in total. This gave the following result.

R = Rook
B = Bishop
N = kNight
D = BD      (beDe)
F = FAD     (Fad)
X = WA      (phoeniX)
S = R4      (Short rook)
H = HFD     (Half duck)
W = WD      (Woody rook)
N'= fhNbFbW (charging kNight)
R'= fsRbFbW (charging Rook)
I = FvN     (fIbnif)
K = non-royal King
Y = vRsN    (dragonflY)
O = BW      (dragon HOrse)

+  = general win
=  = general draw
~  = cursed general win
~? = half-cursed general win
+? = mostly won, but lots of fortress draws
?  = mixed win/draw
?~ = mixed, and about half the wins cursed
*  = already won without the second piece

       X   I   W   N   B   F   D   H   S   R   K   Y   O   N'  R'
XX     =   =   =   =   =   =   =   =   =   =   =   =   =   =   =
BN     =   =   =   =   =   =   =   =   =   =   =   =   =   =   =
FX     =   =   =   ~? ~/=  =   =   =   =   =   =   =   =   =   =
BB     +   =   =   ~?  =   =   =   =   =   =   =   ~   =   =   =
II     +   ~   =   +   =   =   =   =   =   =   =   =   =   =   =
DX     =   ~   =   +  +/= ~/=  =   =   =   =   =   =   =   =   =
YY     +   +   +   +   +   +   +?  +   =   =   +   =   =   +   =
WW     +   +   +   +   +   +   +   +   +   =   +   +   ?   +   ?
N'N'   +   +   +   +   +   +   +   +   +   ~?  +   +   ~?  +   +
N'I    +   +   +   +   +   +   +   ~   ~   =   +   +   =   +   =
RN     +   +   +   +   +   +   +   ~?  ~   =   +   +   ~   +   =
RB     +   +   +   +   +  +/+ +/+  ~   +   =   +   +   =   +   =
FF     +   +   +   +   +   +   +   +   +   =   ?   +   =   +   =
R'I    *   *   +   *   *   *   +   +   +   ~   +   +   +   +   ~? 
KY     +   +   +   +   +   +   +   +   +   =   +?  +?  ?   +   ?~
DF     +   +   +   +   +   +   +   +   +   +   +   +   =   +   ?
DD     +   +   +   +   +   +   +   +   +   +   +   +   =   +   +
KK     +   +   +   +   +   +   +   +   +   +   +   +   +   +   ?
HW     +   +   +   +   +   +   +   +   +   +   +   +   +   +   +
SW     +   +   +   +   +   +   +   +   +   +   +   +   +   +   +
HH     +   +   +   +   +   +   +   +   +   +   +   +   +   +   +
SH     +   +   +   +   +   +   +   +   +   +   +   +   +   +   +
SS     +   +   +   +   +   +   +   +   +   +   +   +   +   +   +
R'N'   *   *   +   *   *   *   +   +   +   +   +   +   +   +   +
R'R'   *   *   +   *   *   *   +   +   +   +   +   +   +   +   +
OY                                     +   +   +   +   +       +
OK                                     +               +       +

We see that the Bede and Fad, despite their lack of mating potential as an individual, form quite strong pairs. This is probably because they are able to drive an unprotected King to checkmate with checks, in a way reminiscent of the 'hand-over-hand' checking of a pair of Rooks. This makes it hard even for a Rook to harrass the pieces, threatening to trade and destroy the mating potential, which is the usual way in which pairs of minors fail to win. So in first approximation a pair wins if both members have mating potential (so that trading any of them will not rescue the defender), or if they are Bede/Fad pairs of unlike color, while other pairs of minors draw against any opposition.

The case major + minor only occurred in FIDE here, (R+B and R+N), as Clobberers have no majors, Rookies have no minors, and for Nutters I could not handle the majors. Because the major is relatively strong in FIDE, only a defending Rook can truly measure up to it; any other defender is so much weaker that adding even a 'standard minor' tips the balance. Against R4 or HFD, however, it takes too long to force the win, and the latter is cursed in almost all, or about half the cases. For Rook + Bishop vs Bede or Fad it doesn't matter if the defender is on like or unlike shade w.r.t. the Bishop.

Of the pairs of minors Bede + WA stands out: it in general beats a Knight, and a Bishop when it is on the same shade as the Bede. The win they in general have against a Fad on the Bede shade, or a Fibnif, is almost always cursed. They cannot beat a WA (which is probably the weakest defender in such end-games), but beating an equal piece is always more difficult, as you cannot attack it without offering it an opportunity to trade. The Bishop pair and a pair of Fibnifs (like a lone Rook) can beat the WA. The pair of Fibnifs is surprisingly strong: it can also beat a Knight. For the Bishop pair it takes so long to beat a Knight that the win is cursed more often than not. A win of two Fibnifs against one is very cursed (it takes on average 90 moves), but in view of the remark above it is amazing that it can force such a win at all. That the Fad is just a bit weaker than the Bede is also demonstrated by that the wins Bede + WA have against Bishop and Knight turn into cursed wins when the Bede is replaced by a Fad.

[Edit 14-5-2019] The Nutters and Dragons pieces were added to the table.


H. G. Muller wrote on 2019-05-05 UTC

A pair of WA is a general win. The rule of tumb is that one of the minors must be able to move from c1 to a1 (or their symmetry equivalents) in three moves (more precisely, for divergent or asymmetric pieces an uncapture, a move and a capture). A WA can do that (c1-c2-c3-a1), and can thus inflict a corner mate (moving c2-c3) with its King on b3 after the other minor has driven the bare King with check from b1 to a1. Furthermore, edge mates can be forced when one minor can 'fork' a1 and c1 at the the same time, and the other minor can move from c1 to b1 in three moves. But that doesn't work if the forking piece has to be on b3 (as a Knight would have to be), where it would collide with the King.

As to the level of ambition: perhaps I should start indeed a bit more simple. The general scheme is to discount a pawnless advantage by a factor 2 even if it still is a win (except for known easy wins such as KQK and KRK), to properly reflect the relative difficulty of the win. But known general draws should be discounted much more, e.g. by a factor 8 if there still is some hope, or even 16 or infinite if it is a truly dead draw. (A factor 16 would even shrink the KNNK advantage to much less than a Pawn, and when that would still make it the best option the alternatives will almost certainly offer no hope for a win either.)

That leaves room for discounting end-games with a single Pawn by 4 times smaller factor, when the opponent can afford to sac a piece for that Pawn to leave the pawnless general draw. Such a sac typically increases the advantage from +1 to +3, but the relative factor 4 makes the latter +0.75, so the leading side would be biased against allowing the sac. The remaining discount factor (2 or 4) would still discourage converting to such end-games, e.g. by trading Pawns in KBNPPKBNP.

This scheme would need a table that specifies which non-Pawn material should be considered a dead or a general draw. The simplest version of this would just list single minors vs nothing: KBK and KNK in FIDE. But having some 4-men endings in there (like non-mating pairs of minors, such as KNNK, or 'exchange'-type advantages like KRKN) would not be too demanding either. These entries would already extend their influence to KNNPKB and KRPKNN, through the sac-rule. The really tedious part would be to add 5-men end-games such as the 'minor ahead' situations KBNKN, KRBKR, KQBKQ,... But I already generated a lot of those tables; I will summarize those results in another comment.

It could also be good to discount cases like unlike Bishops with a difference of up to 2 Pawns by a factor 2, but at the moment I have no idea how to generalize that. (E.g. it seems that end-games with unlike Ferzes are not particularly drawish.)


Greg Strong wrote on 2019-05-04 UTC

Hard to say; perhaps 2 weeks if I would give it priority.

Ok, thanks.  I was just trying to get an idea, not asking to make it priority.  It'll probably take me a couple of weeks to get Quadrox ready.  I'm going to start with FIDEs vs. Clobberers because that will be easiest.  No asymetric pieces or range-limited sliders.

I'm glad you mentioned endgames - I was going to bring that up.  At a minimum, I need to determine under which conditions the game should be terminated immediately because there isn't enough material for checkmate to be possible.  (E.g., any number of BDs and FADs vs. a lone king if they are all on the same color.)  But, yeah, like you I also want to identify those piece combinations where the game should not be terminated because mate is theoretically possible if the opponent walks into the corner but the evaluation function should return zero (e.g., king + fibnif vs. lone king.)  Your Javascript checkmating app is really awesome and answers the question for single pieces.  I'm glad you're going to work on determining the answer for multiple pieces.  Can a king plus two WAs force checkmate?  I doubt it but I don't really know and I have no experience with endgame database generation.

It sounds like you're being really ambitious though.  Recognizing KBBPKBN as drawish is really advanced.  Throw in all the fairy pieces from cwda and the number of permutations is out of sight...


H. G. Muller wrote on 2019-05-02 UTC

Greg Strong:

Any guess when you think you'll have your new cwda engine ready for testing?

Hard to say; perhaps 2 weeks if I would give it priority. But the Tenjiku Shogi implementation in Jocly is still not finished, and already 2 months (of 4) have elapsed on its clock in the yearly Modern Tenjiku correspondence championship in which it is supposed to participate...

Main issue is that I want it to recognize drawishness through lack of mating potential, which would include strongly discounting the score in end-games like KBBPKBN, because of the almost undodgeable N-for-P sac leaving a KBBKB known general draw. This requires the knowledge of which Pawnless 5-men CwDA endings are general draws, which I must first aquire by generating EGT for those (with FairyGen). And there are rather many of those, especially if I want to keep open the possibility to test individual pieces out of their own context (i.e. dropping the requirement that the two pieces fighting on one side must belong to the same army, so that I can test, say, WD+R vs R to see if the (winning) advantage of a WD is preserved on adding equal pieces on each side). An additional complication is that the standard version of FairyGen counts on 8-fold symmetry, although I once made a compile that can handle 4-fold symmetric pieces. But even that would not be able to handle the Nutters pieces other than Fibnif.

Otherwise there are only minor issues; KingSlayer supported only 6 piece types (1-6, code 0 being reserved for empty squares), and I already added some initialization code to set their move tables to that of the various armies. I still want to allow use of code 7 as an extra piece type, which requires a small code change because originally I used the 7th entry in the array that counts the number of pieces that is present of each type to hold the 'game phase' (minors + 2*Rooks + Queens). So I must move that to a separate array. And I still have to fix a-side castling for the Clobberers.


Ben Reiniger wrote on 2019-04-11 UTC

Neat!

Checkmating with the Dragon fly

Play with the Wyvern (The checkmating applet doesn't seem to like the jumping sideways rook component, putting the black king in check by that move.)


H. G. Muller wrote on 2019-04-10 UTC

The Daring Dragons

I designed a new army, which in tests with Pair-o-Max scores about equal against FIDE. I named it the Daring Dragons.

promoChoice=WHLD graphicsDir=../membergraphics/MSelven-chess/ whitePrefix=w blackPrefix=b graphicsType=png symmetry=none pawn::::a2,b2,c2,d2,e2,f2,g2,h2,,a7,b7,c7,d7,e7,f7,g7,h7 Dragon Fly:F:sNvR:chancellor:b1,g1,,b8,g8 Dragoon:D:KivmN:man:c1,f1,,c8,f8 Dragon Horse:H:BW:crownedbishop:a1,h1,,a8,h8 Wyvern:W:vNsjRB:dragon:d1,,d8 king::::e1,,e8

Interesting feature that sets it apart from other armies is a piece with an unusual (meta-)color binding, the Dragon Fly: this is bound to even or odd board files, along which it moves like a Rook. It can switch between files through a sideway Knight jump. (It is in fact half a Chancellor.) It is worth slightly less than a Bishop, and can often force checkmate on a bare King. The other light pieces is the Man / Commoner, but to facilitate its development (which would otherwise heavily compete with that of the Dragon Fly), it has some additional initial non-capture Knight jumps. It is called a Dragoon. (Dragoons are mounted infantry, using horses for mobility, but fighting on foot.) The Rook replacement is the Dragon Horse known from Shogi (moves as Bishop or one step orthogonally), worth slightly more than a Rook.

The super-piece (called Wyvern) is a somewhat weird construct; first I wanted it to be a Centaur (Knight-Man compound), but then the army proved too weak. Then I replaced the wide Knight moves of the Centaur by a sideway Rook slide, to also have the latter in the game. This makes it a compound of a Man and a 90-degree rotated Dragon Fly. But this was not really stronger than a Centaur; with either the army scored only 40% with black. A sideway Rook slide should be worth more than four Knight moves, but the Centaur already covered the first step of it, so it did not add enough. I also did not like its low speed in the vertical directions, which was unworthy of a super-piece. After some experimenting, a compound of a rotated Dragon Fly and a Bishop proved a little too strong (60% against FIDE), although not out of line with what the other CwDA armies do. A suitable way to weaken it to exactly match FIDE was to replace the sR slide by a ski-slide, skipping the first square on the ray (jumping any occupant if needed).

Ski-sliders are interesting anyway: on a near-empty board they are obviously inferior to the corresponding ordinary slider, as they lack the moves to the adjacent square. That the more distant moves cannot be blocked on that square is of no import if there is nothing around to block them. But on a crowded board, where slides almost always are blocked before they hit the board edge, the ski-slider will have the same number of moves as the normal slider, each target just being moved outward one step. Which should make them nearly equivalent. So ski-slider strength will depend in a different way on game phase as the other pieces, relatively decreasing towards the end-game.

 


Greg Strong wrote on 2019-03-31 UTC

Hi, H.G.  It's good to hear from you and to hear that you are working on another engine to help test these things!  I got distracted on other things and never got around to following up.  I have far too many different projects that attract my attention - usually chess variant stuff, but sometimes other things as well.  I found this programming language for writing interactive fiction (think Zork) where source code reads like English called Inform7.  I would not have thought it possible for a real programming language to be a subset of English.  Wild stuff.  But yeah, anyway, I get sidetracked a lot :)

First, I did complete the FIDEs vs. Nutters test with the FIDEs given added incentive to move forward through the PST.  This helped a tiny bit, but not much at all:

Nutty Knights: 261
Fabulous FIDEs: 84
draw: 55

My next thought is to reduce the value of the knight and bishop when facing off against the Nutters.  This will give the FIDEs a strong desire to trade off and the Nutters will have to limit their options to prevent that.  Once the minor pieces are traded off I think the FIDEs are fine.  I don't believe a charging rook is better than a normal one, although a colonel may be a little better than a queen.

I have recently switched back to trying to get the next version of ChessV out.  I have several new features that are mostly done that just need to be closed out.  (Of course, I don't always finish a feature before starting on the next ...)  The most siginificant of these is that I have added a stand-alone ChessV CECP engine so it can be run without the GUI.  This code is all written but almost completely untested.  I admit I've been procrastinating on that.  In the whole scope of this project, there is nothing that is less appealing to me than trying to plan/code/debug for inter-process communication.  The other side of the coin - ChessV's ability to host other XBoard engines is not 100% bug-free either, although it is certainly good enough to be usable.

The material hash is something else I've added but am not making much use of yet.  It is implemented as you describe, and will handle binding of any kind such as your even/odd file example.  I think it was here I described the recursive algorithm I used to find all the different 'slices' of the board for any given piece.  (I'm calling them slices rather than colors because colors becomes confusing when different pieces have different bindings - the knight in Alice Chess being a wacky example.)  It will be interesting to see what scientific testing determines colorbinding bonuses/penalties should be for multiple color-bound pieces.  Currently, ChessV starts discounting the value of pieces heavily starting with the second piece bound to a slice if there are no pieces on a complimentary slice.

Regarding enabling CwDA for inter-engine play, yes, I am definitely interested in figuring out how we can do this.  I am certainly of the opinion that both our GUIs and all our engines should be as inter-operable as possible.  I will post some thoughts about this shortly.  (I'll start a new thread for it.)


H. G. Muller wrote on 2019-03-31 UTC

@Greg

Any progress on this? I am contemplating to also return to piece-value measurements. Because I want to measure the more subtle effects, such as mating potential and pair bonuses, this will require a less course approach than Fairy-Max. I have started to extend the capabilities of my engine KingSlayer (originally released as 'Simple', until I was told that name was already taken), which I wrote a few years ago as a demo source code for orthodox Chess somewhat more advanced than TSCP, to also support fairy pieces. And in particular CwdA. So I changed the move generator to support limited-rage sliding/riding on a per-move basis. (For Chess it was done on a per-piece-type basis, and the range could only be 1 or infinite.)

As that engine only supports 6 piece types per side (which, with a little bit of work, could be expanded to 7), I implemented this by initializing the tables with piece properties it uses during play from a larger table that contains descriptions of all supported piece types. (So far the 16 piece types of the 4 classical CwdA armies.) For a particular game it then just picks up to 4 of these in addition to the always participating P and K. Unlike Fairy-Max, this engine has a dedicated check test (rather than just trying a null move and wait for a King capture), and this had to be extended too in order to handle the new moves. Basically it works by having a 15x15 'board' indexed by the relative distance, where for each step a bitmap indicates which piece type in principle could make such a step, where for sliding moves a contact threat is distinguised from a distant one (to easily see if you need to test for blocking). By making use of the fact that some pieces are compounds of others (like Q=R+B), and decomposig some pieces into 'primitives' to make even better use of that, the number of different primitives needed to support CwdA was 13, too large for the byte originally used for this purpose, but less than half a 32-bit integer, so that I can now even use separate bits for white and black attacks, eliminating the need to test the piece for being an enemy by other means. This type of check test would become more cumbersome with hoppers (where you don't only have direct and discovered checks, but also have to deal with 'activation' by interposition), and very awkward in the presence of bent sliders (like the Gryphon). So this engine will probably never support those kinds of moves. Divergent pieces would still be a realistic possibility, though.

Unlike Fairy-Max this engine does have an advanced Pawn-structure evaluation, (e.g. passer recognition), which is directly usable in CwdA, as that uses the same Pawns. It did keep track of the number of pieces of each type that are still present, and used this to award a Bishop pair bonus (if there were two), or discount the static evaluation score when mating potential gets into jeopary for lack of Pawns (i.e. with 1 Pawn or less). This will have to be substantially refined, though, as with multiple color-bound types cross bonuses are to be expected, and you cannot conclude from the piece counts alone whether you have a pair or not. Also drawish cases similar to 'unlike Bishops' cannot be recognized this way, which was already a weakness in regular Chess. So I plan to add a 'material hash', which uses a hash key that depends on the present material, but counts color-bound pieces of the same type but on different square shades as different. (This can be done through a Zobrist-like hashing scheme that doesn't assign a different key to a piece type for each board square, but just one for each 'meta-color' relevant for that type.) Which piece combinations will have mating potential will now depend on the army, and will thus require a more complex analysis, but if the results of that analysis are kept in a hash table, this will not impact engine speed.

BTW, other types of (meta-)color binding can be interesting as well. E.g. odd/even file binding, such as for vRsD, which does have substantial mating potential. (Although a fortress draw is possible when the bare King cannot be cut off from the safe edge. A vRsDD would even be better in preventing that.)

 

You mentioned lack of standardization presenting a problem for having XBoard engines playing CwdA in ChessV. What would be needed here? Now that I am making KingSlayer into a CwdA engine, it might be good to have a closer look at the specific problems, and try to find ways to remove those. At the moment I have KingSlayer report in the CECP variants feature that it supports variant 'fairy', and gave it engine-defined combo options 'White Army:" and "Black Army:" that can be used to select the flavors FIDE, Clobberers, Rookies or Nutters, and will determine what variant fairy means. But in addition to that I could allow setting of the default value of those options through arguments in the engine command (so that you would never have to bother setting the option).


Greg Strong wrote on 2018-10-13 UTC

I have a bit of discomfort as the game did not had any lame leapers before but that borders on nothing. I'm more concerned how the change affect the balance against the two other armies. As this seems to me that will lead to a wave of interconnected changes that are probably not easy to pull through. Some sort of logical system of equations needs maintaining and I honestly doubt such and endevour is even doable, little to say about feasible. This because you don't have many options for tunning while keeping the initial flavour on

This is a valid concern, but I'm hoping this does not become a problem.  And a tiny bit of rock-paper-scisors effect is acceptable so long as things are balanced against the FIDEs.  Obviously, the FIDEs are the one army that cannot be modified.  For an example of a board game that has significant R-P-S effect but is still an awesome game, see tournament Star Fleet Battles.  I should say this as I was probably not clear - I am NOT proposing making this change until testing of all combinations is complete, along with some testing of evaluation terms changes...  This is just what I'm leaning towards given what we know so far.

It is a pity that test takes so long (a common problem in computer chess...)

Indeed it is, but I can scale up quite a bit.  I actually have quite a few i5 and i7 PCs that can be pressed into service to do testing (6 or 7 of them.)  The longest part, which is largely manual, is calculating out all the starting positions so I can feel very confident that my tests aren't playing the same games over and over.  But when this is accomplished I can scale up testing quickly.  I have just finished generating 20 positions of FF vs RR and am just starting on those with the colors reversed.

I suppose that the new ChessV is stronger than Fairy-Max? Have you ever measured by how much?

My current builds are definitely stronger than Fairy-Max, at least at the various 10x8 variants, but I have not done formal measurements.  I intend to test that with my new "batch mode" capability also, but I've been focused on CwDA tests instead :)  ChessV will control XBoard protocol engines for many games, but CwDA is not one of them because it would require more standards than presently exist.  I should also mention that Fairy-Max is an absolute speed demon, in terms of nodes-per-second, compared to ChessV at approximately 4x the nodes.  ChessV's strength comes from smarter search (using ideas stolen from Stockfish and other GPL engines - I take absolutely no credit for this) and better evaluation.

What TC are you using for these tests?

The 400-game sets use different time controls as one way to get more varied results.  They also modify the new Variation setting from None (which is completely deterministic) to Small (for most games) to Medium (for a few games.)  The fastest time controls I'm using are 25 sec + 2 sec/move.  The longest are 5 minutes + 1 sec/move.  Typically a 400-game set on one computer takes about 2 days.  I will post a new (unofficial) version here shortly along with all my opening positions and batch mode control files so everyone can see exactly what I'm doing and run tests of their own.

Regarding the NN vs FF test with the FIDEs given more encouragement to advance through the PSTs, the test is half done.  The 200 games where the NNs are white and the FFs are black are done.  The Nutters won 136, the FIDEs won 36, with 28 draws.  So it doesn't look like this is making the situation any better although these are all games where the nutters have the first move.  Tomorrow we should know the final results.

 


25 comments displayed

Later Reverse Order EarlierEarliest

Permalink to the exact comments currently displayed.