Single Comment

Chess with Different Armies. Betza's classic variant where white and black play with different sets of pieces. (Recognized!)[All Comments] [Add Comment or Rating]

Greg Strong wrote on Sat, Oct 13, 2018 09:34 PM UTC:

I have a bit of discomfort as the game did not had any lame leapers before but that borders on nothing. I'm more concerned how the change affect the balance against the two other armies. As this seems to me that will lead to a wave of interconnected changes that are probably not easy to pull through. Some sort of logical system of equations needs maintaining and I honestly doubt such and endevour is even doable, little to say about feasible. This because you don't have many options for tunning while keeping the initial flavour on

This is a valid concern, but I'm hoping this does not become a problem. And a tiny bit of rock-paper-scisors effect is acceptable so long as things are balanced against the FIDEs. Obviously, the FIDEs are the one army that cannot be modified. For an example of a board game that has significant R-P-S effect but is still an awesome game, see tournament Star Fleet Battles. I should say this as I was probably not clear - I am NOT proposing making this change until testing of all combinations is complete, along with some testing of evaluation terms changes... This is just what I'm leaning towards given what we know so far.

It is a pity that test takes so long (a common problem in computer chess...)

Indeed it is, but I can scale up quite a bit. I actually have quite a few i5 and i7 PCs that can be pressed into service to do testing (6 or 7 of them.) The longest part, which is largely manual, is calculating out all the starting positions so I can feel very confident that my tests aren't playing the same games over and over. But when this is accomplished I can scale up testing quickly. I have just finished generating 20 positions of FF vs RR and am just starting on those with the colors reversed.

I suppose that the new ChessV is stronger than Fairy-Max? Have you ever measured by how much?

My current builds are definitely stronger than Fairy-Max, at least at the various 10x8 variants, but I have not done formal measurements. I intend to test that with my new "batch mode" capability also, but I've been focused on CwDA tests instead :) ChessV will control XBoard protocol engines for many games, but CwDA is not one of them because it would require more standards than presently exist. I should also mention that Fairy-Max is an absolute speed demon, in terms of nodes-per-second, compared to ChessV at approximately 4x the nodes. ChessV's strength comes from smarter search (using ideas stolen from Stockfish and other GPL engines - I take absolutely no credit for this) and better evaluation.

What TC are you using for these tests?

The 400-game sets use different time controls as one way to get more varied results. They also modify the new Variation setting from None (which is completely deterministic) to Small (for most games) to Medium (for a few games.) The fastest time controls I'm using are 25 sec + 2 sec/move. The longest are 5 minutes + 1 sec/move. Typically a 400-game set on one computer takes about 2 days. I will post a new (unofficial) version here shortly along with all my opening positions and batch mode control files so everyone can see exactly what I'm doing and run tests of their own.

Regarding the NN vs FF test with the FIDEs given more encouragement to advance through the PSTs, the test is half done. The 200 games where the NNs are white and the FFs are black are done. The Nutters won 136, the FIDEs won 36, with 28 draws. So it doesn't look like this is making the situation any better although these are all games where the nutters have the first move. Tomorrow we should know the final results.