Check out Grant Acedrex, our featured variant for April, 2024.

Enter Your Reply

The Comment You're Replying To
Greg Strong wrote on Sat, Oct 13, 2018 09:34 PM UTC:

I have a bit of discomfort as the game did not had any lame leapers before but that borders on nothing. I'm more concerned how the change affect the balance against the two other armies. As this seems to me that will lead to a wave of interconnected changes that are probably not easy to pull through. Some sort of logical system of equations needs maintaining and I honestly doubt such and endevour is even doable, little to say about feasible. This because you don't have many options for tunning while keeping the initial flavour on

This is a valid concern, but I'm hoping this does not become a problem.  And a tiny bit of rock-paper-scisors effect is acceptable so long as things are balanced against the FIDEs.  Obviously, the FIDEs are the one army that cannot be modified.  For an example of a board game that has significant R-P-S effect but is still an awesome game, see tournament Star Fleet Battles.  I should say this as I was probably not clear - I am NOT proposing making this change until testing of all combinations is complete, along with some testing of evaluation terms changes...  This is just what I'm leaning towards given what we know so far.

It is a pity that test takes so long (a common problem in computer chess...)

Indeed it is, but I can scale up quite a bit.  I actually have quite a few i5 and i7 PCs that can be pressed into service to do testing (6 or 7 of them.)  The longest part, which is largely manual, is calculating out all the starting positions so I can feel very confident that my tests aren't playing the same games over and over.  But when this is accomplished I can scale up testing quickly.  I have just finished generating 20 positions of FF vs RR and am just starting on those with the colors reversed.

I suppose that the new ChessV is stronger than Fairy-Max? Have you ever measured by how much?

My current builds are definitely stronger than Fairy-Max, at least at the various 10x8 variants, but I have not done formal measurements.  I intend to test that with my new "batch mode" capability also, but I've been focused on CwDA tests instead :)  ChessV will control XBoard protocol engines for many games, but CwDA is not one of them because it would require more standards than presently exist.  I should also mention that Fairy-Max is an absolute speed demon, in terms of nodes-per-second, compared to ChessV at approximately 4x the nodes.  ChessV's strength comes from smarter search (using ideas stolen from Stockfish and other GPL engines - I take absolutely no credit for this) and better evaluation.

What TC are you using for these tests?

The 400-game sets use different time controls as one way to get more varied results.  They also modify the new Variation setting from None (which is completely deterministic) to Small (for most games) to Medium (for a few games.)  The fastest time controls I'm using are 25 sec + 2 sec/move.  The longest are 5 minutes + 1 sec/move.  Typically a 400-game set on one computer takes about 2 days.  I will post a new (unofficial) version here shortly along with all my opening positions and batch mode control files so everyone can see exactly what I'm doing and run tests of their own.

Regarding the NN vs FF test with the FIDEs given more encouragement to advance through the PSTs, the test is half done.  The 200 games where the NNs are white and the FFs are black are done.  The Nutters won 136, the FIDEs won 36, with 28 draws.  So it doesn't look like this is making the situation any better although these are all games where the nutters have the first move.  Tomorrow we should know the final results.

 


Edit Form

Comment on the page Chess with Different Armies

Conduct Guidelines
This is a Chess variants website, not a general forum.
Please limit your comments to Chess variants or the operation of this site.
Keep this website a safe space for Chess variant hobbyists of all stripes.
Because we want people to feel comfortable here no matter what their political or religious beliefs might be, we ask you to avoid discussing politics, religion, or other controversial subjects here. No matter how passionately you feel about any of these subjects, just take it someplace else.
Quick Markdown Guide

By default, new comments may be entered as Markdown, simple markup syntax designed to be readable and not look like markup. Comments stored as Markdown will be converted to HTML by Parsedown before displaying them. This follows the Github Flavored Markdown Spec with support for Markdown Extra. For a good overview of Markdown in general, check out the Markdown Guide. Here is a quick comparison of some commonly used Markdown with the rendered result:

Top level header: <H1>

Block quote

Second paragraph in block quote

First Paragraph of response. Italics, bold, and bold italics.

Second Paragraph after blank line. Here is some HTML code mixed in with the Markdown, and here is the same <U>HTML code</U> enclosed by backticks.

Secondary Header: <H2>

  • Unordered list item
  • Second unordered list item
  • New unordered list
    • Nested list item

Third Level header <H3>

  1. An ordered list item.
  2. A second ordered list item with the same number.
  3. A third ordered list item.
Here is some preformatted text.
  This line begins with some indentation.
    This begins with even more indentation.
And this line has no indentation.

Alt text for a graphic image

A definition list
A list of terms, each with one or more definitions following it.
An HTML construct using the tags <DL>, <DT> and <DD>.
A term
Its definition after a colon.
A second definition.
A third definition.
Another term following a blank line
The definition of that term.