Game Courier Ratings for Shogi

This file reads data on finished games and calculates Game Courier Ratings (GCR's) for each player. These will be most meaningful for single Chess variants, though they may be calculated across variants. This page is presently in development, and the method used is experimental. I may change the method in due time. How the method works is described below.

Game Filter:		Log Filter:			Group Filter:
Tournament Filter:		Age Filter:		Status Filter:

SELECT * FROM FinishedGames WHERE Rated='on' AND Game = 'Shogi'

Game Courier Ratings for Shogi
Name	Userid	GCR	Percent won	GCR1	GCR2
Accuracy:		80.24%		82.12%	80.66%
Francis Fahys	tamandua	1807	59.0/63 = 93.65%	1820	1793
dax00	dax00	1620	9.0/9 = 100.00%	1624	1616
Kevin Pacey	panther	1595	8.0/11 = 72.73%	1583	1608
Vitya Makov	makov333	1583	7.0/8 = 87.50%	1584	1583
pheko Motaung	couriermabovini	1573	6.0/7 = 85.71%	1577	1569
Fergus Duniho	fergus	1560	6.0/9 = 66.67%	1560	1561
Nicola Caridi	niccar	1551	3.0/3 = 100.00%	1553	1548
shift2shift	shift2shift	1537	2.0/2 = 100.00%	1539	1534
Eric Greenwood	cavalier	1534	2.0/2 = 100.00%	1534	1534
Pericles Tesone de Souza	peritezz	1533	2.0/2 = 100.00%	1533	1533
Chuck Lee	gyw6t	1531	3.0/4 = 75.00%	1530	1532
Raymond D	lewel	1521	5.0/10 = 50.00%	1519	1523
Alexander Trotter	qilin	1518	1.0/1 = 100.00%	1519	1518
Play Tester	playtester	1518	1.0/1 = 100.00%	1518	1518
Doug	bughouse	1518	1.0/1 = 100.00%	1518	1518
Daniil Frolov	flowermann	1518	1.0/1 = 100.00%	1518	1518
juan rodriguez	rodriguez	1517	1.0/1 = 100.00%	1518	1517
John Smith	ultimatecoolster	1517	1.0/1 = 100.00%	1517	1518
Natalia Dolindo	whitetiger	1517	1.0/1 = 100.00%	1515	1519
Oisín D.	sxg	1517	1.0/1 = 100.00%	1516	1518
Nicholas Wolff	maeko	1515	2.0/3 = 66.67%	1513	1516
Jenard Cabilao	mgawalangmagawa	1514	2.0/5 = 40.00%	1509	1520
Julien Coll Morat	facteurix	1512	1.0/2 = 50.00%	1513	1511
S S	sim	1500	2.0/4 = 50.00%	1495	1506
Bogot Bogot	olbog	1499	1.0/2 = 50.00%	1498	1499
boukine	boukine	1498	1.0/2 = 50.00%	1497	1500
ctz	ctz	1498	5.0/10 = 50.00%	1495	1501
kunkun	kunkun	1491	0.0/1 = 0.00%	1494	1488
Hugo Mendes-Nunes	hugo1995	1491	0.0/1 = 0.00%	1494	1488
Fabner Cruz Graciliano	fabner	1491	0.0/1 = 0.00%	1494	1487
Bob Brown	bobhihih	1490	0.0/1 = 0.00%	1494	1486
wyatt wyatt	quimssarcasm	1490	0.0/1 = 0.00%	1495	1486
jesus babyboy	pokechamp	1490	0.0/1 = 0.00%	1495	1485
Sagi Gabay	sagig72	1490	0.0/1 = 0.00%	1495	1484
Nicholas Archer	chess_hunter	1489	0.0/1 = 0.00%	1495	1484
Hsa Said	h	1489	0.0/1 = 0.00%	1495	1483
Georg Spengler	avunjahei	1489	0.0/1 = 0.00%	1496	1482
wdtr	wdtr	1489	0.0/1 = 0.00%	1496	1481
Daniel Zacharias	arx	1484	1.0/3 = 33.33%	1485	1483
Matias I.	tsatziq	1484	0.0/1 = 0.00%	1486	1481
sixty	sixty	1484	0.0/2 = 0.00%	1488	1479
Richard milner	sesquipedalian	1483	0.0/1 = 0.00%	1484	1481
Éric Manálang	edubble19	1483	0.0/1 = 0.00%	1484	1481
MichaÅ‚ Jarski	hookz	1483	0.0/1 = 0.00%	1483	1483
Jose Cancel	joche	1483	0.0/1 = 0.00%	1483	1482
btstw	btstw	1483	0.0/1 = 0.00%	1484	1481
Juan Pablo Schweitzer Kirsinger	defender	1482	0.0/1 = 0.00%	1483	1481
George Duke	gwduke	1482	0.0/1 = 0.00%	1482	1481
Erlang Shen	erlangshen	1481	0.0/1 = 0.00%	1481	1481
Sandra#Paul BRANDLYARD	sandravers130675	1481	0.0/1 = 0.00%	1481	1481
voicant	voicant	1481	0.0/1 = 0.00%	1481	1481
wabba	wabba	1480	0.0/1 = 0.00%	1478	1481
xeongrey	xeongrey	1477	1.0/4 = 25.00%	1478	1477
Jeremy Good	judgmentality	1475	0.0/2 = 0.00%	1477	1473
Jose Carrillo	j_carrillo_vii	1469	0.0/2 = 0.00%	1469	1468
Greg Strong	mageofmaple	1467	1.0/4 = 25.00%	1466	1468
Samuel de Souza	samsou	1466	0.0/2 = 0.00%	1466	1466
Jon Dann	jon_dann	1465	0.0/2 = 0.00%	1466	1464
Matthew La Vallee	sherman101	1464	0.0/2 = 0.00%	1464	1463
Omnia Nihilo	sacredchao	1463	1.0/5 = 20.00%	1467	1459
Gary Gifford	penswift	1459	0.0/3 = 0.00%	1458	1459
Julian	redpanda	1455	0.0/3 = 0.00%	1451	1459
Aurelian Florea	catugo	1454	2.0/8 = 25.00%	1455	1453
Arthur Yvrard	torendil	1452	0.0/3 = 0.00%	1450	1453
pallab basu	pallab	1451	0.0/3 = 0.00%	1453	1448
Erik Lerouge	erik	1450	0.0/3 = 0.00%	1452	1449
darren paull	ramalam	1418	2.0/26 = 7.69%	1367	1469
Carlos Cetina	sissa	1417	0.0/9 = 0.00%	1410	1424
wdtr2	wdtr2	1393	0.0/9 = 0.00%	1382	1404

Meaning

The ratings are estimates of relative playing strength. Given the ratings of two players, the difference between their ratings is used to estimate the percentage of games each may win against the other. A difference of zero estimates that each player should win half the games. A difference of 400 or more estimates that the higher rated player should win every game. Between these, the higher rated player is expected to win a percentage of games calculated by the formula (difference/8)+50. A rating means nothing on its own. It is meaningful only in comparison to another player whose rating is derived from the same set of data through the same set of calculations. So your rating here cannot be compared to someone's Elo rating.

Accuracy

Ratings are calculated through a self-correcting trial-and-error process that compares actual outcomes with expected outcomes, gradually changing the ratings to better reflect actual outcomes. With enough data, this process can approach accuracy to a high degree, but error remains an essential element of any trial-and-error process, and without enough data, its results will remain error-ridden. Unfortunately, Chess variants are not played enough to give it a large data set to work with. The data sets here are usually small, and that means the ratings will not be fully accurate.

One measure taken to eke out the most data from the small data sets that are available is to calculate ratings in a holistic manner that incorporates all results into the evaluation of each result. The first step of this is to go through pairs of players in a manner that doesn't concentrate all the games of one player in one stage of the process. This involves ordering the players in a zig-zagging manner that evenly distributes each player throughout the process of evaluating ratings. The second step is to reverse the order that pairs of players are evaluated in, recalculate all the ratings, and average the two sets of ratings. This allows the outcome of every game to affect the rating calculations for every pair of players. One consequence of this is that your rating is not a static figure. Games played by other people may influence your rating even if you have stopped playing. The upside to this is that ratings of inactive players should get more accurate as more games are played by other people.

Fairness

High ratings have to be earned by playing many games. They are not available through shortcuts. In a previous version of the rating system, I focused on accuracy more than fairness, which resulted in some players getting high ratings after playing only a few games. This new rating system curbs rating growth more, so that you have to win many games to get a high rating. One way it curbs rating growth is to base the amount it changes a rating on the number of games played between two players. The more games they play together, the more it approaches the maximum amount a rating may be changed after comparing two players. This maximum amount is equal to the percentage of difference between expectations and actual results times 400. So the amount ratings may change in one go is limited to a range of 0 to 400. The amount of change is further limited by the number of games each player has already played. The more past games a player has played, the more his rating is considered stable, making it less subject to change.

Algorithm

Each finished public game matching the wildcard or list of games is read, with wins and draws being recorded into a table of pairwise wins. A win counts as 1 for the winner, and a draw counts as .5 for each player.
All players get an initial rating of 1500.
All players are sorted in order of decreasing number of games. Ties are broken first by number of games won, then by number of opponents. This determines the order in which pairs of players will have their ratings recalculated.
Initialize the count of all player's past games to zero.
Based on the ordering of players, go through all pairs of players in a zig-zagging order that spreads out the pairing of each player with each of his opponents. For each pair that have played games together, recalculate their ratings as described below:

Add up the number of games played. If none, skip to the next pair of players.
Identify the players as p1 and p2, and subtract p2's rating from p1's.
Based on this score, calculate the percent of games p1 is expected to win.
Subtract this percentage from the percentage of games p1 actually won. // This is the difference between actual outcome and predicted outcome. It may range from -100 to +100.
Multiply this difference by 400 to get the maximum amount of change allowed.
Where n is the number of games played together, multiply the maximum amount of change by (n)/(n+10).
For each player, where p is the number of his past games, multiply this product by (1-(p/(p+800))).
Add this amount to the rating for p1, and subtract it from the rating for p2. // If it is negative, p1 will lose points, and p2 will gain points.
Update the count of each player's past games by adding the games they played together.

Reinitialize all player's past games to zero.
Repeat the same procedure in the reverse zig-zagging order, creating a new set of ratings.
Average both sets of ratings into one set.

Written by Fergus Duniho

WWW Page Created: 6 January 2006

Comments

Game Courier Ratings. Calculates ratings for players from Game Courier logs. Experimental.

Kevin Pacey wrote on Fri, Jan 5 05:13 PM EST:

Somehow this Page lost a lot of prominence. I can only find it under the already unprominent menu item 'Script', unless I'm missing a more prominent link that says 'Ratings' (that would interest more visitors/members, perhaps).

Kevin Pacey wrote on Sun, Jun 26, 2022 04:47 PM EDT:

@ Fergus:

I'm wondering if there is a bug in the way GC does the total rated games for individual players. For example, I just finished a drawn publicly viewable Fischerrandom game with Play Tester, yet for both his public and rated games for Fischerrandom it shows he still has perfect scores.

Also, I once tallied up my rated games for (orthodox) chess, and I was not credited for one of my drawn games, but seemed to have received a loss instead.

Kevin

Kevin Pacey wrote on Sat, Jun 22, 2019 05:05 AM EDT:

I like that the default Status Filter for the Ratings page has been changed (by Ben) to Only Rated Games (rather than All Public Games). I think this is the way it should always have been, though I never got around to making a post suggesting such a change.

Kevin Pacey wrote on Tue, Jun 18, 2019 01:25 PM EDT:

I can get the Ratings for just standard Chess now. I was using the wildcard (%) before when using the filter on the GC Ratings Page itself, and that caused the problem of too many games & players showing up. Before now I had no idea of the significance of the (%) symbol, fwiw.

Ben Reiniger wrote on Tue, Jun 18, 2019 12:19 PM EDT:

I've fixed the header menu to point Ratings and Logs of a game to the correct locations.

The script here used `gamewcp` ("wild-card pattern", I assume), but at least some links are using `game` as request variables. For now, rather than track down all usages, I've just allowed `game` to override `gamewcp`.

I've also changed the default Status Filter to Only Rated Games.

(This is in response to a few messages here.)

Also, @Kevin, I still can't recreate the problem with including all logs. Using the game filter Chess shows only a few games per person, with Carlos at the high 40 and a few others in the teens. Are you removing the wildcard (`%`)?

Number of ratings: 5, Average rating: Excellent, Number of comments: 167