Game Courier Ratings for Wormhole Chess
This file reads data on finished games and calculates Game Courier Ratings (GCR's) for each player. These will be most meaningful for single Chess variants, though they may be calculated across variants. This page is presently in development, and the method used is experimental. I may change the method in due time. How the method works is described below.
SELECT * FROM FinishedGames WHERE Rated='on' AND Game = 'Wormhole Chess'
|
MeaningThe ratings are estimates of relative playing strength. Given the ratings of two players, the difference between their ratings is used to estimate the percentage of games each may win against the other. A difference of zero estimates that each player should win half the games. A difference of 400 or more estimates that the higher rated player should win every game. Between these, the higher rated player is expected to win a percentage of games calculated by the formula (difference/8)+50. A rating means nothing on its own. It is meaningful only in comparison to another player whose rating is derived from the same set of data through the same set of calculations. So your rating here cannot be compared to someone's Elo rating. AccuracyRatings are calculated through a self-correcting trial-and-error process that compares actual outcomes with expected outcomes, gradually changing the ratings to better reflect actual outcomes. With enough data, this process can approach accuracy to a high degree, but error remains an essential element of any trial-and-error process, and without enough data, its results will remain error-ridden. Unfortunately, Chess variants are not played enough to give it a large data set to work with. The data sets here are usually small, and that means the ratings will not be fully accurate. One measure taken to eke out the most data from the small data sets that are available is to calculate ratings in a holistic manner that incorporates all results into the evaluation of each result. The first step of this is to go through pairs of players in a manner that doesn't concentrate all the games of one player in one stage of the process. This involves ordering the players in a zig-zagging manner that evenly distributes each player throughout the process of evaluating ratings. The second step is to reverse the order that pairs of players are evaluated in, recalculate all the ratings, and average the two sets of ratings. This allows the outcome of every game to affect the rating calculations for every pair of players. One consequence of this is that your rating is not a static figure. Games played by other people may influence your rating even if you have stopped playing. The upside to this is that ratings of inactive players should get more accurate as more games are played by other people. FairnessHigh ratings have to be earned by playing many games. They are not available through shortcuts. In a previous version of the rating system, I focused on accuracy more than fairness, which resulted in some players getting high ratings after playing only a few games. This new rating system curbs rating growth more, so that you have to win many games to get a high rating. One way it curbs rating growth is to base the amount it changes a rating on the number of games played between two players. The more games they play together, the more it approaches the maximum amount a rating may be changed after comparing two players. This maximum amount is equal to the percentage of difference between expectations and actual results times 400. So the amount ratings may change in one go is limited to a range of 0 to 400. The amount of change is further limited by the number of games each player has already played. The more past games a player has played, the more his rating is considered stable, making it less subject to change. Algorithm
|
Written by Fergus Duniho
WWW Page Created: 6 January 2006