7. Established Ratings

 

   



Established rating formulas in the Elo System are thought to be suited for ratings based on reasonably large samples, where percentage expectancy can be reliably inferred from rating differences or ratios.  This is a curious view that takes little account of actual rating practice.  Actually, established formulas follow as close approximations of performance formulas and require no particular assumptions about probability.  They are generally easier to evaluate as change formulas, and therein lies their primary usefulness.  The key concept in this conundrum is percentage expectancy, which will be explored in more detail further on.  It may be taken for the moment as a ponderous term for a simple concept, namely, percentage score viewed as a function of rating difference or ratio.  If the performance rating is a function of the mean opposition rating and the percentage score against that opposition, as in the case of interval ratings

[7.1]         R  =  ERc + K(2P - 1) ,

then the percentage expectancy for the difference R - ERc is

[7.2]         Pe  =  (R  -  ERc) / (2K) + .5 ,

which is no more mysterious than algebraic manipulation. 

The established formula is essentially a method for combining ratings.  As we have seen, rating formulas may be written as arithmetic averages.  If we combine a rating Ro based on No games with a rating R based on N games, the new rating may be written as the weighted average

[7.3]         Rn  =  (RoNo + RN) / (No + N) .

This formula can be applied recursively, with the original sample No increasing to infinity.  At some point in this process, if No is sufficiently large, it can be arbitrarily maintained at a  constant value with a small loss of precision.  The formula then becomes

[7.4]         Rn  =  [Ro(No – N) + RN] / No  ,

which is Elo's "blending process" [E1, 8.63, described in 8.25].  Writing the original rating Ro as the identity

[7.5]          Ro  =  [Ro(No – N) + RoN ] / No  ,

a change formula follows immediately as

[7.6]        DR  =  Rn - Ro  =  (R - Ro) (N / No) .

This formula simplifies the averaging process and restricts the sample weight of the original rating by attenuation.  The extent of simplification is illustrated by substitution of linear values into [7.6] for the event rating, 

[7.7]        R  =  ERc + K(2P - 1) ,

and for the original rating in terms of percentage expectancy,

[7.8]        Ro  =  ERc + K(2Pe - 1) ,

giving the simple result

[7.9]         DR  =  2K(P - Pe) (N / No) .

Results are more complicated, of course, with nonlinear formulas.  Elo chose a differential version of formula [7.6],

[7.10]          DR  =  R' (P - Pe) (N / No) ,

where R' is the derivative of the basic rating formula with respect to P.  This simplifies to

[7.11]          DR  =  R' (W - We) / No ,

which is Elo's established rating formula.  This differential is a reasonable approximation for small rating changes, which again require a large value for No.  The derivative of the Percentage Expectancy Curve is taken from the average slope of "the most used portion," roughly 1 percentage point over 8 points of rating difference [E1, 8.25].  It can be calculated precisely from [5.4] by converting first to natural logarithms, giving

[7.12]     R   =   ERc + (400 / ln10) . ln(P / Pc) ,     P > 0,  Pc > 0.

The derivative with respect to P then follows as

[7.13]        R'   =   400 / (ln10 . P . Pc) ,              P > 0,  Pc > 0,

which is the inverse of Elo's rendering of the Verhulst distribution [E1, 8.43].  The derivative for the normal version of the Percentage Expectancy Curve is given by the normal density function [E1, 8.22], which is formidable enough to warrant the approximation proposed by Elo. 

The arbitrary constant No is sometimes thought of as the sample size of the original established rating, which is to be balanced against the size of the performance sample N for "reasonable confidence" when ratings are combined.  This overlooks the recursive nature of the process, which is somewhat obscured by Elo's notation.  Assuming for the sake of simplicity events of equal size N, a performance that starts with a sample weight of

                                  N / No  

has a sample weight after q calculations of 

                    (N / No) (1  -  N / No)q-1 .

The sample weight of an event never quite disappears, and the number of events on which the established rating is based may become indefinitely large.  We have used the term sampling weight for the constant No to distinguish it from an ordinary sample weight.