7. Established Ratings

 

   



Established rating formulas in the Elo System are thought to be suited for ratings based on reasonably large samples, where percentage expectancy can be reliably inferred from rating differences or ratios.  This view is at odds with actual rating practice.  Established formulas are developed as close approximations of performance formulas and require no particular assumptions about probability.  They are generally easier to evaluate as change formulas, and therein lies their primary usefulness.  The key concept in this conundrum is percentage expectancy, which will be explored in more detail further on.  It may be taken for the moment as a ponderous term for a simple concept:  namely, percentage score viewed as a function of rating difference or rating ratio.  In the case of interval ratings, if the performance rating is a function of the mean opposition rating and the percentage score against that opposition, as

[7.1]         R  =  ERc + K(2P - 1) ,

then the percentage expectancy for the difference R - ERc is

[7.2]         Pe  =  (R  -  ERc) / (2K) + .5 ,

which is no more mysterious than algebraic manipulation. 

The established formula is essentially a method for combining ratings.  As we have seen, rating formulas may be written as arithmetic averages.  If we combine a rating Ro based on No games with a rating R based on N games, the new rating may be written as the weighted average

[7.3]         Rn  =  (RoNo + RN) / (No + N) .

This formula can be applied in cumulative fashion, with the original sample No increasing to infinity.  At some point in this process, if No is sufficiently large, it can be arbitrarily maintained at a constant value with a small loss of precision.  The term sampling weight will be used hereafter for the constant No to distinguish it from an ordinary sample size.  The formula then becomes

[7.4]         Rn  =  [Ro(No – N) + RN] / No  ,

which is Elo's "blending process" [E1, 8.63, described in 8.25].  Writing the original rating Ro as the identity

[7.5]          Ro  =  [Ro(No – N) + RoN ] / No  ,

a change formula follows immediately as

[7.6]        DR  =  Rn - Ro  =  (R - Ro) (N / No) .

This formula simplifies the averaging process and restricts the sample weight of the original rating by attenuation.  The simplification is pursued further by substituting linear formulas for R and Ro,

[7.7]        R  =  ERc + K(2P - 1) ,

and for the original rating in terms of percentage expectancy,

[7.8]        Ro  =  ERc + K(2Pe - 1) ,

giving the simple result

[7.9]         DR  =  2K(P - Pe) (N / No) .

Results are more complicated with nonlinear formulas.  Elo chose a differential version of Formula [7.6],

[7.10]          DR  =  R' (P - Pe) (N / No) ,

where R' is the derivative of the basic rating formula with respect to P.  This simplifies to

[7.11]          DR  =  R' (W - We) / No .

A single constant K customarily combines the derivative and the sampling constant, giving

[7.12]          DR  =  K (W - We) ,

which is Elo's established rating formula.  The differential [7.11] is a reasonable approximation for small rating changes, which again requires a large value for the sampling weight.  The derivative R' is the inverse of the derivative of the Percentage Expectancy Curve, which is approximated as the average slope of "the most used portion," roughly 1 percentage point over 8 points of rating difference [E1, 8.25].  R' thus evaluates to the reciprocal of this approximation, 800 rating points.  For logistic ratings, R' is the inverse of the derivative of Elo's logistic formula (46), which is his Verhulst formula (45).  A precise value for R' is found by differentiating the logistic formula.  As a function of D, this may be written

[7.13]      P  =  1 / (1 + 10-D/2C ) .   

It follows that

[7.14]       10D/2C  =  P / (1 - P) .

Taking the common log of each side

[7.15]       D / 2C  =  log10 [P / (1 - P)] .

Multiplying by 2C and converting to natural logs,

[7.16]       D  =  (2C / ln10) . ln [P / (1 - P)]  .

The derivative with respect to P follows as

[7.17]       dD / dP  =  2C / (ln10 . P . [1 - P]) .

Since D is measured from a constant opposition rating, this is also the derivative of R with respect to P.  Thus,

[7.18]      R'  =  2C / (ln10 . P . [1 - P]) .

The inverse of [7.17] is

[7.19]     dP / dD  =  ln10 . P . (1 - P) / (2C) ,

which is a simplified version of Elo's Verhulst formula [E1, 8.43] although Elo's formula properly gives the derivative in terms of D

The sampling weight in Elo's blending method is sometimes thought of as the maximum sample size of the original performance rating, which is to be balanced against the size N of a further performance sample for proper weighting of results.  This analysis overlooks the recursive nature of the process, which is somewhat obscured by Elo's notation.  Assuming for the sake of simplicity events of equal size N, a performance that starts with a sample weight of

                                  N / No  

has a sample weight after q calculations of 

                    (N / No) (1  -  N / No)q-1 .

The sample weight of an event never quite disappears as the number of events on which the established rating is based becomes indefinitely large.  The formula is plotted below for some typical values of sampling weight, with N = 5.