8. Established Ratings

   



Established ratings are an alternate method of calculation replacing performance ratings when the latter become unwieldy.  To understand them it is necessary to begin with the performance formula, which Elo gives as his first formula,

[8.1]        Rp  =  Rc + D(P) .

"D(P) is to be read as the difference based on the percentage score P, which is obtained from the curve or table" [E1, 1.51].  We have seen in the discussion of probability that the normal curve is a questionable basis for D(P) in the interval system, and in the ratio system it is not clear that D(P) is a probability function.  We proceed, in any case, to the established rating. 

Established ratings are calculated by a change formula arising from the mathematical procedure known as cumulative averaging.  We have seen from the discussion of interval ratings that performance ratings may be treated as averages.  Given an old rating Ro based on No games and a performance rating R based on N games, the ratings are combined as

[8.2]         Rn  =  (RoNo + RN) / (No + N) .

The old rating may be represented by the identity

[8.3]         Ro  =  (RoNo + RoN) / (No + N) .

Subtracting Ro from Rn gives

[8.4]         DR  =  ([R - Ro] N) / (No + N) ,

and repeating this process gives a recursive change formula for cumulative averaging.  Rather than allowing the old sample to increase indefinitely, we can maintain it at a constant value.  This we call the sampling constant to distinguish it from ordinary samples.  The resulting error in cumulative averaging is negligible if the constant is sufficiently large.  If we substitute   No N for No in [8.2] we get

[8.5]         Rn  =  [Ro(No – N) + RN] / No ,

which is Elo's "blending process" [E1, 8.63, described in 8.25].  Writing the original rating Ro as the identity

[8.6]          Ro  =  [Ro(No – N) + RoN ] / No  ,

a change formula follows by subtraction as

[8.7]        DR  =  (R - Ro) (N / No) ,              N  <  No .

A more accurate procedure is to is to hold No constant in [8.4], which puts no restraint on N.  The rating change thus far has been a straightforward application of cumulative averaging, but at this point complications arise in Elo's procedure.  The rating difference R - Ro becomes a change in rating difference by subtracting the old performance rating,

[8.8]            Ro  =  Rc + D(Po) ,

from the new performance rating,

[8.9]            Rp  =  Rc + D(P) ,

and "by making the simplifying assumption that Rc is the same for both samples,"

[8.10]         DR  =  [D(P) - D(Po)] N / No .

The difference in D represents the change in rating, but it is the difference in percentage score that is "read from the percentage expectancy curve, with slope S" to get the rating change, which presumably means that the derivative of the curve is applied to the percentage difference.  The entire procedure, beginning with [8.7], is somewhat roundabout.  It is clear from [8.9] that P is the percentage score encountered in the new performance, and from [8.8] that Po can be calculated from [8.1] using the new opposition rating.  The difference in percentage score can then be translated into a rating change using the derivative of the performance rating with respect to P, which in this case amounts to the inverse slope of the Percentage Expectancy Curve.  The established formula becomes

[8.11]          DR  =  R' (P - Pe) (N / No) ,

where R' is the derivative of the basic rating formula with respect to P.  The result is the same, but the new procedure offers the possibility of dispensing with the probability distribution altogether.  It furthermore suggests a new interpretation of the established rating.  Since the change in percentage score arises from cumulative averaging, it may be regarded as itself the result of cumulative averaging.  On the assumption that percentage scores tend to long-term limiting values, the old percentage score represents an expected value which is corrected by the additional score.  For nonlinear ratings in general, formula [8.11] becomes a differential yielding an approximate value for rating change where the difference in percentage score is small.  The differential simplifies by multiplication to

[8.12]          DR  =  R' (W - We) / No .

A constant K customarily combines the derivative and the sampling constant, giving

[8.13]          DR  =  K (W - We) ,

which is Elo's established rating formula.  When D(P) in [8.1] is based on the Percentage Expectancy Curve, P(D), the derivative R' is taken from the inverse function.  The derivative of the curve is approximated from the average slope of "the most used portion," roughly 1 percentage point over 8 points of rating difference [E1, 8.25].  R' is the reciprocal of this approximation, 800 rating points.  This simplified derivative applies to either the normal curve or the logistic function, and the complicated Verhulst distribution, Elo's formula (45), goes by the board. 

The sampling constant in the Elo System, No, is taken to be the maximum sample size for the performance rating, which is then balanced against the size of a further performance sample, N, in the established rating.  To establish an appropriate value for the sampling constant, Elo compared No to N with respect to sampling error [E1, 8.28].  This analysis overlooks the recursive nature of the established rating, which is somewhat obscured by the notation.  Assuming for the sake of simplicity events of equal size N, a performance that starts with a sample weight of

                                  N / No  

has a sample weight after q calculations of 

                    (N / No) (1  -  N / No)q-1 .

The sample weight of an event never quite disappears as the number of events on which the established rating is based becomes indefinitely large.  The formula is plotted below for some typical values of the sampling constant, with N = 5.