3. Ordinal Ratings

 

     

   

Ratings that are based on an ordinal scale are called ranks in ordinary mathematical parlance.  The process of ranking is based on the dominance relation, indicated in a dominance graph by an arrow or directed edge drawn from the dominant to the subordinate player or node.  Dominance in a competitive setting means prevailing over an opponent by some established criterion, such as a win or winning percentage.  A competing field is represented by a complete dominance graph when all pairings are accounted for.  In a partial dominance graph there are edges missing, corresponding to pairings for which there are no decisive results.  The classic ranking problem is to assign ranks to the nodes of a dominance graph so as to minimize violations, in this case dominance by lower-ranking players. The problem has been shown to be NP-complete [GJ]. That is, barring an earth-shaking breakthrough, there is simply no ranking algorithm to be discovered that is both exact and efficient for data of significant quantity. Rankings of twenty or so players can be optimized by brute force, but as the number of ranks increases, the degree of complexity increases exponentially.  Finding a ranking that minimizes violations soon becomes impractical, even for a high-speed computer. 

For some NP-complete problems, difficulties can be overcome by restating the conditions of the problem, though perhaps at the risk of trivializing it. It turns out that there is an efficient algorithm for ranking so as to minimize the algebraic sum of directed rank differences. A directed (signed) difference in rank is taken from the rank of the subordinate player to the rank of the dominant player as 

            rank (dominant) - rank (subordinate) .

For expected results the direction is negative since the higher ranks are smaller. Such results are desirable from the standpoint of ranking because they tend to minimize directed rank differences. For upsets the direction is positive. The algorithm for minimizing the algebraic sum of directed differences in a ranking consists simply of sorting the players in descending order according to the difference

               Wd - Ld

with respect to each player, where Wd is the number of opponents dominated by the player and Ld is the number of opponents who dominate the player. (The notation suggests "wins" and "losses," but only as a mnemonic device. The subscript indicates dominance as the essential criterion.)  The algorithm should be applied as a tiebreaker whenever possible, in which case the dominance relations among players with equal scores are considered.  

It is difficult to prove the efficiency of sorting by Wd - Ld to minimize directed rank differences without being long-winded.  A wordy sketch will suffice for the present purpose.  Note that when a player moves up in rank, the sum of his directed rank differences is decremented for each player dominated by him (that is, the sum is decreased by Wd) and is incremented for each player who dominates him (that is, the sum is increased by Ld).  Similarly, when a player moves down in rank, the sum of his directed rank differences increases by Wd and decreases by Ld.  For any adjacent pair of players, A and B, where A is the higher ranked, if

               Wd(A) - Ld(A)  <  Wd(B) - Ld(B) ,

a swap will produce an overall decrease in directed rank difference.  By an appropriate sequence of swaps, a ranking can be produced that minimizes the algebraic sum of directed rank differences.  This is best seen by working out an actual case.

A statistic related to Wd - Ld is dominance percentage Pd, defined as the percentage of the playing field dominated by the player in question, plus half the percentage which stands in no dominance relation with that player (either dominant or subordinate), including for the sake of generality the player himself. It is convenient to think of this as the percentage score that would result from a single game with every other player in the competing field, assuming a draw against any player not actually encountered, including the player himself.  If n is the total number of players in the field,

[3.1]          Pd  =  (Wd + .5Dd) / n,

where Dd is the number of players who do not stand in a dominance relation to the player in question.  Since a player does not stand in a dominance relation with himself, Pd will never have a value of 0 or 1.  More importantly, Wd - Ld is a linear transformation of Pd.  From the definition of dominance percentage,

[3.2]         Wd  =  nPd - .5Dd

and, by process of elimination,

[3.3]          Ld  =  n - Wd - Dd .

Substituting [3.2] into [3.3] ,

[3.4]          Ld  =  n - nPd - .5Dd

and subtracting [3.4] from [3.2],

[3.5]          Wd - Ld   =  2n (Pd) - n , 

 which is the linear transformation claimed.

Sorting the field by Pd is therefore equivalent to sorting the field by Wd - L and has the same effect in minimizing directed rank differences.  One application of this principle is that it explains the efficiency of a round-robin tournament, which has long been recognized as a rational means for ordering the performances of its participants.  The percentage scores resulting from a round robin, including half points for draws, are equivalent to dominance percentages, and their natural order consequently minimizes directed rank differences.  The application may be extended to partial tournaments, where the number of games played by each contestant may vary, but here it does not yield much insight into individual performance.  An interesting illustration is the Mannheim 1914 tournament, an 18-player round-robin that was abandoned after 11 rounds with the outbreak of the First World War [HW, "auxiliary scoring methods"] .  Alekhine had the best score at that point with 9.5 points.  It has been argued that Vidmar with 8.5 points actually outperformed Alekhine because he had met stronger opposition, and this is confirmed by several auxiliary scoring methods, including Neustadtl and Solkoff.  Sorting the field by Pd or by Wd - L merely confirms the ordering of performances by standard point scores.  The dominance percentage for Alehine was 13/18, and for Vidmar 12/18.

The dominance percentage statistic suggests a natural ordering of performances, expressed by the relation

[3.6]          Q  @  Pd ,

where Q is the quantile (e.g., percentile) position of the particular value Pd in the sorted competing field.  The relation is almost never an exact equality, but it is analogous to the basic formulas of more advanced interval and ratio rating systems, to be introduced under subsequent headings.  The relation motivates an interesting ranking system, in which ranks are determined sequentially as outcomes are presented.  In this sequential ranking system, a sorted list of opposition quantiles with respect to the entire field is maintained for each player, say the quantiles of his/her last N opponents.  Using the player's point score against his/her current roster of opponents, a quantile ranking based on [3.6] may be estimated.  To take a simple example, a player has in this list of opposition rankings the three quantile values

                 .23          .45          .67 .

With a win and two draws against this opposition, the dominance percentage is 2/3 (ignoring the quantile yet to be determined), and his/her Q therefore falls between .45 and .67.  This quantile, estimated say as .56, is then used in subsequent estimates of opposition quantiles, and the algorithm proceeds in self-corrective fashion.  The scheme is likely to be judged impractical as a rating system, if not hopelessly imprecise.  A similar scheme, however, was shown by this author to more be efficient, at least in random experiments, than considerably more sophisticated methods based on probability considerations [J].   The ranking system does, in any case, shed light on some of the problems of more advanced rating theory, to which we turn forthwith.