|
3. Ordinal Ratings
|
|
|
|
Ratings that are based on an ordinal scale typically take the
form of ranks, though other forms are admissible, e.g.
percentiles. The process of ranking is based on
the dominance relation, indicated in a dominance graph by
an arrow or directed edge drawn from the dominant to the
subordinate player or node. Dominance in a competitive
setting means prevailing over an opponent by some established criterion,
such as a win or winning percentage. A competing field is
represented by a complete dominance graph when all pairings are
accounted for. In a partial dominance graph there are edges
missing, corresponding to pairings for which there are no decisive
results. The classic ranking problem is to assign ranks to the
nodes of a dominance graph so as to minimize violations, in this
case dominance by lower-ranking players. The problem has been shown to
be NP-complete [GJ]. That is,
barring an earth-shaking breakthrough, there is simply no ranking
algorithm to be discovered that is both exact and efficient for data of
significant quantity. Rankings of a dozen or so players can be optimized
by brute force, but as the number of ranks increases, the degree of
complexity increases exponentially. Finding a ranking that
minimizes violations soon becomes impractical, even for a high-speed
computer.
For some NP-complete problems, difficulties can be overcome by restating the conditions of the problem, though at the risk of trivializing it. It turns out that there is an efficient algorithm for ranking so as to minimize the algebraic sum of directed rank differences. A directed (signed) difference in rank is taken from the rank of the subordinate player to the rank of the dominant player as rank (dominant) - rank (subordinate) . For expected results the direction is negative since the higher ranks are smaller. Such results are desirable from the standpoint of ranking because they tend to minimize directed rank differences. For upsets the direction is positive. The algorithm for minimizing the algebraic sum of directed differences in a ranking consists simply of sorting the players in descending order according to the difference Wd - Ld with respect to each player, where Wd is the number of opponents dominated by the player and Ld is the number of opponents who dominate the player. (The notation suggests "wins" and "losses," but only as a mnemonic device. The subscript indicates dominance. The equivalent in a dominance graph is outdegree minus indegree.) The algorithm should be applied as a tiebreaker whenever possible, in which case the dominance relations among players with equal ranks are considered. It is difficult to prove the efficiency of sorting by Wd - Ld to minimize directed rank differences without being long-winded. A wordy sketch will suffice for the present purpose. Note that when a player moves up in rank, the sum of his directed rank differences is decremented for each player dominated by him (that is, the sum is decreased by Wd) and is incremented for each player who dominates him (that is, the sum is increased by Ld). Similarly, when a player moves down in rank, the sum of his directed rank differences increases by Wd and decreases by Ld. For any adjacent pair of players, A and B, where A is the higher ranked, if Wd(A) - Ld(A) < Wd(B) - Ld(B) , a swap will produce an overall decrease in directed rank difference.
By an appropriate sequence of swaps, a ranking can be produced that
minimizes the algebraic sum of directed rank differences. This is
perhaps best seen by working out an actual case. Wd - Ld is a linear transformation of Pd, which may be demonstrated as follows: From the definition of dominance percentage, [3.2] Wd = nPd - .5Dd , and, by complement, [3.3] Ld = n - Wd - Dd . Substituting [3.2] into [3.3] , [3.4] Ld = n - nPd - .5Dd , and subtracting [3.4] from [3.2], Sorting by Wd - Ld and Pd does not minimize violations but is a useful algorithm for reducing violations to a near minimum.. A simulation by the present author pits this algorithm against a brute-force method (Downloads). It produces a sequence of one hundred round-robin tournaments, each consisting of ten players. The players may be judged to be of equal strength inasmuch as every dominance relation is decided by the equivalent of a coin toss. For each tournament, first, the players were ranked by the dominance statistics developed here, using tiebreakers by the same method where possible, and the total number of violations produced by this ranking was noted as T1. Second, a ranking was found by a brute-force method that minimized violations, and the resulting number of violations was noted as T2. The upshot was an array of one hundred values for T1 and a corresponding array of one hundred values for T2. The mean value for T1 was 11.75, and for T2 it was 9.3. Considering the onerous nature of the brute-force method, there was not a great deal of improvement over the method based on dominance statistics. A method similar to the latter was shown by the author to be more efficient in reducing total violations, at least in random simulations, than more sophisticated probability algorithms [J]. The principle of minimizing directed rank differences explains the efficiency of a round-robin
tournament, which has long been regarded as a rational means for
ordering the performances of its participants. The percentage
scores resulting from a round robin, including half points for draws,
are equivalent to dominance percentages, and their descending order
consequently minimizes directed rank differences.
The principle may be extended to partial tournaments. An interesting illustration is the The dominance percentage statistic suggests a natural ordering of performances, expressed by the relation [3.6] Q @ Pd , where Q is the quantile (e.g., percentile) position of the particular value Pd in the competing field, which is almost never exact. The relation is analogous to the basic formulas of more advanced interval and ratio rating systems, to be introduced under subsequent headings. It motivates a ranking algorithm in which quantile estimates are adjusted as outcomes occur. In this sequential ranking system, a list of opposition quantiles is maintained in ascending order for each player, say the quantiles of his/her last n opponents. Using the player's point score against the current roster of opponents, a quantile rank based on [3.6] may be estimated. Suppose a player has in the list of n opposition ranks the quantile values Q1, Q2, Q3, . . . , Qn-1, Qn such that Qi <= Qi+1 and that he/she has scored s points against the n opponents. The player's quantile would be calculated to fall above Qs, but below Qs+1, perhaps as the mean of these two values. The calculated quantile is then used in subsequent estimates of opposition quantiles, and the algorithm proceeds in self-corrective fashion. This scheme is implemented in a program available by download from ftp.rathingtheory.com in the folder "Ordinal Ratings 2." The program generates outcomes based on an arbitrary ranking of 400 players in 25 cycles of 200 games each, but users may change the experimental parameters. Results are calculated in the form of quantiles, which are then used to calculate rankings at the end of each cycle. The correlations between arbitrary rankings and calculated rankings are depicted in the graph below. It appears that a small list of opposition quantiles, three or four elements for each player instance, is adequate for generating plausible correlations. The output depicted is based on four elements in each list. Note that generating rankings from scratch presents subtle problems for the programmer. These are largely overcome by assigning initial quantiles as random values. Correlation of Calculated Ranks with Predefined Ranks
|