[EM] Second (and higher)-order methods?

Mon Apr 30 14:11:16 PDT 2012

I always thought the "Condorcet is like a round-robin athletic tournament"
analogy was weak, because individual voters don't get to go through the
round-robin and make their pairwise preferences explicit.  (As a voter, I'd
find a "better/worse" pairwise choice for all pairs easier than filling out
a ranked ballot, but that just may be because I've been making pairwise
choices between the ophthalmologist's lenses since I was six.) N x (N-1) "A
or B" choices is an easier way to fill out a ballot  than "rank A1,A2,A3."
so no matter what method you use to translate my ranked ballot into pairwise
comparisons I have no way to know if you counted my A<>B preferences the way
I would have.

The sports analogy caught my attention because one of my main interests for
a decade or so  has been analyzing why and how college sports rating systems
work. They EXIST precisely because there ISN'T a round-robin that a voting
system can be compared to.  

What (US) college sports ratings systems do to account for the fact that the
games-played pairwise matrix is "sparse" is to apply some algorithm to the
directed games graph based upon the number of one-way connections A[i] -> .
-> A[j] compared to the number of A[j] -> . -> A[i] paths for some number of
"->"s. 

What is called here the "pairwise matrix" (call it PM) is PM[I,j]= the
number of ballots where alternative i is ranked better than alternative j
and PM[j,i] is the number of ballots where alternative j is ranked better
than alternative i.  In sports ratings, the beginning point is an "incidence
matrix" where IM[i,j] is the number of wins by team i over team j and
IM[j,i] is the number of wins by team j over team i.

The PM and IM count the number of edges in a graph where points
(alternatives/teams) are connected  to other points (alternatives/teams) by
an arrow from point A to B representing a "win" by A against B. In sports
it's an actual win in a game, in the PM it's a ballot that has A ranked
above B.

"Cycles" are common in sports. Team A beat B and team B beat team C and team
C beat team A.  Ratings systems routinely sort that A>B>C>A chain by
comparing how A, B and C performed against their common opponents, which is
analogous to comparing alternatives in a PM "loop" by how each compares to
the alternatives that are included on the same ranked ballots they are.

The way sports ratings systems use a very sparse PM to come up with an
ordered list of teams is to begin with that directed graph that counts the
number of "->"s   and then look at how every team-pair compares based upon
number of ->-> paths that relate each vertex (team/alternative) to each
other, then the number of ->->-> paths that do so, and continuing in that
fashion until their algorithm "converges" to an ordered list that doesn't
change.

This is easy to do: the number of paths from Alternative i to Alternative j
that are two steps long is PM^2[i,j]. What that's counting is the number of
paths i has with k's that have winning paths against j.

If there is some m such that PM^m results in a Condorcet winner when paths
of length m are counted that could be a simple way to break Cycles.

When applied to the example in http://en.wikipedia.org/wiki/Schulze_method
(the Wikipedia explanation of Schultze) you find a Condorcet winner when
paths of length 2 are counted:

        A       B       C       D       E

A       0       20      26      30      22

B       25      0       16      33      18

C       19      29      0       17      24

D       15      12      28      0       14

E       23      27      21      31      0

PM(i,j)-PM(j,i):

        A       B       C       D       E

A        0       -5      7       15      -1

B        5       0       -13     21      -9

C        -7      13      0       -11     3

D        -15     -21     11      0       -17

E        1       9       -3      17      0

finding PM^2

2 done. Elapsed=.000000 seconds.

A        1950    1708    1622    1784    1404

B        1213    1846    1952    1580    1396

C        1532    1232    1938    2271    1178

D        1154    1490    876     1756    1218

E        1539    1441    1898    1938    1930

PM^2(i,j)-PM^2(j,i): 

A        0       495     90      630     -135

B        -495    0       720     90      -45

C        -90     -720    0       1395    -720

D        -630    -90     -1395   0       -720

E        135     45      720     720     0

Now, I don't think it's a coincidence that JUST looking at PM^2 gives the
same winner (E) as Schultze does, since it's counting the x->y->z chains,
giving extra credit to x >> z based upon x's wins over alternatives that
themselves have {}->z wins, and that's explicitly part of the motivation for
Schultze.

But if that is equivalent to Schultze (I'll leave that test to people who
know better than I how it works) I find it more cosmetically appealing than
the Schultze definition.

There's no "eliminate candidate based upon." which has always rubbed me the
wrong way - too IRVish. All ballots and all alternatives are directly
involved in the final count.

I'd be interested in any further analysis of my observation. I'm not
proposing anything, just offering an area of research that I'm surprised
hasn't been explored here (afaik) as much as in the sports market for rating
systems.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20120430/e743166e/attachment-0003.htm>