[EM] Correlated Instant Borda Runoff, without Borda

Fri Dec 23 01:45:36 PST 2005

Months ago, someone proposed a method called "Correlated Instant Borda 
Runoff (CIBR)" in order to fix Borda's clone problem. But only now has 
it occurred to be that CIBR's clone-purging idea might be good for 
things other than improving on Borda.

I've got a couple of examples below, but first I'd like to propose an 
improved defintion of "correlation".

*** DEFINITIONS ***

A candidate C is "voted between" A and B on a ballot if C is voted 
either both strictly higher than A and strictly lower than B, or vice-versa.

The "correlation between X and Y with respect to Z" (where X, Y, and Z 
are candidates), denoted "corr(X, Y) wrt Z" is the proportion of the 
ballots on which Z is NOT voted between X and Y.

The "correlation between D and E", denoted corr(D, E), is defined the 
minimum of corr(D, E) wrt F over all candidates F in the complement of 
{D, E}.

*** SUMMATION ***

Consider a summation array labelled by all possible combinations of X, 
Y, and Z in "corr(X, Y) wrt Z".  To represent a ballot in this summation 
array, put a 0 in each positions in which Z is not voted between X and 
Y, and a 1 elsewhere.  To calculate corr(X, Y) wrt Z, simply add the 
appropriate elements of the ballot arrays and divide by the number of 
ballots.

Let N be the number of candidates, and assume N>2.  By symmetry in the 
definition, corr(X, Y) wrt Z == corr(Y, X) wrt Z, so we only need to 
consider the unordered combinations of X and Y: There are N×(N-1)/2 of 
them.  For each (X, Y) pair, there are N-2 of them.  Therefore, the size 
of the summation array is N×(N-1)/2×(N-2) = (1/2)N³-(3/2)N²+N <= (1/2)N³.

Therefore, the computation of the correlations is third-order summable.

*** EXAMPLE: COMPUTATION OF CORRELATIONS ***

Consider the hypothetical Tennessee capital election between Chattanooga 
(C), Knoxville (K), Memphis (M), and Nashville (N).

42%: M>N>C>K
26%: N>C>K>M
15%: C>K>N>M
17%: K>C>N>M

If I calculated correctly, the correlations of each possible trio of 
candidates are:

corr(C, K) wrt M = 100%
corr(C, K) wrt N = 100%
corr(C, M) wrt K =  59%
corr(C, M) wrt N =  26%
corr(C, N) wrt K =  85%
corr(C, N) wrt M = 100%
corr(K, M) wrt C =  41%
corr(K, M) wrt N =  26%
corr(K, N) wrt C =  15%
corr(K, N) wrt M = 100%
corr(M, N) wrt C =  74%
corr(M, N) wrt K =  74%

Thus, the correlations between each possible pair of candidates are:

corr(C, K) = min(100%, 100%) = 100%
corr(C, M) = min( 59%,  26%) =  26%
corr(C, N) = min( 85%, 100%) =  85%
corr(K, M) = min( 41%,  26%) =  26%
corr(K, N) = min( 15%, 100%) =  15%
corr(M, N) = min( 74%,  74%) =  74%

Note that the clone pair has 100% correlation.

*** EXAMPLE: CLONE-PURGING WITH PAIRWISE COMPARISONS ***
(Just like CIBR, but without the Borda points.)

Consider the election method:

while there are at least 2 candidates:
    compute the correlations
    find the most-correlated pair of candidates
    eliminate the loser of a contest between that pair
elect the remaining candidate

With the ballots from the previous example, the most-correlated pair is 
Chattanooga and Knoxville.  Chattanooga pairwise defeats Knoxville, 83% 
to 17%, so eliminate Knoxville.

The correlation trios without Knoxville are:

corr(C, M) wrt N =  26%
corr(C, N) wrt M = 100%
corr(M, N) wrt C =  74%

The new most-correlated pair is Chattanooga and Memphis.  Chattanooga 
beats Memphis, 58% to 42%, so eliminate Memphis.

This leaves only two candidates: Chattanooga and Nashville.  Chattanooga 
pairwise loses to Nashville, 32% to 68%, so is eliminated.  Nashville wins.

Note that the example elected the Condorcet winner.  This will always be 
the case, because if a CW exists, then by definition it cannot pairwise 
lose to another candidate and thus will never be eliminated.

*** EXAMPLE: CLONE-TRANSFER APPROXIMATION OF IRV ***

Consider the election method:

count the first-choice votes of each candidate
while no candidate has a majority of the vote:
    eliminate the last-place candidate
    transfer that candidate's votes to their most-correlated candidate
elect the candidate with a majority of a vote

With the ballots from the original example, the first-choice votes are:

42%: M
26%: N
17%: K
15%: C

No candidate has a majority, so the last-place one (Chattanooga) is 
eliminated.  The correlations without Chattanooga are:

corr(K, M) wrt N =  26%
corr(K, N) wrt M = 100%
corr(M, N) wrt K =  74%

Chattanooga is 26% correlated with Memphis, 85% correlated with 
Nashville, and 100% correlated with Knoxville.  Because Knoxville is 
best-correlated to Chattanooga, it receives its vote, so the count is now:

42%: M
32%: K
26%: N

The next candidate for elimination is Nashville.  The best-correlated 
candidate to Nashville is Knoxville, which receives Nashville's vote, 
producing a final tally of:

58%: K
42%: M

Knoxville now has a majority of the vote, and is declared the winner. 
(Note that is the same as the IRV winner.  I'd be interested to know how 
often this is the case.)

By changing "majority" to "Droop quota" and extending the "transfer 
votes to the most-correlated candidate" rule to excess votes, this 
method can be used to approximate STV (or STV-CLE or a pairwise 
comparision for CPO-STV).  The advantage over traditional STV is that 
the clone-transfer method is third-order summable.