<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<font face="Helvetica, Arial, sans-serif">There was a discussion
last year about the merits of enforced truncation of ballots. I
modified my evaluation code to allow me to measure the effects
under a spatial model. The results for Condorcet methods and IRV
were as you'd expect, but the results for the Borda count were
surprising. <br>
The first oddity is that with aggressive truncation (eg. from
10 to 4 candidates), the Borda count may outperform Condorcet
methods. In this particular case (10 candidates truncated to 4) I
find that IRV is 47% correct, the Borda count 83% correct, and
Condorcet methods have lower and upper bounds of 70% and 76%, with
minimax coming in at 72%. (The model is bivariate Gaussian.) I
don't think this is hard to explain: a truncated Condorcet method
isn't a true Condorcet method, doesn't satisfy the conditions of
the median voter theorem, and need not work particularly well. <br>
<br>
The second oddity is that the Borda count does not behave
monotonically. When 10 candidates are truncated to 6 rather than
4, the Borda count returns 80% accuracy - i.e. it does *less* well
than when subjected to more drastic truncation. <br>
This is surprising but not astonishing. The Borda count is a
positional method whose coefficients have no claim to optimality;
the truncated Borda count is a method with different coefficients;
and changing the degree of truncation, even in an
information-destroying way, is not guaranteed to affect
performance in any particular direction. <br>
<br>
It struck me that it should be possible to improve on the Borda
coefficients. A natural way to do so is to look at Euclidean
distances. We can compute (by numerical integration) the average
distance of a voter from the candidate she ranks top, second,
etc., and if we use these averages as weights, then the average
score obtained by a candidate should be an approximation to his
average distance from voters; so we elect the candidate with the
lowest score. For a more conventional statistic we can transform
the weights so as to be justified in electing the candidate with
the highest score. I will call this the Euclidean Borda count. <br>
The results aren't very good. In one dimension the EBC scores
*worse* than the standard Borda count whereas in two it performs
slightly better; but it still isn't monotonic under truncation.<br>
<br>
At this point I abandoned all subtlety in favour of brute force. I
simply hillclimbed to maximise the accuracy of the Borda count
with parametric weights. I will call the resulting method the
Spatial Borda Count. To be precise, I simulated a large number of
elections under the model, and maximised the total number of
occasions on which candidate A outscored candidate B when A was
actually better than B. This is more symmetric (but further away
from the intended use) than maximising the number of times on
which *the best* candidate outscored each rival individually, or
outscored all rivals together. <br>
We can view the hillclimb as minimising the sum of terms
sgn(d[k,i]-d[k,j]) * </font><font face="Helvetica, Arial,
sans-serif"><font face="Helvetica, Arial, sans-serif">sgn(scr[k,i]-scr[k,j])
where d[k,i] is the distance of the ith candidate from the
origin in election k (the voters being centred on the origin)
and </font></font><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif"><font face="Helvetica, Arial, sans-serif">scr[k,i]
is the candidate's score.<br>
This can be simplified to the sum of terms </font></font></font></font><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica, Arial,
sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif">sgn( (d[k,i]-d[k,j]) * </font><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif">(scr[k,i]-scr[k,j]) ), and if we
adopt the approximation sgn(x)=x, we end up looking for
scores which are maximally correlated with distances;
and these are obviously obtained by letting the scores
be proportional to distances. So the EBC drops out as a
natural approximation to the SBC.<br>
The attachment is a graph showing the weights against
ballot position for various methods (normalising the
weights for comparability and so that high rather than
low scores are best). The lowest position on a ballot is
at the left and the top at the right. <br>
I measure the accuracy of the Borda count (on
untruncated ballots) as 85% whereas the SBC is 91%. The
number of errors is thus reduced by 40% by a minor
change in coefficients. (The improvement is less
impressive in a single dimension.)<br>
When I look at truncation, I find to my relief that
the SBC is monotonic. For sufficient levels of
truncation the Borda count does better than the SBC; but
if we wanted to optimise performance for truncated
ballots, we’d actually generate a different set of
weights.<br>
The results are as follows, using a bivariate Gaussian
model and truncating 10 candidates to k:<br>
<br>
k:irv :bord: sbc :clow:minx:cup<br>
10: 51 : 85 : 91.2 : 99 : 99 : 99<br>
8: 51 : 85 : 91.0 : 99 : 99 : 99<br>
7: 51 : 84 : 89.6 : 99 : 99 : 99<br>
6: 51 : 80 : 86.5 : 96 : 97 : 97<br>
5: 50 : 81 : 86.0 : 88 : 89 : 90<br>
4: 47 : 83 : 83.7 : 70 : 72 : 76<br>
3: 41 : 66 : 64.7 : 50 : 52 : 59</font></font></font></font></font></font><br>
<font face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif"><font face="Helvetica, Arial,
sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"> 2: 35
: 44 : 43.5 : 36 : 38 : 43<br>
</font></font></font></font></font></font></font></font></font></font></font></font><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica, Arial,
sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif"><font face="Helvetica, Arial,
sans-serif"> 1: 30 : 30 : 30.0 : 30 : 30 : 30<br>
<br>
CJC<br>
</font></font></font></font></font></font>
</body>
</html>