[EM] A distance based method

Kristofer Munsterhjelm km_elmet at lavabit.com
Mon Jul 11 10:46:03 PDT 2011


fsimmons at pcc.edu wrote:
> First find a clone consistent way of defining distance between candidates.

This could be an interesting algorithm problem in itself. It is possible 
to triangulate points in space (assuming Euclidean distances) if you 
have the exact distances; but what if you have only the rank order of 
distances? You get a (usually) underconstrained problem of the sort:

Given v voters, n candidates, and a dimension integer k > 1,
find v + n coordinates in k-dimensional space so that the Euclidean 
distance from V's coordinate to C's coordinate for some voter V and 
candidate C is less than the Euclidean distance from V's coordinate to 
D's coordinate, for the same voter V and another candidate D, iff V 
ranks C ahead of D. Break ties by assigning coordinates so that the sum 
of the Euclidean distances to the candidates from the origin is minimized.
If it is not possible to make such an assignment, make one that 
contradicts as few candidate rankings as possible.

I have no idea how you would actually do this, though, and it would be 
prone to overfitting. It might not be cloneproof, either, since 
differences in clone rankings could eliminate some rotations that would 
otherwise be picked as the best choice by the tiebreaker. It would 
definitely not pass IIA, as the addition of "superbad" candidates could 
serve as anchors.

> Then while two or more candidates remain
>   of the two with the greatest distance from each other
>   eliminate the one with the greatest pairwise defeat
> EndWhile.

Or, if you have the voters' coordinates too, you could use histograms, 
kernel density estimation, or some other estimation to try to 
reconstruct opinion space, and then pick a subset of the candidates that 
"reproduces" that opinion space as best as is possible. E.g. if each 
candidate (and voter) is a Gaussian in opinion space and you want p 
seats, find the p Gaussians where the difference between the space given 
by the sum of the p Gaussians (of the prospective council) and of the v 
Gaussians (of every voter), normalized, is minimized.

Bandwidth selection would be a pain, as would finding the "right" number 
of dimensions.




More information about the Election-Methods mailing list