[EM] A distance based method
Kristofer Munsterhjelm
km_elmet at lavabit.com
Mon Jul 11 10:46:03 PDT 2011
fsimmons at pcc.edu wrote:
> First find a clone consistent way of defining distance between candidates.
This could be an interesting algorithm problem in itself. It is possible
to triangulate points in space (assuming Euclidean distances) if you
have the exact distances; but what if you have only the rank order of
distances? You get a (usually) underconstrained problem of the sort:
Given v voters, n candidates, and a dimension integer k > 1,
find v + n coordinates in k-dimensional space so that the Euclidean
distance from V's coordinate to C's coordinate for some voter V and
candidate C is less than the Euclidean distance from V's coordinate to
D's coordinate, for the same voter V and another candidate D, iff V
ranks C ahead of D. Break ties by assigning coordinates so that the sum
of the Euclidean distances to the candidates from the origin is minimized.
If it is not possible to make such an assignment, make one that
contradicts as few candidate rankings as possible.
I have no idea how you would actually do this, though, and it would be
prone to overfitting. It might not be cloneproof, either, since
differences in clone rankings could eliminate some rotations that would
otherwise be picked as the best choice by the tiebreaker. It would
definitely not pass IIA, as the addition of "superbad" candidates could
serve as anchors.
> Then while two or more candidates remain
> of the two with the greatest distance from each other
> eliminate the one with the greatest pairwise defeat
> EndWhile.
Or, if you have the voters' coordinates too, you could use histograms,
kernel density estimation, or some other estimation to try to
reconstruct opinion space, and then pick a subset of the candidates that
"reproduces" that opinion space as best as is possible. E.g. if each
candidate (and voter) is a Gaussian in opinion space and you want p
seats, find the p Gaussians where the difference between the space given
by the sum of the p Gaussians (of the prospective council) and of the v
Gaussians (of every voter), normalized, is minimized.
Bandwidth selection would be a pain, as would finding the "right" number
of dimensions.
More information about the Election-Methods
mailing list