[EM] CFC-Young, a more time-complexity friendly clustering multiwinner method
Kristofer Munsterhjelm
km-elmet at broadpark.no
Mon Feb 21 03:25:02 PST 2011
Since mentioning CFC-Kemeny, I've been thinking about whether it's
possible to make a CFC-method with a more viable runtime. I think I have
found one, but let's define my terminology first.
CFC means "continuous forced clustering". The idea of these methods is
like Monroe's method, in that each voter can split his vote among
different "clusters", each corresponding to a seat. The voter doesn't
actually do so, but the method does so as to maximize an objective
function. There are usually three constraints:
- Each voter has a fixed (and equal) amount of power that he may spread
among the clusters, so that no voter has more power than another.
- Each cluster must have the same amount of power allocated to it, in
total, so that the result is proportional.
- Each cluster must elect a different candidate.
CFC-Kemeny works by assigning a ranking to each cluster so that the top
rank is different for each, then calculating the point score values
(Kendall-tau distance) for each voter with respect to his vote and the
cluster's ranking. Any given assignment of a voter's power gives a score
for that voter, which is the sum of scores of his ranking with respect
to each cluster's ranking, weighted by the amount of power he allocates
to the cluster. Then the method finds the power assignment that
maximizes the sum of voters' scores (minimizes distance).
That takes the form of an assignment problem, and means that a voter who
votes A>... will probably allocate most of his power to the A-cluster,
unless that would leave someone else unable to allocate power there
because of the proportionality constraint.
There is one problem with CFC-Kemeny. It is extremely slow. While each
instance's objective score (sum of scores across voters) can be
calculated in polynomial time, there are somewhere between (n-1)! and n!
possible rankings per cluster, so this is a combinatorial method where
each seat has an exponential number of choices. No wonder, then, that it
doesn't work for more than a few candidates and seats.
I have been trying to find other Condorcet methods to use than Kemeny,
but most are very hard to express in linear programming and leave few
variables left for weighting-by-power. For instance, it is possible to
find the minmax winner by linear programming, but only by optimizing a
variable, and multiplying one variable-to-optimize (the voter's power to
a given cluster) with another (the minmax objective variable) is not
permitted under linear programming.
However, I think I have found a method that would work. It is a
linear-programming relaxation of the Young method (not Kemeny-Young,
simply Young). The Young method is this: the candidate where you have to
remove the least number of voters to have him become the CW, is the
winner. In integer programming, it is expressed like this:
c*'s Young score is:
max sum over non-negative values x[i], given that
for A being any and every candidate except c*,
sum p = 1...i
x[p] * e[p][c*][A]
> 0
for voters 1..i, x[i] is integer (either 0 if the voter is not part or 1
if he is), and where e[i][c*][A] is 1 if the ith voter ranks c* above A,
otherwise 0.
The candidate with the greatest Young score wins.
This works because the "must be greater than zero" constraint implies
that more voters have c* above A than A above c* no matter who A is.
Thus the program maximizes the sum of x[i] across all i, i.e. the number
of voters who are still being counted, subject to that c* is the CW.
In the integer programming variety, one deals with discrete voters, but
it's not such a stretch to consider each voter a collection of
"mini-voters", so that it's possible to have ballots of any weight.
Doing so turns the problem polytime, too, since we can use linear
instead of integer programming.
The clustering variant of this, is:
Assign the council to check as candidates c[1]...c[k] for k seats. Then,
Council c[1]...c[k]'s Young score is
max sum over non-negative values x[i][k], given that
for each C = 1...k
for A[C] being any and every candidate except c[C],
sum p = 1...i
x[p][C] * e[p][c[C]][A]
> 0
for each p = 1...i
sum C = 1...k
x[p][C]
<= 1
for each C = 1...k, D = 1...k except C
sum p = 1...i
x[p][C] - x[p][D]
= 0
(end of linear program)
As before, the council with the maximal Young score wins.
Here, x[a][b] is the ath voter's power given to cluster b. Unlike the
original method, x[a][b] can be continuous given that 0 <= x[a][b] <= 1.
The first for-each (really C constraints) implements Young for each
cluster, i.e. that the given candidate must be the CW in his cluster.
The second for-each (p constraints) sets the voter proportionality
criterion - the voter can't contribute more than one point of power in
total. The third for-each sets the per-cluster proportionality
constraint, that each cluster has an equal amount of power in total.
It might work. I'm a bit unsure about the third for-each, though,
because Young is not monotone as an optimization function. That is, if
you have something like Dodgson (minimal number of changes needed to
turn the candidate into the CW), guessing at a too high value does no
harm. If 3 changes can turn A into a CW, so can 4 or 10. But that is not
the case for Young. It may be the case that four voters can make A into
a CW, but that doesn't mean that three - or five - can. So CFC-Young
should work if leaving the third for-each out still keeps
proportionality. If not, who knows? I'd have to implement it.
More information about the Election-Methods
mailing list