[EM] A less artificial multiwinner Kemeny method

Fri Oct 30 14:59:47 PDT 2009

In an earlier message, "A multiwinner election method based on Kemeny", 
I detailed an extension of the idea of Kemeny-Young to handle multiple 
winners, by assigning each voter to a candidate rank to which the Kemeny 
distance between his ballot and the candidate rank was the least, and 
using the sum of these distances to find the best set of candidate ranks.

However, Jameson Quinn showed that this method could be rather unfair to 
  majorities, with the example

	99: A>B>C
	 1: C>B>A

which would draw "A" from the group to which 99 voters belonged, and "C" 
from the group to which one voter belonged. I replied by "bolting on" 
proportionality by treating the candidate ranks as party lists, the 
relative support of which determined how many candidates were drawn from 
those candidate ranks ("Bolting proportionality onto the Kemeny 
method"). That approach, though it works, is very artificial and quite 
inelegant, and may cause discontinuities (nonmonotonicity and the likes) 
as slight changes in ballots may cause great changes of the results.

This suggests that one should try to find a means of keeping the result 
proportional, that is somehow not divided from the actual process of 
determining which group a given ballot is to be assigned when 
calculating the distances (which is used to find the ranks that don't 
list the same candidate more than once yet are optimal); and it is that 
which I think I have found.

What is the problem of the naive method that elects {A, C} from 
Jameson's example? It is that the group sizes are different! So why not 
force the groups, or clusters, to be the same size? If so, then each 
"virtual list" (ranking) will have the support of equally many voters, 
and therefore we may apply the original method's selection approach 
(pick the first candidate from each group's associated ranking, except 
if this leads to a single candidate being named multiple times, in which 
case this solution is invalid) and still be proportional.

To implement that idea, we have to solve a more complex problem every 
time we want to find out the cost (negative score) of a set of 
orderings. The ordinary Kemeny multiwinner method just records the 
minimal distance (and associated cluster) for each provided ballot, but, 
as Jameson's example shows, this can lead to very different cluster sizes.
Instead, for k winners, we have to partition the ballots into k groups, 
each corresponding to a different candidate ordering, so that each of 
these groups contain at least floor(n/k) ballots, where n is the number 
of ballots in total. Furthermore, we have to set this partitioning so 
that the sum of distances is still minimized.
More formally,

if D_ab is the distance from the a-th ballot's ordering to the b-th 
cluster ordering, 0 < a <= n, 0 < b <= k, then construct a binary n by k 
matrix P, so that:
/  n        \
| SUM  P_ij |  >= floor(n/k)  for any and all j, 0 < j <= k
\ i=1       /

so that the disproportionality is limited;

/  k        \
| SUM  P_ji |  = 1  for any and all j, 0 < j <= n
\ i=1       /

so that a ballot can't appear in multiple groups or in none;

and then minimize

  n   k
SUM SUM (P_ij * D_ij)
i=1 j=1

Unfortunately, this problem is NP-complete, since one can reduce 
multiprocessor scheduling to it by padding with zero-cost "tasks" to get 
around the constraint of (1).
In practice, a certain greedy algorithm seems to do a reasonable job. 
This algorithm is: allocate to the remaining cluster for which the cost 
is the least, full clusters excluded, where a cluster is full if it 
contains >= ceil(n/k) ballots, and processing in the order of ballot 
with greatest minimum cost (again, full clusters excluded) first.

But now consider the case of fractional ballots, for instance something 
like:
	36.9: A > B > C
	63.1: C > B > A

There are two ways to deal with this. The first is to multiply by a 
sufficiently large integer to turn all the fractional ballots into whole 
ballots. For the case above, 10 will do, giving
	369: A > B > C
	631: C > B > A.

The second is to turn P from a binary matrix into an ordinary matrix. 
Let u then be the number of *unique* ballots and introduce a new 
u-vector W, so that W_i equals the number of voters who voted according 
to the ith unique ballot. D and P become u by k matrices, and the 
relaxed constraints are:

/  u        \
| SUM  P_ij |  >= floor(n/k)  for any and all j, 0 < j <= k
\ i=1       /

so that the disproportionality is limited;

/  k        \
| SUM  P_ji |  = W_j  for any and all j, 0 < j <= u
\ i=1       /

so that a ballot has total weight equal to how many expressed it;

and then minimize

  u   k
SUM SUM (P_ij * D_ij)
i=1 j=1

Now the problem can be solved using linear programming. I haven't 
actually implemented that solution, since I'm trying to find out if 
there's an easier way to solve it than to use linear programming, but if 
it's a good solution, it would also make the method itself more 
continuous (because even integer votes can now be divided into fractions 
to support multiple groups).

However we solve the cost determination problem, the rest of the method 
is the same: find k orderings that each have a different candidate 
ranked first, so that the sum cost is minimized. The winners are those 
that are ranked first on the ordering set that produces the least cost.

For Jameson's example, what happens is that 50 of the A > B > C voters 
gather in one cluster. The remaining 49 gather in the other, along with 
the single C>B>A voter. Now, A > B > C is a definite for the first 
cluster. For the second, B > A > C incurs a cost from the C>B>A voter as 
well as the 49 A>B>C voters, but C>B>A incurs a greater one from the 49, 
even though the one is now happy. Thus, B>A>C wins, and so the council 
is {A, B}, which was what we wanted.

-----------------------------------------------

The revised multiwinner method (let's call it "forced clustering 
Kemeny", since the clusters are forced to have a certain size), does not 
elect and punish, except indirectly. Instead, it divides the electorate 
into as many groups as there are candidates to be elected, and then each 
election proceeds like a Kemeny single-winner election. Why is this any 
better than just having "Droop proportionality with Condorcet when 
there's no longer an agreement"? I'd say that it is because the 
"borders" of the single-winner "districts" are themselves reconfigured 
based on the ballots, so it can take more than just "is supported by p 
Droop quotas" into account.

That strength is also its weakness. While the method would work well if 
there really are k groups in society, what if there are not? As an 
opposite extreme, consider a two-winner election where one of the 
candidates has unanimous first-rank support. This candidate, call it A, 
should win, and then presumably the next would be the single-winner with 
A ineligible - I say so because A is supported by *everyone* and equally 
among all, so there's no subgroup to punish. In such a case, A would 
definitely win in one of the clusters, but since A can't be elected more 
than once, the people who voted for A in the other cluster are just 
increasing their distance by having A > (...) rather than (...) alone.

Now, one may argue that because everybody voted for A, no harm done 
because everybody's distance is increased equally. That might be true, 
but if a single voter stops voting for A, it's no longer so; to which 
one may say that the A-voter surplus will affect every cluster equally 
and so not be unfair, but I'm not certain of that. Ultimately, the 
problem is the "hackish" nature of the "but no one candidate may be 
listed twice" rule.

The method also has really horrible temporal complexity. Part of this is 
inevitable because the method reduces to Kemeny in the single-winner 
case (when there's only one cluster), but even so...
To find the winning set, we need to pick k different orderings out of 
c!, where c is the number of candidates and k is the number of seats. 
The naive approach yields (c!^k), but we can make use of that each 
ordering must have a different candidate first, and so attain C(c, k) * 
(c-1)!, where C is the choice function. With equal-rank (which I haven't 
tried), it would presumably be worse.

Therefore, the method isn't very practical. It does, however, show 
another way of doing proportionality than by elect-and-punish. It could, 
perhaps, be likened to those other superpolynomial time multiwinner 
election methods, the "try every council against every other" sort, 
because in a sense, when we're trying (for instance) candidate cluster 
orderings A > (abc), B > (def), and C > (ghi) (for various values of 
abc, def, and ghi), we're testing the council {A, B, C}.

While the method is very slow, similar approaches might be possible for 
faster functions. Like Kemeny is slow but MAM is fast, perhaps there is 
another distance-clustering method that is fast.
For MAM, one may define the score for a Condorcet matrix (of a cluster) 
with respect to an ordering, as equal to the pairwise defeat of A 
against B, where A and B are picked to maximize this value, and the 
ordering ranks A ahead of B, breaking ties with the next-strongest 
pairwise defeat and so on. The MAM procedure itself picks the optimal 
(highest score) ordering, so the problem would be to select subsets of 
nearly equal number of voters so that the pairwise greatest defeat of 
the Condorcet matrix for each group is maximized, and so that the MAM 
procedure, applied to each Condorcet matrix, outputs orderings with 
different candidates ranked first. How that is to be done, I don't know, 
but the Condorcet matrix separation problem sounds NPC-ish. Perhaps the 
constraints can be relaxed (like we did with Kemeny) to get around 
NP-hardness.

-----------------------------------------------

(Phew! That was quite a bit of text. Hopefully there aren't too many 
errors here - if there are any, just let me know which.)