[EM] Monroe's Clustering Method (was PR methods and Quotas)

Tue Jul 26 00:03:40 PDT 2011

Andy Jennings wrote:
> Kristofer Munsterhjelm wrote:
> 
>     Andy Jennings wrote:
> 
>         Like Jameson and Toby, I have spent some time thinking about how
>         to make a median-based PR system.
> 
>         The system I came up with is similar to Jameson's, but simpler,
>         and uses the Hare quota!
> 
> 
>     How about clustering logic? Say you have an electorate of n voters,
>     and you want k seats. The method would be combinatorial: you'd check
>     a prospective slate. Say the slate is {ABC...}. Then that means you
>     make a group of n/k voters and assign A to this gorup, another group
>     of n/k other voters and assign B to that group, and so on.
>     The score of each slate is equal to the sum of the median scores for
>     each assigned candidate, when considering only the voters in the
>     assigned candidate's group.
> 
> 
> Kristofer,
> 
> If you want a clustering PR method, then I would highly recommend 
> Monroe.  In Monroe, the score of each slate is equal to the sum of each 
> voter's score of his assigned candidate.  I think that, for a given 
> slate, finding the optimal way of assigning the voters to the winners is 
> polynomial.  (See Warren's page for more 
> information: http://rangevoting.org/MonroeMW.html)
> 
> I consider it to be a very promising method.

I think this is the method I called CFC-Range, for "continuous forced 
clustering, Range". This method assigns voters, or fractional voters (so 
we can handle weighted votes) to one of a number of clusters, where each 
cluster is assigned to a candidate, so as to maximize the score that is 
the sum of scores to the assigned candidates in each cluster. This is 
done subject to the constraint that each cluster must have the same 
weight (sum of votes' weights), so one doesn't end up with one cluster 
representing 99% of the people and others less than 1%. Then it tries 
each possible slate to find the one that has a max sum, and that one wins.

As an example, say that there are three clusters, and they're assigned 
to A, B, and C, respectively. Then the method assigns voters (or 
fractional voters) to cluster #1, #2, and #3 so as to maximize [the 
score for A in the first cluster plus the score for B in the second 
cluster plus the score for C in the third cluster].

I do that assignment per slate by using linear programming, but I guess 
it could be done faster by more specialized algorithms, as Warren 
states. It might also be the case that Monroe's original integer 
programming formulation will relax easily to linear programming and so 
in practice be solvable quickly, similarly to how I usually can find the 
Kemeny and Dodgson winners quickly.

I also made a Kemeny version of this - actually, the Kemeny version was 
the one I tried first. Here, one assigns orderings (rankings of 
candidates) to each cluster, and then allocates voters to the different 
clusters to maximize the Kemeny score of the assigned ordering wrt the 
cluster in question. The orderings are constrained so that there is a 
different first place candidate for each cluster's ordering, and then 
the candidates at first place of the orderings whose Kemeny scores' sums 
are maximum, win.
However, this is *very* slow, because now one doesn't only have to find 
the right slate, but the right combination of orderings. Thus, it's 
impractical, though it seems to work to some degree as a multiwinner 
method (were it not so slow...).
Therefore, I've been interested in Condorcet methods that give 
reasonable results under a variant of margins where the X against Y 
score is always (number of votes where X is above Y) - (number of voters 
where Y is above X), not either this or 0, whichever is more. If I could 
find such a method, it might be incorporable into linear programming and 
then I could make a clustering Condorcet method that would only be 
per-slate combinatorial instead of worse.

> If I recall, when you did your PR simulations and graphed them on two 
> axes (total satisfaction vs. proportionality), you said you weren't 
> completely happy with your metric for proportionality.  I think you 
> should consider using Monroe's metric.  That is, no matter how a given 
> method chooses it's slate, you run the algorithm on that slate to 
> optimally assign the voters to that slate and then you compute the sum 
> of the voters' grades of their assigned candidates.

Yes, I could do that. I could do that for any combinatorial method, 
actually. Just like different Condorcet methods can be phrased as "find 
the winner while minimizing the influence of noise type X" (where X 
depends on the method), the combinatorial methods could also be 
considered "find the best slate according to logic Y". There is a side 
effect, though, and that is that doing so will turn the combinatorial 
optimization method perfect (because it optimizes according to that 
logic). Because I don't want to give any method an automatic advantage, 
I would prefer the metric to be indirect - to measure something related 
to proportionality rather than the candidates themselves. The binary 
issue-space metric I'm currently using does that, but it's not very 
detailed (since it is binary).

> Of course, to actually find the optimal slate is an NP problem.  People 
> often show this by embedding the "exact cover by 3-sets problem" in it. 
>  But that requires having one-third as many candidates as voters!  In 
> fact, finding the optimal slate is NP in the number of candidates and 
> seats to fill, not in the number of voters.  If there are not too many 
> candidates or not too many seats to fill, it should be tractable to 
> iterate over all possible winning sets of candidates.  In reality, I 
> think an integer programming algorithm should be able to find the 
> optimal slate fairly quickly.  Or you can always dodge the question by 
> having a public contest to find the best slate.

Or a combination of the two: run the state computer for a day, then set 
the result as the best so far and let competitors try different ones if 
they have any that are better.