[EM] Condorcet clustering methods: correction and quotas

Sun Feb 23 03:17:44 PST 2014

First, a correction to my minmax clustering method.

In the data part, we have:

param ballot_condmat :=
                  #C L R
         [1,*,*]:  1 2 3 :=      # 11: L>C>R
                 1 0 0 11
                 2 12 0 11
                 3 0 0 0
         [2,*,*]:  1 2 3 :=      # 10: R C L
                 1 0 10 0
                 2 0 0 0
                 3 10 10 0
         [3,*,*]:  1 2 3 :=      #  2: C > L > R
                 1 0 2 2
                 2 0 0 2
                 3 0 0 0;

This should obviously be

param ballot_condmat :=
                  #C L R
         [1,*,*]:  1 2 3 :=      # 11: L>C>R
                 1 0 0 11
                 2 11 0 11
                 3 0 0 0
         [2,*,*]:  1 2 3 :=      # 10: R C L
                 1 0 10 0
                 2 0 0 0
                 3 10 10 0
         [3,*,*]:  1 2 3 :=      #  2: C > L > R
                 1 0 2 2
                 2 0 0 2
                 3 0 0 0;

i.e. the 12 in the second row for C is altered into a 11, since that 
ballot is only cast by 11 voters. With this fix, the clustering method 
with ordinary (summed) scores gives C a seat for council sizes of 1, 3, 
5, 7, 8, 9, or 10 seats (of possible sizes <= 10 seats).

The minmax score option only gives C a seat for sizes 1, 9, and 10, 
which favors the center less than even Plurality-based Webster.

-----------------------------------------------------------

I didn't quite realize it at first, but the method (and thus Monroe's 
original method) has an implied quota.

In his paper, Monroe says that each winner (of which there are m, for m 
seats) is associated with a constituency of n/m voters, for n voters in 
total (p. 928, American Political Science Review, Vol. 89, No. 4). This 
means that every constituency (what I've been calling a cluster) is of 
the same size.

This constraint is expressed, in the minmax code, as:

s.t. same_size{c in CLUSTERS}:
         sum {k in BALLOTS} (ballot_fraction[k, c] * ballot_wt[k]) =
                 totweight/numclusters;

(The River code is analogous, except that ballot_fraction's indices are 
reversed, i.e. what's in the minmax program stored as ballot_fraction[x, 
y], is in the River program stored as ballot_fraction[y, x]. Oops!)

What this says is that the number of voters in a cluster (ballot 
fractions for that cluster times ballot weight of the ballot in 
question) must sum to the same number, which is totweight / numclusters 
or n/m.

But this implies a Hare quota. Say that, in the minmax program, a given 
candidate exceeds a Hare quota of first preferences. Then a cluster 
containing voters ranking this candidate first will get maximum score. 
Since each voter will vote the candidate above everybody else, the 
weakest victory for this candidate will equal the cluster size, which is 
the maximum possible score a cluster can attain.

But if that is a Hare quota constraint, then it's relatively easy to 
alter it into a Droop quota constraint. By the same notation (and array 
order) as above:

s.t. quota_constraint{c in CLUSTERS}:
         sum {k in BALLOTS} (ballot_fraction[k, c] * ballot_wt[k]) >=
                 (totweight + 1e-6)/(numclusters+1);

The 1e-6 is there because a Droop quota constraint is strictly that "any 
party supported by more than k Droop quotas should have at least k 
seats", so if it's supported by exactly k Droop quotas, it doesn't 
necessarily get a seat. But that is vanishingly rare and you may remove 
the fudge factor if it seems ugly.

What happens now is that each cluster is constrained to be at least a 
Droop quota in size, but can be larger if that increases the score.

But how does this change the outcome? Well, in the LCR example with 
minmax, we get that, for council sizes <= 10:

- with ordinary (summed) scores, C gets a seat when the total number of 
seats are 1, 3, 4, 5, 6, 7, 8, 9, and 10.
- with minimax scores, C gets a seat when the total number of seats are 
1, 3, 5, 7, 8, 9, and 10.

So with a Droop quota, the method is less proportional and more 
majoritarian than with a Hare quota. This seems in line with how 
Plurality-based party list methods with smaller quotas favor large 
parties. If those methods have a greater "large party bias" with smaller 
quotas, the Condorcet-clustering method has a greater "centrist bias" 
with smaller quotas.

More generally, it seems that smaller quotas render the method more like 
how it acts in the single-winner case. Plurality and IRV-based ones have 
a large party bias because those parties have many votes and so would be 
elected more often in plain Plurality (or IRV). Similarly, if I'm right, 
Condorcet-based ones have a centrist-majority bias because centrists are 
more likely to be elected in single-winner Condorcet.

-

And now that I know there's a quota in these methods, what's next? 
"Floating quota" Webster/Sainte-Laguë? That would be hard to implement 
within the limits of mixed-integer programming.

The most surprising thing here is that minimax scoring seems to behave 
"properly" with a Droop quota whereas summed scores behave better with a 
Hare quota. How do the quota rules influence the outcome with summed 
scores vs with minimax scores? It might be interesting to explore in 
greater detail. Perhaps there is something to the distinction mentioned 
about Hare and Droop on Wikipedia: a Hare quota *represents* a group, 
whereas a Droop quota *is elected by* a group. But a more formal or 
rigorous investigation would probably need extreme cases, where minimax 
and sum differ as greatly as possible.