[EM] Statistical Condorcet: Combining Schulze STV and Webster

Sat Aug 30 09:06:54 PDT 2014

Some days ago, I thought more about the relation of chi-square to the 
Sainte-Lague index and Webster as a MLE for the multinomial (for large 
seat assemblies), and after considering this further, I think I have 
found a way to combine Schulze STV and Webster to make a more fair party 
list method. Unfortunately, unless the structure of the thing permits 
massive pruning, it will be much too slow to run on a national scale. 
But perhaps it could be used in local elections.

The method goes like this: The meta-system is like Schulze STV. Define a 
contest function c(sA,sB) where sA and sB are sets of candidates that
differ from each other by just one candidate. Contests between sets that 
differ by more than one candidate are handled through strongest path, as 
in ordinary Schulze STV.

Since voters vote for parties, a contest may also contain numerous 
copies of the same "candidate". For instance, [AAAB] is a set with 3 
members from party A and one from B.

The main difference between this method and Schulze STV is in what c(sA, 
sB) is. Let sA and sB be council assignments that differ by a single 
candidate which we'll call C. C is present in sB but not in sA, so c(sA, 
sB) returning a higher score means sA is better than sB.

As in Schulze STV, in a given contest, each voter may give his vote to
any candidate he prefers to C. But how is this assignment to be guided? 
Well, that's where the difference comes in. Let x_j be the number of 
seats given to the jth party in the set sA.
Let f(x_1, ..., x_n, n; p_1, ..., p_n) be the multinomial pmf for n 
draws where the probability of drawing an item from category j is p_j, 
and x_j is the number of items drawn. Let p_j be the fraction of the 
voters' support that goes to the jth party. Then the voter-party support 
assignment is done so as to maximize f. So in this particular contest, 
the voters are engaging in a kind of maximum-likelihood cumulative vote 
DSV against C.

Some voters may not be able to contribute to this - for instance, if
they rank C on top. Let nv be the number of voters who did contribute.
Then c(sA, sB) = nv * f(x_1, ..., x_n, n; p_1, ..., p_n).

There is one other quirk to iron out. Since parties may have multiple
candidates in the running, C might be a member of a party that's also 
present in sA. In that case, since voters directly vote for parties, 
they are also allowed to support the party C is a part of if sA contains 
at least one member of that party. This doesn't directly help C because 
if the supporters of C's party were to dump too much support into that 
party, the fit that is optimized by f would decrease. Yet it's only 
permitted if at least one candidate of that party is present on the left 
side; otherwise, single-winner won't reduce properly. (It can also be 
seen by considering party support for A>B>C as 
A1>...>An>B1>...>Bn>C1>...>Cn and then using ordinary Schulze STV rules)

====

Note that it's conceptually easy to alter this system to take other
factors into account. Just alter the distribution and the pmf. For 
instance, if you want to incorporate that some parties may only have a 
few members (not enough to fill the council), then replace the 
multinomial with the multivariate hypergeometric distribution.

Also, the method above converges to Sainte-Lague/Webster when everybody
plumps for one party each, and the council size is large. This is 
because Sainte-Lague/Webster optimizes the Sainte-Lague index, which, 
through its connection to the chi-squared test, finds the council that 
optimizes f above when the council size is large enough.

The mathematical simplicity of the thing might mean that one can 
optimize it to run on large instances. In particular, dealing with it on 
a candidate by candidate basis rather than a party-by-party basis is 
designed to make full use of excess votes so they may add up to give a 
seat to a compromise party. Perhaps one could determine the number of 
safe seats for each party in a per-party manner and then hone the result 
afterwards.

One could also imagine an Approval variant of the concept: voters 
approve of one or more candidates or parties. Then the method picks the 
council that maximizes nv * f as above where each voter's support is 
distributed over the candidates or parties he approved, cumulative 
voting style. But I have a hunch that such a method would reduce to 
Monroe's method or a variant of it.

As for applying MJ to this idea... perhaps robust estimators? I don't know.

I could write something about how to find the optimal vote supports as 
well, but this post is long enough. Tell me so if you think I should. 
The short of it is that it comes down to optimizing a weighted sum of 
logarithms under linear constraints.