[EM] Whoops! Correction to Statistical Condorcet
Kristofer Munsterhjelm
km_elmet at t-online.de
Wed Sep 3 11:10:47 PDT 2014
Apparently the unconstrained MLE for the multinomial distribution with
given probabilities isn't the Sainte-Laguë or Webster apportionment, but
the D'Hondt or Jefferson apportionment[1].
This came as quite a surprise to me, given that the chi-squared and
G-tests are claimed to approach the exact test as n goes to infinity. I
may look into this in detail later, but I suspect the situation is that
although the exact test value, call it x, and the G-test value, call it
y, obey lim n->inf x_n-y_n = 0, for any finite n, the exact test will
give the D'Hondt assignment a greater value (more likely draw) than the
Sainte-Laguë one, whereas it's the other way around for the G-test (or
chi-squared test).
I am not a formal statistician, though! And since I got the implications
of the convergence wrong, I might be wrong about this as well.
For clarification purposes, letting s be the number of seats, the exact
test for the multinomial distribution is: given a draw vector x, (i.e.
so many of x_1, so many of x_2, etc) and a probability vector p = (p_1,
p_2, ...), and the multinomial pmf P
Pr(x) = sum [for all y so that P(y; s, p) <= P(x; s, p)] P(y;s,p)
And letting s be the number of seats, define chi-squared test statistic as
chi(x, p) = SUM i=1...n (x_i/s - p_i)^2/(p_i)
Then, say, for the following votes: (10, 9, 8, 5, 4) and 5 seats, we have:
p-vector: 0.28, 0.25, 0.22, 0.14, 0.11
Sainte-Lague: (1, 1, 1, 1, 1)
D'Hondt: (2, 1, 1, 1, 0)
Exact test value for Sainte-Lague: 0.899
Exact test value for D'Hondt: 1.0
multinomial pmf for the Sainte-Lague assignment: 0.028
multinomial pmf for the assignments with greater probability than this:
0.03234 for [1, 2, 1, 1, 0]
0.03622 for [2, 1, 1, 1, 0]
0.03234 for [2, 2, 1, 0, 0]
But the chi-squared statistic for Sainte-Lague is 0.13403 while the one
for the D'Hondt apportionment is 0.19896, thus ranking the former higher
than the latter.
This is true even for large s, e.g.:
p = (0.3786, 0.245265, 0.1846, 0.06637, 0.06583, 0.059335)
150 seats
Sainte-Lague: [56, 37, 28, 10, 10, 9]
D'Hondt: [57, 37, 28, 10, 9, 9]
pmf for Sainte-Lague: 1.64*10^-5
pmf for D'Hondt: 1.66*10^-5
chi-square for Sainte-Lague: 0.00012
chi-square for D'Hondt: 0.00058
-
Of course, if you like D'Hondt (for stability reasons or otherwise), you
don't need to do anything to Statistical Condorcet to fix the above.
Because it's Condorcet-based, it should also favor compromise parties
rather than parties that get large numbers of first preference votes, so
it is better than ordinary D'Hondt in that respect.
But if you don't, then the elegance of maximizing the pmf falls. So we'd
have to find some way of using, say, the global optimality properties
mentioned in http://rangevoting.org/Apportion.html directly. But this is
tricky because they are all minimization properties, which means that
the optimizer might just decide to set zero voters to participate and
thus get a perfect zero every time.
That is again something to investigate later. Perhaps taking the area of
the chi-squared distribution above the point given by the chi-squared or
G-test would work: that turns it into a maximization problem again. But
since the cdf for chi-squared involves gamma functions, optimizing that
might be rather difficult.
Alternatively, we might go deeper. Why choose Sainte-Laguë to begin
with? Because it's unbiased: it doesn't consistently favor small or
large parties (and, because it's a divisor method, it has certain
favorable properties we'd like to carry over). So find something that is
unbiased. But the problem with that is that we might lose the "reduction
to Webster when everybody plumps" property.
-
[1] I uncovered this when reading "A fast and simple algorithm for
finding the modes of a multinomial distribution" by White and Hendy. It
gives an algorithm for finding a mode of the multinomial, i.e. an
apportionment that maximizes the exact test value. The paper is
paywalled, but the algorithm is essentially a combination of Jefferson
and D'Hondt: first they get a Jefferson solution for a number of seats
that's close enough to the number of seats specified, and then they run
D'Hondt either forwards or in reverse until they get the number of seats
you want. The authors don't appear to recognize the solution as D'Hondt,
though.
More information about the Election-Methods
mailing list