[EM] The Sainte-Lague index and proportionality

Sat Jul 7 08:56:03 PDT 2012

The Sainte-Laguë index is a measure of disproportionality that is 
minimized by Sainte-Laguë / Webster. (Michael Gallagher also recommended 
it as "the standard measure of disproportionality".)

The Sainte-Laguë index is smiply the sum of, over all parties (or other 
distinct groups), (V_p - S_p)^2 / V_p, where V_p is the share of votes 
for party (or group) p, and S_p is its share of seats. If there were as 
many seats as voters, then V_p - S_p would be 0 and 0/x is 0 for any x 
!= 0, so in the case of perfect proportionality, this index is 0.

However, the case of perfect disproportionality shows a problem with 
this index. If there's a party who gets no votes whatsoever, then V_p is 
0 and you get a division by zero. It's easy to, for this case, either 
say 0/0 = 0 or just exclude zero-vote parties (as adding a party with no 
seats and no votes shouldn't have an effect), but if that party gets a 
seat, then the index resolves to infinity. It's pretty unlikely that a 
party with no votes would get a seat, but if a party with a low vote 
share would happen to get a seat, that could unbalance the index, so 
it'd be useful to find something that acts like the Sainte-Laguë index 
but handles those situations better.

The expression of SUM over p, (V_p - S_p)^2 / V_p looks a lot like the 
x^2 of the chi-squared test. If we multiply both V_p and S_p by the 
number of seats, we get a chi-squared test where the expected value is 
the number of seats the given party "ought" to have (in the ideal case), 
and the observed value is the number of seats it actually got -- 
although then the x^2 value is used directly instead of transformed into 
a p-value.

And to my knowledge, the same problem exists in the context of 
chi-squared tests. There, they use rules of thumbs like "where there is 
only one degree of freedom, the approximation is not reliable if 
expected frequencies are below 10".

One could go in two directions, then. First, that the Sainte-Laguë index 
is related to a chi-squared test of the probability that the seats were 
sampled from the distribution of ideal number of seats as given by the 
voters. Then, other ways of measuring goodness-of-fit might work where 
the Sainte-Laguë index itself fails. Perhaps an exact multinomial test 
would work for small assemblies. If one needs to have numbers similar to 
the Sainte-Laguë index, one could just reverse the final step of the 
chi-squared test (and go from p-value to x^2 rather than vice versa).
Second, improvements to the chi-squared test could be used to improve 
the Sainte-Laguë index. Again, as an example, one could construct a 
"Sainte-Laguë G-index": 2 * SUM (over p) of ln(S_p/V_p), which is to the 
Sainte-Laguë index what the G-test is to the chi-squared test. Note, 
though, that this still has the original SLI's division-by-zero problem, 
and to get the same independence of no-vote parties, one'd have to set 
ln(0/0) = 0.

(Usual disclaimer: I Am Not A Statistician.)