[EM] The Sainte-Lague index and proportionality

Sun Jul 8 21:33:47 PDT 2012

You know, as desirable as it is for everyone to be as close as possible
>> to their correct proportional share, _bias_ seems to me to be the really
>> important consideration in apportionment and PR. Size-bias.
>>
>
> I imagine that the SLI would penalize bias more heavily than random
> inaccuracy. My intuition goes like this: there are only a few ways a method
> can be consistently biased (small-party bias or large-party bias), but
> there are many ways one might have random noise. Therefore, if you see bias
> of any given type in a seat distribution, then that would make you much
> less likely to think the distribution is a good fit to the voting
> distribution than if you saw just random noise of the same magnitude. In
> the same way, if the SLI measures goodness-of-fit between the distribution
> given by the votes and the seat distribution, adding consistent bias would
> produce a worse fit than would random noise of the same magnitude.
>
> Chi-squared tests are also pretty good at distinguishing low quality
> pseudorandom number generators from better ones. When PRNGs fail, they
> usually fail by exhibiting bias. For instance, linear congruential
> generators exhibit bias where n sequential numbers fall on either of a
> small number of planes in n-dimensional space, where n depends on the
> generator.
>
> So I think SLI would penalize bias pretty effectively. In any event, it's
> easy to check. Take the voting distribution, then add either consistent
> bias (correlation between s/q and q) to fix the RMSE of the result to a
> predetermined level. Then compare the SLI of the biased distribution from
> the SLI of the randomized one, and do this enough times in a Monte-Carlo
> fashion. If I'm right, the mean SLI should be worse for the distributions
> with bias than the ones with random noise.
>
>
>
SL/Webster minimizes the SL index, right? It's known that Webster has _no_
 bias if the distribution-condition that I described obtains--the uniform
distribution condition.

I'm not a statistician either, and so this is just a tentative possibility
suggestion: What about finding, by trial and error, the allocation that
minimizes the calculated correlation measure. Say, the Pearson correlation,
for example. Find by trial and error the allocation with the lowest Pearson
correlation between q and s/q.

For the goal of getting the best allocation each time (as opposed to
overall time-averaged equality of s/q), might that correlation optimization
be best?

Webster and Weighted-Webster have in common an assumption that they both
depend on: They assume that the state or district populations, or the party
vote totals, are unknown and unpredictable, and that they have some known
or estimated probability distribution.

The methods are unbiased, from our point of view if we (at least pretend
that we) don't know more than that.

With Webster/SL, the assumption is that the probability distribution is
uniform. Weighted Webster attempts to estimate that distribution. There are
various ways of estimating it.

I've recently suggested interpolating, by a few population or vote total
values in and near each particular N to N+1 interval. That gives a
non-smooth collection of interpolating curves. Webster or Weighted-Webster
will be really unbiased, by empirical tests, only if the populations or
party vote-totals are really varying enough to be unpredictable enough, and
if the distribution really is as estimated or assumed.

Maybe Weighted-Webster would be better with an estimated distribution
gotten by least-squares based on a greater number of states, districts or
parties, over a larger range.

Warren assumed one exponential function for the whole set of states
districts or parties, finding it based on the total numbers of states and
seats.

But what if the states' or districts' populations are unchanging, or
changing together in the same proportions? Webster/SL and
Weighted-Webster's guesses about what allocation is unbiased might not be
very good.

And, even at best, even if they vary enough, and the distribution
assumption is accurate, there's no guarantee about _each_ allocation being
the least biased one that could be made, according to empirical
bias-tests.  The unbias guarantee would be over time.

But if we're talking about unbias over time, then why not just do the
equalization of the time-average of the s/q values?

So, Webster/SL is the best of the simple methods being considered, and I'm
certain that it's the one that should be recommended and used.

But, if we're interested in optimizing _each_ allocation, for _each_
apportionment or election, then might it not be better to do it by trial
and error, to find the allocation that looks the least biased by some
empirical test, such as Pearson correlation between q and s/q?

I'm just talking about _ideally_. In practice, I suggest
Webster/Sainte-Lague.

Mike Ossipoff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20120709/d778f2ec/attachment-0004.htm>