[EM] The Sainte-Lague index and proportionality

Sat Jul 14 01:00:03 PDT 2012

On 07/11/2012 08:16 PM, Michael Ossipoff wrote:
>
>
> On Tue, Jul 10, 2012 at 6:17 AM, Kristofer Munsterhjelm
> <km_elmet at lavabit.com <mailto:km_elmet at lavabit.com>> wrote:
>
>     On 07/09/2012 06:33 AM, Michael Ossipoff wrote:
>
>         What about finding, by trial and error, the
>         allocation that minimizes the calculated correlation measure.
>         Say, the
>         Pearson correlation, for example. Find by trial and error the
>         allocation
>         with the lowest Pearson correlation between q and s/q.
>
>
>         For the goal of getting the best allocation each time (as opposed to
>         overall time-averaged equality of s/q), might that correlation
>         optimization be best?
>
>
>     Sure, you could empirically optimize the method. If you want
>     population-pair monotonicity, then your task becomes much easier:
>     only divisor methods can have it
>
> If unbias in each allocation is all-important, then can anything else be
> as good as trial-and-error minimization of the measured correlation
> between q and s/q, for each allocation?

You answered this below. If you know the distribution, then you can 
directly find out what kind of rounding rule would be best and you'd use 
that one.

That is, unless you mean something different, that: "if the only thing 
you care about is correlation, then wouldn't limiting yourself to 
divisor methods be a bad thing?". Well, perhaps a non-divisor method 
would be closer to unbias, but divisor methods are close enough and you 
don't get weird paradoxes. For practical purposes, as you've said, even 
Sainte-Laguë/Webster is good enough.

>     so you just have to find the right parameter for the generalized
>     divisor method:
>
>     f(x,g) = floor(x + g(x))
>
>     where g(x) is within [0...1] for all x, and one then finds a divisor
>     so that x_1 = voter share for state 1 / divisor, so that sum over
>     all states is equal to the number of seats.
>
> [unquote]
> Yes, that's a divisor method, and its unbias depends on whether or not
> the probability density distribution approximation on which it's based
> is accurate. For Webster, it's known to be a simplification. For
> Weighted-Websster (WW), it's known to be only a guess.

It does seem to be pretty unbiased by Warren's computer simulations, 
slightly better than ordinary Webster. See 
http://www.RangeVoting.org/BishopSim.html . But of course, you can't 
entirely exclude the presence of bugs.

> You said:
>>   We may further restrict ourselves to a "somewhat" generalized divisor
>> method:
>>
>> f(x, p) = floor(x + p).
>>
>> For Webster, p = 0.5. Warren said p = 0.495 or so would optimize in the
>> US (and it might, I haven't read his reasoning in detail).
>> [endquuote]

> Yes, Warren said that if the probability distsribution is exponential,
> then that results in a constant p in your formula. He used one
> exponential function for the whole range of states and their
> populations, determined based on the total numbers of states and seats.
> But that's a detail that isn't important unless you've actually decided
> to use WW, and to use Warren's one overall exponential distribution.
> After I'd proposed WW, Warren suggested the one exponential probability
> distribution for the whole range of state populations, and that was his
> version of WW.

The aforementioned page also shows Warren's modified Webster to be 
better than Webster, and have very low bias, both for states with 
exponentially distributed populations and for states with uniformly 
distributed populations.

> You said:
>> Also, I think that the bias is monotone with respect to p. At one end
>> you have
>>
>> f(x) = floor(x + 0) = floor(x)
>>
>> which is Jefferson's method (D'Hondt) and greatly favors large states.
>> At the other, you have
>>
>> f(x) = floor(x + 1) = ceil(x)
>>
>> which is Adams's method and greatly favors small states.
>>
>> If f(x, p) is monotone with respect to bias as p is varied, then you
>> could use any number of root-finding algorithms to find the p that sets
>> bias to zero, assuming your bias measure is continuous. Even if it's not
>> continuous, you could find p so that decreasing p just a little leads
>> your bias measure to report large-state favoritism and increasing p just
>> a little leads your bias measure to report small-state favoritism.
> [endquote]

> You're referring to trial and error algorithms. You mean find, by trial
> and error, the p that will always give the lowest correlation between q
> and s/q?  For there to be such a constant, p, you have to already know
> that the probability distribution is exponential (because, it seems to
> me, that was the assumption that Warren said results in a constant p for
> an unbiased formula).

Then why is the modified Webster so good on uniformly distributed 
populations?

> If you know that it's exponential, you could find
> out p without trial and error, by analytically finding the rounding
> point for which the expected s/q is the same in each interval between
> two consecutive integers, given some assumed probability distribution
> (exponential, because that's what Warren said results in a constant p).
> As I was saying before, it's solvable if the distribution-approximating
> function is analytically antidifferentiable, as is the case for an
> exponential or polynomial approximating function.
> You might say that it could turn out that solving for R, the rounding
> point, requires a trial-and-error equation-solving algorithm. I don' t
> think it would, because R only occurs at one place in the expression. We
> had analytical solutions.
> But, as I was saying, you only know that WW is unbiased to the extent
> that you know that your distribution-approximating function is accurate.
> I felt that interpolation with a few cumulative-state-number(population)
> data points, or least-squares with more data points, would be better.
> Warren preferred finding one exponential function to cover the entire
> range of state populations, based the total numbers of states and seats.
> I guess that trying all 3 ways would show which can give the lowest
> correlations between q and s/q.

It seems that even if you don't know anything about distributions, you 
could gather past data and optimize p just by educated trial and error.

Actually, I think one could go further. Consider a method that tries 
every possible allocation to find the one with least bias. Then you know 
what your assembly "should" look like. Now you can use a divisor method 
and some inference or black-box search algorithms to find out what g(x) 
"should" look like for various x. Repeat for different assemblies to get 
a better idea of the shape of g(x), then fit a function to it and there 
you go.

But in practice, I agree with you. Sainte-Laguë is good enough. If you 
really want better, try all the different suggestions on Warren's page 
and one of them most likely will be good enough. If none are, *then* one 
may consider more exotic approaches.