[EM] The Sainte-Lague index and proportionality

Sun Jul 15 02:47:20 PDT 2012

>> If unbias in each allocation is all-important, then can anything else be
>> as good as trial-and-error minimization of the measured correlation
>> between q and s/q, for each allocation?
>
>
> You answered this below. If you know the distribution, then you can directly
> find out what kind of rounding rule would be best and you'd use that one.

Yes, but that's a big "if".  But if you, by trial and error, in each
allocation, minimize the measured correlation between q and s/q, then
you're achieving unbias, and you needn't know or assume anything about
that distribution.

Besides, Pearson correlation is well-known, and WW is new. And
minimizing correlation is obvious and easily explained.  Of course
you're losing the minimization of the deviation of  states' s/q from
its ideal equal value.

On the other hand, if an one exponential function, over the entire
range of states, is a good approximation, then we have a constant p.
And that p that is just very slightly less than .5 wouldn't be so hard
to get acceptance for, if it's explained that it gets rid of Webster's
tiny bias of about 1/3 of one percent.   ...to better attain more true
unbias.

So either approach would be proposable, if that one overall
exponential is a good approximation. But Warren himself admitted that,
at the low-population end, it isn't accurate, because the states, at
some point, stop getting smaller. But Warren said that his single
exponential function worked pretty well in his tests.

> That is, unless you mean something different, that: "if the only thing you
> care about is correlation, then wouldn't limiting yourself to divisor
> methods be a bad thing?". Well, perhaps a non-divisor method would be closer
> to unbias

Direct trial & error minimization of correlation between q and s/q
would be unbiased without depending on an accurate approximation of
the distribution function.

>, but divisor methods are close enough and you don't get weird
> paradoxes.

That's true. Webster's 1/3 of one percent bias is, itself, negligible,
for all practical purposes. WW, even with Warren's rough
single-exponential approximation, will be considerably more unbiased
still. Webster or WW, then, might be the most easily proposed and
acceptable of the low-bias apportionment proposals.

>For practical purposes, as you've said, even Sainte-Laguë/Webster
> is good enough.

Certainly.

I was just discussing the ideal. Besides, it's good to look at it,
just in case the small states reject Webster because it has _any_
large-bias. (Even though Hill has about 17 times as much bias as
Webster, based on Warren's proposed WW p value of .495). WW would have
even less than Webster, and its direction would either be unknown, or
at least its magnitude would be minimized to the best of our ability
(for a divisor method).

>
> It [Warren's WW] does seem to be pretty unbiased by Warren's computer simulations,
> slightly better than ordinary Webster. See
> http://www.RangeVoting.org/BishopSim.html . But of course, you can't
> entirely exclude the presence of bugs.

I'm certain that Warren's WW is considerably more unbiased than
Webster. But it might be better to, instead, do an actual
least-squares approximation, based on all of the states, or based on
different sets of states in different population ranges; or do an
interpolation of a few data points in and near each N to N+1 integer
interval.

By the way, it's to be expected that p is a constant if the
approximation is one exponential function, over all of the states.
With that exponential, the probability density varies by the same
factor over equal increments of population. It varies by the same
factor in each N to N+1 integer interval. So one would expect p to be
the same in each interval.

> The aforementioned page also shows Warren's modified Webster to be better
> than Webster, and have very low bias, both for states with exponentially
> distributed populations and for states with uniformly distributed
> populations.

Ok, now we're getting into a complication that I'd hoped to avoid. In
December 2006, and maybe in the first weeks of 2007, I'd felt that a
state's deviation of its s/q from 1 should be divided by the state's
population. Unfairness per person. That was mistaken. It's a
fallacious way of judging unfairness. If a state's s/q is a certain
amount high or low, that is a measure of how far _each_ person's
representation is off, in that state. Dividing by population was
fallacious.

But, when I believed that division was the way to go, I proposed a
method that was unbiased by that measure. I called it "Bias-Free"
(BF).  I said, because it seemed so to me at the time, that, with a
uniform distribution, BF was the unbiased method, and Webster was not.
Then, later I realized the fallacy, and I posted that Webster is the
unbiased method when the distribution is uniform, and that BF was
based on a fallacy.

But, before that, Warren and I were working on Weighted BF versions.
Our versions, then as now, differed in what distribution
approximations we used.

When I posted that the division by population didn't make sense, and
involved a fairness fallacy, Warren didn't agree, and said (it's at
his website) that he still preferred that division by population;

So, with that population-divided approach, BF would be the unbiased
method with a uniform distribution. BF is what Weighted BF would be,
if the distribution were uniform.

That's why Warren said what you quoted him (above) as saying.

When the distribution is assumed uniform, WW becomes ordinary Webster.

>
>> You're referring to trial and error algorithms. You mean find, by trial
>> and error, the p that will always give the lowest correlation between q
>> and s/q?  For there to be such a constant, p, you have to already know
>> that the probability distribution is exponential (because, it seems to
>> me, that was the assumption that Warren said results in a constant p for
>> an unbiased formula).
>
>
> Then why is the modified Webster so good on uniformly distributed
> populations?

Warren's modified Webster, with the division by population was the
most unbiased method by Warren's measure, because Warren's measure was
based (as mine had been) on the division by population. In other
words, Warren found a method that would be unbiased by his measure.
With a uniform distribution, his method amounted to my BF. With
Warren's fairness measure (previously mine too), BF is the unbiased
method for a uniform distribution. Now I realize that ordinary Webster
is the unbiased method for a uniform distribution.

In fairness to Warren, he wrote his website in 2007, right around the
time of our discussion.

As I said, Weighted-Webster (my version, which seeks merely to make
the s/q equal) becomes ordinary Webster if the distribution is
uniform.

>
> It seems that even if you don't know anything about distributions, you could
> gather past data and optimize p just by educated trial and error.

Yes, but how much of the past data would be useful? We've only had 50
states for about 50 or so years. That's 5 censuses. Besides, why
should the probability distribution be the same now, and in the
future, as it was during the past 50 years?

Don't the actual current state populations give us the best way to
estimate the distribution function? By least-squares curve-fitting, or
by interpolation of a few data points in and around each N to N+1
integer interval?

Or maybe as Warren estimated it, by finding the exponential that is
consistent with the total number of states and the total number of
seats. Maybe the total population figured in his determination of the
exponential. too.

Anyway, that sounds rougher than the least-squares and the interpolation.

Those estimates of the best distribution approximation seem more
reliable than an assumption that the distribution is as it was over
the past 50 years.

> Actually, I think one could go further. Consider a method that tries every
> possible allocation to find the one with least bias.

Least correlation between q and s/q?  For each allocation, or over
time? WW makes no claim to minimize that correlation in each
allocation.
o.
>
> But in practice, I agree with you. Sainte-Laguë is good enough. If you
> really want better, try all the different suggestions on Warren's page and
> one of them most likely will be good enough. If none are, *then* one may
> consider more exotic approaches.

I don't agree with the fairness-measure that Warren was using. I was,
at first using it too.

But then I realized that unbias merely consists of making the expected
s/q equal, among the various N to N+1 integer intervals. Warren was
assuming something else.

If you want something more unbiased than Sainte-Lague, then use one of
the WW, with one of the distribution approximations that we've been
discussing.

Or minimize, by trial and error, the Pearson correlation between q and s/q.

Or make the average, over time, of s/q equal. (As Warren pointed out,
there are various ways of doing that).

Mike Ossipoff