[EM] Various ways of judging bias and approaches to eliminating it

Tue Jul 17 16:48:43 PDT 2012

This posting accidentally got sent when I was partway through writing
it. I have no idea which keys on the keyboard somehow sent the
message.

I'm sending it again. This time I'll make sure that it only gets sent
when it's completed. I'll do that by not filling in the "To:" field
until I've finished the posting.

So, starting from the beginning:

When I say "interval" without qualifying it, I'm referring to the
interval between two integers, for the value of q, the quotient of
dividing states' populations by some common divisor.

Divisor methods, expected s/q:

Divisor methods can eliminate bias by making equal, for all of the
intervals, the expected s/q for a state somewhere in a particular
interval.

That can be done for different assumptions about the
probability-density distribution for states, over the range of q..

As I've said before, Bias-Free (BF) is unbiased if that distribution
is assumed to be uniform.

Weighted Bias-Free (WBF) is unbiased if that distribution is
accurately approximated by an approximating function that is used with
WBF.  The definition of WBF doesn't specify any particular
approximating function.

As a first impression, it might seem as if WBF is objectively more
unbiased than BF, because the distribution is known to be non-uniform.

But that depends on how one judges bias. One can judge bias by the
expected s/q of a party somewhere in an interval, equally likely to be
anywhere in that interval, without actually believing or assuming that
a state is really equally likely to be anywhere in that interval. I
suggest that that is a legitimate way of judging bias.

If you reside in a state whose population is such as to usually put it
in a certain interval, and you want fairness to states in that
interval, maybe you're interested in the overall average for s/q in
that interval, averaged over all q values in the interval. So it isn't
incorrect to calculate bias as if the distribution were uniform in the
interval. After all, it's not as if your state is really randomly
varying in population throughout the interval..

So BF is completely unbiased in a meaningful sense.

If you really felt that your state were randomly varying in
population, according to some estimated probability density function,
throughout the interval, then WBF would be more justified, for
bias-elimination by a finer measure.

But BF's meaningful justififcation for genuine unbiasedness is good
news, because BF is a lot simpler than WBF.

Correlation between q and s/q in a particular allocation:

I've already proposed a method that, by trial-and-error, minimizes the
Pearson correlation between the states q and s/q.

That's bias-by-state. A problem with that is that that bias reverses
its direction with each interval. For instance, consider
Jefferson/d'Hondt:

Jefferson/d'Hondt is well-known to be strongly large-biased. But
that's only true _overall_, globally. Within each interval,
Jefferson/d'Hondt is strongly small-biased.

Consider two states in the 1 to 2 interval. One of them has q of 1.001
  The other has q of 1.999

Which one has higher s/q?  They both have one seat. The smaller state
has q close to 1. The larger one has q close to 2.

The smaller state has s/q about twice that of the large state.

So, overall, the s/q varies as a sort of saw-tooth function.

The bias within each interval, I call "microbias". The overall global
bias, comparing only whole intervals, disregarding microbias, I call
"macrobias".

I suggest that macrobias is the important bias that we're interested in.

So maybe, for minimizing the correlation between q and s/q, it would
be better do do so for intervals, instead of for states.

...looking at, for each interval, the combined q of all the states in
that interval, and the combined number of seats won by those states.
...And minimizing the correlation between that q and that s/q for all
the intervals.

Either approach probably has legitimate appeal for meaningfulness. I
think I prefer the by-interval approach, because I don't think that
microbias is part of what we're talking about when we speak of bias.

But that's just one impression that someone could have.

If you reside in a state at the top end of an interval, and you
compare your state's s/q to that of a state at the bottom end of that
interval, you might resent that state's higher s/q, if bias was only
minimized for intervals instead of for states.  So maybe there's also
a case for minimizing bias by state, which doesn't ignore microbias.
But maybe you might feel that that intra-interval s/q difference is
justified if you're more interested in equalizing the s/q for small
states vs big states.

Mike Ossipoff