[EM] Proportionality: the small bias effect seems to be real for Harmonic

Mon Sep 23 05:07:15 PDT 2024

I implemented some different measures of proportionality for my 
simulator, and they all favor small values of delta for the cardinal 
methods.

Since the result seemed so persistent, I decided to take a more 
mathematical approach with a 1D standard normal to see if I could 
reproduce it there. Infinite voters and candidates along the Gaussian, 
and treating it like an integration problem.

That Harmonic voting only cares about the ratings of the winners makes 
it easier, as I don't have to sum up infinite non-winning candidate 
terms per voter.

My simple proportionality idea for this model is: suppose the winners 
are x_1 and x_2, identified by their x coordinate on the standard normal 
and that WLOG x_1 is to the left of x_2. Then we want x_1's right wing 
to contain just as many voters (area under the curve) as x_1's left 
wing, and ditto for x_2's wings.

This means that x_1 should be at the 25th percentile and x_2 at the 
75th. Then x_1 covers/represents everybody from the minimum to the 
median, and x_2 covers everybody from the median to the maximum, with 
equal area on both sides.

We can then integrate over all the voters for some choices of x_1, x_2, 
and delta; and get the Harmonic's quality score for those choices. Since 
the normal is symmetric, we can also let x_1 = -x_2 and x_2 >= 0. We 
would then want to determine the delta where the maximum quality 
function value is attained at x ~= -0.6745. For that delta, Harmonic 
would pick winners who have equally strong left and right wings.

Doing the integral is pretty hairy but the general idea is that there 
are four types of voter:
	1. voters to the left of x_1
	2. voters between x_1 and x_2, but closer to x_1
	3. voters between x_1 and x_2, but closer to x_2
	4. voters to the right of x_2,

and they all rate x_1 and x_2 according to some constant (I set 20) 
minus the distance to the winner in question.

The first two voter types rate x_1 higher than x_2, and the second two 
rate x_2 higher than x_1, so we know whose rating will get divided by 
delta and whose will be divided by (1 + delta).

After a particularly long procedure (made possible by WolframAlpha), the 
integral is found to evaluate to:

2 * (20 - sqrt(2/pi) + x_1)/(2 + 2 * delta) + 2 * ((2 * x_1 * 
erfc(x_1/sqrt(2)) - 2 * sqrt(2/pi) * exp(-(x_1*x_1)/2) - 3 * x_1 + 
sqrt(2/pi) + 20) / (2 * delta)).

Some numerical testing later, and the optimum for delta=0.5 
(Sainte-Laguë) is x_1 ~= -0.43, which WolframAlpha states as x_1 = 
-sqrt(2) * erfc^-1(2/3) = -0.43073... x_1 is at the 33% percentile in 
this case.

For delta = 1, it is approximately -0.31864; -sqrt(2) erfc^-1(3/4): the 
37.5th percentile.

Further numerical testing suggests that the correct position, x ~= 
-0.6745, is only obtained in the limit of delta->0. E.g. delta=1e-6 
gives y ~= -0.67449.

Some fiddling and setting derivatives to zero appear to indicate that 
the optimum for a given delta is at sqrt(2) * erf^-1(-1/(2*delta+2)), 
and that this corresponds to the (2*delta+1)/(4*delta+4) quantile. Which 
gives the desired point exactly at delta=0 (and at-large Range at 
delta->infty).

So at least in this respect, the effect seems to be real. You can either 
have optimal proportionality for party list (at delta = 0.5) or for the 
1D gaussian (at delta -> 0), but not both at the same time.

One may argue that the "wings are balanced" definition of 
proportionality is kind of sketchy. I wouldn't entirely disagree; it 
would be better to have three candidates (one at zero, one at -x, and 
one at +x), and then set the requirement so that the number of voters 
closest to each is the same. But I wouldn't want to do *those* 
integrals; two winners was hard enough!

One could also argue that this kind of proportionality idea is too 
Monrovian in that it only takes into account the voter's favorite. 
Perhaps a better notion of proportionality would take the other winners 
into account. But how?

(In the limit of delta approaching zero, Harmonic reduces to simply: 
each voter contributes to the quality function the rating of the voter's 
favorite winner. Which shows the similarity to Monroe, although Monroe 
imposes an explicit limit on the fraction of voters assigned to each 
winner.)

-km