[EM] A VSE-like proportionality measure, or the start of one
Kristofer Munsterhjelm
km-elmet at munsterhjelm.no
Tue Jul 30 16:26:35 PDT 2024
Someone on reddit said this list is too focused on single-winner
methods, and there's been some talk about PR methods here, so I thought
I'd add something... that ends up being related to my first post on the
list.
The Sainte-Laguë index, which is optimized by Sainte-Laguë/Webster, is:
f_SLI = sum over parties i:
(s_i - v_i)^2 / v_i
where s_i is the fraction of seats that party got, and v_i is the
fraction of votes. (Or analogously: the fraction of seats is what we see
- "observed" - and the fraction of the vote is what we would expect with
perfect proportionality - "expected".) Fair enough, but it doesn't seem
to be useful for non-party list methods.
But then I had a thought: we should be able to just do this in opinion
space.
Suppose we have a spatial model where candidates and voters' positions
in opinion space are drawn from the same distribution. Then for each
opinion space point represented in the winner set, let
s_i = the fraction of winners whose opinions have this coordinate in
opinion space, and
v_i = the probability density at this point.
In the traditional party list case, each candidate and voter occupies a
position in opinion space corresponding to one of the parties, and each
voter votes for the party whose opinion he holds. Thus
v_i = share of voter sharing party i's position
s_i = fraction of the winner set (seats) given to party i,
which reduces to the traditional Sainte-Laguë index.
Fans of Mike Ossipoff's divisor method might want to use the G-test instead:
f_GT = 2 * sum over positions i:
s_i * ln (s_i/v_i)
but it is not optimized by Sainte-Laguë in the party list case.
It's important that the v_i figures use the probability of the
underlying distribution, not the share of voters who hold the opinion in
question, because otherwise you could have a candidate who has an
opinion none of the voters happen to have. Then the Sainte-Laguë index
would be infinite. Call this the "zero opinion" problem.
===
Now, with a finite number of seats, it's impossible to obtain perfect
proportionality. So why not use a VSE-like step to normalize the scores?
Let prop_i = (E[f_SLI(optimum)] - E[f_SLI(actual)]) / (E[f_SLI(optimum)]
- E[f_SLI(random)])
Now prop_i = 0 corresponds to random winner (random winning set), and =
1 corresponds to optimal.
A nice additional property is that in the single-winner case, with a
normal distribution, the CW should optimize the measure, as a corollary
of Black's singlepeakedness theorem.
===
This *almost* works for solving the problem of quantifying
proportionality for methods like STV. One could imagine something like:
- Ask a representative sample of the voters a bunch of political questions
- Ask every candidate the same questions
- calculate prop_i between the parliament and the inferred voter
opinion space.
That would give a proportionality measure that works for any PR method
at all, although it would jumble together voters not ranking/rating
candidates in order, and disproportionality of the method itself.
But, alas, the "zero opinion" problem is still real, and the more
questions your survey has, the worse it gets. I think the problem
ultimately is that we're trying to make a one-sample test (do these
chosen points represent the distribution) do the job of a two-sample
test (are these and those points from the same distribution).
Empirically speaking, maybe one could get around this by smoothing the
voter opinion distribution. But perhaps there's a better theoretically
founded way to do it. There do, after all, exist two-sample chi-squared
tests - maybe I should look more into them.
(It would also be pretty easy to augment this measure to include
descriptive representation.)
===
In any case, this could be useful to evaluate PR methods'
proportionality, and to check how much proportionality we lose by having
small districts instead of one large district with something like STV.
Ranking a thousand candidates may be impractical for real voters, but
not for computer simulated ones :-)
It's not perfect. Suppose every voter but two has opinion x=0 on a
single axis. Of the remaining two, one is at x=0.1 and one is at x=0.9
Due to some kind of bias (ballot restrictions, funding problems, etc.),
no candidate has this opinion. There are 10 candidates at x=0.1 and 10
more at x=0.9. By the measure above, electing a council full of
candidates at x=0.1 is just as proportional (not at all) as electing one
full of candidates at x=0.9. This seems wrong. It can be dealt with by
dividing voter space into a number of regions and then counting over
each "bin", like a histogram... but then just where you put the bins
will affect the outcome. Maybe there is a kernel density-like solution?
I don't know it if there is.
Using a continuous distribution mitigates the problem because there's
always some support and it's always decreasing away from where the voter
opinions are concentrated[1]. But fully generally, this and the zero
opinion problems are quite real.
The measure, like any straightforward proportionality measure, can also
be criticized as quantifying the wrong thing: as Steve Eppley said in
2008, (paraphrased) "why should I be interested if they have the same
opinions as me, if they don't legislate the way I would?". In a similar
way, it doesn't take into account coalition effects or kingmaker
problems with low threshold PR methods, either.
-km
[1] In the limit - in the spatial model, a particular draw of voters
might just happen to be placed elsewhere.
More information about the Election-Methods
mailing list