[EM] Feature extraction and criteria for multiwinner elections
Kristofer Munsterhjelm
kmelmet at broadpark.no
Fri Jan 2 02:38:55 PST 2009
One way of making multiwinner elections proportional is to have the
method pass certain criteria. Most obvious of these are Droop
proportionality, which is the multiwinner analog to mutual majority.
However, such criteria can only say what the method should do, in
certain cases, not what it does in all cases. This is like Condorcet  a
Condorcet method elects a CW whenever there is one, but it says nothing
about what happens when there isn't a CW.
So, if we're going to make better multiwinner methods, I think we need a
way that's applicable to all situations, so that we can apply the same
rule consistently. Another approach would be to try to devise criteria
that cover increasingly more of the situations, but I'll get to that later.
If we're going to have a rule or metric for a good multiwinner election,
what should it be? Well, the intuitive nature of proportional
representation is that if there are some factions, and those factions
all vote accordingly to their preferences, then those factions should be
represented according to their numbers. This suggests some way of
feature extraction: find underlying points in issue space, or the
composition of issue space itself; then pick the candidates that most
accurately represent this configuration.
That, in a sense, is what my test system does to find good multiwinner
systems: it constructs n issues (of varying support among the people),
then randomly assigns agrees and disagrees for each issue to each voter
(and candidate) so that the support reaches the fraction in question.
For instance, for an issue set so that 70% are in favor, a random
fraction of 0.7 of the people (voters and candidates) have that issue in
favor as well.
It may be possible to use this idea in reverse: construct some model of
how voters weight disagreements, then assign n bits (for some n) so that
the RMSE between their ballot scores and the predicted ballot scores
(based on assigned issues to voters and candidates) is minimized. For a
binary issue profile, in my simulations, I used simple "Hamming
agreement". That is, if a voter and a candidate agree on five issues out
of ten, the voter gives five points of ten (or 2.5 of 5 or whatnot).
Obviously, this lends itself much more to cardinal than to ordinal
ballots. What's important is not whether voters actually do this, but
whether it's a good model  that is, whether the RMSE can be made very
low by doing this. If voters vote based on personal appeal or something
like that, and that can be modeled as three or four virtual "issues",
then no great loss.
The good thing about binary issues is that once we have reconstructed
them, it's simple (well, in the NP sense) to ensure proportionality.
Simply pick the set of candidates so that the difference between the
fractions supporting each issue, and those fractions for the entire
people, is minimized according to some error (probably should be the
SainteLague metric, Gini, or RMSE).
What's not so easy is to assign the issues in the first place. The
formal problem becomes something like: "define an issue matrix, n_i *
(voters + candidates), where n_i[issue][person] is either 0 or 1;
further define a model scoring function f(voter, cand) = SUM(k =
1..num_issues, (1  n_i[k][voter]  n_i[k][cand])); now, given a
voters*candidates score matrix q, and an integer p > 0, populate an
issue array of p*(voter+candidates) so that the RMSE of the matrix,
where difference at a point is defined as (f(voter, cand) 
q[voter][cand]), is minimized". The decision version of the problem is,
"is there any way of constructing such a binary issue matrix so that the
RMSE (or some other error) is below a certain value?". I think the
decision version is in NP, so at worst, the problem is NPcomplete, but
what's worse is that if it is, it's in not just the number of
candidates, but also in the number of voters.
So that might be too hard. The idea seems sound, though; construct an
opinion space and then pick proportionally from it. Could we use other
feature extraction methods? There is one such function/method that can
be done in polytime, namely SVD  singular value decomposition. It's
used in, among other things, predicting movie ratings by most entries to
the Netflix contest. However, though I have tinkered a bit with SVD, I
haven't found any way of translating its result into issue space, or
getting good results in my simulations by any such SVDdriven selection.
The two ways I've tried have been to pick candidates so that the sum of
each row is the same for the candidates and for the population at large,
and also one so that a histogram over the candidates (or rather, a KDE,
but it's roughly the same) is similar to one over the people, for all
"issues". Neither seems to give much better results than random. Do any
of you have ideas as to how to use SVD for this purpose? I'm no expert
in statistics.
The binary issue model might also be expanded into a continuous value
model, where each "issue" is no longer a yes/no but a point along a
line. The right way of reproducing that issue space would probably be,
as above, to use a KDE to synthesize a probability distribution and then
check the similarity of that for some candidate set against that for the
people; but if we can't construct binary issue models, we probably can't
construct continuous value ones either, unless there's some
"discontinuity makes it more difficult" case analogous to the difference
between linear and integer programming.

I said I was going to mention some criteria that would cover more than
Droop proportionality. I'm going to use the shorthand (k,n) for an
election that picks a council of k out of n candidates. Some ideas are:
Condorcet: If the election is (1, n) and there's a CW, it should be elected.
Reverse Condorcet: If the election is (n1, n) and there's a Condorcet
loser, all but the Condorcet loser should be elected.
Fragmented Condorcet: If the election is (k, n), and there's a way of
dividing the ballots into k piles so that each of those piles have a CW,
and all k CWs are different, then those CWs should be elected. If there
are multiple such partitions, the method passes if it elects according
to one of them.
The idea of Condorcet is rather simple; if the multiwinner method's any
good, it should be a good singlewinner method in the (1,n) election
case as well.
Reverse Condorcet is inspired by something I observed in Raph Frank's
2of3 multiwinner diagrams. If you assign the three candidates
according to primary colors (that is, red, green, and blue), then you
get a diagram with Voronoitype shapes in composite colors. See this,
for instance: http://munsterhjelm.no/km/composite_multiwinner.png ,
which was made by overlaying "Optimal Utility (PAV)" from
http://ivnryan.com/ping_yee/triang_10000_2.html
Fragmented Condorcet is something I'm more uncertain about, but it can
be traced to two ideas. The first one is that of PR being like elections
by faction: if there are two factions, one can imagine each faction
having a separate CW, in which case, if the factions are of equal size,
those two are those that should be elected. The other idea is that of
how PR methods have to fail house monotonicity. Say you have something like
30: Left > Center > Right
30: Right > Center > Left
4: Center > Left > Right
When electing one, that one should be Center to be Condorcet. When
electing two, it should be Left and Right. This can be done by splitting
like this:
15: Left > Center > Right
2: Center > Left > Right
15: Right > Center > Left
2: Center > Left > Right
which gives {Left, Right}, W5. By similar reasoning, if there's a direct
democracy and everybdoy votes for themselves first, and the council's
the same size as the people (direct democracy) then everybody is in it.
However, the LeftCenterRight example may also suggest that Fragmented
Condorcet isn't desirable at all, since it splits the Center voters, and
thus is merely a way of doing automatic gerrymandering.
What are your opinions on the criteria in general?
More information about the ElectionMethods
mailing list